I began by installing Ubuntu Server 20.10 for ARM64 via the Raspberry Pi Imager tool.
After booting Ubuntu on the Pi4, I then installed C++ build tools and cloned & built µWebSockets:Then I simply ran the HelloWorldThreaded executable and started htop.
If you don’t know about µWebSockets, this is roughly how it looks like, minimal snippet:On my laptop I built and ran the http_load_test of µSockets:Above test will establish 200 connections making non-pipelined HTTP requests as fast as the server will allow. This got me to 93k req/sec at 400% CPU-time usage (all 4 CPUs on the Pi). So I figured — with my elite cooling solution I ought to manage a slight overclock from the default 1.5Ghz to 1.7Ghz.
Overclocking the Pi 4 is super simple — all you need to do is edit /boot/firmware/config.txt as root and add the lines:Now the results were a stable 106k routed HTTP req/sec. Running for a few minutes, everything runs stable and performs as I wanted it to, without overheating or failing otherwise.
The code running is a production ready solution that’s currently deployed in many large companies. It does proper and secure HTTP parsing, URL routing, parameter and querystring parsing, timeouts and passes rigorous security testing. It also does (optional) TLS 1.3 encryption. So it’s not your typical “100-liner benchmark winner” we are testing here — just to point that out. Also note yet again that we are doing non-pipelined HTTP requests here.
So, is that good?
These results are quite impressive. Or are they? I feel like 100k is something anyone can claim — you just pick the hardware to make it. So what I really like about this little experiment is that the hardware is fixed, cheap and well-known — anyone can pick this cheap computer up and compare with other software solutions. The fact that we are measuring a physical signal on a cable further simplifies the problem specification and eliminates ambiguity.
For comparison, running the same test with Node.js / Fastify as a cluster yields only 8.8k. That really puts some perspective to this — we just outperformed Node.js / Fastify by 12x! That’s something you probably wouldn’t expect out of a simple software change, but there you have it. Nothing but software changed!Despite Fastify openly boasting about its performance with texts like “the fastest web framework in town” and “serving the highest number of requests as possible” it really is a subpar solution in terms of performance — and so is Node.js — at least without native addons in use.
What about TLS 1.3 then?
We can go even further in our comparison— by enabling TLS 1.3 in µWebSockets we can run the same test with modern encryption enabled. Now we get a stable stream of 77k req/sec over this secure encryption standard. That is, we are outperforming Node.js / Fastify — 8.75x on a secure-vs-insecure basis! If that is not getting the message across, I don’t know what more to say.
With µWebSockets it is possible to serve modern TLS 1.3 traffic way faster than many other software solutions can serve even insecure cleartext traffic.
“This is an unfair comparison”
This test pitted a native C++ server against a scripted Node.js one, so of course the outcome was given. Well, not really. If I run the same test using µWebSockets.js for Node.js, the numbers are a stable 75k req/sec (for cleartext). That’s still 8.5x that of Node.js with Fastify and just shy of 75% of µWebSockets itself.
So you can’t really say that Node.js is “naturally slow and that’s expected”. Native C++ addons can boost the performance of Node.js by huge amounts. Node.js loaded with µWebSockets.js can compare quite favorably with full-on native solutions in many cases, especially if your particular application is performing lots of small-message I/O.
What this test shows, really, is that different software solutions can have huge implications on performance as a whole.
- Amazon has become the largest provider of Amazon Specialty Certification with over 25,000 individuals achieving certification to date throughout the world.