Can We Double HTTP Client Throughput?

Rate this content
Bookmark

The Node.js HTTP client is a fundamental part of any application, yet many think it cannot be improved. I took this as a challenge and I’m now ready to present a new HTTP client for Node.js, undici, that doubles the throughput of your application The story behind this improvement begins with the birth of TCP/IP and it is rooted in one of the fundamental limitations of networking: head-of-line blocking (HOL blocking). HOL blocking is one of those topics that developers blissfully ignore and yet it deeply impacts the runtime experience of the distributed applications that they build every day. Undici is a HTTP/1.1 client that avoids HOL blocking using keep-alive and pipelining, resulting in a doubling of your application throughput.

20 min
24 Jun, 2021

Video Summary and Transcription

Today's Talk discusses HTTP clients, servers, microservices, and maximizing performance in Node.js. It covers topics such as TCP, latency, HTTP Keep-Alive, pipelining, the Node.js event loop, timeouts, and introduces the Undici library. The speaker emphasizes the importance of reusing connections, minimizing blocking, and using benchmarks to measure performance impact. Undici is highlighted as a new client for Node.js that eliminates the need for multiple agents and offers easy configuration options.

Available in Español

1. Introduction to HTTP and Node.js

Short description:

Today, I'm going to talk about HTTP clients, servers, and how to improve the throughput of our HTTP client in Node.js. I have a good grasp of user needs and maintain Node.js. As part of my job, I work with cloud servers and have experience with building fast and scalable Node.js applications. I'm also a co-author of Fastify.

Hi, everyone. I am Matteo Collina, and today I'm going to talk to you about HTTP. clients, servers, maybe microservices a little bit, and how can we double or maybe even triple the throughput of our HTTP client in Node.js.

So first thing, a little bit about me, I am Matteo Collina. I'm part of the Node.js technical steering committee. I'm the co-creator of FASTI5 web framework and PinnoLogger. I'm a software architect and consultant by trade, and you know, I'm technical director at NearForm. Follow me on Twitter at Matteo Collina.

So a couple of notes. I also have maybe 6 billion downloads on NPM for the whole of 2020. I don't know. I was totally stunned by this. So maybe I know what I'm talking about, maybe not. Make it up what you want for yourself.

So what I'm doing, I typically work in helping companies building fast and scalable Node.js applications. This is one of the key things of what I do as part of my job. I am also part of the maintainers of Node.js and a key part of his ecosystem with 6 billion downloads per year. I probably know what I have a good grasp on what our user needs and what they are complaining with and so on. I need to balance all the time those two things. One side help our clients and the other one maintaining Node.js. This gives me a lot of perspective on what I can do, what I need to do for the development of Node.js application if it's an ecosystem. So the two sides of my job strengthen each other to some extent.

As part of my job, I most of the time work with cloud servers. So there is a client, typically a web browser or a mobile app that talks to the cloud and, specifically, to one server, which can be multiple instances, but it's still the same thing that runs. It's what we call a monolith at this time. You know, I also wrote— As I said, I'm a Fastify co-author, so shameless plug here. Use this thing. It actually works really well. This task is actually not up-to-date. Ah, I'm sorry.

2. Introduction to Microservices and Node Core HTTP

Short description:

This part discusses the use of a fast web server and framework for Node.js, which is suitable for building both small and large apps, including monoliths and microservices. The speaker highlights the need for microservices to scale teams and avoid overlapping responsibilities. They also address the issue of chattiness in microservices systems and emphasize the importance of communication between microservices. The speaker then introduces the Node Core HTTP as the focus of the presentation, explaining its role as the backing for popular HTTP clients. They discuss the process of creating a TCP socket and the potential latency involved. Additionally, they mention the concept of the congestion window for new sockets.

So, essentially, this is a really fast web server, web framework for Node.js. You can build small and big apps with this, and it works really well, both for monolith, but also for microservices.

Now, why would you need microservices? Because you need to scale teams. Microservices are a clear way of scaling teams so that you can have different teams to maintain different parts of your system so that they don't overstep on top of each other. It's actually great.

However, one of the problems of microservices system is their chattiness. In fact, all the microservices chat a lot between each other, and you need to call data that is managed by some other microservices. So you have actually a lot of communication between the various microservices. From time to time, somebody will call this a microservices mesh. And what we are going to focus on most of the time in this presentation is this link between microservices. And I've been researching this problem for three, four, five years, something like that. So it has been brewing for some time in my head.

You can have an HTTP server with everything, everyone can work, but even the most basic ones. So let's consider a very simple server that you just do a little bit of a timeout of one millisecond. Very simple. This to simulate a very fast database that always replies us with hello world in one millisecond. Hey, it's great. And an HTTP client, the Node Core HTTP. Why we're just focusing on Node Core HTTP? Well, because Axios, Node fetch, request got, they all use this as their backing. It's great. And so every single time you're doing those things we are doing, you know, they create a TCP socket.

So essentially the sender open up, when they open up a TCP socket they need to do a little bit of a dance. This is typically one round full roundtrip to get to establish this, which is, you know, quite a lot, OK? Because, you know, depending on the latency, the distance, physical distance between the two, it can even take some time. It can be 10, 20 milliseconds, something like that. So we're talking very little numbers. But remember, you have maybe 200 milliseconds to respond to your client or maybe 400, whatever you want. But, you know, the more hopes you do, the higher your latency gets. So you don't really want to spend time because you haven't... So once the 3NShake has established is not even finished, like you haven't transferred any data, right? You just created the socket. Consider that if you're using TLS or SSL and so on, it takes even longer. So, but that's not just the case, because once you create a TCP socket, in fact, it's, you know, there's a concept that is called the congestion window, which is considered slow at the beginning for new established sockets.

3. Understanding TCP and Latency

Short description:

The server sends bytes to the client, which then needs to acknowledge them. As the congestion window grows, more bytes can be sent without acknowledgement. TCP's success lies in its ability to work on networks with varying bandwidth. The need for acknowledgements arises to prevent data loss, but it introduces latency.

So what happens on the left, as you can see on the left, what happens is that the server sends some bytes, then the client needs to hack them, and then it sends more bytes, and so on and so forth. While once the window, the congestion window has got bigger, in fact it can send a whole lot of bytes without a hack. This is the reason why TCP is so successful, by the way, because it enables it to work on very small bandwidth networks or very high bandwidth networks. Why would you need all those hacks? Because if you lose us, if you lose some message in between, this is the maximum amount of data you will lose. However, this comes at a cost, which is latency.

4. Maximizing Performance with HTTP Keep-Alive

Short description:

To maximize bandwidth, it is crucial to reuse existing connections in order to avoid losing the work done by the network layer. In Node.js, the HTTP 1.1 feature called KEEPALIVE allows for the reuse of HTTP sockets, which is particularly important for TLS. By using HTTP clients with Keep-Alive turned on, we can increase the performance and throughput of our applications. To test this, a scenario with one client making 500 parallel requests to a server was used, and the results showed significant improvements. However, it is important to note that these results may vary depending on the system, so it is recommended to run benchmarks to measure the actual impact.

So one of the key problems with this is that if you want to have the maximum bandwidth, you must reuse the existing connection. Like, once you have established a connection, send out some data, increase the congestion window, this grows over time, by the way, which is great, one of the greatest feature of TCP, we must reuse the existing connection. If you don't reuse the connections, you are losing all this work that was done for you by the network layer.

So in order to do that in Node.js, you need to use an HTTP 1.1 feature, which is called KEEPALIVE, which you can see here. You can see KEEPALIVE true and you can set the maximum number of connections to KEEP OPEN. This enables you to reuse the keep the sockets, your HTTP sockets for HTTP alive. More importantly, it's also more important with TLS, because you actually can avoid the full establishing of the crypto context, the secure context between the two. So it's actually very, very important for TLS as well.

So this is the theory. We should be able to increase our performance, our throughput in our applications just by using HTTP clients with Keep-Alive turned on. Is this the case? Is this the case? Well, let's see. Scenario. We have one client that calls one server with 500 parallel requests on the same route. And more or less this is equal as 500 parallel in-bound requests. And the target server takes 10 milliseconds to process the request and it declines a limit of 50 sockets. So this is completely synthetic, okay? It doesn't match your system, so always measure this stuff. Don't trust me. Run your benchmarks.

5. HTTP 1.1 Pipelining and Reliability

Short description:

Always use an agent. HTTP 1.1 pipelining is an obscure feature that can be used on the server but not on the Node.js HTTP client. It suffers from head of line blocking and only works for small files. It doesn't work well on unreliable links, but is less of a problem in reliable data centers.

Now, this is the difference between the two. So if you forget anything from this talk, always use an agent. That's it. That's the only thing you need to remember. Always use an agent. The difference, it's so massive that you can't even consider not using one.

But can we still improve things? Because I've been researching this topic for a while, so I might have some more things to say to some extent. Well, yes, we can. In fact, there is something that's called HTTP 1.1 pipelining. Now, this is one of the most obscure feature of HTTP 1.1, and something that people will say don't use this is wrong. The browser don't use this, it's not supported by browser, but it's part of the standard, and you can actually use this on the server. So Node.js HTTP server support pipelining out of the box, you don't need to do anything to enable them. Node.js HTTP client does not, however. So it is that, you need to do something else, you won't use this technique.

In HTTP 1.1 pipelining all responses must be received in order. This means that you are suffering from head of line blocking, and a slow request can stall the pipeline. So essentially, if the first that you ask is actually very slow, or the other request will be packed up, waiting to go until the first one finishes. So this is a problem. HTTP pipelining only work for small files. So the problem is always retransmits. So the moment, if you start losing a pass and packet, everything goes nuts. So you can't actually do anything I'm sorry. It doesn't really work well on unreliable links. However, our own data centers are actually reliable links. So if we need to call them from one microservice to the next, those links, those connections, those sockets are actually very reliable. They don't fail. It's not that it's somebody moving around with their iPhone and they are connected with different cells so the connection goes on and off. This is actually... You know, they're very reliable on the data center. So this is less of a problem on the data center.

6. Node.js Event Loop and Performance

Short description:

The Node.js event loop works by scheduling I.O. to be done asynchronously. The event loop waits for something to happen and then calls back into C++. To improve application performance, it is important to minimize the time blocking the event loop. Flame graphs can be used to visualize and minimize function calls.

Now, one more thing to be concerned with. It's actually very important to note about the... One thing to remember about how the Node.js event loop works. So, whenever you get a tcp socket, whenever we would do any I.O., essentially, your JavaScript code schedules some I.O. to be done asynchronously. And then it calls back and it starts in the event queue. In practice, what does this mean? This means that the event loop is waiting for something to happen. Then it calls back into C++, and from C++ it calls JavaScript, and from JavaScript, it finishes, it does next tick, it does promises and so forth, finishes C++, and then it starts again the event loop. Now, there is a moment between those two where the event loop is blocked. So, when the C++ and JavaScript function is executed. So, what does this mean? It means that if we want to improve the performance of our applications, we need to minimize the time in which we are blocking our event loop. So, and it's basically, it's increased the key strategy to increase the throughput. And you can actually use these to put flame graphs, use flame graphs to visualize the functions and actually minimize them. It's pretty great, it works very well.

7. Timeouts, Echo Resets, and Introducing Undici

Short description:

Now, this is one of the problems we are trying to solve - timeouts and echo resets. If you use agents, you might end up having echo resets and timeouts. The problem is that a socket might die, and you want to minimize this. By default, Node.js used a FIFO strategy, but recently a new scheduling strategy called LIFO was added to the HTTP agent in Node.js. This LIFO approach minimizes timeouts and echo resets by reusing the sockets that worked the last time. Now, let me introduce you to a new library called undici, which comes from http 1.1.11 and translates to 'eleven' in Italian.

Now, this is one of the problems, right. The next one that we are trying to solve, and this is the bonus point, is that it's timeouts and echo resets. So, if you use agents, okay, you might end up having echo resets. And what does those, and timeouts. So, the problem is that a socket, if you are using them, they might die. And if they die, what you need to do is that you might want to reschedule them.

So if they die, it's a problem, because it can happen and I can say, oh, I'm sending data on this socket. I'm trying to reuse it. When I use it, it die. It's not available anymore. I'm getting a record of that. It's really bad. So, what happens is that you want to minimize this. By default, the original strategy for Node.js was FIFO first in, first out, which means that it was reusing all the sockets to try to create the least amount.

Now, the problem with that approach is that all the sockets are the ones more likely to timeout because they are old. And so, recently, we added a new scheduling for the HTTP agent in Node.js. It's called LIFO. It's a different strategy. Last in first out means that the last one that you use, we try to use that. Which means that we are actually going to create more sockets because, essentially, we let more sockets expire. However, with the LIFO approach, it actually minimizes the amount of timeouts and econ resets because you're actually going to reuse the sockets that worked the last time. So you know that this works. It's just there. So it's way, way more probable that your request goes through.

Okay, so we have known all of these things. Let me introduce you to a new library called undici. So what is undici? Well, undici comes from http 1.1.11. And you know 11 it's a secular word, right? But in Italian it means undici. So you can translate 11 to undici. So that's why undici.

8. Introduction to Undici

Short description:

Undici is a brand new client for node.js that uses Node Core internals. It supports both HTTP and HTTPS, uses lithos scheduling, and allows unlimited connections. It eliminates the need for multiple agents and provides easy configuration options.

And you know it's also totally a stranger things reference, in case you're wondering. So what does undici do? Undici is a brand new client for node.js implemented from scratch just by using Node Core internals. It's great. You can just use it with a global agent that will keep alive your connection by default. It uses lithos scheduling, no pipelining and unlimited connections. So it will work more or less the way you're used to. It will support both HTTP and HTTPS at the same time. There's no need to shenanigans between multiple agents and so on. It just does it all. And that's it. That's pretty cool. You can even configure HTTP agents on agent. You can configure it or you can use it directly then. It works really well, from my point of view.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
Do you know what’s really going on in your node_modules folder? Software supply chain attacks have exploded over the past 12 months and they’re only accelerating in 2022 and beyond. We’ll dive into examples of recent supply chain attacks and what concrete steps you can take to protect your team from this emerging threat.
You can check the slides for Feross' talk here.
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
In the early years of Node.js, diagnostics and debugging were considerable pain points. Modern versions of Node have improved considerably in these areas. Features like async stack traces, heap snapshots, and CPU profiling no longer require third party modules or modifications to application source code. This talk explores the various diagnostic features that have recently been built into Node.
You can check the slides for Colin's talk here. 
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
Native ESM support for Node.js was a chance for the Node.js project to release official support for enhancing the module loading experience, to enable use cases such as on the fly transpilation, module stubbing, support for loading modules from HTTP, and monitoring.
While CommonJS has support for all this, it was never officially supported and was done by hacking into the Node.js runtime code. ESM has fixed all this. We will look at the architecture of ESM loading in Node.js, and discuss the loader API that supports enhancing it. We will also look into advanced features such as loader chaining and off thread execution.
JSNation Live 2021JSNation Live 2021
19 min
Multithreaded Logging with Pino
Top Content
Almost every developer thinks that adding one more log line would not decrease the performance of their server... until logging becomes the biggest bottleneck for their systems! We created one of the fastest JSON loggers for Node.js: pino. One of our key decisions was to remove all "transport" to another process (or infrastructure): it reduced both CPU and memory consumption, removing any bottleneck from logging. However, this created friction and lowered the developer experience of using Pino and in-process transports is the most asked feature our user.In the upcoming version 7, we will solve this problem and increase throughput at the same time: we are introducing pino.transport() to start a worker thread that you can use to transfer your logs safely to other destinations, without sacrificing neither performance nor the developer experience.

Workshops on related topic

Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
WorkshopFree
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
WorkshopFree
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
React Summit 2022React Summit 2022
164 min
GraphQL - From Zero to Hero in 3 hours
Workshop
How to build a fullstack GraphQL application (Postgres + NestJs + React) in the shortest time possible.
All beginnings are hard. Even harder than choosing the technology is often developing a suitable architecture. Especially when it comes to GraphQL.
In this workshop, you will get a variety of best practices that you would normally have to work through over a number of projects - all in just three hours.
If you've always wanted to participate in a hackathon to get something up and running in the shortest amount of time - then take an active part in this workshop, and participate in the thought processes of the trainer.
TestJS Summit 2023TestJS Summit 2023
78 min
Mastering Node.js Test Runner
Workshop
Node.js test runner is modern, fast, and doesn't require additional libraries, but understanding and using it well can be tricky. You will learn how to use Node.js test runner to its full potential. We'll show you how it compares to other tools, how to set it up, and how to run your tests effectively. During the workshop, we'll do exercises to help you get comfortable with filtering, using native assertions, running tests in parallel, using CLI, and more. We'll also talk about working with TypeScript, making custom reports, and code coverage.