Multithreaded Logging with Pino

Rate this content
Bookmark

Almost every developer thinks that adding one more log line would not decrease the performance of their server... until logging becomes the biggest bottleneck for their systems! We created one of the fastest JSON loggers for Node.js: pino. One of our key decisions was to remove all "transport" to another process (or infrastructure): it reduced both CPU and memory consumption, removing any bottleneck from logging. However, this created friction and lowered the developer experience of using Pino and in-process transports is the most asked feature our user.

In the upcoming version 7, we will solve this problem and increase throughput at the same time: we are introducing pino.transport() to start a worker thread that you can use to transfer your logs safely to other destinations, without sacrificing neither performance nor the developer experience.

19 min
11 Jun, 2021

Video Summary and Transcription

Today's Talk is about logging with Pino, one of the fastest loggers for Node.js. Pino's speed and performance are achieved by avoiding expensive logging and optimizing event loop processing. It offers advanced features like async mode and distributed logging. The use of Worker Threads and Threadstream allows for efficient data processing. Pino.Transport enables log processing in a worker thread with various options for log destinations. The Talk concludes with a demonstration of logging output and an invitation to reach out for job opportunities.

Available in Español

1. Introduction to Pino Logging

Short description:

Today I'm going to talk about logging. I work for Nearform, a professional services company based in Ireland. I'm a member of the technical steering committee and co-creator of Fastify and Pino. Pino is one of the fastest loggers for Node.js with a minimal feature set. It has been downloaded 6 million times per month and we are working on version seven. Using Pino is simple with a factory function and it helps reduce the hover rate of logging inside your application.

Hi, everyone, I am Matteo Collina. Today I'm going to talk about one of my favorite topics, logging. Just a little bit of intro of me before we start. I'm Matteo Collina, at Matteo Collina on Twitter. I work for a company called Nearform. We are a professional services company based in Ireland. I'm also a member of the technical steering committee and I'm the co-creator of Fastify and these libraries we're talking today, Pino. I'm also a software architect, so I typically help companies run Node.js in production. I also write every week on a new my newsletter at nodeland.dev, please subscribe.

Anyway, as part of my, oh, this is actually important. Maybe I have a lot of downloads on NPM. I ended up maintaining a lot of things in the ecosystem because as my activity as a consultant, I tend to balance between client work and the need of the client with the maintaining Node.js as ecosystem, and that is a good synergy. So I end up implementing new libraries and new modules to solve problems that my clients have, and I bring those new things to my clients. As part of this initiative I built up various things like Fastify and Pino. Pino, what's Pino? Pino is one of the fastest logger for Node.js. It does a minimal feature set, so we'll talk a little bit about the minimal, and it is a great community. Pino's now been downloaded 6 million times per month, and we have four collaborators. Maybe three, but whatever, four collaborators. And it's version six, and we are working on version seven. This talk, it's about version seven. Hey, this is actually nice. So new things coming through.

How do you use Pino? Using Pino is very simple. You can just create your logger. Use just a factory function, so you just create your logger, and then you start doing info, warn, error, debug, those things that you like. We also have this child logger functionality that enables you to create a new child with a certain set of properties already pre-populated, and it's a newline delimited JSON logger, so it produces newline delimited JSON, so it's a JSON followed by a newline, and another JSON followed by a newline. Pretty interesting. It's fast, and you can use pnode to drastically reduce the hover rate of logging inside your application. You cannot really bring it to zero because it's still doing IO in the end, but it helps reducing it quite a lot. Know that logging should be fast.

2. The Speed and Performance of Pina Logging

Short description:

You should not have very expensive logging because if you have very expensive logging, then you will be inclined to log less and not more, and you need more observability most of the time. Pina's been around for a bit, so, hey, happy days. And most of what we say is still modern. The first bit that I want to cover about Pina is how come it is so fast. This diagram explains the event loop and how to achieve a very performant IEO in Node.js and authorized performance Node.js applications. When that lands in the JavaScript land, it means that, well, before doing that, the event loop called some C++. So in the side, the C++ code inside Node.js, then we kick off the JavaScript function, which is a function that gets executed, which in turn, might resolve a promise or schedule a next tick. And therefore, we have to process also the microtick queues. From the moment we move from the... We start calling inside C++ to the moment that we return, we let C++ return, the event loop is blocked. So there's no more event, no nothing. There's nothing happening IO-wise in the system. One of the... What happens if you have a slow IO operation? So let's imagine in the context of login that you're writing to a destination and you need to do some mangling and processing of your logs before moving them out. Now because of, you know, and you're writing all those logs and you have a slow, then therefore you have a slow IO operation and the amount of tasks, concurrent tasks increase. But if the concurrent task, asynchronous concurrent tasks increase, then it also increase the memory consumption of your application.

You should not have very expensive logging because if you have very expensive logging, then you will be inclined to log less and not more, and you need more observability most of the time.

There is this nice talk called the Cost of Logging that my partner in crime, David Marklamas, and myself have done at node-conf.eu 2016. It's a long time ago, you know? But Pina's been around for a bit, so, hey, happy days. And you can check it out because it was a really good... It's a really good talk. And most of what we say is still modern. There was a bunch of things that are different and we're going to change in v7 that we're going to talk about soon.

So, well, the first bit that I want to cover about Pina in how come it is so fast. How can it be fast? Well, I just want to give you a little bit of overall. And thank you, James. These slides are amazing. This diagram is probably one of the best ones we have ever done. This diagram explains the event loop and explains when we're doing IEO and how can we achieve and how to achieve a very performant IEO in Node.js and authorized performance Node.js applications.

So first and foremost, you have data that is coming on the event loop. So you have data, you have events coming in. So some IEO has finished or maybe an HTTP request has arrived. When that lands in the JavaScript land, it means that, well, before doing that, the event loop called some C++. So in the side, the C++ code inside Node.js, then we kick off the JavaScript function, which is a function that gets executed, which in turn, might resolve a promise or schedule a next tick. And therefore, we have to process also the microtick queues. After all of that is said and done, we go back into the event loop. From the moment we move from the... We start calling inside C++ to the moment that we return, we let C++ return, the event loop is blocked. So there's no more event, no nothing. There's nothing happening IO-wise in the system. This is not great. Well, or it is great in the sense that it's, you know, if I want to make it fast, if you make my code fast, the time I spend executing JavaScript needs to be as small as possible. So... One of the... What happens in, you know, if you have a slow IO operation? So let's imagine in the context of login that you're writing to a destination and you need to do some mangling and processing of your logs before moving them out. Now because of, you know, and you're writing all those logs and you have a slow, then therefore you have a slow IO operation and you, the amount of tasks, concurrent tasks increase, right? But if the concurrent task, asynchronous concurrent tasks increase, then it also increase the memory consumption of your application.

3. Pino Logging Performance and Advanced Features

Short description:

If the memory consumption increases, more garbage collection is required, which runs on the CPU. This can lead to a slowdown in CPU and a buildup of events, potentially causing a blockage. To ensure good throughput, Pino processes data synchronously, avoiding memory leaks. Pino also allows for async mode, although it may affect debugging. In 2016, Pino proposed a different approach to logging, recommending separate processes or relying on infrastructure for log transport. However, Pino now allows for distribution within the same process without blocking the main thread. Users have requested features such as log formatting, sending logs to remote destinations, implementing log rotation, and sending multiple logs to multiple places simultaneously.

But if the memory consumption increase, you have more garbage collection to do, which runs on the CPU. And so when the CPU this seed runs, the GC grows, the memory grows, the GC grows, grows in usage and then the CPU gets filled in by the garbage collection activity to do. Now, that's the problem, you see, because latency and throughput are deeply connected. So if I have a slow operation in my code, then it will increase concurrency, which will increase memory pressure, which will slow down the CPU. And it's possible that because of that slowdown in CPU, the buildup of events to process will go up again and enter some sort of catastrophic cycle where until the pressure is relieved somewhere, everything will be blocked.

So, you know, the trick here is to make sure that most of the process is done as fast as it's possibly can. That's it, that's the answer to this. And it does not schedule any more work. Now, these means in the context of logging, for example, that if you are having, if you want to send the same log line to multiple destination, it will be problematic because we have, or especially even over a network, because if we try to do that, that data will stay alive for longer and we are creating actually more work for our event loop. In Pino, we do all the data processing synchronously. So whenever you call .info whatever, you can be certain that by using the default, that all the processing is as done within the macro tick. So there's nothing scheduled to be executed later. This is phenomenal and really important for getting good throughput, because we make sure that there is no memory located left and it gets essentially cycled very easily. Now you could turn on the async mode in Pino so that the logging is then flashed after a bit, and it's not written immediately, but it's a little bit tougher on the debugging side of things. We'll talk about that in a second.

In the original presentation about Pino from 2016, we had this light about the fact that Pino is more than a module, it's a way of life. Well, Pino, it is more than a module. And at the beginning, we flagged this because it was clear that we were proposing something radically different. And what were we proposing in 2016? Well, we were telling people, well, look, you need to implement your transport as separate processes or rely on your infra. So you log on standard output and your infra picks standard output and send it somewhere else. Yes, we still recommend that, that has not changed. However, several teams have reached out to us and us as well, but we really need to execute the distribution in our process. The typical target is sending your log to data dog or elastic search or something like that. Well, Pino allows you to do that. Well, Pino will soon allow you to do that from the same process without blocking the main thread. We'll see that in a moment. You know, what happened was that all people started asking us for the same features. They wanted to format their logs, they want to send it to remote destination. They wanted to implement log rotation, this was a big thing. And really they wanted to send multiple logs to multiple places at the same time.

4. Using Worker Thread and Threadstream

Short description:

We could have used Worker Thread, which are now stable in all long-term support release lines of Node.js 12 and 14. They offer easy synchronization primitives like Atomics. I started writing a library called Threadstream, which wraps a Worker Thread in a stream-based API. It's fast and allows writing strings and buffers. By providing a writable stream, you can send data to the WorkerThread. This provides a way to send data to my WorkerThread as a stream.

By us saying, well, you should do all those things out of process. We were saying to all those teams, well, you need to have your operations people or your DevOps to take care of those things. It was harder for most teams, to be honest. So, and users keep asking these questions every day, every day, every day. So hey, what could we do? Well, there was something that we could have done. We could have used Worker Thread. The Worker Thread are now stable in all long-term support release lines of Node.js 12 and 14, of course. And they offer some easy synchronization primitives, like, for example, Atomics. And it's great. And maybe we could use them to build transports, right? Wow, well, how? First of all, this idea is not new. It was originally showed to me by Mark Martin at a conference a few years back, where he was actually using a native add-on to do most of this work. And it was showing a really good potential in terms of throughput and performance. So it was pretty great. So what I did was, I started writing this library called Threadstream. What does Threadstream do? Well, it essentially wraps a Worker Thread in a stream-based API. So it's not purely a writable stream because it does not inherit from a writable, but it's fast, and you can write strings to it. You can write buffers. So basically, you call streams write and you just start writing to your Thread. And on the other Thread, you need to provide a writable stream in. That's it. That's the only interface that you need to provide. So you can, once you have done this thing, that you have provided both of those, it's actually pretty great because then now I have one way to send data to my WorkerThread as a stream. But you know, a stream is the interface the PNO have. So let's see how to use that.

5. Using Pinot Transport for Log Processing

Short description:

We want our main Thread to send logs to our WorkerThread. They communicate using a Ring Buffer, allowing for writing and consuming without blocking. Pinot.Transport is a new function that allows log processing in a worker thread. It offers various options for log destinations and pretty printing. A demo showcases the basic example of Pino and the use of transports for multiple destinations.

Well, what we do is we want to have our main Thread send logs to our WorkerThread. That's what we do. So in AV Node app, we will have two Threads, one for main and one for the logs processing. How do they communicate? They communicate using a Ring Buffer. Using a Ring Buffer allows them to write to this buffer mostly without locking the buffer itself. So one can write and the other one can consume without blocking between them.

So, and I can always determine the part to read from that work. So I'm going to introduce you to Pinot Transport now. I built it up all this tension and hopefully we release it right now. So Pinot.Transport is a new function inside Pinot that allow you to create, to do log processing in a remote transport, sorry, in a worker thread that is wrapped as a transport. You can use modules, you can specify either a dependency or a single file with an absolute path, or you can use a destination file or you can use a console and pretty print. So that's it. So you can actually do a lot of those things in different ways.

So it's time for a demo. So what I'm going to do, I'm going to open up Kibana and we'll see our way. So it is Kibana, we have last five minutes, this is nothing. And I have my script. So first of all I'm going to show you the example and the example, this is the basic example of Pino, these just log things out and to standard output. So what we, what we were used to do is no example and you see it prints this, but then I could do Pino Pretty and then get pretty printing. Note that what I've done here is, ah, that's it okay. So what I've done here I have cold executed Node Modules Bean, dot bean slash Pino Pretty to prettify and colorize my locks. Now what I can do instead is with the transports is these new, oop yay, opened anyway, yay. And is use this new technique where I am creating my transport I'm specifying a file for one destination. I'm specifying I'm going to use a Pino Elasticsearch to send the data to Elastic specifying that the Elasticsearcher node and doing pretty print. So let's see how if that works. So I've been doing and oh note that this is the rest does not change at all. It's the exact same thing. And it's pretty cool, right? Because I can write my transport, not keep my app but doing multiple destinations in the same process. Well, let's run this. So if I run example transports, I got the exact same...

6. Logging Output and Conclusion

Short description:

I got the exact same output as before. I only got these log lines added because it shows only the warning logs. And now you can see that they also have logs inside Elasticsearch. It's really nice because we have all our data in an easy-to-process fashion with Kibana. Thank you for watching this talk. If you have more questions, please reach out to me. We are hiring for all sorts of roles.

I got the exact same output as before. I only got these log lines added because in fact, I can, you know, this shows only the warning logs. Okay, here it is. It's pretty great. And then we can see if these were updated. And now you can see that they also have logs in inside Elasticsearch. And it's really nice because you can see that we have all our data in a really easy to process faction which Kibana gives us. So it's pretty great.

Cool, I just wanted to thank you for watching this talk. If you have any more questions about Pino and Fastify and Node.js, please reach out to me or ping me the email. Oh, by the way, we are hiring for all sorts of roles.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
Do you know what’s really going on in your node_modules folder? Software supply chain attacks have exploded over the past 12 months and they’re only accelerating in 2022 and beyond. We’ll dive into examples of recent supply chain attacks and what concrete steps you can take to protect your team from this emerging threat.
You can check the slides for Feross' talk here.
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
In the early years of Node.js, diagnostics and debugging were considerable pain points. Modern versions of Node have improved considerably in these areas. Features like async stack traces, heap snapshots, and CPU profiling no longer require third party modules or modifications to application source code. This talk explores the various diagnostic features that have recently been built into Node.
You can check the slides for Colin's talk here. 
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
Native ESM support for Node.js was a chance for the Node.js project to release official support for enhancing the module loading experience, to enable use cases such as on the fly transpilation, module stubbing, support for loading modules from HTTP, and monitoring.
While CommonJS has support for all this, it was never officially supported and was done by hacking into the Node.js runtime code. ESM has fixed all this. We will look at the architecture of ESM loading in Node.js, and discuss the loader API that supports enhancing it. We will also look into advanced features such as loader chaining and off thread execution.
Node Congress 2023Node Congress 2023
30 min
Building a modular monolith with Fastify
In my journey through Nodeland, I saw most teams struggling with the free-form nature of Node.js development: there are no guardrails for maximum flexibility. Yet, not all paths offer a smooth ride.
How to build applications that are well-organized, testable, and extendable? How could we build a codebase that would stand the test of time?
In this talk, we will explore how to avoid the trap of Singletons to create robust Node.js applications through the use of Fastify plugins: we will build a modular monolith!

Workshops on related topic

Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Workshop
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
WorkshopFree
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
WorkshopFree
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
React Summit 2022React Summit 2022
164 min
GraphQL - From Zero to Hero in 3 hours
Workshop
How to build a fullstack GraphQL application (Postgres + NestJs + React) in the shortest time possible.
All beginnings are hard. Even harder than choosing the technology is often developing a suitable architecture. Especially when it comes to GraphQL.
In this workshop, you will get a variety of best practices that you would normally have to work through over a number of projects - all in just three hours.
If you've always wanted to participate in a hackathon to get something up and running in the shortest amount of time - then take an active part in this workshop, and participate in the thought processes of the trainer.
TestJS Summit 2023TestJS Summit 2023
78 min
Mastering Node.js Test Runner
Workshop
Node.js test runner is modern, fast, and doesn't require additional libraries, but understanding and using it well can be tricky. You will learn how to use Node.js test runner to its full potential. We'll show you how it compares to other tools, how to set it up, and how to run your tests effectively. During the workshop, we'll do exercises to help you get comfortable with filtering, using native assertions, running tests in parallel, using CLI, and more. We'll also talk about working with TypeScript, making custom reports, and code coverage.