1. Introduction to Pino Logging
Today I'm going to talk about logging. I work for Nearform, a professional services company based in Ireland. I'm a member of the technical steering committee and co-creator of Fastify and Pino. Pino is one of the fastest loggers for Node.js with a minimal feature set. It has been downloaded 6 million times per month and we are working on version seven. Using Pino is simple with a factory function and it helps reduce the hover rate of logging inside your application.
Hi, everyone, I am Matteo Collina. Today I'm going to talk about one of my favorite topics, logging. Just a little bit of intro of me before we start. I'm Matteo Collina, at Matteo Collina on Twitter. I work for a company called Nearform. We are a professional services company based in Ireland. I'm also a member of the technical steering committee and I'm the co-creator of Fastify and these libraries we're talking today, Pino. I'm also a software architect, so I typically help companies run Node.js in production. I also write every week on a new my newsletter at nodeland.dev, please subscribe.
Anyway, as part of my, oh, this is actually important. Maybe I have a lot of downloads on NPM. I ended up maintaining a lot of things in the ecosystem because as my activity as a consultant, I tend to balance between client work and the need of the client with the maintaining Node.js as ecosystem, and that is a good synergy. So I end up implementing new libraries and new modules to solve problems that my clients have, and I bring those new things to my clients. As part of this initiative I built up various things like Fastify and Pino. Pino, what's Pino? Pino is one of the fastest logger for Node.js. It does a minimal feature set, so we'll talk a little bit about the minimal, and it is a great community. Pino's now been downloaded 6 million times per month, and we have four collaborators. Maybe three, but whatever, four collaborators. And it's version six, and we are working on version seven. This talk, it's about version seven. Hey, this is actually nice. So new things coming through.
How do you use Pino? Using Pino is very simple. You can just create your logger. Use just a factory function, so you just create your logger, and then you start doing info, warn, error, debug, those things that you like. We also have this child logger functionality that enables you to create a new child with a certain set of properties already pre-populated, and it's a newline delimited JSON logger, so it produces newline delimited JSON, so it's a JSON followed by a newline, and another JSON followed by a newline. Pretty interesting. It's fast, and you can use pnode to drastically reduce the hover rate of logging inside your application. You cannot really bring it to zero because it's still doing IO in the end, but it helps reducing it quite a lot. Know that logging should be fast.
2. The Speed and Performance of Pina Logging
You should not have very expensive logging because if you have very expensive logging, then you will be inclined to log less and not more, and you need more observability most of the time.
There is this nice talk called the Cost of Logging that my partner in crime, David Marklamas, and myself have done at node-conf.eu 2016. It's a long time ago, you know? But Pina's been around for a bit, so, hey, happy days. And you can check it out because it was a really good... It's a really good talk. And most of what we say is still modern. There was a bunch of things that are different and we're going to change in v7 that we're going to talk about soon.
So, well, the first bit that I want to cover about Pina in how come it is so fast. How can it be fast? Well, I just want to give you a little bit of overall. And thank you, James. These slides are amazing. This diagram is probably one of the best ones we have ever done. This diagram explains the event loop and explains when we're doing IEO and how can we achieve and how to achieve a very performant IEO in Node.js and authorized performance Node.js applications.
3. Pino Logging Performance and Advanced Features
If the memory consumption increases, more garbage collection is required, which runs on the CPU. This can lead to a slowdown in CPU and a buildup of events, potentially causing a blockage. To ensure good throughput, Pino processes data synchronously, avoiding memory leaks. Pino also allows for async mode, although it may affect debugging. In 2016, Pino proposed a different approach to logging, recommending separate processes or relying on infrastructure for log transport. However, Pino now allows for distribution within the same process without blocking the main thread. Users have requested features such as log formatting, sending logs to remote destinations, implementing log rotation, and sending multiple logs to multiple places simultaneously.
But if the memory consumption increase, you have more garbage collection to do, which runs on the CPU. And so when the CPU this seed runs, the GC grows, the memory grows, the GC grows, grows in usage and then the CPU gets filled in by the garbage collection activity to do. Now, that's the problem, you see, because latency and throughput are deeply connected. So if I have a slow operation in my code, then it will increase concurrency, which will increase memory pressure, which will slow down the CPU. And it's possible that because of that slowdown in CPU, the buildup of events to process will go up again and enter some sort of catastrophic cycle where until the pressure is relieved somewhere, everything will be blocked.
So, you know, the trick here is to make sure that most of the process is done as fast as it's possibly can. That's it, that's the answer to this. And it does not schedule any more work. Now, these means in the context of logging, for example, that if you are having, if you want to send the same log line to multiple destination, it will be problematic because we have, or especially even over a network, because if we try to do that, that data will stay alive for longer and we are creating actually more work for our event loop. In Pino, we do all the data processing synchronously. So whenever you call .info whatever, you can be certain that by using the default, that all the processing is as done within the macro tick. So there's nothing scheduled to be executed later. This is phenomenal and really important for getting good throughput, because we make sure that there is no memory located left and it gets essentially cycled very easily. Now you could turn on the async mode in Pino so that the logging is then flashed after a bit, and it's not written immediately, but it's a little bit tougher on the debugging side of things. We'll talk about that in a second.
In the original presentation about Pino from 2016, we had this light about the fact that Pino is more than a module, it's a way of life. Well, Pino, it is more than a module. And at the beginning, we flagged this because it was clear that we were proposing something radically different. And what were we proposing in 2016? Well, we were telling people, well, look, you need to implement your transport as separate processes or rely on your infra. So you log on standard output and your infra picks standard output and send it somewhere else. Yes, we still recommend that, that has not changed. However, several teams have reached out to us and us as well, but we really need to execute the distribution in our process. The typical target is sending your log to data dog or elastic search or something like that. Well, Pino allows you to do that. Well, Pino will soon allow you to do that from the same process without blocking the main thread. We'll see that in a moment. You know, what happened was that all people started asking us for the same features. They wanted to format their logs, they want to send it to remote destination. They wanted to implement log rotation, this was a big thing. And really they wanted to send multiple logs to multiple places at the same time.
4. Using Worker Thread and Threadstream
We could have used Worker Thread, which are now stable in all long-term support release lines of Node.js 12 and 14. They offer easy synchronization primitives like Atomics. I started writing a library called Threadstream, which wraps a Worker Thread in a stream-based API. It's fast and allows writing strings and buffers. By providing a writable stream, you can send data to the WorkerThread. This provides a way to send data to my WorkerThread as a stream.
By us saying, well, you should do all those things out of process. We were saying to all those teams, well, you need to have your operations people or your DevOps to take care of those things. It was harder for most teams, to be honest. So, and users keep asking these questions every day, every day, every day. So hey, what could we do? Well, there was something that we could have done. We could have used Worker Thread. The Worker Thread are now stable in all long-term support release lines of Node.js 12 and 14, of course. And they offer some easy synchronization primitives, like, for example, Atomics. And it's great. And maybe we could use them to build transports, right? Wow, well, how? First of all, this idea is not new. It was originally showed to me by Mark Martin at a conference a few years back, where he was actually using a native add-on to do most of this work. And it was showing a really good potential in terms of throughput and performance. So it was pretty great. So what I did was, I started writing this library called Threadstream. What does Threadstream do? Well, it essentially wraps a Worker Thread in a stream-based API. So it's not purely a writable stream because it does not inherit from a writable, but it's fast, and you can write strings to it. You can write buffers. So basically, you call streams write and you just start writing to your Thread. And on the other Thread, you need to provide a writable stream in. That's it. That's the only interface that you need to provide. So you can, once you have done this thing, that you have provided both of those, it's actually pretty great because then now I have one way to send data to my WorkerThread as a stream. But you know, a stream is the interface the PNO have. So let's see how to use that.
5. Using Pinot Transport for Log Processing
We want our main Thread to send logs to our WorkerThread. They communicate using a Ring Buffer, allowing for writing and consuming without blocking. Pinot.Transport is a new function that allows log processing in a worker thread. It offers various options for log destinations and pretty printing. A demo showcases the basic example of Pino and the use of transports for multiple destinations.
Well, what we do is we want to have our main Thread send logs to our WorkerThread. That's what we do. So in AV Node app, we will have two Threads, one for main and one for the logs processing. How do they communicate? They communicate using a Ring Buffer. Using a Ring Buffer allows them to write to this buffer mostly without locking the buffer itself. So one can write and the other one can consume without blocking between them.
So, and I can always determine the part to read from that work. So I'm going to introduce you to Pinot Transport now. I built it up all this tension and hopefully we release it right now. So Pinot.Transport is a new function inside Pinot that allow you to create, to do log processing in a remote transport, sorry, in a worker thread that is wrapped as a transport. You can use modules, you can specify either a dependency or a single file with an absolute path, or you can use a destination file or you can use a console and pretty print. So that's it. So you can actually do a lot of those things in different ways.
So it's time for a demo. So what I'm going to do, I'm going to open up Kibana and we'll see our way. So it is Kibana, we have last five minutes, this is nothing. And I have my script. So first of all I'm going to show you the example and the example, this is the basic example of Pino, these just log things out and to standard output. So what we, what we were used to do is no example and you see it prints this, but then I could do Pino Pretty and then get pretty printing. Note that what I've done here is, ah, that's it okay. So what I've done here I have cold executed Node Modules Bean, dot bean slash Pino Pretty to prettify and colorize my locks. Now what I can do instead is with the transports is these new, oop yay, opened anyway, yay. And is use this new technique where I am creating my transport I'm specifying a file for one destination. I'm specifying I'm going to use a Pino Elasticsearch to send the data to Elastic specifying that the Elasticsearcher node and doing pretty print. So let's see how if that works. So I've been doing and oh note that this is the rest does not change at all. It's the exact same thing. And it's pretty cool, right? Because I can write my transport, not keep my app but doing multiple destinations in the same process. Well, let's run this. So if I run example transports, I got the exact same...
6. Logging Output and Conclusion
I got the exact same output as before. I only got these log lines added because it shows only the warning logs. And now you can see that they also have logs inside Elasticsearch. It's really nice because we have all our data in an easy-to-process fashion with Kibana. Thank you for watching this talk. If you have more questions, please reach out to me. We are hiring for all sorts of roles.
I got the exact same output as before. I only got these log lines added because in fact, I can, you know, this shows only the warning logs. Okay, here it is. It's pretty great. And then we can see if these were updated. And now you can see that they also have logs in inside Elasticsearch. And it's really nice because you can see that we have all our data in a really easy to process faction which Kibana gives us. So it's pretty great.
Cool, I just wanted to thank you for watching this talk. If you have any more questions about Pino and Fastify and Node.js, please reach out to me or ping me the email. Oh, by the way, we are hiring for all sorts of roles.