Examining Observability in Node.js

Rate this content

Imagine your productivity and confidence developing web apps without chrome dev tools. Many do exactly that with Node.js.

It is important to observe and learn what’s happening in your app to stay competitive and create the most performant and efficient Node.js applications, following the best practices.

In this talk, we will explore useful tools to examine your Node.js applications and how observability will speed up development, produce better code while improving reliability and uptime. Don’t miss it!

22 min
24 Jun, 2021

AI Generated Video Summary

Today's Talk explores the concept of observability in Node.js, emphasizing the importance of understanding what's happening inside a system without needing to ship new code. The Talk covers various observability tools and techniques, including performance measurement and tracing, HeapSnapshot and Chrome DevTools, the BA Inspector, and external tools like nSolid. nSolid is highlighted as an enhanced observability tool specifically built for Node.js, offering low-input performance insights and greater security for mission-critical applications.

1. Introduction to Observability in Node.js

Short description:

Today, I will be talking about examining observability in Node.js. Observability is a term from control theory. In several products and services, observability means you can answer any questions about what's happening on the inside of the system by observing or by asking questions from the outside of the system without needing to ship new code to answer questions. Because system complexity is outposing our ability to predict what it's going to break. There's a lot of useful tools that come to rescue.

Hello, all. Today, I will be talking about examining observability in Node.js. Hi, my name is Liz Paradis, I'm the head of Developers Relations at Nodesters, and I am from the beautiful country of Colombia.

This is the agenda for today. First, we're going to talk about what is observability, then why observability is important, how we can use Node.js internals for observability, including performance hooks, trace events, heap snapshots, BA native module. And then we will look at some external tools, benchmarks, and conclusions at the end.

So, let's begin by understanding what observability is and why it's important. Observability is a term from control theory. A simple definition could be it's the measure of how well internal states of a system can be inferred by the knowledge of its external outputs. In other words, we see the results, or the output, and we can know what's happening on the inside. Let's take, for example, bananas and avocados. In Colombia, we have a lot of bananas and avocados. It's amazing. And just from looking at the outside, we can know how it is on the inside, if they're ripped, or if they're not ready for consumption yet. In several products and services, observability means you can answer any questions about what's happening on the inside of the system by observing or by asking questions from the outside of the system without needing to ship new code to answer questions. And that's very important. We shouldn't need to write new code to observe what's happening on one of our systems. Because system complexity is outposing our ability to predict what it's going to break. There's a lot of useful tools that come to rescue.

2. Importance of Observability and Choosing Tools

Short description:

Talking about system complexity, software is becoming exponentially more complex. The number of products and tools are multiplying. With environments as complex as we see today, simply monitoring from problems is not enough. Observability gives you the instrumentation you need to understand what's happening in your software and fix it. Observing is exposing the internal state to be viewed externally, and monitoring is collecting and displaying the information that has been exposed. To solve complex nodejs problems, it's necessary to have an observability tool. In choosing an observability tool, we're going to focus on nodejs internals and external tools like performance hooks.

Talking about system complexity, software is becoming exponentially more complex. In infrastructure, we're seeing things such as microservices, containers, Docker, Kubernetes, and others that decompose monoliths into microservices, which are great for products but can be hard on humans. The number of products and tools are also multiplying. There are countless platforms and tools for empowering people to have better observability and control over their code. Great for users. But it's very hard to choose which one is the best one.

Now that you know what observability is, you will be wondering why it's important and why you should care. With environments as complex as we see today, simply monitoring from problems is not enough to recognize, find, and fix a number of issues that arise. Sometimes the new issues are known or not, which means that you don't know the problem, and even worse, you don't know how to find it. So without tools to observe the environment, it's almost impossible to fix that problem. This is why observability is important. It gives you the instrumentation you need to understand what's happening in your software and fix it. A good observability tool helps you find what and where the problem is. It doesn't add overhead to your app. We don't want our app to slow down. Quite the opposite. It also has enhanced security, has flexible integration, and doesn't need to modify your code. And it's also very important to differentiate between observing and monitoring. Observing is exposing the internal state to be viewed externally, and monitoring is collecting and displaying the information that has been exposed, and usually involves writing automation tools around that.

If you want to solve the most complex nodejs problems such as memory leaks, or performance issues, or even just to follow the best practice to keep your code healthy, it's necessary to have an observability tool. The next step will be choosing an observability tool that are best for our needs. So in choosing an observability tool, we're going to focus first on nodejs internals, and then we're going to check some external tools. In nodejs internals for observability, first we're going to talk about performance hooks. This is particularly helpful for checking on performance. It's an object that can be used to collect performance metric from the current nodejs instance. Performance monitoring is not something you should start considering once you start seeing problems. Instead, it should be part of your development process in order to detect possible problems before they are visible in production. Because of the asynchronous nature of nodes, code profiling with regular tools can be very challenging. Especially because part of the time spent could be outside of your own code and inside of the event loop itself. This is why it's important to use even internal nodejs tools like performance hook or external tools as we're going to discuss later.

3. Demo of Performance Measurement and Tracing

Short description:

In this demo, we measure the performance of different search engines using performance hooks. The fastest search engine in this example is .co, followed by google, ping, and yahoo. Another example demonstrates the use of performance hooks to measure the duration of a simple 'hello world' program. However, there are tradeoffs, as manual code instrumentation impacts reliability and the performance observer has a significant overhead cost. Profiling with the node prop process flag generates a log file that provides insights into where the application's time is spent. Trace events enable centralized tracing information, including file system access and performance data. Tracing in Chrome allows for visualization of events and their durations. HeapSnapshot provides a static snapshot of memory usage in V8, allowing analysis of memory usage patterns.

Now let's do a little demo. So here I'm using performance server from performance hook and then we have the four biggest search engine, google yahoo ping and .co and then we have an initial mark, an end mark here and we're going to measure the performance of here, the calculated time here. And here we're just going to print the console log, print the duration. So if we go to terminal, we can see the fastest one in this example is .co then followed by google then ping and lastly with yahoo with one second and this one is 447 milliseconds.

Let's look another example. Here we are creating just a simple hello work using performance hooks, performance observer for performance hook and I'm just going to console log the duration of this hello world and that's it. So if we just do node, so this app hello world, it just takes 8 milliseconds. While this is very informative there are some tradeoffs. It requires instrumentation of your code manually impacting reliability. In the case of the performance observer there is a significant overhead cost to the observer which is not good. It makes the application slower.

Now profiling. If we're using this flag node prop process it would generate a log file like this one that can be used to create more human readable information like frame graphs like this one to see where the time of your application is being spent. This can be very productive in getting insights it was going to on your application. The downfall is that it has a significant overhead thus generating new files can be tricky. Trace events. The trace event provides a mechanism to centralize tracing information generated by V8 Node.js core and user space code. Tracing can be enabled with these black trace event categories that accept a list of common separated category names or by using the trace events module. So node async is enabled. So we can execute the trace event enable to get the output of several events that happens inside Node.js this can include accessing the file system or performance data, async hooks and others. In Chrome, we can open Chrome tracing, click on the red record button allowing you to visualize tracing like this. If you look at the bottom of the screen you can see fs sync read this is the read operation of the file system. Then there are 546 bytes read. It's also possible to see when the tracing is started how long it took and the CPU duration which is also useful to see what's happening on your code. This other example of tracing is a little bit more complex using more processes and multiple threads. We are looking at the main thread and different useful information like duration time as we can see here processes depending on other processes, arguments and others. Tracing has little less overhead than profiling but management of the files becomes harder to handle because this is looking at a lower level of instrumentation of your Node.js internals. The next one is HeapSnapshot. A HeapSnapshot is a static snapshot for memory usage details and a moment in time and it provides a glimpse into the heap usage of V8, the JavaScript frontend that powers Node.js. Looking at this HeapSnapshot, you can begin to understand where and how memory is being used.

4. HeapSnapshot and Chrome DevTools

Short description:

You can use the built-in HeapSnapshot signal flag to dump the HeapSnapshot in Node.js. Chrome DevTools allows you to convert snapshots and identify objects in memory. A HeapSnapshot in Chrome DevTools provides information about objects on the JavaScript heap, including object count, shallow size, and retain size.

You can use the built-in HeapSnapshot signal flag like as we can see here so you can set as many signals as you want and Node.js will just dump the HeapSnapshot. Chrome DevTools allows you to convert snapshots and you can identify objects in memory that will help you narrow down where a memory leak might be occurring. This is how a HeapSnapshot looks in Chrome DevTools at very high level. The color of the far left here leads the objects on the JavaScript heap. On the far right, you can see the objects count column which represents how many objects are in memory, the shallow size column and the retain size column. The retain size represents the amount of memory that will be free by the garbage collector when the object is collected.

5. The BA Inspector and Observability Tools

Short description:

The BA Inspector is a development tool that helps you monitor your application. Chrome DevTools integrated into BA expands its capabilities. There are multiple ways to get started, including using the inspect flag or the inspect break flag for local development. A demo showcases the use of the inspect break flag and the WebSocket communication session. However, the VA inspectors should never be used in production. Node.js provides observability tools like profiling and performance hooks, but they have limitations. External tools like the blocked library can help with observability.

The BA Inspector. This is not an observability tool, but instead is a development tool that helps you monitor what's happening in your application.

A few years ago, Chrome DevTools was integrated directly into BA, expanding its capabilities to include newer applications.

There are a few ways to get started. One is using the inspect flag, as we can see here, which will start the inspector. Then you can pass a host and a port that you want to listen, just as here. And if no parameters are passed, it will connect to the port 127 by default, as we can see here. One other way is useful when doing a local development using the inspect break flag, this flag. This has the same options for hosts and ports that the inspect flag, but also puts a break point before the user code starts. So you can do any type of setup you prefer without having to try to catch breakpoints in your code at runtime.

Now, let's check a little demo. So if we create this line of code here, just a promise reject, a new error, a very cool error. And then if we just do node the flag inspect break and the name of the file, we can see that Booker is listening of WS, and then we have a WebSocket URL. WebSocket makes it possible to open like a two-way interactive communication session between your users, browsers and servers. We can also see a message here that directs us to the Node.js documentation so we understand what's happening there and if we have any questions, we can just go there. Then if we go to Chrome inspect this will direct us to a link here, Node.js link and if we open up that link, it will show us a pop up window for debugging your Node.js session. So now devtools is connected to your Node.js providing you with access to all of the Chrome devtools features you used and this allows you to edit pages on the fly, access source map, live edit, console evaluation, sample in JavaScript profiler with frame graph, heap of snapshots, asynchronous stacks for native premises and others.

However, the VA inspectors should never be used in production because devtools actions hold events. This is acceptable in development, but it's not suitable for production environments. For production environments, we will see later which one is the best options. But there are some problems with things that Node.js already provides for observability, like profiling, performance hooks, and others is that it tells you that there is a problem but it doesn't tell you where to find it. Also, sometimes there's not easy to implement. It doesn't give you enough information or is not presented in an easy way, like graphs or a center performance metric that standard tools provide. Also, it has significant overhead. Generally, it is not viable in production and only provides data overhead, which means that there's a ton of data and it requires expertise to separate signal and significant data from noise.

But there are some pros because there are great toolings and they have extensive data and insight. So now, we will check some external tools for Node.js observability. First is the blocked library. The block npm package is a concise example of using timers for observability. It helps you check if the loop is blocked.

6. Observability Tools and nSolid

Short description:

If you're running Node A or higher versions, you can get a stack trace pointing to the blocking function. The block function reports every value over the configured threshold, defaulted to 10 milliseconds. While it's useful for understanding the overhead of event loops, it can have false positives in some scenarios. Another external tool is New Relic, an observability platform that helps engineers create better software. Data Doc is a monitoring service for cloud scale applications, providing metrics on requests, latency, distributions, error, and more. Instana is an application performance monitoring tool for microservices, providing detailed metrics on calls, error rate, mean latency, and more. Dynetrix is a software intelligence platform that monitors and optimizes application performance and development, with Node.js observability features. However, these external tools have limitations and overhead. nSolid is an enterprise runtime for Node.js with minimal overhead.

If you're running Node A or higher versions, you can get a stack trace pointing to the blocking function. The block function reports every value over the configured threshold, defaulted to 10 milliseconds. And you can do whatever you want with it. You graph it, log it, alert it and others. While it's useful for understanding the overhead of event loops, it can have false positives in some scenarios because of the time offset. In addition, it can also create like a numbing effect alerting you to event loop blocks but not signifying or pointing into what is actually causing the blockage.

Many times, developers will just ignore it as there's no clear action to take. Another external tool is New Relic. New Relic is an observability platform built to help engineers create better software. From monoliths to serverless, it helps you to instrument everything, analyze, troubleshoot and optimize your entire software stack. It also provides different solutions. This is how New Relic insights appear. You can see waste transaction times, application activity, error rate, hosts and others. The next one is Data Doc. Data Doc is a monitoring service for cloud scale application providing monitoring of service, database tool services, true or false based data analytics platform. With Data Doc, you can check like requests, latency, distributions, error, percentage of time spent and other metrics as we can see here. Instana is an application performance monitoring for microservices. It lets you manage the performance of your application in real time and see every detail about the inner workings and inner dependencies of your application services. We can see some of the metrics here like calls, error rate, mean latency, top services, processing times and others.

So, Dynetrix, Dynetrix produces a software intelligence platform based on artificial intelligence to monitor and optimize applications' performance and development, IT infrastructure and user experience. For Node.js observability it can tell you the number of processes, CPU and memory usage, the percentage, connectivity and availability, traffic, the most consuming, requests and other Node.js metrics. But there is a problem with all of these solutions, all of these external tools. The way APM works, they become agents as we can see here. We become agents between, which are basically intermediaries between the application and the Node.js runtime. The APM is injected into your code and encapsulates your application so they can extract the information and thus have a significant cost also known as overhead. Another problem is that sometimes you have to modify your own code in order to implement the APM. But they can be very useful tools that provide you with additional insights and extensive data. Now let's look at one tool that doesn't have this problem because it's not an APM. It's an enterprise runtime for Node.js and it adds minimum overhead, nSolid. nSolid is a drop-in alternative to the Node.js runtime.

7. Benefits of Nsolid for Node.js Observability

Short description:

Nsolid is an enhanced observability tool specifically built for Node.js, offering low-input performance insights and greater security for mission-critical applications. It provides valuable insights into Nsolid processes, including CPU usage, garbage collector counts, and more. Unlike traditional APM tools, Nsolid operates at a lower level, avoiding the overhead of wrapping user code. In benchmarks, Nsolid outperforms other tools in load times and speed, with minimal memory overhead. Observability in Node.js is crucial for security and performance, and Nsolid is the top choice for Node.js-specific observability. Other tools may have their merits, but Nsolid offers the best of both worlds.

It's enhanced to deliver low-input performance insights and a greater security for mission-critical Node.js application. It has fast time resolution, a stronger infrastructure, and had better security. This is important because traditional APM tools sit on top on the Node.js runtime layer performance and it has performance overhead that might vary from application to the next depending on the architecture.

Node.solid was built specifically for Node.js because it's Node.js runtime itself, it's not an agent. This is the console overview that provides valuable insights into the clusters of Nsolid processes running in a variety of configurations. You can see the number of processes, vulnerability, host, and the number of applications. On the cluster view, where you can see each processes, CPU, CPU percentage, garbage collector, counts, and others.

It's important to clarify that the previous slides show and they all contain libraries that help you expose data, but the main function is as a monitor. For example, I can export data using New Relic API and consuming via AWS. This is where Nsolid has an advantage. The additional metrics that Nsolid provide can be consumed by many monitor solution and without any additional overhead. This is the best of both worlds.

And finally, we will see some benchmark. If we're going to check load times, which is the time that it takes for Node.js process to be available to receive a process request, we can see that Vanilla Node.js here is the fastest with 30 milliseconds, followed by Nsolid with 40 milliseconds. Then we can see Instana with 210 milliseconds, which is an increase of 600%, New Relic, Datadog and finally AppDynamics, which is an increase of 3600%. The standard times we can see that Node.js takes, Vanilla Node.js takes 30 milliseconds in startup time. Followed by Nsolid with 35 milliseconds. Then 150 and 250 milliseconds by AppDynamics and Datadog, which is an increase of 600%. Added baggage or memory overhead of Nsolid only adds 2MB of memory, while New Relic adds 15 and Datadog adds 57MB of overhead. Finally, measuring speed, Nsolid is the fastest one with almost 10000RPS. Then AppDynamics with 2000RPS and finally Datadog with 1500RPS here.

As a conclusion, observability in Node.js is very important for security and performance. It allows you to fix errors faster and if you are focusing on Node.js specifically Nsolid is the best observability tool out there. The other tools are great but they come with a cost because they add noticeable overhead from wrapping the user's code into their own libraries. Nsolid avoids these penalties by observing the application at a lower level allowing Nsolid to make observations without directly affecting how the program runs. For other types of applications and depending on your needs there are other great observability tools that add a lot of value. Thank you so much. This is where you can find me on social media and if you have any questions please let me know.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Do you know what’s really going on in your node_modules folder? Software supply chain attacks have exploded over the past 12 months and they’re only accelerating in 2022 and beyond. We’ll dive into examples of recent supply chain attacks and what concrete steps you can take to protect your team from this emerging threat.
You can check the slides for Feross' talk here.
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
In the early years of Node.js, diagnostics and debugging were considerable pain points. Modern versions of Node have improved considerably in these areas. Features like async stack traces, heap snapshots, and CPU profiling no longer require third party modules or modifications to application source code. This talk explores the various diagnostic features that have recently been built into Node.
You can check the slides for Colin's talk here. 
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
Native ESM support for Node.js was a chance for the Node.js project to release official support for enhancing the module loading experience, to enable use cases such as on the fly transpilation, module stubbing, support for loading modules from HTTP, and monitoring.
While CommonJS has support for all this, it was never officially supported and was done by hacking into the Node.js runtime code. ESM has fixed all this. We will look at the architecture of ESM loading in Node.js, and discuss the loader API that supports enhancing it. We will also look into advanced features such as loader chaining and off thread execution.
JSNation Live 2021JSNation Live 2021
19 min
Multithreaded Logging with Pino
Almost every developer thinks that adding one more log line would not decrease the performance of their server... until logging becomes the biggest bottleneck for their systems! We created one of the fastest JSON loggers for Node.js: pino. One of our key decisions was to remove all "transport" to another process (or infrastructure): it reduced both CPU and memory consumption, removing any bottleneck from logging. However, this created friction and lowered the developer experience of using Pino and in-process transports is the most asked feature our user.In the upcoming version 7, we will solve this problem and increase throughput at the same time: we are introducing pino.transport() to start a worker thread that you can use to transfer your logs safely to other destinations, without sacrificing neither performance nor the developer experience.

Workshops on related topic

Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
React Summit 2022React Summit 2022
164 min
GraphQL - From Zero to Hero in 3 hours
How to build a fullstack GraphQL application (Postgres + NestJs + React) in the shortest time possible.
All beginnings are hard. Even harder than choosing the technology is often developing a suitable architecture. Especially when it comes to GraphQL.
In this workshop, you will get a variety of best practices that you would normally have to work through over a number of projects - all in just three hours.
If you've always wanted to participate in a hackathon to get something up and running in the shortest amount of time - then take an active part in this workshop, and participate in the thought processes of the trainer.