Comprehensive Observability via Distributed Tracing on Node.js8

Rate this content
Bookmark

The benefits of Node.js to develop real-time applications at scale are very well known. As Node.js architectures get more and more complex, visualization of your microservice-based architecture is crucial. However, the visualization of microservices is incredibly complex given the scale and the transactions across them. You not only need to visualize your Node.js applications but also analyze the health, flow, and performance of applications to have a complete observability solution. In this talk, we'll go over the challenges of scaling your Node.js applications and tools (such as distributed tracing) available to you to scale with confidence.

FAQ

Microservices bring challenges such as difficulty in observing and monitoring due to their distributed nature. Traditional monitoring systems often fail to provide clear visibility under the hood of microservices.

The three pillars of observability are logs, metrics, and traces. Metrics help identify issues, logs explain why they occurred, and traces provide detailed information about request paths through services.

Distributed tracing provides the ability to track a request's path through various services, helping identify where failures or bottlenecks occur. This can significantly reduce the time to detect and resolve issues compared to traditional methods.

Important metrics to monitor include CPU usage, memory usage, as well as business-level metrics like bounce rates, revenue, and click-through rates.

In a highly distributed microservices environment, logs can be voluminous and scattered, making it difficult to pinpoint issues without a significant amount of time and effort.

Correlation in observability involves linking metrics, logs, and traces across various services to provide a comprehensive view of system performance and issues, facilitating quicker problem identification and resolution.

Distributed tracing helps narrow down the scope of services involved in an issue, reduces guesswork, pinpoints where time is spent in the code, and provides actionable data through visualizations of service interactions.

A sustainable observability strategy should include clarity on business goals, a choice between DIY or managed solutions, implementation of lightweight observability tools, and scalability to handle fast-growing microservice architectures.

Being proactive in observability allows organizations to prevent issues before they impact the system significantly, reducing downtime and maintaining high performance and reliability.

Chinmay Gaikwad
Chinmay Gaikwad
8 min
24 Jun, 2021

Comments

Sign in or register to post your comment.

Video Summary and Transcription

Welcome to the session on comprehensive observability via distributed tracing on Node.js. We'll explore the challenges of microservices and troubleshoot distributed applications using an example. Correlation is the missing piece in troubleshooting distributed applications. Distributed tracing helps pinpoint issues that logging or metrics may miss, reducing mean time to resolution. It provides visualization of microservices architecture, actionable data, and enables code optimization.

1. Introduction to Observability

Short description:

Welcome to the session on comprehensive observability via distributed tracing on Node.js. In this session, we'll look at the new challenges in microservices, troubleshoot distributed applications using an example, and build a sustainable observability strategy for your company. Microservices have great benefits but also bring new challenges such as observability. Traditional monitoring systems make it hard to know what's happening under the hood.

Hello, everyone. Welcome to the session on comprehensive observability via distributed tracing on Node.js. I'm the host for the session. I'm Chinmay Gaikwad. I'm a technical evangelist at Epsigon.

Let's get started with the session. In this session, we'll look at the new challenges in microservices, specifically focusing on observability. We'll also look at how to troubleshoot distributed applications using an example, and finally we'll look at how to build a sustainable observability strategy for your company.

So let's start with the challenges on microservices. We know microservices have great benefits including scalability, speed of development, decreased system administration time, but microservices have also brought about new challenges such as observability in microservices. Using traditional monitoring systems, it can be nearly impossible to know what is going on under the hood. We'll explore this into much details in the upcoming slides.

2. Troubleshooting Distributed Applications

Short description:

Let's start with metrics, which are a great way to identify issues. Logs tell us why something went wrong, but they are not sufficient in a microservices-based environment. The traditional way of debugging involves looking at metrics, then logs, but it lacks context. Correlation is the missing piece in troubleshooting distributed applications.

First, let's see how to troubleshoot distributed applications. So we know the three pillars of observability are logs and traces. We'll deep dive into tracing a bit later. Let's start with metrics. Metrics are a great way for opps to figure out if something has gone wrong. Some examples of metrics include CPU usage, memory usage. We also have business level metrics such as bounce rates, revenue, click through rate, etc.

Logs on the other hand, tell us why something went wrong. So for this session, let us consider an example of a virtual shop. As you can see, the SAP server authenticates requests using Auth0, and then pushes them onto the Kafka Stream. A Java container pulls the stream and updates a DynamoDB table. Let's say there was a situation where users complained about OAuth that was sent but not handled. Where would you start?

Traditional monitoring solutions come at the expense of higher resource utilization because they have multiple high-heavyweight agents. And they also have the ability to only collect host metrics or are purely metric-driven. Metrics, as we have seen, really only let us know that something is broken, but not when or why. Context is absolutely critical in today's environments. Using the traditional way, first you look at Kafka metrics. You don't see anything abnormal here, so maybe look at the DynamoDB metrics next. We see some spikes here, so that's pretty interesting. So for debugging this, you need more data. And more data means logs. But are logs really sufficient in a microservices-based environment? Let's look into it.

We all know what logs look like. Personally, I have a love-hate relationship with logs. I love the fact that they are available, but I hate digging through them. I've sat myself digging through hundreds or even thousands of lines of logs hoping to spot that outlier. What if I knew the exact path that request is taking through individual services and components? Logs are good to debug on the list, but they don't really work as a starting point in a highly distributed system. So in a workshop example, if you're very lucky, you'll be able to spot the problem, but it might take a very long time. So let's recap of what are the things that are missing here. It essentially boils down to correlation.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
Do you know what’s really going on in your node_modules folder? Software supply chain attacks have exploded over the past 12 months and they’re only accelerating in 2022 and beyond. We’ll dive into examples of recent supply chain attacks and what concrete steps you can take to protect your team from this emerging threat.
You can check the slides for Feross' talk here.
Towards a Standard Library for JavaScript Runtimes
Node Congress 2022Node Congress 2022
34 min
Towards a Standard Library for JavaScript Runtimes
Top Content
You can check the slides for James' talk here.
ESM Loaders: Enhancing Module Loading in Node.js
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
Native ESM support for Node.js was a chance for the Node.js project to release official support for enhancing the module loading experience, to enable use cases such as on the fly transpilation, module stubbing, support for loading modules from HTTP, and monitoring.
While CommonJS has support for all this, it was never officially supported and was done by hacking into the Node.js runtime code. ESM has fixed all this. We will look at the architecture of ESM loading in Node.js, and discuss the loader API that supports enhancing it. We will also look into advanced features such as loader chaining and off thread execution.
Out of the Box Node.js Diagnostics
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
In the early years of Node.js, diagnostics and debugging were considerable pain points. Modern versions of Node have improved considerably in these areas. Features like async stack traces, heap snapshots, and CPU profiling no longer require third party modules or modifications to application source code. This talk explores the various diagnostic features that have recently been built into Node.
You can check the slides for Colin's talk here. 
Node.js Compatibility in Deno
Node Congress 2022Node Congress 2022
34 min
Node.js Compatibility in Deno
Can Deno run apps and libraries authored for Node.js? What are the tradeoffs? How does it work? What’s next?
Multithreaded Logging with Pino
JSNation Live 2021JSNation Live 2021
19 min
Multithreaded Logging with Pino
Top Content
Almost every developer thinks that adding one more log line would not decrease the performance of their server... until logging becomes the biggest bottleneck for their systems! We created one of the fastest JSON loggers for Node.js: pino. One of our key decisions was to remove all "transport" to another process (or infrastructure): it reduced both CPU and memory consumption, removing any bottleneck from logging. However, this created friction and lowered the developer experience of using Pino and in-process transports is the most asked feature our user.In the upcoming version 7, we will solve this problem and increase throughput at the same time: we are introducing pino.transport() to start a worker thread that you can use to transfer your logs safely to other destinations, without sacrificing neither performance nor the developer experience.

Workshops on related topic

Node.js Masterclass
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Matteo Collina
Matteo Collina
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Build and Deploy a Backend With Fastify & Platformatic
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Matteo Collina
Matteo Collina
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
0 to Auth in an Hour Using NodeJS SDK
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher
Building a Hyper Fast Web Server with Deno
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
WorkshopFree
Matt Landers
Will Johnston
2 authors
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
GraphQL - From Zero to Hero in 3 hours
React Summit 2022React Summit 2022
164 min
GraphQL - From Zero to Hero in 3 hours
Workshop
Pawel Sawicki
Pawel Sawicki
How to build a fullstack GraphQL application (Postgres + NestJs + React) in the shortest time possible.
All beginnings are hard. Even harder than choosing the technology is often developing a suitable architecture. Especially when it comes to GraphQL.
In this workshop, you will get a variety of best practices that you would normally have to work through over a number of projects - all in just three hours.
If you've always wanted to participate in a hackathon to get something up and running in the shortest amount of time - then take an active part in this workshop, and participate in the thought processes of the trainer.
Mastering Node.js Test Runner
TestJS Summit 2023TestJS Summit 2023
78 min
Mastering Node.js Test Runner
Workshop
Marco Ippolito
Marco Ippolito
Node.js test runner is modern, fast, and doesn't require additional libraries, but understanding and using it well can be tricky. You will learn how to use Node.js test runner to its full potential. We'll show you how it compares to other tools, how to set it up, and how to run your tests effectively. During the workshop, we'll do exercises to help you get comfortable with filtering, using native assertions, running tests in parallel, using CLI, and more. We'll also talk about working with TypeScript, making custom reports, and code coverage.