Monitoring Errors and Slowdowns with a JS Frontend and Node Backend

Bookmark

We've got a JavaScript frontend hitting a Node (Express.js) backend. We'll go through how to know which party is responsible for which error, what the impact is, and all the context needed to solve it.



Transcription


Hey everyone, my name is Chris. I'm a solutions engineer at Sentry and I'm here to talk to you today about monitoring errors and slowdowns in javascript. Sentry is code monitoring for developers. So we tell you when your code is broken, we tell you when your code is slow. We're not infrastructure monitoring, not analytics monitoring, specifically we live at the application layer and we're designed for developers. Today we're going to talk about two main areas, the error monitoring and performance monitoring sections. We'll head to docs.sentry.io, take a look at the node documentation, it's a simple npm install and initialize the SDK with sentry.init and a few configuration options. Very easy to get started. We land here on our demo site where we're going to purchase some e-commerce products. We notice that the products endpoint is taking a while to load, so we'll come back to that in a little bit. Go ahead and add some products to the cart. Check out and wait to get our products and we see an error.

So without something like Sentry, we're not going to know that something was wrong, both in terms of the products endpoint being slow, as well as more importantly, in terms of having this error. So what we'll see here is at the current time, we just get an error and it shows up in Slack. We can see that it's a 500, we can see it's happening in production, we can see the release it went out at, maybe this was something that got released earlier today, recently, but we know it's happening in production, it's important, we're going to click through. Here we're on the issue page. This is the who, what, when, where, and why of the error. We can see what the error is, it's 500, it's happened 45 times to 22 unique users, it's happened however many times in the last 24 hours, and this many times in the last 30 days. It looks like it was originally seen back five months ago, and it was most recently seen, understandably, because we just triggered it a few seconds ago in this more recent release. We can see that the user was on Chrome, they were on Mac OS X, we get a little information about their email address, and any other custom tags that we've set here. We've got the stack trace that tells us the error type and message, you can see, all right, if the response wasn't okay, then we're going to manually call capture exception, Sentry SDK, Sentry attaches to the global error handler, and it will automatically capture any uncaught, unhandled errors as well, and you can also manually, for instance, if you're handling errors, like in this case, you can also manually capture them. Breadcrumbs give us a sense of the user journey leading up to the error, so what was the user doing right before the error occurred? You can see that just before, there was a post request to the checkout endpoint, some print statements, a click. The other thing that's really important to mention is we're seeing a human readable version of this, so if you have raw, your raw minified stack trace, because you have bundled javascript code that's been minified, you're going to see something not useful like this. When you upload source maps, you're going to be able to see the original human readable source code. We also have integrations with different source code management tools, so for instance, you can see these suspect commits, and this gives us a sense of who might have committed code that caused this problem.

We can also see that there's a child error event, and we can trace that across.

There's another project, in this case, our Node Express app, and we can see that there's a different error message, not enough inventory for product.

It's happened 87,000 times to 85,000 unique users, so clearly this is not a new issue.

It's happened a lot more frequently in the last 24 hours and 30 days than the prior issue.

We can see when it was most recently seen, first seen.

All the same deal, the who, what, where, why, so, not enough inventory for the product, and we throw a new error.

At this point, we trace it down to the root cause.

We can consider this solved, and turn our attention to performance.

So if you recall, back here, we clicked on the product's end point, and we saw that there was some slow performance.

Now, we can also see that reflected within Sentry itself.

You can see a number of Google's Web Vitals, so just the standard seo-related things, like how long it takes for the biggest thing to show up on the page, the first thing to show up on the page, and we can also head over to look specifically at our general transactions.

You can see there's a user misery score that's quite high here for the product's end point, so if we didn't already know what we were looking for, we'd be able to see this.

This is also configurable.

If you have end points, you know we're going to take a long time, but basically, it's a way to helpfully see what are different transactions that are taking a lot longer than we expected.

We'll click through to here, take a look at some of our recent transactions, and we can likewise see this shows a lot of different resources and assets that the browser's loading.

You can see that react components were mounted and being updated.

We're able to expand this and see that in our backend project, there was an HTTP request that took about 7.2 seconds out of the total 7.8, so in this case, there's a slowdown.

It looks like most of these things are not contributing to it, but this is the culprit.

Here on this page, we also get context.

We have similar breadcrumbs, so what was the user doing during the time leading up to this point.

Any more information here, we have a bunch of different tags we can access, as well as, you can also use Sentry's tracing feature to head over to, again, our node project on the backend and realize, OK, this is where things are actually going wrong.

So it looks like there's some database queries happening here, and in this case, looks like we're doing some sequentially, so we're fetching individual product IDs rather than all of the products at the same time or a set of product IDs, so this might be an area where there's room for improvement.

So we traced it from the frontend over to the backend without having to do a bunch of looking at logs for both applications.

Sentry's goal is to basically consolidate all the context that you need to solve errors and performance problems and put them in the same place, automate a lot of the things that would have taken your time.

At this point, we could go back.

We could also even set up an alert in the same way so that we get notified in Slack in kind of the same way. So you might see, if we go back to the products page, we can create an alert directly from here. That'll give us a sense of the transaction slash products. Let's say if it's above eight seconds, it's considered critical. If it's above four seconds for this transaction, maybe six seconds, it's considered warning. Consider resolved if it's below three. And you can see that reflected here. So what we can then do is add an action, so maybe on critical status, on warning status, we email the team. And on critical status, we also go ahead and send something over via Slack. So now we've surfaced. We're going to be helping ourselves to surface any problems with this endpoint, this transaction, in the future. We've surfaced an error. In this case, it showed up in Slack. We had instant context into, okay, is this important or not? Do I need to do something with it? We clicked through and we got the who, what, where, when, and why of the error. We looked at how many times it had occurred, how many people it was affecting. And when we decided it was worth our time solving, we dug in, we traced it across to a different project that we had that implemented Sentry, and we went ahead and found the root cause. And that tracing feature is going to be enabled if you just have two projects, two applications that both have Sentry initialized and configured correctly. But that's a very easy thing to do. performance monitoring, same deal. We surfaced a slowdown. We figured out what went wrong. Thanks, everyone, for watching, and good luck smashing those bugs out there.

8 min
17 Feb, 2022

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic