Every business today is a software business. Software is made of code. And code is meant to be improved. Yet developers get stuck reactively monitoring, investigating, and debugging code to fix issues. They lose too much time manually searching through logs, APM, and observability tools. Instead, they could be using that time to innovate. In this workshop, the participants will be introduced to the continuous code improvement platform that can help them see errors in real-time and gives them the tools needed to automate how they respond. Participants will learn how to instrument Rollbar's lightweight SDKs into their applications to capture uncaught exception errors as they happen along with the surrounding context and details. Participants will walk away with complete visibility on every error in their application, coupled with all the important data needed to make resolution painless.
The Crash Course for Continuous Code Improvement
AI Generated Video Summary
Today's workshop introduces Rollbar as a solution for continuous code improvement, addressing the problem of deprioritizing bugs and the impact they have on developers. Rollbar helps identify and fix bugs quickly, saving time and improving code health. The machine learning-powered grouping algorithm automates bug resolution and reduces duplicate bugs. Rollbar provides integrations with various tools and services, such as Slack and Jira, to streamline the error management process. The workshop also covers Rollbar's support for feature flagging and its commitment to improving the developer experience.
1. Introduction to Rollbar and the Workshop
Today's session is called the crash course for continuous code improvement. It will be a combination demo and training workshop, where I'll introduce Rollbar as a product and service. My name is David Waller, a solutions engineer at Rollbar, specializing in post-sales customer journey and user training.
Do the crazy stuff with TypeScript Use CSS 8 together All right, so it's now three minutes after, so we can go ahead and get started. We'll just do some quick intros to, Ill introduce myself and what we're going to do today. So today's session is called the crash course for continuous code improvement. And this is going to be a combination demo and training workshop. I'm going to introduce Rollbar as a product and a service, and then we will pivot that into some hands-on training with the service itself. So my name is David Waller. I'm a solutions engineer here at Rollbar. So I typically focus on the post-sales customer journey. So all things that have to do with implementing the product, using it effectively, and so, naturally, that covers a lot of user training. So that's one of my sweet spots within the Rollbar product, and hopefully I can help to train up some folks on how to use this as we go through it today.
2. Workshop Agenda and Introduction
We'll start with the slide deck for the introductory portion, discussing the problem space and how Rollbar solves it. We'll have a light demo and cover some educational content. The workshop will conclude with open-ended Q&A. The session is expected to take around an hour, with ample time for questions.
Alright, so this agenda is- it's actually just for the slide deck. Didn't really think about writing up a full agenda for the workshops. So what we'll do is we'll start with this slide deck and this will be the introductory portion. We'll talk about the problem space that we have, how Rollbar fits into that space and helps to solve that problem. We'll look at a light demo and just a few slides on some of the educational content. And then we'll close out with the workshop and with some open-ended Q&A time. I don't anticipate for this to take much over an hour. We're scheduled for two. There's plenty of time for Q&A and anything that comes up towards the end. But I don't plan on holding everyone hostage for a full two hours. So hopefully that's good news.
3. Introduction to Bugs and Rollbar
Today, we'll discuss the problem of deprioritizing existing bugs and how it affects developers. Bugs are a real problem that can impact the code and cost money. As developers, we have the best understanding of these issues. Let's dive into the numbers to quantify the problem and explore the solution provided by Rollbar.
Alright, so let's kick off with just a little bit of introduction of this space that we have. And what better way than to start that with a Dilbert comment. So I'll let each one of you read through this. I don't typically like to echo what are on my slides too much. I like to let folks read them if they choose. So basically here, the issue is more or less deprioritizing the existing bugs. And this is a fun way to say that bugs are a problem. And they're a real problem. We, as developers, or those folks working as developers, are usually the front line of awareness of those. Sometimes that's not the case. But with developers being the ones who create the code, we're often the ones that have the best understanding of what's actually happening there. So that's sort of an introduction.
4. Quantifying the Problem and Introducing Rollbar
Bugs in code have a significant impact on companies and businesses. They can cause resource problems, increased costs, and lost revenue. Resolving bugs is also time-consuming and takes away from new feature development. Rollbar helps address these issues by providing a solution to quickly identify and fix bugs.
And now, let's use some numbers to really quantify the problem. These are the bad numbers we'll talk first about. These are the problem numbers and where do they impact the company and the business itself. So with codes being prevalent in the code, they're not just there, but there's a lot of bugs. And they cost money. They have a number of effects. They can either just cause problems with the resources that support them. Maybe you just have to pay more for extra processing power for things to get around the bugs. Maybe you lose direct revenue as a result of an end-user not being able to purchase things through your application. There's a number of ways that these bugs can cost you money. But then the way that we get rid of them is also a costly process. Developer hours to find, triage, resolve these bugs. That's valuable time that's not being spent on new feature development and things that are improving the software product. So, clearly we don't want to have to spend a lot of time at this point in the process. If we can move quickly past the bug hunting and the bug resolving, then that leads to cleaner code and happier developers. So, those are some of the things that we would like to get out of that process. You know, obviously if we could make it go away entirely, that'd be great, but the way we do that is by working at it and iterating against those bugs. And that's where Rollbar comes in.
5. Continuous Code Improvement and Error Monitoring
We're the leader in continuous code improvement, achieved through error monitoring. We report errors in real time, providing contextual information and helping developers identify and fix code problems. Our components include complete visibility into errors, with real-time updates in Rollbar.
So, we are calling our... we're the leader in continuous code improvement. And that's one of the new terms that we've devised for this. But the way that we achieve that continuous code improvement, the bread and butter of this service is the error monitoring. So, we take the errors, these exceptions, and the other problems that are occurring in your code and report them in real time back to our platform, which is then going to do some further analysis on that. You'll have contextual grouping, stack trace analysis to point to the true root cause of that issue with, with a bug being a code problem, with all of this relating back to the code. That's that's ultimately where we're going to make the fixes as well. And so a lot of this contextual information, a lot of what we do is ultimately focused on that with your process, with the developer lifecycle being a key part of that as well. We want to fit into that process. So some of the components that we have, I think I already sort of mentioned, number one, their complete visibility, having real time visibility into these errors, there's not going to be a delay for for ingest and processing when you send these errors across from your application. They show up in Rollbar within a matter of milliseconds, usually. So watching those go directly from inside your code where you where you typically may have a bit of a black box and then seeing that come out of your application and being externally visible in Rollbar, that's a key part of the process.
6. Automation Grade Grouping and Event Classification
We have a machine learning powered grouping algorithm that classifies and groups events with the same signature, pointing to the single point of failure, the code problem. ROLLBAR provides a low touch way of getting insights that would typically require manual work in log analysis.
But we don't just stop there after we ingest these, if we did stop there, then you would say that that could be akin to a logging service. You ingest the text and then you have to do the work to find the useful info inside of that text data. So, step two, the automation grade grouping. You can think of that as being a very low touch way of getting the insights that you may typically have to work for in something like log analysis. So we have a machine learning powered grouping algorithm that is working on each new event that comes into ROLLBAR to classify that and to group it according with the other events that may have the same signature there. So that's another key distinction between something that you would look at for logging is that instead of showing each individual text entry in the message that comes along with that, we're grouping those to try to point to the single point of failure, the code problem. So when we move into the ROLLBAR UI in the demo, I think it will become clear how we're better organized around the problems underlying these events.
7. AI-Assisted Workflows and Bug Resolution
The AI-assisted workflows and grouping algorithm help you find and solve bugs faster, saving time and frustration. Rollbar provides visibility and actionable insights to improve deploy success rate and iterate through releases faster. It supports the entire development cycle, from lower environments to production, and even helps customer support agents address UX degradation. By saving time and improving code health, Rollbar leads to a better product and a smoother development process.
And then the AI-assisted workflows, so with the grouping algorithm and with it having the better intelligence, we're also working to develop new solutions that allow you to take the signal generated from these ROLLBAR items and then point that and integrate that into other automated workflows. I have a demo example that we'll show towards the end of the slide time. So all of that, ultimately helping you to get around these bugs faster, to find them and solve them much more quickly, spending less time and not banging your head on your desk as much in frustration because you can't find the problem.
So visibility is key and then also having actionable visibility to fix the problems. So here are the good numbers. We talked about the bad numbers and the good numbers here, improving the deploy success rate, iterations through releases faster. That typically points to healthier processes. It may not be the end result of that. But all of these things are basically less time spent resolving bugs, which is going to eat up less of your money through developer hours and give you a better experience all around.
And with all of this talk about creating code and the development process, it's important to note that we don't have a vertical pane of visibility there. Rollbar helps across the cycle, going all the way from the things that happen in lower environments, like your test environments, up through production, all the way to your first line of defense against UX degradation, which can be your customer support agents. And that's actually a growing use case that we've identified is that maybe it's not your developers that are starting to create the tickets in the workflows, but maybe it's the support folks that are being notified about the problems that those end users are experiencing. And even in that case, that helps to provide more information to the folks that may be making the changes, like your developers and the folks that work to fix these problems. And with all the time saved, of course, that leads to a better product in the long term, because your developers are more engaged with the type of problems they need to be solving. The other folks in the life cycle are generally less bogged down by those errors as well. Thinking about it all the way up to support, I can think of a specific customer testimonial where they said, our support team had the quietest night that they've ever had, and they were able to point that pretty strongly to the result of rollbar and its effects on their release cycles and their code health.
8. Rollbar React SDK and Error Workflow
In this part, the speaker introduces the Rollbar React SDK and demonstrates its usage in a React application. They show how Rollbar provides contextual information, such as stack traces and telemetry data, to help developers identify and solve errors. The speaker also discusses integrating Rollbar with source control systems and ticketing systems, demonstrating how Rollbar can generate workable tickets and automate the ticket creation process. They highlight Rollbar's support for various notification preferences and its integration with Slack for real-time error notifications and actions.
All right, so that's enough slides for now. We'll try to get through the rest pretty quickly. Just to do a brief mechanical demo here. So, this is where I will take a second to introduce my colleague Jeff Hoffer's new Rollbar React SDK. So, we're sort of... this is part of the reason that we're here, is to help promote the new library that we have here. We've been able to monitor React for a long time, but only last week when Jeff released this new library did we make this more of a natural process for React developers.
So, if you've used Rollbar before, if you've used it with React, getting it set up may have been a little bit of a process itself, and so Jeff's library is designed to ameliorate those problems and help make this a more React natural process. So, let me go ahead and paste this into the chat, and I think I see a couple of questions in there as well. Let's see if I can swing back to those. So, there's a link to the SDK, this is a public repo. You'll be using this to set up with the new library. So, I'm gonna pause just a second for questions, I'll try not to do this, we'll get to Q&A time at the end. Just wanna make sure I'm not missing anything time-sensitive.
All right, so, yeah, I think most of these can be addressed in the Q&A towards the end. So, this brief demo I have, it is going to be using React. Due to the infancy of this library, I have to make a bit of a disclaimer slash an excuse in that I may not have the full-fledged demo environment for React that I would like, but this is just a calculator app, it's forged from GitHub. I think it's one of the more popular sample applications you can have there. And in here, I'm more or less just adding the roll bar configuration with the provider content that it needs. And then in the display file, I'm throwing an error whenever the number 42 is displayed. So a little bit of a Douglas Adams problem that we've got there, can't get to 42. So with this being injected into my code, so roll bar has been added, we also use the Node package manager to add that. So, there's not a lot that has to be done to the code, and as a result of me getting roll bar into this, things that may typically fly under the radar or not provide a ton of contextual insight, I will now have better information to solve those. So this is what I see in my app here, obviously there is a problem and oops, I can see a good bit of this stack trace here, but at the same time I'm now receiving this information in roll bar as well. I have a new uncalled error here and a legal value, I'm actually sending two here, this was one of the explicit method calls just to make sure we're getting this stuff in, but now we see that there is this illegal value, we have this stack trace information being shown in the roll bar web app and I'm also receiving telemetry data as well. So all of this more or less out of the box for roll bar, we can even see the user clicks that I was making, a lot of contextual information is being captured here. So with this alone, and with this being a specific example I've got I don't think there's any question about what I've done. So that was more or less, I wanted to show that for the sake of using React and then for some of the other stuff I've got more of my features hooked up here. So this donations page is just designed to send some errors through, I can see them here in the console and then those bubble up to the top of my roll bar items list as well. So one of the things that's hard to do, for me, not really knowing React was to get the code context involved and so this is where you can see more of the value. When you integrate roll bar with your source control system it's not only going to tell you where the problem occurred but I even have git blame info that's telling me what line it occurred at, who made the change and what commit did they make this change in. So using this info if there was a problem or a question about what's happening here, how did this error occur, what is the invocation pathway, it's now pretty clear. There's a lot of info here to help me unravel that mystery. So I'm also getting telemetry showing on this as well. And so I'm following through the rollbar workflow. I'm talking more about how this is actually going to fit into my process. From here, I would have a number of options. I can add this to my ticketing system, create a new issue in Jira. This will send me over to my board and then I will have a more bona fide ticket there. So now, in my JavaScript demo project, I now have a workable ticket and it's also going to include a link back to the original item in rollbar as well as some of this stack trace info. So one of the topics that we sort of build towards in this workshop is the rollbar workflows and this is starting to establish some of that. Getting things into a ticketing system is usually one of the core parts of making sure this is captured by developer process. And so now we can either do that manually where I clicked to create that ticket or we can automate that process as well. I have another project here that's doing automated ticket creation and I can see there's a lot of tickets in here. These are all being generated as those errors are, excuse me, as those errors are reported to rollbar. So, whatever your personal preferences for receiving notifications, for creating tickets and working through them, we try to support that. Here's a brief look into my Slack channel that I have set up for this demo. As these errors are coming through, I'm getting Slack notifications about this, so the Slack actions are allowing me to quickly resolve or mute these. I can change the severity level of those. I can assign them to one of the users in my rollbar domain.
9. Decisions, Terminology, and Item Management
We'll discuss the decisions between different workflows and the terminology used in Rollbar. Each project represents a code base, and each event sent to Rollbar is called an occurrence. The menu items in Rollbar are called items, and we can focus on them for triaging. Projects are typically associated with Git Repos, and item management involves changing status and prioritizing items.
So, we'll talk more about the decisions you have to make between the different workflows. Obviously, if I didn't need tickets, then I could just send the messages in here and then resolve them quickly without having to do the extra work there.
All right, so let's pivot back to the slides. We'll get through those quickly. I want to get to the hands-on content as fast as possible, since that's where this really gets interesting.
All right, so before we start with the workshop, there's just a few things that I want to introduce to make sure everyone starts off on the right foot. I don't want there to be a lot of confusion coming right out. So, just some terminology review and some of the key functional changes you can make in this process.
So, there are a number of different projects that are usually configured in a single Rollbar account. Mine, I just used two of those to show off the React and then also the Browser.js that I had there. So, each one of those is a project. You can think of a project as being a code base. So, every micro service can have its own project and it's typically best to configure that way for a better configuration, more granular settings for each of those.
Then each event that we send to Rollbar, we're going to call each of those an occurrence. So, typically they represent stack traces. That's where we see a lot of our value, but we also have the ability to just send log events as occurrences. Rollbar will be able to tell the difference between those, the analysis done on log messages is not as heavy as what's done on exceptions because they don't have stack traces.
And then what we end up with, the things that we see in the menu that we work against, we call those items. So, each of these is an item. We can see from the total column here how many individual occurrences we're getting for an item. So here, the respond with from my demo was a newly generated item, I created that as new. But if I go back to my demo here and I send a few more, we'll see the occurrence count go up on that while also seeing the items stay the same. So sending a few more of those in, sometimes have to trigger a refresh here. Should see those come through in any minute now. Let's see, so, watch for the total occurrences on that. What was I sending through there? I may have chosen the wrong one. I've got a lot of demos open right now. We'll look at LaunchDarkly next.
All right, so items, occurrences, those sort of roll up together. We don't have to do a lot of occurrence investigation typically. So when we look at these, most of the time we can focus on the items. There is a lot of data in the occurrences. We can go to that level. But that may or may not be required for the triaging that we do.
So another slide on Projects. I don't think this is all that important here. I'll leave it up for just a second, just so everyone can read through this. But, the main thing to understand is that we typically associate Projects with Git Repos due to the way that we integrate our code contexts with those. That's typically best to think of it that way, sort of the deployable services which also tie in with code bases. And so as we begin to receive those items, that's where the value. And this is where we sort of pick up and work with the Rollbar web app.
The next topic for the slides, Item Management. This deals with the way that we change their status and how we can sort of prioritize these items. By default, everything that comes into the Rollbar web app has a status of active. But then we also have the ability to resolve and to mute those items. So as I work through and identify the code change that I wanna make for that, then I can resolve the item. I have the ability to do that right here. From the items list, that's another one of the details that we have here. We can resolve these here through the UI. We can choose to resolve it in a specific version.
10. Muting, Item Levels, and Automation-Grade Grouping
Resolving an item marks it as done, but if it reappears, we'll see it again. Muting an item deprioritizes it and stops notifications. Item levels mimic log levels, with critical being the highest. Changing levels can be done in the UI or code side. Automation-grade grouping with ML-powered engine identifies root causes. Rollbar's machine learning-powered grouping is an industry leader, providing the best signal for operational analytics.
And then the mute button is here as well. So resolving thing is, resolving an item is marking it as done. But if it shows up again, we'll see it again. We'll typically get notified about that. And muting an item is deprioritizing it as much as we can really.
So a muted item is held in a separate list. I don't have any that are muted currently. But when you do mute them, you'll stop receiving notifications about them as well. So it is a very effective way to turn off the noise about something, to move it into that status. And then that's also going to move it out of your working list here, so those status changes quick ways to redirect items to clean up your workflow.
And then we also have the item levels, the severity levels. And these are going to mimic the log levels. And we have five of them, they're ordinal. So critical is the greatest and then debug is the least of those. You can think of them as having that ordinal property. In the back end, we represent these with numbers and so we can do inequality statements. You can think of critical as being higher than all the rest of these. So by default, the items that come in are going to have the error level. I think that's what we're seeing all of these here with the orange error level there. But I can also change these after they come in. And this is how we can organize this list a little better. So I can change this to a warning. Then as we go back, we'll see this represented differently in the UI and we can sort these based on the different levels that we have. So this is organizational stuff but it's also important for optimizing your workflow as we get into things like the notification rules and some of the other tools that we have for sorting these then it becomes important to categorize them based on your use case and how you plan to act against them.
All right, so yeah, changing levels. Also, you can do this in the UI like I showed but this is also something if you do set the level in the SDK, you can set it code side and whatever the first occurrence of an item reported is, that's the level that the item will preserve. So, we don't look at that a lot in the demo exercises. We change those to the UI most of the time and that change will be respected over the code side changing as well.
All right, so, automation-grade grouping. This is a big component of the Rollbar product, but it's not necessarily something we have to dive into a lot. So, I'm gonna try to keep this light. Our grouping engine with the ML powering, we're iterating every two weeks to release new grouping rules that are based on what those learning algorithms are picking up. So, I'm just kind of reiterating on terminology here. Occurrences and items, we group these together with what we call fingerprints. And that's where the automated engine is going to come in and do a lot of that work is with these fingerprints. So, again, we're trying to ultimately identify the root cause here. And this basic fingerprinting algorithm, I think I'm probably just going to skip this for now. We don't really have to worry about the basic algorithm as much anymore, because the ML engine is going to override that in most cases. So this is one of our big distinguishing factors actually. I think I saw a question come through in the chat about that. This is where Rollbar is an industry leader. Our machine learning powered grouping does not have any direct competition. I believe we're the first to have a grouping engine this advanced and because we don't spend a lot of time working against this, because the grouping engine is not where we spend a lot of our time. It is, it's sort of the case that this is an unnoticed thing because it's under the hood, you may not see a lot of the value in making these changes, but ultimately, what we're delivering here is the best signal that you can get from any of these products. So for anything that you're doing with operational analytics, we are the one focused specifically on errors that gives you the most context and the most quickly actionable events to investigate here.
So I'm looking at the illustration that we've got here, we have no grouping down to hard coded algorithms, down to ML. So you can think of the no grouping world. This could be logging services where you're just ingesting the stack traces that come out of the errors, but there's no root cause analysis there. Then for a hard coded algorithms, we see that the bugs are getting sorted a little better, but it's not perfect. Those are still grouping rules that are created by humans to solve these problems.
11. Importance of Grouping and Automation
The machine learning grouping engine is crucial in reducing duplicate bugs and allowing for automation. By consolidating root causes, we can automate processes such as rolling back changes and directing tickets to specific groups. Grouping accuracy improves the signal, enabling text analytics, pager duty messages, and Slack alerts. The goal is to achieve a highly automated system that fixes issues based on the signal generated.
And as we all know, human error is a very real thing. And so that's where the power of the machine learning grouping engine really shines here is that over time, and obviously with our focus on error ingestion, we obviously have a lot of sample data to use in this process. Every two weeks specific languages and frameworks are usually targeted for improvements, but we'll see these things, we'll see the number of duplicate bugs going down. You won't have as many things to sift through in your rollbar account. And that gives you more time to actually resolve the bugs. So being sort of meta there, you can think about all the bugs that you may have to sift through in your rollbar account if the grouping engine were not so great, and then ultimately arriving down to here where we've consolidated that into the true root causes. And with that, comes the ability to automate elsewhere. So that's where we see a lot of the value. And this is building towards a bigger picture solution within the development world. So being able to tie in a single error that's associated with something like a canary deployment would allow us to automatically pull away from that new canary deploy and go back to the prior version. So a great way to test in production or to test things progressively because we can automate the signal and say, once the errors increase, we need the amount of deployment and the distribution to decrease. Then we can tie in something like our people tracking where we know who's been affected how many times and by what errors. And that allows us to direct those errors or direct those rollbar items and the tickets to specific groups and certain areas of product engineering maybe. And then looking at that in contrast, if we use people tracking to maybe say, this is something that only our lower tier users are experiencing. This is not an emergency level event, but we do need to fix that. We could send that over into a Jira ticket where it can be handled later. So the importance of grouping in this process, it's foundational here because without the grouping intelligence that we have, then we would be creating a lot of extra work in other parts of the development cycle. Sending in 100 tickets to Jira for one code problem, that's legwork for somebody, somebody has to clean up the Jira board. And if nobody does, then the problem is ignored because nobody wants to deal with that. So this graph sort of trending upwards here with grouping accuracy as that accuracy improves, we go from just being able to do something like text analytics of those messages, we improve the signal by grouping better. Once we've grouped well enough to start sending pager duty messages and Slack alerts to our team, then that means we trust the grouping engine to point to the root cause in most cases. Now we can automate because we know that grouping is good enough to allow other automated systems to rely on that signal. And eventually we'd like to arrive at the point where we're so well automated that our software is maybe rolling back its own broken changes and implementing fixes based off the signal that we've helped to create and put in place with this process. So that's sort of the value there. And as we continue on this journey, as grouping improves over time, and as we work on this, we have more capabilities that are unlocked by getting this right and reducing the noise in this process.
12. Grouping Tools and Customization
When grouping is not going well, Rollbar provides tools to coalesce items and merge duplicates. Custom fingerprinting can be used to override the default grouping engine and create rules specific to each customer's use case. Once you have trust in the grouping engine, you can take advantage of Rollbar's integrations, such as Slack notifications and actions, to more efficiently organize rollbar items. Rollbar also supports various notification channels, allowing you to intelligently route alerts to different teams. Custom formatting can be applied to Slack messages, and the generic webhook can be used with services like webhook.site to receive and evaluate occurrences.
All right, so a couple of slides on what to do when grouping is not going so well. This is really becoming a minor topic because of the machine learning in those grouping engines, but we do support it and it's still a part of most larger customers and their roll bar usage.
When things are maybe not grouped together and you have to coalesce those items, we do have tools for that as well. So there are four duplicates right here and these were all a part of me dogfooding the new React library. There were things that were popping up as we were working through some of that and they're four identical messages and I don't think there's really any extra signal to be gained here. They're probably different because of the invocation pathways, different stack traces. If there's any difference in these coordinates then those would be grouped separately but I can merge these to solve the undergrouping. And doing this, it's going to suggest a custom fingerprint here. So this is the other way to override the default grouping engine. In this case, I'll create one just for demo purposes and we'll take a look at that.
So in cases where the machine learning grouping engine is not doing its job all enough if it's not doing it well enough, or maybe if it's mistakenly grouping things the wrong way, we also have the ability to come in and create custom rules for that. So these are JSON rules, they come in as a list and they'll have a set of conditions that are required to match this fingerprint. And again, this is where things get really technical. This could be its own session just talking about custom fingerprinting and the ways to make that useful. This is very custom for each customer. This is where your use case, your data set, your business domain, all of that can make an impact on how these custom rules are created. So merging and custom fingerprinting are the ways to help the grouping engine with anything that's not going correctly. And they allow you to fine tune it, to make sure that it's working in an optimized fashion. So once you've got that level of trust in the grouping engine, that's where you can really start to take advantage of role bars integrations and how we provide value throughout the developer process.
So most of what I've shown through the visibility here has been me using the web app. And that's something that most people would like to avoid if they can. You know, having to come here, for me having to leave this window open and watch for these errors to come in, not really valuable and going to have a lot of missed messages if I'm just trying to look across those. So that's where we see things like the Slack notifications, the Slack actions that I showed a moment ago, those becoming more valuable. So with these I can more quickly organize my rollbar items. I can just divert these to different user groups. I can assign this back to one of my other personas in my rollbar account. I'll see that reflected in the items when I go and view these, we'll see that there's probably going to be an owner associated there. See if that was error 10 or not.
So in addition to the Slack notifications, we have a number of different channels here. In looking at my projects page we can see we've got Slack, I've got Jira, there's a Trello board that's connected to one and then there's some others where I'm also just sending across generic JavaScript notifications. So, or excuse me, not JavaScript, generic webhook notifications, my mistake. So within the project settings, each one has its own notification channels and its own rules that you configure. So this is where you can intelligently rout these alerts to different teams. I can see here that I've got a number of Slack rules in place here. There are 12 different rules for that. Some of them are more specific than others. This is one only for critical occurrences and in the message I'm adding a little bit of custom formatting there. I'm sending across the framework and also the context that comes from that occurrence. So this is where we can really get to a lot of the power in the customization. So formatting the Slack messages in whatever manner you would like. You can see there's a default format that comes from this. But then we can also add other stuff as well. I think this is the correct way to do something like an at here. For those, you can format those into the messages that you send out of these systems. So because we don't have something that I know everyone can use, the generic web hook is what I've added to the workshop. So in that, you'll be using webhook.site. It's a free website that allows you to receive these occurrences, these payloads that come through, and just evaluate those. So this could be a part of a longer process where maybe you send the webhooks to a Lambda listener or something along those lines. You can do any number of things with that. I think Zapier is an app that allows you to send out notifications based on the content of these payloads.
13. Integrations and Workshop Kick-off
We have custom integrations and a webhook channel for services not yet integrated. The webhook notification type sends out the full dataset, including stack trace details. Jira integration allows for manual ticket creation and filtering based on various criteria. Source control integration enables viewing code context by connecting to Git-based version control systems. Other integrations, like LaunchDarkly, allow for triggering feature flags.
So a lot of things that you can do there. I see in the chat a question about Mattermost. I don't think we have support for that as a boxed integration currently. These are the channels that we have for the custom integrations, but the idea behind the webhook is that for those that are not yet integrated, you can still post those webhook payloads to a URL. So, for any service that's not a part of that yet, feel free to send us a note about a feature request or something. But for a temporary workaround, you may be able to leverage the webhook channel here.
And interestingly enough, the webhook notification type is the only one that sends out the full dataset. This is the exact payload received by the occurrence that comes into the rollbar platform. There's a lot of text here. There's a lot of data about how this message was received, where it came from, what was the stack trace? And here are the coordinates as a part of that stack trace. So all this detail that we get is also going to be available in the web app. But this is sort of a unique case where these generic webhooks contain all the info that comes from that event.
All right, so besides just the notification channels, so issue tracking like Jira, I mentioned before, this one does fall under the notification rules in that it is configured here. They're slightly different when you go to set them up. We can see for the Jira settings here on this project. In addition to supplying credentials and creating rules, I tell it which project it's going to. You give it all of the Jira configs that it needs to create and update these tickets, the reactivated status, the resolved status. And in this case, I did not have any automated ticket creation. It was just doing the manual tickets where I clicked through and clicked the button to create that. So that's sort of the low trust way of working with Jira. But once I gained that trust in the grouping engine itself to know that I don't have to manually create these tickets, that's where I can add more value here.
So for every new item that comes into this project, we would want to create a Jira issue there. And I can filter this heavily. The default filter is anything that's error level or above. We don't always want that. Sometimes I do need to know about the warning messages. So maybe I get rid of that. But then I can add as much filtering here as I want. And this is where the templated notification rules we have are useful, but they really become more powerful when you fine tune them and add all of the extra context here. I can say what file name is going to trigger this. You can see here, some of the autofill from other training workshops I've done with things like foobar, doing a regex match on API, I don't remember what we were searching on with the term wave there. But then something like subscriptions for a specific business unit. The path filter allows you to inject your own key value pairs that you may have added to the data and then filter on those as well. So this is where I could do something like route it to different teams and team could be something that I've created and added to the error data. So that's a bit of an advanced use case where we can start to explore some of that as we go through it.
So here just for that one I can have all new items create a ticket on that given JIRA board and have those go through. So notifications and issue tracking all based on the premise of taking info and signal from the Rollbar web app and then sending that outward to the other services that we also use for development.
The third one, the last integration that we'll talk about and then we'll just go ahead and get started with the workshop. The source control integration. We have the ability to connect to the Git based version control systems and that's what allows us to show the code context like I was seeing with this message here. So this is a result of me integrating source control through my settings here. If I go back to this project JS demo and go to source control, I have authenticated it with my personal GitHub username that's there. And then I've pointed it to the correct repo with the right branch. And since this is just the home route there, I just had to give it the forward slash there to say, this is my project route. So by doing that and by adding some context to the payload where I'm giving it the git sha for the code version, and I'm giving it the server route from where it begins, that allows me to see the full git code context with each of those messages. And we have a number of other integrations, one that I'll demo in a moment after we start the workshop is the LaunchDarkly workflow where notifications can trigger LaunchDarkly feature flags to auto kill switch certain things that may be causing problems.
All right, and so in the interest of time and of folks getting their hands dirty as I talk about some of these other things, let's go ahead and kick off the workshop and then we'll circle back to some of these things for those that are still interested. So this is the link to the repo that you'll use. You don't actually have to clone it. You can just copy the HTML file from the raw.
14. Rollbar Project Setup and Error Reporting
To set up your Rollbar project, follow the instructions on the repo page. Copy the index.html file to a text editor and uncomment the necessary rollbar method calls. Create your first rollbar project and select the JavaScript option. Retrieve the rollbar snippet and paste it into your file. Test the error reporting by using the window on error function in the console. Any questions about the code or instructions can be asked in the chat. Microsoft Teams integration is a highly requested feature.
So navigating over to that repo. Let's see, I've got so many tabs open. So here there are instructions on the repo page. This isn't too complicated, but this is basically going to get your first rollbar project up and running in the case that you didn't do that during this demo already. So here you can follow the instructions, but you'll take this index.html file for simplicity's sake, this is all about portability, just taking a basic webpage that can be copied out in a single file. I can copy this over to a text editor. You can use a code editor, but again, it's not really necessary. I think was was part of why we're doing that here. With this text editor, it does auto format once I save it as a certain type. So I've got the nice typing there. There's a couple of things that did not work unless they were commented out first. So there's three rollbar method calls that you'll need to to uncomment after you paste in your snippet there. So to get your rollbar code to work, you'll need to use a description, and your rollbar code snippet. To set up this project, you will need to create your first rollbar project. So for me, I've already got one in my account, I have to go to the projects list. If you're starting fresh, then there should be a nice guide to ask you to do this. But here, let's see, I already have one called workshop. We'll do react workshop. Because this is basic browser JS, we can just choose the JavaScript option here. Sorry, somebody asked for the link in the chat, I only pasted it in the discord. My apologies. Oops. Sorry, that is actually not the right link. Sorry, trying to do too many things at once here, folks. There we go, now that's in the chat. So from the tutorial, you will select your framework. For everybody, it's going to be JavaScript. And continuing through here, our new user guide, the new project set up is going to automate a lot of the typing and the packaging and stuff. And so to get the rollbar snippet that you'll be using, you should be seeing a page like this where it offers you the code snippet. And it will even show this little loading wheel up here. And that's because this page will change to your items page as soon as you've got your first error reporting in. So proof of concept, I will take mine and paste it in here. I'm gonna go ahead and uncomment my two little roll bar method calls there, or three of them, excuse me. And now I can open this file, and we should be able to see an error come through there. So this is the page itself. This is my React workshop. I'm hoping that I can report an error from here. There's actually a way to test that before we get into the sending of the real errors that are built into the workshop, this window on error. If you paste this into the console for Chrome, which is what I'm using here, it's basically just sending a test notification. So it says undefined, and here on my project set up page, it's thinking, it's trying to do something. There's something that's been detected here. So it refreshes with the new items list specific to this project, and I can see that first error having come through.
All right, so I'm going to allow folks to work on the workshop for a second and try to answer some of the questions that may have come through, just to make sure everyone's on the same page. Again, those instructions here on the repo, this is not anything that complicated. It's mostly designed to get you acquainted with Rollbar and some of the core features. So if you do have any questions about the code itself and sort of what was expected to change there, feel free to ask. Thanks.
All right, question in the chat just now about Microsoft Teams. That has rapidly escalated in priority level because it is requested, I think, on a weekly basis at this point. We have a lot of folks that want it, and I believe if it's not on this quarter's roadmap, then it's definitely on the next.
15. Workaround and Handling Errors
There is a workaround for MS Teams installation. Rollbar captures and reports all unhandled exceptions. Catching errors in the code is the best way to prevent them from going to Rollbar. The speaker is available for assistance and questions throughout the workshop.
But, with that, we do also have a workaround. I think you can install it from the MS Teams site, from inside the domain. I know one of my SE colleagues worked on this recently. Try to find some info for a follow-up on that.
So, question in the Discord, does Rollbar replace the onError handler, or just listen to the error event? So, the onError is just used for testing purposes there. Rollbar is going to capture and report all unhandled exceptions. So if you're looking to prevent things from going to Rollbar, then catching them in the code is the best way to prevent that. And that's sort of the, I think the train of thought behind that is, you know, if it is being caught in the code, then it's being handled there. And so what we're primarily dealing with are unhandled errors. And that may be because they're unnoticed, or just because they're truly not being handled.
All right, so the unfortunate thing for me about the workshop is I really have no way of tracking progress without being in the room and running around a bunch. So, anybody that may need help, feel free to do a raised hand in the Zoom or send a chat message directly to me. We've only used up the first hour. So, any questions you have, I'm here to support all rollbar things that have to do with this.
16. Debugging Errors and Grouping in Rollbar
There was an issue with adblock blocking the messages, but it can be resolved by using an alternate browser. During the workshop, small changes were made to the code, and errors were debugged. The speaker mentioned a question about catching errors related to failed loading, but they needed more information to provide a detailed answer. They also addressed a concern about the instructions disappearing after sending a test message, explaining that it was because Rollbar was successfully integrated into the application. The speaker then encountered another missing method error, which they fixed by removing the broken code. They highlighted the value of grouping in complex applications and explained how Rollbar detects distinct events even when the same missing method is present in different lines of code. The missing methods on lines 18 and 36 were fixed, and basic formatting mistakes were addressed. The speaker mentioned a question about capturing uncaught and unhandled rejections.
So just a second. All right, I see something in Discord about adblock getting in the way. So interestingly enough, I had not encountered that, maybe I don't have adblock installed on my professional persona here. Blocked by client, okay. Did not have any logistical problems last time, I was expecting a few. So I think honestly the easiest solution would probably be to just open one of your alternate browsers that you're not using for your actual browsing purposes. Like for me I would maybe open Safari or something that didn't have as many extensions. That's good to know though I've never encountered that on my own so I'll have to debug the adblock. Although something was getting in the way of these coming through.
So with this there's I think a total of seven or eight messages that you'll see come through. So there's just the sample test message that I had and then as we iterate through this you're basically just making small quick changes to the code. Let's see, yep too many demos open. So... Missing method comes through. Then I can try to debug through validating these. We'll find more errors as we go along. So missing method. If I look here... So even without the full GitHub integration, the coordinates should be helpful if it's not glaringly obvious what's wrong. So Line 18, Column 18. If I look there, pretty clearly got a broken method there. I need to go ahead and remove that. Save, refresh, and iterate. That's sort of the flow that I had in mind for this, so...
Alright, so another question about catching errors about failed loading. I guess that would probably depend, uh, yeah, so we would probably wanna follow that one up offline just because I don't think I know enough about that. We have support for a number of different ways of getting things to rollbar. I don't know if this is relevant at all, but you can thread the rollbar execution process. So if it was about things that are failing to load, as long as the agent is instantiated as part of that, as long as there is some type of exception or some indicative message that comes out of that process, I would think that probably works. So I'm gonna try to get more detailed info to answer that question as well. And another item in the Discord, when you send your test message the instructions disappear. Yeah, forgot to warn you about that. The good news is, that your instructions have disappeared because you've got rollbar in your application. So, the only stuff that was left in the instructions down there, it was just about how to troubleshoot if you're not able to get your events in here. So, if you go back and set up another project, you can read in more detail there. But, the reason it disappears after that is because for all intents and purposes you are, you're good to go at that point. So this is where things get different and I have to look around a little more, because it was throwing another missing method there. Alright, so we got missing method twice. I figure out what's different about these. Well, obviously I know because I wrote the workshop, but, this is an example I think, of all the things I can do to show off grouping in a very simple application. Grouping has most of it's value from, from the more complex applications, those with lots of dependencies and high user volumes. But even for something simple, like where I put the same missing method in two different places, Rollbar is going to detect those as distinct events because they're not coming from the same code problem. Those are two different lines of code that just happened to have the same missing method. So there, I would probably choose not to merge those, but instead just fix both problems. So missing method was on light 18 and I got rid of that. And then on line 36, I can delete that as well. Let's see, and then in here, I had put some basic formatting there. A couple of just small mistakes that may be made. The type of things that may go unnoticed. So question about capture uncaught versus capture unhandled rejections.
17. Generating Multiple Errors in the Web App
To generate multiple errors in the web app, the speaker iterated through the code and solved the problems in each text box. The goal is to put info in the field and press subscribe, but there will be errors at each step. As each method is solved, a little Rollbar nugget is fixed. A critical message appears at the end to indicate successful debugging. Refreshing the web page is necessary after making changes.
So I think the difference there is just being very specific in what type of event we're talking about. So uncaught would be the uncaught exceptions, and then unhandled rejections would be server rejections, I believe. So when you look at the telemetry data that comes along with some of these events, then that's where we may also see some of those being captured. And the other question was how did I generate multiple errors? So in order to get my web app to show more, I iterated through the code, solving the problems. So I should have explained this more clearly, but this is a subscription page. That's what we're mimicking. And I wrote it in such a way that you can kind of debug down through each text box to try to solve the problems. So ultimately you want to be able to put info in this field and press subscribe and have it work. But at each step along that way, there's going to be an error that is in the way. So because I'd done this so many times getting ready, I think I sort of skimmed through that, but with each text box, I think the way I wrote it was, there's the possibility for some validation here, not required, that's more of like the extra credit stuff to do regex matching and that. But as you solve each method, these are all just called in order by the sort of the wrapper method there. And then each one has just a little role bar nugget in it that you can solve. And then there's a critical message that comes through at the end and it will pop up and tell you that you've successfully debugged all the fields when you've gotten around those issues. So there's another code problem, we'll get rid of that. And you have to refresh the web page every time, of course, since we're just making these changes locally and we're not kicking off servers and stuff. Any last thoughts?
18. Conditional Rollbar Usage and Privacy Features
Conditional rollbar usage based on consent from privacy regulations. Rollbar provides privacy features like automatic scrubbing of certain fields. You can engineer a toggle into the code to control event reporting. Rollbar's advantage is its code-side implementation and full code access, unlike log collection agents.
Latest one about conditional rollbar usage based on consent from, looks like, privacy regulations. I don't think we have anything today, I know that we are doing our best to always stay compliant and up to date with attestations and those things, yeah GDPR, I believe is the European data privacy. So we have a number of privacy features that are built in, such as the automatic scrubbing of certain fields from the data that goes into rollbar. But as far as having a sort of toggle on whether or not the events are reported themselves, you could engineer one into the code. That's where rollbar does have an advantage in being code side is that, for something like a webpage, I could write myself an alert to ask the user to consent maybe, and then these configs could be changed based on that. There is an enabled property that you can always just set to false, and that turns off rollbar while leaving it in place. So, yeah, this is, I used to work for a log company where we had a log collection Java agent, and that's more of a black box because you implement that, you run the collector, but you can't change the code itself. And here, we're sort of the other way around. We have full code access. So, those things that you can codify, you can typically do with the SDK.
19. Using the SDK and Addressing Errors
You can codify things with the SDK. There was a blender when pressing the demo buttons, causing missing items due to the default behavior of refreshing the page. The main errors were addressed.
So, those things that you can codify, you can typically do with the SDK. So, let's see. So, Juan just asked a question that helped me to unravel the mystery I just gave myself. So, I made a little bit of a blender here when I was pressing one of those demo buttons and I just saw, subscribe pages, and I did not see those items come through. That's because of the way that these pages have been written. In order to get these, to populate through without it automatically refreshing the page, had to change the default behavior here. So normally, it's gonna trigger a refresh which will mean that these events are not always reported. So in this case, it's by design, but obviously it's not always the most insightful thing because I kind of missed out on that as well. So I think I got mine through all of the main errors that I wanted.
20. Demoing Rollbar and Future Functionality
Going back to the page, there are seven notifications if you send the test notification, and six if not. After the core instructions, notifications were left as a supplemental item for users to generate their own personal URL using webhook.site. Rollbar on-prem is in development and is expected to be delivered within the next year or so. The workshop concludes with additional demoing and a discussion of future functionality, including the launch Darkly integration for feature flagging.
So going back to the page here, there's a total of seven if you send the test notification. There's six if not. And some of the things that are sort of left without instructions, but hopefully you can identify. If I send an info level here, then maybe I want to change this back to info. I think it was good to know that you're done with the critical level event. If you get that critical message to come in, that means you've completed the workshop. So hopefully folks are getting through that. Feel free to drop a note if you are.
And then after the core instructions for the workshop, just solving those code problems and getting used to the way that Rollbar reports these things, the notifications were left as a supplemental item where you can use the webhook.site to generate your own personal URL that you can then take and use as a temporary listener for some of these events. So all right. Another question. Rollbar on-prem. It is in development. It's been requested a few times, and it is something we ultimately want to deliver. Currently, it's still a cloud-only service. It's just hosted over the website. But we are looking to make that journey at some point soon. I don't think it's a ultra-long-term goal. I think it's something we'd like done in the next year or so.
All right, so hopefully everyone has done OK with the workshop. I really didn't do a good job of figuring out a way to make this more interactive. It's kind of just putting the code in your hands and hoping that you're interested enough to do this stuff. I'd like to believe that's the case with those that are still hanging around. So just to close out, I'll do a little bit of extra demoing and explain some of the workflows that we have. You can think of this as parking lot time, where we're done with the workshop unless I have specific questions to address. But I did want to also just kind of point ahead towards where we intend to go with this in the future and try to shine a light on some of that extra functionality. So I mentioned launch Darkly as one of those other demo paths that I would try to touch on here. So with that, you can think of this as being one of the more advanced integrations that have. But it's it's unique in that it combines with a product that gives us some operational control over our code and our application. So I have a separate project just for this. It's called LD demo. That's that's my launch Darkly project. And in here, I have my launch Darkly notification channel set up. So for those that aren't using launch Darkly may not know about feature flagging, it is a way of adding multiple different behaviors into a single deployment. So your code can have the ability to do multiple different things to handle an event. And the feature flag is the conditional check for who gets which version. So in its simplest form, this may be a canary test or an AB test to say the feature flag is on. I wrote this new button. I don't know if it's going to work. So I just want to test it out. And if it doesn't work, then I need to pull that away. I don't want folks using my broken button if it's broken. So with that, the targeting is set on, so we can think of this feature flag as being on. And as a result of that, here on my demo page, there's a button called Say Hello. But with the settings that I have here, trying to organize these tabs, there is a notification rule here for my launch darkly stuff that says, if a new item comes in or a reactivated item shows up in this project, then I want to trigger that feature flag. In this case, it's got to be a message that has the Say Hello method as a part of that. So if Say Hello shows up in an error, then we're going to turn off the Say Hello button. And so launch darkly, it's not my job to promo for them, but I will say it's a great tool and it works really fast. So I click the button, the error is reported and we'll see that come through. I think I forgot to toggle my item here in Rollbar.
21. Feature Flags and Developer Experience
Feature flags can be toggled on and off to control the behavior of an application. Rollbar does not currently support custom feature flagging, but they are working towards it. As a solutions engineer, it is their job to ensure that different workflows are satisfied and that developers get what they need. Rollbar continues to integrate convenience tools for developers and architects, and they are constantly improving the developer experience.
But we'll see that feature flag have its state changed. Let me go back to my demo here. It's a lot of demos to keep ready. So here's this message. Toggle that to resolve so we can get that to come through. Oh, never mind. It did actually toggle there. I refreshed too quickly. So, let's run through that one more time just so it looks a little better. The feature flag was toggled off as a result of that item coming through. Let me toggle this back on. Refresh that. There's the item. So make sure the item is resolved as well. And we also will see a history in here as well. We'll see sort of an execution path of that. So there's my roll bar item. Press the Say Hello button. See, give it another second. Hopefully, we can see that one come back. I think we've got a little bit of pipeline delay here. We should see that toggle in just a second. So the Launch darkly integration, that's a specific service provider, of course. Now there we go. So custom feature flagging is what we want to work towards as well. We don't have support for that yet. I know a lot of folks do feature flagging in-house. You can get creative with the webhooks, but if that's something that you're trying to do, then we'd like to help out. I am a solutions engineer after all, so it's sort of my job to help make sure that those different workflows are satisfied and that folks are getting what they need from them. Look for continued integration, and we try to add all the types of convenience tools that developers and architects will use. I think I should probably spare going into additional demos. There's a Terraform demo that I typically run for customers as well. So even things like getting your rollbar account configured and getting your rollbar infrastructure set up, those are all things that we can do and constantly trying to improve the developer experience through this product.
Comments