Debugging JavaScript Apps in CI/CD

Rate this content

- Causes of failed builds in CI/CD pipelines

- Approaches to debugging (reviewing logs, accessing environments, reproducing issues)

- Debugging application-related causes (failing tests, failed application builds)

- Debugging pipeline-related causes (pipeline setup, environment issues, container issues)

124 min
11 Apr, 2022

AI Generated Video Summary

This workshop covers the debugging process for CI/CD pipelines, including the steps involved and potential points of failure. It emphasizes the importance of good reporting, logging, and artifacts for effective debugging. The workshop also introduces Replay, an open-source time travel debugger, and demonstrates its capabilities for manual and automated testing. Participants are encouraged to reach out for further assistance.

1. Introduction and Workshop Agenda

Short description:

In this part, Cecilia Martinez introduces herself and gives an overview of the workshop agenda. She discusses her experience with automated testing and debugging in CI/CD pipelines. Cecilia also provides information on how to connect with her on Twitter and access the workshop slides. The workshop will cover the anatomy of a CI/CD pipeline, the debugging process, causes of CI/CD failures, and a practical exercise using replay. Cecilia explains the workflow trigger, environment provisioning, and dependency installation steps in a CI/CD pipeline.

OK, is anyone can confirm that they see slides debugging JavaScript apps in CI CD? I'm assuming that unless anyone tells me different, we can see them there. OK, great. So I am Cecilia Martinez. Just to get started with a little bit about myself. I am a community lead at a company called Replay. It is a time travel debugger. Previously I was at Cyprus I O. So that's where I learned a lot about automated testing, which we'll talk a little bit about today as well. And debugging failed tests in CI CD pipelines. I'm also a part of the GitHub Stars program and a volunteer with Women Who Code front end and the chapter head for App and Tech Atlanta. So I like to do a lot of coding, a lot of teaching, and a lot of events like this. So let me just take a look at the chat.

Okay, great. And you can find me at Cecilia Creates on Twitter. That's the link at the bottom of my slides there. I'm also going to go ahead and post the link to these slides in the Discord chat. So the post of those in the Discord channel, if anyone would like to follow along or if you would like to keep these slides for afterwards, they are public. So I just want to go through the agenda that we'll be working through today. This workshop will probably not be the full three hours, but I wanted to make sure we had plenty of time in case there's any questions or people want to dive into any specific topics as we go along. Feel free. Also happy to answer questions in general about automated testing or debugging in general within your CI C pipelines. So given the size that we have here today as well, please feel free to ask questions and, you know, just engage as much as you'd like.

The agenda today, I'm going to start with the content presentation. So the first part of the workshop will be me talking through the anatomy of a CI C pipeline, debugging process, and causes of CI C failures. Then we'll take a break and then we'll do a practical exercise using replay. This is essentially a workflow that allows you to record automated tests in your CI C pipeline so you can debug failed tests. We'll get a GitHub actions workflow set up and we'll debug that failure. The first is the workflow trigger itself. So I'm gonna go ahead and pull up an example, and I'm going to be kind of bouncing back and forth between a couple of different code examples, workflow examples and then the slides. So everything that we're gonna be taking a look at today is open source, so you'll be able to access it as well.

Um, so I'm gonna start by taking a look here at this floating UI, which is a open source library that we essentially have forked here and that we're gonna be using in order to run tests and do deployments. And specifically I'm looking at a run from this morning, just tested it, um and it's based on a CI.YML file that triggers on every pull request. And so when we talk about a CI CD pipeline were essentially talking about all the potential steps that will go through from the trigger to the deployment. And so the first is going to be that workflow trigger itself. So here we can see in our GitHub Action in a summary of our workflow and what this is if you go into any kind of repository and you go into GitHub Actions. You can see all the available workflows and you can see the history of your workflows. So I'm taking a look here at this one from this morning and you're going to be able to see not only the history of that specific workflow, but also the file that triggered that workflow. The files that trigger your workflow are stored like code it's infrastructure as code is how it's referred to, and these are all stored here in the repository in the GitHub Actions workflows file, and here's where we're able to see the file itself that orchestrates this workflow. When we talk about the workflow trigger, we're referring to the event that initiates the workflow. In GitHub Actions, these are referred to using the syntax on. So in this case we have on pull request or on a push to the main branch. This workflow will be triggered. In other syntax and other pipelines, you, this syntax would be maybe the word triggers itself. I've seen that in YAML files and that would where you indicate your pull requests. You'd indicate if it's going to be any commits, if it's going to be only on releases. And then if you're using a kind of a an editor, a UI editor type CI CD pipeline like Jenkins or Travis, where you may actually be able to manually select what your triggers are for a specific workflow. That's what we're referring to. So, the next is the environment provisioning. So, this is where once you trigger the workflow, the first step is getting the resources, the machines, the computers needed to actually run the process. So, when you kick off a CI CD pipeline, depending on your organization, and I've seen all types of infrastructure deployments, where it may be where you have virtual machines that you need to request access to. It may be that you have dedicated machines in your own infrastructure, if you're a large organization. With GitHub Actions, they have runners. So, pull up the documentation here. GitHub has its own built-in runners that you can use. There are Ubuntu Linux, Microsoft Windows, and Mac OS runners to run your workflows. What this does is it does a fresh newly provisioned virtual machine, depending on what you define in your workflow. In this one here, we're doing runs on, and then Ubuntu latest. But you'll see a wide variety of different types that you can leverage. You can also declare multiple types of machines using a matrix. I'm going to pull up another example here. This is from Cypress, the Cypress real-world app, which is a demonstration app used by Cypress in order to show automated testing workflows. This also leverages GitHub Actions. We can see here in the GitHub Actions, our main yaml file. We have the job that runs on Ubuntu latest, but then we also have a Windows version as well. So when this pipeline deploys and it's provisioning the needed resources, it's going to provision a Linux machine, it's also going to provision a Windows machine. That is one another step that we're talking about in that environment provisioning, is making sure that the necessary resources, the necessary machines are there in order to run the workflow. Next we have dependency installation. And I'm just going to check the chat here. Okay, great. So dependency installations means once we have the machines, we need to make sure that they're set up the way that we need them to be set up in order to build our app and test it and deploy it.

2. CI/CD Pipeline Steps

Short description:

In this part, we discuss the steps involved in the CI/CD pipeline. We cover the installation of dependencies and the code checkout process. Pre-build checks, such as linting and type checking, are also explained. The application build step is highlighted, including the use of specific build commands. The importance of package.json in determining dependencies is emphasized. We clarify that the code is checked, but not compiled, before running the checks. Finally, the execution of unit tests and the build command are mentioned.

So this could mean a number of things it may mean that you need to install node you may have be using a container or that has preset dependencies from an image like Docker, and, but all of that is going to happen in the in the CIC pipeline itself. And we can see the steps here. In our CIC file, the Cypress one is a good example because it has a couple more dependencies that are needed. But we need to, for example, install Windows, we need to install node, we need to use a container here. So in this one, we're using a container that contains certain browsers that are needed to run the tests. And all of those are essentially the pieces that are required in order to make the machine run. And that is another step that takes place where things could go wrong that you may need to debug.

The next step is code checkout. So we've triggered our workflow, we've got the machines we need, we installed the necessary dependencies in software and operating systems and whatever it may be that we need on those machines. And now we want to get the code from our application. We need to make sure it's the right version of the code. So whether it was a commit or a pull request or a release that triggered this workflow, we need to check out the correct version of that code before we build. And so typically in GitHub Actions, we will use an action called check out for this. So go ahead and pull that up here. So the GitHub Actions checkout, it does essentially two things. It authenticates that you have permission to check out that code and then it also checks out the specific commit of that code. Typically related to either the commit pull request or the release that triggered that workflow.

Once we know that we have the correct code, we're gonna run some pre-build checks. And again, these are all steps that could happen. They may not always happen depending on the complexity of your CICD pipeline, but the next step would be pre-build checks. And so if we go back to our Action here and we take a look at the steps, in this pipeline, we have three jobs that are running. We have a linting and type checking job, a unit test job, and a functional test job. Jobs are essentially groups of steps that occur within the workflow. So if you think of the hierarchy, you have workflow at the top level, then you have jobs within that workflow, and then there are actions within that workflow. Then you have jobs within that workflow. And then within each job, you have the steps. These jobs can run in sequence. They can run concurrently depending on how you need the order of operations to go. And you can see what happens in the run based on the job here. And so the first job that we have is the splinting and type-checking job here. And so what we're doing is this is running TypeScript checks and also a linter on the code before we actually build it. And if you have a different type of application that's not a JavaScript application, there's also a compile step that can be required. So before you can do any kind of checks on your code, you may need to compile it first if you're using a different type of language. But for JavaScript apps specifically, it's not a compiled language, so it's not required. So we can see here in the log for this run that we set up the job, we went ahead and provisioned our environment. So we have our operating system as Ubuntu, because we're using the latest. We've gone ahead and received our virtual environments. It has some included software from the Linux Ubuntu. And then we went ahead and prepared the workflow directory. And we started with the action here to check out our code. So we're running an actions checkout. We're syncing the repository to my Cecilia creates floating UI fork. We're also getting the Git version info from our bin Git directory related to that repository. From there, we're going to initialize the repository. We're going to check out the main branch here. We're going to set up off in order to ensure that we are authenticated and able to check out that code. We're going to fetch the repository, and this takes some time, and then we're determining the checkout information. Again, we're getting the specific Git ref for the SHA. The head here is from that merge because of that pull request that initiated it. Then we're going to go ahead and set it up. You'll see it's going ahead. We are running NPM install. We're installing all the required dependencies to run our application. That's going to be based on your package.json file. If we take a look at the code, I'm going to go ahead and open that in a new tab here, and we have our package.json. This is what's going to dictate the required dependencies that will be installed during this step. If you have any issues at that point, which we'll talk a little bit more about later, this would be a good place to start and see and ensure what packages are being installed. But I think I wanted to point out, before we run these checks, all we're doing is NPM install, but we're not actually building our application. We don't have to compile it before running these checks because we're only checking the code itself. We're not actually checking the runtime of the application. So, we're going to go ahead and run our TypeScript actions, and we're going to run our link check, and that's it. We're not actually building the application until we get to the next step. So, that's something to keep in mind. It's specific to JavaScript applications and JavaScript CI, C, E pipelines.

The next step is that application build step, right? So, once we have all the dependencies installed, we've checked our code, make sure that it looks okay, according to our linter, if we're using TypeScript, we've done the type checking that's required, and we're going to go ahead and build the application out. So, we can see here, as part of our unit test, we're going to go ahead and run NPM build. And the build command could be different depending on your application build process. If you may be using like Webpack, or you may be using React, you may be using a framework that has a specific build command. This build command is defined within your GitHub action. So, if we come here, and it's actually a second look at the Cypress real world app here, so we're gonna go ahead and we have, like for example, we are running our yarn types, we're running our yarn lint, and then we're running yarn test unit tests, which, you know, the unit tests don't need the application to be deployed first, but then we're gonna go ahead and run yarn build CI.

3. Build Command and Configuration

Short description:

The build command defined in the package.json file starts the build process for the application. Different build commands can be used depending on the requirements. For example, Netlify allows you to define the build command in its build settings. In the case of using Nuxt framework, the build command is 'NPM run generate'. Ensure that the build command is specified correctly for your specific setup.

And that build command is what is essentially gonna actually start the build process for our application. And again, you can typically see these defined in your package.json. So if your build command is failings, if it's not going right there, you can see what that translates to here. So we have our commands are defined here in our scripts. So we do have a build, couple of build commands. So for example, we have build AWS exports ES5, which we may use, we have that build CI, which is gonna go ahead and use Cognito and start that up. We have build CI, which uses React scripts to actually build the application itself. And so depending on what your build command that's defined in your package.json, you may need to use something specific. So that can be defined in your YAML file, or if you're using something like Netlify, for example, Netlify has a field in order to define your build command within Netlify itself. So we can take a look at that example. Let me go ahead and get that pulled up for you as well. Just give me one second here. Okay, so this is Netlify, which I use to deploy my personal website. And we can see here that under build settings, I have my repository set up, it's linked to my GitHub. And then I have the build settings here. And I build command is NPM run generate. So that is specific to Nuxt. I am using the Nuxt framework, which is a view framework for static site generators, it's really good for content specific sites. So I needed to come in here and specify that build command because that is gonna be specific for Nuxt. As you saw, it's not gonna be a react scripts build, it's not gonna be NPM run build, it's NPM run generate. And so you may need to update your build settings if you're using a UI type CICD provider like Netlify, or you need to define it within your YAML file itself.

4. Running Tests in the CICD Pipeline

Short description:

In the CICD pipeline, tests are run on the application that is built and running on a local machine. It is important to ensure that the tests are written to hit the local host within the same machine and not a separate environment. Various types of tests can be performed, such as unit tests, component tests, integration or end-to-end tests, and visual testing. Performance testing is usually done in a production environment. This is all part of the CICD process.

Okay. So the next step is going to be running tests on your application. So you've built your application, and when you've built your application it's now running locally in that virtual machine on a server. So when you run your tests, for example, if you've done it on your local machine on your computer, you may have done like NPM start, it goes to localhost 3000, and you execute your test and it's executing against that localhost 3000. The same exact thing is going to happen in your CICD pipeline. You are going to build your application, you're going to start it up, and it's now going to be running on a port within that virtual machine, on that server. And your application is, tests are then going to run all within that machine. So I have seen it where sometimes people will, for example, deploy to a staging environment or a testing environment or a QA environment, and tests will be run against there. But in most cases, that's only used for either manual testing or release testing. In this case, you want to ensure that you're running against the version of the code that you just checked out. So you'll need to make sure that your tests are written to hit the local host within that same machine and not a separate environment which may have different code. So I've seen that take place before too, where it's not specific to that certain version but your tests should always be running against the same version within the same machine. And it's gonna be running on that local host. And there could be a wide variety of tests that take place. Unit tests, component tests, integration or end to end tests, visual testing. Performance testing isn't typically done as part of deployment. Usually that's something that's done against like a production environment, for example but you may have some baseline performance testing that ensures specific that you're within certain thresholds that could take place. And that's something that this will all take place as part of the CICD.

5. Data Management and Testing

Short description:

In end-to-end testing, data management is crucial. One approach is to use a local flat-file JSON database and reseed it between each test. Another option is to hit an external test database using an API endpoint. However, this may cause issues if the database is not reseeded between tests. Using mock data is a faster option for UI testing, but it may not cover backend reliability. It's important to consider what needs to be tested and set up a suitable testing database. For critical paths, perform full end-to-end testing with a real database. For testing specific areas of the application, like components or UI, mock out unnecessary data. At Replay, an open-source debugging tool, various types of automated testing, including end-to-end tests, are used. Different testing workflows can be implemented using tools like GitHub Actions and playwright.

Okay. And I just see a question in Zoom. I'm gonna go ahead and answer that just that I'm checking both Zoom and then also the Discord chat. So the question is, in case of end to end testing most of the data depends on backend. So how to handle this? And does this make sense to use mock data and run these? And then again, that's not real. This is a really great question. So data management in testing is something that I get asked about a lot. I talk about it a lot. There are multiple strategies and how you work with these. So the Cypress Real-world App, which is a good example for test data management, what Cypress Real-world App does is it actually uses test data. And it uses a local flat-file JSON database in order to seed that JSON database. And it reseeds the test data between each and every test. So you're not using mock data that's randomly generated per se. What's actually happening is, there is part of the build process, there is a seed data script that actually takes database seed from JSON. It takes that seed data, it reseeds the JSON database between each test so you have fresh data.

And then when the tests are written, what they're doing is they're using a specific API endpoint to get data from that test database. So we can take a look here at the code. So if we look at any tests, I'm gonna pull one of these up here. So like new transaction, for example, we can see here that before each test, we're running a task called DB seed. That's actually seeding the database inside the test environment with that fresh data. And then when it comes time to actually get a user, what we're doing here is we're running a sideout database command, which essentially just allows you to get data from that database, but we're actually grabbing a user. We're grabbing the first user here and we're setting that as our contacts user. And we're using that in our tests. So we're saying, you know, the contact first name, we're validating it against that. So everything is dynamic in the test. There's no hard-coded data, but we're always grabbing data specifically from the database that's local so we know what it is. So there's a couple of different ways to do that. Like if you, that's one approach, right? That's one approach that you can use. You can use data specifically for your test environment that is receded each time that you build and you have a specific API endpoint set up to get data just for your testing. That's something that can be stripped out as part of the deployment process so that that test data doesn't end up in your production build. That is one option. Another option that I have seen is, for example, using an API endpoint to hit an external test database. So if you have a staging database or a testing database and you are using like an API endpoint for that database specifically and hitting that, I've seen people, for example, will create a new user, grab that new user information, use that to log in and then run the test. But it's still being done dynamically. That's another option that I've seen people use. The problem with, the only issue with that is because you're not reseeding the database between each test, if something goes wrong with one of the tests or if you create a user twice, for example, then that could cause leakiness in your test that is not related to the test itself or the application itself. So if you can have fresh, clean deterministic data for every single test, that's the absolute best solution because then your data is never gonna affect the quality of your test. The other solution is to use mock data. And that's something where you are maybe using a faker or a library to generate random data, mock data that you are then stubbing out and only testing the front end of your application. And the key thing to remember when you're thinking about your test data is what is it that you need to actually test? If you are doing UI tests and you wanna make sure the front end of your application is working properly and you're not really concerned with those connections to the backend, if you aren't testing the reliability of your database or your API endpoints, then use mock data. It's gonna be faster. You're still gonna be able to ensure that your front end is working properly. And if there's any UI issues, then we're gonna get caught. If you are doing a full end to end test and you want to make sure that absolutely a user could go all the way from like their login process, do some key transactions, work through the critical path. And it's really important that you are testing those database connections as API endpoints, even maybe making assertions against the database to ensure that data was updated correctly. Like if you delete something, does it actually delete the database? Then you are going to want to really set up that nice testing database, whether that be in your local machine that you're using to run the test or whether that be a dedicated database. That's gonna be more important. Now, it's gonna be more effort. It's gonna be slower, but because you have decided that you need to test that full flow, that's essentially what you're gonna go through the effort to set that up. So general rule of thumb is for the critical paths, the really important things that we need to make sure that the entire flow works, do the full end-to-end testing with a real database. For if you're only testing certain areas of your application, like the components or the UI, then mock it out, mock out what you don't need. Awesome, great question. I love talking about testing. So if that bleeds over to this side a little bit, I hope that's okay with everyone. And so I did also want to point out here, that I work at a company called Replay. We are a debugging tool, it's a runtime recorder that comes along with debugging tools. Dive into that a little bit more later, but we are open-source, our front-end protocol, pretty much all of our application, except for things that require security, like part of our backend is open source. And I wanted to show what that looks like, what kind of testing that we run. So this is our DevTools Repository. This is essentially our front-end of our tooling, and you can see that we do have some different types of automated testing. You can see that we also use GitHub Actions as well, and you can see all our workflows here. So we have our test suites workflow, and I'm gonna go ahead and pull that up here. But you can see we have browsers. We're actually using the Replay browser itself to run the tests, but we use actually playwright, which is different from me. I'm used to using Cypress, obviously. And then we also do have some unit tests, but we can see that we're running this end to end tests in a few different ways. And that's something where you can make the decision of whether to break out the types of tests in your workflow or have them all in one step. So if we look again, I'll go ahead and go back into an action here. Let's pull, let's just pull one up. Let's see if we can find, okay, that one just has storybook.

6. Organizing Workflows in CI/CD Pipeline

Short description:

Workflows in the CI/CD pipeline are separated into different jobs, allowing for flexibility and isolation. Tests, such as link check, test suite, storybook build, type check, graph QL schema check, node, recording driver, unit tests, end to end tests, QA wolf tests, and mock tests, are all run as separate jobs. This allows for rerunning specific jobs and provides options for rerunning all jobs or just the failed ones. Organizing the jobs separately provides flexibility but may result in inefficiency if multiple tests or the build fail. The decision on how to organize the CI/CD pipeline depends on resource availability and desired run time.

So these are actually, our workflows are separated out. So if you look at a pull request, this is probably a better way to demonstrate that. We can see here that these are all different workflows. We have our link check runner, we have our test suite, we have storybook build, we have our type check, graph QL schema check, node, the recording driver. And then we have a Vercell preview branch. And so this is where we're actually deploying to Vercell. And then we're running the tests against that deployed deployment. Then we're running our unit tests and then our end to end tests. And these are all kind of separate jobs that are taking place. And then we have some end to end tests that take place that are done by QA wolf. And then we have our mock tests as well. So we have a lot of different tests that take place and those are all separated out into different jobs. And that is so that we're able to essentially, if we need to rerun a certain job, we can isolate that. So in GitHub Actions, you're given the option here. If an action fails, actually let me go ahead and use one of these, just don't wanna mess something up. So here I have three different jobs. And I can rerun all the jobs or I can just rerun the failed jobs. So you can also, I can go into an individual job and then just rerun that one as well. So by breaking it out, it can give you a little bit more flexibility, but sometimes it's helpful to have them all in one workflow because you can say, if any one given step fails, break everything else down after it. By having them separated out and running concurrently, if we have like, for example, all the tests may fail or something goes wrong with the build, you may end up being a little bit inefficient that way. So it depends on your resource availability, how much time you want the runs to take and making that decision on how to organize your CI-CD pipeline.

7. Visual Testing and Artifact Storage

Short description:

Visual testing involves taking visual snapshots of your application and comparing them to previous versions. Tools like Percy and Apple tools are commonly used for visual testing. Artifacts, such as screenshots and diffs, are generated during the process. These artifacts can be uploaded locally or stored externally. External artifact storage services include Cypress dashboard, Applitools, Percy, and CodeCov. The artifact upload process should not break the build, unless it is critical to have the artifacts. The location of artifacts depends on the storage service being used.

Okay. So that's our application test. The thing I will note about visual testing. So visual testing, if you're not familiar with visual testing is essentially where you take a visual snapshot representation of your application, it's stored and then new snapshot visuals are taken and they're compared against the previous version. So if you think about this, you're saying this is the way my website should look. I'm going to take a new screenshot. It should still look the same. So that's a type of testing that's done. It uses tools like Percy, Apple tools are some of the more popular ones.

And typically what that will generate is screenshots, it'll also generate a diff. So if there does find any differences, those will be artifacts that we generated. And that leads us to artifact uploads. So, as I mentioned, if you have visual testing, you may end up with a diff. So if we take a look at this action here, and we go into the summary, we can see that we have some artifacts. So we have our visual snapshots diff, was produced during runtime. And if we go ahead and download that, give me one second, I'm just gonna take this off screen really quick here, and put that back, I'm just gonna open that up here in my finder. Okay, so this is our downloads. And so essentially it just kind of creates this file, and it allows you to see the diff as an image. So this is essentially the screenshot of the test that took place. And we can see what that looked like at the time. So if you ever need to visually compare those artifacts restored here in your GitHub action.

Now, there's different play ways that artifacts can be uploaded. They can be uploaded locally. So on the same machine that ran the test. So these artifacts, for example, here in your GitHub actions are stored, it may be a log, it may be screenshots, it may be a video run. So if, for example, with Cypress real world app, if we go to one of their runs here. And we take a look, they may not have any for that one. Let's see, that's a nice big one. Yes, okay. So we have, for example, the build that was produced during run time, if we needed to recreate that, for example, if you wanted to make sure that to reproduce an issue locally, we have the build that can be reproduced. But there's also external artifacts that can be created. So with Cypress, the way that the Cypress dashboard works is all of the tests that are recorded, generate screenshots and then a video of the test run. So we can see that in the Cypress dashboard. So all of the Cypress real world app logs are made public, so you're able to see them. So if we come in here to this run, we're able to see, for example, screenshots of the test failure, and we're also able to see videos. And so these are an example of additional assets that may be generated. It could also be an HTML report of a test failure. It could be screenshots that are just done a different testing library. Playwright has traces, for example, that will generate, but those are all stored externally. So they're updated, uploaded via an API. We're going to be taking a look at replays later. So replays are essentially a debuggable recording of their test run, and those are stored in your replay library. So during the artifact upload process, something could go wrong. If, for example, that service is down or the API call doesn't work. Typically, you don't want the artifact upload process to break your build, unless it's critical that you have those artifacts. So if you need to upload an artifact for security reasons, or if you have certain quality requirements that you need to have screenshots of where you can deploy, then you may want that to be a breaking step. But typically, you don't want this to actually break your build just for the artifact step. Yeah, so that's a good question. So the question is, where are these artifacts stored? For example, the videos or screenshots of the Cypress dashboard. So the Cypress dashboard is an example of an external artifact. So that's being stored by Cypress. It's being stored by their service. Typically, it's like a subscription-based service. Same thing with Applitools or Percy, or if you have a code that is structured you have a code. What's it called? CodeCov, for example. So if I come back here, Cypress Robograph also uses CodeCov to check it's reporting. So this is an It stores these artifacts which shows the coverage sunbursts for different poles. That's being stored on CodeCov. Same thing with visual testing, at Percy, right? So the visual tests on Percy will have the screenshot diffs. And those are those snapshots here that show the diffs. Like this is supposed to look like this, but it looks like this. This is supposed to look like this, it looks like this. This is all being stored on So those are example of external artifacts that are gonna be stored on different servers. They're not gonna be stored on the machine that you're actually running the tests. Now there are certain GitHub actions that will, for example, generate screenshots. That will generate the build artifact. That will take a video, potentially, depending on what type of program that you're using. And then it will have a command to store that locally.

8. Artifacts and Deployments in CICD Pipeline

Short description:

In a CICD pipeline, artifacts produced during runtime, such as zip files or screenshots, are stored in the CI provider connected to the run. Deployments involve the new version of the application being deployed to the hosting platform after passing all checks. Examples of deployment processes using Vercell and Netlify are provided. Understanding the various steps in a CICD pipeline and potential points of failure is crucial for effective debugging.

That's where you're gonna see these artifacts that are produced during runtime. CircleCI, also, I've used them before. They have artifacts that you can store as part of a run. And typically it is like a zip file to build or it's screenshots. And so those are gonna be stored in your CI provider, which is whether it's gonna have actions, whether it's circleCI, whether it's Jenkins, those are all gonna be stored as connected to the run and not part of a third party external service.

Okay, great, good questions. Let me just check this cord really quick. Perfect, awesome. And then just check how we're doing on time. Looking good, okay. And then last but not least, we have deployments. Okay, so deployment is where the new version of your application once it's run through all the checks, we've said, okay, this looks good, we're ready to go, actually gets deployed to the, where you're hosting your application. And so this may be part of your GitHub action and maybe a separate step. We can see, for example, in the DevTools workflows, if you go back to the pull request, for example, like I said, we use, we're using Vercell bot. So Vercell is kind of creating this initial deployment that we can preview. And then once all the checks are run and something is merged, then it's actually deployed. And the deploy itself, for example, let me see if I can find one that's complete, we're going to pull up a closed pull request here. Fixing rounding errors, and that was fixing rounding of service mode 6085. And then we do have, we're a very active team, so we have a lot of things that are going on here. Here it goes, okay. So fix rounding error and focus mode, that's the SEMgrep, which is essentially one of the workflows that we have, and that gets deployed. A better example actually is probably going to be Netlify, so like I said, I use Netlify in order to deploy because it's actually is hosting. And so we can go ahead and take a look, for example, at this deploy summary, we can see the build step, we can see that this part is deployed and it's deploying from site from dist. And again, if we go to our build settings, deploy settings here, and then, we can see that dist is the place we can see that dist is the published directory. So after I do npm run generate, it actually builds that dist directory and that is what is being deployed to my website. So, okay. So that is the anatomy of a CICD pipeline. So essentially, the reason that I wanted to go through that is because at every single one of those steps, something can go wrong. And there's a lot of steps, right? And you think about how often do you kick off a CICD process, depending on your organization, it could be nightly, it could be unreleased, but a lot of times you're doing it on every pull request or and then if you push a new commit, it's running the pipeline again. So, being able to understand what all the different steps are and where things can go wrong will help you narrow in when something does go wrong, how to fix it.

9. Debugging Process for CICD Pipelines

Short description:

In this section, we discuss the general debugging process for CICD pipelines. It is important to have a defined process to effectively debug issues. The debugging process for CICD pipelines is different from debugging issues in production or JavaScript applications. Debugging in CICD pipelines involves isolating and understanding the event or process that occurred. Having a good debugging process is crucial to avoid wasting time and effort. Let's dive into the steps of the debugging process for CICD applications.

So this next section, I'm going to talk about a general debugging process. And this is, like I said, a process that can be used if using to have actions, if you're using any kind of CICD pipeline. And I wanted to just note that this is a little bit different than how you debug an issue in production. So I recently did a talk about debugging JavaScript applications. So not CICD pipelines, but JavaScript applications. And the process is a little bit different, you want to recreate the issue, you want to isolate the issue. They're similar overlapping concepts, but this process here is specific to CICD pipelines because again, you're not necessarily debugging an application, you're debugging an event. You're debugging a process that took place, and that can be different than trying to debug something that's continually happening like you see in an application. So the important thing is that you have a process. A lot of times with debugging, this is not something that we're taught, it's not something that they necessarily give you a good process for, typically you have to kind of just figure stuff out on your own, learn, see what happens. And so I've talked to people about having a good debugging process and how you approach these problems so that you're not just kind of throwing spaghetti at the wall or banging your head against the wall trying to figure it out. So, just wanted to set some context for that for this next section here.

10. Debugging Process for CICD Applications

Short description:

The first step in debugging CICD applications is to isolate the initial point of failure. GitHub Actions provides visual charts that show the different jobs in the workflow and where they broke down. Understanding the initial point of failure is crucial as fixing it may resolve subsequent issues. Confirming expected behavior involves understanding what the pipeline is supposed to do and comparing it to what it is actually doing. Reviewing artifacts, such as logs and test artifacts, provides evidence for debugging. Enabling debug logs and rerunning the pipeline can provide more detailed information. This can be helpful when dealing with complex applications or issues with test runners.

So, the first step of the debugging process for CICD applications, and again, these are all gonna be JavaScript applications just because it's gonna be a little bit different they don't have compilation. We have some nice tools that come along with web applications, but should be applicable to different types of CISC pipelines as well. But the first thing is to isolate the initial point of failure.

So, GitHub Actions gives you these nice charts. It's this kind of dependency graph, this visual graph that shows you the different jobs that took place as part of your workflow. And it shows you the status and where things kind of broke down. And you may have workflows, you may have, I'm sorry, you may have individual jobs, you may have matrices of jobs, you may have jobs that can run concurrently and then jobs that have to run in sequence. And these charts give you a nice representation of that. So, you can see where the initial point of failure was. So, as we saw in our pull request right here with the replay pull request, you can have a lot of things go wrong, right? So, we like this one has three different failures, and we may not necessarily know if one of those failures resulted in the others failing or if they were their own individual issues. Same thing with your tests, right? We talked about if you have tests that rely on each other, one test failure could cause a cascading failure. So it's important to understand where that first point of failure occurred because that's where you're gonna wanna start because everything that happened after that may or may not be fixed if you fix the first thing. So this is an example of this dependency graph here. These are the steps that are required to take place first. We had to download the replay browser, wait for the Vircel preview, download node, download the recording driver, once we have this step, then the mock test can kick off. Once we have both of these steps, then the end to end test can kick off and then the unit tests can kick off at any time. Those aren't required to have any of those dependencies. So here we can see that our unit tests failed but our unit tests aren't connected to these other steps. So we can also see that our end to end tests failed even though all of this was successful. So if I were starting here, I would definitely take a look at the unit test failure. First, see what happened there that could indicate to an issue with the code base itself. So we'll talk a little bit about that in the next section about things that could go wrong with testing but you're gonna wanna identify the initial point of failure and start debugging from there. So if I was looking and I, for example, saw that I had some test failures but then I also saw that the wait for bracelle preview failed, it wouldn't make sense to go directly to the test failures because if the preview branch failed, then that could be why the test failed. So we want to start for the first point of failure. Then we want to confirm expected behavior.

So what that means is that before you can understand what went wrong, you have to know what being right looks like. And this is a little more complex when you have applications. Like, so that basically is where we expect the application to do based on a click event or what is this UI element supposed to look like. That can be a little bit more complex. When we're talking about CI-CD pipelines, what we mean is what have we told the pipeline to do and is it doing that or not? So for example, if you've recently made a change to your yaml file and it's running the... Maybe like we have our functional tests but they're running concurrently instead of waiting for node to install first, then the pipeline is doing exactly what you told it to do. It's just you didn't tell it to do the right thing. So the first step is to understand what is expected, what is the pipeline supposed to be doing. Same thing with triggers. So you may say, hey, I pushed a change but it's not triggering the workflow. That could be, if you look at what's expected, you look at the yaml file, okay, it's only gonna trigger for certain branches and I've excluded certain branches and that's why it's not triggering. So it's important to reframe and start off by saying, what is supposed to be happening when something goes wrong and making sure that your understanding of what's expected actually matches up with what you're telling the pipeline to do. Again, it's a little bit easier. I wouldn't say easier, but it's just different because you're looking at typically like one yaml file, maybe some configuration files. If you're working with Jenkins and you have like custom groovy scripts, it's gonna be a lot more complex. I've seen that before. But again, it's typically a steps of processes versus applications, which are dealing with a lot of concurrent asynchronous activity.

Then you're gonna review the artifacts. So you're gonna essentially, this is kind of like taking a look at the evidence. So something's gone wrong. You've found where the first point of failure was. You said, okay, it was supposed to do this, but instead it did this, why? That's where you start to review the evidence that you have. so that's gonna be your logs. It's gonna be what we just kind of look through in our GitHub action. So this one that failed, for example, we take a look at our logs, we're able to kind of come in here and see what errors there are. It's also going to include any failed test artifacts that you have like screenshots, videos, dirfs or replays. And sometimes you need more information. So sometimes you review the logs and maybe there is a node error when it installs, maybe something goes wrong when trying to provision the machine. And maybe it just times out and it fails and it freezes. Maybe the machine resources are exhausted. And sometimes you have to enable debug logs. So the workflow logs aren't giving you enough information, then you need to get more and so you may need to enable debug logs and then essentially redo the run. So what that'll do is there's kind of some options. There's the runner diagnostic logging, which will give you like really detailed information about the entire run. And then there's also step debugging, which is going to have the logs like during and after the job's execution. And it's gonna be very, very verbose. And so some of the reasons that you may wanna do this. For example, with Cypress, I spent a lot of time debugging Cypress. Sometimes we'll ask people to turn on a log that actually shows the CPU usage as you are doing your run. So let me see if I can find this. So we have logging in Cypress events. You can also, if there's anything going on with your test runner, for example, and you don't know what's happening, maybe something's going on with the install itself. But one of the things that we talked about, which was in, I think it's in, see if I can remember guides, it's in continuous integration guide, okay, here it is. Introduction, here we go. Environment variables, machine requirements.

11. Machine Resources and Debug Logs

Short description:

Sometimes your machine may lack the necessary resources to run your application and tests. Enabling debug logs can provide information on memory and CPU usage. If you encounter issues and need more information to generate a hypothesis, consider enabling additional debugger logs.

Boom, found it, okay. So sometimes your machine doesn't have the horsepower that it needs to run your application and run your test. So you can run with debug logs, which will actually show memory and CPU usage. And it prints out this little chart that actually shows all the processes that are running on the machine and how much CPU usage that they're taking up. So what we typically see is Chrome goes like 75%, 85%, 100%, 900%, and then everything dies after that. And that's an indication that there's not enough resources on the machine. But that isn't always available. We have to enable that. So if you are looking at the logs, if you're running into issues and you don't have enough information to be able to generate a hypothesis, which is the next step, then you may need to get more information by enabling additional debugger logs.

12. Debugging CICD Failures

Short description:

To effectively debug CICD failures, start by identifying points of divergence and generating multiple hypotheses. Trace the failure to its root cause, which can be application, environment, or pipeline related. Incorporate test code related issues into pipeline related problems. Finally, test and validate your solution before making any changes.

Okay, and so let me just check the chat real quick. Okay, we're looking good. All right. And so next you're gonna identify points of divergence. So we talked about how, in order to know what's wrong, you need to know what right looks like. Once you know what right looks like, then you can say, where does it go wrong? Where does it not match up? And that point of divergence, it could look like a couple of different things. It could be very clear. It could be an error message. It could be that you expected a certain test set to run and it didn't. That's an area of divergence. It could be that it install, typically it installs certain dependencies. In this case, it didn't install one of them. That's a point of divergence. So you wanna kind of start to narrow in on where it was going wrong. And then I'm sorry, where it was going right. And then why all of a sudden it went wrong. And then you're gonna generate hypotheses. And I have hypotheses not hypothesis because statistically, if you have three hypotheses versus one hypothesis, you're actually typically gonna spend less time debugging than if you just have one. Because if you have one hypothesis and it's wrong, you will spend a lot of time trying to prove that hypothesis. Whereas if you have multiple ideas of what could have gone wrong, then everything that you're investigating will help to contribute to all three of those. And then you can start to cancel some out as you go along. So don't get tied up on one solution right away, unless it's something you've seen like a million times before. But even then I caution you against making assumptions. But start to kind of think around, okay, the tests are failing, a couple of hypotheses, something actually broke with the application. It could also be that our data is didn't reset properly. Or it could be that the test code was updated or the test code needs to be updated to match an application code change. Maybe the new test hasn't been deployed yet. So those are like three hypotheses. And as I started to investigate the failed test I may say, okay, you know what? Like the test code was changed so I do see a corresponding test code change so I can cut that one out. And then as you go along further you see, okay, the test data itself looks appears to be good and the assertion that's failing is not related to data. It's related to a UI elements that doesn't appear on the Dom. And so I can cross out the test data hypotheses. And that is as I'm going along, I'm learning and that's because I started with a little bit of a broader swath when it comes to my potential root causes. And so you wanna start narrow as far as this determining where in the process you're gonna isolate your research but you don't wanna be narrow when it comes to your potential root causes. So you wanna kind of come up with a few of them and go from there. And then you're gonna trace those failure to the root cause itself. And when it comes to CICD, debugging CICD specifically, it can be an application related. So it could be that something in the code itself with the application is broken. It could be environment related. So that may be a mis-environment variable it may be the dependencies didn't install correctly. It may be that there's not enough resources on the machine. So the test is slightly in it fails. That's something that's gonna be related to the hardware so to speak of where you're running the deployment. And it could also be pipeline related. And what that means is that it can be related to how you've structured your pipeline and the steps. So that is where you're gonna see a divergence between what you think is supposed to happen and what the pipeline is actually doing. And those are the three essential categories of what can be related. And the one thing that you won't really see here is test code related. And so I actually incorporate test code related into pipeline related because testing is part of the deployment process as part of that pipeline process. So if you have a test that hasn't been updated to match the application code, that's a process problem. That's something that's related to how you are deploying your code. Maybe you need to have a check where it's not, you're not pushing a pull request unless it has an attached test code change, right? So that's a process problem. So I didn't break that out separately but that is also a potential, test failures are related to the pipeline itself. So, great. And then last but not least, test and validate your solution. So once you have a hypothesis, you feel like you found it. You don't wanna just necessarily make the change and it depends on how you have your setup but you wanna essentially test your solution whether that be in a sandbox environment, whether you make some of the change locally. If you are able to run in like a QA or a development or a staged environment, if you make any pipeline changes, if you're able to rerun tests locally an isolated test, the one that just failed, rerunning that, whatever it may be. You wanna test and validate your solution and then go ahead and make the change. So that is the debugging process for CICD. All right, awesome. All right, so let's go ahead and move into the next section which is causes of CICD failures. So we talked about all the different steps of a CICD pipeline. We talked about a general approach to debugging. Now, we're gonna talk about once you zeroed in on the step, what's gone wrong, how to help you generate some hypotheses. So these are gonna be some of the things that can go wrong. Some of the things that take place throughout these different steps. Okay. The first step we talked about was workflow trigger. So that was essentially what triggers the workflow to start, right? Whether it be an event, whether it be a manual trigger.

13. Trigger Criteria and Syntax Errors

Short description:

To ensure that your workflow triggers correctly, define the event, conditions, and branches that will trigger it. Use filters and conditions to control the workflow triggering based on specific criteria. Some CICD providers offer trigger testing to validate your workflow file. You can also manually trigger a simple workflow to test your conditions. Be aware of syntax errors in the YAML file, as typos can cause issues.

And in order to talk about some things that could go wrong here, the one thing to look at is the trigger criteria. So what is the event that you have defined to trigger the workflow? What are some of the conditions? And then what is the branches or the branch that you've indicated? And we can take a look here, again, we'll be looking at GitHub actions, but this will apply to any type of kind of YAML based CICD. But you trigger a workflow using on and on can refer to a lot of different events. So you can use a single event, you can have multiple events as well, and then you can also use activity types. So if a label is created, if a push is made up, any kind of push is made to a certain branch. So say you don't do pull across, maybe it's just your own code base and you just are happy or you just like have a small group and you just push straight to main, that may be what triggers instead of a pull request. And so understanding what you've actually defined as the events that trigger the workflow is really important. The other is that you can have specific filters or conditions on trigger workflows. And if you have anything like that, it's important to understand how they work and what to do. What's neat is that some CICD providers actually have trigger testing where you can set up your workflow file and then you can run a test and it'll show you what would or would not have triggered that workflow. So if you have access to some kind of environment like that, that's kind of a neat way to test it. You could also kind of just manually try and trigger a simple workflow to ensure that the conditions that you've set up work as you expect. So you can have filters, you can have, for example, you can exclude certain branches. You can also have conditions. And so conditions are really interesting. Conditionals essentially let you put if, if statements conditional logic within your YAML. So for example, if your GitHub label is bug, then it's gonna do a certain thing or it's gonna trigger a certain workflow. A lot of times we see this based on environments. So if you have a certain environment like for maybe a certain branch, if you're working on a certain feature branch, a really popular workflow that I saw when I was working with teams at Cypress is that they're working on a feature branch, it would run a subset of tests first related to that feature. And then if those tests pass, then it would run the entire suite. And what that did is it kind of ran the test that were most likely to be affected by a change in that area first, before running through the entire full test suite or it may only run a full test suite when it actually goes time to merge to main. So that is an example of how you can kind of use conditions based on branch, based on any kind of label, based on who is making a pull request, maybe you can give one person superpowers, it doesn't mean to run the checks, but you know, you're gonna want, those are some things that you'll wanna check if you run into an issue where something is not triggering the way that you expected. The other thing to keep in mind is a syntax error, right? So if you're looking at everything and it's making sense, or maybe it's stopped working, it could be that there's just been a change to the YAML file typos get the best of us all the time, it happens to everyone, so that's something to keep in mind as well.

14. Environment Provisioning Challenges

Short description:

Resource availability, Docker image issues, provisioning requirements, resource management, security, and data requirements are potential challenges in the environment provisioning step.

So in the environment provisioning step, things that could go wrong here, the first is resource availability. So depending on what your CIC provider is, you may only be allowed a certain number of machines, certain types of machines. If you have like a lot of jobs running concurrently, maybe they don't have enough resources available. You may end up having what's called queuing, which is where jobs are essentially waiting to start until resources are available. As you're setting up your environment, it could be that maybe you have an issue with the Docker image that you're using. So if your image is out of date, something's going on with the container, that could cause problems as well. There's also provisioning requirements that maybe take place at your organization, depends on the company. But sometimes, for example, you need to have certain authorization to provision machines. I've seen before where you have to, for example, make requests to tap into a main workflow. And you have to use workflow templates. There may be resource management requirements where you're only allowed to provision a certain number of machines. And maybe you've provisioned 10 and you only have five. And so you need to change your demo file and something goes wrong there. There may also be security or data requirements as well. So maybe the machine that you're provisioning is too small for the job, things like that. So those are the things that could go wrong at that step.

15. Dependency Installation

Short description:

Dependency installation is the stage where you install all the required dependencies for your application. Issues can arise if dependencies are missing or incompatible with each other. Compatibility issues may also occur with the operating system. It's important to check the package.json and YAML files for the listed dependencies and ensure correct installation.

Next is dependency installation. And then let me just make sure here that I check in the chat. Okay, perfect. Okay, so dependency installation. This is the stage where you are installing all the things that are required to run your application. So we can see here, we are doing that in our step where NPM install step, which actually takes place here in our setup command. So we are extracting, sometimes you may be downloading remotely. So for example, if you needed to install something from a third party server, those are all things that can kind of go wrong at this step. As I mentioned before, if you are have a dependency that's missing, you'll wanna check your package dot JSON. So your package dot JSON is what is going to determine what dependencies are listed. You also will check your YAML file because your YAML file will also indicate what dependencies are required. So for example, if you are using a certain dependency, like if you have to install node, if you have to install certain browsers, probably a better example for that is going to be the Cypress workflow because that has a lot more dependencies because you do have to install certain browsers. But just to kind of give you an example, I'll look at this one too. I just clicked the wrong button there. So, you know, so we have to kind of, we have to have a step where we download the browser. We actually have a script that we're running to get that from a certain API endpoint. If that is down, for example, we could have an issue there. Here we're using our Brazil preview action. Here we have a playwright test space URL and that's the branch URL coming from Brazil. So all of that is being set dynamically and we're downloading the browser, we're downloading node, we're downloading a driver, we're downloading a preview branch, so those are all things that could kind of go wrong at that step. There also could be dependency-compatibility issues between each other, so if you made changes, maybe if you changed some of your dependencies, if you've done like an update to versions of dependencies and you didn't run it locally, you just kind of pushed the change, this workshop is being recorded, so it'll be available after the fact as well. And so it's something where, this happens a lot with your bumping versions, right? So everyone's probably familiar with the dependency management aspect of working with modern JavaScript applications, maybe you're bumping dependencies and some are not compatible with each other. So if you didn't build it locally first, you're probably gonna first encounter that when it runs in your CI-CD pipeline and you could have something where something fails to install because of a different dependency issue. You could also have dependency compatibility issues with the operating system that you're running. So for example, replay, if you wanted to use the replay enabled browser to record your tests, it only works on certain operating systems. So if you wanted to run, we have some for Linux, some for Windows, but say you tried to install the wrong version on a operating system, then it would, that would be a compatibility issue between the dependency and the operating system. So those are some things to look out for at the dependency installation stage.

16. Code Checkout and Authentication

Short description:

Code checkout authentication issues can occur if there's a problem with the authentication process, such as being behind a VPN or having specific role-based restrictions. Another issue could be if the code base has changed since starting the process and the branch or commit being accessed has been deleted.

Okay, so the code checkout, as I mentioned before, a lot of times we're using a checkout package, I'm sorry, a checkout action if you're using GitHub Actions, but this essentially has two steps. It authenticates you, it says, yes, you have the right to checkout this code, and then it also will go to a specific commit. And so you really only have issues here if you are trying to checkout code and there's some kind of authentication issue. So typically that won't be a problem if you're in the same repository using GitHub Action. But for example, if you are checking out code and there's something wrong with the authentication process, maybe it's behind a VPN or it's an SSO, or maybe you have really specific restrictions are based on role type, maybe you're able to push, but you can't check out. That could cause issues at that stage. It could also, sometimes you'll see the error message that says, like, you know, check out fail. And it could be because the code base has changed since you started the process and a commit or pull request, that you're the commit that you're trying to access has the branch that you're trying to access has been deleted. Pretty rare, but that can happen as well.

17. Prebuilt Checks and Application Build Phase

Short description:

Prebuilt checks are performed before starting the application, including type checks, linting checks, and security checks. Automatic tools like Trunk can fix linting issues. Security checks scan the code base for exposed environment variables or data. The application build phase can encounter issues with machine resources or incorrect configuration. Building locally can help determine if the issue is application, environment, or pipeline-related.

Okay, so prebuilt checks. So these are the checks that take place before you actually start up your application. Typically these are type checks, linting checks or security checks. I actually had this happen to me recently. I wonder if I can find it. Let's see. It was a while ago, but anyway. So that'll occur when something fails in the linting or TypeScript stage. Sometimes people will run those locally first in order to prevent it from happening in CI. So it depends on how large your code base is, but that is something that, no log should be a pretty clear indication at that point what the issue is.

A lot of times there will be automatic, like we use trunk, for example. And so trunk will, you could just do like trunk correct your, I'm not, I'm forgetting what the exact command is and it'll automatically fix any linting issues. And then there's also could be security checks. So that may be, for example, if you've exposed any environment variables, if you've exposed any data, anything like that, before you actually build the application, you may have certain checks that scan the code base for things like that as well. And that could cause issues.

And then next is the application build phase. So we talk a little, take a look at the things that could go wrong. I would say that most of the time, the checking out is going to be fine, pre-build checks, you at least know where to go for, triggering the workflow, usually your yaml file is going to be the culprit there. Application build errors can happen pretty commonly. And especially if you have complex build steps in your application. If you're using something like Babeler webpack in order to build, but a couple of things that could go wrong is machine resources. So if the machine is not meaty enough, like not big enough for your application to actually build on, that could be something that takes place. Missing or incorrect configuration. So this is a big one, especially if you, like I said, if you're using any kind of Bundler, any kind of build tooling, if you have configuration that is off, that could cause the application build to fail. Typically, a good way to determine, and as I said before, if it's application related, environment related, or pipeline related, is are you able to build it locally?

18. Build and Test Failures

Short description:

Dependency failures, environment variables, build cache issues, and machine resources can cause build failures. Test failures can be due to true test failures, test code issues, test data issues, or test flakiness. It is important to rule out test code issues and test data issues before debugging the application itself. Test flakiness can be caused by DOM instability, network instability, or environment instability. Having assets related to tests, such as reporting, logging, screenshots, videos, and debuggable replays, can help identify the cause of test failures.

So a dependency failure would mean essentially that something didn't install correctly, that something didn't install that was needed to build the application. Environment variables, like it sounds, it means that something, an environment variable essentially is you set that in your CI City provider. So we're actually doing that in this next section when we do the workshop part. We'll set an environment variable within GitHub to access from our YAML file. So if you don't have that set correctly, that would be dependent to your environment, not the actual application code itself. You could also have a build cache issue. So many CI CD providers, GitHub Actions, Netlify, CircleCI, they will cache dependencies so that you don't have to reinstall them every time. So if you run an NPM install and every single time you're installing certain packages or you're installing playwrights or whatever it may be, and the version is the same, it won't reinstall it. It'll just grab it from the cache in order to just forget the process and reduce the requirements on the resources. But this can cause problems. So when you go to build the application, if there's something left over in the cache that shouldn't be there, or I think it's the same version but you actually made a manual change and so it's not recognizing it for some reason, you may wanna clear out your build cache in order to start from scratch and see if that's what the issue is. So again, if you're able to build locally but not in CI, you're gonna wanna check for dependency failures, environment variables, build cache issues, machine resources. If it's failing both locally and in CI, then that's where you're gonna wanna check your configuration. You're gonna wanna check your plug-in if you have any build plugins, if you have configurations, files that are used for your bundler, that's where you're gonna wanna start to debug. So that's actually a build issue regardless of CI CD, but it still absolutely will happen in CI CD as well. Any questions on that step? Because it does happen pretty frequently that can be a problem. Good, great. All right, so your application tests. This is, I would say, the number one cause of CI CD pipeline failures in my experience that I have seen. So when tests fail, it's important to understand if it was a true test failure, meaning something in the application code changed that caused a regression, and the application no longer works the way that it's supposed to. That's good. Those are good test failures. That means that we caught something bad that shouldn't have been deployed. We like those. Those are great. Unfortunately, that is probably the smallest percentage of test failures, depending on what kind of framework that you use and what kind of tooling that you use. But you want to rule out if it is a test code issue. So as I mentioned earlier, for example, if the application was changed, but the test code was not updated, or if you changed a class or a HTML tag or whatever your selector that you were using for your test code, but you didn't actually change it in the code itself. That could also be the way that you're writing your tests. So say, for example, your tests rely on each other and one test change, but you didn't update the other tests. That's a test code issue, but all of that has nothing to do with your application itself. It's still important to get right. It's still important to have that quality, but that's not going to be a true test failure. Then there's also test data issues, which I talked to pretty detailed earlier. Thanks to that really great question, but this is where you'll see data causing problems. That's again, unrelated to your application. It has to do with the data that you're using for your tests. Then we have just good old fashion tests flakiness. Test flakiness is when a test will fail and then pass with no other changes. There's a lot of different causes for that. There's DOM related instability. There's network related instability. There's environment related instability. If you're interested in flake, I've done quite a few talks on that. If you look up Cecilia Cypress flakiness, you'll see quite a few. But again, the key here is to identify, is this a true test failure that should have broken my pipeline? Or was it one of these other reasons? Because if it's one of the other reasons, then you have a process issue at that point. And that's gonna kinda help you point in the right direction of where you need to debug. Do I need to be debugging my test process? Do I need to be debugging my data process? Do I need to be debugging my actual application itself? And things that help with this is if you do have assets related to your tests. So if you have really good reporting, if your tests have nice logging, if you have screenshots, if you have videos, if you have debuggable replays, which we'll take a look at in a little bit, that can all help you identify and pinpoint which of the categories that this falls into.

19. Artifact Sorting, Deployment, and Configuration

Short description:

Having good reporting, logging, screenshots, videos, and debuggable replays can help identify issues. The OuterFact upload process should not be a breaking step unless specific artifacts are required. Incorrect error handling, permissions, and directory location can cause issues in this step. In the deployment step, permissions, deployment size, and hosting issues can go wrong. Configuration issues with the host may be the cause.

So if you have really good reporting, if your tests have nice logging, if you have screenshots, if you have videos, if you have debuggable replays, which we'll take a look at in a little bit, that can all help you identify and pinpoint which of the categories that this falls into.

So during the OuterFact upload process, again, you really don't need this to be a breaking step unless there's something very specific that you need from an artifact. So if you say like, for our security compliance, we have to have a reporting of the test run or whatever it may be, and if that fails, then the whole thing needs to be redone. So if you have issues with this step of the process, and that's breaking your pipeline, you may have incorrect error handling. So you may have it be where this step is like a dependency where it breaks everything after it, and you don't necessarily want it to be set up that way. Or if you do have it where you're having issues within this step, you could not have right permissions to wherever it is that you're trying to sort the artifacts. And then you also may have an incorrect directory, a location of where you're trying to sort the artifacts. Nine times out of 10, this is automated. GitHub Actions makes it really easy. A lot of the actions that you'll use, that's just set up for you. But if you're using a different CI CD provider where you have to, you know, potentially manually define paths for where you want certain things stored, maybe you have different artifacts that go to different locations. The directory location, you know, syntax, or name, or location could be the issue there.

Okay, and then last but not least, deployment. So things that could go wrong at this step, permissions. So again, if you don't have the correct permissions to deploy to your host, there may be certain secret keys, for example, that may be required that you don't have set up in your environment. The deployment size may be too large. That's if you have certain hosting limitations. A lot of, more recently I'm seeing build tools that will actually, sorry, telemetry tools or like kind of metric tools that will measure build size. How long it took to deploy and kind of give a sense for that whether that's a safe deployment to make. And then, you know, it could have hosting issues. I've had Netlify build fail, deployment fail just because there was an error and then I had to restart it. So those are some of the things that could go wrong at this step. Chances are if you've made it this far, it's not something that's wrong with the code base or anything like that. It's gonna be more of a configuration issue with the actual host that's deploying your app, that's the actual host of your application.

20. Replay: Time Travel Debugger

Short description:

Replay is a company that creates an open source time travel debugger. It allows you to record a session and create a debuggable replayable recording of that session. The recording includes everything that happened in the browser session, such as click events, network requests, and code execution. The debugger provides familiar debugging tools and allows you to evaluate code and view the state of elements at different points in time. You can share the recording with others, who can then debug the session and view error messages. The recording is like time travel, allowing you to fast forward and rewind to different instances and observe the code execution and values at each point in time.

Okay, so those were things that can go wrong and how to start to approach them based on what step you're at. We're talking about Replay. So, Replay is a company that creates a open source time travel debugger. It initially started at Mozilla at Firefox debugger, DevTools, and now is its own stand-alone tool.

So, first I'd like to start by just kinda showing you what it does, and it's a little bit tough to explain what's a time travel debugger, right? So, I'm gonna show you what that looks like first by walking through the manual recording process. So, Replay is a browser. And the Replay browser allows you to record a session and create a debuggable replayable recording of that session. So, I'm gonna start. I'm gonna go ahead and open up, I have this application on my machine. It's just a React calculator app. It's one of the demo apps from the React docs. And let me just double check what the script is. It's just NPM start. I'll go ahead and start that on the machine here locally. And I have it running here at localhost 3000. So on the slides, you'll see this reporting bug reports link that goes to our documentation. And this is essentially the process that I'm gonna be using. So if you wanna follow along, feel free or just kind of watch as we go through. So I'm gonna start by opening my Replay browser, and this is a browser that's installed on my machine. You can install it, and inside the browser, I'm gonna go ahead and go to my localhost 3000 where the calculator is running. And now that I have the application open, I'm gonna go ahead and just hit record, and it's gonna reload, and I'm just gonna kind of do some things here. So two plus two equals four. I already know that this is an error, it's divided by zero, and when that happens, we get an error. So I'm gonna go ahead and hit stop. What that's gonna do is it's going to upload that to my library, and I'm gonna go ahead and do divide by zero error. I'm gonna go ahead and make this public so y'all can see it too if you like. I'm gonna go ahead and hit save, and so now I have this recording of what I just did. So I reproduced that bug once, this recording now lives on forever. It has a recording not just of what happened on the screen. So we can rewatch the video. We can see where I clicked, we can see what went wrong. It also has a recording of everything that was happening in the browser session. So we have all of the click events, we have all the network requests. We even have the execution of the code that ran. We have the state of all of the elements in our application at every point in time. So, if I click over here, we're in the viewer mode to the dev tools tab. This gives us all of the debugging tools that you may be familiar with from Google Chrome or VS code, or whatever ID that you use, built in right alongside the reporting. So I can see here on my mouse events, like I said, I can see websocket XHR requests, but I can also, you know, click back and forth in time and evaluate certain things. So here I can click on this element. I can see where that is in my HTML Dom tree. I can see what CSS rules are applied to it at that given time. I can see what event listeners are attached to it if there are any, so if I click on this button here, I can see that we have two click event listeners, I can see where that's defined in my JavaScript and I can actually click on it and I can see all of the source code that was recorded. And this works, I was running it locally, but you can also record, you know, or whatever it may be because JavaScript lives in the browser. So it's able to record all of that. So I'm able to come in here and able to see that this piece of code here, this line was hit 171 times I'm able to see in my button component. Let's say, for example, that this click handler has, was passing through this.props dot name. So let's say I wanted to evaluate, you know, we're all kind of used to console logs, right? So, there are console log in your code, you evaluate it. We can do that here. And I can log this dot props dot name to the console. And I can see every execution of that line of code and what the value of this dot props dot name was every single time that line of code executed. And so if I, and I can kind of rewind to each instance. So, this instance, I'm hitting the plus sign and it's handling my click with the name of plus. If I fast forward to this instance, I'm clicking the eight button and we can handle click is processing with props dot name of eight. So, this recording exists, it's shareable. I can click share, I can copy the link. I can go ahead and paste that here in the discord chat. So, anybody could come in here now that has the link, they can play around with it and they can debug it, right? So, if we wanted to take a look at what was our error, right, so we had in our console, I can go ahead and log the errors that occurred. I'm gonna go ahead and get rid of this one here. So, we had an issue when we divided by zero. And so we can fast forward. Yeah, this is like time travel, right? It's a recording. So, we can fast forward and we can take a look at our error message. So, we have an error with division by zero and we can see in our, we have this kind of operate file on line 20 it's dividing by zero and it's throwing an error. So, let's go ahead and pull that up. In our source explorer. Here we can, you know, search for a file and it's operate.js, we can pull that up. We can see here that this is what was returned and it returned one time and it was one divided by two. And we can see here, if you log two to the console, that to string that we're kind of have an issue here. So, we can see it zero is the value. And so, we're getting an error there.

21. Replay: Manual and Automated Testing

Short description:

In this section, we explore the capabilities of Replay, including inspecting React components, evaluating React state and CSS at different points in time, and viewing network requests. Replay is a manual process for reporting bugs during development, manual QA, and support. Automated tests can also replicate this experience using forks of Playwright and Puppeteer. We'll be using Playwright in the next section to record and analyze failed tests. The Replay browser replaces the regular browser used for running tests, providing a recording of the test execution. We'll walk through the process and discuss the replay Playwright configuration. A GitHub action is available to automatically record and upload failed playwright tests to the replay library. The library stores all the replays, which can be reviewed and debugged. We'll be working with the floating UI fork and demonstrate the process of reviewing failed test recordings. The debuggerable replay provides detailed information about the test run, allowing for code evaluation and debugging. We'll recreate this process in the following section.

And what we can see above here that we do have like this kind of check that's supposed to happen where if somebody tries to divide by zero, it will alert divide by zero error and return zero. But those two lines have zero hits, so those did not execute. And I kinda like to, as I'm going through, add comments. So, we can say here, this line is supposed to hit but does not. And then anybody else that looks at this replay will be able to then click here and go directly to that line in the code, see the comment and start to debug it.

You can also inspect React components. So, if I come back here and I go to my app, I can see the state of my application that corresponds to, I can go ahead and search here, my app component, search for a file, by app.js. So, I can see that I have... My React state is total next and operation that corresponds to what the value is here. and that changes at different points in the replay. So, when I'm here, my app state is 8.0.0. But when I'm here, my app state is... Let me see, go a bit further down. 8.0 division 8. So, I'm able to also evaluate React state at different point in time. Again, I'm able to evaluate CSS at different points in time. And I'm also able to see the network requests. This is a local UI application. There's not a lot happening with the network request, but, you could see the request headers, the response body, the stack trace, the timings. And then you can also add comments directly to network requests. So, that's a quick tour of what Replay is. This is the manual process using the Replay browser to report a bug. It's helpful for, if you encounter a bug during the development process, if you're doing manual QA, if you're on a support team, those are the use cases that we see. But you can also replicate this experience using automated tests.

So, if I go back to my slides here, we can go ahead and minimize the browser. We have recording automated tests and it's just Replay automated tests is the tiny URL there on the slides. So, what you can do is we have, and we have forks for Playwright and Puppeteer. We do have a very basic proof of concept for Jest as well. And then also a Cypress fork, but those are still being worked on. So, very, very alpha. But Playwright and Puppeteer, and we're gonna be using Playwright one today. Essentially, what it does is it swaps out the regular browser that you would use to run your tests. So when you're running your tests in your machine, Playwright is opening a browser, it's navigating through the test code, it's executing your execution in a browser. So instead of using Chrome or Firefox for that, we're using the Replay version of that browser. That way you can create that recording and you have that asset after the fact, so if a test fails, you have all that helpful information about what happened during the execution of that test. So that's what we're gonna be working on in the next section here. And we're gonna go ahead and start with, just to kind of, I'm gonna walk through what the process looks like. So again we're gonna be recording Playwright. We're gonna be using a fork of an existing setup, so you won't have to go through this many steps. But I'll replay Playwright, it supports Firefox and Max Linux and Chrome on Linux, so when I was talking earlier about how there could be a dependency discrepancy between operating system, this is an example. If you tried to use it on a Windows machine, would not work, for example. So, what this does is there is a replayio playwright, replayio slash playwrights package that gives you access to a replay devices array that you just use in your replay Playwright configuration in order to use the replay enabled browser instead. Once you've done that, it'll go ahead and use the replay enabled browser to run the tests and report instead. So we do have a GitHub action, which is what we'll be using today that does this automatically for you. So it records your failed playwright tests and replay and it automatically uploads them to your replay library. So your replay library is essentially where all your replays are stored. So if I go to my test team we can see, you know, this morning I ran some tests and those failed recordings are all here in my library. So that happens automatically as a result of the GitHub action. So then when I have the failure that occurs I can come in here and I can review the recordings.

So, just to show you what that looks like we're gonna be working with the floating UI fork here. I'll go ahead and pop that in the chat and then also in Discord. So I'm gonna go ahead and pop that in the chat. And as you can see, this morning, I ran, I updated the readme just to add the texts that we're using the replay browser for this fork and everything passed. So, link and type checking pass on my unit test pass, functional test pass. We create a Google request with a change and it is a breaking change, and so now our tests did not pass. So if we go into here, I can see that my unit test and my functional test did not pass. If I go to my functional test, I can see at the end here, upload finished, view your replay apps, and it has a link here to the replays that were generated of the failed test. So it creates a replay for each individual failed test. It's added to my library. I can also click on the link here from the GitHub Action in order to access that. This is very, very, this is similar to, for example, we looked at with the Cypress example, right? Where Cypress example has the screenshots and the video reporting. This, this takes it one step further and it gives you that debuggerable replay with all the information from the test run. So if we were to take a look at one from this morning, I'll go ahead and just click on this one here. So let me go to viewer. I can watch my test run here and I can see it floats. It's off, it's supposed to be lined up, but it is not lined up and that is why it is failing. So arrow should not overflow floating element left end and it is overflowing. So I can come in here, I have access to my code, I can see what code executed, what didn't, and start to debug. I could add comments to it. But what we're gonna do is essentially we're gonna recreate this process.

22. Forking Floating UI and Setting up GitHub Secret

Short description:

We'll start by forking the specific floating UI repository and creating accounts at Then, we'll create a team API key and add it to our GitHub repository secret. After updating our visual snapshots and running a passing CI-CD, we'll make a breaking change together and debug it.

So, yeah, so like I said, we're gonna go ahead and start by working and Cecelia creates floating UI. You're gonna wanna use that specific fork because I have removed a lot of the tests. The floating UI has a really robust PlayWrite test suite, which is why we wanted to use that one specifically because it already had existing PlayWrite tests, but there's a lot of them, and so when it fails, it takes quite a while to run. So with this fork I only left in a certain number of tests and that's what will essentially run. So you'll start by go ahead and forking that, and the next step is going to be, again, we'll take a look at the action, replay action. We're going to go ahead and create accounts at, they're free accounts, but you do have to sign in with Google. So if you want to just follow along, feel free. And then we're going to create a team API key so that we can add that to our GitHub repository secret. And so we're going to get some practice with learning how to add environment variables here. And then essentially we already have the existing GitHub action in that fork. So we don't have to manually set it up. We're going to start by updating our visual snapshots. We're going to run a passing CI-CD, then we'll go ahead and make a breaking change together and watch that fail and debug it. Okay, so we go ahead and have that forked.

23. Setting up GitHub Secret

Short description:

To set up the GitHub secret, create a new team and a new GitHub secret in the replay library. Switch to the team settings, create a new API key with the desired label and necessary permissions. Copy the API key. In your repository settings, navigate to secrets and actions. Click on new repository secret, name it 'record-replay API key', and paste the value. This keeps your API key secure and private, preventing unauthorized access to your source code.

And then the next step is going to be to set up the GitHub secret. So I'm going to walk through how to do that. I'm gonna go ahead and create a new team, a new GitHub secret. I'll show you kind of walk you through how to do that. But so in our replay, once you log in, you'll be taken to your library. You may be prompted to do a, like an initial demo, you can close out of that, you can just go directly until you see this screen where it says, you know, your library. You're gonna click create new team and we'll do DevOps JS team here. And then it'll prompt you to an invite link, you can just click next. And then go head and switch over to that team and open up the team settings. On API keys, we're gonna be able to create a new API key. You can give it any label you want. I'm just gonna do GHA for GitHub Actions. Permissions to create recordings and upload source maps, you can leave both of those checked, that's fine. And I'm gonna go ahead and add that. And I'm gonna go ahead and copy it. Okay, so next step is back in our repository. And so this is gonna be specific to your project itself. You will go to settings, and then secrets, and then actions. So again, settings, secrets, actions. So I already have this record-replay API key that I created this morning. But you'll wanna click on new repository secret up here in the top. You're gonna wanna name it record-replay API key, and you're just gonna paste the value. And this, I'm gonna leave the one that I already have. You'll paste the value, and you'll go ahead and add secret. Once you do that, you'll be able to see the repository secret here. You won't be able to view it again. It's saved. If you accidentally delete it, you'll need to create a new one, which is fine. But what this does is it keeps your environment variable, your API keys safe, and it's not stored anywhere in your source code. So if anybody checks out your source code, they don't have access to that secret. So that's what keeps it secure and private so that nobody else can find it. I'm gonna go ahead really quick and just click done. And I'm just gonna go ahead and delete that one really quick, so it's not out there.

24. Running the Update Visual Snapshots Workflow

Short description:

Once you've added the action secret, go ahead and run the update visual snapshots workflow. This is necessary because there are no baseline visual snapshots to compare against, and the tests won't pass without them. The update visual snapshots workflow should take about four minutes to complete. While it's running, you can review the logs to see what's happening. If you need more time, feel free to ask.

Okay, so once you've added the action secret, that's now stored, and we can go ahead and take a look at our code, right? So we have this forked. We have our GitHub action secret, a record replay, we have our record replay API key saved as a secret in our GitHub. We can see now we have some GitHub action workflows. The first time that you click on the GitHub actions tab here, it's going to give you a message, a warning message that says we've disabled workflows because this is a fork. That's normal, just go ahead and click I understand, enable. And then you'll be able to see the potential workflows here. Just double check nightly, this one will still be disabled and that's okay, you can leave that disabled. That is a scheduled workflow. So if we take a look at that yaml file, we can see that that uses a cron job to schedule kind of this nightly run. So you can leave that disabled, don't necessarily want this running every night just for the purposes of practicing. But I wanted to explain what that one was. The rest of these are triggered workflows. So we have, you know, release floating UI, release popper test and then update visual snapshots. So the first thing that you're gonna want to do is run the update visual snapshots workflow. So it's the same as workflow has a workflow dispatch event trigger. What that means is that on, it's not related to a pull request, not related to a commit, it's not related to an event. The event that it's tied to is you saying, yes, please run this workflow. So you have to dispatch this workflow in order to run it. So that's gonna be the first step that you're gonna do. Once you've added the API key, you're gonna run the update visual snapshots workflow. And the reason that we have to do this is, I just lost this here. Where am I? There it is, okay. The reason that you have to do this is because this is your first work. If you try to run the tests, there are no baseline visual snapshots to compare against and the test won't pass. So the first step that we need to do is run the update visual snapshot for flow. It should take about four minutes. That's how long it took when I ran it earlier. And while that's running, we'll go ahead and just dig into the, to logs here. So go ahead and dispatch that, get that running. And then I'm gonna go ahead and dig into the logs and show what was happening along the process. And then before we move on, if anybody needs more time, just raise your hand or pop in the chat if you need more time.

25. Running the Workflow and Updating Visual Snapshots

Short description:

The workflow checks out the code and sets up node. It installs the dependencies using npx playwrights and the required browsers using npx playwright install. The application is built and the tests are run, generating initial snapshots. After completing the job, the workflow updates the visual snapshots. Congratulations on running your first GitHub Actions workflow!

So yeah, so just update visual snapshots, what this is doing is it's going ahead and checking out the code. And so again, this is run off of your workflow dispatch event. So it's not gonna be tied to a specific pull request or a specific commit. This is actually gonna just go ahead and fetch the repository. So the ref is actually just gonna be our master branch. So we can see here that the branch is branch master is what the ref is that we're checking out. So it's again, not tied to a pull request, not tied to a commit. We're just checking out master and we're going from there.

Then we're gonna go ahead and we're setting up node. So we can see that we attempted to download, it was not found in the manifest, which means that we did not have that cached in our machine, which makes sense. This is a brand new GitHub Action Workflow that shouldn't have anything cached. And we can see after the fact that it was added to the cache so going forward, we won't have to install node every single time.

Next, we're gonna run npx playwrights to install the dependencies. So we see that happening there, quite a bit of logs. Let's go ahead and scroll past that. This package here, bamitab-npm-install is what allows us to actually do caching of NPM modules. Small World, Gleb Bamitab is the person who made this package, who I worked with at Cypress, has a lot of great resources on Cypress. He's no longer at Cypress, he's at a different company, so am I, but we both still do a lot with Cypress testing. And he does a lot in general, including this package. But what this does is essentially allows you to cache NPM modules so you don't have to reinstall them every time. So we can see here that we don't have an NPM cache yet, that makes sense, this is the first time running this workflow. So we're gonna go ahead and install our NPM packages. Out of 2300 packages, and then we're gonna go ahead and run npx playwright install. So we're installing the required browsers. And again, we're not using replay yet. This is just initiating those initial visual snapshots, right? So we can see that npx playwright install, what that does is it just installs any kind of browsers that you would need to run playwright tests. Here we are building our application, so npm run build. Then we are gonna go ahead and run test functional update. So this is a script that's in our package.json command. This is in our package.json, it's a command that we can run and what this is doing is it's running the tests and it is taking the snapshots. And we can see here in the logs, right? This png is missing in snapshots, writing actual. So it's going through and it's creating our initial snapshots for all of the tests that we have. On this fork, I've deleted a lot of the tests. I just left one directory, the functional arrow test. So those are the ones that are being generated. And then we're just gonna go ahead and have some post actions. So when we're gonna complete the job. Alright, so at this point, you should ideally have completed that job. You should have a successful action that you updated the visual snapshots. Anybody needs more time, just raise your hand or pop in the chat that you need some more time before we move on. Okay, and if you've never used GitHub actions before, congratulations, you just ran your first workflow. You are officially a GitHub actions specialists. No, I'm just kidding. But yeah, so that's really is how easy it is to get set up and start running with it. I know for me, I was a little bit nervous the first time I ran one because I was, have heard all these horror stories about like runaway machines and breaking things but GitHub actions is a free for all public repositories and projects. And so feel free to use it and experiment with it. Okay, great. So the next step is going to be to make a breaking change PR. But first actually I'm gonna go. I wanted us to do something else first. I changed my mind. I'm going to have us go ahead and just run a working test run. So you're gonna make a change and you don't have to download the code to do this. You can do this right in GitHub. We're gonna make it change. And so let's go ahead and go to our code. And a really easy one is just to change the README. You can go to code and you can click on this little edit button here next to README. And this will allow you to edit the file directly. And I'm just gonna just be, this is a test. Add that there. And then what we're gonna do down here is we're gonna actually check the box. I'm sorry, check the radio option that says create a new branch for this commit and start a pull request because it is the pull request that triggers our workflow. So we're creating a pull request. All I've changed is the README file. So this should be a working for requests. So let's go ahead and propose changes. That's gonna go ahead and create a pull request to open the README. I'm gonna go ahead and create that. So what that will have done is kicked off a workflow run.

26. Running Tests and Debugging in CICD Pipeline

Short description:

The update README has started and is currently queued. The linting, type checking, unit test, and functional test are all running concurrently. Each job within the workflow has its own steps. Debugging options include SSH into the CICD provider or SSH into the container of the running machine. Functional tests are running with replay Firefox or replay Chromium. It is unnecessary to install both Playwright and Replay browsers. The GitHub Action and tests have passed successfully.

So we can see here that the update README has now kicked off is currently queued, but we can go ahead and click on it and see it start to pull up. So, okay. So there it goes, it's in process. And again, these three jobs are can run concurrently. So they're not running right after another. The linting and type checking happens. The unit test happens. The functional test happens. Those are all happening at the same time. They're not dependent upon each other. Each job within the workflow will have its own steps that it needs in order to install, checkout, build and run the tests.

So if anybody has any issues up to this point, raise your hand, let me know in the chat, but you should see some running tests at this point or some running workflows. And we can click into any of them here. We can see the setup job. We can see the GitHub token permissions. We can see the operating system that we're using. We can see the ref that was checked out. So this is kinda fun checking out ref. We can actually see the head is now the merge this into this. Those are the two shas. That's like the pull requests, that's the actual ref of our git that is you being used to check out the code, which is kinda interesting there. And then, so now it's going ahead and running the setup actions. We click over to our unit test. Now again, these are all running at the same time, we can click in, we can kinda start to look at things. This is doing the same thing. It's checking out the same version of our code. The ref is gonna be the same. It's emerging to this. And so, kinda talked earlier about how it's up to you to decide if you want things to run in sequence, if you want them to run concurrently. You could essentially have one job that has the steps in order where you're installing, checking out, and then, for example, running the linter, then building, then running the unit test, then running the functional test. And that's gonna be one longer process. So it's just kinda, it uses one machine. So it just really depends on the resource needs that you have, how long you want the runs to take in order to determine what the best strategy is for setting up your pipeline.

Okay, so it looks like we have a passing unit test. So we can go ahead and click into the step here, run GitHub Actions Unit, so we can see that all of those have passed and that's great, so that one's already good. We are still running our limiting and type checking. And okay, so it just finished the typescript check. It looks like that was good. And it's just doing the cleaning up the completing process right now. And then our functional tests are running. So right now it's reading the database, preparing to unpack. And all this is happening at the same time. So a couple of things I wanted to mention about debugging in general earlier that we didn't really get too much into because a lot of it has to do with Docker and containers and didn't want to get too deep into that. But if you are in the middle of an active CICD pipeline process, maybe it's very complex and you're running into issues. A couple of options, you can SSH, the CICD provider that you have, or if you're using something like Docker or Kubernetes, you can actually SSH directly into the container of the machine that is running in order to do additional troubleshooting. So we talked earlier about those debug logs, right? And how you had to kind of enable them and then rerun to get that verbose information. If you're able to open a terminal and actually SSH into the machine, it's like you're running in the same terminal like you have in your VS Code editor, right? So like I'm in here, I'm started the server. If I wanted to run commands and like inspect things, I could. So you can do the same thing with certain CICD providers. I know DigitalOcean, for example, let's do SSH into an existing job. So those are something that you may want to do. Again, if you're in the middle of a long deployment and there's an issue, rather than having to enable debug logs and rerun, if you know some particular commands that you want to try out, maybe you want to validate your environment variables while you have the machine up and running, that's another option as well that we didn't dive into too much. It's a little bit... It takes some additional configuration to demo that. But, great. So now we're running our functional tests. And so, I just want to kind of show here that you can see that, the tests that's running, the device that's being used is replay Firefox. So it's not the regular Firefox, it's replay Firefox. And that's going to be different than the action that we had before, where you updated your visual snapshots. So when you updated your visual snapshots, that was just using Playwright. And so, you can see that that's essentially just running the tests. And it's... Oh, actually this is using replay Chromium. Okay, so we must have it set up for both. That's interesting. So, it's interesting that we still installed the Playwright dependencies browsers then. Probably don't need to do that. Probably an optimization that you could do there. If we don't necessarily need to install the Playwright browsers then, in addition to the Replay browsers. But, okay, so now we have a working GitHub Action, it's passed, our tests have passed, this is great.

27. Making a Breaking Change PR

Short description:

To make a breaking change PR, edit the compute position file to return X plus 20 instead of X. Create a new branch, propose the change, and create a pull request. The action will run the same jobs and report to the replay library. You can access the replay library in any browser at

So we can go ahead, and approve our pull requests if we wanted to, right? I won't do that right now, because you know, but if you want to, feel free, this is a test. So, now that you know what working looks like, let's see what not working looks like. So, we're gonna make a breaking change PR. So, you can do whatever you like, I'm gonna show you what I did, just so that you can kind of get a sense of, like we'll be working on the same page if you don't want to change yours too. But, you can kind of see it, if I go back to, I have an open pull request here, update compute position, and I just changed one file, it's our compute position file and I change where it returns X and instead it returns X plus 20, so it just makes it off by 20. That was actually in the original work as well, it is just a really easy way to force a failure. So, you can do something similar, you can change something different, if you want to break whatever you want to break, but we'll make a breaking change. Again, you don't have to download the code and open up an IDE to do this, from just making tiny little change, so we can do it right in GitHub, so it's a little bit easier to manage. And so, I'm gonna go into packages and core, source, compute position and here on line 127, I'm gonna edit that file. So again, I can just click this little pencil icon to edit this file directly, you really don't wanna do this for a lot of stuff, this is just as a demo purpose, just so you know. And so we're gonna return XX plus 20 and then I'm just going to confirm, really quick, that's the exact same syntax, the other one, cause I've been known to type things wrong, X plus 20, okay, great. And then again, we're going to create a new branch for the commit set of pull requests and I'm gonna do introduces breaking change to failed tests and you go ahead and propose the change and we're gonna create the pull request and that is going to kick off our action and it's gonna go ahead and run the same jobs. Now, because we have added that API key to our GitHub secrets, we don't have to do any additional configuration, it's just gonna run and it's gonna report to your replay library. So, we'll go ahead and let that run. So that's running really what we care about is the functional test, once that comes up. We'll go ahead and close some of those things out here. And so, the thing to note about the replay library is that you can access it in the replay browser, it will automatically be taken there if you're creating a recording, but you don't actually need the replay browser in order to open a replay. You can open it in Chrome, you can open it in Firefox. So, if you're using replay to record automated tests and you don't actually ever have to install the replay browser, the replay browser is being installed on your CICD machine, and then the recording is going to your replay library, which you can open in any browser. So, that's essentially, it's going to and I able to access my recordings here.

28. Debugging Unit Test Failures with Replay

Short description:

We identified a failing unit test that indicated an off-by-20 error in the X value. By examining the code and using replay, we traced the error to the compute position function. The visual snapshot comparison also revealed the off-by-20 error. We explained the difference between unit tests and functional tests in catching this type of error. We concluded the workshop by summarizing the steps and tools for debugging CI/CD pipelines and encouraged participants to reach out for further assistance.

Okay, and so let's go ahead and take a look, oh, it's actually in here, great. So, it's gonna go ahead and it's checking out. Okay, so we do have a failing unit test here. So, we can see that we expected object equality. So, we expected X to be 25 and that is failing because now we are off by 20, right? So, we changed the positioning of our X for our item, which is essentially forcing that to break. So, expected was 25, received was 45. So, we are now off by 20, which is why our test is failing. So, we have a unit test which points to that kind of initial point of failure. We can see that something is off with that value of X. So, if we were looking for our kind of initial point of failure, going back to our debugging process, you could start there and it gives you a really, really big hint as to what's happening, right? X is off by 20. So, we're expecting then to see the same thing have started to happen in our functional test. And just for the sake of time, I'm gonna go ahead and open up, like I said, the replay from this morning where I recorded this run. And we can go ahead and go into any of these. So, go ahead and click that to open it. And you can see the test is that arrow should not overflow floating element top. So, this is the floating. That's the reference, we play it. We can see it's overflowed. And we can go into our dev tools. And just to show kind of what's happening here. We can filter our functions and we know that we changed the compute position, right? So, compute position has seven hits here. And we can see that what's actually happening in our code, we can make this a little bit bigger, let me see, make this code bigger here for you. So this is compute position. And we can see that it's calling another function, compute position one, that takes reference and floating. And then it returns the platform and then it also spreads out the options that are returning. So we can go into this compute position one and go into the call stack there, but we can also see it here on the left hand side and the outline. So we have our compute position one and we have down here where we can see the return value is X plus 20. So we're able to identify that in our code here. We could also essentially outline where it says, like, let X, let Y, let computer coords from placement. So we may be able to narrow in on that. And then we have quite a bit of hits, so it's not actually allowing me to do that. But that's what we can see in our code that that change took place and why that failure's occurring. So we can come down to that line, and again, this is like the bundled code. So it looks a little bit different than it did in our source code. Oh, wait, I lost it there. There it goes. And so here we have return X plus Y, and so we can go ahead and add a comment to this line, and we can say returning X plus 20 off by 20 error. Yeah, the nice things too is that with replays, we know this is active. Well, once I've identified where the bug is, it can let you mark it as resolved, and then that shows as a resolved bug. So once you have solved an issue, identified the reason for a failed test, you can mark it as resolved, and then that will be tracked in your replay library. All right, great, yep. So you can see here that we're starting to see the failures, snapshot comparison or fails, and we're expecting it to match the snapshot. They're obviously not matching because previously we ran it with before we made that error. Now it's running with the off by 20 error, snapshots do not match. We are gonna be able to see the diff report from the run. So if I go to the previous one here, we have our visual snapshots diff produced during runtime. Go ahead and open that up. We can see, for example, expected, actual and diff, it generates three files, this is how the visual testing works. So we expect it to look like this, it actually looks like this. So see, it's off by 20 and it generates what's called the diff, which shows the difference between the two. So this section and this section are off and then the text is a little bit messed up there. So that's what visual testing does as well, it highlights those visual differences and then it matches, it says, does this match or not? And that's what's actually causing the failure. So we're not actually making an assertion on the location of it. We're doing that in the unit test, the unit test was off by 20, we were actually making that mathematical calculation. The end-to-end test, the functional test is failing because of the visual snapshot mismatch. So we're kind of catching it in two different places for two different reasons and depending on how you like to debug, one may be more helpful for you than the other, so.

Okay, awesome. So yeah, I said we wouldn't take up the whole three hours so give you back a little bit of time your day and hopefully this is helpful for you hopefully that you have kind of understand now the steps and the things that can go wrong in the CI-CD pipeline. We also explored some tooling that you can use to debug CI-CD pipelines, including GitHub actions. We took a look at functional testing like Playwright and also Cypress. We took a look at visual testing tools like Percy and then we also looked at replay which gives you a debuggable asset of build test recordings. So if there's not any questions, thank you so much for your participation today and I hope this is helpful for you. Please feel free to reach out to me on Twitter if you do have any questions. I am also in the discord, the Git Nation discord so feel free to message me there as well.

Watch more workshops on topic

DevOps.js Conf 2022DevOps.js Conf 2022
152 min
MERN Stack Application Deployment in Kubernetes
Deploying and managing JavaScript applications in Kubernetes can get tricky. Especially when a database also has to be part of the deployment. MongoDB Atlas has made developers' lives much easier, however, how do you take a SaaS product and integrate it with your existing Kubernetes cluster? This is where the MongoDB Atlas Operator comes into play. In this workshop, the attendees will learn about how to create a MERN (MongoDB, Express, React, Node.js) application locally, and how to deploy everything into a Kubernetes cluster with the Atlas Operator.
DevOps.js Conf 2022DevOps.js Conf 2022
13 min
Azure Static Web Apps (SWA) with Azure DevOps
Azure Static Web Apps were launched earlier in 2021, and out of the box, they could integrate your existing repository and deploy your Static Web App from Azure DevOps. This workshop demonstrates how to publish an Azure Static Web App with Azure DevOps.
React Summit 2023React Summit 2023
88 min
Deploying React Native Apps in the Cloud
Deploying React Native apps manually on a local machine can be complex. The differences between Android and iOS require developers to use specific tools and processes for each platform, including hardware requirements for iOS. Manual deployments also make it difficult to manage signing credentials, environment configurations, track releases, and to collaborate as a team.
Appflow is the cloud mobile DevOps platform built by Ionic. Using a service like Appflow to build React Native apps not only provides access to powerful computing resources, it can simplify the deployment process by providing a centralized environment for managing and distributing your app to multiple platforms. This can save time and resources, enable collaboration, as well as improve the overall reliability and scalability of an app.
In this workshop, you’ll deploy a React Native application for delivery to Android and iOS test devices using Appflow. You’ll also learn the steps for publishing to Google Play and Apple App Stores. No previous experience with deploying native applications is required, and you’ll come away with a deeper understanding of the mobile deployment process and best practices for how to use a cloud mobile DevOps platform to ship quickly at scale.
DevOps.js Conf 2022DevOps.js Conf 2022
163 min
How to develop, build, and deploy Node.js microservices with Pulumi and Azure DevOps
The workshop gives a practical perspective of key principles needed to develop, build, and maintain a set of microservices in the Node.js stack. It covers specifics of creating isolated TypeScript services using the monorepo approach with lerna and yarn workspaces. The workshop includes an overview and a live exercise to create cloud environment with Pulumi framework and Azure services. The sessions fits the best developers who want to learn and practice build and deploy techniques using Azure stack and Pulumi for Node.js.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

React Advanced Conference 2021React Advanced Conference 2021
19 min
Automating All the Code & Testing Things with GitHub Actions
Code tasks like linting and testing are critical pieces of a developer’s workflow that help keep us sane like preventing syntax or style issues and hardening our core business logic. We’ll talk about how we can use GitHub Actions to automate these tasks and help keep our projects running smoothly.
DevOps.js Conf 2022DevOps.js Conf 2022
33 min
Fine-tuning DevOps for People over Perfection
Demand for DevOps has increased in recent years as more organizations adopt cloud native technologies. Complexity has also increased and a "zero to hero" mentality leaves many people chasing perfection and FOMO. This session focusses instead on why maybe we shouldn't adopt a technology practice and how sometimes teams can achieve the same results prioritizing people over ops automation & controls. Let's look at amounts of and fine-tuning everything as code, pull requests, DevSecOps, Monitoring and more to prioritize developer well-being over optimization perfection. It can be a valid decision to deploy less and sleep better. And finally we'll examine how manual practice and discipline can be the key to superb products and experiences.
DevOps.js Conf 2022DevOps.js Conf 2022
27 min
Why is CI so Damn Slow?
We've all asked ourselves this while waiting an eternity for our CI job to finish. Slow CI not only wrecks developer productivity breaking our focus, it costs money in cloud computing fees, and wastes enormous amounts of electricity. Let’s take a dive into why this is the case and how we can solve it with better, faster tools.
DevOps.js Conf 2022DevOps.js Conf 2022
31 min
The Zen of Yarn
In the past years Yarn took a spot as one of the most common tools used to develop JavaScript projects, in no small part thanks to an opinionated set of guiding principles. But what are they? How do they apply to Yarn in practice? And just as important: how do they benefit you and your projects?
In this talk we won't dive into benchmarks or feature sets: instead, you'll learn how we approach Yarn’s development, how we explore new paths, how we keep our codebase healthy, and generally why we think Yarn will remain firmly set in our ecosystem for the years to come.
DevOps.js Conf 2024DevOps.js Conf 2024
25 min
End the Pain: Rethinking CI for Large Monorepos
Scaling large codebases, especially monorepos, can be a nightmare on Continuous Integration (CI) systems. The current landscape of CI tools leans towards being machine-oriented, low-level, and demanding in terms of maintenance. What's worse, they're often disassociated from the developer's actual needs and workflow.Why is CI a stumbling block? Because current CI systems are jacks-of-all-trades, with no specific understanding of your codebase. They can't take advantage of the context they operate in to offer optimizations.In this talk, we'll explore the future of CI, designed specifically for large codebases and monorepos. Imagine a CI system that understands the structure of your workspace, dynamically parallelizes tasks across machines using historical data, and does all of this with a minimal, high-level configuration. Let's rethink CI, making it smarter, more efficient, and aligned with developer needs.