Flaky Test Management with Cypress

Bookmark

This workshop is for Cypress users who want to step up their game against flake in their test suites. Leveraging the Cypress Real World App, we’ll cover the most common causes of flake, code through some examples of how to make tests more flake resistant, and review best practices for detecting and mitigating flake to increase confidence and reliability.

Table of contents:
- Cypress Real World App Overview
- What is Flake?
- Causes of Flake
- Managing Network-related Flake (Activity)
- Managing Dom-relate Flake (Activity)
- Flake Detection and Mitigation Best Practices
- Q&A



Transcription


I'm Cecilia Martinez. I will be your workshop instructor today. I am a technical account manager at cypress. I have been at cypress for almost two years now. January will be two years. I started when the company was pretty small. There's about 20 of us. When I started, I was the first kind of person on the support and success team, and I spent essentially the last two years just working with our users, helping them overcome technical challenges, talking to them about their test strategy, understanding how they use cypress and how they tested their organizations, and then also developing resources and doing some education around how to best leverage cypress and implement it on their team. So through that time, I've gathered some learnings, which I love to share with the wider community with things like this. So today I'll be talking about flaky test management with cypress. This is a workshop, so there is some coding elements to it. We'll be doing two coding activities. There's about 26 people so far, so we end up having around that many. I do like to keep things pretty casual. There will be time for questions throughout the workshop, but also feel free to drop them in the chat as they come up. I'll be keeping an eye on it. And then we will have some dedicated Q&A time at the end as well. So if you prefer to turn off your microphone and chat about it instead of posting a question, we'll have some time for that as well. So some general housekeeping. The slide link I did post it in the Discord. It's also on the screen as well. These slides are public, so you can feel free to save them for later, leverage them at your convenience, share them. And we'll talk about the set up instructions in just a minute, but they were included in the workshop description, so hopefully y'all were able to follow through with those, but if not, we'll have some time to troubleshoot initially as well. And then we will have, like I said, some time for questions. We'll also have some built-in time for breaks. So when we do the activities, I'll say, feel free to take some time now. We'll come back at X time or after the hour this time. That way people can kind of, you need to grab some water or whatever it may be. We'll have some built-in time for that as well because three hours is kind of a long time without any kind of break. But if you do need, obviously, to step away at any time, feel free to do so. Yeah, so any questions on logistics or anything before we get started, feel free to pop them in the chat or raise your hand in Zoom if you want to ask me a voice. Awesome. All right, so let's go ahead and talk about those set up instructions. So we'll be working today with the cypress real world app. So the cypress real world app is, I can post this link in the chat for those of you who weren't able to get set up in advance. This is a full stack react application built by our developer experience team to demonstrate best practices, use cases, essentially all the things that you can do with cypress and how to best use it. So it is a full stack react app. It's a Venmo style payment application where you can send money back and forth to your friends, like transactions. It's just a code base. It doesn't live anywhere because it's not a real app, but it's meant to show different types of UIs, different user in your flows that you may need to test. And it also comes with a full suite of UI and api tests. So we'll be working with the UI test today. One of the things that it also has is it has a very specific branch called Flake demo. And that's the branch that we'll be working on today. So this branch has some intentionally flaky behavior that we'll be working through and troubleshooting as part of this workshop. So to get started, if everyone can kind of go to the Flake demo branch, if you are familiar with GitHub, you essentially will just come down here, grab the Flake demo branch, and you can either pull it down. Ideally, you can also download the zip file if need be. And then you're going to go ahead and once you have that pulled down, you're going to run yarn install, and it is preferred to use yarn. If you have to use npm, you may run into some technical issues, but yarn is preferred. And so let me know if you do have any issues with that. But yarn install, and then once the installation is complete, run yarn dev. This will start up the application on your system. And so, and then once you have the application running, you'll see it on localhost 3000. In a new terminal window, yarn cypress open, and this will open the cypress test runner. And then raise your hand and zoom once you're done, or that way I can kind of get a sense of where everybody's at. And so the link is posted in the Discord. We will be using the Discord for chat. I just posted, I posted the general GitHub repo, then I also posted specifically that one branch, the Flake demo branch. And then again, hopefully you're able to kind of get up and running in advance and check that out before you're able to join today. But if not, we'll walk through these steps now. I'm going to leave these instructions up for a few minutes. Again, if you could raise your hand in zoom, once you're up and running, there is a option under in zoom to be able to raise your hand. That way I kind of just get a sense of how many people are ready to go and how many people are still setting up. I will also, if you have any questions, I can share my code and kind of walk you through the steps as well. I'll give it about five minutes for us everyone to kind of get up and running. I posted the slide link in the chat as well, getting a request for that. And then I believe that the workshop recordings will be made available later. Correct? That's a question in Discord about that. Yeah, it may take a minute to install everything the first time. Also, if it's the first time that you're doing this now, if you weren't able to do this in advance, it will also take some time to install cypress the first time that you run YarnCypress open. That's a good question. Okay. And then if we're able to use Discord for the chat, that's ideal. It's a little bit easier to follow along with the messages and respond there. So again, the Discord link, I can post that in the Zoom chat as well. So depending on, I'm seeing some kind of activity in the chat in Zoom. Again, if we could try and post in Discord, that's a little bit easier to follow along and respond directly. If you're having some issues with running yarn, Node, or cypress, you could have some security considerations on your machine. What are the problems? Why it's preferable to kind of do this in advance is to kind of address those in advance, but we can take a look here. If you are running into issues with YarnDev, make sure that you're running yarn install first and that everything is completing properly. If you have any errors when you're running yarn install, then I would review those errors and see what could be causing them. Sometimes you have maybe you're not running the correct Node version and then it could be something of that kind of dependency that's missing. So if we go here to be kind of getting started installing cypress, this does show the system requirements for leveraging cypress, including what version of Node that you need to have. And then I can kind of post this in the chat. And then additionally, there's going to be some information, like I said, if you have certain security considerations. What is it called? No, it's not VPN, it's Proxy. If you're using like a work computer, for example, if you're on a work network, you may have issues with the proxy configuration. Yeah, so real world app, I think, prefers Node version 14 or higher. The installation instructions are also on the real world app repository. So if you follow that link as well, if you have any troubleshooting issues getting started. Let me see some more hands raised. That's awesome. People are getting set up here. And thanks to everyone who kind of was able to get it set up in advance. Helps us kind of move things, move things along, get started. So I'm going to give it a little bit more time. I want to get to where at least like half the people here are able to make sure that it's up and running well for them. We can continue to kind of. Yeah, OK, so there's a question about using npm. You would do npm install and then I think it's npm run dev. Let me take a look at the package.json under the scripts. This is still leveraging yarn. So essentially you'd have to do npm and then you'd have to do the same kind of start commands that are built into here. So it's. It's really, it's really preferred to use yarn, but you can essentially take a look at these commands and then break them down. So we need to we need to start the api. So you like npm concurrently run TS node files back end. And then you also need to start the react here. So all of these commands you would need to concurrently we need to set the environment and concurrently run the npm version of the start react and then start api watch if you wanted to use npm. Thanks James for for helping out in the chat. I appreciate that. Yeah, so you do have again if you want to use yarn, you want to need to install it globally and in order to be able to leverage it in the project. Thanks for sharing in the discord. What versions that you're using Claire that's super helpful. And reproduce some of these links from the zoom chat into the discord as well. And then just kind of while we're going through that I'm going to go ahead and pull mine up. I'm going to go ahead and insert mine up as well. So kind of show you what you can expect. And it does take a minute to get started. Once it loads up you'll see kind of the Cyprus real world app and it'll look like a like an onboarding or welcome flow. So it'll kind of like this, it'll show the real world app. And you'll kind of have this kind of onboarding flow. You can go ahead and close that out we actually don't need to have it open. But then you'll know that it's running. And then again in a new terminal, then you'll go ahead and run yarn Cyprus open. And that'll go ahead again if it's the first time that you're doing it you'll have to install Cyprus. But then that'll go ahead and open the test runner. Looks like this. So you'll be able to see all of the spec files, they'll be pulled up automatically in the test runner. I pulled it down recently so it's version 9.0. There was a new release like this morning, I think for 9.1 but 9.0 is good. And then this will also pre fill with whatever browsers available on your machine. So it comes pre bundle with electron. I also have Chrome 96 on my machines. That's why it's showing here. I'm going to be running the test in Chrome. But feel free if you have a different browser if you want to use the bundle with electron, you can do that as well. It just may look a little bit different obviously because it's a different browser. All right, so seeing seven hands raised. Has anybody else been able to get up and running if you could raise your hand and zoom, or let me know in the chat I think we have actually have eight because I believe Claire said that their hand is raised as well. And then maybe if you're planning on just watching today, but you don't plan on getting set up, you could also raise your hand just to let me know that way I'm not waiting for you to proceed. So again, if you're kind of just here today to watch and you don't plan on coding along so this doesn't apply to you if you could also raise your hand just let me know that you're ready to proceed. Come back here to the instructions. All right, so we're about half. Everyone's about half so I'll go ahead and proceed. Again, we will have some time as we're moving along that will be like taking breaks are being available for questions so we can contribute continue to cry and troubleshoot. And this is being recorded and all the slides are being available to you. So, if you want to kind of come back and later on. Run through that that is totally fine as well so let me go ahead and lower some hands here. We're all hands, but feel free to lower your hand as well. If you are ahead and move forward. housekeeping here. Awesome. Okay, great. Okay, so for those of you who do have the code repo pulled down there's also one of the things that we need to do. This, this, this line here can cause some unexpected behavior in this branch, and I don't want to kind of interfere with anything that we're doing today. So it's going to be in the Cyprus slash support slash commands that TS file, we're going to go ahead and just need to delete or comment outline 204. So that is in, again, our Cyprus slash support slash commands file line 204, you can either do to two lines in front of it to comment it out or you can delete it and then go ahead and just save that file. I'm going to give you all minutes to do that. Let me know if you have any questions. I found that today when I was going to pull down the most recent version. So we are working on the flake demo branch. So, again, that's this flake demo link that has was on the slides. We do have to be working on the flake demo branch. This is the branch that has the specifically built in flaky examples. And then if you already pulled down the main branch, you can just check out like demo branch. You don't have to go through the process of putting it down again. Just check it out and then use that branch instead. Awesome. All right. So I mean, we're going to good to move forward. If anybody has any questions, just let me know in the chat. And then again, you can also follow along the slides. So the slides are linked in the discord and also in the zoom chat. So if you need to refer back to this page around what the setup project, the setup instructions are, if you join later on, if you just didn't get a chance to, you can go back and reference that page. Let's go ahead and move on. All right. So, and then the first section is going to be around me talking about flake. So you will be able to kind of continue troubleshooting and downloading things as we move forward. But let's go ahead and start talking about what is flake, right? I mean, I think all of us here, the reason that we're, the reason that we're kind of that y'all are here today is because probably very familiar with flake and it's probably a big pain point for you. But I want to kind of define what the technical definition is of flake and talk about what the impacts can be. So a flaky test is considered flaky when it can pass and fail across multiple retry attempts without any code changes. So if a test executes and fails, but then you rerun the test with no other changes, no changes to the environment, no changes to the test code, no changes to the source code, and then it passes again, that is what we define as a flaky test. And then there's no need to log into the application. Once it pulls up on your screen, you can just close it out. I just want to make sure that it's actually running. Just saw that question in the chat. So again, today we're going to be working with the cypress real world app. This is a, again, full stack demonstration app built by our VX team. We are going to be working specifically with the flake demo branch because it has some built in flaky test cases that we can leverage in order to understand how flake works and how to mitigate it in our test feed. So one of the test cases that we'll be working with today is in the notification spec. So I can go ahead and actually show you what this looks like running. So in our test code, we have a notification spec and we're going to be running a specific test here. I have it.only. You can see that this test is flaky since the like APA has an inconsistent response time. So we'll go ahead and just run that test and see what happens. So under our UI test, it's going to be the notification spec. And because I've put a dot only after the it block, it's only going to run that one single test. It's not going to run the entire spec file. So go ahead and let it get up and running. First time I'm running it today, so it might be a little slow to get started. So it's going to go ahead and go through this test case. Okay, cool. And so by the time it did pass the first time, which is great. Let me go ahead and double check and make sure that I didn't fix anything. Okay, so let's go ahead and get notifications. Looks pretty good. So we're going to go ahead and run it a couple of times until we can see the flaking action. Okay. So what's happening now is cypress is retrying the test automatically. cypress has a built in retry functionality, which talk a little bit about more later. But we did two attempts here. The first attempt, it failed. And second attempt, it passed. So with no other changes, the test here was flaky. What happened on the first attempt here is that it expected this button to be disabled and it did not find the element. Second time through though, it expected the button to be disabled. And it had no, where is that up here? It expected the button to be disabled, had no problem, found it, it was disabled. So this is what we mean when we say that there's a flaky test. So the reason that this test is flaky is because in our backend source code, again, this is a branch that specifically has some flaky test cases built in. There's an arbitrary delay on the server to simulate inconsistent response times. So you have probably run into this in the real world, right, where your api can take a long time sometimes, other times it's fast. Maybe it's a slow environment. Maybe it's just chugging that day. So what we've done here is we just essentially added a random set timeout before sending the response back to simulate those inconsistent response times. As you saw, Cyprus was weighted up to 4 milliseconds, I'm sorry, 4,000 milliseconds or 4 seconds before the item to appear before it actually failed the test. So what that means is that sometimes this api response is taking more than 4 seconds and sometimes it's not. And so when it takes less than 4 seconds, it passes. If it takes more than 4 seconds, it fails. And that's what's causing our flake in this specific situation. So we can see this is the test code that we're zeroing in on the behavior here. So we're going ahead and we're saying that the like count selector has a count of zero. We're getting the like button and we're clicking it. That click is sending off that api response to our backend saying, hey, we liked this transaction. Then our test code is looking for the like button saying that it should be disabled because when we fire off that request to our backend, it's also disabling the button so we can't re-like it. What's happening in this case is that that's failing because it's taking too long for the api response to complete before our next command fires off. So this is maybe a use case that you've seen a lot, right, where there's just inconsistency in your DOM related to an inconsistent network. So like we saw, sometimes it can pass, sometimes it can fail. So when you have flakey tests like this, how does it impact your test suite? It can cause a longer deployment process, right? So if you have to rerun tests or restart CI build, it's like causes a failure. So with cypress, we have automatic test retries, right? So it retried the test and it passed the second time. That's great. You know, it's better than having to go in and rerun your entire build process or restart the rerun, take off the test run again. It does still take some time, though, in order to run the test the second time. It can cause a longer debugging process because you have to determine if a failed test is a real failure or flake. So in that case, the functionality of the UI was working, right? We fired off the request, eventually the like button was disabled, the like showed up. But because of the slow network request, if that test were to fail, we don't actually know if the test failure is related to the functionality of that test case or if it's related to something in the environment or the back end. So we have to then double debug not only what the failure reason is, but then whether it's a real failure or if it's flake. And that ultimately results in reduced confidence. So do failures actually represent regressions or is or is it something that's due to the way that the test is written or to our test environment? And then another part of it is that it's flake hiding underlying issues in the application or test suite. So in this instance, you know, obviously this is an engineered situation. But if you do have certain areas of your application, or maybe you have an api endpoint that's consistently slow, and it turns out that the way that that api endpoint was coded, it's just taking longer to response because maybe it's doing it's processing things in a way that aren't quite as performant. So when we're seeing flake, if you're consistently just rerunning the test or just restarting it up, and you're not really keeping track of that flake across your suite, then you may not notice those patterns and be able to then make better decisions about how to address flake across your test suite. All right, any questions so far about what flake is? And then anything that we've covered so far? Feel free to drop them in the Discord. All right. And then also just super interested to, if you want to drop in the Discord, if you've had similar experiences with the flake, if you've seen anything like what we just saw where it was a slow network request causing the DOM to take a long time to reload. Maybe just like a little plus one in the chat, or just kind of let me know what experiences that you've had. I'm curious to see what people here think as well. Yeah, so some good questions. cypress will note a retry. It'll note it in the output that the test was retried. It'll also note it in your cypress dashboard if you're using the cypress dashboard. And then we're actually going to talk about causes of flake right now, James. So network is definitely one of them, but there's a couple others as well that we'll talk through. Good questions. If your output essentially, it'll say that the test was retried. So you'll see attempt one, attempt two, attempt three. And again, just to confirm, this is all configurable, which we'll talk about in just a little bit later on. And then if you do switch branches, you may need to rerun yarn install. There could be different modules in different branches. All right, so let's talk about flake. So at a really, really high level, flake is caused when either the functionality that you're testing is inconsistent. This is actually really helpful flake. So if the way that the application is coded, if there's logical inconsistencies, for example, that can cause the test to pass once and then fail a second time, that's good flake because that points to issues in your underlying application. But it can also be caused when something completely unrelated to the functionality that you're testing is inconsistent, right? And so that's what we just saw in our example. So really, like in our case, that's something that we want to try and mitigate as much as possible. So again, if you're seeing flake and you determine that it's because the functionality is inconsistent, then that's good flake because it's pointing to a defect. But if you have something that's really unrelated to the functionality that you're testing, then that's bad flake because that flake is going to reduce confidence in your test suite and make it harder to understand that that test case is actually failing or flaky because of the purpose and the functionality of that test case. So there's a couple different types of flake. I put them into categories. The first is DOM related flake. So this occurs when there is inconsistency with how elements are rendered to the DOM or how quickly they're rendered. So some of the errors that you may see associated with this type of flake are timed out retrying because the element is disabled, timed out retrying because the element was never found, or this is a big one we see a lot, right? Timed out retrying because the element is detached from the DOM. So again, DOM related flake. Now, the core, the root cause of DOM related flake could be related to network, but it typically is related more to how your DOM handles network and data updates. So if it's not rendering as quickly as it should, if it has, you know, if it refreshes when new data comes in, even after you've already done an input. If it's not handling state changes properly and is disabling elements when it shouldn't. These are all things that are related to your inconsistency with the DOM that can cause flake. So some examples I just talked through, right? Sometimes an element will load within that Cy.get timeout. Other times it won't. It may get an element, but then when it tries to click it, the element is disabled because the state hasn't actually updated yet. So we see this a lot in that cypress is faster than a user would be sometimes, right? So it may say, okay, like now this button becomes enabled because all the form elements are filled out, but the application hasn't caught up yet. So the state is still pending. The state hasn't recognized all those form inputs yet. That button is still disabled. And when it goes to get it and tries to click on it, it's still disabled. The DOM can also re-render between the Cy.get and an action command, causing a selected element to be detached from the DOM. One of the activities that we're going to do later is actually related to this exact use case. So I'm going to dive pretty deep into those detached from DOM issues. And this is kind of the one that I was talking about where cypress will take an input field, but the application is slow to process those key press events and the field value doesn't update completely before clicking submit. We've seen this a lot, where especially if you have state changes on each key press, where it's going to, like, for example, if you have the model in view, where it's connecting the input value to something in your state, and it types really fast and the update hasn't completed yet. So when it goes to submit the password, for example, the last two characters are missing, right? And then that's what causes the failure. So definitely have seen that one before. And then we have network-related flake, right? This is what we were just talking about. When a network request responds inconsistently, it could be an internal api or a third party or serverless endpoint. This can be related to either incorrect responses or also slow responses, right? So in the first example, it didn't have the correct number of elements. This is because the response that we got back from the api was not correct. They didn't get the correct data. The second example is that it timed out, right? Because the request didn't occur or the request got aborted or it just didn't occur within that time frame of the five seconds. So, you know, slow api response time, like in our example, this happens a lot with third party providers. One of the things that we recommend if you're using like a third party authentication provider is to actually sub out that behavior so that your testing of your application is not dependent on that. In our documentation, we actually have a testing strategy section that covers the kind of how to programmatically authenticate with Auth0, Amazon Cognito, Okta, Google authentication, graphql. So these are some of the really kind of most common authentication providers and some patterns for how to bypass them in your test so that you aren't dependent on them because they can definitely cause some flake. And then a microservices endpoint has, this is something that I see a lot too, is that if you're using maybe like a microservices endpoint or a lambda or something that has a cold start or that can take some time to start up the first time, it may fail the first time, but when you retry it, it's ready to go and it's warmed up and it does that. So one of the things that I've seen leveraged here is actually firing off a side out request to that endpoint in advance or even better as part of your CI build process before the test run even starts up, sending off the request, validating the endpoint, ensuring that it's up and running before you even kick fire off your test run. But that's another thing that I've seen cause network related flake as well. And last but definitely not least is environment related flake. So these are things related to the test environment, so the place where you're actually running your tests against. So sometimes you can have inaccessible or inaccurate environment variables in your environment. Maybe you have something different locally versus in CI. You have running tests across different size machines. Sometimes the resources are different across different machines. Then, you know, like one test, the test may be going a lot faster on a machine, a lot slower on another machine. You can have a failed dependency install in the environment. So when you actually are building your application, you may have something failed to install or may not have all the dependencies on that machine in order to actually run properly. You can also, this is a really, really big one, inconsistent data across your environment. So if your tests are dependent on having certain values in your test data, like if they're hard coded in or if you're making assertions against the actual content on the page, if that data changes, that can cause flake and failures that have nothing to do with the test itself. Okay. Awesome. So those are some of the causes of flake. Before we go into managing flake, I have seen some questions in the chat, so I want to just talk to those real quick. And then situation with cypress is not attempt to retry. So retries are configurable. We'll talk about that a little bit later on, but you can kind of set how many times you want it to retry in open mode versus closed mode. I'm sorry, open mode versus run mode. Yeah, and then we'll also talk about chaining selectors and assertions and commands and assertions. We talked about the chain of command and how to make the entire chain retryable. And then John, if you're seeing that error, try just restarting your server. That was happening to me. That happens to me sometimes. If you just kind of just shut down the yarn Dev that's running, restart it. That was enough to resolve it for me earlier. Okay, and then do you have a code example on getting microsite endpoint to run before the test? I don't know if I have a specific example. We have to kind of find one, but I noticed some out there. But essentially, you just said site out request to the endpoint. Actually, I can kind of show you what that looks like with our api tests. So cypress, a real world app, we also have a suite of api tests. We actually run those first. So like, for example, in our circle CI config, we have our api tests that we're running first, and it's going to run these specifically. But the syntax that we're leveraging for the api tests are similar to what you would do, maybe like in your support file or something that we're going to kick off before the tests actually start. So let's say we just we have like a slow likes api, right? We just want to make sure you can just do a site out request to that endpoint, and then response and then say that you expect it to equal 200, right? And so that's going to fire off a real request to that endpoint. So again, if you have like a cold start, or if you have something where it needs to like, if it's like a lambda or microsite, you can fire off a site out request, validate that the response is good. If it's not, there's a couple of ways that you can kind of like repeat or retry that. But essentially, you would not want to proceed with the rest of your tests until you know that that endpoint is up and running. Again, if you don't need to actually leverage it for your test, then you can also stub that out and bypass it completely, which is that I would recommend. We'll talk a little bit about that in a minute. But this would be the model for essentially firing that off is using a site out request. If you're if you're using it, if you're going to do it in your CI, that's going to be really dependent on what CI provider you use, you may have to some of them have like a run script option that you can just like run like a fetch or like a seo call or like install a package in order to do that. Curl, you know, something like that, but that's going to be dependent on your CI setup. Awesome. All right. So I'm going to go ahead and talk about managing Blake. And it's kind of a long section. Is everybody okay to keep going or do we need a break? If anybody just like wants to, if anybody has a strong opinion. All right, let's keep going. And then we'll take a break when we do the first activity. All right. So like I said, the first thing that we talked about was network related flake. That was the example that we looked at. So, in order to mitigate network related flake, here's some test writing best practices. Yeah, so we're going to take a break right after this section once we start the activity. So maybe in like 10 to 15 minutes. And then again, if you need to take a break at any time, please feel free to do so. Okay, so as a test writing best practice, Cy.intercept can spy and stub all network requests. So I just did a talk on network requests for test.js summit last week. Not sure if you saw it, but the slides are also available. You'll see them on there. If you want to dive deeper into this, we'll be using it today as well. But Cy.intercept is a really powerful command that allows you to spy on and stub all the network requests that are going in and out of your application. So how can we leverage this to battle network related flake? We can wait for long requests to proceed. We can wait for long requests to complete before proceeding. So we can say, hey, we have this api accounts endpoint. That's just slow. We have a lot of accounts. It takes a while to get it. It's like, you know, a data dump. It's huge. So what we can do is we can say that's going to take longer than the typical, I think it's like six or eight seconds that we give for a network request. So we can say, let's go ahead and wait for that request, but let's give it more time. Let's give it up to 30 seconds. It's because we know it takes a long time. And so what that will do is it'll give that request more time to complete. It will also ensure that it's completed before we proceed with the test. That way we know that the account name will be on the page or it should be on the page. At least that's what's expected of the DOM behavior that we're testing. Well, dive more into that in just a second. Don't worry. We can also stub inconsistent or unneeded network requests. So this is what I was just talking about. If you don't need it for your test, if you're not testing its functionality, then you can stub it out. That way it doesn't affect the behavior of your test. And you can use that and intercept for stubbing as well. So you can either use it to watch a route request or you can also use it to intercept and then stub, which is essentially to pass a fake result back from that api request. So this is how we do it. Declaring a spy with Cy.intercept. You can pass a URL, a method or a route matcher. If no method is passed, any method type will match. If you watched my talk last week on network requests, this is from that. So I apologize if it's repeated for you. But essentially, this is the option that you can use to declare an intercept. We'll be using this a little bit later on today. Again, you can kind of leverage the slides. But I want to kind of put the second one here where you can pass through a method and then a URL. So you can say, please go to this endpoint and we're going to be looking for a certain method type, which we get post patch delete. Once you've declared that intercept, you can save it to use throughout your test code by using dot as. So in this case, we have our Cy.intercept. We're saying any request that goes to the search endpoint with a certain query, please save that as search. And then once we have saved it, then we can wait for the request with dot wait. So here again, we have our intercept declared. We're saying any get request to our users endpoint, please save that as get users. Then within our test code, we're going to say Cy.wait get users. And then what that's going to do is it's going to wait for that request to complete before proceeding with the rest of your test code. So again, if you have a network request that takes a long time or if you have a network request that's critical that it complete in order to have the data on your DOM, then you can leverage the Cy.intercept.as and then Cy.wait pattern in order to ensure that those are completing before proceeding. And then if you need even more time, you can pass the time out in order to make that longer. So again, going back to our initial kind of thing from the beginning, this flake is causing something unrelated to the functionality that you're testing is inconsistent. This is bad flake. This is not helpful flake. It gives you no insight into what's happening in your application. So if a problematic network request is unrelated to the functionality you're testing, stub it out. So if you're having just a lot of pain with a certain network request and the only reason that you're using it is to get to a certain page or because you need to be logged in to test your application, then don't use it. Just stub it out. So what we have in the cypress RealWorld app, going back to the code there, is in our custom commands. Oh, wait. Okay. I have the real code. In our custom commands, we have three options for logging in. We have login via UI, which is a login command, where we're actually typing in the username. We're typing in the password. We're checking the box. We're logging in via the UI because sometimes we do want to test that, right? We should have a test that tests our login flow, make sure the form is validating properly, etc. But then we also have a login by api command, where we're just making a post request to our api endpoint and with the username and password, and we're bypassing the UI. So that's a lot faster. We're saying, hey, like, let's not type in everything. Let's forget the UI. So send a post request to our back end and then telling them that we're logged in. We also have login via XState. Let's say it's logged out via XState here. So XState is the front end store that we're using for this application. And when we're doing this, we're actually just kind of going into our store and we're telling it that we're logged in. So what that's doing is it's actually it's not serious. It's WinOffService.Send. So what that's doing is actually just telling our front end store, hey, we're logged in with this user. We didn't touch the back end. We didn't log in via the UI. So if all we need to do is log in to get to a certain screen so that we can test our date picker, and that's all we care about with that test case, then reduce as many of the other variables as you can and just focus on that test case and that functionality specifically. All right. So we are going to do activity now. It's also going to double as a break time. So we're going to be working with the UI notification spec file that we were just the test case that we were just looking at. So, again, this is going to be under UI. So it's under tests, cypress, then tests. Let me minimize all this. Okay. cypress, then tests, UI, and then notifications.spec.ts. And it's going to be on line 30. And then you'll see it has the word flake. If you just search for flake in the file, that's another option too. But this is a flaky test, right? We've talked about why we've identified that. That it's a flaky test. So using these instructions, we're going to fix this flaky test. So run the test in your cypress test runner. You can do that again. I'll go ahead and close this out. And the cypress test runner, by selecting the notifications spec, I would recommend adding dot only after the it block here. That way you don't have to wait for all the tests to run. So you can add dot only on line 30. And then go ahead and select it on notifications. And run the test in order to note the command that fails. Once you've done that and you have a sense, and you may have to, again, you may have to rerun it a couple times because it's flaky. It's not failing every time. But go ahead and run it until you do get a failure and note the command that fails. All right. So we got a failure the first time there. And go ahead and take a look at the error, right? The cypress test runner gives you pretty helpful errors. It'll go ahead and tell you what line of code is failing, which is helpful. And it'll tell you what the error is, right? It's this, it's on line 53. Button should be disabled. Then refactor the test so it's resistant to flake. And you can use one of the following strategies. Option one, you can use, you can try both and, you know, test them both out, right? One is to increase the default timeout on the flaky command. Recommend using the docs for this if you need to. But there's a timeouts guide in our docs that talks about how you can increase the time to retry. Go ahead and post that in the chat as well. I'll post that in the Zoom chat too. The second option is going to be using the model that we just talked through, which is identifying the slow post request and then waiting for it to occur using cy.intercept.as and .wait. And then if you do, or if you're just curious and you want to see, so there's a couple different ways to see post requests, right, or to see requests. You can see them in the command log. And if you click on them, it'll output it to the console. Let's see, console. It'll output the information to the console. You can also have the network tabs open while you're running the test, and it'll show all the network tabs, NEC requests that are taking place. Or if you just are familiar and comfortable with looking at the source code, the code that's causing the flake is in backend slash like routes as well. And so, okay, awesome. Yeah. So I'll go ahead and feel free to lower your hands now as well. Thanks so much for you all who put them up. It looks like we have almost half. So I can go ahead and kind of talk through some solutions. And I did see some people posting some solutions in the Discord that looked good, so that's exciting. But we'll go ahead and talk through the options now. So let me go ahead and get my code pulled up here back in the window. All right, so this is our test code. When we run it, it's flaky. Not great. So the first option that we talked about was to increase the default timeout on the flaky command, right? And so let me go ahead and get this pulled up. And we saw there that the flaky command, so the failing command was the sci.get data test like button, right? It's on line 53. We have that really helpful error message from cypress. So if we come here and we go in line 53, how can we increase the default timeout? So timeouts can be configured on either a block level, so you can do it for a describe or a block or context, if you use context as a describe. They can be on the command level, so you can pass it through as an option to any individual cypress command. You can also increase the default timeout globally in your cypress.json. So if you just have like a really slow app and you just want to give everything a little extra time, then you can do it there too. So why would you want to do a timeout and not just an arbitrary sci.wait? So say, for example, this like button, we go ahead and we increase the timeout and we say, give it 10 seconds, just give it a bunch of time. Who knows how slow it's going to be? If instead, if we were to do sci.wait 10 seconds, it's going to wait 10 seconds every single time, no matter what. So your test is now going to be 10 seconds slower every single time. Whereas if we increase the default timeout, it will only take 10 seconds if it needs 10 seconds. If it needs four seconds, great. If it needs six seconds, great. But it's going to be dependent on the time that it actually needs. So you're increasing the maximum allowable time versus saying, please wait this much time every single time. So if we go ahead and save that, and again, we're passing this as an option to the actual command itself. So you can just do a comma and then pass this timeout option and pass the milliseconds. So we save it. cypress automatically detects the change and reruns the test. And go ahead and take a look and see what that looks like. One thing that you'll notice too is that it's running. The bar is bigger now, right? So it gave us 10 seconds instead of four seconds. So as your bar is going down, it's going down slower than it was previously. So there we go. Voila, passed in the first time. So that's one way to do it. Now, not necessarily the most elegant way to do this. For a couple of different reasons. Number one is, what if one day it's 11 seconds? What if one day it's 12 seconds? Hopefully not. But you have to manually update these timeouts. It's also something that you have to kind of maintain on the command level. So what if this network request happens a couple of times in a test? So we want to wait for it multiple times. So the other option, this is kind of the quick fix method. The other option is to use the cy.intercept.as and then .wait pattern or model in order to intercept the post request and wait for it to occur. So what does that look like? So we want to declare our intercepts as early in the test code as possible. We actually have some existing intercepts here if you scrolled up and noticed in our before each because these are ones that we leverage heavily throughout all the tests on this spec file. So you could do it in your before each hook. Just to make it easy, we'll go ahead and just continue working with this one block that we have. But if you see the cy.wait kind of thrown throughout the test code, those are defined up in your before each. So we'll go ahead and do it here. But you do want to declare it as early as possible because you need to make sure that it's declared before the network request happens. I've seen people kind of throw them in the middle of their test code, like they would declare it here and then wait for it there and it doesn't happen fast enough. So we'll go ahead and declare our cy.intercept. And this is a post request that's going to the likes endpoint. And it's going to have like a transaction ID. We can just wildcard that. And then we want to go ahead and save that. So we can save it as. And I like to just follow the method of post likes. So the type of request and then the endpoint. Sometimes you'll see like get users, get likes. That's just kind of I think it's descriptive and then it also kind of lets you know what the pattern is that you're following. So now we can use this post likes in our test code. So this is the flaky. This is the action that triggers the api call. And this is the command that's flaky because it needs that to occur. So here's where we'll want to put that in our test code. So we'll use the at symbol. That's how we reference anything that's saved with .as. And we'll go ahead and put our post likes. And again, if you're using VS code and you have IntelliSense, so if you're using the remote app, it's already installed here. This shows you like the syntax. It gives you an example when you hover over it, right? So you can see this is like the pattern that we're using. We're using the side intercept as and then we're doing side out wait. And in this case, we're actually making assertions on it. But in this case, we're just waiting for it to complete. So that's all we need. So now if we go ahead and save that. Again, cypress is going to rerun the test. And just hop it over to the Discord. And again, no longer a flaky test. If we go ahead and we look here, cypress test winner actually keeps track of all of our intercepts. So we have here, this is the post request that occurred. If we hover over this, we can see that it matched the side intercept spy with an alias of post likes. We can also see here that we waited for it and then we found the alias. So we're able to kind of track this along the command line as well, or command log as well. So there's a question about instructing cypress to run the test a certain number of times so that you can check that your fix is not just lucky straight. We'll be talking about this a little bit later. So there's a way to do it yourself using what's called the module api. So if I go back to our lovely docs. So the cypress module api essentially allows you to orchestrate test runs how you want. It's like using a node, like using a node module. So you could essentially like say like run the test, run the test, run the test. But the feature that you're talking about is something that we're actually going to be adding, or it's under development consideration. It's on our kind of roadmap for the cypress dashboard. And it's something that we call test burn-in, where essentially the first time that you add a new test, it will run it a bunch of times to ensure that it's a quality test and not flaky. So we'll talk a little bit more too about how you can leverage retries to kind of ensure that it's a good test. But essentially, that's something that we've gotten feedback on. It's a really good idea. Right now you kind of have to orchestrate it yourself. But yeah, good questions. Awesome. So did anybody have any other questions about this activity? Do you all feel like you are more comfortable now with the kind of Cy.intercept.as. and then Cy.wait method? Feel free to throw any questions again in the Discord chat. But hopefully that you found that helpful. Awesome. Seeing some thumbs up. Nice. All right. So let's talk about DOM-related Flake. This is one that is... So cypress has a couple of built-in functionalities to help address Flake on the DOM level. So I'm just going to walk through those first. And these are things that cypress already does under the hood, cypress Test Runner. One of them is query command retriability. So what cypress will do is it will retry a command and it'll allow up to four seconds by default. Again, as we just saw, you can extend that. You can increase the default timeout. But before failing the test, it'll give it some time. It'll say, hey, just keep retrying. I think we may have some DOM slowness here. We don't want to fail it right away. So we kind of look at this example. So we have an if block where it adds two items to a to-do list. So we're getting the new to-do. We're typing the first one. We're typing the second one. So we're making an assertion that the to-do list should have a length of two. Very kind of like a simplified example there. So if we look at kind of what this is, is cy.get is a command. It's a selector, but it's also a command. It's doing something. And then .should is an assertion. So if the assertion that follows the cy.get command passes, then the command finishes successfully. But cypress will also do this automatically for you, where if the assertion that follows the cy.get command fails, then the cy.get command will re-query the DOM again until the command timeout is reached. So if we get to-do list length item should have length and it comes back with one, then we won't just retry the should. We'll go back and we'll re-query the DOM for the get. So that's really important to understand how that chain of command works, because you can leverage this to kind of fight DOM flake by ensuring that the chain is retryable. So you want to be able to leverage this pattern as much as you can so that it's re-quering the DOM whenever the assertions fail. Otherwise, you're going to just be doing the same, you're going to do an assertion against the same element and it's not going to actually be updating. So what could cause that to happen? Again, so right now it's waiting. It only had a list of, it has zero, but it's re-getting the DOM element over and over again, and then finally it passes. So it's actually re-quering the DOM and that .get over and over again. So something to keep in mind, only the last query is retried. So if you have a cy.get and then you chain a .contains after it, or I've also seen cy.get and then .find or .children or .parent or .its, if you have anything chained after that, only the last query command will be retried. So if you have cy.get, .contains, .should, the cy.get will not be retried. That initial selector will always stay the same. The .contains will retry. It will keep looking for that. It will re-query whatever it is that it finds with that text, but the entire chain will not. If you leverage instead cy.contains and pass through the selector and the text, then it's going to re-query that entire cy.contains. So whenever you have chain selectors, so I see this sometimes with cy.get and then .children, it's going to re-query the children. It may not re-query that initial get. So you can either use a single query command if possible. You can do something like cy.contains where you're passing both a selector and a text, or you can have a really specific selector where you're zeroing in on exactly what you need instead of having to traverse the DOM. Or you can alternate commands and assertions to make sure that you have the right version of the element before proceeding and chaining off of it. So if you absolutely have to get a list and then get the child, and you can't just zero in on the child element, then what you can do, and this is the example here, is that cy.get selector, and I'll say it's list, should have length of three before we go up to the parent. Or child is probably a better example because that's more common. But say, for example, we did cy.get lists, and then we didn't have that should have length of three. We may be grabbing a list that only has, and only two of the elements have rendered so far. And then again, that's never going to query again in this chain. So we want to make sure that we have the right version of that element before we proceed. So in this case, we have cy.get selector should have length of three. Now that we know we have the correct version of that list, okay, cool, it's got three children, like whatever the assertion may be, if we need to assert that it has a certain state or it's showing certain content, then we can go ahead and proceed. Otherwise what's going to happen is if we have like assertions only at the end, it's only going to re-query that parent. And maybe we never get the original, like we never actually had the correct one to start with. So if you alternate commands and assertions around areas of your application, that you have to be very specific. So I see it a lot when they, you know, select dropdowns or forms or things where you have like multiple, like really complex components that have a lot of child components instead of a single parent. Then maybe it's dynamically rendered with lots of different data. So it's a little bit hard to get into. You'll want to make sure that you have the right version of each selector before you proceed. Yeah, so good question, Blaine. So exactly, if you have a cy.get and then a cy.click and then an assertion and the assertion fails. So what's happening there is that cy.click, so you're chaining an assertion off of .click. Typically you chain an assertion off of, I guess you could do that. So what'll happen is a cy.click should pass through this subject. I don't know if it'll retry .click. Let's find out. That's a good question. So all of our api methods have information about them. So yes, okay. So .click will automatically retry until all chained assertions have passed. And then there's also information about the subject management that comes through. But I don't believe it'll retry the, yeah, so it won't, so cy.click will automatically wait for the element to reach an actionable state. It requires it being chained off a command that yields the element. It'll automatically retry until the chained assertions have passed. But it's going to retry the click, not the get. So you want to ensure that you have the correct element before proceeding with the click. Awesome. Okay, so I hope that's helpful. And then another thing that I wanted to share here, and this is in the slides, is we had a webinar, wow, last year already, oh my gosh, about using code smells to fix flaky tests in cypress. And in this webinar, Josh went through three different examples of detached from DOM errors and how they related back to the source code. And then also I wanted to share this blog post here that Amitav wrote about, again, specifically detached from DOM errors and what they can mean in your underlying source code. So detached from DOM errors typically are caused whenever there is a DOM refresh that occurs or the state changes in your DOM that's causing a re-render after you use the get and before you do the action. And we do have a couple of open GitHub issues with requests and features of how to maybe address this. And we've done some additional error handling and some better error messaging. But if you think about it, that's essentially what's happening, is that you're getting the elements, and then your front-end application, the DOM, is re-rendering, and the element that you had previously isn't there anymore. So you're trying to click on an element that is no longer there, it doesn't exist in the DOM anymore, it's been detached. So what we typically need to do is ensure that the element that we have is going to be consistent before we action it. So we're going to take a look at an example in the next activity. Yeah, and then one of the things is like, we get a lot of feedback around this behavior. We definitely always are open to feedback via GitHub issues. So cypress Test Runner is open source, we use GitHub. And so if you do have any kind of requests about behavior or questions, you can always leverage that repository to submit features. I think we may have some open ones as well. So if you want to go in plus one or add comments. But yeah, and then another thing is writing long chains without any assertions in between is bad in general. So not necessarily. Claire, that's a really good question. It depends. So it depends, right, is like always the answer. But it depends on what you're doing with those chains. So one of the things is like, if you're doing a lot of just like actions versus selectors. Now, if you have like selector, like repeated selectors in a chain, I would recommend putting assertions in between those just to ensure it. Because again, otherwise it's only ever going to get the last one. But if you have a lot of maybe like actions or things that are chained off of each other, because you just have to do that a couple of times, it's probably like not a best practice, but it's not a bad practice either if your application can handle it. Now, if you have a lot of flake or if you're having a lot of inconsistencies or if it's causing performance issues to do that, then maybe don't. So it's all going to depend on your application. But if you kind of look at some of like the cypress real world app tests, because this is what we're leveraging as kind of some of our best practices, is we have like should, and then we get the first, then we have another should, then we have an and. So we have some chains, but they're never super, super long, right? Even if we have things that are taking place, like get by cell, location, should, you know, should. We have quite a bit of assertions kind of bundled into those chains. So if you have super, super long chains, like I wouldn't say it's like it's not a best practice. It may not be bad, but it could definitely lead to like more flake and more inconsistency. All right. So we're going to do another activity. We're also coming up on the second hour here. So this is going to be the second coding activity that we do. The rest of the time, we're going to be used to talk through some more best practices and examples, and then also have some Q&A. So let's go ahead and take some good time here. I have about 15 minutes, but let's go ahead and maybe take a little bit of a break as well. So I'm going to walk, but I want to walk through it first. So we're going to be using a different spec file. It's going to be the new transaction spec line 259. So let me get that pulled up here. So we have our new transaction spec. Again, that's still going to be under UI. And then line 259 is where it kind of talks about it. Line 262 is where the test code starts. I would definitely recommend popping a dot only on that line to only run this one test. So what's happening here? Let's go ahead and run this test. So we'll stop this one. And then in our cypress test runner, we're going to look for the new transaction spec. And again, because I have the dot only that I've added on that line 262, it's going to only run that single test. So we'll go ahead and see what's happening here. All right. So this one is probably going to fail every time, I think. This is pretty flaky. So what is happening here? We have a really long error message, right? So we timed that after four seconds. The side out should fail because the element is detached from the DOM. So we have this really long error message. We have kind of like a learn more that goes directly to a section in our docs about the error message. cypress has a bunch of custom error messages that are meant to kind of point you in the right direction, but walks through the entire example about how you can fix the test code. We re-query for newly added DOM elements, understand when your application re-renders, guard cypress from running commands until a specific condition is met, usually by writing an assertion or waiting on an XHR. So I wanted to point that out. So in this case, the element that we're working with is, you know, this Kalin, right? This is getting detached from the DOM. So what's happening here? Let's go ahead and take a look at the instructions. The test is intentionally flaky and determine traits and example of an element can become detached from the DOM. So let's go ahead and take a look and see what's happening here. We're clicking on our element. And then once we're doing that, this post request fires off. I'm sorry, get request fires off. Then we're getting our list items and we're looking for the first one. But what's happening is once this request completes, if we were to look at our application source code, that actually triggers a refresh of our DOM. So that is what's detaching the element from the DOM. And again, we can kind of see this if we inspect and we look at the network and if we rerun the test. It can kind of be interesting to watch this because it shows them happening like in progress. And you can see the milliseconds. So it was really fast. But what's happening is that search query end point, or I'm sorry, api requests. Once you receive that back, that's actually triggering a refresh of our DOM. So again, understanding what's triggering re-renders of your DOM is really important to helping debug some of these cases. But that's what's happening there. So because we know that it's related to a network request, we can leverage the same kind of pattern that we did last time, where we know that we are essentially when we're doing that click here, that's what's firing off that request. And then with the time we grab, we grab this right away. So we click that and it fires off the request. And then we immediately with no wait time at all, grab that user list item. Then we have the dot first and then we have should contain. So again, we are grabbing the element here. Somewhere in between here and the should, the request is completing. The DOM is refreshed or re-rendered, whichever term you like to use, and then it's detached. And so the time we go to make our assertion, it's been detached from the DOM. Let's go ahead and take a look at the solution. So again, a couple of different ways to solve it. If you actually looked at the source code and you saw what was causing it, that would be something. But let's go ahead and start with what this looks like here. So I'll drag my code back over. So again, we're using the side.interstop.as.wait pattern here. So when we took a look at the test code, we were able to identify that there was this post request that was fired off. So this actually isn't being caused by the click. I misspoke earlier. It's actually being caused here. So when we type in Kaylin and then we just like enter, that is actually what's causing her to show up as the first item. But this takes a minute. It takes a minute. When we're clicking outside, which is the next thing, it doesn't actually resolve. And so we're waiting for it to get, that's what's kind of causing that rerender. So what we've done in the test code in order to make this pass, make it less flaky, is we're intercepting that get request. So you don't have to pass that. It's by default going to be get. If you want to be more prescriptive, you can. I just go for the simplest version. Again, to our users endpoints with this search and then just the wild card here as search users. And then we're waiting for search users here. So just to kind of dig in a little bit deeper, if you wanted to maybe investigate why this was happening. So what is this is going to this is involving our user list search kind of what's happening here. So when we're using our user list search form, this is the react component that we're working with here. And we can see that whenever there is a change in the text field, we're firing off user list search. So any time that we're typing in and there's a change, it's going to fire off that request. So that request is actually searching for every single time that there's a change in that text input, which is fine. Like it's just we just kind of need to understand that to test it, because from a user experience perspective. A user like the user doesn't really care if the items getting detached from the DOM every single time and it's refreshing like they don't really notice it. From a testing perspective, we want to understand that. So every time that there's an on change in this text field, it's firing off this user list search method with the query. And that query is obviously it's whatever in that input. So this is being defined in our transaction create container, the user list search. So we see here that a debounce has been added and a debounce will delay by two seconds. I'm sorry, by 200 milliseconds. And so we're kind of doing it. We're doing a type fetch and we're sending off that payload to our back end with that query. So this the fact that this is debounced is actually what's causing that delay. But again, we could we could kind of solve for that a little bit. Couple other things to keep in mind is just like understanding how that works with our DOM is being refreshed. So if this was something we were say, for example, using X state and instead of using a function within the component to fire off that request. If we were, say, for example, updating our states or leveraging our global store, we could actually like follow along with what's been dispatched and wait for the dispatch to complete before proceeding. And then but in this case, we went ahead and just waited for the user. I'm sorry for the post request for the HTTP request to come back before proceeding. Target user got first name within the set of intercept. So you can say, yes, you absolutely could. That's a really, really good question, Wayne. So aside that intercept, you can do a lot of things. And so actually, in one of the examples that we looked at in the slides, and I'll come back here for network requests. Let me find it here. Yeah. So we're actually like using a query, right? And so query is one of the properties in the route matcher object that you can use in order to specify. So if you really wanted to get granular and say, like, yes, we want to we want to do this particular request, which is actually a really good best practice, because there could be different requests firing off to that search end point. They want to make sure that we're like waiting specifically for that one, then you could leverage the query property on that object. And you could pass through a variable so you could pass through like whatever username that you're using in the test. In this case, it was Kaylin, but whatever it may be, or whatever the expected term is. So, yeah, really great question. Really great idea. Yeah. So, Lars, that's another. Oh, sorry. That's an unrelated question. That's about the. Yeah. So the test case is more readable. Yes. And that's we're talking about command logging and the Cyprus command log. But that is correct. You're essentially if you're using methods that are defined outside of outside of the case, or you're pulling that in, you may not get as helpful information in the command log, unless you're leveraging something special like logging customizations. Anyway, sorry, got a little distracted there. But so that was how to handle some detection DOM issues, but I really wanted to show the codes that you understand how to kind of follow that trace into the source code and understand what's causing those rerunders. How was that any questions on that activity. So hopefully now you're feeling much more comfortable with the intercept as leak pattern in order to handle a lot of the flake that you may be experiencing your application. So that's when the flake can be addressed with test code. Sometimes it can't. And so we have to talk about how to address all different types of flake. Awesome. Yeah, thanks for that example. That's perfect. Yep, you can pass that through directly as part of the path as well. Okay, so we sometimes we have flake again, like we talked about DOM related flake, network related flake, but we also have environmental flake, our flake that we that is related to things outside of the test case itself, outside of the app, even sometimes even outside of the application itself. So we're always as they were always at the mercy at the at the mercy of our source code and our in test environments. So those are sometimes things that we, you know, we're not testing perfect things we're not testing in perfect environments. And so there's a couple of strategies that you can take to try and mitigate some of those issues that you're having outside of what you're writing in your test code. So let's talk about those. First one is test retries. So we talked about this. So test retries is a function of the cypress test runner. So for those of you who may not be familiar, cypress has is a cypress test runner, a cypress app, which is free open source always will be. That's what we've been using today this entire time so far. On top of that, we also have the cypress dashboard. The cypress dashboard is essentially like a SaaS product that allows you to record your test results to a location where you can see them, get analytics. We're going to talk a little bit about some of those functionalities too. That is kind of wanted to say that a lot of what we're talking about though is in the test runner itself. So test retries is configured in the test runner. You don't have to use the dashboard for it, but it can be helpful because the dashboard will show you those attempts. So you can retry it either once like this retries and then set the number here for how many times you want it to retry. Or you can also configure it by mode by passing an object to the retries property with run mode and open mode. So sometimes when you're running it locally, you don't want it to retry. I actually turned it on because I wanted to leverage it for this workshop. But when you're in run mode, which is typically going to be in CI, you want it to retry to get that information. And so again, you can close out all this stuff, make it nice and clean. That's going to be in your cypress.JSON. So here I have run mode, open mode. This was one. I just upped it for this workshop so you could see how that was working. And then in the cypress dashboard, you can see the screenshots, videos and the attempt like the information for each attempt. So just to show you what that looks like. So with the cypress real world app, we use the cypress dashboard to record all the results of our test runs and we make those public. So from the GitHub repository, if you click on this site test button, that will take you to our public dashboard where you can see all the information that I'm going to kind of show. But if we go into actually, there's a flake demo. Yeah, here we go. So this is the branch actually that we're using today. It has a lot of built-in flake in it. But if we click into a failed test case, for example, we're able to see here that it attempted three times. We're able to see that it had the same error all three times and that, you know, it caused 24, this error caused 24 failed test results and we're able to see like the screenshots and the video from each error. So that's but that kind of shows you the how the test retries can be leveraged in order to and then it will also show you flaky test. So as we said earlier, a flake is when it fails, but then passes. So the cypress dashboard will automatically detect these flaky cases where on the on a subsequent retry, it ultimately passed and it'll flag them. It'll say that this test is flaky and it'll kind of let you know how many flaky tests there are in each run and it'll collect all those flaky test cases together and the flaky test analytic. And that's all again based on test retries. So it'll identify the flaky tests and the place severity based on the number of times the test is retried. That's the only indicator that we're leveraging in order to define a flaky test case. So you have to use test retries in the cypress test runner. That's how it's enabled. You can see artifacts and then you can also see kind of some historical information. And again, if you wanted to take a look at the cypress real world app, all of this is public. But our flaky test case, we have an overall flakiness of 18% across the entire suite. What's more helpful is to kind of do it by branch, right? So on the develop branch, we only have 3% flake. But if we go to our Demho branch and then we undo develop, we will see that a lot of that flaking is coming from this branch, which is a 16% flake rate. And then if we kind of look at the flaky test cases, we'll recognize them from the ones that we use today, right? So user A likes the transaction of user B. We can see that this was all the runs that it was flaky on. We can see the errors that caused it, which is the like button and then the transaction item. We can see the test code history last time that it was changed, kind of just some information about those flaky cases and the severity. So I mentioned before that we have some kind of information like things coming in with flake management. So you do get alerts back to GitHub for requests and then also in Slack whenever you have flaky tests and it's identifying those flaky tests. We don't want you to just say like, OK, well, it passed eventually. I guess we're good because as we kind of talked about, flake can be an indicator of issues with either the way that the tests are written or with the underlying application or the environment. So it's helpful to kind of see those patterns across all your entire test suite of what's causing the flake. So, for example, if we're seeing here that like, hey, you know what, like a lot of these are coming from transaction feed notification and new transaction. And you know what, a lot of them are related to things like not populating. Maybe we have an api endpoint that's goofy or maybe we have a debounced method. Obviously, those are your contrived examples, but that could kind of point to some of the underlying causes of flake in your overall test suite, not just on individual case levels. And then the other thing that I talked about was test burn-in. That's coming soon. It's on our roadmap. Essentially, it would identify new tests. Whenever new tests are added to your test suite, it'll automatically retry them multiple times. It's supposed to be a configurable one a time. So if you wanted to do like five or 10 or whatever it may be, that way and you check for flake. That way you can kind of battle test new cases before introducing them into your test suite to see if they're actually like good cases, good tests. I'm not sure if that would cover change tests. That was kind of the question that caused this earlier is if I make a change, how can I ensure that it's a good change? That's like really interesting feedback. So definitely make sure to get that to our team. But again, all of this is kind of you can also kind of orchestrate some of this stuff yourself. We have a lot of really good blog posts around like orchestrating tests, like running change specs first, running modified areas first of your application. You also do get CI recommendations in the dashboard, but I wanted to point out this section in our documentation. CI is obvious and like there's a lot of variability in using like testing the application in CI, right? There's the environment, there's the machine resources, there's where the application itself is hosted. There's are you building the application in CI every single time based on the code changes? Are you running the tests on the same machine that you're executing the application on? A lot of stuff goes into it, all super variable, but we did add this section to our documentation. That can kind of give you some indications that if you're having flake in CI or if you're having performance issues in CI, that it could be resulting from machine requirements. And so when we do have some kind of example information about how we execute the real world app and our cypress documentation and the machine sizes that we use. So if you're having environment related flake, like something that's only happening in CI or it's like slowing down and crashing in CI, definitely recommend checking out that section of our documentation. And then I just wanted to also point out that cypress dashboard will also give you recommendations on how many machines to run your test suite against in order to maximize the time savings. And also to ensure that they're being split up across the right number of machines to ensure that they're being quick enough and that you have those resource savings. Okay, so the other thing I want to talk about was test data management. That was one of the causes of environmental flake. So this kind of goes back to like that really big bullet point that I said earlier, where when your tests are flaky for something, anything other than the functionality that you are testing, that is bad flake. So a lot of what I see is failing or flaky tests related to test data management. And test data management refers to essentially what data that you're using to test, how it's updated, how it's maintained, and how you are referencing that data in your test code. So I've seen everything from, you know, hard coding check for this content, right, to, you know, using fake data and passing it through your variables and generating it every time to using fixtures or static data and seeing a data test database. You know, so there's a lot of different strategies that you can take, but some general best practices for your test data management to kind of reduce some of that environment related flake. Test should be independent of each other. So you shouldn't have, be setting up data in one test that you're going to use in a different test. Test should be, I think, I don't, the word in the word is inimpotent, I can never say it. But essentially, they are, should be able to, you should be able to do a dot only in front of any it block and it run and it pass. So they should not be related to each other. That's across spec file and that's a good practice. Yeah, it impotent. I don't know how to say it. And so, and they also will help you to when you go to parallelize because cypress was parallelized by spec file. So if you have things that are interfering with each other and they're running on parallel machines, that can, I've seen that cause issues too if you don't have good test data management. And so when possible, you seeded data. So with the cypress real world app, and if you go to the cypress.slides.com slash Cecilia, These are kind of like, I have a presentation around test data management in the cypress real world app. So I would definitely recommend checking that out. And so the, what we do is we actually, you may have noticed that there's like a db seed command that runs in your before each hook. That's literally reseeding the database in before each it block. So we always have clean fresh data every single time. The other thing that we're doing is that we're leveraging an endpoint specifically for test data. So if I pull back up a test case here. So zero means no retries, by the way, so the question there. And so if we come back to one of our test cases here. Okay, and then this is a new transaction. So we're We're kind of creating like we're getting a user. And we're getting transactions. We're doing that from our database. So we have a side out database commands, where we're actually just kind of grabbing a user and a username from a test from an endpoint specifically designed to give us test data. So that is essentially allowing us to know that we are always going to get a value that is in our database. So instead of saying, well, I really hope that username exists, or I really hope that password exists. We're actually able to say, hey, test data endpoint, give me a username, I can log in. And we know for a fact, it's going to work because we just got it from that and from that endpoint in our back end. Another option is to use your api or front end state management to set and clean up data for tests. So as we talked about those three custom commands, we had login via the UI, we had login via api. And then we also had login via XState. We also have a logout via XState and a switch user via XState. So if we just, again, don't need don't care about our back end, we've already tested it elsewhere and we just need to switch contacts in order to test something in our UI. We can use our front end state management in order to set up that data for that it block without having to worry about anything else affecting that. And then again, when possible, leverage network stubbing and fixtures when you don't need to hit your real api for testing. By all means, like by all means, like your critical paths and having if you want full end to end coverage on those paths, then yes, run through everything, hit your back end, ensure that those end to end tests are working. But again, for other areas of your test suite, where it's not dependent on that, obviously login is like the low hanging fruit here, right? So you have some applications, you have to log in for every single test because everything is behind a login. You only need to test your login flow once. You don't need to test it for every single it block. So test it once. Yes, have that full end to end path, you know, coverage. But then for other areas of your application, we're only testing certain feature areas, just log in via api or even better state management on the front end. That way you're not dependent on that. Any questions on test data management or anything that I showed about like management with, with test retries or the dashboard? Let me go ahead and pop back over to the discord here. Okay, awesome. All right, so I guess that we I said we probably wouldn't take up the full three hours today. So, I think we did pretty good on time, I think the last minute of this this workshop, it took about like two hours and 15 minutes or so so I wanted to make sure that we had enough I could choose between two or three. But yeah, so I wanted to leave some time for any general questions around Cyprus, like, you know, what it's like in Atlanta this time of year. But yeah, it's probably most of Cyprus related. Anything that I can kind of help answer look up in the docs. Feel free to use this time if you don't have any questions for me, please feel free to go ahead and drop off if you like. Okay, good. So question here, what's better way to add tests data to environment api requests or database seed via SQL queries. So it depends on how your data is set up. So you could do both right so you could have an api endpoint that you're using to see the database. So it depends on how your data is set up. So you could do both right so you could have an api endpoint that you're using to see the database. And then you're kind of, and then that endpoint will end up doing a SQL theory in order to see the database right or you can tap into your underlying node process with side dot task. So it also depends on where your database lives. So if you're using a database on the same machine and you're able to do side out tasks in order to use a SQL query to see that database, then you can definitely do that. But if your database is is hosted somewhere else and you're not able to access it directly from the same process as where you're running Cyprus, then you typically need to use an api request. So with the Cyprus real world app, we are using a side out task so I can kind of dig into that a little bit more here. Give me one second actually go ahead and pull up that slide that I was talking about earlier. Okay. Test data management. Oh yeah, here we go. Exploring the Cyprus real world app. Present. Alright, so, There you go. Test data management. So there's a local JSON database managed with low DB. So I mean if you didn't have a JSON flat file database, but you did have a database like that was running on the same machine that you were testing on, you could use that too. So databases receded in the app and started in between each test and it's based on the business logic of the application. We actually have a script that will generate the data based on what we kind of data that we need. For example, if we need transactions between friends and we need a transaction with this type of dollar amount or on this date. So there's no need to create business logic within the tests because we already have certain data types or data transactions that have been created. If you have a really complex data model that could be difficult to do with like a script, but the goal here is to keep the test code clear and not have to set up data within your test code specifically. So what we do is we have in our before each hook this kind of Cy.task dbseed. So what this is doing is this doing a POST request to our test data endpoint or the seed command. And then the api endpoint, which is created only for test data. So it's in our back end test data routes.ts file that is essentially accepting the POST request. And then it's going to go ahead and do a seed database command or function and then res.send status back. And so it's honing in on that seed database function. So seed database is in our database.ts file. It is actually doing kind of like a Zareed file sync. So it's actually a node command that's being executed. And then we are just doing db.setState with the test seed. So the test seed is coming again from our database seed.json file that was generated when the application started up. And then we are writing to our database with that test seed. And so we can then use the test data throughout our tests. And again, we can kind of query our database in order to get certain data in order to leverage our tests. We have a database custom command that's just like a general command. And then we also have an operation database that allows you to do two different things. We have a filter and a find. So filter will essentially kind of like give you data results based on the filter. Lodash function and then find is based on the find Lodash function. And so what this looks like in our tests is we essentially, if we have like a query database, we're fetching the data, we are passing that back into our test code. So we're actually going to that endpoint. We're passing through to the entity, whether it's users, transactions, likes, you know, banks, accounts, whatever it may be. And then we're waiting for that to take place and come back and we can use that in our test code itself. So that is how we do it in cypress real world app. Again, it's going to be dependent on whether you have access to the database or to see it directly with side task. If you have access to like be able to run a SQL query or if you want to do an api request. Can you alias to store a return value from a side task? Oh, that's a good question. I don't. I don't think so. But let's go ahead and check the documentation as always. So OK, so that as can be chained off of something that are requirements. Does require being chained off of previous command. It will not run assertions. So you typically is using Senate and then so let's take a look at Senate task. Requirements. OK, so a task will only run assertion you have chained once, it will not retry. It can time out. Not sure. It doesn't say in the documentation. That's a really good question. I can find out and then post that as a follow up. I don't believe so. Just because typically I see that as being used for either routes or variables. And I think that side task. Let me just kind of. Yeah, it's a really good question. I'm not sure I have to dig in and find out for you. And then let's have safeguards when you put in place to make sure those data see processes don't run on the live database. Yeah, so definitely would only want that to be at a test environment or test database environment. A lot of times you can just use like environment variables to see what kind of environment that you're in. Obviously, in this case, we're running we're only running these commands locally on the machine that we're running the test on. Right. This is a this is just like a local database. It's not a product. We're not even touching staging or production or death. But typically I see people use different environment variables in order to set, for example, the database URL. Mr. Set what host it's on, depending on what environment they're running against. And I would say you really never want to make any changes to production database, even if you are testing against production. That's really dangerous. And so, you know, typically you will set, you know, your staging database or your dev database or your testing database, whichever location that is. And you can set that environment that hosts URL in your cypress dot JSON. And then you can also have different environment, different different cypress config files based on one environment that you're on. So you can pass through that we're in the environment we're on staging today. All right. Use this cypress dot JSON with all of our staging variables or, you know, which we're testing on Dev today. Let's go ahead and use the Dev and cypress dot JSON with all those two big files in it. But, yeah, you know, don't don't don't. I can just totally see that happening. So, OK, we recently faced a big pipeline blocker from a test over an infinite loop. Yeah, so you so as far as a timeout on execution, cypress has some built in timeout configurations where it will like ultimately eventually crash. Right. But. So that's something that I think can be configured. I think it may be on the config option. I have to check the documentation for that, but there is a default cypress like timeout configuration where I think it's like 30 minutes. If it runs for more than 30 minutes, it'll just crash and end. And I think that's configurable. Let's see. You probably want to test cypress run command line. This has all the options. So the config file. And then I don't know if timeout though is when I don't think it's one of the options. I think you'd have to probably if it does exist or it is configurable, you probably have to do that in your cypress dot JSON. But yeah, there is a there isn't a default timeout for cypress. database queries in particular plugin. It's yeah, it's really going to depend on what database you're using. Cy.task allows you to tap into the underlying node process that's happening. And then from there, you can kind of code a node however you like. But there are some plugins as well. There's a lot of plugins, so we can kind of definitely check out this page. All of our plugins are tagged as either official, if it's something that's from cypress, verified, if it's been reviewed by our team or community. There's also some experimental ones on here that, you know, just to kind of keep in mind about whether you want to introduce it into like a production application. You may want to just think to take that into consideration if it's a supported plugin, if it's something that's experimental, maybe not. But that's something to kind of keep in mind as well. Yeah, so that Cy.log, the symbols and pictures, right? So you probably noticed that we have like authenticating and it tells you like the lock and it says the username. That's part of that patterns and practices that I posted as a response to Cy.log. So let me get this pulled up here. So the slides. It was also a webinar that we did. I didn't do it, but the DX team did, called patterns and Practices. And one of the things that's covered is customizing the command log. So it actually talks through how to essentially add these messages. So it's authenticating the username and we actually like put in the little emojis there. Nice, thanks for the link there. OK, can someone change environment variable? Yeah, that's true, too. So if you are hitting like a certain endpoint, like just for test data and you're grabbing that, then it wouldn't actually be able to mutate anything on your database. Awesome. Any other questions? And feel free if you want to come off like mute, if that's easier, or you can keep popping them in the chat. Awesome. All right, great. Well, thank you all so much for your time today, for great questions, for participating. I hope that this was helpful for you. I hope that you at least have some good strategies that you can take back to your team about mitigating some of the flick that you're experiencing. So again, I'm Cecilia. I'm a technical account manager at cypress. So I work with all of our users that leverage the dashboard and I also do things like this. You can follow me on Twitter at Cecilia Creates. It's just my first name, the word creates on Twitter. It's usually a good place to kind of just keep up to date. I like to post a lot of cypress talks and resources and things that I see. But yeah, thank you so much again. Feel free to leverage the slides and share them with your teams. And hopefully we'll have the recording for this soon, if I understand. Awesome, awesome. Great. Thanks, everyone. Well, cheers. Enjoy the rest of your day. And for those of you who celebrate, have a nice Thanksgiving. Bye, everyone.
114 min
23 Nov, 2021

Watch more workshops on topic

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career