Visual regression is one of the hardest part in UI testing. And you will likely agree that it is extremely powerful. But how it works? What the problem it is solving under the hood? Why people choose visual regression services and how we build the fastest visual regression tool in the world :)
Visual Regression Under the Hood
Transcription
Hey, everybody. I'm really excited to be here today and to talk about visual regression. My name is Dmitry. I'm from Ukraine, working full-time at Cypress.io and doing some work at the open source community. Let's start. Today we've been talking a lot about UI testing, but you probably will agree that the hardest part of UI testing is to test how UI looks for users, right? Because computers don't know anything about UI. And that's where visual regression gives us a lot of value. Let's roll over a simple example of visual regression and then dive into the process. So here is an example. A simple screenshot of the Cypress.io homepage. And here's the next screenshot. You probably spot the difference, right? Because they're changing too quickly. But visual regression can do this automatically. You can see that there are two changes in between of these screenshots. And this is extremely helpful when we want to build a stable and reliable system. Yeah, it makes us confident about how our UI changes for real users. But in my experience, visual regression is also an extremely flaky test category. You probably know this reason when literally each second screenshot, each second commit has some visual regression noise. And we all are humans and we are getting used to this, starting ignoring them, auto approving, and so on and so forth. And this is a problem because once it becomes flaky, it losts the value. So today I'd like to discuss this problem by diving into visual regression, how it works under the hood, and try to get this knowledge and use this to build more reliable visual regression. So under the hood of visual regression always contains four simple steps. Firstly, you need to load a page. Then you need to make a screenshot, compare it with previous approved version, and see the difference. It looks pretty easy, but each of these steps has its own hidden problems. And I'd like to discuss it. So first of all, you need to load a page. But that's not enough to just load a page using your favorite browser-based test runner like Cypress, right? You need to make this page predictable. And this is a problem, especially when you are not using visual regression services. Because when your page is not in the stable state, you can easily get a lot of noise. Like for example here, most of the screenshots have sections that are changing from time to time, like the inline videos or changing carousel by timeout. And everything, all of this, can easily break the visual regression process. Also animation, times, random values can easily break this. So we need to be careful about this. But that's not everything we should care about. Also the different UI is possible even when you are running the same code, but in different operation system or in different browsers. Just because the different layout systems or different operation system itself can produce layout shifts or different default view, so this will break our code. And this is a real problem, which is perfectly solved by visual regression services, but it gives a lot of problem for people that are trying to make the visual regression by themselves. Visual regression services solve this by loading your HTML, not the screenshot, but HTML, running this HTML with all the styles in the specified browser and only there to make a screenshot and compare it. But you can get the same level of predictability by running all of your tasks and only run your visual regression tasks in Docker. It can be even reasonable to make a specific separate amount of tasks only for visual regression and run it only in Docker, even approve it in Docker. And this will make you confident that your tasks are running in the same environment and does not give a lot of noise and layout shifts in between local machines of developers. But there is also an interesting middle between these two approaches. There is a project called visual regression tracker that gives you an ability to run these tasks inside the Docker in the self-hosted service, gives you an interface that allows to approve the screenshot and is giving you the same level of predictability as visual regression services, but self-hosted. I'm sure this project will make a future of visual regression. But then you need to make a screenshot, right? But which one? And here is a problem. Because I'm constantly seeing, especially in Cypress community, that people are using default Cypress resolution or some small resolution that are honestly not used by nobody in the world. We need to ensure that we are testing our UI over that resolutions that are used by our users. You can easily get this information from any analytics tool. For example, here is the stats of my personal website. And you can see that most of my users are using this weird resolution of some tablets. And honestly, I'm not testing my website over this resolution. And you probably know how it can be easy to lose some visual defect when this resolution is not widely popular or too big, for example. And you need to ensure that you are testing over this resolution that is used by your users. And that's actually weird that by default, visual regression tools and services are not using the most popular resolutions, like, for example, Full HD. And the reason of this, that we are doing screenshot testing over small images, is that the comparison of screenshots is really slow. In order to compare two images with a Full HD resolution, you need to iterate over 2 million pixels, calculate the difference between each one using specialized formula, and only then save the difference. It's a pretty hard and performance task for� and not performance-friendly task for computers. Especially when you're trying to do this in JavaScript. And that's why I created� and I'm right now working on the library called ODIF that allows you to do this� the image comparison, not in JavaScript, but in native more� but in native more performance language, and save you a lot of time and allow you to test dead screenshots that you want and make it faster. So we probably are out of time. So let's discuss a conclusion and a key to painless visual regression. You need to ensure that your tests are running in the same environment, and you need to ensure that you don't have any unstable content on your page, even if you're using visual regression services. And you also need to test your UI over that resolutions that are used by your users, and not just fast or performance-friendly for some service. And that's it. I'm happy to be here. Thank you. And bye. Bye.