Visual testing compares the appearance of your application with a previous state. If changes become visible, you can allow them or not. So you or your testers have their eyes everywhere - without needing to check manually repeatedly. I've been using visual testing for a while, saving my neck a few times. Let's look at my journey together and explore if and how visual testing can also help your projects.
Let’s Get Visual - Visual Testing in Your Vue.JS Project
AI Generated Video Summary
This Talk discusses the importance of fixing small UI errors and typos, as they can leave a negative impression and raise questions about trust in applications. Traditional testing methods may not catch all UI errors, so visual testing is introduced as a solution. The Visual Regression Tracker is recommended as a tool for managing visual test results. Best practices for visual testing include ensuring the application is fully loaded, addressing flakiness, and handling false negatives. The key lessons include giving tests eyes, looking beyond the given path, using visual testing, and covering the original with suitable tests if consistent results can't be obtained.
1. Introduction to Visual Testing
Hello and welcome to my session at Vue.js Live. I'm Ramona Schwering, a software engineer at Chopware. I'll showcase the importance of fixing small UI errors and typos. These errors can leave a negative impression and raise questions about trust in applications. The phenomenon of inattentional blindness contributes to such errors.
Hello and welcome to my session here at Vue.js Live. I'm so glad to have you here and that you seem to be interested in learning more about visual testing via application, because to be honest with you guys, it saved my neck a couple of times, and I hope I can basically give you the same experience, especially as testing can be sometimes a little daunting.
But well, before that, my name is Ramona Schwering. I'm working as a software engineer at Chopware, which is a company providing an open source ecommerce platform. And there's much VU involved, so I'm working with VU for three days now, I think. And apart from that, I became a Google developer expert in web technologies and a Cypress ambassador. And yeah, I guess you might not be surprised to hear that I'm especially known for testing, and I hope I can make testing accessible for anyone, and especially pain-free, or a little more pain-free, maybe, for everyone.
And without further ado, there's one point in testing which I'd like to showcase to you. And I don't know if you are similar to me, but sometimes when I'm dealing with my cell phone, with applications, no matter if it's VU or not, there are some bugs which I encounter often, but I'm not sure if it's me being a perfectionist or if you feel it like that. But there are bugs which are not release blockers. They are small user interface errors or typos. Just plain looking ugly, right? So I think they are basically everywhere. And they leave a certain impression if you don't fix them.
See this one, which I took some time ago from my cell phone, where this string in the middle of it, keine Mitteilungen, or in English no notification. It's clearly broken, right? And you can find it basically everywhere. It's also the case for larger companies like Google, where you have a button on the wrong location, right? Or take a look at this Facebook app, where the button has a completely wrong padding. There are so many examples I could showcase to you, but we have a certain time frame. I sometimes to be honest feel a bit triggered by it, and it's not only my own perfectionism, I think. Because I wonder one thing. Would you trust those apps, if you have an app or a website with lots of UI errors, which just look broken or signal no sign of care? Would you trust such applications, your credit card data for example? Well, I'm not quite sure when it comes to my opinion. But I don't want to be too strict here too, because we're all humans, right? Behind all applications, behind all websites, there's a developer. And we humans do sometimes a bit strange stuff. And there's one phenomenon, which is, at least in my opinion, one of the things why such errors occur. It's a phenomenon which is called inattentional blindness. It's well-known in psychology, and it's depicted not only in psychology or in psychology classes, but also in traffic ads like the famous whodunit from the U.K. You can take a look at this video later on. I posted it here as a QR code. All of those videos, all of those campaigns double down on the fact that a person fails to notice an unexpected stimulus in the vignette, solely because of the lack of attention and not because of any visual defects or blindness or deficits. Imagine a designer who builds a wonderful banner but doesn't notice there is a huge typo in the headline. Stuff like that.
2. The Limitations of Traditional Testing
We have various types of testing like unit testing, integration testing, and end-to-end testing. However, these tests may not catch all UI errors and typos. End-to-end testing, for example, may miss issues that are outside of its scope. There is a need to give our tests eyes and introduce visual testing.
I guess everyone had such a situation, right? But don't we have testing for such a situation? We have good test automation, right? We have unit testing, integration testing, end-to-end testing. Don't we? Shouldn't they catch that? And they do. But there is a catch, at least in my opinion. So I would say they don't always catch it, because all of those testing types will only test what they are supposed to test. I like to phrase it as end-to-end testing doesn't look left or right. So things could remain undetected if they are outside of the concept, outside of the things you didn't explicitly written down, right?
3. Visual Testing and Tools
We need to give our tests eyes. Visual testing is like a spot the difference puzzle. It compares screenshots to find differences, allowing us to decide if they are intentional changes or visual bugs. While there are AI tools available, human review is still necessary. Open source solutions like the Visual Regression Tracker are worth exploring for managing visual tests.
We need to give our tests eyes. How we as humans do it automatically, right? Like we see things, we don't always just focus on one thing. Backwards, we want to do visual testing. So what does this exactly mean? What is visual testing and how does it work? Well, I don't know if you saw my promo video on Twitter, but I did some little cut in it and it was really intentional. So imagine my video or those two frames from the video as a spot the difference puzzle and I'd love to do them in my title. I loved to find all the difference in it. So can you spot the differences in this puzzle and this little video here? I guess the most obvious one is already being found. The glasses. I changed my glasses throughout the video. Another one is the penguin here, was an elephant before. And the box here had a different color. It was white in the beginning. And as I was editing my slides, I saw a fourth difference even. The github logo on my arm was gone. Whatever, it was like that. But yeah, four differences we as humans can find really really quickly, right? So I want a test to do exactly the same. I want that the test will find those differences like a human would do. And it can achieve that due to basically a screenshot comparison. So you take one screenshot from your master branch or from anything else you consider a status quo which is the correct way your application should look like. So it defines what's correct. And then you will do a new current screenshot from your branch, from your new feature, whatever you are building right now, as long as it's taken from your Vue application, and then you will compare those screenshots and highlight the differences. Like a spot the difference puzzle basically. When we have those differences we can decide whether the change is desired or by intention or not. So was it a feature we built or some change we did in the UI which is visible does a difference we intended to do, or is it a visual bug? The test cannot completely decide this by itself yet. How could it? So there are some tools which provide you some AI, some machine learning, but in my opinion we still need a review done by an actual human who can tell if this is intentionally or not. So, there are a couple of tools you could take a look at to implement such visual testing. Maybe Apple tools, which does provide some AI in it, Percy or Chromatic from Storybook, all wonderful tools, however, I'm especially interested in open source solutions. And there are many custom implementations for Open Source plugin you can take a look at. My favorite one is the Visual Regression Tracker. So, the Visual Regression Tracker is basically a tool for managing the results of visual tests, displaying those screenshot comparisons, giving you an opportunity to reject or to approve such changes.
4. Using the Visual Regression Tracker
The Visual Regression Tracker is a tool for managing visual test results. It allows you to compare screenshots and approve or reject changes. It's automation-framework-independent, supporting tools like Playwright, Cypress, and Selenium. It's open source and self-hosted, requiring only Docker. Installation is similar to adding a dependency, and using it in tests is done through a custom command. The service provides a UI for approving screenshots, comparing the baseline and new implementation. Differences can be highlighted for easier identification.
So, the Visual Regression Tracker is basically a tool for managing the results of visual tests, displaying those screenshot comparisons, giving you an opportunity to reject or to approve such changes. I like that it's at least a bit automation-framework-independent. So, you can use Playwright with it, you can use Cypress for it, and I guess even Selenium and others.
So, I will use Cypress in this talk, for example, but this shouldn't be a roadblock if you don't use it. So, I will show you this Visual Regression Tracker and I like it because it's open source, it's a self-hosted solution, so I can control everything I want to. And it only requires Docker on your machine, and as I like Docker, this is basically a wonderful thing I like too.
So, let's stay for Visual Regression Tracker in this talk. And to get it installed, I will just briefly cover it, so we don't waste lots of time, but I could give you some more extended video if you're interested in it. So, you just run... Yeah, after installing Docker, of course, you run this little script, which installs everything you need. So, you have this service to restart it. No worries. We'll take a look at it later. And from the point of view of your tasks, it's just like installing another dependency. In my case, installing or importing a Cypress plugin. If I want to use the Visual Regression Tracker inside of my tests, it's only a custom command So you see the cy.track command here. As I don't use any filter or any parameters besides the title, I will capture the complete page here. But if you chain it right after a little element like I do it in the second try, I will only do a screenshot of the one element. And then it will be given to the Visual Regression Tracker service. This service looks like this. So you have a little UI, which helps you with the approval workflow of the screenshots. And you can accept it or you can reject it. On the left, you will see the baseline. So the basic screenshot for the comparison. So what should be correct. Correct is, of course, a matter of definition that we assume that it's the correct point. And on the right side, there's the new screenshot from the new implementation from the branch or whatever you are working on. So you see there's a menu open, which is obviously a visual change we should decide on. And sometimes it could be that those differences are really small and difficult to see. So if this happens, you can highlight it in red if you want to.
5. Challenges in Visual Testing
Visual testing works well for view applications, but there are difficulties and pitfalls. Time specifications can cause UI changes, but we can suppress them by freezing the time. Flakiness is a problem in visual testing, where tests fail or pass without any changes. We need to address this issue to improve the reliability of visual testing. Loading times are a common culprit in visual testing for view applications.
So you can basically exactly see and really precisely see what's the difference in those two screenshots. Yeah, from my daily business, I can say that it works really fine when it comes to view applications and all of those tweaks they have and all of those behaviors they let show in daily viewing. So it matches well. However, of course, you can imagine that there are some little things to take into account because when they're slight, there will be shadows, especially when it comes to view and the way they render elements and how components work. There might be some difficulties and pitfalls I experienced before.
And I'd like to take a quick look at those with solutions, of course, so I won't leave you in the open here. So the first point is a typical trend when it comes to testing, and I know I think you heard it before when it comes to me ranting about time specifications, but they are a thing in visual testing as well. Picture a dashboard you want to take a screenshot from and there is a date in it, and you run, for example, visual testing on a nightly build, so they will be executed on every night. So you will have a change of date from yesterday to today to tomorrow. Or if you have it even more precise when it comes to those time specifications, even hours, minutes or seconds. So they will trigger a change in the UI, because another time is displayed. And this one isn't something we need to know, because we know that the day will change, we don't want to be annoyed by such notifications. The most easy point, how we can suppress those things, are freezing the time on client-side and setting it ourself to a fixed date or to a fixed time, basically faking it. We can do this in Cypress for example with a custom command or the Cypress command cy.clock. So, like we do in this little snippet, we will freeze the system to always the same time in January 2018 or something and continue like normal below. So, we will not be alerted from time changes now, but timing could be another point which leads to another problem, which is often seen in end-to-end testing, but also in others.
It's called flakiness. I love to use the story Boy Cried Wolf by Asok to showcase what flakiness is about. Just to rehash it real quick or to get back to it real quick. It's a story about a boy who tends a flock of sheep and is bored and plays pranks on the villagers basically. He's calling help and a wolf is attacking me without a wolf being there, so actually giving a fool's alarm. And when the villagers came to defend him, they came for nothing and will go away being disappointed. Until there is a real wolf attack and no villagers will come to defend the flock of sheep. So yeah, the boy cried for a wolf and nobody believes him anymore. Well, a lie will not be believed even when he speaks the truth, that's basically the learning of it. And I like to use this quote to explain flakiness too. Flakiness is a test which will fail or pass without any changing in between. This could also be the case with visual testing. So sometimes you will be notified of a change, sometimes not, but you didn't do anything in between to cause it, right? So yeah, we really need to get rid of those, if not for the end-to-end testing, we need to get rid of those for visual testing too. And when it comes to visual testing in specific and for viewing specific, main culprit, at least in my experience, are loading times.
6. Best Practices for Visual Testing
When taking snapshots for visual testing, ensure that your application is fully loaded and the appropriate rendering or UI changes have been made. Avoid using fixed wait times and instead wait for consistent snapshots. Be aware of false negatives, which can occur due to natural changes or changes that cannot be prevented. To handle these, configure the Visual Regression Tracker to ignore changes or use custom commands. However, be cautious when interfering with the application in tests and ensure that separate tests are written to verify the functionality being tested. For more best practices beyond visual testing, check out Marco's talk on writing good tests for UBI applications. Together with this talk, you'll be able to have wonderful tests that mimic human testing behavior and prevent errors caused by unforeseen side effects.
So basically the question of what happens if my website is still loading? What happens if my element didn't stop rendering yet, or if my user interface is still changing? Am I really taking the right snapshot at the right timing? We really need to make sure to do exactly that. So making sure our application is ready to be screenshotted, because otherwise it might cause wakiness.
And the solution is rather simple, because it's the same solution as you would use for testing too. So use your assertions consciously and using them dynamically and not with fixed rating times. Wait for consistent snapshots. Wait that all load times have been completed, that all the appropriate rendering or UI changes have been made before you create the snapshot. And I know I'm really annoying in this regard, but it should be a general best practice for not to use fixed rating times, but really wait until everything is properly done.
Another point, last but not least, would be false negatives. And I think they are dangerous, because they let it look like your test is failing because something's broken, but your test fails without any errors being present. This can especially be the case for natural changes, which are not erroneous, and changes that cannot be prevented. May it be, again, time specification if you have a read-only time, which you cannot influence by the client. Or, my favorite example, which caused me some nightmares before. It's this one. It's an image in a login screen, taken from the Shopper6 administration UI, which is basically for an online shop. I guess it's just still true at the moment. It looks fairly harmless, but this image here is depending on the time. So there will be a different image depending on time of the day, and it's randomly chosen from an image pool. So even at the same time, it could be a different image, and thus causing all those notifications that something changed in the application. And we know that it's natural, but we don't want to be notified again, false negative.
The solution for this would be making the test to ignore the changes, maybe by using a pixel threshold if it's a rendering differences, blurring it, or even ignore areas or elements. So you can configure it in the service of Visual Regression Tracker or in the codebase if the Visual Regression Tracker is not enough to help you there. I use for this regard, I use an own custom commands, where I take actually the image and set it to another fixed background image which is always the same. But we need to be really careful when it comes to those interferences because this is actually what we do, we interfere with the app through the test. So if you do this, write a separate own test to make sure that for example, the image or the image selection process is really working, to not hide an error just because you interfere with the application in the test and document it so that other developers at the test and know that you are doing such things here.
Okay, so this is about visual testing best practices basically, or pitfalls I encountered. But if you want to learn more about best practices not only limited to visual testing ones, please take a look at Marco's talk about writing good tests for UBI applications because it's a general area and not only visual testing. And if you didn't have the chance to see this talk yet, please check out the recording later on, it's really worth it. And together with this talk and my talk, we will be able to have wonderful tests. So our tests are now detectives in this regard, maybe Cypress or it could be anything else, Sherlock Cypress, Sherlock Playwright, Sherlock Selenium or Red Driver, whatever you use because we make them be a bit more like the way we humans are doing testing. So not only taking a look at the things we describe, but also outside of the concept, writing or look a little bit and this really can be a lifesaver because it prevents errors caused by side effect you might not be aware about.
7. Key Lessons and Conclusion
There are four key lessons to remember: give your test a size, teach it to look beyond the given path, make it act like a kid doing a spot the difference puzzle, and use visual testing or screenshot comparison. If consistent results can't be obtained, it's okay to interfere with the test and cover the original with a suitable test. The visual regression tracker is a recommended tool. Thank you for listening and feel free to reach out with questions.
If there are four lessons I want you to remember from this talk, it's those. Always try to give your test a size, teach it to look beyond the given path and make it act like a kid doing a spot the difference puzzle. This can be achieved through visual testing or simply put for a screenshot comparison.
If you can't get consistent ones, it's okay to interfere with your test. So cover the original with the suitable test so you don't hide any issues, but it's okay to do this. A tool as a starting point can be the visual regression tracker, but could be other mentioned tools too.
Okay, what else to say than thank you. Thank you for listening to me. If you got questions, please post them here, find me around here, check me out, Twitter, LinkedIn, Mastodon, the handle can be found in this page. So basically wherever you found me.