Visual testing compares the appearance of your application with a previous state. If changes become visible, you can allow them or not. So you or your testers have their eyes everywhere - without needing to check manually repeatedly. I've been using visual testing for a while, saving my neck a few times. Let's look at my journey together and explore if and how visual testing can also help your projects.
Let’s Get Visual - Visual Testing in Your Vue.JS Project
From:

Vue.js London 2023
Transcription
Hello and welcome at my session here at vue.js live. I'm so glad to have you here and that you seem to be interested in learning more about visual testing via application because to be honest with you guys, it saved my neck a couple of times and I hope I can basically give you the same experience, especially as testing can be sometimes a little daunting. But well, before that, my name is Ramona Schwering. I'm working as a software engineer at Chopware, which is a company providing an open source e-commerce platform and there's much vue involved. So I'm working with vue for three days now, I think. And apart from that, I became a Google Developer Expert in Web Technologies and a Cybers Ambassador. And yeah, I guess you might not be surprised to hear that I'm especially known for testing and I hope I can make testing accessible for anyone and especially pain-free or a little bit more pain-free maybe for everyone. And without further ado, there's one point in testing which I'd like to showcase to you. And I don't know if you are similar to me, but sometimes when I'm dealing with my cell phone, with applications, no matter if it's vue or not, there are some bugs which I encounter often but I'm not sure if it's me being a perfectionist or if you feel it like that, but there are bugs which are not release blockers. They are small user interface errors or typos, just plain looking ugly, right? So I think they are basically everywhere and they leave a certain impression if you don't fix them. See this one, which I took some time ago from my cell phone where this string in the middle of it, keine Mitteilungen or in English no notification, it's clearly broken, right? And you can find it basically everywhere. It's also the case for larger companies like Google where you have a button on the wrong location, right? Or take a look at this Facebook app where the button has a completely wrong padding. There are so many examples I could showcase to you, but we have a certain timeframe. So well, I sometimes to be honest with you, I'm a bit triggered by it. And it's not only my own perfectionism, I think, because I wonder one thing. Would you trust those apps if you have a net or a website with lots of those UI errors which just look broken or signal no sign of care? Would you trust such applications, your credit card data for example? Well, I'm not quite sure when it comes to my opinion, but I don't want to be too strict here too because we are all humans, right? Behind all applications, behind all websites, there's a developer. And we humans do sometimes a bit strange stuff. And there's one phenomenon which is, at least in my opinion, one of the things why such errors occur. It's a phenomenon which is called inattentional blindness. It's well known in psychology and it's depicted not only in psychology or in psychology classes, but also in traffic. It's like the famous whodunit from the UK. You can take a look at this video later on. I posted it here as a QR code. All of those videos, all of those campaigns, double down on the fact that a person fails to notice an unexpected stimulus in the vignette solely because of the lack of attention and not because of any visual defects or blindness or deficits. Imagine a designer who builds a wonderful banner, but doesn't notice there's a huge typo in the headline, stuff like that. I guess everyone had such a situation, right? But don't we have testing for such situation? We have a good test automation, right? We have testing/talks">unit testing, integration testing, end-to-end testing, don't we? Shouldn't they catch that? And they do, but there's a catch, at least in my opinion. So I would say they don't always catch it because all of those testing types will only test what they are supposed to test. I like to phrase it as end-to-end testing doesn't look left or right. So things could remain undetected if they are outside of the concept, outside of the things you didn't explicitly written down, right? So how can we solve the situation or at least make it a bit better? Well, we could phrase it like that. We need to give our tests eyes, how we as humans do it automatically, right? Like we see things, we don't always just focus on one thing. packaging words, we want to do visual testing. So what does this exactly mean? What is visual testing and how does it work? Well, I don't know if you saw my promo video on Twitter, but I did some little cut in it and it was really intentional. So imagine my video or those two frames from the video as a spot the difference puzzle. And I'd love to do them in my childhood and I loved to find all the difference in it. So can you spot the differences in this puzzle, in this little video here? I guess the most obvious one is already being found. The glasses, I changed my glasses throughout the video. Another one is the penguin here was an elephant before. And the box here had a different color, it was white in the beginning. And as I was editing my slides, I saw a fourth difference even. The GitHub logo on my arm was gone. Or whatever it was like that. But yeah, four differences we as humans can find really, really quickly, right? So I want a test to do exactly the same. I want that the test will find those differences like a human would do. And it can achieve that due to basically a screenshot comparison. So you take one screenshot from your master branch or from anything else you consider a status quo, which is the correct way your application should look like. So it defines what's correct. And then you will do a new current screenshot from your branch, from your new future, whatever you're building right now, as long as it's taken from your view application. And then you will compare those screenshots and highlight the differences. Like a spot the difference puzzle basically. And when we have those differences, we can decide whether the change is desired or by intention or not. So was it a feature we built or some change we did in the UI which is visible? Does a difference we intended to do? Or is it a visual bug? And the test cannot completely decide this by itself yet. How could it? So there are some tools which provide you some AI, some machine learning. But in my opinion, we still need a review done by an actual human who can tell if this is intentionally or not. So there are a couple of tools you could take a look at to implement such visual testing. Maybe Apple tools which does provide some AI in it. Percy or Chromatic from storybook, all wonderful tools. However, I'm especially interested in open source solutions. And there are many custom implementations or open source plugin you can take a look at. My favorite one is the Visual Regression Tracker. So the Visual Regression Tracker is basically a tool for managing the results of visual tests. So displaying those screenshot comparison, giving you an opportunity to reject or to approve such changes. I like that it's at least a bit automation framework independent. So you can use Playwright with it. You can use cypress for it. And I guess even Selenium and others. So I will use cypress in this talk, for example. But this shouldn't be a roadblock if you don't use it. So I will show you this Visual Regression Tracker. And I like it because it's open source. It's a self-hosted solution. So I can control everything I want to. And it only requires Docker on your machine. And as I like Docker, this is basically a wonderful thing I like too. So let's stay for Visual Regression Tracker in this talk. And to get it installed, I will just briefly cover it so we don't waste lots of time. But I could give you some more extended video if you're interested in it. So you just run, after installing Docker, of course, you run this little script which installs everything you need. So you have this service to restart it. No worries. We take a look at it later. And from the point of view of your tasks, it's just like installing another dependency. In my case, installing or importing a cypress plugin. If I want to use the Visual Regression Tracker inside of my tests, it's only a custom command I need to use. So you see the cy.track command here. As I don't use any filter or any parameters besides the title, I will capture the complete page here. But if you chain it right after a little element, like I do it in the second try, I will only do a screenshot of the one element. And then it will be given to the Visual Regression Tracker service. This service looks like this. So you have a little UI which helps you with the approval workflow of the screenshots. And you can accept it or you can reject it. On the left, you will see the baseline. So the basic screenshot for the comparison. So what should be correct. Correct is, of course, a matter of definition. But we assume that it's the correct point. And on the right side, there's the new screenshot from the new implementation from the branch or whatever you are working on. So you see there's a menu open, which is obviously a visual change we should decide on. And sometimes it could be that those differences are really small and difficult to see. So if this happens, you can highlight it in red if you want to. So you can basically exactly see and really precisely see what's the difference in those two screenshots. Yeah, from my daily business, I can say that it works really fine when it comes to view applications and all of those tweaks they have and all of those behaviors they let show in daily viewing. So it matches well. However, of course, you can imagine that there are some little things to take into account because when they're slight, they will be shadows, especially when it comes to view and the way they render elements and how components work. There might be some difficulties and pitfalls I experienced before. And I'd like to take a quick look at those with solutions, of course. So I won't leave you in the open here. So the first point is a typical trap when it comes to testing. And I know I think you heard it before when it comes to me ranting about time specifications. But they are a thing in visual testing as well. Picture a dashboard you want to take a screenshot from. And there's a date in it. And you run, for example, visual testing on a nightly bit. So they will be executed on every night. So you will have a change of date from yesterday to today to tomorrow. Or if you have it even more precise when it comes to those time specifications, even hours, minutes, or seconds. So they will trigger a change in the UI because another time is displayed. And this one isn't something we need to know because we know that the day will change. We don't want to be annoyed by such notifications. The most easy point how we can suppress those things are freezing the time on client side and setting it ourself to a fixed date or to a fixed time, basically faking it. You can do this in cypress, for example, with the custom command or the cypress command cy.clock. So like we do in this little snippet, we will freeze the system to always the same time in January 2018 or something and continue like normal below. So we will not be alerted from time changes now. But timing could be another point which leads to another problem which is often seen in end-to-end testing, but also in others. It's called flakiness. And I love to use the story Boy, Crag, Wolf by Aesop to showcase what flakiness is about. So just to rehearse it real quick or to get back to it real quick, it's a story about a boy who tends a flock of sheep and is bored and plays pranks on the villagers, basically. He's calling, have a wolf and a wolf is attacking me without a wolf being there. So actually giving a fool's alarm. And when the villagers came to defend him, they came for nothing and will go away being disappointed. Until there's a real wolf attack and no villagers will come to defend the flock of sheep. So yeah, the boy, Crag, Wolf and nobody believes him anymore. Well, a lie will not be believed even when he speaks the truth. That's basically the learning of it. And I like to use this quote to explain flakiness too. So flakiness is a test which will fail or pass without any changing in between. This could also be the case with visual testing. So sometimes you will be notified of a change, sometimes not, but you didn't do anything in between to cause it, right? So yeah, we really need to get rid of those. If not for the end-to-end testing, we need to get rid of those for visual testing too. And when it comes to visual testing in specific and for view in specific, the main culprit, at least in my experience, are loading times. So basically the question of what happens if my website is still loading? What happens if my element didn't stop rendering yet or if my user interface is still changing? Am I really taking the right snapshot at the right timing? We really need to make sure to do exactly that. So making sure our application is ready to be screenshotted because otherwise it might cause flakiness. And the solution is rather simple because it's the same solution as you would use for testing/talks">e2e testing too. So use your assertions consciously and using them dynamically and not with fixed rating times. Wait for consistent snapshots. Wait that all load times have been completed, that all the appropriate rendering or UI changes have been made before you create the snapshot. And I know I'm really annoying in this regard, but it should be a general best practice for testing/talks">e2e testing to not use fixed rating times but really wait until everything is properly done. Another point, last but not least, would be false negatives. And I think they are dangerous because they let it look like your test is failing because something's broken, but your test fails without any errors being present. And this can especially be the case for natural changes, which are not erroneous and changes that cannot be prevented. May it be, again, time specification if you have a read-only time, which you cannot influence by the client, or my favorite example, which caused me some nightmares before. It's this one. It's an image in a log-in screen taken from the Shopper 6 administration UI, which is basically administration for an online shop. I guess it's just still true at the moment. It looks fairly harmless, but this image here is depending on the time. So there will be a different image depending on time of the day. And it's randomly chosen from an image pool. So even at the same time, it could be a different image. And thus causing all those notifications that something changed in the application. And yeah, we know that. It's natural. But we don't want to be notified again. False negative. So the solution for this would be making the test to ignore the changes. Maybe by using a pixel threshold if it's a rendering difference, blurring it, or even ignore areas or elements. So you can configure it in the service of Visual Regression Tracker or in the code base if the Visual Regression Tracker is not enough to help you there. I use for this regard, I use an own custom commands where I take actually the image and set it to another fixed background image, which is always the same. But we need to be really careful when it comes to those interferences because this is actually what we do. We interfere with the app through the test. So if you do this, write a separate own test to make sure that, for example, the image or the image selection process is really working to not hide an error just because you interfere with the application in the test and document it so that other developers at the test end know that you are doing such things here. OK. So this is about visual testing best practices basically or pitfalls I encountered. But if you want to learn more about best practices, not only limited to visual testing ones, please take a look at Marco's talk about writing good tests for your applications because it's a general area, not only visual testing. And if you didn't have the chance to see this talk yet, please check out the recording later on. It's really worth it. And together with this talk and my talk, we will be able to have wonderful tests. So our tests are not detectives in this regard, maybe cypress or it could be anything else, Sherlock cypress, Sherlock Playwrights, Sherlock Selenium or Red Driver, whatever you use, because we made them be a bit more like the way we humans are doing testing. So not only taking a look at the things we describe, but also outside of a concept, writing or look a little bit. And this really can be a lifesaver because it prevents errors caused by side effects you might not be aware about. If there are four lessons I want you to remember from this talk, it's those. Please try to give your test size. Teach it to look beyond the given path and make it act like a kid doing spot the difference puzzle. This can be achieved through visual testing or simply put for a screenshot comparison. And if you can't get consistent ones, it's okay to interfere with your tests. So cover the original with the suitable test so you not hide any issues, but it's okay to do this. And a tool as a starting point can be the visual regression tracker, but could be other mentioned tools too. Okay, what else to say then? Thank you. Thank you for listening to me. If you have any questions, please post them here. Find me around here. Check me out, Twitter, LinkedIn, Mastodon. The handle can be found in this page. So basically wherever you found me. And until then, bye.