Transcription
Hey everyone, my name is Marie and welcome to my talk called A Medley of Frontend and Backend Performance Testing. Before I start with my talk, I want to tell you a story first. This story is about Overcooked. I've been playing this game with my 5 year old daughter and if you're not so familiar with Overcooked, it's a cooperative game where you have to pass different unusual kitchen layouts and serve as much food as possible to customers. When the game start, it's still pretty normal. Orders are flowing in nicely and you're getting tips because you're getting sushi orders served on time. As the game gets harder and you get more orders than expected, the kitchen is now overwhelmed and without proper coordination and teamwork in place, the kitchen is now on fire. You're also not getting tips anymore and you've got hungry customers waiting impatiently for their food. Because the kitchen can't keep up with the overflowing orders from customer, the whole kitchen is now on fire. Of course, this is very dramatic but ultimately, you get the picture. The customers are very unhappy, you're getting negative tips and the kitchen is such a mess that you can't even cook a single potato. Going back to my topic of performance testing, imagine you're trying to buy some items during Black Friday or Cyber Monday sales. You found an item that you really like but suddenly, the website that you were using has crashed. It can't keep up with the overwhelming requests from different users simultaneously. This is a very common phenomenon during Black Friday sales. You can see in this example graph as well that during the peak times of Black Friday sales, the response times are significantly higher as opposed to normal periods. This has then resulted to response times errors that can break your website. Most of the time, the response from companies is to buy more servers, thinking that this will fix their performance problems but this could end up costing you more money. A better investment is if you understand, test, monitor, and make performance improvements to your internal application. Now, let's get to the more serious part of this talk. In order for us to make sure that our users have a positive user experience, we need to do performance testing. Performance testing is the practice of measuring and evaluating how your system responds under certain conditions. When you think of performance testing, we are concerned about the speed, reliability, and stability of the application that we are testing. With performance testing, there is often a misconception that it's all about load testing. Performance testing is the umbrella term for any type of performance test, while load testing is just one type of performance testing. In a nutshell, load testing checks how your application is behaving if it's been exposed to a heavy number of concurrent virtual users sending multiple requests at a given time. Within load testing, there's also different variations such as stress testing, soak testing, or spike testing. Performance testing is typically divided into two areas. We have the front-end or client-side performance, which is aimed at testing how fast a user can see the web responses instantly. It is concerned with the end-user experience of an application, which usually involves a browser. Front-end performance testing has metrics that are distinct from back-end performance testing. Example metrics could be how long did it take for the browser to render the entire page, or how long did it take for the page to be fully interactive. And on the other hand, we have back-end or server-side performance testing, which is aimed at ensuring that when multiple requests are sent from different users simultaneously, your back-end should be able to handle the load accordingly. Example metrics could be how long did it take for a response to come back from a specific request, or how many failed requests did we encounter. So as you can see, performance testing is not just about load testing. With different types of performance testing, you might wonder if there is a priority as to which one is more important. The answer, as with everything, is that it depends. It always is. If we revisit the golden rule of web performance, it states that 80 to 90% of the load time of a web page or application is spent in the front-end, while 10 to 20% is spent in the back-end. You can see in this image, which I took from Steve Souders' blog, that the average front-end time is significantly higher as opposed to the back-end timings. If we are following this golden rule, and you want to make some performance improvements, it's always a great idea to start on the front-end and make small recommendations to your team. Performance testing on the front-end is also much closer to our users' experience. However, the golden rule of web performance is not always necessarily accurate. If you have a lot of traffic arriving at your website, the front-end response time can remain roughly similar. But once your back-end struggles with the increased concurrency, the back-end time will grow exponentially. Front-end performance testing is executed on the client side and is therefore limited in scope. They don't provide enough information about your entire application. Back-end performance testing is really useful when it comes to catching any performance bottlenecks when your application servers have been exposed to high levels of load. At the same time, front-end performance testing can catch issues related to browsers only, which can be skipped entirely from the protocol level. This is why a mixture of both is key. Moving on, I want to talk a bit about performance testing tools because there's a variety of tools out there that are available to support you with your performance testing needs. From a front-end perspective, tools such as Lighthouse, Google PageSpeed, Sitespeed.io, WebPagetest, and even your developer tools can help. Other testing tools such as Playwright and Cypress can also offer ways to measure front-end performance. Then, if we go to the back-end tools, there's also JMeter, K6, Gatling, Torus, Locus, and also Artillery, just to name a few. These tools predominantly perform performance testing, most commonly load testing on a protocol level. So, as you can see, you would need a combination of different tools to test your front-end and back-end. But, what if there is a single tool that you can use for both? What if there is a tool that can simulate a browser-based test with a protocol-level test so you can understand how the front-end behaves during various performance events? This is where Xk6 Browser comes in. Xk6 Browser is an extension to K6, which brings browser automation and end-to-end web testing to K6 while supporting core K6 features. It adds browser-level scripting APIs to interact with real browsers and collect front-end metrics as part of your K6 tests. With Xk6 Browser, this gives you the ability to measure how your front-end is behaving during certain events, which would be difficult to catch from the protocol level. Xk6 Browser, similarly with K6, is actually written in Go, but the tests are written in JavaScript. It's also great news for Playwright users, because Xk6 Browser aims to provide rough compatibility with the Playwright API. Xk6 Browser is still in its very early stages, so it's been created as a K6 extension, which means that it's not included as part of K6 core yet. To get started with Xk6 Browser, you need to install Xk6 first via Go. Then you have to build a custom version of K6 with Xk6 Browser binary added to it. Since K6 tests are written in JavaScript, there will be some familiarity already. To demo a really simple test, I just want to visit a test URL. To create that, first I need to import Chromium from Xk6 Browser. At the moment, Xk6 Browser only supports Chromium-based browsers, but we also have plans to support Firefox and WebKit. Next, I have my export default function, which is our virtual user code. Anything that's inside the default function will be executed by a virtual user again and again, depending on your configuration. For now, this will only be executed once. I'm telling Chromium to launch a browser, and since I want to see the browser, I'm passing in headless as false. Then, we're telling browser to create a new page. To visit our test URL, I'm using the page.goto method, passing in my test URL, and then wait until the network is idle. Finally, I'm just closing both my page and my browser. To run a test in Xk6 Browser, we just need to use Xk6 Browser followed by the command run and then the file name. Let's see that in action by typing Xk6 Browser run followed by the file that I want to run. In this example, I've saved it on a folder called examples. You can see that it has opened up my Chrome browser and it has visited the page. When that's finished, K6 prints out a summary of a bunch of performance testing metrics. Apart from the usual HTTP-specific metrics that K6 already tracks, there are a few browser performance metrics that are now also added, such as browser DOM loaded, first contentful paint, first meaningful paint, and so on. For each of these metrics, you get an overview as to what the average time is, the max response time, or even the 99th percentile among others. This gives you an insight into how performant your website is from a browser perspective. Let's make the script a bit more complex by automating a login functionality. Let's say that I want to automate typing in my login name and password and then checking that I have logged in successfully. Let's launch an instance of Chromium and notice that apart from the headless option, I'm also pausing in an option called slowmo, which slows down input actions and navigation by the specified time. Next, I'm creating a new page. I'm visiting the same test URL again and waiting until the network is idle. After that, I'm using page.locator, which can also be interchanged to page.$ to interact with the given selectors and perform additional actions to type the username and password. For some context, a lot of the operations in Async are synchronous. However, Playwright operations are async, so we're also trying to support async operations. Since clicking the submit button will load up a different page, I also need to wait for that page and use page.waitForNavigation because the page won't be ready until the navigation completes. Once all the promises have been resolved, we can check that it has loaded the new page by asserting that the text content is equivalent to what we expect. Then finally, I'm closing the page and the browser. To see that in action, let's run our tests. Compared to the previous execution, performance metrics are also reported, but there is also a check here indicating that the assertion passed. The real power of xk6 browser shines when it's combined with the existing features of K6. xk6 browser allows for mixing browser-level and protocol-level APIs. You can have a scenario where you want to simulate the bulk of your traffic with protocol-level scenarios and at the same time, you can have one or two virtual users accessing your website on a browser level to collect frontend metrics such as DOMContentLoaded or FirstContentFullPaint. To see that in code, first I need to import the relevant modules from K6 as well as xk6 browser. Next, I'm configuring the behavior of my tests using options which is built in with K6 already. Here I can run two scenarios concurrently. My first scenario is for my browser tests, while my second scenario is for my protocol tests. I'm using the constantViews executor for both scenarios, which will introduce a constant number of virtual users to execute as many iterations as possible for a specified amount of time. In this example, I've set 1 view for my browser test while I have 20 views for my protocol tests. Next, I have my messages function which is my browser test and I also have my news function which refers to my protocol tests. It's all in one script and this allows for greater collaboration amongst teams because if you have backend teams already using K6, frontend teams can collaborate more with them and be more effective when doing performance testing. Here is a sample output from the test run. You can see both scenarios are executed concurrently with various performance metrics reported. Now, since I started my talk talking about Overcook, I would also like to end this with Overcook. Now, if we want to have better kitchen coordination, handle high amounts of customer orders, make sure that we don't have customers waiting impatiently for their food, and also provide the best customer experience, you need to have the right blend of frontend and backend performance testing. Just some final words before I finish. Since XK6 Browser is still a fairly new tool, we need help from the community so you're most welcome to try it out and give us feedback. Do check out our GitHub project, have a look at our examples, and play around with the tools. That's it for my talk at TestJS Summit. I hope you learned something new today and thank you so much for listening. I want to go ahead and start by discussing the answer to your poll question. Beforehand, you asked a question around what K6 is primarily known for load testing but can do a lot of other things. The correct answer is unit testing. Unit testing is the only item that you cannot do. Yeah. Yeah, that's right. It's really interesting here to see the results because even though a lot of people voted for the right answer, there's still a few people who maybe aren't as aware that K6 can also do browser testing, that K6 can also do chaos experiments, and even contract testing in the form of a schema validation. The main type of testing that it can't really do because K6 is more of a black box type of tool, whereas it's not really suited for unit testing because you'd really want to go deep into the different functions, so you would want it to be part of your actual code. You can do it, but we don't really recommend that you use K6 for that because obviously there's better toolings that are much available for that type of test. Yeah, it sounds really helpful, though, to be able to have such a versatile tool in K6 and being able to cover all those different use cases without having to stand up separate tools for each of those use cases. Yeah, really interesting stuff. I also wanted to ask you back to our first poll question that we had for the audience about what is your favorite testing framework. And no pressure, but I'm curious what your answer is there. Yeah, I'm not going to lie. My favorite framework is really Cypress. Before joining K6, I was a Cypress ambassador, and it was the first framework that I've used after I was on my maternity leave. It was just really great in the sense that it was easy for me to get started, easy for me to install, and the visual test runner was really helpful if I need to debug the test. So I think for me that developer experience is a really great thing that Cypress can offer. So yeah, I'm a big Cypress fan. Yeah, and so, yeah, you know me, I can't hide it either. Obviously a really big Cypress fan. And talking about the different use cases and the developer experience, being able to leverage these tools for so many different types of things, Cypress and K6 are similar in that space. So we are having some questions come in from the audience, so we'll make sure that we answer those for you. The first one is actually about how is browser-level performance testing different from protocol level? Yeah, so I think if I go back to one of the slides that I have, so I've differentiated performance testing as either front-end or back-end performance testing. So when we're talking about browser level, it's really about testing the front-end that we're trying to verify if, for example, our website is fast enough from the point of view of a single user. It also has distinct metrics. So it's all about the browser performance metric. So an example metric could be how long did it take for the browser to render an entire page, or how long did it take for the page to be fully interactive. Whereas protocol level, so that's the most commonly used case, let's say, for load testing API. So with protocol level, rather than using a real browser, we're using protocols such as HTTP to simulate a bunch of loads. So we're interested in testing our server response times and ensuring that when multiple requests are sent from different users simultaneously, that our back-end servers and our databases should be able to handle that load. When it comes to load testing, obviously protocol level is a really popular choice because it's less resource intensive. Because if you think about browser level and load testing, even though with XK6 browser, we now have that capability to spin up a browser like virtual users, we still need to consider that with browser level load testing that it could be quite resource intensive because we don't really want to spin up a thousand browsers just to simulate a load test because that could crash the servers that we're using. So we have to have a balance of doing browser level and protocol level as well. Right. And that's something that we see a lot in that in testing, it can not be as real world as it may be in production. And so it sounds like when you have the separate metrics as well for browser and protocol, and you're testing them both, then you can also maybe be able to better identify where there are bugs or understand where issues may arise because you've already put it through its paces, so to speak, in a test environment. And so if something goes wrong in production, you can go back and you can evaluate that with that context already in place. Have you seen that be a benefit as well? Definitely. So one of the use case that I've mentioned during the talk is that the real power that, for example, that we're trying to communicate with XK6 browser is you can now simulate, for example, you can run a bulk of your load testing on a protocol level. So let's say you want to create like a thousand protocol level requests. But at the same time, that wouldn't give you an insight to your user experience. So let's say you want to check if any loading spinners are taking a long time. So what you can do is while you have like a thousand load tests, like on a protocol level, you can have a couple of browser level virtual users just to check the overall user experience as well. So you can now have, I guess, the full picture when it comes to performance, because rather than having an isolated approach, you can then check, oh, what's happening in my front end if my back end is exposed to this amount of high number of requests. Yeah. And that sounds similar to actually kind of going into this next question from the audience. You may have touched on this a bit, but it states that there is an overhead involved in diagnosing CI failures for a single use case in end-to-end testing. With the K6 backend, you have access to the CI report and you can see failures and checks and thresholds. But knowing that you may or may not have a CI diagnosis tool like Replay or Cypress dashboard, when you have the K6 browser extensions CI diagnosis issues, how can you make the CI diagnosis aspect of that easy or easier? That's a really great question. And I'm not 100% closer to it yet because XK6 browser is still in a very early beta stage. But one of the things that I know that the team wants to achieve in the future is that within K6 Cloud, we have a feature called performance insights. So it can give you insights as to which area really has the high number of bottlenecks. So if your CPU has experienced a very high number of utilization, then that performance insights will be able to tell you that maybe you can try adding some think time, adding some sleeps in your test. Because again, we want to simulate what's happening in production as closely as possible. So we're still doing a lot of beta tests in our K6 Cloud. It's not open yet to the wider public, but that is something that we would have. So it can then give you some diagnosis as to which areas have a really performance bottlenecks issues. So we currently have that with the existing K6 stuff already. So with the performance insights, it gives you information as to which, for example, servers have degraded because of the high number of load that you run. So we want to use or we want to have the same sort of feature for XK6 browser for that. But for now, it's not fully out yet in the public. Well, it sounds really interesting to be able to get those kind of insights from a CI environment. That's something that we all struggle with is understand what's happening in those spaces. Another question here that we have from the audience is a question about the resulting output. So is there any way to configure alerts when certain metrics increase? Yep. So you can use thresholds for that. So what you can do is, let's say, like one of your SLA is a specific response time should be less than, let's say, 500 milliseconds for the 95th person. So in K6, we have a concept of a threshold where it's like a pass fail criteria. So if the threshold fails, then your test will be reported as a fail. And then in terms of like notifications, I think there is actually an extension that one of like our K6 contributors have written. I have to come back with the actual name, but because I've seen that there's like a list of extension in our XK6 ecosystem that you can use it to send notifications to, I don't know, like whichever like sort of like platform you want. But yeah, the way to do it with K6 is you have to configure a threshold. And then after the test has finished running, then that specific threshold will then say if it's meet the criteria or not. Yeah. Okay. And that probably helps with establishing your metrics too, because you assign those thresholds in advance and everyone's on the same page about what acceptable performance is. So that's always a good discussion to have. Another question here is, can you test Kafka producers and consumers with K6? I believe so. So we do have a K6 extension if you want to load tests on Kafka producers, consumers. I would have to refer you to one of my colleagues because my main sort of expertise is around browser automation, but yes, we can load test Kafka as well. Okay. Yeah. There's a follow-up question about whether you can fire custom events directly and test how the system reacts with Kafka producers and consumers, but maybe that might be a better question to hold off on. So if this person maybe send me a message and then I can give you, or I can point you to the right person from the K6 team who can be much better suited to answer. Yeah, absolutely. So we do have a very important question here, which is, which character do you main in Overcooked? Which character did I? Yes. Did I make? Yeah. Which character do you like to play in Overcooked? Oh, I don't know any of the names. I just let my daughter pick any random ones that she likes, but normally it's the cute animals. She doesn't like one character who looks like this angry chef guy, so we tend to avoid using that character. But yeah, it's a really interesting game because it's supposed to be a cooperative game, but playing with a five-year-old, it's very stressful, which is why I wanted to relate it to my talk about load testing because once the orders have started to pile up, we can relate that to if you haven't tested your servers properly, you just get a lot of queues, a lot of requests that haven't been processed, and ultimately it results to a bad user experience. Absolutely. So I've played Overcooked as well, and it requires a lot of communication, right, between if you think about the front end and the back end, it requires a lot of communication between you and the other person to really be able to handle that. So I thought that was very fun. Awesome. So I don't have a main character either. I just kind of pick different ones. So I guess back to another question about K6. What was the main motivation for introducing the XK6 browser? Yeah. So I think just to give everyone some context, so we have this golden rule of web performance. I talked about it during my talk as well. And basically it's saying that 80% of bottleneck issues actually happens on the front end. K6 stayed away at the beginning, away from real browsers, because they want to make sure that back end performance testing is very stable. And I think now that K6 has definitely reached its peak maturity, it's now very stable. We now have a lot of support from the community. We now want to shift our focus on front end performance testing, because we know that in order to have that hybrid approach to performance testing, we can't just do it at the protocol level. We have to do both. So the main motivation really is to make sure that we provide our users a way of having a full picture of your application's performance. And I guess you can do that already by plugging in different tools, by using different tools. But we want to see if we can try to have a single tool that can do that, that we can use the same script, and we can use or we can leverage existing features of K6 that our users are already using. And now we just want to focus on highlighting more of the front end performance side as well. That's great. Yeah. And thank you for providing that context. So when it comes to other kind of tools or testing tools like Playwright and Cypress, how does the xk6 browser either compete with or complement or work alongside those types of tools? Yeah. So this is a really interesting point for me, because even when I started K6, we have this thing called week of browser testing. So one of the things that I've spoken about is we don't want to compete with Playwright. We don't want to compete with Cypress, because the messaging that we want to share is we want to provide a hybrid approach to performance testing. Yes, you can do performance testing with Playwright. Yes, you can do performance testing, specifically front end with Cypress. But what if you want to have the same tool for back end as well? Then you would need to use other tools. So the angle that we're trying to communicate here is like if you want to have a single tool for a hybrid approach to performance testing, then xk6 browser with K6 can help with that. So we're not looking to compete. We're solving a different, I guess, problem that our users are facing. And I guess the more tools we can provide to our users that can solve the problem, then I guess the better it is. Right. I guess the answer is always it depends, right? It depends on the use case, the product that you're testing, everything like that. So you can go ahead, attendees, if you do have any additional questions, you continue asking those on Discord in Q&A production track. But we'll go ahead and wrap things up there. Marie, thank you so much for answering all these questions and for your really great and interesting talk. I know that I'm definitely interested in checking out more about the K6 browser and performance testing for front and back end. So thank you so much, Marie. Again, you can continue to ask questions of Marie in the Q&A production track channel on Discord. Thank you so much. Thank you so much.