You’re Probably Using Lighthouse Wrong: How We Got Tricked by a Single Magic Number

Bookmark

These days web performance is one of the most important things everyone wants to optimize on their apps, and it's clear to everyone how dramatic the impact of a poorly optimized website is on business. Yet we as an industry completely fail in recognizing its complexity and widely misuse the most common tool to measure it — Google Lighthouse. If you’re one of those people thinking that good performance equals a good Lighthouse score, you’ve also fallen into this trap and this talk is for you.



Transcription


Hello! Hey, hey! How are you doing? Are you having fun? Are you having fun? Okay, good, good, I'm happy because I'm having a lot of fun. It's great to be here again. I think it's the first time I'm here in vue London, in vue.js Live this time. And the title of my talk is You're Probably Using Lighthouse Wrong. And I know it sounds a little bit provoking, that was my intention, but I don't assume you're using it wrong. I really hope you do it, you know, you're using it in the right way. But just in case you're using it wrong, here's my talk. So, my name is Filip Rakowski. I'm a chief developer experience officer and co-founder in vue.js. I was introduced as a CTO because I used to be a CTO, but we hired a better CTO. So, right now, I can move to the things that I'm best at and that I enjoy, honestly, a little bit more. So, yeah, I'm also a technology council member of Maha Alliance. Who here heard about Maha Alliance? Okay. So, Maha Alliance is an alliance of the biggest enterprise vendors that are modernizing the e-commerce landscape. And I'm extremely proud to represent it. And who heard about vue Storefront? Please raise your hand. Nice, nice. It's getting better every year. So, you know, I work in the e-commerce industry. And I work in the e-commerce industry literally all my life. And building e-commerce storefront is harder than it seems. Like, you feel powerful after displaying, you know, the first product, the first data on your website from one api endpoint. But the road from there to production is very long and it's often painful. And I can guarantee that you will feel physical pain once you learn what faceting is. So, the goal of vue Storefront is actually to provide tools that save you from this pain. And vue Storefront is open source, so you can check it on GitHub and give a star if you like it. I'm not encouraging, but, you know, it would be nice. And in the e-commerce industry, performance is one of the most important things to look at, really. The fact that the way how people look at this is often completely wrong is another topic, but that's what I'm going to address in this talk. So, Amazon did a study on that topic. And what they learned is that every 100 milliseconds in other page load costs 1% of revenue. For Amazon, it's millions of dollars, really. 100 milliseconds. And speaking about numbers, if you need a good source of arguments for your boss, for example, to take care of performance because you know it's important but you need an argument, check out this website, WPO stats, which stands for Web performance optimization Stats. It will give you a great, great insight on how optimizing performance helped other companies to grow their revenue. And it feels quite crazy from today's perspective, but five years ago, when we were writing the first lines of code for vue Storefront, the topic of front-end performance was almost non-existent in the web developer space. At that time, javascript frameworks were just gaining the traction. AngularJS, react, they were already well-established tools gaining in popularity every day. vue, it was just getting the attention of the broader developer community fighting with frameworks like this one. Do you know those frameworks? Raise your hands if you know all the frameworks, all the logos from this picture. Oh, really? It could have been vue, but luckily, vue made it to the third place. So, almost no one cared about how fast the website built with those frameworks are. And of course, now everyone say that, you know, putting so much javascript on the front-end, it was a terrible idea. But honestly, it wasn't that clear at that time. It wasn't that clear because the reason why we are having performance issues with single page applications these days is because of the ecosystem and how much javascript you're adding through the ecosystem, not the frameworks themselves. And, you know, as long as we're using PCs or laptops as our primary machines, which, believe me, like seven years ago was a normal thing to consume the web, no one seemed to be concerned with the growing size of websites. Most CPU and internet bandwidth, they were growing faster and websites were growing in their size. It all changed when mobile phones started to become the preferred way of consuming the web. And according to Google research in 2017, it took on average 15 seconds to fully load a web page on a mobile phone. Imagine, 15 seconds. If I wouldn't have only 20 minutes, I would just wait to give you, you know, this perception. At that time, the awareness about the impact of this poor mobile performance on the business started to emerge. But we're still lacking an easy way to actually link those two components. So, performance and business metrics. And everything has changed when Google Laptop started to gain popularity. So, I remember when it was first introduced in 2018, it became super popular, super rapidly adopted, the same way as progressive web apps. And at that time, everyone started to be obsessed about performance. Everyone started to be obsessed about progressive web apps. But you know what? They really didn't know much about those. It was all a marketing from Google. And unfortunately, not that much has changed since then. So, what makes Lighthouse so widely adopted? I think it's simplicity. You run a test, you get a number between 100 and 1 that tells you how good or how bad the performance of your website is. Everyone, even those without any technical background can understand that. And honestly, that's the root of the problem. Because the reality is not that simple. And web performance or user experience, it cannot be represented by, you know, just a single number on the screen. In addition, there's tons of nuances around Lighthouse. So, how it works, where the numbers comes from, et cetera, et cetera. And we'll navigate through all these nuances. And at the end of this talk, I hope you will feel that you know how Lighthouse works, where these numbers comes from, when to trust it, and when not to trust it. But let's start with a simple question. What does Google Lighthouse measure, really? Can anyone tell me? What does Google Lighthouse measure? Okay. I have a feeling that a lot of people doesn't try to answer this question. So, they assume that the number has to be high to be right and when it is low, it is bad and that's it. That's all they need to know. And as we can read on the Google Lighthouse website, the goal of Lighthouse is to measure page quality. So, the audit divides quality into four different metrics, performance, accessibility, best practices, and seo. All of those combined should give you a very good perspective on the quality of the website and by that, try to accurately predict the user experience of this website. And try to predict is super important here because no audit will give you any information and definitive answers if your user experience is good or bad. Guess what? Your users will. So, Google has always promoted this performance score as the most important one. And I think in the heads of general audience and all of us, really, that has score equals performance score. So, quality page means page with, you know, high performance score. And don't get me wrong here, performance is definitely major factor influencing the user experience. But the truth is, this only one that even all four metrics of Google Lighthouse, they won't tell you if it is good or bad. In reality, there are just so many factors influencing good user experience that it's impossible to tell that just by using any tool. So, without knowing the context, it is very easy to make bad decisions that logically could seem correct. So, give me, let me give you an example. BMI. Who knows BMI? Do you know what it stands for? Body mass index. And I think body mass index is exactly like Google Lighthouse and I will tell you why. So, does 30 BMI mean that you're obese? If you look at the chart, you can say, yeah, with full confidence, right? But when you dig a little bit deeper into those details of how the results should be interpreted, we learned that this scale doesn't work for a very large amount of people. Older adults, women, muscular individuals, the interpretation for them is different. But the list doesn't end here. The interpretation for children and adolescents is also different. So, actually, the only group that comes to mind that actually fits into BMI are, I don't know, non-muscular, middle-aged males. That's it. So, the initial results could lead you to decisions that are bad. And if you don't dig into those specific details, we'll just make bad decisions. So, this kind of thinking could lead us to real disasters. For example, here on the chart, we can see that we can quickly jump to the conclusion that we can put an end to really horrible things just by banning worldwide cheese consumption. So, let me quickly explain how the Google Lighthouse score is calculated to make sure that we are all on the same page. The Lighthouse score is calculated from a bunch of other metrics. Each of them has their own weights. Some of them are more important, some of them are less important. And it's important to know that because when you want to optimize the score, you should start by optimizing the things that have the biggest weight but not because it will give you the best results but just because they are the most important ones. And the algorithm is changing with each version. So, it's a very important thing to acknowledge because if you are optimizing a website or if you are doing, I don't know, a migration or something like that, never compare the previous Lighthouse score with the new Lighthouse score. This is not the right thing to compare because most likely, the scoring algorithm has changed during that time. And, you know, the score could decrease, increase mostly because the algorithm has changed, not because you have changed something. Okay. So, we already know how the score is calculated. This is like 50% of truth. But what we don't know yet is where this number is exactly coming from. So, we know how it is calculated. We don't know the environment. And the environment you run the test in is extremely, extremely important. So, most people use Chrome devtools to measure the performance, so to run the Lighthouse audits. And I will tell you this, this is the least reliable way of doing that. Okay. So, raise your hand if you are doing this. So, don't, please. Because there are multiple external factors that are actually influencing your score. One of them is your network. Another one is your CPU. Another one are extensions. Yes, if you have any extensions, they are also included into the Lighthouse audit. For example, if you have Grammarly, you can take a look. It is adding something at the very bottom of your page. So, we can decrease the impact of those by running it in Cognito mode, by applying proper throttling, et cetera. But this is still going to be different on different devices. So, let's say we have 20 people in the company, everyone would run the Lighthouse score, the Lighthouse audit, the scores would be completely different. And you have much more, you can have much more consistent results if you implement something like Lighthouse CI. It's, well, it's Lighthouse CI that is running on every pull request. Or use PageSpeed Insights. What is PageSpeed Insights? Who heard? Good, good, good. Because this is, that would be my recommendation. It's a website made by Google itself and you can use it to perform very quick Lighthouse audits on any website remotely on one of the Google data centers. So, PageSpeed Insights usually uses the data center that is the closest to your location. Sometimes you could use a different one, for example, if the one that you want to use is under heavy load. So, you could have slightly different results between runs. In addition, please keep in mind that when you're having a website, on this website you probably consume multiple APIs, they could also have inconsistencies. So, really neat picking on the Lighthouse score is not the point because it will always be different. There will always be some differences between runs and those differences are coming from the, you know, dependencies, for example. And so, even though the PageSpeed Insights score will be more consistent, it's still far from being, you know, a good representation of how your users are actually experiencing your website because this score is running on the emulated budget Motorola. So, unless all of your users are using budget Motorola, then it's not very good representation of how they are actually experiencing your website. The good news is PageSpeed Insights will also tell you how your website performs in the real world. So, at the very top of every audit, you will see three metrics called core web vitals that has to be green to positively impact your seo results. Yes, you can get an seo boost if you have green core web vitals. And the boost is individual from each of the metrics. And the three other at the bottom are also quite important, especially interaction to Next Paint that will become a new Core Web Vital in March of the next year. And another problem with Lighthouse is that it is just an algorithm. So, you just learn how it works, right? And if you know how something works, then we could treat it. It gets an input, it gives an output, right? You can easily find a lot of articles that are actually showing how you can build the least accessible website in the world with 100 Lighthouse score. But you can do even more. It's equally easy to treat the performance score. You can detect the Lighthouse user agent and serve a completely different version of your website or auditing tool. So, the thing that I've seen and I've seen it unfortunately more than once is that what people are doing, they're removing the script text from server-side rendered application and serving this to the Lighthouse audit. And because javascript is usually the main bottleneck, the main reason why we're having bad performance scores, then it's magically super high. But it has nothing to do with a real user experience. And if you think about this, it's just stupid because Lighthouse became so important that we are cheating it up for really no reason. Like, there's no benefit from that except from lying to your customer, for example. So, after hearing my presentation, you can get the impression that in my opinion, Lighthouse is completely useless. And this is definitely not my point. I think Lighthouse is a wonderful tool, really wonderful tool. And, you know, it contributed a lot to a better, faster web. But the problem is not in the tool itself. It's more in the way how we are using it or misusing it. Because the name Lighthouse has its purpose. Its goal is to guide you on improving page quality, not on giving you definitive answers if it is good or bad. And I've seen websites with great user experience and low Lighthouse score and the opposite. Good performance is a tradeoff. You have to always keep that in mind. So, 100 is never a goal because you always have to sacrifice something to make the performance better. Sometimes it's an analytic script, sometimes it's a feature, and it's not always a good business decision, really, to get rid of some of them. So, remember, it's a tradeoff. Don't treat performance as the ultimate goal because you might end up with a website that is only displaying text without css because that would be the perfect website for Lighthouse, right? And now, when is it worth using Lighthouse? So, to me, Lighthouse shines the most when we want to quickly compare different versions of our website to see if there was an improvement or maybe decrease in its performance. So, it's definitely worth implementing Lighthouse checks in your CI-CD pipelines using Lighthouse CI. And you should also audit websites with similar complexity or your competitors to get, like, a realistic perspective of what is a good score and what is a bad score. Because for a blog, everything below 100 is a disgrace, really. It's a shame. But for the e-commerce website, 60, it's a good score. It's much more complex. It requires much more analytics. It's just something, it's just this type of a website where performance shouldn't be the highest priority. It should be extremely high because, as we learned, we are losing money. But at the same time, the user experience is not only performance. And Lighthouse is definitely not a good tool to measure the actual user experience because it's a synthetic data. It has nothing to do with how your users are experiencing that. And to understand how your users are experiencing your websites, well, try talking to your users. You could be surprised about how they are experiencing it. And you really don't have to set up any additional monitoring tools to check that. If you audit your page on PageSpeed Insights, at the very top, you will see how it is scoring against those four most important metrics three. So, you could think, okay, where does the data is actually coming from, right? This is an old picture of the previous results of PageSpeed Insights. Right now, we have three at the top, but I think it's irrelevant. So, where does the data comes from? The data comes from the last 30 days and is collected on the devices using Chrome. It's very important technology. So, it doesn't include other browsers. And if you're using Chrome, actually, you're automatically sending this data to Google. So, this is how they know how the website is used. And the only requirement for actually collecting this data from any website is to make it scrollable. So, if your website is scrollable, then you will see the real-world data. Nothing else you have to set up. And this data, if you go a bit deeper, it comes from something that is called Chrome user experience Report, CRUX in short, which collects performance metrics from all of those devices using Google Chrome. And you can also get access to this, to the history of your metrics in the CRUX dashboard in Google data Studio. It's all free. It can generate really nice reports that are showing how your performance were changing over time, how it is changing, you know, depending on the device, et cetera, et cetera. Extremely useful. And again, the only thing that you need to do is make your website scrollable, then the data is collected. Now, before I finish, I keep saying about this difference between so-called lab data and others measured in a specific environment and the real-world data of your users. And it's important to not optimize the former. And here is why. If you do a PSI benchmark on a few websites, you could often see this picture. What is wrong in here? Who can tell me? Anyone? Okay. So, this is an e-commerce website. And we see that the synthetic score is 81. I said 60 is good for an e-commerce website. So, 81 is amazing, right? But if you look how it translates to the real user experience, we see that it is terrible. Like, Orphe core web vitals are not fast. And if we look at things like interaction to an extent, it's terrible. So, this is just an example of what could happen if you only optimize for the lab health score. It could be completely missed with how your users are experienced, how your users are experiencing your website. And that's all I have prepared for today. So, I hope you learned something. If we have any e-commerce geeks in here, I'm running a newsletter about this. So, you can check that. And if we have any web performance geeks, I'm posting a lot of tips on Twitter, if you're interested. So, thank you so much. Thank you for your time. Thank you for having me. Thank you so much. Please step into my office. Really, really enjoyed the talk. I've learned a lot about Lathouse today as well myself. It's really funny, actually, because I thought we'd be able to get through the rest of the whole of today without really talking about AI. I know we did co-pilot. But the top question right now is what about a machine learning approach to evaluating a website? That's a broad question. Do you see maybe any uses for... So, here's the thing. I mean, it's not that much connected to performance, but it's actually connected to the things that I'm working on in the e-commerce space. So, you could use AI algorithms, for example, for the A-B tests. You could use AI, for example, to see how your business metrics and how your performance metrics are correlating. And by doing that, you can predict that, okay, so perhaps I should increase performance a little bit because historically it gave me a better revenue increase or things like that. But yeah, that's all I can say about that. I agree. I feel like there's going to probably be more tools that all come out as time goes on. We'll see. And this is another good question that came in from Alvaro, which is how can you measure the performance of pages that require authentication first? We currently rely on devtools Lighthouse because PageSpeed Insights does not support this case. Yeah. So, for that, I would suggest to just have a separate version of the website that you can measure synthetically. And unfortunately, if you have pages that require authentication, they just cannot be measured that much. But at the same time, if you have websites that require authentication, these are probably like dashboards or B2B applications. And for those, the performance is usually not that much of a concern. I mean, it's much less important than, for example, in e-commerce. But yeah, it's an issue. It's an issue. You can do some things around that. You can set up an alternative version of your website, even have some traffic on it, like synthetic traffic. But it's just hard. Nice, nice. And we have another question as well, which is what about the use of webpagetest.org and the role that that could play for performance testing in the real world? I am absolutely a fan of webpagetest. So, I was talking about Lighthouse, but this is not my go-to tool for measuring performance. My go-to tool is actually a webpagetest. And just try it out. You will see how many insights you're getting, how many very actionable tips you're getting from it. Also, another one I would recommend is Yellow Labs. So, Google it. Yellow Lab, actually. So, Yellow Lab, I think, is the most actionable one. If you run an audit, you will quickly see why I'm saying that. So, yeah. Nice, nice. And another one as well is any ideas on how to automate the synthetic measurement of the UX, of the user experience? Of the user experience? I have no idea how you measure synthetic user experience. But if you meant performance, then I mentioned, I think, twice a tool called Lighthouse CI. So, this is basically a very simple GitHub action. Implementing it is really not hard. Also, Jakub Andrzejewski, who was talking like 30 minutes ago, he has a very nice article on how to add that into your CI-CD pipelines. So, check that out. And this is actually great to compare how the performance was changing from one deploy to another. But again, like the numbers that the people are getting, like the real users are getting, could be completely different. Nice. Thank you. And another question we've got coming in, which is, what would you say is a good budget from a javascript-sized perspective for a website nowadays? Well, it used to be 170 kilobytes, which for today's standards, I mean, javascript budget. Yeah. Oh, ideally zero. Like, honestly, like, the biggest cost of all performance bottlenecks these days is javascript. And if you take a look how the frameworks are evolving, you will see that they're all evolving towards server-side execution. There is a reason for that. I would say that the budget, well, it depends also on what you're building, right? For an e-commerce website, I would say that you shouldn't exceed like 70, 80 kilobytes to zip for a block, ideally zero, and lazy load everything that is interactive. But really don't think about those things as, you know, a budget that you have to fill. Try to have as few javascript as possible. Try to lazy load, try to execute things on a server and just put the static files on the front end. Yeah. Nice. We're going to go to the last question. I know there's a couple of more. So remember, if you are online, there is the speaker Q&A room. And if you are here, you can go upstairs and talk to Philip as well. But for the last one, Google gave us Lighthouse, but maybe if they're not optimizing Google Tag Manager, Analytics, and some of their other sort of third-party libraries and stuff, do you have any advice for third-party libraries and performance and how to maybe still increase performance when you're depending on these other libraries? Yeah. So, first of all, like there is a very nice tool called PartyTown. Who heard about PartyTown? Yes, from builder.io. So what PartyTown is doing is actually allowing you to run those analytics scripts on a web worker. This way you're not impacting the main thread. So they're just loading in parallel. But sometimes it could also impact your analytics data. So make sure that the analytics tool that you're using is on the list of the PartyTown. And also, we've heard today that Daniel will be proposing an RFC of NextScript. NextScript, like NextScript, is actually a way to also load your scripts either in a web worker or in an asynchronous way. So take a look at that. Basically, the worst thing you can do is loading your analytics scripts together with everything that is in the main thread because they will impact the user experience. Try to have them asynchronous. Try to have them in parallel, but never in the main thread. That's amazing. I love when talks connect to each other. Thank you so much for sitting and talking with us today. And thank you so much. Give them a round of applause.
29 min
12 May, 2023

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic