Performance can make or break a website, but how can you quantify that? In this session we will look at the Core Web Vitals as a way to measure performance on the web. Specifically, we'll go through the history of web performance measurements, where the new metrics come from and how they are measured.
Core Web Vitals - What, Why and How?
AI Generated Video Summary
This Talk provides an introduction to the core of Vitals and its contribution to Google search. It discusses the evolution of website performance metrics and the need to consider factors beyond the time to first byte. The concept of Core Web Vitals is introduced, consisting of three metrics: Largest Contentful Paint, First Input Delay, and Cumulative Layout Shift. The upcoming Page Experience signal, launching in May 2021, will combine Core Web Vitals with existing ranking signals. The Talk also addresses challenges in measuring performance and provides insights on layout stability and visual completeness.
1. Introduction to Vitals
Hello and welcome to my session about the core of Vitals. We'll talk about web performance, the core of Vitals, and how it contributes to a Google search. Website performance is about quantifying if a website is fast and delightful for users. It has evolved over time and continues to evolve as our understanding of web performance changes.
Hello and welcome to my session about the core of Vitals, their what, why, and how, more specifically. So this is a testing conference, and I'm always a little humble to speak at testing conferences because I'm not that much into the testing space anymore. I do write tests when I write my code, but you all are probably more experts here than I am. Nonetheless, testing your website performance is an important thing, and the core of Vitals are a tool to accomplish exactly that.
So I think it makes sense to discuss these things. I'll look with you at three different things tonight. First thing first, we'll talk about web performance, or what website performance actually is. We'll talk about the core of Vitals, and then we also will talk about how the core of Vitals will contribute to a Google search in the form of the page experience signal launching in May. So there are some SEO implications or search engine optimization implications from this as well.
So let's start with what is website performance. Intuitively, we all know the answer to this question is a website fast and delightful to use or not. But if you want to compare that between sites, and maybe even between different versions of the same site, it becomes a lot more tricky, because you want something that you can compare and track over time, and intuitive measurements don't really help and don't really tick that box. The goal is to quantify it, to have some sort of number or metric that we can get that tells us if a website is fast and delightful for a user to use or not. As we will see in this talk, this has evolved over time and continues to evolve even today as our understanding of what makes a website fast and performant and delightful for users the web changes and the kind of websites we build are changing. There won't be an easy answer. That's kind of like the spoiler alert. But let's have a look at this.
2. Quantifying Web Page Performance
One of the earliest metrics to quantify web page performance is the time to first byte. However, this metric is no longer sufficient to determine if a website is fast and delightful. The website architecture has changed, and bandwidth and connection speeds are not the main bottleneck anymore. A better metric is the overall completeness of the response. For example, a slower website that delivers a more complete response is considered better than a faster website that delivers an incomplete response. Time to first byte is still useful in identifying connection issues, but other factors such as rendering speed should also be considered.
How could we quantify web page performance? One of the earliest metrics has probably been the time to first byte. We would measure how long it takes for the first byte from the server to come back to our computer or device and actually the browser can then start parsing and then eventually rendering the page.
And historically, this has made a lot of sense. So classical websites, like here, this example.com case, our browser would make a request, the web server would respond with the HTML, and then the content would be visible in the browser. There are huge differences and there are a few things and factors that we can influence as website owners and developers to make sure that this is still fast. Like we make sure that our server is fast, has enough memory, has enough capacity, has good network bandwidth. We can also make sure that the server is close enough, physically close enough, because it just physically takes time for data to like electrical or light impulses to travel. If I'm here in Switzerland, the server is in Australia, then this might take a while until the data has made its way to Australia and comes back. It might be lost on the way and then has to be retransmitted. So this can take a significantly longer time than when the server is, for instance, in my own city, I'm living nearby a data center. So maybe if it's like located there, then it's literally just taking like basically no time at all. It's going to be really, really quick. And thus the time to first byte will be a lot shorter than it would be with a server in Australia.
3. Evaluating Website Speed Metrics
Looking only at the time to first byte is not sufficient to determine website speed. Many metrics, such as speed index and first content full paint, are used to evaluate when elements start to appear and how long it takes for visual completion.
But the point being still, that metric is not sufficient. If you just look at time to first byte and say, like, what? My server responded in 0.1 seconds and the data was there in like 5, 0.5 seconds. How can this be slow? You might miss out. And that's why we have looked at many, many metrics, for instance, speed index where we try to figure out, okay, so not just like, when is the website there? When is the network part of it mostly done? But when does things like, do things start to pop up and how long does it take over time to near visual completion? Then we looked at first content full paint. How long does it take to get the first bit of the content actually visible on the browser window, excuse me.
4. Metrics and Interactivity
We had tons of metrics to track different aspects of web performance. It's not just about when stuff starts to show up. For example, imagine you have an online pizza shop. The menu loads quickly, but if you add pizzas to your cart too quickly, it's not delightful. We also consider metrics like time to interact.
And then we had tons of other metrics over time to track different aspects. But it's not just about when does stuff start to show up. Imagine you were having an online shop for pizzas, you want pizza delivered to your place. It shows up really quickly. The menu is there in like no time, in the blink of an eye, fantastic. But then you're like, I want this pizza, I want this pizza, hello, I want this pizza. You click, click, click, click, click. And after five seconds, suddenly you have a hundred pizzas in your cart. That's not delightful either. We also looked at other metrics like time to interact. When can I actually start to interact with the content, and a lot of other metrics.
5. Evolving Performance Metrics and Challenges
Performance metrics have evolved over time, becoming more complex and difficult to communicate. Measuring performance is challenging, as metrics need to be stable, sensitive, and reflective of the user experience. It's important to find a metric that is comparable over time and stable, rather than too sensitive or rough. Additionally, metrics should be able to be generated in lab settings and gather real user data to account for different devices and connections. As performance understanding changes, metrics may fall out of favor, leading to fluctuations in scores and potential discomfort when reporting to stakeholders. One approach is to focus on obtaining vital signs for the web.
And this has evolved over time in random intervals, it seems. At some point, someone was like, actually, you know what, this metric doesn't really reflect what we looked for or what the users experienced. We came up with this new metric. And then someone else was like, that's a great metric, but it also needs to basically take into account this other aspect, as I said, for instance, interactivity. And thus, we have not only one metric that we look at, but we look at a set of metrics. Which unfortunately, also makes this more complicated, not only to understand, but also to communicate with others.
So if you are communicating with users, and you say, like, you have a Lighthouse score of 100 out of 100, that really doesn't mean that much to them. However, they might say like, oh, it's only 80 out of 100 possible points. That still does not really say much. Because are these last 20 points really a problem? Or is that just like cosmetic? So measuring performance is actually a big challenge, as it turns out. One thing is that we want metrics to be relatively stable, but also fine enough and sensitive enough to really spot problems when they occur. We had a metric called first meaningful pain that tried to figure out when the meaningful part of the content was showing up. And that usually was very janky. So that means that you would have the same website, you would measure three times, you would get three different results without changing anything in the circumstances. So that's not really helpful. You want a metric that is comparable over time and more or less stable. It will never be 100% stable, but you don't want to pick two sensitive ones. You don't want to pick two rough ones where you don't like time to first bite. It's a very, very broad, rough metric. Doesn't really reflect things either. And also, these metrics really need to reflect the actual user experience. I said that already with the time to interact if we needed a way to actually track these things as well. Also, we would like data to be able to be generated in lab settings, where we can run these things automated and with things that are not necessarily already public. If we can only gather real user data, that is a little tricky. But it would also be nice to get real user data, to get a feeling for how this actually looks like on real people's machines and connections. Because we, with our high-end computers and good, stable fast internet connection land, might not be the target group of our website and then people on phones, on flaky connections might have a very, very different experience. So it would be cool if our metrics could be measured in both contexts.
Another big thing was that as we were changing the metrics, because our understanding of performance changed over time, you might end up working on improving a metric and then finding that this metric fell out of favour and then we have looked at other metrics and now you're like, ooh, our score is constantly increasing and then it drops and you're like, why? Have we done something wrong? Which also puts you in the uncomfortable position that whoever you're reporting to might ask for these metrics and then if your metrics are now lower than they have been as you started the initiative to make things faster, that might not be very comfortable. So we needed to change a few things. And we figured out that one approach would be to basically get like vital signs for the web.
6. Introduction to Core Web Vitals
Core web vitals are three metrics that help web developers, testers, and SEOs measure and improve user experience in terms of performance. These metrics are already available in various Google tools, such as Lighthouse, Chrome DevTools, PageSpeed Insights, and Google Search Console. The thresholds for judging website performance are based on field data and are updated roughly every year.
These are called core web vitals. But what are these? Well, they are basically three metrics for web developers and testers and SEOs to look at to figure out how the user experience in terms of performance is on each page and to measure that and to work reliably on improving them as well.
The goals for the thresholds that we are using to judge if websites are doing well, need improvement or doing poorly, are based on field data that we gathered and analyzed. So these metrics can and are already being achieved by lots of websites. Even though you might not necessarily hit the targets yet, you can achieve them. That is definitely possible. And also to fix the goalposts, moving goalpost issue, we will roughly update them every year. So we published them in May last year, 2020. And we will probably give an update on them at Google IO this year as well. And then it will again, roughly be a yearly cadence for us to review the thresholds and the metrics.
7. Metrics of Core Web Vitals
The core web vitals consist of three metrics: Largest Contentful Paint measures visual completeness, with a good measure being less than 2.5 seconds. The first input delay measures how long it takes for the page to respond to user inputs, with a target of under 100 milliseconds. Cumulative layout shift measures the stability of page content, with a value below 0.1 considered acceptable.
So what are these three metrics that make up the core web vitals? The first thing measures visual completeness. So how long does it take until I actually see what I care about? Like what's the main content? How long does it take to show up? That's measured by the Largest Contentful Paint metric. It's basically the visual loading time that we had measured with other things beforehand. And a good measure would be less than 2.5 seconds. If you are getting your main content visible within 4 seconds, that is in the needs improvement area. Everything that takes longer than 4 seconds has an impact on how users perceive your site. So we would recommend to make the website faster then.
Last but not least, we also want to make sure that the page content is visually stable. What does that mean? Well, it means how much does it move around? You probably know this if you're on your phone or on your computer, and you see like the website shows you a button and you want to interact with that button. But then before you can click on that button, something else moves and then like a new thing is there and you click on that and you're like, oh no, I didn't want to interact with this. Why did that happen? We measure that with a new metric called cumulative layout shift. And it's basically how much of the content shifted and by how much did it shift. So that value should be below 0.1 because everything between 0.1 and 0.25 is considered okay. Everything that is more than that is definitely considered a problem. What are these values? To be fair, I don't really have found the unit that I should use because it's not really percent but it's basically the way you calculate it is how much of the page is affected by the shift and by how much does it shift. So in this case, for instance, we have a website that has two halves, the gray half and the green half. After a while, a button pops into the middle of the page which means the entire lower half shifts. So 50% of the page is affected by the shift. The button and some spacing that it introduces is roughly 14% of the page. So it is affected by 14%. So we can multiply the 50% that's the affected area by the 14% that shifts. And that gives us 7% it's not really percent but like 0.07 is the value we get if we multiply 0.5 with 0.14. So 0.07 that would be within the acceptable range actually. Assuming that this button would now be on the top of the page, everything on the page would shift.
8. Core Web Vitals and Page Experience
Learn about Core Web Vitals, test different page versions, integrate it into your automated flow, and check for any issues using the Search Console. The upcoming Page Experience signal, launching in May 2021, will combine existing ranking signals with Core Web Vitals measurements. AMP will no longer be required for top stories carousel. Don't worry about the Page Experience update, but ensure there are no issues with Core Web Vitals, mobile friendliness, or safe browsing. Connect with us on Twitter or check out our documentation for more information.
So a 100% of it would shift. It would probably still be shifting by 14% so that would be 0.14 that is above the threshold already. So you can see that either large amounts of space that are taken after the fact or after the first rendering, or a shift that moves everything on the page, both of these are being taken into account as problematic shifts anyway.
What can you do in regards to the Core Web Vitals? Web.dev.vitals has lots of information. Learn what you want to learn about these metrics, understand what they do, how they are measured and how you can improve on them. It's very, very useful to know how this works. Test all your different page versions. If you have a mobile desktop and AMP page version of your content then test all three of these or any combinations. Do look into integrating it into your automated flow because that's really, really helpful. It also has impact on Google Search or will have impact on Google Search.
There will be a new signal called Page Experience. In May 2021, we want to launch this new ranking signal that is comprised or composed from existing ranking signals. We take things like mobile friendliness, safe browsing, HTTPS and intrusive interstitials which are already ranking factors and remove the proprietary page speed variation that we use to measure in ranking with the Core Web Vitals measurements and combine these into a signal called the Page Experience signal. Page speed and mobile user experience are not new ranking factors. They have been beforehand so it's not something that we need to super much worry about. It is just something to be aware of. And both the page speed using the Core Web Vitals and the others that I just mentioned will form this new Page Experience signal that will happen in May.
One of the upsides is that once this Page Experience signal has launched, we know how fast different pages are and we will no longer require AMP to be—well, AMP is no longer a requirement to show up in the top stories carousel. If you are a new site and want to have your articles in the top stories carousel, once the Page Experience signal has launched, AMP will not be a requirement to show up in there anymore.
What can you do for these things? If people in your organization are worried about the Page Experience update, don't be worried about it. It's not like a huge update. It is an update, but it's not like the biggest we've done. Do check your pages. Make sure that there's no Core Web Vitals issues, that there are no mobile friendliness issues, no safe browsing issues. You can use the Search Console. That's a free tool we put out there. You can go to search.google.com slash console. Sign up, get a feeling for how your pages are doing, and use the Core Web Vitals report as well as the Mobile Friendliness Test to figure out where there's areas for improvement and work on those. If you want to learn more, feel free to ping us on Twitter at googlesearchc or ping me on Twitter at geekonoa. You can also check out our documentation at developers.google.com slash search, which has a load of information.
YouTube Channel and Q&A
And we also run a YouTube channel with regular office hours on youtube.com slash google search central. Thank you so much for watching and listening. We're going to go into the questions from our audience. The first question is from our guest, Yanni. He's asking about visual completeness within 2 1⁄2 seconds and the first input within 100 milliseconds. He wants to know if it's feasible to have effective input if the main content is not loaded yet and what exactly is measured by the first input metric.
And we also run a YouTube channel with regular office hours on youtube.com slash google search central. With that, I'd like to say thank you so much for watching and listening and bring on your questions.
I'm really excited to hear what you are up to. Martin, hey, thanks for joining us. How are you doing? Hi there. Yeah, oh, I'm doing pretty well. I mean, all things considered, still doing well, I guess. Yeah, how are you doing? Good. Happy to hear that. Yeah, very well, very well. I can't complain. I mean, it's been a lovely two days here at TestJS Summit. So anything that happens in life, you forget with such nice conference days.
We're going to go into the questions from our audience. So pack yourself to your seat. We're going to go... Okay, the first question is from our guest, Yanni. And he's asking, visual completeness within 2 1⁄2 seconds, but first input within 100 milliseconds. Is it really feasible to have any sort of effective input in it if the main content is not loaded yet? Or does the first input metric measure the time between the user clicking and the input showing up? He doesn't quite understand what exactly is measured there.
Understanding CLS and Layout Stability
The CLS metric is not time-bound and can be challenging for single-page applications. Layout shifts during navigation can result in high CLS scores, even when there is no visual instability. Feedback on CLS is being collected through a survey, as the metric is due for a major rework. Page shifts caused by privacy/cookie notifications and banners can affect layout stability, so it's important to avoid shifting elements for a good score.
Again, not really easy to express that in the metrics unfortunately yet. Well, you'll get there one day Martin, I know you will. I'm not smart enough for that, that's something that they need to work on, the smart people. Just act, just play the role.
Sure, sure, I'm on it. Next question, and I think that's the last question we have time for, is from Autogibbon, is the CLS metric time-bound? In example, I've seen some websites shift all the time while they're using it and others spend the first few seconds of loading and balancing stuff around.
That's actually, it is not time-bound as far as I'm aware. That's actually one of the biggest complaints about this, because single-page applications currently have a bit of a hard time with CLS because technically when you navigate from one view to another, you have a huge layout shift, right? Pretty much everything on the page is affected and it shifts by a lot. So as the CLS is measured throughout page lifetime in the user's browser, you might see high CLS scores when it isn't really visual instability. It's just the way that single-page applications happen to work. There is currently, if you go to Chrome devs, I think it's the account name. So basically the Chrome developers Twitter account, you will find the link to our survey where you can give feedback on problems to CLS, because I know that that metric is definitely in for a major rework because there is no time boxing or time bounding on CLS that can actually cause high CLS values where they are not really user experience problem. Yeah, it feels like cheating. Yeah, like I said, that's all the time we have for our face to face Q&A. Oh, we have one more. What a quick one Martin. The question is from Saf. How does page shift handle things like privacy slash cookie notifications, banners and stuff? So I had a look at that and it depends on how it's implemented. If it is implemented outside of the rest of the layout flow. So if you're basically absolutely positioning things on top of it without other things shifting, that's not a layout shift. Oftentimes many solutions that are on the market unfortunately implemented in a way that it does shift things around and then that is a problem. So don't shift things around if you want to have a good score. If you want Martin's approval, don't shift things around. Well, thanks Martin. For the rest of the questions, you're going to have to go to Martin speaker room. He's going to go to the spatial chat, click the link below in the timetable and you'll find Martin there. Martin, thanks. Yes! Love to see you again. Thank you very much. Hopping over to spatial chat. Thanks a lot.