1. Optimizations Backfire: The CDN Dilemma
The talk discusses cases where common optimizations can make the app slower instead of faster. The speaker shares their experience with implementing a CDN to improve the performance of a web app. However, performance tests revealed that the site became slower after adding the CDN. The speaker explains the importance of the lighthouse score and how it is calculated based on different metrics. They highlight that while most metrics remained the same, the First Contentful Paint metric worsened after adding the CDN. The speaker emphasizes the need to analyze the frame-by-frame rendering and network waterfall to understand the impact of optimizations. In this case, the speaker observed that the responses took less time with the CDN, but the First Contentful Paint metric suffered.
So, yeah, the subject of my talk is when optimizations backfire. It's about cases when you achieve some common optimization but it, in turn, makes the app slower, not faster.
So first, let's talk about CDNs. So who here uses a CDN in production? Hm, quite a lot of people. Nice. So one time, a while ago, I was a developer in a big complex web app, and the tab was slow. Slow apps aren't nice, so we had a project to make the app faster, and as part of that project, we decided to shape a CDN.
Now, a CDN is a service that brings your files closer to the user, right? If your server is in the US, and the user is in India with a CDN, the request for the bundle does not need to go all the way back to the US. It can just go to the CDN server close to the user in India. So our server was in the US, and our users were all around the world. So we decided to try this optimization. We configured the buckets to upload built results into, we configured the CDN in front of that, and we pointed all URLs in the app to that CDN server. Basically, our index.html, instead of looking like this, started looking like this, with the CDN origin in front of all the URLs.
So far so good, right? But when we run performance tests, we discover that suddenly, the site became not faster, but slower. Anyone, by the way, has an idea why? Based on the optimizations, based on the change we made, does anyone have an idea why this happens? No hands. Good. Well, good for me. So let's try to investigate this, as if I was investigating it today with the knowledge and the methods I have now.
So what's important to know about the lighthouse score is it's not just an abstract performance score, right? It's calculated based on these five metrics. If one metric gets worse, the whole score gets worse. If one metric gets better, the whole score gets better. There's even a calculator for this. Now, most metrics in these two tests, before adding a CDN and after adding a CDN, stayed more or less the same. Lars' Contentful Paint even got better, which is good, but one metric, First Contentful Paint, got significantly worse. So First Contentful Paint is a metric that measures how quickly the first content renders. It's the time from the moment the site started loading to the moment the site rendered its first content, and it got worse.
So now, whenever I have a site that has better rendering speeds, like First Contentful Paint or Lars' Contentful Paint, what I like to do is I like to look at the frame-by-frame rendering of that site, at the network waterfall of that site, and then compare them to try to figure out what the heck has happened. So in this case, if I compare both versions of this site, one without CDN and one with, I'll notice two big things. The first one is the responses now actually take less time, as intended. With the CDN, these tests were made from Canada, for example, using webpagetest.org, and with the CDN, this error was closer to the users, so the round trip was shorter, so the file took less time to download than without CDN.
2. CDN Connection Delay and First Contentful Paint
Despite the files taking longer to download, the first paint happened later with the CDN. The delay was due to the connection process, which consists of DNS resolution, TCP connection, and TLS handshake. By adding a CDN on a separate domain, the CSS files were delayed, causing the first contentful paint to also be delayed. This issue was not initially understood, leading to the abandonment of the CDN optimization. However, the speaker now knows how to avoid this problem.
Second, even despite our files taking longer to download, the first paint actually happened later. You can see that, without the CDN, the first paint happened around the 2.0 second mark. And when we added the CDN, the first paint started happening three seconds mark. And if I try to figure out why it happens, if I look through the part of the waterfall that precedes the first paint, the first paint and try to compare what changed, I will notice this.
Does anyone know what that means? Oh, I can't hear from here, turns out. But, okay. Some people are raising their hand. Oh, okay. Somebody knows. Great. So, this bit of the waterfall, this piece of the waterfall is what beat us back then when we were trying to set up a CDN. And now that I'm working with clients, now that I'm an independent consultant, I see it biting my clients as well.
The issue is whenever we try to load something from a new domain, the browser has to set up a connection to that domain first. And that is surprisingly slow. So the connection process consists of three steps. The first step is DNS resolution. The browser gets the domain name that you have referenced in your HTML or something and looks up its IP address. That ideally takes one round trip. The second step is the TCP connection. Once the browser has the server's IP address, it needs to set up a connection to the server behind that address. That involves one round trip. So one ping to the server and then one response from the server. So if there's a 100 millisecond delay between the client and the server, this is going to take 200 milliseconds. And the third step is the TLS handshake. Once the browser has connected to the server, it needs to encrypt the connection to upgrade HTTP to HTTPS, and this takes two more round trips. So setting up a connection has these three steps and on a typical 4G connection, these steps typically take 500, 600 milliseconds to complete, like in this waterfall.
So in this case, what happened was by adding a CDN, we also accidentally added the connection delay, because the CDN was on a separate domain, and as a result of that, we delayed the CSS files the page uses by 600 milliseconds and because the page can't render without CSS, the first contentful paint also got delayed. So we moved CSS files to the CDN, CSS files got delayed, first contentful paint happened late. Right? So, back then, we actually did not figure out what happened, so we had to ditch the CDN optimization. Today, luckily, I know how to fix this, I know how to avoid this.
3. CDN Optimization and Code Splitting
To optimize the CDN and avoid connection costs, use a pull CDN for the whole origin. Popular files like Google Fonts are no longer cached across sites due to cache partitioning. Loading everything from your own origin is crucial to prevent connection delays. Another optimization we implemented was code splitting.
So for the CDN optimization, for the connection cost optimization, for the connection costing to notebook fire, I need to make sure I don't add any new connection costs. I don't add any new domains. And the only way to do that is to put a CDN in front of the whole origin, not just for some files. The correct technical term that you want to use is pull CDN, which is like something you want to use versus the push CDN that you want to avoid. This is a bit trickier to set up than just pushing some files to bucket and putting a CDN in front of that, but if you do that, you are, well, not going to have a performance regression, which is good.
Now, some of my clients who I introduce into this connection cost issue have a different concern. They say, like, hey, Luke, I understand there's a connection cost, but in my case, I use the Roboto font from the Google Fonts CDN, and it's, like, really popular across many sites. Won't it be just cached when the user uses just sites? Won't it just, like, avoid the connection cost completely in this case? So meaning if a user gets to some site which uses Roboto from Google Fonts, the browser will cache that font, right? And if five minutes later if the user gets to my site, which also uses Roboto from Google Fonts, the browser will take the font from cache and the file will be available immediately without any connection delay, right? So the answer is no. A while ago, this actually worked like I described. However, a few years ago, this changed. So in the end of 2020, Chrome shipped a thing called cache partition. The idea of cache partitioning is that every file that a browser caches gets cached only for that site. So if you have other site.com requesting a robot.font, that file will get cached in the other site.com bucket. If you have my site.com also requesting the robot.font, that file will get cached in the my site.com bucket. It will not be taken from the other site.com. So this is done for privacy reasons. And it's been implemented everywhere since early 2021. So if you're thinking, but I'm loading some popular file, it will probably be cached from other sites, it won't. For the connection costs to not backfire, you need to load everything from your own origin. Even stuff like Google fonts. In general, the rule of thumb with CDNs and domains is that everything must be behind the same domain. As long as you follow this rule, you should be good. If you step away from this rule, it will backfire. So this is the first optimization.
Now, let's talk about another optimization, code splitting. So who here has done code splitting on their projects before? Okay, yeah, a lot of people. That's like an anonymous code splitter's meta. So back when I was working on the app, the answer was not the only thing we tried to get the app faster. Another optimization we tried was code splitting. At the beginning of the optimization round, our bundle looked roughly like this.
4. Code Splitting and Delayed Rendering
We attempted to solve the issue of bundling all routes into two chunks by implementing code splitting. While the bundle size decreased, the app took longer to render, with a delay of two seconds. The first UI elements appeared sooner, but the overall UI rendering was delayed.
We had four or five separate independent routes, each of them having their own components, their own dependencies. But all of these routes were getting bundled just into two chunks, the main chunk and the vendor chunk, ten megabytes in total.
So what did we try to solve this? Of course we tried code splitting. Our single page app was built from a single entry point, it had multiple routes, and it used React router to pick the right route to serve. If you google for code splitting guides for React router or View router, they both will recommend you pretty much the same thing. Take your root component, rub them with imports, and code split them away. Which is precisely what we did.
And so we did it, we checked our bundle, and noticed it's definitely getting smaller, like instead of serving the whole ten megabytes blob right away, we can now load just five megabytes of code. Yay. However, once we run the performance tests, we realised the app now actually takes longer to render. The first UI elements appear sooner, but the whole app UI renders two whole seconds later.
5. Code Splitting and Performance
To debug this issue, let's analyze the network portfolio. The original portfolio requests index.html and the associated files and bundles. After executing the bundle and sending API requests, the app is rendered. However, code splitting introduces a waterfall of requests, delaying the rendering process. Despite reducing the bundle size, the app's performance worsens. To avoid this, frameworks like Next.js and Nuke.js handle code splitting differently by building separate bundles for each route and managing code sharing.
Has anyone seen anyone like this before? No, no hands. So, to debug this issue, let's look at the network portfolio again. So this is the original portfolio. Let's go through what happens here. First, we request index.html. When index.html arrives, we request the files that are referenced by that. and our bundles. When the bundle arrives there's a pause in network requests because we execute the bundle. You can see this by this pink JS execution, thin rectangles. And then once the JS execution completes, we send a few API requests. And then once that's complete, we render the app. You can see this by this green dashed vertical bar which indicates the largest console paint in WebPageTest.
6. Optimizing Code Splitting
To optimize code splitting, use import only for lazy components that don't affect the first render. For critical components like root components, avoid using import. Instead, use multiple points to prevent the app from becoming slower.
This is option one. Option two, you can theoretically also replicate what these frameworks are doing on your own. I'm not going to say how to do it, because I don't know how to do it. But if you sneak into the source code and set these Webpack plugins and also check this deeper dive article, which I linked and you'll get in the... It's available in this link that's shown everywhere. You might be able to set something up. I've never done this, honestly. Hopefully I will never have to. But option three is if you already have a bunch of code that uses dynamic imports for routes and you're not feeling like migrating to Next.js, there's still a way to fix this. What you can try doing is you can try parsing the Webpack compilation steps and detecting which chunks each route needs. And then generating multiple HTML files with the relevant link rel preloads for every route so that every route starts preloading its chunks it needs right ahead of time. It's pretty tricky to do. It sounds tricky. It is tricky to do. But I linked to one case when I had to implement this for an open source client of mine with a lot of documentation. So if you're in this situation, you could check it out as well. Anyway, the key idea, the rule of thumb here is that the next time you decide to code split something, use import only for lazy components, for components that don't affect the first render. Maybe it's a model component, maybe it's something that's not visible to the user immediately. For critical components like your root components, like components that the app depends on, do not use import. Use multiple points, or splitting will make your app slower and backfire.
7. Lazy Loaded Images and Performance
Lazy loading images can cause performance issues if applied too aggressively. When lazy loading images, they can be delayed due to CSS blocking. The browser waits for the CSS to finish loading before downloading lazy images, causing a delay in their loading time. To avoid this, specify priority attributes on critical images and use tools like the web vitals extension to detect and optimize the largest contentful paint (LCP) element.
All right. Let's talk about another optimisation, which is lazy loaded images. This is actually an example I give to attendees in my workshop, and here's a simple product page. This page lists 100 products, each of these products has an image. When you open the page, the page immediately loads 100 images. How could you fix this? Actually, how could you fix this? How could you fix this? Yes, lazy loading! A lot of my workshop attendees correctly identify that what they need to do is to make this image just lazy loaded. What they do is they find the component that renders every product card, and the loading lazy attribute on it, which tells the browser, hey, can you avoid loading this image until it enters the view port? And then they check DevTools and confirm that, yeah, it looks like the issue is resolved. We're now loading nine images before we scroll down. It looks pretty effective, right? But again, if you look at the lighthouse results of before and after, you'll discover that the LCP of the site actually got worse. Why? To figure out why, you'll need to scroll down the report and see what exactly the largest Conful Paint element was, the element that triggered the largest on the page. The largest Conful Paint element was this image, shell-gold-some-hash.jpg. And then once you know what was the largest Conful Paint element, you need to look at the waterfall and see what exactly has changed about loading that image. So if you look into the waterfall, you'll see that before aiding loading Lazy, that image started loading around a 0.8 second mark. Whereas after aiding loading Lazy, this image started loading at 1.6, actually 1.7 second mark. Here it is, the same image. But why exactly does this happen? So if you check the LCP element once again, you'll notice that the LCP element, the LCP image, has this Loading Lazy attribute because we applied Loading Lazy to all the product cards. And some of the product cards were actually LCP elements. As it turns out, Lazy loaded images are not delayed just because they're lazy or something, but they're delayed for technical reasons. They're delayed because they're blocked by CSS. When you have a lazy image, the browser can't unload the image until it knows if the image is within the viewport. And this is controlled by CSS. Your CSS can have any of these rules and the browser won't know which rules it has until it actually has the CSS, so before the browser can download lazy images, before the browser knows whether these images are within the viewport, it has to wait for the CSS to finish loading. And that's what happened here. We applied lazy loading and the images had to wait for the CSS to load, which caused them to start loading almost a second late and delay. So that's how by applying lazy loading a bit too aggressively, we actually hurt our performance. Now again, how to avoid being backfired by this? First, if you use the next image component or Angular's ng-optimized image, one thing to remember is that these components use lazy loading by default. So if you take some critical image and you wrap it with next image and you put it above the fold and it ends up being the LCP element, it will delay your LCP. So what you need to do is to remember to specify priority true attributes on critical images so that they are actually loaded as soon as possible. And second, to actually detect the critical images, one thing I like doing is installing the web vitals extension, enabling the console log-in in its options, and just making a habit of mine to check the LCP element of the site I'm working on. So, if I install this extension and enable console login, what the extension will start doing, it will start logging all this information into the console.
Image Optimization and Code Splitting
And so if I'm working on some site, and I see the LCP element being an image, and I notice the image has the load and lazy attribute, then I know I'm in danger. So that's the second point. And in general, remember the rule of thumb. All images above the fault must not have load and lazy, as long as you follow this rule, otherwise, the image optimization will not backfire.
So thanks. This was Ivan Akulov, I'm a Google Developer Expert. I have Twitter where I post performance trivia. So maybe you want to subscribe. No pressure, folks. And thanks for having me.
Link, Pre-connect, and Pre-load Headers
Link, pre-connect, and pre-load headers can help with connection and loading cascades, but only if you load your files later. If you reference a third party domain from your HTML, these headers would not help because they also arrive in your HTML and the browser starts connecting to the third party domain as soon as it receives the HTML. They do not significantly help with critical stuff.
Just in general, 100 files is maybe a tricky idea. All right. Let me see. How much do link, pre-connect, pre-load headers help with... Oh, it's gone. Oh, no, it's not gone. Help with connection and loading cascades. Oh, that's actually a great question. So it can help, but only if you load your files later. So I'm not sure... Can I still show my slides? No. No, no. No. That's hard now. Okay. So if you have something... So basically, if you reference some third party domain from your HTML, link, real, pre-connect would not help you at all, because link, real, pre-connect would also arrive in your HTML. And the browser would start connecting to your third party domain as soon as it receives the HTML. So you cannot move this faster. Well, maybe you can with like their HTTP, 103 early hints standard, but that's like advanced stuff. In general, they do not really help for critical stuff.