Transcription
A really long time. In fact, some of you I think I've met for the first time ever. So it's been great. It's been surreal really. My name is Sid. I work at Gatsby Inc. I've been there for a while now. Over the years I've helped build and maintain the Gatsby open source project. More recently I've been working on Gatsby Cloud. Gatsby Cloud is the best place to build and deploy your Gatsby sites. And I'm going to be talking about a bunch of different stuff that we've done with Gatsby v4. In case you missed it, we've been busy. Gatsby v4 came out yesterday and it has a bunch of new rendering modes. Let's talk about what that means. So before all of that, a quick history lesson. The Jamstack has been around for a while now. It's been a couple of years. But when it started out, it was really just static sites. You had static site generators like Hugo. Now you have Eleventy. We've had Gatsby, Next, at some point added SSG as well. The idea with the Jamstack was to pre-build your assets, HTML, CSS, JavaScript, etc. and deploy it to the edge. And the core principles, as I said, I copy-pasted this, the core principles of pre-rendering and decoupling are supposed to give you more confidence about your site. The thing that's interesting about this though is pre-rendering. Why does that give you more confidence and why is that more resilient? Because the stuff you need to do to construct a page is already done. You don't have to do it at request time. So there's very little that can go wrong. However, it's not always that simple. Before we get to why it's not simple. Static site generation has worked well for us for a while. At Gatsby, on Gatsby Cloud, we've seen folks with pretty large sites, we've seen folks with e-commerce sites and blogs and whatnot. We've seen sites that have gone as high as almost 100,000 pages. And it's incredible that we're able to do that with what started out as a simple static site generator. And in case you've missed out on the benefits of SSG over the past couple of years, it's good for SEO, it's fast, it's also cheap to deploy. We have Matt here from Netlify. All of you have probably used Netlify at some point, right? And it's almost become sort of the default way to deploy things, right? Because it's cheap, it's free. He's right there, by the way. His picture was there for a second. And in case you want to know what SSG actually looks like under the hood, when you visit a site on Netlify, Gatsby Cloud, any of these hosts, you're effectively just reaching a storage bucket, whether it's S3 or GCP or whatever. You're hitting a storage bucket and you're getting a static file. There's not a lot of code that's actually running at runtime, and that's why it's so resilient. Let's talk a little more about where this kind of falls apart, though. It doesn't always scale. I've heard this over the years from people on GitHub, on Twitter, etc., that it's great for your blog or for a tiny site, but it doesn't scale for really large sites. What if you have a lot of pages and so on? I disagree. We'll talk about why. Let's say you want to build an e-commerce site. You start out with what looks like a template from Tailwind UI, which I bought, by the way. And then you have a homepage with your products and so on. And you have about 20, 30-odd products and it builds quickly and it's all great. And you have another page where you have a T-shirt that you sell and it's all wonderful. Your whole site builds in under a minute. Life's good. But then you have a lot more products. Over time, you make money. You sell more T-shirts or mugs or whatever you're selling. And now, suddenly, this happens. You see that? You see that time stamp there? That's an hour and 54 minutes. I didn't make that up. That's a screenshot. Julian works with me. That's a screenshot. A site took an hour and 54 minutes to build. That's kind of unusable, no? You can't possibly wait two hours after every change you make. And this is real. We've seen this. We've had customers who've seen this. And this is one of the reasons why SSG sort of falls apart. It can get slow because you want to do all of that work at build time and maybe that's too much work. So how do you fix this? Well, in our e-commerce site that we're trying to build, what you typically do is, you know, you'd get frustrated and you'd say, hey, this doesn't scale. And you'd call an old friend. Next. Or really anything that does SSR. And the way SSR works is that, well, we've done this over the years. You have some server, whether it's Node.js, PHP, whatever, that serves every request, right? And you have a cache in front of it. It works well. It's served us for years. But now you've lost all the benefits and everything that SSG promised you, everything that you sort of wanted to go static for in the first place. Cache is much harder to predict. It's not deterministic. Cache tends to be user specific, sometimes even edge specific, right? It's hard to know what is cache at what point in time for your site, for all your users. You also consume all of these services and APIs at runtime. Remember Facebook going down a couple of weeks ago? I mean, if they can go down, our APIs can go down too, right? And let's say that happens at runtime when you try to visit a page. What happens? Your page doesn't render. So it's brittle. And let's say that doesn't happen and you scale up well, that stuff can get really expensive, right? And this is what you see when it falls apart. You can see that I got that from the internet, because Gatsby, this never happens. So I downloaded the screenshot. That's why often you go for SSR when you have a lot of pages. But the moment you mix in SSG with SSR, you have a whole new set of problems. With SSG, it's great for a couple of pages. So let's say you build your homepage statically, because it doesn't change that often. And in our example that we were talking about for a while now, your product pages are being generated via SSR. It's a good compromise. It scales well, right? Your builds don't take two hours anymore. But you lose out on build atomicity. And what I mean by build atomicity is because both of these parts of your site are built independently at different points in time, it's hard to make sure that they're consistent with each other. You might hit a homepage that says product A is available and click through and then see that it's not. And that's a terrible experience. You don't want to do that. So how do you fix this? How do you compromise and have fast builds but also have consistency across pages but still try to build some stuff later? It's tricky. How do you almost defer some work? The spectrum that we've been talking about with SSR on one end and SSG on the other, there's got to be something in between. Right? I mean, SSR is wonderful for some use cases. SSG should be your default. But what about these use cases in between? And that's really what this talk is about. With Gatsby v4, we have what we call deferred static generation. You can think of it as SSG, but later. You get all the benefits of static site generation, but with the scale of SSR. And this might sound too good to be true, but it's actually fairly simple. The idea is that we spoke about all of that work you need to do to build all these pages, which took like two hours. But instead of building all of them, you just build the critical ones. You just defer the ones that aren't as critical. And now instead of your build times being a function of the number of pages your site has, it's a function of your traffic, which is pretty interesting if you think about it. Also, you don't need to hit these APIs or consume these services at runtime. What Gatsby does is that Gatsby will create a snapshot of your data and your JavaScript bundle at build time and use that to generate these pages, but later. Also, your cache is a lot more deterministic. Remember we spoke about caching. Caching is hard, right? It's one of the hardest problems that we have to deal with and solve. And non-deterministic cache can cause a lot of confusion. With deferred static generation, your cache is not edge-specific or user-specific. Your cache is global just like it is typically with SSG, but you're still able to get these benefits of being able to defer some work. And as a result of that, it's also atomic. You won't have those inconsistencies that we spoke about. Let's talk a little more about how it works though. This pretty little diagram here. And if you look at it, there's two sections. There's the first request at the top and the second request below it. Let's look at the first one. The first one looks identical to what an SSR request is, right? You hit a URL, there's a cache miss. So instead, you go back somewhere, right? Somewhere, somewhere, there's some code running. Let's say the page is called about. There's some file called about.js which runs, looks at that snapshot, generates that page for you, does everything you need to, just like an SSR server would. And it sends that response back, but look at what else it does. It also caches it. And this might seem fairly simple on the surface, but it's a lot different from how SSR is and how we've been used to. That cache is not HTTP cache. That cache is these static assets saved in a bucket somewhere. So any time there's a request after that, you simply get it like you would for any other SSG page. And that isn't the first request for, or the second request for one specific user. That's the first and second request across the board. So if I shipped a Gatsby site with a page that was marked as deferred, and if my crawler happened to hit a page, that would generate it, and anybody else that would visit my site after that would get that page just like any other static page. So you can see where this is going. Gatsby 4 now supports all of these different rendering modes. We have SSG, which is sort of ahead of time. We have SSR, which is just in time, and we have DSG, which I like to call fashionably late. It's not hard to use either. If you look at the documentation, really all you need to do to opt in into DSG for a page is run create page. If you've used Gatsby before, you've probably seen it. If you haven't, it's one function. You need to pass one more option, set deferred to true, and it's done. That can even be a condition. In most cases, your plugins will handle it for you even. Opting into SSR is similar. You export one function in your file, it's called get server data. This is really similar to what Next does with get server side props, I think. And the pages that you sort of opt in like this are automatically skipped at build time. So you don't need to do anything else. That all works, right? And it's all invalidated. Everything's taken care of. You can even mix some pages with SSR data and build time data. That's pretty incredible when you think about it, right? You can have product pages that have product data that doesn't change often, which is sourced at build time, and you can have availability data sourced at runtime. You can mix that in one page. So really that spectrum, you can fall anywhere you want in between with all of your pages. It's Gatsby. What those examples look like are this. This is what an SSR page looks like in Gatsby. You have a React component like we've always had. The only thing different here than a regular page is that little export at the bottom. Export async function gets over data. The moment you export that function from a page file or a template, that becomes an SSR page. And the example for deferring is, like I said, it's one key. You just said defer to true, and that's it. Your page is deferred. All of this is in Gatsby v4. I had to keep this slide because it's so pretty. And so it's a nice four. Looks like a K-pop album cover. But yeah. It's all out in Gatsby v4. Gatsby v4 has other stuff too. We now have parallel query running. We figured out how many threads you have on your machine. We run all your queries in parallel. Your builds are faster. Hopefully you notice. But you don't have to do anything to obtain it. It just works. You know, the stuff under the hood. There's a lot of other stuff under the hood as well, including something called LMDB. Which as a user of Gatsby, you don't have to know or worry about. But it uses it under the hood and sort of makes a lot of this stuff possible. You can try Gatsby 4 now. It came out yesterday. Fairly well-timed, I'd say. Just npm install gatsby, and you'll get Gatsby v4. And it works on Gatsby Cloud today. Works on other cloud platforms too. You know what I mean by that. But yeah, to be fair, Gatsby is still open source. You can build your assets, deploy that code really anywhere you want, as long as you put in some work. And that was it. That was my talk. My name is Sid. That's my Twitter handle. If you have questions or if you want to talk, feel free to DM or tweet at me. Thank you. Thank you. Thank you. Sid, you are my favorite person for two reasons. One, because you're just incredible. But also, you are under time, so we're back on schedule. Please take a seat. Welcome to the hot box. So we have quite a lot of questions for you. And there's not a lot of voting going on. They're all at the same level at this point. Could we have this? You had all of them, Jan. I did not. OK, here we go. So I'm just going to start from the top. Can you adapt Gatsby's DSG to render pages at the edge? For example, a compute at edge or Cloudflare workers? Yes, you can. If you run a Gatsby build locally, you can take a look at your.cache directory. All of the assets that you need for DSG are in there. The documentation is sparse. It just came out a couple of days ago. That will get better. But you can if you look at the.cache directory. There's a functions directory inside. Pick some of those files out. Put them on the right cloud, and it'll just work, hopefully. Nice. Glenn has asked the question, if the data doesn't change for a particular route, does DSG know not to rebuild that page, that it can reuse the last deployment's version? Excellent question. The answer is yes, it can. So the beauty with how we've built stuff on Gatsby Cloud is that the pages are all hashed by content. And even if it's from a different build, we still save those static assets independently of build ID. If you build the same page over and over again, nothing's changed. We're going to use the same one and just skip it automatically. I have a personal follow-up question to that. How much complexity are you introducing within Gatsby internally to make all of this happen? Because the diagram that you draw is very beautiful, and it's clear, and it's basically like write-through, write-through caching. But how much magic does Gatsby need to do to make this work? A bunch. I won't lie, a bunch. And that's what LMDB is. LMDB lets us serialize and deserialize that state across workers. And it also lets us... In the past, Gatsby has had a lot of its state in memory, and that's been one of the biggest reasons we haven't been able to move into workers and parallelize things. With LMDB, we can do that now. And yes, there is complexity. Yes, stuff under the hood is trickier, but it's all for DX, and the DX is better. So I guess it's worth it. Exactly. As a developer, I appreciate any improvement to my experience. And we hope that also translates to the user experience at the tail end of it. The next question from Adam Young. Can we use DSG for code we might have as a client route? Something like custom content specific to an authenticated user, which is cached specifically for them? Great question. Not yet, because to be able to do that, you'd still have to... We would still need to run some userland code of yours to be able to check authentication per user, etc. We do intend to build out what we're calling pre-route middleware during SSR and DSG time. So this will happen. But as of today, no, you can't. You want to go SSR in cases of pages like that, or go client-side. Awesome. What about the time it takes for the build system to spin up? Doesn't that add a lot of latency? I'm assuming... Okay, you answered the question. That's a great question. We actually worked really hard on that one. Try it out. It's actually ridiculously fast. So what we do is, even before your build finishes, your entire build... Now the Gatsby build involves several things, right? We generate your JavaScript bundles, we generate CSS, and then we write HTML files per page. Now your HTML files are a function of that bundle, but your SSR and your DSG doesn't have to wait for all of those. So as soon as we have your bundle ready, even before the entire build finishes, we send those over via sockets. We pre-warm containers which would run for SSR. So before your build finishes, your SSR containers are already up and running and ready. We also require your code, so it's in require cache. It's all pre-warmed. There shouldn't ideally be any latency introduced by any of this infra. The latency that you'll see will likely be because of the complexity of user LAN code and the amount of data you have. But yeah, we did think of this. I will follow up on that. I've been always curious. You build a platform like this, and so you already refer to containers. So you just spin up Node instances in containers and you host them somewhere. Do you have to do anything optimized at the platform or runtime level, or is it just like a standard Node boxes? We have a custom image that we run that's really stripped down that doesn't have a lot of stuff that you typically associate with running any Node server. But really, it's just a container. It's just Node.js. All right, cool. I think we have time for a couple of more questions, and there are a lot. Is optimizing images only for the first request viable, or should images be more critical? Interesting. So here's the thing. You can do either, really. I think it depends on your use case. If you feel like you want your images... If your images are being shared across pages, and if some of those pages are statically generated, those images will be critically generated at build time anyway. In case your images aren't critical and they're just in deferred pages, they will be built on the first request. In most instances, I wouldn't worry so much because remember, this is just the first request we're talking about. Every request after that is just going to be static. Often hitting a page that links to another page like this will trigger that first request anyway because of preload. So typically, I wouldn't worry so much, but try it out. Nice. A couple more questions. I think Slido is supposed to protect us from things like this is a comment instead of a question, but one comment slipped through and I think it's a good one. I'll just read it out loud. Build times are now a function of your traffic. This blew my mind. And this is not a question, but I would actually like to hear a bit more about this. Yeah. Yeah. So it's... And I wrote this this morning at 6 a.m. But really, if you think about all your statically generated sites before this, before this was possible, your build time was always a function of the total number of pages you have. If you have 10,000 pages, your build time is going to be a function of how many pages you have. But if you were to mark all of your pages as deferred, and you probably wouldn't do it for all your pages, but if you were to, your build times are now just a function of your runtime traffic. Your build times would be really quick, but then it's almost like slicing your build time into your initial build time and then your deferred build time that would then happen as traffic hits your site. And that's why I think it's a function of your traffic. Excellent. Excellent. And I think a final question, this is a good one to finish on, like a one minute answer. What's coming next for Gatsby? What's the exciting thing that you're looking forward to? A lot. I think with Gatsby v4, what we worked on and sort of invested heavily on was being able to make Gatsby viable for use cases that it wasn't for before. Gatsby has never had SSR before. And that's something that we've been thinking of for years now. And we finally sort of built SSR in. I think now Gatsby can really be used for any kind of site, whether you're doing SSR, DSG, SSG, and so on. In terms of what's next, I think the DX is what's next for us. Today, to use DSG, you still need to pass in that option, defer true to create page. It doesn't support the file system routing API, for instance. A lot of plugins, a lot of community plugins and CMSs need to sort of embrace these things and work out of the box. The next thing for us is to focus on these integrations and make sure everything really works as well as we'd like to. And the general DX, that's what's next for this quarter. Incredible. Thank you very much, Sid. Thank you. Applause for Sid, please. Thank you.