1. Introduction to Gatsby v4 and the Jamstack
My name is Sid, I work at Gatsby Inc. I've been there for a while now. Gatsby cloud is the best place to build and deploy your Gatsby sites. Let's talk about what that means. The Jamstack has been around for a while now. The core principles of pre-rendering and decoupling are supposed to give you more confidence about your site. Static site generation has worked well for us for a while. At Gatsby on Gatsby Cloud, we've seen folks with pretty large sites, we've seen folks with e-commerce sites and blogs and what not. We've seen sites that have gone as high as almost 100,000 pages. And it's incredible that we're able to do that with, what started out as a simple static site generator. It's good for SEO, it's fast, it's also cheap to deploy.
A really long time. In fact, some of you I think I've met for the first time ever, so it's been great. It's been surreal, really. My name is Sid, I work at Gatsby Inc. I've been there for a while now.
Over the years, I've helped build and maintain the Gatsby open source project. More recently, I've been working on Gatsby cloud. Gatsby cloud is the best place to build and deploy your Gatsby sites. And I'm going to be talking about a bunch of different stuff that we've done with Gatsby v4. In case you missed it, we've been busy. Gatsby v4 came out yesterday and it has a bunch of new rendering modes. Let's talk about what that means.
The thing that's interesting about this though is pre-rendering. Why does it give you more confidence and why is that more resilient? Because the stuff you need to do to construct a page is already done, you don't have to do it at request time. So there's very little that can go wrong, however, it's not always that simple. Before we get to why it's not simple. Static site generation has worked well for us for a while. At Gatsby on Gatsby Cloud, we've seen folks with pretty large sites, we've seen folks with e-commerce sites and blogs and what not. We've seen sites that have gone as high as almost 100,000 pages. And it's incredible that we're able to do that with, what started out as a simple static site generator. And in case you've missed out on the benefits of SSG over the past couple of years, it's good for SEO, it's fast, it's also cheap to deploy.
2. Introduction to Netlify and SSG
We have Matt here from Netlify. Netlify has become the default way to deploy things because it's cheap and free. When you visit a site on Netlify or Gatsby cloud, you're effectively reaching a storage bucket and getting a static file. There's not a lot of code running at runtime, making it resilient.
We have Matt here from Netlify. All of you have probably used Netlify at some point, right? And it's almost become sort of the default way to deploy things, right? Because it's cheap, it's free. He's right there by the way, his picture was right there for a second. And in case you, you know, you want to know what SSG actually looks like under the hood, when you visit a site on Netlify, Gatsby cloud, any of these hosts, you're effectively just reaching a storage bucket, whether it's S3, or GCP, or whatever. You're hitting a storage bucket, and you're getting a static file. There's not a lot of code that's actually running at runtime, and that's why it's so resilient.
3. Challenges with SSG and the Introduction of SSR
This doesn't always scale for really large sites. Over time, as you add more products, the build time can become extremely slow, making it impractical. One way to address this is by using server-side rendering (SSR), which involves having a server that serves every request and a cache in front of it. However, using SSR means losing the benefits of static site generation (SSG) and introduces challenges with cache unpredictability.
Let's talk a little more about where this kind of falls apart though, right? This doesn't always scale. I've heard this over the years from people on GitHub, on Twitter, etc. That it's great for your blog, or for a tiny site, but it doesn't scale for really large sites. What if you have a lot of pages and so on? I disagree. We'll talk about why.
Let's say you want to build an e-commerce site, right? You start out with what looks like a template from Tailwind UI, which I bought, by the way. And you have a homepage with your products and so on, and you have about 20-30-odd products and it builds quickly and it's all great. And you have another page where you have a t-shirt that you sell and it's all wonderful. Your whole site builds in under a minute. Life's good. But then, you have a lot more products, right? Over time, you make money, you sell more t-shirts or mugs or whatever you're selling. And now, suddenly, this happens. You see that time stamp there? That's an hour and 54 minutes. I didn't make that up. That's a screenshot. Julian works with me, that's a screenshot. A site took an hour and 54 minutes to build. That's kind of unusable, no? I mean, you can't possibly wait two hours after every change you make. And this is real. We've seen this. We've had customers who have seen this. And this is one of the reasons why SSG sort of falls apart, it can get slow, because you want to do all of that work at build time, and maybe that's too much work. So how do you fix this? Well, in our e-commerce site that we're trying to build, what you typically do is, you know, you'd get frustrated, and you say, hey, this doesn't scale, and you'd call an old friend. Next. Or really anything that does SSR.
And the way SSR works is that, well, we've done this over the years, you have some server, whether it's Node.js, PHP, whatever, that serves every request, right, and you have a cache in front of it. It works well, it's served us for years, but now you've lost all the benefits and everything that SSG promised you, everything that you sort of wanted to go static for in the first place. Cache is much harder to predict, it's not deterministic. Cache tends to be user specific, sometimes even edge specific, right. It's hard to know what is cached at what point in time for your site, for all your users.
4. Challenges with SSG and APIs at Runtime
You also consume all of these services and APIs at runtime. Remember Facebook going down a couple of weeks ago? Your page doesn't render. So it's brittle. And let's say that doesn't happen and you scale up well, that stuff can get really expensive. And this is what you see when it falls apart.
You also consume all of these services and APIs at runtime. Remember Facebook going down a couple of weeks ago? I mean, if they can go down, our APIs can go down, too, right. And let's say that happens at runtime when you try to visit a page. What happens? Your page doesn't render. So it's brittle. And let's say that doesn't happen and you scale up well, that stuff can get really expensive, right. And this is what you see when it falls apart. You can see that I got that from the internet, because Gatsby, this never happens. So I downloaded the screenshot.
5. Challenges with SSG and Deferred Static Generation
That's why often you go for SSR when you have a lot of pages. The moment you mix in SSG with SSR, you have a whole new set of problems. With SSG, it's great for a couple of pages. So let's say you build your homepage statically because it doesn't change that often. And in our example that we were talking about for a while now, your product pages are being generated via SSR. It's a good compromise. It scales well, right. Your builds don't take two hours anymore, but you lose out on build atomicity. And what I mean by build atomicity is because both of these parts of your site are built independently at different points in time. It's hard to make sure that they are consistent with each other. You might hit a homepage that says, product A is available and click through and then see that it's not. And that's a terrible experience. You don't want to do that.
6. Deferred Static Generation and Rendering Modes
With deferred static generation, your cache is global and atomic, eliminating inconsistencies. The process involves running code to generate a page from a snapshot, caching it as static assets. Subsequent requests for the page are served like any other SSG page. Gatsby 4 now supports these rendering modes.
With deferred static generation, your cache is not edge specific or user specific. Your cache is global just like it is typically with SSG but you're still able to get these benefits of being able to defer some work. And as a result of that, it's also atomic. You won't have those inconsistencies that we spoke about.
Let's talk a little more about how it works, though. There's this pretty little diagram here. And if you look at it, there's two sections. There's the first request at the top and the second request below it. Let's look at the first one. The first one looks identical to what an SSR request is. You hit a URL, there's a cache miss, so instead you go back somewhere. Right? Somewhere. Somewhere. There's some code running. Let's say the page is called about. There's some file called about.js which runs, looks at that snapshot, generates that page for you, does everything you need. Just like an SSR server would. And it sends that response back. Look at what else it does. It also caches it. And this might seem fairly simple on the surface, but it's a lot different from how SSR is and what we've been used to. That cache is not HTTP cache. That cache is the static assets saved in a bucket somewhere. So anytime there's a request after that, you simply get it like you would for any other SSG page. And that isn't the first request for, or the second request for one specific user. That's the first and second request across the board. So if I shipped a Gatsby site with a page that was marked as deferred, and if my crawler happened to hit a page, that would generate it. And anybody else that would visit my site after that would get that page just like any other static page. So you can see where this is going. Gatsby 4 now supports all of these different rendering modes.
7. SSG, SSR, and DSG in Gatsby
We have SSG, SSR, and DSG. Opting into DSG is easy with the create page function and setting defer to true. SSR is similar with the get server data function. You can mix SSR and build time data, allowing for dynamic and static content on the same page. Examples of SSR and deferring are provided.
We have SSG, which is sort of ahead of time, right? We have SSR, which is just in time, and we have DSG, which I like to call fashionably late. It's not hard to use either. If you look at the documentation, really all you need to do to opt in into DSG for a page is run create page. If you've used Gatsby before, you've probably seen it. If you haven't, it's one function. You need to pass one more option, set defer to true, and it's done. That can even be a condition.
In most cases, your plug-ins will handle it for you even. Opting into SSR is similar. You export one file, sorry, one function in your file. It's called get server data. This is really similar to what Next does with get server side props, I think. The pages that you opt in like this are automatically skipped at build time, so you don't need to do anything else. It all works. It's all invalidated. Everything's taken care of. You can even mix some pages with SSR data and build time data. That's pretty incredible when you think about it. You can have product pages that have product data that doesn't change often, which is sourced at build time, and you can have availability data sourced at run time. You can mix that in one page. So really that spectrum, you can fall anywhere you want in between with all of your pages.
It's Gatsby. What those examples look like are this. This is what an SSR page looks like in Gatsby. You have a react component like we've always had. The only thing different here than a regular page is that little export at the bottom. Export async function gets our data. The moment you export that function from a page file or a template, that becomes an SSR page. And the example for deferring is, like I said, it's one key. You just said defer to true and that's it.
8. Gatsby V4 Features and Availability
Your page is deferred. Gatsby V4 introduces parallel query running, resulting in faster builds. Gatsby V4 also includes LMDB, which enhances performance. Try Gatsby V4 now and install it with 'BM install Gatsby'. It works on Gatsby Cloud and other Cloud platforms. Gatsby remains open source.
Your page is deferred. All of this is in Gatsby V4. I had to keep the slide because it's so pretty. It's got a nice fold. It looks like a K-pop album cover. But yeah, it's all out in Gatsby V4.
Gatsby V4 has other stuff too. We now have parallel query running. We've figured out how many threads you have on your machine. We run all your queries in parallel. Your builds are faster. Hopefully you notice. But you don't have to do anything to opt in. It just works. It's stuff under the hood. There's a lot other stuff under the hood as well, including something called LMDB, which as a user of Gatsby, you don't have to know or worry about, but it uses it under the hood and makes a lot of this stuff possible.
You can try Gatsby V4 now. It came out yesterday. Fairly well-timed, I'd say. Send BM install Gatsby, and you'll get Gatsby V4. And it works on Gatsby Cloud today. Works on other Cloud platforms, too. You know what I mean by that. But yeah, and to be fair, Gatsby is still open source. You can build your assets, deploy that code really anywhere you want, as long as you put in some work. And that was it. That was my talk. My name is Sid. That's my Twitter handle. If you have questions or if you want to talk, feel free to DM and tweet at me.
Q&A on Gatsby's DSG and Rebuilding Pages
Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Sid, you are my favorite person for two reasons. Can you adapt Gatsby's DSG to render pages at the Edge? Yes, you can. If you run a Gatsby build locally, you can take a look at your .cache directory. All of the assets that you need for the DSG are in there. Glenn has asked if the data doesn't change for a particular route, does DSG know not to rebuild that page? The answer is yes, it can.
Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Sid, you are my favorite person for two reasons. One, because you're just incredible. But also, you are under time, so we're back on schedule. Please take a seat. Welcome to the hot box.
So we have quite a lot of questions for you. And there's not a lot of voting going on these. They're all at the same level at this point. Did you add all of them, Jan? I did not. OK, here we go. So I'm just going to start from the top.
Can you adapt Gatsby's DSG to render pages at the Edge, for example, to compute at Edge or Cloudflare Workers? Yes, you can. If you run a Gatsby build locally, you can take a look at your .cache directory. All of the assets that you need for the DSG are in there. The documentation is sparse. It just came out a couple days ago. That will get better. But you can if you look at the .cache directory. There's a functions directory inside it. Pick some of those files out. Put them on the right cloud, and it'll just work, hopefully.
Nice. Glenn has asked the question, if the data doesn't change for a particular routes, does DSG know not to rebuild that page, that it can reuse the last deployments version? Excellent question. The answer is yes, it can.
Caching and Complexity in Gatsby
The pages on Gatsby cloud are hashed by content, allowing for efficient caching. Gatsby internally uses LMDB to handle the complexity of serialization and deserialization across workers. This enables parallelization and improves the developer experience. However, using DSG for custom content specific to authenticated users is not yet supported, but Gatsby plans to introduce pre-route middleware for this purpose.
So the beauty with how we've built stuff on Gatsby cloud is that the pages are all hashed by content. And even if it's from a different build, we still save those static assets independently of build ID. If you build the same page over and over again, nothing's changed. We're going to use the same one and just skip it automatically.
I have a personal follow-up question to that. How much complexity are you introducing within Gatsby internally to make all of this happen? Because the diagram that you draw is very beautiful, and it's clear, and it's basically like right-through right-through caching. But how much magic does Gatsby need to do to make this work? A bunch. I won't lie, a bunch. And that's what LMDB is. LMDB lets us serialize and deserialize that state across workers. And it also lets us, in the past, Gatsby has had a lot of it's state in memory. And that's been one of the biggest reasons we haven't been able to sort of move into workers and sort of parallelize things. With LMDB, we can do that now. And yes, there is complexity. Yes, stuff under the hood is trickier. But it's all for DX. And the DX is better. So I guess it's worth it.
Exactly. As a developer, I appreciate any improvement to my experience. So, you know. And we hope that also translates to the user experience at the tail end of it.
The next question from Adam Young. Can we use DSG for code we might have as a client route? Something like custom content specific to an authenticated user, which is cached specifically for them. Great question. Not yet. Because to be able to do that, you'd still have to. We would still need to run some user line code of yours to be able to check authentication per user, et cetera. We do intend to build out what we're calling pre-route middleware on during SSR and DSG time. So this will happen.
Build System Latency and Containerization
But as of today, no. You can't. You want to go SSR in cases of pages like that. Or go client side. What about the time it takes for the build system to spin up? Doesn't that add a lot of latency? It's actually ridiculously fast. Even before your build finishes, your entire build. We send those over via sockets. We pre-warm containers, which would run for SSR. There shouldn't ideally be any latency introduced by any of this infra. The latency that you'll see will likely be because of the complexity of user-land code and the amount of data you have. But yeah, we did think of this.
But as of today, no. You can't. You want to go SSR in cases of pages like that. Or go client side.
Awesome. What about the time it takes for the build system to spin up? Doesn't that add a lot of latency? I'm assuming. OK, you answered the question.
Optimizing Images and Build Times
So you just bin up node instances in containers and you host them somewhere. We have a custom image that we ran that's really stripped down. Is optimizing images only for the first request viable, or should images be more critical? Build times are now a function of your traffic.
So you just bin up node instances in containers and you host them somewhere. But do you have to do anything optimized at the kind of like a platform or runtime level? Is it just like a standard node boxes? We have. We have a custom image that we ran that's really stripped down that doesn't have a lot of stuff that you typically associate with running any node server. But really, it's just a container. It's just node.js.
All right, cool. I think we have time for a couple of more questions, and they are a lot. Is optimizing images only for the first request viable, or should images be more critical? Interesting. So here's the thing. You can do either, really. I think it depends on your use case. If you feel like you want your images, if your images are being shared across pages, and if some of those pages are statically generated, those images will be critically generated at build time anyway. In case your images aren't critical and they're just in deferred pages, they will be built on the first request. In most instances, I wouldn't worry so much, because remember, this is just the first request we're talking about. Every request after that is just going to be static. Often, hitting a page that links to another page like this will trigger that first request anyway, because of preload. So typically, I wouldn't worry so much, but try it out.
Nice. We've got a couple more questions. I think Slido is supposed to protect us from things like this is a comment instead of a question. But one comment slipped through, and I think it's a great one, I'll just read it out loud. Build times are now a function of your traffic. This blew my mind. And this is not a question, but I would actually like to hear a bit more about this. Like, yeah. Yeah. And I wrote this this morning at 6 AM. But really, if you think about all your statically generated sites before this, before this was possible, your build time was always a function of the total number of pages you have. If you have 10,000 pages, your build time is going to be a function of how many pages you have. But if you were to mark all of your pages as deferred, and you probably wouldn't do it for all your pages.
Gatsby's Future and Focus on Integrations
Your build times are now just a function of your runtime traffic. Gatsby v4 has made Gatsby viable for various use cases, including SSR. The next focus is on integrations and improving the developer experience (DX).
But if you were to, your build times are now just a function of your runtime traffic. Your build times would be really quick. But then, it's almost like slicing your build time into your initial build time and then your deferred build time that would then happen as traffic hits your site. And that's why I think it's a function of your traffic.
Excellent. Oh, excellent. And I think a final question, this is a good one to finish on, like a one minute answer. What's coming next for Gatsby? What's the exciting thing that you are looking forward to? A lot. I think with Gatsby v4, what we worked on and sort of invested heavily on was being able to make Gatsby viable for use cases that it wasn't for before. Gatsby has never had SSR before. And that's something that we've been thinking of for years now, and we finally sort of built SSR in. I think now Gatsby can really be used for any kind of site, whether you're doing SSR, DSG, SSG, and so on. In terms of what's next, I think the DX is what's next for us. Today, to use DSG, you still need to pass in that option. Defer True to Create Page. It doesn't support the File System Routing API, for instance. A lot of plug-ins, a lot of community plug-ins and CMSs need to embrace these things and work out of the box. So the next thing for us is to focus on these integrations and make sure everything really works as well as we'd like to. And the general DX, that's what's next for this quarter.
Incredible. Thank you very much, Sid. Applause for Sid, please. I am. Thank you.