How to Edge Cache GraphQL APIs


For years, not being able to cache GraphQL was considered one of its main downsides compared to RESTful APIs. Not anymore. GraphCDN makes it possible to cache almost any GraphQL API at the edge, and not only that but our cache is even smarter than any RESTful cache could ever be. Let's dive deep into the inner workings of GraphCDN to figure out how exactly we make this happen.


♪♪ Hello, everyone. I am super excited to be here today and to talk to you about edge caching GraphQL APIs. My name is Max Stoiber. I am in beautiful Vienna, Austria here. Unfortunately, I can't be there in person this time, but I am really excited to be here. And if you want to follow me practically anywhere on the internet, I am at mxstbr, basically everywhere. I am the co-founder of GraphCDN, which is the GraphQL CDN. If you are in the React community, in the React JS community, or in the JavaScript community more generally, you might have used some of the open source projects that I helped build, like styled components, or React boilerplate, or micro analytics, or a whole bunch of others. I'm really active in that scene. And so if you're there, you might have used some of those projects as well. The story of GraphCDN and how we got there started in 2018. At the time, I was the CTO of another startup called Spectrum. And at Spectrum, we were building a modern take on the classic community forum. So essentially, we were trying to combine the best of what PHP BB gave us 20 years ago with the best of what Discord and Slack give us nowadays. That was essentially the idea. It was a public forum, but all of the comments on any posts were real-time chat. So we tried to take these two worlds that are currently very separate, where communities in Slack and Discord write lots of messages, but none of them are findable, and make them public and a little bit more organized so that you could find them afterwards on Google or elsewhere. And we tried to combine those two worlds together. Now, that actually worked out surprisingly well, which led to quite a bit of user growth. As you can imagine, with all of this user-generated content, lots of people found us on Google and elsewhere and started visiting Spectrum quite regularly. That meant we had quite a bit of growth. Now, unfortunately, I had chosen a database that wasn't very well supported. I had chosen RethinkDB, which nowadays doesn't even exist anymore. The company behind it shut down after a while. And I'd chosen that database originally because they advertised themselves as the real-time database. And their key feature, or the thing they praised externally, was that you could put this changes key at the end of any database query, and it would give stream real-time updates to that database query to you. And so you could listen to changes to practically any data changes, which felt like a fantastic fit for what we were trying to do. Because obviously, almost anything in Spectrum was real-time, right? The posts popped in in real-time. The chat was real-time, of course. We had direct messages, which had to be real-time. So this felt like a great fit for what we were trying to do. Lesson learned, in hindsight, rely on the databases that everybody uses. There's a reason everybody uses Postgres and MySQL and now Mongo. There's a reason those databases are as prevalent as they are, and it's because they work. I'm a lot wiser now. I wasn't that wise back then. And so it very quickly turned out that RethinkDB, the real-time nature of it, didn't scale at all. We had hundreds of thousands of users every single month, but RethinkDB couldn't even handle a hundred concurrent change listeners. Now, as you can imagine, every person that visits the website starts many different change listeners, right? We're listening to changes of the specific post that they're looking at. We're listening to changes of the community that the post is posted in. We're listening to new notifications. We had a bunch of listeners per user. And essentially, our database servers were on fire, literally on fire. Well, thankfully not literally, but they were crashing quite frequently. I Googled servers on fire and found this amazing stock photo of servers on fire, which, if your data center looks like this, you have some really serious problems. Ours weren't quite as bad, but they were still pretty bad. So we had this database that didn't scale, and we had, essentially, we had to work around that limitation. We wanted to switch to a more well-supported database. However, that's a lot of work. Rewriting the hundreds of database queries we'd written and optimized up to that point, migrating all of that data without any downtime, that was just a whole project, and we wanted to get there eventually, but we needed a solution for us crashing literally every day right at this moment. As I was thinking about this, of course, I realized that caching, we had an ideal use case for caching because our API was really read-heavy. Of course, it's public data. Lots of people read it, but not as many people write to it. And so, actually, we had an ideal use case for caching. We'd originally chosen GraphQL for our API because we had a lot of relational data. We were fetching a community, all the posts within that community, the authors of every post, the number of comments, a bunch of relational data, and GraphQL was a fantastic fit for that use case. It worked out extremely well for us, and we really enjoyed our experience of building our API with GraphQL. The one big downside that we ran into was that there weren't any pre-built solutions for caching GraphQL at the edge, which is what we wanted to do. Now, we wanted to essentially run code in many, many data centers all around the world, and we wanted to route our users to the nearest data center and cache their data very close to them for a very fast response time, but also so that we could reduce the load on our servers. Now, if you've ever used GraphQL, then you know that that is essentially what GraphQL clients do in the browser. If you've heard of Apollo Client, Relay, Urql, all of these GraphQL clients, what they are is essentially a fetching mechanism for GraphQL queries that very intelligently caches them in the browser for a better user experience. So in my head, basically, the question I wanted to answer was can't I just run a GraphQL client at the edge? Why can't GraphQL clients do this in the browser? Why can't I just take this GraphQL client that's running on my local browser, put it on a server somewhere, and have that same caching logic, but at the edge? To answer that question, I want to dive a little bit into how GraphQL clients cache. If we look at this example of a GraphQL query, which fetches a blog post by a slug, and it fetches its ID, title, and the author, and of the author, it fetches the ID, name, and avatar. And there is one magic trick that makes GraphQL caching really great, and that is the __typename meta field. You can add that to any GraphQL object. In your query, you can add that to any object type, and you will get back the name of the type of the response. So for example, with this query, we would add typename in these two places, for the post and also for the author. When the origin responds with the data, the response will look something like this, with the important piece being that now we have the post data, and we know that the type that was returned there was a post. And the same thing for the author. We got the author data, and we also know that the author is a user. And when we take this response and we store it in our cache locally in the browser, we can now associate that cached query response with those two objects. We can tag it with post with the ID 5 and user with the ID 1. Okay, that's fine. So we've just taken this query response, we've put it in the cache, we key that by the query that we saw, so by the get post query. And any time we see the same query, we return that same data. Why are these tags relevant? Why do I care that this contains the post with the ID 5 and the user with the ID 1? Well, this is where the magic comes in. GraphQL also has something called mutations, which are essentially just actions. Anything that changes data needs to be a mutation. For example, if we had a mutation that was called edit post, which edits a post. In this case, we're editing the post with the ID 5 and changing its title. Any mutation also has to fetch whatever it changed. So in this case, we're getting back the post. And again, we can do the same thing we did for the query and add the __typename field to the response. Now, when that response comes back from the origin to our client, the client can look at this response and go, oh, look, we just sent a mutation to the origin. That mutation has come back from the origin. And the data that was returned was the post with the ID 5. Huh, I actually have a cache query response that contains that post with the ID 5. And I can now automatically invalidate that cache query result that contains the stale data of this post. That's amazing, right? And this is what GraphQL clients do under the hood. They do this magic invalidation based on the __typename field and the ID field. And then they combine them to invalidate any stale data that has been changed at the origin. There's one slight edge case here where the magic kind of ends, which is list invalidation. If you imagine a query that fetches a list of blog posts, in this case, just their ID and title, when we look at the response to this query, it's an array that just contains the one blog post that we have right now, the post with the ID 5, how to edge cache GraphQL APIs. Now, a mutation that creates a new post now poses an interesting problem, because of course, the response to this create post mutation will look something like this. It will return an object of a post with the ID 6. But of course, our cached query results for the post list doesn't contain the post with the ID 6. And that's really annoying, because that means that GraphQL clients can't automatically invalidate lists when new items are created. Kind of frustrating. Now, thankfully, they found a good workaround for this, which is manual invalidation. Essentially, GraphQL clients give you different APIs to manually influence the cache and change it depending on which things pass through it. So for example, with Urql, which is the third biggest GraphQL client, this would look a little bit like this. You could tell Urql that when the create post mutation passes through the GraphQL clients, invalidate any cached query result that contains the posts query, that contains the list of posts. And so that way we can automatically invalidate that, no problem. And whenever a post is created, our GraphQL client will automatically refetch the fresh data from the origin. GraphQL clients actually go one step further and they do something called normalized caching. If we go back to our original query of fetching a single blog post, it's a D title and it's author, then rather than taking the entire response of the post with the ID five and the user with the ID one and putting that entire thing keyed by the query into the cache, they actually take each object within the query response individually and store that individually. So inside of Urql's cache, this looks a little bit like this, where we essentially in the cache store, okay, the post with the ID five corresponds to this data and the user with the ID one corresponds to this other data. Why do we care to do this? Because now if a query comes in that, for example, fetches the user with the ID one, then the cache can go, oh, hold on, you're fetching the user with the ID one. Although we haven't seen this specific query before, we do actually have that specific data in our cache and we can just serve you that on the client without you having to go to the origin to fetch that data again, because we've already fetched it. It was just deeply nested in some other query, but we've normalized that for you and can now give you the user data for the user with the ID one, no problem, just like that, which is very nice and actually makes for less network traffic and a much nicer user experience because things will resolve much, much faster since they're already on the client and loaded. Very nice. You essentially only ever fetch every object once, which is fantastic as people, particularly if people navigate around your app quite frequently. Now, the one thing that's missing here that you might've noticed is the We have the post with ID five data and the user with the ID one data, but how do we know that the is the user with the ID one? Well, Urql stores that in a separate data structure that looks like this, which essentially just talks about the relations or the links between things. So here we're essentially saying, hey, if you're fetching the post with this specific slug, that corresponds to the post with the ID five. If you're fetching the post with the ID five's author, then that corresponds to the user with the ID one. And then the user with the ID one doesn't have any further relations or links that you can go into. Now, what I really want you to take away from this section is that GraphQL is actually awesome for caching. It's actually really, really good for caching because of its introspectability. It tells you what data you're returning and this introspectability combined with a strict schema where you have to return something that matches that schema means it's actually really good for caching. And that's also a lot of the reason why so much great tooling has spun up around GraphQL. It's gotten such wide community adoption that if one person builds tooling for it, because it's always the same GraphQL spec that it has to follow, everybody else gets to benefit from that tooling. And that's incredibly powerful. Now, to get back to my original question that I posed way back in 2018, can't I just run a GraphQL client at the edge? Can't I just take this logic that Apollo Client, Relay, and Urql have internally anyway, take that same code and just put it on a bunch of servers around the world at the edge so that everybody that uses Spectrum everywhere gets super fast response times and we get to reduce the load our server has to handle massively? Well, the key to the answer of this question lies in the last part, the edge. Because as it turns out, GraphQL clients are designed with very specific constraints that differ ever so slightly from the constraints we would have to work with at the edge. One of the main ones that we have to deal with if we were to deploy caching logic to the edge is authorization. Because of course, if a GraphQL client runs in the browser, it knows that if something is in the cache, whoever's requesting this again can access it because it's the same person, right? If I'm using Spectrum and I'm querying for the post with the ID five and the GraphQL client puts that in the cache, then the GraphQL client doesn't have to worry about authorization. It doesn't even have to know anything about authorization because I am allowed to access the post with the ID five. So if I request the same post again, the client can just give that to me from the cache, right? And go, yeah, of course, right? No problem. At the edge, that's slightly differently, right? If we have one server sitting that a lot of users are requesting data from, some of those might be allowed to access the post with the ID five, but others maybe aren't, right? Or maybe even more specifically, if you think about user data, right? Maybe somebody is allowed to access their own email, but nobody else is. And so we can't just take a query and put that result in the cache because that would mean everyone gets served the same data. So if somebody queries some data that's sensitive, that's specific to that user, suddenly that will be served to everyone. That will be a nightmare, right? That will be a terrible security nightmare and a really bad experience because we would essentially just be leaking data. Very bad idea. So at the edge, what we have to do is, rather than just making the cache key a hash of the query, so essentially we take the query type that we have in the variables and we use that as a cache key, rather than doing just that, we also have to take the authorization token into account. Whether that's sent via the authorization header or whether that is a cookie, we have to just add that to the cache key so that if somebody else sends the same query, they don't get the same response. It's as simple as that. Just put the authorization token in the cache key and everything will be fine. The other part that's a little bit different is cache purging, because not only do we have to do automatic cache purging and support manual invalidation for list invalidation, we also have to do it globally, right? If you're running at the edge in all of these data centers globally, then you have to invalidate that data globally, right? If the post with the ID five changes and the user sends a mutation to edit that or the server says, hey, look, this has changed and wants to manually invalidate it, then you have to do it globally. You can't just do it in one data center. That would be a terrible experience because the stale data would stick around in every other data center. You have to do it globally. And so as we were thinking about these problems for GraphCity and as we were building out this GraphQL edge cache solution, we came to the conclusion that we're going to use Fastly's compute-at-edge product. Now, we are huge fans of Fastly here. And the reason we chose Fastly is because, like their name suggests, they are super fast. Fastly has about 60 and ever-increasing data centers worldwide, spread across the entire globe. Now, here is a crazy fact. Fastly's invalidation logic, right? If you take a query response and you put it into Fastly's cache and you tag it with the post with the ID five, if you then send an API request to Fastly to invalidate any cache query result that contains the post with the ID five, they can invalidate stale data within 150 milliseconds globally. 150 milliseconds globally. That is probably faster than you can blink, right? In the time that it takes me to do this, Fastly's already invalidated the data globally. That is absolutely mind-blowing to me, right? And I actually looked up a while ago, I was like, wait, hold on. How fast even is the speed of light, right? Surely that takes a while to go around the globe once. And so I looked it up and actually got a quick look and so I looked it up and actually light does take 133 milliseconds if I remember correctly to get across the entire globe. So how can Fastly invalidate within 150 milliseconds? That is super fast. Well, the answer is of course that they don't have to go around the entire globe because they're going bi-directional. They're going both ways at the same time. So they only have to go around half the globe, which cuts the time in half. And then of course they also do, they have a really fancy gossiping algorithm, which you can Google. They've written some great articles about it. And I bow down in front of their engineers because it's absolutely genius. And it is so fast that it enables our customers now to cache a lot more data, right? If you can invalidate stale data within 150 milliseconds globally, imagine how much more data you can cache because it will never be stale, right? When the data changes, send an API request and within 150 milliseconds later, everybody globally has the fresh data. Imagine how much more data you can cache if you have the super fast invalidation. And that's the reason we use Fastly. They're super fast and we're super happy with them. So that's essentially what GraphCDN is. We rebuilt this caching logic to run at the edge, to take authorization into account and to have this global cache purging. And we deploy it to Fastly's computed edge, 60 worldwide data centers allow our customers to cache their GraphQL queries and their GraphQL responses at the edge. I wish this would have existed back in 2018 when we had our scaling problems with Spectrum. At the time, I just built a terrible in-memory caching solution that reduced the load slightly until we eventually got acquired by GitHub. And I just, if we had had GraphCDN, we would have been able to scale so much more smoothly. We would have saved so much money because of course running something at the edge is much cheaper than running the request through our entire infrastructure. And it would have been a much better experience for all of our global user base because everybody would have had super fast response times from their local data center. All right, I hope you learned about GraphQL caching today. The main thing I want you to take away is GraphQL is amazing for caching. That's really the takeaway I want to hone in on. GraphQL, absolutely fantastic for caching, the introspectability, the strict schema, chef kiss, just absolutely fantastic. And if you have a GraphQL API, I'd love to meet you. I'd love to hear what else we can do for you in the future, even if you don't need caching. Thank you for having me. If you have any questions, feel free to hit me up anytime. I am at mxSTBR, practically everywhere on the internet, and I look forward to hearing from you. That's awesome. So before we started playing the talk, we asked a question to the audience and a question which you wanted to ask, do you cache your GraphQL API? And the results are in. I see that 44% of the people say yes, while 33% say no, and 22% say that they do not have a GraphQL API. So what's your take on that? Are you surprised or were you expecting such kind of percentage? That's actually great to hear. I'm surprised 22% of people watching don't have a GraphQL API, that's awesome. You should really be using GraphQL. I can highly recommend it. I'm obviously a huge fan. I think it's great to see that 44% are already caching their GraphQL API. And I think the remaining 30% should think about whether they have a use case for it and think about whether they should maybe cache their GraphQL APIs. That's really amazing. So now that you talk about the use case, so I would like to start by a question which I really had while I was going through the talk. So what could be some of the probable use cases that you would say for GraphCD? And like when one should start thinking about caching their GraphQL queries? That's a great question. We at Spectrum, the story I told from 2018, had a perfect use case for caching. We were lots of public data, very read heavy, right? Probably 95 to 99% reads, only 1% writes. So really an ideal use case for caching. And I think that's essentially what it boils down to. I have some friends that work at Sentry, for example. Sentry is the error tracking service. And when I told them that we were building a GraphQL caching service, they were like, why would I ever wanna cache my API? And of course that makes total sense because Sentry's use case is super write heavy. Sentry ingests millions of errors probably every single hour. And so their data changes way more often than a cache could ever be updated, right? So caching for them doesn't really make a lot of sense in many layers, right? And really they're a lot more focused on ingestion performance and they've spent a lot of time on making that really good. And so I think really what it boils down to is, do you have a very read heavy use case? And how shared is your data? If you have lots of authenticated data and every user only sees their own data and nothing else, and they only visit your app once every week, then you're not gonna have a very high cache hit rate either because yes, the data for that one user might be stored, but they're not gonna come back to actually get a cache hit. So you don't really get any of the performance benefits or load benefits of caching your GraphQL API. So I think it really boils down to, if you have a read heavy use case, and then secondarily, if you have a use case where some data is shared between users. Now, that being said, we've been surprised by how high cache hit rates people have been able to get, even with authenticated use cases. Our customers that have authenticated use cases still see 40, 50, 60% cache hit rates, which is a lot higher than we were expecting. And that's still a lot of load of your plate. That's still 50, 60% less traffic at your origin and much faster performance for your users. But of course, it's not the 95, 99% cache hit rate you would see with very, very public data that is very read heavy, if that makes sense. Yeah, that's really impressive, actually. Yes, thank you. That definitely makes sense. So we have a question from Newbie. So they want to know, what is Edge? Can you explain the Edge like I'm five, eh? Yes, I actually, I'm gonna try. I'm not sure if a five-year-old will understand this, but essentially, wait, how do I explain this to a five-year-old? Now I have to think about this. Rather than everyone having to drive all the way to the farmer to get their food and then drive all the way back, you build supermarkets and the supermarkets get some stuff from the farmer and then they store it at the supermarket. And so you only have to go to your local supermarket and back rather than all the way to the farmer. So the farmer has to deal with way less people. He just sells his stuff to one supermarket and that's it. And then you don't have to drive all the way to the farmer and all the way back, but you can just go to your local supermarket that's nearest to you. That is what Edge computing is. It is moving compute closer to where your users are rather than having everything in one central place. I just came up with that off the top of my head. I don't know if that makes any sense at all, but I think that's how I would explain it to a five-year-old. Yeah, no, wow, it's amazing. Even I didn't have much information about Edge, but that I'm able to understand. I hope that the person who asked the question were able to understand at least a bit of it. So thank you for making time to make it understand like that way. So we have another question from Danny Z. So first of all, they want to say that amazing talk as always. So they are like so keen to start using GraphCDN. And they have a question that do we still need local GraphQL client caching when using the GraphCDN, especially the manual invalidation? That's a great question that I get sometimes that I actually should have spoken about in the talk. That's already great feedback for me. The answer is most likely yes, because nothing is faster than loading data from memory in your browser, right? Your user clicking around the app and going back to screens they've already seen doesn't really require going to an edge cache, even if it's an edge cache in their local metropolitan city. Really, you still want to use a GraphQL client for the very best user experience. It's almost like a layered caching approach, right? On the one hand, you have to sort of per user in their browser client cache. And then you have to shared between everyone in the same location edge cache, and then you have your origin. And so you have like a layered cache strategy where of course the single user is only ever going to get a cache hit on any data they've loaded before. But at the edge, you might get cache hits for data that other users have loaded as well. And so you're going to have different trade-offs and different invalidation requirements because of course the browser doesn't know when the server data changes. So yes, you want to use a GraphQL client most likely for the great user experience that they provide and the great developer experience. Honestly, I think one of the things that makes GraphQL great, as I'm sure you know, is GraphQL clients, just using them to fetch data from an API just feels amazing, right? All of these hooks that we have nowadays in the React world. The developer experience is just super duper nice. And so, yeah, I would always use a GraphQL client. We use a GraphQL client for GraphCDN at the dashboard and I've used a GraphQL client always. I hope that answers the question. Thank you. So yeah, you got the answer. Do use it. So now we have next question by Alexander Warwick. So how do you cache with Apollo? So they want to know how are you caching with Apollo? So Apollo is a either, that depends on what they're referring to, right? Like either they're referring to Apollo client, which is in which case I just answered that question, right? Like you still want to use caching with Apollo clients that is separate from your edge cache, but then there's also Apollo server. So if you have a GraphQL API that you created with Apollo server, you can put GraphCDN in front of it without any issues and we will cache your GraphQL API. We support any GraphQL API, right? Like whether you own it or not, you can put us in front of a GraphQL API that you built with Apollo, but you can also put us in front of GraphCMS or the Shopify API or any other GraphQL API that you have, the GitHub GraphQL API, right? We support any GraphQL API and you can just put us in front of any GraphQL API and we'll cache it for you. That's awesome. Actually, I mixed the names. So previously the question was from Cafe M-A-K-R-A and now is the question from Alexander. And remember Max, we were talking about cache a while ago. So they have the similar question. Like you mentioned adding auth data to the cache key. How do you prevent the cache hits from never happening? Because the cache key is unique for every request. So like, is there any upper limit to the authorization flexibility that a service using GraphCDN can have before it stops making sense? Yeah. Like I mentioned, I think if you have a very, very unique use case where every user sees their own data and nothing else, then caching probably doesn't make sense because your cache rate is gonna be very, very, very low. However, if you have any data at all that is shared that you can cache between users, then yeah, it totally makes sense. GraphCDN has very fine grained control. So you can say, hey, look, if it contains the data of a certain user, then please cache the entire query for that specific user. But if it contains, I don't know, a blog post, then cache it publicly for everyone. We have the ability to sort of differentiate on a per query level. So you can configure GraphCDN to cache things differently depending on whether the data is public or authenticated and should only be returned to that single user. So you have ways of tweaking your cache hit rate and getting it higher, even if you have some authenticated use cases. But again, if you have a use case that's very authenticated, then it's not gonna help. You're not gonna have a great cache hit rate no matter what. That's amazing. So that was the last question which we are taking today. Thanks a lot, Max, again, for taking time to give us such a nice overview of the caching and also for all these answers to the questions. So to the audience, you can still cache Max into Max's speaker room. So head over to the speaker room and you can still speak to Max. And also you can ask questions to him on his socials. Thanks a lot, Max, once again. It was really amazing talking to you. Bye. Thank you for having me. ♪♪♪
23 min
09 Dec, 2021

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic