GraphQL Caching Demystified

Rate this content

How would you implement a performant GraphQL cache? How can we design a good algorithm for it? Is there a good Open Source solution that is efficient, scalable and easy to deploy? How is the caching key computed? What about cache invalidations? Would it be possible to deduplicate resolver execution? This might be seen as daunting but in reality is all code and algorithms.

In this talk we are going to walk through a GraphQL caching system that we have developed for Mercurius - one of the fastest GraphQL servers for Node.js.

21 min
10 Dec, 2021

Video Summary and Transcription

Today's Talk focuses on GraphQL caching and improving performance using Fastify and Mercurius. The experiment involves federated services, resolver caching, and load testing with AutoCANON. Enabling caching with a 0 second TTL can increase throughput by 4 times. The AsyncCacheDedupe module allows for efficient caching and avoiding unnecessary computations. Redis pipelining has improved requests per second by 100 times. Cache invalidation is an ongoing topic of development.

Available in Español

1. Introduction to GraphQL Caching

Short description:

Today, Matteo Collina will talk about GraphQL caching and how to improve the performance of your GraphQL gateway by four times. He will use Fastify, one of the fastest web frameworks for Node.js, and Mercurius, the GraphQL adapter that runs on top of Fastify and integrates with the GraphQL JIT library for faster query execution.

Hi, everyone. I am Matteo Collina, and today I'm going to talk to you about GraphQL caching. Before we start, please follow me on Twitter at matteocollina. You can find it on the slide, so hey, here I am. I talk a lot about Node.js, JavaScript, GraphQL, open source, all the things, so I don't know, you might find it interesting.

So today we are going to talk about GraphQL, before we start, though, oh, one more thing. I follow my newsletter, Adventures in Node-Land, Who I am? I'm Matteo, I'm part of the Node.js Technical Steering Committee, and Chief Software Architect at a company called NearForm. Check us out, we are hiring, doing a lot for GraphQL, so if you want, it's a good company.

Anyway, going a little bit further, back when I was a kid in the 90s, yes, I am telling how old I am, I was really, really, really impressed by this show by David Copperfield, and I don't know about you, but I always wanted to be, I was fascinated by magic, right? So, and, you know, how to make things disappear, how to make things, you know, fly, whatever. It's very, very interesting and I found them very entertaining, the shows. And in fact, there is a lot of hard work behind magic, right? So in this talk, we are going to talk about magic and we are going to make things disappear. So in fact, we are going to apply magic to GraphQL. So we are going to show how to improve the performance of your GraphQL gateway by four times. How? By making things disappear.

So, how? Well, let's talk a little bit about the tool of the craft. We need tools, right? We need things that we're going to use for this little demonstration. So first of all, we are going to use Fastify. Fastify is one of the fastest web frameworks for Node.js. It's very easy to use. It's similar to Express, but more modern, faster. It has more features. All things that you will need. It's great! Check it out. We are going to use Mercurius. Mercurius is the GraphQL adapter that you can run on top of Fastify. It's cool. Mercurius offers a few interesting features that makes it unique. First of all, it integrates with the GraphQL JIT library, so that we can take your query and do just-in-time compilation for your query, so that it can execute way faster. And so on.

2. Tools, Experiment, and Magic

Short description:

The tools discussed include a library called AutoCANON for load testing in JavaScript. The experiment involves two services federated by a gateway, offering user and post objects. The just-in-time compiler and cache module will be used to enable resolver caching. The service is a simple user object with an ID and a name. Live load testing will be performed on dedicated hardware.

It also does a little bit more things like that for performance and speed reasons. It's great. So check it out. Oh, it also supports the full federation, both as a gateway or as a microservice.

So last tool of the craft is a library called AutoCANON. AutoCANON is a tool that I wrote long ago to do some load testing. And you can use this to skip things in JavaScript. So it's to skip load testing in JavaScript. It's great. I use that a lot of times. So these are our tools, right?

Okay. So we're going to use these three things. So let's talk a little bit about our experiment. We have two services that are federated by a gateway. And one offers the user object, and the other one offers the post object. And we are going to use the just-in-time compiler, and we will enable the cache for the resolver depending on our algorithms. So we can run multiple experiments, right? You can see it here. You can run multiple experiments. And see the impact of this cache module, what does this module look like? So let's see where things disappear or reappear.

What's the service? Well, this is an example of the service. Literally, it's a user object that has an ID and a name. Very simple, okay? It's nothing special here.

So it's time for some magic. Are you ready for the magic? Let's make things disappear. So how? Well, let's go back into our terminal. So this is connected to my server. So it's running on dedicated hardware. So I'm going to do live load testing. Oh, wow.

3. Mercurius Cache Repo and Experiments

Short description:

In the Mercurius cache repo, we have benchmarks, gateway services for user and post data, and experiments using AutoCanon. Running the script without caching does 3000 requests per second. With zero second TTL, it increases 4x. Let's explain this further.

Oh, wow. So let's look at my repo. All of this is in the Mercurius cache repo. And we can see that we have our benchmarks and this is the gateway that I just showed you. And we have our gateway services. This is the user and this is the post. Note that these services are all serving the data from memory. So there's no databases involved. They're really fast.

And this is our bench. So how do we benchmark things using AutoCanon? So basically we require AutoCanon. And then we have our query, and we send our query as a body with 100 concurrent connections. That's it. And then we do several experiments using our bench script. So in our bench script, you see that we are running all the services plus multiple stuff. Multiple example, one with no cache, one with zero seconds time to live and one with one second time to live and one with 10 seconds time to live.

So let's run this script. So first of all, this is our basic, this is our control check, right? It's a gateway mode. Something where we are not going to cache anything. So we have done this, and, whoa! It does 3000 requests per second. Okay, seems fast or not, depends on what you want to do. Latency is good though. So I'm pretty happy with the results. Now with zero second TTL, whoa! How? It just bumped 4x. I'm not doing any caching. It's zero seconds time to live. I'm just turning it on and it just does 4x. And, whoa! Still 4x. Like, how is this possible? Like, how does this work? OK, let's leave this running. Let's explain this in a little bit of a second.

4. Caching and Deduplication

Short description:

Our baseline has a P99 latency of 80 milliseconds, resulting in approximately 3000 requests per second. By enabling caching with a 0 second time to live, we can reduce latency to 18 milliseconds and increase the number of records per second and throughput by 4. The flame graph shows that the majority of time is now spent on caching, thanks to deduplication. The Node.js event loop diagram provides insight into the execution flow and the blocking time between C++ and JavaScript, which is utilized for deduplication by computing a cache key.

So our baseline has a P99 latency. That's what you want to measure for latency of 80 milliseconds. While a request per second, it gives you more or less 3000 requests per second.

However, I can also create flame graphs. What is a flame graph? Well, this is a representation of where our CPU time is being spent. More specifically, all that time is being spent in doing HTTP requests. By the way, if you have not seen my talks about undici and Node.js, please check it out because you can speed up quite a lot your HTTP calls.

But the result is that the vast majority of the time is spent doing HTTP. So, well, what can we do? We need to reduce the HTTP. Yeah, how can we improve this? Well, just by making it a 0 second T time to live, we can just reduce the latency to 18 milliseconds and multiply it by 4 the number of records per second and the throughput. Whoa! This is quite an improvement for not having any caching at all. Zero caching. It's not caching at all. We just enable the cache. Well, and if we enable the cache it does not improve much.

Okay, so how come it's possible? Well, this is the flame graph of our gateway now. And as you can see in the middle, the HTTP request that was there before is gone. And now we have in the middle a huge block of of time being spent doing the caching, okay. So literally now the bottom is the caching system. So, but where did the HTTP call go? Like where did it disappear? Well, what we are doing, we are doing deduplication, which is the clear strategy that will make things incredibly faster, especially on the graphical side.

So, let's go back and talk a little bit about the Node.js event loop. You probably have seen this diagram about Node. This is great because you have seen the request comes in and you know you have an event, it goes into the event queue, it goes processing, and then this generates more asynchronous activity. What you have not seen is this diagram. This diagram, it's a different presentation of the exact same event loop. However, it shows it from the point of view of the JavaScript function being executed. So, when the event loop is running, it's waiting for something, it's waiting, right? This is, on the left and on the right, the event loop is waiting. Then, when an event happens, it calls into C++, it calls into JavaScript, which typically schedules some next-tick or some promises, then go back to C++, which in turn kicks off the promises and next-tick execution, and finally, once all of that is done and settled, it goes back and relinquishes control to the event. All the time it takes in between, though, from this starting point of the C++ to the end of it, it's the time where the event loop is blocked. So, in order to do the deduplication of the request, what we are doing is, when we are receiving our solver being executed, we can compute a cache key, okay? And with that cache key, we can create a matching promise.

5. AsyncCacheDedupe Module

Short description:

The AsyncCacheDedupe module allows you to compute the same cache key for a resolver and avoid executing it multiple times. It automatically caches the results and provides a fast and efficient way to avoid unnecessary computations.

So, and then we can, you know, complete our execution, right? However, when a follow-up execution comes in, we can compute the exact same cache key and get the promise that we put there before, which might be still pending. However, we don't need to execute the same resolver two times. We can only execute them once, right? It's pretty great. We can avoid a lot of computations this way. This is what this module does. It's called AsyncCacheDedupe. You create a new cache where you go and define some methods on it and that are asynchronous, and then automatically it caches the results. And we can have, you can have a TTL, but it automatically dedupes and caches the result. It's phenomenal and it's really fast. So you can use this in all the other places where you want to use the system, right?

6. Implementing Resolvers and Caching

Short description:

When implementing a solver in Node.js, you can use four arguments: root, arguments, context, and info. By combining the resolver anatomy, the info object, and other parameters, you can compute a cache key for each GraphQL resolver. However, in-process caches are problematic, and using Redis as a shared cache between nodes can lead to performance issues.

When you implement a solver in Node.js, you can have four arguments. You have the root, you have the arguments, you have the context, which you know, the root is the current object, but then you have the arguments for the solver, the context which can include your Node.js request, response, database connections, all things, and then the info object which includes the definition of the query that you are computing.

Well, take that into your mind and just wait for a second. Now, what you can do now is use this, you can create, you can take an arbitrary object and JSON-ify it, right? You can call JSON stringify it. If you do that, depending on the order of the properties, you will get different JSON. However, there is a module called save stable stringify, which independently of the ordering of the properties, it will always generate the same JSON. So, what we can do is we can use this module and combine it with the resolver anatomy, the info, the data on the info object, the root and all those things to create a cache, a hash key for an arbitrary hash key for that specific resolver.

Now, how is it implemented? Well, what you can do, as you can see here, is pretty simple. We navigate the info object to get the selection, the current field selection, and then we create an object including the current resolved object, the arguments, the fields, and some more parameters. It's pretty great, you see. We can compute a cache key for each GraphQL resolver. So, this is what we call the zero-second TTF. We are deduplicating all the resolver accessing your data.

Adding some caching is not improving much here because the target services are mostly very easy. They don't require a lot of traffic. Okay, sorry. They don't require a lot of CPUs to compute. They don't have a database. They don't have nothing. However, these adding more caching will change in case you need more. Adding more time here will improve your performance if the target services are not fast enough or slow or something. Well, all of this is very good, right? But in-process caches are problematic. So we can't really increase the time too much because it's all in process, right? So if it's all in process, if the data expires on my node, it's not expiring on the other node. So how can we implement that? Well, you know, one of the good solutions is to use something like Redis to implement a shared state, a shared cache, between all the nodes. Yeah, but we tried that, and we implemented it, and it did not work. And it did not work, well, mainly because, you know, we have in our benchmark, a hundred graphical queries per second, which each one were invoking 20 resolvers. And this turns around that, you know, if you want to fetch that data from the cache, this is two hundred to two thousand Redis Gets per second. And unfortunately, at the round trip time of Redis, the round trip time is 0.5 milliseconds. But the actual round trip time is 15. So, can't do much.

7. Redis Pipelining and Performance

Short description:

We have solved the problem of head offline blocking with auto pipelining, a technique that batches multiple commands into one Redis pipeline, reducing network round trip time. This logic in production has improved requests per second by 100 times and expanded Redis by 15 times. Redis handles the traffic without any issues. However, naming things and cache invalidations remain challenges.

So, can't do much. We need to parallelize this Redis Gets, right? So, maybe you can use a connection pool or, I don't know. Well, there is something better. Well, I actually solve this problem already. Yes, it's with this figure.

Anyway, check out this talk that I did at RedisConf 2021. Explained how to solve the problem of head offline blocking with auto pipelining. So, basically, it's a technique that we have invented to... Well, we have applied to the Redis client that enabled batching of multiple commands that happens in the same event loop iteration into one single Redis pipeline. So, that we are sending them as a batch, making sure that we actually cut down the round trip time, the network round trip time on the server. It's great, and it works beautifully, and you can have to really speed up your Redis access. But this is actually the same thing that we are doing before with the sync. So, it's turtles all the way down. Happy days.

So, we have all of this logic in production. So, it's important to say that these code in production is giving us an improvement of 100 times in terms of number of requests per second. And it's having 15 times expansion factor on Redis. So, for each complex query that we receive on average, we are doing 15 Redis gets with different cache keys to verify if things works as we would like to. It's pretty great, right? But it's also quite scary. By the way, Redis is not even flicking an eye, it's not even blinking an eye. It's perfectly fine with all this traffic, so we don't care. Redis is amazing, by the way. Go use Redis. More Redis for everybody.

So, those are our real-life stuff. This technique has been a phenomenal lifesaver recently. We were able to handle a huge peak of traffic without even blinking. So, yeah, check it out. It's great. However, there are two things in computer science, right? One is naming things, and the other is cache invalidations.

8. Cache Invalidation and Conclusion

Short description:

We haven't discussed cache invalidation, but it's a fundamental topic. Although I've run out of time, we are actively working on implementing this module. Soon, you'll be able to invalidate the cache locally and on Redis. Stay tuned for updates on Twitter and my newsletter. Thank you for watching!

Oh, come on, okay. That's really bad, right? Because we haven't talked about how we invalidate the cache, and this is one of the fundamental topics. However, we are almost at the 20 minutes mark, so I've run out of time. So, I'm not going to cover them in this talk.

I'm joking. We have not finished working. We have not finished the implementation of this module. But we are actually working for this. So, in reality, we'll be adding them to AsyncCache.JDube soon. So, you'll be able to invalidate the cache both locally and on Redis sooner rather than later.

So, check it out, because we are going to watch on my news on Twitter and on my newsletter, because there will be some good announcements in the coming weeks. So, with that, I just wanted to say thank you. As I said, I am Matteo Collina. I am Chief Software Architect at NearForm. You can find me on Twitter, at Matteo Collina. Please ask me any question you want on Twitter, and I will be very happy to respond as soon as I can. So thank you for watching this talk.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
From GraphQL Zero to GraphQL Hero with RedwoodJS
Top Content
We all love GraphQL, but it can be daunting to get a server up and running and keep your code organized, maintainable, and testable over the long term. No more! Come watch as I go from an empty directory to a fully fledged GraphQL API in minutes flat. Plus, see how easy it is to use and create directives to clean up your code even more. You're gonna love GraphQL even more once you make things Redwood Easy!
Vue.js London Live 2021Vue.js London Live 2021
24 min
Local State and Server Cache: Finding a Balance
Top Content
How many times did you implement the same flow in your application: check, if data is already fetched from the server, if yes - render the data, if not - fetch this data and then render it? I think I've done it more than ten times myself and I've seen the question about this flow more than fifty times. Unfortunately, our go-to state management library, Vuex, doesn't provide any solution for this.For GraphQL-based application, there was an alternative to use Apollo client that provided tools for working with the cache. But what if you use REST? Luckily, now we have a Vue alternative to a react-query library that provides a nice solution for working with server cache. In this talk, I will explain the distinction between local application state and local server cache and do some live coding to show how to work with the latter.
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
Do you know what’s really going on in your node_modules folder? Software supply chain attacks have exploded over the past 12 months and they’re only accelerating in 2022 and beyond. We’ll dive into examples of recent supply chain attacks and what concrete steps you can take to protect your team from this emerging threat.
You can check the slides for Feross' talk here.
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
In the early years of Node.js, diagnostics and debugging were considerable pain points. Modern versions of Node have improved considerably in these areas. Features like async stack traces, heap snapshots, and CPU profiling no longer require third party modules or modifications to application source code. This talk explores the various diagnostic features that have recently been built into Node.
You can check the slides for Colin's talk here. 
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
Native ESM support for Node.js was a chance for the Node.js project to release official support for enhancing the module loading experience, to enable use cases such as on the fly transpilation, module stubbing, support for loading modules from HTTP, and monitoring.
While CommonJS has support for all this, it was never officially supported and was done by hacking into the Node.js runtime code. ESM has fixed all this. We will look at the architecture of ESM loading in Node.js, and discuss the loader API that supports enhancing it. We will also look into advanced features such as loader chaining and off thread execution.

Workshops on related topic

GraphQL Galaxy 2021GraphQL Galaxy 2021
140 min
Build with SvelteKit and GraphQL
Top Content
Featured WorkshopFree
Have you ever thought about building something that doesn't require a lot of boilerplate with a tiny bundle size? In this workshop, Scott Spence will go from hello world to covering routing and using endpoints in SvelteKit. You'll set up a backend GraphQL API then use GraphQL queries with SvelteKit to display the GraphQL API data. You'll build a fast secure project that uses SvelteKit's features, then deploy it as a fully static site. This course is for the Svelte curious who haven't had extensive experience with SvelteKit and want a deeper understanding of how to use it in practical applications.

Table of contents:
- Kick-off and Svelte introduction
- Initialise frontend project
- Tour of the SvelteKit skeleton project
- Configure backend project
- Query Data with GraphQL
- Fetching data to the frontend with GraphQL
- Styling
- Svelte directives
- Routing in SvelteKit
- Endpoints in SvelteKit
- Deploying to Netlify
- Navigation
- Mutations in GraphCMS
- Sending GraphQL Mutations via SvelteKit
- Q&A
React Advanced Conference 2022React Advanced Conference 2022
95 min
End-To-End Type Safety with React, GraphQL & Prisma
Featured WorkshopFree
In this workshop, you will get a first-hand look at what end-to-end type safety is and why it is important. To accomplish this, you’ll be building a GraphQL API using modern, relevant tools which will be consumed by a React client.
Prerequisites: - Node.js installed on your machine (12.2.X / 14.X)- It is recommended (but not required) to use VS Code for the practical tasks- An IDE installed (VSCode recommended)- (Good to have)*A basic understanding of Node.js, React, and TypeScript
GraphQL Galaxy 2022GraphQL Galaxy 2022
112 min
GraphQL for React Developers
Featured Workshop
There are many advantages to using GraphQL as a datasource for frontend development, compared to REST APIs. We developers in example need to write a lot of imperative code to retrieve data to display in our applications and handle state. With GraphQL you cannot only decrease the amount of code needed around data fetching and state-management you'll also get increased flexibility, better performance and most of all an improved developer experience. In this workshop you'll learn how GraphQL can improve your work as a frontend developer and how to handle GraphQL in your frontend React application.
React Summit 2022React Summit 2022
173 min
Build a Headless WordPress App with Next.js and WPGraphQL
Top Content
In this workshop, you’ll learn how to build a Next.js app that uses Apollo Client to fetch data from a headless WordPress backend and use it to render the pages of your app. You’ll learn when you should consider a headless WordPress architecture, how to turn a WordPress backend into a GraphQL server, how to compose queries using the GraphiQL IDE, how to colocate GraphQL fragments with your components, and more.
GraphQL Galaxy 2020GraphQL Galaxy 2020
106 min
Relational Database Modeling for GraphQL
Top Content
In this workshop we'll dig deeper into data modeling. We'll start with a discussion about various database types and how they map to GraphQL. Once that groundwork is laid out, the focus will shift to specific types of databases and how to build data models that work best for GraphQL within various scenarios.
Table of contentsPart 1 - Hour 1      a. Relational Database Data Modeling      b. Comparing Relational and NoSQL Databases      c. GraphQL with the Database in mindPart 2 - Hour 2      a. Designing Relational Data Models      b. Relationship, Building MultijoinsTables      c. GraphQL & Relational Data Modeling Query Complexities
Prerequisites      a. Data modeling tool. The trainer will be using dbdiagram      b. Postgres, albeit no need to install this locally, as I'll be using a Postgres Dicker image, from Docker Hub for all examples      c. Hasura
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate