Power Up your GraphQL Apps with CDNs

Rate this content
Bookmark
Slides

If you have some GraphQL data that you think would benefit from CDN caching at the edge, it’s actually really simple to get everything working well. This talk will walk you through the interplay between several tools: * Automatic Persisted Queries with Apollo Link lets queries use GET while mutations still use POST * Apollo Cache Control lets you specify cache control information in a fine-grained, schema oriented way * Apollo Engine generates small query IDs you can use in those GET requests to limit the cache key size, and sets the Cache-Control header for the CDN Then, when we put it all together, you can see those results getting cached in your favorite CDN service, tada!!

13 min
05 Dec, 2022

Video Summary and Transcription

This Talk discusses how to grow GraphQL apps with CDNs by exploring concepts like caching, freshness, and validation. It explains how CDNs cache content closer to end users, improving delivery speed. The use of persistent queries and cache control headers in GraphQL is explored as a solution to caching challenges. The talk also highlights the interplay between automatic persistent queries, Apollo cache control, and Apollo Engine for efficient CDN caching.

Available in Español

1. Introduction

Short description:

How to grow your GraphQL apps with CDNs. Enable and caching, two words that don't really go well together. Let's give you a little bit of intro about me. My name is Naz. I am currently an engineering manager working at LinkedIn.

How to grow your GraphQL apps with CDNs. Faster GraphQL queries with caching and CDNs. This is what we're going to talk about today. Enable and caching, two words that don't really go well together. It's been a lot of talks in the community, how are we going to enable caching and GraphQL queries together? Well, before we jump into that, let's give you a little bit of intro about me. My name is Naz. I am currently an engineering manager working at LinkedIn. Before LinkedIn, I worked as an engineering manager and individual contributor at Netflix. I'm currently running JavaScript Weekly with a group of amazing individuals on Twitter spaces and also hosting career QAs on LinkedIn events. I'm also a career coach and a mentor on Mentor Cruise, mentoring and coaching a lot of engineers across the globe. If you want to learn more about me, visit my website, naz.dev.

2. Caching and CDNs

Short description:

So, let's talk about caching. HTTP caching has two main concepts: freshness and validation. Freshness determines how long a resource can be kept in the cache, while validation checks if the resource needs to be refetched. Last modified and ETAC headers are used for validation. CDNs are content delivery networks that cache content closer to end users, delivering it more quickly.

So, let's talk about caching. Before we learn about GraphQL and caching, let's talk about HTTP caching. What is HTTP caching and how it's done. HTTP caching has two main concepts. One is freshness and two is validation.

Freshness means, as a browser, how long can I keep this resource in my cache. Freshness is a way for server to give a resource to client and then instruct the client on how long it can keep a resource. In practice, this is done through the HTTP header cache control. Cache control max age equals 60 means the browser can keep the resource for 60 seconds and then start for re-requesting the resource to the server again.

But we come to validation. Validation means when that 60 seconds is done, if the client decides to re-request the resource again from the server, it will ask the server, hey server, do I really need to refetch this again? So there is a way for the server to actually know if the client really needs the resource again or does it have the latest and updated and valid resource. So if nothing has changed on that resource, there is not really a need for the server to re-send the resource back to the client. And this is actually done through last modified in ETAC headers on server side. Last modified is a date and a time and ETAC is a token that indicates the state of the resource. For example, if not matched, the ETAC.

These are very important headers, but can GraphQL really actually use any of these mechanisms? Why are we saying they don't go together? They are super and we can just attach it to HTTP headers. Well, we'll see. Before we dig into that, let's talk about CDNs a little bit. If you're not familiar with what a CDN is, a CDN is a content delivery network, which caches content like images, videos, webpages, anything that is in proxy servers that are located closest to the end users than the original servers.

A proxy server is a server that receives requests from clients and passes them along to the servers. Because the servers are closer to the clients who are making the request, a CDN is able to deliver the content more quickly and seamlessly to the clients. Let's explain this easier. We can think of CDN as being a chain of grocery stores. Instead of just having one grocery store, one walmart, which is the main branch of walmart that all the houses in the area or all the people go to that walmart branch because that's the only branch to shop. We can have small branches of walmart at every neighborhood. So instead of people need to go to the main branch to pick up their stuff. They can actually look for stuff in the smaller branch first. And if that thing that they want to shop exists in that smaller branch. Awesome. They can pick it from there.

3. CDN Caching and Persistent Queries

Short description:

CDNs cache static content on proxy servers at the edge of the network, saving copies of requested content. GraphQL queries can have cache control headers, but attaching them to POST requests is challenging. Persistent queries provide a solution by using GET requests and shortened query IDs. This brings GraphQL closer to regular HTTP GET requests. Another option is poll cache control, where a cache control header is returned from a specific REST API endpoint.

It's way faster and very quicker. If not, they can go to the major branch or the main branch and then also ask the branch to have those things in the smaller or pricier branches so they can get it from there. This is how CDNs caching work. It's basically replicating the static content on proxy servers at the edge of the network. So when a user requests content from a website using a CDN, the CDN fetches the content from the origin server or the main server and then saves a copy of the content for future requests. Cached content remains on the CDN cache as long as users continue to request it.

Well, what about GraphQL queries? Like where are we going with this? Well, CDNs know how to cache resources when they actually have those request headers we talked about attached to them. But can we use those request headers with GraphQL queries? Yes, we can. We can set cache control headers on a GraphQL query, right? Well, except we usually use resources that are query documents. Well, still a resource. So we can set headers. In the example that you see here, one document is our resource here and we could undoubtedly attach the cache control last modified and some e-text headers to it. But even though that is possible in theory since GraphQL uses post, but in action we basically can't attach those headers to post and you need to use get. So that's why we come to persistent queries as solution number one to go around attaching those headers we talked about to GraphQL queries.

A central principle in REST that we talked about is that you use a URL to identify a piece of data, a piece of resource, and then we use get gets verb in our HTTP request to indicate that you're doing some data read, not a write. That tells our CDNs it's OK to catch that result since it's not expecting to modify something on the back end. In contrast to that, historically, most graphical tools sent HTTP requests using posts. Instead of a URL, they used a complicated request body that contains a query and variables. As an added complication, in some browsers, there is relatively a small URL size limit. That means you can fit the entire query that you're making and also the valuables in the get requests. So what can we do? Well, persistent queries come to our rescue. By combining ApolloLink, our modular network interface for the client, and Apollo's engine automatic persistent queries feature, we can address both concerns at once. After setting up the engine, you can easily add ApolloLink persistent queries to your client code. So here is a code example of using a persistent query link. This will do two things for us. First, sending queries over HTTP GET instead of POSTs, right, because CDNs need that GET request to understand that resources are not changing, while still using POST for mutations. And second, use a shortened persisted query ID in the GET URL so that the cache key for CDNs is shorter and we don't hit the URL size limits. This brings GraphQL much closer to the regular HTTP GET requests that CDNs are designed to handle.

Well, what else can we do than persistent queries? Let's talk about poll cache control. What's that? With our REST API, we can simply return a cache control header from a specific endpoint.

4. Cache Control and CDN Caching

Short description:

With GraphQL, we constantly improve queries on the front end, adding and moving fields as needed. Poll cache control ensures cache hints stay up to date with query changes over time. It allows specifying cache expiration at different levels while maintaining front-end query flexibility. The engine combines cache hints into a cache control header understood by CDNs. Cache control can also be used with Apollo Engine 2 for caching without a CDN. This talk highlighted the interplay between automatic persistence queries, Apollo cache control, and Apollo Engine for efficient CDN caching.

Just like we talked about, until we write a new endpoint, it will remain the same, right? But with GraphQL, we'll constantly be improving queries on the front end. You're adding fields and moving fields. You have different versions of the UI needed. So how do you make sure your cache control hint stays up to date with the shape of the query, even as the data included in the result changes over time? That's what a poll cache control is designed to solve.

This is a spec for how the GraphQL server should return cache hints on a per-field level. So here we can see that comes with a reference implementation for JavaScript that kind of shows us how we could specify cache hints with different levels of specificity. So here we do have the cache hints, max age of 5 seconds for the whole schema set with the cache control. Or we could have that set on a graphical type or field, as we did here. Or even we can set that on a single execution of a resolver. It doesn't need to be on the whole schema. This is super important because it allows our API to specify the expiration of different pieces of data. We don't want everything to expire at the same level. While maintaining the freedom of the front-end code to specify whatever queries it needs. Well, cache control.

At the end of the day, the engine combines all these hints into one convenient cache control header. That's our winner that our CDN can understand. Just a note here. If you're not using a CDN, you can use cache control to power the caching feature of Apollo Engine 2, so you don't have to specifically use a CDN. So, to wrap everything up we talked about today, if you have some graphical data that you think you would benefit from a CDN caching at the edge, it's actually really simple to get everything working well. This is a great example of how interplay between several tools we've been working on for a while. First, automatic persistence queries with Apollo. Link lets queries use GET while mutation still use POST. Second, Apollo cache control lets you specify cache control information in a fine-grained schema-oriented way. And third, Apollo Engine generates the smaller query IDs, so we can use those queries IDs in our GET requests without hitting the cache key size. And set the cache control header for the CDN. I hope you really enjoyed this talk. If you have any questions again, or if you want to connect with me, feel free to find all my handles at mass.dev. And I'm looking forward to chat with you all on the Discord channel of the conference.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
From GraphQL Zero to GraphQL Hero with RedwoodJS
Top Content
We all love GraphQL, but it can be daunting to get a server up and running and keep your code organized, maintainable, and testable over the long term. No more! Come watch as I go from an empty directory to a fully fledged GraphQL API in minutes flat. Plus, see how easy it is to use and create directives to clean up your code even more. You're gonna love GraphQL even more once you make things Redwood Easy!
Vue.js London Live 2021Vue.js London Live 2021
24 min
Local State and Server Cache: Finding a Balance
Top Content
How many times did you implement the same flow in your application: check, if data is already fetched from the server, if yes - render the data, if not - fetch this data and then render it? I think I've done it more than ten times myself and I've seen the question about this flow more than fifty times. Unfortunately, our go-to state management library, Vuex, doesn't provide any solution for this.For GraphQL-based application, there was an alternative to use Apollo client that provided tools for working with the cache. But what if you use REST? Luckily, now we have a Vue alternative to a react-query library that provides a nice solution for working with server cache. In this talk, I will explain the distinction between local application state and local server cache and do some live coding to show how to work with the latter.
GraphQL Galaxy 2022GraphQL Galaxy 2022
16 min
Step aside resolvers: a new approach to GraphQL execution
Though GraphQL is declarative, resolvers operate field-by-field, layer-by-layer, often resulting in unnecessary work for your business logic even when using techniques such as DataLoader. In this talk, Benjie will introduce his vision for a new general-purpose GraphQL execution strategy whose holistic approach could lead to significant efficiency and scalability gains for all GraphQL APIs.

Workshops on related topic

GraphQL Galaxy 2021GraphQL Galaxy 2021
140 min
Build with SvelteKit and GraphQL
Top Content
Featured WorkshopFree
Have you ever thought about building something that doesn't require a lot of boilerplate with a tiny bundle size? In this workshop, Scott Spence will go from hello world to covering routing and using endpoints in SvelteKit. You'll set up a backend GraphQL API then use GraphQL queries with SvelteKit to display the GraphQL API data. You'll build a fast secure project that uses SvelteKit's features, then deploy it as a fully static site. This course is for the Svelte curious who haven't had extensive experience with SvelteKit and want a deeper understanding of how to use it in practical applications.

Table of contents:
- Kick-off and Svelte introduction
- Initialise frontend project
- Tour of the SvelteKit skeleton project
- Configure backend project
- Query Data with GraphQL
- Fetching data to the frontend with GraphQL
- Styling
- Svelte directives
- Routing in SvelteKit
- Endpoints in SvelteKit
- Deploying to Netlify
- Navigation
- Mutations in GraphCMS
- Sending GraphQL Mutations via SvelteKit
- Q&A
React Advanced Conference 2022React Advanced Conference 2022
95 min
End-To-End Type Safety with React, GraphQL & Prisma
Featured WorkshopFree
In this workshop, you will get a first-hand look at what end-to-end type safety is and why it is important. To accomplish this, you’ll be building a GraphQL API using modern, relevant tools which will be consumed by a React client.
Prerequisites: - Node.js installed on your machine (12.2.X / 14.X)- It is recommended (but not required) to use VS Code for the practical tasks- An IDE installed (VSCode recommended)- (Good to have)*A basic understanding of Node.js, React, and TypeScript
GraphQL Galaxy 2022GraphQL Galaxy 2022
112 min
GraphQL for React Developers
Featured Workshop
There are many advantages to using GraphQL as a datasource for frontend development, compared to REST APIs. We developers in example need to write a lot of imperative code to retrieve data to display in our applications and handle state. With GraphQL you cannot only decrease the amount of code needed around data fetching and state-management you'll also get increased flexibility, better performance and most of all an improved developer experience. In this workshop you'll learn how GraphQL can improve your work as a frontend developer and how to handle GraphQL in your frontend React application.
React Summit 2022React Summit 2022
173 min
Build a Headless WordPress App with Next.js and WPGraphQL
Top Content
WorkshopFree
In this workshop, you’ll learn how to build a Next.js app that uses Apollo Client to fetch data from a headless WordPress backend and use it to render the pages of your app. You’ll learn when you should consider a headless WordPress architecture, how to turn a WordPress backend into a GraphQL server, how to compose queries using the GraphiQL IDE, how to colocate GraphQL fragments with your components, and more.
GraphQL Galaxy 2020GraphQL Galaxy 2020
106 min
Relational Database Modeling for GraphQL
Top Content
WorkshopFree
In this workshop we'll dig deeper into data modeling. We'll start with a discussion about various database types and how they map to GraphQL. Once that groundwork is laid out, the focus will shift to specific types of databases and how to build data models that work best for GraphQL within various scenarios.
Table of contentsPart 1 - Hour 1      a. Relational Database Data Modeling      b. Comparing Relational and NoSQL Databases      c. GraphQL with the Database in mindPart 2 - Hour 2      a. Designing Relational Data Models      b. Relationship, Building MultijoinsTables      c. GraphQL & Relational Data Modeling Query Complexities
Prerequisites      a. Data modeling tool. The trainer will be using dbdiagram      b. Postgres, albeit no need to install this locally, as I'll be using a Postgres Dicker image, from Docker Hub for all examples      c. Hasura
GraphQL Galaxy 2021GraphQL Galaxy 2021
48 min
Building GraphQL APIs on top of Ethereum with The Graph
WorkshopFree
The Graph is an indexing protocol for querying networks like Ethereum, IPFS, and other blockchains. Anyone can build and publish open APIs, called subgraphs, making data easily accessible.

In this workshop you’ll learn how to build a subgraph that indexes NFT blockchain data from the Foundation smart contract. We’ll deploy the API, and learn how to perform queries to retrieve data using various types of data access patterns, implementing filters and sorting.

By the end of the workshop, you should understand how to build and deploy performant APIs to The Graph to index data from any smart contract deployed to Ethereum.