@defer has been long discussed within the GraphQL working group, and while not part of the specification yet, it’s an exciting new addition that may help with your application’s user experience and mask performance problems.
GraphQL won’t solve your performance problems, but @defer might help
AI Generated Video Summary
The talk discusses the defer directive in GraphQL, which allows clients to specify parts of a query that can be delivered incrementally. It addresses the problem of higher latency fields while still having a single response. The talk explores different approaches to solving this problem, such as query splitting and prefetching. It also covers examples of using the defer directive for partial rendering and lazy loading, as well as its use in mutations. The talk emphasizes the importance of performance for user experience and provides resources for further exploration.
1. Introduction to the Defer Directive
Hello, everybody. My name is Lucas Ledbetter and I'm a solutions architect here at Apollo. I'm extremely excited to be here today, at least virtually, to talk about the defer directive and what it can mean for your clients. GraphQL requires the entire body be sent as a single response. Some fields require additional time to resolve. The client should be able to partially render data as it comes in.
Hello, everybody. My name is Lucas Ledbetter and I'm a solutions architect here at Apollo. I'm extremely excited to be here today, at least virtually, to talk about the defer directive and what it can mean for your clients.
To get the full picture, we'll need to cover a few background details before diving into defer actually is. Lastly, I'll give a few examples showing where defer can be an incredibly powerful tool in your arsenal to help your clients give a better user experience.
So, to take a step back, you might be aware that GraphQL requires the entire body be sent as a single response. This is great for most queries as your UI expects all data to be there. You certainly wouldn't want to return only some of the data for a user on a user profile page, for example. As your graph grows, you realize not all fields are built the same. Some fields require additional time to resolve for one reason or another. For many organizations, this is simply due to scale. But for others, it could be tied to a third-party service that they may not have full control over, thus upholding the other parties response times, or a number of other reasons. Regardless of why it takes more time, they realize that the request for those fields cause the entire response to slow down. As clients begin to access those fields, they start to see the overall graph latency increasing as a result, which isn't ideal and means the user experience is impacted. The client should be able to partially render data as it comes in. But it is unfortunately not possible with a single request.
2. Different Approaches to Address the Problem
Companies came up with two different approaches to solve this problem. The first solution is query splitting, which required clients to orchestrate multiple queries. This addressed the weakest link problem in traditional GraphQL but had performance and context limitations. Prefetching became another popular approach, making assumptions about user intention and requesting data ahead of time. However, this placed additional load on servers and relied on informed guesswork.
So to solve this growing problem, companies came up with two different approaches. Both addressed some of the problems, but each came with tradeoffs. The first solution we'll cover is query splitting. This solution was to split the single query into multiple queries and required clients to orchestrate the responses. This sounds a lot like one of the problems that GraphQL was built to solve for. However, this addressed the weakest link problem that plagued traditional GraphQL, until performance problems can be mitigated or addressed on the server.
In addition to having to handle the responses on the client, it also required the server to handle multiple unique queries from the clients, each with traditional HTTP and GraphQL overhead, as well as being unable to use already fetched data such as apparent data type. Losing that additional context, especially for fields needing to make many calls to other services, could be quite expensive.
On the other hand, prefetching became another popular way to address this problem. For requesting data ahead of time, it is possible to mitigate performance concerns by making assumptions about the user's intention. For example, grabbing the user's payment information prior to hitting the checkout page. This had the added benefit of it still being a single request. However, the user's experience in the client doesn't change much, if at all, when it works. As might be obvious, servers are required to handle this additional load for assumed interaction, even if the user never interacted with the returned data, making this a mostly informed guesswork.
3. Introduction to the Defer Directive
Both of these solutions made progress towards the end goal of a good user experience, but never got there all the way. That takes us to the defer directive. As you might have guessed, defer is a possible solution to the ongoing problem of higher latency fields while still having a single response. Defer is a directive used by clients to inform GraphQL servers that the requests can be delivered incrementally by specifying parts of a query that were returned later. As of today, this directive is not part of the official GraphQL specification. However, it's been used by a number of companies in production already. It is currently under the working group with a publicly accessible RFC which covers ins and outs of the directive and I'll have a link to the RFC at the end of the presentation. By making defer indicated by the client instead of the server, we can now let the clients inform the server of what fields can be responded to later and which ones cannot. So how does this directive actually work? Is it magic? Well, it's not magic, but it might seem like it after working with it for the first time. We'll be covering this at a high level today for the sake of time, but there are excellent talks and posts covering the more technical aspects of the directive, as well as the RFC from the Working Group. So consider the basic schema for a fictional social media site, which has a query to return a list of users, as well as a user type with a list of friends. In this example, the company running this GraphQL schema notices that the friends field isn't as performant as it probably should be. Now, given the query on the left to fetch the list of users along with their friends, we send the requests and it takes a second for the entire response. This isn't great, and as a client developer, we could render the user data partially before rendering the list of friends for each user.
Both of these solutions made progress towards the end goal of a good user experience, but never got there all the way. That takes us to the defer directive. As you might have guessed, defer is a possible solution to the ongoing problem of higher latency fields while still having a single response.
The first question you might ask yourself is, what is this defer directive thing? And it's a great question. Defer is a directive used by clients to inform GraphQL servers that the requests can be delivered incrementally by specifying parts of a query that were returned later. Or in other words, by deferring the response.
As of today, this directive is not part of the official GraphQL specification. However, it's been used by a number of companies in production already. It is currently under the working group with a publicly accessible RFC which covers ins and outs of the directive and I'll have a link to the RFC at the end of the presentation.
As a quick sidebar, we've all seen directives used in server-side applications. From formatting information, to Apollo Federation notation, and more. But the GraphQL specification also calls out the ability to use them in the query itself. In general, I'm a big proponent of executable directives or client-side directives. They allow for clients to provide additional context to defer use and processing, and not simply notation in the schema about expected behavior as with server-side directives. We've seen client-side directives in the past with things like skip and include as defined in the specification and defer is much the same.
So why do I bring this up? By making defer indicated by the client instead of the server, we can now let the clients inform the server of what fields can be responded to later and which ones cannot. Now the requests are the same. It can be located for portions of a query in one component, but that same query may be returned as one response in another component. That context is hard to convey without defer. So now back to regular scheduled programming.
So how does this directive actually work? Is it magic? Well, it's not magic, but it might seem like it after working with it for the first time. We'll be covering this at a high level today for the sake of time, but there are excellent talks and posts covering the more technical aspects of the directive, as well as the RFC from the Working Group.
So consider the basic schema for a fictional social media site, which has a query to return a list of users, as well as a user type with a list of friends. In this example, the company running this GraphQL schema notices that the friends field isn't as performant as it probably should be. I want to explicitly call out something here, however. The friends field, while returning a list of non-null users, is itself nullable. This is important and is something we'll touch on in a bit as to why that's the case.
Now, given the query on the left to fetch the list of users along with their friends, we send the requests and it takes a second for the entire response. This isn't great, and as a client developer, we could render the user data partially before rendering the list of friends for each user. On a user profile, we probably render their username, contact information, and more.
4. Using the Defer Directive in Queries
We've modified the query to include the new deferred directive. The multi-part response includes a has-next field to indicate more responses and an incremental field for additional data. The response is shown in a gif, with an immediate response in about 8 milliseconds and the remainder coming later.
Well, the friends list can load when it arrives, instead of blocking the entire rendering process. So, we've now modified the query to include the new deferred directive. You first might note that the directive is on an inline fragment. This is because the current RFC states it must be included only on fragments or inline fragments. Now, if you execute this query, we get back a multi-part response shown on the left. The first response has a new field but otherwise contains the same structure as the normal GraphQL response, subsequent responses have a new format. While it's not critical to understand all of this, there are two parts I'd like to highlight. First, each response has the has-next field indicated in blue which indicates to the client that are more responses coming. Once it's false, the entire response has been sent. Each subsequent response has an incremental field which contains the Delta and path to the additional data. As you can see, the additional info here is inserted into the user's array at index 0. Now, to put it together, here's a short gif of the results. You can see we get back an immediate response in about 8 milliseconds and then a second later we get the remainder. If you're curious, the gif is from a Apollo sandbox and supports Defer out of the box now including a nice timeline to scroll through the various states of the response and could be a helpful tool to understand how your deferred response is happening.
5. Understanding the Importance of Performance
So as you begin to think about defer, you might notice that it doesn't actually improve performance. But to really dive into how it can help, you must first ask yourself, Why do you care about performance? This is a question I have to ask a number of companies to answer for one reason or another. For some, it's measuring how third parties are performing within their graph. For others, it's about how the user experience is going. Most of the time, however, the reality is that users are the most impacted by performing graphs.
So as you begin to think about defer, you might notice that it doesn't actually improve performance. You're still waiting for a response after all. But to really dive into how it can help, you must first ask yourself, Why do you care about performance? This is a question I have to ask a number of companies to answer for one reason or another. For some, it's measuring how third parties are performing within their graph. For others, it's about how the user experience is going. Most of the time, however, the reality is that users are the most impacted by performing graphs. Many client teams measure against time-to-first-byte, time-to-interactivity, payload size, or any number of other metrics. But Defer specifically targets time to first byte and can intersist with time-to-interactivity, depending on how it's implemented.
6. Examples of Defer in Action
Let's explore examples of Defer in action and its impact on user experience. We'll start with a straightforward way to use Defer for partial rendering or lazy loading. By adding additional fields to the schema, we can render the user's profile information first and load friends later.
So let's look at a few examples of Defer in action and talk about how they can improve the user experience. We'll first want to look at a relatively straightforward way to leverage Defer as a way to do partial rendering or lazy loading. Going back to our previous schema, we have now added a few more fields to the fictional social media site. As mentioned earlier, we can properly render the user's profile information quickly before adding in friends later. Now, we'll see the two queries side by side. Note that we've added the avatar URL, username, and title to the base query in both since we need that right away. Otherwise, we'll defer the friends list, same as before.
7. Using Defer Directive for UX Improvements
Now, excuse my terrible mock-up. In the normal query, we're blocked by the lowest fields, which is the friends field, and we can't render any other data until it's finished. But you can see right away that we get the profile information in the deferred requests and can clearly see that we get our data quickly. Our time to first byte has dropped significantly, and probably even more importantly, time to error activity has decreased as well. Now, going a step further, we've added parameters to our friends field to specify the number of friends to show and an offset. Using defer and GraphQL aliasing, we can load the first 10 friends quickly and then offset the remaining friends in the deferred statement. These optimizations are enabled by using defer. It's also possible to have more than one defer or even nesting defers. In this example, we have a user with a company profile associated with it. We can defer each step to start returning data along the path, rendering data as it comes in. Now let's consider some other potential UX improvements. Fetching data is straightforward, but what about changing data? Mutations are in GraphQL as well.
Now, excuse my terrible mock-up. I'm most certainly not a UX designer. In the normal query, we're blocked by the lowest fields, which is in the case the friends field, and we can't render any other data until it's finished. But you can see right away that we get the profile information in the deferred requests and can clearly see that we get our data quickly. Our time to first byte has dropped significantly, and probably even more importantly, time to error activity has decreased as well. The user can see and engage with parts of the data before the full response is sent.
Now, going a step further, perhaps our company identified that the friend field wasn't, or slowed down based on the number of friends selected for whatever reason. There's another potential path we could take. We've now added two new parameters to our friends field in the schema shown on the left to allow us to specify the number of friends to show and an offset. Using defer as well as GraphQL aliasing, we can now load the first 10 friends quickly and then offset the remaining friends in the deferred statement. Traditionally, this would consist of multiple queries to fetch the other friends to render, allowing the server to run these queries in parallel gives other performance benefits beyond simple UX improvements. In fact, this is quite similar to the stream directive, which is a part of the same working group as the defer directive, but falls a bit out of scope for today's discussion. In summary, it uses much of the same patterns as defer to send large lists incrementally with an initial list of items sent across. Each subsequent chunk in the multi-part response would then contain additional entries to the list until finished. There are a number of great talks and documents about Stream and I highly recommend taking a look at it if taking a look at the RFC if for that directive if sending large lists is a concern for your organization. These sorts of optimizations are enabled by using defer. And so as you go down the path of considering defer as a solution, consider how you might also leverage other aspects of the GraphQL to layer features together. You'll likely realize you can create solutions with deferred returns that weren't previously possible and can help with improving fight side performance. Note that's also possible to have more than one defer or even nesting defers. In this example, we have a user which has a company profile associated with it. That company also lists their owner. In this case, it's quite possible that the user's service communicates to the company service, which then goes back to the user service to resolve the owner information. These hops take time. And in this case, we can defer each step to start returning data along the path, rendering data as it comes in. It's quite possible that the company name isn't immediately critical on a profile and the list of owner isn't probably important at all.
Now let's consider some other potential UX improvements. Fetching data is pretty straightforward. We can all envision how to do partial loading since there was a solve prior to defer. We'd simply use multiple queries or queries per component in React. But what about changing data? Mutations are in GraphQL as well.
8. Defer Directive and Mutations
And thankfully, defer supports mutations as well. As an example, here's a sample mutation for a fictional payment service. Breaking this down, it returns a payment ID and a user as well as payment status interface. This interface either indicates in success or failure states. As a side note, I personally love codifying errors and schemas for things just like this. And we'll actually touch on errors in just a moment.
Diving a bit deeper into this for folks that work with e-commerce, you know that payments can take a while. Defer affords us the ability to return a transactional ID, which we can use for later, such as the user refreshing the processing and otherwise show a loading screen. Once we've gotten a result from the payment status field, we can then render appropriately. Either six success in which case we have the build amount or it's a failure in which case we have failure. Or in which case we have the message or reason in this case. From a user perspective however, we could probably render the right result without subscription or polling to get an updated status. Instead, we use the existing HTTP connection to forward the information to our clients for use later.
9. Handling Partial Data and Error Behavior
Lastly, if you have a partial data, use that partial data and don't kick a user off the page if one piece fails to load. Defers supports both null and non-null fields, but the behavior of errors differs depending on how you use defer and how the field is typed.
Lastly, if you have a partial data, use that partial data and don't kick a user off the page if one piece fails to load. As obvious as this sounds, defer is best used for lower priority data, things that can wait to be returned and as such retrying or other mechanisms can help make sure the user experience is still excellent. And if you recall earlier, I explicitly called out their social media sites friends field was nullable. Defers supports both null and non-null fields, but the behavior of errors differs depending on how you use defer and how the field is typed. This differs from traditional GraphQL. So it's important information to know. Our friends list was a nullable field with a non-null list of users. The query used before is great. It likely would, it's likely that the request would be an all or nothing request where if we fail to get the friends for one reason or another, the field will remain null as you can see here.
10. Understanding Defer Behavior and Implementation
In a normal query, if the friends field is null, we get an empty data object. However, with deferred statements, we receive a partial response without the friends field. The same applies when deferring specific fields. Be careful with nullability as defer introduces additional complexity. Apollo supports defer in their clients, making it easy to use in practice. Apollo Studio and Apollo Sandbox are great tools for testing defer. Apollo has also released a version of defer for federated graphs, which doesn't require additional support. If you're using federation, consider checking out defer. The implementation is agnostic of the library. Adding federation and entity declaration can enable defer without much code change on your server.
Great, now what happens if the friends field was null, non-null? In a normal query, we get back an empty data object since a non-null field returned null. Now how about deferred statements? You might expect that the same would happen. However, we get back a partial response without the friends field but including the ID field. This is an important aspect to consider when dealing with deferring clients. The error behavior changes from traditional GraphQL and you must be aware of this change when migrating to using defer in your queries.
The same also applies when deferring specific fields. For example, you have a query to get a user and their avatar URL, which comes from a service that isn't 100% reliable. This field needs a response, however. So we marked it as non-null so clients can expect it to exist. Same as before, query on the left is not deferred. And as expected, we get back an empty data object. Now for the defer, we get back the partial object again. So all of this in mind, my recommendation is this, be careful with nullability as defer introduces additional complexity when it comes to the response. Just be aware that this exists as it currently stands in the RFC and you'll be okay.
And while we talk about what defer is and how you can use it in principle, to use it in practice is thankfully fairly straightforward. Apollo recently announced support for defer in both their React and Clon clients with iOS support coming soon. For those considering defer, this can be a massive help if you're currently using Apollo's clients since defer can plug into existing applications without headache-inducing rewrites. And once you have a client that supports defer, you simply need to rewrite your queries with defer in mind. As mentioned earlier, Apollo Studio and Apollo Sandbox can be great ways to test defer and see how it works on your graph and ensuring what it does what you expect. In fact, that's what you're seeing on the right. It's the nested defer example from earlier. Apollo has also released a version of defer for organizations using a federated graph and this is especially unique since it doesn't require any support beyond being able to resolve federated entities. I won't have time to dive in deep here on this, but the short version is this. If you're using federation and have even a remote interest in defer and what can afford for your clients, check it out. The implementation is agnostic of library. So even if your subgraph library doesn't support it out of the box, this can enable it so long as you're using Apollo router to serve your super graph. To summarize how it does this, the router leverages the existing entity relationships and manages the asynchronous process of getting the data for you. So subgraphs are not needed to handle sending the responses as multi-part as all. If you don't currently have a federated environment, adding federation and entity declaration, which is just syntax is a simple way to utilize the router as a way to get defer without much code change on your server. We've seen how great defer can be.
11. Considerations and Resources
Consider the experiences you drive currently and what use cases might benefit from being deferred. Defer is driven by the client and is best used for lower priority data. It won't solve your server-side performance concerns. Defer is a powerful tool for your consumers and should be leveraged if possible. Get involved in the graph community and provide feedback. Resources: RFC, social solomon's talk, 200 okay talk, Apollo React client docs, Apollo Router Federation defer support. Thank you for being here.
It's also a lot of fairly complex problems in a declarative manner playing to the strengths of GraphQL. But at the same time, consider the experiences you drive currently and what use cases might benefit from being deferred. We've seen examples of how to do this. One, whereas drop-in fixed for the team and another was schema needed to accommodate defer to provide a better client experience. Keeping in mind those experiences can ensure your schema remains expressive while still enabling defer as an option to improve user experience.
Finally, as mentioned before, defer is driven by the client. They will be most aware of what use cases can be driven by defer. We've seen that defer is best used for lower priority data. So simply stuffing a query full of defers will provide little benefit. Make sure defer is used carefully and ensure it makes sense to do so when you do use it. And while you can improve the user experience and clients and performance defer isn't a panacea. It won't solve your server side performance concerns and it should be treated as such. Identifying and resolving server side performance issues can also remove some of the need for defer. So doing both can help your graph seem and be more performant for clients and servers.
And so a few parting thoughts around this. First defer is a powerful tool for your consumers to your graph and should be leveraged if possible. There are a menagerie of ways to improve your user experience using it. And to that end while specification is new there are a few libraries already supported and Apollo recently announced support for organizations using a federated graph as mentioned previously. And my last point today is this. If you are interested in defer but it doesn't meet your requirements or you feel like it can be improved, get involved. The graph community is we dot are deep and wide and the working groups can always use more feedback about specific use cases concerns or more. Get actively included and if you can. Lastly, just some resources as mentioned, we have the link to the RFC social solomon's talk the 200 okay talk about codifying errors in your schema docs for both Apollo react client which was used to demo defer earlier as well as the Apollo router Federation defer support. With that, thank you. You can reach me at elevator on Twitter and that's my email Lucas at Apollo graphical.com. Hi everyone. Nice to meet everyone. Thanks for joining today's topic. Awesome. Thank you so much for being here.
12. Using Defer Directive and Other Approaches
So just a reminder to everyone you can join the Andromeda Q and a channel on discord to ask your questions for Lucas here in this Q and a, but let's go ahead and get started by taking a look at the answers to the poll question. It looks like the majority of companies currently handle slow fields by waiting for the response or splitting the query and prefetching when possible. Defer doesn't necessarily replace these approaches, as there are use cases where split queries or prefetching may be more suitable. Implementing defer is an incremental process and requires server support. It's important to measure the success and benefits of defer for your specific use case. Tools like the router and React's profiling tool can help with implementation and measuring rendering time. Other approaches like the stream directive can also address performance problems.
So just a reminder to everyone you can join the Andromeda Q and a channel on discord to ask your questions for Lucas here in this Q and a, but let's go ahead and get started by taking a look at the answers to the poll question.
So again, the question that we had was how do you currently handle slow fields within your graph? And it looks like now a pretty overwhelming and strong response for just wait for the response. A little bit behind that is split the query and prefetch when possible, but Lucas what do you think about the results? Is that aligned with what you expected? Yeah, for sure. I think in general, we see a lot of companies doing this where it's not currently feasible or possible to orchestrate the responses to handle split queries or making assumptions about user interaction may not have data there. And so in general, it's really just, we wait for those slow fields to show up and then render when we get it.
Yeah, and just to answer a question here from the audience, does defer replace things like query splitting or prefetching? It doesn't necessarily replace a hundred percent of the time there are certainly use cases for handling both of the situations. For example, you may not wanna have the data requested all the time. There may be an extensive field that the user is needing and it may be necessary on the page at some point. For example, if they scroll down. So split queries or prefetching might be a more appropriate solution, but in general, defer can help with most of these.
Yeah, and when you're talking about implementing something like defer, are there certain areas that you may want to try it out first, or is there a way that you can do more of like a proof of concept or implement it into a code base? Or do you typically see people going on and on it right at once? Yeah, it's definitely an incremental process. We don't see people will just all of a sudden add a defer to every single query that they do for a number of reasons. Primarily the fact that defer requires server support, and so service support can vary across languages. I know we heard that GraphQL Yoga, for example, supports it now. However, on the Apollo side of things, we just introduced support for entity-based defer. So if you're running Federation, this has just become available. And it's me to be something that you're not aware that exists. But as you continue to look at it, it's also part of the thing, a part of it is you need to measure what success means to you. Whether that's time to first buy or whether it's when the user can start interacting with it. And that's going to be something that you have to kind of measure and see how beneficial defer can be, because you may realize that it's not providing the benefit that you expect and may not be worth the continued investment and engineering investment towards that specific use case. And so with all that being said, I definitely think taking a look at it and doing a proof of concept, if you haven't already looked, the router is an easy way to do this through kind of a Federation. Like model where you add entity declaration to interior existing GraphQL server. That can be an easy way to get started and measure whether defer can be a good tool for you. And are there a certain best practices or tools that you would use to measure or stress test in order to determine what may be a good place to apply this first? Yeah, I would definitely look at seeing where it takes most time to render on your site. This can vary based on UI framework you're using, but understanding where your rendering time comes from most can be usually where you can start to drill down and find out why is this happening. And oftentimes it's related to your data service. React, for example, has a really good profiling tool, which you can use to get an understanding of how long it takes to render a specific component. And it could be a good place to start to see how people or how your data is being rendered within your UI and seeing, can we delay this and partially render this component until we get the rest of the data? And you talked about performance problems in general and some of those issues. Are you seeing any kind of newer approaches or best practices to performance problems aside from defer that are of interest to you? Yeah, I think obviously there's things like the stream directive which obviously didn't fall in today's discussion but I think it's definitely worth talking about because Stream also solves a lot of the problems that defer may not fully address. Things like sending full lists or arrays of data across.
13. Alternative Approaches to Improve Performance
Defer may not be appropriate for all cases, especially when constraints like offsets are involved. Another interesting area to explore alongside defer is Stream. Leveraging tools like subscriptions or live queries can also improve data responsiveness.
This is a really common problem where you have a gigantic array of, for example, statistic information that you don't wanna request all of it at once, which may slow down your request endpoint. But in this case, defer may not be appropriate given constraints like the ability to do offsets like we talked about during the talk. And so Stream is one area that I think is really interesting and something that is also alongside defer, so once that becomes official will be interesting to follow. I also think that there are other areas that the tools you can leverage to improve performance. I know a lot of people use subscriptions or effectively live queries, and those are also good utilities to help with some of this data responsiveness effectively.