Though GraphQL is declarative, resolvers operate field-by-field, layer-by-layer, often resulting in unnecessary work for your business logic even when using techniques such as DataLoader. In this talk, Benjie will introduce his vision for a new general-purpose GraphQL execution strategy whose holistic approach could lead to significant efficiency and scalability gains for all GraphQL APIs.
Step aside resolvers: a new approach to GraphQL execution
From:

GraphQL Galaxy 2022
Transcription
Hello, everyone. My name is Benji and I love graphql. I think graphql is awesome. It's made such a huge impact in the way that we build our client applications, our websites, mobile apps, and more. It's shown us how we can be more efficient and get rid of this overfetching and underfetching. Really do things in a way that is highly optimized to the way that the client needs. Minimizing round trips and having type safety reduces the risks of something going wrong. Handling partially successful results means that even if things go wrong, we can still render something useful to the user. Built-in documentation increases productivity. And of course, by delivering the data in the shape that the client expects, we can minimize the amount of wrangling that we need to do on the client side. All of this makes it much, much easier and faster to write our client applications, which makes it easier and faster to build things that our users love. graphql is amazing. But I have a confession to make. I hate resolvers. I've always hated resolvers. The graphql language is declarative, and yet resolvers were not built to leverage this awesome capability. Instead, they turn execution into a procedural, layer by layer, field by field, item by item approach. To my mind, resolvers are very much a minimum viable product approach to execution. They're simple to understand and to specify, but they punt things like solving the N plus one problem into user space, requiring schema designers to remember to use abstractions such as data Loader to achieve acceptable performance. And they rule out entire classes of optimization strategies. If you want to optimize what you're asking your business logic layer to do based on the incoming graphql query, for example, to only select certain fields from a database or to tell your remote api to include additional resources that are needed, you have to dabble with abstract syntax trees or similar look ahead or transpilation complexities. It's unpleasant and a lot of effort. Even people who put in this effort tend to only do so to a fairly superficial level. But the real efficiency gains would come from pushing this a bit further. All of this means that graphql is making our servers do more than they should need to. Burning more CPU cycles, making more network calls, using more energy, putting more pressure on the environment. And it's not doing as much as it could to save us money on our server bills. In fact, my hatred for resolvers is actually why I joined the graphql working group in the first place back in 2018. The graphql specification seems to dictate that we must execute our operations using resolvers. It just seems so unnecessary. graphql being a declarative language, why must we stipulate that we must execute it in a procedural manner? As I grew to learn the graphql specification, I realized, of course, that we don't stipulate that we must use resolvers at all. One paragraph, right near the start of the graphql spec, which I must admit when I first read I completely skipped over, went straight to the execution section, states, conformance requirements expressed as algorithms can be fulfilled by an implementation of this specification in any way as long as the perceived results look equivalent. We stipulate that it must look like that's how it's executed, but that doesn't necessarily have to be what we actually do on the server side. So long as the observable result is the same, do whatever you want. But we still have resolvers. Resolvers are still the dominant way of executing graphql. Even in projects that delve a little deeper into optimizing backend queries, using tools such as my graphql pass resolve info module to look ahead and figure out what fields are being requested, we're still using resolvers. And the reasoning behind them is sound. The way they describe execution is correct. Without this definition, graphql could become more of a transfer format than an execution engine. Clients wouldn't be able to rely on the same assumptions, the assumptions that make things like normalized caching possible. Because resolvers enforce the graph nature of graphql, where we traverse from node to node, the value of the node that we're on being dependent on neither where we came from nor where we're going next. And yet, this doesn't have to be the way that we actually execute operations. I've been battling this problem on and off for almost six years. I've tried many experiments over that time. I've really delved deep into the graphql specification to understand exactly why resolvers are specified, what they're protecting us from, why are they declared the way that they are. I've tried to work within the constraints of graphql JS to solve this problem, and I've been reasonably successful. But it's always irked me that it just doesn't feel as ergonomic as it should be. The workarounds that I've come up with have been really clumsy and rigid. Admittedly, many of those solutions I came up with before I had the support of my many sponsors, so I couldn't invest much time into their research. But thankfully, a lot of people have found that the software I write is very helpful. And as my sponsorship has grown, thank you sponsors, the amount of time I have to invest in this problem has increased. About two and a half years ago, I set out, very part time, to solve this problem once and for all. But before we get into that, let's talk about my inspiration. Those of you who know me will know that I love relational databases. Well, one relational database in particular, really. Relational databases use SQL, which is a declarative language. It specifies to the server what data is required, and the server decides how to execute it. And it can choose many different ways of executing that query. For example, factoring in the presence of indexes, or figuring out how much data is expected so that it knows which type of operations to attempt to do. This is called query planning. Modern Postgres even has things such as genetic algorithms to help choose the best execution plan, just-in-time compilation to compile expressions down to machine code so they can be executed more efficiently. Now, graphql is declarative too. So we have resolvers. We don't have generic execution planning for graphql. But it's not fair to say that we have nothing. Loads of people have felt this pain. So now we have specialized planners, such as the graphql to graphql planner found in apollo Federation, or optimization of the graphql internals in graphql JIT. We also have ways of optimizing particular patterns, such as projections in Hot Chocolate, or graphql to SQL transpilation with Hasura. But we don't have a powerful, general-purpose query planning system that takes a holistic approach and allows advanced optimizations, no matter what your underlying business logic may be. At least until now. So after that really long introduction, I'm here today to say, step aside, resolvers! There's a new way to execute any graphql operation. The working title for this new project is Graphast. And it works by taking your graphql operation and compiling it into an execution and output plan. The output plan is a straightforward mapping from the data that we retrieve through the execution phase to the graphql result that we want to output for the user. So we'll skip over that. Mostly what we care about is the execution plan. We build the execution plan by mostly following the execution algorithm in the graphql specification. However, we do this before actually executing. So we're not dealing with concrete data, unlike we would be in the spec. Instead we're dealing with what we call steps that represent this future data. When it comes time to execute, a step is quite similar to Dataloader in that it accepts a list of inputs and it returns a list of outputs. Batching is built into the very heart of Graphast. Whereas a field in a traditional graphql schema would have a resolver, in Graphast it would have a plan resolver. The plan resolver has similar looking code, as you might be able to see from a glance. But again, rather than dealing with concrete runtime data, it's dealing with future data, these steps, which we represent with a dollar symbol in the code on the right. By walking through the selection sets and calling the plan resolvers at each stage, we can build out a plan diagram, like this one, that expresses what needs to happen. Now this plan diagram is not the final thing that will be executed. This is just our first draft. The Graphast system will go through each step in the plan diagram and give it a chance to de-duplicate itself, to optimize itself, and to finalize itself. These are the three main lifecycle methods that these steps may implement. optimization allows similar steps to be amalgamated to simplify our plan diagrams. optimization is the main phase, the most critical one for performance, and it allows steps of the same family to communicate with each other and pass information between one another. For example, if we were dealing with the Stripe payment api, we might have a graphql field that fetches the customer from Stripe, and then in our query, we might have a child field of that that fetches the subscriptions for the customer, and then various fields below that. When the step responsible for getting the subscriptions is being optimized, it could determine that its grandparent is pulling data from a Stripe customer, and thus tell the customer step to fetch the subscriptions at the same time. Stripe has a feature called expanding for this. This would make the subscriptions fetching step redundant. So as the last action in optimize, it can replace itself with a step that simply extracts the relevant data from the customer response. This way, we now only need one round trip to Stripe to get all the information that we require. Finally, we have the finalize method. This is typically unused for most steps, but it gives the step a chance to prepare for execution, to do something that only needs to be done once. For example, it might be used to compile an optimized javascript function to execute its action more efficiently, or it might be used to prepare the final SQL or graphql query text just one time. Steps are designed to be something that you can implement yourself, much like you would with a data loader callback. They're a little more powerful than data loader and have these optional lifecycle methods that we just discussed. But in the simplest case, all they need is an execute method that takes a list of values and returns a list of values in the same way that a data loader callback does. We also have a number of optimized pre-built steps that you can use to handle common concerns, including load one for batch loading remote resources similar to data loader, or each for mapping over lists, or access for extracting properties from objects. And we're building out more optimized steps for dealing with particular concerns. For example, issuing HTTP requests, sending graphql queries, or talking to databases. Ultimately, our intent is to use these steps to pass additional information to your business logic layer, no matter what that is, so that it may execute its tasks more efficiently. Just like graphql helps eliminate over and under fetching on the client side, Graphfast helps you eliminate it on the server side. For example, if your business logic is implemented with an ORM or something like that, you can use this additional information to perform selective eager loading to reduce database round trips. If your business logic is from an HTTP api, you could use this contextual information to dictate which parameters to pass, better controlling the data you're retrieving, reducing server load and network traffic. And since Graphfast is designed around the concept of batching, you never need to think about the N plus one problem again. It's solved for you out of the box by virtue of how the system works. Just like with graphql resolvers, Graphfast plan resolvers are intended to be short and only express what is necessary to communicate between graphql and your business logic. And despite their pleasant ergonomics, they unlock much greater efficiency and performance than resolvers can. Graphfast has been built from the ground up to support all of the features of modern graphql, queries, mutations, subscriptions, polymorphism, and even cutting edge technology such as the stream and defer directives. And it's backwards compatible with existing graphql schemas. So you can use Graphfast to execute requests against an existing resolver based schema and then migrate to plan resolvers on a field by field basis. And if you already use Dataloader, then migrating to using Graphfast plan should be very straightforward. We're hoping to release an open source version of Graphfast under the MIT license, the very same license that graphql JS uses in the first half of 2023. To be notified when we're ready, please enter your email address at graphfast.org. If you'd like to help me continue to invest time in projects like this, please consider sponsoring me on GitHub sponsors, and you may even get early access. Feel free to reach out to me with any questions. Thank you for your time, and I hope you're as excited about the future of graphql as I am. Thank you.