GraphQL is a powerful and useful tool, especially popular among frontend developers. It can significantly speed up app development and improve application speed, API discoverability, and documentation. GraphQL is not an excellent fit for simple APIs only - it can power more advanced architectures. The separation between queries and mutations makes GraphQL perfect for event sourcing and Command Query Responsibility Segregation (CQRS). By making your advanced GraphQL app serverless, you get a fully managed, cheap, and extremely powerful architecture.
Advanced GraphQL Architectures: Serverless Event Sourcing and CQRS
AI Generated Video Summary
GraphQL is a strongly typed, version-free query language that allows you to ask for specific data and get it in JSON format. It simplifies data retrieval and modification by allowing the server to handle all necessary operations. Serverless architectures, such as AWS Lambda, are scalable, cost-effective, and good for event-driven applications. Event sourcing and CQRS are techniques that ensure consistency and separate reading and writing parts of an application. Building a GraphQL API with commands and queries can be achieved using AWS AppSync and DynamoDB. This approach offers low latency, scalability, and supports multiple languages. Challenges include application complexity, data modeling, and tracing, but starting with simplicity and making something work first can lead to success.
1. Introduction to GraphQL and its Benefits
I'm here today to tell you a few stories. First story is about GraphQL. Facebook built a mobile application packed inside a web view with data fed as HTML. They had issues with serving data, so they came up with GraphQL. It allows you to ask for specific data and get it in JSON format. GraphQL is strongly typed, has a hierarchy, and provides documentation. It's version-free.
It's good that you're having fun. I'm here to change that. So I love stories, of course, as everyone. I'm not that good at telling stories. But fortunately with AI, I can just ask it to pretend to be some famous writer or something like that and help me out.
But I'm here today to tell you a few stories. First story is about GraphQL. Many of you know GraphQL and many of you know an application called Facebook. Now probably mostly used by our parents and things like these. But Facebook application at some point in time built a mobile application. It was just a mobile view of that application which was packed inside mobile app as a web view. And they actually feed the data to that app as HTML. So it was rendered somewhere on the server and just sent as a full HTML to the application.
Now that's probably not the huge problem because our Internet is much faster, but at that moment instead of just getting posts and things like this, getting like full responses and doing many round trips was a bit of a problem. So they come up with a cool tool called GraphQL where basically you can ask your server for specific data and it can give you the data. It works something like this, it's like you define the data that you want. For example I want user with specific IDs but I don't want everything about that user, I just want like a few specific things and I want to get a picture with specific size and maybe first 5 friends of that person and then server will give me JSON in the same format which is amazing. And another amazing thing about GraphQL that was called GraphQL and it's still called GraphQL. The cool thing is that you define that data shape, you have hierarchy which helps like GraphQL to know which data to load first and things like this, it's strongly typed that helps you to navigate that schema and everything easily. It's a protocol, not really a different way of writing a whole server-side back end or whatever. It's basically just defines the data shapes and the rules by which it works. And another cool thing was that you got like documentation out of the box with that introspections schema you can ask GraphQL, hey, tell me what can I query for user and things like this. Something that I'm not sure if it's a good or bad thing was version-free. It's basically not version.
2. Introduction to GraphQL Basics
Before GraphQL, applications worked by sending multiple requests for different data. GraphQL allows you to tell the server what data you want, and the server will handle all the necessary operations to retrieve and return that data. Types, schema, and resolvers are used to define and retrieve the data. GraphQL supports queries, mutations, and subscriptions for data retrieval, modification, and real-time updates.
You need to be backwards compatible on a server because front end can ask anything that is available inside that schema. Before GraphQL, an application worked, and still they often work in a way that they send one request, then get a response, then ask for another thing using something from the first response, and then query third thing. For example, you want to get users from your database, but then you want to get their pictures from let's say Amazon S3 or something, and then finally you want to get some analytics from some third tool or something like that.
GraphQL basically did something that we had before Ajax and everything, which is basically tell server what they want. Server will do all the things and get you the data back, just not HTML. So types are just types. You define which fields you have and things like these. Then you define schema. For example, I have some queries here, and each query defines their attributes and then return values and things like that, and then I write something called Resolver, which basically tells my backend how to get that data. Resolver can be anything basically. It can read the data from a database or query some API or whatever. It doesn't really matter. When you write a query, GraphQL basically parses that query, validates that everything is fine, and then when everything is fine it runs that resolver for us, gets the data, and then pack the result in a way that we want. GraphQL supports queries, which is basically a way for us to get the data, mutations, which is a way for us to change something on the server side, and subscriptions, which give us updates and things like these in real time.
3. Advanced GraphQL Architectures and Serverless
Today we'll talk about advanced GraphQL architectures, for example, serverless event sourcing with CQRS. Let me tell you a story about serverless. Who uses serverless at any point? Probably the first thing that we saw on the internet is from some article in 2012, which said that the future of all software apps is serverless. Almost 10 years ago, AWS announced something called AWS Lambda, which allows you to write a function and attach a trigger to it. Serverless can be applied to many things, not just computing. It scales automatically, you pay per request, and it's fully managed by your service provider. It's good for event-driven applications, teams that are moving fast, and when you can't predict your workload. Serverless is like a managed infrastructure where someone else is running your servers.
So my friend explained the serverless terms. Still in a best way, like a long time ago, he said that serverless is serverless in the same way Wi-Fi is wireless. There are wires somewhere in the background. And when I use my phone, I don't need to have a wire connected to it. But there is a router somewhere that has wires and things like this. These are just not my problems. It's the same with serverless. And serverless can be applied to many things. It's not just that computing part. You have now databases, message buses, and things like these. I don't have a good definition on serverless anymore. But the pure serverless thing was a thing that scales automatically when you need it to scale, where you pay per request. So you're paying just for things that you're actually using. You're not paying anything if no one is using your application. And it's fully managed by your service provider. Serverless is good for many things. For example, event-driven applications. So wherever there is some kind of event, that event can be like an API call, IoT that sends something, doesn't really matter. Whenever you have some kind of event, serverless is decent for that. It's good for teams that are moving fast because you're building small functions and you can change one function without affecting the whole system and things like these, it's good when you can't predict your workload, when sometimes there are more users using your application and then at some point there's no one, it's really good for that. And it's actually good for many applications because at the end, serverless is just like a managed infrastructure. So someone else is running our servers, we don't need to care about that, at least not that much anymore.
4. Serverless Prototypes and Event Sourcing
Serverless is good for prototypes. I used it to build a PTO application. But as our application grew, we faced problems with storing and managing the application state. We found a solution called event sourcing, which is similar to Redux.
And I know that there are many people that do that much better than I do. Well, there are some cases where serverless is not that good. Recently, Amazon Prime Video said that serverless doesn't work for them anymore. So I guess they hit one scenario where it doesn't work for them. Guys from Basecamp often talk about that the plain servers are much better, but it's basically if you have some long running tasks, if you need some really, really low latency and things like this, or you have specific hardware or data jurisdictions, you probably want to use something else, either containers or whatever. But who cares about that? Serverless was and still is good for prototypes.
So we used serverless a long time ago to build a prototype. And my prototype worked something like this. I wanted a small PTOpplication that will allow me to request a leave. And then my manager inside Slack receives my request. They can click and approve my leave or whatever. And I should probably introduce myself before I continue. My name is Slobodan and I work as a CTO and co-founder of Vacation Tracker. That thing is not like AI-driven, that's the PTO requesting thing. I'm also co-author of serverless applications with NodeJS, Booq, AWS Serverless Hero, and I'm organizing some meetups in Belgrade, Serbia.
Well, I was working with that application called Vacation Tracker, and it was an early prototype. But at some point people started using our applications, so our application grew. But the problem was that with the usage of our application, when our app became like a real app, then our problems became real problems. And one of the big problems was like, we were storing a state of the application inside database. For example, I have 10 PTO days remaining for this year. But then many things can change that state. And at some point, for example, if someone moves me to another location, my manager gives me like two more days, but then I request some days, but then someone edits my request and do many crazy things. We were not sure anymore, what was the correct state of the application. So we searched for a solution for our problem and luckily there's always an old solution to most of the new problems that we are facing. So let me tell you about a thing called event sourcing. And I know these are mostly back-end things, but pretty sure many of you heard about Redux. Who used Redux? Okay, perfect. So how Redux works, it's like you're sending some actions or whatever, and then these actions go through some reducer and generate some state. And then when you send a new action, then reducer generates a new state and then the third action generates the third state. And then when you read the state from UI, you don't really care about the events.
5. Event Sourcing and CQRS
Redux has the same idea as event sourcing. Event sourcing ensures that all changed applications on all events are stored as a sequence. CQRS, or Command Query Responsibility Segregation, divides the reading and writing parts of an application. It uses different objects for commands and states. CQRS provides eventual consistency and is useful for specific parts of an application.
You just read the state. But if you go to Redux DevTools, you can travel through the history of these events and see the state different like point and things like this. Well, on a high level, Redux has the same idea as event sourcing. There are some differences, whatever, but it's like, event sourcing is nothing else than basically Redux implemented event sourcing on the front end.
So we wanted to do something similar to our application. And the definition of event sourcing is that basically it ensures that all changed applications on all events are stored as a sequence. So at some point you can query these events, generate some kind of current state, but also travel through history of these events and see what's the new state and things like this. This is the perfect solution for our problem because I can store each event in our sequence as an event. And if something changes, we can generate the state again.
So there's another thing called CQRS, or Command Query Responsibility Segregation, which sounds really scary, but I'll show you in a few seconds, it's not that scary. It often goes with event sourcing. It doesn't need to go with it, but it's often connected. Again, it's an old idea, like 35 years old or something like that, older than some people in this room, almost older than me. It started as Command Query Separation, as an idea. The idea was that instead of building one big application where your query can change the state of the database, you basically divide the reading part and the writing part. Whenever you want to update something in your application, you just do that update and don't return anything. Whenever you want to read the data that does not affect your application in any way, then it evolved to something called Command and Query Responsibility Segregation or CQRS, which basically uses the same definition as this. And the only difference is that basically, it defines two different objects, so your command and the state that you're reading are not in the same shape at all.
So your command is something like action in Redux. Basically, the state you're reading is like state in Redux, they are not completely identical. And it works very similar to the thing in Redux. So basically, whenever you want to send some command, it goes through some command model or reducer that stores the event to some storage, but also creates current representation of your application state somewhere in a different database or whatever. Then whenever you want to read your data, you just go to that cache table or whatever and read that data. For example, your bank account works the same way. You have different transactions and all you care is the balance of your account at any point. You don't really want to go through each transaction all the time.
Well, one of the important things about CQRS is that because of its nature and the way it works, it has eventual consistency. What does that mean? When I send a command and read the thing from the state immediately, there's a chance that I will not have the most recent state, that something will work with that new event in the background. So CQRS is a powerful architecture pattern that helps you with some specific parts of your application, but it's not for everything inside your application. If you're working with for example, blog posts, it's okay to store the blog post in the background as a normal state instead of a series of events, unless you want to do some specific thing like being able to go back in history and things like this.
6. Building a GraphQL API with Commands and Queries
If we have queries and mutators or commands in CQRS and query subscriptions and mutations in GraphQL, maybe they're somehow related in a way that I can just use like GraphQL queries as queries and GraphQL mutations as commands. We want to have some kind of like GraphQL API that has a command here with some kind of a resolver or reducer, whatever, that stores some events to that events database. We send it, and then we want to have queries, which basically allows us to read the data from the database as a state, not as a series of events. So let's try to build this with AWS. We'll use something called AWS AppSync for that. AppSync is completely managed and scalable GraphQL by AWS. It gives you subscriptions out of the box. Then we want to store these events and these like states, current states somewhere, we can use something called DynamoDB, which is managed and scalable, NoSQL database by AWS.
So you often want to use these things in part of your application, not everything. But let's go back to this definition. And if we take a look at this definition, he's talking about queries and also commands, but he says that commands are also called modifiers or mutators, which gives me an excellent idea, or maybe not an excellent, but an idea.
If we have queries and mutators or commands in CQRS and query subscriptions and mutations in GraphQL, maybe they're somehow related in a way that I can just use like GraphQL queries as queries and GraphQL mutations as commands.
So let me tell you about the thing that we were building. And again, thanks to GPT for helping with these slides. So here's what we want to build. We want to have some kind of like GraphQL API that has a command here with some kind of a resolver or reducer, whatever, that stores some events to that events database, which is basically, each event is stored there the way it sends.
We send it, and then we want to have queries, which basically allows us to basically, oh, there's a typo here, but sorry for that. These things at the bottom are queries where we want to read the data from the database as a state, not as a series of events. So what we can do, we can basically send GraphQL mutation, that mutation can go to that resolver or reducer or whatever. Then store something inside that like events database, some event happened. And then it can create projection or reduce that to a current state and store that current state in some other database. Whenever a user wants to read something from my application, they can just go simply and read that from that cache table basically. If I delete this table at the bottom, I can just replay all the events and get the same value.
So bonus point here is subscriptions because this is not a synchronous application, basically user on the front end doesn't know when the process finishes. The good thing about GraphQL is that it defines subscriptions, which basically allows us to send a real-time message back to our front end and tell the user, hey, this is finished. So let's try to build this with AWS. I'll do this really quickly because you can implement this with many different things. But let me walk you through this process.
7. Benefits of Low Latency and AWS Lambda
It has low latency and can handle millions of requests per second. You can stream data and be notified of new or changed data. AWS Lambda can be used to create projections and is cost-effective. It supports multiple languages.
It has really low latency. You pay just for things that you really use. You don't pay if no one is using your application except for some data storage, and it can handle millions of requests per second. You probably don't need that at early stage, but it's good to have that scalability and you can stream the data, which is really cool because the database itself can tell you when the new data is stored there or data changes.
Of course, we'll need something to create projection from these events. We can use that AWS Lambda function for that. Lambda is, as I mentioned, just a function which is managed and scalable. It's cheap. It costs like $0.20 for million requests or something like that. It supports many languages including Node.JS and many others.
8. Using AppSync, EventBridge, and SQS Queue
Now, if we go back to MyDiagram, we can use AppSync for this part. We can store the data to DynamoDB. DynamoDB can notify Lambda of changes and create projections. Different Lambda functions can handle events using EventBridge. Queries remain the same. For high loads, use SQS queue. In Vacation Tracker, data is stored, functions and events are invoked, and a cached version is stored. Front end or Slack is notified using an event bus. Reading the data is easy with GraphQL or REST API for Slack. The application is fully managed and stable.
Now, if we go back to MyDiagram, we can use AppSync for this part. Nothing changed here. We can store the data to that DynamoDB. Cool thing about DynamoDB is that it can tell Lambda in the background, hey, a new thing changed in the database. And then we can create that projection completely in the background. And we can use that Lambda function to return the subscription event back to our user. It's the same with reading the data. And that's it basically.
Now, it's hard to use one Lambda function for many events. We have many, many events now in the application, so we need some kind of message bus that will split that to multiple smaller functions. We use something called EventBridge, which is basically event bus that sends a message somewhere, and then different Lambda functions can just wake up and answer to that. And the cool thing about these services is that it has storage and replay for the events, and everything works something like this in the end. So we store the data. Data goes due to some EventBridge, and that EventBridge knows exactly which functions should handle that kind of event, and everything else is the same.
For queries, nothing changes. It's the same. If you have a huge load, you can use a queue in front of everything. There is a queue of—the famous queue in AWS called SQS, which is scalable again, and your paper usage and things like these. So you can put a queue in front of your database and everything, so if you have a huge spike, it doesn't crash your application or something like that. All these things scale, even without a queue, but at some point at some huge scale, you probably need that. So let's see this in practice, and I'll finish with that. For us in Vacation Tracker, it works something like this. Again, it's very similar to what I told you. It stores to a database, invokes some functions and some events to some event bridge, event bus, all business logic does something, stores a cached version in some database and use another event bus to notify the front end or Slack or Microsoft Teams. And when you read the data, it's easy. You just use GraphQL or you use REST API for Slack because Slack doesn't know how to work with GraphQL. Benefits, we got a fully managed application. Last event, where someone worked after working hours to fix the bug or something like that, was more than two years ago. So it's stable. It's easy to trace all the changes.
9. Challenges, Quotes, and Conclusion
Less code, better control, scalability. Challenges with new services and event tracing. Application complexity and data modeling in DynamoDB. Overcoming challenges in development, testing, monitoring, and security. Cost-effectiveness. Starting with simplicity and making something work first. Making it beautiful leads to speed. AI product mentioned. Open for questions.
Less code because we are just using services to do these things and we have better control and everything is really scalable. Downsides, we have many new services that everyone needs to know. It's hard to trace the events because you have many different services where event goes through. Complexity of the application is a bit higher. And yeah, there are some challenges about writing the application. Modeling the data is really hard with DynamoDB. Luckily, Chaijupiti can help you with that. And the cool thing is that you can use TypeScript for events, schemas and everything and basically round everything to one big solution.
We had different challenges with development, testing, monitoring, security, but we overcome all of these. If you want to know more about these things, feel free to approach me after the talk. I don't have enough time to talk about these things now. But the last cool thing is that it's really cheap. So we have 12 environments that are identical and everything costs like a bit more than $1000 per month. And that application generates way more than that.
So instead of a summary, I want to finish with two quotes. First, the quote is that we should realize that some simple rules can basically become much more complicated and build really complicated things in the future. Basically, something that starts very simple over time can grow and become way more complicated and solve some really good problem. And you don't need to start with some complex things. You always need to start with something simple. And another thing that I really quote, that I really love is like we should make something work first. And then if we have time, we'll make it beautiful. And finally, if we really, really need to, we'll make it fast. Ninety percent of the time, if we make it beautiful at some point, it will be automatically fast. So just make it beautiful.
So that's it for me. This is the AI thing and the co-founder AI that was mentioned at the beginning. And this is basically a product that I'm building that is not AI. Thank you. That's it. If you have any questions.
10. Handling Projection Table Regeneration and Tracing
We store each version of Lambda to handle updates in the projection logic. Creating a new Lambda function for event structure changes is easy and cost-effective. Tracing can be done using correlation ID and tools like X-ray or CloudWatch log insights. Speaker available for more questions in the Q&A room.
All right, so we have someone who asked them, how can you regenerate the projection tables from the events table when a projection logic, i.e. the Lambda code was updated in between? So we basically store each version of Lambda because you don't pay for Lambda function if no one is using it. It's really cheap to have one Lambda for each version. So if our event structure changes, we basically don't just update one Lambda instead, we create a new Lambda function and just connect it to listen to a new version of the event.
Nice, nice. That thing is actually really easy because it gives you more code, but eventually you don't really maintain anymore the old versions of each Lambda function. So, yeah.
That's awesome, that's awesome. We have another question which is just coming in. How do you handle tracing? So tracing, yeah, yeah. You need to have some kind of like a correlation ID. That goes through all parts of the system. And then there are specific tools that you can use, including, for example, like X-ray, that can trace different things through AWS services, but also you can go with simple things like just logging that correlation ID and then using something called like CloudWatch log insights to just like search for the same thing across the functions.
That's awesome. Now I know that some people have more questions coming up, but you can come and find him. He will also be at the speakers Q&A room. So if you're online, you will also be able to ask more questions, but thank you so much. Thank you very much. Definitely one of the last but not least talks of the day, really appreciate you.