Ashley Narcisse
Ashley Narcisse
Software Engineer, Hackathon Hacker, Functional Programming & System Observabaility Lover.
GraphQL Observability
GraphQL Galaxy 2020GraphQL Galaxy 2020
8 min
GraphQL Observability
GraphQL is an immensely powerful tool and while there are tons of resources out there on how to leverage it, there doesn't seem to be much open discussion around Day 2 (maintenance in production) operations of GraphQL. In this talk, we'll be focusing around observability and the various techniques and tools we can use to get a better understanding of how our graphQL services are running in production. More specifically we'll be focusing on combining ApolloServer and OpenTelemetry.
Transcript
Intro
Oh, hey guys. How's everyone doing?
Thank you for being here. My name is Ashley Narcisse. I'm an engineer at Apollo GraphQL and I work with a awesome team of individuals that are building tools to help you developers make data more accessible.
[00:45] Now, speaking of information, accessibility, I have a question for you. Have you ever wondered what's really happening in production? Like, especially for your GraphQL services.
I don't know about you, but I certainly have. It's something that couldn't shake for a long time. The fact that GraphQL often times felt like a black box. At least until I discovered a lot of the toolings that are in Apollo Studio. And that really, really, really helped me start to paint a better picture of all the different aspects of your graph. Now, there are a lot of features in Apollo Studios that I won't go within, but it's one feature that I will go into.
[01:29] So I lied. GraphQL comes with a tracing spec that this little nifty configuration right here, you can ship that information over into Apollo Studio. Now this was really the starting point of the thing. Wow. All right, cool. I'm sorry. I understand what's happening. I no longer have this like black box imagination that, "It just works." And can't really paint a picture as to the sheer breadth of different operations and the concentrations of what data's my clients are requesting.
We're going to dive into how to tool our Apollo Server with OpenTracing.
About Observability
[02:08]
- Whoa, whoa, whoa. You're not just going to skip over that observability part, right? I know you.
- I feel like they can kind of like Google the definition for it.
- Nah man, they can't just Google that.
- Yeah, you're right.
- Don't worry, I got it.
[02:26] Observability is a term borrowed from mechanical engineer or control theory. And it pretty much boils down to, "Can you understand what's happening inside your system or any particular state that it can arrive to?" A lot of APMs out there describe observability as having three pillars, logs, metrics and tracing.
And while the first two do fall in the category of things that you can't predict, you can describe them as known unknowns, right? Things that you know can potentially go wrong. But tracing is getting a bit closer. Observability boils down to really getting all the right data at the right level of abstractions to get the right context as a request that occurs to your system propagates throughout your entire infrastructure. Observability can really be boiled down to understanding the unknown unknowns.
How to tool Apollo Server with OpenTracing
[03:17] I appreciate, you know that? My dawg. Now OpenTracing is a spec published the Cloud Native Computing Foundation, which is a way to help provide a vendor agnostic format of creating this tracing data. Now it works really, really well with distributed systems and microservices, which is one of the things that GraphQL also does a really, really good job of helping abstract for our end clients.
This is perfect. And here's why, we're able to go beyond the GraphQL tracing spec and convert over to a vendor agnostic format, while also enriching the information that is being published from our GraphQL Servers. And if you have any providers that already support OpenTracing, it's plug and play. It's demo time, let's go.
Demo
[04:14] So we're starting off with an initialized instance of Jaeger client, and we're going to pull in some of our OpenTracing tooling to initialize that tracer as the global tracer for our system. We're also going to export that so we can use it as area of our code.
So here we have an instance of Apollo Server, particularly this implementations on Vercel on the serverless. But we're going to go ahead and import the dependencies that will need to wire this up.
[04:46] Primarily, Apollo OpenTracing is the notable package that we're using. And we're also going to pull in the tracer that we initialize in the tracer file. When initializing the open tracer and adding it to the plugins array in the configuration, it requires a minimum of two properties, which are a server and the local. Both of these take tracer objects, which is what we're going to pass along to it.
At that point, we have a feature parity with the same GraphQL tracer that we also can leverage in Apollo Studio, but we're going to expand on this information on field result to further instrument other layers of our code.
[06:25] So we're going to hop into Jaeger and execute a search so we can pull up our tracing. We can go to the requests that we see, and here are the traces for that render mutation that was executed.
If we expand the details view for this trace and read the tags a little bit better, we'll see that there are the spans that are tracking the render, which is the field resolution. As well as the outbound IO to this local host API that I have also running on my machine.
[07:01] We're just beginning to scratch the surface on this. But I do hope that this talk has given you a few leads on how you can improve the insights that you're obtaining from your systems, while also to help continue the conversation forward. The code for this demo will be available at this URL right here, and you can also reach out to me on Twitter.
And on that note, thank you. Stay safe and I'm sending good vibes away.


Lighting Talks - Day 2 - GraphQL
GraphQL Galaxy 2020GraphQL Galaxy 2020
24 min
Lighting Talks - Day 2 - GraphQL
Video
Hello everyone. Thanks for joining. My name is Nikhil Chandrapa. I'm a software engineer at Ugoby DB. We are a distributed SQL company. I work in the drivers and ecosystem team of Ugoby DB, where we build integrations with popular developer tools. Of late, we are seeing a lot of interest in our community as well as from our users to build first-class support for GraphQL, right? If you do a quick Google Trends search, you will see GraphQL is gaining rapid adoption. For UX developers, GraphQL provides autonomy over querying the data. They are able to retrieve only the data they need, right? For us, being a database company, and also on a daily basis, we work with the backend developers and API developers, it made sense for us to understand the server-side concepts of a GraphQL architecture. The GraphQL server, which provides abstraction over the database for querying and mutating the data and also a few other advanced features like pagination, filtering, eventing, right? We wanted to see how all these things fits in with the general REST API development that we generally see, right? So, API developer would understand the business domain. He comes up with the domain objects. He implements all the access patterns around them. We wanted to see how GraphQL server, how this all looked in the GraphQL server, right? For that, we wanted to use a real-world use case. We considered e-commerce domain, which is kind of well-known, and also it provides challenges for both UX developers, where they have to build an immersive engaging experience, and also API developers, where they have to implement the APIs for the randomness of the traffic, right? So, it's not always constant. It scales based on the user traffic. You have to scale up your APIs or scale down, and also API needs to be available all the time, right? This is the general microservices architecture you will see for e-commerce applications. You'll have a bunch of microservices, which exposes a REST API, and this REST API will be consumed by the UI applications or the UX developers to implement the UX apps, right? So, it probably is a very quick sell for them to move to GraphQL rather than going through entire docs. For us, during our evaluation, we wanted to see how well the GraphQL server can use the native database capabilities. Is it performant or not? And how easy is it to get to production, right? For that, we first wanted to use one of our simple tables, the product catalog. It doesn't have a complex access pattern. You get the product ID in the request, and you send back the product details, right? It was a quick win for us. If you see from this complex JSON object that was being sent on the response, now GraphQL clients or the UX applications can only retrieve the data they need. Once we dipped our toes in the water, we felt comfortable with the GraphQL applications. So, we wanted to have a much more complex table, or the dataset coming to the picture, right? We took product ranking, which has a few things going on there. You have to filter the data based on the category, and also sort it based on the ranking. So, some of this happens on the database, and few of these filtering is also happening on the API layer. One thing we quickly understood was any crud against these tables is super easy, and it leads to a faster prototyping where UX developer need not do back and forth with them. As in when we were bringing more tables, we soon realized building all these domain mappings and the resolvers is kind of a complex task. It's not something you can have a couple of spins and you have a robust GraphQL server, right? But that's where we kind of wanted to pivot to use some of the popular open source and GraphQL server implementations out there. So, we kind of pivoted from that to use the open source tools. In our research, what we kind of found out was all these tools have first-class support for subscriptions, right? First-class support for event-based systems using GraphQL subscriptions. We wanted to use that. So, in our iteration two, we kind of wanted to use GraphQL subscriptions. So, we use it for a order management system where you want to track all the user orders and the status for them. We were super, this kind of solves a lot of issues, right? You don't have to pull for the data. Whenever there's a new status update, you kind of get the notification. Next thing for us to determine was whether it can handle the scale and the SLA requirements that were currently there on the API. For that, we went ahead and benchmark the GraphQL subscriptions. We are able to scale to the SLA requirements we needed. And we are super happy with that. And the next thing is in a cloud-native application, everybody will ask is like, how resilient your apps are? Is it able to, like, as in when you move to the cloud, it becomes super essential for us, for you, or any one of us to build an application which doesn't go down if there is a cloud outage or a region outage, any of such things, right? So, next step, what we did is, we took the GraphQL servers and we kind of deployed it with a stretched Ubite DB cluster, where we are able to do resiliency tests by taking out a region, taking out the entire region, or a few machines in a region. And that provided us enough confidence on taking it to production. So, we believe, along with GraphQL, being GraphQL kind of simplifies the feedback loop between the API developer and the UX developers. And also, it becomes very powerful when it's able to use the capabilities of modern cloud-native applications. That's it, folks. That's all the talk I had today. If you have any questions, please feel free to ask this question in the live Q
&A. Thank you. ♪♪ Hmm, maybe they choose here. Oh, hey, guys. How's everyone doing? Thank you for being here. My name is Ashley Narcisse. I'm an engineer at Apollo GraphQL, and I work with an awesome team of individuals that are building tools to help you developers make data more accessible. Now, speaking of information accessibility, I have a question for you. Have you ever wondered what's really happening in production, like, especially for your GraphQL services? I don't know about you, but I've certainly have. It's something that I couldn't shake for a long time, the fact that GraphQL oftentimes felt like a black box, at least until I discovered a lot of the toolings that are in Apollo Studio, and that really, really, really helped me start to paint a better picture of all the different aspects of your graph. Now, there are a lot of features in Apollo Studios that I won't go within, but there's one feature that I will go into. So, I like. GraphQL comes with a tracing spec that by this little nifty configuration right here, you can ship that information over into Apollo Studio. Now, this was really the starting point of saying, wow, all right, cool. I'm starting to understand what's happening. I no longer have this black box imagination that it just works, and can't really paint a picture as to the sheer breadth of different operations and the concentrations of what datas my clients are requesting. We're gonna dive into how to tool our Apollo server with open tracing. Whoa, whoa, whoa. You're not just gonna skip over that observability part, right? I know you. I mean, I feel like they can kind of like Google the definition for it. Nah, man, they can't just Google that. Yeah, you're right. Don't worry, I got it. Observability is a term borrowed from mechanical engineering or control theory. And it pretty much boils down to, can you understand what's happening inside your system or any particular state that it can arrive to? A lot of APMs out there describe observability as having three pillars, logs, metrics, and tracing. And while the first two do fall kind of the category of things that you can predict, you can describe them as known unknowns, right? Things that you know can potentially go wrong. But tracing is getting a bit closer. Observability boils down to really getting all the right data at the right level of abstractions to get the right context as a request that occurs to your system propagates throughout your entire infrastructure. Observability can really be boiled down to understanding the unknown unknowns. I appreciate you, you know that? My dog. Now, OpenTracing is a spec published by the Cloud Native Computing Foundation, which is a way to help provide a Vector-Agnostic format of creating this tracing data. Now, it works really real, what? Really, really, really, really. It works really, really well with distributed systems and microservices, which is one of the things that GraphQL also does a really, really good job of helping abstract for our end clients. This is perfect, and here's why. We're able to go beyond the GraphQL tracing spec and convert over to a Vector-Agnostic format while also enriching the information that is being published from our GraphQL servers. And if you have any providers that are ready to support OpenTracing, it's plug and play. It's demo time, let's go. So we're starting off with an initialized instance of Jaeger client, and we're going to pull in some of our OpenTracing tooling to initialize that tracer as the global tracer for our system. We're also going to export that so that we can use it as area of our code. So here we have an instance of Apollo Server, particularly this implementation is on Vercel, on the serverless, but we're going to go ahead and import the dependencies that we'll need to wire this up. Primarily, Apollo OpenTracing is the notable package that we're using, and we're also going to pull in the tracer that we initialized in the tracer file. When initializing the OpenTracer and adding it to the plugins array in the configuration, it requires a minimum of two properties, which are the server and the local. Both of these take tracer objects, which is what we're going to pass along to it. At that point, we have a feature parity with the same GraphQL tracer that we also can leverage in Apollo Studio, but we're going to expand on this information by leveraging on field results to further instrument other layers of our code. So, let's go ahead and get started. So, we're going to hop into Jaeger. and execute a search so we can pull up our tracing. We can go to the requests that we see, and here are the traces for that render, mutation that was executed. If we expand the details view for this trace and read the tags a little bit better, we'll see that there are the spans that are tracking the render, which is the field resolution, as well as the outbound IO to this local host API that I have also running on my machine. We're just beginning to scratch the surface on this, but I do hope that this talk has given you a few leads on how you can improve the insights that you're obtaining from your systems, while also to help continue the conversation forward. The code for this demo will be available at this URL right here, and you can also reach out to me on Twitter. And on that note, thank you, stay safe, and I'm sending good vibes your way. All right, let's continue. Thank you, Nikhil and Ashley, for these amazing lightning talks. And well, I guess we have to give it up for Ashley with his production value. I can't imagine how much time you've spent. Well, I don't even know what you did. Writing a script and then thinking, well, I am now two people, multiple cameras set up, amazing. So thanks for the production value at least. And yeah, so I want to remind the audience that they can still ask questions in the GQL-Q
&A Discord channel, if you have any questions. So I'm stuck for words, sorry guys. Well, first of all, thank you, but yeah, like observability, at least my talk, it's still an evolving problem, more in essence, right? So things are constantly changing, the spec is always improving. The one thing I guess I might add is that while I did cover open tracing, there is a 2.0 version still in the, well, active development called Open Tracing, which is a new version of Open Tracing. And there is a new version of active development called Open Telemetry. So that's one caveat, but we can talk a lot more about that in the discussion panel. Yeah, because you're going to join the discussion panel later on, right? Yeah. Discussion. Yeah, specifically in my talk, what I wanted to show was like for an API developer coming in with a database background and how we see in our customers and our users, how they come and ask us like, what are the GraphQL capabilities you provide in the database? And how we went ahead with evaluating GraphQL from, say, without having any knowledge of how GraphQL works, building the GraphQL server, experimenting with one of those things. So that was a journey that I kind of wanted to bring into the talk. And also one other thing that we generally get asked is like, in our world, like how do I deploy my GraphQL application in a HA fashion? Because I come from a database world and in the microservices architecture, where I talked to a bunch of enterprises, that's what they kind of ask us. So that's why in this talk, we kind of wanted to present our perspective of how we see GraphQL applications get deployed across multiple regions in a HA fashion, which means like it cannot go down in any way. And also how you can scale it, which means you can scale your GraphQL labs as well as your database. So that's kind of the point which we wanted to bring in this talk. Yeah, I have to say I really enjoyed that part because for me as a front-ender, seeing like you know that you deploy to some server and it's in the cloud, and that's where my knowledge stops. And it's really like seeing the other side of the coin and how you look at GraphQL as a database provider. That's really interesting to see. And I think and hope that a lot of our viewers also for them that this was really new and gave them some more background. Yeah, generally, right? Like for me, it is super important to know whether my query is performing or how fast I can, what's the throughput I can support and things like that. So some of those things doesn't really, like you don't have to know as a UX developer how all those things are maintained. But for me, as if I'm developing a GraphQL server, all those things becomes very important to me. That's why I wanted to understand that side of the architecture and in that side of the GraphQL architecture, right? So yeah, even for me, like learning from some of the things that UX developers do compared to API developers, that was also like kind of a different experience understanding those things. Like generally, the access patterns is completely different from a UX developer from an API developer. So those like kind of, we understood the gap there, whatever, the understanding we had. Yeah, it's always fun to see if you think people will use your product in a certain way and then you see people with another profession or just any user and it's like the complete opposite of what you thought of. Yeah, exactly. Yeah, I did some UI design back in the day and then you're like, this is really obvious. You see your user and they're like, just going around the button you want them to click. And yeah, it's a funny world. Totally. Yeah. I think another thing to add on to this full circle is like the awesome part about GraphQL is that there's so many ways you can really implement that across different languages, different technologies. So while Nikhil just showed you how to do that really like across multiple zones and really make that highly available, one of the aspects of that is you kind of need to understand as things are happening. So I guess it's awesome that we're letting it talk to the back-to-back cause it plays really, really well together. So there's no really cookie cutter approach to say, all right, cool, you could do a monitoring in one way and it'll apply across all these different ways that you're building your systems. So that's kind of like what the challenge that the spec for OpenTelemetry and the libraries and tools that it comes with is helping create like some basic building blocks. But when you couple the two together, these are approaches that are happening very like in backend systems. So now it's like, all right, let's bring a bit higher. So at least now you're able to get more visibility from the GraphQL layer. So at least like, for example, the whole aspect of caching, which is even a tough problem, even from a front-end perspective, you might be like, all right, cool. Is it a cache from your browser? But at least we're having some good observability. You can at least prove that, all right, cool. Your request that's happened to your GraphQL layer is hitting your caching layer. So at least that might be a reason why you're seeing some sort of side effects or behavior that you're witnessing. But yeah, it's really, it's dope. Yeah, yeah. And we had a talk earlier today. Sorry, like I had an interesting conversation with Ash because I wanted to understand how the API world and how the GraphQL clients, I mean, apps world, like how the tracing actually worked. Some of the points that Ash actually made was really interesting to me as well, being a backend developer. Yeah, I was saying, we had a talk earlier that looked back on how we used to do database design a little bit earlier, like 10 years ago, maybe. And now we are now going through GraphQL. And like Ashley said, it's now an exciting time for GraphQL. Can you imagine if you, yeah, what are we gonna do in 10 more years? That's of course, the whole world will be different. Do we even need to write our own queries anymore? Or is it just somehow you write your new fancy framework frontend and it will just understand? I don't know. It's really exciting time. And I can't wait to see the next five to 10 years. But yeah, well, if we still have to write code, of course, because AI will take over and we'll all be jobless and have to look for something else to do. But hey, until that day comes, wee! Okay. So as a reminder, Ashley is now gonna go to the discussion room and observability. So on the website, go to the timetable below and then you can continue the conversation with Ashley. And I'm gonna thank you for joining us today and hope to see you again at another conference, maybe next year we'll have the second edition and hope to see you again. Thanks for joining us. Likewise. Thank you guys. Thank you. Bye bye. Bye bye.