GraphQL Observability

GraphQL is an immensely powerful tool and while there are tons of resources out there on how to leverage it, there doesn't seem to be much open discussion around Day 2 (maintenance in production) operations of GraphQL. In this talk, we'll be focusing around observability and the various techniques and tools we can use to get a better understanding of how our graphQL services are running in production. More specifically we'll be focusing on combining ApolloServer and OpenTelemetry.



Oh, hey guys. How's everyone doing?

Thank you for being here. My name is Ashley Narcisse. I'm an engineer at Apollo GraphQL and I work with a awesome team of individuals that are building tools to help you developers make data more accessible.

[00:45] Now, speaking of information, accessibility, I have a question for you. Have you ever wondered what's really happening in production? Like, especially for your GraphQL services.

I don't know about you, but I certainly have. It's something that couldn't shake for a long time. The fact that GraphQL often times felt like a black box. At least until I discovered a lot of the toolings that are in Apollo Studio. And that really, really, really helped me start to paint a better picture of all the different aspects of your graph. Now, there are a lot of features in Apollo Studios that I won't go within, but it's one feature that I will go into.

[01:29] So I lied. GraphQL comes with a tracing spec that this little nifty configuration right here, you can ship that information over into Apollo Studio. Now this was really the starting point of the thing. Wow. All right, cool. I'm sorry. I understand what's happening. I no longer have this like black box imagination that, "It just works." And can't really paint a picture as to the sheer breadth of different operations and the concentrations of what data's my clients are requesting.

We're going to dive into how to tool our Apollo Server with OpenTracing.

About Observability


- Whoa, whoa, whoa. You're not just going to skip over that observability part, right? I know you.

- I feel like they can kind of like Google the definition for it.

- Nah man, they can't just Google that.

- Yeah, you're right.

- Don't worry, I got it.

[02:26] Observability is a term borrowed from mechanical engineer or control theory. And it pretty much boils down to, "Can you understand what's happening inside your system or any particular state that it can arrive to?" A lot of APMs out there describe observability as having three pillars, logs, metrics and tracing.

And while the first two do fall in the category of things that you can't predict, you can describe them as known unknowns, right? Things that you know can potentially go wrong. But tracing is getting a bit closer. Observability boils down to really getting all the right data at the right level of abstractions to get the right context as a request that occurs to your system propagates throughout your entire infrastructure. Observability can really be boiled down to understanding the unknown unknowns.

How to tool Apollo Server with OpenTracing

[03:17] I appreciate, you know that? My dawg. Now OpenTracing is a spec published the Cloud Native Computing Foundation, which is a way to help provide a vendor agnostic format of creating this tracing data. Now it works really, really well with distributed systems and microservices, which is one of the things that GraphQL also does a really, really good job of helping abstract for our end clients.

This is perfect. And here's why, we're able to go beyond the GraphQL tracing spec and convert over to a vendor agnostic format, while also enriching the information that is being published from our GraphQL Servers. And if you have any providers that already support OpenTracing, it's plug and play. It's demo time, let's go.


[04:14] So we're starting off with an initialized instance of Jaeger client, and we're going to pull in some of our OpenTracing tooling to initialize that tracer as the global tracer for our system. We're also going to export that so we can use it as area of our code.

So here we have an instance of Apollo Server, particularly this implementations on Vercel on the serverless. But we're going to go ahead and import the dependencies that will need to wire this up.

[04:46] Primarily, Apollo OpenTracing is the notable package that we're using. And we're also going to pull in the tracer that we initialize in the tracer file. When initializing the open tracer and adding it to the plugins array in the configuration, it requires a minimum of two properties, which are a server and the local. Both of these take tracer objects, which is what we're going to pass along to it.

At that point, we have a feature parity with the same GraphQL tracer that we also can leverage in Apollo Studio, but we're going to expand on this information on field result to further instrument other layers of our code.

[06:25] So we're going to hop into Jaeger and execute a search so we can pull up our tracing. We can go to the requests that we see, and here are the traces for that render mutation that was executed.

If we expand the details view for this trace and read the tags a little bit better, we'll see that there are the spans that are tracking the render, which is the field resolution. As well as the outbound IO to this local host API that I have also running on my machine.

[07:01] We're just beginning to scratch the surface on this. But I do hope that this talk has given you a few leads on how you can improve the insights that you're obtaining from your systems, while also to help continue the conversation forward. The code for this demo will be available at this URL right here, and you can also reach out to me on Twitter.

And on that note, thank you. Stay safe and I'm sending good vibes away.

Ashley Narcisse
Ashley Narcisse
8 min

Check out more articles and videos

Workshops on related topic