Performance Monitoring of a Heterogeneous GraphQL Mesh App

Rate this content

Today it is fairly easy to integrate GraphQL on a client and server-side and get it all up and running quickly with any cloud service of your choice like e.g. Netlify or Vercel. With this setup, how can we monitor the performance, and how observe all parts together to find any root cause in case of problems?

8 min
10 Dec, 2021

AI Generated Video Summary

Performance monitoring is crucial for businesses as users don't like to wait. The ApolloEngine tool helps track and analyze metrics, revealing response time variances and other information. Instana combines traces for service communication with infrastructure metrics and end user monitoring, implementing open telemetry. Apollo Studio is great for managing the GraphQL schema and provides full observability, enabling efficient root cause analysis.

1. Performance Monitoring and Issue Investigation

Short description:

I'm Robert Horslowski, a software engineer at Instaun in IBM company. I have experience with GraphQL and have encountered performance issues in live demo applications. Performance monitoring is necessary because users don't like to wait, and APIs are crucial for businesses. Investigating a real performance issue, I found that the communication with the database was sometimes very slow. The ApolloEngine tool helped track and analyze metrics, revealing response time variances and other information.

Hi everybody! I'm very happy to be here to have the opportunity to share my thoughts and learnings about performance with GraphQL specifically in a service mesh. Let me quickly introduce myself. I'm Robert Horslowski working at Instaun in IBM company and in 2016 I gave a talk about GraphQL in Relay. Later in 2018 I published this video course about a full-state trailer clone on top of GraphQL. By then 2019 I found a subtle performance issue in this live demo application which brings all this rolling.

But let's first dive into and see what do we mean with distributed mesh. So, actually we don't have only one service but typically our landscape from an infrastructure looks like this. So, of course there can be one or two machines going down and so on. But this typically handled. But what is then happening on the service level. And here also this is typically how a service mesh looks like when you look into it and have a representation of the traffic of the communication. And also here there are of course many communications running and this is typically not good visible if you have not such a tool.

But first, let's ask the question, why is performance monitoring necessary? Yeah, it's quite simple. Users don't like to wait. And typically when we have today a service mesh or at least some service is used. Maybe this is a tool for a payment service or anything like this. And typically, other services depend on that. And this needs to somehow be tracked. And in case of a failure, of course, should be easily found and fixed. Why is this important? Typically, today, when APIs are the center of a business, for instance, then also here, it's very important that timings are as expected. So nobody wants to wait for something and later find out it was not their fault, but somebody else. And even while there might have been a contract, so-called SLA, where you define a specific service needs to be reacting sometime. And if it does not, that's where somebody has a problem and the business has a problem at the end.

But let's come to investigating a real performance issue. As I mentioned, I had a problem with my live demo at the time. It's a simple Kanban board with some database transactions or a backend where you have some data stored, of course, but also, at that time the communication of the database was graphical. So, for some reason, it was very slow, but on other times, it was very fast. I couldn't say where the problem is, but sometimes it was really really slow, and there's only the tool out there, or it was there, it was called ApolloEngine. It was quite simple to just add an API key into the Apollo server when using the Apollo server library, and then it automatically tracks these metrics and showed them here in the board. So you can see here, this is the variance, let's say, or the spectrum of the response times, up to 13 seconds for a call, which of course is not acceptable, and there are some more information like on the right, so the number of queries and so on.

2. Instana and Apollo Studio

Short description:

A year ago, I had the chance to use Instana, which combines traces for service communication with infrastructure metrics and end user monitoring. It implements open telemetry. To collect user data, inject the UEM snippet in the website. Tracking down backend traces and analyzing query counts is easy. I also monitor my application running on Netlify functions using the instanawrapper. The real problem was using a GraphQL service backend with a premium plan. Apollo Studio is great for managing the GraphQL schema and provides full observability, enabling efficient root cause analysis.

This was a year ago. In the meantime, they improved their service and also have some tracing built in, which can also be very easily enabled and for specific freemium services also quite easy and doesn't cost anything.

So but this, at that time, also gave me a little bit of information and I also had the chance to use Instana, and Instana combines these traces for the communication of services together with infrastructure metrics and also with UEM, so with end user monitoring. And by the way, it's implementing open telemetry, the latest standard in this area.

So how do we get there? It's quite simple at the end. Finally, to get all the information of the user and what the user is doing, you just inject your UEM snippet in the website, then the GraphQL query can collect all the data, how even JavaScript errors and so on. And even specific requests you can find here, and then tracking down, we find a few to backend trace there at the end, also show up the GraphQL query. And right side you see there's some meta information of the operation and so on. And we can also do some more analytics on the counts of queries and so on. But nowadays, my application also runs in Netlify functions, which at the end run on AWS Lambda. So how can we track that? It's quite easy, just using here this instanawrapper. And with this, I was able to monitor the Apollo application server here as we saw in the slide before.

So finally, what was the real problem? At the end, I figured out that the real small thing was that as I used at that time a GraphQL service backend which used this premium plan. So that was the only problem. Summary, it's quite easy. Apollo Studio is great for managing the GraphQL schema, and it's done as a full-blown observability with all these extra features, and it enables the left shifting for giving developers a full context of their running application in production. So this makes it also very efficient to find any root cause. I would say, let me say, thank you very much for listening. And for any questions, please reach me at Twitter, at their hosts, or the email And, of course, I hope to see you and meet you at the conference chat.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

React Advanced Conference 2022React Advanced Conference 2022
25 min
A Guide to React Rendering Behavior
React is a library for "rendering" UI from components, but many users find themselves confused about how React rendering actually works. What do terms like "rendering", "reconciliation", "Fibers", and "committing" actually mean? When do renders happen? How does Context affect rendering, and how do libraries like Redux cause updates? In this talk, we'll clear up the confusion and provide a solid foundation for understanding when, why, and how React renders. We'll look at: - What "rendering" actually is - How React queues renders and the standard rendering behavior - How keys and component types are used in rendering - Techniques for optimizing render performance - How context usage affects rendering behavior| - How external libraries tie into React rendering
React Summit 2023React Summit 2023
32 min
Speeding Up Your React App With Less JavaScript
Too much JavaScript is getting you down? New frameworks promising no JavaScript look interesting, but you have an existing React application to maintain. What if Qwik React is your answer for faster applications startup and better user experience? Qwik React allows you to easily turn your React application into a collection of islands, which can be SSRed and delayed hydrated, and in some instances, hydration skipped altogether. And all of this in an incremental way without a rewrite.
React Summit 2023React Summit 2023
23 min
React Concurrency, Explained
React 18! Concurrent features! You might’ve already tried the new APIs like useTransition, or you might’ve just heard of them. But do you know how React 18 achieves the performance wins it brings with itself? In this talk, let’s peek under the hood of React 18’s performance features: - How React 18 lowers the time your page stays frozen (aka TBT) - What exactly happens in the main thread when you run useTransition() - What’s the catch with the improvements (there’s no free cake!), and why Vue.js and Preact straight refused to ship anything similar
JSNation 2022JSNation 2022
21 min
The Future of Performance Tooling
Our understanding of performance & user-experience has heavily evolved over the years. Web Developer Tooling needs to similarly evolve to make sure it is user-centric, actionable and contextual where modern experiences are concerned. In this talk, Addy will walk you through Chrome and others have been thinking about this problem and what updates they've been making to performance tools to lower the friction for building great experiences on the web.
GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
From GraphQL Zero to GraphQL Hero with RedwoodJS
We all love GraphQL, but it can be daunting to get a server up and running and keep your code organized, maintainable, and testable over the long term. No more! Come watch as I go from an empty directory to a fully fledged GraphQL API in minutes flat. Plus, see how easy it is to use and create directives to clean up your code even more. You're gonna love GraphQL even more once you make things Redwood Easy!
Vue.js London Live 2021Vue.js London Live 2021
24 min
Local State and Server Cache: Finding a Balance
How many times did you implement the same flow in your application: check, if data is already fetched from the server, if yes - render the data, if not - fetch this data and then render it? I think I've done it more than ten times myself and I've seen the question about this flow more than fifty times. Unfortunately, our go-to state management library, Vuex, doesn't provide any solution for this.For GraphQL-based application, there was an alternative to use Apollo client that provided tools for working with the cache. But what if you use REST? Luckily, now we have a Vue alternative to a react-query library that provides a nice solution for working with server cache. In this talk, I will explain the distinction between local application state and local server cache and do some live coding to show how to work with the latter.

Workshops on related topic

React Summit 2023React Summit 2023
170 min
React Performance Debugging Masterclass
Featured WorkshopFree
Ivan’s first attempts at performance debugging were chaotic. He would see a slow interaction, try a random optimization, see that it didn't help, and keep trying other optimizations until he found the right one (or gave up).
Back then, Ivan didn’t know how to use performance devtools well. He would do a recording in Chrome DevTools or React Profiler, poke around it, try clicking random things, and then close it in frustration a few minutes later. Now, Ivan knows exactly where and what to look for. And in this workshop, Ivan will teach you that too.
Here’s how this is going to work. We’ll take a slow app → debug it (using tools like Chrome DevTools, React Profiler, and why-did-you-render) → pinpoint the bottleneck → and then repeat, several times more. We won’t talk about the solutions (in 90% of the cases, it’s just the ol’ regular useMemo() or memo()). But we’ll talk about everything that comes before – and learn how to analyze any React performance problem, step by step.
(Note: This workshop is best suited for engineers who are already familiar with how useMemo() and memo() work – but want to get better at using the performance tools around React. Also, we’ll be covering interaction performance, not load speed, so you won’t hear a word about Lighthouse 🤐)
GraphQL Galaxy 2021GraphQL Galaxy 2021
140 min
Build with SvelteKit and GraphQL
Featured WorkshopFree
Have you ever thought about building something that doesn't require a lot of boilerplate with a tiny bundle size? In this workshop, Scott Spence will go from hello world to covering routing and using endpoints in SvelteKit. You'll set up a backend GraphQL API then use GraphQL queries with SvelteKit to display the GraphQL API data. You'll build a fast secure project that uses SvelteKit's features, then deploy it as a fully static site. This course is for the Svelte curious who haven't had extensive experience with SvelteKit and want a deeper understanding of how to use it in practical applications.

Table of contents:
- Kick-off and Svelte introduction
- Initialise frontend project
- Tour of the SvelteKit skeleton project
- Configure backend project
- Query Data with GraphQL
- Fetching data to the frontend with GraphQL
- Styling
- Svelte directives
- Routing in SvelteKit
- Endpoints in SvelteKit
- Deploying to Netlify
- Navigation
- Mutations in GraphCMS
- Sending GraphQL Mutations via SvelteKit
- Q&A
JSNation 2023JSNation 2023
170 min
Building WebApps That Light Up the Internet with QwikCity
Featured WorkshopFree
Building instant-on web applications at scale have been elusive. Real-world sites need tracking, analytics, and complex user interfaces and interactions. We always start with the best intentions but end up with a less-than-ideal site.
QwikCity is a new meta-framework that allows you to build large-scale applications with constant startup-up performance. We will look at how to build a QwikCity application and what makes it unique. The workshop will show you how to set up a QwikCitp project. How routing works with layout. The demo application will fetch data and present it to the user in an editable form. And finally, how one can use authentication. All of the basic parts for any large-scale applications.
Along the way, we will also look at what makes Qwik unique, and how resumability enables constant startup performance no matter the application complexity.
React Advanced Conference 2022React Advanced Conference 2022
95 min
End-To-End Type Safety with React, GraphQL & Prisma
Featured WorkshopFree
In this workshop, you will get a first-hand look at what end-to-end type safety is and why it is important. To accomplish this, you’ll be building a GraphQL API using modern, relevant tools which will be consumed by a React client.
Prerequisites: - Node.js installed on your machine (12.2.X / 14.X)- It is recommended (but not required) to use VS Code for the practical tasks- An IDE installed (VSCode recommended)- (Good to have)*A basic understanding of Node.js, React, and TypeScript
GraphQL Galaxy 2022GraphQL Galaxy 2022
112 min
GraphQL for React Developers
Featured Workshop
There are many advantages to using GraphQL as a datasource for frontend development, compared to REST APIs. We developers in example need to write a lot of imperative code to retrieve data to display in our applications and handle state. With GraphQL you cannot only decrease the amount of code needed around data fetching and state-management you'll also get increased flexibility, better performance and most of all an improved developer experience. In this workshop you'll learn how GraphQL can improve your work as a frontend developer and how to handle GraphQL in your frontend React application.
React Summit 2022React Summit 2022
173 min
Build a Headless WordPress App with Next.js and WPGraphQL
In this workshop, you’ll learn how to build a Next.js app that uses Apollo Client to fetch data from a headless WordPress backend and use it to render the pages of your app. You’ll learn when you should consider a headless WordPress architecture, how to turn a WordPress backend into a GraphQL server, how to compose queries using the GraphiQL IDE, how to colocate GraphQL fragments with your components, and more.