Exploring the Data Mesh Powered by GraphQL

Rate this content
Bookmark

Different approaches are being explored for building an operational data lake with an easy data access API, as well as a federated data access API, and GraphQL opens up opportunities for enabling these architectures by laying the foundation for a data mesh.

FAQ

GraphQL is a query language for APIs that enables clients to request exactly the data they need in a single API call. This can reduce the load on underlying data systems, increase performance, and help in managing complex data structures efficiently.

A data API layer is crucial for standardizing API interfaces, ensuring performance quality, and providing security compliance across various data sources and client types. It facilitates efficient data management and faster application development.

While GraphQL offers flexibility and efficient data fetching, it also presents challenges such as increased complexity in schema design, authorization, security, and optimal query planning across varied data sources.

GraphQL integrates data fetching with authorization logic, allowing for complex security rules that are tailored to user permissions and data access requirements. This integration helps prevent unauthorized data access and ensures compliance with security policies.

GraphQL APIs can serve a wide range of clients including internal and external clients, clients operating on cloud or on-premises, and those across different teams or regions. This flexibility makes GraphQL suitable for diverse operational environments.

Serverless architectures can significantly enhance GraphQL performance by allowing precise data fetching that reduces unnecessary operations and costs. This architecture adapts well to the scalable and efficient nature of GraphQL.

Yes, GraphQL effectively prevents over-fetching and under-fetching by allowing clients to specify exactly what data they need. This not only optimizes data retrieval but also improves overall application performance and user experience.

Tanmai Gopal
Tanmai Gopal
34 min
08 Dec, 2022

Comments

Sign in or register to post your comment.

Video Summary and Transcription

This Talk discusses the challenges of working with data APIs and GraphQL, including standardization, performance, and security. It emphasizes the need to optimize data fetches and push down authorization logic. The concept of externalizing authorization and using a GraphQL engine is explored. The Talk also covers the generation of GraphQL schemas and APIs, as well as the implementation of node-level security. Overall, the focus is on designing and standardizing GraphQL for data APIs while addressing authorization challenges.

1. Introduction to Data APIs and GraphQL

Short description:

In this part, Tanmay discusses the need for a data API layer to address the challenges of working with different data sources and clients. He highlights the benefits of GraphQL in selecting and structuring data, but also acknowledges the challenges of standardization, performance, and security. Tanmay explains how performance optimization can vary depending on the data sources and shares examples of query plans. He also mentions the discussion around the N plus one problem in GraphQL.

Hi, folks. I'm Tanmay, I'm the co-founder, CEO at Hustler. And I'm going to talk to you a little bit about data APIs powered by GraphQL today. So, increasingly platform teams across various organizations are setting up a data API layer to kind of deal with this problem of saying that you have so many different data sources and so many different types of clients. And you need to solve problems of performance and standardization and security to allow these clients to move quickly.

We have to deal with the fact that these data, the domain data is kind of coming from different sources, databases, services. Clients can be internal or external, they can be at the edge, they can be on the cloud, they can be on-prem, they can be within the same team, they can be across different teams. And we need kind of a data API layer that can absorb and solve for standardizing the API or providing a certain performance quality of service or providing security compliance guarantees. As a Data API, GraphQL can be a great fit and we'll see some of the benefits of GraphQL in addressing some of these challenges as well.

So, GraphQL is a nice API because as we all know, it allows us to select exactly the data that we need in a single API call. This has pretty large impact if the amount of data that we're fetching is fetching models that have been hundreds of attributes where we can drastically reduce the load on the underlying data system. Increasingly as we move to serverless centers and serverless data centers, there's a massive cost saving impact that also happens when we're able to select exactly the data that we need. We all know about the fact that GraphQL has a really nice schema and you have a type of graph that allows us to select exactly that allows us to structure the way that we're getting kind of our output, but also it allows us to structure our input and parameters fairly easily. Right? And that has an impact in our ability to handle increasing complexity. When we think about this query here, when I'm fetching orders, I'm fetching order where the user is greater than a particular value on it, ordered by the user ID in an ascending order. Providing these input parameters and arguments is much more easy with the GraphQL compared to trying to do this with a rest API, for example. Right? And so kind of being able to layer on this complexity becomes much easier. When we think about taking this, these kind of niceties of GraphQL that we're all aware of, and we think about standardizing and scaling this, we kind of run into some challenges at its core. It's because the cost of providing this increased flexibility means that we need to do a little more work in solving for kind of standardization or schema design and guaranteeing performance and solving for authorization and security, right?

Let's take a look at performance, for example. If we think about the types of data sources that we have and the way that we execute a query across those data sources, that optimal data fetching that we do can be very contextual. If we take a simple example of fetching orders and the user for each order as well, the username. Depending on the topology of this data, we might have varying query plans. For example, if it came from the same data source that supported JSON aggregation, if I had to implement a controller that would result and respond with just this data, I could make a single query that would perform the JSON aggregation at the data source itself. That means that I'm not even doing a joint that fetches a Cartesian product, I'm making a more efficient query that is fetching just the order the user are constructing the ship, the JSON, then sending that back to the client. Let's say it's coming from two different databases, in which case I would use something like an inquiry and perform memorization, so that I'm not fetching duplicate entities into this cross database joint. If this was coming from two different services, then I'd have to make multiple API calls, but again, I would do a form of memorization to prevent duplicate entities being fetched within the same request. It's a variation of the data pattern. But the idea that this query plan depends on the kind of data that we have, and the same kind of query plan will not work across these different data sources. There's an interesting thread that popped up on Twitter a few weeks ago, where we talked about how GraphQL doesn't create an N plus one problem, and Nick, one of the creators of GraphQL kind of chimed in saying that, well, GraphQL doesn't create the N plus one problem. But because of the way that we typically think about executing a GraphQL query, it does make it harder to address that problem in kind of a systematic way.

2. Challenges of Data Fetching and Authorization

Short description:

In this part, Tanmay discusses the challenges of integrating predicate pushdown with data fetching and the need to push down authorization logic. He emphasizes the importance of optimizing data fetches and explains the challenges of doing this across data sources.

And that's kind of what we look at, and we see how we can address those kinds of challenges. And we think about authorization. Very common challenge is that we have to integrate predicate pushdown along with our data fetching. Again, if you look at the same query where we're fetching orders user, and let's say this query is being made by a regional manager that can only fetch orders that are placed within the region, within their region. And so if we made a naive kind of request where we selected all of this data, and then after selecting the data, start filtering by region, terrible. We obviously can't do this when you have millions or billions of rows. And what you don't do is you don't want to fetch that data or select from orders, from the orders kind of model or table or whatever, where the region is equal to the current region. This is the predicate and again, pushing down that predicate in our data fetch, right? And we'd want to be able to push down our authorization logic with our data fetch as much as we possibly can. Doing this across data sources can become challenging, right?

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

From GraphQL Zero to GraphQL Hero with RedwoodJS
GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
From GraphQL Zero to GraphQL Hero with RedwoodJS
Top Content
We all love GraphQL, but it can be daunting to get a server up and running and keep your code organized, maintainable, and testable over the long term. No more! Come watch as I go from an empty directory to a fully fledged GraphQL API in minutes flat. Plus, see how easy it is to use and create directives to clean up your code even more. You're gonna love GraphQL even more once you make things Redwood Easy!
Local State and Server Cache: Finding a Balance
Vue.js London Live 2021Vue.js London Live 2021
24 min
Local State and Server Cache: Finding a Balance
Top Content
How many times did you implement the same flow in your application: check, if data is already fetched from the server, if yes - render the data, if not - fetch this data and then render it? I think I've done it more than ten times myself and I've seen the question about this flow more than fifty times. Unfortunately, our go-to state management library, Vuex, doesn't provide any solution for this.For GraphQL-based application, there was an alternative to use Apollo client that provided tools for working with the cache. But what if you use REST? Luckily, now we have a Vue alternative to a react-query library that provides a nice solution for working with server cache. In this talk, I will explain the distinction between local application state and local server cache and do some live coding to show how to work with the latter.
Get rid of your API schemas with tRPC
React Day Berlin 2022React Day Berlin 2022
29 min
Get rid of your API schemas with tRPC
Do you know we can replace API schemas with a lightweight and type-safe library? With tRPC you can easily replace GraphQL or REST with inferred shapes without schemas or code generation. In this talk we will understand the benefit of tRPC and how apply it in a NextJs application. If you want reduce your project complexity you can't miss this talk.
Batteries Included Reimagined - The Revival of GraphQL Yoga
GraphQL Galaxy 2021GraphQL Galaxy 2021
33 min
Batteries Included Reimagined - The Revival of GraphQL Yoga
The Guild has recently released Envelop - a new, modern GraphQL Server Framework and plugin system. In this talk I’ll share a brief overview of Envelop and why you should probably upgrade your existing GraphQL server to it.
Rock Solid React and GraphQL Apps for People in a Hurry
GraphQL Galaxy 2022GraphQL Galaxy 2022
29 min
Rock Solid React and GraphQL Apps for People in a Hurry
In this talk, we'll look at some of the modern options for building a full-stack React and GraphQL app with strong conventions and how this can be of enormous benefit to you and your team. We'll focus specifically on RedwoodJS, a full stack React framework that is often called 'Ruby on Rails for React'.

Workshops on related topic

Build with SvelteKit and GraphQL
GraphQL Galaxy 2021GraphQL Galaxy 2021
140 min
Build with SvelteKit and GraphQL
Top Content
Featured WorkshopFree
Scott Spence
Scott Spence
Have you ever thought about building something that doesn't require a lot of boilerplate with a tiny bundle size? In this workshop, Scott Spence will go from hello world to covering routing and using endpoints in SvelteKit. You'll set up a backend GraphQL API then use GraphQL queries with SvelteKit to display the GraphQL API data. You'll build a fast secure project that uses SvelteKit's features, then deploy it as a fully static site. This course is for the Svelte curious who haven't had extensive experience with SvelteKit and want a deeper understanding of how to use it in practical applications.

Table of contents:
- Kick-off and Svelte introduction
- Initialise frontend project
- Tour of the SvelteKit skeleton project
- Configure backend project
- Query Data with GraphQL
- Fetching data to the frontend with GraphQL
- Styling
- Svelte directives
- Routing in SvelteKit
- Endpoints in SvelteKit
- Deploying to Netlify
- Navigation
- Mutations in GraphCMS
- Sending GraphQL Mutations via SvelteKit
- Q&A
End-To-End Type Safety with React, GraphQL & Prisma
React Advanced Conference 2022React Advanced Conference 2022
95 min
End-To-End Type Safety with React, GraphQL & Prisma
Featured WorkshopFree
Sabin Adams
Sabin Adams
In this workshop, you will get a first-hand look at what end-to-end type safety is and why it is important. To accomplish this, you’ll be building a GraphQL API using modern, relevant tools which will be consumed by a React client.
Prerequisites: - Node.js installed on your machine (12.2.X / 14.X)- It is recommended (but not required) to use VS Code for the practical tasks- An IDE installed (VSCode recommended)- (Good to have)*A basic understanding of Node.js, React, and TypeScript
GraphQL for React Developers
GraphQL Galaxy 2022GraphQL Galaxy 2022
112 min
GraphQL for React Developers
Featured Workshop
Roy Derks
Roy Derks
There are many advantages to using GraphQL as a datasource for frontend development, compared to REST APIs. We developers in example need to write a lot of imperative code to retrieve data to display in our applications and handle state. With GraphQL you cannot only decrease the amount of code needed around data fetching and state-management you'll also get increased flexibility, better performance and most of all an improved developer experience. In this workshop you'll learn how GraphQL can improve your work as a frontend developer and how to handle GraphQL in your frontend React application.
Build a Headless WordPress App with Next.js and WPGraphQL
React Summit 2022React Summit 2022
173 min
Build a Headless WordPress App with Next.js and WPGraphQL
Top Content
WorkshopFree
Kellen Mace
Kellen Mace
In this workshop, you’ll learn how to build a Next.js app that uses Apollo Client to fetch data from a headless WordPress backend and use it to render the pages of your app. You’ll learn when you should consider a headless WordPress architecture, how to turn a WordPress backend into a GraphQL server, how to compose queries using the GraphiQL IDE, how to colocate GraphQL fragments with your components, and more.
Relational Database Modeling for GraphQL
GraphQL Galaxy 2020GraphQL Galaxy 2020
106 min
Relational Database Modeling for GraphQL
Top Content
WorkshopFree
Adron Hall
Adron Hall
In this workshop we'll dig deeper into data modeling. We'll start with a discussion about various database types and how they map to GraphQL. Once that groundwork is laid out, the focus will shift to specific types of databases and how to build data models that work best for GraphQL within various scenarios.
Table of contentsPart 1 - Hour 1      a. Relational Database Data Modeling      b. Comparing Relational and NoSQL Databases      c. GraphQL with the Database in mindPart 2 - Hour 2      a. Designing Relational Data Models      b. Relationship, Building MultijoinsTables      c. GraphQL & Relational Data Modeling Query Complexities
Prerequisites      a. Data modeling tool. The trainer will be using dbdiagram      b. Postgres, albeit no need to install this locally, as I'll be using a Postgres Dicker image, from Docker Hub for all examples      c. Hasura
Building GraphQL APIs on top of Ethereum with The Graph
GraphQL Galaxy 2021GraphQL Galaxy 2021
48 min
Building GraphQL APIs on top of Ethereum with The Graph
WorkshopFree
Nader Dabit
Nader Dabit
The Graph is an indexing protocol for querying networks like Ethereum, IPFS, and other blockchains. Anyone can build and publish open APIs, called subgraphs, making data easily accessible.

In this workshop you’ll learn how to build a subgraph that indexes NFT blockchain data from the Foundation smart contract. We’ll deploy the API, and learn how to perform queries to retrieve data using various types of data access patterns, implementing filters and sorting.

By the end of the workshop, you should understand how to build and deploy performant APIs to The Graph to index data from any smart contract deployed to Ethereum.