With only an HTTP endpoint provided you get autocompletion and client-side validation. Isn't it magic? No, it is introspection. Whether you have heard about it or not, you have most likely already utilized it. Let's uncover together what it is, how it works and why it is the fuel to power the GraphQL ecosystem.
All You (n)ever Wanted to Know about Introspection
AI Generated Video Summary
Introspection in GraphQL allows APIs to be self-aware and self-discoverable. It eliminates the need for external standards or documentation and provides detailed information about types and fields. The introspection query provides insights into the schema, and GraphQL Tools allows schema transformation. The introspection result can be used for generating powerful tooling and detecting breaking changes in CI/CD pipelines.
1. Introduction to Introspection in GraphQL
Hello and welcome. In this talk, I'll be discussing introspection in GraphQL. Introspection allows your API to be self-aware and self-discoverable, providing information about the types it works with. Unlike REST, GraphQL has built-in support for introspection, eliminating the need for relying on external standards or documentation. With introspection, you can easily query your GraphQL server and retrieve information about the available types and their fields. Let's take a look at a real-world example using GraphiQL, a popular graphical interface for issuing queries against GraphQL servers. GraphiQL provides auto-completion and detailed information about the schema, types, and fields.
Hello and welcome. Thank you for joining me on my GraphQL Galaxy talk, and I'll be talking about introspection, how it can be useful and all those things. The idea for this talk actually came from when a colleague dragged me into a live stream where they wanted to figure out how they can get GraphQL query autocompletion into some API tooling. And I thought it might be interesting to just share some of the learnings with other folks as well.
So who am I? My name is Stephan Schneider and I have been working with APIs for roughly five years now and have started working with GraphQL in 2018 when our company was building our first GraphQL API. Since then, I basically have never stopped working with it and want to give you some more details about introspection. So, we do have this slight distinction that we have a thousand schemas every month that we generate at Contentful because our customers can define their own content model, and based on that, we generate them a GraphQL API. And for that, we also use introspection in some of our tests and that will be quite interesting.
What is introspection and why should I care? Introspection is also a term used in psychology and I really like those two terms about self-awareness and self-discoverability and basically took them now to the technical part of introspection as well. So self-awareness means that your API knows which type it is working with and can give you those information about them. Self-discoverability means it's a standardised way that the server can make assumptions about your types and that you can use to actually fetch those types from the server. In comparison to REST, this is an upfront design decision of the language itself. On REST, you have to rely on that there might be some standard that an API is following like the OpenAPI spec, for example. If it doesn't comply to it, you basically have no more information about what kind of types it provided and have to rely on a good developer documentation being written.
So let's look at a real world example. Most of you have probably worked with it beforehand. Graphical. The graphical interfaces ship with most of the server implementation that you have, and it allows you to issue queries against your GraphQL server. So here, for example, you can see we have all the types available that our Contentful schema knows about, for example, the Contentful Metadata or the Contentful Tag. Then for all of those types, it also knows about which fields you have available. As you can see here, for example, also the descriptions of them. Not only that, you also have for all those fields, their types and their arguments to the types as well. So in the end, GraphQL knows about everything that your schema contains and can give you auto-completion for it. How does it work? Let me introduce you to a couple of types that you might have never seen before, because they're actually not part of your schema, but they are part of your schema. They are there implicitly. So here, for example, we have on the query type, we have two more fields called __schema and __type. The two underscores mean they are meant for internal use, and you are not allowed to create types that start with two underscores. They are owned by GraphQL as a language. We also have the type schema, which is returned when you query the __schema type. The schema type then contains more fields that you can use for querying, you will see those later.
2. Implicit Introspection Types
You can query for types and get information about specific types and their fields. The __typename is useful for differentiating runtime types. Introspection types are implicit and cannot be removed from your schema. Disabling introspection does not guarantee schema safety. GraphQL provides helpful error messages, including did you mean completions, even with introspection disabled.
For example, you can query for all the types, or even for a single type on the query object, and you will get information about that specific type, like its fields. Those are implicit, they mean they are never actually showing up in your schema, you don't have to write them yourself.
Then there is also a bunch more, and this bunch more, I think the most interesting one is the __typename, you might have used them already when you have worked with interfaces or with union types. You can use them to differentiate which actual type is returned at runtime, so you and your result can reason about which of the two or three different types available you have actually received. They are also implicit, they are not part of your schema.
Those introspection types, what does it mean that they are implicit? It means you don't have to write them, they will be there, and there is no way for you to forget about implementing them or implement them differently. Every type object will always have a fields field, and that fields field cannot be differently implemented, it is part of the implementation of the server that you are using. If you don't stick to that implementation, you basically are not spec-compliant. They give you, therefore, a standardized interface, to work with the types in your system, and you can technically disable it, but you cannot remove them implicitly from your schema. Disabling here can be done in the GraphQL.js reference implementation for example, via a so-called no-schema introspection custom rule, quite a handy name, but it does imply it is a custom rule. With that, you are actually relying on the validation phase to disallow querying the schema or type properties or the other types that are there implicitly. By disallowing querying any of them, you have effectively removed them from your schema, but please take note that it doesn't mean your schema is now safe from being inspected. GraphQL is quite helpful when it comes to errors. Within those errors, you get some did you mean completions and those completions will still show up even if you have introspection disabled. You could brute force your API to return you those names, so don't take it as a security measurement.
3. Introspection Query and Schema Definition Language
Let's have a look at the introspection query. It provides information about all the types in the schema, including their kind, name, description, and fields. You can also write your own introspection queries to test generation assumptions. However, using the schema definition language (SDL) directly can be cumbersome, especially in transport-independent scenarios. Creating another HTTP endpoint or returning a lengthy string with the whole SDL may not be ideal. Apollo's Federation approach addresses this but can feel hacky. It's important to serve the correct final schema.
Let's have a look on the introspection query. This is the introspection query in a very shortened form because the real one has over 100 lines and that would not be very good to display here on a slide. This is a brief form of it and you can import it from the GraphQL.js reference implementation as well and it will give you basically everything that the type system knows about. So you can see it as a full introspection query.
What it does is, it gives you all the types, the kind of type, for example, whether it's an object, an enum, a directive, name, description, and if it is available, the fields are there. So they are really relying on the nullability feature of GraphQL. Every type will have the same shape and you don't need to do fragments or so to fetch them. They might be just null, then fields will just be null in the response. However, in case of like objects, you will actually get the fields returned.
What we can also do is we can write our own introspection queries. As I said earlier in the introduction, at Contentful we are generating different schemas based on the customer input. So what we thought is quite handy would be to test that our generation assumptions actually hold true. For a given input, we want to check that a specific set of types is generated. What we did was we wrote our own introspection query, which we call here introspect all type names, which just gets us in the schema all the types and their names. Then in a test assertion, we can just check that, for example, the query or the test CT type are available in our API. With that, you can basically test all those assumptions with your own introspection queries without taking the full lengthy 100 lines of code one.
At this point, you might be wondering, well, we already have the schema definition language, why can't we take this? Yes, this would be possible. You could take the STL string, you could parse it into the object types, and then you could work with them. This is not really the nicest experience. Also, there is another problem. GraphQL is transport independent. While it's usually served via HTTP, and it's so common there's even a GraphQL over HTTP spec, you technically could also use RPC calls, for example, or you could have some web sockets interactions. On those, you also want to be able to get your introspection results. What we can't do is to just create another HTTP endpoint, because those are not available, like RPC or web socket. Then you might be thinking, okay, we just could put them on a field somewhere. Yes, you can, and then basically just return a very lengthy string, which contains your whole SDL. Actually, that is what Apollo is doing when you are using the Federation. In Federation, you then have, I think it's called service type, and on that service, you have an SDL field which gives you the whole SDL string again. For me, that feels a bit hacky, but I do get the reasoning, and you will learn about that today as well. Schemas can often also get filtered or stitched and everything, and you only want to serve the real final one, and you would have to take care that you always serve back the correct one.
4. Introspection and SDL
The stitched API in its introspection always knows about all the types stitched into it, ensuring no mistakes or sync issues. Introspection is handy for understanding commonalities and differences between SDL and introspection. However, introspection cannot be used for schema stitching. When working with a GraphQL schema, you typically build it from an SDL string and introspect its types via introspection. The resulting introspection result is a JSON response that can be analyzed. You can also print the schema again to get the SDL representation. Introspection results can be used to build a client schema, which can then be printed as an SDL string. The process can be reversed, transforming an SDL or introspection into each other. However, there are differences between the two. SDL retains directive information, while introspection does not.
But the stitched API in its introspection will always know about all the types that are stitched into it, and no mistake can happen. It cannot get out of sync. So commonalities and differences between SDL and the introspection, I think it's nice to explain why it is so handy to have introspection around, and also to explain why we cannot use introspection for schema stitching.
So when we have a schema, the GraphQL schema is like the domain type we are always working with. For example, when we pass it to an Express GraphQL to the Apollo server, they all work in the background with GraphQL schema object. GraphQL schema object can be used to execute queries against them, run validations against them and so on. So usually you would build your schema from an SDL string and pass it to buildSchema. buildSchema is provided by the GraphQL.js reference implementation as well, the same as all the other functions I'll be calling in this section. You can introspect its types on it then via introspection from schema and you would get an introspection result which is the JSON response that you get and you can look at it the same way that, for example, graphical would receive. You can then even take the schema and print it again and you would get the SDL out of it. Here the whitespaces are preserved because when you are parsing it from an SDL string, the types in the schema know about their position and the whitespacing so in the end it should be a one-to-one representation of the initial string that you have parsed.
5. Introspection and Schema Transformation
In introspection, you won't see where directives are used. GraphQL Tools allows schema transformation, such as prefixing types and filtering schema. GraphQL defines two primitives: schema and document. Loaders generate a schema from sources like a string, file, or URL. The URL loader sends the introspection query to the endpoint. Once you have a schema, you can attach resolvers and work with it. Introspection provides a standard way to work with your API and its types.
In the introspection result, on the other hand, you won't see that. You will only know that there is a directive defined in the schema, but you will not know where it has been used. I personally think the reason for that is that directives can have some business logic that you don't want to see exposed, but there is also lengthy discussion going on in the GitHub repository. If you feel strongly about it, feel free to chime in there and maybe, one day, we won't need the SDL string for using federation or any other type of stitching.
To look on another pretty cool example of what you can do with introspection, is GraphQL Tools. GraphQL Tools is built by the guild. The guild is doing a lot of awesome tooling around the GraphQL ecosystem and it allows you to transform your schema in various ways. You could say we want to prefix all of our types with Contentful, because we are stitching it with, for example, the GitHub schema. We don't want to have clashes on the types, so we just prefix them. No problem with that. You could also filter your schema to only expose specific types. Those tooling methods all work with basically two primitives that GraphQL have defined, which is the schema or the document, aka the parsed query string. A query gets always parsed into a document, and then a document can be executed against the schema. Those are the two primitives, and how you can get to a schema is by using different loaders. Those loaders can take sources like a string, a file, or even a URL. And from all of these, it will generate you a schema that you can work with. Loaders examples you can see here.
The easiest one is just a string as we had in an earlier example. You can just throw the string in, and it will delegate under the hood to the buildSchema function, and now you have a schema available. It can also take a file, so that you don't need to read in the file by yourself before calling it, and you can use the file loader, pass it a schema.graphql file, it will read the file in, then uses the buildSchema method, and you have a schema again that you can work with. I think the most interesting one for our talk here is the URL. You can pass to loadSchema a URL of your GraphQL endpoint, and use the URL loader. Once you've done that, what happens under the hood is that the full introspection query that GraphQL.js provides is used and sent to your GraphQL endpoint to be executed there. Afterwards, the result is fed back into the buildClient schema, and you have a schema again. It basically does exactly what I've shown you in more detail beforehand. Once you have a schema now, you can attach resolvers to them if you need to, if you really want to do execution or delegation. You can basically provide them with resolvers. At this point in time, there's basically no distinction for you anymore where that schema came from. You can just work with it.
To give you a recap of what we have just learned, introspection provides you a standard way to work with your API and to work with the types it has.
6. Benefits of Introspection
You don't need to write your own introspection query. The introspection result can be used to generate powerful tooling, including interfaces for type languages, visual representation of the API schema, and code completion. It can also be used in CI/CD pipelines to detect breaking changes. The introspection result can generate a non-executable schema that can be used as desired.
You can, but you actually don't need to write your own introspection query. In fact, most of the people are fine with just using the full introspection query the reference implementation provides. If you ever need to, you can just query portions of your type system.
We can get code completion, which is basically the reason of the talk like GraphicCode does, that when you type you will get all the field names that match to it, so you can use auto completion and are aware which fields are available. Which is also pretty cool, you can use it in your CI or CD pipeline to take the new schema from your current code and compare it with what's currently on production and you could detect any sort of breaking changes and have them automatically attached to your pull request, for example. There is even a tooling for that, I think even from the Guild, that does exactly this.
The introspection result can generate a non-executable schema, of course all the business logic that is in resolvers is not part of the introspection, but you can generate a schema from it and then use it as you wish. And with that, I thank you for your attention, if you have any questions or feedback you can reach out to me on the Twitter handle linked here.