The Secret to Graph Onboarding

Bookmark

"What is GraphQL used for?” “How do I find the fields I need?” “How do I test queries against a local graph?” All of these are common and valid questions that developers who are learning GraphQL have when they are first getting started with a unified graph. The secret to overcoming these challenges? You guessed it, Apollo Studio! In this session, we’ll dive into the best tools in Apollo Studio to help new (and experienced) developers feel confident when querying and collaborating on a unified graph



Transcription


Hey everyone, my name is Danielle and I'm an engineering director at apollo where I've been working on helping people use graphs for more than five years now. And today I want to share some tips with you for how to set your graph up so other people can use it successfully. This talk will be a little different than your standard graphql deep dive because instead of talking about some of the elements that go into building a graph, I would like to talk about how people use graphs. And in particular, I would like to talk about how to make your graph more usable to other people. And you might wonder why is it important to take explicit steps towards making your graph more usable? graphql has introspection built into its specification, so it's self-documenting, right? And the graphql ecosystem has a lot of wonderful tools that are auto-generated from graphql introspection to make things like schema diagrams and fast documentation sites. But the reality is like any other piece of code that you author or any other product that has a bit of success, eventually it will start growing and growing and you're going to need to scale the velocity of your graph's development beyond a single person or a single team of people. And the more you think about how you want to use your graph early on, the easier it will be to onboard new folks into your graph down the road. So with our time today, I want to talk first about how people who are running graphs at a pretty large scale are onboarding developers into their graphs, both to consume and to contribute to. And then with the majority of our time, I want to write some code together and show you how you can set your graph up with a developer portal-like experience for other people to plug into with very little effort. So let's dive in. I feel like the graphql community is inundated with technical content relating to how to scale a graph, but I haven't seen a lot of content around the human side of scaling a graph. So I asked some of our customers and some folks from our community, what do you do to onboard new people to your graph? And I got a lot of answers back with overlap, so it was exciting to see a lot of the patterns and what people were doing. A lot of folks are using internal hashtag graphql Slack channels to make themselves available for questions and to create processes around things like schema design. A lot of folks host regular onboarding sessions to help new teams get started. Folks also host monthly lunch and learn sessions and brown bag talks on advanced graphql topics or to talk about new things that are happening in the ecosystem. And a lot of folks have set up a graph section in their internal wikis. Some folks have even created full graph websites. And these sites tell people how to get started with a graph, how to authenticate with every company's unique needs, what types of conventions this graph is following, et cetera. Some of my favorite answers that I got back, though, were actually around the tooling that people have set up to enforce that their graph is being used according to the rules of the company. One example is that there are a lot of ways that people set up observability for their APIs. And with graphql, once you put something into a graph, it becomes discoverable through introspection. So it's very possible that you might add a field to your schema for your own use, but someone else in some other department might also start using it. So if you require that every request being made to your graph has headers to identify which client was making those requests, then you can set up observability to be alerted whenever a new client starts using your field, which becomes really useful information as you evolve a graph. And what I have on this slide here is just an example of what that client-level observability looks like in apollo Studio. Another pattern that I really liked was the idea that you could establish some naming conventions for your operations. So first and foremost, one thing that you can do is you can enforce that every operation to your server has a name. But then in addition to that, you can set up naming conventions so that your operations are all being named in a standardized way. And this becomes really useful down the road, because inevitably, one day, you'll end up having some issue with some section of your graph. And if all of your operations follow a standard naming convention, it will be very easy to figure out which parts of your products are being affected by this graph issue. So even though operation names aren't unique identifiers, in practice, they're actually very, very helpful as a tool for product resiliency. And then on the developer tooling side of things, there was a pattern of people setting up schema and query linting tools that I really liked. So one example of this is in a federated graph, where you have lots of subgraphs that are all independently owned and controlled by different teams. It can often be really useful to enforce that every subgraph schema has contact information in the schema itself. And you can use graphql directives to define contact information. And then you can use schema linting and graphql schema tools to make sure that those directives are present. And one of the cool things about using the schema and directives and linting for capturing all of that metadata is that all of the tools further down in the graphql ecosystem can make use of that metadata as well. So here in the screenshot, I have a picture of the documentation page like schema documentation for a graph in Studio that is showing contact information about the subgraph that a particular field belongs to. And there are so many more conventions that you could set up along these same lines. How do you want your graph to handle error throwing is a really big topic in graphql. How do you want your graph to handle authentication? Is there a pattern that you want to follow for the types of arguments that mutations are going to take in the schema? Is there a particular nomenclature that you want to make sure people abide by when adding new things to the schema? And amidst all of this onboarding stuff and teaching people how to use the graph and setting up conventions for them, you also need to keep in mind that a lot of folks are going to be coming to your graph without much graphql experience at all. So you need to have a way to give them this information that they need to succeed without also completely overwhelming them or making the barrier of entry to your graph too high. And across all of this feedback that I got from these teams about how they onboard developers into their graph, the thing that was consistent for pretty much every team is that they had created some sort of developer portal. And these portals took all sorts of different shapes, websites, wikis, even Google Docs. But every team had some, every team had needed to create some sort of hub for all of this content. And it might seem like building a full developer portal is kind of a big task and therefore one you should take on only when you're ready to start scaling your graph outside of the context of maybe the team that started the graph. But there are some new things in the apollo ecosystem that make sharing your graph much easier than it used to be. And I wanted to show you all just how easy this could be. So with the majority of our remaining time, I wanted to write some code together and go through the process of setting up a fully customized developer portal for the graph just to show you how accessible that now is. So I have a little demo here for us to start from. And you should be able to get into this code sandbox by scanning that QR code. And this is actually a graph that I created about a year and a half ago when COVID first hit and I transitioned to working from home. Because in my work from home routine, I started to try to make some routines and I made this habit of going on a walk and getting a bagel every morning before work to get myself into the right headspace. And a couple months in, I started to wonder what is this bagel walk habit costing my budgets? And being the graphql developer that I am, I figured the best way to answer that question is obviously to make a graph to query that data. So here I have a graph that can query financial transaction information. This is a very, very lightweight graph. It's an apollo server with no configuration at the moment. It's just a very simple apollo server 3. And in my schema, my schema is very small. I just have one field on the root query type, which is transactions that we can query. And in my resolvers over here, the way this graph works under the hood is we're making use of the Plaid financial transactions api. So Plaid is a company that provides a financial transactions api as a service. And they have a REST api. And this graphql api is really just a very lightweight wrapper around their REST api. And we are using the Plaid node client with its Plaid api bindings to just translate our transactions resolver into a call to their api. And then down here in the mutations, I have two little mutations for getting a link token and getting an access token. And these mutations are how we facilitate authentication with this graph. So here I can come in here and I can query the server. And we'll open up the apollo sandbox. And I can ask for transactions and maybe their amount and their name here. But if I run this request, I am going to get an error saying you're not authenticated because there is nothing we can do meaningfully with this graph without authenticating because it's all about financial transactions. And so if we want to invite developers to participate in contributing to this graph, we want to give them an experience where they can come and make whatever query they want at will. But we want to make login way easier for them. Because we don't want them to have to come here, get a link token, go to some other website, exchange that for an auth token, come back and put it into headers. That's just a lot of work. It would be much nicer if they could have a website that they would visit that feels almost entirely like this but also has a login button that they can just press and authenticate with right in place. So in order to do that, we're going to need to create a custom web page. We can't just use the sandbox immediately out of the box. And I'm thinking the best place for our web page is actually just to put it at the same URL as our server because then it will be pretty easy to find. So what we're going to want to do is change the default landing page of apollo server and put our custom page there. And we can do that by plugging into apollo server's plugin architecture. So here apollo server defines a bunch of different plugins. Well, there's a whole ecosystem of plugins. But apollo server has a plugin api that lets you define your own plugins. And we can come back to our server, add this plugin that puts the html and the string in the landing page instead of the default landing page. You can see now our server is serving that html. And what we want is we want that login button experience. So I pre-prepared a little bit of html for us just so we don't have to write it on the spot. And in my html file that we've got here, I've got some helper functions at the top of our file. The only real structure to the DOM of this file is the button that we have for login and then a little span to show login context. And then down here, I have a click handler for how to facilitate authentication. So when this button is clicked, we get a link token from that mutation in our api. We open up the plan login handler. And if you have a successful login with this login handler, which please don't share my banking password, everyone. This is very, very confidential information. If we have a successful login with this login handler, then on success with that data that we get back from our mutation, we will put that auth token into local storage. So now our little website here has a Plaid token in its local storage. And what we want to do is combine this website and this login button with a query IDE experience. The way we're going to do that is we're actually going to use a new feature of apollo Server called public graphs. So you can make a server, sorry, a new feature of apollo Studio called public graphs. So you can make a new graph in Studio based on a schema from anywhere. In our case, we're going to make it based on a schema from introspecting our code sandbox. And if we come in here and make this graph public, then in our Explorer page, there's a setting that becomes available to us, which is a setting to embed this Explorer in other websites. And so we have some options here. We want to embed a dark Explorer because we're putting it in code sandbox. Probably want to hide our docs panel because code sandbox is quite small. And then we're going to hide headers and environment variables because we want to handle auth through our website so that users don't have to think about how to handle auth in headers. So we're going to copy this code now, go back to our code sandbox, and in our html file right below our buttons, we'll paste that. We will restart our sandbox, refresh our page, and now we have an embedded Explorer. But this embedded Explorer is still just an iframe that is making requests to our server from an iframe in another website. So we haven't fully solved our problem yet because what we really want to do is have our parent website make requests on behalf of the Explorer and not have Explorer make a request from the iframe. So we're going to go into Explorer's advanced config, and we're going to define this handle request function inside of the Explorer config. And our handle request function here, we pretty much want the request to be sent as it otherwise would have by the Explorer, but we want to add this one header, which is the token that we're going to take out of our local storage, this authentication header. So now if I save this, restart my sandbox, refresh my html page, and make this request, we should now have authenticated requests. And if we change our page's default landing query to be a little bit more interesting, like we could query instead of account owner, which is just the default field that we got, amount and name, then when we refresh this and land on that default query now and run it, we get our transactions back. And so now any developer that comes to our little dev portal in the future will be able to log into their bank account and immediately get their transactions with pretty much a log in click and a run query click. So now that our graph is super easy to query, its popularity is totally going to take off. I just know it, because everybody's going to want to know how much their bagels are costing them. And before that happens, I think we should put into place some protections so that people follow the rules of using our graph from date one. And one of the rules that we had from our slides was that we want every request that comes to our graph to have our required headers. So in this case, let's say we want every request to have a required apollo graphql client name header, because then we can plug that into our other observability. So the way we're going to accomplish that is we're going to extend the logic in this context function. So if a request does not have our apollo graphql client name header, then we are going to throw a new error, all requests must be from an identified client. Now if I run this query, we should get that error back, which we are, because our little web page here isn't identifying itself as a client yet. But over here, just like we extended our request using this handle request function with an authorization header, we can also pretty easily add an apollo graphql client name header, and just say this is our portal embedded explorer. Now if I refresh my page here and run this query, we should be good to go again. And if we were to look into our observability tools, now the requests coming from our portal embedded explorer will be identified as such. So now that we're enforcing required headers are indeed present on our requests, another rule that we had from the slides was that all operations should also have a name attached. And this one is a little bit more involved, because unlike headers, which are just a property that you can pull off the request directly, operation names are embedded in the body of the operation, and the body of the operation when it goes over the network and first hits our server is entirely in a stringified format. And so in order to know whether or not the operation is actually named, we have to have our server parse that operation into an abstract syntax tree, and then ask at that point, does this operation have a name? So unfortunately, we can't just continue to extend the logic that we have in context here, but fortunately, we can make use of apollo Server 3's plugin architecture. So down here, if we come back to the apollo Server docs, and look at how the plugin request lifecycle works, there's an entire set of states here that every request goes through in apollo Server, and at any point, we could plug into these states and add some augmented logic. And so the first stage of the request lifecycle where the request operation has been fully parsed is the stage right after validation did start. So this would be did resolve operation. So we're going to come in here and hook into... We're going to come in here, add another plugin, and then hook into did resolve operation and its context. And all of this is inside the request did start lifecycle. So here we're going to say, if there's no context.operation name, then throw a new error, all operations must be named. And now if I run our operation again, we should still be just fine because this operation is named. But if I delete that name, and run this operation, we should get an error. So we are good to go. So with just one small plugin in apollo Server and some logic in our context function, we are now automatically enforcing that all clients who are using our graph are following our rules. And as this graph takes off, and more people discover that they can use our api and learn all about their bagel transactions, we will now know exactly who is using our api and how, which is pretty amazing. And the last thing that I wanted to cover in our demo is how to make this graph truly approachable. It's one thing to give folks that are going to be developing against our graph a link to this sandbox that we made and say, have at it, figure it out, good luck. But it's another thing entirely to give someone a link to a homepage that welcomes them and tells them where to start. And I showed you all how to make a public graph in Studio earlier because that's how we embedded the Explorer, but I didn't really tell you what public graphs were all about. We built public graphs in Studio to make it arbitrarily easy, basically turnkey, for you to share your graph with your users. And every graph in Studio has a homepage where you can fully customize its metadata. So here I pre-prepared a little readme for us so we don't have to type it out on the spot, but we can customize this readme and explain all about how this is a little graphql wrapper around the Plaid REST api. Because we have our developer portal with this login button that we made, we can add that as an external link. And we could even give this graph a avatar if we wanted to make it look real sharp and fancy. Once we've set up the homepage to look exactly like we want, we can get the public link to our graph. Now anyone on the open internet can kind of come, consume our graph, read all about it and use it. And we're pretty much done here with this job. So in just about 15 minutes, we've set up our graph to be very approachable. Even folks with no context on this graph specifically or folks who might be newer to graphql should be able to take a look at that readme and get started. We made a custom developer portal with a login button so that people don't have to jump through hoops to run their first query and just start looking at their bagel data. We set up guide rails at the code level so that everybody using our graph and evolving their usage of this graph will continue to do so according to our rules of observability. And we created a front page to the graph at a URL that can be shared with anyone as a friendly place to get started. So the point I really want to leave you all with today is that the earlier you think about how people are going to consume your graph, the better. I think a lot of folks have a tendency to focus on the early graph development days as being all about data modeling and productionizing their deployments because those are the hardest and most obvious problems. But the industry is not moving towards graphql because it's easier to build with. It's moving towards graphql because it's easier to use. So please make sure your graph is easy to use. That's the tea. Thank you so much for watching. Enjoy the rest of the conference today and tomorrow. And if you have any questions or comments, I will see you on the internet. Bye! Bye!
24 min
10 Dec, 2021

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic