As it says in the GraphQL documentation: "Delegate authorization logic to the business logic layer". Is that really everything you need to know? This advice is coming from a good place, but it relies on you knowing how you would go about doing authorization in the first place — and this isn't a widely solved problem! On top of that, many of the approaches used in traditional applications don't quite carry over. In this talk, you'll get a crash course in authorization and how to implement it for GraphQL APIs."
Authorization Patterns in GraphQL
From:

GraphQL Galaxy 2022
Transcription
All good talks need to start with a quote, right? So you know how they say in theory there's no difference between theory and practice, but in practice there is? Well, that's what this talk is about. I'm going to talk to you about the theory of authorization in graphql, and then if and how this matches practice. So I'm Sam, I'm the co-founder and CTO at Oso, and we build authorization for developers. And as someone who did a PhD in cryptography and then gradually sort of slid more and more into the practical world, I'm pretty familiar with the distinction between theory and practice. So strap yourselves in and learn a little about patterns in graphql, both theoretical and practical. So I know there's a lot of confusion out there around the difference between authentication and authorization, so I'll start out with a few definitions. authentication is all about identity. You know, who is the user? It may be you identify them through username and password. You may do single sign-on, you may have two-factor authentication. That's all authentication. Authorization is the piece that comes next. Now that you know who the user is, what can they do? And I'm going to be talking about application authorization. So specifically, what can they do inside your application? To give you a few examples, GitHub, this is a great example. It's got a pretty robust authorization system. It allows you to do things like roles at the organizational level, like owner and member, and which of those can create repositories. You can also have roles at a more granular level on repositories. I can invite you as a collaborator to my repository and give you a specific role, which grants you different access. Now I really like GitHub as an example, because they also have a graphql api, so we'll be able to see not just authorization sort of generically, but also in the context of graphql. For a bit more of a complex authorization example, take aws IAM. And with aws IAM, you can sort of write your own authorization logic and policies to determine who is allowed to do what inside of this ginormous platform. It's a pretty complex one. But there are some application authorization that you might not think about so much. Take something like Google Docs or Notion, where part of the core workflow is inviting people to collaborate on documents, or maybe invite them to an entire folder and see whether they can sort of view, edit, or comment on those documents. That's all authorization. So GitHub, aws, Google Docs, Notion, these are all fantastic examples of application authorization. So before we start getting into the technical details, why is this an important topic to talk about? Well, first of all, if you don't have authorization in an application, your product is entirely broken. I mean, you sort of have anarchy. People can go in and do anything, they can go and delete other users, they can delete other people's data, they can do anything. But on the other hand, if your authorization is broken, people can't get access to anything. Your app doesn't work. If your authorization is buggy, then users start getting annoyed, right? We've all been in that situation where the authorization logic in an app is so broken and frustrating that you just say, you know what? I'm going to go make everybody an admin. I don't want to deal with this permission system. Okay, so that can be what the implications are of doing authorization wrong or even just poorly. Okay, so getting into the authorization patterns. The first pattern I want to talk about is authorization in the business logic layer. Now if you're like me and you're coming to a new topic, probably the first thing you do is just go ahead and Google it. Now you type in graphql authorization. The first hit happens to be graphql.org. There's this beautiful conceptual overview on authorization in graphql. And it starts with this one kind of golden rule. It says delegate authorization to the business logic layer. Now there's potentially two new concepts in this that you haven't thought about before. One, what does it mean to delegate authorization? And two, what is this business logic layer? So the term business logic layer also comes from another graphql.org page, this page around thinking in graphs. And it sort of lays out this kind of like architectural diagram of layers of how to think about where graphql fits into your application. I have my own version of this diagram so that I can kind of doodle on it and draw extra things. And okay, so when I spoke about the layers of your application, what I'm really talking about is a backend application. So kind of putting this picture into context, it maybe looks more like this. So first of all, on the left, you maybe have your client. This could be a web browser. This could be a mobile app. And those things are all going to be making various requests to your backend, to your backend APIs. Those could be REST requests. Those could be graphql queries. So the business logic layer, this sort of, this handles that like the mapping those requests to actual things that are going inside your application. For example, if you want to get a specific organization, the business logic will be gathering all the relevant data to return, computing fields, doing transformations, things like that. And at the bottom of the persistence layer, this is what's like reading and writing data to the database. Okay. For example, coming back to that user wants to review a specific organization, right? So maybe they do that through a REST request. And in the backend, maybe we have some application logic that we wrote to handle that. You know, it looks up the organization by that ID from the database. It authorizes, is the user allowed to read that organization? And if not, returns null. Okay. So that's maybe our REST endpoint. Similarly, on the graphql side, we now instead have a query for an organization by a specific ID. But again, we write our resolver logic to say, first, you know, look up that organization by ID, authorize, is the user allowed to read that organization? And similarly, return null. Okay. So far, so good. We have our REST api, we've got our graphql api, we have authorization. Doesn't seem too bad. So going back to that original graphql.org docs page, what it's saying the problem here is that what you've actually done is you've duplicated this authorization logic across multiple places. And they highlighted this as a bad thing because, okay, in this case, we have two reads, two methods in this case, upstate, not too bad. But as you start growing and growing your different APIs and endpoints, you now need to keep all of these different authorized statements in sync between these two APIs, the REST and graphql. Now why that's problematic is if these things fall out of sync, maybe you check for one permission somewhere in one place, check for a different permission somewhere else, then your application starts behaving differently based on what api somebody is using. And that can create a really poor user experience. So going back to that diagram, when that golden rule said delegate authorization to the business logic layer, what it's talking about is like pushing that authorized down, don't handle it in those api handlers, but instead push it down and handle it inside the business logic layer. So in my previous example, what that might mean is in your method to look up the organization by ID, maybe that's where you put that authorized logic. So when you go and fetch that organization, you then check can the user read it or not. Because both the REST and the graphql api are both calling the same method, we sort of guarantee that they're going to be in sync. So there you have it. That's the sort of conceptual theory of authorization in graphql. In fact, you don't actually do it in graphql. You push it down to the business logic layer and handle it there. So you don't have to repeat that logic multiple times. And actually, that's not just a theory. I think in practice, this is also fantastic advice. I think for a lot of people, this should be the right default option, especially if you already have an application, you already have authorization in a lot of places. As you go and start adding those graphql endpoints, you don't want to go and copy and paste that authorization logic around. Better that you can just have it defined in one place. I want to briefly extend this past time with something else that I think can be super helpful to do. Suppose we were getting a specific organization. But what if instead we were fetching multiple organizations? Suppose we want to get the first ten orgs. The naive thing to do would be to fetch those organizations and then go through and authorize each one one by one. Now, the reason that's problematic is, one, it can be kind of slow to repeatedly do that authorization. But two, if one of those is not allowed, then what do you do? Do you only return nine organizations instead of ten? Do you return a null? Do you require the user goes back and fetches more and so on? So this can actually be a pretty tricky thing to get right. But again, that theory from before holds really well. You should do this authorization down in that business logic layer. But in practice, how you achieve that is by applying your authorization logic as a filter at the query level. So maybe what you need to do is filter those organizations by all the organizations that the user belongs to. Maybe we've got a list of orgs on the user. Now as that authorization logic gets more complex, you probably want to separate out that logic as well behind a single interface. Say something that can list all the authorized organization IDs for you. Then you can go and filter the database. And that's the kind of thing you can actually get from sort of modern authorization solutions like Oso. And so the pattern I want you to take away here is that when you're doing your query resolvers, a great pattern to follow is actually doing that authorization as part of fetching the data from the database. Not just for one thing, but also for multiple things. And so maybe an alternative version of the diagram might have that authorized maybe before the persistence layer. I don't know. You know, all part of that business logic layer. Okay. So seems like we've outcovered everything. We done? You know, give your time back. You can go for a walk or something before the next talk. Okay, not quite, not quite. So the previous stuff, right, it's all great in theory and actually mostly great in practice too. And you know, when we first started looking at graphql authorization a few years ago, this just completely made sense to me. I was like, this is great. Like we'll solve authorization for people with graphql and there's nothing for us to do. And actually it was my co-founder who repeatedly asked me, like, there must be more we could do for graphql authorization. Every time we sort of thought about it, we're like, but the rationale says we do it in the business logic layer and that's where we fit anyway. And yet, you know, we did see people who were doing custom authorization things with graphql. They were asking us about integrations. And so we wanted to find out like what was going on there. Why was this? And there were kind of like three different reasons we saw out there. So number one, honestly, it's just easier, right? It's actually a lot easier to not have this like rigorous discipline where you have like perfectly separate and decoupled your authorization from your resolver handle logic and your persistence layer and a third one. For a lot of people, they're just trying to move fast. They are adopting graphql and they put their authorization in the resolver because it's just easiest to do that way. And there was zero judgment in there, by the way, because for years people have been doing this for authorization with REST endpoints, right? They just, they kind of put it into their route handling logic. They may make middleware. There's many, many different things people do. And so often, you know, that can be the rationale. The second one I saw was maybe a little bit more principled than that, which is for people who either graphql first as a company, you know, they're sort of all in on graphql. And so that kind of rationale of like, well, you don't want to duplicate it between graphql and REST, that doesn't quite add up for them because they're only using graphql anyway. And so that doesn't hold. And we'll see in a second why it can actually be very powerful to do authorization in the graphql layer. I think simply from that, for a lot of people, they see it as an opportunity to do authorization very cleanly. And we'll see there's some really great benefits of doing authorization in the graphql layer. And so people I think are jumping on that opportunity. I think about that as sort of the graphql first approach where you're sort of saying you want to do authorization in graphql intentionally. I think finally, and this is always a fantastic reason in authorization, for defense in depth. So it can be hard to know that you've done authorization correct in all the right places, all the way throughout your backend, all the different routes you need to handle. And so having like a single place to put it, whether it's the REST api, the graphql api, it can kind of give you that comfort of knowing that there is at least some amount of authorization done in one place. So the next pattern I want to talk about is what does it look like to do authorization directly in the graphql layer? And so that means, and there's a couple of different ways we could do this. I'm going to talk about sort of the resolver side of things. So suppose we have some mutations. Maybe there's create a repository, open an issue, and so on. And we've implemented resolvers for each of those mutations, right? Looks at the data. And for one of those reasons I mentioned before, we have done authorization here in the resolver layer. Okay, so there's a couple of things I want to point out around what I have here. Number one, I think it's absolutely crucial that you do, if you're going to do authorization anywhere, that you abstract it away behind an api like this. I really cannot recommend that one highly enough. It lets you keep all your authorization logic in one place, one place to make changes, one place to debug, log, and so on. You'll sort of see throughout this talk, I'm not even going to show you any authorization logic itself. Like, you know, you can do this because you have a role. Because here in the integration layer, like, we don't need to worry about that. Okay, so number two, like, even with that api in place, even with this really clean, like, one line authorized call, it can still be really tedious to go through every single resolver and make sure that you've done authorization correct. You'll probably go around and copy and paste, and you might end up making a mistake. And so a pattern that I've seen emerging to sort of reduce that tedium, to make this a little bit easier and repeatable, is to use a custom directive. I'm not going to kind of go deep into this, the definition of directives and how those work. But suppose that we have this kind of custom check directive. And what this lets you do is, like, annotate your mutations with the permissions that you should check. So, for example, again, when you create a repository, you should check that the user has create repo permission on the organization. What's really nice about this approach, if you go down this path, is that your graphql schema becomes sort of a single declarative place to map your graphql APIs to the permissions you need to access them. And so the takeaway is that doing authorization at the resolver can actually work pretty well. However, remember with the business logic layer, we had two patterns. There was the single authorized, there was the query filtering. You're never going to get away from that data filtering piece. You can do it in alternative ways, but you'll never kind of fully get away from it. And so this pattern of doing resolver-level authorization should only really be used in combination with, say, filtering at the database. And so in particular, I see this resolver-level authorization being really great for mutations. Because those are often ways to have a wide range of different things you can do and different permission checks. Okay. So, I have one final bonus pattern that I want to share with you. And it's all about how you can build frontends that are permissions aware without needing to duplicate all your authorization logic between your backend and frontend. And the gist is basically you can extend your graphql types with a permissions field. And that permissions field will contain all the permissions the current user has on that particular resource. And you can see an example of this out in the wild. So for example, if you go look at the GitHub graphql api, you know, when creating for a repository, you can ask for the viewer permissions. And what you'll get back is a string representing sort of the permission level that the user has. And now the general idea is that what a UI can just do, it can take that information and put it into a simple conditional expression to decide whether you should maybe gray out a button, hide a configuration option, hide a tab, things like that. Now I find this is a really I feel it's a very elegant pattern. And in particular, it's very, very nice and easy to express this with graphql. Because to implement this, all you're doing is maybe just extending your resolver with a new permissions field. Which you're going to compute in some completely different way. Because graphql allows you to extend your types with other kinds of data this easily, it can be very natural just to inject this extra field into your types. So the way you want to implement this, again, you probably want that abstracted interface. So something like list actions that can return all the permissions a user has on a particular resource. For maybe a simple authorization model, the way you might implement this is maybe check what role the user has and return the permissions that role has. As the authorization logic gets more complex, this can get more involved. Which is where existing authorization solutions like Oso can be really helpful. So what I'd really love, I'd really love everybody to take this pattern away. Because I think it's an incredibly powerful one. You sort of want your backend to stay the source of truth for authorization logic. And by sort of exposing permissions in this way, it helps you build really, really great UIs that kind of use that permission data and extend that schema. Okay. So in summary, right? Three patterns. One, doing sort of data filtering for read level permissions at the business logic layer is a great one to help you sort of centralize that logic in one place. But when you're thinking about doing sort of lots of mutations and trying to figure out what permissions should a mutation check, doing something like a custom directive can be a really nice way to manage that declaratively in your graphql schema. And then finally, the pattern that I mentioned around, you know, exposing permissions as part of your graphql schema so that UIs can build on top of that, I think it'd be a really great one to build applications that are very aware of authorization. So that's everything I want to speak about. Thanks for listening. If you'd like to learn more about any of these topics, we, you know, a lot of this content was based on a blog post written by a colleague of mine, Patrick, and talks through about how, you know, these different patterns of authorization in graphql. If you're just interested in learning more about authorization, I only just scratched the surface on different kinds of authorization. There's also everything around how to do things like, you know, roles, relationships, attributes in logic modeling, you know, how to structure this across microservices, things like that. We basically took a lot of the thinking we've done around authorization and put them in this vendor neutral set of technical guides on how to build authorization for your application. And then finally, you know, if you are looking for a way for someone else to solve this, you know, if you're listening to this talk and you're like, ugh, Sam, just make this problem go away for me, well, we have a product called Oso cloud that will help you with that. So thank you very much for listening to my talk, and I'll be happy to answer any questions in the Q&A session.