Lighting Talks - Day 1 - GraphQL Galaxy 2020

Bookmark


Transcription


♪♪ Hello, everyone, and welcome to this lightning talk about how to secure your graphql endpoints in five minutes, and we're going to be doing that using Tyke. So my name is Matt Tanner. I am a product evangelist here at Tyke, and I'm going to be walking you through this. So getting right down to it, since we have a limited amount of time, let's look at a few problems that we're going to solve within securing graphql. First one is adding authorization, so authorization, authentication, adding in those mechanisms quickly, securing the schema, so making sure that only specific users have access to specific fields, and then also looking at protecting us against denial of service attacks. How do we do that? Well, we have batteries included security, which is a phrase that we like to use at Tyke to say everything that's within our gateway is included. There's no plug-ins or anything like that that you need to add. And for that, we're going to add that right in. Then we're going to, as part of that, put in some field-based permissions to secure the schema, and then we're going to add some query depth limiting to it as well for those denial of service attacks. So let's see how it works. Let's just get right to it. I'm going to jump out of this, and here I am in the Tyke dashboard. What I'm going to do is first I'm going to show you what I want to secure. There's this TrevorBlades countries api, graphql api, that right now is completely open, and I can hit it. There's no security, no type of security at all. What I'm going to do is proxy to that through Tyke and then secure it using Tyke. So I'm going to grab this. This is as if it was your api. You come over into Tyke, and we come over to APIs, add new api. I'm going to call it countries. It is a graphql api. We're going to proxy to an existing graphql service, and you'll see that I have the TrevorBlades countries URL in there. Now, at this point, believe it or not, we already have some authorization built in. We've now proxied to it. If I come over to the Playground, which is built into Tyke, and I run, if I just hide this here, hide meeting controls. If I come over here and grab this query, and I come over back to here and run this query, you'll see that it says authorization field is missing. That's great. That means we're already enforcing an authentication token. Where is that specified? Well, in our setup right down here. We support quite a few different things, but today we're going to use authentication tokens just for brevity. We also support mutual TLS, OAuth 2.0, JOTS, all of those good type of authentication modes. So in order to access this now, I need to generate some keys, and in order to have some keys, I need to have a policy created. So let's save this, jump over to policies, which is down here in the corner, add policy, and I'm going to cover my countries api and come over to configurations here. I'm just going to call this countries policy. My keys that I generate are never going to expire. And then I'm going to hop back over here to access rights. And there's a few things that we're going to do. So set per limits, set per api limits and quota. I'm going to turn this on, and this here would allow us to enforce rate limiting, throttling, usage and quota, sorry, usage quotas, all that stuff. We won't worry about that today. What we are going to worry about here is this query depth limiting. And what I'm going to do is I'm going to make my maximum query depth five, and I'll demonstrate that to you in a moment here. And just with that, now that will be enforced. And what I'm also going to do under field-based permissions, I'm not going to allow any of my users of this policy to access. As you can see, you can see all the types available through this api as well as all the fields individually. I don't want them to have access to continent code or country code. Then I'm going to create policy. There we go. Policy has been created. And now I'm going to hop over to keys. I'm going to add a key for this policy, create key. And with that, my key is created. Now if I come back over to APIs, I'm going to open a new tab. Come over here and come to countries, which is our created graphql proxy playground. Remember again that we weren't able to issue that query. I'm going to add a request header with an authorization header that includes our key. I'm going to come back and grab our query that we had, and I'll paste it in here. Now I'm just going to take out code because we blocked that field and I want this to work. I'm going to run this. As you can see, now we have access to the api. I'm using that authentication token in order to access it. Now let's add in our code, which we don't have access to. So countries code and continent, we don't have access to these fields. What happens if I try and hit them? Code is restricted on type continent, so if I get rid of that, next my code is restricted on type country, and I can take that out, and away we go. Now I'll be able to do that. And lastly, what I want to show you is I have a query here that is nested, and I'm going to demonstrate that query depth limiting that we put here to enforce as well. I'll paste this in. As you can see, I've got some redundancies in the query. I do that. Oh, I need to come back here and run this. Am I missing another bracket? I must be. There we go. Okay. So as you can see, field code is restricted on type continent, so let's just get rid of those quickly. Code, code, code. Now we run this, and we get some data back. Now what if I go one more here, and I say countries, and I do name and run this? Depth limit exceeded. So now you can see that at the gateway level, so without even going to that back-end service, things are getting cut off, and that is how easily it is to secure APIs with Tyke and our graphql features. That's all. Thank you very much. [♪ music ♪ Hi, everyone. I'm Bricht. I'm super excited today to talk at the graphql Galaxy conference. I'm data Bricht on Twitter. I work for the FAUNA database, and today I'm going to talk about native graphql, or graphql as a database query language. Now, if we talk about native graphql at FAUNA, what does it mean? Well, first of all, we have a FAUNA query language, which we call FQL, and basically native graphql means that a graphql query is going to translate into one FQL query. That one-to-one translation has huge advantages. So first of all, you might wonder which advantages. We'll look into that. And question two, why doesn't everyone do this if there are such advantages? To answer these questions, we actually have to answer other questions, like how do graphql resolvers work? So let's take a detour. How do graphql resolvers work? Well, typically, if you have a query like this, with getList, todo, title, every field in here, like getList and todo and title are fields, will map on a function. So getList will be a function, and that will delegate to the todo function, that will delegate again to the title function, for example, to the title attribute. This is a resolver chain, which is a chain of functions, but it's actually more of a resolver tree of function calls, because here, there's one function that calls n functions. And if we turn this around, we get n plus one, and it's basically a problem. And this is actually called the n plus one problem. That's why I turned it around. And when is this a problem? Well, basically, if you're going to call the database for each of these resolvers, because then you get n plus one database calls, which is not efficient. So question four, how can we solve the n plus one problem? Well, there are multiple solutions. Solution one is batching or n in-memory caching. So in that approach, we're going to hook into these functions, for example, todo.titles, and just wait until all the todo.titles are called, and then combine these. So instead of going to do n calls for these todo.titles, we're going to do one call, so in total, two calls. That's batching, and that's often combined with caching. So if a similar call comes in, then instead of going to the database, we can go to an in-memory cache, so we don't hit the database at all. A very popular implementation is Facebook data Loader, which you can just plug in on top of your resolvers, and it will handle it for you. However, there's a problem with this solution as well. It should, in fact, be a last resort. Well, why? Your data is no longer live. It's no longer consistent. You can't apply it on everything. You can't batch everything, so you will have still multiple calls. What about caching validation, memory pressure that you have to deal with suddenly? So it introduces complexity. So the first question, which advantages that FANAS approach provides? Well, it doesn't deal with these problems because it doesn't have these problems. It is live by default. It is consistent. It requires no extra work, and there is no memory constraint problem. It just works out of the box, so you don't have to do this. Solution two, generate one query, which is what FANAS does behind the scenes. But why doesn't everyone do that? If we would look at SQL, for example, and let's say we would select a star from lists where id is equal to something, then we would go to the to-do calls and do the same and try to concatenate that query. Of course, we'll have to do it for multiple to-dos, so we'll end up with a join. The problem is if we go deeper like that in a graphql traversal, we might end up with a lot of joins. Now, not only is this super complex to analyze this query and then generate SQL from it and then transform the results back to a graphql format, it might also be inefficient depending on the joins. You might overfetch a lot and then have to throw away things. And then how are we going to pigeonate this? Limit 100 might not be exactly what you're looking for. The problem here is that what joins solve, which is a join between two tables, is a different problem than what the actual problem is, which is more like a tree traversal problem or a graph-like problem. Joins are maybe the wrong tool for the job. There is an implementation, a very impressive implementation called JoinMonster, which actually comes from the problem they're trying to solve, a monster join that might be the result of a graphql query. If you look at the work involved, you can see that it's a complex problem to solve. That's question four. How can we solve the n plus one problem? So the two solutions. That brings us back to question two. Why doesn't everyone do this? Well, we just showed it. The query language might not fit the problem or the execution plan might not fit the problem. Then, of course, why does FQL does fit the problem? Well, we do it quite differently because it's a different query language and has quite graph-like properties. So if we would look at the same query, we would start by getting a list with match index and then the list ID. We would immediately wrap it in paginate. So we actually will have pagination on every level and very sane pagination with an after and before cursor that is always correct. Then we just map over the results of these lists and we would call a function. That's actually just like a normal programming language where you would just map over something and then call the function. In that function, we can do whatever we want. And if we look at the get to do's there, well, what is this? That's just a javascript function because I'm using the javascript driver for FQL where we just throw out more FQL. Pure function composition. Then we see the same pattern, paginate and map. So we have the second level of pagination immediately and map and again a function that will be called. This is actually a graph-like reversal that we're just implementing in FQL. Because that's possible, it was super easy for Fana to implement that one-to-one translation from graphql to FQL. So what is actually happening here? If we see and look at the query execution, is that we map get over all the lists, then we paginate that immediately. And then we just continue map getting and paginate on every level. There is no monster join problem because we do it completely different. So we don't have to solve that problem. So question five, that's why FQL actually fits the problem. Back to question one, which advantages does that bring? Because we have mentioned advantages, but there are others. Because we have the same advantages as the rest of the normal native FQL language, we can actually combine that with FQL and use FQL for the flexibility and power and graphql for the ease of use. We have multi-region out of the box, scalability out of the box. We have 100% ACID and transactionality out of the box. So that's what native graphql is. I hope you like that idea. And if you want, try it out for free at fana.com. Hi there, graphql Galaxy. I'm Ryan Severins, one of the founders and COO of StackHawk. I'm here to tell you a little bit about what we do at StackHawk. We are an application security testing tool. We make it easy for developers to find and fix security bugs. And in particular, we have some really cool things around graphql. So I'll run you through that. So like I said, we do application and the application security testing. We do testing of the underlying APIs as well. And part of that is graphql. If you're not familiar with application security testing, there's really three main types. One is software composition analysis. So it's looking at the open source components, looking for vulnerabilities there. Another is static code analysis. So it's looking at the code, looking for known error types within whatever language you're using. And what we do here at StackHawk is called dynamic application security testing. So we're running active tests against your application, against a running version of your application. And we test server-side html, REST APIs, single-page applications, and we test graphql as well. We are the only product that does active automated testing of graphql. There's a handful that do some best practices checking, making sure you're doing certain things that are known to be best practices from a security standpoint. But we're the only one to actively run a test against your graphql endpoints and look for potential security vulnerabilities. Big belief for StackHawk is automation in CICD. We believe that every time you open a pull request, an application security test should run. Make sure that you're not introducing any new vulnerabilities before it passes the build and goes on to production. And ultimately, we make finding and fixing the security vulnerabilities very simple. Let me tell you a little bit about how it works. So it all starts with a YAML configuration file. Like I said, we describe what to scan. We have server-side html, single-page apps, REST APIs, graphql. You describe what to scan. If you have authentication for your application, you can configure that here. We also have all kinds of other customization in terms of how the scanner runs. The beauty about graphql is that the configuration is really simple. You can see in the image here, you mark graphql enabled equals true and point it to the schema path of your introspection endpoint. You can also control certain things around which operations you're testing, the depth of recursion. There's a lot that you can customize there. Then you kick off a scan with this docker run hawkscan command. And so this GIF will cycle through and show us a preview of it. The beauty of it running in Docker is it can run anywhere. It can run locally on your machine as you're developing. Super easy to implement in CICD. And you can even point it at a production application. I would say use caution because this is running an active security test and trying to find input validation errors among other things. So it does try to input data into your database, which is why we always advise test this pre-production in a CICD environment. It's super fast app sec testing. You can see results in the terminal, which is great for CICD logs. And then it always has a link out to the findings within the StackHawk web app, which I'll show you next. And that helps for where you go when you actually need to fix a bug. So you jump into the StackHawk app. First thing I say is we are big believers in integrating with developer tools. We integrate with Slack, with JIRA, alerting in Slack, manage your issues in JIRA, and really only land in StackHawk when there is a vulnerability that you need to fix. And when you do end up there, we make it really easy to jump in, figure out the context of what that bug is. We have a description of what the vulnerability is. We have links to fixed documentation so you know how to fix it. And then we provide all of the information, the request that was sent to the application, the response, and a simple curl command to go recreate that as you step through the code in debug mode to figure out where you're mishandling the data. And then one nice thing is there's finding triage for CICD instrumentation. So it might break the build if you've introduced a new vulnerability. And if it ultimately is low risk, maybe you're not going to fix it, or maybe it's something that will be prioritized in an upcoming sprint, you can send it to JIRA. You can mark it as risk accepted. The scanner will still find it, but it won't break the build every time. So that's a quick overview of StackHawk in a nutshell. We would love for you to come test us out. So you can sign up for a free single user account if you want to test your own applications at stackhawk.com. We also have free trials for our team product. Same product, you just have extra users and collaborate with teammates. And be sure to swing by our booth here at graphql Galaxy. We are giving away T-shirts, entries to win a Nintendo Switch, and we would love to chat with you more. All right. Thanks so much. Q&A with all three lightning talks at once. One of the best bits here is, of course, that if someone has a question, someone else can answer it. I'm not going to moderate this. That's for someone else to do. But let's start off with a question for Matt. Is it possible to add more than one graphql service? And if so, how do you resolve type conflicts? Yeah. So, okay. If we're talking about from the fact of the stuff that I went through, so I went through how to add security to it and pull in proxy that is existing. So you will have some, you may have some naming conflicts. Our latest release is actually going to fix that. So there is a way to resolve those. It takes some manual workarounds for it, but there are ways for us to resolve those type conflicts. Amazing. Well, it's good to know the deal forward facing. And a second question for yourself. One of the things, this is a question from Bastian. One of the things regarding security is query depth. Usually we do a fixed limit, but I found some insightful scientific insights on the formalism of graphql language at this thing. And he's asking if there's any research on query depth prevention or attacks using query depth. Yeah. So from our side, we added in the query depth limiter, and we just, at Type, we just introduced graphql functionality in July. And query depth limiting was one thing that we found when we were doing the research about building graphql into our api management product. That's one thing that a lot of people requested. Now, it is pretty simplistic when you think about it right now. We're able to set a query depth, and that applies across. Now, the nice part is if you have multiple policies, so let's say I have group A, group B, and group C, I would be able to, for group A, say, okay, you get a query depth of five, B, six, and then seven, sorry, and C might be unlimited query depth. So there are ways to configure that within Type so that even if, you know, there's no way to dynamically set that query depth, there is a way to do it based on what policy you set. Now, we do have some more features that are coming in the next little bit that are definitely going to enhance our offering in terms of query depth and a couple other metrics that may factor into those types of attacks, so that query depth, the denial of service attacks. I think you may be on mute. Oh, wow, okay, well, that was embarrassing. This was one of those times when the MC muted himself. A question for Brecht from Fauna. What's next on the feature list for your application? It looks really interesting. That's a very good question, which actually, maybe we should go to another question and I can answer you in a few seconds because I would have to look it up myself. But one of the things that is requested often is the ability to do complex conditionals, range queries from graphql, and the other one is streaming. These are both things that we are considering. Whether one will take priority over the other, I'm not certain. Okay, that's great. It's always hard to queue up the most important features. That's one of the parts of the game. And that's, of course, only for graphql because we're a database and the next big thing that is coming up is actually streaming, that you get push-based streaming, but that will initially only be for FQL. Is that going to incorporate some of the subscription grammar? Not directly, I think. That's okay. I realize I'm really digging into your future plans. But eventually it will also be used for the subscriptions of graphql once it gets to the graphql endpoint as well. That's really interesting. All right. A question for Ryan, since you're here from StackHawk and you do security testing. What do you find is the most prominent security, not only is it a failure, like unpreparedness you find in graphql applications on the backend? To be honest, I don't know the answer to the most common. We haven't actually looked into any data. The ones that, more anecdotally, the things that I see, SQL injection, information disclosure, and remote OS command injections are the things that we've seen pop up as we're testing with our customers. So those are probably, again, can't back it up with data, but those are the three that we've seen most frequently. Do you think there are aspects of graphql that makes it weak to any particular kind of attack that would be less prominent in RESTful APIs? I guess the way that I think about that is purely because it's such a new thing that there's not this robust tooling built around the security testing in the way that there are with other web applications and api frameworks. So many of these things are simple mistakes that developers will make and simple fixes, but there's just not automated testing that's been widely adopted to catch those things. And so you're waiting until a quarterly pen test where you hope your pen test firm actually knows graphql and is able to dig in to find any potential issues. So just because there isn't this blessed and validated library stack and there isn't a set of conventions that everyone knows, you find more, I don't want to call them rookie errors, just oversights. Yeah, absolutely. Amazing. Okay, I think we have exhausted our question stack for now. Do you have any questions for each other? This is something we can't do in the other Q&As. That's A-okay. And from me, yeah. All right, great. Well, then we're going to go to a quick break, but put your hands together, send those claps to the graphql Q&A chat for our lovely lightning talkers, and we will see you soon. Awesome. Thank you. Thank you. Thank you. Thank you very much. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you.
29 min
02 Jul, 2021

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic