How to Automate Security Testing for Your GraphQL Service

Workshop from:
GraphQL Galaxy
GraphQL Galaxy 2022

We’ve all heard the buzz around pushing application security into the hands of developers, but if you’re like most companies, it has been hard to actually make this a reality. You aren’t alone – putting the culture, processes, and tooling in place to make this happen is tough – especially for sophisticated applications like those backed by GraphQL.

In this hands-on technical session, StackHawk Lead Engineer Topher Lamey will walk through how to protect your GraphQL APIs from vulnerabilities using automated security testing. Get ready to roll-up your sleeves for automated AppSec testing.



Welcome, everybody. Thank you for your time. We really appreciate it. In this workshop, we're going to be going through automating security testing for a GraphQL service. What we're going to do is take a vulnerable example GraphQL service, and we're going to put some security testing in the CICD pipeline for that repo. That really gives us some nice coverage around the code and the application which helps us find bugs earlier because really that's the point. You want to find those bugs as soon as possible. We'll go through the agenda, but just at a high level, we're going to be forking an example repo, modifying that repo, adding in some GitHub actions to do some testing. We're going to be hooking up CodeQL from GitHub and StackHawk scanning. We'll be able to really get some good testing into around this application and surface some of those vulnerabilities. A little bit about me before we get into things. My name is Topher. I'm a Lead Software Engineer here at StackHawk. I've been here almost since the very beginning of the company. I was the second engineer hired after a principal architect. When I started, there was no source code at all. I've been involved building everything for StackHawk from the ground up process, code, architecture, all that stuff. I'm in Colorado. I've been in the startup world for quite a long time, since the mid-90s was my first one. I started out in the Bay Area. Probably the one that you might have heard is I worked was at Netscape early on, way back in the day I was there when they were inventing JavaScript and SSL or TLS as it's called now, and all sorts of stuff. That was super fun. Got some kids, love to ride my bike. My main hobby is playing music, so I try and get out and play as much as possible. Nicole is my co-host, by the way. She's the one who's going to be helping out with the Discord and everything. Before we get into the workflow steps, I wanted to give a little bit of overview of the security stuff that we were talking about, and why it's important. In this workshop, we're going to be implementing three different types of automated security testing for this GraphQL service. The three major types, the way I think about it is, we have this first one here called software composition analysis. If you use GitHub, you've probably seen Dependabot give you some notices, or at least they've asked you to enable Dependabot. What this does is it just looks at your dependencies and says, hey, are there vulnerabilities in these dependencies and then tells you about it. It's useful in that if you're using something, the big recent one was the log for shell one. If you're using that library or if it's a downstream dependency or something like that, this will let you know. The downsides are, it's not looking at things in your code, in your specific logic for your code or in your application. It's really just looking at known vulnerabilities and dependencies, but it is very fast. It's really cheap and easy to implement. It's a good one, software composition analysis. It's good to know, it's good to have. We're going to be enabling that as part of the first step that we're going to go through. Then the second one is static application security testing. I also call it SAST. If you hear me say SAST, that's what I'm talking about. Some big vendors here are Snyk, CodeQL. That's what that top icon is. What this does is it looks at your code. It analyzes your code and it's not quite a compiler, but it looks for things in your code. It's got a parser and it tries to identify known patterns in your source code and alert you to them. It's really nice because it has a specific line of code that it will point out and it's really helpful. The downside is that it's not always right. It's just a guess. It's a well-educated guess, but it's still at the end of the day. It's just a guess. We're going to be implementing that as well. We're going to turn on CodeQL as part of this demo. Then maybe if we have time, Snyk as well because we just launched that. We've got the third one then is the dynamic application security testing or DAST. What this does, this is different in that it actually stands up a instance of your application and then runs a scanner against the live running application. It's very useful because when it finds a problem, it's actually a problem in the running application. One of the things that's difficult though is that it doesn't point out the line of code like the SAS does, but you know that this vulnerability is here and you can go then go trace it down and try and figure out what's going on. DAST tools, we've got StackHawk, SAP, and Burb Suite. Today we're going to be implementing StackHawk, which I work for, and it's a great product. Those are the three different types of security products that we're going to be implementing on this sample application today. We're going to be putting them in the build pipeline. On every commit that goes against any branch, just the way we're going to set it up, it will run these tests and then we will know if that commit caused regression. Conventional wisdom at this point, but it's worth pointing out that the earlier you find a bug in the dev cycle security or otherwise, the faster and cheaper and quicker and better is to fix it because you know that whatever I just did, whatever I just checked in, changed something and cause something this alert to trigger, the security issue to be found. Those are the three different types, and that's why we're doing it. StackHawk, we think we do this really well. We are built around ZAP, which is an open-source scanner. We build it and package it specifically for CICD. We make it real easy to use in GitHub Actions, Jenkins, CodeBuilds, all kinds of different build systems. We make it easy to run in CICD and easy to understand what's going on. We take the results. Just at a high level, what happens is our scanner runs and pushes the results to our platform, where they get aggregated and there's historical information. You can go back in time and see if things changed. You can do reports, all kinds of functionality and features around the DAST test that get run. Basically, we want it to be fast and easy to use and understand because there's some security tools out there like, hey, there's a problem, and then you really don't know. It takes a bit of detective work to figure out what's going on. We try and make that easier. These are the high-level steps for the workflow. Time to do some stuff, make some things happen. Step 1, we're going to use GitHub Actions to automatically build a GraphQL application. The step 1, I talked a little bit about it. We're going to take a vulnerable GraphQL application. The way we're going to do this is we're going to fork that into, I'm going to fork it into my own personal GitHub account so I can modify it and then add it to GitHub Actions for a CICD pipeline. Then step 2, we're going to add the SCA, the Dependabot scanner so that we can get alerted if there are any problems with the dependencies. We'll go through that, look at how to set that up. There's a ton of different options around that. We're not going to go through them all, but some useful ones, some helpful ones, the ones I use, we'll go through those. Then step 3, we'll add GitHub CodeQL, which is a SAST product that'll scan the application's code base. When it finds anything on the commits, it will let us know. Then step 4, add StackHawk to scan an instance of the running app. This is during the CICD build process. We'll add StackHawk so it can run the scanner, find any vulnerabilities, it'll report them back to the StackHawk platform. The other two, actually the results show up in GitHub. We can look at things there, but with step 4, the results get reported back to StackHawk. We'll go through StackHawk and look at what the scanner found and what that means and how we can do something with that. If you haven't worked with GitHub Actions before, they are GitHub's CICD offering. They have a huge marketplace of different things you can do. It's basically controlled by checking in a YAML file or two into your repo in a specific place. Then once GitHub sees that, it will then perform actions. The first thing we're going to do is we are going to fork this Vaughn GraphQL API, which is so CACA is the StackHawk public repo. We have this vulnerable app here. It's a Ruby app. It's actually a third-party one. If you go, we go look over here. It's this carve system. This is a CACA Vaughn GraphQL API, which is based on this carve system one. Actually, just to give you guys an example, here it is running locally. You don't have to do this as part of the workshop. I just wanted to show everybody that this is what this vulnerable app looks like. It's your standard GraphQL introspection interface. I just wanted to show you guys what it looks like. Let's go ahead and fork this. The Vaughn GraphQL API. Go ahead and create the fork. Now I've got a fork here. What we're going to do is we're going to basically add the CI to this repo as in GitHub actions. Like I was saying before, what you do is you put a file in this repo that GitHub can recognize. In this case, it is this build and test YAML. What you want to do is in your fork, you want to do this, add a file, create new file, and then the name of the file is this GitHub workflows build and test.yaml. Then in our workflow guide or in the steps, is this example, not example, but here's the contents of the file. This is a YAML file. It just basically says, we're creating a new GitHub action here. We're going to run it on Ubuntu. It only has two steps to start here. We're going to clone the repo, and then we're going to build the app. The app is very simple to build. It just uses Docker Compose. Go ahead and commit that. We're going to commit directly to main, which is fine. Now, with that file checked in, you can go over to the GitHub actions tab. You will see based on that commit, if you go to that commit, that's what I just did. We now have a action called build and test here. If you haven't worked with these before, you can go in and look at the output of what's going on. If you remember, we had two steps. We had the clone repo and build the app, and those are right here. Clone the repo does pretty much exactly what you expect, just doesn't get good clone on it. Then build the app. It's running that Docker Compose here. Start to talk about adding Dependabot. Dependency scanning with Dependabot, and the workflow guide, that's step two. We'll wait for this action to finish, but actually we don't have to. If you go in GitHub in your cloned vulnerable Graph API, if you go to the settings tab, code security and analysis. They keep changing this, but basically what you want to do is, in here you want to enable your dependency graph. What that does is it allows you to then turn on some Dependabot features. For this workshop, what we want to do is the alerts. You can read what it does here, but basically when it detects some of your dependency has a vulnerability, it will alert us. Then we want security updates. This one's pretty slick. It allows Dependabot to open pull requests automatically to resolve the alerts. We will see that, we'll turn this on. Then when it does find an alert, if it thinks it can resolve it, it will open a PR for us, which will be pretty nice. This is just a vulnerable example app, but imagine in your real app that you're working with, if when there's an issue, if Dependabot find something, if it automatically just open the PR for you, that would be a lot easier for you and your team to deal with than having to go research it and figure it out. With those enabled, now we go over here. Now we go to the front-end settings to the security tab, we go to Dependabot, and hey, Dependabot found stuff. We have eight open ones that it found, Axios inefficient problem here. You can go into each one of these and get some more detail. In this one, it's saying, hey, go from the version 0.18.1 to 0.21.2, and it will fix some vulnerabilities for you. They do some nice things around telling you, hey, is this severe? Some information about the vulnerability that it found. One of the things that the security world has, if you're not super familiar with it, is it's got these IDs, the CWE and CVE and the GHSA. These are just information around the vulnerability so that you can learn more about it. This one tripped these alerts. You can go and this is a common basically database, if you haven't gotten into the CWE stuff before. It gives you information about it. In this case, we've got some regex security leak here, which upgrading the library will fix. Like we were talking about before, depend about actually open to PR to upgrade that library for us. If we go in here, we go, oh, cool. This is nicely formatted PR. Now, this is an example application. I'm not going to merge this PR, but if you think about it, if this was your application or my application, your team's application basically, you'd probably have a bunch of tests around this. When it did do this PR, at least the way we have things set up for our builds, it would run a series of integration tests, series of unit tests and all that stuff. We would be very comfortable that moving between this version to the new version would be okay. In this case, I'm just going to leave this here. You can go back to security here. Now, it didn't. You go back to security, depend about it didn't open PRs for all these because it wasn't a clean upgrade. You can get information here. It said the latest can be installed is 1.5.10. There's these other dependencies that it was not. Basically, it's a fail-fast thing where it's like, oh, if I can't do a clean upgrade on a single library, then I'm just going to let you know about the vulnerability, let you know some details about what you could do here. Again, here's more detail on the bug. This is what was identified in the library. These CWE IDs are pretty common in the security world. Back to security, a list of depend about things here. You can go through each of these and get an idea. Now, I will tell you the first time you run this, if you haven't run it against your repo or your project, it's going to find a bunch of these things. Dedicate some time to going through and learning a little bit about what they are, how much impact they could have. Like we said before, the thing with the SCA type tools like this dependency scanner is, it may or may not be an issue. It just lets you know that, hey, the dependency has a known security vulnerability. It doesn't actually tell you if it surfaces in your code. Like, if this was something to do with, you know, regexes and you didn't use regexes at all, it might not be an issue for you. But that's, takes a little bit of digging to sort of figure that out. So, and we go look at that file. So, as part of the step, we added this GitHub workflows directory using this add file. Yeah, so Nicole just posted the guidebook. If you go in there and go down to step one, and the second step of step one, there's the build and test YAML. And so slash workflows slash build and test.yaml, we have this. And we saw those steps over in the action. So we've got clone repo and build the app. If we go look at the actions, we go in here, we got clone repo build app. And then once you get that YAML file in there, then just go ahead on over to settings, code security and analysis, and you can enable dependency graph and then turn on the first two options or enable the first two options for Dependabot. And then you can actually, those run as GitHub actions as well. And then you can see those runs here because we had, it did a PR. So it caused those two actions to run. So that was in the workshop guidebook. That was step two, dependency scanning with Dependabot. And we can go into security Dependabot and here are what Dependabot. Here's what Dependabot found. So I think we're gonna move on to step three here. We're gonna enable static code analysis with code QL. And this is also done in GitHub. So we're gonna go to security code scanning. So in the repo, you go to the security tab, go to code scanning, go to configure scanning tool. And hey, look, the very first thing is code QL analysis. So we're gonna go ahead and configure this. And this is yet another GitHub action. So what this, basically when we go to configure the code QL, what it's gonna do is it's gonna drop another GitHub actions YAML file into that.github slash workflows directory off the root of the repo. And it's gonna have a bunch of configuration stuff in here. We're not actually gonna change too much. We're just gonna commit this as is, see how the default works because it's actually pretty good. So just to go through that again. So you go into security code scanning, configure scanning tool. Oh, for me, it comes up right away. If it doesn't come up, you can just search for code QL. And then it will find that workflow. You want the code QL analysis by GitHub and go ahead and just click that configure button. And all it does is it just adds this file into your repo. So here's the file, you can look it over if you want. I'm just gonna go ahead and commit it here. So on the right here, you do start commit, commit directly to main, commit new file. So now in our.github slash workflows directory, we have two files. We have the one we added for Dependabot, which actually was just not for Dependabot, sorry. The one we added just to get the GitHub actions to build the application. And then this code QL one that got added when we added the code QL workflow. So if we go to actions now, we have those two show up and you can, so this is by default, you see all the workflows. We have those two defined using those two YAML files and the GitHub.github workflow directory. And then the commit that I just did to add that code QL YAML triggered two of these. So we have the code QL one, which is the new one. And if you go in here, you can, oh, it's analyzing JavaScript, cool. It's doing all kinds of stuff. And then we go, and then this was the build and test one that we added. It's just by now, this is kind of old hat to us because we've already seen this, but it's basically building that. Okay, so code QL completed. So now if you go in the repo, if you go to the security tab, before when it clicked on code scanning, we got that prompt to go search for workflows, but this time, cause we have one installed, it actually went through. Now what this has done is it's gone through and it scanned the source code of the repo and it's looking for known security vulnerability patterns in the code itself. And it found four of them here. And like we were talking about before, the cool thing with the SAS products is that it highlights the line of code for you in the file. So the way this app is written, again, this is an example of vulnerable app. Someone put a secret into the code. And so that's, I would agree, that's a critical thing. Whoops, sorry. So here it is on line 42 and app.ts. The secrets in here, it gives you more information about that, including like, hey, don't do that. Use environment variables or some sort of runtime resolution. But it's nice cause it tells you the source. You can go look at the source directly. You don't believe it. Maybe it's secret. Oh, nope. Not the line of code. It shows you the line of code, sorry. And then it gives you some code examples here. Again, here are these CWE IDs. They're gonna link to the so you can learn more about them there. So that's the first code scan, code QL alert that I found. If we go down here, we can look at some more of these. So this one uses user ID, which is just a very finite random number here. So we probably wanna beef that up, make that a little better. And again, it's the same kind of thing. We can go in here and say, hey, use some examples of writing some better code to get that random ID. And then the third one here, yeah, when you log out to the console, user information, maybe there's sensitive user stuff in there like user IDs or credit cards or who knows. So definitely want to take a look at that. So you wanna either mask out your logs or don't log that, or there's a number of different ways to solve it. But what this is doing is telling you the line of code. And also, does it think it's critical or high? And then this one here, cookies transmitted in clear text without SSL. So that's code QL for this app. It's pretty nice. So at this point, what we've got, we've got our dependency security scanner going. That lets us know when it finds problems in our dependencies or known vulnerabilities in our dependencies that may or may not affect our application. Code scanning is a little bit more direct. Whoops, here we go. So this is saying, hey, in this line of code, this looks like a problem. So that's very useful. It's certainly worth looking at, but it's, again, if that's in a test, or if it's in some piece of code that doesn't actually get invoked, maybe it's not an actual issue in your running application. So, but it's very good to have. But so the third one we're gonna do is StackHawk. So we're gonna run our scanner, which is going to build, stand up an instance of the application in GitHub actions, and then run the scan against it directly. And then if it finds problems, when it finds problems, you know that's a problem in your running application. So let's go ahead and get that going here. So this is step four we're doing here, dynamic app scanning with StackHawk. So in order to get this working, I'm gonna sign up for a new StackHawk account. So I'm signing up for an account here. When I do this, it's gonna send me a confirmation email. If you don't have a StackHawk account already, you can use your GitHub login. You can use your Google login, or you can create an email one. I created an email based one here just cause I have so many. I went through that, created it, and it sent me a confirmation email. I clicked on that and I landed here. So creating a new account in StackHawk. So we're gonna say, yes, that's fine for the org. I'm using that email based one here. So we have an option. You can either load in some sample data or just start scanning. We're just gonna start scanning here. Okay. So what we have to do here, we're gonna be copying and pasting a little bit of information out of StackHawk into GitHub here. When you get to this point, you have the scanner set up. This is the first step of the onboarding model here. What we wanna do is we wanna do, we're not gonna do the CLI. So you can run StackHawk either as a developer tool locally. We have a, it's just another command line utility called Hawk. There's different ways to install it on OSX. You can use brew, but there's other ways to do it too. We're not gonna do that as part of this cause we're running this in GitHub actions. So we wanna run it in Docker. So we're gonna, in the scanner setup for StackHawk, the scanner type, we want StackHawk Docker image. And then this is the part we're gonna cut and paste from StackHawk into GitHub. So what we're gonna do here is we're gonna take our API key out of StackHawk and you can either select it like that or just click that guy. So we wanna copy the API key, the StackHawk API key. We're gonna go back into GitHub here and we're gonna add this as a repository secret. So we go into settings. This is in your Volney clone, Volney app vulnerable graph QL API. So we go into settings, secrets, actions. So in settings, there's a secrets tab and actions underneath there. And in actions, we wanna add a new repository secret. So what we're gonna do is we're gonna take the API key out of StackHawk. We're gonna put it into GitHub as a actions secret so that when the action runs, it can just look that up. In this case, we're gonna call it Hawk API key, all caps. And then we just cut, we paste the Hawk key, API key out of StackHawk into here. So in just to recap, in the first step of the StackHawk setup, we click the StackHawk Docker image. So you have two options, CLI or Docker image. We want Docker image and we want this API key, copy that. And we go over to back to GitHub for our repo settings, secrets, actions, new repository secret, name Hawk API key secret is the key out of StackHawk. Go ahead and click add secret. And so now we've got it. So now this environment variable will be available to all the GitHub actions, which will have the API key. Alrighty, so that's cool. So now we've got that saved in GitHub. We will configure the rest of that in a sec, but we gotta go back to StackHawk here. So scanner setup, we're gonna click next, app details. So the way that StackHawk organizes the scans, keep in mind that what's happening is you have your GraphQL application that's gonna run. StackHawk has a scanner that's going to go essentially attack that application. And then it's gonna report the results, whatever it finds back to StackHawk. And the way it does is, the way it reports those things is it groups them into applications and environments. So here's where we're gonna set up our first application. Depending on your team and organization, there's no real one way to do this. A lot of folks will do it by service. So maybe I'll call it QL. So we'll call it by the service. Beyond application, we group things by environment. So we'll call this development. And then we want to put for host. For host, no. So what's gonna happen is GitHub Actions is going to run the Volney application in the CICD pipeline. It's gonna run it in Docker. And Docker, it's gonna run on port 3000, effectively what's the local host for that CICD pipeline. So, and as far as the StackHawk scanner knows, it's running local host port 3000 with no SSL. No SSL. So go ahead and put HTTP://localhost://3000. The next step here is telling the StackHawk scanner what kind of application it's gonna be testing. So there's lots of options here. We support lots of different things. The one we want is API. So GraphQL API. And what this is gonna do is, I talked a little bit earlier about how, what StackHawk does is it takes the ZAP scanner and it configures it specifically for a given use case. This is part of that. So what this is gonna do is it's gonna tell the scanner to load up the GraphQL vulnerability tests. It's gonna configure kind of the introspection. It's gonna configure a bunch of stuff for you. So you wanna pick an app type, application type, API, and then API type, GraphQL. And then when you pick GraphQL, it wants to know the introspection point. And it just tacks that onto that URL, that host URL that we gave it previously. So this looks right. So we're gonna have HTTP://localhost://3000. So that's gonna be where the scanner is gonna go figure out what this particular application has in terms of data types and mutations and all that kind of stuff. Okay, so this is the last step here. What we wanna do is we wanna take this, again, we're gonna copy this StackHawk YAML. So this is very similar to how the GitHub actions work where you check in a YAML file in a specific place in your repo, and then the actions or StackHawk know where to find that. If it sees it there, then it will do stuff. So what this is, is our StackHawk configuration file, basically, it's called stackhawk.yaml. If you look through it, it's got the ID of the application we just created, the env, there's the host, localhost 3000. So this is the customized part for GraphQL. So we want GraphQL enabled, there's the introspection points. We're gonna go for all operations. We're doing posts here, auto input vectors, all that kind of stuff. So we're gonna go ahead, what we wanna do is we want to copy this, go back to GitHub, go back to our Volney app here, go to the code tab. And like we did in that first dependabot step, we're gonna add a file. So we wanna create a new file. We're gonna call it just stackhawk.yaml. And then the contents are just pasted right out of the wizard, the StackHawk modal, or setup guide. They used to be called wizards, I don't think anyone calls them wizards anymore. But anyway, so we wanna cut and paste the stackhawk.yaml out of the stackhawk setup flow into here as a new file. And then we just go ahead and commit this. We go to actions now. So now we're set up here. So now we have to add, now we have to tie the stackhawk scanner to the build step. So I just created that stackhawk.yaml so that triggered our two existing steps here, or our two actions. So we have build and test, which just builds the application, and then the code QL one. So the last thing we have to do here is, I have those two optional steps in the guide. That's if you didn't use the setup flow, it just kind of reiterates that stuff. So what you wanna do is add a stackhawk scan to your build and test workflow. That's the step I'm on right now. So add a stackhawk scan to your build and test workflow. So what we're gonna do here is we're gonna go back to the code. We're gonna go into.github slash workflows, and we're gonna modify that first build and test.yaml that we created for the Dependabot step. So it had those two steps. We had clone, repo, and build the app. And so what we wanna do here is just, we're gonna add some more steps. So we wanna get the file here and just click on edit this file. And then we're gonna add two more steps. And they're in the workflow guide. The whole file is in the workflow guide. You can just take the last two steps here. We're gonna have run the app, which just does a Docker compose up. And then we're gonna run, we have a new step called Hawk scan. And it's gonna use basically a prebuilt action called Hawk scan action. And it's gonna use that API key secret that we put in earlier. So the way you reference those is you say secrets. And then the name of the key, we named ours Hawk API key. So we're gonna go ahead and specify that here. So what that's gonna do is it's gonna run the scanner. It's gonna pull in that API key from the secrets repository. That API key, it will use that to then, sorry, to then upload the results. It actually streams them back to the StackHawk platform. And yeah, and then we can actually watch that. So once I, so we'll put that in here, get this going here so we can close this guy. So we're adding in these two steps. So when we commit this now, so we're gonna commit that. We see we have run the app and Hawk scan with Docker compose up, and then the StackHawk action with the API key. So with this new commit, we have our new steps here in the build and test action. And you can see that they, I don't wanna see that. They showed up here. So we have now we have after build the app, we have run the app, which is that Docker compose up dash D. And then after that, it will run scanner. So we go back here, if we go back to StackHawk now, we can basically we're finished. So now we're just waiting for the scan results to show up here. So as part of the build process, that's what will happen. So the question is, how does code QL work? Does it use keywords, for example, console.log user? Does it look like a user? Does it look like a user? Console.log user, does it look for keyword? Does it look for the keyword user or some other technique? Do you have any insight on that? Yeah, it's like I, you know, it's, it has a parser and it looks for patterns. So it's not quite a compiler, like it doesn't actually compile the code and then, you know, look at execution flows. Although some of them can do that. I don't think code QL does that. I think it's essentially a very, very sophisticated regex machine. So it looks for like for this one here, the insecure randomness. So it, you know, it has sort of a bunch of patterns that it looks for, for different types of source code. So for JavaScript, and I don't know the exact implementation, but, you know, I would guess that it looks at, it looks for get random int and it says, oh, it's putting some, it's kind of a narrow range for a random int. And then user ID, yeah, I don't know. Maybe it looks, it must look at that and say, oh, something ID, that's important, right? In a security context. So it's, yeah, it's, that's kind of the downfall of SAST is it doesn't, you know, it may or may not be a thing. It's just looking at the patterns. It can only go to a certain point, you know, as to whether it's an actual vulnerability or not. It's not, yeah, so here's their thing. So random source get, that's probably like the trigger, like, oh, look at this, they're using get random values. Now we're in the running, so the scanner's running right now in GitHub. So if we go back to StackHawk here, we can go click on the left here. We can look at our scans. So here's the scan that we started. So volume GraphQL, it started, like I said before, it's actually streaming results back. So the scanner is running right now against that instance of the GraphQL application. You can watch it go here and see kind of what step it's on, what it's doing, which, or how long it took. So it's gonna go through all these different things. And these are all the ones that are tuned for, remember we picked, it's an API, and specifically it's a GraphQL type. So it's gonna, it's tuned to that type of application. So all these things are things that it wants to look for in GraphQL. So it finished, scan completed, and it found some stuff. So this is very similar to the CodeQL, that security list, vulnerabilities list in GitHub for the CodeQL one. But this is things actually found in the running application. So these are things that you definitely wanna look at. So these are the findings. Just before we dive into the findings, this is the paths it looked at. Now for GraphQL, because the URL is always GraphQL, and it's just text on parameters, this isn't super, it's not gonna be a lot of different stuff for a traditional site. It'll have a bunch of routes here. But it does have operations. Found 11 of them. Not sure what's going on there. Okay, so we'll go, there we go. There we go. So these are the operations that it found in our GraphQL application. So it found a bunch of mutations, including one called super secret private mutation. What? That doesn't sound good. And then the queries. So we got search post, user, all that kind of stuff. So like I was saying, paths aren't super interesting for GraphQL, but operations are. So we detail them out here for you. So these are the things it found, operation-wise. And then these are the security findings that it detected. So we can go into each one of these. So because it's a DAST scanner, it's testing the application basically through that host and port. We don't know about the source code. So the CodeQL, the SAS products will tell you which line of code they found that pattern on, where that user ID was. But what we, so we don't, StackHop can't do that. But what it does do is it traps the request and the response, and then points out the evidence that triggered this alert, this finding. So here we have this remote OS command injection. Again, we have CWE ID, which is, you know, the same, it's the same database, same thing, We've got details here. We've got the, it was a mutation, super secret private mutation. Hey, the evidence is right here. So here's the request and response. So this is where it's, this is the good stuff. So this is a request it made. The command variable was cat x etsy password. That's not great. And what's even worse is that it actually did cat x etsy password. And so right here, you can see the evidence is highlighted as part of the response here. And so that's, you know, that's not a great vulnerability to have in your GraphQL application. And here's some more information on it. ScanRule was able to retrieve the contents of a file on the operating system. So instead of source code, we have this request and response objects that we can look at. If you want, you know, you can look in here, you can see the headers. It was a post GraphQL. Here are the cookies that got used, but here's the variables that got posted. So there's a question, is the GitLab implementation for StackHawk similar to the one for GitHub? It is the same one. So it's the same, StackHawk stays the same. The scanner is the same. It just knows, all it does is it gets run in whatever the GitLab actions equivalent is. So in Jenkins, it would run in Docker. If you're running it at the command line, I do that all the time on my box. It runs and then it just, it runs against the application. It just runs against a host and port. And then it reports the results back to StackHawk. The limitations, the only limitations I can think of would be around maybe some environment stuff between the two, but really everything's configured via that StackHawk YAML file. So we'll go back here. We can look at this StackHawk YAML. If you go look at the docs, this file can get really big and complicated, but there's a lot there you can do. So really it's the same. It's the same scanner between the two. Okay. So another thing that we do is we provide this curl command which lets you reproduce it. So this is how the Hawk scan, this is how our scanner found it. And so we basically give you this. So you can imagine if you're doing this locally on your machine, you're trying to develop some things, you run the scanner, it finds this problem. You can reproduce it using this recreate request. You just copy this, paste that down to your terminal and run it. You can see the query here, mutation, super secret, and then the variable, yeah, variable cat etc password. And then another thing you can do is, let's say you're okay with this or no, let's say you're not okay with it, because it's not great. You wanna assign it. So you can mark it as assigned. You can mark it as a false positive if you don't think it's an issue or risk accepted. It's just basically a way to kind of keep track of these. And then the next time the scanner runs, it will sort of, it will remember that and say, oh, it's not new, it's false positive. You've already dealt with it. So that was the first one. The second one, you basically go through the same process. So you kind of look at this, you go, oh, it's CWE89. Here's some more information on it. It's a query. Query is a single tick. And what did we get back from that? So yeah, it actually just passed that single tick down to the database. So yeah, that's not good. So that could be a lot more malicious. So you probably wanna address that, do some SQL sanitization, input sanitization on your input. So those are the two high ones that the scanner found. Then we start getting into the mediums here, but it's the same kind of thing. You look at the CSP policy here, has a wildcard. And the evidence is here. And if you click over on response, it highlights it for you. So you know exactly where it was in the headers. Super helpful. Similar kind of thing. So you go in here and so this is where cross-domain, this is across all these different operations. So you wanna, on your response headers coming out of your GraphQL app, you wanna be a little bit more better about cross-domain configuration here. All righty. All righty. So if you were doing things at the command line, you can tell StackHawk, or even through the Docker command here, you can tell it to rescan and it will only run things, it will only run the tests just for the findings that it found in this one. So if you don't wanna, if you have a really long one that takes a long, basically the use case is if you have something that takes a while and you don't wanna run the whole thing every time, you can just have it only run against the findings that it found from the previous one. So basically you say, hey, run the scan, it's a rescan and here's the scan you start from. So that's a new feature. Thought I'd point that out. Spent a lot of time on it. As you run scans, you'll get them in here and it will let you know the differences between them. I don't think this does this. This just has open ones, but we kind of keep track of things over time. So you can look at new ones that came in or if you addressed one and this number will go down, that kind of stuff. Oh, look at this. You just ran your first scan. Yeah, I completely forgot about this over here. Let's see, dependent bots, we went through that. Code QL, scans code, it adds patterns. Basically, yeah, they just, oh, it's open source. You can go look at those patterns if you want. Stackhawk, I think we covered a lot of this stuff. One of the things I didn't talk about was our integrations. So we integrate with a whole bunch of different stuff. So you can have, including things like Jira, so you can create issues directly in Stackhawk for a finding. So if you imagine that SQL vulnerability, if you have the Jira integration hooked up, you can go in there and say, oh, you know what? Send to Jira and then it will create the Jira ticket for you and tie it back. That's super helpful. We also have these sort of more notification-based stuff where you can do send me a Slack every time a scan fails, that kind of thing. Same with Teams. We have a generic webhook, so we can send out events as things happen in our scanner. If you want to get the payload, if you want to load them in, if you want to automatically drop things into some sort of database so you can keep track of these things on your own, you can do that. But I think that was kind of it for the, oh, the sneak preview of custom variables. So this is a new feature that we are releasing now, I think, or next week maybe, not sure. Next week, it's available, but we're gonna do an official launch next week with lots of resources and just materials to help people get up and running with custom variables for GraphQL APIs. So what this does is it allows you to put in custom variables in a specific format. So if you have an email field here, you can say, you know what, I want to inject some fake data in the form of an email, and it will create like a random email-looking string and use that as part of that, like you saw, where it did the cat etsy password. It will start putting those kinds of things in those places. So it's really, really great for basically running through your data checks, because I mean, I've written lots of code where you sort of assume it's gonna be an email, and then you handle one pattern, but then there's some other thing that comes in you hadn't thought of. So what this does is it lets you inject those in, so you can do it, there's a whole range of them. So there's like emails, phones, UUID, all that kind of stuff. And it will look like pretty, it's realistic looking data that will come in. It really stretches the application. It's great to have. Tofer, correct me if I'm wrong, you can supply your own variables too, if you want to test something like a password reset. And yeah, so that's another really useful option. Yep, so yeah, so right above that is the, if you don't want to use the faker library, if you don't want, if you know that there's a string that you want to look for, like let's say you had, yeah, a password, or there's some string that, if you had a bug where you know it caused a specific problem and you want to force that string, you can do that here. So you can specify these custom variables, and then it will use those as part of the tests. So you have a couple different options here. You can either specify hard-coded strings, like if you know that there's a problem string, or like I said, if it's a password, if you really need that, or if you just want to generate a smart fake looking data, you can do that as well. And we talked a little bit about the GraphQL configuration. There's a lot of options here. So our docs are trying to be pretty thorough. This workshop is really just about setting up those three types of security scanners. And with those three, I think, yeah, so for this application, now we have three different types of security scanning enabled. We've got the software composition analysis using Dependabot, tells us about dependencies. We've got the SAST, which will go look at the source code for patterns that look suspicious, and it will alert you to those. And then we have DAST using StackHawk, which ran against the actual running live application and found some issues in there. So we know that those are actual issues and we've got to go look at them and triage them, either fix them or say, you know what, we're okay with that for now. But with those three things, I know we've just added some pretty good tooling around our vulnerable application here, our example application. So going forward, hopefully you can use this to implement this in your own GraphQL service. And then the idea is, you know, once these are in place, when you get a new one, the first time you write it, you're gonna get a bunch of stuff, you go through and you sort of triage and sort of fix stuff and say, no, I don't care about that library or whatever, or I'm gonna fix that. And then going forward, whenever you get a new one on any commit, then you know that, oh, hey, that commit changed something and I have to go figure out what the change was and why did this trigger this alert. But, you know, that's a lot better than it going through the whole process and then getting deployed. And then, you know, you get an email that says, hey, we found this problem out in your production server. We definitely don't wanna do that. So we wanna do it as close to the commit as possible. And so I think, let's see, next steps. Yeah, so now you basically translate kind of what we did into your environment. So if you're not using GitHub actions, whatever kind of CI, CD provider you are using, StackHawk supports a lot of them. Dependabot is, I think it's pretty easy. We've enabled it for all our stuff in our GitHub repos. CodeQL is one SaaS provider. Snyk is another big one that's out there. We have integrations with both. So yeah, you can use those. And then, you know, stay in touch. StackHawk, we got a blog. We're doing stuff all the time. If we've got time, I could show some Snyk integration. So Snyk code is a lot like GitHub's CodeQL and that SaaS tool. It will scan for patterns in your source code and then let you know the line file number or the file and line number of where it thinks or where it found the problem. And so what we do, our integration with actually both of them I just don't have the GitHub one ready to go, is we will try and correlate a SaaS finding between Snyk and a DAST finding in our scanner. So if we'll use, basically we'll use that CWEID to correlate the two so that you can see that not only was it found in the SaaS scanning where it's like, oh, you know what? That user ID doesn't look right. Or, you know, that's a hard-coded secret password. If our scanner finds it, then we will link the two and you can say, oh, wow, this actually got found in the running application by the StackHawk scanner. So let me get this wired up here. So I'm in StackHawk here. We're gonna set up a Snyk integration. Snyk org ID. So we have a account in Snyk for our organization and then just generated an API token here. So now we're gonna link the two together and Snyk, because it works on your source code you have to give it access to your repositories because it has to read the source code. So we're gonna, we have our, this is the Volney graph API. So it's the one that we forked basically. So source code wise, it's gonna be the same cause we haven't changed anything. And I only have one application in StackHawk which is the Volney graph API. So that's pretty much it. So you give it your org ID, your API key, your Snyk org ID, your Snyk API key. And then you hook up, you link your source code repo that you've already set up in Snyk with your StackHawk application. And you say, finish. And so now they're here. So if we go back to our repo here, we go to our action. We go to this, up the build and test action here, workflow. We just wanna rerun this guy. It will rerun. And as part of its rerun, it has that, the Hawk scan step. So Hawk scan here. The difference is this time when it runs the Hawk scan step, it'll run the scanner. The scanner will run against the application, send the results back to the StackHawk platform. This time though, we have the integration with Snyk there. So as the findings come in, StackHawk will, the platform, our platform will talk to Snyk and say, hey, do you have any findings for this? And if it does, then it will correlate the two. And then we can see that in StackHawk. And so that's really, really nice because StackHawk found the problem in the running application. And then you can go through the integration to the actual line of code that Snyk found. So you kind of tie those two things together. It's a problem where the SAST tools will say, hey, this looks like a problem. This user ID looks bad or this secret looks bad, but it might not actually be an issue in the running application if that piece of code isn't ever used or if it's just a test utility or could be any number of things. That logic path doesn't get exercised in the code, whatever it is. And then the other side, StackHawk, hey, we found a problem, but StackHawk isn't code aware, right? It's running outside of your application. It's running against a host and port. It knows it's a GraphQL application, but that's about it. It can't get into the source code. So we're trying to bridge the gap here between the two by this integration with Snyk and the CodeQL one as well. We just released the CodeQL integration. It's basically gonna look very similar to what I'm gonna show you with the Snyk one. I just hadn't updated this workshop because it's so new, so it's building app here. Now it ran the app, so now it's running Hawk Scan. As you can see, this kind of answers, partially answers what's the difference between this and GitLab. The core of it, the scanner is the same. It's just running this Hawk command in Docker. So you can run it. It can be run in Docker. It can be run at your desktop. And then all the CICD providers have ways to run things in Docker, or you can try setting up the Hawk command directly in there. We're doing some of that stuff with Microsoft Azure right now. They have a way to say, hey, you know what? I'm gonna run a Java command. So Hawk is a Java command, Java application. So you can set up an action that says, hey, don't run the scanner in Docker. I wanna run it directly, but you have to have Java set up, and then you have to download the Hawk. We provide it as a zip file. So you can download the zip file, unzip it, and just run it directly. So that sort of depends on how the CICD system works, how well it supports Docker, how well it supports just running arbitrary third-party stuff. But anyway, so this is the same output as if I was running it on my laptop, or if you were running it on your machine. So it runs this Hawk command. There's the stackhawk.yml file specified. It's in the root of this project now. So it starts up. You can see it's running version 2.9. Here's the localhost 3000. That's that host that we configured when we were setting up. All done. I wanna go through this though. So you can get a little kind of a taste of what it's doing here. So it found 11 GraphQL routes. It's gonna try them all. Goes through here, does all that, runs, you know, a bunch of status stuff. And then at the end of it, for the command line, we actually output this little helpful summary here. So you go in here. So there's that remote OS command injection. Here's the SQL injection. It's basically the stuff we found. I don't think it prints them all. I think it only prints out some of them. But, so it ran, that's the output. And then if we go back to here, so now we've run another instance. So at 10 o'clock Denver time, we ran the first one and then we just ran the second one here, but with the Snyk integration. So now we have indicators here telling us that some of the findings got correlated against assessed findings in Snyk. So if we go into SQL injection, we go, okay, cool. We looked at this one previously. We go, yeah, it's that query. It's a single tick that would cause a problem. SQL lite error, remember this one? Yeah, sure, we're just passing those down. We're not sanitizing the inputs. The difference now is now we have this Snyk code tab here. We can click on this and we go, oh, CWE89. Wow, Snyk found that in this post.ts file. And if we go here, we go, oh yeah. Yeah, that's, so if you look at this thing, it's taken this query right off the arguments, right off the URL. It's not doing anything with it and it's just, oh, arguments, sorry, arguments, yes. So it's passing this argument straight off the query into this. Yeah, so a single tick would cause this like, basically to be parsed and finished. You'd have some extra stuff. So yeah, we need to address that. But you can see how this is a problem detected in the running application. And then this is the actual line of code that looks pretty suspect and needs to get fixed. We can also link back to Snyk and look at it. Similar kind of thing where they have, here's the line of code, a little bit more information on it. Let's see, and then there was another one. So this server leaks X powered by, so here all the operations that were detected to have this problem. So we go into the Snyk code now, and it's maybe in a couple of different places. So we go here, doesn't like that, and it doesn't like, oh yeah. So yeah, so this one is, this one too. Yeah, so by default, Express sends this X powered by header. And so it's just leaking out some information about your application. So if a malicious third party could say, oh, I know this is an Express app. So now I know I can go probe for certain Express vulnerabilities. Yeah, so you can use Helmet middleware to sort of not disclose that information. But yeah, there's two places in the code. You should go look at that and figure it out. Yeah, so that's our Snyk integration. It's pretty nice, you can see between the scan runs here, we've nothing really changed because we didn't change any code, didn't do anything. The only difference is we added Snyk integration here. And then like I said before, the other SAS provider that we integrate with is GitHub CodeQL. Yeah, CodeQL. And it looks very similar. In fact, you can have both running at the same time, I think, but it would have, we just basically have CodeQL markers here and the links would be back to CodeQL. So that is trying to bridge those two. We had the Dependabot software composition analysis. We have SAST, the static analysis and the StackHawk dynamic analysis. We're really trying to bridge those two together because being able to say, hey, this line of code is causing this problem in your running application is really powerful. It's a great thing if you're an engineer, like being able to know, being able to see that, like when you do a commit, you commit a bunch of changes and then this thing triggers and it's like, oh, you know what? This line that you just added has this problem in it. Doing that at the commit time is super powerful. It just, it makes things real easy and fast. That is all we've got. Yeah, next steps, scanner, CI updates, all good stuff. Hopefully this translates pretty easily into your GraphQL apps. The CI piece, like I said, if you're not using GitHub Actions, we support a ton of them. The basic idea is the same that you set up, if it's GitLab or Jenkins or whatever, you basically set up the scanner, the StackHawk scanner to run in your CI, CI CD pipeline against your application and then it reports things back. The Dependabot and CodeQL are, because those are operating against, you know, source code, you don't have to have a running application. Those are a little bit easier to get going. Snyk is very similar. Basically when you go to run, go to use Snyk, it will want access to your GitHub repos and you say, yeah, that's cool. And it checks them out and then runs, runs its pattern matching against your repos directly. But it's the same thing as the CodeQL that we did. Hopefully that translates either Snyk or CodeQL. It's very, very similar. Yeah, and then we've got our updates. Stay tuned, there's always new stuff coming out. I think, yeah, I think that wraps it up. Like Topher said, I mean, we kind of went through like a really robust security system here with various types of security testing going on at once. So thank you for that, Topher. That was awesome. And like he mentioned, you know, there's, we went through a couple of specific examples using particular tools to make it easy, but there's so many, you know, tools you could use. CodeQL can be swapped with Snyk, and I believe Snyk even has like a software composition analysis tool as well. So all of these work, you know, hand in hand together, you know, to give you really complete coverage of your APIs and applications. So please feel free to use any of those resources that we dropped in the Discord to go further, you know, with this experimentation. Now, I mean, if you're in the workshop, you've got a StackHawk account, you're on a free trial. I believe it's an enterprise level free trial, when you go through that. Yep, so you've kind of got access to everything during that time. So play around with it, connect some things. And our team will also be in touch with some more resources on how you can just play around and explore with StackHawk and connecting some of these tools. So yeah, thanks everybody for joining. Thank you, Topher, for going through such a good workshop. Hope you enjoy the rest of your day. Thanks everybody.
76 min
07 Dec, 2022

Watch more workshops on topic

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career