Secrets in Source Code - How Your JS Code is Exposing Your Credentials
AI Generated Video Summary
This lightning presentation discusses the issue of secrets leaking in code and how it can expose digital authentication credentials. GitGuardian scanned over 10 million secrets in public repositories on GitHub, with Python being the top language for leaked secrets. The exposure of secrets can occur in both public and private repositories, and it is important to avoid hardcoding secrets and store keys securely. Best practices for handling keys and secrets include using a centralized place for storing keys, using tools like .env for loading secrets, and implementing vaults and secrets managers.
1. Introduction to Secrets in Code
In this lightning presentation, I'm gonna talk about secrets inside code and how your applications or you react applications may be leaking your secrets. Secrets are digital authentication credentials that grant access into services and allow for data ingestion and writing. Our applications are a collection of different services, like Okta for authentication, Stripe for credit card processing, and MongoDB for managing databases. GitHub is a platform where we scan code and find a huge amount of data, including secrets, with over a billion commits made to public repositories.
Hey, everyone. My name is Mackenzie. I'm a developer and security advocate at GitGuardian. And in this lightning presentation, I'm gonna talk about secrets inside code and how your applications or you react applications may be leaking your secrets.
So good place to start is really what are secrets. When I'm talking about secrets, I'm referring to digital authentication credentials. These are typically things like your API keys, your credential peers, like your database credentials, security certificates, anything that grants access into services, or allows you to ingest data or write data. Really, these are the crown jewels of any organization because an attacker is going to go after these immediately when they make access into anything, allows them to persist their access or move laterally into different systems, elevate their privileges, and they can do all of that while remaining undetected because they're properly authenticated into services.
So to understand how we use secrets, let's take a step back and look at how we build application. No longer are we building monoliths. Our applications are more or less a collection of different services. So for example, we may use Okta to authenticate users into our systems. Maybe we're using Stripe to process credit cards. MongoDB is a managed database. So very quickly, our applications can become a collection of all these different systems.
One of the things we do at GitGuardian is we scan code around the place to try and identify where these secrets are. And we release it in a report each year called a State of Secrets For All. One of the places that we scan is github.com. You're probably familiar with it. Lots of developers are. In fact, at 2022, 94 million developers were using GitHub according to GitHub themselves. And we've crossed the 100 million developer mark by now. And last year, just last year, over a billion commits were made to public code repositories. Just public code repositories. So that's a huge amount of data. 84 million new repositories were made. Again, just public. So this is a literal firehose of information. You can find anything and everything in GitHub in public repositories, including secrets. So there's 1 billion commits.
2. Secrets Detected in Public Repositories
GitGuardian actually scanned every single one of them last year. We scan them for secrets. In fact, nearly 400 different types of secrets we were looking for. And what did we find? We found over 10 million secrets that were detected in public repositories in GitHub. 10 million. This is a big increase from previous years. The first year we released this report was in 2020 when we found 3 million, and you can see the progression to 10 million now.
Partly this is explained by more developers on GitHub, but it's also explained because we're actually using more and more secrets every year. Now we have different things like Infrastructure as Code, which is the fastest growing segment on GitHub. We're now programmatically managing our infrastructure, so we need to use secrets to do that. So more secrets, more leaked secrets on GitHub. There's a whole bunch of lots of information that we can find on GitHub. We look for specific secrets and we validate them when we find them. Some of the more interesting things that we found are, for example, that 20% of the secrets that we found were for cloud providers, things like Google Card Services or AWS. And again, these are valid credentials. So this is 2 million valid cloud provider keys that we found in public places in GitHub. We even found lots of different things like messaging systems, database keys, and even keys to your version control platform. So this is access to your private repositories that you put in a public repository.
3. Exposure of Secrets in Repositories
Secrets can be leaked from both public and private repositories. The Lapsus Group has publicly released the source code of major companies, showing that private code is not safe. Source code is spread across various platforms, including developers' machines, backups, wikis, messaging systems, and Git repositories. Even if you don't have open source repositories, secrets should never be in your code. To prevent exposure, avoid hardcoding secrets and store keys on the backend instead of handling them directly on the front end.
Very frequently. One example that we had last year was with Toyota, where a contractor working for Toyota accidentally leaked public keys, accidentally made a private repository public, which leaked keys to a database that Toyota's mobile application was using, a mobile application called T-Connect. So these are keys that gave access to all the user's information that were using T-Connect that were sitting out there in a public repository. So this is just one example, but there's many, many, many more examples of where real keys belonging to organizations having real consequences have been leaked out in public places.
But it's not just public repositories that you need to worry about. It's your private repositories, too. Source code that is private is always accidentally finding its way out into public places, or as I call it, involuntarily open sourced. The Lapsus Group, for example, last year publicly released the source code of NVIDIA, Samsung, Microsoft and many, many more. You may ask, how did they gain access to private source code? Well, there's lots of different ways. Source code is a very leaky asset, which means it's sprawled everywhere. It's on your developers' machines, it's backed up, it's in wikis, it's shared over messaging systems, and of course, it's in your Git repositories. Also, a lot of people have access to your Git repositories. And I know for a fact that Lapsus was simply paying employees to grant them access to public private source code. So it's not that difficult for a threat actor that's motivated to access your private source code. So if you're thinking you're safe because you don't have open source repositories, you're not. Secrets definitely should not be in your code regardless of where they are.
So why do secrets get exposed? There's lots of different ways. Maybe you're committing things and you're testing out keys so you've hardcoded them in there, not realizing that those keys are going to be in your history forever. Even if you do something on a development branch, you hardcode credentials, you quickly remove them later because you're just testing it, those credentials are still going to be in your Git history and they're going to be there forever. They can be printed out in auto-generated files like debug logs. You can include sensitive files in your Git repositories like .env files or .pem files. They should never be in your repository so make sure you use a .gitignore file and you can also accidentally push code to the wrong place. So how do you prevent them from doing this? Well, you should never hard code secrets, never directly type in the secret into your application. That's rule number one, doesn't matter if it's private, public or what kind of branch you're working on or for how long it's there. It will be there forever in Git. We also can use tools to prevent them from being leaked. So we can use tools to enable us to do this. So we want to store keys on the backend. If you're building a React app and you're making it look pretty, we want to make sure that you're accessing your credentials through the backend and it's passing you the data that you need and you're not dealing with it directly. And there's not many scenarios where you actually need to handle those keys on the front end.
4. Best Practices for Handling Keys and Secrets
If you want to go a step further, we should definitely be using vaults and secrets managers. We need to rotate our keys regularly so that even if a key does get leaked, you're on a rotation schedule so it won't be valid for long, and we need to restrict the access of the key. We should restrict it down to just the services that we expect, so if a threat actor gets it, they can't do anything with the key.
For sensitive stuff, we should be signing those keys with a hash of the application that's using it so that we're absolutely certain that everything that's being used is for legitimate purposes only. So thank you very much. There's a couple of QR codes on the screen. One will take you to the state of secrets we'll report for 2023, and the other will look at a secrets management maturity model of how you can correctly handle your secrets in various different ways. Thanks for listening.