Secrets in Source Code - How Your JS Code is Exposing Your Credentials

Secrets like API keys are constantly leaking through source code. The 2021 State of Secret Sprawl report found over 6 million secrets in public git repos. This presentation reviews the new, unreleased, 2022 State of Secrets Sprawl report focusing on how JavaScript source code specifically leak secrets.

Mackenzie Jackson
11 min
06 Jun, 2023


Video Summary and Transcription

This lightning presentation discusses the issue of secrets leaking in code and how it can expose digital authentication credentials. GitGuardian scanned over 10 million secrets in public repositories on GitHub, with Python being the top language for leaked secrets. The exposure of secrets can occur in both public and private repositories, and it is important to avoid hardcoding secrets and store keys securely. Best practices for handling keys and secrets include using a centralized place for storing keys, using tools like .env for loading secrets, and implementing vaults and secrets managers.

1. Introduction to Secrets in Code

In this lightning presentation, I'm gonna talk about secrets inside code and how your applications or you react applications may be leaking your secrets. Secrets are digital authentication credentials that grant access into services and allow for data ingestion and writing. Our applications are a collection of different services, like Okta for authentication, Stripe for credit card processing, and MongoDB for managing databases. GitHub is a platform where we scan code and find a huge amount of data, including secrets, with over a billion commits made to public repositories.

Hey, everyone. My name is Mackenzie. I'm a developer and security advocate at GitGuardian. And in this lightning presentation, I'm gonna talk about secrets inside code and how your applications or you react applications may be leaking your secrets.

So good place to start is really what are secrets. When I'm talking about secrets, I'm referring to digital authentication credentials. These are typically things like your API keys, your credential peers, like your database credentials, security certificates, anything that grants access into services, or allows you to ingest data or write data. Really, these are the crown jewels of any organization because an attacker is going to go after these immediately when they make access into anything, allows them to persist their access or move laterally into different systems, elevate their privileges, and they can do all of that while remaining undetected because they're properly authenticated into services.

So to understand how we use secrets, let's take a step back and look at how we build application. No longer are we building monoliths. Our applications are more or less a collection of different services. So for example, we may use Okta to authenticate users into our systems. Maybe we're using Stripe to process credit cards. MongoDB is a managed database. So very quickly, our applications can become a collection of all these different systems.

One of the things we do at GitGuardian is we scan code around the place to try and identify where these secrets are. And we release it in a report each year called a State of Secrets For All. One of the places that we scan is You're probably familiar with it. Lots of developers are. In fact, at 2022, 94 million developers were using GitHub according to GitHub themselves. And we've crossed the 100 million developer mark by now. And last year, just last year, over a billion commits were made to public code repositories. Just public code repositories. So that's a huge amount of data. 84 million new repositories were made. Again, just public. So this is a literal firehose of information. You can find anything and everything in GitHub in public repositories, including secrets. So there's 1 billion commits.

2. Secrets Detected in Public Repositories

GitGuardian scanned over 10 million secrets in public repositories on GitHub, including valid cloud provider keys, messaging systems, database keys, and keys to version control platforms. Python was the top language for leaked secrets, followed by JavaScript. Hard-coding credentials in main files like app.js and index.js is a common but insecure practice. Configuration files for specific services, like Docosaurus, can also expose keys. Hackers do monitor GitHub for these credentials.

GitGuardian actually scanned every single one of them last year. We scan them for secrets. In fact, nearly 400 different types of secrets we were looking for. And what did we find? We found over 10 million secrets that were detected in public repositories in GitHub. 10 million. This is a big increase from previous years. The first year we released this report was in 2020 when we found 3 million, and you can see the progression to 10 million now.

Partly this is explained by more developers on GitHub, but it's also explained because we're actually using more and more secrets every year. Now we have different things like Infrastructure as Code, which is the fastest growing segment on GitHub. We're now programmatically managing our infrastructure, so we need to use secrets to do that. So more secrets, more leaked secrets on GitHub. There's a whole bunch of lots of information that we can find on GitHub. We look for specific secrets and we validate them when we find them. Some of the more interesting things that we found are, for example, that 20% of the secrets that we found were for cloud providers, things like Google Card Services or AWS. And again, these are valid credentials. So this is 2 million valid cloud provider keys that we found in public places in GitHub. We even found lots of different things like messaging systems, database keys, and even keys to your version control platform. So this is access to your private repositories that you put in a public repository.

So a lot of the extensions that we found here, Python was the number one language that leaked secrets, but JavaScript is really the second language when we exclude JSON and ENV files, which are really language agnostic files. But we're just looking here on this slide at JavaScript file extensions. Now, it may not be a surprise that app.js or index.js are really the number one applications. And this is really a main file where the developer has hard-coded their credentials into their source code. Now, this is a very bad practice, but it doesn't come as an overall surprise. But we can find some interesting things as we look through this data. For example, we have configuration files for specific services. So Docosaurus is kind of like an out-of-the-box framework to build websites on. And it comes with a configuration file. You can just simply replace the example text and put your own keys in here, and commit this to GitHub. And now you've just exposed your keys. So is this really a security risk? Do hackers actually monitor GitHub for these credentials? The answer is yes.

