AI Generated Video Summary
It's a really great way to just interact and engage developers. But through that work, I also do a lot of things with a community, which is through the OS security project, or maybe through things like the node foundation security, triage vulnerabilities, and a lot of work around open source. And that kind of like, helps me get a clear picture of what is going on, where things are going.
Now, I realize that this is probably, everyone kind of like relates to this in a very emotional state, right, when you go and do an npm install. Yes? So, I'm here to tell you that this is okay. You are filling something that every one of us fills before we do an npm install, and this whole talk will basically be about why you feel that way, but also give you some preventive measures, some security controls that you can have and add tomorrow in your team to be able to mitigate the risks around things that happen there.
So, that filling that you have if you can relate to that MIM is basically very based on some foundational scientific research. One of those cases a couple of years ago have shown us how when we install the average npm package, we put a lot of trust out there into maintainers. And third-party dependency that we're bringing in. Installing the average npm package just by that, you're probably trusting about 79 third-party dependencies and 79 and then 39 maintainers. That's a lot. That means there's going to be probably a lot of noise and potentially pain to maybe also remediate some of this. But this is the truth of the things.
And I'm also here to tell you that this isn't a new concern. In fact, this whole thing about where do we put our trust as developers and how much should we trust? What should we trust exactly? Is something that's been talked about almost 40 years ago. This person called Ken Thompson, he's an award-winning Turing award developer, and he had actually went on to create this essay called Reflections on Trusting Trust. I highly recommend reading it, but just giving you the gist of what it actually means.
So this person went off and said, I want to show you what it means to trust people. And then he added a back door to the Unix login program. But of course people review codes, right? On open source. So then he went on and continued this chain of adding the back door to the compiler that then compiles the login program and then it will inject it. But well, people also review the compiler codes. Well, how do you compile compilers? You need one entry point to begin with. And so he actually went on and added that back door.
2. Insights on Open Source and Supply Chain Security
That's Rojan thing that he wanted to show us as an experiment how this works. Added this to the compiler that then compiles the Unix login program. It reveals why trust is important and how much further we need to go. Open Source is great and using it is a productivity tool. We need to understand the gift of Open Source and the supply chain security story. It's not just about NPM dependencies, but also the entire software building process and integration points.
That's Rojan thing that he wanted to show us as an experiment how this works. Added this to the compiler that then compiles the Unix login program. So if you review the Unix login program and if you review the compiler, at that point you will not see it anymore. Because you still need a binary compiler to then compile all of those. And that is where things are happening. That is where the back door is inserted.
A very interesting insight in revealing how software kind of like has traits and how it spawns them on to other specific programs that it gets generated out of. And so I highly recommend reading this. But it shows us like why trust is important and how much further we need to go in order to put that trust somewhere.
So still Open Source is great. And we can't deny the fact that to build software today, we need to use Open Source software even when maybe the program that we build is not Open Source itself, maybe half of it or whatever. But that's kind of like the reality. And of course, why not? Why not use Open Source software, right? Because essentially what we really want is not to reinvent the wheel. We want to use work that great people have done, and then we can take that work and use it to practice. And this is a great productivity tool.
So by now, I'm pretty sure we're hitting that two million mark on NPM. So I don't know. Amazing to us, to all of you here, helping us promote Open Source software. But at the same time, we kind of need to understand and recognize this gift that we're given, that Open Source has been given to the world, and what it actually means. So all of those packages, they are essentially the supply. This is part of the supply chain security story. And it is relatively an easy thing to think for us, that all those supply chain securities may be NPM dependencies. But it's not really just that. In fact, if we go back all the way to the basics of how software is being built, we can see that we have several connection points along the way. So you're a developer, you're building something, maybe pushing it to GitHub. That's basically your source control, then there's a build getting triggered, then there's some output out of that. Maybe that's essentially a package or maybe that's getting thrown onto some CDN or whatever. And you're using some open source through the build process. So all of that is essentially how we're building software. But here are the integration points of what supply chain security means at the very basic level.
3. Supply Chain Security and Lock File Tampering
Developers are being targeted as malware distribution vehicles and for spearheaded attacks to steal tokens. Installing an NPM package can be risky. Let's discuss preventive measures, starting with lock file tampering. In 2019, I disclosed potential security problems with lock files. A pull request can include a Trojan hiding in the package.json and lock file. The yarn lock file is often ignored, but it poses a threat.
It's essentially any one of these intersection points can actually go ahead and insert bad code, which we've seen. For example, the Linux hypocrite commit that has been happening, I think that was last year with an incident from a university that had actually inserted that to make the point of, you know, Ken Thompson, if you want to relate to that to begin with. And you know, compromising source control is something that's happening. For example, the PHP source code control was not managed on GitHub and someone was able to get access to the PHP Git servers and actually potentially modify the code. That's like millions of servers running off the internet and gaining, you know, the back doors or the Trojans out of that. And there's more and more, someone can modify our code, your build might be compromised, maybe we're not, you know, building those GitHub actions correctly with the best practices. Maybe you're using a bad dependency like we've seen with EventStream. Maybe you, the actual result of what you do build, what your consumers actually get does not actually go through the formal CI CD processes, which is a very related security story for us in the ecosystem because CodeCup was part of this problem where the binary was actually changed behind the scenes. So all of this is like how software is getting built today and this is the whole supply chain security story. The NPM package is a part of it but it's not all of it. Still, I think we're seeing that developers are being right now targeted and for a few years already if not more, as, you know, malware distribution vehicle or just targeted as spearheaded attacks to steal all of our tokens for NPM and for GitHub and for everything else because the stuff we have on our laptops is, well, we have secrets for production, right, and access keys for staging and all of those things. So if you install an NPM package, that's kind of like if it does something bad, you should be worried of that. So with that said, this kind of like intro, let's go and talk about some preventive measures, like what can you, what can we do as security controls for this ecosystem? Starting off with something that I've actually done in the past, which is lock file tampering and Myle, who has been here opening the session today about Yarn 4, has actually talked about this and the security aspects of package managers and how this is now getting mitigated. So back in 2019, I actually disclosed research on potential security problems with lock files and it has to do with lock files on Yarn and NPM and whatever, it's basically how they are managed. So let's take a practical approach and see what this actually means. This is a screenshot of how I opened a pull request to a repository. And I'm pretty sure you can all recognize what's going on here. There's essentially no code change that I've proposed, but this pull request still includes a malicious package. There is a Trojan hiding here in this pull request. And this is the entire code for the pull request, just this. Package JASON and the lock file. So what's really going on there? Because it looks like this isn't a type of squatting attack because those name of packages are legitimate. The versions check out. And if you were running something like sneak on a git integration, it would tell you that upon pull request it does a check and none of these specific libraries and versions introduce new vulnerabilities. So essentially everything looks to be okay here, right? Okay. Well, there's a yarn lock file here, which we all kindly ignore, right? Because who wants to code review this? I don't. That's as much as I don't want to review regexes, right? This is not supposed to be human readable and not supposed to be human consumable. Still this poses a threat. Let's see what it is. So I expanded this and, you know, this is my lock file.
4. Mitigating Package Manager Risks
When giving a pull request to a project, I can use a feature of package managers like npm and Yarn to install packages from unconventional sources. This poses a risk as malicious packages can introduce post-install scripts that run arbitrary commands on your machine. To mitigate this, I propose linting the lock file and enforcing trust policies. Additionally, it is advisable to prevent contributions to lock files and automate dependency management through bots.
And if you were to start reviewing this, and you can see there's like a line change of like 5,000 or whatever, this is pretty, you know, long. So I'll scroll down a little bit and I'll try to review it together. Scroll, scroll, scroll. Okay. Do you see it? Do you see the issue? Not yet. Almost. There we go. Okay. So all I need to do when I give that pull request to a project is use this really cool feature of package managers on, you know, like npm and Yarn which essentially allows us to install packages off of really weird things like a gist of GitHub. Like the tarble, like essentially the head commits of a source control repository. So I can do that. Once I have the integrity check and everything else checks out, I can go ahead and push this into my pull request or I can change the ms source, not from being on npm but rather from being from my own GitHub, and it will install it, and I'm saying this is a malicious package because once you install it, I may introduce for you some post install scripts that will run some commands that I can install whatever I want in your machine. How do we mitigate this? This is where I was disclosing this research and came up with the idea of linting lock file. So one of those things, we all use linters for different things, like JS lint for your code quality and clean code or whatever you want to use it for, which is great. This is another one that you should probably think about adding because it's essentially giving you an ability to say your lock file needs to have specific trust policies. For example, even not related to the origin of where something is from, the allowed host here. But maybe some software is getting installed out of an HTTP connection which enables people to perform man in the middle of attack. So you kind of want to have this trust policy and this is how you do it. So use it in your CI or your pre-copy talks or whatever you're using, but essentially you want to have more mitigations measure. So besides of maybe using this, you should figure out two things here, right? First of all, probably you do not want to allow or receive any contributions to log files because of this issue because realistically none of us is going to really review those lines of code of a log file. So let's not, you know, open this door to begin with. And also what relates to how we manage dependencies is you essentially want to be able to have all of these dependency management spawned off to some bots because they're good at it and they can, you know, raise those automatic PRs for us. So that's another thing to just realize.
5. Arbitrary Command Execution in NPM
Arbitrary command execution is a feature of package managers like NPM. This exposes users to the risk of installing malicious packages or compromised modules. In January 2022, a malicious version of the NPM package 'callers' was submitted to the registry, putting users at risk.
So, continuing on. Arbitrary command execution for all of us. That's like a feature of package managers. So it's amazing. Let's see. NPM install callers. I'm going to go and copy-paste that into my terminal but... Yeah. Maybe I should take a few seconds before I run that command. And I should, because this actually happens. NPM, as a package manager, allows any dependency along the tree, no matter how big or small, to execute commands before or after something in that tree is installed. And so, if I went on and did NPM install callers on the day, when there was actually a malicious callers version submitted to the NPM registry, I would be exposing myself to malicious packages or maybe compromised modules, maintainers, and things like that. So this really did happen. In January 2022, in case you missed it, if you had installed NPM callers, that's something that would have happened.
6. Impact of Sabotaged Package Callers
But let's drill down a little bit to realize what's going on. And callers has been kind of like sabotaged by its own maintainer to run some, I won't get into this, but that has been happening, you can see it hasn't had any downloads in the last two years. No, like, sorry, no new versions in the last two years. But suddenly a patch new version has been released, and at this point in time, very, very quickly, just the last seven days, it gains something like 100K of downloads for end users downloading this version. What's going on with this version that all of those 100,000 users have been downloading, maybe me and you? Let's see.
7. Working Around Blind Upgrades and Vulnerabilities
So you need to work around those things. But most of us probably do not need to trust everything and everyone by default have this insecurity. How about avoiding blind upgrades? Another thing that I've been seeing happening, you know, talking to developers all the time. Like, they have in their CI things like running an NPM update, running a NPM check update command, and essentially they are running that in CI, because they want to be able to in CI always update the latest version and test that none of the packages they were dependent on had broken their code. So, why would you want to do this to yourself? So, you need to think of, like, how do you do this well, right? Not with an upgrade, but with context. Number four is what you see is not what you execute. It's a very favorite of mine.
So you need to work around those things. But most of us probably do not need to trust everything and everyone by default have this insecurity. How about avoiding blind upgrades?
Another thing that I've been seeing happening, you know, talking to developers all the time. Like, they have in their CI things like running an NPM update, running a NPM check update command, and essentially they are running that in CI, because they want to be able to in CI always update the latest version and test that none of the packages they were dependent on had broken their code. Which is, I mean, understandable why they're doing it, but it is exposing you again to a plethora of issues that could be happening. Security incidents like dependency confusions and a ton of other things, like, why you want to be there. You would... If you had done that in your CI and that CI were running in those days where colors was out, where NodeIPC was out, all of those security incidents, you would be getting those malicious versions automatically. So, why would you want to do this to yourself?
So, you need to think of, like, how do you do this well, right? Not with an upgrade, but with context. Which essentially means, please, again, use those automated bots. Can use GitHub or sneak, whatever you want. But use this in order to streamline those package upgrades. Not through a way that actually gives them all of this access to your machine. In fact, actually, some of them can protect you. Like, with sneak, what we've done is you have... We are doing NPM upgrades for your packages. Not just for security, but also just because of their out of date. But when we do those, if Node IPC, or callers 141, just gets out yesterday, we do not immediately rush to give you those updates. We've actually went and looked at a bunch of security incidents that happened in the past. And what was actually happening there and how much time did it take the ecosystem to go ahead and mediate them. And that is why we kind of have this inherent delay of about 21 days before we suggest a new upgraded package. So if something malicious is going on right now, you're not getting that malicious package next day, before everyone had a time to react to this.
Number four is what you see is not what you execute. It's a very favorite of mine. I know how many of you have heard about Trojan source attack. But let's go and drill into a bit of code. So here's a bit of code that I took of like a Fastify middleware thing. It's like a Node JS example. You can go ahead and tell me for a second, look at it and tell me where do you see the vulnerability coming in. Is it the first paragraph, the second paragraph, the third paragraph, is it all okay? Of course, you're developers, so what am I thinking? I'll just highlight this and you'll find it right away.
8. Hidden Comments and Trojan Source Attacks
This code has an issue with how it is written. Control characters in strings can hide comments and change the code's logic. Recent research on Trojan source attacks has led to improved warnings in tools like VS Code and GitHub. Mitigation options include using the Speak extension in VS Code or an ESLint plugin.
So some idea of mitigating it. Oh, there we go. I like that dog. I like dog in general, so it fits. So Trojan-source attacks, right, we have some ways to mitigate it, and essentially what we want to do is be able to mitigate them as fast as possible. Again, you have this already in VSCode idees and things like that. And so you can do this and either use that or you can use a nes-lint plug-in I wrote at it and no matter what VS Code you're using, versions or idees or whatever, it will just detect them and tell you about this.
Next up, avoiding dependency confusion. Whoa, let's see what this actually means.
9. Dependency Confusion and Mitigation
This research highlights the risks of incorrectly managing dependencies and the potential for dependency confusion attacks. Mitigating this is easy with tools like Snyk, which scans package JSON and Git commits for vulnerabilities. Thank you all for attending my talk and remember to write secure code.
I'll run through this pretty quickly towards the end of it. So this has been a research that's been going on in the ecosystem for quite a bit and a lot of actual pen testers have been using this to try and get inside companies' internal systems because of the way they are incorrectly managing dependencies and the configuration around them.
There's a bunch of this, I won't go into how this whole thing works, but essentially dependency confusion is rooted with the private packages hosted internally for a company are not found in the NPM register. Like, that space is open and free to register for everyone and then potential misconfiguration could allow the fact that someone is able to take that namespace, add again malicious code into it like an NPM install command and run that in your machine. How do you mitigate it. That's, you know, pretty, there's like a bunch of tooling that are like pretty simple today. But if you do things like, again, NPM update or like manually manage your dependencies, there's like a lot of chance that you will be prawned to those dependency confusion attacks. So, again, do not do this. You see it's like repeating theme across different types of attacks. Mitigating this is pretty easy if you kind of like want to use one of those tools. So like we created it back then. It's called Snync. The idea is it scans your package JSON. It even scans your Git commits to understand when you inserted a private package and how was that in terms of the time frame relatable to watch on NPM. It will give you this kind of warnings where like potentially, like you are right now vulnerable or maybe there's like a suspicious way of some package that exists. But you're not, we're not entirely sure if it is malicious or not.
So with all that said, thank you so much. We're on time. Thanks, everyone, for coming to my talk, and I hope you all write secure code. APPLAUSE. Thank you so much. I think everybody was taking notes. Right? Otherwise, if they were not – I will share the slide after that. Thank you. Asking for a friend. Share the slides. If you still have questions, definitely put them on Slido so I can read them from the screen here. So, to pick one, what are your thoughts on feeling safety introduced by features like MPM ignores scripts when a node module can access DFS at any time during – at run time? Yes. Great question. There is some talk.
10. Node Foundation and Threat Model Discussion
The Node Foundation, now the OpenJS Foundation, has a security ecosystem working group. They are transparent and open on GitHub, allowing discussions and monthly calls. There is a current discussion on establishing a threat model for node applications. Compartmentalizing or departmentalizing capabilities of modules or apps is a potential solution. This is a real threat that needs to be addressed.
There is actually done – the Node Foundation, or today the OpenJS Foundation, has a security ecosystem working group, so you could also join that, and this is – all of that is managed very transparently and open on GitHub, so you can actually join the discussions, the monthly calls, et cetera. There is now a recent discussion about establishing a threat model for node applications, kind of like this is related to the whole demo versus Node.js, in terms of like the security aspect. So, this is being discussed there. One of my thoughts, of course, I'd be happy if there's like a way for us to essentially be able to compartmentalize or kind of like department the whole capabilities of maybe specific modules, maybe the whole app or whatever, but this is a real threat anyway, regardless of the NPM. And so, this is a good thing we need to fix still.