At first glance, it seemed pretty limited in impact: it would basically be a good way to crash some code. However, multiple cases of Remote Code Executions have happened based on this vector.
In this talk, we will clarify what are prototype pollutions, their real impact and of to prevent them from happening in your codebase.
AI Generated Video Summary
2. Understanding Prototype Pollution
3. Understanding Prototype Chain
The prototype chain forms a tree structure. Objects created with different methods may or may not share prototypes. Item three has a prototype, my proto, while item one and item two have the prototype of class one. The prototype of old style class is still in the chain. The method bar is not available on cl.prototype but is available on old style class.prototype.
So prototype chain, basically, it's a tree. So on the left hand side, I defined a prototype in my proto. I use that to create objects. I defined a class with a constructor, old style class. I give it a prototype directly, and I even define a class with the class keyword. And basically when I create objects on the right hand side with these, they will share prototypes or not.
Okay. Let's go a bit deeper. Item three has a prototype, my proto, because on line eight on the right hand side code, we set objects dot set prototype of on this object. So we have a method to arbitrarily put a prototype on an object, but item one and item two, the prototype of the class one, because we created them with new and called the constructor of the class at the bottom of the left hand side. But since this class extends old style class, well the prototype of old style class define a line eight to 13 on the left hand side is still on the chain. So if I want to check the method, the method bar on item one or item two, it won't be available on cl.prototype but it will be available on old style class.prototype. Okay. Is that clear? I hope. Please don't throw stuff at me. Okay, thanks.
4. Accessing Object Prototypes
So how do we access the prototype of an object? Yeah. But the people telling me it was clear, knows everything about prototype. So how do we access the prototype of an object? We have multiple ways. So let's have a class named my class because I'm very original in the way I name my classes and create two items, my item and my item two. If we check, if show prop, which is a method of the class is available on this object, they are not, because has on property on line 13 tell us it's not available on the object directly. But if we do object get prototype of my item, this will return the prototype of the class. And this has the own property show prop, that's what we are on line 14. We can also access a prototype with underscore, underscore, proto underscore, underscore. So if we do my item, underscore, underscore, proto, underscore underscore that has on property show prop, it will return true. And if we console.log my item.constructor.prototype has on property show prop, it will return true too. And what is worth noticing that there is a single instance of this prototype in the heap, meaning my item dot underscore underscore proto underscore underscore, is exactly the same thing as my item two underscore, underscore, proto, underscore, underscore. Pretty sure nobody said prototypes that much in their life in a short amount of time.
5. Understanding Prototype Pollution
Prototype pollution occurs when an arbitrary payload can overwrite properties or methods on the prototype chain of objects. This can happen when using a merge function. A specific example is shown with the Hook library, where a malicious payload is used to modify the prototype chain. The impact of prototype pollution can be severe, with over 200 disclosed vulnerabilities since 2018, including remote code execution in Kibana and the PaaS server.
So what does the impact of prototype pollution? Because it sounds like a good way to mess with someone's code base, but can it be evil? So since 2018, there have been more than around 200 CVEs, CVEs mean public vulnerabilities disclosed for everyone to know and not every vulnerabilities get a CVE. So that's probably the top of the iceberg. There have been a few remote code execution in Kibana and the PaaS server, and no, I'm not working at Datadog anymore, so I can say what I think about Kibana without looking so well.
6. Prototype Pollution in Kibana and Parse Server
Parse server. So parse, who is familiar with Parse here? It was very, very popular a while ago. It's basically a backend project for mobile applications that expose an API in front of a MongoDB server, being acquired by Facebook, shut down by Facebook, and now there is only an open source version that only Google shut down product people love. And basically that's an API in front of MongoDB. You can store objects.
7. Prototype Pollution in MongoDB
You can request object from MongoDB through a wave API. It's vulnerable to prototype pollution before it was fixed. The library used, bsonjs, allows storing functions in MongoDB. By default, functions are not unserialized. However, if the eval function option for the bson library is true, arbitrary functions can be evaluated. This can lead to running arbitrary code when retrieving objects from the database.
You can request object from MongoDB through a wave API. It's vulnerable to prototype pollution before it was fixed of course. And it uses a library that's named bsonjs. So, bson is a format to store objects in MongoDB. It stands for binary JSON or something like that. But, bson allows you to store functions that will be stored in MongoDB and you can unserialize. But by default, they are not unserialized. Because unserializing a function that would come from a database would basically mean, let's do eval on that string that comes from the database I have no idea about, and you don't want that to be a default. But if the option eval function on the object used an option for the bson library is true, well, you will be evaluating those arbitrary functions. And because Parse allows you to write pretty much anything you want in your database from the network, because it's Parse, you could actually run arbitrary code when the object is retrieved, which is, oh my god, that's because they check the diversion hack.
8. Preventing Prototype Pollution
To prevent prototype pollution, filter out merge functions and specifically remove underscore, underscore, proto, underscore, underscore. Lodash has fixed all instances of prototype pollution. When using as-owned property, ensure it exists on the object and not its prototype chain. Building defensive objects using Object.create or Object.createNull can prevent prototype pollution. Sanitization and data validation are crucial for preventing outside attacks. Consider using libraries like joy for data sanitization when building a Node.js web server. Node.js has an option to disable proto, underscore, underscore, proto, underscore, underscore, but be cautious as it may break some code.
How to prevent prototype pollution because I'm a responsible person, I don't want you to feel scared and say, you know, let's use a language without prototypes like Python. How to prevent? Well, let's filter out, you know, merge functions. You see, for instance online nine here, line nine, nine, three, or online four here, that we filter out, underscore, underscore, proto, underscore, underscore. And that's what we've been fixing a lot of libraries. Lodash has been adding more prototype pollution than any other library I know, and they've been all fixed one by one. If you find a new one, feel free to responsibly disclose it to their maintainer, whatever libraries it is.
Sometimes, you will know that your code path is critical and you want to make sure that you're using as-owned property. Well, well, as-owned property can be tampered with with third-party attacks, but that's something else. So make sure that if you expect a property to exist on an object, you make sure that it exists on the object and not on its prototype chain. Also, this one I like. It's what I call building defensive object. I don't know if that's the academic term, but you can use Object.create and that will create a new object with its arguments as prototype. Well, null is an object. So you can do Object.createNull. These objects won't have all the methods you expect them to have as-owned property, as-owned symbols, get-owned-property descriptors, but this object will be safe from prototype pollutions because it doesn't have any prototype.
Sanitization, make sure that stuff that gets in your process from the outside are safe. Do data validation. I love the joy library because I'm a happy, happy thin fat boy, but there are a lot of amazing libraries to do data sanitization. Use them. They are very cool. And anyway, you should use them if you're building a web server with Node.js. As mentioned that will also probably remove your surface of attack to no secret injection. So go for it.
Conclusions. Oh my god, I'm on time. What's now? Monitor incoming objects Node.js has an option to disable proto, underscore, underscore, proto, underscore, underscore. It might break some code. So be warned that it might break some code because the Internet but you can use it. And for sanitization and prototype less object. Oh, you remember why I told you should use Python? That was a joke in January.
9. Mitigating Prototype Pollution
Someone published a paper about class pollution in Python. There is no proof of actual use in the wild for malicious attacks, but it highlights the vulnerability. It's important to check where your objects come from in your codebase, especially for web applications that accept objects from the outside. Be cautious of third-party attacks from NPM modules and inputs from the network. Sanitize and validate the objects that enter your app to prevent injections and ensure they match your expectations. Use tools like Sneak Audit and NPM Audit to check for known vulnerabilities in your codebase.
Someone published a paper. There is no proof of actual use in the wild for malicious attacks, but there have been a paper about class pollution saying that oh basically Python is vulnerable to that too. So there's nowhere safe.
Some links. The slides will be on Twitter. Please shout on Twitter if you want the slides. So let's stay in touch. You can find my Twitter with this short URL.
I hope you enjoyed this presentation and you have questions that I can answer. Thanks so much for being an amazing group. We will come to this question in due course. There was a lot of suggestions there on how you can mitigate the risks brought about by prototype pollution. If our audience were to go home and go to their code bases that they're currently working on and with the concerns that they may now have, what would be the first thing that you encourage people to do? Maybe it's a more simple low lift action or starting off a more significant piece of work.
Thank you so much. So some questions from the audience, is there an easy way, I think you touched on this, but I'll ask the question explicitly regardless. Is there an easy way to check if my service is vulnerable to this attack, someone who uses a lot of third-party NPM modules? Sneak Audit, NPM Audit, you will already know the vulnerable methods, if they are known. Check if you're using merge methods in your own code, but yeah, basically making sure you don't have known vulnerability in your codebase is the first step. Ideally, frameworks should be able to handle these kinds of things for us.
10. Questions and Answers
Are they? Is there something like Express or Fastify or equivalents that prevents prototype pollution? As far as I know now, but I'm not up to date on Fastify documentation, Express documentation, I'm up to date because it hasn't changed in five or six years, but I don't think so. That's a very good point, I guess Matteo will say PRs are welcome, so feel free to PR Fastify.
Does object.assign also create polluted prototypes? That's a good question. I want to say no, but I'm not sure of it. So that's your homework for tonight. I don't think so, but was checking. Didn't think you were coming to no congress for that homework. Are there other ways of doing RCE with prototype pollution without node options like in the Cabana example? Well, there was the bison example with a function being in serialized. So I guess so in a way, that really depends on the application you are attacking, does this application as string evaluation at some point whether it's through environment variables, through eval, through with VM the transcode, in that case, yes, but it's very business logic dependent. Thank you. Got a few more, you want to get through. Should we, based on your talk, should we therefore be using maps more and instead of objects to avoid these problems? I mean, yes. Yes. I mean, just make sure that you're, if you're very very cautious about that, you need to ensure that there is no intrinsic pollution, meaning that someone overrides the map based methods, they can't do that as far as I know with prototype pollution, but they can do that with a malicious third party. The Node.js code, code base is actually very defensive against that, so you can check. But yeah, maps is probably one of the smartest thing you can do in your web application as far as I'm concerned, I love maps. If you've been to James' talk about asynchronous storage, the first version I proposed of asynchronous storage would force you to use a map as the store, so as a map lover, I would say yes, but I'm biased. Just a tad. What if I merge an object with a native spread operator instead of the old version, an old version of lodash? I think you're safe from prototype pollution, but it's like object data assigned, I never tried it, so I can't be assertive on that. I think with a lot of these questions it is probably the case of like as you're trying to solve for this risk that you are trying all the solutions and seeing what the outcome is, right? Also, I kind of hope if these ways were vulnerable to prototype pollution, we would vastly know it as a community, so that's why I tend to think we are safe, but as a security person, I don't want to handle things. And you don't want to assume either. Are there or could there be some kind of ES lint rules that detect these problems? Good question. There could be a need. It's hard because it needs a bit of taint tracking.
11. Semgrep and Merging Objects
Semgrep is a powerful static analysis code that can find vulnerabilities by running part of your code in a VM. It's open source and designed to check if you're merging based on incoming objects.
There could be at least a semgrep rule that will check that you're merging based on incoming objects. So semgrep for those who are not familiar is a static analysis code. It's open source. It's written in Okeanos, built on the west coast, it's really really cool and it's designed to find vulnerabilities. But it's a bit smart. It's more powerful than most linting tools because it has some kind of symbolic execution engine and can basically run part of your code in a VM. I mean executes part of your code and decide if it's vulnerable. So I'm not sure about linter but I'm pretty sure about semgrep. Cool. Awesome.