Building a JS Engine -- For Fun!

Rate this content
Bookmark

Modern production-grade JS engines can seem intimidating, but that's not all there is! This lightning talk draws from my past experiences working on hobbyist JS engines, and it will cover the following:


- Different angles to approach this seemingly complex task

- Things you can learn along the way

- Some existing projects that optimize for fun (instead of chasing benchmarks), and what makes them unique


You'll see that you don't need a background in compiler design to get started exploring JS engine internals. Most importantly, it's fun!

Linus Groh
Linus Groh
9 min
13 Jun, 2024

Comments

Sign in or register to post your comment.

Video Summary and Transcription

The Talk discusses the basics of building a JS engine, highlighting the complexity and feature completeness of existing engines. It emphasizes the possibility of creating a simpler engine tailored to specific use cases and target audiences. The speaker suggests starting anywhere in the process and provides tips on using parser libraries, implementing runtime features, and ensuring correctness through testing. Additionally, the Talk encourages exploring JavaScript standards and engaging with the open-source community.

1. JS Engine Building Basics

Short description:

I'm Linus, I work at Bloomberg on JavaScript infrastructure. Today, I want to talk about building a JS engine for fun. Spidermankey, V8, and JavaScript Core are mature JS engines that made JavaScript viable at scale. However, they are complex and feature complete, making it difficult to find something simpler to implement. Additionally, they are tied to product roadmaps and require extensive integration. We can make our own JS engine using different languages and choosing the scope based on our use case and target audience.

I'm Linus, I work at Bloomberg on JavaScript infrastructure, and I want to talk about something briefly which I've been doing for the last few years which is building a JS engine for fun. But first of all, let's take a step back. We all know these, right? So, Spidermankey, V8, and JavaScript Core from WebKit, and they are great. I'm not here to change your opinion on that. These are all very mature, highly optimised JS engines and they basically made JavaScript viable at scale, and made it as popular as it is today both on the client and server side. This is all good.

But, they are also very complex. This has a reason, which is largely performance, so they are not just simple interpreters, they have these several tiers of just-in-time compilers that create native code, and obviously, all exist in browsers, so they need a lot of integration with that, and that makes them pretty intimidating and not as easy to get into. They also largely feature complete, so, obviously, these have been around for at least a decade, V8 and the others even longer, so it can be pretty difficult to find something to actually implement that is not purely performance-related, or pretty complex. They are all very competitive, again, for a reason, none of these want to fall behind, everyone wants to stay at the forefront of performance and compliance, but, again, that kind of can take the fun out of it a little bit. Lastly, they're tied to product roadmaps, at least in some sense, as they again exist in browsers and can't just do their own thing but have to follow along with everything else around it, which is more than just JavaScript.

And now you can just show up and participate, but that is more or less the exception rather than the norm. Most people who work on these are employed at Google, or Apple, or Mozilla, and do this as a full-time job. But we can make our own, for fun, to change that, and just see how that works. First of all, you need to pick a language to implement a language. Traditionally, that is C or C++ which gives you good speed by default and a lot of control over memory and allocations and all that. Now that can be pretty scary to some people. Obviously, these languages are not known to be safe, have a lot of issues that we can't really fix because of history. But the good thing is you don't have to use C or C++. There is a long list of languages here. I found at least one implementation in all of these, there might be even more, but you know, it could be whatever, Go, Java, JavaScript, any of these. Just pick whatever you want. You can even write one in raw86 assembly if that's your thing. You choose. Then, next up, you have to kind of pick a scope. Could be very simple, just ES5, transpiler still exists. You can use all the new features and compile them down, or anything in between, like ES6, mix it up, use the newer features, or you decide to target the latest and greatest, commonly known as ES Next, or you could implement custom extensions, for example, the quick JS engine, they have something like use strict, called use math, and that gives you some non-standard extensions. All of this kind of depends a bit on what your use case and your target audience is. Obviously, a browser needs more support for various features than just a simple implementation that serves as a plugin, for example, in a game engine. And there's also certain parts of the language that are commonly referred to as Annex B, which is just what they call it in the spec.

.

2. Building a JS Engine

Short description:

You can start anywhere in building a JS engine. Use standalone parser libraries or create your own AST. Choose what to implement in the run-time side. Follow the ECMA 262 specification and run tests to ensure correctness and handle edge cases. Access 50,000 maintained tests at test262.fyi. Gain a better understanding of JavaScript and interact with standards contributors on GitHub. Explore fun projects that build JS engines.

It kind of only exists for legacy reasons, and, if they could delete it, they totally would, but, you know, history.

Again, you choose. Then, next up, you type git init, and you're kind of a bit lost where to even start. You might look at it using the script life cycle, so, like, first you pass the script. Obviously, you need to start with the parser, and then you need something that runs your script, and then you build the run time at the end. That's not really true.

You can basically start anywhere. Thanks to a lot of JS tooling, like formatters and linters, we have standalone parser libraries that you can just take, and then you get a quick start, or you just write your AST from hand in the beginning. That's something we did in the engine I worked on. And then later on, you can add the parser, and you already have something after that implemented, and it all fits together nicely. On the run-time side, again, that is huge, at least if you choose to implement all of it, so just pick whatever you want. It could be the classic string number, boolean prototype, or something more exciting like type arrays, or proxies, just start wherever you want. That is fine.

Then, obviously, you need to have a specification, known as ECMA 262. It's very complete, which is great. Nowadays, not a lot is unspecified. It largely looks like this, so you get a pseudo code that you can roughly translate into your own code. That might not seem like a whole lot of fun, but you still get to do a bunch of custom stuff like optimizations. This is really just focusing on the correctness and behavior. Then you need some tests, obviously. After you've implemented some stuff, you need to make sure it works correctly, handles all the edge cases.

Great news, you get 50,000 tests for free which are maintained next to the specification, so it's an actual official thing. It's a requirement nowadays for everything that goes into the language that it has new tests added to the test suite. Many new features are known to be fully tested. There's this wonderful website called test262.fyi that tracks the results of all the engines out there, and updates them every day, so you can see very detailed which engine implements which thing how well. I find personally that having a graph that shows you it goes up over time is very good for motivation.

More generally, you can obviously learn a huge amount of stuff if you do this. Things about how parsers work, interpreters, if you do bytecode engine code generation, if you want to do a JIT even, some native code, certain optimizations. For me, a huge thing was a better understanding of JavaScript itself, so, like, once you implement it, you really, really understand how things work under the hood. A few low-level concepts, and then one of my favorites is you get exposed to standards, so these new features don't appear out of nowhere, and you kind of get to interact with the people who add new things to JavaScript, and you can even get involved. It's all on GitHub.

Now, I do this. I'm a delegate for Bloomberg, but I started out as an invited expert in TC39, just like the standards body, and they reached out one day and thought, like, this is cool what you work on, do you want to get involved? And then lastly, here's a list of a few projects that you should definitely check out if this sounds interesting to you. These all do this for fun, like they don't aim to compete with V8, and all these big engines, they just do it for fun, so there's a few of them. Definitely check that out.

And that is that. If you have any questions, please find me afterwards or online. Thanks.