The Core of Turbopack Explained (Live Coding)

Rate this content
Bookmark
Tobias Koppers
Tobias Koppers
29 min
01 Jun, 2023

Comments

Sign in or register to post your comment.

Video Summary and Transcription

Tobias Koppers introduces TurboPack and TurboEngine, addressing the limitations of Webpack. He demonstrates live coding to showcase the optimization of cache validation and build efficiency. The talk covers adding logging and memorization, optimizing execution and tracking dependencies, implementing invalidation and watcher, and storing and deleting invalidators. It also discusses incremental compilation, integration with other monorepo tools, error display, and the possibility of a plugin system for Toolpag. Lastly, the comparison with Bunn's Builder is mentioned.

1. Introduction to TurboPack and TurboEngine

Short description:

I'm Tobias Koppers, the creator of TurboPack and TurboEngine. I'm here to demonstrate live coding in JavaScript, focusing on the core of TurboEngine. The motivation behind TurboPack is to address the limitations of Webpack in handling large applications and incremental builds. TurboPack introduces a new architecture to optimize cache validation and improve build efficiency. I will showcase a simple application that copies JavaScript files based on the dependency graph, with the addition of a copyright header. Through live coding, I will explain the process and demonstrate how TurboEngine enhances incremental builds. Let's get started!

Thanks for having me. I'm trying something new. I'm trying to do live coding today, so I hope it works out. So, yeah, my name is Tobias Koppers and I worked on Webpack for 10 years and now, joint Universal and work on TurboPack trying to do something new, something better and yeah, I'm trying to focus, as I've said, I'm trying to focus on one aspect of TurboPack and trying to explain a little bit how TurboPack or the core of TurboEngine works in detail so I'm trying to demo something with that, so I'm trying to actually trying to live code in JavaScript a little bit of the core of TurboEngine.

So the motivation of that is that on Webpack applications we saw that applications grow and grow, larger and larger and Webpack, the architecture of Webpack is not built for that, incremental builds tend to get slower and slower if the application grows. It's not that huge of a problem but yeah, it might get a problem in a few years when the applications get millions of modules or whatever and a few problems we isolated were we do a lot of cache look-ups, we have to compute a lot of stuff to do cache validation like checking if files are still the same, hashing stuff and that is the problem because all this overhead, you pay it for every incremental build and we want to do something new, a new architecture to tackle this problem. And that's why we created like turbo engine and turbo pack and it's a new architecture and I can explain it a little bit in doing live coding.

What I want to show is a kind of small application which is super simple, not a bundler but something that is similar to a bundler, it is taking any JavaScript application and just copies over the application by following the dependency graph to another folder and doing that it also adds a copyright header just to demo something. With that, I start with the basic application written in JavaScript and explain it later and then I try to add something similar to TurboEngine to make it more efficient, to make incremental builds possible in a similar way which it works in TurboPack, in Rust and with TurboEngine. For that I prepared this little application, it's really simple, it's just a bunch of Node.js We use Acron to get the dependency graph of something, the path there, the modules.

And I go through the application a little bit to make you understand it. The main process is really simply, we get the base directory, like the source directory, we have an output directory, and then we have an entry file which is actually this file we're looking at. So we're actually copying its own application to another folder. And then we start following the dependency graph from that entry point and copy that from base tier to output tier. And another, to make it a little bit more complicated, I add this header file which basically is, let me show it, it's like a copyright statement which should be added to every file to make it a little bit more interesting. So then we invoke this kind of function, copy graph, which basically computes the output from the current file by just relocating it. Calling the copy function which copies the file, super simple. And then calls two other functions which is called get references which we see later, it's like getting the references, like all the files that have been imported from one file and then looping over that and calling itself recursively to just copy the whole application. Yeah. So copy also pretty simple, read the header, read the file, and write it to another file. Nothing super complicated here. Get references is a little bit more complicated but yeah, it's not really that you have to understand it. It's like calling parse to get an AST out of the module and like looping or doing some magic to extract the import statements and returning a list of all files referenced by that kind of file. Parse is also pretty simple using calling Akon, which is a JavaScript parsing library. Also reading the file obviously and then it returns the AST. And after that I start the whole thing and that should copy the application to the new folder. So let's try it. Oh. A few things I want to explain. I also have this task function which is actually doing nothing currently.

2. Adding Logging and Memorization

Short description:

So it's basically only adding some logging so you can actually see what the application is doing. We add more logic to that later. The first step is basically add some kind of memorization system to that is like a cache. We store the cache somewhere using a map. Now we should have this kind of memorization, it's pretty simple, actually.

So it's basically only adding some logging so you can actually see what the application is doing. Otherwise it just prints nothing, that's pretty boring. So it only has logging and what I do is I basically call the function with logging. You see it but it's nothing straight forward, it's not doing anything special.

We add more logic to that later. So what you'll end up seeing is this whole application running so it's calling main, calling copy graph, calling copy and calling all these functions in kind of three kind of metals. This is basically a stack trace.

But you also see a lot of problems with this application. In example, we're reading header a bunch of times, like here and here and here. And we also calling copy graph multiple times. We're calling fs copy graph from taskKey, because it's referenced from fs. And we're calling copy graph from task. So we're doing a lot of duplicate work that we don't want to do duplicate work because that's what we want to do.

The first step is basically add some kind of memorization system to that is like a cache. So if you execute the same function twice, then we just return an existing result. So let's add that. So to add some cache, we store the cache somewhere. And in JavaScript, we can just use a map for that. And what we want to do is we want to get the task from the map as first step, the function from the cache. Actually we want to get the function and all these arguments. So because you can call the same function with different arguments, which is basically a different task. And then if we have a task, and we can just, if you don't have a task, we can just create one. Which means we create a new object, which has some result, which is undefined for the and then we set the result, which is basically what we were doing before, so copy that one here. And then, in any case, we return the result. So now we should have some kind of memorization system.

I missed some stuff, so I actually have to set the cache, yes, like this. And there's a bug, you probably see it if you're an JavaScript developer, the map doesn't work with arrays because it's stored by identity, so what we actually need to do is store it by a kind of value of that, so for that I prepared something which is like a TupleMap. Which I need to import. Copy load, don't do it wrong. And now we should have this kind of memorization, it's pretty simple, actually.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Vite: Rethinking Frontend Tooling
JSNation Live 2021JSNation Live 2021
31 min
Vite: Rethinking Frontend Tooling
Top Content
Vite is a new build tool that intends to provide a leaner, faster, and more friction-less workflow for building modern web apps. This talk will dive into the project's background, rationale, technical details and design decisions: what problem does it solve, what makes it fast, and how does it fit into the JS tooling landscape.
React Compiler - Understanding Idiomatic React (React Forget)
React Advanced Conference 2023React Advanced Conference 2023
33 min
React Compiler - Understanding Idiomatic React (React Forget)
Top Content
React provides a contract to developers- uphold certain rules, and React can efficiently and correctly update the UI. In this talk we'll explore these rules in depth, understanding the reasoning behind them and how they unlock new directions such as automatic memoization. 
Speeding Up Your React App With Less JavaScript
React Summit 2023React Summit 2023
32 min
Speeding Up Your React App With Less JavaScript
Top Content
Too much JavaScript is getting you down? New frameworks promising no JavaScript look interesting, but you have an existing React application to maintain. What if Qwik React is your answer for faster applications startup and better user experience? Qwik React allows you to easily turn your React application into a collection of islands, which can be SSRed and delayed hydrated, and in some instances, hydration skipped altogether. And all of this in an incremental way without a rewrite.
SolidJS: Why All the Suspense?
JSNation 2023JSNation 2023
28 min
SolidJS: Why All the Suspense?
Top Content
Solid caught the eye of the frontend community by re-popularizing reactive programming with its compelling use of Signals to render without re-renders. We've seen them adopted in the past year in everything from Preact to Angular. Signals offer a powerful set of primitives that ensure that your UI is in sync with your state independent of components. A universal language for the frontend user interface.
But what about Async? How do we manage to orchestrate data loading and mutation, server rendering, and streaming? Ryan Carniato, creator of SolidJS, takes a look at a different primitive. One that is often misunderstood but is as powerful in its use. Join him as he shows what all the Suspense is about.
From GraphQL Zero to GraphQL Hero with RedwoodJS
GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
From GraphQL Zero to GraphQL Hero with RedwoodJS
Top Content
We all love GraphQL, but it can be daunting to get a server up and running and keep your code organized, maintainable, and testable over the long term. No more! Come watch as I go from an empty directory to a fully fledged GraphQL API in minutes flat. Plus, see how easy it is to use and create directives to clean up your code even more. You're gonna love GraphQL even more once you make things Redwood Easy!
Jotai Atoms Are Just Functions
React Day Berlin 2022React Day Berlin 2022
22 min
Jotai Atoms Are Just Functions
Top Content
Jotai is a state management library. We have been developing it primarily for React, but it's conceptually not tied to React. It this talk, we will see how Jotai atoms work and learn about the mental model we should have. Atoms are framework-agnostic abstraction to represent states, and they are basically just functions. Understanding the atom abstraction will help designing and implementing states in your applications with Jotai

Workshops on related topic

Using CodeMirror to Build a JavaScript Editor with Linting and AutoComplete
React Day Berlin 2022React Day Berlin 2022
86 min
Using CodeMirror to Build a JavaScript Editor with Linting and AutoComplete
Top Content
WorkshopFree
Hussien Khayoon
Kahvi Patel
2 authors
Using a library might seem easy at first glance, but how do you choose the right library? How do you upgrade an existing one? And how do you wade through the documentation to find what you want?
In this workshop, we’ll discuss all these finer points while going through a general example of building a code editor using CodeMirror in React. All while sharing some of the nuances our team learned about using this library and some problems we encountered.