Challenges for Incremental Production Optimizations

Rate this content
Bookmark

We will look into usual optimization bundlers applied on production builds and how they can be combined with incremental builds.

Tobias Koppers
Tobias Koppers
32 min
13 Jun, 2024

Comments

Sign in or register to post your comment.

Video Summary and Transcription

TurboPack is a new bundle similar to Webpack, focusing on incremental builds to make them as fast as possible. Challenges in production builds include persistent caching, incremental algorithms, and optimizing export usage. The compilation process can be split into parsing and transforming modules, and chunking the module graph. TurboPack aims to achieve faster production builds through incremental optimization and efficiency. Collaboration and compatibility with other ecosystems are being considered, along with the design of a plugin interface and tree-shaking optimization.

1. Introduction to TurboPack and Incremental Builds

Short description:

I work on TurboPack, a new bundle similar to Webpack but designed from scratch. Our mission is to focus on incremental builds, making them as fast as possible. We want developers to spend their time on incremental builds, and only have to do the initial build once. This talk covers the unique challenges of production builds.

So, my name is Tobias Koppers, and I work at Vercel and work on TurboPack. TurboPack is a new bundle we're working on, similar to Webpack, but we're designing it from scratch. We're using it for Next.js, and our mission for TurboPack from the beginning was to focus on incremental builds. We want to make incremental builds as fast as possible, even if we have trade-offs on initial builds, because we think that most developers tend to wait on incremental builds often, because that is what you do while developing.

On the other hand, we also try to make every build incremental. We try to make that you only have to do your initial build once, and then spend the remaining time only on incrementables. That also means if you upgrade the Next.js version or the TurboPack version, we don't want you to lose your cache, or if you upgrade the parentheses, and this also includes production builds, so we want to focus on production builds too. That's what this talk is about. Production builds have some unique challenges I want to cover, and go a little bit over that.

2. Challenges and Optimisations in Production Builds

Short description:

There are several common production optimisations for bundlers, including tree-shaking, export mangling, module IDs, chunking, scope hoisting, dead code elimination, minification, and content hashing. These optimisations have different challenges when it comes to incremental builds and whole application optimisations. For incremental production builds, we need to have at least two ingredients.

So, if we look at both sides, like on one side we have TurboPack, which is really focused on incrementables, and on the other hand we have production builds which are really optimised, and that doesn't look too opposed, but in fact, there are some challenges that come with these optimisations we do in production builds usually. Because optimisations often, like, so in development we can focus on making it as fast as possible, even if you trade-off bundle size, or make builds a little bit larger, or do something of these trade-offs, but in production builds you don't want to make these trade-offs. You want to make production builds as optimised as possible, and then you basically have to trade-off with maybe performance on that stuff. Bringing them both together, like incrementables and production optimisations, that's a bit of a challenge.

So let's look at some common production optimisations for bundlers. The one you probably know is called tree-shaking. It's basically all about, like, you have your repository with a lot of files, and in your application, you only want to include the files that you're actually using on pages, and bundlers usually do that by following a dependency graph, by following your imports, and only including the files you actually reference. But it goes more low-level. Every file has multiple exports, usually, and you maybe have some kind of utility libraries where you have a lot of function declarations, and tree-shaking also looks at these and looks into your source code, and looks which of these exports are actually used in your whole application, and only includes them in your bundle, and basically throws away the remaining ones. That's actually the first challenge.

We have to look at your whole application to figure out which exports are used in your application, and looking at the whole application is basically the opposite of making it incremental, where incrementables usually want to look at one piece at a time, and, if something changes, you want to minimise the changes, the effects on that. Basically, this whole application optimisation is a little bit opposed to that. The next optimisation is called export mangling. Now, as a good developer, you made this function declaration, and gave them good names, meaningful long names to actually do cool stuff, and give good explanation to your co-workers, and whatever, and, in production, you don't want to leak these long names into your bundles. You want the bundler or the tool to optimise it, and usually bundlers do that by renaming things like A, B, and C, something like that, so it's just more optimised. But there's also a problem with that. If you rename these exports in a file, and you also have to rename it on the import side, so every module that is importing that module with the declarations need to reference not the long names but the A, B, C, the mangled names, so basically you have this effect where you change one module, or your optimisation changes one module, and that affects a lot of other modules, and this is also a little bit challenging for incrementables, because then you change one thing, and it bubbles up to multiple other changes, and yes, it's not really incremental.

Another optimisation is called module IDs, where you have usually some modules in your application need to be addressable at one time, so you might want to get some export from that module and that stuff, and you could just give them at one time, you have to give them a name at one time, and you could just give them the pull path as name, but it's very long and verbose, and for production builds, we want something shorter, similar to export mangling, so usually in Webpack we give them just numbers, short numbers, and address them by that, and the problem with that is now you give every module a number, but you have to make sure that this number is unique in your whole application, and uniqueness is again a problem with like, you need to look at your whole application to figure out if this name is already taken, so if there is a conflict, so this is again the whole application optimisation.

Another optimisation is chunking, so usually your bundler doesn't put all your modules into a single file and serve that for all pages, because it would end up with huge megabytes of a bundle, so we split it up, or the code splitting, we split it up into bundles per page, but we also do something like a common chunk or shared modules optimisation, where we figure out if there are modules shared between your multiple pages or multiple chunks, and then we put them into a shared file so we don't have to load the same modules multiple times, basically load the shared module once, and finding shared modules also requires looking at multiple pages at the same time, so this is again an whole application optimisation. There is an optimisation called scope hoisting, where we basically don't want, as a good developer, you write up many small modules, because that's organising your code well, and we basically don't want this abstraction of modules leaking into the runtime, so we want to get rid of many small modules at one time, and basically make only what we actually need at one time.

So scope hoisting optimisation basically merges modules together under certain conditions, and the tricky thing are the conditions in this case. We have to figure out which modules are always in the whole application, always executed in the same order, in this order, and then we can basically merge them together because then we know this is the order that they're executed in. So basically, finding this kind of matching this condition, finding something happens always in your whole application, is again an whole application optimisation. Then there are some simple optimisations like dead code elimination, which just omits code that is not used, or minification, which just writes your code more efficiently, omits white space commands and that stuff. Then the last optimisation is content hashing, where we basically put on every file name we put in hash at the end to make it long-term cacheable, basically means you can just send an immutable cache header, and then the browser cache can cache that. And yes, that's basically all the eight optimisations I come up with.

So if we summarise that, like in this table, you see that a bunch of these optimisations, or like half of them need our whole application optimisations, and a bunch of them also have effects where you change something and then the importer side of that changes. These are a bit complicated for incremental builds. But we look at this later. So in general, for incremental production builds, what we need to do is we need to have two ingredients, or at least two ingredients.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Rome, a Modern Toolchain!
JSNation 2023JSNation 2023
31 min
Rome, a Modern Toolchain!
Top Content
Modern JavaScript projects come in many shapes: websites, web applications, desktop apps, mobile apps, and more. For most of them, the common denominator is the technical debt that comes from settings up tools: bundlers, testing suite, code analysis, documentation, etc. I want to present you Rome, a toolchain that aims to be a all-in-one toolchain for the web, with one single tool you can maintain the health of all your projects!
Server Components with Bun
Node Congress 2023Node Congress 2023
7 min
Server Components with Bun
Top Content
An early look at using server components via Bun’s new bundler, with and without React.
Parcel 2: the Automagical Bundler
DevOps.js Conf 2021DevOps.js Conf 2021
8 min
Parcel 2: the Automagical Bundler
Parcel 1 was created out of the frustration from configuring slow and hard to configure legacy bundlers like Webpack and in turn started a trend of zero-config tooling. Unfortunately, Parcel 1 had some major design flaws that led to the creation of Parcel 2, a ground-up rewrite of Parcel which aims to resolve those design flaws whilst also creating a bundler that can scale to the size of companies like Atlassian and Adobe and beyond. A new plugin system, bundler targets, optional configuration, stable caches, improved scope-hoisting, improved developer experience and better performance are just a few of the things we’ve been working on for the last 3 years.
In this talk I’ll glance over how Parcel works and talk about a couple of the largest and most exciting new features in Parcel 2.
Owning your Build-step – Owning your Code
DevOps.js Conf 2021DevOps.js Conf 2021
28 min
Owning your Build-step – Owning your Code
Ever since JavaScript has become a language for writing applications, build tools and especially bundlers have been around. They solve the discrepancy between writing code that is easy to maintain and writing code that loads efficiently in a browser. But there are advantages to bundling JavaScript code that go well beyond the browser, from cloud functions to servers to command line tools.
RollupJS is special in that it was always designed from the ground up to be a general purpose bundler rather than a frontend specific tool. In this talk, we will have a look in what way other scenarios can profit from bundling. But more importantly, I will show you how RollupJS not only generates superior output in many situations, but how easy it is to tailor its output to custom requirements and non-standard scenarios. We will see how to patch up code, mock and replace dependencies, elegantly inject build information and control the chunk generation when code-splitting, all with a just few lines of code.