React conferences

React Day Berlin 2023

React Day Berlin 2023

English version

Rethinking Bundling Strategies

Tobias Koppers

Webpack and Turbopack Creator

We take a look at different challenges and decisions when bundling code for web applications. We look at how these are commonly solved and why we need to rethink them.

FAQ

Tobias Cobbers is the creator of Webpack, a popular module bundler for JavaScript. He joined Vessel and contributed to improving Webpack for Next.js. Currently, he is working on TurboPack and integrating it with Next.js.

TurboPack is a new tool developed by Tobias Cobbers that aims to improve upon Webpack's features and integrate seamlessly with Next.js. It focuses on efficient bundling strategies and better caching mechanisms to enhance web development workflows.

Tobias Cobbers highlighted two main challenges in bundling: ensuring deterministic builds and managing small input changes to result in small output changes. These challenges are crucial for effective long-term caching and minimizing the impact of updates on bundled resources.

Long-term caching in web development involves storing web resources in a browser's cache to improve load times and reduce server requests. It utilizes techniques like immutable caching, where resources are cached without revalidation, and e-tag caching, which allows browsers to check if the content has changed before downloading it again.

Webpack addresses content hash dependencies by using a manifest file that lists all the chunk hashes. This prevents changes in one part of the application from affecting unrelated parts, thereby optimizing caching and minimizing the need to re-download unchanged assets.

TurboPack proposes improvements such as more efficient handling of module fragments and exports, reducing unnecessary code in bundles, and optimizing the generation of module graphs to focus only on used exports. This leads to faster builds and more efficient application performance.

Effective code splitting strategies involve isolating changes to specific entry points or pages, ensuring that changes in one part of the application do not impact others. This can be achieved through heuristic methods such as separating node module dependencies from application code to leverage long-term caching more effectively.

Tobias Koppers

32 min

08 Dec, 2023

Comments

Sign in or register to post your comment.

Video Summary and Transcription

The talk discusses rethinking bundling strategies, focusing on challenges such as long-term caching and improving the state of Next.js and Webpack. It explores handling immutable caching and content hashes, optimizing asset references and page manifests, and addressing issues with client-side navigation and long-term caching. The talk also covers tree shaking and optimization, optimizing module fragments and code placement, and the usage and relationship of TurboPack with Webpack. Additionally, it touches on customizing configuration and hash risks, barrel imports and code splitting, and entry points and chunking heuristics.

Available in Español: Repensando las Estrategias de Agrupación

1. Rethinking Bundling Strategies

Short description:

I'm Tobias Cobbers, the creator of Webpack. Today, I want to talk about rethinking bundling strategies, focusing on two challenges in writing bundlers. The first challenge is long-term caching, leveraging the browser cache to store resources between deployments. The second challenge involves improving the current state of Next.js and Webpack. Let's dive into these challenges and explore how we can do better.

Thank you. Yeah, I'm actually talking about rethinking bundling strategies today, and my name is Tobias Cobbers. I created Webpack 11 years or 12 years ago, and two years ago, or three years ago, I joined Vessel and worked a little bit on Next.js, improving Webpack for Next.js.

Now I'm working on TurboPack and integrating Next.js with TurboPack. My talk is actually a little bit more general-facing, so I want to talk about a few things. I want to look at two different challenges in writing bundlers. We're actually looking at the magic in bundlers. So I grabbed two topics for that, two challenges that I currently or in the future will face with building TurboPack. And I want to go a little bit deep into that because I think learning these bundler magic can be important, even if you technically should not face it in your day-to-day job. The bundler should make it transparent and should not face you with all these challenges. It should just solve it magically. But I think it's still useful to know it, and you get some deep insight of that, and it may help you in a few edge cases.

First, I want to present these two challenges, and then go into the current state with Next.js and Webpack for that. And after that, I want to spend a bit of time rethinking that and how we can improve on that, what we can do better in the future, and what we actually want to do on TurboPack with these challenges. A little disclaimer first, I mostly work with Next.js, Webpack, and TurboPack, so everything is from the perspective of these tools. And there are still other tools outside, and they have similar things, different implementations. And although most of the ideas are not really new, it's more inspired by other tools and yeah.

The first topic is mostly about long-term caching, which is really not very known by many people. And so what is long-term caching at all? So long-term caching means we want to leverage the browser cache, so the memory cache in the browser to store our resources, and especially between deployments. So there are basically three levels, or three practical levels of leveraging browser cache. The first one is max edge caching, where you just specify my resources are valid for two hours, and you don't have to check that again, and you can just use the cache for two hours. But in practice, it's pretty much unsuitable for our case of application, because we might have a critical bug fix to fix, and we want to deploy something, and we don't want to wait two hours until the user actually gets a bug fixed. So we don't want to use that at all. And what we want to use is like e-tech caching, for example. And e-tech caching means basically when the server responds with the resource, it sends a special header, e-tech, which usually contains a hash of the content, and then the browser stores that in this cache, and basically, in the cache. And you also want to specify three valid dates, so like the next time the browser wants to use the resource, it just does a new request for that, but it includes a special if-not-match header, which includes the e-tech, so the hash of the content, and then the server might, if the resource didn't change in the meantime, it might respond with a special status code, like, hasn't changed, you can just use the cache, and you don't need to download it again. And that basically always works, that's great. But it always also re-validates the request. So it basically sends a new request, you have to pay the round-trip, but you don't have to pay the download cost. So it's good, but you can do better.

2. Handling Immutable Caching and Content Hashes

Short description:

The best way to handle caching for static resources is through immutable caching, where the browser can cache the resources indefinitely. To ensure consistency, a unique URL with a content hash is used, allowing for easy updates without breaking the cache. To achieve deterministic builds, the bundler must generate the same output for the same application, while also ensuring that small changes result in small output changes. However, handling content hashes becomes more complex when there are references between different parts of the application. Webpack and Next.js have made progress in solving these challenges, but the issue of content hashes remains.

The best one, I think, is at least for static resources and for that stuff, is immutable caching, which means you send cache control immutable and a few other headers, and that means that the browser can cache it forever, never have to do a round-trip, never have to request it again, just can store it forever, usually one year or something.

But it only works, basically, if it stores it without re-validating forever, you basically can't change the content of the resource, because if you change it, then it might be inconsistent, and browsers might have still it cached, it doesn't work.

So usually you tackle that by making the URL of that unique in a way that it never changes. So usually the thing is that just add a content hash into the URL, you might saw that with file names having this hash attached, and that makes the URL that unique that it will never change and if you deploy a new version, it will just get a new URL with a new hash.

Yeah, that would be the best one. So how do we face that from a bunch of levels? So the challenge can be solved with a few different techniques. So one thing is we want to make the bundler in a way that it's generating deterministic builds. So a build should, if you build the same application, it should just generate the same output asset so that the cache can actually be used. If you would generate different output assets, then you can't use the cache. But you also want another property. You want this property that even if you do a small change to your application, which you usually do, like in every pull request or whatever, you want a property that a small change results in a small output change. If you only change one module, you might only expect one or few chunks change in the output bundle. And yeah, that's sort of the way that we can generally use our browser cache. Now we want to use this immutable caching thing, so we won't just put a content hash on every source or every file name we emit from the bundler. It sounds pretty easy. You just hash the content, add it to the file name. But it gets a little bit complicated because there are actually references between the different things in your application. So like an example, HTML references your chunks, the chunks reference each other, maybe for async loading and that stuff. And chunks also reference assets, like images, fonts, that stuff. And so that's where the problem comes in. So yeah, so we basically solved these first few things with Webpack in the current state with Next.js. So to make deterministic builds, we just be careful implementing that and try to avoid absolute parts, basically avoid absolute parts. And to make it independent of these changes where you clone your repository to a different directory and all that stuff. And that's pretty easy, actually. And the more difficult one is this property of small input change, small output change, where you have to consider every algorithm to make it actually not having this whole application effect. Like module IDs, we can't really number them one by one, we have to... Because if you number it one by one, inserting one module at the start would rename all the modules not to the property we want. So making usage of hashes to generate module IDs, and also to chunk your modules into chunks, you have to make it deterministic in a way that small changes were turned into small output changes. It's also relevant for optimizations, like mangling and that stuff. In general, we solved a few things, but let's look into this content hashes problem.

3. Optimizing Asset References and Page Manifests

Short description:

If you go with an AETH approach and put content hashes and references in the normal way, changes in one asset can cause the entire dependency graph to invalidate. To solve this, Webpack uses a manifest that contains all the file names, URLs, and content hashes of the chunks. The chunks no longer reference each other directly, but instead reference the manifest. This allows unrelated chunks to remain cached while only the manifest and HTML files change when an asset is updated. However, in a multi-page application, all the chunks reference the manifest, leading to invalidation of all HTML files. To address this, a per-page manifest can be used, isolating pages and eliminating hash dependencies between them. This change allows for independent pages with no hash dependencies, benefiting long-term and build caching, but requiring additional handling for client-side navigation.

If you just go with an AETH approach, and just put content hashes and everything, references in the normal way, then you get this property where you have an asset here, like an image, include a content hash, so it's basically hashing the content, the file name is based on an image, ABC, with a hash. That means, because AsyncChunk2 references in this case, we have to embed the URL of this asset into AsyncChunk2. And that basically means the hash of asset becomes part of AsyncChunk2. And AsyncChunk2 is also hashed and gets a name, and that basically happens for everything.

Now the important thing, if something changed, like I changed this font file to have more subset of fonts, whatever, then this asset, of course, it changes, get the new URL, and the problem is now, the new URL needs to be embedded into AsyncChunk2, and that means this chunk also changes and gets a new URL, and that needs to be embedded into AsyncChunk1. So basically the problem is that the change bubbles up your dependencies, your reference graph in your application, and in the end it means the whole graph will invalidate just because you changed the leaf of the graph. And that's not the property we want, we want something else. And Webpack has a solution for that.

So what we do in Webpack is instead of putting references of chunks in other chunks, we just put out all the file names, all the URLs, all the content hashes of chunks into a manifest, which is embedded into the Webpack runtime, and reference that from that. In our graph it looks like that, where you are... Basically all the chunk hashes are in the manifest, so the chunks don't reference each other directly, it's basically referencing them indirectly instead. And that changes the property where, if you change asset, it still bubbles up to AsyncChunk2, but then it stops bubbling up to all the other chunks. It basically only bubbles up to the manifest file that changes, and HTML will change, which will always change. But we basically can keep all the chunks, the unrelated chunks cached, and that gives a lot of benefit for hashing.

But there are still issues with that. An example in a multi-page application, where if you have many, maybe thousands HTML files, and you want to client-side navigate between these files, which is what you usually want in an example in an XJS application, you need one runtime, because then you want to share modules between all that stuff, and you want to sustain between that stuff. So you basically have the problem that you have a single runtime manifest file, and now you see the problem. Everything is referenced, all the chunks in your thousand pages are referencing the manifest file, and that gives you the problem, like, if you, on page B, you changed your image, it bubbles up to AsyncChunk4 in this case, and then it bubbles up to the manifest, and that will invalidate all your HTML files. And that's okay, it works, and we used to do it for, like, years, but I think we can do better.

So I spent a bit of time on this very obvious change you can do here. Instead of doing a global manifest with all this stuff, you just make a pair page manifest. And that changes the property that now your pages are isolated between each other, and you don't have hash dependency between each other. Now you can change one page, or change a module on one page, and it only invalidates one of the HTML files. That is a simple change, but it has a big impact. You still have the property of, like, initial page load is fast and cached and content-hashed. But now you have independent pages. Pages can be independent, there are no hash dependencies between each other. That has benefits for long-term caching, but it also has benefits for build caching where you can just cache HTML files between deployments if they don't change. But there's a trade-off. Now you have to do something for client-side navigation.

4. Client-side Navigation and Long-term Caching

Short description:

When doing client-side navigation, getting the manifest and initial chunks of the new page is crucial. Immutable cached requests won't work, so uncached or etag-cached requests are necessary. Combining data requests with requests for new chunks can solve this problem, as demonstrated by Rack Server Components.

If you do a client-side navigation from one page to another page, you need to somehow get the manifest and the initial chunks of the other page. And you can't do it with an immutable cached request, because then you would need to embed the content hash into the HTML page again, and that would break the whole thing. So you need to do it uncached or etag-cached. But that's a good trade-off to do, because when you navigate to another page, you often have to do a data request anyway. And usually it's kind of easy to bundle or to combine this data request with a request for the new chunks. And actually Rack Server Components is doing that. Just bundle them together. We have the one data and chunks request, and then you solve this client-side navigation problem. Yeah. That's for long-term caching.

5. Tree Shaking and Optimization

Short description:

Another topic I want to cover is tree shaking. The challenge with tree shaking is that you have a lot of unused code in your application, such as utility libraries, component libraries, and icon libraries. These libraries often re-export all the icons, which can negatively impact bundle size. However, when optimizing for development, we need to be careful with module replacement and file dependencies. In Veprack, we have three levels of tree shaking, including module graph tree shaking and side effect theme modules elimination.

Another topic I want to cover, hopefully in the remaining time, is tree shaking. The challenge with tree shaking is that you have a lot of unused code in your application. You have utility libraries like loaders, which bundle a lot of functions, which maybe you're only using four of them. Or you have component libraries, maybe from open-source projects and that stuff. Or you have icon libraries, which are actually the worst ones. There's bundles of thousands of icons, which you only use two of them. Yeah. Usually that's the problem.

What libraries often do is verify, like, this icon library, just re-export all the icons in the library, you can just easily import them. It's kind of a nightmare for bundle size. And basically you're referencing all these icons in this library. And so we want to do some optimization there. But we have to be careful. Because in development, you want to use something like HMR, like module replacement. That could potentially conflict with all these optimizations. If you skip a module that might change over time, you might know it's unused, but later it might be used. So you have to be careful. Also be careful with this kind of scheme where changing one file affects a lot of other files. That's kind of not really great for HMR.

In Veprack we have basically three levels of tree shaking. The first level is the module graph tree shaking where just by following only the imports in the application, you already skip a lot of files because they're never referenced by other files. And that's the base level where it just follows a module graph and it automatically tree shakes the unused files away. It's basically happening automatically. It's not really a thing we have to do. It's just by design. And the next step is side effect theme modules elimination. Which means maybe a Beryl file in an icon library is flagged as side effects. Side effects was in Package.json, so it's flagged as side effect free. So the Bundler can skip that file if there's no export of that used. And that's often really useful because then you can skip over this Beryl file and all these unused icons and only really import that file.

6. Optimizing Tree Shaking and Bundle Units

Short description:

That works by skipping over modules in the reference and generating code that only references the used exports, removing the unused ones. Webpack analyzes the used exports, skips over side effect modules, and generates code that skips unused code. In development, only side effects free modules elimination is used. The main issue is that modules are the lowest unit in Webpack, resulting in embedding unused exports in bundles. This is problematic for larger applications. Additionally, most of the optimization is disabled in development, which is not ideal. A new strategy is needed to determine the smallest unit in a bundle.

That works by just skipping over modules in the reference by following the references in a faster way. And the last level and the most complex level is a module exports tree shaking. They just detect in your application which exports of a module are used and then generate code to only like generate a new code that only references like these modules, these exports that are actually used. Basically removing the exports that are unused.

And in Webpack it works by having a lot of steps. So this is only the part of the two later levels of that. So in Webpack in production, it basically first analyzes like all the exports which are provided, which are used, and then analyzes all the graph in modules and skips over side effect modules. And after all this analysis process, it basically can generate, remove, like generate code which skips the unused code and mangles the used ones. And in the end, the minifier kicks in and dead code elimination in the minifier removes the code that's actually unused because it's no longer referenced.

In development, it looks much lighter. We basically only use the side effects free modules elimination because all the other steps would be too expensive and especially it would break HMR in ways that are really unpredictable. And that's not great. So, but yeah, so the issues with that is a few ones. So the biggest one, I think, is because we have this module graph and exports being a separate process, it's basically also like the lowest unit in Webpack. What it places in chunks, places in bundles, is a module. A module might be reduced by exports, but it still places a whole module into chunks. And that's not ideal. Actually, we may be from page A, we're using export one from a library. And from page B, we're using export two of a library. But in the end, after the analytics, we have the library and detect that export one and two is used. And in both pages, we actually embed the library with two exports and that's not optimal. We actually want something where we can actually place exports into bundles and to chunks so we can actually split it up and have a more efficient code. But especially for larger applications, like your admin page uses export, but then you are a user facing application, not really using it. And you don't want to carry that into the bundles. It's also problematic that most of that is disabled in development. Which makes it unusable for development, but it might be useful for faster builds because you can skip stuff in development too. And I think the fundamental problem in this case is in Webpack, it works by first it creates this module graph, the unoptimized module graph, and then the optimization kicks in and removes stuff from the module graph again. I think we should do better in that. And the new strategy is to reconsider what's the smallest unit in a bundle. We shouldn't do a module graph by modules.

7. Optimizing Module Fragments and Code Placement

Short description:

Instead of bundling modules as a whole, we can split them into smaller fragments based on their exports. By following only the used exports or fragments, we can create a smaller optimized graph, automatically skipping unused code. This approach eliminates the need for an unoptimized graph and allows for efficient merging of modules using scope hoisting. By isolating changes from modules, this optimization provides benefits for development. Splitting modules into smaller units also allows for code placement where it is actually used, enabling new optimization possibilities.

Instead, we should make it by module fragments and by splitting modules into smaller pieces by exports. Like my library with 10 exports might be split into 10 fragments where every export is its own fragment. So we do the module graph thing, but on a module fragment level. So instead we have a module fragment graph where we just follow only the exports that are actually used and not the modules, but the exports that are used or the fragments that are used. And we basically make it much, much smaller. And then we can just use this automatically behavior of a module graph, automatically skipping all the unused stuff. Can do that on export level too. So it makes it much easier and it's much more efficient.

The good thing is that this is a one pass approach. So we just go at first place. We don't create an unoptimized graph and optimize it. We just create an optimized graph in first place, which is much more efficient. But to do that, we basically have to merge it again. But there's an existing optimization for scope hoisting called, which can merge modules together to larger units that basically eliminates the runtime cost of having smaller units in Bundler first place. And for development it's really great because then now we all have the effect that we isolate changes from modules.

Yeah, let's take a look at how this actually works from code wise. Like I have this small counter module, which is basically a simple module. It's a count variable. It's an add increment and get count export. And now we can split it up into smaller units. The tricky thing is that modules might share state. In this case, it shares this shared count variable. And for this, we have to create an interfragment where we just export only the count state, put it in a separate module fragment, and then the different exports like increment and get count can use that shared fragment. Yeah, the benefit of that is now we don't have unused code anymore. So we can just we don't have to bundle that. And we don't have to process it at compile time because we can just skip over it. And we could also now, which is new, we can place the code where it's actually used. So we can place export A in this chunk. We can export B in another chunk and maybe in separate pages, maybe in an async load, different exports of a library. That's given new possibilities to optimize stuff.

Summary and Q&A

Short description:

We discussed long-term caching, isolating pages, and tree shaking. The new architecture allows for splitting modules into exports and placing them in different contexts. Developing tools that affect a wide range of developers is cool and fun. It's like bundler magic that usually doesn't affect developers, but sometimes it does. Let's now move on to the audience's technical questions.

And it also makes faster builds possible just because this not we don't create an unoptimized graph first, we create an optimized graph in the first place. And that makes it easy to skip like all the steps for like an IK library. And if you haven't verified, we can skip like resolving all the unused exports at all. And by skipping resolving, we can skip reading, parsing, analyzing, chunking, code generation and bundling and all this stuff is easier to skip at least if that is side effect free.

So in summary, we talked about long term caching, which has to import this properties like we want small input changes to result in small output changes. And we want to have pages be isolated with each other so we can deploy stuff independently and don't have like invalidation, cache invalidation for things that you don't want to have be invalid, didn't change. And we also look into tree shaking. Instead of doing a model graph, we do a module fragment graph, which makes the unit smaller. And now due to this change of architecture, we now be able to split modules into exports and place them in different contexts, like different pages or different async load individual exports.

So that's the end of the talk. Nearly in time. Great. Thank you. Really enjoy your talks, especially because they dive so deep. Like you have an intimate knowledge of TurboPack and Webpack and all of these tools that oftentimes I depend on without necessarily looking under the hood. But what's it like sort of developing tools that affect such a wide amount of developers, a wide range of developers as well? Yeah, it's really cool. And maybe you could say like this results in a large responsibility or something like that. But I don't feel like it's more like I have fun doing this cool stuff, doing this fancy algorithmic optimized stuff. And especially I like it. This is all the stuff I talked about. It's basically transparent to the developer. They don't have to care about it. It's just bundler magic and what people often describe as bundler magic. Usually it doesn't affect you at all, but sometimes it does. So it's really useful to have this knowledge. And I also like to do deep talks and deep topics, even if the time is really hard to cover here. But it's fun. Yeah, no, you're one of the people, just a silent hero, not silent because you're clearly educating all of us, but one of those heroes that definitely keeps my day ticking along just fine. So why don't we jump into some of the technical questions that came from the audience. Don't forget that you can ask your questions.

TurboPack Usage and Relationship with Webpack

Short description:

We have plans to make TurboPack usable outside of Next.js, but currently it's not stable enough for standalone use. There is a TurboPack CLI binary that can run a single page application, but it's not recommended for public use yet. TurboPack is not just a new version of Webpack, but rather a successor with a different architecture, language, configuration, and plugin API.

We also have the little screen in the front as well. So you're very, very closely into TurboPack and seeing the roadmap and there's probably stuff even further that you can talk about that you kind of have an idea about.

So what's the timeline, if there is any, for using TurboPack outside of Next.js? Yeah. So we didn't make a plan for that yet, but it's planned to do that eventually. So we want to get Next.js finished up. So we actually solved all the problems for that and we don't have to make larger refactorings again once it's public, especially if probably making it stand alone, be usable stand alone, makes API at least a little bit public. So we want to make it as stable as possible and solve that stuff for Next.js and then we can make it stand alone.

But technically there is a binary you can compile yourself, which is TurboPack CLI, which can run a single page application. It doesn't have any configuration or any plugins, but it's technically possible to, I think it could make a create direct app, similar thing. So you're saying do so at your own risk. Yeah. You shouldn't do that. It's like not public yet, but we did it for like our own purposes to our own test cases, one with our like basically on a TurboPack stand alone thing.

And another thing as well, because I think this kind of what we talked about, because when TurboPack was branded as new, it seems that it's an evolution of Webpack. Why not bump the version only and what led to the creation of a totally new tool? Because like, especially with TurboPack, you are building a tool within the context of Next.js. Like it's very closely coupled to the fact that it's going to be used a lot in this ecosystem, whereas with Webpack, you're kind of have a much broader audience that you cater to.

Yeah. We're aiming for this broader audience to eventually. So basically I would have felt bad if it's like a Webpack 6 or 7 or 10 or whatever, because it would be so a major change, especially as we don't support Webpack plugins in the same way. And we don't support Ops configuration in the same way. And I don't feel like this is like a new version of Webpack, because that would like a break and change. It changes everything. It's like completely new architecture, completely different language. So I don't think you could claim it's like a new version of Webpack. It's more like something that is inspired with Webpack or has the same purpose of Webpack, but it's not the next version of it. It's more like, we call it a successor, but it's like, yeah, it's a new tool that does the same thing in use case wise, but it's not the same tool. It's like different configuration, different plugin API. Yeah.

Customizing Configuration and Hash Risks

Short description:

We took over some stuff from Webpack and redesigned the configuration and plugin interface to address common struggles. We haven't created a configuration yet to avoid overwhelming complexity. By developing and testing internally, we can iterate and improve the API before making it public. Regarding hashes for caching, there is a risk of false positive conflicts, but the hash length is usually long enough to prevent practical occurrences. It's more likely to win the lottery than to encounter hash conflicts.

It's like different configuration, different plugin API. Yeah. Everything is different. So we try to, yeah, at least we took over some stuff from Webpack. The runtime looks very similar to Webpack. The way modules are generated in chunks is pretty similar. So we used a good part of Webpack and tried to redo the bad parts like configuration, plugin interface, what people really struggle with.

And yeah, we didn't do any configuration yet because we are basically, we are afraid of doing, if we create a configuration, like creating an API, that will stuck for a while and that's hard to change afterwards. Yeah. We don't want to make the same mistake with Webpack where a configuration is really overwhelming and complicated to use. And that although like one benefit of doing it inside of Next.js, that we don't have to expose any configuration, any API from Webpack yet. We can just make it work in Next.js and like iterate on the API, on the internal stuff as long as we want to before making it.

It's like you're sharpening the recipe in-house before you open the restaurant. It's like that. So actually as we do it a lot, we just dogfood our own tools internally first and then before we make it public. And because it actually helps to get like our own people work on that and work with that and try it and maybe find the problems. And then we can iterate on that before we make it public. And then it's hopefully a good product in the end.

That's awesome. I'm looking forward to that as well. Another person asked, because you spoke about the hashes when we were talking about caching, how is the hash calculated? And are there any risks or false positives? There is a risk of false positive hash conflicts, more like. But usually the hashes are long enough to not that doesn't happen in practice. But it's, yeah, it's always a risk if you use hashing, you can have conflicts and it would have the same URL and the cause of those caches. But in practice this never happens because I think it's eight bytes or so hash. It's like so large and the conflict is so unlikely that it doesn't happen. At that point, you should also go and win the lottery as well. Yeah, it's probably more likely to win the lottery than the hash conflicts. But you're also emitting a lot of files. So technically it's maybe possible, but I'd never see any case in that. I saw some cases where people, you can configure the hash length and people tend, some people try to make it as short as possible, like four chars, and that actually conflicts a lot often.

Barrel Imports and Code Splitting

Short description:

And then people run into problems with that. But if you keep the normal length of eight chars, which is pretty high, that should be fine in most cases. We recently shipped an optimization tool pack that handles barrel imports efficiently, allowing you to keep using them without impacting your build. For code splitting, it's important to consider which modules go into which chunks. Webpack's approach of splitting common modules can be inefficient and problematic for long cache. In development, we focus on making changes isolated to pages and ensuring that each entry point only cares about itself.

And then people run into problems with that. But if you keep like the normal length of eight chars, which is pretty high, that should be fine in most cases.

Nice. Nice. It's amazing. The questions just keep coming thick and fast. We won't be able to get through all of them, but we're going to go through as many as you can. And this one is about barrel imports. So was there a good alternative to barrel imports where the imports are still readable, but you don't impact your build and you get circular dependencies?

Yeah. So actually we recently shipped an optimization tool pack, which does similar to the tree sharing thing that we have on the higher level, but only for exports. And that should make it, if you write a barrel file with named exports, no, not with export star from export star, export blah from something, then it will, like, it basically has no cost of this barrel file. It just skips over the exports. You have to flag it as side effect three, of course. But it can skip that. And for total pack, it would be no cost for this barrel file. So you can actually keep using barrel files. And I would not recommend to do that right now if it was a library because other tools and like webpack doesn't cope with barrel files very greatly. It basically passes and then does all the other imports. But at least for total pack we made the change that we can make it efficient to write barrel files to make it code looking friendly, but still having good optimization.

That's awesome. And the next one is more about like implementation. So how do we implement code splitting and what are the most important considerations when we're doing that?

Yeah. So code splitting is really interesting, especially how you decide which module goes into which chunks. So webpack did, like, this whole graph up thing that, like, window split and common modules are all split. And that is really inefficient to everyone's taste. But it's also, like, problematic for long cache. If you extract common code, then you often have this problem that there is this reference from multiple chunks. So we currently in development we don't do this common code. We basically try to make changes isolated to pages. So, like, one entry point should only care about this entry point.

Entry Points and Chunking Heuristics

Short description:

It doesn't look at other entry points. It makes a lot of sense from developer performance experience. In production, we want to do something similar to webpack, extracting common code and using heuristics. One heuristic is to put node modules and app code in different chunks for long-term caching. There's so much detail that can be explored, but time constraints prevent us from going into it. If you have more questions, please join Tobias in the QA discussion Q&A.

It doesn't look at other entry points. It makes a lot of sense from developer performance experience. Because then you don't have to care about these pages. And changing one page or adding one page don't affect other pages. That's really important for us in development.

And in production, we probably we didn't do that yet now. But we probably want to do something similar to webpack. We have a global application information and, like, extract common code things. Yeah. But we also need to do some heuristic stuff. Like, one heuristic in webpack is, like, node modules go into a different chunk. And app code goes in another chunk. So that makes it possible. Because it makes sense from a longterm caching perspective. Because vendor code changes less often than app code. And then if you isolate it in different files, you can leverage longterm caching for the vendors much longer than if you would put it together. And changing app code would affect the vendor code. And you'd redownload that. Nice. A lot of considerations. But it's mostly heuristics and other stuff. There's, like, so much detail that he can go into.

Yeah, definitely. And it's really complicated. Yeah. Well, we don't have enough time to go into all of that. But there were so many more questions. And if you did have more questions, please come and check out Tobias over in the QA discussion Q&A. And you'll be around during the conference, and people can find you online. Let's give him a round of applause, ladies and gentlemen. Thank you. And... Thank you. Thank you. Thank you. Thanks for coming.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Building Figma’s Widget Code Generator

React Advanced Conference 2022

19 min

Building Figma’s Widget Code Generator

Jenny Lea

Widgets are custom, interactive objects you place in a Figma or Figjam file to extend functionality and make everything a bit more fun. They are written in a declarative style similar to React components, which gets translated to become a node on the canvas. So can you go the other way, from canvas to code? Yes! We’ll discuss how we used the public Figma plugin API to generate widget code from a design file, and make a working widget together using this.

tool building devtools case study design

Start Building Your Own JavaScript Tools

JSNation 2023

22 min

Start Building Your Own JavaScript Tools

Will Klein

Independent Developer Advocate

Your first JavaScript tool might not be the next Babel or ESLint, but it can be built on them! Let's demystify the secret art of JavaScript tools, how they work, and how to build our own. We'll discover the opportunities in our everyday work to apply these techniques, writing our own ESLint rules to prevent mistakes and code transforms to make breaking changes easy to apply. We’ll walk through the fundamentals of working with an abstract syntax tree, and develop our understanding through a live-code. You will be amazed at what you can build, and together we’ll explore how to get started.

tool building tooling build tools

Advanced linting rules with ESLint

TypeScript Congress 2023

10 min

Advanced linting rules with ESLint

Tibor Blenessy

This talk will explore more advanced ways to write static analysis rules in ESLint using ESLint's control flow APIs. I will quickly explain what a control flow graph is and how you can use it to find issues in your code. I will show you how to detect when a value is assigned to variable uselessly and other logical problems you can detect using this technique.

tool building advanced devtools

How not(!) to Build Real-time Apps

Node Congress 2024

10 min

How not(!) to Build Real-time Apps

Nikolas Burk

Are you building a chat app, a way to see users’ online status or a real-time collaboration dashboard? All of these use cases have one thing in common: Somehow the user-facing application needs to be informed in real-time about events that happen on the backend of your application.In this talk, we’ll look closely at common approaches like polling, application-level updates and pub-sub systems. We’ll explain the tradeoffs with each approach and elaborate why another approach, called Change Data Capture (CDC), is the most elegant and robust way to achieve this.

tool building web apps

Building a Network Stack for our Browser Extension

Node Congress 2024

19 min

Building a Network Stack for our Browser Extension

Cyrus Roshan

Engineering problems often repeat themselves in places you wouldn't expect. Sometimes the best solution has already been invented, in a different corner of the software engineering domain. In this talk, we show how and why we mirrored the TCP/IP network stack to solve a communication problem between different components of a browser extension.

browser api tool building

Workshops on related topic

Build React-like apps for internal tooling 10x faster with Retool

JSNation Live 2021

86 min

Build React-like apps for internal tooling 10x faster with Retool

Workshop

Chris Smith

Most businesses have to build custom software and bespoke interfaces to their data in order to power internal processes like user trial extensions, refunds, inventory management, user administration, etc. These applications have unique requirements and often, solving the problem quickly is more important than appearance. Retool makes it easy for js developers to rapidly build React-like apps for internal tools using prebuilt API and database interfaces as well as reusable UI components. In this workshop, we’ll walk through how some of the fastest growing businesses are doing internal tooling and build out some simple apps to explain how Retool works off of your existing JavaScript and ReactJS knowledge to enable rapid tool building.
Prerequisites:A free Retool.com trial accountSome minimal JavaScript and SQL/NoSQL database experience
Retool useful link: https://docs.retool.com/docs

Follow us

Upcoming events

Korben
Dallasvisa@gitnation.org

Want to have access to all events for 4x less?

JSNation US 2024

November 18 - 21, 2024

React Summit US 2024

November 18 - 22, 2024

React Advanced Conference 2024

October 25 - 28, 2024

Productivity Conference 2024

November 7 - 8, 2024

React Day Berlin 2024

December 13 - 16, 2024

Node Congress 2025

February, 2025

JSNation 2025

June, 2025

React Summit 2025

June, 2025

C3 Dev Festival 2025

June, 2025

TechLead Conference 2025

June, 2025

React Advanced Conference 2025

October, 2025

JSNation US 2025

November, 2025

React Summit US 2025

November, 2025

TestJS Summit 2025

November, 2025

React Day Berlin 2025

December, 2025