Turbopack. Why? How? When? and the Vision...

    Bookmark

    Transcript


    Intro


    Thanks for having me. My name is Tobias Koppers, I'm from Germany, from Bavaria, from South Germany. And I'm going to tell you something about Turbopack. I started with web development 10 years ago when I started to found Webpack and maintained it for 10 years. So it's pretty old now. And nearly two years ago I started to join Vercel and worked with the Next.js team on Next.js and integration with Webpack and Next.js and performance stuff and stuff like that. And now for about 10 months, I'm working on Turbopack and I'm going to tell you something about that.


    The history behind Turbopack creation


    [00:58] First of all, what's our mission with Turbopack? Our mission is to create the successor of Webpack. We want to align with the goals of Webpack. We want to make some kind of tool which is really like Webpack, similar to Webpack, and fulfills at least the goals of Webpack.

    I know that's a really ambitious mission and we will take years or a long time to get there, but at least that's our direction we are trying to head to. And this basically motivates us for our project goals. We don't want to build something that's only for Next.js. We want to build something that's framework-independent. We do not want to build something that's only for Vercel. We want to make something that's for the open source community, which is a community for something... Really we want to align with the goals and the motivation behind Webpack. We also want to make sure that we are building something that's as flexible and as extensible as Webpack. We want to follow Webpack's footsteps in that way. We actually want to create a building block for web development for the next 10 years of web development. Ambitious goals, yeah.

    [02:09] Okay. Let's look into what has led to the creation of Turbopack and into the past and also how it works and what's exactly our vision with Turbopack. It all started when I was joining Vercel and worked with the Next.js team. And basically, we wanted to solve some developer experience challenges and one of these challenges was performance. It's kind of working well, but there are some challenges with performance, especially as Next.js Is mostly built on top of JS or JavaScript-based tooling. And JavaScript-based tooling for compute-heavy work has a really hard time leveraging all the power of your computers. So leveraging multiple CPUs and really JavaScript might not be the best language for compute-heavy work or for build kind of things.

    Basically, the Next.js team and I also started to work on porting some part of Next.js or of the compiler infrastructure of the development into Rust world, so SWC was integrated into Next.js and it really has a lot of benefits performance-wise, but there are also some challenges integration wise. There's always a boundary between the JavaScript world and the Rust world and you have the civilization problems between, so it's still, there are challenges while working on that. There are also some trade-offs we had to do in Next.js for performance. One example was that we were resolving model requests in Webpack and we had to be really optimistic about this to make it performant. And actually, once we successfully resolved something we just assumed that this has not changed or we also assumed that Node models usually didn't change. And this is working well for the 99% of cases but it's kind of a trade-off and we don't want to be forced to choose it.

    [04:14] But there were also some implementation challenges in the Next.js team. Currently, Next.js is using four out of five Webpack compilers to make all this work. One for client, one for server, for server rendering, for edge rendering, for web server components, and the fallback compiler for error messages and all this lighter orchestration work between these compilers is not that easy. It is working but it's a source of errors where you can make mistakes or forget something or if the communication is not working correctly and that's not something we want to be forced to.

    There were also some changes on the Webpack side. Webpack was built 10 years ago and it still has the architecture from 10 years ago where applications were hundreds of modules. And web development scaled a lot in the last decade. So Webpack architecture was not really built for this large-scale incremental build performance kind of thing.

    [05:12] That is a problem. And we can't easily fix that because there's a lot of plugins that depend on other architecture of Webpack and also, it's really hard to make backward-compatible changes to the architecture. We can't really change the architecture while being backward-compatible, but we also don't want to break everyone's use cases by just changing the architecture of Webpack. That's not really a good opportunity to fix that. But on the other hand, we fixed a lot of things while I'm working on Next.js In Webpack to make it more performant. But we had the limit on this kind of optimization we can do without rewriting everything.

    There were also some other challenges like cache invalidation. In many cases, it's either too sensitive so you change something and it has to rebuild a larger part of the application while it's probably not affecting... For example, if you change the command, why do we have to rebuild the module? It could be more granular in what we can cache.

    [06:30] One problem of the architecture is this cache lookup cost problem. What we do when doing an incremental build is we basically start, similar to a full build, start building it, and for every work we want to do we first check if it's in the cache, if it's already cached and then we can just skip doing that work. And that sounds great, but on large scale, if you look up 10,000 modules in the cache and you have this serious cost of looking things up in the cache and that's kind of a problem. But we are talking about lookup problems, we are talking about a few seconds of rebuilds. If you worked in the native world you might know that they are incremental build times of minutes or something. Yeah, we have a really luxurious problem in the web development world I guess.

    But there were also some opportunities we could address while working on a new bundler. At Vercel, we have this monolithic build tool called Turborepo Tool, and when combining this with Next.js, we can learn a lot of stuff from each other. An example, Next.js has this cool granular caching. If you change one module, you will only rebuild one module and some boilerplate stuff. But in the Turborepo Tool world, you always have this caching granularity of whole commands and then Turborepo can learn from Next.js by having more granular caching but also can learn by having more like a development-focused watch mode where you can have this incrementally much the same experience by default. But on the other side, Next.js can learn from Turborepo. Turborepo has these cool two features about remote caching. So it can actually share your cache with your team and it's uploaded to the cloud or safe hosted solution. And then you can share that with your team and you don't have to rebuild what your colleague had built. That's cool. So we want to have that in Next.js like sharing cache with the team and sharing cache with deployment so we don't have to do the whole work twice.

    [08:41] But there are also some new opportunities we can offer. I've seen that many developers actually want to have more insight into the build process. What's the size of the page of components? What's the size of dependencies? And how does this pull request affect the change of my application or the performance change of my application? And these kinds of insights. And we want to offer more of these types. Also related to build performance like meter statistics. Why is my build slower? Or how does this Webpack plugin affect my build time? Or these kinds of statistics.

    We made this magic plan where we had this idea about building something new and the idea as well said that we have these common challenges about caching and invalidation and watch mode and incremental builds and we want to abstract that from the bundler. The plan was to build some kind of core engine which solves these common problems and then build a bundler on top of this core engine so we don't have to solve these problems all over again and take care of cache invalidation in every code we write. And then after writing this bundler, we just want to use it in Next or in other frameworks to get the benefits out of this new idea. And that's basically what we did.

    [10:05] And we always had this goal in mind about incremental build should be independent of app size, which means my application can grow as large as it wants to, but my incremental build should still be constant in performance. It should always be the same time spent on incremental builds. Basically, incremental builds should only depend on the size of the affected modules or affected changes in my application, not of the total size, which is not the case for Webpack in the example.

    On top of this idea, we built a layering system. What we plan to do, we wanted to use Rust as a base layer, as language. The most pressing reason was we wanted to use SWC as a parser because we don't want to rewrite a new parser and quickly innovate and all this stuff. We want to use that which is based on Rust so it was an obvious choice to use Rust, but Rust also fits good with our challenges, it has predictable performance, it has … which is easy to use and shared memory stuff and stuff. It also has a safe language, memory safety, which is a good point for remote stuff and security reasons kind of thing. But there's also some trade-offs by using Rust. Rust is usually much harder to write compared to JavaScript. And this could be a problem, it could be a developer experience problem when we want to offer to the developer to write plugins. We made the decision that we always want to provide JavaScript and Rust as plugin interfaces. The plan is to always allow the developer to write the plugins either in JavaScript or in Rust. JavaScript might be a little bit slower but in many cases that might not be relevant. But in the end, you can still start with writing your JavaScript plugin and import it to Rust afterwards if you figured out all this stuff and it's a performance problem.

    [12:00] But in most cases, it will probably not be a performance problem. And on top of Rust, we basically built Turbo Engine which is this common core engine which solves these common problems, caching, invalidation, incremental builds, and also some kind of abstracts, these common sources of external assets like file system, fetching, networking, stuff like that. And on top of Turbo Engine, we build Turbopack which is just a bundler. It's basically doing all the stuff that a bundler is doing like CSS, static assets, whatever, EcmaScript, TypeScript, web assembly, images, fonts. Well, there are a lot of things, actually. And then Turbopack can be used by Next.js as a bundler in the replacement of Webpack, but it can also be used by other frameworks like whatever.


    How the Turbo Engine works


    [12:52] Let's zoom into how Turbo Engine works. Turbo Engine is an automatic... Turbo Engine is the ability to write turbo functions, which is basically a function you annotate with some kind of annotation, and then it's made a turbo function. And turbo function means that you opt into pair function memorization for caching. It will automatically memorize or cache your function. If you call it twice, it'll not compute it twice. But we also do something for cache invalidation. We automatically track all dependencies of the function. If you write this kind of turbo function and then you access some data, we automatically track and build a large graph which we call the task graph out of all executions of turbo functions. We have this large graph of tasks and dependencies between them and can also make all this cool graph work like analytics on top of this compute graph or task graph.

    And by tracking dependencies, we can also automatically schedule your work in the background in a thread pool to make … automatically. So by tracking dependencies, it means once you access data from a different task or different turbo function, we can await it at that point because we track it and make scheduling automatically and transplant for the writer of the user of Turbo Engine.

    [14:22] We have an example of what a task graph looks like. It's really a simplified version. In reality, a task graph has about 200 task paired modules of a web application so it's really, really granular but it's just an idea of what you expect from it. We have this graph where all those functions located like tasks are connected with other dependencies.

    And what we can do with this kind of graph, we can make incremental builds super cool. Because this is an initial build, we have to execute all the functions obviously only once but we have to execute them. But once you make an increment build, you have some source of change. An example, a file watcher which basically invalidates one of these tasks or functions. In this example, it invalidates the Read file task and we basically start invalidating or recomputing the graph from that external source and we can basically bubble the chains through the graph by following the edges backwards. And we compute only the tasks that are needed to update the graph to the new change. And this has a lot of benefits. We only have to touch tasks that are really affected by the change and we can also stop bubbling the change. In this example, it's shown here where some change might not affect some output of a task.

    [15:57] An example, you change some kind of code which doesn't affect imports and in this case, it will not follow up after getting the module references it will see the references are the same and it will stop bubbling the change through the graph. We can stop at any point. But you also see that we can automatically one code in parallel like both of these tasks depend on parsing TypeScript and we can run them just in parallel automatically. And basically, you don't have to think about these common problems when writing a bundle on top of Turbo Engine. This solves a lot of problems. We don't have this problem with single-threaded JavaScript because we are writing Rust and we have automatic scheduling polarization. It also solves the problems of cache invalidation. You can't miss-invalidating the cache because it's automatically you can't break it or at least it's really hard to break. And it also solves the problem of too-sensitive cache invalidation because we have this really granular function graph where we can follow changes and dependencies in this really granular way and so it makes a cache invalidation really granular and correct.

    It also solves the problem of many cache look-ups because for all these graphs which are gray here, we don't have to do anything. We don't have to look it up in the catch. It's just sitting around not being touched by the change at all. It just does not have any cost for inactive tasks, basically you can have a large application as large as you want to, your change performance is only affected by the task you have to recompute, which is really minimal. Which basically gives us our goal that incremental builds should be independent of application size.

    [17:43] And on top of this Turbo Engine system, we build Turbopack which is basically a bundler but with two major differences compared to Webpack. It has a system of mixed environments. In a Turbopack asset graph, you basically can mix environments too. You can mix server, and server is importing a client component, or you're importing edge function from a server component, whatever. You can mix environments in the graph and it's just one compiler taking care of all the graphs.

    And this gives a lot of benefits. We can do cross-environment optimization like tree-shaking between environments and set stuff. A lot of more opportunities to optimize your code. We also have this concept about lazy asset graphs. This means Turbo Engine takes care of incremental builds but what about the initial build? Do we want to build all the tasks from the startup? Probably not. We want to have some kind of laser system where basically we build a graph in a bundle. We build multiple graphs like a module graph and a chunk graph. And we build a graph in a way that they are derived from the previous graph. The module graph is derived from the source code and the output graph is derived from the module graph. And in this way, we don't have to build a function which converts one graph to another graph. We build this derived graph and they use functions to get references which is a turbo function. And this way you don't have to build the graph, you just build it on demand when you access it. Basically, this means everything is lazy by default or the complete graph is lazily built. And this means only if you do an HTTP request to the dev server, it will compute and read, and read your files, build a module graph only for that kind of graph you are needed for serving this HTTP request which makes it real lazy and should be also kind of streaming experience often when you open a page on your desktop.

    [19:47] And this new bundler, we basically want to use it in Next. And to do that, we try to move all the cool stuff from Next into Turbopack. So we have this in the core in the bundler and also available for other frameworks. And Next.js the build system is only left with a few conventions and a few Next.js-specific runtime code and that's basically Next.js on top of Turbopack.


    Next steps


    [20:14] And what are the next steps? The next steps for Next.js, we basically did an alpha release at the Next.js conf and it's open source and basically seek for feedback. It's obviously not ready for consumption or not production ready because it's missing a lot of features. We want to reach feature parity with Next. That's our next step. And we also want to start dogfooding it for our own systems, which gives a lot of testing and direct connection with people testing it but you can also test it. That would be really helpful. Of course, there are a lot of bug fixes to fix like edge cases we didn't consider yet and that stuff. And next year, we want to do a beta release to make it feature complete with Next.js and give it in public hands and test it.

    But our vision is larger. We want to have Turbo not as an opt-in. We want to make it default. It probably takes years to do that, but the vision is to make it Turbo for everyone. And we also have a lot of ideas. When Turbo is the default, we can get rid of the complexity in the old system, which really limits innovation we can do. We want to have a more advanced building block for making more innovative new ideas. And the next steps for Turbopack is basically to make a plugin system and move the Next.js support into a plugin and then add more plugins from other frameworks. We don't want to be Next.js specific, we really want to add plugin support and add more framework. And this should be something usable by everyone, by every framework. And it's not Next.js specific.

    [21:56] But the vision is larger. Currently, bundle doesn't give you that much of a control at one time. So it's really hard to test production base. You have to reconfigure Webpack to make production builds or you have to do a full build just to test it if it's working or if you put a half production-only box, we really want to make it more interactive. We want to give you control. In my vision, there's a slider in the dev server UI where you can just slide it from development to production and test production version of your page just in the dev server. And this kind of experience should be something we want to give you.

    And there's also more optimization opportunities. Currently, the optimization abilities in Webpack are really limited by most of the more advanced optimization have a really large performance cost. Currently, modules are the smallest unit of optimization. What we can do instead is we can split up your modules into statements and declaration, make this the smallest unit of optimization. This makes it really more granular tree-shaking. You can split modules, we can split pre-bundled libraries into the building blocks. We can split modules into different chunks and make more optimization for the user at one time.

    [23:09] For Turbo Engine, the next steps are… it's already open source but it's not really consumer-based standalone. But we really want to make Turbo Engine something that all the consumers build without Turbopack. It should be something you can build on top of if you want to build something cool and incrementally optimized thing. There are some steps missing. We don't have a logo for it and we want to stabilize the API so we are still evolving, we can evolve it by working on Turbopack. That's really great. And there's no documentation yet, but the plan is to make an actual release, make it standalone usable, and then it can be used in other scenarios and bundling, maybe.

    But the vision is even larger. Currently, it's a task graph. It's limited to one user, to one process and that's not really what we want to do in future. In the future, it should be shared with your team, like I mentioned before, it should be one graph for your team and they can share modules, share tasks, and computation of tasks. And that basically works because you can trust your team that the computation that your colleague is doing also works for your case. And we can even go further. Computation still happens on local machines, but we can also move them or at least some of them to the cloud and make some kind of edge compute cloud which computes part of your application if you don't want to wait so long for your own computer to build it.

    [24:39] This is cool because you can usually trust the Cloud. And if you trust the cloud, you can get something like public caching where when you ask the cloud to compute some node module, there's a good chance that somebody else in the world already computed this node module before. So we can just use this publication opportunity to make it even faster.

    But we also want to give you more visibility. We want to have more insights into builds. We want to have statistics of how your build is performing. Cool summaries which say what's really affecting the current build time. And also some kind of linting or hinting system for performance where you can describe this should not take longer than blah time or whatever. And also there's this bundle analyzer in Webpack, but it could also be a build performance analyzer where you can have the same insights which a bundle analyzer gives you but for the build process. How long are modules taking? What is affecting your performance? This giant, huge node module is affecting other performance, whatever.

    [25:45] And the vision for Turborepo is to get more granular caching for everyone. If all the granular caching we have with Turbopack with Next.js in Turborepo, we could also make it for more operations, for other frameworks, also for common file operations, and these kinds of things.

    Thank you. That was all I have to say. And if you want to find me afterwards, I'm either in the Vercel booth or in the Q and A room or in the performance discussion room, or in the after-party.


    Questions


    [26:23] Eli Schutze: Thank you. Please, step into my office. Let's have a quick chat with Tobias about this stuff. We had a lot of questions in, we're going to get to a couple. And the rest just as a reminder, you can go see him in the speaker room out by the reception. Are they up? First question, I think you touched on this a little bit at the beginning, but why not release the same thing but as Webpack 6? Why a new product?

    [26:51] Tobias Koppers: We thought about that, but the problem is that it is a too large breaking change to be like... I would not be comfortable as Webpack 6 because it's like you basically get some Webpack 6 release which is like Angular 2 which breaks everything and that's not what users are expecting from that. I think a new name makes sense for a completely new architecture, which probably if it would be Webpack 6, it would be incompatible with everything before. All the plugins would not work. That's not what we want to do. A new name makes sense for that and sort of cooler.

    Eli Schutze: So it's mostly a different thing.

    [27:28] Tobias Koppers: It's a different thing but with the same motivation. But we're also working on making a good migration tool from Webpack. I guess it will be some kind of Webpack-style plugin which gives you the Webpack thing with much configuration, much advanced things for Webpack, and in the Turbo world, it will make migration easy and stuff like that.

    [27:53] Eli Schutze: It's a perfect lead-in to our next question, which is, will there be migration paths from Webpack to Turbopack?

    [27:58] Tobias Koppers: There will be. Currently, you can't really migrate yet because we don't support most of the things. But once we have it done the way that you can easily migrate, we'll have an advanced migration guide with all the things you want to. We really want to get the Webpack people onto Turbopack and that's why we want to offer a good migration guide.

    [28:22] Eli Schutze: There's a bit of a spicy Webpack question, which is, the Webpack config was rather complicated, according to this asker. Is this problem also tackled somehow?

    Tobias Koppers: Yeah. Currently, we don't have any conflict, but the plan is to...

    Eli Schutze: Easy.

    [28:34] Tobias Koppers: Yeah. We probably will have some conflict. We know that there's a problem with the Webpack conflict. We can't easily fix it in Webpack because of all the breaking changes and stuff like that. But we know about the problem and we want to make it more easy and that's definitely on our mind.

    [29:00] Eli Schutze: Fair enough. There's a couple of questions here that are sort of tying together, which is a bit of a nervousness. Since it's based on Rust, will it be just as simple to integrate into a JavaScript project, or will the plugins be any different? What language are they going to be in?

    [29:19] Tobias Koppers: Currently, it's already integrated into the JavaScript ecosystem by Next.js’s using. Next.js is technically JavaScript, it starts up with the JavaScript and then calls out to Turbopack. It can be integrated, it's like a native Node.js plugin to integrate. But this will probably also be some kind of standalone binary executable. And we will make it possible to integrate it. Also with our story that we want to offer JavaScript-based plugins, it will be able to run the JavaScript code and integrate with it.

    Eli Schutze: So we don't all have to learn Rust really fast.

    Tobias Koppers: Yeah. I don't want to expect developers to run less Rust.

    Eli Schutze: Turbo learn if you will.

    [29:55] Tobias Koppers: Yeah. That's what I... Rust is a good choice for performance but it's really hard to learn and hard to write. I can't feel comfortable with forcing developers to write plugins in Rust. They won't do that and nobody wants to learn Rust if you're just working on web development. So we want to offer JavaScript plugins and give you all the... You will not have to learn Rust when you want to run Turbopack.

    [30:22] Eli Schutze: Sounds good to me because I don't know Rust. This is a fun one. Is it too optimistic to expect Turbopack with Next to come out of beta before summer 2023?

    Tobias Koppers: I think that's not too optimistic.

    Eli Schutze: Oh, there you go.

    Tobias Koppers: At least in the beta version. It will not be super stable and production ready probably, but we plan early next year, a beta version should have most of the features.

    [30:45] Eli Schutze: So it depends on how brave you're feeling, I guess.

    Tobias Koppers: Yeah. Basically, if you have a custom Webpack in Next.js, then we probably won't have this until summer or you have to write it in a different plugin system. But at least for the basic Next.js usage where you don't have any advanced configuration, Webpack-specific configuration, I'm pretty sure we'll have this early next year by summer. It should be fine.

    [31:14] Eli Schutze: Perfect. We are out of time for onstage questions, but you can find Tobias out in the speaker Q and A room after this. Round of applause, please. Thank you for joining us.

    Tobias Koppers: Yeah, thank you for having me.

    32 min
    02 Dec, 2022

    Check out more articles and videos

    We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

    Workshops on related topic