In TypeScript 5.0, the TypeScript toolchain migrated to modules. In this talk, we'll get deep in the weeds, discussing what "modules" even are (and how we somehow weren't using them), the specifics of the migration itself, how we managed to make the switch "mid-flight" on an actively-developed project, how the migration went, and what's next.
Migrating TypeScript to Modules: The Fine Details
AI Generated Video Summary
This Talk discusses TypeScript's migration to modules, the challenges faced, and the automation used for the migration. The TS Morph library and git are used for code transformation and managing changes. The final step involves converting TS dotted names to named imports. The migration to ESBuild brings benefits like faster development loop and improved import organization in TypeScript.
1. TypeScript's Migration to Modules
Hey everyone, I'm Jake. Today, I'm going to talk about TypeScript's migration to modules. It was a huge change, completely changing the way the TypeScript compiler was structured internally. We'll discuss what a migration to modules means, why it was challenging, how I made it less painful, and what's next for modules in TypeScript. Modules in TypeScript refer to the import and export syntax, unlike the previous use of scripts. Namespaces, originally called internal modules, provided encapsulation, export, and import like modules, but with some differences.
Hey everyone, I'm Jake. I'm a senior software engineer at Microsoft, and today I'm going to talk about TypeScript's migration to modules.
So, first off, what are we even talking about? Well, in November of last year, I sent this PR. It was the culmination of many, many months of work. It was a huge change, some quarter of a million lines of code, and it completely changed the way the TypeScript compiler was structured internally. We talked about this in length on our blog, but there are plenty of details that we didn't get to talk about, and that's what I'm going to talk about today.
Real briefly in outline, we're going to talk about what a migration to modules even means. We're going to talk about why it was so challenging, but how I made it less painful. We're going to talk about how the migration actually worked under the hood, and we're going to talk about what that meant for us, how it went, and what's next or modules in TypeScript.
So getting back to basics here, what even are modules? I think there's a few different definitions one can come up for this, but I think the two most critical are the syntax itself. So that's import-export and the part where you make multiple files and import them between each other, but it's also the output format. So ESM like I've written below, CommonJS used in a Node ecosystem pretty heavily still, and plenty of other minor formats. When I'm talking about migrating TypeScript to modules, what I'm referring to is specifically this first one. So that's changing TypeScript to use the import and export syntax.
Now that begs the question, if TypeScript wasn't modules before, but didn't use import-export, what did it use? The answer is scripts. So in TypeScript, everything is either a script or a module. So if our files aren't modules, then they are scripts. We put all of our code into global namespaces. So you can see that we have parser.ts here, we define a function createSourceFile, export it, then in another file that declares the same namespace, we can use createSourceFile, and it all just works. You'll notice that we don't have to use ts. here to refer to that. If it's inside of the namespace we're currently working in, we get to refer to it implicitly. Now, namespaces were originally called internal modules, and you can sort of see why. They provide somewhat of the encapsulation, export, import that modules do, but differently. Now, when we go to emit these namespaces, they turn into plain objects and functions. So you can see on the left here that we have createSourceFile for parser.ts, it's defined as a function inside of a closure and because it's exported, it gets set to this ts namespace. Then later inside of program.ts, we define the ts namespace again. But when you go to createSourceFile, you notice that we get a ts. explicitly. Our implicit access became explicit.
2. Bundling and Imports in TypeScript
Now, all of our code is in different files, but we can bundle them by using outfile and prepend. So, tsc has effectively been a bundle this whole time, although this option is being removed. If someone wants to import us, we can be clever by using namespaces. Namespaces have some upsides, but nobody writes code like this anymore. We want to have files that explicitly export and import each other.
Now, all of our code is in different files, but we can bundle them by using outfile and prepend. So you can see that inside of tsc's tsconfig, we set outfile to be tsc.js, and then we ask tsc to prepend all the contents of its dependent projects. Below, you can see all the code that the compiler defined, execute command line defined, they all get shoved on top of the file. And then all the code that happened to be in the tsc project.
So, tsc has effectively been a bundle this whole time, although we'll see later that this option is being removed. Now, if someone wants to import us, this is sort of interesting because we're global scripts. There's no imports or exports. But we can be clever here. What we can do is we can say if we're in a common JS context, that is, there is module and module exports, then we can just read our ts namespace as a value and set it to module.exports. This means that if someone loads us inside of a context that supports common JS, so Node, but also other bundlers, they'll see that this is a CGS module and it'll work as people expect. But if someone loads this into their web page by just using a script tag, they'll get a ts variable and it'll also work. Namespaces have some upsides. Because we're using namespaces, we don't have to write imports. Obviously, we're talking about adding imports. There's no imports at all before. When we add new code, we don't have to import it. If it's another file, we just use it. If we move code from one file to another, we don't have to change imports either. We also get bundling for free used by a TSC. But, nobody writes code like this anymore. This means that we don't get to dog food modules. If we want to test out things like Node Next resolution, auto imports, organized imports, all that good stuff, we can't do it in our own code base. We're also unable to use external tools that only handle imports and exports. So, if we want to say test for cycles between files, all those tools are expected to have imports present and we can't use them. We also need to maintain prepend. It turns out that nobody uses this feature except for TypeScript itself. So, it would be really great to remove from the product and say not maintain anymore. We want to have files that explicitly export and import each other. We have the same file before, but we have parser.tsx exporting create source file at the top level. And then in program.ts we import by name and use it.
3. Challenges and Automation in TypeScript Migration
So, we know what we want, but we need to figure out how we can actually make the switch while maintaining the same behavior and the compatible API. TypeScript is massive, with over a quarter million lines of code. We also have huge files, like checker.ts with nearly 50,000 lines. Additionally, TypeScript changes very often, with an average of five commits every weekday. To handle these challenges, we're automating the change as much as possible using code transformation and git. We're doing everything in steps to debug and review each one individually. We're also using the TS Morph migration tool for code transformation.
So, we know what we want, but we need to figure out how we can actually make the switch while maintaining the same behavior and the compatible API. Of course, this is challenging. TypeScript is massive. So, there are over a quarter million lines of code. And you'll remember from before, a quarter million lines of code were changed in my PR. And all of this is going to have to change.
We also have huge files in general. So, checker.ts is nearly 50,000 lines. So, anything that we have to do, anything we have needs to be able to support large files. Additionally, TypeScript changes very often. So, in the time that it took me to change, to work on the migration itself, there were over a thousand commits to main. These aren't like merged merges or anything, people's branches. These are literally a thousand and 100 PRs that were merged on the main. And any of these changes invalidate the entire change. That's an average of five commits every weekday. So, whatever we do also needs to be able to handle changing often and letting us iterate and work on the problem.
Obviously, we're not going to make this change by hand. We're gonna automate as much as we possibly can. So, that means using code transformation wherever we possibly can. We're also gonna use git to store manual changes. And we're gonna do everything in steps. This is gonna help us debug each step individually. We're gonna be able to review them, which is important when you're changing a lot of code. And we're also gonna be able to keep git blame working, which is really important for us as we really like to go back and figure out when something changed.
The migration tool itself looks like TS Morph for code transformation. This is a great TypeScript API wrapper made by David Sherratt. It's great at doing TS to TS transformation. When I started, I was using TypeScript's own transformation system. But it's really good. It's really more suited for emitting JS files or DTS files.
4. TS Morph Library and Git
The TS Morph library captures formatting and allows us to make the transformation without breaking formatting. Manual changes are stored in patch files and managed with git. If a patch doesn't apply, we can fix the problem and save the changes for later.
It doesn't necessarily preserve things like formatting, indent, etc. And so, the TS Morph library, at the cost of some performance, was able to capture all those things and let us make the transformation without breaking all of our formatting. We're also gonna store all of our manual changes in patch files. So, git, obviously, is really great at managing changes. And so, we can make the changes onto the tree and then dump them to disk. Then later, when we want to bring them back again, we can just use git am. And because am is sort of like a rebase, if one of the patches doesn't apply, it will pause, and we can fix the problem and save the changes for later.
5. Code Transformation Steps
Now, let's take a look at the actual code transformation process. The first step is to de-dent the code and bring it to the top level. Next, we make all namespace accesses explicit by adding 'TS dot'. The final step involves removing the code from the namespaces, placing it in the outer scope, and adding imports. This allows us to match the old API using namespace barrels. These barrels can emulate the behaviors of namespaces and allow for nested namespaces and merging of multiple barrels. After step three, we're left with namespace imports and the code containing 'TS dot'.
Now, if you'd like to see a demo of what the module migration looked like itself at the time that I ran it on the day, there's a demo below. It takes about five minutes or so. But let's go take a look at the actual code transformation itself.
The first step is a silly one. But if you think about it, when you write code using modules, all of your code is at the top level. But when you write your code using namespaces, all of your code is inside of the namespace block itself. Eventually, we're going to have to take all the code inside of our TS namespace and pull it out to the top level. So, whatever change does that is going to basically change all the white space for all lines of code. If we do this early instead, then Git will figure out what we've done. It will also make all the changes easier to review. So, the first step we're going to take is to just make the styling look bad, and take our code, de-dent it by one level and stick it up to the front.
The next step is to make all of our namespace access explicit. So, we saw before that when we wanted to access the TS name space but in a different file potentially, we didn't have to prefix our accesses with TS dot, but this step adds in the TS dot. You'll see later why this is important, but we're going to be able to more easily see where all of our namespace uses are within the file if we do this now.
Step three is the big one. This is the one that actually takes all the code out of the namespaces and puts it into the outer scope, and then adds imports. So given our code, which looks like this now, which is de-dented, it has the TS dot, we're going to take all the code out of the block and stick in the top level and then add an import, and you'll see that if we add a diff here, that the only changes in the file is exactly that, only adding in the new import and removing the namespace.
Now, you're probably wondering what this namespaces thing is. This is a way for us to match our old API. So the trick is that inside of a module, say, TS dot TS, we can export all of the files that used to export the TS namespace. And then when someone goes to import that module, the view on the file that they see is exactly what those files used to declare before for the TS namespace. In the JS world, this is commonly referred to as a barrel module. So I'm just going to call these barrel namespace barrels for consistency. Now namespace barrels can emulate all the behaviors that namespaces used to have. So for example, if we want to declare a nested namespace, so for example, there's TS dot performance, we can just create a file that contains the TS performance namespace exports and then re-export them through the TS namespace barrel. If you want to take lots of different namespace barrels and merge them together like prepend would, that's the same as just exporting all of them through one singular module. And then later, when we want to export our code, we can just import that TS namespace barrel and then export it as our public API. It's exactly the same API. So recapping, after step three, we're left with these namespace imports. So you can see that we import TS from namespaces TS, and our code contains the TS dot.
6. Final Step: Named Imports and Custom DTS Bundler
What this final step is going to do is to take all of those explicitly TS dotted names and make them named imports. The tedious work is done, but there's loads of manual changes that we have to make after the bulk transformation. We're using ESBuild to bundle TypeScript, which is really fast and has great features like scope hoisting, tree shaking, and enum inlining. We have an alternate mode called nobundle to emit code as common.js. We also rolled up our own DTS bundler to handle the DTS files.
What this final step is going to do is to take all of those explicitly TS dotted names and make them named imports. So this is much, much closer to what we actually want at the end of the day. And in fact, if you took the code and you diffed it against the original state of the world, the only thing that would have changed is the import line itself. The code below is identical.
And at this point, the tedious work is done. But we're not done yet, because there's loads of manual changes that we have to make after the bulk transformation. In fact, there were 29 commits. Obviously, this is kind of scary, right? Because any of the changes upstream could break any of our patches, and we're going to have a bad time. But thankfully, we were using Git to manage these. So when we go to ask Git, hey, apply all these patches, it's going to go, hey, your patch didn't apply, can you fix the problem? And then we continue the rebase. And then we just ask it to dump the patches. The patches are stored with the transformation tool, so when the time comes, we can just run the tool and it works.
Now, obviously there are 29 manual changes, so I can't go through all of them. But there are some really interesting ones. So first of all, we're bundling TypeScript using ESBuild. For various reasons, TypeScript still wanted to ship all of its code as these large bundled files. Before, we saw that it was produced by outfile and prepend, so if we want to recreate those same files in new TypeScript with modules, we need to use some sort of bundler. There's lots of bundlers out there, I won't try to list them all, but I went with ESBuild. Obviously, it's really, really fast, so 200 milliseconds to build TSC is really good. It also has a lot of really great features, such as scope hoisting, so that's taking all the code out of people's modules and pulling them to the top level so you don't have to indirect. There's tree shaking, so removing dead code, and then enum inlining. So, when we declare our enums in our compiler, they get inlined and thrown all over the code so that it's faster and doesn't have to go through an object lookup.
Now, obviously, we're depending on an external project to bundle our code, and just to make sure that we can still run ourselves in work, our build has an alternate mode called nobundle, and just make sure that we can emit all of our code as common.js and it continues to function. Now, because we're using esbuild to bundle, there's no one around that can bundle the DTS files. So, I ended up rolling up my own DTS bundler. It's about 400 lines of code. There are many other DTS bundlers out there, API extractor, TS up, rollup plugins, etc. And they all are good in their own way, but for us, we only needed a very small subset of the features. Specifically, because we were using namespaces, we didn't have any naming conflicts. And so, we were able to use our own custom bundler to go a little bit faster, but also produce an output that looks a lot like our old output.
7. Namespace Declarations and Build Changes
We can declare namespaces in our DTS files, maintaining compatibility. We completely changed our build, moving away from Gulp. I rolled my own task runner called Hereby. It runs in parallel, has explicit dependencies, and improves build performance. TypeScript users experienced a 10% to 20% speed up and a significant package size reduction. ESBuild tree shook dead code and the change had no API impact on downstream users.
So, we can declare namespaces in our DTS files, and so people won't know that anything changed, but things will work exactly as they did before. This was also a great opportunity to completely change our build.
So, previously, our old build was Gulp. I think many, many years ago, it was Jake. But it had gotten somewhat convoluted, I'd say. And when we're using modules, all of our build steps are completely different. And so, it was really difficult for me to try to overlay the new build onto the old one and get it working again.
So, as with the previous thing, I rolled my own task runner. So, this one's about 500 lines of code. It just works based off of plain functions, exports. And it runs everything in parallel if you can. It has explicit dependencies, kind of like make. And it helps make our build perform faster with fewer dependencies. If you want to look it up, it's called Hereby. I don't think you should use it. It's feature complete at this point. But some daring people have. So, you can go take a look if you'd like.
So, we did it. So, how did it turn out? Well, it went great. So, for TypeScript users, there were lots of big benefits. So, we saw a 10% to 20% speed up from ESBuild scope hoisting alone. That's really great. Some people just installed TypeScript 5.0 and they got 20% faster for doing nothing. We also got a huge package size reduction from various different sources. So, we were able to delete the TypeScript services bundle out of our package. ESBuild also tree shook away a bunch of dead code in some of our bundles. And interestingly, ESBuild hardcodes a two-space indent, TypeScript hardcodes a four-space indent. And that two-space indent does have a significant impact on the on-disk size. Additionally, the change provided effectively no API change to downstream users.
8. Benefits of ESBuild and Future Plans
They upgraded and nothing really changed. For the TypeScript team, the migration to ESBuild brought a great development loop boost. It allowed for instant debugging and the discovery of auto-import bugs. The team is now focused on improving TS's import organization and deprecating the pre-penned feature.
They upgraded and nothing really changed. Now, for the TypeScript team, we got a great development loop boost. So, because we're using ESBuild, what we can do is we can just run the ESBuild step and then debug a test instantly. Additionally, we are able to actually do the dogfooting which was the original reason we wanted to make this change in the first place. So, we immediately found some auto-import bugs and fixed them. It's also spawned an effort to try to make TS better at organizing imports, more like ecosystem tools. And we've also been able to deprecate the pre-penned feature which will be removed in TS 5.5.
Comments