As your team grows and becomes multiple teams, the size of your codebase follows. You get to 100k lines of code and your build time dangerously approaches the 10min mark 😱 But that’s not all, your static CI checks (linting, type coverage, dead code) and tests are also taking longer and longer...
How do you keep your teams moving fast and shipping features to users regularly if your PRs take forever to be tested and deployed?
After exploring a few options we decided to go down the Nx route. Let’s look at how to migrate a large codebase to Nx and take advantage of its incremental builds!
by
Transcription
Hi everyone, welcome to my talk on scaling your React app without micro-frontend. Very quick disclaimer first, this talk is about scalability of your codebase on a developer perspective and not on user-facing performances. That's it for the disclaimer. Hello again, I'm Jonathan Wagner, I'm an engineering manager at Theatre UK and I've been working for the past four years on a bit more than 10 projects in production and for the last eight months I've been working with NX, which is a build system helping you build faster and do things faster. So that's going to be the core of the talk here. Let's have a look at what we'll talk about. So first of all, discuss a bit of the problem, what happens when you scale your codebase and then my journey to fixing it and what I've learned along the way, what I've tried, what didn't work, what worked. So first of all, the problem. So I mentioned having a codebase that grows, that means your CI becomes a bit slower and when it becomes slower it means you have a slower feedback loop, which means your developers are unhappy, they take longer to develop features and in the end your users are unhappy, which we definitely don't want. So when your codebase grows, you have type checking, you have ESLint, you have maybe some dead code tests, some unit tests, end-to-end tests, a lot of testing and the build time and then everything's taking a bit of time and it adds up and in my case it added up to more than 30 minutes and that's my trigger. Ideally, I want my CI to be 10 or 15 minutes. When it reaches 30 minutes, it means something goes terribly wrong and I want to address it. In this case, we cannot even see the buildable time. It's just skyrocketing. It's going to three, four hours, which means it's costing a lot of money to the company and frustrating for everyone. Let's try to look at a more precise example here. So we have a growing codebase, which means we may have 800 tests. Imagine you have a pull request where you do one line change. You push everything and the CI runs everything again. So that means you have to wait 20 minutes, maybe 30 minutes for the test to pass. Does it sound normal to you that one line change triggers a 20 minute run in the CI? I don't think so. NX doesn't think so either. From the NX docs, they actually say the duration of the invoked operation should be proportional to the size of the change and something very strong. It seems simple, but it's quite tricky to put in place. It involves a bit of caching, a lot of caching, and then putting all of that together properly. But I didn't know that at first, so obviously I tried to just over-engineer my CI and that meant doing a lot of parallelization, putting some tests up to skip frontend or backend tests depending on what was changed. That meant writing a lot of custom rules, which were tricky, and introduced a few regressions, and even switching TypeScript compilation to something based on Rust, like SWC. And doing all of this, I got some improvements, got down to 20 minutes-ish, but it meant I had quite a tricky CI to manage. We had 27 different jobs, and it meant understanding how they all played together, which one was conditional, and which became just tricky to manage and to maintain. So that's the start of where I wanted to go afterwards. We spent hours and hours optimizing the CI. Where do we go from there? How do we make it faster without over-engineering it more? And here comes the journey. So discovering NX, discovering a bit more about how to use NX properly, what the secret trick is, and how the caching puts everything and stitching everything together. So as I said before, the main idea we want to aim for is only rerunning things that have changed. And that's basically what NX does. It does smart rebuilds only on the things that have changed and that have been affected. And for that, it uses computation caching, and it helps you generate things so that you don't have to do them hand-by-hand every time. So basically, NX was already set up on the project. We weren't using it for the build system. We were just using it for the Monorepo management. Then when I realized how much it could do, that was the open door to so much more. So now that we know that NX can do all of this, how does it actually work? So I've mentioned a secret trick. It's actually not so secret. It's actually advertised everywhere on the NX documentation. The idea is to use libraries. Libraries everywhere. You might wonder, what is a library in NX? So it's simply a folder where you can execute a few operations, like testing, linting, basically anything you want. And it's just about the code that it concerns. So if we look at the example on my project, we had here the frontend, which was the app that was actually deployed and used by our users. And then everything you'll see below are libraries. So the app is using libraries, and then it gets deployed. And the libraries are there for a way of organizing the code in your folder structure. And the thing is, you can generate them easily so that it's just a common way to create a new one and doesn't cost you much. And what we can see here as well is that in pink, it's highlighting the projects that have been affected by change. So in this case, I made a change in the design system, and it shows me that a few libraries have been affected, and the app as well. So that when I run my tests or when I run the lint, it's only going to retest and relint those pink projects. So the libraries and the app. But it's not going to touch anything else because it hasn't changed, so it doesn't need to retest it. And that's the beauty of it. That's the core principle. The concept of libraries might still be a bit blurry, so let's have a look here at what it means concretely. So this is an example repository from NX where you have basically a couple apps, products, and then a few libraries. And if we zoom in on the libraries, it's actually just a nesting config, a jest config, and then a TypeScript config, and then all your source code in the src folder. And it's as simple as that. There's not much more. And it's everything you need to be able to run your operations in each of those folders. The side bonus that you have from using libraries is that it forces you to have a clean architecture because you have to ask yourself the question, where should I put my code? Where should I put it into the same library? Should I create a new library? It's a question that because you ask yourself, it forces you to do your discussion, think about it, put it in the right place, and it saves you some time and prevents spaghetti code from happening in the future. So it's basically making you win time just by using NX. But this concept of libraries, it's very nice to say like this, but in order to work, as we saw here, in order to know that something has been built already, you need to store that information somewhere. And that's where caching comes in. So NX provides you with some advanced caching where it's looking at all the source files, the operation that you're running. So if it's test, you have the key in test and you can have the build or anything else, the app that you're building or the library. And then some of the runtime options or configuration that you have for React, Jest or something else. And so it means NX has a big table with a hash key and then the output it's supposed to give you in terms of files and in terms of console output in your terminal. So knowing all of this, I knew there was caching. We had a cache folder and I thought to myself, okay, can I put the cache in my CI and use it just like this? The answer is yes, you can do that. It's the first step you can try. But then it means you download all the cache every time you put a pull request. In some case, it can grow and grow and grow to up to a gigabyte, which means it takes one or two minutes to download and unzip. And sometimes you might spend those two minutes downloading and unzipping. And in the end, you don't even need the cache because it's not relevant to the pull request you have. So it's not that optimized and it cannot be shared with the local developers either. And you can do distributed task execution. I'm mentioning all of this because that's something that NX Cloud does. So NX Cloud has been developed by the NX team as well. And it basically sets all of this up very easily so that you just have to run a command in your terminal. It prepares everything. You don't have to sign up. It just works in the CI. It works locally for you whenever you build something and another developer tries to build it. It gets the same cache because it's shared between developers. And it optimizes the order in which it runs tasks so that everything gets faster. So basically, you should use NX Cloud. It has an amazing free tier. It's easy to set up. And it brings so much benefits. And it's so simple to use. But so we've looked at the libraries. We've looked at the caching to stitch everything together. Let's try to see what I've tried and where I actually struggled in all of this. So here comes the learnings. I've played a bit with incremental build and with trying to split an existing large codebase. So let's have a look at what happened here. So the concept of the incremental build is that you want to reuse the outputs of your libraries. So the initial behavior, like the default behavior, is that when you want to build your frontend app, you use a tool like Webpack and Babel to transform your TypeScript files into JavaScript and then your JavaScript files into another version of JavaScript understood by your browser. And then from some minification to make it a bit smaller, some tree shaking to remove what you don't use. And that's basically all the bundling happening from within Webpack so that you have a small package that can release to your users. The new behavior suggested by the incremental build from NX, instead of building everything when you need it, you build incrementally each library. That means everything, the first time you want to build it, you build all your dependencies. And then when you make a change, let's say to the tenant system, you rebuild the tenant system, the UI, the web testing, the frontend domain, and you reuse the previous build from the other ones. And you basically bundle everything back together in the frontend app. So it's supposed to go faster. And that's what NX tells you on the documentation where they show this graph where in blue, you can see the build time for a normal build and then for a cold run with the incremental and then another run, this time warm, for the incremental build. So I've tried to set this up and I didn't have quite the same results. Basically what we can see here is that for the building from source, it was taking about 60 seconds, cold and warm, no differences. And then when adding the incremental build, I had 20 more seconds. It seems a bit normal because I have to build all the libraries first. But then I expected that when it's warm, I don't have to rebuild the libraries. I just have to build the app again. So I would expect this time here to be much faster. So I looked into it a bit more and I discovered a few things. So actually in my configuration, we had a custom webpack config where we were setting up some more bubble loaders, some CSS loaders and others. And the assumption I made is that I was basically building my libraries independently beforehand and then when building the app, it was taking the output of my libraries and building that again. So basically doing everything twice. Hence the 60 seconds we had here. So I tried to do something smart and just exclude from the bubble section this folder, so the output from my library so that it wouldn't build them again. And I saw some improvements. So we were around 15 seconds here between the first incremental try and the second one, which is a good start. But given the time I've invested in making other libraries buildable and then understanding the problems we had with webpack and trying to optimize that, it doesn't seem like a good investment. Because basically I spent hours on it. I had build errors everywhere and don't see any benefits. So I'm sure my webpack config can be optimized much better and I can surely get it to look like what NX has. But is it really worth hours and hours to optimize it more? In my opinion, not yet. Maybe at some point. But so far we have other ways to make our build faster. One of them, actually, which we discovered along the way, was just simply updating our dependencies. So this is like a funny little story about BrowserList, which basically uses a dependency called canIuse, which decides which polyfill your browser needs, depending on which version you want to support. And by updating BrowserList, it updated canIuse, which basically said we don't need those polyfills, we don't need to build for ES5 anymore. That means we just need to build the ESM format. And that took down our build time from 12 minutes to 6 minutes, which took us a couple hours and gave us way faster benefits than the incremental builds I mentioned just before. Something else I discovered along the way was a profiling tool from NX as well, where you can see and zoom in on what's happening behind the scenes. So when triggering a build of basically everything, you can see all the dependencies here, and you can see in what order they happen and what happens in parallel. And it can show you if you should be putting more threads in parallel, or which one you should be splitting. Like in this case, we can see the build, the design system is quite big, it's taking 6 seconds. Maybe the next candidate to be split into two design systems. This time, we could have a design system specific for the forms and another design system for everything else. So it's giving you insight on what you could refactor next and what you could be doing. So yeah, very cool tool. Next up is how to split your app. So I mentioned having lots of libraries. Seems very simple and easy like that, but what kind of proportion should we be looking at? So Enix actually recommends to have about 20% of your code in the app and 80% in your libraries. What that means is basically that you put as much possible code in the libraries and then in the app, you play Legos with your libraries. It's a bit like what you do when you use external libraries like for npm, but in this case, it's your code that you've written. So it's a bit safer. That sounds very nice said like this, but how does it look when you actually do it on your project? Let's say you have two scenarios. When you start a new project, you can create new libraries very easily. You have a command to automate that and it's very simple. So you can actually create libraries for your features, for your UI, elements for data access, utilities and have as many as you like. Like you could have a thousand libraries and it would just be so fast. It means just like rebuilding a tiny proportion of something when you change your app. But it becomes much, much harder when you have an existing project. So let's say you have a few hundred thousand lines and you want to start splitting it down. I tried to do that and I had, let's say, like a flow of a few pages that I knew I wasn't touching and I was testing and building again every time whenever I made a change. So I wanted to take that out and basically not test it again because I knew it wouldn't change. So I tried to do that, just creating a new library, move the code into that library and then I saw errors everywhere. It wouldn't build, it wouldn't compile. It had imports from the library to the app, which meant some things were cyclic, which meant problems everywhere. And maybe you realize that if you want to split it, you have to split the leafs of your tree first. So instead of moving all the pages, I had to move the components one by one first. And every time you move one, you have to update the imports, fix everything, make a commit, try to make sure everything works. So it's not a simple job of just, okay, I'm going to create a library, two libraries, split my code in two, and I'm going to benefit from index straight away. You're going to have to go around it bit by bit. And my recommendation for this is actually to try and adapt it to your teams. So if you have domain teams, for example, let's look at the Spotify app, where you have podcasts, you have radios, you have playlists. Each of those contain a lot of different features, but the first step would be that each team works on their library. And then they might have some overlap. They might have more than one library. But the idea is that they start with just one, and then bit by bit it improves and it grows. So yeah, starting with a large project, adding an index to a large project, you get some benefit straight out of the box. But to get the full power, it takes a lot of time. It doesn't come just by snapping your finger. It's going to be a lot of refactoring, a lot of effort, and a lot of training your teams. So yeah, just be aware of that. But if you just have a quick conclusion on that, having a build system is the next step after optimizing your CI. When your codebase grows, it's basically taking the pain away from refactoring it and improving your CI. Nx is a very good candidate for that. It's not the only one, but in my experience, it's been a delight to use. When you have a large app, splitting it is quite hard, but the faster you start, the easier it gets. So my advice is you should start now and let me know if you have any issues. Thank you everyone. Feel free to contact me if you have any questions and see you in the Q&A. Bye! Bye!