Supercharging Your Dev Experience With Turborepo


Monorepos is a hot topic in the TypeScript/JavaScript community these days, but getting a high performing monorepo setup from the ground up can be challenging. In this talk, we will see how Turborepo can help you to move your monorepo tasks at light speed.



♪ Hello, everyone. Thanks for joining me today. We're going to talk about how to supercharge your dev experience with TurboRepo. A little bit about me. My name is Bruno Paulino. I'm a tech lead at this cool company called Any26. We're building the bank that we love to use. And I'm a software engineer focused on the web. And there at Any26, we're building the platform for all the engineers to build web applications on top of it. Not just web applications, but also web libraries. You could think of design system components, for example. And there, I don't like to call ourselves as devops. We actually like to call ourselves devops. So we actually help web engineers to ship divs as fast as possible to the browser. I'm also bpaulino0 on Twitter. So if you use Twitter and you want to follow me, please do. I'm there. So let's just jump right in. This presentation is actually divided in two parts. The first one, we're going to talk about Monorepo, multi-repos, and how you can use them. And the second part, we're going to actually talk about TurboRepo. And then to close it, we're going to see a nice live demo. But before we start, I want to talk about multi-repos, mono-repos, and how companies actually organize their code. The most common approach is actually to have multi-repos, right? You go to a company and then they have several different projects and they organize projects in different repositories. That's the most sensible way of doing it. You have several different teams working in different projects. They have their own toolings, they have their own standards, and so on. And that's pretty common and very reasonable, right? You want to give teams the independency of actually using their own tools, using their own ways of building software and shipping as fast as possible. But there is another way, right? There is a way of actually organizing your code in the same repository. There is mono-repos. Don't confuse mono-repos with monoliths. So you can still have a mono-repo and you can still have microservices, for example, inside of the mono-repo. The only difference from a mono-repo to a multi-repo approach is actually that you have all your apps or packages, libraries, inside of the same repository. But then you might ask yourself, why do I need to have a mono-repo? Why do I need to put all my code together in the same repository instead of having them separately in individual repositories? There must be a good reason for other companies doing that, right? Like, why mono-repos are so hot in those companies? For example, Google, Netflix, Facebook, and Twitter, they all use mono-repos in some shape or form. And there must be very good reasons for them not to go with the multi-repo approach, right? And let's talk about them. So the first one is actually code reuse. If you have everything under the same repository, then it's much easier for you to share code with your teammates. Like, think about it. You could have a models package where you have all your database models, and you can also have a package there with your UI components. And then everybody else in your team can just reuse them without having to reinvent anything or to re-import anything or to install any extra package. The other reason you might want to consider a mono-repo is to have shared standards. In a mono-repo, you can actually have true shared standards across your code base. Because think about it, you could have a yes-lint config where it's completely shared across all the packages, and then you can update in one single place, and then every other package can benefit from it immediately. So if you're using typescript, for example, you can have one single typescript configuration shared across all over your project. So it was also very easy to adopt and share stuff among the packages inside of the mono-repo. Bringing us back to the example that I gave before, you can also have a shared database model with queries there, and then you can just reuse across your entire mono-repo setup whenever you need database access, for example. Another thing that works really well in the mono-repo setup is team collaboration. So if you have everything under the same repository, you can easily share a pull request with all the engineers and get feedback because they have already all the context around the codebase, they don't have to set up anything, they don't have to install a different node version, they don't have to set up any SDK in the machine or anything like this, because they have been working in the same codebase. They have every tool installed already, so they don't have to figure out anything. If you need to do a pair programming session, for example, it's just much easier to do it. You can just share a screen, fire up the dev server, and then everything is in there. You don't need to clone anything, you don't need to do any setup. It's just much easier and much faster for a big team. Another big selling point for mono-repo is atomic changes. But what do I mean with that? So if you have a single repository, you can actually change several different apps and packages at the same time in the same pull request. This way, you can actually guarantee that everything is going to work together. In contrast, if you have multiple repositories, then you have to make sure that you can coordinate those changes with different teams, and then you make sure that you release this different version of the library or the app in order for you to move forward with a different change. In the mono-repo, everything could be together in the same pull requests. If you have tests, if you have built pipelines, everything in your CI system, you can actually guarantee that everything is going to work or you don't merge that change. So it's more straightforward to keep everything in sync. Last but not least, another important point is isolation. You might ask yourself, how can you have code isolation in a mono-repo where everything is inside of the same repository? This is possible because of workspaces. Today, npm, pnpm, and yarn, they all implement workspaces, which is a way of actually having self-contained packages. So all your packages, including libraries and apps, they're fully self-contained with their own dependencies declared and installed separately. And the dependencies are all explicitly defined in each package. So you can actually make sure that those packages, they have all the dependencies they need to be properly built, test, and shipped to production. All right, so now we seem that we have got the packaging sorted. So it looks like we can actually have a solid mono-repo thanks to workspaces. This is a very neat feature. It works really well across all the common packaging systems that we have now, including npm, pnpm, and yarn, but running tasks efficiently, it's quite tricky. That's still not very efficient. It can be very challenging, or at least it was until now. Today, we have TurboRepo, which is a build system that was created specifically for the javascript and typescript ecosystem. TurboRepo was built by Jared Palmer. He's a very prolific engineer, and he's doing TurboRepo as an open-source tool. Today, TurboRepo is part of Vercel, and it's still being built in the open source. So let's take a look at the features that TurboRepo provide us so we can actually build this mono-repo in a very efficient way. The first one is that TurboRepo never do the same work twice. So if you run a build or a test or a linting task, TurboRepo is going to remember that and is going to cache that for you. If you want to do it again for a different package that didn't change, TurboRepo will just immediately say, hey, you have done this task already. You don't have to run it again. I'll just show you the logs here from the previous run, and then everything will just work in the same way. You're going to see that in our live demo at the end. Another thing that's a big win when using TurboRepo, it's optimized scheduling. TurboRepo will figure out how many CPUs you have available at your computer, and then it's going to run as many tasks in parallel as you can. So if you have tasks that are completely independent and can be run in parallel, TurboRepo will make sure that it can actually run in different CPU cores, for example. If those tasks, they actually have dependencies, so for example, if your test task depends on a build task, TurboRepo will make sure that the build task executes first, and then the test task will follow up on that. We are also going to see that in the live demo at the end. Remember that I told you that TurboRepo will never do the same work twice? Here, TurboRepo takes this to the next level. So if you do something at your machine, you can actually share this cache across your entire dev team and also your CI service. So for example, if you run a test or a link task locally on your machine, and if you have your cache configured, you can actually upload this cache automatically. TurboRepo does that for you. And then it's going to be shared among all your CI systems and also among all your dev team. And then they also will benefit from this cache automatically. This feature works transparently, but you either have to use Vercel connected to your machine and CI servers, or you have to self-host your own remote cache. There are a few remote caches that are open source, and you can actually deploy them on your own cloud. We're also going to see that in action in our live demo very soon. Last but not least, TurboRepo has zero runtime overhead, which means that Turbo is just a dev dependency. It doesn't ship anything to your code when it goes to production. In fact, all your packages, they don't even know that Turbo exists because Turbo is just existing at the monorepo level, and all your packages are completely independent and unaware of Turbo at all in your package. So enough talking. Let's jump right in to our live demo. Here we have the link for our repository. You don't have to code along. Just follow me on the screen. But at the end, if you want to give a start to the repository, feel free. All right, so we're here in our VS Code, and I have here a demo repository that I prepared for this talk. Here we have a monorepo using TurboRepo as a base. In this monorepo, I'm using pnpm to manage the packages, but you could as well use yarn or npm in the same way. So let's take a look. Let's have a little tour across the monorepo and understand how we can actually compose those packages. Here at the root level, we have a package.json file, just the same package.json file that you use to see everywhere in javascript or typescript projects. The only difference here that in a monorepo, you actually have tasks and dependencies declared here that are just meant for the monorepo. Okay, we're going to have a look at specific tasks later on. Here I have a packages folder with a bunch of other packages. So we have like a collection of packages here. First one is a yes-lint config. It's basically like this common yes-lint config that I can actually share across all my packages in our monorepo. Then we have a ts-config. It's basically our typescript configuration that's also shared across packages and apps. Then we have a UI library. It's pretty much our design system, let's say. It's a very small library just for the sake of this demo, but you see that in action very soon. And then on top of it, we have the apps folder, and that's where we kind of like segregate our apps in a separate way. So we have apps and packages. That's more meant like for libraries. And then apps, you can actually build Docker image, for example, you can actually deploy to a serverless environment and so on. It doesn't really matter. In this case, we have two apps. We have the shop app. That's our next.js app where customers go to buy T-shirts. And then we have our admin app. It's kind of like the back office. So then employees can refund order, ship stuff to the customer and so on. So let's have a look how Turbo Repo actually can help us with the development in our Monorepo. I'm going to open my terminal here, and then I'm going to run pnpm dev. This is going to fire up our dev server in our Monorepo. And this is where Turbo is already helping us to execute tasks in parallel as much as possible. Remember that I told you that you can actually declare tasks and then Turbo can take care of running them in parallel. This is what's happening here. In my package.json file at the root level of the Monorepo, I have a dev task here that's using Turbo. It's our dependency here that's declared below as a dev dependency. I'm just using the latest version of Turbo Repo. And we have Turbo here calling Turbo run dev dash dash parallel, which means that, hey, just execute all the tasks that we have in our packages called dev in parallel and don't care about their dependencies between each other. So just run them in parallel. And that's what Turbo is doing here. If you look at the console, we can see that we have a bunch of different outputs here, even with different colors. So we have the admin dev, we have the shop dev, and we have the UI dev. So this is executing the dev command for all of those packages. So let's go to our browser. And then let's see how our shop and admin app look like. So I'm going to go to localhost 3000, and that should be our shop. So as you can see here, we have the Turbo store, and I can actually add stuff to the cart. It's just a dummy store. It's not doing much, but you can see here, there's a little button here. And then I'm going to go to localhost 2001, 3001, and that's where we have the Turbo admin. And that's kind of like our admin back office for employees. And here you can refund orders. That's also just a mock app to show you. But notice that a common thing among those apps is this blue button here, right? It's actually quite blue. And they're very similar. So this is probably coming from our design system component library. Let's take a look. Let's go to VS Code again. And then let's take a look at packages, UI, and then we have here SRC button. So we have a very simple react button here. It doesn't do much, but we have some css here. Let's change this a little bit. So let's change it to some shade of black. And then let's go back to Firefox. We can actually see that reflected immediately, right? So our dev environment is actually picking up change across packages. Notice that I didn't do any change to my admin app. I actually changed my UI library. And then we can already see that reflected here. So see, it's already working the same way. And then if I go to my shop, we also see that the button, it's also in a shade of black here. And that's pretty cool. So let's go back to Firefox. We can actually see that it's in a shade of black here. And that's pretty cool. So we can actually do development locally without having to set up anything extra. Imagine that you have your UI library in a different repository. And imagine that you're using that also in your admin app and your shop app. That's just much more difficult for you to do development. You'd have to release a new version of the library and then import this new version of the library in your app and so on. So it gets a little bit more complicated to develop. To repo, then things are much simpler to do. Now back to VS Code, since we did some changes here, let's take a look at how we can run test and lint here. I'm just going to stop my dev server. And then I'm going to do a lint task here. I'm going to execute pnp and lint. And then let's see what happens. It executed a lint task. And we see here a bunch of other hints. So our lint here is configured to run across all of our packages. So we have here ui-lint, we have here shop-lint, and we have here admin-lint. But notice down here that we see an output saying three successful, and two of them were cached, and three in total. So which means that two of them have already been executed in the past. So Tuber already knows that, oh, just one package change, which was just the ui-package. And if you look at closely here, we can see that the ui-lint here, it's a cache miss. So Tuber knows, hey, we don't have a cache for this hash here. Tuber has a very smart hashing algorithm behind the scenes that's just computing which files have changed. And then it knows whether it needs to execute that task. So in this case, it's a cache miss. So it's going to execute the lint for this package. For the other packages, we have here shop-lint. It's a cache hit. Notice that Tuber found this already in the cache. So it's not going to execute this. For the other package, admin-lint, it's also a cache hit. So it has already been executed before, because I executed that before showing you here in demo. And then it's already cached. And that's pretty cool, right? So now your monorepo is just executing what actually is necessary. So now I'm going to run another command here, pnpm test. So let's test our packages. Let's see how that behaves. I execute the test task. And then we have a bunch of output here again. So let's take a look. We have now a different output here. We have the ui-tasks and ui-build. I didn't tell Tuber or pnpm to execute a build, right? I just told it to execute a test. But remember that I told you that you can actually declare dependencies between tasks? In this case, my test task, it's actually dependent on a build task. So it makes sure that the build is executed first. So let's take a look how you can accomplish that with Tuber. So once you have Tuber installed in your repository, the only thing that you need is a tuber.json file. That's this file here in the root of your monorepo. And that's the file where you declare everything related to Tuber repo. And the most important thing that you have there is this pipeline here. Pipeline is where you declare your pipeline of tasks. And that's where you have to be very explicit in which tasks do you want to run with Tuber and how their dependencies are going to look like. So for example, we have the build task here. And then we have a key called depends on. Depends on basically tells Tuber, hey, before you run the build task, make sure that you have executed anything else that depends on it. So in this case, it's caret build, which means that, hey, execute any build task from any dependency that I depend on before you execute my build. So for example, if the admin app depends on the UI library, it's going to build the UI library first. And then it's going to be the app next. So it will make sure that the UI library will be ready before it can build the app. The outputs here declares what we expect to be an output for this task. So if I'm doing a build in the next project, I can actually have several output folders, right? Next outputs several different folders. In this case, I want Tuber to cache the this folder and folder, which means that you can actually retrieve those artifacts from the cache once you need them. The test task, it's no different. It follows the same standard. It has depends on, and then it says caret build, same thing we had for build, but it also depends on build again. So which means that whenever you execute a test task for any package, make sure that you execute a build first for any package that it depend on. Let's go back to our example for the app. So we have the shop app that depends on the UI library. If I want to test the shop app, I want to make sure that the UI library is ready and available for me. So I'm going to execute the build first, and then I'm going to run the test. And that's also true for the library itself. It's going to build the library itself, and then it's going to execute. Here, outputs means that this task, it's expect to output something, right? In this case, it's an empty array. It doesn't output anything. So basically the test task, you don't have, Tuber doesn't have to cache anything besides the logs that it generates. Then we have the link task, which doesn't have a depends on. It just has the outputs. And here it declares an empty array again, which means that it doesn't output anything. It doesn't depend on anybody. And notice that we don't have the depends on key here, which means that you can run lint in parallel as much as possible. Then we have the dev command. The dev command here also doesn't have any dependencies and also doesn't output anything. It just has a cache false, which means that Tuber never caches anything for the dev, which means that whenever you're doing development locally, you always want to have the fresh output. All right, but now that we have those pipelines declared, how Tuber knows that it needs to execute those tasks? Then we come back again to our package.json file. Remember that I showed you the dev command? It's the same for build, lint, and test, and CI. We're going to execute all of them. We have executed already the lint and the tests, and then we're going to execute the CI in our CI environment. And I'm going to show you how this outputs there. So whenever you want to execute a task across your monorepo, you just execute them calling the tubo binary. So it's just called tubo, and then you pass the command run, and then you tell the task that needs to execute. In this case, it's test. So it does just one specific task for all packages across your monorepo. In our CI, it's a little bit different. I'm telling tubo to execute lint and test. So it's going to execute all of those tasks in parallel as much as possible. If they have any dependencies, then they execute those task dependencies before they can be executed in parallel as well. All right, so I have shown you here in my terminal that tubo is caching something already, right? We saw here that it had some cache before. So let me run a new command here. I'm going to execute again the pnpm test command. And now we have a different output. We see the full tubo here, which means that tuborepo knows that nothing else changed in our package. We did change the button here, but we have already executed the test, right? So now tubo has already cached this, and it remembers for as long as you don't change the code. If I try to execute the lint again, same thing that we did before, it's also a full tubo. That's great. It's not executing anything else. It's very fast. You see it's 100 milliseconds. It executed everything. And then it also show us all the logs that it had before, because tubo is still caching the logs and replaying them to you so you can see what happened before. All right, so we have seen here how to do development with tubo, how we can actually configure our tasks across the repositories, and how we can execute tubo commands across the repo. So the thing that's missing here is how tubo knows that it needs to execute like a dev task or a lint task or a test task across our packages. The only thing that you have to make sure is that on your package, you have those tasks declared. So let's take a look at our UI package here and see how the package.json file looks like. So we have here the package.json file inside of the UI package. And here we have a bunch of scripts, but notice that we also have a dev command here. We have a test and we have a lint. And that's how tubo knows how to invoke those scripts. As long as they have the same name across the packages, tubo will execute them immediately. All right, so now we know how tubo figures this out across all of our packages. So let's commit this and then let's push it to GitHub and see how that behaves in our CI environment. So I'm pushing this straight to the main branch. And what do we expect there? I expect that my CI environment, being aware of the cache, it's going to be able to fetch this remote cache that I generated locally here, and it will just replay the logs. Remember that I told you that tubo never does the same work twice? So it should also be true for our CI environment. If we have done a task already, either on our local machine or in a CI environment, then tubo should be able to capture this cache and just replay the logs. Let's head to GitHub and let's see how this looks like. I'm now on the GitHub page of this repository, and I can see here that the action has already been executed. So let's check it out. So we have a bunch of steps here, but the most important one is the checks. That's the one that we actually execute our pnpm commands. So we have a command here that we executed. Let's increase the font a little bit. And then we have a run pnpm CI, and that's our CI scripts. Let's take a look at our VS code again, and then we should be able to see the CI script here. You see tubo run lint and test. So it's doing linting and tests at the same time. Back to GitHub, let's take a look at the output here. So we see UI build, we see admin lint, we see UI lint, UI test, a bunch of outputs here. That we actually expect that tubo will be able to execute them in parallel, but also respect the dependencies, right? So let's scroll down a little bit more, and then let's see the output. It took like 1.3 seconds to execute all of those tasks. And then we see full tubo here again, which means that tubo was able to fetch this cache that I generated locally here in my machine, and then share it across all of my CI runners. And if you're not using github actions, but if you're using GitLab CI, for example, that works in the same way. You can still share your cache across any other continuous integration system. And that's it. And that's what I wanted to share with you here. Thank you very much, and I'll see you around.
26 min
05 Dec, 2022

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic