The Age of Monorepos

Rate this content
Bookmark

The history of the web can be divided into evolutionary development leaps. The age of inline scripts, the age of jQuery, the age of SPAs, the age of JAMStack...

We are now entering the next stage that has been carefully prepared in the past few years. Let me invite you to the world of modern monorepo solutions and share with you the benefits you will reap by using them in every project size and setup. It's time you automate those boilerplate tasks and reduce the bottlenecks so you can focus on what truly matters.

Get ready for the next leap! Welcome to the age of monorepos!

25 min
16 Jun, 2022

AI Generated Video Summary

Today's Talk is about the world of monorepos, their history, benefits, and features. Monorepos address challenges in web development, such as slow build processes and unstable connections on mobile devices. Collocation in monorepos enables easy sharing of functions and components among projects. Speed and efficiency in monorepos are achieved through collocation, dependency graphs, and task orchestration. Monorepo tools like Learnr offer features such as caching and distributed task execution. Monorepos provide code sharing, consistent tooling, and automated migration, resulting in a 10x developer experience.

1. Introduction to Monorepos

Short description:

Thank you all for joining my talk. Today I'm going to be talking about the amazing world of monorepos. But before we dive into that, I have an important disclaimer. In this slide, you will see some examples of extremely bad web design. You will see some flickering colours that might cause happy attacks. And finally, you will see some life-changing features.

Thank you all for joining my talk. Unfortunately, MC decided to resign at the last moment so I have to announce myself, but it's fine. Today I'm going to be talking about the amazing world of monorepos. But before we dive into that, I have an important disclaimer. In this slide, you will see some examples of extremely bad web design. You will see some flickering colours that might cause happy attacks. And finally, you will see some life-changing features. So, if you have medical history with any of these symptoms, perhaps it's better to change the track. Otherwise, I assume you take the full responsibility for being here.

2. Introduction to Web Development History

Short description:

And with that formal note, let me introduce myself. My name is Miroslav Janas. I work for Narval on the tool called nX. Before we dive into what monorepos are, let's take a trip back through history to understand how we got here. In the beginning, the web was static. Pages were boring, but then scripting languages like JavaScript brought dynamicity. As websites became more complicated, single page applications emerged. However, the rise of smartphones brought new challenges with unstable connections.

And with that formal note, let me introduce myself. My name is Miroslav Janas. I work for Narval on the tool called nX, which you're going to hear a lot about today. I also co-organise two meet-ups in Vienna, ViennaJS and AngularVienna.

Now, before we dive into what monorepos are, in order to understand how we come to the point where monorepos are needed, we need to take a trip back through history all the way to the beginning of the web to retrace our steps to see how we got here. Fasten your seatbelts, it's history time.

In the beginning, as you all know, the web was static. It was merely a collection of HTML pages linked with hyperlinks. The first web pages looked something like this Yahoo page. They had lots of text, lots of links, very small images, it was dial up time so things have to be small and fast. Usually, they had a contrasting choice of colours but pages were boring. They were too static, so people came up with graphical format that would shade things a bit. Who remembers this dancing baby? Some pages took this to a whole new level where the entire page was spinning in animations. But you see, it still wasn't what we needed because this was running in a loop, it wasn't controlled animation, it wasn't controlled movement. So Brendan Eich from Netscape, a company producing the popular browser at the time, was given a task to come up with a language in just two weeks that would pick up some ideas from Java and that would finally bring dynamicity to the browser. And two weeks later, LiveScript was born which was later renamed to JavaScript much to the delight of generations of recruiters and headhunters ever since.

And so the age of scripting began and with this we finally had pages that had fancy image galleries, we had crazy menu effects, buttons that would run away from our cursor but the pages, well, they could still look very ugly but now they had controlled movement finally. And as number of scripts on a page grew, we started to encounter certain patterns to recognize certain things that were repeating. At the same time this was a moment of a famous browser wars between Microsoft and Netscape and there was lots of inconsistencies between standards in these two browsers, so developer usually had to implement things for both browsers. Luckily we had now helper libraries, most notably jQuery, which would overcome these differences and would create like a wrapper around DOM manipulation. This would allow you to quickly create your websites. And as websites became more and more complicated, we started calling them web applications, not websites.

But encapsulating DOM and animations wasn't the only boilerplate. There was still a lot to be encapsulated like routing, event management, state management, and this is what led to single page applications. The first world popular framework that implemented single page application was AngularJS, and soon React and Vue followed. All of these are still used today in some variations and they together change our thinking of web development. They set up the web development as we know it today. Unfortunately for them, this was also the time when our phones became smart, and now suddenly we no longer browse the internet on our desktop computers, but we started browsing internet on our mobile phones while sitting on park benches or being in a public transport or sitting on a toilet seat. In these places, connection wasn't really stable. We could hope for 3G at best with lots of interruptions.

3. Monorepos and Collocation

Short description:

Single page applications were too heavy for mobile phones, leading to the birth of Jamstack. However, this came at the expense of developer experience and slow build processes. Monorepos addressed these issues by allowing different parts of the system to immediately notify each other of changes. Collocation in a monorepo enables easy sharing of functions and components among projects.

And suddenly we realized that the single page applications are just too heavy for mobile phones and we needed to address this. And people were used to having fast websites, so suddenly when you had to wait minutes for something to load, it just didn't work. And this led to a birth of Jamstack, where we needed something that was fast and light in contrast to the large single page applications that took minutes or dozens of minutes and megabytes to load.

We needed to address the elephant that is trying to be squeezed to a straw. It was the birth of the first meta frameworks like Next, Nuxt, or Gatsby or popular today Remix, Quik, and Astro that bring something new to the table, where we finally had pages that were fast to load, where you would immediately see the results. Unfortunately, all of this came at the expense of developer experience. In order to have fast websites for users, we had to do heavy lifting on our own machines, so our build processes became super slow. And not only this led to frustrating developers, but when we consider that most of the CI and CD is now pushed to cloud, if our build processes were slow, that also meant that our monthly cloud builds were getting higher and higher.

It was time to address the second part of the slowness, so now websites were slow but we needed to work fast, but we needed to speed up our developer experience. This led to monorepos. In order to understand what monorepos exactly do, let's first consider our typical web application today. Our application usually consists of a front-end application build with your chosen framework, and then we have some back end which is not a monolit back end but has some micro-services architecture behind, and then you have some UI components. Now what happens occasionally is that one of the developers working on back-end slightly changes a method in one of the services by that changing a contract, and simply forgets to inform you the maintainer of front-end, and suddenly your website is broken, everyone is pointing fingers at you, and you have no idea what happened because you didn't touch that code.

There are some clunky attempts to solve this by constantly exporting contracts from the back-end, converting them to TypeScript and importing them in your front-end applications so that you always have up-to-date information, but this requires manual intervention, and when it comes to human factor, given enough time, it will eventually fail. And on top of that, there is a sync issue because if something changes in the back-end, you have to roll this out at the same time on front-end, and there is a lot of coordination and juggling involved. On top of that, your application might not be the only one. There might be two additional applications, a dedicated mobile and dedicated admin portal maybe built with completely different technology that is also shared among different applications.

Now, what you want to do is that whenever one of these changes, each part of the system that has been affected by it immediately is notified. Now, in a typical poly-repo approach, what happens is that, for example, if you change the utility, you would have to publish a new version, then you update the dependencies in a home page application and admin portal, and then test if this works. If it doesn't work, you have to go back again, make a fix in utility, again publish a new version. Of course, you can solve this with, for example, SIEM linking or some local repository, but it requires a lot of coordination, a lot of things that you need to maintain, and on top of that, this doesn't scale well.

Now, wouldn't it be nice if they could just simply talk to each other? So whenever something changes, the entire ecosystem already knows what happened. And this is what happens with collocation. Collocation means that all of these projects are sitting in the same repo. So whenever one of these changes, all the projects can immediately see that because they are sitting together, so there's no need to publish a new version, no need for any rolling or any other rituals. Now, having things collocated also helps us to identify certain functions that can, for example, be reused. So if you have some sophisticated admin function like this in your code, you can easily tell your colleagues, hey, I have this amazing function, maybe you want to use it in your project as well? And they can see it and they're, yeah, amazing, we need this for mobile phone as well. And you can easily extract this into a library and share it with everyone else in the repo. And when people think about monorepos, this is what they usually think about, that it's just about collocation. But it's not just about collocation.

4. Achieving Speed and Efficiency in Monorepos

Short description:

Collocation is the first step towards achieving speed in monorepos. By having everything collocated, we can easily identify connections and create dependency graphs. This allows us to optimize our build and deployment processes, reducing the time and effort required. Additionally, we can maintain up-to-date dependency and architectural graphs, eliminating the need for stale diagrams. With this knowledge, we can orchestrate tasks more efficiently, running them in parallel and optimizing the order of execution. Recent advancements in monorepo tools, such as Lerna with the use of NX, have made it even more powerful and competitive. Check out the new website for more information.

Collocation is just a precondition to everything else. It's the speed that's the main selling point of the monorepos. So how do we achieve this speed? First of all, when we have things in one place collocated, we can easily spot how things are connected. And by knowing how things are connected, we can come up with this nice graph of dependencies. And for example, if we would change here our core library, we can see that the only two projects that might be affected by this are store and admin.

Furthermore, if the command that we're trying to run is deploy, we know that this core is just a building block library, it doesn't get deployed. So nothing to be done there. But our store and admin do get deployed. So by knowing what target we are running and how things are connected, we can reduce this entire graph to just two nodes. Now imagine if this entire graph had hundreds of projects. How much you would save there? So instead of running build, for example, for all of these projects, you would simply run build for core, store and admin, and then deploy just store and admin.

Again, if we know how things are connected, we can always have up-to-date dependency graph. We can have up-to-date architectural graph. So no longer you have to maintain some stale architectural diagrams in your network drives. You can just run the command and get up-to-date diagram immediately. And not only that, you can nicely filter it by nodes, you can click on the vertices and see which files create this connection between two projects. You can see whether you have some circular dependencies. You can spot all these problems. If we know how things are connected, we can create a nice task orchestration, because we can know in which order things need to be processed. So if we need, for example, to run test, build, and lint, a naive way would be to just simply run them sequentially, first build, then lint, and then test. Let's assume that test includes end-to-end tests, so we have to do it after the build. A better way would be, of course, to run build and lint in parallel because they don't depend on each other, and then run test afterwards. An even better way would be to start running tests as soon as parts that need to be tested are built. And if you ever used to learn up until a month ago, this was the limit of its capabilities. And it was fine for most of the stuff, but it also left a lot of things to desire. And in this state of limbo, a lot of new Monorepo tools came out and they offered more functionality on top of Lerna. This changed recently when Narwal took stewardship over Lerna, because now with a single flag use NX, you can have most of the functionalities that NX, one of the popular Monorepo tools provides. So Lerna is now also competitive with all the other tools and it became way faster than it was before. And there's also a new website, so if you haven't checked, you can go ahead and check the website. Not now, after the talk.

5. Monorepo Tools and Automation

Short description:

As I mentioned, it's now powered by NX. Companies like Google and Facebook have had their Monorepo solutions for a long time, but setting them up required a lot of manual configurations. Luckily, companies like Microsoft, Narwa, and Verso offer automated solutions that simplify the process.

As I mentioned, it's now powered by NX. This is opt-in, so you can switch it on or off if you don't like, it's up to you. So what are these other Monorepo tools? Companies like Google and Facebook have had their Monorepo solutions for a long time. The repositories consist of thousands of projects that are constantly running and for this type of ecosystem, you need a really sophisticated build tools in order to run this in real time, right? And it took a very long time for it to happen. But up until recently, it wasn't open for public. But once it became open for public, although we realized how impressive these tools are, there was a slight catch. You might need a PhD to be able to set it up, because almost none of it was automated. It required lots and lots of configurations, and not just once. Over and over, every time you create a new project. So in order to maintain this on your repository, you might need a dedicated person doing just this. This wasn't, of course, feasible for any smaller company. But luckily, companies like Microsoft, with Rush and Lagge, or Narwa with NX, or Verso now with TurboRepo, offered solutions that give more or less the same feature set, but automated with all the boiler plates taken away from you, so you don't have to do this manually.

6. Learnr Features: Caching and Remote Cache

Short description:

The most important feature missing from Learnr is caching. Caching allows you to avoid running code that you've already run. It stores build artifacts and locks in a cache, so when you run a build again, it can replay the results from the cache. With a remote cache, you can store your cache locally or on the cloud, saving time on CI and reducing cloud costs.

So what are these features that we're missing from Learnr? The most important one is caching. Caching is a premise that you shouldn't have to run code that you already run. So if you're running, for example, build on your state of the repository, if you need to run it again, you should have just results replayed. Why struggle again? And this is what caching does.

So for example, if I run a build on my system, it will be stored in a cache. And not just my artifacts, but also my locks. So the next time when I run it, it will not only copy those artifacts back as if I was running it now, but it would also replay all the locks, so you wouldn't even notice that it was running from a cache apart from being there immediately.

But what happens if you have a colleague, for example, in Australia, and they're running the same command on the same state of the repository? They would have to do it again, because they can't see your local cache. But luckily, with a remote cache, you can not only store your cache locally, but you can also store it on the cloud. So the next time your colleague is trying to load or run the build, first it will check is it available in my local cache. If yes, then we'll copy it from local cache, if not, we'll check the cloud. Is it there? If yes, we'll copy it from cloud, otherwise we'll run it from a scratch, save it in local cache and save it on cloud. And this not only saves your developer time, but it also saves tremendous amount of time on CI, especially because while you're working on your PR on your feature, you're already building these things on your machine, and your cloud already knows about it. So when you run the CI, CI's just then picking stuff from the cache. And this not only saves time while you wait for the PR, but it also saves huge amount of money for your monthly cloud bills.

7. Distributed Task Execution

Short description:

The next very cool feature is distributed task execution. It allows running tasks in parallel or on multiple agents, which is crucial for handling major changes or refactoring in a monorepo. Distributed task execution is the hardest problem of monorepos, and currently, only Bazel and NX are implementing it.

The next very cool feature is distributed task execution. No matter how cool implement your caching is, or how performant your effected graph is, there are times occasionally when all of this doesn't matter. For example, if you are updating your version of your framework, suddenly everything is effected and your cache is non-existent, because you're having completely new version of framework. So all the cache that you have doesn't work anymore. It could be broken. So you have to run everything from the scratch. And although this might not seem important, we've seen solutions with hundreds of projects where just running build takes an hour and a half. Their nightly builds are running day and a half. So it's not even nightly, it's already by nightly. And imagine if they would have to run this every time something major changes in their framework. Or if they have some major refactoring, that would be a disaster. Luckily distributed task execution gives us ability to run things in parallel or on multiple agents. Meaning, for example, if we have a hundred tasks that we need to finish, we can split them to, let's say, five agents and each takes 20 tasks. But it's not that simple because tasks don't have the same size. They might be intertwined, there might be some dependencies. And on top of that, if something fails, I don't want to go through all of the agents to see exactly where it failed. I want to see a unified report. And this is why distributed task execution is the hardest problem of monorepos. And so far, only two monorepos solutions, Bazel and NX, are implementing it.

8. Monorepo Features and Benefits

Short description:

Code constraints, powerful generators, automated migration, and consistent tooling are key features of monorepos. They provide clarity, speed, and efficiency in project development. Monorepos enable developers to restrict access to projects, configure CLI generators, automate migrations, and maintain consistent tooling. They also offer workspace analysis, graph visualization, local and remote caching, task orchestration, and distributed task execution.

Next feature that is quite dear to me is code constraints. This is your only weapon against unmanageable architectures. You are working on some experimental features, and until you're finished fully and you're still experimenting, suddenly someone started using your feature in their project. And now not only you have a dependent, which whenever you change something, might be breaking change for them, but if you decide that your entire experiment was just a failure and you would rather remove this feature, well you can't anymore because someone is depending on this.

Load constraints is the ability to restrict access to your project, to specify project of which type can access this. It could be as simple as if you have, for example, Angular and React in your repository, you can say that Angular libraries should not load React libraries. But what if you have just single project? Should you still use monorepos? The answer is simple. Absolutely, yes.

They also provide powerful generators. Normally when you start a project you start with a CLI that generates and scaffolds projects for you. Most of the time it's fine. But often it's not. That's why there are so many boilerplate projects that take you one additional step forward. Now, imagine if you could configure your CLI to generate stuff as you like it with unit test, with end-to-end test, with your state management set up the way you like with all the utilities and applications just the way you like it. With NX, for example, you have these powerful generators where you can really configure everything and not just on init time but also every time you're creating a new component or you're creating a new function or a new project you can specify how it should look like.

And it also gives you ability to nicely migrate things. So imagine there's a new version of your BelowIt framework and there are some breaking changes of course. Normally, you would have to go to your codebase and fix all those, spot all those problems and fix them manually. With generators, with automated migration, you can just run migration and it will fix all this stuff for you. And finally, we have consistent tooling. I don't know about you but for me it was always annoying when I switch between different frameworks and then I have to remember was it NPM start or NPM serve or was it NPM develop. And it's not just serving applications. All of these commands have different parameters and different ways of running. So consistent tooling creates a wrapper around these commands so you can always run the commands in the same way regardless which framework is under the hood or whether it's next or next or it's Cypress or ESLint and it's not just commands it's parameters that are always the same. So you only need to learn one set of commands. Even this you don't have to because here what I forgot to mention is that we also provide extensions for VS Code that give you this nice graphical representation of all the generators so you can easily find the generator you like and it will give you information about all the parameters you need to pass. You don't have to remember anything.

So let's recap. Monorepos brings clarity by giving us workspace analysis and graph visualization. They give us speed by leveraging local and remote caching, task orchestration, detecting affected nodes and distributed task execution.

9. Benefits of Monorepos

Short description:

Monorepos make development easy with code sharing, collocation, generators, consistent tooling, and code constraints. Visit monorepos.tools for comparisons. Use NX for a 10x developer experience.

And finally, they make our development life easy because they give us code sharing, code collocation, powerful generators, consistent tooling and code constraints. If you would like to know how different monorepos solutions compare to each other you can head over to monorepos.tools which gives you in-depth comparison with all the features listed. And this is open source website so it has been maintained by creators of popular monorepos solutions. Or, if you don't want to spend so much time you can just take my word and use NX. And if you are wondering if this is the thing that will finally make you a 10x developer, why settle for 10x when you can be NX developer? Thank you and enjoy the conference.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

DevOps.js Conf 2024DevOps.js Conf 2024
25 min
End the Pain: Rethinking CI for Large Monorepos
Scaling large codebases, especially monorepos, can be a nightmare on Continuous Integration (CI) systems. The current landscape of CI tools leans towards being machine-oriented, low-level, and demanding in terms of maintenance. What's worse, they're often disassociated from the developer's actual needs and workflow.Why is CI a stumbling block? Because current CI systems are jacks-of-all-trades, with no specific understanding of your codebase. They can't take advantage of the context they operate in to offer optimizations.In this talk, we'll explore the future of CI, designed specifically for large codebases and monorepos. Imagine a CI system that understands the structure of your workspace, dynamically parallelizes tasks across machines using historical data, and does all of this with a minimal, high-level configuration. Let's rethink CI, making it smarter, more efficient, and aligned with developer needs.
React Summit 2022React Summit 2022
21 min
Scale Your React App without Micro-frontends
As your team grows and becomes multiple teams, the size of your codebase follows. You get to 100k lines of code and your build time dangerously approaches the 10min mark 😱 But that’s not all, your static CI checks (linting, type coverage, dead code) and tests are also taking longer and longer...How do you keep your teams moving fast and shipping features to users regularly if your PRs take forever to be tested and deployed?After exploring a few options we decided to go down the Nx route. Let’s look at how to migrate a large codebase to Nx and take advantage of its incremental builds!
Remix Conf Europe 2022Remix Conf Europe 2022
22 min
Remixing Your Stack in a Monorepo Workspace
Remix entered the stage with a unique and refreshing take on how to develop on the web. But how do you integrate it into your existing ecosystem of applications? Do you want to test-drive Remix on a small project, or do you want to go full-in, but it is tricky to do a big-bang migration from your existing React app? In this talk, we're going to explore how a monorepo-based code organization can help integrate Remix with your existing React and TypeScript infrastructure, facilitating high code reuse and a migration path to Remix.
React Summit 2022React Summit 2022
22 min
Fast React Monorepos with High Quality DX
Monorepos have been around for some time but only recently gained popularity in the JavaScript community. The promise of easily sharing code, better enforcing organizational standards, greater developer mobility due to common tooling, and more is very appealing. Still, if approached naively, a monorepo will quickly turn into a huge mess: skyrocketing slow CI times, spaghetti dependencies among projects, hard to navigate, and ultimately leading to frustration. In this talk, we will look at the available tooling, how to kickstart a new React monorepo in particular, and we will learn the key ingredients required to build a successful, long-running monorepo that scales.

Workshops on related topic

React Summit 2023React Summit 2023
145 min
React at Scale with Nx
Featured WorkshopFree
We're going to be using Nx and some its plugins to accelerate the development of this app.
Some of the things you'll learn:- Generating a pristine Nx workspace- Generating frontend React apps and backend APIs inside your workspace, with pre-configured proxies- Creating shared libs for re-using code- Generating new routed components with all the routes pre-configured by Nx and ready to go- How to organize code in a monorepo- Easily move libs around your folder structure- Creating Storybook stories and e2e Cypress tests for your components
Table of contents: - Lab 1 - Generate an empty workspace- Lab 2 - Generate a React app- Lab 3 - Executors- Lab 3.1 - Migrations- Lab 4 - Generate a component lib- Lab 5 - Generate a utility lib- Lab 6 - Generate a route lib- Lab 7 - Add an Express API- Lab 8 - Displaying a full game in the routed game-detail component- Lab 9 - Generate a type lib that the API and frontend can share- Lab 10 - Generate Storybook stories for the shared ui component- Lab 11 - E2E test the shared component
Node Congress 2023Node Congress 2023
160 min
Node Monorepos with Nx
WorkshopFree
Multiple apis and multiple teams all in the same repository can cause a lot of headaches, but Nx has you covered. Learn to share code, maintain configuration files and coordinate changes in a monorepo that can scale as large as your organisation does. Nx allows you to bring structure to a repository with hundreds of contributors and eliminates the CI slowdowns that typically occur as the codebase grows.
Table of contents:- Lab 1 - Generate an empty workspace- Lab 2 - Generate a node api- Lab 3 - Executors- Lab 4 - Migrations- Lab 5 - Generate an auth library- Lab 6 - Generate a database library- Lab 7 - Add a node cli- Lab 8 - Module boundaries- Lab 9 - Plugins and Generators - Intro- Lab 10 - Plugins and Generators - Modifying files- Lab 11 - Setting up CI- Lab 12 - Distributed caching