End the Pain: Rethinking CI for Large Monorepos

Rate this content
Bookmark

Scaling large codebases, especially monorepos, can be a nightmare on Continuous Integration (CI) systems. The current landscape of CI tools leans towards being machine-oriented, low-level, and demanding in terms of maintenance. What's worse, they're often disassociated from the developer's actual needs and workflow.

Why is CI a stumbling block? Because current CI systems are jacks-of-all-trades, with no specific understanding of your codebase. They can't take advantage of the context they operate in to offer optimizations.

In this talk, we'll explore the future of CI, designed specifically for large codebases and monorepos. Imagine a CI system that understands the structure of your workspace, dynamically parallelizes tasks across machines using historical data, and does all of this with a minimal, high-level configuration. Let's rethink CI, making it smarter, more efficient, and aligned with developer needs.

FAQ

The main challenges include managing the complexity of running multiple projects simultaneously, ensuring efficient and fast pipeline execution, and maintaining the CI setup as the monorepo grows. This often requires sophisticated tooling and strategies to handle dependencies and parallelize tasks effectively.

NX optimizes CI processes through features like affected commands, which only run tasks related to changed projects, and advanced caching mechanisms to avoid redundant computations. Additionally, NX supports fine-grained task distribution and dynamic scaling across multiple machines to improve efficiency and reduce CI times.

In monorepo management, project graphs are crucial for tracking the dependencies between different projects within the repository. This allows tools like NX to efficiently determine which parts of the monorepo are affected by changes, optimizing build and test processes by only processing relevant parts.

Yes, NX can handle dynamic distribution of tasks across multiple CI machines. It uses a coordinator on NX Cloud infrastructure to distribute tasks based on the project graph, optimizing resource usage and reducing build times by balancing loads across available machines.

NX Agents are part of NX's suite of tools designed to improve CI efficiency by helping with the distribution of tasks across multiple machines. They enable dynamic scaling and fine-grained distribution, reducing the overhead of manual configuration and ensuring efficient use of resources.

NX addresses flakiness detection by leveraging caching to identify when a task produces different results under the same conditions, indicating potential flakiness. It can then automatically rerun these tasks on different machines to confirm and address the issue, ensuring reliability in the CI process.

Juri Strumpflohner
Juri Strumpflohner
25 min
15 Nov, 2023

Comments

Sign in or register to post your comment.

Video Summary and Transcription

Today's Talk discusses rethinking CI in monorepos, with a focus on leveraging the implicit graph of project dependencies to optimize build times and manage complexity. The use of NX Replay and NX Agents is highlighted as a way to enhance CI efficiency by caching previous computations and distributing tasks across multiple machines. Fine-grained distribution and flakiness detection are discussed as methods to improve distribution efficiency and ensure a clean setup. Enabling distribution with NX Agents simplifies the setup process, and NX Cloud offers dynamic scaling and cost reduction. Overall, the Talk explores strategies to improve the scalability and efficiency of CI pipelines in monorepos.

1. Introduction to Rethinking CI in Monorepos

Short description:

Today, I would like to talk about how we could potentially rethink how CI works in monorepos. My name is Joris Sturmfloner, and I have been using monorepos for six years. I am also a core team member of NX and a Google developer expert in web technologies and Angular.

[♪ music playing ♪ All right. So today, I would like to talk a bit about how we could potentially rethink how CI works compared to the current CI situation that we have, with a particular focus on monorepos and potentially large monorepos. So how we could optimize that. So before we go ahead, my name is Joris Sturmfloner. I've been using monorepos for probably six years already. Since about four years, I'm also a core team member of NX, which is a monorepo management tool. And so I'm also a Google developer expert in web technologies and Angular and also an instructor on AgHead, where I publish courses on web development and developer tools.

2. Considerations for CI in Monorepos

Short description:

When working with monorepos, we need to consider the local developer experience, automation and rules, and task pipelines. Current CI solutions are not optimized for monorepos and require low-level manual maintenance. Developers want a high-level way of defining their CI structure and need strategies to ensure scalability and manageable speed and throughput.

So when we go into the direction of a monorepo, it doesn't come for free, right? So there's some considerations that need to play in. One big one is obviously the local developer experience. So how do we structure a project in a monorepo? How do we make sure that we have a consistency in how these products are being set up? Which version do they use? How are they configured such that we can also have some sort of team mobility between projects, potentially, and it will also help us obviously maintain. Automation and rules around those projects is also a very important part, especially looking at maintenance and the longevity of such a monorepo.

And also things like features like task pipelines, being able to run things in parallel. Because clearly in a monorepo, we don't run just one project anymore, but potentially a series of projects where there are also dependencies. And so we need to be able to kind of build dependent products first before we actually run our project. And those are things that we don't want to do manually, but rather want to have tooling support. But today I would like to specifically focus on the elephant in the room whenever we talk about monorepos, which often is not being paid attention to immediately, which is kind of a mistake, which is CI. Because clearly there is some— the current CI situation is basically not optimized for monorepos because it is very machine-oriented, so we need to focus on exact instructions that we want to process. We need to actually have a very instructional kind of approach. It is very low level in that sense as well. It requires a lot of maintenance because we no more, as I mentioned, run just one project and that's it. We run a series of projects. We run multiple projects. And so we need to have strategies of tuning the CI in order to make sure even as our monorepo structure changes, as more products come into the monorepo, that it still works. It is also, I would say, a bit removed from what developers want, because as a developer, I would want to have a more high-level way of defining my CI structure, my CI run, my CI pipeline, in the sense of saying, hey, I want to run all these projects that got touched, for instance, in that PR, rather than having to fine-tune every single aspect of that project. And as I said before, they don't really work for monorepos, so they design much, much more general purpose and more into single-project workspaces in general. So today I would like to dive into some of these aspects, specifically looking at speed and throughput, because that is one major thing that we need to pay attention to, because otherwise our monorepo would be a problem. Because if we have good collaboration going on locally within Teams, but our pipeline takes over an hour for each PR, that's going to be a problem.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Levelling up Monorepos with npm Workspaces
DevOps.js Conf 2022DevOps.js Conf 2022
33 min
Levelling up Monorepos with npm Workspaces
Top Content
Learn more about how to leverage the default features of npm workspaces to help you manage your monorepo project while also checking out some of the new npm cli features.
Why is CI so Damn Slow?
DevOps.js Conf 2022DevOps.js Conf 2022
27 min
Why is CI so Damn Slow?
We've all asked ourselves this while waiting an eternity for our CI job to finish. Slow CI not only wrecks developer productivity breaking our focus, it costs money in cloud computing fees, and wastes enormous amounts of electricity. Let’s take a dive into why this is the case and how we can solve it with better, faster tools.
Atomic Deployment for JS Hipsters
DevOps.js Conf 2024DevOps.js Conf 2024
25 min
Atomic Deployment for JS Hipsters
Deploying an app is all but an easy process. You will encounter a lot of glitches and pain points to solve to have it working properly. The worst is: that now that you can deploy your app in production, how can't you also deploy all branches in the project to get access to live previews? And be able to do a fast-revert on-demand?Fortunately, the classic DevOps toolkit has all you need to achieve it without compromising your mental health. By expertly mixing Git, Unix tools, and API calls, and orchestrating all of them with JavaScript, you'll master the secret of safe atomic deployments.No more need to rely on commercial services: become the perfect tool master and netlifize your app right at home!
How to Build CI/CD Pipelines for a Microservices Application
DevOps.js Conf 2021DevOps.js Conf 2021
33 min
How to Build CI/CD Pipelines for a Microservices Application
Top Content
Microservices present many advantages for running modern software, but they also bring new challenges for both Deployment and Operational tasks. This session will discuss advantages and challenges of microservices and review the best practices of developing a microservice-based architecture.We will discuss how container orchestration using Kubernetes or Red Hat OpenShift can help us and bring it all together with an example of Continuous Integration and Continuous Delivery (CI/CD) pipelines on top of OpenShift.
Federated Microfrontends at Scale
React Summit 2023React Summit 2023
31 min
Federated Microfrontends at Scale
Top Content
The talk will be a story of how Personio went from rendering through a Monolithical PHP architecture, to a microfrontend oriented Next JS app, powered by Module Federation and the NX monorepo toolchain.
Scale Your React App without Micro-frontends
React Summit 2022React Summit 2022
21 min
Scale Your React App without Micro-frontends
As your team grows and becomes multiple teams, the size of your codebase follows. You get to 100k lines of code and your build time dangerously approaches the 10min mark 😱 But that’s not all, your static CI checks (linting, type coverage, dead code) and tests are also taking longer and longer...How do you keep your teams moving fast and shipping features to users regularly if your PRs take forever to be tested and deployed?After exploring a few options we decided to go down the Nx route. Let’s look at how to migrate a large codebase to Nx and take advantage of its incremental builds!

Workshops on related topic

React at Scale with Nx
React Summit 2023React Summit 2023
145 min
React at Scale with Nx
Top Content
Featured WorkshopFree
Isaac Mann
Isaac Mann
We're going to be using Nx and some its plugins to accelerate the development of this app.
Some of the things you'll learn:- Generating a pristine Nx workspace- Generating frontend React apps and backend APIs inside your workspace, with pre-configured proxies- Creating shared libs for re-using code- Generating new routed components with all the routes pre-configured by Nx and ready to go- How to organize code in a monorepo- Easily move libs around your folder structure- Creating Storybook stories and e2e Cypress tests for your components
Table of contents: - Lab 1 - Generate an empty workspace- Lab 2 - Generate a React app- Lab 3 - Executors- Lab 3.1 - Migrations- Lab 4 - Generate a component lib- Lab 5 - Generate a utility lib- Lab 6 - Generate a route lib- Lab 7 - Add an Express API- Lab 8 - Displaying a full game in the routed game-detail component- Lab 9 - Generate a type lib that the API and frontend can share- Lab 10 - Generate Storybook stories for the shared ui component- Lab 11 - E2E test the shared component
Node Monorepos with Nx
Node Congress 2023Node Congress 2023
160 min
Node Monorepos with Nx
Top Content
WorkshopFree
Isaac Mann
Isaac Mann
Multiple apis and multiple teams all in the same repository can cause a lot of headaches, but Nx has you covered. Learn to share code, maintain configuration files and coordinate changes in a monorepo that can scale as large as your organisation does. Nx allows you to bring structure to a repository with hundreds of contributors and eliminates the CI slowdowns that typically occur as the codebase grows.
Table of contents:- Lab 1 - Generate an empty workspace- Lab 2 - Generate a node api- Lab 3 - Executors- Lab 4 - Migrations- Lab 5 - Generate an auth library- Lab 6 - Generate a database library- Lab 7 - Add a node cli- Lab 8 - Module boundaries- Lab 9 - Plugins and Generators - Intro- Lab 10 - Plugins and Generators - Modifying files- Lab 11 - Setting up CI- Lab 12 - Distributed caching
Bring Code Quality and Security to your CI/CD pipeline
DevOps.js Conf 2022DevOps.js Conf 2022
76 min
Bring Code Quality and Security to your CI/CD pipeline
WorkshopFree
Elena Vilchik
Elena Vilchik
In this workshop we will go through all the aspects and stages when integrating your project into Code Quality and Security Ecosystem. We will take a simple web-application as a starting point and create a CI pipeline triggering code quality monitoring for it. We will do a full development cycle starting from coding in the IDE and opening a Pull Request and I will show you how you can control the quality at those stages. At the end of the workshop you will be ready to enable such integration for your own projects.
Powering your CI/CD with GitHub Actions
DevOps.js Conf 2022DevOps.js Conf 2022
155 min
Powering your CI/CD with GitHub Actions
Workshop
David Rubio Vidal
David Rubio Vidal
You will get knowledge about GitHub Actions concepts, like:- The concept of repository secrets.- How to group steps in jobs with a given purpose.- Jobs dependencies and order of execution: running jobs in sequence and in parallel, and the concept of matrix.- How to split logic of Git events into different workflow files (on branch push, on master/main push, on tag, on deploy).- To respect the concept of DRY (Don't Repeat Yourself), we will also explore the use of common actions, both within the same repo and from an external repo.