Yarn in Depth: Why & How

Rate this content
Bookmark

Since 2017 Yarn proved itself a pillar of JavaScript development incubating numerous features our ecosystem now heavily relies on. As years passed, as competitors improved, so did Yarn, and it's now time today to dive into the features and tradeoffs that make Yarn a truly unique gem of the JavaScript ecosystem.

34 min
09 Jun, 2021

Video Summary and Transcription

Yarn is a package manager and project manager that prioritizes soundness and reliability. It offers workflows and features like zero-installs and workspace alignment. Yarn's success lies in its infrastructure and hidden gems. It has a modular architecture and is committed to the ecosystem. Yarn emphasizes correctness and is a better fit for those who value it.

Available in Español

1. Introduction to Yarn

Short description:

Yarn is a package manager for JavaScript, but it's actually more than that. It's intended to be a project manager that lets you manage scripts, split your application into stand-alone models, manage release cycles, monitor script usages, and enforce monorepo standards. Yarn values community contribution and relies on its contributors to make the project sustainable. It prioritizes soundness and ensures that issues in your application are not overlooked. Yarn aims to provide a reliable and dependable development environment.

Hi everyone, for those who don't know me, my name is Maer. I currently work at Datadog, a company focused around cloud monitoring, and I have been leading Yarn's development for a few years now.

Today we're going to extensively talk about what Yarn is and what it can bring you. I hope that by the end of this talk you will have a better idea of what makes this project unique in our ecosystem. So if you ask anyone, they will likely tell you that Yarn is a package manager for JavaScript and by that they would be right. But it's actually only part of the story. Yarn is not just a package manager, it's intended to be a project manager. And indeed, if you think about it, Yarn lets you manage scripts. It lets you split your application into stand-alone models. As we'll see later, it will also manage your release cycles, monitor script usages, or even enforces monorepo standards. All these tasks go far beyond the typical package manager and every release we make pushes the boundaries even further.

So it's a product manager which sounds nice, but how does it translate in practice? What are the things that we look for when we merge Peers that give the project what it is? Let's discuss core values. So the first thing to realize is that we are a community of contributors. Open source is a very taxing environment and most projects are struggling finding ways to make their work sustainable. Yarn isn't exempt of that. To help a bit, we rely a lot on our contributors to be the changes they want to see, to contribute back to the project they like. In practice, it means that our core team spends as much working on our infrastructure as on the product itself. Recently, we moved from Webpack to ESBuild to make building yarn easier. Multiple comments let's you build part of the Yarn binaries from sources. So you can easily try out independent features. Or even write them yourself and start using them in your project without waiting for them to be merged. Yarn is all about making it possible for you to experiment well past what we could offer by ourselves.

The second very important value we are always keeping in mind when developing on Yarn is soundness. Yarn must tell you if something is wrong in your application. It must not let it go unnoticed. It must not make uncontrolled assumptions. This may sound a bit rigid, and is the reason why you have so many problems with x is depending on y without listing their dependencies, but it's really critical. Because whether you alter applications or libraries, you need to have confidence that something that works now will also work on production or when installed by your customers. If something needs to break, then it needs to break early so that you don't get it later on when you're not prepared for it. Good practices.

2. Yarn's Impact and Workflows

Short description:

The JavaScript landscape is large and fast-paced. Yarn, as a package manager, aims to guide users and help them understand the tools they are using. Yarn prioritizes developer experience and aims to do the right thing by default. It eliminates the need for manual validation of project installation. Yarn's impact can be seen through stories of projects that have adopted it, including the Yarn repository itself. The Yarn team actively uses all features shaped into the core, which has positively influenced their work. Yarn also provides workflows, such as release cycles, to enhance development processes.

The JavaScript landscape is large, charges fast and has many opinionated people. As a package manager, we are uniquely positioned to help our users understand the tools they are using and guide you along the happy path. Not only using Yarn should solve a practical need, it should also contribute making you learn and become a better engineer along the way.

We reached the last one on this set, developer experience, doing the right thing by default. Let's take an example. In Yarn 1, there was a comment named Yarn Check. It was validating that the project was correctly installed. As a result, various companies had a policy of calling Yarn Check after every install, just to be sure that everything was okay. We removed this comment from Yarn 2 and can you guess why? The thing is, Yarn should do and already does the right thing by default. You shouldn't have to validate that your project is correctly installed, because it should be correctly installed from the get go.

So we've seen a bit of what Yarn claims to be. Let's talk a bit DevOps. What does Yarn can do for you, practically speaking? We're going to go over two interesting stories of projects who adopted it. One is a web open source project you may have heard of. The other is an internal application. Without surprise, the first one is Yarn itself. Before we dive in, let me tell you a funny story. Back in Yarn 1, we didn't actually use workspaces to develop Yarn, and it was a real problem because not only were bugs hidden from us, we also weren't directly confronted with the value of some important features. For example, when you use workspaces, it's very apparent that you need to be able to run a script in all your workspaces at once. However, since we didn't use those workspaces ourselves, it didn't seem like a huge deal to us at the time, and thus, the feature took a long time to come. Nowadays, we have an internal rule that the Yarn team needs to use all features shaped into the core, and I believe it has a strong positive impact on the work we've made so far. This is one reason why the Yarn repository is so important for us because it's where we experiment with a lot of features, not all of them being used by our users themselves. Anyway, let's talk about workflows.

The first one are release cycles. Our previous process was very simple. We had a file at the root of the repository, and each PR was expected to have one line to it. It worked fine, but after switching to a model repo, it wouldn't have scaled very well since we needed the ability to release each workflow by itself. So we developed the release workflow instead. The idea is that with each PR we merge, it has to include a little file created by Yarn that lists all the workspaces that have been changed by the PR and whether they will need to be part of the next release, and whether it will be a minor, a patch, or a major version. Our CI will validate that this file exists and its content, and at release time we simply need to tell Yarn to aggregate all the version files currently checked inside the repository into package bugs and charge log generation.

3. Yarn's Zero-installs and Workspace Alignment

Short description:

We use Zero-installs to keep the project cache inside the repository, allowing instant project loading and easy branch switching. Yarn's constraint feature ensures workspace alignment and enforces patterns across all workspaces. This eliminates the need for external tools and provides consistency. At Datalog, we use a Yarn plugin to monitor script execution and identify potential bottlenecks. Checking in dependencies helps us avoid deployment issues caused by remote registry downtime.

Another workflow that we are using are Zero-installs. The idea isn't that complicated. It's just that we decided to keep the cache of the project inside the repository itself. So, for each package that we have, we keep exactly one zip file into the repository. As a result, anyone that loads the project is able to instantly start working on Yarn. They don't even have to run Yarn install. More importantly, it means that our contributors are able to switch from one branch to another at almost no cost in terms of context switching. That's extremely valuable because it ties into the FAPR experimentation subject that I mentioned earlier.

We can quickly iterate on multiple different branches without having to pay the cognitive costs of having to run frequent installs each time you run git checkout. I safely need to mention that adding the packages of the cache into the repository is a trade-off that we made for our project, but it's not something that you absolutely have to do yourself if you're not convinced by the idea. We know that a few other projects have decided that they wanted to run Yarn install as they always used to and that's perfectly fine. It's just that nowadays you have the choice.

One problem when you have a lot of work spaces, and we have a lot, I think we have something like 44 at the moment, is that it becomes difficult to make sure that all the workspaces are aligned in their configuration. For instance, how do you ensure that no two work spaces will depend on different versions of the same dependency? It can become a daunting task and existing tools didn't help much with that. Now, thanks to the constraint feature in Yarn core, we are able to enforce past-period patterns across all work spaces in a very few lines. Better yet, Yarn can apply the fixes itself. So for example, the Yarn repository itself, we are using that feature, not only to enforce that all dependencies of all workspaces are the same one, but we also use it to ensure that, for example, the license is properly set on all of our packages, that the build scripts are consistent, this kind of thing. So, those three examples give you a good idea of what value approach projects can derive from Yarn. In general, they no longer need tools like learner or churn sets, as those workflows now receive first party support. But InterLab projects can benefit from Yarn as well, as we are going to see.

I work at Datalog, and Datalog is a huge web application along with a sizable JavaScript infrastructure supporting everything from linting to deployments. Yarn also central places in the engine, and we'll go over three examples. So, if you don't know what Datalog is, we are making some really good cloud monitoring, extracting signals out of arbitrary metrics. Given this DMA, it won't surprise you to know that we also monitor the way our developers interact with our own infrastructure, in particular in terms of response time. To achieve this, we wrote a plugin for Yarn that watches all scripts that get executed, how much time they take, and sends it all to the dashboard. All this data then helps us identify potential bottlenecks before they become a problem, and generally helps us prioritize our work. As we saw with the Yarn repository itself, checking in dependencies has poor full effect on one's ability to efficiently contribute to a project. But even in corporate environments, this pattern has its use. From our data, the remote registry goes down approximately once a month. By keeping our own copy of the packages, we're never blocked during our deployments.

4. YARN's CI Efficiency and Hidden Gems

Short description:

Despite a high volume of commits, our CI doesn't waste much time running installs. YARN supports the patch protocol, integrating it with the cache system. YARN offers many hidden gems, such as compressed packages, installing from Git, and portable scripts. YARN's success lies in the infrastructure built around it, solidifying its foundations and enabling fast iteration.

Additionally, despite a high volume of commits, our CI doesn't waste much time running installs. All in all, instead of having to spend minutes installing projects in every single CI, it's now just really a matter of seconds.

And one last item I want to mention. We have a lot of projects, and a lot of dependencies. Some of them are bugs, and we try to address them as well as we can. But until the fixes are actually merged in the upstream project and released, which is sometimes a bit difficult, we need some simple way to use our changes locally while keeping some auditability. Some third-party tools exist just for that. But YARN now supports it out of the box thanks to the patch protocol in a way that's directly integrated with the cache system and all its chads. So that's really nice to have this kind of feature that came from the community through the patch package, for example, published on NPM that are moved as a first-party feature directly integrated with the package manager.

So I think those three examples give you a good overview of what YARN has to offer in the context of a company. But keep in mind that those are only examples. For example, we didn't talk about how YARN can keep your packages compressed on disk and share them between projects. That's what you prefer. Or how you can install packages straight from Git or how your scripts are made portable across both Windows and POSIX without you having to do anything. And that itself is only a quick list from a single slide. There are a lot more hidden gems in YARN.

So far we've discussed what YARN is and what it can bring you and your organization. But now I want to take the opportunity to go farther than usual and tell you why it works so well. Let's make a quick dive into YARN's development process themselves. A large part of our secret sauce, if you will, is our infrastructure. It may come as a surprise that the most valuable asset we have isn't YARN itself but rather layers built around it. But that's really the case. You probably noticed that YARN's iteration speed was far beyond the industry standard, and the big reason for that is that we have spent a lot of resources solidifying our foundations.

5. Testing Strategy and Infrastructure

Short description:

End-to-End tests are run every four hours on popular third-party projects to ensure compatibility with YARN. Building the code base using the latest tools like TypeScript has increased confidence in the code. The testing strategy was revised to use the CLI binary, making it easier for external contributors to write tests. Trusting the infrastructure and investing upfront costs leads to faster iteration and decreased burnout risk.

One example of this are the End-to-End tests. Every four hours, a set of comprehensive tests are run. But rather than test YARN's behavior, we actually test popular third-party projects from the ecosystem, NextGIS, CreateRack, Gatsby, Angular, Jest, those are only some of the big names that we monitor. If any of them accidentally ships something incompatible with YARN or if YARN accidentally ships a regression, we are aware of it less than four hours later and can immediately work with the maintainers in order to find a solution. Even if that's an incommon scenario, having this mechanism in place means that we can afford to be slightly less conservative than usual. We don't have to assume that our work is compatible with the ecosystem. We have the proof that it's actually really the case. And similarly, we made a huge effort to build our code base using the latest available tools, including TypeScript itself. I can't stress enough the benefit that it gave us. Not only did it have us avoid silly mistakes on our own PRs, but it also raises the confidence that we can have that a PR won't have unforeseen effects, which means that it's easier for us to merge pull requests from third-time contributors and trust them that if things will just work. Finally, our last major also came with another rule of our testing strategy. At the time, we had a bunch of unit tests where we instantiated a lot of classes from the core and activated their methods to change their behaviors. While it worked well enough for a time, this approach proved very difficult to maintain over time, because it meant that refactorings were practically impossible, because it would have required to rewrite all the tests, all to keep a downdated interface just for the tests. So we decided to try something a bit different, starting from V2. We still rewrote all the tests, unfortunately, and I had to do this incrementally, but this time, we updated them to directly use the CLI binary, exactly like real users would have done. And because the CLI interface is not meant to change really, it meant that it allowed us to remove a lot of mental skills while making it much easier for our external contributors to understand how to write tests, because literally, all they have to do is to write CLI code. I think the lesson here is simple. To write code efficiently at large scale, like for a major open source project, you must be able to trust your infrastructure. And by accepting to pay the costs upfront like we did, you would be able to both reach faster iteration speed and decrease the risk of burnout for your team, because they will have less work and less pressure to assume that things are working.

6. YARN's Architecture and Future

Short description:

YARN used to be a monolith, but now it is modular with a core containing critical algorithms. Features are implemented as models, making it easy to extend YARN with new functionalities. Community members have built their own plugins to customize workflows. YARN 3 is in development with improvements and new features. Sponsor programs are being explored to support YARN's sustainability.

So, let's focus on YARN itself, and more precisely, one particular part of it is architecture. As I mentioned earlier, YARN used to be a monolith. Work spaces didn't exist at the beginning, so we didn't use them. And while we work on YARN, some parts of the application would benefit from being independent from the rest, even if only to prevent them from accidentally relying on unrelated components they shouldn't even be aware of. So, one of the first things I did working on it was how to make YARN modular.

As for today, we now have a core containing all the critical algorithms. For example, the resolver, which will take all the packages, query the npm registry, and get all the versions needed, or the fesher that downloads packages, and which exposes a bunch of interfaces that models can implement whenever they want. Most of the features you see in YARN are implemented this way. Communication with the npm registry is a model. The package table generation is a model. A plug and play itself that you may know as the new install strategy in YARN is an external model itself. It's only at release time that those models are bundled together to yield a YARN CLI that you know. Interestingly, this architecture also makes it very easy to extend YARN with new functionalities. For instance, the focus command as we implemented it is literally a hundred lines inside its own model. You could implement it yourself, and in fact a few people actually did. Because that's something that is interesting, it's very difficult sometimes to build workflows that satisfy everyone. So for example, we have two community members that have built their own plugins that do exactly what they intend to do with their focused install. They remove dependencies if they don't want them. They generate a zip archive that they can publish to AWS in one pass. It's really much easier to implement your own workflows in some cases than to expect an open source tool to implement exactly what you need. So if a behavior doesn't match your expectation, you can experiment with a new one and let us know how it goes.

So, as I mentioned, in the past few months, at least two community members have started to solve their monoreport deployments by altering plugins dedicated to their use case. So where do we see YARN in the future? No one can predict what's going to happen, but I can already tell you what's currently happening. We're working on the next major YARN version, which will be YARN 3. It will feature its share of improvements, clean up some behavior of our CLI and add new features, including some that I'm sure will make sense. At the same time, we're starting to look into making YARN sustainable by looking into sponsor programs. I mentioned earlier that on the source projects, there are a lot of efforts, resources and time that we needed to find ways to support this work. This is where you can do something. If you or your team are interested in getting your company to sponsor some of the time we spend on YARN, please contact us. We'd be happy to discuss stuff that will be favorable to both parties.

7. YARN's Impact and Commitment

Short description:

YARN is alive and in a better shape than ever before. It protects you from mistakes and offers value to the ecosystem. We've made an impact on projects like Next.js, Gatsby, Creative app, webpack, and more. Our test suite rarely reports problems. We're committed to moving the ecosystem in the right direction. Thank you for joining us!

All this to say, YARN is very much alive and in a better shape than ever before. Our team is aligned. Our objectives are clear and our tech is sound. Certainly, we have taken a more optimistic course than we used to. And we think that, for most people, it will be a net positive, knowing that the tool will protect you from your mistakes.

Of course, different people may prefer different kinds of tools. And that's fine, because not one project is for everyone. Overall, I think it's clear that YARN is there to stay. It's not just a fade. And we truly believe that we offer something different enough to provide value to the ecosystem.

We've been in relation with many projects during the past few years, tweaking our copies, expanding concept and pushing for good practices. And I think it really made an impact. Things like Next.js, Gatsby, Creative app, webpack, much more. When we started, the tests that I mentioned earlier, those end-to-end tests that we run on all projects every four hours, they frequently failed due to missing dependencies. Each time it happened, we went to the relevant project and shared a PR with them in order to have those missing dependencies. Over time, they started to pay more and more attention to it. And nowadays, our test suite is very rarely reporting problems. This is also the work that Yarn does. We don't only make a CLI tool. We also have the ecosystem move into the right direction, not only for Yarn users, but for every package manager users. Of course, the problem is that it's always public facing or shiny. So sometimes, it may be difficult for outsiders to have any idea of the real amount of work that we put into the project, but it's something that really is important for us and that we are going to keep doing in the future.

So, we've reached the end of this presentation. I hope you enjoyed it. And I'll be happy to answer questions you might have on this subject. Thank you. Thank you so much for that amazing talk, Mal. I really, really enjoyed it. I love hearing about yarn, about where it's been, where it came from, and the future of where it's going. Remember, if you have any questions, throw them into the chat so that we can grab them.

8. Yarn Poll Results and Evolution

Short description:

The poll results show that there are both new adopters and long-time users of Yarn. It's interesting to see a significant number of people who have been using it for less than a year. I joined the Yarn project while working at Facebook and continued working on it even after leaving the company. Yarn has evolved to become more stable and reliable over time, with a strong emphasis on technical correctness.

And we can ask Mal. I'm going to invite Mal on. But the first thing we're going to talk about are the results of the poll. So, the poll question was, for how long have you been using yarn? And the top answer was, one to three years, followed by less than a year. So, quite a lot of new adopters, just a few people who have been using it for a while. Does that surprise you, Mal, at all?

No, actually, that's actually very interesting, because we hear every now and then that most people using yarn have been using it for a while. But it's interesting to see that almost half of the people who answered, answered less than one year and the rest is in one to three years. So, yarn isn't that old yet. So, three years, one to three, can be a lot of time. But I'm really interested by the less than one year stuff. It's really interesting.

Well, I mean, it's good to have so many people who are beginning to use it and take it up. We've got a couple of questions that have come into the chat and probably a few more that are going to come in. I'm going to pick one that I really like because I love seeing how many people work in different open source or these big projects. And I always wonder, how did they get this? And how did you get into maintaining the yarn project? So, at the time it was because I joined Facebook. So, I joined it for no particular reason related to yarn itself. And then I happened to be in the same country as the team that was in charge of yarn at the time. So, I joined the team. So, when I say team, it was a very small amount of people that were working on yarn itself. In fact, when I joined the team, there was a single one who quickly went to other projects across countries. So, I suddenly had to work on yarn by my staff. And I liked it. I really like working on a project that the whole ecosystem could use. It was very difficult, I will say, because it meant that I had to do a lot of work for the open source and prioritize correctly work, depending on the company I was working for as well. And I think I learned so much that when I left Facebook, so it was one year and a half, two years ago actually, when I left Facebook, I decided to keep working on it. And that's about when we started releasing the new branches for Yarn 2 and above that we are still maintaining to this day with a new core team that is almost entirely new people that joined starting from Yarn 2, except for me who went from Yarn 1 to Yarn 2.

That's amazing. And one thing I find so interesting about your talking Yarn, it seems sort of like there's so much I learned about what Yarn could do that I didn't know and it became so much more. I like what you said, where it's not just a package manager, it's a project manager. But what would you say is the biggest thing if you kind of had to answer this in one question in one answer? How different is Yarn now from when it was first released? I think Yarn is much more stable and sounder in terms of the technical aspect, because we we put a very strong emphasis on doing things when we know that they are correct for sure.

9. Yarn's Sound Development and Collaborative Growth

Short description:

Yarn helps you build your application in a sound way, similar to TypeScript and Flow. The organizational point of view has improved with a core team of lead contributors. Finding people who share the same passion keeps the flame burning.

So, for example, with the default installation Yarn, we throw error when you try to access packages that are not in your package address, so that you have no risk to accidentally depend on the balances that would be missing on your machine. So, we really try to help you build your application in a sound way, a bit like TypeScript and Flow actually do.

And organizational point of view, it's very different from what it was before, because before it was mostly one single person, sometimes two, but not for long, and it was extremely exhausting. Nowadays, the core team is actually composed from, I would say, four main lead contributors and a few other ones who are gravitating in the same circle. So, it feels much less lonely to work on the companies, so that's a great improvement. In the open source, I think that the main way that we can keep the flame burning is by finding all the people that share the same passion as we do, and I think that we have started to to find that.

Yeah, I love the fact that you spoke about how having the different people in the team now to help you must just make that so much, much a nicer experience.

10. Reasons to Use YARN over NPM

Short description:

The biggest reason to use YARN versus NPM is YARN's emphasis on correctness. While NPM focuses on things working and doing something sensible, YARN takes a more radical approach, ensuring that things work as intended. Both approaches have their benefits, but the emphasis on correctness makes YARN a better fit for me and our users.

There's a question here, and I'm pretty sure it's a question you've been asked over and over and over again, so I'm sorry. I apologize in advance that I'm gonna ask this, but what is the biggest reason to use YARN versus NPM? So from my perspective there is a lot of reason, but I would say that if something works for you, then it's fine to keep using it, whether it's YARN or NPM. The reason I choose to work on YARN is because of its emphasis on correctness. NPM has a slightly different philosophy, which is a bit that things should work and do something sensible. And on YARN we do something that is a bit more radical in that things should work if they are meant to work. The two approaches have their own benefits, but I think that the one that we have chosen fits me better, and it fits our users better as well.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

DevOps.js Conf 2022DevOps.js Conf 2022
31 min
pnpm – a Fast, Disk Space Efficient Package Manager for JavaScript
You will learn about one of the most popular package managers for JavaScript and its advantages over npm and Yarn.A brief history of JavaScript package managersThe isolated node_modules structure created pnpmWhat makes pnpm so fastWhat makes pnpm disk space efficientMonorepo supportManaging Node.js versions with pnpm
DevOps.js Conf 2022DevOps.js Conf 2022
31 min
The Zen of Yarn
In the past years Yarn took a spot as one of the most common tools used to develop JavaScript projects, in no small part thanks to an opinionated set of guiding principles. But what are they? How do they apply to Yarn in practice? And just as important: how do they benefit you and your projects?
In this talk we won't dive into benchmarks or feature sets: instead, you'll learn how we approach Yarn’s development, how we explore new paths, how we keep our codebase healthy, and generally why we think Yarn will remain firmly set in our ecosystem for the years to come.
JSNation 2022JSNation 2022
28 min
Yarn 4 - Modern Package Management
Top Content
Yarn 4 is the next major release of your favourite JavaScript package manager, with a focus on performance, security, and developer experience. All through this talk we'll go over its new features, major changes, and share our long-term plans for the project.If you only heard about Yarn without trying it yet, if you're not sure why people make such a fuss over package managers, if you wonder how your package manager can make your work simpler and safer, this is the perfect talk for you!
React Summit US 2023React Summit US 2023
23 min
Taming the State Management Dragon
We spend a lot of time discussing which state library we should use, and fair. There are quite a few, from the common one everyone uses and loves to hate on, to that one quirky alternative, to several up and comers. However, discussing which library is best puts the cart before the horse.

When figuring out how to handle state, we should first ask ourselves: what different categories of state do we need? What are the constraints of each category? How do they relate to each other? How do they relate to the outside world? How do we keep them from becoming a giant, brittle ball of yarn? And more.

This might sound overwhelming, but never fear! In this talk, I'll walk you through how to answer these questions, and how craft an approachable, maintainable, and scalable state system. And yes, I will talk about how to pick a state management library too.