Yarn in Depth: Why & How

Since 2017 Yarn proved itself a pillar of JavaScript development incubating numerous features our ecosystem now heavily relies on. As years passed, as competitors improved, so did Yarn, and it's now time today to dive into the features and tradeoffs that make Yarn a truly unique gem of the JavaScript ecosystem.


Transcript


About Maël


Hi everyone. For those who don't know me, my name is Maël. I currently work at Datadog, a company focused around cloud monitoring. I've been leading Yarn's development for a few years now.

Today, we're going to extensively talk about what Yarn is and what it can bring you. I hope that the end of this talk you'll have a better idea of what makes this project unique in our ecosystem.


What's Yarn?


[00:38] So if you ask anyone, they will likely tell you that Yarn is a package manager for JavaScript and that, they would be right. But it's only actually part of the story. Yarn is not just a package manager, it intends to be a project manager. And indeed, if you think about it, Yarn lets you manage scripts. It lets you split your application into stand-alone modules. As we'll see later, it will also manage your release cycles, monitor script usages, or even enforces monorepo status.

All these tasks go far beyond the typical package manager and every release we make pushes the boundaries even further. So, it's a product manager, which sounds nice, but how does it translate in practice? What are the thing that we look for when we merge PRs that gave the project what it is? Let's discuss core values.


Core values


[01:25] So, the first thing to realize is that we are a community of contributors. Open source is a very taxing environment and most projects are struggling finding ways to make their work sustainable. Yarn isn't exempt from that. To help a bit, we rely a lot on our contributors to be the changes they want to see, to contribute back to the project they like. In practice, it means that our core team spends as much time working on our infrastructure as on the product itself. Recently, we moved from webpack to ESbuild in order to make building Yarn easier. Multiple commands lets you build part of the Yarn binaries from sources, so you can easily try out independent features or even write them yourself and start using them in your project without waiting for them to be merged. Yarn is all about making it possible for you to experiment well past what we could offer ourselves.

[02:13] The second very important value we are always keeping in mind when developing on Yarn, is soundness. Yarn must tell you if something is wrong in your application. It must not let it go unnoticed. It must not make uncontrolled assumptions. This may sound a bit rigid and is the reason why you have so many problems with package X is depending on Y without listing it in their dependencies, but it's really critical. Whether you alter applications or libraries, you need to have confidence that something that works now will also work on production, or when installed our consumers. If something needs to break, then it needs to break early so that you don't get hit later on when you're not prepared for it.

[03:00] Good practices. The JavaScript landscape is large, changes fast, and has many opinionated people. As a package manager, we are uniquely positioned to help our users understand the tools they are using and guide you along the path. Not only using Yarn should solve a practical need it should also contribute making you learn and become a better engineer along the way. We reach the last one of this set, developer experience. Doing the right thing default.

Let's take an example. In Yarn 1.0, there was a command named Yarn check. It was validating that the product was correctly installed. As a result, various companies had a policy of calling Yarn check after every install just to be sure that everything was okay. We removed this command from Yarn 2.0 and can you guess why? The thing is Yarn should do, and already does the right thing default. You shouldn't have to validate that your product is correctly installed because it should be correctly installed from the get go.


Workflows


[03:54] We've just seen a bit of what Yarn claims to be. Let's talk a bit DevOps. What does Yarn can do for you, practically speaking? We're going to go over two interesting stories of projects who adopted it. One is a project, you may have heard of, the other is an internal application. 

Without surprise, the first one is Yarn itself. Before we dive in, let me tell you a funny story. Back in Yarn 1.0, we didn't actually use workspaces to develop Yarn. This was a real problem because not only were bugs hidden from us, we also weren't directly confronted with the value of some important features. For example, when you use workspaces, it's very apparent that you need to be able to run a script in all your workspaces at once. However, since we didn't use those workspaces ourselves, it didn't seem like a huge deal to us at the time and thus the feature took a long time to come.

[04:42] Nowadays, we have an informal rule that a Yarn team needs to use all features shaped into the core, and I believe it had a strong, positive impact on the work we've made so far. This is one reason why the Yarn Repository is so important for us, because it's where we experiment with a lot of features, not all of them being used our users themselves.

Anyway, let's talk about workflows. The first one is release cycles. Our previous process was very simple. We had a file at the root of the repository and each PR was expected to add one line to it. It worked fine, but after switching to another repo it wouldn't have scaled very well since we needed the ability to release each workspace itself. So we developed the release workflow instead.

[05:24] The idea is that with each PR we merge, that it has to include a little file created Yarn itself that lists all the workspaces that have been changed in the PR and whether they will need to be part of the next release, and whether it will be a minor patch or a major version. Our CI will validate that this file exists and its contents, and at release time, will simply need to tell Yarn to aggregate all the version files checked inside the repository into package bumps and changelog generation.

[05:56] Another workflow that we are using is Zero-Installs. The idea isn't that complicated. It's just that we decided to keep the cache of the project inside the repository itself. So for each package that we have, we keep exactly one zip file into the repository. As a result, anyone that clones the project is able to instantly start working on Yarn. They don't even have to run Yarn install. More importantly, it means that our contributors are able to switch from one branch to another at almost no cost in terms of context switching. That's extremely valuable because it ties into the experimentation subjects that I mentioned earlier. We can quickly iterate on multiple different feature branches without having to pay the cognitive cost of having to run frequent installs each time you run the checkout.

[06:44] I still feel the need to mention that adding the packages of the cache into the repository is a trade-off that we made for our project, but it's not something that you absolutely have to do yourself if you're not convinced the idea. We know that a few other projects have decided that they wanted to run Yarn install as they always used to and that's perfectly fine. It's just that nowadays you have the choice.

[07:10] One problem when you have a lot of workspaces, and we have a lot, I think we have something like 44 at the moment, is that it becomes difficult to make sure that all the workspaces are aligned in their configuration. For instance, how do you ensure that no two workspaces will depend on different versions of the same dependency. It can become a daunting task and existing tools didn't help much with that.

Now, thanks to the Constraints feature in Yarn core, we are able to enforce postural patterns across all workplaces in a very few lines. Better yet, Yarn can further apply the fixes itself. So for example, in the Yarn repository itself, we are using that feature, not only to enforce that all dependencies of all workspaces are the same ones, but then we also use it to ensure that for example, the license is properly set on all of our packages, that the build scripts are consistent, this kind of thing.

[08:01] So these three examples give you a good idea of what value projects can derive from Yarn. In general, they no longer need tools like [Lerna or Changesets as those workspaces receive first party support.

But internal projects can benefit from Yarn as well as we are going to see. I work at Datadog and Datadog has a huge web application, along with a sizeable JavaScript infrastructure, supporting everything from linting to deployments. We'll go over three examples.

[08:32] So if you don't know what Datadog is, we are making some really good cloud monitoring, extracting signals out of arbitrary matrix. Given this DNA, it won't surprise you to know that we also monitor the way our developers interact with our own infrastructure, in particular, in terms of response time. So to actually have this, we wrote a plug-in for Yarn that watches all scripts that get executed, how much time they take, and sends it all on the dashboard.

All this data then helps us identify potential bottlenecks before they even become a problem and generally help us prioritize our work. As we say, with the Yarn repository itself, checking in dependencies as per full effect on one's ability to efficiently contribute to a project, but even incorporating variables with this pattern as it's used.

[09:20] From our data, the registry does that approximately once a month. By keeping our own copy of the packages we are never blocked during our deployments. Additionally, despite the high volume of , our CI doesn't waste much time running installs. All in all, instead of having to spend minutes setting projects in every single CI, it's now just really a matter of seconds.

One last item I want to mention, we have a large project and a lot of dependencies. Some of them have bugs and we try to address them as well as we can. But in terms of fixes are actually merged in the upstream project and released, which is sometimes a bit difficult we need some simple way to use our changes locally, while keeping some audit ability. Some third-party tools exist just for that, but Yarn does supports it out of the box, thanks to the patch protocol in a way that's directly integrated with the cache system and all its childs.

[10:14] That's really nice to have this kind of features that came from the community through the patch package. For example published on npm, that are moved as a first party feature directly integrated with the package manager.

So I think those three examples give you a good overview of what Yarn has to offer in the context of the company. But keep in mind that those are only examples. For example, we didn't talk about how Yarn can keep your packages compressed on disk and share them between projects if that's what you prefer. Or, how you can install packages straight from Git monorepos. Or, how Yarn scripts are made portable across both Windows and without you having to do anything. That itself is only really a quick list from a single slide. There are a lot more hidden gems in Yarn.


Under the hood


[11:03] Okay, so far we've discussed what Yarn is and what it can bring you and your organization. But now I want to take the opportunity to go farther than usual and tell you why it works so well. Let's make a quick dive into Yarn's development process themselves.

A large part of our secret sauce, if you will, is our infrastructure. It may come as a surprise that the most valuable asset we have isn't Yarn itself, but rather the layers built around it, but that's really the case. You probably noticed that Yarn's iteration speed was far beyond the industry standard. The big reason for that is that we have spent a lot of resources solidifying our foundations.

What is a part of this are the interim tests. Every four hours a set of comprehensive tests are run, but rather than test Yarn's behavior, we actually test properly our third party projects from the ecosystem. Next.js, Gats, Angular JS, those are only some of the big names that we monitor. If any of them accidentally ships something incompatible with Yarn, or if Yarn accidentally ships a regression, we are aware of it less than four hours later and can immediately work with the maintainers in order to find the solution. Even if that's an uncommon scenario, having this mechanism in place means that we can afford to be slightly less conservative than usual. We don't have to assume that our work is compatible with the ecosystem. We have to proof that it's actually really the case.

[12:29] And similarly, we made a huge effort to build our cloud base, using the latest available tools, including TypeScript itself. I can't stress enough the benefit that it gave us. Not only did it help us avoid silly mistakes on our own PRs, but it also raises the confidence that we can have at a PR won't have unforeseen effects, which means that it's easier for us to merge pull requests from first time contributors and trust them that if things will just work.

Finally, our last merger also came with another rule of our testing strategy. At the time we had a bunch of unit tests where we instantiated a lot of classes from the core and activated the methods to check their behaviors. And while it worked well enough for a time, that approach proved very difficult to maintain over time because it meant that refactorings were practically impossible, because it would have required to rewrite all the tests. Also, keep an outdated interface just for the test. So we decided to try something a bit different, starting from v2.

[13:34] We still rewrote all the tests, unfortunately, and I had to do this incrementally, but this time we updated them to directly use the CLI binary. Exactly like real users would have done and because the CLI interface is not meant to change really, it meant that it allowed us to remove a lot of obstacles, while making it much easier for our external contributors to understand how to write tests. Because literally all they have to do is to write CLI codes.

I think the line lesson here is simple, to write code efficiently at large scale, for a major repository, or project, you must be able to trust your infrastructure. And accepting to pay the cost upfront, like we did, you would be able to both reach speed, and decrease the risk of burnout for your team, because they will have less work and less pressure to assume that things are working.


Modular architecture


[14:32] So now let's focus on Yarn itself, and more precisely one particular part of it is architecture. 

As I mentioned earlier, Yarn used to be a monolith. Workspaces didn't exist at the beginning, so we didn't use them. And while we work on Yarn 2.0, it became clear that some parts of the application would benefit from being independent from the rest, even if only to prevent them from accidentally relying on unrelated components they shouldn't even be aware of. One of the first thing I did working on it was how to make Yarn modular. As for us today, we now have a core containing all the critical algorithm. For example, the resolver, which will take all the packages, query the API registry, and get all the version needed. Or, the fetcher that downloads the packages and which exposes a bunch of interfaces that modules can inhabit however, and where they want.

[15:30] Most of the features you see in Yarn are implemented this way. Communication with the API registry is a module. The package table generation is a module. Plug-and-play itself that you may know as the new install strategy in Yarn is an actual module itself. It's only at release time that those modules are bundled together to yield the Yarn CLI that you know.

Interestingly, this architecture also makes it very easy to extend Yarn with new functionalities. For instance, the focused as we implemented it, is literally a hundred lines inside its own module. You could implement it yourself and in fact, a few people actually did it. They compared something that is interesting, is that it's very difficult sometimes to build workflow that satisfy everyone. So for example, we have two community members that have built their own plugin that do exactly what they intend to do with their focused installer. They remove dependencies if they don't want them, they generate a zip archive that they can publish to AWS in one pass. It's really much easier to implement your own workflows in some cases than to expect a tool to implement exactly what you need.

[16:44] So if a behavior doesn't match your expectation, you can experiment with a new one. Let us know how it goes. So, as I mentioned in the past few months, at least two community members have started to solve their monorepo deployments altering plugins dedicated to that use case.


Yarn's future


[17:02] So where do we see Yarn in the future? No one can predict what's going to happen, but I can already tell you what's currently happening. We're working on the next major Yarn version which will be Yarn 3.0. It will feature its share of improvements, clean up some behavior of our CLI and add new features, including some that I'm sure we make those.

At the same time, we're starting to look into making Yarn sustainable looking into sponsor programs. I mentioned earlier that projects there are a lot of efforts, resources, and time, that we needed to find ways to support this work. And this is where you can do something. If you or your team are interested in getting your company to sponsor some of the time we spend on Yarn, please contact us. We'll be happy to discuss terms that will be favorable to both parties.

All this to say Yarn is very much alive and in a better shape than ever before. Our team is alive. Our objectives are clear and our tech is sound. Certainly we have taken a more opinionated course than we used to and we think that for most people, it will be a net positive knowing that the tool will protect you from your mistakes. Of course, different people may prefer different kinds of tools and that's fine because not one project is for everyone.

Overall, I think it's clear that Yarn is there to stay. It's not just a phase and we truly believe that we offer something different enough to provide value to the ecosystem. We've been in relation with many projects during the past few years, tweaking our repositories, expanding concepts and pushing for good practices and I think it really made an impact.

Things like Next.js, Gats, Traintrack, Webpack, much more. When we started, these tests that I mentioned earlier, those end-to-end tests that we run on all projects, every four hours, they frequently failed due to missing dependencies. Each time it happened, we went to the relevant project and shared a PR with them in order to have those missing dependencies. Over time, they started to pay more and more attention to it and nowadays our test suite is very rarely reporting problems.

[19:05] This is also the work that Yarn does. We don't only make a CLI tool, we also help the ecosystem move into the right direction not only for Yarn users, but for every package manager users. Of course the problem is that, as always, public facing are shaded, so sometimes it may be difficult for outsiders to have any idea of the real amount of work that we pour into the project. But it's something that really is important for us and that we are going to keep doing in the future.

So we've reached the end of this presentation, I hope you enjoyed it and now I'll be happy to answer questions you might have on the subject. Thank you.


Questions


[19:44] Nathaniel Okenwa: Thank you so much for that amazing talk Maël. I really, really enjoyed it. I loved hearing about Yarn, and about where it's been, where it came from and the future of where it's going. Remember if you have any questions, throw them into the chat so that we can grab them and we can ask Maël. I'm going to invite Maël on, but the first thing we're going to talk about are the results of the poll. So the poll question was, "For how long have you been using Yarn?" And the top answer was one to three years. Followed less than a year. So quite a lot of new adopters, there's a few people who've been using it for a while. Does that surprise you Maël, at all?

[20:24] Maël Nison: No, actually that's very interesting because we hear every now and then that most people using Yarn been using it for a while, but it's interesting to see that almost half of people who answered, answered less than one year, and the rest is in one to three years. Yarn isn't that old yet, so three years, 1, 2, 3 can be a lot of time, but I'm really interested the less than one year stuff. It's really interesting.

[20:55] Nathaniel Okenwa: Well, I mean, it's good to have so many people who are beginning to use it and take it up. We've got a couple of questions that have come into the chat and probably be a few more that are going to come in. I'm going to pick one that I really like, because I love seeing how many people work in different open source, or these big projects and I always wonder, how did they get there? So, how did you get into maintaining the Yarn project?

[21:18] Maël Nison: So at the time it was because I joined Facebook. I joined it for no particular reason related to Yarn itself, and then I happened to be in the same country as the team that was in charge Yarn at the time, so I joined the team. When I say team, it was a very small amount of people that were working on Yarn, itself. In fact, when I joined the team there was a single one who quickly went to other projects across countries. I suddenly had to work on Yarn myself and I liked it. I really liked working in the project that the whole ecosystem could use.

It was very difficult I will find because it meant that I had to do a lot of work for the open source and prioritize correctly work, depending on the other company I was working for as well. And I think I learned so much, that when I left Facebook, so it was one year and a half, two years ago, actually. When I left Facebook I decided to keep working on it and that's about when we started releasing the new branches for Yarn 2.0 and above, that we are still maintaining to this day, with a new core team that is almost entirely new people that joined starting from Yarn 2.0, except for me who went from Yarn 1.0 to Yarn 2.0.

[22:42] Nathaniel Okenwa: That's amazing. And one thing I find so interesting about your talk on Yarn, there's so much I've learnt about what Yarn could do that I didn't know and it became so much more. I like what you said where it's not just a package manager, it's a projects manager, but what would you say is the biggest thing, if you had to answer this in one answer, how different is Yarn now from when it was first released?

[23:09] Maël Nison: I think Yarn is much more stable and sounder in terms of the technical aspect, because we put a very strong emphasis on doing things when we know that they are correct for sure. So, for example, with the default installment in Yarn, we throw error when you try to access packages that are not in your package   so that you have no risk to accidentally depend on the dependencies that would be missing on your consumers' machine. So we really try to help you build your application in a sound way. A bit like a TypeScript, or Flow actually do.

An organizational point of view, it's very different from what it was before, because before it was mostly a one single person, sometimes two, but not for long and it was extremely exhausting. Nowadays the core team is actually composed from, I would say four main release contributors and a few other ones who are gravitating out in the center circle. So it feels much less lonely to work on the companies so that's a great improvement. In the open source, I think that the main way that we can keep the flame burning is finding all the people that share the same passion as we do. I think that we have started to find that here.

[24:36] Nathaniel Okenwa: Yeah, I love the fact that you spoke about how having the different people in the team now to help you must just make that so much, much a nicer experience. There's a question here, and I'm pretty sure it's a question you've been asked over and over and over again. So I'm sorry, I apologize in advance that I'm going to ask this, but what is the biggest reason to use Yarn versus npm?

[25:00] Maël Nison: So from my perspective there is a lot of reason. I would say that if something works for you, then it's fine to keep using it, whether it's a Yarn or npm. The reason I choose to work on Yarn is because of its emphasis on correctness. npm has a slightly different verify recipe, which is a bit that thinks through the work and do something sensible. On Yarn, we do something that is a bit more radical in that things should work if they are meant to work. The two approaches have their own benefits, but I think that the one that we have chosen fits me better and it fits our user better, as well.

[25:46] Nathaniel Okenwa: Yeah, I love that answer. I think a lot of times we often think of it's A versus B when really it's like, "Do you prefer a knife or a spoon?" Like which one.

[25:55] Maël Nison: Exactly. Your two tools have very different purposes and depending on what you prefer, how you prefer things to be developed, because Yarn and npm are completely different organization. They go through a very strict process. We have a more fluid, flexible way of merging feature requests. Those are two very different orgs.

[26:21] Nathaniel Okenwa: Cool, this is a question, it's a bit related to npm, and I just find this interesting from a development perspective, and someone has asked, do you share a feature roadmap with the npm dev team, or do you just come up with your features independently, and they sometimes coincidentally happen to be similar, or are very different?

[26:42] Maël Nison: We don't share a roadmap with... I should rephrase that sometimes when we come up with ideas about our package manager and then they are two different options. Either this feature is purely for Yarn, including the client , for example, that would be workspaces, which are something that would only affect your project, then we don't need to mention it to the npm team because it will not affect them directly. However, we can sometimes come up with proposals that would affect the ecosystem. So for example, that would be our  that I made that propose a way to have different variants of the same package for different architecture, so there are prebuilds on all architectures at the same time.

Something like that will affect the other package managers. So when that happen we typically ping the maintainers of both package managers, so npm, but also pnpm, in order to get their feedback. Historically, it's been a bit difficult to always be aligned on the feature that we each want to implement so that's why we implement more fun features that are in the client itself rather than ecosystem wide. But it happens for example, one example of such a collaboration as being the optional peer dependencies that we proposed two or three years ago, and that we discussed with pnpm and npm and eventually made their way into all package managers.

[28:24] Nathaniel Okenwa: So all of that makes so much sense. And also, I think it's really important as well that times and space where that collaboration is what we need just for everyone to move forward and to have better products. I think being able to collaborate, and to be honest as JavaScript developers, we see this all the time on the web, where proposals are made, everyone kind of made aware of it, and everyone is able to sort of implement these new best practices or these new good things into the way they're building things. So, I really, really like the answer. I know you kind of answered this question in your talk, the question was, what features are coming in the future for Yarn? I would also like to add a little extra spice on that question. What are the features that are coming in the future for Yarn, that you are most excited about?

[29:12] Maël Nison: The one that I'm most excited about that, but that will take a long time to really be there is being able to install packages from different languages. Right now, Yarn is able to install JavaScript packages, but we are seeing a bit beyond that. We want Yarn to be a package manager that you can use in order to prepare a project, regardless of the language that you're using right now. It may happen to be JavaScript, it may happen to be Python, it may happen to be something else. So one large part of the architecture that we decided for Yarn 2.0 has been to make it possible later to start R&D on being able to install packages from other languages. Of course, it's not easy, neither technically, nor politically, because those language have their own package managers, so we don't want to ruffle any feather, or this kind of thing. It will be a slow process, but we really want to be able to really find a way to make Yarn universal in terms of a package management.

[30:22] Nathaniel Okenwa: Man, that blows my mind away, just being able to use packages from different languages. Now I'm beginning to already think of what would that look like? That must be so ambitious and actually a good question now, that popped into my mind is how do you come up with the ideas for features? So what is usually the biggest inspiration for you? Is it things that happen for yourselves as dev, or is it things that you're hearing about from the community? Where do you get someone of the ideas that turn into features in Yarn?

[30:50] Maël Nison: It's a bit of both. For example, when we started working on Yarn 2.0, we started make our repository a monorepo. So I mentioned in the talk that before that it wasn't a monorepo. We made it a monorepo and started using it and we quickly noticed that some things were a bit lacking because we needed them ourselves. And if we need them, then it's likely that a lot of people are actually needing them. So in this case, we are able to really spot a missing point and implement it because we are going to benefit from that ourselves, and then it will be spread to the community.

In some other cases, it comes from the community who submit their own use cases. One thing that we made in version 2.0, is to make a Yarn Plug-in home in order to let them implement these kind of features. One problem that we had in Yarn 1.0 was really that we were forced to implement everything that was proposed to us because otherwise we were blocking the work from various people. We didn't want that to happen with Yarn 2.0, so that's why it had this plug-in system, which allows the community to make their own R&D, to come up with their own concepts. If they actually are useful to a large amount of people, then we can bring them into the core, fairly easy. So I think it's a bit of both.

[32:20] Nathaniel Okenwa: Nice, I love that. I know we're running out of time, but I really, really want to just find out if there was maybe somebody who's a developer, who's been thinking about starting with Yarn, what would you say to them? Or maybe if there was anything maybe you really want all developers to start doing when they're working with packages, maybe some that would just make the dev world better. What's one good practice you want people to take away?

[32:43] Maël Nison: Properly list your dependencies, don't just assume that something works because it's meant to work. Sometimes things work chance and it's important to also be open to the fact that you really have to respect the standards in order to have a properly working program.

[33:05] Nathaniel Okenwa: That makes so much sense, thank you so much for spending time with us today. I really appreciate you. Thank you and goode.

[33:11] Maël Nison: Thanks.

Maël Nison
34 min

Check out more articles and videos

Workshops on related topic