1. Introduction to Yarn
Hello, everyone! I'm Mael, and I've been leading the development for Yarn. Today, I'll talk about Yarn's core values, our goals for each release, and the future of Yarn. Yarn is a package manager similar to NPM, emphasizing consistency, stability, and good performance. We released the first version of Yarn six years ago, and now we are working on the 4.0 release.
[♪ music playing ♪ ♪ Hello, everyone, so my name is Mael. I work at Datadog. And I've been leading the development for Yarn for the past few years.
So today I'm going to talk to you a bit about Yarn, what are its core values, what we are aiming for, for each version that we release and show you a glimpse of the future.
Before we start, what is Yarn? So Yarn is a package manager that you may know, similar to NPM, that allows you to install packages on your system to resolve dependencies. And it favors consistency and stability while still attempting to provide good performances and high moderality to your projects.
It's been a long adventure, the first version of Yarn got released almost six years ago, with 0.15, one year later we released the first stable release with 1.0, and two years and a half later we decided that it was time to make a change and to decide for sure what we wanted to do in the future of Yarn, and with that came the 2.0.
At the time, there were a lot of discussion about some of the core aspects that we have been working on in the subsequent release of the 3.0, and that we are going to keep refining in the 4.0.
2. Yarn's Priorities and Unique Features
Why another package manager? Yarn brings unique properties and priorities to the table. Stability is a core tenet, ensuring consistent and predictable experiences. Maintainability and future-proofing are key considerations. Yarn is designed to be modular, allowing for custom logic and specific use cases. Security is also a focus to prevent future attacks. Performance is not discussed due to the current year.
Why another package manager? We already have MPM, we also have PMPM, what does Yarn bring to the table? The thing to remember, and that's true for package managers, but also true for say, bundlers is that features and performances aside, each project in the open source ecosystem has different properties in terms of priorities, roadmap, governance model, maintainability, infrastructure. All those things are things that you should keep in mind each time you try to evaluate a project. Because for instance, MPM is owned by GitHub, whereas Yarn is completely open source. In both cases, there are pros and cons, and that's the kind of thing that you don't see at the very first glance, but that's makes sense when you're trying to invest in a tool in the long term.
So I talked about priorities. What are Yarn's priorities? We have four of them at the moment. The last one got added quite recently and we are going to talk about it in the future slides, but first, stability is the main core tenet of Yarn. We want all your installs, all your experience of using Yarn to be deterministic and predictable. If something works for you, then it should work for your colleagues. If something crashes for you, then it should crash also for your colleagues. And this last part is quite important because making sure that a program fails consistently allows you to make sure that it will also work consistently. If someone has a problem, you will be able to reproduce the issue and to help them get past it. Maintainability. We are trying to set up the project not only so that it succeeds now, but also that it succeeds in the future. The way we see Yarn, Yarn will still be there in ten years. How can we make sure that Yarn will still be in good shape in ten years? That's not so easy because it means we have to make choices in terms of government, in terms of architecture of our own repository. How can we keep the code base healthy? So that's one of our priorities.
Modernity is another one. Back in Yarn 1, we noticed that a lot of you had very specific use cases. It was very difficult for us to implement all the features that you needed, sometimes that only one company needed. So instead what we decided to do with Modern Release of Yarn is to make our core modular. Meaning that you can write plugins, you can write commands that go into the core Yarn API that we provide that we document. And you can make your own logic on a very few simple lines of code. Almost all of the Yarn commands are implemented through this system. For instance, the install itself takes something like 50 lines to implement. And finally, security. That's something that we are starting to introduce, because even though Yarn was safe before in that we tried to prevent packages from accessing your disk, there are other types of attacks. During the past few months you may have heard about attacks such as UAParser.js or Faker.js, these kind of problems that are starting to rise, and we want to provide a solution so that it's not a problem in the future. You may notice that I didn't talk about performances. That's because we are in 2022.
3. Yarn's Unique Features
All package managers have similar performance for the same features. Yarn offers unique features like plug and play installation, support for nonmodules, and the exec protocol. Yarn also allows installation from git and provides the patch protocol for applying changes to packages.
And truth be told, all package managers have the same performances for the same features. Sometimes one package manager will do more things than the other, for instance, with Yarn we are doing some validation. But overall, it's the same thing. We track benchmarks on the PMPM, NPM, and Yarn. An hourly basis in our infrastructure. And something I noticed by trying to see what were the difference between package managers is even though PMPM adds some hedge on us on our automated benchmark, that's something I couldn't reproduce on my laptop. I think it goes to show that performances of package managers are very relative and at the point where we are, they don't change that much things.
Anyway, enough about Perth. I want you to learn something about yarn. So, now we are going to discuss about 14 things that you don't know about yarn but that it actually does quite well. So, first, about installs. Since that's the main thing that a package manager does. I don't know if you know that, but even modern versions support nonmodels just fine. You may remember that in 2.0 we released a new install strategy that is called plug and play that allows you to not have a nonmodels folder to share your dependencies across all the projects on your machine. To not be plagued by ghost dependencies. But we also support nonmodels. If your goal is to migrate a project quick and fast, you can do that. It's very well supported. We have a protocol called exec that allows you to create your own packages dynamically when running your own install. The exec protocol, for instance, let's say you have a package that is on SVN. I don't know if other people are using SVN, but sometimes we do. And if you want to fetch a package from there, then we would have to implement it inside the package manager, so inside yarn itself. With the exec protocol, you can just define a JavaScript script that fetches these packages from any location you want. I use the SVN, but it could be from any other location.
Any workspace can be installed from git. Git protocol, when declaring a dependency as git, allows you to install one workspace from any Yarn or PMPM or NPM product. And finally, the patch protocol is a protocol that allows you to apply changes to any package and to keep those changes inside your repository. That's a use case that is very common when you have a security issue on a project that you need to address that has not been released yet. You can just use the patch protocol in order to fix whatever is problematic. Another use case is, for instance, when you're trying to migrate from CGS to ESM, and there's something that is breaking somewhere, just a line to change, but it's difficult to send a change upstream, you can use patch package in order to just jump the one line that is bugging you.
4. Yarn's Optional Features and Community Service
Yarn has optional features like installing symlinks, a modular approach for easy implementation, a version workflow for managing cross workspaces, constraints for linking project packets and files, and auto installation of TypeScript types packages. Yarn is committed to being a good citizen in the open-source community.
So that was for installs. Now, Yarn has a lot of features and some of them are optional. You can opt into them and start using them, but you can just completely ignore them. First one is that Yarn can install symlinks, just like pnpm. You may know this strategy that pnpm has, where instead of generating a concrete non-model where each file is actually a file and each folder is actually a folder, npm does symlinks that points to a global store. That's something that Yarn can do. We introduced that in the 3.0 last year and a few companies have been starting to use it and giving us their feedback.
The one thing I really like about Yarn is that we are very modular, as I mentioned. So we can implement a lot of things in a very few lines of code. So when we implemented the pnpm linker that allows you to do those symlinks, it actually took less than 100 lines to make the first iteration. That's the kind of thing that I was mentioning when I said that we want Yarn to be healthy so in ten years from now we can still keep adding features, fixing bugs without being encumbered by the past implementations.
We have a version workflow that allows you to manage versions of cross workspaces. So if the only reason you are using, for instance, learner is to manage versions, that's something that you can get for free just by using Yarn. We are using this workflow as part of the Yarn development itself since all of our workspaces, and we have a lot of them like 30 or something, all our workspaces are managed through this system. So we are improving it frequently. Constraints allow you to link your project's packet and files. Once you have a certain amount of workspaces like, for instance, Yarn with these 30-something workspaces, it becomes difficult to make sure that all of them satisfy some criterias that you have, for instance, that none of them depend on some dependency, like, if you have both Lowdash and Underscore, it will be a problem, so you might want to prevent one of them from being added somewhere. That's what constraints allow you to do. If you want to prevent two workspaces from depending on different versions of React, that's something that you can do too with constraints. So constraints are very powerful and allow you to define in a very few lines of code, sometimes even as small as two, what you want your workspaces to look like. And what's the nice thing is that it just works, it can just autofix all problems, so if you tell it what the state should be, it will just apply the changes to all your workspaces in one command line.
Another one is that TypeScript can, yeah, so Yarn can auto install the add types packages as needed. So it's always a little annoying for me when I'm doing some TypeScript development and I'm adding a dependency and then I see that in my editor they don't show up, the types don't show up. With Yarn, we can just check whether the package has types and if it doesn't, we check whether it has definitely typed packages and then we install them automatically. This behavior can be disabled. In 3.0 it's opt in. In 4.0, I think we are making it opt out, but it's something that you can disable if that's not something you want.
In terms of community service, so open source is about community. We are writing this tool and making it available to you so that you can use it so that other projects can use it in order to maintain their own architectures and it means that we are trying to be good citizens and we are trying to work with the community in implementing features or making sure that projects can benefit from the changes that we make, this kind of thing. I listed a few of them.
5. Contributing to Fix Dependencies
We contribute to third parties to fix dependencies and address the problem of unremoved packages causing subtle issues. Yarn aims to surface dependency information and collaborate with maintainers to resolve compatibility problems. This is crucial for Yarn, NPM, and PMPM users.
The first one is that we actually contribute to third parties to fix dependencies. I mentioned earlier this problem about goes dependencies when you start relying on dependencies without declaring them in your packet.JSON. While it may work in some cases, it usually leads to very subtle problems where the versions are not matching which means that as you had unremoved packages, you may suddenly end up in a state where your application doesn't work even if you didn't touch any related dependency from your project. In order to solve that, proper fix is to lease dependencies but most package managers don't really surface this information very well. With Yarn, we try to do that. And each time we notice something that doesn't seem quite right, we try to work with the maintainers in order to fix those issues. It is important not only for Yarn or for its users, but also for NPM and PMPM users. As I mentioned, those problems occur everywhere. Each time you have something where the version is not quite compatible, although it probably should be, that's a problem of Gauss dependency.
6. Node.js Loaders and Corepack
We're part of the Node.js Loader's working group, working on making loaders powerful enough to be practical. We run end-to-end tests on open source projects to prevent dependency issues. We advocate for Corepack, a tool to manage Package Manager versions on a per-project basis. Corepack works for both Yarn and PMPM. We aim for cross-project pollination and maintain a compatibility database for problematic dependencies.
We're part of the Node.js Loader's working group. So loaders are the way that Node.js allows you to intercept the required call and route them to different locations. For instance, that could be loading models from HTTP instead of loading them from the disk. It could be from loading models from compiled archives, instead of loading them from individual files. There's a lot of use cases for loaders.
For instance, you may know DESP, which is mocking your models. That goes through loaders. So the loaders are very new. They didn't exist for command.js. They were starting to appear for ESM. We are part of the discussion in order to figure out how to make them powerful enough to be practical in our world.
We run end-to-end tests against many open source projects. Something we noticed by contributing to the third parties is that it's easy for them to accidentally add another dependency, forget to list it, and then things start to break. In order to prevent that, on our side, inside Yarn itself, every three hours, we run a bunch of end-to-end tests by installing the latest version of all major open source projects, like Zvelt, Gatsby, Webpack, all kind of projects, really, and checking whether they work on simple tests. If they don't, then we can immediately go to the maintainers and speak to them and see what would be the best fix. So it's been quite helpful for both us and maintainers to track regressions.
And finally, we advocate an unimplemented Corepack, a new Node.js tool that allows you to manage the version of your Package Manager on a by-project basis rather than a global basis. That's something I've been feeling very strongly about, because when you think about it, your Package Manager's job is to lock your dependencies. Going from there, it feels a bit weird that the Package Manager is the only dependency of your project that wouldn't be locked, right? So with Corepack, you can actually lock the Package Manager version to a specific version so that you are entirely sure that everyone in your team will have the exact same behaviour.
One thing to note about Corepack is that it works for Yarn, so it's distributed with Node, and when you run Corepack enable, you have Yarn inside your bin folder, but it also works for PMPM. So that's something that I also felt strongly about, that things should work not only for Yarn, because we are one of the other Package Managers, but also for PMPM, which is another one. We should recognise them and accept them inside the community.
And that brings me to my other point, which is cross-project pollination. We want Yarn to be kind of a platform that can be used in order to build your own Package Manager if you want to. We maintain a database of Gauss dependencies. All those problematic dependencies that I mentioned, where if you are missing one, you may have different behaviours from one cell to the other. That is a thing that we track and that we store inside a small database. And PMPM, for instance, leverages this database in order to fix problems as they are reported. So basically, it is like a compatibility database. PMPM itself leverages our code to generate normals, not the symlink normals, but the concrete file that you may...
7. Yarn's Unique Features
Yarn has implemented an oyster, allowing you to define layouts for packages. They have extracted code into packages that other managers can use. Yarn also has various libraries, such as clip-and-yarn, that can be used in any application.
The concrete normals in style, such as the one from NPM. We implemented an oyster, which allows you to define the right layout given a set of packages. And we've been able to extract this code inside a package so that other package managers could leverage it in order to implement this same kind of behaviour. And we have a bunch of libraries that we publish on their own.
For instance, clip-and-yarn is the framework that we use in order to build our CLI. And instead of keeping it inside the yarn code and just leaving it to live like that, we extracted it inside a package that you can use inside your own application. So even if what you're doing has no relation whatsoever to package managers, you can still use code that is written for yarn. That's a lot of things. And I'm sure that you didn't know at least one of them.
The thing with yarn and Parallels Package Managers in general is that those are diamond fields. There's a lot of very different things that we are doing, and it's sometimes difficult to be aware that they exist before you need them. So at first, you're like, yeah, but what's the difference between A and B? And then you're trying to dig, and you're starting to see that there are very small features that are making a whole.
8. Improving Yarn and Security Features
Yarn is focusing on addressing friction and improving user experience. CorePack is becoming the recommended tool for installing Yarn, making it closer to being part of Node.js. Yarn 4 includes all the features previously offered as plugins, such as auto types installed, constraints, and workspace tools. The local cache is now opt-in, reducing the number of generated files. Security features like check resolution and refresh log file prevent supply chain attacks. Yarn's Ardent Mode automatically adjusts trade-offs between speed and security. Stable resolution, inspired by other languages, mitigates viral attacks.
Okay, I talked about what yarn does, but where is yarn still not good enough? And indirectly, what are we doing for yarn form? In terms of friction, we are not exactly at the stage where I would want us to be. As I mentioned, it doesn't matter how good is the Buffett if you can find the door to the restaurant, right? So that's something that we want to address. We want to help you find the door.
So, for that, we are going all in on CorePack. CorePack will become the recommended way to install a node to install yarn, and that makes it closer from being part of Node.js. So it's still experimental in Node itself, but as far as yarn is concerned, that's the tool that we are going to recommend people to use in order to install yarn. It means that even though with yarn 2 and 3, it was recommended to check in the built binary of yarn inside your repository, it's no longer true starting from yarn 4. The CLI will become battery-included. One of the changes that we made when we switched from 1 to 2 is that yarn got a lot of new features, some a bit experimental, that we didn't include in the default binary, which requires you from installing and managing plugins that were written by us. That's something that we are changing in yarn 4. We are graduating all the features that were previously plugins so that when you get the yarn binary, it will contain all the features that we have been working on, so auto types installed, constraints, workspace tools like for instance yarn workspaces forage, which allows you to run a script on all your workspaces. Those are things that we know be part of the default distribution of yarn. Another one is that we opt in the local cache. Yarn has two caches, one is global to the machine, another one is local, and with the new version, we are making it opt in so that there are less files that are generated when you are running an install.
Security. That's something that should be a default, not something you opt in. In order to do that, we have check resolution and refresh log file. Those two flags completely prevent any attack to be done by modifying the metadata in your log file. You may have heard about supply chain attacks. That's something that is not possible with those flags. However, those are flags. You need to enable them, which brings me to my other point, which is Ardent Mode. Yarn will try to detect which mode it is being run under. If it detects that it's an unsafe environment, for instance, public pull request, it will, by default, make different trade-offs between speed and security and, for instance, enable the two flags I mentioned so that you will not have to think about security when using Yarn. Stable resolution. That's an alternative resolution strategy that we are working on that is actually used in other languages, for instance, Go, that protects against attacks like UA PartialGIS or FakeGIS. The thing with most attacks in JavaScript is that they are extremely viral. Once a package reaches the registry, it may be picked up by anything that depends on this package but also all the transitive things that accidentally depend on the vulnerable package. With stable resolution, we remove the viral factor. It's not entirely ready.
9. Yarn's Experimental Phase and Documentation
Yarn is still experimental and breaks some old packages. Discussions with the community are ongoing to determine the best way to roll it out. The documentation is being rebuilt to provide clearer information and better content. Other improvements include the patch protocol, faster boot time, enhanced public APIs for writing Yarn plugins, and improved integration with Git. Due to time constraints, only a few topics were covered in this 20-minute talk. Feel free to ask questions on Slido or reach out to me. Thank you for your interest in Yarn!
It's still experimental. It breaks some old package, for instance. Under this mod, things like Gulp don't work anymore. We are still discussing with the community to figure out the best way to roll out this if we actually want to run it out.
Finally, documentation. As I mentioned, there are a lot of things to discover in Yarn. You often have very little bandwidth to see them. We are rebuilding our websites in order to be more clear, to better present all the information and to have better content in general.
I only mentioned those three main topics, but there are a lot of things we are also improving in Yarn, like the patch protocol and faster boot time. We are improving the public APIs that you can use when writing Yarn plugins by adding new hooks and new functions. We are trying to make Yarn have better integration with Git in virus commands, so that it knows how to manipulate Git to the best. There are a lot of things that change, but this is a 20-minute talk, so I can only go over a few of them. I hope you enjoy this talk. If you have any question, feel free to ask them on Slido or to ping me in the hall. I love talking about Yarn. Thanks for having me here. Thank you very much. Thank you very much, Myle. Very nice session. I learned a lot about Yarn. I will definitely give it a try.
10. Contributing to Yarn and Plug and Play
Yarn is open source. How to start contributing? We have good first issues labeled for easy fixes. Join us in Discord. We provide guidance and have contribution guidelines. We welcome anyone to help us with Yarn. One question from the audience was about switching to plug and play in a big organization. We have tools in Yarn to fix dependencies, such as the package field in settings. We also have a database of compatibility problems that we automatically fix. It is possible to switch to plug and play.
I don't think that we have some relevant questions on Slido so far, so folks, wake up, use this tool for sending questions. That doesn't mean that I will let Myle go without a couple of questions. First one from myself. Yarn is open source. Let's change places, organize this, ask us. Your advice, how to start contributing to Yarn. Are there some issues that are easy to fix? We have a label with good first issues. We are open to new people joining us in the discord. A few times someone said, would it be possible if I was doing this fix? Would it be merged? We give them ideas how to make it in line with the type of contribution. We have written a guide for contribution guidelines. So, yeah, we are really welcoming anyone to really help us working on Yarn. That's amazing.
I think we can take one question or maybe two from the room. Any questions about Yarn? Yes? Let me run to you. Come closer. Thank you. Thank you. You just show a lot of amazing things that you did with Yarn. And I recently tried to actually use Yarn tree to improve our mostly CI performance and also, of course, get all the local benefits from it. And I checked the plug and play option. I don't know if you mentioned it today. But the question is how a big organization of hundreds of developers can just switch to this one because there is issues with at least IDs and, you know. So that's actually what we did at my company. We are using plug and play. And the way that we do this is that you have a various fields tools inside Yarn itself to fix dependencies. For instance, in your Yarn settings you can have a package field that allow you to declare all the dependencies that are missing so that Yarn knows they are here. And stop telling you that there is a problem there. As I mentioned, we also have this database of compatibility problems that we automatically fix when we are aware of them. So from time to time people come to the repository and make a PR just to add new entries into this database so that all the dependencies that adopt PNP are not affected by those issues. So it's definitely possible.
11. Switching to Yarn and Performance
Switching from NPM to Yarn in a big project depends on priorities and tradeoffs. Yarn aims to surface problems before they impact CI runs. Regarding performance, Yarn is fairly consistent, and using PNP can make it even faster. Further questions can be discussed in a special area.
It's a bit of work because you have to go over the missing depths and add them to those settings. But overall, it's in a pretty good state. Thanks for asking. And yeah, we have some questions actually. Let's just read them aloud.
Would it be recommended to switch from NPM to Yarn in a big, long-running project? It really depends on what your priorities are, as I mentioned. So I can't ‑‑ I don't use NPM. So I don't exactly know all the diamonds that it has over Yarn. I know that Yarn works for me. I don't know if it would work for you. Clearly we are trying to be perhaps a bit more difficult to work with in that we have more friction at the moment, as I mentioned. But at the same time, we are trying to do more to surface problems before they start to suddenly appear in the middle of a CI run and then you have to stop what you're doing to try to fix things. So it's all a matter of tradeoff. See whether things work for you, whether you would want to see a special thing fixed. See whether Yarn fixes them and make your choice. Fair enough.
And so you started your presentation with a lot of statements about performance. Let's finalize on the same note and the questions exactly about that. What's your opinion on the non-deterministic performance result of package managers? Will Yarn become more consistent across different machines? Overall I think we are fairly consistent. The timing difference I mentioned between C.I. and laptops was still very slow. So it was more just two lines going from one to the other with PNPM. When you're using PNP it's even faster because then we are just writing a single file on the disk so the link step basically doesn't exist anymore. So depending on your settings it can be extremely fast. Awesome, awesome.
Folks, I'm sure you have more questions, more questions about yarn and you have a chance to continue this discussion because I'll kindly ask organizers to guide Myle to the special area where you can sit and chat further. Thank you very much, Myle. Great presentation.