JavaScript conferences

Node Congress 2022

Node Congress 2022

English version

The Secret Life of Package Managers

Tally Barak

Architect @ Yoobic

Ever wondered what happens after you hit npm install and go to grab a coffee? Let's deep dive into Npm and Yarn installation process, and how you can put this knowledge into practice.

FAQ

During 'npm install', npm manages your project's dependencies. If you install a package (e.g., 'npm install foo'), npm adds it to your 'package.json', creates a 'node_modules' directory in your project, and places the package code there. If the package has dependencies, npm recursively installs and nests them in appropriate 'node_modules' directories within each package.

In earlier implementations, npm would create multiple copies of a common dependency for different packages needing it. To optimize, npm now deduplicates by placing common dependencies at the highest possible level in the 'node_modules' directory, allowing multiple packages to share a single instance of the dependency.

The original npm structure led to problems such as bloated 'node_modules' directories, circular dependencies that could cause infinite loops, and issues with package singletons that could result in bugs due to multiple instances.

NPM ensures consistency across different environments using a lock file, known as 'shrinkwrap'. This file captures the exact package structure and versions installed in one environment to ensure they are replicated in others, such as on different development machines or in continuous integration (CI) systems.

While npm has evolved its package handling to create a flatter 'node_modules' structure for efficiency, Yarn introduced a different approach with Yarn 2 (codenamed Berry). Yarn 2 uses a virtual file system and patches the way Node.js requests files, directly managing dependency resolutions and fetching the exact files needed from a mapped list of modules.

PNPM is a package manager that maintains the traditional nested 'node_modules' structure but optimizes storage by using a global cache and hard links. Unlike npm and Yarn, PNPM does not physically duplicate package files in each project; instead, it creates direct links to the cached versions, significantly reducing disk space usage.

npm node.js yarn packaging

Tally Barak

9 min

18 Feb, 2022

Comments

Sign in or register to post your comment.

Video Summary and Transcription

npm install can be a mysterious process, but understanding how package managers work is essential. NPM solved problems like large node_modules, circular dependencies, and multiple instances of the same package. Managing package versions and conflicts is crucial for consistency across projects. Alternative approaches to package management, like PNPM and Yarn2, provide insights into the hidden complexities of package managers.

Available in Español: La Vida Secreta de los Gestores de Paquetes

1. The Secret Life of Package Managers

Short description:

npm install can be a mysterious process, but understanding how package managers work is essential. When you install a package, it creates a node_modules folder and adds the necessary code. However, this can lead to issues like large node_modules, circular dependencies, and multiple instances of the same package. NPM solved these problems by deduplicating packages and using a hierarchical structure. This ensures efficient package retrieval and eliminates the need for redundant copies.

So, you are running npm install and you go and grab a cup of coffee, and then you come back and you have no idea or maybe you don't even care what npm did during this time.

So, my name is Tali Barak. I work for Youbeek and let me tell you about the secret life of package managers. This is what happens in your project, this is the basics of npm.

So, you have your project and you need a package called foo. So, you are running npm install foo and that's add npm install to your package json and create a node modules in your project and put this code of foo inside it. But what if foo requires buzz? Okay, no problem. It will create another node modules folder under foo and it will put buzz. And what if buzz requires bugs? Well, same thing. It will put it and add it there. And then what happens if your foo requires a buzz but also your bar requires buzz. In this case, in the naive implementation of npm, you will have two packages of buzz in the same project.

And this actually creates the whole structure of your file system that is replicating the package structure that you have in your project. And this was nice, but it created quite a few problems. For example, it made your node models huge. Also, it created a problem with circular dependencies. That means that if your foo needed a buzz, that needed a buzz, that needed a buzz, that needed bar again, that needed buzz, and needed buzz, you would go into an infinite loop. And this is, by the way, quite common. It's not as rare as you might think it is. Another issue which is common is with singletons. If you need a single instance of a certain package, like the debug package, for example, in this structure, you would have multiple instances, and that can cause bugs when you execute the packages. And the last one is no longer with us. Thank God for that. It was a Windows 8 problem that the path, the file path was limited to 256 characters. This is less common.

So what did NPM do in order to solve that? So they decided to do a dedup, de-duplicate the packages that were multiple times. So instead of having buzz twice, they would put it in the highest possible level and use it there. And the reason this worked is because the way Node is requiring packages. So if buzz needed buzz, it would go under the Node modules and search for it. If it doesn't find it, it will go one level up and search for a third.

2. Managing Package Versions and Conflicts

Short description:

When packages have different versions and conflicts arise, the package structure becomes a graph. NPM and Yarn tackle this issue by taking a snapshot of the node modules, ensuring consistency across projects.

And then if it doesn't find it, it will go another level up. And there it is. It actually found buzz there and it will use it.

Next, they decided, well, let's take it one step further. If we can move packages up the tree, why only the duplicate one? We can actually do that for all the packages. And they made a very flat tree with all the packages. And this was good. This solved the problem. You now had smaller packages, shorter paths because it didn't go that deep. It was only unidirectional, no circulars. It made every package unique.

Was good, but then we had a problem. Each package might have a different version that it requires. So your tree, the tree that you need doesn't actually look something like this one. You have different versions of the same packages required in different places in the tree. And even worse, in some cases, the versions could conflict. That means that your foo might require buzz in version one, but bar requires buzz in version two. And how do you flatten that? What do you put at the top level? In fact, we have an issue here that your file structure, your package structure, is no longer a tree. It is actually a graph.

And the way it was solved is by different versions of NPM had different solutions. Sometimes it would take a popular one. Sometimes it would take a first one and put it at the top of the tree. While the other version that was required was left under the package that required it. Like in this case here, where you could promote two or one, depending on the order. And this is a problem because now we get a very shaky and unpredictable tree. And the way that NPM and also Yarn in version 1 solved it is by actually taking a snapshot of your whole node modules. And this is the famous log file. NPM has it as a shrink wrap file. And then Yarn added the Yarn log file. And this is the way for NPM to make sure that the package structure that the node module files in one project is the same as the one on the CI or is the same as your colleagues run.

3. Alternative Approaches to Package Management

Short description:

PNPM and Yarn2 offer alternative approaches to package management. PNPM keeps the original package structures in a cache, using hard links to point to the required versions. Yarn2, codenamed berry, maintains a map of all packages and their dependencies, intercepting file requests from Node to provide the required packages. These solutions provide insights into the hidden complexities of package managers.

But here is another solution. PNPM is actually doing something smart and different. PNPM is keeping the original structures, the tree structures that we talked about, keeping all the packages and all of their versions. But they don't really install the packages. They store all the packages and all the different versions in a cache. And then the node modules is actually pointing at the operation system level, using hard links to the cache and storing the exact version and pointing to the right version where it is required.

So this is one solution that was used and the other solution is what yarn2 which is codenamed berry introduced. They basically say that we are the package manager and we know exactly what package is required by each package. So they have a map of all the node modules and all the packages and what they require. And they patch, they change the way node is requesting the files. So if node is now requiring a package it will no longer go directly to the file system and get the package but instead it will access yarn and request the specific package that it needs and yarn will look into this map of modules and will return the exact file that are showing that.

So that was a very, very short and very quick peek into the secret life of package manager.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Levelling up Monorepos with npm Workspaces

DevOps.js Conf 2022

33 min

Levelling up Monorepos with npm Workspaces

Top Content

Ruy Adorno

Open-source maintainer, Node.js contributor, npm cli team member at GitHub

Learn more about how to leverage the default features of npm workspaces to help you manage your monorepo project while also checking out some of the new npm cli features.

npm monorepos devops

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder

Node Congress 2022

26 min

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder

Top Content

Feross Aboukhadijeh

Feross Aboukhadijeh

Feross is the author and maintainer of WebTorrent, StandardJS, and 100s of other open source projects

Do you know what’s really going on in your node_modules folder? Software supply chain attacks have exploded over the past 12 months and they’re only accelerating in 2022 and beyond. We’ll dive into examples of recent supply chain attacks and what concrete steps you can take to protect your team from this emerging threat.
You can check the slides for Feross' talk here.

security node.js

Towards a Standard Library for JavaScript Runtimes

Node Congress 2022

34 min

Towards a Standard Library for JavaScript Runtimes

Top Content

James Snell

Workers team @Cloudflare

You can check the slides for James' talk here.

javascript node.js component library

ESM Loaders: Enhancing Module Loading in Node.js

JSNation 2023

22 min

ESM Loaders: Enhancing Module Loading in Node.js

Gil Tayar

Microsoft, Israel

Native ESM support for Node.js was a chance for the Node.js project to release official support for enhancing the module loading experience, to enable use cases such as on the fly transpilation, module stubbing, support for loading modules from HTTP, and monitoring.
While CommonJS has support for all this, it was never officially supported and was done by hacking into the Node.js runtime code. ESM has fixed all this. We will look at the architecture of ESM loading in Node.js, and discuss the loader API that supports enhancing it. We will also look into advanced features such as loader chaining and off thread execution.

Out of the Box Node.js Diagnostics

Node Congress 2022

34 min

Out of the Box Node.js Diagnostics

Colin Ihrig

Member of the Node.js Technical Steering Committee

In the early years of Node.js, diagnostics and debugging were considerable pain points. Modern versions of Node have improved considerably in these areas. Features like async stack traces, heap snapshots, and CPU profiling no longer require third party modules or modifications to application source code. This talk explores the various diagnostic features that have recently been built into Node.
You can check the slides for Colin's talk here.

pnpm – a Fast, Disk Space Efficient Package Manager for JavaScript

DevOps.js Conf 2022

31 min

pnpm – a Fast, Disk Space Efficient Package Manager for JavaScript

Zoltan Kochan

Lead maintainer of pnpm

You will learn about one of the most popular package managers for JavaScript and its advantages over npm and Yarn.A brief history of JavaScript package managersThe isolated node_modules structure created pnpmWhat makes pnpm so fastWhat makes pnpm disk space efficientMonorepo supportManaging Node.js versions with pnpm

devtools packaging

Workshops on related topic

Node.js Masterclass

Node Congress 2023

109 min

Node.js Masterclass

Top Content

Workshop

Matteo Collina

Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate

Build and Deploy a Backend With Fastify & Platformatic

JSNation 2023

104 min

Build and Deploy a Backend With Fastify & Platformatic

WorkshopFree

Matteo Collina

Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/).
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.

graphql fastify cloud node.js

0 to Auth in an Hour Using NodeJS SDK

Node Congress 2023

63 min

0 to Auth in an Hour Using NodeJS SDK

WorkshopFree

Asaf Shen

Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher

javascript authentication node.js

Building a Hyper Fast Web Server with Deno

JSNation Live 2021

156 min

Building a Hyper Fast Web Server with Deno

WorkshopFree

Matt Landers

Will Johnston

2 authors

Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.

node.js backend deno

GraphQL - From Zero to Hero in 3 hours

React Summit 2022

164 min

GraphQL - From Zero to Hero in 3 hours

Workshop

Pawel Sawicki

How to build a fullstack GraphQL application (Postgres + NestJs + React) in the shortest time possible.
All beginnings are hard. Even harder than choosing the technology is often developing a suitable architecture. Especially when it comes to GraphQL.
In this workshop, you will get a variety of best practices that you would normally have to work through over a number of projects - all in just three hours.
If you've always wanted to participate in a hackathon to get something up and running in the shortest amount of time - then take an active part in this workshop, and participate in the thought processes of the trainer.

web development graphql node.js beginner friendly

Mastering Node.js Test Runner

TestJS Summit 2023

78 min

Mastering Node.js Test Runner

Workshop

Marco Ippolito

Node.js test runner is modern, fast, and doesn't require additional libraries, but understanding and using it well can be tricky. You will learn how to use Node.js test runner to its full potential. We'll show you how it compares to other tools, how to set it up, and how to run your tests effectively. During the workshop, we'll do exercises to help you get comfortable with filtering, using native assertions, running tests in parallel, using CLI, and more. We'll also talk about working with TypeScript, making custom reports, and code coverage.

typescript node.js testing

Follow us

Upcoming events

Korben
Dallasvisa@gitnation.org

Want to have access to all events for 4x less?

JSNation US 2024

November 18 - 21, 2024

React Summit US 2024

November 18 - 22, 2024

React Advanced Conference 2024

October 25 - 28, 2024

Productivity Conference 2024

November 7 - 8, 2024

React Day Berlin 2024

December 13 - 16, 2024

Node Congress 2025

February, 2025

JSNation 2025

June, 2025

React Summit 2025

June, 2025

C3 Dev Festival 2025

June, 2025

TechLead Conference 2025

June, 2025

React Advanced Conference 2025

October, 2025

JSNation US 2025

November, 2025

React Summit US 2025

November, 2025

TestJS Summit 2025

November, 2025

React Day Berlin 2025

December, 2025