Resolving dependencies when they are all bundled together is easy. Resolving dependencies when they are in being loaded via script tags is much more challenging. The goal of this talk is to explain how Meltwater handles dependency resolution when building native Web Component based applications that rely on packages published by many different teams.
Immutable Web Apps
From:

JSNation 2022
Transcription
How's it going, JS Nation? Today we're going to talk about immutable web apps and dependencies. First, I'm Andy Damaris. You can find me on pretty much all the social medias, at Teradox, and my website is teradox.tech. So what we're going to cover today is we're going to start with the problem. There was a problem that we were having at Meltwater, and then we'll move on to immutable web apps, what they are and why we're using them, dependencies without bundling them, referencing those dependencies, and then last we're going to talk about how we or those dependencies successfully. So what's the problem? Well, the problem was dependencies. We had a lot of them, and they were being created by internal teams for libraries that we could use on a regular basis for things like authentication or companies lookups or users lookups, and each of these things was beginning to compound the amount of duplicated JavaScript that was being bundled together with all of the different applications we were shipping. So we needed to figure out a way to unbundle these dependencies and allow them to be shared more effectively between the many applications that were using them. So immutable web apps were an important part of that journey, so we're going to cover those first. The basic philosophy of immutable web apps is that you want to build them once and deploy them many times. There are some really specific fundamentals that we need to accomplish in order to be an immutable web application. The first one is all of our environment variables need to be outside of the bundle. This allows our bundles to be built one time per version and be immutable for that specific version. They'll never change after the first time they're built. It also means that we need to deploy them to a URL where the version that we just built is included as a part of that URL. So we want that fully qualified URL to have our version number in it somewhere. Now what benefits does this really buy us? There's a bunch. The first is that when we're testing in staging and we're testing in production, they're using the exact same assets. The only difference will be the configuration that's changed between the two. This means that all of our assets can also be maximally cached. They can be cached for a full year, in fact, which is the current browser maximum. There's a huge benefit to our customers for that. They only ever need to round trip for that specific version of that specific asset once. And after that, it's on their local disk cache, saving huge amounts of time for secondary and tertiary loads of the site. We never have to worry about them having to go back to origin constantly for these large assets at times. The other things that get included with this are you always know which versions are deployed because your index HTML page makes it very plain. Look at your script tag, what versions in the URL? That's the version you're dealing with. It also means that rollbacks are now really trivial. We're just flipping back from one specific version of a set of assets to another specific version of a set of assets. And if you're a consumer already, chances are good you already have that previous version of assets in your disk cache ready to go. So you won't even have to pay download costs again for it. So we've talked about all of our assets, but what about the index page? Well, the index page is the one part of an immutable web app that you don't cache. It's our focal point for where all of our changes are allowed to show up immediately. So we like to break things down into three different thought processes. You don't have to break it down this way. It's just an exercise in a way to think of things. So the first piece is just our static web hosting, right? It hosts our index HTML page. That index HTML page is pretty static. There's not a whole lot of dynamic nature to it. And it dictates the versions of all of the static assets that we're going to deliver. Those static assets are then delivered using script tags. And those script tags use the fully versioned URL that we were talking about, in this case, version 1.2.3, to be able to deliver those assets. Then we also have our APIs. And our APIs could be on a different server, could be on the same server. But we dictate which API we're going to hit by that configuration block you're seeing in the script tag. That configuration block tells our application where we should go to hit that API and what the route should be, removing that environment variable from our JavaScript code. So I've glossed over immutable web apps pretty quickly here. There's some more nuance, a lot more detail that you could dig into. And if that's something that's interesting to you, please check out immutablewebapps.org. So now let's dig back into the main crux of what we're talking about, which is dependencies. We have a lot of them. They're being used by a lot of different applications. So how do we fix that problem? Well, we're going to need some tooling to do this successfully. The first little bit of tooling we're going to use is UMDs. They're a specific type of bundle that we'll dig into. And then we're going to talk about HiMyNameIs, a tool for helping us discover dependency names. And then we'll talk about Orchard, which is really our tool for ordering those dependencies. So the first tool I want to discuss is UMDs, the universal module definition style of bundling. Pretty much every bundler out there supports it. And it allows those bundlers to give a specific name to a package of code that will then end up on the global disk scope in the browser. The benefit of this is that now we can reference that module using the global disk namespace that it's occupying in order to bring it in without having to bundle that dependency into our other code. The example that we're going to work with today is Meltwater Visualizations. And you'll notice that when we're talking about making a name for this UMD bundle, we include that suffix of a V major version. In this case, major version 14 of Meltwater Visualizations. The reason for this is that when we reference a major version, it allows us to potentially load multiple major versions of the same dependency in the page at the same time. Now you might be thinking that that feels like a bad idea, and I agree with you. But there's also benefits to being able to do that. When you have a lot of intertwined dependencies, not all of them are going to be able to move at the same pace. Upgrading a dependency for visualizations to version 14 when other people are relying on version 13 might mean that everyone has to upgrade at the exact same time and do a very large forklift upgrade. But if you can load Meltwater Visualizations 13 and Meltwater Visualizations 14 at the same time, you give those other teams an opportunity to upgrade at their own pace, at a reasonable pace that makes sense for the workload that they're under. And that means that versions can move more slowly. And instead of having to do a forklift upgrade, you can now upgrade versions when they make sense to do so for your team's movement. Now, this doesn't mean you shouldn't be mindful of a lot of duplication of code. That's how we ended up here in the first place. But it does mean that it makes that upgrade path a lot smoother across a lot of different teams. So how do we use UMDs within our bundle? Referencing them comes down to bundling tools. We'll start with Webpack. Webpack has a property called externals. Externals allow us to load those UMDs that have been put on the global disk scope by referencing them using their module name. By using their module name, it tells Webpack, I don't want to bundle this node module anymore. Instead, I want to make a reference to that node module to this corresponding global disk namespace. So a couple of things are happening here. One, Webpack is being told to ignore bundling that module name. And two, that module name is being set up to correspond to a global disk namespace object that should exist. The benefit of doing it this way is that when we're testing with something like Jest or Beat Test, we can still use that node module to be able to run our tests. It doesn't have to reference the browser-specific code if we don't want it to. This means that our testing and mocking is much more straightforward than if we were always relying on the global disk namespace. The benefit there is easier test setups, easier to debug, and your local code runs the way that you would expect to. It's wonderful to know that we can just use the NPM module for local testing and know that that will act the same as when we're deployed. Now let's look at what a rollup config could look like for this. In rollup, those two things we were talking about that Webpack handles in the externals get divided out into two separate configuration options. The first one being which modules are not going to be bundled together with us. That's what the external property is for in rollup. It says, here are the modules, I just don't want you to load. You don't need to bundle those into my bundle. And then we can down in the output globals option, use that same type of map we saw in Webpack where we reference the left-hand side as the module name and the right-hand side is the global disk namespace that we want to resolve for that module name while we're bundling. It has a very similar output in the way that we are referencing these things, and it makes it very straightforward to reference UMDs from the window object. But if we start thinking about when we get to 20 or 30 or 50 of these dependencies, maintaining that config is exhausting. There's so much to maintain. Every single one of them has a major version we have to keep track of. When we upgrade our package JSON, that major version could change and then we have to remember to update it in the config. I mean, it gets to be a bit of a mess, but luckily we can automate that. And we can automate it through a package called, HiMyNameIs. This package is going to be open sourced hopefully within the next month and you'll be able to take advantage of it. HiMyNameIs is basically a library that allows us to build out that module to namespace mapping that we were talking about without you having to maintain that yourself. And it does it with two very straightforward functions. The first function is called generate namespace from package. Generate namespace from package allows you to build that global this namespace we were talking about, including our major version, so that we can populate it into a new package JSON property. That package JSON property is called browser namespace. This property gives us the ability to resolve that information later in the build. So once you've got a library that's using this and building out that property and package JSON, it means your consumers are now set up to be able to successfully reference them. Let's look at what that looks like. The second function out of HiMyNameIs is get package namespace mapping. This function goes to your package JSON for your project and looks at all of the dependencies. It ignores the dev dependencies and resolves those dependencies to look at their package JSONs. If the package JSON for that dependency has a browser namespace property in it, it uses that to build out the mapping of module name to global this namespace. So you don't have to maintain that anymore. As you're versioning, it'll just get versioned for you. Every time you build, this will resolve appropriately in a very rapid manner and give you that mapping that you need. So in Rollup, it's a little bit more complicated, but not much. Similar to with Webpack, the get package namespace mapping function still gives you that global this object, but then you also need to build out that external array like we referenced earlier. Luckily, the get package namespace mapping has all the information you need to be able to do that successfully. Okay, but we still haven't tackled ordering, right? And that's a much harder problem because dependencies can have dependencies can have dependencies and they all need to be loaded in the right order in order for this global this resolution to work appropriately. But we can automate that too. That's where the Orchard CLI tool comes in. It very similarly to hi, my name is reads your package. Jason looks only at your dependencies, ignores dev dependencies. But instead of stopping at the shallow level, this time it uses a package directly out of NPM JS called Arborist to build out the full dependency tree. Arborist is basically behind the dependency tree resolution for things like NPM install. It allows NPM install to build out that full dependency tree to understand what new packages need to be put where in that dependency tree. It helps build out the package lock Jason as well. So very powerful tool that we did not build in order to make this system function. Using that dependency tree, it then builds out a set of script tags based on that dependency tree resolution. And then we limit that using a curated set of yaml files. Why a curated list of dependencies? Well, there's a couple of reasons. The first one mostly is safety. We don't want just any package being loaded from node or loaded from NPM. We really want to limit it to just things that actually we want to be loaded into the browser. By limiting that, we're kind of limiting the number of things that can load to a very specific subset that we actually care about. It also helps by limiting that dependency tree resolution to a much shorter amount of time by trimming branches that are no longer relevant. So with these tools in place, we have this yaml file now that we need to worry about with the orchard. Luckily, it's a reasonably straightforward yaml file. Here's what one looks like for an internal dependency. At the top, we have ownership information. It's wonderful within a large organization to be able to know exactly who is responsible for the dependency that you're relying on in production. That's a part of what the orchard allows us to keep track of. Each of these yaml files has an owned by, a repo, and a contact property that allows us to keep really close track of who is actually in control of these different dependencies. Then under the technical details, this is where we're actually building out the path for each of the dependencies that we're loading either via script tag or link tag. We split it into these three parts for a really specific reason. We wanted a suffix that is the base path that these things will be loaded from. Then we wanted a version path that allows that version path to be maybe prefixed with a V or an at symbol, depending on where it's being loaded from. Then the suffix is the specific files that need to be loaded. In the case of what we're looking at, this would load a script tag with type equal module and then resolve out that base path plus the version plus the ESM path as the source for that script tag. It would also create a link tag based on that, once again, that base path, the version, and the CSS path to build out that link tag, allowing us to create a set of ordered dependencies. There are two other properties in here that are worth calling out. One is conflicts with other major versions, which is that call out before of the fact that we try to make sure that our internal libraries can run with multiple major versions at the same time. This would be the spot that we would kill a build if two different versions of the same library were trying to be loaded that cannot interoperate successfully. The last one here is requires initialization. There are some internal libraries that require you to run some initialization code in order for you to be in an appropriate state. We like to call that out for our consumers so that they know exactly what they need to bootstrap. Looking at an example for an external dependency looks very, very similar to an internal dependency. In this case, Moment.js. We can see that the ownership stuff has shifted a little bit. It's no longer an internal concern, so we don't have things like team names, but we do have things like the repo that it came from and where I can go to contact them if I need to in case of an issue. The technical details are also very similar. In this case, you can see that we're using ES5 instead of ESM because there isn't an ESM version of Moment.js. All that means is that instead of being a type equal module script tag, we'll now get a script tag with a defer on it, which basically allows that script tag to be deferred until after all the DHTML has been parsed, and it will follow the same ordering process as a type equal module for when we're doing that resolution. You'll see here that we have conflicts with other major versions set to true. You can't load multiple major versions of Moment. It's just not allowed. They occupy the same global this namespace and would overwrite one another. That's a couple of examples of YAML files that are used to configure the orchard. I want to take a second to talk about some of the outcomes of doing this work, the benefits that it's provided us at Meltwater. The big deal here is it's a good thing. It's really helped us cut down on the customer issues that we were experiencing. By having staging and production be relying on the exact same versions of the code, we no longer worry about logic changes that could sneak in between staging and production. We also have decreased our bundle sizes overall for our application bundles by making those dependencies external. Those dependencies can then be relied upon by many different applications across our application suite, since they're all loaded at the same domain. We've also seen that caches are getting hit much more often. We're seeing a lot fewer requests for upstream files, since they can be cached so heavily on the browser. It's really helped a lot with user load times. A quick summary of everything we've talked about today is we had a problem with a lot of libraries being loaded in a lot of different applications and that duplication of JavaScript being downloaded. We took an immutable web apps approach to making sure that all of our assets were versioned fully and maximally cached. We build them using UMDs, which allows that global this namespace to be created. It can also be loaded via the bundlers that we're already using. And then the little bit of tooling for DevX is HiMyNameIs to allow those dependencies to be resolved more quickly and more easily. And also the orchard. The orchard is responsible for loading those things in a specific order. So I covered a lot during this talk, and I know there were a lot of topics here. So thank you for your time. Thank you for listening. And if you have any questions on any of these topics, definitely reach out to me on social media. I'll do my best to respond as quickly as I can and answer any questions you have. Thanks so much again for your time. It was great being here to speak.