The AsyncLocalStorage API is arguably one of the most important relatively recent additions to Node.js. Today we are seeing implementations being added to other runs such as workerd, deno, and bun. And there is an effort underway in TC-39 to introduce a new AsyncContext API to the language. This talk will introduce async context tracking with AsyncLocalStorage and AsyncContext and discuss how the model is evolving as it is being implemented across multiple platforms.
The Road to Async Context
AI Generated Video Summary
This Talk discusses the implementation and performance improvements of the async local storage API in Node and Cloudflare Workers. It explores the concept of continuations and execution contexts and how async local storage allows for passing contextual data through async flows. The talk also highlights the challenges in implementing async local storage in Node and the need for a standardized API. The introduction of async context as a replacement for async local storage is discussed, along with its benefits and the ongoing development of the async context API.
1. Introduction to Async Local Storage
I will talk about code in Node, Cloudflare Workers, and the async local storage API. We have been working on improving its implementation and performance in Workers. Additionally, I will discuss the differences between Node and Workers and the upcoming Async Context API. Async local storage creates a store that stays coherent through async operations by using execution contexts and continuations.
Good morning, how are you all doing? I was, when they invited me to come up and come out and speak, I was kind of thinking about, you know, what do I want to talk about? You know, there was, you know, we could talk about Node in general, kind of what's happening in the project. And I just wanted to talk about code.
So you know, I want to talk about some of the code that we've been writing recently, kind of been working on not just in Node, but also in Cloudflare Workers and the async local storage API. If you're not familiar with it, it's one of the, it's been on Node for a couple of years, but it's one of them that's not really well known kind of what it does, how it works. And one of the things that, you know, as we started to work on this in Workers here recently as part of our Node compatibility story, we tried to figure out like, you know, how is this implemented, how, you know, is the performance of it and how it's written? Is it what it should be? All right.
So in Workers, we decided to take a different design approach to how it works under the covers and I kind of want to talk about the differences between the way it works in Node, the way it works in Workers, and kind of where things are going. Now, you know, that original title of the talk, the Road to Async Context. Async context is actually a new proposed standard API that covers the exact same use case as async local storage but it's actually going to be added to the language itself. And I'll talk a little bit about that here at the end. So a little bit, you know, more for me, like, you know, for those of you that don't know me, I have been involved with Node for a number of years. I'm also contributing to CloudFlare Workers. I'm actually on the Workers run time team at CloudFlare. I'm not going to talk too much about Workers itself. My colleague Matt Alonzo is here. He's going to be showing off some bits and some more specific details of Workers. If you're interested in that, I highly recommend you go over to his talk later on. Alright, so let's get things going. What is async local storage? The node documentation gives us this very helpful definition. Creates a store that stays coherent through async operations. That's basically it. And then it gives you an example. It's an extremely unhelpful explanation of what it is. So we're going to break it down a little bit further. Alright, so we have this notion of what's called an execution context. This is whatever code is running right now. That execution context can schedule a continuation. Continuation is any code that's scheduled to run later. So think of a timer, or a promise. Anytime you attach a then, a catch, a finally to it. Callbacks.
2. Understanding Continuations and Execution Context
When calling the FS API and the async callback is run later, the continuation is the generic term for things scheduled to run later. The execution context can schedule any number of these things, such as QMicroTask, promises, callbacks, or process next take.
So when you're calling the FS API and that async callback is run later. You know, the continuation is kind of the generic term for those things that are scheduled to run later. Alright? Alright. So, as the execution context is running, it can schedule any number of these things, right? You know, if you're using QMicroTask, or promises, or callbacks, or process next take. All of these things are things that the current code, the current running code can schedule to run, you know, either just a few moments from now, later, you know, sometime later in the application. Whatever. But these things can get stacked up.
3. Understanding Async Local Storage
Async local storage allows us to associate contextual data with continuations. We can log unique request IDs without passing them explicitly through every function. This greatly simplifies passing information down, especially when we don't control the API.
Async local storage is about being able to associate contextual data with the scheduling of that continuation, right? So when the current execution context is running, we want to be able to set a context value. And then when that function actually does run later, we want to be able to recover that exact same value, right?
An example makes this a lot easier to see. All right. So in this particular case, what we want, this log function at the top is our use case. What we want is to be able to log something to the console that has a unique request ID. All right? So imagine that this is a server. Any time that request comes in, we want to create a unique ID for that request. And any time we have the console log output, we want to actually include that unique ID in that output. Right?
So the way that you have to do it up until local storage existed is you essentially had to pass that ID in explicitly through all your functions. Right? Or you had to attach it to something else. We've probably all seen the case where you take the request object and add properties and pass that through. That's very gross. Adding arbitrary properties to somebody else's object that you don't own is always a bit problematic. Right? And also, having to pass this property down through every function, even if you're not going to use it, just so you can have something like logging, is very cumbersome, very difficult to do, especially if you don't control that code. Right? All those functions that you have to pass through, if you don't control that, it makes it a lot more difficult to do. Right?
You know, this kind of shows a bit more of the complexity here. Like, everywhere that that thing goes, this do something and do something else function, we're not actually using the ID there. The only reason we're having to pass that in is so we can get it to that log. Right? With async local storage we have a much better model. Right? We can create this async local storage instance at a scope that is visible to both, you know, server and our log function. Right? The do something and do something else methods don't need to know anything about this ID. Right? As we set, as we, you know, go in and start dealing with our request, right? We tell the request ID what that value is going to be. This is setting up a counter that increments on every request. And then we call set timeout. We're just kind of, you know, simulating async activity here. It's going to wait a bit of time. Then when it actually calls that function, that do something function, which then does the log. Inside the log, we'll pull out that request ID stored value and go from there. So, it greatly simplifies how we actually pass this information down. Like I said, especially when we don't control that API that we're passing through.
4. Local Storage in Node.js
Async local storage allows you to pass contextual data through async flows. It is built on the async hooks and promise hooks APIs, which add complexity and performance loss. Each async local storage instance has a hidden key representing that API instance. In Node, there is a current execution resource representing the running code. Setting a value in async local storage sets the key in the current execution resource's map. This copying of values between resources can be expensive, resulting in performance loss. Node's implementation is based on async hooks, originally designed as a diagnostic API.
Right? So, if this do something is from some arbitrary third party module out on NPM that you don't control the API, but you still need that ID passed through and in your logs, async local storage is the thing that's going to make that possible for you to do. Right? It's going to allow you to get that information. This is what we mean by remaining coherent. It's there when you need it in that async flow. OK?
We can set multiple things, right? So, this async local storage is not just a single value. You can schedule as many continuations as you want at the same time, different values. When those run, and they can run in any order, the appropriate value is going to be returned when you call als get store. OK?
But, how do we do this? Unfortunately, it's a bit complicated right now in Node. Async local storage is built on a API called async hooks and promise hooks. We'll talk a little bit more about that in a bit. But it adds a certain degree of complexity and performance loss when we actually go to use this. Every async local storage instance has a key. And it's kind of a hidden key inside that instance that represents that particular instance of that API. And then at any given time within Node, we have this thing called a current execution resource. That's the object that represents the code that's running right now. So when a timer fires, right, that current execution resource is a timer. When you're dealing with a promise, then a continuation from promise, that current execution resource is a promise. There's always an object that represents that current running code within Node, right? Whenever you set that value, so als.run, right, that run, and we're setting that value, what we do is set that key from that instance to the value that we're setting. And that current execution resource actually maintains the map of all of those values. Let's say we have 10 async local storage instances in our application, on that current execution resource is going to be 10 keys, 10 different values, right? Every time we schedule a new thing, a new timer, a new promise continuation, every time you add a new then, every time you create a new promise, what we end up doing now is we take the full set of those properties and copy them to the new resource. But think about what you think about the performance cost of that. I've dealt with production applications that have created tens of thousands of promises within just a few seconds. Every time a new promise is created in this model, that async context here, those key values are copied from one object to the next. Right? That is an incredibly expensive operation. I have seen cases where just turning on async local storage in a heavy application has been a 30 to 50% performance loss. And that's being fairly conservative. Right? We can do better. Like I said, Node's implementation is based on async hooks and promise hooks. Async hooks was originally intended as a diagnostic API. It's a way of kind of monitoring the lifecycle of these async resources that are created within Node.
5. Understanding Async Context and Implementation
But it is a very low-level, internal thing. Promise hooks is an API built into v8, really intended to help with this type of use case. A key problem here is that we are propagating that context. We recently implemented async local storage in cloud floor workers. We introduced this thing called an async context frame.
But it is a very low-level, internal thing. It was discovered when we started looking at async local storage, like, hey, we could use this to implement this model. but the performance of it is actually rather poor.
Promise hooks is an API built into v8, really intended to help with this type of use case. What it does is it sets up a series of callbacks that can be fired when a promise is created, when it's resolved, when it's just all the different lifecycle events of that promise. But that gets incredibly expensive when you're invoking that code every single time you create a promise, every single time you resolve the promise. And like I said, applications can create tens of thousands, even hundreds of thousands of promises in just, you know, a few moments of time. So that code ends up being extremely expensive to run every single time.
A key problem here is that we are propagating that context. We are copying those key values from one execution resource to the next. Every time the resource is, every time one of those continuations is created, not when the context actually changes, right? And that's the key thing because the actual contextual information that you're dealing with only changes very, very rarely, right? You're only dealing with a few, a few instances of these things. The values are typically set once during the, during this application. You know, right now, every time we create those promises, we're copying that data every single time and it gets very, very expensive very quickly.
So we recently implemented async local storage in cloud floor workers. This is a fun thing. We're actually getting, it's not going to be full node compatibility in workers, but we do have things like, we will have things like node dot or node colon F or not FS, but node colon net and node colon crypto, and a lot of these things, we are using the node specifier there. It is required just like in Dino. It is there. And async local storage is one of the first ones that we added. It just got enabled, I think like last month where it's available for everyone to use. But we did this without using async hooks or promise hooks. But at the API level what code uses, it is very compatible with what node has, but we've implemented it in a completely different way.
So how do we do it? We introduced this thing called an async context frame. Rather than storing all of those key value pairs on the actual execution resource, what we do is we create a frame only when the data actually changes. So initially when the application is running, there is no frame. We haven't set any values. The first time an ALS instance is used, we will create a frame and set that value and that frame actually maintains that map. The execution resource only maintains a reference to the current frame. So when that resource is created, we just will link it to whatever frame is current. We only create new frames when a new value is specified.
6. Implementation Challenges and Standardization
So it's much less expensive. We're only propagating that context when the values actually change, not when the individual continuations are scheduled. This is currently how we do it in Workers. This is how I want to do it in Node. There's a challenge with how we can implement this in Node right now. All of this depends on a new API. It's an API that's been in V8 for years that we are using. It completely eliminates any dependency on async hooks and promise hooks for us to implement this entire model. What I'm hoping for is later this year, that we'll actually be able to make these changes in V8, and make this a significant performance improvement in Node, for anyone that's using Async local storage API. So, what about standardization? A little while back, Luca, working over on Dino, posted this comment on Twitter, I found it quite amusing. Vercel has adopted async local storage, other people are starting to adopt it. So what are these run times supposed to do? Right? Dino and workers and Bun, do they just go off and implement, you know, follow nodes lead and implements all of the node-specific APIs? Well it turns out yes, we're doing that, right? We are implementing that node compatibility layer, but what we really want are standards, is a standard API for this. Fortunately, TC39 is working on this. There is an async context API that is being developed actively right now. It's in stage two.
So it's much less expensive. We're only propagating that context when the values actually change, not when the individual continuations are scheduled. It results in a massive performance improvement. This is currently how we do it in Workers. This is how I want to do it in Node.
I skipped a few slides here, but there's a challenge with how we can implement this in Node right now. All of this depends on a new API. It's an API that's been in V8 for years that we are using. It's undocumented, very few people actually know about this API. It completely eliminates any dependency on async hooks and promise hooks for us to implement this entire model.
The problem is that Chromium also uses this API. The way that the API is currently designed, you can only use it for one purpose at a time. If Node was just Node, doesn't depend on Chromium at all, just by itself, that'd be fine. We could use it. But there are things like Electron which use both Node and Chromium, and they're using the same V8 instance. Since Chromium is actually using this API too in their own way, we can't quite use this in Node yet, until we make changes in V8 to actually make it work. The key change that we need to make it work there is allow multiple uses of this API at the same time, to be able to set this contextual data and have multiple keys. So, it's something that's coming. What I'm hoping for is later this year, that we'll actually be able to make these changes in V8, and make this a significant performance improvement in Node, for anyone that's using Async local storage API. Right now, performance is the number one issue you're going to run into, if you're using this API. Okay?
So, what about standardization? This is a very good question. A little while back, Luca, working over on Dino, posted this comment on Twitter, I found it quite amusing. Vercel has, you know, with Next.js, and they said it's going to be totally standards-based, we're going to use all web platform standards, but hey, it requires async local storage. Async local storage is a node-specific API, there's no specification for it whatsoever other than the code and the documentation. Because it's a node-specific API. But Vercel has adopted it, other people are starting to adopt it. So what are these run times supposed to do? Right? Dino and workers and Bun, do they just go off and implement, you know, follow nodes lead and implements all of the node-specific APIs? Well it turns out yes, we're doing that, right? We are, you know, but we are implementing that node compatibility layer, but what we really want are standards, is a standard API for this. Fortunately, TC39 is working on this. There is an async context API that is being developed actively right now. It's in stage two.
7. Introduction to Async Context
And to give an idea of kind of what this is going to look like, this is the example with async local storage now, and this is how it changes with async context. Right? So we just changed new async local storage to new async context and get store just becomes get. Right? It's going to be that simple.
There is some other, you know, there are some other smaller API differences depending on what you're doing, but for the most common use case, this is the extent of the change. Okay? So very, very straightforward. The QR code sends you to the GitHub repository where this proposal is being worked on. It is in active development right now. It's at stage two. It is expected to go to stage three a little bit later this year, just a lot of unanswered questions, things that they're still working through. Most of the API though is stable, but it is on track to be added soon.
If you have an interest in this, they are seeking more input on use cases and just general questions in general on how this is going to be used. But you know, like I said, the simple API is pretty close. There are some features of async local storage that will not be in async context. Specifically things like enter with and disable. What those APIs allow you to do is modify the contextual information in place without – with run, run will create that new frame, enter with kind of modifies that frame as it exists. So that context frame becomes fully mutable synchronously. That has a number of challenges. Azure currently allows that. It's an experimental feature but that's not going to be carried through to Async Context for a number of reasons. So just don't use those. Don't use those features.
So that's it. Hopefully it was useful.
Q&A on Async Local Storage
Folks have questions about this, just let me know. Thank you. Thank you James. That was great. Do we have any questions? The first one is, how do you have such a cool beard? When I met my wife it was just short stubble, and she just said don't shave it and so I have it. The next question is, wouldn't a sink contest create a global state with its own set of to propagate in parameters? Is the trade-off worth it? Short answer's yes. It can. If you set the async context at that global state, then yes, it carries all of the same issues as for global state in general. The next question is async context on stage 2, is there any sort of pushback in security happening that could prevent this from landing? Yeah. In terms of the pushback, the API is pretty stable. There have been some security questions in the committee. But as far as, from what I've understood, those have been dealt with and it's pretty well accepted. And why does Async local storage copy key value pairs instead of doing the same things as you did? The... It's hard to say. Since everything was built on top of the Async hooks API, there are some limitations to how that works. There's quite a bit of complexity involved in tracking those resources. And the fact that we're dealing with so many different types of resources, promises and timers and all these kinds of things, it was pretty much the easiest thing to do in the current model.
Folks have questions about this, just let me know. Thank you. Thank you James. That was great. Do we have any questions?
So we have some questions for you, the first one is, how do you have such a cool beard? When I met my wife it was just short stubble, and she just said don't shave it and so I have it. It was quite a while ago, 2012. 2012, you haven't shaved not even once. I haven't shaved in a while, got to keep my wife happy.
The next question is, wouldn't a sink contest create a global state with its own set of to propagate in parameters? Is the trade-off worth it? Short answer's yes. It can. If you set the async context at that global state, then yes, it carries all of the same issues as for global state in general. You can set that async local store or the async context at any scope you want though. It just has to be accessible to the functions that are going to need it. So, you don't have to do it as a global. Most of the examples show it as a global. Cool. Well, I keep asking your questions, please.
The next question is async context on stage 2, is there any sort of pushback in security happening that could prevent this from landing? Yeah. In terms of the pushback, the API is pretty stable. There have been some security questions in the committee. But as far as, from what I've understood, those have been dealt with and it's pretty well accepted. The key question that's being asked right now is whether it actually belongs in the language at all versus being pushed through as a web platform API through like what WG or W3C, one of those venues. So that's the key debate that's happening right now. And it's still, I mean there's still some really good arguments for it not going into the language. Cool. And why does Async local storage copy key value pairs instead of doing the same things as you did? The... It's hard to say. Since everything was built on top of the Async hooks API, there are some limitations to how that works. There's quite a bit of complexity involved in tracking those resources. And the fact that we're dealing with so many different types of resources, promises and timers and all these kinds of things, it was pretty much the easiest thing to do in the current model.
Q&A on Async Context
The authors of the async context proposal have ruled out implementing the nterwith functionality. Use async local storage and stick to the portable subset defined by WinterCG. Async hooks API is not deprecated, but it's best to avoid it for most application cases. There are similarities between DinoKV and workers' KV2, and more use cases for async local storage include APMs, tracing, and logging. Async context's relation to Java async context is unknown.
It just wasn't the most performing way of doing it, but I think it just ended up being the easiest. Any chance nterwith could be implemented in the proposal? It's basically required for APMs since sometimes we can't decide if we're SYNC or async? Yeah. Probably not. So, at this point, the authors of the async context proposal have ruled it out entirely and they say they do not want to support that nterwith functionality. Whether or not it'll come back in later on, I don't know, I don't foresee it.
We have more questions for you. We have domains that are deprecated and async local storage which is experimental. Which one should I use now if I need it? So use async local storage now, stick to that portable subset that the WinterCG has defined. That is going to be, if you stick to that subset when async context is finished, it will be, Node will support both, workers will support both, hopefully Dino will support both. So you have a greater chance of being compatible if you just stick to that subset. Async hooks API, it's not deprecated so much as it's just permanently experimental and nobody likes it. So it has some really good cases for diagnostic purposes and you can use it for those things but for most application cases it's probably best to avoid it.
Cool. You just mentioned Dino. We have a question of Dino here. Do you see some similarities with DinoKV? Oh yeah, yeah. I'm very interested in DinoKV. In workers, we have KV2 and I'm going to be digging into that a lot more in detail. Nice. But yeah, no comments on it yet. Surprise. Can you give us more cool use cases other than logging? That is interesting. The folks that are working on async context proposal are actually struggling to get more use cases other than things like tracing and logging and that kind of thing but if you look at most of the... If you look at most of the applications using async local storage now... It's things like APMs, tracer, and logging... Those are the key use cases. So it's like, you know, we're looking for more but those are the ones that we've seen the most. Tracing is a big one. Is async context related or inspired by Java async context? You know, that's a good question. I don't know.
Benefits of Migrating to Async Context
Async context is inspired by async local storage and will benefit all applications. It is recommended to use ALS now and migrate to async context when available. The models and API differences are minimal. It is most useful for servers with a unique context for each request.
No? We'll find out later... It's mostly inspired by async local storage.
And what kind of applications will benefit from migrating to async context from ASL? I think all of them. So ALS is what you have available now. It's what you should use now. When async context is available, there won't be any reason not to use it and not to migrate to it. Like I said, the models are going to be very similar, very close. The API differences are going to be very minimal. So there won't be any reason to avoid it.
The types of applications, servers, it's primarily going to be most useful for servers that have that unique context for every single request.