With concurrent mode coming in React 18, let's talk about the complexities behind providing declarative APIs for concurrent rendering. While implementing concurrent mode APIs from scratch for Brahmos.js, I came across lot of use cases and variability which made it one of the most interesting problem to solve, and appreciate more the React's effort on advocating Concurrent UI. In this talk we will see what concurrent mode means for a web app, what are the internal complexities and how I solved it for Brahmos.js.
Cracking the Concurrent Mode
AI Generated Video Summary
Sudhanshu Yadav discusses the incremental concurrent feature in React 18 and the need for concurrent mode to provide a better user experience. Time slicing is the key pattern enabling concurrent features. Background rendering and unit of work are used to achieve asynchronous rendering and eventual consistency. Concurrent mode introduces a new pattern called differing for immediate rendering and adjusting based on available resources. React provides APIs for deferred updates and transitions. Implementing concurrent mode APIs can be complex, but it offers benefits like avoiding update starvation and reusing work. Scheduling and slots are used to control execution and dynamic FPS control. Handling multiple transitions can be challenging, but the React 18 working group discussions provide insights into the team's efforts to improve the user experience.
1. Introduction to Concurrent Mode
In this talk, Sudhanshu Yadav discusses the incremental concurrent feature in React 18 and the need for concurrent mode to provide a better user experience. Concurrent features allow for effective resource utilization, background rendering, and declarative APIs to control rendering sequence. The key pattern enabling concurrent features is time slicing, which breaks down a large work into units and provides breathing space for the browser to handle other tasks.
In this talk, I'll be talking about concurrent mode, more on my mental model toward concurrent features. So, there is no concurrent mode as such coming in the React 18, it's more of like an incremental concurrent feature.
About me, I'm Sudhanshu Yadav, I work at Proficy as a front end architect, I've authored Brahmos, the React library, and also have open sourced a lot many other tools. I am an internal fanatic, and I love discussing the internals of the library I use, I like making theories around it, you can find me on Twitter or you can check my project on GitHub.
Before we start, let's understand why we need concurrent mode, what it means for a React application or any application in general. The most important reason why concurrent feature is there is to provide a better personal experience. Now, lot of libraries focus on improving the performance of the library itself, which should be the priority, but sometimes those improvements are not noticeable by user, and if a user doesn't feel your application to be smooth, then your application is not smooth. They should pursue your application to be smooth.
There is a problem with the current patterns, which comes around improving personal experience. They have a good context on the application, but they don't have control over your rendering phase. Because of which those patterns can't effectively use resources. With concurrent features, it allows to effectively use all the resources available while keeping the application responsive. Concurrent features also enable background rendering and that kind of enables a lot of things like suspending a render, pausing, resuming it back, or like batching renders together. A lot many things. And it also removes a hard part from a user length, is about orchestrating the whole rendering thing. It kind of like provides declarative APIs which can help you to build, more define the orders, give a hint to the React or application like a library that this is the way I want my rending sequence to be, or like if I want the best things to get a pre-render something or lazy load something. All of those things can be done in a more declarative way.
Now definitely concurrent feature has a lot of scope, but what enables the concurrent features? And the first and most important pattern, I would say, is the time slicing. To understand time slicing, let's see a react tree here all the nodes, you can think of a component which takes a little bit time to process and how update would happen like you change a state in any component and then it will trigger a re-render on that component which internally would trigger re-render of their children and all of these things will happen asynchronously at one go. And once you find what all changes are required for the actual DOM, after knowing those changes, you commit those changes to the actual DOM.
2. Unit of Work and Background Rendering
So let's try to visualize it on our application. You process a unit of work, called Fiverr in React, and check if it should be included in the browser. If not, you process the next unit until you have to include the browser. Once the browser is done, you continue processing units until all changes are rendered and committed. This turns synchronous rendering into asynchronous rendering. To maintain consistency, background rendering is used. Instead of updating the current tree, a work in progress tree is created. The current tree remains consistent with the actual DOM while processing happens in the work in progress tree. Updates are prioritized based on their type, with browser tasks and input updates having the highest priority. React provides APIs to manually mark updates as deferred or synchronous. Eventual consistency is achieved at the end.
So let's try to visualize it on our application. So you are making a state change and with the state change we call, let's not call it a component, let's call it a unit of work, Fiverr in terms of React. So you process a unit of Fiverr and then you check whether you should include it in browser, and if no then you process next unit, and you keep doing until you don't have to include browser. As soon as you have to include browser, you stop your processing at that moment and you schedule your next unit of task, after browser has done doing its stuff. So basically you stop and you let browser do their stuff and once browser is done you come back to processing units and you will continue until all the units are processed and once all things are rendered you know what changes are required, you commit all those changes.
Now, there is one more pattern which is very important for content rendering, is that at the end you want eventual consistency.
3. Concurrent Mode and Differing in React
To understand this, let's take the example of a UI with a table and an input bar for filtering. When a user types, the input update needs to be synchronous, while the table update can lag behind. Debouncing is a common pattern, but with concurrent mode, a new pattern called differing is introduced. This allows for immediate rendering of the table while interrupting it for new user input. It automatically adjusts based on available resources. Transitions in React provide a declarative way of defining deferred updates.
So, to understand this, let's see through one example. Let's say we have UI where we have a table with a lot many rows and columns and those cells takes some time to render. And there is an input bar where you can type and filter things. So, whenever you type on the input, you filter the table.
Now, while a user is typing, that update has to happen faster because you have to give the feedback to the user that something is being input. So, it has to be done synchronously. But the table update can lag behind. It's fine if the table render happens in a few milliseconds. The user may not notice that at all or the user might be fine with that.
So, there are existing patterns for solving this. One of the common patterns is debouncing. With debouncing, you do something after some milliseconds and you don't do it until that second. And if you do anything in between, you just shift that boundary. But, with concurrent mode, it brings a new pattern of differing instead of debouncing. So, debouncing is not very good because it just delays the problem. It doesn't actually solve the problem. It will be like, you will render up to 200 milliseconds. But, what will happen if a user is trying to type at that moment of time. On this differing, you don't have to wait for 200 milliseconds. You can start rendering your table as soon as possible. And you keep trying to render it and you will interrupt it whenever there is a new user input is happening. And eventually the table becomes consistent to what a user has typed. So, it's fine lagging a little bit behind there. This enables for machines with more resources, it will happen fast. For machines which doesn't have a lot of resources, it will lag more behind. So, it will automatically adjust based on the resource available.
Now, another pattern if you are following React you must have gone through transitions. And transitions, you must have seen this snippet. Like, in Functional Components you can use transitions, you get to start transition method. And transitions are mostly a declarative way of defining a deferred update.
4. Updates and Transitions
It's like saying your application is at state A, you want to move it to state B. And any update between state A and B should be clubbed together as a transition. All the updates happening in that transition are plugged together and flushed together. Transitions can be interrupted by synchronous events, so timeouts are used to ensure updates are flushed. Multiple transitions within one action can run independently and merge things into the current tree.
It's like saying your application is at state A, you want to move it to state B. And any update between state A and B should be clubbed together as a transition. And it shouldn't block your browser, it shouldn't make your browser unresponsive, it can happen as a deferred update.
But, and also, all the updates happening in that transition, like here SETI state which is on the same component, onClick might be calling SETI state in its parent component, all of the certain updates are plugged together in a transition, and all are flushed together. Because transitions can be interrupted by synchronous events, there could be a chance where transition doesn't happen at all.
So for that transition also have timeouts. You can provide timeout and say, at least in this millisecond, you want your update to be available on the DOM. And whenever it reaches that timeout, it tries to flush them synchronously. And there can also be multiple transitions within one action. Like here in HandleClick, we are calling transition A and transition… start transition A, start transition B. And both transitions can run independently. It's like running… executing in their own universe. And once they execute, they can merge things into the current tree. Also, like a… you can… whenever you call the start transition twice, you are calling the same start transitions. So, all of these multiple transitions, which are started by the same start transition method are clubbed together. So, you don't see an inconsistent behavior. Inconsistent behavior state.
5. Complexities in Implementing Content APIs
Now, while content mode is great, it provides a lot of scopes, implementing those APIs are not very straightforward. The first complexity is your updates can become stale. To solve this, for the solving the stale problem, what I'm doing is keeping a timelapse tempo on those updates. For the mutation part, we create an update queue where all updates are stored. Those updates are applied lazily during the render phase, so it can only happen on your working progress tree. It doesn't have to impact your current tree references.
Now, while content mode is great, it provides a lot of scopes, implementing those APIs are not very straightforward. There are a lot of complexities, and in the next part, we'll discuss those complexities. Here, it will be more on my experience on building those content APIs for promise. But yeah, most of them relate to React as well.
The first complexity is your updates can become stale. Because deferred update happens asynchronously, it can be interrupted by a sync update, and those state changes from deferred update can become stale. Similarly, it can become stale by a sync update, it can also become stale by another deferred update because everything is happening asynchronously. And the bigger effect is that they might be mutating the same reference, and can have side effects as well.
So to solve this, for the solving the stale problem, what I'm doing is keeping a timelapse tempo on those updates. And whenever I'm in a render phase, I check whether the update is older than the current tree update. If it is new, then go to the latest path, otherwise discard the old updates. For the mutation part, which can mostly happen on a class component where every setstate states are stored on a component instance. Instead of applying the state on setstate calls, we create an update queue where all updates are stored. Those updates are applied lazily during the render phase, so it can only happen on your working progress tree. It doesn't have to impact your current tree references.
6. Update Starvation and Work Reuse
To fix update starvation, a retry count is added to low-priority updates, giving them more time for deferred updates. Transitions can be made synchronous after a timeout. Reusing work already done is crucial to avoid throwing away redone work. Updates on a deferred tree can be resumed from where they left off, checking for staleness and cloning fibers if necessary.
The other problem is update starvation. Low priority updates can be eaten up by high priority updates, so it can starve, because there are a lot of high priority updates happening very frequently, which will never allow the low priority update to happen. To fix this particular problem, I added a retry count on the low-priority updates, and with more retries, they start giving more time for deferred updates, so that they have more space, basically more time to process the fibers in a deferred, in a work in progress state.
Also, for transitions, having a timeout helps, where you can say like, if a transition is written by a high priority update, after a timeout, it will become a synchronous update. And the most important piece is to reuse the work, which we have already done, as much as possible. Because everything is happening asynchronously, there is a good chance that we might throw away the redone work and to solve that, in the most what I did, is like do all the synchronous update directly in a current tree because it has to be flushed synchronously and resolve the work in progress stream only for the deferred updates. So, with that, what capability we get is like, let's say, for example, there is update happening on a deferred, like a work in progress tree. At that moment, if there is a sync update which triggered on a current tree, you'll finish that update and you'll come back to working progress tree. In work in progress tree, you can resume from where you have started because you don't have to throw away anything, you just have to check whether the updates you have made, whether those have become stale or not, if they aren't stale, then you can keep those fibers. If they are stale, then you can clone the fibers again from the current tree.
7. Scheduling and Slots for Execution
The scheduling problem arises when deciding when to yield to the browser. To address this, I implemented a concept of slots, where each slot is 5 milliseconds long. After each slot, we yield to the browser and schedule the next slot using a message channel. However, when close to the frame, we schedule through request idle callback and set timeout to ensure faster response. The slot size can vary based on the number of deferred updates and the available time in a frame, allowing for dynamic control of the FPS.
The other interesting problem, and it was kind of a nightmare for me, was scheduling. So the problem with scheduling is, let's say if you yield two browser to list, then you will execute your code faster, but your browser will become... You will trade off the responsiveness of a browser because you will eat up some frames. At the same time, if you yield too much to the browser, it will keep your browser responsive, but it will make your execution take more time. So how do you know when to yield to browser? Also, once you yield to browser, how do you know that browser is done with its work so you can schedule the next unit of work accordingly? So browser is working on APIs like isInputPending and scheduler API which can solve this problem, but until those are there, I had to come up with my own solution. So what I came up with is like a concept of slots where I created like 5 millisecond slots and in that slots, we can process as much as fiber we can. Now, after every slot, we yield to browser and schedule the next slot with message channel and message channel are good in terms of like, it doesn't have a cool-off time, you can say like a most of the cellular techniques like request animation frame or not request animation frame but set timeout and request ridal callback, they will have a bigger delays between two slots. But message channel can also eat up some of browser task as well, so what I did is whenever we are close to the frame and at that point of time instead of scheduling from message channel, schedule through request idle callback and also with a set timeout and whichever response first, take that and ignore the other one. And the slots are not constant. The size of slot can increase if a deferred update is re-tried multiple times and also the slot size can reduce based on like if you have enough, like if your slots have taken enough time in a frame and there is less time remaining on that particular frame, so slot can become smaller to accommodate that. And by the way, like a slot can even become bigger than one frame like it can become, it can span across two frame. And the idea behind that is programmatically, not programmatically, dynamically we can control, we can tune the FPS based on the execution speed and the resource available for a user.
8. Handling Multiple Transitions and Complexities
Handling multiple transitions can be a nightmare, especially when there are risk conditions and conflicts between updates. BrahMos takes a simpler approach, running transitions independently if they come from different start transitions, and merging them if they start from the same method. Nested transitions follow the leaf transitions, and only direct child state updates are kept. There are many complexities, but exploring React 18 working group discussions and the older document on concurrent mode will provide valuable insights into the team's efforts to improve the user experience.
The other interesting problem, and transitions are good, but handling multiple transitions is a nightmare. So, the thing is like because they are in a sink in nature, there can be multiple transitions in a queue. And because they are a sink, they can have risk conditions as well. And updates in one transition can become instilled by other transition.
Also, transition can be nested so you can write start transition inside a start transition and you can compose transitions as well. Like you have multiple transition start transitions and which is wrapped up in another start transition. You can even do all sort of nasty stuff with transitions.
And the other thing can happen like two transitions can try to update the same state. And now what to do, like which one to take precedence. In BrahMos I took a simpler approach to solve those. React right now, it merges all the transitions together but their long term plan is to run the transition individually in a different lane. In BrahMos what I'm doing is like if two transitions are there from different start transitions, then always run them independently. And if the transition starts, like multiple transitions starts from the same start transition method, then always merge it. Now it could be like you're calling the same start transition multiple times, or it could be like you're pressing a button and there was the previous button pressed, is it still not processed completely? Now you have pressed it again. So the previous transition and the new transition, the updates will get merged. And this, like if you have different transitions, those transitions are kept in a transition queue and processed one by one. Instead of creating, like, a multiple work in progress tree to try to process every transition in their own place, we just keep it simple, like by process one transition at a time. And timeout transition can move themselves from transition queue to sync up data. That makes timeout... That is handled by timeouts. And for nested transitions, I simply follow like whatever leaf transitions are there, that will be respected. The parent transition will only... only keep the state updates, which are direct child of that transition.
Now there are a lot more complexities, and I guess I could have explained more of them, but we have limited time. I would strongly recommend you to go through React 18 working group discussions. Those are gold mines. And also like the document, the older document on concurrent mode, actually you'll get a lot of idea. And after seeing those documents and discussions, you will start appreciating the effort of React team. They are trying to solve one important problem, which is have a better pursuit experience for user. And on a grand scheme, if you see like that is the only thing which matters. With that note, I would like to thank you and see you all. Thanks.