We spend a lot of time discussing which state library we should use, and fair. There are quite a few, from the common one everyone uses and loves to hate on, to that one quirky alternative, to several up and comers. However, discussing which library is best puts the cart before the horse.
When figuring out how to handle state, we should first ask ourselves: what different categories of state do we need? What are the constraints of each category? How do they relate to each other? How do they relate to the outside world? How do we keep them from becoming a giant, brittle ball of yarn? And more.
This might sound overwhelming, but never fear! In this talk, I'll walk you through how to answer these questions, and how craft an approachable, maintainable, and scalable state system. And yes, I will talk about how to pick a state management library too.
Taming the State Management Dragon
We spend a lot of time discussing which state library we should use, and fair. There are quite a few, from the common one everyone uses and loves to hate on, to that one quirky alternative, to several up and comers. However, discussing which library is best puts the cart before the horse.
AI Generated Video Summary
This Talk discusses various aspects of state management in software development. It covers different types of state, such as bootstrap data, lazily loaded data, and reactive data. The Talk also explores the concept of locality in state management, including local, global, and regional state. It introduces libraries like Recoil and Jotai that challenge the single global store concept and provide better locality. The Talk emphasizes the importance of setting up state management systems for success and creating reliable systems to focus on user satisfaction.
1. Introduction to State Management
Hi, everyone. My name is Brian Hughes, and I'm a staff frontend engineer at Patreon. Today I want to talk about some things I've learned over the years of how to build successful state management solutions inside of applications. State is not just data, it's how we access, read, update, and synchronize data. It's more than just libraries, it's the entire house. When designing a state management solution, one important question to ask is when is state created? There are three categories of state: bootstrap data, pre-interaction state, and dynamic state.
Hi, everyone. My name is Brian Hughes, and I'm a staff frontend engineer at Patreon. At Patreon, I'm on a team called frontend platform, and our team is tasked with managing basically all of the foundations for our frontend code base. And this includes architecture and state management. And today I want to talk about some things I've learned over the years of how to build successful state management solutions inside of applications. Because I think this is a really critical part to having a successful frontend code base. It's also one that tends to be under-invested in.
Let's start off with the definition. What is state? So I define state as data and all of its related mechanisms. We know that intrinsically I think that state isn't just data, right? Data's just a blob of, well, data. It's really how are we accessing that data? How are we reading that data? How are we updating it? How are we synchronizing it? So state is really its data and all of those mechanisms associated with it. The mechanisms are more than just libraries. I think a lot of times when we think of state, we're like, well, we've got this API schema coming from our backend and tells us what data we have. And we pick the state management library, Redux, recoil, Jotie, whatever, and that's it, right? But that's not it. Those are really important parts. The way I like to think of it is that our API schema, what data the backend returns and the libraries that we use to access this data, that's like the foundation of a house. It's really critical, of course, it's the foundation. But we also need walls and a roof and things like that too. And a real state management solution in an front-end application, especially any really reasonably complex application is we got to think about this holistically, we got to think about the entire house.
So whenever I'm going through and I'm designing a state management solution, either if it's for a whole new application, rewriting an old one, or even if it's just adding a new page to an already existing system, there's a handful of questions that I really like to ask myself. And we only have so much time today, so I'm just going to stick to two of the really important ones, or at least that I think are important, and I also think aren't talked about as much as I wish they were. So the first question is, when is state created? If we think about a front-end application, it's not a snapshot in time, right? It exists over time and it changes and reacts to user input, right? And so when is the state coming into being? Understanding that really informs what kind of a testing strategy do we need for it, what kinds of safeguards do we need for accessing and just all sorts of things. And it really depends on the answer to this question. And I think there are roughly three categories of state when it comes to this temporal nature. The first category is state that is created before startup, otherwise known as bootstrap data. And when we think about this, we'll take, for example, a social media site. We'll use that as an example through this talk. On a social media site, we load the page, the first thing we do is we see a bunch of posts. That is something that exists before we can even begin interacting. And so this happens, or at least one way it can be done is we can do this as part of bootstrap data in a modern server-side rendering type setup.
2. Data Assembly and Gotchas
In Next.js, data for the first render is assembled by calling multiple back-end API endpoints and then starting the application. However, the creation and consumption of data are decoupled, and we need to keep them in sync. The data goes through multiple steps and is funneled along, passing through different places before being accessed. While implementing this is not complicated, it still requires writing and maintaining code, which comes with a cost.
Say with Next.js. And in this world, the idea is that we first, we assemble all of the data needed to do that first render. You know, we call our back-end, usually multiple back-end API endpoints, and we might do some hydration, massaging of data. We assemble all that together, and only then once we're done doing that do we start the application and do that first render.
And so there's a couple of gotchas to be aware of with this. And like I mentioned, this data creation and the data consumption, they're pretty decoupled. The way that data looks for example, in Next.js instead of that get-server-side-props function, the way we work with it, the way it looks and everything is usually different than whenever we're reading it from say inside of Redux or something like that. And so we just need to keep in mind that these are decoupled, and thus we need to actually think about keeping them in sync. Now, it's actually not that complicated to do in practice. Usually TypeScript is a great solution to this, but is still something that we have to think about, and thus we also have to maintain.
Another bit of a gotcha with this is this data goes through a couple of steps and it kind of gets funneled along. Like I said, we start off calling multiple endpoints to fetch different, oftentimes unrelated bits of data. We're fetching who is the current user, what is the list of most recent posts, what are the current trending topics, things like that. And we take all the disparate data and we kind of like squeeze it together into one ball. And we pass it along through a couple of different places. It goes into our top level component, we pass it off to our libraries to Redux say whenever we're initializing our store, only then to we break it apart and access it later. And again, this isn't particularly complicated to implement, but it is code. And anytime we implement, we have to write code to do a thing. No matter how simple that code is, it is still code that has to be maintained and is still code that can have a bug in it, and thus it comes with a cost.
3. Lazily Loaded Data and Gotchas
The next type of data is lazily loaded data. It's data that isn't present for the initial rendering of the app but is necessary to show some UI. In a perfect world, we would load all the data at once, but in reality, we load it in chunks. This requires components to have extra intelligence and handle loading states. Lazily loaded data also has potential SEO implications, requiring special code for search engine optimization.
So the next type of data is lazily loaded data. Which I kind of think of as deferred bootstrap data. This is data that isn't present for the initial rendering of the app, but is still necessary in order to even begin to show some bit of UI. And the way I like to think about this is, if we lived in a perfect world where servers had infinite memory, database queries always took zero milliseconds, stuff transferred over the internet also in zero milliseconds, no lag, none of that, lazily loaded data is data that would actually be bootstrap data. In our social media example, we, on that homepage, on that feed, we would load every single post for the feed for all time, all at once, right? And we just render that at the start. Of course, in practice, we can't do that. That is way too much data, especially if it's not going to be seen. So instead, we'll like load a chunk at a time, the user will scroll some, then we'll load a chunk more and a chunk more. So this is really interesting in that conceptually, it really though is exactly the same as bootstrap data, in that this is data that's needed to even begin showing the UI to the user. But mechanically, they look completely different. As a result, the gotchas are different too. And so in this components need some extra intelligence, right? In the bootstrap world, component gets the data, it renders the data, it's done. But now components need to know like, well, what kind of state are we in? It's like, are we in the process of loading and that's we need to show, you know, a spinner or a ghost component or something like that? Does this component also need to be the one to kick off the request? Does it need to manage that request? So all of a sudden, these components would have to do a lot more work. And also is lazily loaded stuff isn't run in server-side rendering kind of by definition. And this has potential SEO implications as well, right? If SEO needs to have that lazily loaded data in order to understand the sites that you get, you know, for other or the web crawlers understand that, excuse me, so that we can get proper search engine optimization. Well, now all of a sudden, we have to like write special code to handle this SEO case. And again, this code may be simple. But simple code is still code that has to be maintained, that can have bugs and go wrong, right? So there's still a cost to it.
4. Reactive Data and State Scope
The last bit of data is what I call reactive data. This is state that is in response to a user interacting with the website. The gotchas in state synchronization can become a real challenge, especially with modern posts that have rich content. Handling interactions with the server that have latency and can stack on top of each other requires careful code writing. Defining the type of data and identifying its scope are crucial in state management.
The last bit of data is what I call reactive data. And this isn't data that is needed to show the UI. This is state that is in response to a user interacting with the website. So I think the social media example, the best case of this is composing a new post, right? We hit a button, it usually pops open an editor, and we have to maintain a whole bunch of state around that, right? We have to maintain your what is the user type? Well, there's no way we can know this in advance, right? The only thing we can do is wait for the user to tell us the bits about it, right? We don't have a time machine, after all.
So again, the gotchas look pretty different. This is where we start to get into state synchronization a lot more. This is where this can become a real challenge, especially with the way modern posts look, which have all kinds of rich content in them, right? We can attach files. Whenever we attach a file, we have to usually start uploading it to the server in the background. We can ad mention people. So you type add, start typing a few letters, and we'll get an auto completion. That involves multiple rounds to the server. Same thing if we want to add tags or anything else to it. So we get all of these kind of interactions with the server that have latency, and they can even stack on top of each other. So we have to write our code in such a way that it can handle all these various different combinations.
So once we get here, the testing service becomes much, much larger. Because with lazily loaded and with bootstrap data, there's pretty much just one option. I mean, you might have a couple of variants of data, but they don't really interact with each other. Whereas in this role, we have to think, what are all the — not just testing out what are the basic things, but what are all the possible combinations that can occur at once? All right, so this gets quite a bit more complicated. But it's really interesting to look back on these types, because they all have similarities and differences too. Conceptually speaking, lazily loaded and bootstrap data are the same thing and reactive data is really different. Mechanically speaking, lazily loaded data and reactive data are really similar and bootstrap data is different. So it's like, they all have similarities and they all have differences. And I think that's why it's really important when we're defining state to really identify what type of data we're working with and which of these three it is. So the next question I like to ask myself is, where is state use? We started with when and now we're onto where. And another way of putting this is, what is the scope of state? So first let's talk about locality of data. This is the phrasing I like to use with it. I think we all know the concept of local data and global data. We use this all the time. I think there's a third type here I call regional. Let's just put a pin in that though.
5. Types of State: Local, Global, and Regional
We're going to talk about local and global state. Local state is specific to one component and doesn't need to be known by the rest of the application. Global data, on the other hand, is accessible everywhere and is crucial for features like user authentication. There is also a third type called regional data, which sits between local and global. It's scoped to a specific part of the app, such as the home page, but is not limited to a single component.
We're going to talk about local and global first. So local state, this is state that only one component needs to know about. It doesn't make sense for the rest of the application to know. For example, in our post editor, as we are typing values in, it doesn't make any sense for the navigation bar or the list of trending topics or anything else to know about the current state of what the user has typed. That is really just local, just that one component needs it.
On the flip side, we have global data. I think the most common global and the best example of this is who is the current user? It's possible that value is null if no one's logged in, but that is still data. And this is something that everywhere across the site needs to know about it. The navigation bar needs to know it to show the user avatar with a logout option. A reply editor needs to know this as well, because we usually show that avatar there. Settings page needs to know about this some too. So this is data that really needs to be accessed everywhere. And so when we're designing global data, we kind of have to think about it a little differently, because it's no longer coupled. So we have to support all these possible different use cases.
There is a third type. And I had to come up with my own term for this. I call this regional data, mostly because I just don't see people talking about this. But maybe there's a better term out there for it. A regional data sits, as the name implies, somewhere in between. This is data that doesn't actually make sense for all of the app. Perhaps it even doesn't make sense for the majority of the app. But it doesn't make sense for just one component either. And I see this most in the modern world of applications, where we're not really a single page app ecosystem anymore. Like this isn't what Next.js and React components and all of those mechanisms are. They're kind of multi-page apps, but they also kind of look like single page apps. So we can have some state. For example, that list of posts, that makes a lot of sense on the home page. But that doesn't make any sense if we're reviewing the profile for a specific user. Especially doesn't make sense if we're on the settings page. So this is the state that really is scoped to just the home page but it's broad enough that it's not just scoped to like a single component.
6. Challenges with Regional Data and Mechanisms
State managers often face challenges when dealing with regional data. The common approach of having all possible data on all pages and setting them to null if not needed is not convincing. The concept of locality applies to mechanisms as well, where local mechanisms are accessible by a single component, global mechanisms can be accessed by anyone, and regional mechanisms are limited to a subset of the codebase.
And this is where I actually see most of the issues crop up around state managers. What do we do with this regional data? The common answer, especially in the Redux world, which really, really likes global data is to just have all the possible pieces of data on all pages and just set them to null if we're not on that page. But I never found that a very convincing solution. I talked about locality of data, right, and I use that word on purpose because if we go back to our definition of state, state is data plus mechanism. And it turns out we actually have the same taxonomy for mechanisms, too. And the concept is the same, local is where the mechanism is only accessible by a single component. Global means anyone can access the mechanism. Regional means only a subset of the codebase can access that mechanism.
7. Types of Locality and Libraries
For different types of locality, different libraries are meant. useState and useRef are used for local data, while Redux and Zustand embody the Flux architecture and provide a single global store for data. However, the relationship between data and mechanism is not always one-to-one.
And so, for the first question, you know, I didn't really talk much about specific libraries because in truth, all the various libraries out there support those different use cases. Well, once we start talking about locality, that's where things change a bit. This is where we actually start to see where different libraries are really kind of meant for different types of locality. So, let's dive into some examples to really kind of understand more what I'm talking about.
And then on the opposite end, we have Redux and Zustand. Both of these embody what's called the Flux architecture. Both of them have the concept of a single global store for all of our data. And it really is. Both the data and the mechanisms are truly global. If we want access to that data, we import the use store hook from Redux and we can access every piece of global data in that one store. And there it really is just one store. Almost always, in fact. So, this is like the epitome of global data. So, far with these two options, we have seen a one-to-one relationship between the data and the mechanism. So, it might be tempting to think they always are one-to-one, but it turns out not always.
8. Recoil and Jotai: Micro Stores for Better Locality
Recoil and Jotai are two newer libraries in the state management world that challenge the single global store concept used by Redux and Zustand. Instead of a single global store, these libraries create micro stores called atoms. Each atom is responsible for a single piece of state. This allows for better locality in managing state.
And the next two libraries I'm going to talk about, we finally start to see this break apart. And that is Recoil and Jotai. So, these two libraries are much newer in the state management world. So, if you're not familiar with them, just real quick what they do is they said, you know, this whole like single global store idea that, like, you know, Redux and Zoostand use, maybe that's actually not the best way to do it. So, instead of having a single global store, we're going to create this bunch of basically micro stores. And they call these things atoms. And so, a single atom is responsible for basically a single piece of state. So, for example, there would be a current user atom. Right? And that's all that's there. And there could be a completely separate atom or set of atoms for, you know, all of the posts in the Home feed. And yet again a separate set of atoms for the list of training topics, you know, so on and so forth. And in practice, there's often quite a bit more than that. So, each of these atoms is only responsible for that one piece of state and nothing else. And so, this is where we start to okay, so let's get back to locality now.
9. Jotai: Local and Global Mechanisms
With Jotai, we can create local mechanisms for accessing shared data in components. Atoms hold state and business logic, allowing for side effects and network calls. Unused atoms are not included in the bundle, providing a regional-like mechanism. Jotai can also be used for global data, like a current user atom.
With that in mind, this is where things get, I think, very interesting. This is why personally I'm a really big fan of this. I've been using Jotai lately and I'm really enjoying it. Because with this, we can construct our code such that let's say we have a file with a component in it and it needs access to some data that's not local to that component. It exists across all components. There are use cases for this. But we can create an atom to maintain that state and we can stick it in that same file as the component and then never export it.
So, what that means is in practice, all of a sudden we have this atom that can only be accessed by one component, which is to say, we can actually have a local mechanism for accessing this data with Jotai, which is something we can't do with Redux. Now, we still don't get local data because that data again, is shared across all components. But it is a local mechanism which is really interesting. It was also the first time we start to see the appearance of anything starting to get close to regional data.
Another thing I think is just very clever with Recall on Jotai is, we have these atoms, not only do they hold state, they also have business logic for how we interact with it. We can have side effects and all sorts of other stuff. We can make network calls related to it. So we can have all of this logic, we have these mechanisms here for working with certain types of state. And if there is, say, a page, and no component on that page imports that atom, that code doesn't even show up in the bundle. It's not there at all. So if we have an atom to maintain, say, two-factor authentication information, this exists on the settings page. The homepage never imports it. That code, that's not even in the bundle. It's never even shipped to the client. So we actually have what kind of looks like a regional mechanism. I say this is only a regional-ish mechanism, because there's no enforcement of this. There was nothing stopping a component on the homepage from importing that 2FA atom. Which is kind of what we really want to have. But still, it kind of gets us there. It's at least closer, which I think makes it really compelling. And of course, these can also be used for global data. You can have, again, a current user atom. Everyone imports it everywhere.
10. Regional Data and React Context
It just works. So this kind of begs the question, okay, we've talked about different stuff. We've seen good examples of local. We've seen good examples of global. These are starting to get into regional, but not really. So the question is, can anything actually even do regional data? Especially since, again, it's not really talked about.
Turns out, yes. And that is React context. So React context itself is used to power all those libraries. They sit under the hood of Recoil, Zoostand, and a bunch of others. But the thing about React context that those libraries don't really exercise that I find so compelling is you can have as many React context as you want and stick them anywhere in your component tree.
So for that two-factor authentication data, we could put that inside of a React context, and we only initialize the context at a component for the settings page that's not used anywhere else. Sort of like a page wrapper where we start this up. So when that happens, that means you have to be inside of that component tree in order to access this context. It's just built in. If you try to access that context from outside of that component tree, you get an exception. So that means we can now create regional data by scoping it to our component tree, which is really cool.
11. Regional State Management and Best Practices
React contexts are a low-level mechanism not meant to be a featureful state management solution. They were invented to support other state management libraries. I created a library called ReactStrap for accessing regional bootstrap data. Regional state management is an area ripe for innovation. Make it easy to do the right thing and hard to do the wrong thing when setting up state management systems.
Now, there's another issue with React context. Maybe not exactly an issue, but something to be aware of. It's a pretty low-level mechanism. React contexts weren't meant to be a featureful state management solution. They were invented to help support these other state management libraries. They're meant to be built on top of.
I think there's a real opportunity here to start exploring this regional state area. I created a library to do exactly that. I just created and released it as I'm recording this, just a few days ago. I call it ReactStrap. It's very limited and special purpose. It is meant for accessing regional bootstrap data. I encourage folks to take a look at it. Whether you use it or not, I want you to think about what is it we're trying to do here. I think this regional state management concept is an area ripe for innovation. I'm hoping folks can start to invest more in libraries, tools, patterns, even testing mechanisms.
So, it is a critical part of most modern web applications. So, I want to end this talk with a little bit of advice to folks who are setting up state management systems. First, a mantra we learned from a different platform team, make it easy to do the right thing and hard to do the wrong thing. If we can do that, this is how we steer people onto the happy path. And we tend to get more reliable systems as a result. What do I mean by that? A good example of making it hard to do the wrong thing is the lint rule. When we have a lint rule, it says you're not allowed to do this. It makes it hard to do the wrong thing. It doesn't have anything to do with making it easy to do the right thing. It says don't do this. It doesn't say do this. Right? But an example of making it easy to do the right thing, is a pattern I'm a big fan of for common bootstrap data. Let's say a current user. So, I'll create a hook for it.
12. Final Thoughts on State Management
At Patreon, we have a hook called use current user. It takes no arguments. If you need to use a current user, you import the hook, you call it, and you got it. Invest early and don't implement state systems ad hoc. Think about all the stuff I've talked about. Set engineers up for success and not failure. That's what a good state management system does. That's what good code does in general. When we're thinking about these systems, think about how can we help our fellow engineers succeed? Once we do that, that's when we create better systems, more reliable systems and we can focus on features and making our users happy.
At Patreon, we have a hook called use current user. It takes no arguments. If you need to use a current user, you import the hook, you call it, and you got it. It is so brain dead simple to use this thing. There's almost no reason to use anything else.
Next bit of advice I have is invest early and don't implement state systems ad hoc. Like I mentioned at the beginning, it seems most folks think I've got my API schema, I've got my library. I'm going to show all my data and not worry about it. Only once problems arise, do they add more structure. I encourage you to think about it early on. Think about all the stuff I've talked about.
Finally, the most meta level of this talk, what I want everyone to take away is set engineers up for success and not failure. That's what a good state management system does. That's what good code does in general. All the state management mechanisms can be used for all types of state. We can use state and use ref for global data, but it's not going to be successful. It's not going to work. We're going to have so many growing pains. When we're thinking about these systems, think about how can we help our fellow engineers succeed? Once we do that, that's when we create better systems, more reliable systems and we can focus on features and making our users happy.