In this talk, I plan to discuss how the apollo cache works in practice, how important ID's are to the process and how one can leverage it (through the way they query/mutate and through schema design). To add to this I want to share some caching patterns and best practices used at Shopify and beyond to solve problems.
The Apollo Cache is Your Friend, if You Get To Know It
From:

React Summit 2022
Transcription
So I'm Raman Lalli, I'm here from Shopify, and I'm giving a talk on befriending the Apollo cache. That was my only meme, I only had space for one. That's more of me. So really the reason why this talk came about is because we've been using GraphQL forever, and I only started recently, and we're moving over to Apollo Client 3, and people had run into these weird bugs, and I'm going to talk about one, but I wanted to talk about how we could avoid those and getting to know how the cache works is the best way. So someone had created this query that was pulling out this product metadata, and they had this query. It looked like that, that's not it exactly. But there was something wrong in this query, and that second piece of data just wasn't coming in. They were querying it, nothing's there, and we're going to come back to this in a minute and see how we could fix it. So what's happening in the cache? What exactly is in there? And where is it? Is it a data object we're keeping somewhere? These are things I didn't know, and now you guys might know. So it's in memory, as the name might tell you, and that's exactly where it's stored. So every time you would rebuild your application, it would get rebuilt. Every time you refresh the page, it's coming back, it's not persisted anywhere, unless you've actually persisted it yourself. And what's inside of it? And it's not actually your data, it's like a representation of your data. So it takes whatever data you got back from your query, and we store a version of it. So before I talk about any of that, I'm going to talk about how we get that data. And that is the fetch policies. So these essentially define when to get your data from the cache and when to get it from the network. So there's like six of them, and I'm going to just go through them really quickly. Mainly because this is one of the main things that would cause a bug in your application. Let's say you're expecting to get data from network right away, or you need a new fresh version of your data, you're not expecting to get it from the cache, you would probably want to swap these around. So here is our first one, cache first. It's our first. And it's the default one, and it's really simple. Is all of your data in the cache? Golden. If it's not, we're going to go to the network. And the keyword there is all. So if you have an identical query, but you're asking for one extra field, it's going to go to the network regardless, because all of that data is not in the cache. And then very similar to this is only the cache. And the same thing is true here, where if all that data isn't in there, it's going to give you an error, and it's not going to come back. And we have a few others, like caching and networking. So this one is interesting, because it's going to go and get it from your cache, and then refill the cache from the network, right? So if you had some really high, very, like, a lot of data that's changing often, and you want it to be incredibly consistent, this would be the way to go. And then it will go to the network, refill your cache, but you'll always have the cache first. And then this is very similar, except for it's only going to the network, and then updating your cache. So if you needed to get just the updated data first, and you're going to wait, something like you're going to load and wait for it, and then we'll save it in the cache if you're going to have a subsequent query, grab it from there. And then finally, just network. Really simple. Nothing else there. So we have our data, and we know when we're getting it, and we know when we're getting it from the cache, but what's in the cache? How did that data get stored, right? I said it's not exactly your data, it's just a representation of it. And normalization is how it got stored. So it's normalized in steps, and there's three, essentially, where you can break it down to. So our first one is whatever data object comes in, we're going to split it. And we're going to split it into all the object entities that it could be. Where at a time, I believe that was every data object that can exist in your application would get split and it would get normalized, and that would be really cool. But that's not quite the case. So imagine we had this really cool query. You'll see that this is from a demo app I wrote that I won't actually get to show you, but I'll link it at the end, and it's really ugly. So essentially here, we can see that the three objects that get broken up all have one thing in common, and it's that they're all uniquely identifiable. And that's going to be, like, I guess that rolls us into the next point, where the second step is after we break up all these objects, we want to give them unique identifiers, right? And the default way of doing that is just by using type name and ID. So if your object has an ID field, it would be normalized, or they would try to normalize it. But that's only usually, because you might not have an identifier that is exactly ID. So in that situation, you would use the key fields API. So you can define your own identifiers, like, let's say you had UID as an identifier for an object, you could use that directly. You could also use multiple nested fields to create it. But it's the same concept as just using ID and type name, right? You can generate your own, and this way those objects would also get normalized. Otherwise they would get normalized under their parent object. So at the very root, you'll have just the query. So if you had no identifier in any object, it would just be this giant object of, you know, your whatever query that came in. So we have them all broken up. We have them in these individual objects that are identifiable, and we're going to save them in this flattened data structure. And we want that flattened data structure because it's easy to access, and we can make it as small as possible. And that's essentially the idea is we took that other query, and this is exactly what the normalized cache looks like. So if you were to extract the cache and take a look at it, which I'm going to show you in a minute, but this is exactly what it looks like. And you can see that, like, where is it? All these objects that had identifiers, they're not actually inside or nested inside of the other objects anymore. We're just holding a reference to them. And any time if this query needed to get this object, it would get it from this reference. But the magic comes in when you have multiple queries that all use the same types of objects, like we reuse types everywhere. They'll just be referencing the same object in the cache. And that's really the power of that normalization. These are washing machines. So this was my riff on something about automation, something automatic. I couldn't think of anything better than this. I thought maybe a car transmission would work, too. But essentially the idea is we have our data, and we're going to get new data after, right? And I'm sure you might have seen at some point where if you requery, data gets automatically updated. And that's really cool, and we love that, and automatic things are very cool. But sometimes automatic things don't work so well. So I'll throw this shirt into the wash, and it's got a stain on it, and usually all my stains get washed out, but this one didn't. This actually happened to this shirt before, and it was blueberry, and it was really ugly. So I was like, okay, yeah, the washing machine sucks. I probably should have done it by hand. But the washing machine doesn't suck, because I probably should have just done something to it before it went in there to make the automatic work better, right? So that's what we're going to talk about. Sometimes not so automatic updates will happen. So whenever data comes into your application, and it gets normalized, and it's being cached, one of two things will happen if there's already data there. It will either get merged, or it will get added. There's only two options. So when is it getting merged automatically, right? What are those scenarios where it happens and it works and we're really happy about it? The first one is if you're just updating a single entity, and you're returning that entity with its identifier and its updated fields, it's really simple to do. Again, as we saw, all those objects are just in this hash. We can easily grab them by their identifier, and we can merge these new fields in. And this is the one that will happen the most often. But the second one is if you had a list or a collection of entities, and you returned all of them with all of their identifiers and all the fields that need to be updated. So this doesn't work if you return, let's say, some of them or just one of them, right? You have to return the whole collection back in order to update that for each of them. So talk about when it doesn't work, because these are the ones where we run into it, and it's a really ugly situation. So first, let's say your response data that's coming back isn't related with the update that you want to happen, right? So I know there were some situations where we had an object we were going to favorite, like a product that was being favorited, right? And you favorite it, the mutation goes out, it comes back, you return the ID and the favorite status, and that updated. And it would update that product wherever it was being used in that UI, right? The thing it's not going to update is how many products are favorited. Let's say you had a UI that was showing the number of favorited objects, because it might be related, but it's not the same data. So in this scenario, you would have to write your own update function and update that data yourself even though it might seem related to you. So then the rest of these are about lists, because that's really the hardest thing to manage, where it's like, if you, again, don't return the whole list of updated objects, you're not going to get that automatic update. The same is true if the order changes. So if you were to send out, let's say, objects in order 1, 2, 3, 4, and you're going to change it to 1, 2, 4, 3, when it comes back, the objects are still the same. It's the only thing that changes is the order. It's not going to reflect in your UI. That's something you'd have to write automatically. And it's mainly because the cache makes no assumptions about how you want to store your data or what your data should look like inside the cache. Those objects are identical, and the only thing it has reference to is the references to those objects. And then finally is adding or removing things, which also really sucks, because if I was going to unfavorite something, I can update that object's favorite status, but I can't remove it from a list of favorited objects. That was something you'd have to write an update function for. Because, again, it doesn't know that you can't return something from your mutation and say, OK, yeah, now remove this for me. You can just return something. So again, the update functions exist to do that, but in these scenarios, it can be a bug where if you're not expecting it to update automatically. So we've definitely run into those. So we'll come back to this, because now we've talked about a couple of the things that would come into play if we wanted to solve this issue. But essentially the idea here was the product metadata is the same type as this metadata type down here. And this person was querying this, and they were like, oh, like slug is undefined. I don't know why slug is undefined. And it's mainly because the identifier for product metas and the identifier for metadata was the same. So those objects got normalized inside the cache. And you would expect that their children would also be. The issue is values has an ID, and this value does not have an ID. And what happened to this value is it was tried to be normalized, but it couldn't. And the other values object was the one that was saved inside this totally normalized object. So the way you would solve this, boom, just add an ID to it. And now it can find it, they can update it, and now they can be normalized properly. So now we're on to the last part and the part I was the most excited about, which was garbage collecting and how the eviction works. So I'm sure everyone's run into garbage collection at some point. I realized after the fact that it's recycling, and it's not garbage. I was thinking about putting something over it, but I didn't. So generally, just garbage collection, we're going to try to recover memory from our app. And in JavaScript, that usually happens behind the scenes. We don't manually have to interact with it. In Apollo, we do manually have to interact with it. And this is how we would do it. I tried to make it as tiny as possible while still being readable. But mainly just because it's so small. You would call this garbage collection to clean up any unreferenced objects in your cache. And we'll show what that's going to look like in a second. But the other thing is this will return the IDs of anything that got collected. So here's that contrived app I was talking about. Using the same queries as before. So there's a GraphQL server. We're querying it for all those pixels, including all the white space. And they're each individually identifiable. So it takes up... you can see I just printed out that cache size. That's not the actual cache size. That's just the number of keys that were in the cache. But it's to represent how big the cache might be at that point. And it's that many items big. And the idea was that there's like a... since all of these have IDs, they're all addressable, they're all being normalized. Let's say we were going to go ahead and change Pikachu's color to orange. You might notice that the cache is twice as big now or more. So the issue here is that we went ahead and we made this mutation. This is a very contrived example. But essentially all of those identifiers changed. Or the vast majority of them. And they came back and they weren't able to be merged. So what happened? They had to be added. All those other objects are still there. They're just not accessible via the actual root object. So we don't need them. Our UI doesn't need them. But they're still there and they're taking up space. So that's really the main reason we want to get rid of these things. So let's say we do run it. And this is, again, a more crude version of all that other data. How do we collect all these items that aren't being referenced? And the idea is that the garbage collector takes a look at your... it takes a look at the normalized cache. And it'll recursively go through each node that exists inside the cache until it finds all of your leaf nodes at the very end. And anything that wasn't visited will be removed. So this is our new query. The one with that all orange color with all those new identifiers. And this is our original query that had the orange color. And you can see it's not being referenced by root anymore. Because we're not using that data anymore. It's not the main query. So we would go through, reach all these nodes. Boom. They're all good. And then the ones that weren't visited will be removed by the garbage collector. So let's say there was a specific object we wanted to keep. Or we wanted to make it so even if it's not accessible, it won't be removed. And we would use retention, like the retention API for this. But really what happens inside the cache when you retain something is we add it to this extra root IDs. So inside the normalized cache, there's a separate, I guess, field for keeping track of all these identifiers. And the odd thing I noticed when I was working with this was if you ever write a fragment, like if you ever use the write fragment or write query, those objects that you write to directly also get retained automatically. So if you were writing to all objects, like a whole bunch of random objects in your cache, and you were running the garbage collector and noticing they're not getting collected, it's because they were all getting retained by default. And it's because you altered them directly. So if you were going to get rid of those, you can release them after the fact. So this is more or less what it will look like. So let's say we wanted to retain the cheek. I think that's what I call it over here, right? Yeah. And what will happen is that the garbage collector will go through, it will visit all those nodes that it knows about, then it will go ahead and visit all these nodes that you've referenced inside your retention, and boom, it will not get rid of those bad boys. So the last part is that the thing that the garbage collector isn't going to get rid of is any object that is accessible, right? But you might want to get rid of those manually. And we talked about it before, where adding and removing things are something you have to do manually instead of it happening automatically when you re-query. So again, that's where the eviction API comes in. So you can evict a whole object from the cache directly if you wanted to. The only issue is, let's say you got rid of that root that was referencing all of those other nodes that we saw before, so it had a reference to each of those objects. Nothing has a reference to those objects anymore, theoretically, if you got rid of that top-level object. So if you're ever going to use this, you would have to run the garbage collector after the fact to get rid of all those objects that can no longer be referenced. And then you can go a step further and just evict specific fields if you'd like to, which gives you a lot more power to update queries after a change comes in. And then there you really have it. We went through the cycle of an object from fetching it to normalizing it inside the cache to how we would update it or how those things would get updated automatically, and then finally to collecting and evicting them. I couldn't write the last line because it doesn't really work back into fetching, so I felt really awkward about that. I was going to show the demo, but I don't actually have time for that, but I did link to all of the really ugly demos that I wrote, and they just have examples of everything I was talking about in here, so if anyone wants to go check those out, I totally can. And then shameless plug. Thank you. Thank you, Raman. Let's ask some questions. I call my Uber pickups garbage collection. All right. Oh, my God, so many. That was great. Thank you very much. All right. The people want to know, what fetch policy do you recommend using for a standard single page app? The default one. Cache first, honestly. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go. I think that's the best way to go.