But let's see how does that work with DynamoDB. So we want to store coffees and we want to also store some things about our customers such as profiles and maybe a cart for our customer that is shopping right now. And we want to store also orders for our customers. We can have different tables for that. For example, one table for coffees, one for customer profiles, one for customer cart, one for orders and that will work fine. But in some cases that will not be efficient enough.
If we put everything in a single table it can look something like this. This pk is basically our partition key. This is our sort key. And then we have some data which is not important right now and we can have some global and local secondary indexes. As I said, everything with the same primary key will be stored as the same platform. So coffee and coffee will be not in platform, sorry, on the same partition, on the same physical disk. So coffee and coffee will be on the same machine, customer 1-2-3 and customer 1-2-3 and customer 1-2-3 will be on the same machine. If I use query or something like that to get all the coffees, I can just tell DynamoDB return me everything with the primary key coffee and it will return me these two things. But I also can really efficiently say, for example, return me on with the coffees from Brazil by saying primary key is coffee, I always need to tell the full primary key, but also for the sort key, I can tell DynamoDB, give me everything that starts with country, hashtag Brazil. So it will give me just this one, but if I have multiple coffees from Brazil, it will give me all the coffees from Brazil. Again, if I want to get, for example, customer profile page, I can easily say, give me customer one to three by just providing primary key. It will return me profile information, card, and orders. But if I want only orders for, for example, profile for one customer, I can say give me customer one to three with sort key profile. So it will get me just this one line, or if I want like orders, I can say, give me customer one to three with orders one to three from date, blah, blah, blah. So as you can see here with different things, we can we can get like really different patterns. And this is cool because if I use GetQuery to get like a few different things, it will be able to return me everything related to this one customer in milliseconds, which is really, really fast. But of course there are some limitations, I'm not able to tell DynamoDB, give me just coffees named Columbia or something like that. Is this something that sounds clear? Does it sound weird or, actually, I have an idea. Let's see who has an experience with DynamoDB. I think I have this one, yeah. Let's see. How do I zoom out this? Yeah, like this. So many people, actually, most of the people never use DynamoDB. And that's what I expected. Is there anyone that used single table layout for DynamoDB? You can type in chat or feel free to unmute and tell me. Oh, can we take a quick break? Of course we can. Let's take a five minutes break and let's continue at 1730 My times in seven minutes. Is that okay for everyone? That's good. Okay, I'll stay here. So feel free to ask any questions in the meantime, but we'll not continue from here. Can I ask you a quick question just about the scheme above? The slide above the... Just scroll up a little bit, where are you deciding on your partition key and sort key? Because you mentioned that... This one? No, go up, just go up a bit. Oh, this one. That one, yeah. So let's say for example, my front end, I want to actually search by topic. Would that mean that I have to rebuild my database than just to search by the different? I mean, would that mean that I have to rebuild my database then because I haven't thought ahead and like set topic as an index key? So, long story. Actually, short answer is yeah. There are some, actually, you don't need to rebuild everything. Fortunately, you can add, you can't add local secondary indexes without rebuilding the database, but you can add global secondary keys without rebuilding the database. So, you can easily add like a global secondary key that is, for example, album or whatever here. So, this will become your index. But again, if you want to have a full text search, you can do even that with DynamoDB. It's a bit more complex, but we do that for users in vacation tracker. What we do right now for vacation tracker, this primary key is a company ID. We have a special table built for user index because we have a CQRS. We have an event table that will just build some read-only tables that are temporarily built for us. So, primary key is company ID. I'm not able to see any other company data except my company. Secondary key is basically something, for example, active users or something like that. And then, I can say, give me a company, one, two, three, for example, let's say Netflix, give me active users from Netflix, and then use a filter on one other field that we have, which is basically username, email, and few other things, and filter by do a full-text search on that field and say, give me just users with the name, let's say Slobodan or something like that, or users that contain letter S or something like that. It's a bit more complex, but it works really fast. Cause, you know, coming from a relational database, you don't really expect to have to guess ahead of time which columns you can search, cause you should be able to- Of course, of course. So what you're saying is, but this pattern though, if you set the data, you don't have to worry about, you know, if you have a requirement that comes down the line, you can just use this pattern to kind of handle that. Is that what you're saying? So basically, yeah, it's a really good, practice to, before you start using, if you decide to use single table layout, which many people don't use, of course, to try to understand your patterns a bit in advance and try to even build some patterns that you're not using right now, but you might use at some point. But if you really need to add something down the line, you can easily create another table or you can add a global secondary index, you can have more global secondary indexes that are able to cover some of these situations. I see. So you're able to patch everything down the line. Of course, there's one more thing that you can do. You can rebuild the database and just fill it with the data, which you don't want to do that often. So yeah, the easiest way to do is to spend some time. Actually, it's not easy, but the best way to handle this is to spend some time try to understand all the patterns here and try to design your database to be able to support even patterns that you don't have at the moment. And then you'll be able to easily adjust to some new situations. If there are some situations that you're not able to adjust to, you can add global secondary index without rebuilding this database. It will just like take the data from these database, create the copy somewhere in the background as a basically that global secondary index, and then you'll be able to use a different primary key and different sort key for that index. Right. So if it would depend then on the performance characteristics then, I suppose. Yeah, yeah. Of how many times that route is used, that that's how you probably would decide whether you have to rebuild the database or just add a global secondary index, I think. Yeah, but global secondary index is still really fast. It'll just give you the different primary keys, sort key somewhere. Just imagine the copy of these database in like on some other location. So it will allow you to do that. Sorry, there's one more question from Vitaly. I hope Vitaly is still here. Actually I'll wait to be sure that Vitaly is here for two more minutes and then I'll answer the question. Yeah. I appreciate like a mask required. I'm here. Oh, you're here. Perfect. So what is the benefit having one table for all data rather than a different tables for each kind of data? Could be a performant issue if one single table grows too large. So basically if we go back here, AWS guarantees us that as long as we have partition keys that are evenly ordered, for example, we have UDIDs for customers or something like that, this will be really performant because it will store each partition. It will just make sure that each partition is on a separate machine.
Comments