Relational Database Modeling for GraphQL

Rate this content
Bookmark

In this workshop we'll dig deeper into data modeling. We'll start with a discussion about various database types and how they map to GraphQL. Once that groundwork is laid out, the focus will shift to specific types of databases and how to build data models that work best for GraphQL within various scenarios.


Table of contents

Part 1 - Hour 1

      a. Relational Database Data Modeling

      b. Comparing Relational and NoSQL Databases

      c. GraphQL with the Database in mind

Part 2 - Hour 2

      a. Designing Relational Data Models

      b. Relationship, Building MultijoinsTables

      c. GraphQL & Relational Data Modeling Query Complexities


Prerequisites

      a. Data modeling tool. The trainer will be using dbdiagram

      b. Postgres, albeit no need to install this locally, as I'll be using a Postgres Dicker image, from Docker Hub for all examples

      c. Hasura


106 min
15 Jul, 2021

Comments

Sign in or register to post your comment.

Video Summary and Transcription

This workshop covers relational modeling with GraphQL, including building a database schema, designing relationships, and using tools like dbDiagram and Hasura. It explores the concepts of relational databases, tables, and relationships, as well as the use of UUIDs and auto increments. The workshop also covers tracking foreign key relationships, setting up migrations, and optimizing queries for performance. Additionally, it discusses the capabilities of Postgres, tips for database design, and the use of schemas. Overall, the workshop provides valuable insights into relational modeling and database development with GraphQL.

1. Introduction to Relational Modeling with GraphQL

Short description:

Welcome to today's workshop. I am Adrian. This is relational modeling with GraphQL in mind. We'll go through the first part, get a database schema built, design out some things, thank you Daria, and then we will take a short break, basically a bio break, before getting into part two. So part one, again, relational modeling, we'll build ourselves out an actual schema. Something kind of like this. That's dbDiagram by the way. So that's what we'll be working in. And then in part two, we'll get more into the relational data models. What we get when we do like multi-joined queries and the relationships within the database, but also from what we would get using a tool like Hasura, which I'll be using to show how it sees relationships and how it builds those together for the GraphQL queries based on the relationships and the modeling of the underlying database. And that is what the Hasura tool looks like. We'll be using primarily the GraphQL section, which makes it super easy to build out the GraphQL queries and we'll look at some of the other parts of it, like the Explorer, the CodeExporter, things like that. We'll also be looking at the data section to actually interact directly with the database in some of the design and everything. And to execute some of our initial SQL that we'll write out. And I may use the Hasura CLI a little bit. It's a great tool to get, if you're gonna use Hasura and you're gonna build yourself the development workflow around it, definitely get the CLI too. It's got a lot of great features around doing migrations, filling out the metadata, and just kind of working with and interacting through an automated way, or being able to build an automated pathway for your tooling. And then I'm gonna take the time to move on to the next part of your tooling.

Welcome to today's workshop. I am Adrian. This is relational modeling with GraphQL in mind. So there's a few prerequisites that pointed out. Of course, nobody has to follow along. However, I have this material very much oriented toward speaking about particular topics and then actually working through with all of you through the content.

So there's gonna be some coding, there's gonna be some design, and then there's going to actually be some spinning up of services and things like that. So if you do want to follow along to get the most from this workshop, you can absolutely do that. And I put here that I'll be using Postgres Docker a little bit, but I have actually made it so that I don't even need to use that. The only thing that you'll need is an account, which you can sign up for even right now, it's pretty quick to be able to do so, with DB diagram and Hasura and prospectively Heroku. Heroku will be providing the Postgres database back in for Hasura when I get to that part of the workshop. And there will be time to sign up for each of those quickly if you haven't done so yet.

I've split this into two parts. We'll go through the first part, get a database schema built, design out some things, thank you Daria, and then we will take a short break, basically a bio break, before getting into part two. So part one, again, relational modeling, we'll build ourselves out an actual schema. Something kind of like this. That's dbDiagram by the way. So that's what we'll be working in. And then in part two, we'll get more into the relational data models. What we get when we do like multi-joined queries and the relationships within the database, but also from what we would get using a tool like Hasura, which I'll be using to show how it sees relationships and how it builds those together for the GraphQL queries based on the relationships and the modeling of the underlying database. And that is what the Hasura tool looks like. We'll be using primarily the GraphQL section, which makes it super easy to build out the GraphQL queries and we'll look at some of the other parts of it, like the Explorer, the CodeExporter, things like that. We'll also be looking at the data section to actually interact directly with the database in some of the design and everything. And to execute some of our initial SQL that we'll write out. And I may use the Hasura CLI a little bit. It's a great tool to get, if you're gonna use Hasura and you're gonna build yourself the development workflow around it, definitely get the CLI too. It's got a lot of great features around doing migrations, filling out the metadata, and just kind of working with and interacting through an automated way, or being able to build an automated pathway for your tooling. And then I'm gonna take the time to move on to the next part of your tooling.

2. Introduction to Relational Concepts

Short description:

We'll start with relational concepts and briefly mention NoSQL databases. The focus will be on relational databases, but I'll touch on other types as well. If you're interested in learning more, follow me and sign up for future workshops and events. I have experience with various database types.

All right, we're gonna start off with relational concepts. I will mention a few things about NoSQL databases. Primarily though, this entire workshop is going to be relational database oriented. So focused on the ideas behind that. But I'll mention some of the others because they are quite prevalent these days. And there's a lot of shared concepts between a lot of these. And it helps to know the difference between the concepts there versus the concepts in the relational model. So I'll mention that in a few parts.

And those things that I mentioned primarily are only focused around the columnar and document based distributed databases like Mongo and Apache Cassandra. And I'll mention a little bit about graph databases like Neo4j or something like that. Neo4j, Titan and the other ones. Not gonna bring up Time Series. I'm not gonna bring up some of the other NoSQL niche databases at this point. However, if you are curious, be sure to follow me and sign up for future workshops and events at Hasura and under my accounts which I will provide at the end of this. So no reason to worry about those right now. I do a lot of streaming and other things like that around all sorts of database types. So I've done a lot of work in that, love to share it.

3. Relational Modeling and Data Model Building

Short description:

In relational databases, tables are used to store data and are connected through relationships. These relationships can be one-to-many, many-to-many, or recursive. The tool dbDiagram is used to visualize and build these relationships. It allows for the creation of primary and foreign keys to establish connections between tables. Recursive relationships can be simulated in a relational database by relating a table back to itself. Building a data model involves designing the structure of the tables and their relationships. In dbDiagram, sample diagrams can be loaded as a reference. The tool provides information on how to draw relationships and build tables. It also demonstrates the use of different data types and column names.

All right, so starting off in the relational world to cover the standard way in which things are designed for relational databases. We have a schema design which lays out the relations between tables. Now in relational database, these are the tables, these are the core elements within the database that are going to be used to store data. You can think of a table almost like as a spreadsheet or just whatever. It's gonna store all of the things that we will put in and it does so by splitting up things into columns.

So like the source table here has an ID, a date stamp, name, URI, details. And they have a data type, timestamp, text, et cetera. The tables then have relationships which are of various types. In a relational database, it's primarily you have a one to many relationship or many to many relationship or you have a recursive relationship, i.e. a table back to itself. If you look at the NoteJot table there and you see on the far side that's actually a recursive relationship line being drawn back to it. If you're ever working with graph data, graph databases, you will notice that graph data is oriented toward edges and sides of the data and how it connects.

So the relationships vary and can have many different types of ways in which the relationships occur beyond just one to many and many to many, things like that. But in a relational database, it's focused around related one to many, many to many or many to one, depending on which way you look at the relationship. Here, I've connected these tables where the source ID, you'll see source ID and the source notes here. It's connected back to the ID here to draw that relationship. And in the tool that we'll be using, dbDiagram, there is the primary key, designated with a one and an asterisk, which designates a foreign key, which would then make this a one to many relationship between that table. There's another where I pointed out directly what that is, so this note jot goes one to many to source notes. And what I have created in this particular situation shows a many to many relationship. Because of that middle table, I can relate source to note jots and there can be one or many sources for each note jot or one or many note jots per source. And that's what a many to many is, where you can relate that back and forth, but there's no limit to how many of the thing can be related to the other thing. With the recursive relationship, drawn on note jot here, it means that a note, which I've called note jot, can be related back to another note, which would make whatever it's related back to would be the parent of that note. And the ones that are related to that parent would be the children of that note. And in that sense, you can draw a recursive relationship with that data. And the easiest way that I've always found that's shown in computers all the time of a recursive relationship, just think of a folder tree within your directory structure on your computer. That is a recursive relationship. You can have as many folders as you want, as many files as you want in the folders, well, generally speaking, and it just nests in words as you go. So that's a recursive relationship. And you can simulate the same kind of thing in a relational database by doing the same thing there.

All right, so building a data model. This is what I'm gonna build, right? So somewhat of a minimal database. We were just looking at these three tables, source, sourcenotes, and notejot. But here there's a parent table called connection, and then three others called formatter, schema, and action. And I'm gonna show you in dbDiagram how to build a structure like this. And we'll talk through some of the particular pieces of this model and why too. I'll elaborate on why I picked certain data types, certain column names, et cetera, as we go through this. Here's what we're gonna do. All right, so open up dbDiagram if you haven't already. When you go to the site at first, you'll either end up here and you'll need to create an account. So go ahead and do that if you haven't already. If you have, when you click the Go To app, it should shoot you right over here like this. Generally speaking, you saw that like ghosted schema that it showed. A great way to work with dbDiagram is to use one of the reference diagrams that it gives you. And if you scroll over here to this dropdown and just click Load Sample, it'll load the sample for you. And it gives you information in the markup over on the left-hand side of the screen here about how to draw out all the relationships and build the tables here in this tool. You can see here's a primary key with an auto increment value. They're using a var char here, timestamp here. And I'll get to more of that in just a minute as I start to work through this. And it shows some of the references. This right here is a many to many relationship between, what is it? The country code to countries and then merchants to countries.

4. Building Source Table with UUID

Short description:

Merchants can have many countries, countries can have many merchants. We'll use a UUID as the primary key to ensure uniqueness. Adding a timestamp provides audit capabilities. We'll also include columns for name, URI, and details. Source notes will have a source ID that references another table. UUIDs are faster than auto increment for large volumes of data inserts.

So merchants can have many countries, countries can have many merchants and their respective country codes, et cetera. And there's some other reference points in here also. So this is a great tool and we're gonna be using it to build out our initial schema.

So let's do that. I'm gonna just click on the dropdown and click on new diagram. And then over here, I'm going to start my first table like this. And I'm just gonna call it source, open close brace. And then let's see if we'll go ID UUID. So a UUID is a 16 bit integer value. It looks, usually it's represented with dashes in it to make it a little bit more readable. Every database has it. It's one way to ensure that your ID is unique. And unlike the auto increment where you can run into conflicts because it will go one, two, three, four, five, and then if more than one person is trying to insert something, you may run into a conflict where somehow they get the five and then five gets overwritten, or five gets deleted, and then someone else tries to add it and there might be a collision between those because they're not particularly unique numbers. A UUID, however, if you use a UUID generator, which every database has a function or something that will allow you to create it automatically. And most codebases, whether you're using JavaScript or Python or whatever, there's something that will enable you to create a UUID. So you can insert without worry of conflict and you can reference without worry of running into, needing to deduplicate if for some reason like somebody turns off the key around the UUID that keeps it unique. So in this case, I'm gonna start off right off the bat, setting this UUID as the primary key, P-K-A. Then I'm gonna add a date stamp. So there's often, and this is something very important to data modeling. It's a good idea to have a default set of columns that are basically duplicate across all tables that offer you audit capabilities. Especially when working in the enterprise environment where audits happen on a semi-regular basis, it's always great to have certain things that the audit will require, such as timestamp to represent when the data was last changed and other pertinent information that may be required by the audit. Like the user who made the last change. And it gives a path, a trail that the audit can be done on so that you know all of the things that have changed or altered or manipulated the data in any way throughout the database. Some of the default things that I always add is timestamp and often, I'll put in the system user, not particularly the database user, but the system user. Because a lot of the time the database system will keep track of the user that's changing the data anyway. So if you wanna make it easy to be queryable in the tables themselves, you wanna add a timestamp, you wanna add a user, maybe the system user or the database user depending on whatever your requirements are. So in this case, I'm just keeping it simple and I'm just gonna have the timestamp, not gonna get into the users and everything. So that's what that is for. And again, the database has a function which we can use to have it timestamp it, it puts the timestamp stamp in the column. So we don't have to like manually enter that or derive it from code or anything like that. It can be overridden, but the database can do it for us too. I think I answered the UUID question, Pravin. So if you have any other questions about that, feel free to ask and I will get into that more. And we'll also look at an example. So beyond just talking about this, you'll see the database generate one when we enter some data into the database shortly. All right, so the next thing in source, I wanna do name, so put text and then URI. So like the name of the source, I'm thinking of this table as something that provides sources of where data is at or a reference to data, where data is at. Think of it as something where you had to look up in a dictionary or in a library or something like that, but in this case, you know, it's the internet, you've looked it up on the internet and you want the URI, so the actual location of the universal resource indicator, and you want a name to be able to give it a friendly name that us humans can read and understand what it is, and then finally, I'm gonna add a details, and details will basically be something like just notes that we might keep about a particular source that we've collected for some reason. Like if I find an awesome page on GraphQL and React or something like that, I'd put the URI in there, add a friendly name, and then add more details about what it is specifically. So basically a very glorified table of bookmarks.

Then let's add one for source notes like this, and I'll add a source ID. So that source ID, the reason I named it that, because it's not the ID of the table. It's the ID that references back to another table that this table will be related to. So I'll do a UUID because it needs to be the same as the table that it references back to, and then I'm gonna add a reference. An NDB diagram, that's ref colon chevron, the name of the table that it's related back to,.id, the column that it's related to. So there's that, and the details text, and then stamp, I'm gonna have time stamp for this table too There was a question by Nebertec, what is faster, UUID or auto increment? If you're talking about a large volume of data being inserted, you're gonna win out with UID's, UUID's, and I'll tell you a reason. If you have one system, only one system, with just that one systems throughput writing inserts to a database with auto increments, it can be pretty fast, and auto increments can be fast in all sense of reality. However, often if you have a lot of inserts, you also have a lot of clients trying to do the inserts. They're going to get very slow with auto increments because they are gonna run into collisions, and they're gonna have other conflicts come up as you work through these inserts.

5. UUIDs vs Auto Increments

Short description:

Auto increments can become slow and cause conflicts in inserts, updates, and deletes. UUIDs eliminate conflicts but may complicate the client-side code. Performance-wise, having the client generate UUIDs is the fastest option. For multiple clients writing data, UUIDs are recommended. The difference in performance between UUID and auto increment functions is negligible. Consider batch inserts for large data volumes.

They're going to get very slow with auto increments because they are gonna run into collisions, and they're gonna have other conflicts come up as you work through these inserts. If you also have updates and deletes, even more conflicts. UID's are not gonna have the conflicts. If you're really worried about inserts, though, don't have the database call any functions to create UID's or to do auto increments and create the unique identifier purely on the client side. Make the client do that work. However, again, that can be complicated in nature because you're again, taking something that is going to be a database reference specifically, the UID, and primary key, foreign key relationships. And you're putting the onus onto the client, which is not the database. It doesn't know why it would be creating this. You remove a lot of the conflicts and issues at the database level, but you also remove some of the databases responsibility. I like to have the database use a database function to create a UUID, but sometimes you do need to have the client code or the client, whatever is creating this data, you need it to actually create the UUID. But as far as performance goes, the fastest way is to not have the database do it. If you have one client writing data only, auto increment is pretty fast. If you have multiple clients writing the data, you need to move to UUIDs regardless of the performance. The other thing I would add is the difference between a UUID function and an auto increment function in almost every single relational database is so negligible that it shouldn't harm your inserts. The other thing to do is to take into account like batch inserts, if you actually have large volumes of data to insert into the database. So that's some of the things to take into account there. Unfortunately, the answer isn't straightforward, it depends on how you're going to write that data into the database.

6. Building Source and Connection Tables

Short description:

Here's the source and source nodes. We'll add note jot with an ID as the primary key. We'll also add a timestamp and the note ID. This creates a recursive relationship back to the table itself. We can add details to both the note and the source. Next, we'll create the connection. The connection will link sources to specific actions, such as FTP or REST API locations. We'll add tables for action, format, and schema. Finally, we'll relate the connection back to the source and note tables.

All right. So here's the source and source nodes, and now I'm gonna get the note taker thing there. We're gonna add note jot like this. And again, we'll have an ID in there, UUID, and we'll make it the primary key, all right? And then add a stamp again. We'll make that timestamp and let's add the note ID. And this is that elephant ear, that recursive relationship back to, the table itself, right? Note jot dot ID. And you see it pop into the diagram there to the right. Really cool feature of this is that it draws it as you do this. Let me add details. And then let's go back to source notes, and we'll relate this. Let's put note ID, UID reference, note jot dot ID, oops ID. So now, let's move this around so we can look at it better. You can see we have that mini to mini relationship and that recursive one over here, right? So that was the first part of the schema that we looked at when I discussed the specific details. Well, the details for the note would be like the body of the note itself. Like if you wanted to have a recursive, wide range of notes around a particular source, right? And in the source, the details is specific to just the correlative thing that's held in that table, in this case the URI only. And also, it's just an example. I probably wouldn't have details in both unless there was very specific reason to have them tightly coupled to each of those entities like that. But in this case, it just seemed like an easy thing to add that could be of use as this is fleshed out. Like when someone starts to do application development against it, they could prospectively use the details in both of those related specifically to the note or the source, right? All right, so next thing, I wanna create the connection. And let's see, we'll do ID, UID, make that primary key. Then we're gonna have a stamp again. Another timestamp, because, as again, with timestamps, it's for any updates or changes to the data at all, right? Stamp, and then let's see, here we want, we're gonna do two other things. So the connection, just to describe what I'm thinking with the connection, is these things here, these three tables, we'll change their color. Let's change them to this color. You can see in the markup, they change it over here like this, the head color. Just so we can see these differentiated from what I'm about to do. So this is just purely the source and the notes about the source that are collected. With the connection, I'm gonna connect the sources to particular actions. So the idea is, if the source is let's say, not just a webpage, but an FTP location, like a file transfer location, or some other REST API location or a GraphQL location, I could set up actions, right? And other things, like let's say, tell it what type of format it is. Is it XML or JSON or something like that? Or is there a schema to the data that it could commit an action against and pull down from that particular source? So with the connection, I'll add a stamp. And then there's gonna be an action, a format and a schema. So let's add the tables for that. And then we'll go back and we'll add the ID. So we'll go back and we'll add the IDs to this. So first, let's see, whoops, table, schema. And again, ID, just like everywhere. We'll go PK. And then we'll go table format, formatter. I'm just gonna copy this, put it in there since it's the same in all. And then table, what else did I say? Oh yeah, action. I wanna do an action. Like that, okay. So those are there. And now we can relate these back to here with... Let's see, for the connection we wanna do... How do I wanna do this? I think what we'll do is we'll do an action ID and then we'll do a source ID. Let's, ID, we'll need that. Oh, changing the color's a pro feature. I must have been, what's the term? Grandfathered in or whatever. I did not, I don't, I'm not paying.

7. Exploring Audit Trail and Data Flow

Short description:

Yeah, I have the upgrade button. Nicks, this is true, however you can deduce that 41 likely exists, but it may not. It's not super reliable. UUID doesn't fix this, but in an audit, one needs to take that into account. The best way to get a data flow is to have a timestamp and additional data and information about the table and the data. Note jot is called so to remove conflicts with reserved words. We got these in there, schema, timestamp, connection ID, formatter map, JSON as the data type. And that's it.

Yeah, I have the upgrade button. So I don't know. I must've gotten pulled in there somehow. Sorry about that. Reading the questions real quick. Give me just a second. Nicks, this is true, however you can deduce that 41 likely exists, but as you stated, emphasis on likely exists, but it may not. It may not have even ever existed. This is a thing that we don't know because the database may have killed that and jumped to the next increment. So even though it seems like a good thing for an audit trail, it's not super reliable.

It's not super reliable, and I'm not saying that UUID fixes this as per your reference there, but in an audit, one definitely needs to take that into account from the audit perspective. That just because there's an auto increment doesn't mean that there's always the next succession of numbers in order. The database could have skipped things for various reasons, there could have been conflicts and it skipped them, and it never got that entered, a whole bunch of things. A database administrator could have gone in and said skip a bunch of things, start incrementing from 4,000 when it only got up to like 399 or something. So from the audit perspective, one definitely needs to take into account that there's not really a smooth back and forth, like there's no linked list of the data flow. The best way to get a data flow is you're gonna have to have a timestamp, you're gonna have to have additional data and information and logs about the table and the data in the table. But that's a good point to bring up, good call out there.

Alright, so let's see, I put that over here, here, and here and then let's get this in. So this is ref, this can be action.id. And this is gonna be ref. Source.id. Now as you can see, my naming generally follows an approach with IDs where I can easily ascertain what the relationship is without looking at the physical relationship drawn in the database. I can just look at the name and know that action.id points to the action table. I can look at source.id and know that it points at the source table. The only one that's a little weird is note jot. The reason I called it note jot is because one might jot down a note, which is kind of a slang way to say it, but note however is often in most databases a reserved word, I have found. So you can't really call something a note as the table name, at least in some databases. I don't know about all of them. In that case I call the table note jot to remove that conflict, but I routinely refer to that table when I'm calling a relationship just as note dot something. Because at the core, it's just intended to store notes that someone would write about a particular URI. Alright, so we got these in there, schema, let's see here, what else do we want in there? We want, basically we'll do the standard time stamp real quick. Woops, time stamp. And then we'll have the connection ID, we want to relate back to that. Like so. And then, let's do a formatter map. And then because this is gonna be backed by Postgres, we'll do JSON as the data type. So that's just an open-ended, it's gonna describe the format or the schema I mean in whatever way it needs to so that we can know what the schema is of the underlying data that's in the URI or that would be pulled from the URI. All right, what is it not like about my thing here? Nothing like errors while you're doing stuff live. Let's just release it for a second. And then connection. Connection ID. Oh, I know what I did. I didn't set the type derp. There we go, okay, all fixed. And then for Formatter, we'll go with stamp again. And then I think we'll need the connection ID there too. So I'll just copy that in and then we'll do Formatter. We'll do a map of that too, just make it a JSON map basically. Let's see here, and then action. Add Stamp.

QnA

Building Data Model and Answering Questions

Short description:

In this part, we discuss the need for an action ID and the one-to-many relationship between connections and actions. We also encourage participants to build their own data model and take a break before starting part two. We answer questions about naming conventions, ERD diagramming tools, and sharing and iterating on data models with the team.

Do we need the action ID? No, because multiple actions can take place on a connection. So, per the relationship in connection, it will be a one-to-many relationship to action. So that should give us what we need there. Hit an action. Oh yeah, I need to declare something to be the action. So we'll make it JSON too. So then we can define actions and really extend and elaborate on what that would be. Because I don't really know what the client system would be like here. This is just, again, kind of an idea of how to build out a model for a system that would prospectively do all these things. So there we go. We have, we have that data model that I had previously showed in the slides. So this is how you would build that out. Now, the next big thing is for all of you to build out a model yourselves that we will then use momentarily as we spin it up in the Hasura product. So everybody take 10 minutes, take a bio break, grab a drink real quick, and also begin to if you haven't already begin to build out a data model which has at least some primary key, foreign key and some many to many relationships as we have done here. You could, if you want basically just use this and I'll make it super easy. Let's see here, I will paste it in the discord. Boom, there it is. So if you want to grab that from discord you can have this exact data model or you can build out whatever you want. So take 10 minutes, it is currently 9.38. We will come back at 9.48 and begin part two. Also in this interlude, if you have any further questions or thoughts about this or questions about how to model these things please ask those questions and I'll cover anything that comes up whenever I get back. So I will be back in exactly 10 minutes at 9.48 or 48 after the hour wherever you're at, whatever time zone you're at. All right, so see you in just a few. All right, hope everybody was able to get a drink, take a break, et cetera. A few questions that I'll verbally go over right now. Nick's asked, why was I using PascalCase? Was it just a personal choice or some specific reason? In Discord I also posted a blog entry where I talked about that bit more, especially in the case of database naming conventions and such. In databases, I would strongly suggest always stick with Camel or PascalCase. It doesn't always matter which one you go with. However, I personally tend to stick with the case naming that is used in the tech stack. So if it's.NET, I often use PascalCase because a lot of.NET code generally tends to follow the convention of being PascalCase. With Java, for example, I often will switch to CamelCase so that then the database, the tables, the entities that might be generated through GraphQL or whatever generally tend to be cased. However, the coders that are writing code against these tables and these entities and other things that will be created from the database, they'll be using the same case that they already use in their code base. So then it will just look like whatever it is for them, their objects or their entities or their tuples or elements or whatever encourages familiarity from that perspective. Oh, interesting. You use, Nivertec mentions they use ERD diagramming CLI tools and VS Code extensions, but it doesn't generate SQL DDL. That almost seems odd. I would check into JetBrains tools, I would check into a number of other tools because many, many, many ERD diagramming tools that I've seen almost always have DDL spec that they'll export. So I would definitely check out a few new tools here. There's gotta be something somewhere. There was another question too from Melody Burger. It says, how generally do I share, communicate and iterate on the data model with the team? So the DB diagram tool, which we'll look at in just a second actually does the export of the DL. Gives you a sequel, it'll give you images and then you can put that into the repo. Also on the tool itself, you can share these diagrams like the actual diagram itself, in the DB diagram out to the team via a link. And then they can even go, if you give them permissions, they can go and they can edit it or add to it whenever they need to. A lot of the other tools that I use tend to have this type of capability. So you can at least put an image in the repo or put the sequel in the repo. I mean, you need the sequel at some point anyway, especially when you're building out that initial design and you need to get it into the database. If you don't have that, you then need to recreate by hand all of the tables and everything in the database. That's quite frustrating. That would be a lot of redundancy in work.

Setting up Hasura and Database Backend

Short description:

At this point, we'll export the SQL from db diagram into the Hastur GraphQL API server. We'll review the data model and discuss its applicability to GraphQL. If you have any NoSQL questions, feel free to ask. We'll also address the challenges of working with distributed databases in GraphQL. Before proceeding, sign up for a Hasura account and a Heroku database backend. In Hasura, we'll create a new project, use the free tier, and launch the console. The DB diagram code can be obtained from the Discord chat. We'll be using Postgres with Hasura. In Hasura, we'll paste the SQL code under the data tab and execute it. We'll explore the connections, relationships, and entities, and execute GraphQL queries. The query analysis will show the SQL executed by the GraphQL server, providing insights into the execution plan.

So at this point with db diagram, we'll get the SQL and then we'll just, we'll export it into the tool that we're gonna use, which is the Hastur GraphQL API server. With that, let's get rolling here. So, we have our database schema. So we're gonna get that running and actually get it built, executing some SQL. We're gonna look at some GraphQL queries around that. We're gonna review some of the queries and execution plans of those queries, put some data in the database, things like that. We'll further discuss the data model and how that would or wouldn't apply to the GraphQL. And also, be sure if you have any NoSQL questions, be sure to ask me too because I'm gonna leave that kind of open-ended. But one of the things I will broach a little bit toward the end here is how you'll deal with what you get from a distributed database from a GraphQL perspective because it can depend on the distributed database, can depend on the graph database or whatever it is. And you really have to do a lot of the heavy lifting for the GraphQL yourself to determine relationships or graphs back to things, like in the case of say Apache Cassandra, there's no relationships. There's also no joins in that database, vastly more scalable for many reasons because it's distributed, horizontally scalable, et cetera. But then in the GraphQL, to get the value and the power of being able to relate the entities and such, you'll need to make those connections in whatever that server is that you'll be using for your GraphQL. So a little tricky. With Hasura and a relational database, however, you can draw those, which makes it very, very powerful in that sense. So before doing anything else, if you don't have a Hasura account and the requisite database backend via Heroku, go sign up right now for an account on Hasura and Heroku. With Heroku, all you need to do is go create that sign in and then in Hasura, you'll be able to do a authentication against that to get the database. Very easy in that sense. Just like a click of a button. So if you don't have that, start signing up for those right now. I'm gonna continue just a little bit and we'll come back to that. This is what the cloud interface looks like in the Hasura solution. We'll be clicking on new project, which will create us a new project that looks just like that. And then we'll use the free tier. Again, Heroku and Hasura combined and gives you a free database and a free GraphQL API. And in it, you just enter the project name, region, and then you go through, there's a button to click. It will create the Heroku database for you as long as you have that account authenticated, which the button will do that. It will initiate that process and allow you to just easily finish it by clicking some buttons. Then you click create project and it spins up a project that will look just like that. And then we'll click on launch console and that's what we'll be working in through most of the remainder of this workshop. And again, this is what I built and prospectively you built, or you can grab the code from, the DB diagram code from in the Discord chat. In DB diagram, right up on the export, you can export a PDF to Postgres, or a MySQL SQL Server or to PNG. Of course, PNG is an image format. So you'll get an image of just the diagram, believe that's what that is. And Postgres, MySQL and SQL Server will all give you SQL that is specific to that relational database that you can execute against that particular database. In our particular situation, we'll be using Postgres cause Hasura primarily works against Postgres at this time. And we don't even need to deal with the PDF. We don't need a PDF of any of this, but it's there if you need it. And I don't know if you can without a paid account or whatever account that I'm grandfathered in as I've been able to save the diagram. So that's another thing you can do. You can save it and you can share it and it'll give people the link that'll point back at the diagram that you just created. In Hasura when we get logged in, and don't worry I'll be navigating through this. I'm just showing you a quick overview of what we're about to do. We'll go in and we'll paste the SQL code that we get under the data tab under the SQL section of the data tab is a place where we can just post in, whoops, raw SQL and then execute that raw SQL. A little bit more about that in just a second. We'll do that and one time for the database, we'll be able to draw connections, look at the relationships, look at the entities and execute GraphQL queries against it and do all sorts of whatever we need to do at that point. We'll also take a look at the query analysis. As we execute some GraphQL queries, we'll be able to see those and look at the SQL that is actually being executed by the GraphQL server. This is very important as with any data schema, data diagram, you want to know how the server is actually executing the SQL, how it's executing the query, whatever that might be. In this particular case, it's just like a basic query against something that I've executed. Below the generated SQL, you see the execution plan because even if you have the SQL statement, you don't particularly know how a database server will execute something unless you look at the execution plan.

Execution Plan and Database Concepts

Short description:

Hasura provides the execution plan for SQL statements. Tables contain columns, and the data stored in columns is called a tuple. A tuple is a single row of data. The relation schema includes the relationships between tables. Cardinality refers to the total number of rows in a table. Primary keys and foreign keys receive indexes for performance optimization. Cardinality can also refer to the distinct values in a column. Understanding these concepts is important when working with databases. Now, let's create a new project in Hasura and select the free region, Ohio.

The execution plan shows you the timing and other characteristics around that. Postgres has it, Oracle has it, SQL server has it. Every relational database has some way to look at the execution plan. So, Hasura provides that capability to look at what SQL statement it's created and it provides the execution plan just below that. So we'll look at that, talk about that some more.

Alright, to the action. So, if we go back over here to export, Postgres. And, even if you didn't do any of the steps so far, I have an easy cheat for ya. It'll get you right up to where we are at. I'm gonna copy all of this SQL into the Discord chat. So, in two seconds, one, two, boom. All of it is now in the Discord chat. So, even if you didn't do any of that, you can still follow along from this point if you would like to.

Now if you, I just wanna cover a few tidbits, little pedantic detail around relational database. Especially if you want to speak in great minute pedantic detail about a database. These are tables, as I mentioned. These things in the tables are columns. The data that is stored in those columns within a table are actually referred to as a tuple. It's a single row of a table. Within that, it's that single row of data. Each single row is considered a tuple. Now in programmer speak, that could be a little confusing because in some, a tuple may be a thing with two or three values or whatever. It kind of depends, varies just a little bit in the programming languages, but in databases, relational databases, a tuple is the row. The relation schema is the schema with all of these relationships here. A schema could be just tables without the relationship. So sometimes you hear very specifically stated relation schema versus just schema. A degree is the total number of attributes. So the individual values within a row, attributes, that's what attribute is, which in relation is called the degree of the relationship. So like references, you see those, those are your degrees within your attributes, et cetera. The cardinality, that's a word that comes up quite often with relational databases. The cardinality is the total number of rows present in the table, okay? So high cardinality would provide a lot of data around various things. A column, which I've stated, that's these. Let's see what else. Making sure I haven't missed anything. I've gone through a little punchlist in my head here. The relation key, sometimes things are referred to specifically as this is the relation key, okay? And the primary key or the foreign key, this other thing that it's related to, are both referred to generally as a key, okay? Now from a performance perspective, these receive an index. When you make something a primary key or foreign key, they receive a type of index. You can add indexes at the database level to other columns. I'm not going to go into that, but it's something that's very important to know, especially in relational databases, because indexes can dramatically improve the performance of scans, which in turn, as we look at these queries, you'll see in the execution plans what the scans are doing. So if you see something taking a lot of time on a particular column, like if you'd say something something based on this thing, you can improve that performance often by ensuring that a particular column that you're doing that scan against is indexed. Yes, Nevertech points out, there's also a cardinality of the distinct values in a specific column. Cardinality is one of those words that can mean a thing or many things depending on the specific context in which you're talking about, and it is quite often used. You'll see it in newspaper articles, you'll see it in database articles, et cetera. So it's important to note what it specifically means in a general sense, and then specifically what it's applied to in the particular context what it's used for. That's the same thing with tuple. And I bring that up because, as you go from the development tech stack in JavaScript or C sharp or Java or Python or whatever language you may be using to the database, and then back again, you may see these similar words, which do have either very distinctively different meanings or just slight variations on the meaning, it helps to know the difference so that you know what the context of it is with the particular, either the tech stack or the database. All right, I think that covers the basis of just some of that pedantic detail there. All right, so we have the SQL and hopefully you have created your Historia account and your Heroku account, and I'm gonna go ahead and we'll create ourselves a new project, and I'll call this the Awesome Workshop Project. Just gonna click on Free Tier, and I'm gonna, select the region. The free region is Ohio.

Setting up Database and Launching Console

Short description:

We'll go through the steps of setting up the database with Heroku and launching the console. We'll also discuss environment variables and the initial setup of the GraphQL API server. Finally, we'll run SQL commands to create tables and track them in the server.

There's a lot of other regions depending on where you're at that you might want to use, but I'll just go with the free tier just because then all of us can use it and we can do exactly the same thing and get it to work. I have named the name too long. I should know better than that. Let's just call it the Awesome Workshop. There we go, okay. So then go to Database Setup. I'm gonna click Try with Heroku. I'm gonna click the Heroku button. Right here, if you haven't done so already, this is where the auth would begin authentication. You'd click Okay, etc. and then it would bounce through and then create the Heroku app, install Postgres, get the database URL, put that into here. And then, oh, quick note. You can take this, this database link, and you can use it in a Postgres connector in any tool that you may be using against Postgres if you're doing that. So, it's helpful to know that that's what that is. I click Create Project. Did I just delete something? Well, I'm just gonna quickly refresh. I can do that over again. There we go. I don't know, I think when I highlighted that, I might have messed up the link there. But anyway, here we are. While we're in this, just some quick context while it's generating the server. It's useful to know where the environment variables are. If you do this, one of the first things you often want to do is to remove the console from the live view, because literally anybody, like everybody right now, could go to each other's console if you're creating this, because it's initially created wide open. But what you do is you just add a new environment variable and you go down. And well, I'm going to type it, because it will do it faster. Enable console, and you'd want to set that to false. I'm going to leave it enabled for now, just so we have something to work through here. Now it looks like we're all set, all good. And launch console. There we go. So right now, it is a GraphQL. We have nothing in here. If you click on the data, it'll go and it'll show that we just have these two schemas in our database. The HTTP Pro Catalog unrelated. Don't have to deal with that at all. Public is just our general public schema that we have. And of course, there's nothing in here because we've created nothing here. It's an empty database with a respectively empty GraphQL API server. But click on SQL and it will change all of that. So I'm going to literally copy in, so that if anything, if I goofed up on my copy and paste earlier into Discord, I'll get it too. I'm going to copy that SQL from there and I'm going to paste it into here. There we go. As you can see, it does a nice little color coding for us. We have all our notes and everything in our types, etc. So we should be all set. Another thing, you can unclick or click track this, click on track this so that it will initially track all these tables and start to add the requisite metadata into the server, so that it will know what kind of GraphQL queries and capabilities and other connections that we want. So I'm going to click run and in a few seconds, we'll have a database. Okay. SQL executed and you'll note to the left here action connection formatted notes, job scheme, the source notes are all just popped up as it was created. Now click on data one more time and it'll leave the SQL screen and just go to the general data screen that you usually see when you log in.

Tracking Foreign Key Relationships

Short description:

Foreign key relationships are not currently tracked in the database. We need to add these relationships to ensure data consistency. Let's take a look at the existing relationships and track them in the metadata. After adding the relationships, we can edit them if needed. There are some additional tasks we need to perform in the database.

Now there's some foreign key relationships that are not currently tracked. So it tracked the tables but it did not track foreign key relationships, that's one thing that we're gonna want to do. But before we do that, let's take a look at the relationship to see what we get here. Action, for example, if I click on action and then go over here to relationship, you'll see that the relationship is not currently added. So don't click anything yet, we'll go back to that. The other thing you can do is like, let's say source, go to source, we've got insert row, modify. This modifies the table itself. We have, again, the relationships and the two connections that it believes we may want to infer for the GraphQL entities will know that they're connected in that way, right? The database knows because we drew the database connections in the primary Form Key relationships in the SQL. But the service server hasn't made that connection in its metadata to strongly connect those just yet because it's given us the option to do so. But it sees in the database that those are obviously there because they are actual connections within the database itself. So let's go back to just click on data and then click on track all. That's gonna add all those relationships. Should just take a second or two here. How's the font size, by the way, everybody? Is the font size okay? Should I bump up the font size? Any other qualms of that sort, just let me know. I'll see if I can tweak it up and down or anything. Cuz I know some of these fonts get a little small on the 1080p screen size. All right, so the relationships are added, so, we're gonna have a bit of a transitional element so, relationships are added. Now, if we go into, let's say, action, go to relationships, you see now it's an actual relationship, there's not an add button anymore, there's just an edit, so you click that and you could actually change the relationship if you wanted to, but you don't need to do that. However, there are a few other things that we do want to do in this database that was not done before.

Setting up Migrations and UUIDs for Batch Inserts

Short description:

Setting up migrations is important to automate database creation or changes. UUIDs enable batch inserts and minimize round trips to the database. Using UUIDs as primary key and foreign key relationships throughout the data model is crucial for performance optimization. Hasura provides auto-complete based on data types, making it easier to make quick changes. The default function 'now' is commonly used in various databases. Upserts behave differently depending on the database. In Apache Cassandra, upserts can be dangerous if the primary key is multi-column.

Okay. This is a point where if I was doing this within a team perspective from a workflow perspective, right, with trying to get everything into the repo, as Melanie mentioned earlier, how do you share all this stuff with a team? At this point, this would be the first point where I would immediately set up migrations to automate the creation or changes of the database. However, I'm not going to do that today because we're kind of focused on a different topic, but it's very important that that's started as soon as possible so you can have a clean flow of migrations from your version one migrations to your version two database, and they can stay in sync with your software development or your report writers or whoever's like using your database and executing GraphQL queries, et cetera, against the database. Like, you want to keep that in sync with them so you can always give them the version spec and they know exactly what they're getting in the database. So. If we look at relationships, that's good, but the other thing is, let's go to modify, we'll see this is UUID, primary key, and it is unique. That is what we want, but we want to change the default to the random UUID so that the database creates the UUID for us. Now, doing this in Postgres, as well as many other databases, you get the UUID, the database generates it for you. However, that doesn't prevent you on the client side, if you need to, to create a UUID. And this is a very cool thing that you're enabled to do with UUID is that you cannot do with auto increments, and it's very important, especially if you have a system with a lot of users reading and writing data and stuff that's gonna have relationships between the primary data created and the foreign key data created that may need to be written into multiple tables. But you don't want to do a bunch of trips to the database because you want to speed up the performance of the website. So the way to do that is you have the website create that initial primary key value. In this case, let's say source, okay. But as soon as they create that, it's set up, it's ready to be saved. But the user also wants to do some other things like create the action, the formatter or other information like the notes around the source at the time of the creation of the source itself. If that's the case and they're still creating that data. Like you have all of this stuff on a form where they can create these things. You may want them to create more of that information and then write all of it at one time to the database. The only way to do that is to use UUIDs. And then when you draw the primary foreign key relationship on the client, as they create that data, you have created that UUID to be able to pass to the database so that the database correctly draws these correlations between this data that's created on the client side. You can push all of it at one time. So you have basically a batch insert against a multitude of tables and UUIDs enable you to do that, whereas auto increments would not allow you to do that because you may run into conflicts as multiple users are trying to write things to the database. And it would necessitate a round trip every single time. You would write the insert and you would have to ask for that auto implemented ID back and then take that for the foreign key related data and use it for that as you do those things and then send that data in. So there's that whole round trip that is required in those situations where as with UUIDs not necessary. This is also very, very important with GraphQL because if you take a complex entity and you have the related elements within that entity of entities and you try to write all of it back, you may just have the pertinent things you need. But you need, you want it to minimize the trips to the database and the looping back and forth that occur. So in GraphQL we'll hide that, but as the database administrator, the person who's managing the database, you'd want to make sure that it's not doing any more round trips than is necessary. And that is fixed by using UUIDs as your primary key, foreign key, relationships throughout your data model. So I'm going to save that with the generate there. So that way I don't always have to get that from the client side if I don't want to. And then for the stamp, I'm going to take the default and I'm going to go with now. Notice how Hasura kind of gave you the auto-complete, but it gave it to you based on the data type. Very, very cool. That's partially, you know, it's from the data type. The naming kind of pulls it off of that. So it helps a lot when you're trying to go through and make some quick changes to get the auto-completes that are specific to what you're trying to do. So, all right, we've got the ID and the stamp in there. So clock underscore timestamp. I think that's dependent on database type. Now, so I don't actually know the specific difference between that. Do elaborate if you know, Bravan. I just use now because that's also... It tends to be the default function between other databases also, which I like. So upserts are interesting. And depending on the database, they behave differently. Because this is definitely a good NoSQL question. In your, in your query, how do you store it? In your XAML docs, do you have a variable that defines a new item in every loop? So do you store a single item that's not meant to be taken away to the next query? Like, in Apache Cassandra, upserts can be, can be dangerous. Because you may have an ID, which is your primary key. And it may be multi column, because a primary key can have multiple columns.

Database Design and Development Considerations

Short description:

When using UUIDs in a database, inserting a new record with the same name and email address can result in a duplicate entry due to the unique UUID generated. Updates in databases are actually a delete-insert operation. When using timestamps, it's important to consider factors such as time zones, time servers, and the accuracy of the function. The Hasura CLI automates migrations and keeps track of versioning, making it easy to switch between different versions of the database schema. This simplifies project management and collaboration.

It's called a, let's say, a candidacy column. So if you have that, and one column is a UUID, one column is a name, and one column is a email address, and then you try to insert a new record with the same name and email address, because you just want to update the name and email address, if you're using UUID, you end up with a duplicate. Because the UUID would be generated differently. And it would be a different value. So it would make, even though the name and the email address are the same, it would make a new row in the database. So that's one of the problems with upserts. And upserts aren't a functional thing that occurs. There's still just an insert occurring, but it's referred to as an upsert because you specifically have to map what is your ID in the upsert to ensure that the update that you're expecting is actually occurring. It's also probably good to mention that updates are kind of a mythical connotation within databases, because what's actually happening is a delete insert on the database side, whenever you do an update. When you do an update, those two things occur. So keep in consideration that you're doing two actions every time you do an update in a database.

Oh, what else is there to mention about that? There's a lot. There's a lot to think about a database design and development. So yes. Back to the database specifically here around these tables. In actions, I've updated that. I'm going to update source because I'm going to play around with source a little bit to show some things. So let's actually get this and add the UID there. And we'll add the stamp. All right, there we go. And then let's go ahead and do it for... Source notes doesn't need it because in every situation you would always be taking the source ID and the note ID to draw this relationship as you do the insert in the source notes, so no function necessary there. However, in stamp you want to do now also. Excellent, so now you can do it. So what is the difference, as Previn asked, between now and clock stamp? So quick little side diversion here on good old stack exchange. Says I see no difference however they state, ah, there is no difference. So the SQL standard functions return value is based on the start time of the current transaction, current time stamp, transaction time stamp, equivalents current time stamp. This is traditional Postgres equivalent to transaction time stamps. So there we go. Good to know. So I guess now is basically a wrapper to that just like now is often a wrapper in other databases to those things. It's a very good question though. And to elaborate on the concern that the questions often derived from, you need to make sure that, whenever you use in any type of time stamp, it's stamping it in the zone, in the format, and with other details that you expect. And you know, the criteria that, for example, is it using a time server, right? Or is it using the local database time server or time clock? Right? The clock of the local server? Is the local server in your time zone, or is it on UTC, or is it even set to UTC, but in a different location? There's a lot of different variables around times and timestamps you need to take into account, especially, even more so, if the timestamp is being used for a legal audit. Often in that situation, you'd want to standardize on something specific that is well-documented, like UTC, and then do any conversions to other time zones or whatever based on the UTC for that audit and the timestamp. Knowing what the function does and how accurate the time stamp is is crucial to being able to understand exactly what you're working on. Knowing what the function does and how accurate or possibly inaccurate a function can be when doing a time stamp is hugely important. We've got source nodes, schema, note JOT. Let's look at that one. I'll pop UUID in there. I'm just going to real quick do all of them. Again, stamp. Now the cool thing, if you use the Hestera CLI, all of this as you do this in the console would be automated and kept in migration. So whenever you execute your DDL, you execute your database schema, you can move back and forth between the versions that you keep. Like this would be version two that I'm working on now with these extra default functions in place. So with migrations and using the CLI, you can automatically go back and forth between those with simple Hestera Migrate Apply. And the console, you can execute from your migrations to bring it up and then make these changes. And it writes the SQL into your migrations folder within your project. So very easy, as Melanie mentioned earlier, how do you keep up with these things in your project? It's very easy then because it creates the SQL for you to have in your Migrations Directory that you can put into your repository and into whatever other collateral you need for your project, whether it's Jira cards or something else. Great way to be able to do that.

Exploring GraphQL Tools and Query Analysis

Short description:

With Asura, you can easily avoid writing unnecessary SQL queries. The Formatter feature allows you to choose different functions and their defaults. When working with GraphQL, Asura introspects the database and infers entities based on the tables. It's important to choose the appropriate casing for familiarity. The Explorer, CodeExporter, and Voyager are useful tools for executing GraphQL statements, exporting code, and visualizing the database metadata. Voyager provides a map of the metadata, query routes, and data types. It's a valuable tool for troubleshooting and understanding the GraphQL API. The analyze feature provides insights into the execution of queries and the nested structure of the data. It's helpful for complex models and server-side execution.

It makes it very easy so that you don't have to write any SQL that you don't really need to be writing. So really great option there. Formatter. Go to a quick last view here. You have a preference too as mentioned, with stamps with stamp. If you don't want to use now, there are all these other functions, you can go with another default. They all show up in the drop down and makes it easier to determine which one you want and go with that.

Alright, so we got those. Now let's go into the world of GraphQL. With Asura, it introspects against the database which in this case is the database that we've just created against our particular tables that looks at our relationships and it infers things and it infers what ought to be entities. In this case, the entities are generally the table. So again, as was mentioned earlier, it's important to pick casing. Helps to bring familiarity to what the concept is of the thing that you're bringing back that your query against, etc for people writing code against this particular API, right? So pick your Campbell case or your Pascal case accordingly.

In here, let's see here, we were going to do source. Let's do a basic query. Pick ID, name, stamp, and URI. Of course, we have nothing in here. We execute that, boom, nothing, right? However, we do have something, we've executed a query. If you look at the history, my query has been executed. If you change the name to another query, for example, and execute that, boom, shows up in the history. You can go back to these various queries and look at them. Like that, very easy, just clicking on things.

The explorer, I'm going to close that for a second. I'm going to close the history for a second. Now let's say you need to write some code that's going to execute this GraphQL statement. Click on CodeExporter. You have options here, TypeScript, JavaScript. As you can see below, it provides the code that you need to execute that, basically in copy and paste format. You can change from fetch, or you can use React Apollo. It shows pertinent codebase for that. So, super-handy feature there. Let's get rid of CodeExporter for a second. Next, click on Voyager. Transmitting to space. Alright, this may be a little overwhelming at first. So, we're going to zoom in. As you can see, this is a map of the metadata that was created about the database, that has all the correlations, etc. Query routes for each of the tables we created, action, action aggregate, connection, formatter, notejot, schema, source, etc. And then that maps out to look at the nodes, the min, maxes, looks at the data types and other things like that. You got your connection, source, and the list goes on. So a great way to figure out what is happening as far as the database is concerned in the design of the GraphQL on top of the DDL-built database schema. OK. Very, very useful, very important place as you begin building more complex models and you need to troubleshoot what you want to do with your GraphQL API and determine what you want to do as far as the model, but also your entities and things like that. Then let's do an analyze real quick and see what we get. That is the query analysis that I showed a little bit earlier in the slides. So pretty, pretty easy query because it only executes against the single entity, and in this case, a single table. So there's some nested slacks, which is how the default generation of these things occurs. As you get more complex in your queries, this becomes more complex. And you can go in here and see what you're getting and executing on the the server side. So let's get into that a little bit.

Adding Records and Related Values

Short description:

Let's go back to Explorer and change the source to a mutation. We'll do an insert without the ID field, and only include the name and URI. The database will handle the ID and timestamp. By asking for the ID and stamp, we can confirm that the record was created and obtain the necessary information for future reference. Let's use my blog as a reference and add a record. Now, let's go to YouTube and add another record. We have a few records now. Next, let's add a related value by going to the source and selecting formatter. We'll add something to the formatter and include the connection ID.

First, let's sort that out. Go back to Explorer. And here's a, this is one of the features that I love about Hasura and its graphic URL integration. It's just like let's go into source. And let's change this to a mutation. And you'll see all the inserts, updates, deletes, etc now. Let's go with source, do an insert, and we want to, I'm not going to insert ID, so I'll take that out. I want to do name and I want to do URI. That's it. Because ID and the timestamp will be done by the database, right? So let's get rid of this thing. And then control space, just like in normal GraphiQL, will give you the autocomplete for things. So I just want return from that. I'm going to ask for the ID and the stamp. Okay. That will give me confirmation around several things. One, that the record was created when I asked for it to be created. It'll give me the ID and the stamp so I'll know when it was done. And I'll also have the ID if I want to reference back to that particular record with whatever I'm doing in my code base. So pretty common query tactic that I use. All right, so first source. What's a good web page? Anybody got a good web page? I'm going to go to my blog. I'm going to use it as reference. So composite code dot blog. Put that there. And then the name of it is composite. Composite. Brash encode. And I'm going to add that. So there we go. We have the ID and the stamp. It comes back from that. We're good to go. Let's go to another website. Let's go to YouTube. To the Hasura HQ channel. Lots of good content out there. So big shout out to our content on YouTube. So I'm going to take that though and just put it in here. So now we have a few records. And boom. New ID, new stamp. So that's great. So now let's take that ID here, and let's add a related value in. Let's go to the source. And then we want to do formatter. So let's say YouTube has an API. So we'd be adding something to the formatter. That's the update. I want the insert. We'll have the connection ID.

Adding Connection to Source

Short description:

We need to add the connection to the source first. The connection needs action ID and source ID. If you have foreign key items with required items, the primary key ID needs to be available when inserting the record. Let's go back and add the ID and Source ID. Then we'll go to Insert Connection. We'll skip creating the other items first and look at the execution again.

What did I draw? I'm thinking this is bad. Here we go. We need to do connection first, so we'll do source ID. And then, yep, there we go. Oops. Hype that. SKING. Okay, that's good. Then we'll do, I thought I had called, oh yeah, action. Let's see how we can do this. So formatter. See where this starts to get complicated when you're doing multi-insert.

Okay, I've actually confused myself. Let me look at this again. Okay, I put in the source and I want to just add, I need to add the connection to the source first. So the connection needs what? It needs all these things. So it needs action ID, source ID. I think I left that, let's make sure we left it nullable. Yeah, so you can add one or the other as it's being done. To note, that's another thing. If you have foreign key items that have required items, the primary key item, that ID needs to be available to be inserted with this record when this record is finally put all together with all its required items. And if that takes a minute, you may not have that unique identifier if you just sent the primary key item to the client, right? And then for some reason you haven't gotten the auto increment value or something. So that's important to note. So let's see here. Connection. We need to go with ID Standards. Let's go back and do that. So ID will be generated for us, and then we want to do Source ID, that's what we want to add. And I added it. So change this to mutate. And mutation. There we go. And I'm going to get rid of this. We'll go to Insert Connection. So we'll want... So your ID will be done for us. We want to Source ID. That's what I copied. Okay, there we go. Then we won't need this stamp. And that's just going to provide the connection. It must have a selection of subfields. So let's see, Source. Actually, let's do this. Mm. I feel like maybe we should create the other items first and then this one. So for that matter, I'm going to actually skip through that, and we're going to look at the execution again. So we have the source. Let's look at those. There we go.

Nested Queries and Performance Considerations

Short description:

We have a single record query where we filter by name. The result sets are returned and mapped to their corresponding fields. As the query becomes more complex, nested relationships are introduced, which can lead to performance issues due to the increased number of joins. It is important to keep the number of joins minimal to avoid decreased performance. Creating denormalized tables can help improve query performance by grouping together frequently accessed data sets.

So we have two. No details in there. One's a blog. One's the other one. So if we do that, but then we say where, let's go with name equals hasura.hq. Oops. It's called a single record query. So we can look right at it. And then I'm going to pull up the history. So there it is, along with the others. I'm going to take an analyzation of that and we'll look at what we've got.

So here you can see the where is billed as such. What I'm going to do is I'm going to scroll down to where it is, and I'm going to go ahead and use the where is billed as such. So public source name is HHQ is equal to the text. It's a little weird, because you might not think of this as what SQL you would write yourself if you were to write this query, but it shows you how this is done and how these things are infilled. And you can make the determination. And of course, here's the three things. From this query, that query is executed. Then this query is executed on that result set to show the sets that are returned, i.e. the tuples that are returned. And it maps ID to ID, name to name, details to details so that when you see it, it comes back just like that.

So as this gets more complicated, though, let's look at one of these crazy ones. Let's say we have connection, and you want to show various pertinent things. You want to see the action, and then you want the formatter, formatter map, stamp ID. Then you want source, name, ID, stamp. You can see how this starts to nest. You have connection, and then action within that. Within the action you have connections, and then source. Now let's see if this even executes. It did. However, let's talk about this for a second. You can see that there's dramatically more steps to what just occurred, even though there's no data to execute against. It still has to go through these steps. There's a nested loop sequence scan, a nested loop index scan, another aggregate, nest loop, nest loop sequence scan inside of that, another index scan, et cetera. So now, if this is per se, one record on this join, the next one then is executed, queried, against the set of all that. So you can have an exponential or even a Cartesian product of the query. You could technically kill your server with certain types of queries like this. And you need to be aware of that so that you can take precautions. For example, this connection inaction. You wouldn't want to call that deep into it, most likely. You'd want to keep it minimal at a higher level. You'd want to do it at most like one, maybe two levels deep. The good rule to follow, as with relational design, is you never really want to go too far down the path with too many joins. As joins are added, you see a huge spike in decreased performance, because of the correlative scans that have to be done against the database. To minimize that, you keep things one, one level deep, two level deep. And you can think of those as your joins, right? As you nest into an entity object like that, though, you increase your joins. So when you design your database schema, think about the GraphQL entities and how those will be queried. So one of the things you can do to fix queries or to change things so that they can be faster or more easily queryable, if you know certain sets of data are always needed together, you can create a denormalized table. And maybe either with code or something else, have that filled at the same time that you fill your system tables of record. Now, one way to do that is like if we go back to data, and let's say we want that connection with its requisite related data.

Optimizing Queries for Performance

Short description:

We can combine the columns of the action ID and source ID into a single table for faster queries. This denormalized approach reduces redundancy and allows querying multiple elements within the entity. By minimizing joins and nesting, we can achieve faster results with only one level of nesting.

So we have ID stamp, but it has an action and source ID. So we might want a table that has all of the columns of the action ID and source ID combined. Very denormalized, a lot of redundant elements of data within that. So as was pointed out earlier about cardinality, your distinct cardinality decreases, and you have a lot of redundancy within a table. But you have a single table or a single entity in GraphQL to query against. So then you can query against multiple different elements within that entity, or in the SQL speak, multiple different columns to query against in that entity to be able to get faster results. And to also get results that are only nested one level deep to minimize the amount of joins and nesting within the query itself that has to be executed against the database.

Performance, Design, and Use Cases

Short description:

Normalization and denormalization of data play a crucial role in performance and design. Consider what parts of the entity are needed at a particular time and whether to expand the entity to include other things. Follow me on Twitter, Twitch, and YouTube for more content related to Hasura and GraphQL. Hasura is not limited to CRUD applications and offers features like custom validation, authentication, and remote schemas. It can be used for customer-facing mobile apps with roles-based permissions and authorization.

So, with that, any questions about that? Cause that's where a bulk of performance and a bulk of design needs to come into play is around normalization or denormalization of the data to change how we ask things with our queries and GraphQL of the entities. Like what parts of the entity do we want at one particular time? Do we want more than that's available in a single entity? Do we need to expand that entity to be more inclusive of other things? Those are the things that we need to take into account.

So questions, comments, et cetera. As you think of any other questions or comments, I'm going to pull up my final slide, because that's basically the material that I have for this. To note, the slide has links, I will make these slides available. I'll post them in the discord within the next few minutes. So if you want the links in the slides, you can just get the slides and they have the links to them. It's all Google slides, so nothing, no crazy format or apps needed for it, and I'm going to add a few things here. Example, me, reach me at Twitter at Adren. Twitter, Twitch, Adrian. Twitter, Twitch, and YouTube also. I also stream shows on Hasura HQ around this specific topic as it relates to the Hasura tool. There's a lot of stuff out there around migrations. I'm going to be doing more. And in the very near future, in 2021, we will be writing apps with React, Vue.js, things like that around the Hasura API server using GraphQL. It's a lot, a lot of deep dive GraphQL topics in there. So be sure to follow us on Twitch, subscribe on YouTube, and follow us on Twitter. I, myself, am available on Twitter at twitter.com slash Adrian. And on Twitch at twitch.tv slash, whoops, thrashing code. And on YouTube at the same. We also have GraphQL Asia coming up. So, if you're interested, definitely check that out. And for further training on this specific topic, as well as many others around GraphQL, how to do data modeling around GraphQL, check out siro.io slash events. Got a lot of good stuff in there. And that is pretty much it. I think I might have, let's see here. Oh no. Definitely follow me on Twitter and ping me if you're interested in credits. Cause I think I have some pretty cool credit options I can give you. But follow me and I'll get those to you cause I don't have the code with me right this second. What are intended use cases for Hasura? Is this mostly for enterprise CRUD apps or for consumer facing web mobile apps? Anibertec asks. Yes. Isn't that the greatest answer, yes. So it can definitely be used very, very easily for CRUD applications. Because as you saw, as I worked through it, you can go in and all the CRUD features are there by default, out of the box. Like all this stuff, none of this was canned at the beginning of the workshop. Like I did this as I would do it in any circumstance the first time I'm doing it, for all of you. One reason I do that is to make sure that there's no little hiccups in between. I like to make sure any material that I present is seamless from start to finish, so that your experience would be the same experience as what I'm showing you. I don't want any tricks in there. So, you saw it right there, it just, boom. All the CRUD capabilities are there by default. However, Hasura does a ton more than just gives you that. For instance, you can look at Actions. And you can add actions that do custom validation, custom authentication, like if you're using an OAuth with Auth0 or something like that. It gives you those capabilities to add that and then execute GraphQL against that, or create against it that way. Remote schemas allow you to connect to an endpoint that's GraphQL, and then integrate that in and even draw relationships between this other technically unrelated GraphQL endpoint and draw it to your GraphQL endpoint here in Hasura and draw correlations between it, like relationships like we saw, that it drives itself against the database that it's directly pointed at. With the remote schemas though, you can point this out to those locations and basically pull in all sorts of other GraphQL APIs and kind of centralize them to provide a single source reference point for your developers or report writers or whatever to access and have all those correlative relationships and everything that they need and ought to have to make life simpler as they do development from this singular point. So that's far beyond merely the crud, right? As far as customer-facing mobile apps, yes, I would definitely use it for that. You can set up roles-based permissions, authorization, all types of things there.

Postgres Capabilities and Tips

Short description:

Postgres offers extended capabilities and can be scaled horizontally. JSON fields in Postgres can be inferred as strings unless specified otherwise. The types in Postgres are mapped accordingly in Hasura. If a type is not available, it can be inferred as is. A list of general tips related to GraphQL queries and the database perspective is provided. These tips include minimizing joins, using 'exists' instead of 'in' to check data existence, using appropriate data types, and creating indexes on columns selectively based on query patterns and cardinality.

Basically what's available in Postgres is extended beyond that and as we add databases like MySQL, SQL Server, which happened pretty soon, those permissions will also map accordingly through so you can have all those capabilities that you need, especially like in more complex, odd, enterprise scenarios to make sure that you're having the right permissions and roles and other things like that for the client software.

Other things that we have, you can basically scale this out horizontally, so for large, customer-facing websites or mobile apps that you would need some very high-scale capabilities for, we got options around that too. If you were to use some of the high-scale options for Postgres, you just point this at that high-scale option and boom, you're good to go. Like I believe Citus and some of the other ones, or some of the large RDS options within Amazon even.

I mean, I suppose the types of the JSON fields in Postgres through Hasura. Okay, I'm gonna ask you a question because I'm not sure of the exact context. So what was it? We had action, and this is JSON, right? So you can also do JSON-B, right? And I think that adds more context for some scenarios. Around typing and such. But if you just do JSON, I think there is. I'm not remembering exactly what we've done in the past specifically for that. To bring the JSON types across in that JSON object. I'm trying to think how PostgreSQL even does it. If it does it. Yeah, so that's kind of a tricky one. Because otherwise I mean, as you probably know, when you have JSON, if you just get the JSON, and there's not something to infer the types within the JSON itself, it's generally just strings, right? If you had an array, it would be an array of strings, unless designated. Unless you can find out some way that, and infer that it's an int or something like that. So I think you'd have to do that the same way that you would infer it if you just got JSON that you didn't know about. That's a good question though. I would have to check that out and see specifically what the detail is around that. Because I'm not sure at this point. Somewhere in there, there's something though. Because I know that the types are mapped accordingly, so whatever's in Postgres ought to be available in some way, form, or manner. If it's not available to you, infer the types as is, and Postgres does it, then Haseera will at some point be able to do that if it doesn't do it already.

One more thing, right here on the end, of course I remember this right at the end. I created a list that I did want to go through. I'm just going to paste it right here. PerthTips. These are somewhat arbitrary, not specifically related to schema, but just in general. They do in some ways do relate back to schema, but again, they're just in general. Let's see here, paste without formatting. Can I do that? Yes, there. There's just a few that I've been collecting the last few weeks. That are more related to things I note that specifically help the GraphQL side of queries and such, from the database perspective.

One, minimize joins, I mentioned that. The use exists instead of in to check existence of data. I've run into that scenario a few times. That's pretty much synonymous against anything you're gonna execute ANSI SQL against. So SQL server, Postgres, et cetera, that helps. I feel like it's kinda silly to mention but use the appropriate data type for a thing, if it's a particular data type. Yes, I'll get them in there. That is a good call, probably. Or in hypertech. I'm gonna turn it back, that's weird. Also, I mentioned you can create indexes on columns in the database. Yes, you can. However, that doesn't mean you get a performance boost if you go create an index for every column. Depending on how the database does indexes, which generally is similar between relational databases, you want to pick the one or two or maybe three, at most, columns that you're gonna be querying against within the WHERE clause of obviously your GraphQL or your SQL, as it maps through. You want that WHERE clause to execute against index columns, ideally. But if you're just kinda doing WHERE clauses against a whole bunch of different columns, then it's kinda hard to pick and choose. You wanna pick the columns that will have the highest distinct cardinality.

Optimizing Indexes and Using Schemas

Short description:

Indexes can be iffy if there's redundancy. Denormalizing data by combining columns into a single table can help, but it sacrifices distinct cardinality. Apache Cassandra works differently from the relational model and offers fast reads with horizontal scaling. Schemas in databases provide a way to organize and streamline performance. They act as a namespace to group tables by business domain. Other databases like SQL Server and Oracle have similar options. Any more questions? I'll be available for the next few minutes. Thanks for attending!

That's one thing, if there's a lot of redundancy, indexes can be iffy. So you really kinda have to trial and error through some of that and become more familiar with how the database streamlines and enhances the execution path and the scans of the queries. And I also mentioned denormalizing the data, which is taking that standard primary and foreign key and just munching a bunch of columns into a single table, that can help. But of course, you lose your distinct cardinality, and then you have to weigh which columns to index and how many things you should have together or shouldn't have together because of the redundancy within the table.

This is something, however, if you have a server that uses Apache Cassandra and you run a GraphQL on top of that, Apache Cassandra works very differently in how to do this. You have a table structure, but it's column and stored, and your where clauses can only be done against one or two columns, generally speaking, at a time. But you can literally have petabytes of data in a table and get timely queries against that because of the horizontal scaling of Apache Cassandra, for example. But again, like I was saying, that's very different, a very different database than the relational model, but it gives you that reporting capability that is not even comparable because it's just, the reads are so fast.

Let's see here, oh yeah, schemas, in some of the databases, it seems that schema isn't, there's a little bit of redundancy. So there's database model schema, and then there's the schema feature. In Postgres, you can name a schema and then put tables within the schema. There's the schema design, where all the tables are related to each other, but then there's the schema, almost as if it were a namespace that you can organize tables within. So then you have my schema.tablename whenever you write a query. Putting things into schemas helps organize, but also can help you streamline specific performance needs or something like that, and to break out things per business domain or other needs. SQL Server has options like that, so does Oracle, et cetera. Not always called schema, but it's like a namespacing element within the database to be able to break out tables into groupings. Doesn't really change the table itself in any way, it just, it kind of namespaces it where it breaks it out into a domain organizational structure.

All right, so yeah. Any other questions? I'm gonna be hanging out for the next five plus minutes, at least. I am going to share these slides real quick, and I will post those in the Discord right now. So if any more questions, feel free to ask them. I'll answer anything and everything I can. If not, I'll help you dig up an answer to that. In the meantime, if you have any questions or want me to answer them, in the meantime, I'm gonna share these real quick and get them to you. So thanks for attending, I hope this was useful, and again, you know, go follow me, follow Hasura's Twitch, et cetera, go read the blog, check it out, there's so much more out there that we are producing and will be producing, and so much more specifically around connecting database schema and modeling design to how to do it well with GraphQL in mind. To make development faster, easier, more robust and capable as you do so. So again, thanks, I'm gonna unshare my screen, get those links to you and hang out in chat.

Watch more workshops on topic

GraphQL Galaxy 2021GraphQL Galaxy 2021
140 min
Build with SvelteKit and GraphQL
Top Content
Featured WorkshopFree
Have you ever thought about building something that doesn't require a lot of boilerplate with a tiny bundle size? In this workshop, Scott Spence will go from hello world to covering routing and using endpoints in SvelteKit. You'll set up a backend GraphQL API then use GraphQL queries with SvelteKit to display the GraphQL API data. You'll build a fast secure project that uses SvelteKit's features, then deploy it as a fully static site. This course is for the Svelte curious who haven't had extensive experience with SvelteKit and want a deeper understanding of how to use it in practical applications.

Table of contents:
- Kick-off and Svelte introduction
- Initialise frontend project
- Tour of the SvelteKit skeleton project
- Configure backend project
- Query Data with GraphQL
- Fetching data to the frontend with GraphQL
- Styling
- Svelte directives
- Routing in SvelteKit
- Endpoints in SvelteKit
- Deploying to Netlify
- Navigation
- Mutations in GraphCMS
- Sending GraphQL Mutations via SvelteKit
- Q&A
Remix Conf Europe 2022Remix Conf Europe 2022
195 min
How to Solve Real-World Problems with Remix
Featured Workshop
- Errors? How to render and log your server and client errorsa - When to return errors vs throwb - Setup logging service like Sentry, LogRocket, and Bugsnag- Forms? How to validate and handle multi-page formsa - Use zod to validate form data in your actionb - Step through multi-page forms without losing data- Stuck? How to patch bugs or missing features in Remix so you can move ona - Use patch-package to quickly fix your Remix installb - Show tool for managing multiple patches and cherry-pick open PRs- Users? How to handle multi-tenant apps with Prismaa - Determine tenant by host or by userb - Multiple database or single database/multiple schemasc - Ensures tenant data always separate from others
React Advanced Conference 2022React Advanced Conference 2022
95 min
End-To-End Type Safety with React, GraphQL & Prisma
Featured WorkshopFree
In this workshop, you will get a first-hand look at what end-to-end type safety is and why it is important. To accomplish this, you’ll be building a GraphQL API using modern, relevant tools which will be consumed by a React client.
Prerequisites: - Node.js installed on your machine (12.2.X / 14.X)- It is recommended (but not required) to use VS Code for the practical tasks- An IDE installed (VSCode recommended)- (Good to have)*A basic understanding of Node.js, React, and TypeScript
GraphQL Galaxy 2022GraphQL Galaxy 2022
112 min
GraphQL for React Developers
Featured Workshop
There are many advantages to using GraphQL as a datasource for frontend development, compared to REST APIs. We developers in example need to write a lot of imperative code to retrieve data to display in our applications and handle state. With GraphQL you cannot only decrease the amount of code needed around data fetching and state-management you'll also get increased flexibility, better performance and most of all an improved developer experience. In this workshop you'll learn how GraphQL can improve your work as a frontend developer and how to handle GraphQL in your frontend React application.
React Summit 2022React Summit 2022
173 min
Build a Headless WordPress App with Next.js and WPGraphQL
WorkshopFree
In this workshop, you’ll learn how to build a Next.js app that uses Apollo Client to fetch data from a headless WordPress backend and use it to render the pages of your app. You’ll learn when you should consider a headless WordPress architecture, how to turn a WordPress backend into a GraphQL server, how to compose queries using the GraphiQL IDE, how to colocate GraphQL fragments with your components, and more.
GraphQL Galaxy 2021GraphQL Galaxy 2021
48 min
Building GraphQL APIs on top of Ethereum with The Graph
WorkshopFree
The Graph is an indexing protocol for querying networks like Ethereum, IPFS, and other blockchains. Anyone can build and publish open APIs, called subgraphs, making data easily accessible.

In this workshop you’ll learn how to build a subgraph that indexes NFT blockchain data from the Foundation smart contract. We’ll deploy the API, and learn how to perform queries to retrieve data using various types of data access patterns, implementing filters and sorting.

By the end of the workshop, you should understand how to build and deploy performant APIs to The Graph to index data from any smart contract deployed to Ethereum.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
From GraphQL Zero to GraphQL Hero with RedwoodJS
Top Content
We all love GraphQL, but it can be daunting to get a server up and running and keep your code organized, maintainable, and testable over the long term. No more! Come watch as I go from an empty directory to a fully fledged GraphQL API in minutes flat. Plus, see how easy it is to use and create directives to clean up your code even more. You're gonna love GraphQL even more once you make things Redwood Easy!
Vue.js London Live 2021Vue.js London Live 2021
24 min
Local State and Server Cache: Finding a Balance
Top Content
How many times did you implement the same flow in your application: check, if data is already fetched from the server, if yes - render the data, if not - fetch this data and then render it? I think I've done it more than ten times myself and I've seen the question about this flow more than fifty times. Unfortunately, our go-to state management library, Vuex, doesn't provide any solution for this.For GraphQL-based application, there was an alternative to use Apollo client that provided tools for working with the cache. But what if you use REST? Luckily, now we have a Vue alternative to a react-query library that provides a nice solution for working with server cache. In this talk, I will explain the distinction between local application state and local server cache and do some live coding to show how to work with the latter.
GraphQL Galaxy 2022GraphQL Galaxy 2022
29 min
Rock Solid React and GraphQL Apps for People in a Hurry
In this talk, we'll look at some of the modern options for building a full-stack React and GraphQL app with strong conventions and how this can be of enormous benefit to you and your team. We'll focus specifically on RedwoodJS, a full stack React framework that is often called 'Ruby on Rails for React'.
GraphQL Galaxy 2022GraphQL Galaxy 2022
16 min
Step aside resolvers: a new approach to GraphQL execution
Though GraphQL is declarative, resolvers operate field-by-field, layer-by-layer, often resulting in unnecessary work for your business logic even when using techniques such as DataLoader. In this talk, Benjie will introduce his vision for a new general-purpose GraphQL execution strategy whose holistic approach could lead to significant efficiency and scalability gains for all GraphQL APIs.