Real-Time Data Updates for Neo4j Using GraphQL Subscriptions


Join Thomas from the GraphQL Team at Neo4j as he talks about one of the newest features of the Neo4j GraphQL Library: GraphQL Subscriptions. Using this new feature, GraphQL API consumers can listen to data changes in real-time, which happen in Neo4j via the GraphQL Library. Following a high-level overview of the Neo4j GraphQL Library, he will demonstrate the new Subscriptions feature.


Hi all, I'm Thomas Wies, I'm a software engineer at Neo4j in the graphql team. And today I'll be talking about real-time data updates for Neo4j using graphql subscriptions. But first things first, right, we have to talk about the database that we use and it's Neo4j. And Neo4j is also a company, but mostly known for its database. And a database that can actually scale very heavily and it doesn't have any schema. All you need is nodes and relationships, which form a graph, as you can see one here, a movie graph with relationships and nodes. And the nodes are the bubbles and the relationships are the arrows. And the most important thing to note here is that both nodes and relationships can contain properties, so they're first-class citizens. Of course, if we have a database, we need a way to query that database or a query language. And that's what Neo4j built as well. It's called Cypher. And Cypher got a lot of inspiration actually from SQL or SQL, but it added some more things on top because you have to query graphs. So we have to do things like graph pattern matching. And one example, a simple example on top is here on the top row where you can see a match where we match a person who acted in a movie. And we have the arrows pointing or indicating a relationship. And this way we can actually read it quite intuitively. And believe it or not, it's actually very easy to learn. And it has a lot of the power that SQL already has built in here. So where clauses match, et cetera, et cetera. And we call this ASCII, or it's almost ASCII art, what we do here. So now we have a database, a really amazing database. And of course, the topic of this conference, graphql. And wouldn't it be great if we could just get those two together, right? Could we do something like this here? Yeah, of course. We can do that, actually. We already built it for you. And we started two years ago building an open source library written in typescript for your convenience. And all the library doors or the centerpiece of the library is, as you can see here in the image, it provides an api, or it's the base for an api for a graphql api, where the graphql clients can send their graphql queries to the api. And the library in the background translates on the fly this query to a Cypher query and sends that to the database and the response back to the api and then back to the client. And that's all handled for you automatically in the background. You don't have to do anything here. But we need an api, right? We have to define the api somehow. And usually you would go along with this or something like this, right? You create your types, your type movie here with a field title. And then you have to do this cumbersome stuff here. You have to write read queries. You have to write all these mutations, add, change, remove, et cetera, et cetera. You even have to write the resolvers for all of that. How painful is that? That's so, no way. I'm not going to do that. I want automatic schema generation, right? I mean, you do the boilerplate. I do automatic schema generation. And that's exactly what we did with the Neo4j graphql library. So the only thing you have to provide to make your graphql api ready to be used is to define your types or the type definitions. And we just take a very simple one here to get started. A movie with a title, all you need. And then the library in the background for you will create this here. You will get the read queries out of the box. We haven't done anything, but it's all available for you. So you can do read of movies with filtering, so where, for instance. We have aggregates already ready for you with, again, with filtering built in. We have connections, like the relay connections specification is implemented here. And where you can do cool things like cursor-based pagination, all of that out of the box. And you know what's the best part of it? The resolvers are already written. You don't even have to write that as well. When it gets better, wow, mutations, they're also included. You can create, delete, update movies, all these nice things. Again, filtering wherever you need it is already included. And again, the resolvers, they're actually already built in. You don't have to do anything. But that's kind of interesting. So let's have a look at how all of this works on that query level. So something you're familiar with to start with, right? We have a graphql query where we want to get all the movies which were released in 1999. We want to get the title and the name of all the actors that acted in it. So far, so good. The Neo4j library, graphql library, sorry, will then automatically translate this when it sees this or gets this graphql request to this Cypher query. And I'm going to walk you through it real quick. So on the top, we match all the movies, then we get or filter out only the movies with a certain release date, 1999 in this case. And then in this call subquery, the brackets, what we do is we do a relationship traversal from the actor or the movie to the actor, and we get all the names for it. And then we pack it all up, all the actor names and everything in a form that the graphql client can consume or that it actually expects. And do you know what's the best part about this? One graphql query matches one Cypher query. Do you see what's missing? The N plus one problem. It just doesn't exist. It took care of it. You don't have that problem. We need one query, resource it, one call to the database, and it's all done. Isn't that cool? Right? So now that you know all the basics there are to know, let's go on to the actual cool stuff, subscription, right? So just on the highest possible level, what are graphql subscriptions? Two entities are involved, the client and the server, where the client subscribes at the server for a particular event. And by doing that, they open up a permanent connection. In the graphql specification, actually there is no clear definition of what the protocol, like on the transport layer, what the protocol is supposed to be here. It's actually up to the server. So what we've seen is that most actually use WebSockets or the WebSocket protocol to do just that. And in the demo that I'm going to show in a while, we do exactly that. We use WebSockets. So once the connection is established, the server then has the possibility to send or push through events to the client without the client having to do anything, right? No busy pulling, no nothing like that. And we can send as many events as we want, which is pretty cool, right? So we have our real-time data updates, just what the title has promised. Let's go a level deeper and have a look at how it's implemented using the Neo4j graphql library, where the library is the instance in the center, and there will be a couple of more of those pictures. So it will always be the same, the instance in the center. So in this setup here, we have a client on the right-hand side who subscribed to an event. And then another client, the one on the left-hand side, sent a mutation. Think of creating a movie, right? Sent a mutation, it gets translated to a Zafiq query on the fly in the Neo4j graphql library, gets executed against the database, and then it comes back to the Neo4j graphql library, which then informs all the subscribed clients that this event happened, right? Pretty straightforward to this point. There's one thing that we have to keep in mind here is if someone were to make a change to the Neo4j database there at the bottom of the image from the outside, so not using the Neo4j graphql api, we wouldn't detect that. That's a limitation right now. We know of that, but we actually work on it. So stay tuned. It might actually be fixed going forward soon enough. The other thing is this is completely agnostic. There's absolutely nothing that prevents you from building on top of this. But I think now I gave you enough context to see all of this in action. And you can look at this demo or at the demo code later on from my colleague, Andres. He built these amazing demos, not just the one I show you now, but several of those. But we're going to have a look at it. So let's head over to the code and look at some code. So as you saw before, the first thing we have to do is to define the type definitions. And we're going to use a slightly bigger type definition this time than just the type movie. So the first thing we do is define two types, type movie and type person, and give them some fields as we specified here. And then you see a relationship or a directive that you haven't really seen before, which is called a relationship. And it does exactly that. It forms or specifies the relationship between the movie and the person. And in the directive, we can specify what type of relationship it is, acted in in this case, the direction, and the properties. Because remember, in Neo4j, the relationships, they're first class citizens. They can hold data. And that's how we specified in graphql. We specify an interface, and then we can add any data we want here. We're just providing some roles because you have to have a role in a movie, right, as a person. That forms our base. So let's briefly look at the server code because that can be quite interesting for a lot of people as well. The first thing to note here is subscription is a plugin, right? So we let people develop their own plugin. If the community, someone in the community, maybe you want to develop your own plugin, you're free to do that. We have provided a bunch of them for you already, but more on that later. For the demo, we use something very simple. We use event emitters from node.js to get it done in a fast and quick manner for this demo. So you define a plugin. We have some for you, so no need to write them yourself. But here, this is a demo. I want to point out a couple of things. Then what you got to do is you have to get your type definitions, of course. You have to create a driver, a Neo4j driver to access your database. I started that in the background for the demo, so that's all running. No worries here. Then as a first important point is we have to pass those to the Neo4j graphql class, which we then instantiate later on, and also include the plugin to make sure to the graphql api or the Neo4j graphql instance that is running to show subscriptions are on. We're on here. And then we have to do the entire server setup. We have to get an express server, a WebSocket server on top of that. I'm just leaving out a bunch of details, which you can look at your own convenience. They're not really adding to the demo right now, but it's not really hard. Don't worry. Then we have to create the schema, including all the resolvers, right? And we just pass that schema to an apollo server, because it's just a convenient way to do that, to use apollo. Then we can start it. I did that already, so we're not really losing any valuable time. If we head over to the code, the first thing you're probably going to notice is over here, the root types. I mentioned those before. We have the query and mutation, where the auto-generated content is ready, create, update. But we're not interested in that right now. We want to have a look at subscriptions, right? Subscriptions comes with a lot of this automatically generated goodness out of the box for you. We have, for instance, movie created, deleted, and updated. Same for a person. And again, we have where clauses or filtering already built in for you. The same we also have for relationships, right? Because they're first class citizen. So if you create a movie and an actor connection, this is built in for you. You can just use it. And that's what we're going to do now. But I start with a simple case, so we all get our heads around it quite easily. So we can create a subscription, right? A movie created with an event and a timestamp. And then we want to know from that particular event that we get back from the movie, what exactly you want to return. So what has been, and this is just standard graphql, so I can add whatever I want here. And I know there's a tagline, so I'm going to add it. Then I hit subscribe, and I see it listening. What do we have to do now? Create a movie, right? Let's do that. So we create a movie. We have a mutation, create movie, story of toys. And if we execute that, we see we already have a subscription in, but we also have the regular response from the mutation that comes back. And we have subscription events here, like create. We saw the event that is create, the timestamp, and the movie in exactly the form that we requested. And this is subscription, right? We want to see multiple of those. Look at that, too. And of course, if I add another one, it will all come in a natural flow. And I didn't have to do anything, right, except listening to it. So let's stop this example and hop over to the relationship one. So if there's a movie relationship created event, we do the same again. We want to see which event it is, the field name of it, then which movie got created, which actor node got created, so which actor, and the roles that actor had. And again, here we can do whatever graphql offers us at this point. Let's hit subscribe. Again, it's listening. That's amazing. And now let's create a movie and an actor, right? So we create Gump Forest with an actor called Kevin Bacon, and he apparently was a runner in that movie. We have to see that, right? So let's store it. And again, it's here, right? We have connect as an event type, relationship field name, Gump Forest is here. And of course, if I just create another Kevin Bacon E, look at that. He's here as well. So that all works out of the box. And it's extremely fast. As you can see, it's already here. So that concludes our demo part, and I'm going to show you some more details in the background. So let's head back into the slides. And now they're loaded. So prior to, if you don't have subscription turned on, so just regular graphql quotes, if you create a movie on the left-hand side, a hot fuss with a graphql query, the Neo4j library will create this Cypher query for you. So a call subquery again in brackets, create the movie, add the title or set the title and return it in a fashion that a graphql client can understand. But then if, as before in the demo, if we turn on subscriptions, we have to add some extra data. The good thing is, and that's how it's supposed to be, the mutation on the left-hand side for graphql query didn't change at all. But we have to add some extra information to the Cypher query, which is this metadata here with an event create, as we've seen before, the type name, the timestamp, all the things you've seen in action before. We add that into the Cypher query because then at the bottom, you see there in the return statement, we return it back after it has hit the database. Why do we do that? This means we can keep consistency in the database without actually changing anything. We have one mutation that changes the database, but then we can have a multitude of subscribers on the other end that are consuming this event. So we have absolute data consistency in the database and we can subscribe or send the event to as many subscribers as possible or as they're subscribed without having an impact on the data or the database. But that's actually the point, Thomas. So many thousands of subscribers, is that going to work? Do we not need to horizontally scale all of that? In this case, the graphql library at some point, it cannot handle everything. So we have to horizontally scale it, which means we will end up with situations like this here where a client subscribes to an instance where the mutation on the left hand didn't even occur. Isn't there an amazing link here? I mean, how does this work? But obviously, rest assured, otherwise I wouldn't talk about it. This is fixed. We already built all of that for you. So this will work, which means your use case will scale. Both your Neo4j database and your Neo4j graphql api will scale with your use case. And we do that, as you can see here, by after we do the mutation, the Cypher query, and it's back in the graphql library, we will push an event there, number three, to a broker, which can be Kafka, which can be a Redis or a RabbitMQ. And that broker then in its turn will broadcast or fan out the event to all the running instances, and those will then send the event to their subscribers. So scaling solved, right? And now you may say, Thomas, cool, that's a nice promise, but this must be very hard to implement and set up and all of that. Actually not. It's super easy. Look at that. Remember the plugin system from before? Well, we already built a plugin for you that handles just that case. It's called subscriptions AMQP, and all you have to do is define the connection details and credentials to your AMQP broker. And yeah, sorry, you still have to set up that one, but there's many ways to do that. So you only have to provide us the credentials or the plugin of us, and you pass that plugin to the Neo4j graphql instance or class as you did before, and as you saw in the demo, and off you go. That's all you need to do. Currently subscriptions are still in beta, but they will very soon be generally available, and then you can get started with this. But to round things off, I want to talk about something really cool. So this you may have seen somewhere before. It's on Reddit, and it was in 2017 on April Fool's Day. It's rplace, rplace, and this allowed users to set one pixel or one color, then they have to wait for 15 minutes and set another pixel, and so on and so forth for a lot of users. We created this really, really amazing community art, and we thought this is the coolest thing to show or to showcase our graphql subscriptions, and we did just that. We created NeoPlace for you. You can actually access it right now under this link, under this link or the QR code here that is visible. We took the same ideas but built it with the things that I just highlighted before with a Neo4j database in the cloud and Oracle database. We used a broker to make horizontal scaling and breeze. We of course used our Neo4j graphql library and the graphql api, and of course, I mean, graphql subscriptions for a real-time update of when people set the pixels or not. So go here, try it out, have fun, draw something amazing, and enjoy the power of graphql, graphql subscriptions, and Neo4j. And that's actually all I have for you today. You can head over to the open-source library. It's under GitHub. You can see here on the link where you can find us. We have all the plugins there, a lot of things to take in. open-source for you available all the time. And this also is a great opportunity to say thank you to our community. You can always reach out to us on Discord under this link. We really had a lot of time and inspiration from our community, and they helped us build the product that is there now. And if we wouldn't have you as a community, we wouldn't be here right now presenting this wonderful product. So thank you very much, and see you next time. Bye-bye.
22 min
08 Dec, 2022

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic