The expansion of data size and complexity, broader adoption of ML, as well as the high expectations put on modern web apps all demand increasing compute power. Learn how the RAPIDS data science libraries can be used beyond notebooks, with GPU accelerated Node.js web services. From ETL to server side rendered streaming visualizations, the experimental Node RAPIDS project is developing a broad set of modules able to run across local desktops and multi-GPU cloud instances.
GPU Accelerating Node.js Web Services and Visualization with RAPIDS
AI Generated Video Summary
Welcome to GPU Accelerating Node.js Web Services and Visualization with Rapids. Rapids aims to bring high-performance data science capabilities to Node.js, providing a streamlined API to the Rapids platform without the need to learn a new language or environment. GPU acceleration in Node.js enables performance optimization and memory access without changing existing code. The demos showcase the power and speed of GPUs and rapids in ETL data processing, graph visualization, and point cloud interaction. Future plans include expanding the library, improving developer UX, and exploring native Windows support.
1. Introduction to GPU Acceleration and Node Rapids
Hi, and welcome to GPU Accelerating Node.js Web Services and Visualization with Rapids. I'm Allan Ane-Mark, and I am the lead in the Rapids Viz team here at NVIDIA.
So, Rapids is an open-source GPU-accelerated data science platform, and then you can find more details at rapids.ai and nvidia.com, and Node Rapids, which is the project I'm going to be talking about, is an open-source modular library of Rapids-inclusive bindings in Node.js, as well as some other complementary methods for supporting high performance browser-like visualizations. It's currently in technical preview, but you can find more details about it at github.com slash rapids.ai slash Node.
2. Introduction to Node Rapids
Rapids provides data science libraries, machine learning algorithms, and visualization tools. It is traditionally used with Python and C++, but can also be used with Windows through WSL 2. In the Viz ecosystem, libraries like Cougraph and DataShader are used for creating dashboards and server-side rendering. Node Rapids aims to bring high-performance data science capabilities to Node.js, allowing developers to leverage existing JS vis libraries and accelerate their applications. It provides a streamlined API to the Rapids platform without the need to learn a new language or environment.
So what do you get with Rapids, which is traditionally Python and C++? You get these data science libraries, such as DataFrame Operations in CUDF, you get CUML, which is a lot of GPU accelerated machine learning algorithms, Cougraph, Forgraph stuff, Spatial, Signal, all the like, and more being developed continuously, and these are continuously getting improved. The caveat being these are mainly around Linux-based systems, so if you want to use Windows with them, you can. It has to be through WSL 2, however.
3. GPU Acceleration and Architecture
We also have a kind of really nice SQL engine that we bind to, which enables us to do multi-node and multi-GPU stuff when needed. Then there's a whole data science wing, all in Rapids, so you have your CUDF and CUGRAS stuff. And then this is the graphics column here, where we're sort of taking advantage of the fact that WebGL is a subset of OpenGL. So, really what we're doing with these bindings is, you know, you can take your WebGL code and you can then run it in OpenGL and get the benefit of that performance increase from OpenGL. We're also doing things with GLF bindings and, you know, node processes. But again, you can now use your LumaGL, DeckGL, SigmaJS, we're hoping to get two and three Jets and basically run them in OpenGL without much effort.
So, another component to this is since you're already on GPU, you get the benefit of GPU video encoding. So, by taking advantage of that and using WebRTC, you can do server-side rendering, stream that over to the browser and have the browser-side JS interact with it like a video tag and it's lightweight in that sense. It enables a lot more capabilities in that sense. Or you can just do stuff like, you know, interacting with all this stuff in a notebook.
So, what do we mean by kind of architecture? GPU hardware is a little bit different. You know, it's pretty straightforward when you have a single GPU and just doing client-side rendering. So, all the compute happens on the GPU, you send those computer values over, and the client JS renders it, those few values. You know, pretty straightforward and sort of where GPUs excel. It's excelled so well that you can have multiple users accessing the same kind of GPU data and it's fast enough to handle that. Or if you have particularly large data, NVIDIA has an NVIDIA NVLink. And so you can kind of link multiple GPUs together and kind of get expanded memory and compute power that way. Or you can go in a more kind of traditional cloud architecture, where you just sort of have multiple GPUs all separated, you know, you have lots of child processes running on each GPU, Load Balancer running across all of those. And so lots of, you know, instances of people accessing it. And basically whatever GPU is free at that moment is the one that is serving up the information to that user. So a pretty straightforward, you know, again, you still tend to need a lot of GPUs, but, you know, not something terribly unfamiliar.
4. ETL Data Processing in Node Rapids
We'll demonstrate basic ETL data processing in a Node.js notebook. Using a 1.2GB dataset of US car accidents, we'll load it into the GPU, perform operations like filtering, parsing temperature data, and applying regex operations with cuDF. The dataset has 2.8 million rows with 47 columns. The entire filtering operation took only 20 milliseconds, and the regex operation took 113 milliseconds on the GPU. This showcases the power and speed of GPUs and rapids.
We're basically just going to show off some basic ETL data processing stuff in a notebook. A notebook that is running Node.js and JS syntax, but we're kind of basically using it as a placeholder for very common Node services that could be used for like batch services and stuff which you'd need to do for you know parsing logs, or serving up some sort of you know subsets of data, things like that.
So in this case we're going to do it live. Right now I have a Docker container running in our Node Rapids instance, and a notebook. And so here I have a Jupyter notebook, a Jupyter lab. If you're not familiar, it's sort of like the data science sort of IDE. And you can see here all we're doing, like we did in any sort of Node app is require our Rapids AI QDF, for us we're going to be loading a 1.2 gigabyte sized dataset of US car accidents from Kaggle. And we're going to read this as a CSV into the GPU. And well, it was actually extra fast today. It only took under a second, which is kind of wild considering this is a 1.2 gig file. And so how big is this? It is 2.8 million rows with 47 columns. So for us, you know, not that big of a dataset, but for people typically used to doing data and like Node.js and stuff, it's pretty good. And how fast and responsive it is, is kind of impressive.
So we're going to just see, you know, what is the headers in this. It's sort of messy, and we're gonna go through this pretty quick, but basically get rid of some of the ones you don't need. See again, we're gonna then see what the columns are and say, all right, there's some temperature data in here. Let's parse this to see, you know, what the ranges are, sanity checks. And you say, all right, there's some really wonky values in there, as you always have with data stuff. Let's, you know, bracket it so it's sensible. So this whole operation of, again, 2 million rows, 2.8 million rows, took 20 milliseconds to kind of filter out. It's kind of wild. So now this is looking better. We have basically 2.7 million now. And we can start doing some other operations. In this case, we're gonna stringify it, and then with cuDF you get some regex operations. And so we're gonna have a pretty complicated regex here where we sort of mask out commonalities between these terms. And again, 2.8 million, 2.7 million rows, it took 113 milliseconds. So you can see how powerful and fast it is on a GPU. And basically at the end of this, you're saying that, yeah, you know, when there's clouds and rain, the accidents get more severe. But this is kind of just the first showcase of what you can do with GPUs and rapids.
5. GPU Acceleration in Node.js
Now you can access GPU acceleration in Node.js, enabling performance optimization and memory access without changing existing code. We demonstrated this using Sigma JS, a graph rendering library, and a geospatial visualization with deck GL and an Uber data set of 40 million rows. With GPU memory loading and server-side compute, multiple instances can access the same GPU and data set, providing real-time results and unique views of the data.
And now you get access to it in Node.js, which we think is pretty neat, and, you know, the JS syntax, which is kind of cool. So we're gonna step it up a bit more. So the next demo is using Sigma JS, which if you're not familiar is a graph rendering library that's only client side. So it's pretty performant and pretty neat, they have a lot of functionality. But what we did is basically take out the memory load graph loading into system memory and load it into GPU memory, and then serve that up or stream it into the client size app.
So in this case, ignore the numbers. It's actually 1 million nodes with 200,000 edges, and you can see it zooms in and pans buttery smooth, even if it's in WebGL. So you get that great interactivity, and all that. But this is technically loaded onto GPU memory. So that means that you're not bottlenecked by web browser limitations. If without using a GPU, you basically are limited to 500,000 nodes, and then the tab craps out on you. So as you can see here, you know, it's using the GPU for the nodes and edges and, you know, straightforward, Sigma JS with very little changes. So we're hoping to do in the future is enable lots of Sigma JS users to kind of get that performance optimization and memory access without really needing to change much of how their code works.
For this next one, we're something more similar. We're going to do a surface side compute and a client-side rendering of a geospatial visualization. It's using a deck GL and Uber data set of 40 million rows. So, you know, not bad. Still not that big for us. And you can see, basically, it's all computed into source and destinations. So by clicking on each of these areas, you're kind of computing how many of those trips went where and what time. And you can see that clicking on each of these, it's essentially instantaneous, and you get the results from 40 million rows. It's so fast, you can basically just start scrubbing the values and getting that interaction back. So what this basically means is, you know, you can get away with having multiple instances accessing the same GPU and the same data set. And because this is a React app, the state is managed on the client side. So that means they all get their unique view of the data. But you're still querying that same GPU, and because it's so fast, it gets the values back, basically still in real time. And so again, like I said, really where the GPU excels is doing the server side compute, and the streaming over the values. So you can get away with quite a lot, even on a single GPU. So next one is a bit more complicated. You can see we're sort of building up modules as we go.
6. GPU Acceleration and Visualization Demos
This part showcases a video streamed server-side rendered graph visualization using LumaGL, OpenGL, and NV encoding. It demonstrates filtering and querying a dataset of 1 million edges and nodes in real-time. The Fort Atlas 2 layout algorithm is applied to compute the layout for each node and edge, streamed to the browser as a video. Additionally, a point cloud dataset is used to showcase the ability to pan, zoom, and interact with the data. There are more demos and examples available, including graph renderings, Deck GL demos, and a multi-GPU SQL engine query demo.
We're going to use CUDF for the data frame, and Cugraph for the graph layout compute. So you can see we're sort of starting to wrap this in a little bit more of a GUI app. In this case we're just using a sample data set. We're selecting those edges, colors, and then we're going to render it. So you have this giant sort of hairball, but it is 1 million edges and 1 million nodes that you're able to basically filter and query real time. So you can see how quickly query it, clear the filter, you're back to a million nodes, million edges. You can basically tooltip and hover over each of these.
Now visually it doesn't really make that much sense, but this is more again for a kind of performance benchmark and how you can still get each individual node. And now we're going to run Fort Atlas 2 in real time. So Fort Atlas 2 is basically a layout algorithm, and so it's going to be computing the layout for every single node and every single edge in real time and all the forces are being calculated. And you can see this is doing it on again a million edges and a million nodes, and it's all streamed to video to the browser. You can see it moving through the iterations in real time, which is pretty wild. Again, this is just a single GPU and you're able to interact with this much data. And I kind of love looking at this, it's sort of like looking at the surface of the sun, which is pretty neat. And for the last one, basically very similar, except we're going to be using point cloud data instead. We don't need to do any like graph layout and stuff, but similar thing where we're doing video streams. So in this case, it's a point cloud of kind of a building scan, so you can pan and zoom and interact with it. But again, because GPUs are pretty powerful, I'm just going to duplicate this tab and have multiple instances of the same data set from the same GPU. But I'm, again, able to interact with it uniquely for instance, because of how you're handling the data. And so, again, this is a video stream, pretty cool. So you can use a single GPU for a lot.
That's a lot to talk about. We have even more demos and examples. This is the location of that. We have things where we're showing graph renderings with the GLFW. We have all the Deck GL demos and local rendered OpenGL. We have a really good multi-GPU SQL engine query demo. So we are querying all of English Wikipedia using multi-GPUs.
7. Future Plans and Community Engagement
We have examples showing the Clustering UMAP algorithm and Spatial Quadrant algorithm. So they're pretty cool. I recommend you check them out. If you don't even install it, we have links to YouTube demos of the videos.
So what's next? Going forward, we kind of have three main guiding roadmaps ideas. We're going to continue doing these demos and working on doing external library binding. Again, looking forward to doing Three.js and some more work with Sigma.js. From that, we're going to start doing more specialized applications, specifically around like Cyber and Graph, Geo Spatial, Point Cloud, stuff like that. Maybe something that makes it a little bit more easy for turnkey usage. And then, hopefully, more broader community adoption, specifically in non-Viz use case. We're biased being the Viz team for Viz stuff, but we know that providing these CUDA bindings has an opportunity for a lot more functionality that we're just not thinking of because we're not aware of them. So we'd love to hear back from the community of novel ways to use this.