Largely based on Free Association in the Metaverse - Avaer @Exokit - M3, Avaer will demo some of the ways that open standards enable open and free traversal of users and assets throughout the interconnected metaverse.
Building the Interconnected, Traversable Metaverse
AI Generated Video Summary
The Workshop discussed the potential of the Metaverse and the open virtual worlds on the web. They highlighted the use of AI integration, open-source projects, and standard tools to build an open and forward-thinking Metaverse. The focus was on performance optimization, rendering, and asynchronous loading, as well as the integration of 3D NFTs and metadata. The Workshop also explored procedural generation, avatar customization, and multiplayer synchronization. They emphasized the importance of interoperability and the potential of the web as the best gaming platform.
1. Introduction to the Metaverse
Let's take a look! Yeah, so another title for this room might have been Import- the Metaverse. I kind of have a few goals that I wanted to hopefully accomplish here. The main is to just open your eyes, maybe to what's possible for open virtual worlds on the web. A challenge goal is to kind of convince you that the Metaverse can be good. I know there's a lot of skepticism and controversy about that these days. And my personal goal is, I don't know, I hope I learned something today.
I wrote some script notes that I kind of wanted to cover, but I hope that anybody who has any questions about the crazy things that I'm going to be talking about, just kind of chime in. I'd love to kind of deep dive into whatever parts of this people find most interesting.
So some background for me, I've been making basically Metaverse adjacent technologies for about eight years now. And all of that has been on the web. I basically started on the web and I never left. And I've been constantly surprised what I was able to do just using basically what's available in the browser. The open source tooling that's already out there, just to make experiences, worlds, and even my own browsers.
I wrote a browser called Exokit. I was really surprised that all of this is possible on an open technology like the web. That basically anybody can host, anybody can experience on any device. And it seems like I was the only person that was doing that. Every single time I would show off one of the cool new hacks that I made, whether that was the browser, running on magic leap or some immersive N64 emulators that I wrote which basically let you play Legend of Zelda in your HTC Vive in the browser no less, using a web VR. It was called at the time.
So I basically started hanging around all the people who thought that, like, wow, this is really cool, like, I didn't know that this was possible. And yeah, we kind of started building up this technology toolkit of all these different things that we've been releasing basically open source from day one and working together on. And eventually what it started looking like, was that we basically had the ingredients to quote unquote import the metaverse, meaning you can use a standard web browser API to load in assets from all over the web. From all over different places, including for example IPFS, GitHub hosting basically any website, or even things like NFTs.
So we basically started plugging all of this stuff together and started realizing that, hey, we're actually making possibly the world's first distributed video game where all of these assets are just kind of plugged together using a client that we're all kind of agreeing on using. So that technology that we're using to basically make all these translations happen and plug into the game engine. We ended up calling that Totem, because we're basically trying to develop a culture of creating art from the raw ingredients of the web. And we're basically sticking things together. Things are being imported. They're interacting with each other. And we're basically stacking art together. We're creating these virtual totems in a sense. Everything in the game is an asset with a URL and that can be even a local file if you're hosting locally. I am right now. It could be something on a GitHub repo, IPFS, or like I mentioned, even NFTs if you just want to point it at an Ethereum address. Yeah. And in the future, there's a lot of really crazy things that we'll be able to do once we get this baseline game working. One of the things is just providing simple services to recreate the software that a lot of people love like VRChat. Where you basically have your avatar, which can animate. It has facial expressions. You have mirrors, and we'll have soon AR as well as VR support as well as integration with KaleidoKit.
2. Building Worlds and Creating Spatial Relationships
This allows us to build worlds and traverse across them, creating spatial relationships. Scenes can be contained within each other, providing level of detail without background noise. The street is a directional representation of virtual worlds, with different content sections. Metadata defines game elements like quests. Our goal is to build the first distributed video game, using open-source technology. We use three.js, CryEngine, and PhysX for rendering. React is used for the UI. We support drag and drop and smooth animations. Our toolkit aims to make these elements work together seamlessly.
So that's basically a way where you can use your webcam to v-tube your avatar here on the side. And another thing that this allows is for us to build worlds so that we can traverse across and share in the sense that I have some sort of way to identify where I am in this quote-unquote metaverse. And I can go over to your place, and we can have a spatial relationship between those two places.
So this is still actually kind of buggy, but I'm going to make it a little bit more manageable. Because we're loading everything through these structured import statements, it allows us to basically have one scene be contained within another scene. And for scenes to be a first-class concept that can be, for example, captured... so you can draw previews of the scene and basically get level of detail from a distant asset without having to deal with any background of the scene.
And what we're also going to do to improve that is use marching cubes to give basically a full 3D mesh with texture of the virtual world that will be there when you go there. And of course, you can actually just go there, and now you're in some of these virtual world. Another thing that this allows us to do is use metadata to define the kinds of things that you'd want in a video game, like for example, quests. There's actually a quest in this world. It says, destroy all Reavers in the area, and there's a reward of some virtual asset. So if I can find it, I'm not sure where it is. This might actually be broken, but we also have a pathfinding system to kind of geolocate you in the virtual world so you can locate your objective. And then if there's some Reavers, you can go and slay them and get your virtual silk, whatever that means.
3. AI Integration and Open Metaverse
We have AI support for everything in the world, allowing us to build amazing experiences. By combining AI with open data, code, and metadata, we can generate character scripts and have avatars that are true AIs in the OpenVirtualWorld. We can define how objects are worn and their abilities. Avatars have path finding implemented and can interact with objects. We also have features like speech-to-avatar mapping and speech-to-text conversion. Our goal is to build an open metaverse using standard tools and open-source projects, creating the next version of the internet.
The animations for the avatars are from Mixamo, so it's also basically just open stuff that we're reusing. And we even have AI support for basically everything in the world. So this kind of shows one of the coolest reasons why we want to build things in this way. It's not only that it lets us work together, which it definitely does, but it's also because if you combine this with AI, you can build some really amazing experiences. And the way that AI works is you need all the data to be structured in the sense of open data that you can download, things that you can import, code that you can run, and metadata that you can parse in order to generate, for example, a script between multiple characters talking. And we actually have that system integrated as well. You basically plug in any API key for any AI engine that you want, whether that's GOOSE AI, which is the open one, if it's your own self-hosted or even OpenAI. And essentially, it will write character scripts for things that happen in the world, which then actually happens in the world. We parse it out, which basically allows you to have avatars that are true AIs in your OpenVirtualWorld. I actually can't show you too much of that because it's still not approved to be deployed by OpenAI, but I'll give you a sense of what is possible. By the way, if there's any questions, I'd be glad to hear them if I'm going too fast.
Here again, we're loading in. All these avatars are just different VRM assets. Every object here is basically an interactive JSON file. I can even show you what this hat looks like, in fact, in terms of the code. It's something we call a.metaverse file, but it's really just JSON. Essentially, the engine sees this and renders it into the world. And you have your basic start URL, which is referencing the GLB file. In this case, it's going to be relative to the location of the file. And this file can be hosted absolutely anywhere. Right now it's on GitHub pages. And then you have the components for defining how that hat can be worn. In this case, it will be attached to the head and it will be attached to 0.1 meters above the head. And that looks about right. And basically that gets completely synced with the avatar. We also have a mode of wearing where it's conformant to the avatar, but there's still work that we need to do to make sure that the displacement of clothing is proper for what you're doing or proper for the shape of whatever avatar you're wearing. And there's different abilities that objects can be defined to have. In this case, we're working on just kind of a spell system magic combat type stuff as well as things that will actually make sense in the context of an action game because we probably want to explore that angle. So you probably want your slurp juice or whatever this is supposed to be. And yeah, the different objects can for example, be editable. And the avatars basically do have path finding implemented. So they know to follow you. They know to essentially interact with the rest of the objects. This is the part that I can't show you because OpenAI is actually pretty strict about what you're allowed to publicly show. But there's a ways where we can literally talk to the avatar, have it go and fetch the object for us. And then for example, help us in a fight. That's actually just relatively simple integration between AI and traditional action based character systems. We also have features that are once again inspired by VRChat. Like there is a speech of the avatar in the sense that... Sorry I actually can't do that because of the echo effect. But basically when you talk, we run the audio through speech phoneme engine, and then that's mapped to the avatar. Another cool thing though that you can do is you can do speech to text. So then you can basically turn your mic messages into chat messages, but we can even do one better than that, which is the opposite. So I'll give you a demo. Hey, Drake, what's up? And that's about all I can show you in terms of AI. But yeah, basically our idea is to allow all of these different assets to come together through the powers of AI, and just basically using standard tools, standard open source projects to kind of build up all the different pieces of tools, standard open source projects to kind of build out the best, quote unquote, version of the metaverse. And it's an open one. I think that's the only way that we get any sort of good version of the metaverse. And I'm not talking about building the kinds of things that Mark Zuckerberg wants to build. I'm more interested in kind of building open systems and basically the next version of the internet.
4. Building an Open and Forward-Thinking Metaverse
We can build an open and forward-thinking metaverse that is accessible on various devices. We're working on multiplayer synchronization using CRDTs for object state replication and WebCodecs for voice transcoding. Performance optimization is a priority, with systems to automatically configure graphics fidelity. We plan to standardize avatars and enable connections with thousands of people. React is used for rendering and UI, with canvas elements and React hooks for engine integration.
And I think we can honestly get there. I think the web has evolved far past the point where we can't do it. And it seems like the future is actually pretty bright for us because, first of all, WebGL and WebXR has great support on Oculus. And it seems that Apple is also on the WebGPU and WebXR bandwagon. So everything that we're building here is gonna be accessible in everybody's VR devices. In fact, I'm mostly coming from a VR background and the only reason I'm not doing this in VR is because there's a lot of other stuff I have to do. But we're actually gonna fix up the VR here and basically all of this is gonna be also fully immersive. And what we're basically trying to do here is kind of show the way forward for a version of the metaverse that can be good, that can be open and that's actually forward-thinking where whatever we're building, it's gonna be accessible a year from now in whatever form we want, whether it's on desktop, it's just content that we're viewing on our phone or like, we're just meeting in VR headsets.
So another thing that we're working on is of course, multiplayer to make sure that everything here can just be done in a synchronized fashion with other clients. I hope that's not used to create crazy bots, but you never know, but actually the technologies behind that are again pretty simple. We're using CRDTs to replicate basically the object state in the world. So, essentially, the positions, the transforms and all the components of each object are just in a global store. And then we get updates from that store and that store is synchronized back and forth with a central consistency with the server. And that's essentially how we do multiplayer. It's just synchronizing those states. And for voice, actually we don't use WebRTC. We're gonna use WebCodecs, which is essentially just doing the transcoding ourselves and doing it over a web socket, which tends to be more reliable than some of the crazy hoops you have to jump through with WebRTC. It's also more secure in the sense that this is something we want somebody to be able to host on their Raspberry Pi if they want. You should be able to kind of set up your own private server that has your own private encrypted voice connections and runs the same content. And what I hope for the future is that we start building on more open systems like this because I think it's truly the way that we're gonna get to whatever a good version of the metaverse is. Like the more content there is, the better we're gonna make this kind of crazy game that we're making. If anybody wanted me to deep dive into any particular technologies, I know there's a lot here. Then just let me know. I'll see if I can get more of that.
Somebody in the chat had asked earlier, what about performance? So right now the main problem that we have is our avatars are so high quality. There's 55 drawcalls mostly in the avatars. And we're not only rendering at once, but we're rendering a depth pass over top of all of this. So the way that we're rendering is actually extremely unoptimized right now. We're working on that. Hopefully within the next week or two. But in terms of just forward thinking performance, actually we have systems for configuring basically the graphics fidelity automatically. So for example, for the avatar, you can reduce the quality to medium, which actually changes what I said, it's no longer gonna be 55 draw calls for the avatar. It's gonna be one. Actually I even disabled lighting on here. It didn't have to be disabled, but now we have the entire avatar basically automatically at list to a single texture and a single draw call and a single geometry. So we can actually make this super fast and this actually basically fixes all the performance problems that we ever had. But we are planning on going a bit further actually than just standardizing the avatar because we also have a spreadsheet system where it will take screenshots of your avatar from all the necessary angles and basically render you as a single plane. And this is something that can allow for connections with a thousands of people. I hope that answers the question. And despite being a sprayed avatar, you still actually have full interactivity with the world. There's a few angles that we need to fix, but overall, this has worked well for us. This is also a way or a technological path where we can be interoperable with other types of virtual worlds. Like there are other virtual worlds where everything is 2D. And if you standardize on your formats, it's pretty easy to create something like a Sprite Generator. If somebody has a question about is React just rendering the scene or are they just, the components just buttons or the UI with WebGL? Okay, so basically everything is React. It's actually using Vite.js for things like styling and importing that. But there are canvas elements in the React and then we just use React hooks to hook into our engine. This is actually that canvas for the character here is the same as the renderer as the main scene. So you actually basically get this for free. It doesn't even take any more memory to do it this way.
5. Rendering and Asynchronous Loading
We can render items in the inventory by using a separate render thread and changing the domain name. By running a second copy of the engine, we can draw asynchronously and post the image buffers back to the main thread. This allows for seamless exploration of the world without hitching or dropped frames. The code can be zoomed in for better readability, and I'm available to explain any specific code mentioned.
I can show that code, I think that's some character. There's our canvas, and we're hooking it into our engine here, game.player.render.addCanvas. And this pushes us onto the render loop, and then we're just copying frames over to the canvas. There it is. Copy frame. This canvas draws from the main scene where we actually temporarily drew that window. Actually, there's an interesting quirk of the browser that we're taking advantage of to be able to essentially render items in your inventory. These are, once again, standard JSON files, standard GLB files that we're importing. But the way that we do the previews is actually on a separate render thread. It's a browser trick where as long as you change your domain name entirely, and by domain name I mean top plus one, so that means you have to be on.xyz rather than.com, otherwise this doesn't work. But essentially as long as you iframe over to a URL on a completely different origin structure this way, then you'll get a different render thread. And as long as you run a second copy of the engine, which in this case we do, you can do drawing completely asynchronously and then post the image buffers back to the main thread. So that allows you to have basically the aspect where you're exploring the world and things continue to load, and you can load the previews for it asynchronously. You're not actually hitching, which is the common problem that you get with a lot of three JRSpace things. When you add something to the scene, there's a compilation process. But in this case, a lot of that can just be done off thread, including on a separate GPU thread. And then as long as you have a fast GPU, you won't even drop any frames and usually you don't. And it can also be throttled. Can you zoom in the code a little bit? It's hard for people to read. In fact, if anybody wants to talk about any of the code that I just mentioned, I can probably find it and explain it.
6. Routing and Scene Abstractions
Somebody asked about the routing in React and how scenes are rendered when changing levels or areas. The speaker explains that they are not using a standard routing framework and are currently working on shader compilation issues. They mention that as you traverse the virtual world, the URL changes and the back button works. They also mention the concept of scenes as abstractions and offer to show a scene file for further explanation.
Somebody asked, that's interesting. Somebody asked the routing is only instructing the scene source, correct? The routing? I guess in React, the routing is only instructing the scenes source, correct? I'm not sure where the routing starts at the moment. We're not actually using any standard routing framework. It's kind of hand rolled, but it does eventually hook into the React. Actually, I would have loved to show a lot more scene traversal stuff, but we're still working on shader compilation issues there, but the idea being that as you're traversing in the virtual world, the URL does actually change and the back button does actually work. It will just actually teleport you back to the previous URL. Most likely modulo some coordinate. What kind of an auto progress save in your URL bar? Yeah, so I was asking about the routing because I saw there like, you know, you changed the different like scenes like darkness to like the scenes grass, and that's what I noticed. But I was also interested like, how do you, do you render like another scene when a level changes? Like when you're going into a dungeon or you're going another area, or you're doing that like still like manually? I'm sure I can show you this isn't hooked up yet, but I can explain how it works. So a scene is an abstraction. I can even show you what a scene file looks like. I'll show you this file. It might help to explain.
7. Scenes, Optimization, and Shader Techniques
A scene is a file type that can import other file types, containing different assets. Multiple scenes can coexist, and switching between them is seamless. The most expensive operation is the skeleton mesh, but optimization opportunities exist. The team plans to rewrite the avatar system using WebAssembly and optimize the characters' draw call version. They experiment with shader techniques, including SSAO and different lighting methods. They aim to achieve results comparable to Unreal Engine. They do not rely on Redux for state management.
That's awesome, thank you. A scene is just another file type that you can import in webiverse. The interesting thing about it is that it's a file type that can import other file types. It's a scene and the scene contains other URLs. So in addition to URLs, like for example, the street base, which I think is this object right here. The blue stuff, you also have the lights defined for the scene. Basically this is inline content, which is gonna be translated into a data URL. It's the same as if you loaded a data URL. But it's basically just all these different assets in the scene.
And so with scenes being a first class concept, you can actually have two scenes at the same time. So I have a system here where if you get close to the edge of where the next scene should be, it's actually gonna add it and then it's gonna preview the scene by basically taking a 3D volumetric snapshot of it and then basically rendering it in front of you. And this is just one draw call. There's zero performance hit because that entire scene is paused. But it's still there. And now we basically have two scenes in the world at the same time. And when we cross the threshold, we can magically switch one to the other or switch but there's actually a glitch here. This should have actually rendered the full preview of the entire scene before. But basically we can walk backwards and forwards and these are just different scenes that have entirely different data and they could have entirely different multiplayer rooms, for example. So if I'm over here, I can walk over to my friend's place and then say hi.
The most expensive operation that it's happening right now is just the skeleton mesh, right? It's just the character itself. Everything else is kind of like dynamically loaded. Correct? That's right. Actually, everything is dynamically loaded but there's plenty of opportunities to actually optimize a lot more. Like for example, we will probably rewrite our avatar system. At least the kinematics part in a WebAssembly. We're already using a lot of WebAssembly, by the way. Like for example, this procedural generation is a WebAssembly algorithm. The physics are WebAssembly. And we're probably also gonna take the bones code which is the real only computationally heavy part of this app. The rest is just matrix multiplication. And we're probably just gonna write that in WebAssembly as well as probably making the default of the characters just the single draw call version which means we do an offline pass to once again generate the clean texture for the character and then swap that out directly for the 55 material versions. And once we actually fix the lighting you almost can't even notice a difference but it's 55 times faster. Actually, it's more than that. It's like 110 times faster because there's also a secondary depth pass that we do for some effects, like SSAO. But yeah, this is mostly the only expensive part of anything that we're rendering here. Everything else is just basic shaders, it's 3JS shaders. It can be as fast as any shader you can write.
So eventually, are you gonna rely on, later in the project, I guess, you're gonna rely on something like real time, like in a real engine or something more powerful or are you gonna just kind of do more like geometry tricks so to speak? Can you explain what you mean by geometry tricks? Like, kind of like how, for example, you can have some blockers in the light to kind of cheat the behavior with the environment rather than actually do it properly and see where the light by itself and kind of just use something like Lumen, for example, and being able to tackle that problem just with the open real, for example.
No, we're experimenting with, essentially, the best shader techniques that we can implement. There's actually most of this research has already been done, and most of it is it open source in some form, whether that's GLSL or some other language. Like, it's just open source Unity code. We've actually already implemented a lot of that kind of stuff. Like for example, for SSAO. And then there's a whole bunch of different ways that you can do shadows, you could do voxel-based lighting. That's all stuff that we're experimenting with. It's actually one of my favorite things to do is to just kind of see all the algorithms that are possible and see how awesome that looks. I'm always surprised with the results, and then everybody says, like, oh, wow, is this Unreal or whatever? Well, I'm like, no, it's not Unreal. It's not Unreal, but it's using the same technology as Unreal, so is there a difference?
And my last question, sorry, I don't mean to take time from the other participants, but I also was curious. Are you using like a store to be able to keep the state in the general application itself? Or is it just like a one page, so you're just, okay. So you are using something like Redux or I don't know like- It's not Redux.
8. UI and Engine Integration
The UI side uses useState and context API for UI-related stuff. The core engine content, including scene events and character states, is handled by YJS state, a CRDT that is replicated on the network. This enables multiplayer functionality, where characters can follow each other by updating their positions on the server. In this case, the server and client are the same entity.
On the UI side, it's mostly just useState, and there's some context in there. Oh, it's context API? Okay. Yes, but that's only for the UI stuff. The core of the engine content, like the actual, what's happening in the scene, the state of the characters, all of that is in YJS state, the CRDT that I mentioned. And that is the thing that's actually replicated on the network. And that's why, even though everything here looks like it's single player, it's already basically running the same multiplayer code because it's just updating the state engine. The engine is responding to state changes. And that's how you can have these characters follow you. You're updating the position on the server pretty much. So two people are able to see the position on the plane. Yeah, except in this case, the server and the client are the same thing. Got it. Okay. Yeah, thank you so much. This is awesome. Can't wait to participate.
9. Exploring Avatars and Virtual Interactions
We're working with artists to explore the possibilities of avatars in a virtual world. We're adding combat systems and cool gadgets like bows and pistols. Avatars have an emotion system and can perform dances. We're treating avatar movement and interactions as game design problems. Our goal is to create a virtual world where people want to spend time with their friends. We're also working on enabling inventory and loadout storage on a database, allowing interaction with virtual objects from other interfaces. The pistol and explosion are separate apps, demonstrating a degree of isolation. The pixel solution allows for dynamic imports and recursive loading. The shooting action is event-driven and efficient, allowing multiple particles in one draw call. The pixel solution works by specifying an entry point and exporting components. Applications can set components and send messages to each other, achieving a level of isolation.
So one of the coolest things that we've been doing is kind of working with artists to build out whatever this virtual world is, and just exploring with all the things that we can do with avatars. Like we're basically adding combat systems for the different items. I can show you, for example, bows, and pistols, and just all these kind of cool gadgets that you want to play with in any sort of virtual world.
The actual world that we're working on is mostly just the game world, and the avatars also have a full emotion system, so if you're in a multiplayer room and you want to say that you're angry, then you can just increase the angriness, as well as the fun, so you can be fun angry. As well as doing things like dances. And I wanted to show some items, so we also have basically inverse kinematics implemented for a bow. And you can shoot and the arrows stick to things. And we're also treating a lot of this as just game design problems, like how do you make the avatars move in interesting ways? What are the kinds of interactions that you want to have in a virtual world that will actually make you wanna be there and spend time with your friends there? And I think the best virtual world for testing the pistol might be darkness. In fact, one of the features that we're hopefully gonna enable soon is your entire inventory and load out and all the things that you earn, there's just kind of stored on a database that we're running and that allows for things like to be on Discord and kind of interact with all these virtual objects from other interfaces besides just the game. But, yeah, there's basic lighting effects for essentially everything. In fact, this pistol is a good example of multiple applications working together because the pistol and the explosion are actually not the same app. The pistol imports the explosion. And that allows for a degree of isolation. So if you ever need this really cool explosion somewhere else, then you can just somewhere else, then you can just import it. You're doing more like a dependency injection type of thing? Yep, essentially. Awesome. The back end compiler is just vjs. So you can think of it as a dynamic import and it is recursive. So it'll go through, it'll figure out what type of thing you're trying to load. In this case, I'm referencing a GitHub repo. I can even show you how that works and why. But you really just have to specify the GitHub repo.
Is the PixelExplosion working with some kind of object pooling because of the animations and how expensive the particles are for the shooting action, so to speak? It's event-driven. I'll show you the code in a second. But it's event-driven. And on the firing action, it's all based on state changes. It's a state machine. On the firing action happening, that's when we for example add the light and then we initialize a new particle. And then every kind of frame, we go through the particle's animation. And then eventually when the particle realizes that it's dead, it'll kill itself. It'll remove itself from the geometry and then the mesh becomes invisible. It's actually quite efficient. And that also allows multiple particles to be done in one draw call because we're just using instancing for a bunch of cubes in this case. But these effects are going to get way more advanced once we put some actual time into them. Even though I think that actually looks pretty cool already.
So the reason that pixel solution works is anywhere you put any directory, HTTP directory that has something called.metaverse file in it will be loadable in the web versus regardless of everything else. And what that allows you to do is essentially specify where your entry point is and some components that you're describing about your entry point. Actually this is a bug ignore that this shouldn't have any components. But for example, this has an entry point of index.js and then here ultimately define some shaders, some constants, some functions for itself. And then ultimately you're just exporting a default which is going to specify what your object is. In this case, it's going to do a dependency injection type thing for a use app, which is basically calling up to our engine. Here's where you import the engine, import metaverse file from metaverse file. So it's getting basically these engine functions that allow it to access what application am I? The engine has already kind of set me up with components, sent me up into the scene. And then all I have to do is create some additional meshes, animate them every frame. Here it is, explosionmesh.update, and this might be... I guess using the custom hook and the class, like itself, so kind of like separation of concern because then you can do like what the garbage collection that it needs to be done by the explosion, it's not gonna be tied to any other asset I guess. Yeah. That's right. And applications can set components on each other and they can send messages to each other. And that's basically the level of isolation.
10. Interacting with Objects and Optimization
If you know like which kind of object that you're interacting with, then of course you could do more advanced things like expose your own API. But otherwise, these are just basically open objects on the web. It's up to you to define how you want to use them. And we just provide the primitives.
So when you're shooting or you're hitting something, you also using like a class just for that to like handle the collision? Yes. So you have basically rocks access to the PhysX API if you want it. So if you want to do a Raycast, then there is a function for that. I wonder if it's here in the pistol repo. Oh, here we go. Here we go. We have a Raycast, from the guns position, gun quaternion. I guess this is correcting for some error in the GLB where we have to flip at 180, but you're essentially taking the position, the direction that it's facing, you get a result out of that. And it will tell you actually not only like the point that you hit, but also the application that you hit. So this is how you can target a thing and then see, oh, this is the thing that I hit. And then I can call the.hit function on it. And that actually allows for damage tracking systems. This building here is not invincible. You can actually destroy it. If you shoot it enough. Let's see how long that takes. There you go. It's dead. Let's see. It also works for avatars where you can define kind of raycasts for where are you going to hit. And if that hits some sort of avatar inside of their bounding box, inside of their character controller, then they will take damage. And that allows for automatically defining for like a lifetime of a character gameplay mechanics. Is there any other applications that somebody wants me to cover or deep-dive into? There's vehicles as well. Oh, awesome. End mount. This is actually a bug. He's not supposed to be going this crazy, but you can hop on. He's only supposed to be going crazy when you're riding him. I guess you would have some kind of problems with frame rates sometimes or not at all, based on the object that you're interacting with? No, that's usually not the problem. In fact, the assets are mostly just quite optimized, and it's pretty easy to get optimized assets these days with just ktx2 and compressed textures and just crunching everything into single meshes. The main problem that we actually have right now is almost exclusively related to animating the avatars because we don't do any optimization and they're just really high quality. We basically need to do a runtime swap of the geometry to make sure that it's a single draw call. And that will probably mostly make us limited by whatever other content there is inside the Scene because the avatar will be a relatively cheap operation, and it's just a matter of what is the expensive of the shaders that you're using, are you using Fog, do you want to set up your render effects. Another thing that Scenes do is they define the render settings for how you want the Scene to look, which is honored for every Scene that you go to. I'll show that as well. So you get easy access to things like, we have your lighting, but then you have SSAO, screen space, ambient occlusion, depth of fields, HDR, bloom, Fog, type of Fog. And as you go from world to world, these Render settings will recompile your shaders, essentially, and basically make sure that every world looks the way that it looks, and it can even have its own skybox. So you could also work with something like a procedural mesh, right? You don't have just to work with a static one. Absolutely. Oh, we have a creature here. Do you mean for avatars or for something else? Just for Assets in the world in general, I think for avatars, it's pretty expensive, so... We actually need to work, I'm not sure how the actual word sort of handles it.
11. Procedural Generation and Infinite Worlds
We're already doing procedural generation using noise functions to create geometry. Wooden structures, container systems, and terrain are all procedurally generated. We have plans for infinitely generated worlds with biomes, textures, shaders, and interactive elements. These techniques are well-known and have been used in games like Minecraft. We ensure efficient updating of geometry to prevent frame drops.
Uh-huh, go ahead. In terms of procedural generation, we're already doing a lot of that. This is actually a shader that's using noise functions inside the shader to draw all of this. Basically, this geometry is entirely done on the GPU. Also just in one of our main scenes, we have these buildings that we're trying. We were trying to build it out of fortnight blocks that kind of generate together the geometry. It doesn't look good yet because we haven't replaced all the different wall sections, but all of the different wooden structures that you see here are just procedurally generated. And it's properly built, it has an interior, we also have some kind of a container system up here, which is again procedurally generated. You can kind of go inside and have different sections of some sort of virtual living habitat. And some of this is using similar techniques where we're just using noise functions to generate a terrain and we have much better stuff in the works as well where it's infinitely generated, it has biomes. We apply some really great textures and shaders to it. And we also sprinkle it with some other geometries that we've baked. So trees with a canopy, and just a bunch of different items that you can interact with in the world. The techniques for doing this are well known. This was basically Minecraft's claim to fame and we're just doing the exact same thing. We're using noise functions, bio-maps, and basically efficient updating of the geometry in chunks as you progress through the world to make sure there's no frame drops. I've actually been doing that for years, so it's really fun for me to kind of go back to it again. But we're gonna be doing a lot of that for the actual gameplay.
12. Integrating 3D NFTs and Metadata
13. End User Customization and Tooling
We have a native experience and in-world tools for inspecting everything in the world. We're working on tooling, including a Unity exporter, to generate files and scene files for basic creation. We plan to integrate the Fortnite build tools and provide a hosting service for virtual worlds. It's easy to use, with drag and drop functionality, and you can play around without external tools. However, using external tools like Blender is encouraged as they often provide better functionality.
The last question. Can I have like my end user like customization opportunity, let's say if my end user wants to scale, modify, is only the developer have the option or? No, no, no, no. I wish I could show you more tools. There's bugs which I'm fixing. But if you know what I mean by the Fortnite build system, that's kind of what we have, as well as just kind of in world tools for inspecting everything in the world. And if this was working, you'd see the transforms here. So there is a bit of a native experience. We're also working on tools to do that. We're working on tooling, for example, a Unity exporter, that as long as you haven't done anything too crazy, that will generate all the files, including the scene file, and then it'll just work that way. But it's not only necessarily for developers. We wanna make sure that anybody can essentially have access to basic creation tools. So we're probably gonna be integrating like all of that, including the Fortnite build tools. And actually, we're probably also gonna be providing a hosting service for a lot of this content for people's virtual worlds. So it actually could even be simpler. In addition to being drag and drop, it's really just a matter of logging in. Then you can kind of play around without even using any external tools. But if you want to use external tools, then of course, we encourage that, because in most cases, they're better. Like we actually use Blender for everything.
14. Integrating Technology and Performance
Are you thinking of integrating web-tree technology to improve processing? We're considering pre-loading worlds offline and providing tools for exploration and sharing. We don't have plans to charge gas fees. We'll provide hosting for first-party game content. Users can connect to our service or run their own advanced version. Building WebAverse in a browser allows for faster technology development. There's little difference in performance between browser-based apps and traditional desktop applications. The web has the potential to be the best gaming platform. You don't need Unity to make something good. Thank you, this app is impressive.
Awesome. Are you thinking of like integrating some like maybe like Hashgraph or any other like kind of like web-tree technology in order to improve the processing of meshes and assets and everything that you have in your quote-unquote pipeline? We're probably okay. So I mentioned kind of the technique of pre-loading worlds. We're probably actually going to be doing that offline as well as providing cool tools for people to kind of explore each other's worlds without necessarily going to them, just kind of exploring the previews or even creating topographical maps that you can kind of share and say, like, oh, meet me over here. So a lot of that is in the pipeline and we're going to be building a lot of backend infrastructure before doing that, but if you have any suggestions of specific technologies that we should be using, I'd love to dig into them. We want to use the best of what's available. If you actually have any ideas, probably the best way is to open an issue on GitHub. Some of the best feature requests and suggestions that we've gotten have come through that. Yeah. Yeah, we'll do that. We'll send you an email as well. Thank you.
Oh, thank you. As I just mentioned that people can drop their artworks. So do you charge any gas fees like many of the NFT websites, like they charge gas fees and all on the blockchain? We'll probably... I can't make any promises because I don't know how the... I can't predict the market, but we don't really have any plans for charging gas fees for like that. Actually, we just want to fund ourselves mostly by building the best game possible and giving people reasons to kind of get first party stuff or get integrated into the rest of the game. Because most of the costs for doing things like minting are actually quite low. So the way that we might restrict it is just like you need to have bought a copy of the game quote, unquote, which might be an NFT, and then that allows you for example, infinite minting or something like that. But if you kinda wanna make your pitch for what the best solution is, again, I think the GitHub and just joining the discord, joining the discussion is probably the best way. If you don't mind, can I share your discord link in the chat box? Yeah, that's cool. Yeah, thank you.
So you mentioned about creating your own virtual world and you are hosting service for the next steps. Like if someone wants to create their own quest and gaming in their virtual world like, do you envision that they'll set up their own database and infrastructure for it or do you provide that too? We're providing it if only because that's how our first-party game content is gonna run. But I think the case that you're talking about is a very advanced one. I would hope that we're able to keep up with feature requests and demands from users that they'll just continue to say, it's better if we work together and just kind of connect to this service and then it's all handled. And then you're trusting us to run it. But the thing that kind of keeps us honest, I think the best hope is that, the thing that keeps us honest is the fact that at any point anybody can run the more advanced version which is figure this out yourself. I'm coming from a background where I know how hard it is to make these things because I've made them. So I'm actually trying to spare a lot of people a horrible fate by saying that, yeah, we'll probably try to get you to host with us. But yeah, at any point, anybody who has a better idea for web verse and what it can be, I encourage Forks because that's how we actually make everything better. Awesome. Hello. Yep. Hello, so I'm just I'm curious about the performance capabilities of such a large scale browser-based app like this. So how does it compare with like traditional desktop applications? Like, for example, Unity or Unreal apps? Are there any notable advantages or disadvantages for building WebAverse into a browser-based application like this? Building it this way allows basically everything else to happen. It allows the technology to move at the pace that users actually want it, rather than at the pace that some, whatever game engine company is doing it. I actually don't think there is much of a difference at all these days, because things like Chrome's implementation of WebGL even is actually highly optimized. Especially if you know the power trick, where in Chrome you select the OpenGL drivers for NVIDIA cards. It's basically native at that point. In addition, we're gonna be moving to WebGPU, which is even more low-level than WebGL, which is what we're currently doing. I actually don't think there's gonna be any appreciable, in the next couple of years, there's probably gonna be no appreciable difference between a Unity game and a game on the web, because ultimately this comes down to the computing power and the APIs available. And somebody actually having the passion to take the leap and make something good. I think that's pretty much the only thing that the web has been missing, in terms of being basically the best gaming platform. It's just for somebody to take it seriously, rather than constantly saying, well, yeah, well Unity sucks for all these reasons, but we have a deadline to meet, so I'm gonna use Unity. And then if everybody does that, the web can stagnate. I think one of my goals is to kind of show people that you don't need to use Unity to make something good. And in many cases, you can actually just make it way better with a drag and drop interface in the browser. Thank you, honestly, this kind of app is honestly impressive.
15. Performance Optimization and First Principles
We've taken a first principles approach to performance, considering factors like avatar loading, shaders, and context-dependent rendering. Many code bases lack performance optimization due to a lack of understanding of game engines and shader recompilation. The Web Reverse aims to address these issues and create a high-performance game engine.
We're just getting started. Yeah, so you might have mentioned this. How is, what are the things you integrated which makes this way more performant than other browser-based game, like from HTML, HTML file, WebGL? There's not much difference, other than we actually, I consider myself a pretty hardcore engineer with all this stuff. And I actually, the way that I started with all of this was compilers and optimizing code. So I think with 3JS, especially and with the culture of the web where it's mostly like copy-paste code that you're just trying to get it to work and then you paste something and then it kind of works so then you ship it, I think that has kind of built up a lot of code bases that they just didn't take performance seriously at all because they don't necessarily understand the underlying game engine, they don't understand shader recompilation. They don't understand why compressed textures are necessary and so on. And if you pile on enough of these mistakes, you can build kind of this minstrosity of an application that appears to work despite all of the mistakes that you've made. I think with the web reverse, we've just taken a very first principles approach like what does it mean to load an avatar? What kind of shaders is it going to have? Are the shaders going to depend on the context of the world? Do we need to preload things? Do we need to offload this processing over to some sort of worker? I think a lot of people just haven't really had the willpower to ask those kinds of questions. But when you do ask yourself those kinds of questions, the same questions that the people who made the Unity and people like my heroes, like Tim Sweeney, they're the people who actually did take that seriously, they thought about these things. And I would hope that web reverse ultimately is considered something like, okay, well, and when somebody took the web seriously, you actually have a pretty dang good game engine. Great job.
16. Publishing Tutorials and Mobile Compatibility
We plan to publish tutorials on how we engineered the system. Currently, the platform is mobile-friendly and works on both mobile and desktop devices. Although there may be bugs, we are actively working on resolving them. Our goal is to reach beta state, where all the necessary features are implemented. We are also exploring integration with ARKit and using webcams with TensorFlow and KaleidoKit to animate avatars. While using a webcam to create avatars with real faces is not our current focus, it's an interesting idea that others are exploring.
Do you plan to publish tutorials and how you engineered it? Could you repeat the question? Publish the studios? Like publish tutorials on like how you have your tutorials. I wish I had more time to do that, honestly. I'm gonna say yes, because I know a lot of people want me to do that.
Yeah. Hi guys, can I ask you a question? Is mobile friendly? Is it working on mobile or on the desktop? I don't know if it's currently working on iOS. Like there might be bugs but there's no reason that it actually wouldn't. If there is, it's probably some small bug. Because right now we haven't reached beta state. Beta state being all the features we need are kind of there on the first version. And then we're going to basically iron out all these things Like, oh, is it working on iOS? Set up the testing pipelines and so on. It's definitely in the works. We also want to integrate, hopefully, just things like ARKit. Cause we actually have a system already that's using webcams and just TensorFlow, I think, is the one, PoseNet, with KaleidoKit to animate the avatars. So even if I don't have a VR device, I just have a webcam I can join multiplayer and basically have an animated avatar in the world. So that's also kind of coming in. So can you use your cam and make your own avatar, like use your real face and- That's not what we're using that for right now. But that sounds like a cool idea. I know other people are doing it, like Ready Player Me. Like, it might be more good if you implement it, I guess. It's a lot of work. Yeah, I understand.
17. Procedural Generation and Engaging Gameplay
Virtual worlds often lack engaging activities beyond basic interactions like walking and shooting. However, we are addressing this issue through procedural generation. By combining elements from Minecraft and Fortnite, we are creating a world where everything is generated and destructible. This includes biomes to explore, NPCs with AI-controlled behavior, and procedurally generated quests. We have built the necessary infrastructure to easily implement these features, as NPCs and mobs are treated as objects with animations, hitboxes, and stats. This allows for the existence of combat mechanics and creates an engaging game loop. Additionally, players can customize their experience by adding objects to aid them in their quests.
I have more of a like, can you hear me? I have more of a like game design questions. So a very common problem I've seen with virtual worlds is there's usually not all that much to do. Like you can walk around, you can insert images, you can shoot the like. What else is there gonna be to do? How are we gonna entertain ourselves with it? That's an excellent question. The answer I think is procedural generation. That's actually kind of my specialty and like I said I'm really glad to kind of go back to it. But we're basically creating a Minecraft combined with Fortnite where everything is generated. Everything is destructible using all the physics systems that we already have and using a whole bunch of optimization techniques. So there's biomes that you can explore with both NPCs, AI controlled NPCs that are either gonna help or hinder your quest based on what the AI is returning and actual just procedurally generated quests like talk to this NPC and then people can kind of share interesting seeds like, hey did you know if you go over to this place there's this NPC that offers you a really good deal on this quest and you can really grind that. All of this stuff is actually pretty easy to hook up with all the infrastructure that we've built because NPCs are just objects, mobs are just objects, they're just things that you set animations on, it has its own hitbox and basically everything has stats as well. So it allows combat mechanics to exist. I actually think people are gonna be really surprised by what we come up with here because I basically love doing procedural generation stuff but I think that's all you really need in order to create that amazing game loop. And on top of that of course, if you want to drop in your object to help you in your quests that makes it even more interesting.
18. Synchronizing Positions with CRDTs
The server uses CRDTs to synchronize positions in almost real time. It's not data intensive and includes audio compression using WebCodecs. Latency exists, but smooth animations are achieved through interpolation.
Yeah I like that, thank you for your answer. So you kind of caught my attention when you were saying that you have a pretty heavy background on compilers and more like memory-based kind of problems. Sure. Right now, there is a matchmaking server or is there like something that you are using of like another third party kind of like game server to be able to synchronize positions in almost real time? That's just using CRDTs. So I might be able to show you that code. It's basically a replicated data type that the server reflects to all of the clients. It's probably WSRTC. And this can run on any server. I don't recommend running it on a home network or whatever. Yeah. But you theoretically could, it's actually not data intensive at all because even things like the audio are compressed using WebCodecs. But yeah, it's mostly just a quick data exchange. There is of course latency, we don't do any prediction right now but we could. We do actually interpolate though. So regardless of what the player's different frame rates are everybody still sees smooth animations for each other's characters. That's actually surprisingly not that hard of a problem.
19. Backend Code, Performance, and Optimization
Yeah, that's mostly correct. You're dealing with the JSON, essentially. Yeah. But that reading of JSON, is that the fastest that you can go? Isn't it, for example, like something which worked with, I don't know, buffers or serial data that could be even more performant?
Yes. If our profiling tools ever say that this JSON stuff is a problem, we're gonna move it to WebAssembly because I think that's an excellent idea and I think it'd be really cool to kind of explore. Can we take all of these CRDTs and just write it in Rust or whatever. Yeah. And compile it to WebAssembly and then you have this awesome state engine and then you just have a clean interface of the events. Because it looks like a lot of the work is done from the front end, so there is not any server render being done or anything like that, correct? Because it's just like a static site, so to speak.
Yeah, mostly. It's not a static site in that we do use React State in order to draw, for example, the UI. But for example, when everything is closed, none of the React render loop is running. This is just the game engine and everything is just painted on top of that by the browser. But when the component is closed, it's pretty much like component on destroy for React, so it just gets garbage collected. In some cases, yes. In other cases where we don't wanna reinitialize things, then we keep it around and we just hide it. Okay, it's just a model, so to speak. Yep, but it really depends. We just make sure that it's efficient at whatever it is. Like for example, we don't need to render the avatar when it's off screen. Things like that. It also allows us to kind of do cool things like have a mini map. It's kind of hard to see probably, but there is a mini map that real time renders from the top view and it kind of caches the images as you're going through. So at all times, you have a perfect view of where you are. And in VR, we hope that this could actually be a good physical map that you can pull out. It's kind of rendered. Awesome. In terms of the backend code, do you think you can show like some interesting problems that you came across and kind of like explain how you solved that? Because I guess like not being a super backend intensive application, it shouldn't have so much load but I see a lot of interesting type of like, you know, quaternions like operations done. So I was just wondering if you could give us like a quick overview. Do you mean backend code literally because it's running on the backend or are you more interested in like technical geometry problems? Because most of the geometry stuff is handled entirely on the front end.
20. Sprite Generation and Animation
The engine can turn the user into a sprite by generating the sprite code on the backend. The code for generating the sprite can be moved across computers. The avatar cruncher code generates the full spreadsheet image for the avatar. A shader is used to draw the avatar based on the camera normal and the correct frame from the texture. This allows for the creation of simple characters and helps overcome network volume limitations. The system can detect and change the character's hair programmatically. The animation system allows for multiple actions to happen simultaneously and specifies the animations for different actions. The binding between objects and animations is done through bone offsets. The player stage determines the current state of the avatar.
And then the engine based on that will respond by turning you into the sprite. But the code for generating the sprite is something that we could actually shepherd off to the backend. You saw there was a pause there, that's because it's doing it all on the front end. But this could just as easily be done off thread. I just haven't had time to optimize it. That's the nice thing though about the web is that you can basically move code across computers and it totally just works. It's just a buffer. If you have like node, yeah. If you had a little bit of a different backend framework, I guess it would have been a little bit more complicated. Yeah, but from the start we've tried to keep it as bare bones and compatible with everything. That's how we've been able to kind of glue this all together with pure open source components.
This is the avatar cruncher code because it basically takes your avatar and crunches it. In this case, it's the code for atlasing. And then we have another function which is avatar spriter, which also does the math of positioning the cameras in the different places, animating the avatar because you get the full walk cycle as you get the jump cycles, then you get the different actions, like for example, crawling. So, it poses the avatar in all of these different positions, positions the camera snaps the snapshot, and then it actually generates the full spreadsheet image. And then we have a shader which takes the camera normal to the plane. This is actually just a single plane drawing, but it takes the camera normal and it finds the correct frame from the spreadsheet, from the texture, and then it just draws it. It's actually a super cheap operations, and texture lookup. And basically that lets you create characters which are as simple as you can possibly get which is hopefully going to be the temporary solution for how we can get thousands of people into these kinds of world, because the bottleneck at that point tends to not even be the graphics. It's just the network volume and how do you pipe that correctly. So it doesn't really feel like at this point there are really many limitations to what we'll be able to do. I wonder if that works. There we go. We also have a system for our characters to kind of be transformed by additional shaders. Somebody has a question though, I'll take it. There we go, it worked. Like basically you can detect when it's a character's hair and then just change their hair programmatically. So this is one application interacting with another just due to the fact that the standard file format is being used and you can detect what is hair and what is not. That's also what allows us to do things like when you zoom into the character, you actually don't see their face, you don't see their hair, we just clip that out using the Vertex Jitter. We just kind of zero out those positions that were in that morph, and then when you scroll back out you can see the avatar's full head and it kind of makes the experience relatively seamless. It works for most avatars, you can just kind of drag and drop them in and you get either first person or third person. That's tied to the action of like the mouse wheel or something that switches the camera pretty much.
In terms of the animation, I saw that there is like a switch case in there, like so you are able to like play different animations at the same time, or there is some kind of like synchronicity between those kinds of actions. Multiple actions can happen at the same time. So for example, aiming is an action and it makes your avatar face a certain direction, but I'm also kind of moving. So it's the intersection of these two different actions that create the animation, which is that they're kind of like walking sideways. And when you are wearing an object that can change the functionality for what does aiming mean? These animations are actually still kind of rough. We're gonna do a clean transition, but our entire animation system is aware of all the different actions that you can have on your avatar. And we specify what the list of those is. Like, for example, this is the back sword type of aim. And then the sword just defines that I am the kind of object that when you aim me, I should basically have this. And then when you click me or when you trigger me, then I should attack in this manner with these combo animations. And the avatar just performs that. The binding between the object that defines the animation is mostly just an offset of one of the characters bones. Like in this case, I think it's the wrist because that's the closest thing to the hand. And then we just offset it so that the character is holding it.
You have some kind of like player stage that says, Hey, I'm in vehicle, Hey, I'm handling the sword, and all that. That's exactly right. And that's actually, I can show that as well.
21. Avatar Animation and Multiplayer Integration
The Aim action is handled by the IO manager and called into the game engine. The avatar system recognizes the player's actions and includes the corresponding animation into the character's animation stack. In a multiplayer environment, the same CRDT is used for connected players, allowing their avatars to have their own state and inventory. Changes in the CRDT are replicated across the shared world, enabling other players to see the items being dropped. Voice transfer is also supported, and optimizing avatar animation is a known challenge. The engine picks up replicated transforms and passes them to the application, ensuring all users see the same thing. The speaker mentions an Adobe product that can understand and apply mesh and skeletal mesh, making it easier to handle complex movements. The weapon is represented by metadata, which is interpreted by the avatar. The code for a specific sword is shown, highlighting its ability to carve things into the world.
And that's actually, I can show that as well. Actions, probably game. Here's the Aim action. I think there's a better one. The Aim action, most likely when the user, yeah, when the user right clicks that's handled by the IO manager. And then the IO manager is gonna call it into the game engine, kind of to say, Oh, what should I do? In this case, it means add the Aim action. And then later the avatar system itself, it recognizes that, Hey, this player has this Aim action. I can show that code as well. Which will then include its animation into the animation stack for the character. And there is a bunch of rules for how those animations stack and compose. And there's also a bunch of bugs that we need to fix. But the idea is that no matter what your character is doing no matter how you're kind of moving around we want there to be a believable animation. In some cases that means you have to break kind of the avatar into two parts, a top half and a bottom half because you want the legs to go one way but they're holding something in some other way.
So in a multiplayer environment, like would the other player know that you are in a certain player state? Yes. Yes, because yeah, I forgot to mention this is actually using the same CRDT when you're connected to a multiplayer room. It's also your own avatar that has its own kind of state. So your avatar has its own inventory and that's not related to the world inventory. It's bound to your avatar. And in fact, when you drop something into the world and pick it back up it's moving between those different app managers which also allows the possibility for when you go to a different world all the things that you're wearing also come with you. And all of this is replicated using the CRDT and basically the entire game engine is just responding to these changes in the CRDT. Like this object disappeared from over here and it appeared over here. So that means I need to move it. I need to reanimate it on the avatar because they just grabbed it. But in the shared world so to speak like when you do multiplayer the other people can also see what kind of item you're dropping for that reason.
Yes, yes because the single player in the multiplayer code is exactly the same. It's using the same data store. The only difference with multiplayer is that that state is replicated. And then in addition to that, we of course do the actual voice transfer so you can speak. In fact, the more efficient way to do this once we fixed VR support is probably going to be to do the animation once on one client side and then push the actual bone transforms to the network. And then the rest of the clients won't actually need to animate it. Because as I mentioned, this is actually one of the most expensive parts of the entire engine is just animating avatars. It's actually a known problem. This is how VRChat works as well. You basically just want to animate it once on the client and then not on other remotes. And then the remotes are just playing back animations. And so that's kind of an optimization that we can do on top of this. But the way that it currently works is just the actions that are replicated, the engine, whether that's single player or that's some remote client is just picking up those transforms, interpolating them and then passing them to the application so the user is essentially seeing the same exact thing. I guess there is something that are like, may be like you're going to expire to more like the newly Adobe obtained product, the one that pretty much like it's able to understand what kind of mesh and skeletal mesh it has to apply. Sounds cool, I haven't heard of that. I don't remember the name of it. I'll definitely send it to you through email, but as soon as I remember. But it's gonna be like pretty much easier kind of like problem because as I was mentioning before, I guess like the skeletal mesh is like the most like expensive one. So if you have like a boss that has kind of like more intricate movement or some more kind of like floating movement or anything like that. I feel like, you know that could be like really expensive and she have like 1,000 players at the same time. Oh man, you need like a T630 for that. There you go. So you mentioned that, like when you were talking about animations that the animation that the avatar has sort of depend on the sword or the weapon that they're carrying. Does like the weapons send signals to the avatar, so to say, or how does that work? The weapon is a bunch of metadata, can actually show it, which is then interpreted. In fact, this is the code for this exact sword. So first of all, there's just, it's saying that it's index.js and this is just because the sword has additional features like it can actually carve things into the world.
22. Avatar Animation and Future Possibilities
The system allows for aiming animations and multiple item holding. The difference between local players, remote players, and NPCs is important for AI features. AI-generated selectable voices are available for characters. The future may bring immense variety and complex applications. AI engines are selectable in the application, and OpenAI integration is possible. The goal is to provide an awesome experience that blurs the line between website and video game. The style will likely be closer to Fortnite's cartoony style. Building in Blender and drag-and-drop functionality are supported.
Is the system aware if an NPC is person-controlled or if it's AI controlled? Yes, so there are different classes in the code. There's a local player and then there's a remote player and then there's an NPC player, I believe. So I mean that's an easy check. I can even show that. So based on that, you're able to understand how much rendering you have to make for an object to be placed in the world? This isn't generally how we estimate that. The rendering cost for an object tends to be more a property of how many materials it has and how big the geometries are. Those are the two big factors. So that generally is not related to the actual players. We handle that at a much more lower level. But for example, the difference between a local player and a remote player is most important for actually the AI features that we have, because we need to know when we're basically chatting with the system. Is this something that's from the player? Is there an NPC that should move, like perform some sort of action? Yeah, one thing that I haven't even shown is that we actually have AI generated selectable voices for all of these different characters. So in addition to some of the characters that we're creating in this list, we're gonna be adding a whole bunch more. And they all have their own kind of unique AI generated voices, which right now it's actually easy to test that you can go into the audio settings and just change the voice to somebody else. And then your character has an entirely different voice.
Is that a neural map where that understands like what's going on and kind of like we learn from like the fighting style or like adopt like the boss difficulty on anything like that? Not at the moment, we're getting started mostly with just procedural generation of quests using the AI system. In the future, we'll probably like, once we figure out what's the most fun thing to do and what do people enjoy the most, then we'll probably find ways to, how do we make this even better and like maybe give like these certain characters memory, like that actually sounds like a really cool idea. But right now it's just mostly how can we make something that's compelling, really fun to play? And then the sky's the limit from there. I really wish I could show a lot more of the AI features because it blows my mind.
So what could the future hold? It probably would be a certain kind of immense variety of objects that slice between- like you said, Bruce, there's a lot going on in really big animators and you've got big GPUs and software and all of a sudden, these are little toys. It's not a really wide variety coming out of it. Like how do you design an application that has that level of complexity? In terms of AI engines, it's actually just a selectable option in the application. If you have your own API key, for example, for OpenAI, you can enter it. We don't actually store that. That's entirely local. And then you can basically use OpenAI with the app. We're hoping to be able to expand that to basically everybody for free, so you don't even have to do that. But that's the thing that requires OpenAI's approval before we can quote-unquote ship it. We want this to essentially just be an awesome experience for when you just kind of jump in. You feel that for the first time, you opened a website and you're confused. Is this a website or is this a serious video game? Are you gonna keep it at low poly or you're actually gonna apply it a little bit higher graphics? No, we're probably... We're not really trying to be realistic, but we're probably not gonna be low poly quote-unquote. It's probably gonna be closer to Fortnite. Favorite style. So like a cartoony style? Probably. We found that a lot of people want to build a lot of different things on these kinds of technologies. And it feels like the cartoony style is the thing that's most compatible. So when one character goes from one world to another, they don't necessarily feel out of place if it's a cartoon. So that's kind of what we're probably gonna be doing for our first party stuff, but we encourage people to basically build whatever there's been other stuff that's actually trying to be realistic. And it's all really pretty easy to set up. If you can make it in Blender, then it's just a drag and drop. I love that feature. I guess, you can either help or confuse so many people during a quest by, I don't know, duplicating the quest item or something that doesn't have any interaction.
23. Game Techniques and Audio
We're using techniques inspired by games like Minecraft and No Man's Sky. The goal is to create a living, breathing world with many things to explore. Web audio allows for positional sound and convolutions in the game engine. The boss feature is not yet enabled.
I guess, you can either help or confuse so many people during a quest by, I don't know, duplicating the quest item or something that doesn't have any interaction. I'm actually really interested in what the game is gonna come up with, because the techniques that we're using, we're just learning from the best. The techniques that we're using are inspired by the games that we grew up on, like the Minecrafts and No Man's Sky and so on. We're just reimplementing the best of that stuff. And if you've ever played something like No Man's Sky, you know how crazy it is, like the feeling when you first drop onto an alien planet and then there's so many things to explore and you feel like there's a living, breathing world. Another one that I really enjoyed was Subnautica. So if you didn't check it out. Yeah, absolutely. I wonder where the cave entrance is. Oh, there it is. All right. It turns out that things like web audio are also quite good. So we can get convolutions in the game engine essentially for free, just with metadata defining that this is an area with this much reverb. And again, you can get positional sound for the things that your characters say. I actually don't know if the boss is enabled for this. Looks like not yet, so we might be safe.
24. AI and Procedural Quests
We're experimenting with a party system for switching controls between multiple characters. The AI scene in our system understands descriptions and generates quests based on the data. We're constantly exploring the limits of procedurally generated quests. If you're interested, try playing with OpenAI. The combination of AI and web technology allows a one-person dev team to create an entire universe. We have a team and we're hiring. The AI system handles matchmaking and generates new quests.
Another thing that we're experimenting with, since we're doing quests and procedurally generated stuff like that, is for example, a party system, where sometimes you don't just wanna be one character, you want to be a group of characters. So how do you kind of switch controls between them? That's mostly just a matter of tweaking a bit of code.
I'm curious to know, like how did you architect it procedurally generated quest? Right now, just based. This is really early days, but right now just based on the lore mostly. So the way that the open AI system works is there's a virtual scene, in addition to the real scene, which you're rendering, there's two more things, I guess. There's the physics scene, which is kind of the view of the physics engine, which is not the one that you're rendering, but there's also the third kind, which is the AI scene. And so what the AI scene is doing is it's understanding all the different descriptions, even textual descriptions, and the parsed out descriptions of the data that you've loaded. So whatever this thing is called and whatever additional descriptions you said for the scene get fed into the AI algorithm, which then outputs what it thinks would be the best quest, the most fun quest that we could have in this territory. And then that's, once again, parsed out by the engine based on a premade set of quests. We're expanding it and there's tons of people who have many ideas, but one simple one is go enslave this boss because it understands that this is boss.glb and the world says, oh, this is a cavern which has a dangerous creature living in it. When you plug something like that into a system like OpenAI and you just do proper prompt generation, it will actually output a reasonable approximation of, hey, the best quest would be go over here to the quest entrance, then follow this path to go enslave this boss and your reward will be such and such. And once you parse that out by the engine, you have basically the procedurally generated quest. And we're constantly exploring with like what the limits of that is.
I don't know. That's super interesting. I wanted to learn more about it. Any reference you can point to me? I feel like, I wish the answer that other people had for me was yes. But like a lot of this stuff is just new and I've mostly had to learn it all myself. I don't know. But what I really encourage anybody, especially like on the AI side, if you're really interested, just go and play with OpenAI. OpenAI. I think it's open now for everybody. You just try some prompts and I think you're gonna be surprised. And if you're an engineer, it's pretty easy to kind of convert people and make them realize what this can actually do for something like a medium, like video games. And is there any part of that on your GitHub which I can look for reference? Yeah, there's a lore. Sorry, there's an AI folder in our repo. You're saying AI, but like the main theme of this is the fact that you build a game, a pretty good game with just JS that runs, is going to run on web, but what does AI have to do with that? Isn't AI like a smaller thing in this? Isn't the bigger thing? Like the engine and the shaders and the performance optimization, et cetera. I think it's the combination of everything. Cause if we talk about something like Minecraft, which by the way, like a lot of people ask for examples of like, what are some games on the web that like actually have had success? Well, Minecraft actually started as one of those things. It was a Java applet initially. And it certainly like didn't even have the best graphics at all. So it's actually really hard to say what the focus for the future of like the next Minecraft or Fortnite should actually look like that's using all these open technologies. I think the AI part of it though is one of the key differentiators that is only possible on something like the web where all of the metadata has been precisely defined and structured in the ways that basically make the web just such an awesome platform. You can just parse it out. Once you combine that with something like AI, that's what allows, for example, mostly a one person dev team, which for most of this project's lifetime, that's all it's been. It allows a one person dev team to basically make an entire universe. And I kind of wanna convince people that that's probably the way that we should be creating and kind of self empowering ourselves. And I'm just kind of trying to show people and inspire them, I guess.
You probably talked about this in the beginning, but I was a little late to the workshop. So are you working on this all on your own? No, no, no, we have a team, we're actually... Oh okay. No, we're hiring as well. I hope that one of the people on the team can actually share some links, if you're interested. No, we need a lot of help to build this. Despite how far we've come, there's still a long ways to go. That can get lost here. So the AI is doing like matchmaking for quest or is like, it's very like zero prompt where it's making new quest. It hasn't seen before. Two different things. There's different systems that are mixed together here.
25. Procedural Generation and Interoperability
Procedural generation and AI-controlled systems allow for the creation of a livable and immersive world using web assets. The combination of AI models, such as text-to-image and text-to-3D, has transformative potential for industries like gaming. However, building on closed platforms and advertising-based foundations can limit progress. By utilizing the web as a platform for the metaverse, we can create a better and more inclusive experience. Interoperability is a key aspect, including the ability to carry avatars, assets, and identities across different metaverses. Cryptographic signing on the blockchain ensures security, and features like specifying someone's ENS in the URL and login with Metamask and Discord enhance user experience. A minting system is also in development, allowing users to define metadata for objects and mint them directly. The goal is to enable a 3D experience for anyone, regardless of their data source or profile picture.
Some of it is just procedural generation, which I don't know if that counts as AI, but that's the kind of thing that can, for example, plays different classes of characters in the world. But then on top of that, you have the entire chess system and the personalities of all the different avatars. So that's where we're gonna be generating, for example, lores and bios and descriptions for these different avatars, which gives them, for example, different stats, different objects and different behaviors. So actually, despite the system being AI controlled, it's kind of interesting because you can't directly tell the game to do anything. The game is you have to convince the AIs to do it. And some of the AIs that you're gonna meet are not gonna do what you want them to do. Like, their personality might be really stubborn or like that they're gonna rip you off. So basically this, the AI layer allows us to mix our procedural generation techniques in entirely new ways that kind of build a livable, breathable world, despite the fact that this is all just web assets. What you really have behind it is like the knowledge of the entire internet, like the entire history of the internet, trained into an AI that's kind of telling you an immersive story with all of these assets. And then it's our job to make that entire experience really fun, interesting and to add in the game mechanics that we just kind of want to play. Like, hey, this is a really fun game, it's a really cool way to kill time. It's a really cool place to make art, maybe make some friends I don't know. I am yet to see a really good game which uses like a multi-modal AI models like a text to image, text to 3D assets. I think that's all, all that stuff is in its infancy, but I think once a lot more people start waking up to the different combinations that you can have there it's probably gonna be transformative for a lot of industries, especially things like games. I know this is exactly what Mark Zuckerberg wants to do. Like, I think Mark Zuckerberg and I probably would agree on a lot of things. But the problem with what metais doing is they're trying to build all this stuff on the wrong foundations essentially. Like it's all built on top of advertising and all of these closed source platforms that are each extracting taxes from the user. And sometimes like taxes that we don't even understand like taxes on free speech. And I think we have the opportunity to actually make something way better by basically just using the web. The web that's always been there. The web that everybody is already using. We really took it seriously as a platform for metaverse. I was actually debugging HTML rendering features. Since we're in the browser we can actually just drop HTML pages in here but I still have to fix the bug there. It was just using I-frame transforms. Are there any other interoperability which is in your roadmap? Can you expand on that? So, there is interoperability on avatar level, assets level, people's identity, let's say on blockchain or NFT level. Like what amount of things I can carry from one game to another game in various metaverses. So we're probably gonna cryptographically sign all these things just on the blockchain. One of the things that we're going to do, one of the most asked for features actually is to be able to, in the URL, just specify somebody's ENS. So that will just kind of load their own gallery that they've set up, which is just gonna be kind of written to our contract or stored on some system like ceramic. So there is actually login with both Metamask and Discord. And with the Metamask login, we'll just pull all those tokens. We also have a minting system in the works so that anything that you're dragging and dropping will kind of give you a wizard for defining all the different metadata for the object that you're dropping. And then maybe you can even mint it right here, which will once again, just be added to your Metamask. And there's no fees for that at all. As long as I think we're probably gonna limit it to people who actually own one of our tokens, like a copy of the game so that we can actually provide you reasonable hosting service without breaking our own bank. But it's also possible since these, like the stuff that we're doing is just open source contracts. Anybody can kind of load up whatever they want and mint their own things. Doesn't have to go through us. Interesting. And have you just out of curiosity, have you seen any algorithm or any workflow where people are able to make a character out of their 2D NFT and generate avatar or like whole storytelling? I've played around with it. There's some easy wins that you could do there but I haven't seen anything that was really compelling for me. But I think that's probably kind of gonna be the future where we have some sort of assets and it's transformed through this AI process or something into some other form. The problem is that it requires a lot of guesswork most of the time, because if you have a 2D PFP that really doesn't tell you everything you need to know, it doesn't have a full body that you could generate. But if anybody, by the way, has any interesting techniques, I'd love to hear about them and test them out. Because ultimately we do wanna make sure that anybody who has some sort of data, whether that's on-chain or not, is able to have a good 3D experience and they don't just have to be a 2D plain profile pic. All right, well, if that's all, thanks for having me. You can either follow us, I guess, it's web averse on Twitter, or if you wanna follow me I'm Web Mixed Reality. But I've been tweeting lately.