The open source PlayCanvas game engine is built specifically for the browser, incorporating 10 years of learnings about optimization. In this talk, you will discover the secret sauce that enables PlayCanvas to generate games with lightning fast load times and rock solid frame rates.
Optimizing HTML5 Games: 10 Years of Learnings
Transcription
Hi, my name is Will Eastcott. I'm the creator of PlayCanvas. I'm going to be talking to you today about optimizing HTML5 games based on 10 years of learnings working on the PlayCanvas game engine. So to begin with, I want to just start by explaining a little bit about what PlayCanvas is. It's an open source game engine. It's written in javascript. It's based on webgl. And you also get this visual editor that's like browser-based. It's real-time collaborative. It's built in the cloud. So you can visually build your games in this editor. So PlayCanvas actually powers Snap Games, which is the gaming platform in Snapchat. It's had over 200 million players. There's a large number of games that you can check out in just about any genre. So I would encourage you to check them out. But PlayCanvas isn't just used by Snapchat-based game developers. It's used by game developers the world over to make all sorts of different types of games, from casual games to.io games. It's actually pretty popular for FPS game developers. And you can see several of those represented there. Now my personal journey in working on game optimization started in the early 2000s, working for a company called Criterion Software on a game engine called Renderware. Now Renderware was used to power about a third of the games in the PlayStation 2 generation. And day-to-day, I would be working on this type of hardware. So we're talking a T10,000 PlayStation 2 developer kit. And yeah, if you wanted to do performance analysis on that kind of hardware, you would struggle. And you would often need to go into Sony's HQ and work on special hardware that was developed by them, very expensive hardware. It was incredibly inconvenient. Fast forward to 2022, and HTML5 game devs are living the dream, right? I mean, we've got incredibly powerful hardware in the palms of our hand. And we've got great tools that are built right into our web browsers. So is optimization still important? Well, spoiler alert, yes it is. Now performance optimization, in my view, kind of falls into two main areas. There's load times and there's also frame rates optimization. And let's start talking about load times. Now this is something that we don't want our end users to see, loading bars. So why does it matter whether we present our users with loading bars? Well, as it turns out, after six seconds of waiting, we tend to lose anything like 40% of our audience who will just bounce, not prepared to wait for the page to load. So when we begin an investigation into load time, what kind of tools do we have available to us that can help us investigate optimizing load times? Well, like I said, built right into the browser, you have some pretty advanced tools. In Chrome devtools, you have a couple of tabs. You have the networking tab, which shows you what resources my game is loading. And then you have the performance tab that shows how my game is loading those resources. So when you start your investigation, you typically look for the low hanging fruit. If you look in the bottom left of the browser tab, you'll see the number of requests made by the browser. You'll see the amount of data that's transferred. And you'll see the time it takes to load your game. Now, the first thing that you'll want to do is sort the list of resources based on size, because obviously, the bigger the resource, the bigger the opportunity for optimization. You can search for duplicates or redundant resources that your game shouldn't really be loading in the first place. And if we look at the biggest file that's being loaded here into our game, it's a 2.2 megabyte JPEG. So we can ask ourselves, hey, can we downsize these resources? Can we optimize them somehow? Now, as it turns out, in HTML5 games, typically, most of the data tends to be texture-based data. And in our example on the previous slide, we had a 2.2 megabyte JPEG. Now, if the browser downloads this file, we need to decompress it into memory. And that is 48 megabytes of data. Then we have to pass it to webgl to be used as a texture. And a copy of it is made. Plus, we have to generate mipmaps, which is another 64 megabytes of data. So together, that's like 112 megabytes of data just for a single JPEG. Now, if you try to load about 10 of these into a browser tab on mobile, I guarantee you, you'll crash the tab. So we need some kind of solution around this. Moreover, every JPEG reload, you have to pay like a decode cost. It takes time to actually decompress the data from JPEG to uncompressed formats. And in this case, this 2 megabyte texture takes 160 milliseconds, which is just excessive because it's going to cause the mainframe to block, and it's going to cause long load times. Luckily, we have hardware texture compression that we can use to load more optimized texture data. So if we take our original 2.2 megabyte JPEG, and we crunch that down to some natively supported hardware texture formats like DXT, EBR, and ETC, we find that the sizes are actually larger than the original JPEG. But because the format of this hardware texture compression data is native to the GPU, we can just supply it to the GPU with no decode cost, which is great. Also, the amount of GPU memory used by hardware texture compressed data formats is between one fifth and one tenth of the original JPEG. So we're at least sure we're not going to be crashing the browser tab anymore. But we are paying the costs for having to download large images. And that's a problem. So what can we do? Well, fortunately, there's another texture format called Basis, which is essentially an abstract compression format that is supposed to be transcoded to any of the natively supported formats at runtime, at load time of your game. So if we take that original JPEG, and we convert it to a Basis texture, it goes from 2.2 meg down to 1.7, which is great. But also, we get all of the benefits of these native formats. So again, between one fifth and one tenth of the original GPU memory that the JPEG occupied. So what does compressing textures to Basis look like in the Play Canvas editor? Well, it's a very simple operation. You just select the textures that you're looking to compress. You can see we've got four 2K textures here, which are like five megabytes of PNG data. You just say, hey, I want these to be Basis compressed. You import the transcoder. And you hit the compress button. And there you have it. It's very simple. So after texture data, the next biggest contributor to download size in an HTML5 game is often mesh data, at least for 3d games. So let's talk a little bit about mesh optimization. Here we have something called a Stanford Dragon. It's a mesh that's often used in rendering experiments. And we're going to use it in some tests here. Now, you can see it's a very dense mesh. It's got hundreds of thousands of vertices and triangles. Probably not a typical game asset, but it should underline some points from building. So if you're going to start thinking about a format to load this data with, well, I think it's reasonable to start thinking about JSON because JSON works great with browser. It's very easy to work with in javascript. You can just pass the data into a javascript object. So incredibly convenient. Now, it turns out that this Dragon mesh will serialize to 43 megabytes of JSON. That's fairly big. But it will juice it down relatively aggressively down to 12.8 meg. The big problem we have here, though, is that to parse that much JSON takes 1.25 seconds. And also because we're having to throw around quite a large JSON file, there's quite a high peak memory usage of the application. So what can we do? Well, the Khronos group owns an open standard called GLTF, which is essentially designed to be the JPEG of 3d. A large ecosystem has grown up around GLTF, various companies providing tools, such as Play Canvas. And we now use GLTF as the primary format for the engine. So let's examine what GLTF, how that performs with this particular data. So GLB is the binary format of GLTF. So if we save out a GLB file containing this mesh, we find that it's less than half the size of the original JSON file. And when we use it, it's only marginally smaller than the JSON data. That's because JSON is text and it compresses very well. The key thing to notice is that the parse time for the mesh, the GLB file, is just 50 milliseconds, a tiny fraction of the time it takes to parse the JSON, which is pretty incredible. But the reason for this is that the GLTF format stores data in a GPU-ready format. So once it's copied out of the file, you can parse it directly to webgl with no processing. So this is a huge win for game engines and HTML5 games to ensure they're using GLTF. Also, the peak memory usage is lower because, like I say, we're not throwing around large JSON datasets. We can do even better than this, though. We can compress the GLB file using some technology from Google called Draco. This is an extension of the GLTF specification and allows you to compress the versus data. So here we can see that the 21 meg GLB can be compressed down to 1.84 megabytes. And you can even cheese it that slightly further, down to 1.79 meg. The only slight issue that you must be aware of is that this data needs to be decompressed at load time. So to run the Draco decompressor for this mesh takes 0.4 seconds. But as we did with the basis textures, we can offload that to a WebWorker thread, and we can then not store the main thread and essentially hide that processing. I've mentioned a few times it's important to use gzip or compression as part of your process to publish your games. It's very critical that your infrastructure, your server, serves your game resources compressed. So to verify that, you should be able to select any resource and look at the contents encoding header and check that it's set to either gzip or Brotli. Now, how you compress your data to put it on your infrastructure will depend on your backend services provider. The technique will be different per provider. But here, there's an example of how you would do it with Google cloud. You would use gsutil, and you would specify which file types you want to be gzipped. So let's apply some of these techniques to the HTML5 game Swoop. We can see that the original game used JPEGs and JSON for model data, and the original load time was 4.5 seconds total. So just by converting, crunching all of these JPEGs down to basis and re-importing all of the artwork as GLB instead of JSON, we can shave an entire second of the load time. And that represents a few percent of your audience that you've managed to retain by reducing that load time. There are other techniques for improving your load times, one which is actually thinking about your game design. So one of my favorite games of all time is Metroid Prime. And they had an interesting technique where you could move from a large open area to another large open area through a tunnel. And when you're moving through the tunnel, they unload the previous area and start asynchronously loading the next area. When you get to the end of the tunnel, you shoot the door. And as soon as the next area has finished loading, the door will open. And it means that in the entire game, you don't see loading bars, and the entire environment seems completely seamless. So what we've done is we've added a little bit of a bit of a gimmick to the game. This technique is used in many Play Campus games. So here we have Bitmoji Party, which loads a very minimal set of assets to show that initial menu. It maybe takes two seconds for that first initial menu to load and be displayed and be interactive. And while the user is spending those two or three seconds just deciding what they want to select in an initial gameplay option, we can be streaming the first set of assets in that mini game on the right. And it means that in that particular game, you don't see loading bars. Let's move on to talk about framerate optimization. Why do we care about framerate these days? Surely, smartphones are pretty powerful these days, right? Well, it's interesting if you go and look at some of the benchmarking numbers for phones that are on the market today. So I took the iPhone 13 and the Samsung Galaxy A21s. And the iPhone 13 in terms of CPU benchmarks outperforms the A21s by an order of magnitude. This is a huge difference. So it's important that your game can scale from the high end all the way down to some of these more budget devices, as well as considering some of the legacy devices that aren't even on the market today. What tools do we have in our arsenal to be able to investigate framerate? Well, during a load time investigation, we looked at the performance tab. The performance tab is very powerful. You can also use it to run a hierarchical profiler for your code. So you can capture a trace over, say, 10 seconds. And you can then drill down into the core stack of your game and identify the hotspots that take up the most CPU percentage, and then focus your efforts there. We also have the timeline view in the performance tab, which allows you to zoom into an individual frame and have a visual representation of where time is spent in that individual frame. So here I can see that my game loop consists of an update and a render. Now, in this case, the render function takes almost three times as long to run as the update function. So I'm clearly going to want to focus my efforts there, because most of the performance wins can be found there. So to investigate rendering performance, there are many tools available that's found as browser extensions. One really great one that I recommend is spectre.js. It allows you to capture a single frame of rendering, and it shows how webgl is being driven. So you can see individual draw calls being submitted, and you can even drill down and see shader code that's being executed. Perhaps the biggest mistake I see game devs make today is picking the wrong resolution for their game. Now, I've picked a couple of devices here to make a point. The iPhone 13 Pro Max actually has 20% fewer physical pixels than the Samsung Galaxy S6, which is a seven-year-old device, whereas the S6 actually has, obviously, a much weaker GPU in terms of fragment processing. So it wouldn't make sense for you to just render your game liberally at full device resolution on any device. What you can do is either give the user an option to render at different resolutions, or better yet, you can detect the GPU. You can do that on Android, and you can make a decision about the rendering resolution based on the family of GPU that you've detected. Something else you should consider is limiting the graphical complexity of your game. So on the left, there's a game called Bitmoji Tennis, which is very simplistic. It uses a single emissive map for the environment. There are no dynamic shadows. Everything is baked. There's a single directional light for shadows, well, for lighting the characters. And then on the right, you can see there's a much more complex demo, graphically speaking. It's a technical demo that the Play Canvas team built, which uses physical shading, shadow mapping, image-based lighting, and other effects as well. And the complexity of the shaders of the game on the left is far simpler, and therefore, it puts less strain on the GPU, and you're able to do more complex things. And you're able to render at a higher frame rate. I also recommend that you be very careful about the number of draw calls that your game makes. Now, a draw call is essentially submitting a graphical primitive to the GPU with some render state. Every draw call will have an overhead in terms of CPU and GPU cost. And a typical HTML5 game might render between, say, 100 and 200 draw calls if you want to target some of these low-end devices. So one technique for optimizing draw calls is atlasing textures. So for this environment that you see rendered there, there were several materials. There's wooden planks on the floor. There's wallpaper on the walls, et cetera. Now, these textures are atlased into a set of textures, and this means that draw calls can be combined together. Another technique is batching, where we have an environment that's rendered using seven distinct models. And some of these models are rendered multiple times. Like, you can see, for example, some of the buildings are duplicated. The cacti are duplicated. And when we render this scene normally, this is kind of the way the scene is built up. You can see there's an individual draw call per object. But if we use batching, we can combine similar models into combined draw calls, if you like. And this means that we can render the entire scene in just six draw calls instead of the original 50, so almost an order of magnitude reduction, which pretty much maps to an order of magnitude lower processing on the scene. And that's to an order of magnitude lower processing on the CPU and the GPU in terms of, well, I mean, lower on the CPU and on the GPU, it's a lower overhead because there's fewer draw calls. So I'm going to leave you with three key pieces of advice here. Do not leave optimization considerations until the end of your development process. Also, design your game with performance in mind right from the very beginning. And lastly, select what your baseline device is and test on that device early and throughout your development cycle. That's it for this talk. Thank you for listening. So let's start with taking a look at the poll results here. So it looks like there's an overwhelming majority of Chrome users here. Yeah, I guess this is sort of what I expected. I mean, I don't know about you Omar, but I use Chrome mostly, but I have to kind of dip into all of the different browsers because obviously it's important that I test the engine everywhere where it's going to run. And yeah, so it's not uncommon. I spend a bit of time in Safari and Firefox and so on. Yeah. And especially, I was just going to say the way the question is phrased, like which browsers do you use to optimize your games? Because I've also will primarily be using Chrome devtools, but I was pleasantly surprised last time I was trying Firefox that their devtools are, I have caught up at least since five years ago when I last used it. So maybe it's not, I mean, I don't know if it's comparable, but it is worth looking at, I think. I think the CPU profile is really awesome. I use that quite a lot. Yeah. I'm trying to think like, also I noticed there's some people who went for Opera here and I mean, presumably Opera shares Chrome devtools with- I was going to ask that. I don't know if they have their own engine. I've never used Opera. In the Safari, because I haven't done much iOS debugging, is the Safari have, because I know you can do the thing where you connect the phone and then you can kind of do this remote debugging. Does Safari have pleasantly well devtools? Well, yeah, the devtools are pretty great. Yeah. I mean, as you mentioned, the main reason you're going to use them is because you're connecting to an iOS device and then you can debug either a WebView or Safari browser. And that experience is pretty good. I think the only kind of issue that we've had is because Play Canvas is used to build quite a few sort of hybrid applications where WebView is embedded in the application so that it can be shipped to an app store. And you can't actually connect to an app that's been like signed and production, which is kind of frustrating. But yeah, generally it works pretty well. And Chrome's the same with Android, right? Like the experience when you connect over, I mean, I normally do like USB logging, but that whole experience is pretty slick these days. Cool. That makes sense. And we have a couple of questions streaming in from the audience. So Dan says, great talk. Are there any tools for detecting device performance? So there are a lot of tools. So I'm not quite sure if Dan means sort of like, you know, sort of theoretical performance or whether you just want to get like performance stats out of running out. I mean, in Play Canvas, we have something that we call the mini stats profiler. And it's like a panel that you automatically get as part of our launch page when you run your application. And it will show you CPU utilization, GPU utilization, and like number of draw calls and stuff like that. So you've got like a real time kind of little hub down there that shows you performance stats of the device that you're running on. And it's really important that, you know, that's easy to do, especially on mobile, right? Because most of us are targeting mobile these days. So having like easy ways to figure out like what performance you're getting on a mobile device is like really, really important. And the only like one slight problem that you have on mobile these days is that if you want to do like GPU profiling, it's quite difficult in the browser. There's a webgl extension called Disjoint Timer, which allows you to essentially do, you know, accurate timings on the GPU. So that's what our mini stats profiler uses. But unfortunately, it's not well supported on mobile. I don't think it's supported on iOS, for example. That makes sense. I think it's actually a good segue into the next question where Mark asks, if you ever find it useful to do real time profiling of the hardware a game is running on, and then do like, based on that modifications of like, okay, for this device, we'll have like, higher LODs or lower LODs or something like that. Do you ever do these like micro optimizations based on like, specific devices? Yeah, that's a really good question. I mean, there was a slide in my talk where I kind of mentioned the huge disparity in power of mobile hardware today that you can actually just go out and buy in a shop. And so, you know, it would be a huge shame if you had to write a game, which was just targeted towards the lowest common denominator hardware. So if you're gonna somehow, like give your users, give your players a better experience, you'll want to find a way to let your game scale depending on the performance of the device, right? So that can be tricky, because in the browser, you're kind of limited in being able to detect the device on which you're running. So for example, on iOS, you don't know which GPU is being used. You can do things like user agent sniffing to give you clues about the device. You can even do things like make assumptions based on the things like the reported window width and height, or the like, you know, the what's it called the window dot device pixel ratio, whatever it is. Like those kind of properties can give you clues to like the age and power of the device. And then on Android, things are a bit easier because Chrome will report the actual GPU family that you're running on. So actually, I know many game developers that write their games where they've got like some kind of a statement in their code that says, if Android, and then they've got like a switch statement with certain GPUs that they want to do something specific for. Normally, it's like seven, eight, nine year old GPU families like the the armali 400 or the Adreno 300 family, right? Because on those older devices, you know, you're going to probably want to limit the complexity of what you're rendering. So maybe I don't know, you're going to turn off particles or you're going to, I don't know, use lower LODs, like you said. So yeah, there are techniques you can use to kind of sniff characteristics of the hardware and then enable and disable certain aspects of the game. That's cool. I mean, it's actually sad that we have to do this micro-optimization based on it. Because I mean, in my mind, the reason I like the web is, you know, it's this whole published once and it's all the same thing everywhere. But it's also, I guess, nice that we can, at least when you need to, make these changes as needed based on the device. We have one more question. It's a bit of a spicy question. Do you think Apple is purposely, this is Daniel asking, do you think Apple is purposely holding back because Fario on iOS is not in danger of their App Store profits? Or do you think there are other reasons why they're so far behind on progressive web apps and modern APIs in general? Yeah, that's an interesting, I mean, I think there probably are some considerations about the App Store business, but I think we have to give Apple some credit here because they've really stepped up with their webgpu implementation. webgpu, so webgl 2 is now out in production on iOS and macOS. And in addition to that, we're also seeing a commitment of Apple to implement WebXR into WebKit. So I think we're in a slightly different world to where we were, say, two or three years ago when there was quite a lot of frustration that we were still lingering on webgl 1. But yeah, I think that, let's give them some credit and say they're catching up very rapidly. And I think things are going to be really exciting once webgpu lands. I think it's available as a tweak if you go into the settings of the browser in iOS. So yeah, I mean, things are looking pretty good now. The other thing to say is that you can ship HTML5 games to the App Store using technologies like Cordova. We've got a lot of people who are using Cordova and HTML5 games to the App Store using technologies like Cordova. We've got a guide in the PlayCanvas developer documentation that explains that process. So if you just Google PlayCanvas Cordova or something, it'll take you to that page. And you'd be surprised how easy the process is. You can do it in five minutes, build an IPA executable, which you can then test on your iOS device. And it's yeah, super easy. That's awesome. That makes a lot of sense. Thank you so much, Will. It was so great having you here. Really appreciate it. And it's super cool hearing all your insights. Thank you so much. My pleasure. Bye-bye.