Hello everyone. My name's Jason Mayes. I'm a developer advocate for TensorFlow.js here at Google. And I'm excited to be talking to you about machine learning in the browser and beyond. So let's get started.
Pre trained models
[02:15] Now, first up is object recognition. This is using COCO-SSD behind the scenes and is trained on 90 object classes. You can see this in action on the right-hand side with the dogs being highlighted with their bounding boxes. And we can even know that there's two dogs in this image as both are returned to us. So let's see this in action live to see how it performs in the browser.
[02:37] Okay, so here's a web page I created that's running this code live in Chrome. And if I click on any one of these images, I can now get object detection working for any objects it finds in those images. Even if it's different object class types. But we can do better than this, we can actually enable the webcam and then do this live in real time. And you can see me talking to you right now, and you can see how it's classifying both myself and the bed in the background sometimes as I speak to you.
[03:29] So next up we've got face mesh. This is just three megates in size and can recognize 468 facial landmarks on the human face. You can see this in action on the left hand side of the slide right now. Now people are starting to use this for real world cases, such as L'Oreal and they've created AR makeup Try on, which allows you to try on lipstick in this case in real time, without even having to be there physically present in the store.
So you should note the lady on the right hand side is not wearing any lipstick. We're using face mesh to understand where her lips are. And then we use WebGl shaders to augment the color of lipstick she wants on her face in real time. So this is super cool. And I'm sure we're going to see more stuff like this coming out in the future.
[04:16] So let's see face mesh in action to see how it performs in the real world. Let's switch to the demo. Okay. So now you can see me talking to you with face mesh in real time in the web browsers app. At same time on the left hand side here, you can see the machine learning in action and there is indeed a mesh of my face being overlaid as I move my face around and it's pretty robust. I can open and close my mouth and my eyes, and you can see that happening all in real time.
[05:47] Okay, back to the slides. So next up is body segmentation. This allows you to distinguish 24 body areas across multiple bodies. All in real time. You can see this is an action on the slide. And on the right hand side, you can see that the different colors represent different parts of each body. Even better, we've got pose estimation going on at the same time. Those light blue lines contained with each one of the bodies on the right hand side there, which allow us to estimate where the human skeleton is. And with that, that can enable really powerful demos, such as the ability to recognize when you're in a certain pose or a gesture or something like this.
We've already seen people in our community use this to do workout instructors or yoga instructors, and this kind of stuff. So it's super cool to see the creative potential of this model. And in fact, with a bit of creativity, we can use things like body picks in a number of delightful ways. There's just two examples I created in my spare time.
On the right hand side, I also made a clothing size estimator. Now I don't know about you. I'm really terrible at knowing what size clothing I am when I'm trying to buy clothes once a year. And for different brands I have different sizes. In some brands, I'm a small. Other brands, I'm a medium. So I never know what to select at checkout. Now, here I can in under 15 seconds, get estimate of my body size for the key measurements that they care for that particular brand. And I can automatically select at checkout the correct size for me. And that saves me time and money of having to return things when they don't fit. And this solved a problem I had in my daily life.
To give you super powers!
[07:57] Next up. What about giving yourself superpowers? One guy from our community combined our face me model with WebGL shaders to create this iron man like effect. And here you can see lasers coming from his eyes and mouth in a really beautiful, realistic kind of way, which could be great for an activation with a movie company or something like this for a new movie release.
Or what about if you combine with other technologies here, we see another member of the community using WebEx R and WebGL, and TensorFlow.js together to extract an image of a body from a magazine and then bring that body into the real world. So if they can inspect the fashion design in more detail, I've even seen this person go one step further and make the face animate and say sounds, which is really cool.
Add WebRTC to teleport!
[08:44] But why stop there? We can go once that further still adding web RTC to this which stands for web realtime communication, I can even teleport myself. Here, I can segment my body using body pics from my room. I can transmit that segmentation over the internet and then reconstruct it in a real physical space using web XR. And this allows me to speak to my friends and family in the current times where we're not able to travel as much in a more meaningful way than a rectangular video call. In fact, maybe in the future, my future presentations will be delivered to you in this form, who knows, but some very exciting stuff ahead.
[09:24] Now, the second way to use TensorFlow.js is transfer learning. And this allows you to retrain existing models to work with your own custom data. Now, of course, if you're a machine learning expert, you can do this all programmatically, but today I want to show you two easier ways to get started.
Now, the first is teachable machine. This is a website that can do both the training and the inference completely in the web browser. This is great for prototyping for things like object detection, pose estimation, and sound detection. I think more models will be supported in the future, so watch this space.
[10:00] But let's see it in action to give you a flavor of how it works. Okay. So if we head over to TeachableMachine.withgoogle.com, you can follow along if you like. We can actually select one of three projects to choose from.
Today, we're going to go for image project to recognize a custom object. So we click on that and we're then presented with a screen like this. On the left, we've got a number of classes for the objects you want to recognize. If you want to recognize more than two things, we can click the nice add class button here if we choose to do so. But today we're just going to recognize my face or a deck of playing cards. So let's go ahead and give it some more meaningful names. Going to call the first one, Jason, to represent me. And the second class I'm going to call cards, which represents the cards. Now, all we need to do is allow access to our webcam and you'll see a live preview pop up on the left hand side for the first class. Now I just need to record some samples of my face to make sure we have some training data for this class type.
[11:01] So let's go ahead and do that. I'm going to move my head around to get some variety. There we go. And we can see, how many images have I got there, about 38 sample images. Perfect. I'm now going to go and do the same thing with class number two, the deck of cards.
And I've got here a nice set of playing cards. So what I'm going to do is hold the record again, but this time I'm going to get roughly the same number of images, but of the cards. So I've got 42 there. That's close enough. All I need to do now is look on the train model and now live in the web browser. This is going to attempt to categorize the training data that I've presented to it versus the ones it was previously taught on. You can see there in under 30 seconds, it's already complete. And it's currently predicting Jason as the output with 99% confidence, which is pretty good. And if I bring my deck of playing cards up, you can see that switches to cards of a hundred percent confidence. So Jason, cards, Jason, cards. You can see how easy that was to make and how robust it is in actually detecting those two objects.
[12:11] Now, of course, this is a prototype. If this was good enough for what I needed, I can click on export model here, I can click on the download button, and of course I can then copy this code and use it on my own website if I choose to do so. So that's teachable machine and great for prototyping. However, if you've got gigates of data, you might want to use something more robust for production quality models. So let's go back to the slides and see how to do that.
Cloud Auto ML
[12:38] So Cloud Auto ML allows us to train custom vision models in the cloud, and we can deploy to TensorFlow.js at the end, which is super useful.
So all you have to do is upload folders of images that you want to classify to Google cloud storage. As you can see here, and then click on the next button. Once you do that, you'll be asked if you want to optimize your model for higher accuracy or faster predictions or some kind of trade off between the two. You then set a budget and leave it training for hours a day, depending how much data you've got uploaded there. And it'll come back to you with the best results. It's going to try many different hyper parameters, many different types of computer vision models, and try and figure out what works best with your data. Once it's ready, you can then click export and choose TensorFlow.js as shown here in the circle, which will download the model Jason files, which you need to run in the web browser. And with that, you can then use it on your own webpage and add your own user experience and user interface and so on and so forth.
How hard is it to actually use this production quality trained model?
[15:02] So next we grab a reference to the image we want to classify. So in this case, we call document.getElementById('daisy'), Which is referring to the Daisy image above here. And now we've got a reference to that in memory. All we need to do now is call await model.classify and pass it the image you want to classify.
And this again is an asynchronous operation because this might take several milliseconds to execute, which of course in computer terms is a very long time. So we want to wait for that to finish. And then we'll have JSON object assigned to this predictions constant here on the left, which you can then iterate through and go through all the things it thinks it's found in the image. And with that, you can do whatever you like. You can trigger something to run. You could control a robot. You could do whatever you wanted to do just with a few lines of code. So super cool and super functional.
Write your own code
[15:54] Now the third way to use TensorFlow.js is to write your own code. And of course, to go through that would be a whole different talk in itself. So today I'm going to focus on the superpowers and performance benefits of why you might want to consider using TensorFlow.js in the browser.
Now, first up, I want to give you an overview of how our API is structured. We've got two APIs, one is the high level layers API, which is very similar to Keras, If you're familiar with Python. In fact, if you use Keras, it's basically the same function signature. So you should feel very much at home. And then for those of you who want to go lower level, we have the ops API, which is the more mathematical layer that allows you to do things like linear algebra and so on and so forth.
[16:37] And you can see how it comes together in the following diagram. At the top there, we've got our pre-made models, which are sitting upon our Layers API, that Layers API sits on top of our ops API. And this understands how to talk to different environments, such as for client side. And client side here, we mean things like the web browser.
Now those environments themselves can execute on different backends. And in this case, we can execute on things like the CPU, which is the slowest form of execution, WebGL to get graphics, car acceleration, and web assembly, or WASM for short, for improved performance on the CPU across mobile devices.
[17:16] And the same is true for the server side as well. We can execute using Node.js on the server side, and this can talk to the same TensorFlow CPU and GPU bindings that Python has. So yes, that means you get the same AVX supports and the same Cuda acceleration that you do in Python. And in fact, as we'll see later, this means the performance benefits are pretty much exactly the same as well. We execute as fast and then sometimes faster than Python for certain use cases.
Now, if you choose to still develop your machine learning in python, which many of you of course will, that's completely fine too. Our Node.js implementation supports loading of Keras models and TensorFlow save models without any kind of conversion. So as long as you're executing on the server side in Node, no conversion is required to use that and integrate with say, a web team. So very convenient. And then if you choose to take your saved model and you want to run that in the web browser, then you'll have to use our TensorFlow.js command line converter to do so. That will convert the sales model format into both json format we need to run in the web browser. And that's only required if you want to run client side in the browser.
Model inference performance only
5 client side super powers
[19:51] And on that note, if you are thinking about executing on the client side, there's also some superpowers to consider here as well. And these are hard or impossible to achieve on the server side in Node or Python.
So the first one is privacy. If you're executing on the client side completely, then none of the Tensor data is going to a server for inference. And that means the client's privacy is completely preserved. And that's very important for certain types of applications like medical or legal, or if you're trying to comply with certain rules, such as GDPR, where you might not be physically allowed to transmit data to a different server.
[20:31] Second point, if no server is involved, you can achieve lower latencies. Typically, it might take a hundred minutes or more if you're using a mobile device to talk to a server and get the results. If you're using TensorFlow.js device, you can cut that middleman out and essentially have lower latency for your inference times resulting in a higher frames per second to allow real time applications.
The third point is lower cost. Because no servers are involved, you can save significant costs on hiring GPU, RAM, and CPUs, which might be running 24/7 for a busy machine learning application. By doing this all on client side, you don't need to hire those pieces of hardware in the first place. You just need to have a standard web CDN to deliver the website.
And when the fifth point is a reach and scale of the web, anyone in the world can click a link and use your machine learning model in a web browser. The same is not true if you want to do this in node or Python, because first of all, you have to understand how to install Linux. Secondly, you need to install TensorFlow, then you need to install the Cuda drivers, and then you need to clone the person's GitHub repository, read their readme, and if all of that works in your favor, then you might have a chance of running their machine learning model. So you can see how there's a much lower barrier to entry here. If your purpose is to get your research used many people around the world. And that can be really great because it can allow you to identify biases or bugs that maybe could have gone overlooked if only 10 people are using it instead of 10,000.
[22:30] And of course, with TensorFlow.js in the browser, we can run on GPU on 84% of devices due to WebGL. We're not limited to just Nvidia graphics cards. We can run on AMD ones too, and so on and so forth.
5 server side / Node.js benefits
[22:46] And if we look at the server side, we can also see some of the benefits of running Node.js. It allows us to use TensorFlow stage model format without any kind of conversion or performance penalties. And we can run larger models than we can do on client side. But of course, some GPU memory limits you might run into if you try and push a gigate model over the web to the client device.
[23:51] So with that, let's wrap up some resources that you can use to get started and learn more. If there's one slide you want to bookmark, let it be this one. Here you can see all the resources you need to get started with Tensorflow.js our website at the top, there. You can find many resources and tutorials to help you on your way. We've got our models available at TensorFlow.org/js/models. I've only shown you three or four today, but there's many more on there, which you can also be using out of the box to get started super fast.
We are completely open source. So we're available on GitHub as well. And we encourage contributions back to the project if you are feeling ambitious. We have a Google group for more advanced technical questions, which are group of monitors. And of course we've even got Codepen and Glitch examples to help you get started with boiler plate code, to understand how to take data from a webcam and pass it to some of our models.
And with that, I encourage you to come join our community. If you check out the #MadeWithTFJS on Twitter or LinkedIn, you'll find hundreds of projects that people are creating every single week around the world. And I can't show them all in the presentations today, but here's just a glimpse of some of the great things that are going on elsewhere in the community.
[25:28] So my last question for you is, "What will you make?"
Here's one final piece of inspiration from a guy from our community in Tokyo, Japan. He is a dancer day, but he's used TensorFlow.js to make this really cool looking hip hop video as you can see on the slide. And my point for saying this is that machine learning really is now for everyone. And I'm super excited to see how everyone else in the world will start to use machine learning now that it becomes more accessible. Artists, musicians, creatives, everyone has a chance now to use machine learning. And if you do, please make use of that #MadeWithTFJS so we can have you featured in our future presentations and blog post writeups. Thank you very much for listening and with that, feel free to stay in touch. I'm available on Twitter and LinkedIn for further questions. And I'd look forward to talking with you soon. Thank you.
[26:18] Jason Mayes: Cool. Thank you for having me today. Great to be here.
[26:21] Mettin Parzinski: Well, really happy to have you, my first question and the question is from me is, how many cameras do you have? It felt like you have like six cameras going on, all different angles.
[26:32] Jason Mayes: I've got definitely more than two cameras here. It's very good for demos and doing things in the web browser and I'm recording at the same time. So good stuff.
[26:43] Mettin Parzinski: It was nice to be able to enjoy your glowing microphone from six angles. Thanks a lot for your awesome gear setup.
[26:54] Jason Mayes: It make the emerald train faster if I've got RGBs right? So good stuff.
[29:59] Mettin Parzinski: Yeah. So basically people can make a cover album. So a follow up question to that is, "How important is it? If I want to start using TensorFlow.js to have a good understanding of math?"
[30:15] Jason Mayes: It is just the same as Python based machine learning. Obviously, if you want to go down and unpeel all the layers of what's going on behind the scenes, you're eventually going to find a lot of the linear algebra and statistics and all of this kind of stuff that drives all of this stuff. So if you want to go lower level, then of course you start to need to learn some of the mathematics to start tweaking things. But if you are working with the pre-made models or other research people have made often a lot of this stuff can be reused. I think maybe 5% of people will probably ever need to actually write their own custom model from scratch, most of the time you can reuse existing research. If I want to do like face landmark detection, we've got models for that already. I don't need to reinvent the wheel necessarily, unless I need to understand different points that aren't available in that current model and at that point I would need to start diving into this to start retraining it on my custom data to learn how to recognize those things. But if you don't need to do that, then you can do very well without going too deep into the pile there.
[31:20] Mettin Parzinski: So it's easy to tip your toes without it being all that scary.
[31:27] Jason Mayes: I think no matter what your level, everyone has a chance to play with TensorFlow.js and the people who want to go deeper, all the goodies that you are used to from Python land also exist there. We've got a very similar Keras like API and but also the lower level mathematical API also exists just like the original Python based version of TensorFlow. So no matter what, your level, you can go fill into it or you can step the high levels and have a more abstract view of it all. See?
[31:55] Mettin Parzinski: But there's basically anything you learn, right. You can kind of tip your toes in and it will be okay. And the more familiar you get, you can go deeper and deeper and down the rabbit hole basically.
[32:06] Jason Mayes: Exactly. And I think that's the way to go, honestly. Like I think a lot of people start off on the high level. They find something they're passionate about. They start tinkering with that and then they want to optimize it in some way or adjust it slightly. And it is at that point, we start to go a little deeper into that kind of field and they make great things. Little little people go in the right direction as they need to I think, and explore as they need.
[32:28] Mettin Parzinski: We have another question. How long did it take for you to become a machine learning expert as you are?
[33:32] Mettin Parzinski: Learning in public, that's fun always.
Jason Mayes: Exactly.
[33:39] Mettin Parzinski: So next question is actually from my co-MC, Sergey. Python has a big tool set of explorative data analysis Pandas, Jupyter, Math, Plotlab. How is it for JS? What is the biggest bottleneck for JS community to deep learning?
[35:05] Mettin Parzinski: And this is a question from me again, how do we say job security is? Let's say I drop everything and go full in, TensorFlow.js. I don't know, a few months of studying, job security, how is it?
Mettin Parzinski: So, okay.
[36:37] Jason Mayes: But yeah, learning the skills now is great for the next few years when those jobs start to appear when people actually will need this, because as a JS developer we are in a unique position that we, when we make a website, that website could be for pretty much any industry out there. It could be for a farmer, it could be for a fashion brand, it could be for anything and all of these different verticals have the potential to be optimized in some way using some kind of ML flow. So for fashion, you maybe something like the clothing size estimator, for farming, maybe it allows them to automatically kind of send goods to the right place categorizing apples and oranges or whatever it might be. There's potential for ML to be injected into their pipelines and maybe there's room for something on the web kind of platform too, for more seamless, frictionless experience where you don't need to install an app. A lot of people don't have the apps of the brands that they might use. They just go to the website. So in that case, there's one touchpoint that might need to be optimized using machine learning in that use case. So things like that might come out in the future.
[39:19] Mettin Parzinski: We have time for one short answer. This is a question from the user Quest "Very inspiring talk. We talked about security benefits of running TensorFlow.js without server. Is it possible to keep everything off the server or are there things that wouldn't work without the server?" One minute go!
[39:40] Jason Mayes: So yes, if you're running on the client side, then everything is completely off the server other than the initial delivery of those assets. So in that sense, it can run completely offline if you use something like a progressive web app or something like this. So yes, you can do it completely server less if you choose to do so.
[39:56] Mettin Parzinski: Cool. Well, thanks a lot, Jason, just as a reminder, Jason's going to go to through speaker room now, if you have any more questions or want to just hang out with Jason, go to the special chat, find the link below in the timetable and thanks a lot for joining us, Jason and I might just get my head dirty this weekend.
[40:15] Jason Mayes: Excellent. I look forward to seeing what you see, what you make, sorry. Thanks for having me. Cheers.
Mettin Parzinski: Bye-Bye.