Intro to AI for JavaScript Developers with Tensorflow.js

Rate this content
Bookmark

Have you wanted to explore AI, but didn't want to learn Python to do it? Tensorflow.js lets you use AI and deep learning in javascript – no python required!


We'll take a look at the different tasks AI can help solve, and how to use Tensorflow.js to solve them. You don't need to know any AI to get started - we'll start with the basics, but we'll still be able to see some neat demos, because Tensorflow.js has a bunch of functionality and pre-built models that you can use on the server or in the browser.


After this workshop, you should be able to set up and run pre-built Tensorflow.js models, or begin to write and train your own models on your own data.

81 min
18 Jun, 2021

AI Generated Video Summary

This workshop provides an introduction to AI for JavaScript developers using TensorFlow.js. It explores deep learning, its APIs and capabilities, and the benefits of using TensorFlow.js. The workshop covers topics such as representing data as numbers, GPU acceleration, creating tensors and training neural networks, using pre-existing models, and using a KNN classifier for image classification. It also discusses other techniques for tabular data, converting and loading models with Python, and different types of networks. Overall, this workshop aims to empower JavaScript developers to leverage AI in their projects using TensorFlow.js.

1. Introduction to AI for JavaScript Developers

Short description:

This is an intro to AI for JavaScript developers using TensorFlow.js. We'll explore deep learning, its APIs and capabilities, and the benefits of using TensorFlow.js in a browser or node. It allows for AI usage in JavaScript, offers portability, privacy, and faster inference speed by running models on devices.

Hi, my name is Chris and this is an intro to AI for JavaScript developers. We're going to be using TensorFlow.js, but it's really about AI in general, what problems you can solve with it, why you might want to use TensorFlow.js instead of some of the Python alternatives.

I have a presentation and during that, we are going to see some code examples, and then we have this GitHub link, which is in the chat now, and we're going to be looking at some examples through that. If you have questions, feel free to ask them in the Zoom chat. I don't have Discord open, I just have the Zoom chat open, so feel free to ask there and I will try to look at that throughout the presentation. Also, I will have a dedicated spot for questions a couple of times, so I'll try to catch up if I miss some questions there.

All right. So let's get started. So what are we going to do here? First of all, there's way too much information about AI for just one session. There are entire college classes about it, and so what I am going to give you first is a general idea of deep learning. This is an intro to AI, so maybe you've heard about it before, maybe you think it's something you should be aware of, right? But what is it really? And then through tensorflow.js' APIs and capabilities, we're going to look at what type of deep learning we can do, so why we might want to use it, the different things that we can do with it, and we'll be going through some code examples.

If we take a look at like the overview of deep learning right now, everything right now is basically in Python except for tensorflow.js. There are, so you may have heard of R, Matlab, Julia, so those are all used as well, but those are more for kind of statistical models. Most deep learning research and in practice is done in Python, but we are JavaScript developers, right? So along comes tensorflow.js. It is a port of TensorFlow, so TensorFlow was released by Google. So TensorFlow was written in Python. TensorFlow.js is basically all of that written for JavaScript. The main benefits that we're gonna see today are, first of all, where you can use it. So you can use TensorFlow.js in just a browser. So we're going to see that that's the demo we're gonna have today, is just in a browser, which is pretty fantastic. So this is the one thing you can't do with the Python code. You can also use it anywhere that node can run. And so it's just a package. And between the browser and node, basically all the code is the same, it's just where you run it. So it has great portability. We cover this a little, but why use it? First of all, we could use AI and use JavaScript, right? So if you don't know Python or you don't want to learn Python, but you think you want to learn AI, this is the reason that you might use TensorFlow.js. Also, like I mentioned, it works in the browser, on the server, mobile with React Native, on IoT devices. And because of all that, there are two major benefits. One is privacy, and the other is inference speed. And those both have to do with the fact that you can run the models actually on the device, so either in the browser or on your mobile device. And so the two big things, even if you don't want to just use JavaScript, are data privacy and the model inference can be much faster.

2. Benefits and Disadvantages of TensorFlow.js

Short description:

With TensorFlow.js, you can run models where the user is, ensuring data privacy and faster model inference. However, there are some disadvantages, such as potential slowness compared to Python and the lack of extensive scientific libraries in JavaScript. The majority of AI work is still done in Python, and tutorials and existing algorithms are primarily in Python. On the other hand, TensorFlow.js offers a large library of demos that take advantage of browser and webcam capabilities. It also provides prebuilt models with associated code.

With most AI today, you have to go all the way back to a server. So send the data back to a server, make the inference, and then pass it back. And so with TensorFlow.js, you can take your models and run it actually where the user is. And so the two big things, even if you don't want to just use JavaScript, are data privacy and the model inference can be much faster. Like I mentioned, it can be as fast or even faster than Python, because you don't have to go back to the server. We'll talk a little bit about GPUs. Yes, you can take advantage of GPUs with TensorFlow.js as well. But that's not the whole story, right? So why would you not want to use it? There's a few major disadvantages. One is it can be slower. All of the major speed enhancements are done using Python code. And so it can be slower, even though you can take advantage of the GPU. Also there's extensive supporting libraries already in Python. There's lots of scientific libraries that JavaScript just doesn't have. And that is primarily the reason that people still use Python, even if they like JavaScript better, I would say. And then the majority of AI work is still done in Python. So if you're looking at tutorials, or papers, or existing algorithms, it's all done in Python first, and then ported to JavaScript. So maybe not the... So while it is JavaScript, and we like JavaScript, sometimes you have to go to Python, because that's where the tutorials and the nice libraries are. One of the main benefits, though, of TensorFlow.js specifically, is they have a really big library of really neat demos. These demos all take advantage of different things you can do only in the browser, or only using the webcam, so this is an example where inference speed would be way too slow if you went all the way back to a server, but in the browser, using a webcam, you can make inference on the webcam images in real time. And so, if you take a look at, so this TensorFlow.org slash js slash demos has a large list of demos that mostly are only possible because we're able to run AI at the edge. Along with that, and we'll talk about what a model is and how to actually use them, but TensorFlow.js has a long list of prebuilt models. Now, these exist in Python as well, but the demos are just particularly good, I think. And there's code associated with each one of them. And we'll talk about why you might use a model and how these actually are created.

3. Understanding AI and Deep Learning

Short description:

In this part, we explore when and how to use AI and delve into the concept of deep learning. We define AI as deep learning, which involves teaching a computer to perform tasks using examples and neural networks. TensorFlow.js primarily focuses on deep learning. We also compare conventional algorithms with deep learning algorithms and highlight the power of deep learning in automatically determining code based on examples. Inputs for both conventional and deep learning algorithms can be any data type supported by the code.

Okay, so the big question that we're gonna look at is when and how can we use AI? And the goal for this whole presentation is we're gonna work kind of backwards to understand what TensorFlow.js is and what it can do. And we're gonna explore that API with code and examples. And so hopefully by the end of this, you will know what the different things you can do with it are, when you may want to or may not want to use it and get a flavor hopefully for what it can do. And then with the GitHub example that's in the chat, you can hopefully have a little bit of a starting point.

Also, I'll say the examples I've chosen are pulled almost directly from the examples on the TensorFlow website. They have great examples for all of the models. So I'd really encourage you to look at some of those.

So let's start about what AI is. And this is a slide that has to be in basically every presentation about AI because people mean a lot of different things when they say AI. So generally what I'm going to say, this is what I'm talking about when I use the word AI. So AI in general started in about when computers started. And it just means any program essentially that's running. But that's not how people use it today. Some people mean machine learning. Machine learning specifically is a type of algorithm which learns from examples. And so any algorithm which learns by you passing it examples and we'll take a look at what that means can be called machine learning. Deep learning specifically is using a specific type of machine learning called a neural network and it's still learning by examples. But when I talk about AI for this presentation I'm talking about deep learning and I think most people when they talk about the really neat applications of AI today, they're talking about deep learning. So whenever I say AI, what I really mean is deep learning.

Alright, so what is deep learning? It is teaching a computer and we'll look at how that works does all the certain tasks using specific examples, using a neural network. So all these different parts are important. Neural network is important and we'll take a look at what that is. Teaching is important, we'll take a look at what that is. Deep neural network, we'll take a look at what that means. And that all equals deep learning. Now TensorFlow.js we'll see can also do machine learning but it's primarily used for deep learning. So what is deep learning? You've probably seen an image like this. I actually tend to think I put this in here because it's gonna come up later but I'm not gonna explain it right now because I think this is a kind of confusing way to get started with deep learning. So kind of know that this image exists and we're gonna come back to it but let's start with something else. Instead let's start with what a conventional algorithm looks like and then what a deep learning algorithm looks like. In a conventional algorithm, you have code written by human, might have if statements, returns, maps and filters you know, like general, you know, conventional kind of code. In deep learning, what we essentially have is a bunch of multiplications and additions and then I'm calling these if statements. They're actually called activation functions, we'll take a look at those but basically you have a bunch of ifs, multiplications and additions and no one actually codes these. These are automatically determined by the examples we give it. We'll take a look at what that means but imagine you know, a conventional algorithm may have, you know, 10, 20, 30 if and maps and filters. Deep learning might have millions of multiplications and additions and if statements and that's where a lot of the power comes in because it's not something that anyone could ever code by themselves but it's learned through the examples. At the end of both of these though, we still have an input going into an algorithm and we have an output. Same thing with a neural network. We have an input, you can think of this just like a function and then we have an output. The difference is this conventional algorithm that's taught by human or that's explained by human, coded by human. This neural network is learned. Now we do define the structure of the neural network first but then it's learned by example but we still have these inputs and the outputs. Another way to think about this is we have questions and we want answers. And I like this framing better than inputs and outputs because inputs and outputs can mean just about anything. But for a neural network think about you have a question going in and then we want an answer coming out. And the way we train it is we have lots of these question answer pairs. This by the way, if you are into AI a little bit, this is all supervised learning. There are other ways that you might be thinking about this but this is the supervised learning framework of deep neural networks.

All right, so what kind of inputs can we have to both of these? Well, in a conventional setup we have just regular code, right? And so the inputs can be anything, it can be strings, numbers, data structures. It really can be anything that our code supports.

4. Representation of Data as Numbers

Short description:

In deep learning, we represent data as numbers. Deep learning includes supervised, unsupervised, and reinforcement learning. Tabular data, images, and text can all be represented as numbers. Representing data creatively can improve algorithm performance. TensorFlow.js handles numbers using tensors, which are multidimensional arrays. GPUs are used to handle the computational load.

In deep learning though, we have all these multiplications and additions. And we do have these if statements but they're actually like I mentioned, activation functions, which are really just fancy multiplications by functions. And so all we can have for deep learning is numbers. And so whenever you're thinking about the data that you have and what you can do with it, think, this has to be represented as numbers.

So what kind of data can we represent as numbers? Let's see, I have a question. That all deep learning is supervised learning? Just the problems we're looking at today are supervised learning. The supervised learning is specifically you have, so the question was, are is all deep learning supervised learning or just what we're looking at today? Supervised learning specifically is where you have lots of question answer pairs that you know the answer to, and that's your training set. And it's called supervise because you actually know the answer. There's unsupervised learning where you don't know the answer and you have other techniques that you can go through. You can do those with TensorFlow JS as well, but those would be mostly custom. Most of the models that you're going to look at were trained using supervised learning. There's also reinforcement learning which is kind of like supervised learning, except you, the computer generates the question answer pairs and so that's like where you can see it play games. Like it'll play through rounds of games and then it knows if it won or not at the end. And so it basically looks at all of its actions and the answer is, did I win or not? So that's reinforcement learning which is kind of just supervised learning. And there's a few other techniques as well, but yep.

Okay, back to data, what data can be represented as numbers? Let's look first at tabular data. This is basically anything you can put in Excel, right? So this is from Wikipedia, we have population, that's numbers we have percent of the world, that's numbers. We have dates and they have text in there but, you know, we can represent dates as for example, the Unix timestamp since 1970. So that's really a number. So tabular data is the first example of something where you just have lots of numbers. So, okay, we know how to represent that.

What about images? Images, we have this puppy over here and he's going to be coming back a couple of times but we have this puppy and that's actually just numbers too. So in a computer, it's stored as red, green and blue channels and each of the values are zero to 255 or zero to one if it's floating point and so even though an image is kind of an abstract concept on disk it's just stored as numbers so we can handle that too. All right, what about text? Text throws a lot of people off because one it's variable length and we'll see that a little bit later but also because it's letters, right? Well, the computer stores it either as ASCII or Unicode, right? So the first thing we might be able to do is just say every character is its own number. So that's one way we can store it as numbers but we may also want to get a little more creative. For example, we might have a word lookup table so that same sentence, if we map each word to a number, the word the appears twice and so our number is, so our sentence is one, two, three, four, five, six, one cause the the appears here again, seven, eight. And so if we look back to the ASCII and then to here you may understand that the way you represent your data matters a lot because this sentence here is gonna look a lot different to a computer than this sentence here. An even further place we could take this is with something called word embeddings. We'll take a look at these later with image embeddings. But the idea is that you can represent every word in this case with more than one number. And these embeddings are learned. We'll explain what that means as well. But if we have these numbers, so we have three numbers for every word in the sentence, then the sentence becomes this. And this is where we can start maybe realizing that deep learning algorithms can really do a lot more than what humans can do. Because you can imagine, so this is for a short sentence and it already has this many numbers. Well, imagine if our embedding wasn't three numbers, but a thousand numbers, right? And our sentence wasn't just 10 words long, but it was a thousand words long. Suddenly deep learning and this learning from examples pair makes a lot more sense than say conventional algorithms where you don't have a hope, right, of creating an algorithm that could actually solve that. So the high level overview is any data can be represented as numbers, but you may have to get creative. And sometimes the more creative you get, the better your algorithm will do. For example, in the word embeddings case, if you want to represent the distinction and also the similarity between the words, then having three numbers is gonna give you a lot more value than just having one number. So any data can be represented as numbers, the type of representation you have in your algorithm matters a lot.

All right, so let's take a look at our first TensorFlow.js code. So how are these numbers handled in TensorFlow.js? The word TensorFlow comes from tensor, and a tensor is a multidimensional array of numbers. So that sounds confusing, but it's just an array, just like any other array. Here we have a four by five block by two, so this would be like a 3D array here. And then we have three of those blocks. So this is a 4D tensor. Why do we call them tensors instead of just arrays? And that's because of GPUs. TensorFlow and PyTorch and the other the other libraries, they have one thing in common and that's how they take the numbers and they handle the GPU stuff for you. You may have heard that GPUs are fast.

5. GPU Acceleration in TensorFlow.js

Short description:

TensorFlow.js takes multi-dimensional arrays of numbers and feeds them to GPUs, enabling them to perform thousands of operations simultaneously. This results in a significant speedup, making deep learning much faster on GPUs compared to CPUs. By converting data into tensors, TensorFlow.js allows for efficient processing and takes advantage of GPU acceleration.

Why are they fast? And it's because of something called tensor cores not just tensor cores, but how they handle these numbers. Basically a CPU can handle one set of instructions at a time. And so it goes in serial and it may have multiple threads on a core, but, you know, generally you take your data in sequence. GPUs, they're built to handle thousands of operations all at once but it's all these very specific operations. It's matrix multiplications and additions. So what TensorFlow.js does for us is it takes these multi-dimensional arrays of numbers and it feeds them to GPUs in ways that they can do things all at once. And that speed up is not trivial. That speed up can be, you know, 10,000, 100,000 a million times faster than running it on a CPU. And this is really why deep learning works the way it does because we're able to take these libraries so TensorFlow.js in this example and able to run our data through GPUs and this very specific algorithm type of way so that if we have data like this, right this is a bunch of text and on a CPU we'd have to do it one character at a time, for example but if we can turn it into this so if we can turn it into one of these tensors then we can take advantage of all this GPU speed-up.

6. Creating Tensors and Training Neural Networks

Short description:

In TensorFlow.js, you can create tensors and perform various operations on them. Understanding the shape of tensors is crucial, as it determines the size of your data. The magic happens when tensors are processed in parallel on the GPU. Neural networks learn through training, where weights and biases are adjusted to achieve the desired output. Training involves running labeled data through the network and updating the weights and biases based on the errors. The initial guesses are random, but through the update step, the network learns to distinguish between different classes of inputs.

All right, so let's take a look at our first bit of code. This is how you create a tensor in TensorFlow.js. You almost never have to do this but I think it's important to show so that you understand the building blocks of what's actually going on. You can see in the middle here, it's just a 2D array. You can have as many dimensions as you want. In fact, in deep learning you almost always have a third dimension, which is a batch and we'll talk about a batch. The reason you do batches is so that you can take advantage of all of this parallel compute. But here we just have two and it turns that array into a tensor and it sets that as A.

And now we can do a few things on it. First of all, we can get the shape of a tensor. This shape will just return two by two, so it shows us rows and columns. And whenever you start doing TensorFlow.js or any deep learning, get really familiar with the shape command. The most important thing is that you know the size of your data all the time. The thing that really threw me off when I started was what size are my arrays? What size do they have to be? How can I get it from here to there? And so even though you almost never create tensors like this it always like comes from your data someway. I think it's worth it to spend, you know like a few hours probably just creating sensors, adding them together, concatenating them and trying to look at their shape at all the different times. That is really going to help you understand what's going on. We can also print it and we can do other things to this as well. So like I mentioned, you could add things together, you can multiply them, and all of those things, if you have to put it onto the GPU and that's a different step. But once you do that, then it all happens in parallel. And so that's where the real magic happens.

Okay. So at the end of all that, what we have is this neural network, which takes inputs, which are tensors and gives us an output, which are tensors. But we never really described how the neural network knows what it's going to do. So how does it actually learn this stuff? All right. We took a look at this picture, right? In conventional programming, we just program it, but deep learning, it has all these multiplications and we haven't described it all, how we use our training data to actually create those. So let's refine this a little bit and I'm gonna make it look like this. So we have our inputs. We may have, so in this case, we have four. We may have, you know, a million. For the image example, we have one input for every channel of every pixel. And then inside we have these, what are called weights and what are called biases. And so that's the, we're multiplying by number, adding a number, and then we have these activations, which I called if statements. They're actually multiplication by a function. It doesn't really matter, you can think about them like if statements. And the learning, what learning actually means is that the computer will figure out what these weights and what these biases have to be in order to get the answer that you are telling it to get. And that happens through this process called training. Training is just learning by example. So let's say we are trying to distinguish a dog from a cat, which is the example that we're going to do in a little bit here. The first thing we need is training data, and that means we need a bunch of images labeled dog and a bunch of images labeled cat. And this is, again, this is supervised learning. And what we're gonna do is we're gonna run those dogs and cats through the neural network. And because of the power of GPUs, we can do those all at once. So it's determined by your GPU RAM size, but we can take all of our dogs and all of our cats, run it through all at once into the neural network. And the neural network by default starts as just random numbers. And so it's gonna give us a guess, but it's gonna be completely wrong because the weights and biases, all those multiplications and additions, are just random. But it's gonna give us the first guess. And then we go into the update step. And we say, that was completely wrong because it's all random. Actually, these rows are dogs, and these rows are cats. And then we tell the neural network to update.

7. Updating Weights and Biases

Short description:

The network uses stochastic gradient descent to update weights and biases, getting closer to the right answer with each iteration. It compares the output numbers to the training values, adjusts the weights and biases using calculus, and repeats the process until the loss is low enough. The update step averages the weights and biases for all training examples to improve accuracy.

And this is where I'm gonna gloss over this a little bit, but the network will use a process called stochastic gradient descent to update all those weights and biases. And the exact mechanism is not important for us, but all we know is that it's gonna get closer to the right answer the next time we go through the loop. It won't be perfect, though, so we have to do it again. And it gets the next guess, and it's gonna come back with, like, instead of completely random, it's gonna be a little bit closer to the right answer. Go through another update step. Say, these are dogs and these are cats. It tries to update its weights and biases, and we do it again. So this update cycle, first, we run the data through the network. Then we compare the neural network output to the training values, calculate the loss, tell it to update, that's stochastic gradient descent, and then repeat until the loss is low enough. What does this update exactly? That's kinda what I skipped over. That's when we tell the neural network to update stochastic gradient descent. What it does, I'll, so the question in the chat is, what does it update exactly? I'll explain it shortly, a little bit. There's some great, if you look on, so Stanford and MIT both have classes out that cover this, and so does, in fast.ai, that's another resource where they cover stochastic gradient descent in much more detail. But what it does is, it takes a look at the output numbers, so these are all numbers, right? And it uses calculus, basically, to look back at the inputs and figure out how much the weights and biases affected the output for every one of the inputs. And so, you're saying that there's a cat, it only predicted 0.5% cat, so it has to adjust all of its weights and biases up or down a little bit to get closer, that 0.5, for example, closer to one, which would mean cat. And it does that for all the training examples all at once, because it's on a GPU. And so it kind of averages, it tries to average all these weights and biases so that it gets closer to the answer for each of the values. And that's what the update step does. There's a lot of math involves, but in TensorFlow.js and in other things, it's like three lines of code or so. You basically say, start the update, actually do the update and then loop. Okay, let's see, so when we're done then, what we have, and let me know if I don't answer any of these questions fully, then let me know in the chat and I'll try to provide more information.

8. Using Pre-existing Models in TensorFlow.js

Short description:

When using TensorFlow.js, you can take advantage of pre-existing models that have been created by others. These models have been trained on large datasets and wrapped up in code examples, making it easy for you to use them. It's recommended to explore the available models before starting your own project, as you may find that someone has already solved a similar problem. This saves you time and effort in collecting data and training models.

All right, so when we're done with a conventional algorithm, we have something we call a function, right? It has an input and output. Exact same thing for the neural network, except we call it a model. So we have inputs and outputs. Now, I went over all this background information but what we're gonna see today is that one of the beauties of TensorFlow.js is that they have this model, this selection of models already created for you. So these are things that people have gone through all the trouble of collecting all the data, doing all the training, which can take a really long time, and wrapping them up in really nice code examples. And so one of the great things is that for these models, you don't have to do any of that work, you can just use them. And so that's what we're gonna start with today is using one of these models. If you go to the actual model page, you can see there's some for image, there's some images, there's poses, like you can do things with webcams, there is text, and so, audio. And so most of the things that you might want to do with a neural network, someone has probably done before, and there's probably a model for it. And so I'll repeat this again later, but one of the things that I always tell people to do first is to actually go look at what models exist. And you might be surprised because the problem you're trying to solve may be almost exactly solved for you already.

9. Demo and Model Creation

Short description:

We'll do a demo of what this looks like now. Most models are done for research purposes and are open source. The AI community has a willingness to turn over models to open source. The demo we're about to do was created from the ImageNet dataset. We'll go through transfer learning and making your own models.

Okay, so we're going to do a demo of what this looks like now. Okay, in the chat, yeah, there's an answer to that, great. What do these people who made the models, what do they earn? Kudos, that's what, so most of the models are done for research purposes. So they're done with grants, or they're done by big companies who want to get their research out into the world. And so this is all open source. And one of the great things about the AI community, I think, is this sort of willingness to turn over your models to open source and let everyone use them. It's one of the reasons that, so for example, in the one that we're going to do today is it was created from the ImageNet dataset. And ImageNet is a huge array of images that there's a contest every year, so people can make these models, put them in the open, and compare them. And that's one of the reasons that we're able to do the demo we're about to do with all of the accuracy that it has. Making your own models, we'll go through some of that. We'll do transfer learning first. And then I'm going to gloss over a little bit about making your models with sort of a caveat at the end, but we're transfer learning. So existing models plus transfer learning gets you most of what you want probably. And then I'll talk about our own models in a bit.

10. Demo: Using Models in TensorFlow.js

Short description:

This demo showcases the use of models in TensorFlow.js, specifically MobileNet. MobileNet is a small model with top 1% accuracy on ImageNet, making it suitable for edge inference on devices. TensorFlow.js allows us to use these large models in a compact size, enabling us to easily incorporate them into our projects.

Okay, let's go to this demo. Oh, so the first thing I can show you is, so here's the models that you can use. These are all the ones that they have, plus then there's a lot more in GitHub. I think the documentation is really great. And so I would start with, try to start with one. The ones we're going to look at today are image classification, and then we will look at a KNN classifier, but there's some really fun ones otherwise. This is what I was talking about with, so this is MobileNet, the one we're going to look at today, and this was created using ImageNet. And so what this does is, anything, so ImageNet is, I think it's 10 million real world images. I might have that wrong. I should look but, and so anything that's kind of like a real world photograph will be able to be detected very well using this MobileNet. If we look at this, so this is some example. One of the things TensorFlow.js specifically is used for is like I mentioned, edge detect, so edge inference. So that means like pushing it to IOT devices, pushing it to the browser, pushing it to React Native. And so this MobileNet here is meant to have good, this is the top 1% accuracy on ImageNet. It's meant to have good accuracy while being as small as possible. Some of these other ones, VGG-16 and now there's several others, are huge, huge models. And we would never want to download, for example, like a 200 megabyte model on a mobile phone. But these ImageNet models are, I think the one we're gonna use today is about 20 megabytes or 16 megabytes. And so 16 megabytes for 75% accuracy on what we'll see what this top 1% means is fantastic. And so that's why TensorFlow.js is great, I think, because we can use these giant models, they're made small for us, and then we can just use them.

11. Using a Prebuilt Model for Image Classification

Short description:

Let's take a look at the code first. We import TensorFlow.js from a CDN and the MobileNet model for image classification. We use a prebuilt model to classify a cat image and print the predictions. WebGL is not supported, affecting the download and prediction speed. Despite being asynchronous, loading spinners should be used. The predictions include different types of cats, demonstrating the power of deep learning. Deep learning allows us to predict complex classifications that conventional algorithms cannot. The performance of running the model depends on factors such as downloading the model and the presence of a graphics card.

Okay, let's take a look at the code first. So this is the tfjs repository that I had out there. And the one we're looking at right now is images.html. So this is gonna be in a browser. And this is almost an exact copy of their first demo, we're gonna make it a little bit different but.

The first thing that you can see is we're importing TensorFlow.js from a CDN, right? So you don't have to do lots of fancy stuff. You can just get it from the CDN. And then the other thing we have to get, so this is TensorFlow, this is the base level operations that I showed you. So this is like how arrays are turned into things that the GPU can use. Then we also need our model, so this is MobileNet. What I just explained with the images. So this is like the function that we're importing. If we made our own model, then we would have our own model here instead.

For this image, we're going to have a picture. So I chose cat. I think we can look at the cat here. Yeah, so this is the cat we're going to try to classify. And then the beauty of using a prebuilt model is this is all the code we're gonna look at. So we get the image, we're going to load MobileNet, so it's asynchronous, so this is going to actually download it, and then we're going to use MobileNet to classify the image and print the predictions. And that's all we're going to do at first. So let's take a look at that.

If we look at here, and see, it already ran but I'll refresh it. So the first thing you'll notice, WebGL is not supported. Don't worry about that too much right now, but WebGL will allow my browser, it would allow my browser to use the GPU, but it's not supported. So it took a little while to download. And then, well, so let's see, first of all, let's look at the network. So if we look at, okay, so all these groups right here, this is the actual model being downloaded. And so you can see it's 4.2 megabytes times four is the major bits of it. So that's like what 17 megabytes or so.

Now if we look at the console, so here, let's run that again and pay attention to the speed of this, and you'll see why it's a little bit slow because WebGL is not supported. So it's downloading the model now. And then once it downloads, it will run predictions and we'll be able to see this speed better later. And then now it's predicted. The predictions, so that's the first thing is that since all this is asynchronous, so it's JavaScript callbacks is asynchronous or promises I guess, you're going to want loading spinners, things like that. And so just pay attention to your loading speeds and your download speeds. Now for our predictions, we have a Tiger cat at 28% or an Egyptian cat at 20% or Tabby cat at 18% but it got that it was a cat, right. These classes by the way, ImageNet has a thousand different classes. And so that's why we're seeing multiple types of cats as the top ones. In the next example, we're gonna see how to make it just say cat instead of these ones. But I think that's amazing, right. We use someone else's model in the browser to predict that this is a cat. This is something that you just can't create a conventional algorithm to do. Like how do you determine that this is a tiger cat versus an Egyptian cat, conventionally? It's not really possible, I think. And that's why deep learning is super interesting.

How much of a difference would it take to run on a gaming computer versus your phone versus a computer without an actual graphics card. So there's two answers to that. The first is a lot of the time it takes is downloading the model. And then if you have a graphics card, it's actually moving the model from the CPU to the graphics card, then running it and then coming back.

12. Model Download and Graphics Card Acceleration

Short description:

Downloading the model and utilizing a graphics card can significantly speed up the processing of multiple images. While it may not make a noticeable difference for single images, performing inference on a graphics card can greatly enhance performance, especially when processing large batches or streaming from a webcam.

The first is a lot of the time it takes is downloading the model. And then if you have a graphics card, it's actually moving the model from the CPU to the graphics card, then running it and then coming back. So for just single images, it doesn't matter that much. It would speed it up. So like if WebGL was supported, it would speed it up. The place where it gets really interesting is if you're doing like 1,000 images at a time. Then you can do those all on a graphics card at the same amount of time as it would to take one image on the CPU, or even faster than that actually. So if you're just doing single inference, not that important that you get graphics, the graphics card working, although it would speed it up quite a bit in this case because I have this all disabled. But the really interesting thing is if you're doing like 1,000 a time, or if you're streaming it, so like from a webcam, you can do it in near real time. So near 30 frames a second.

13. Using Transfer Learning for Image Classification

Short description:

If you can't find a pre-trained model for image recognition and you only need to classify cats or dogs, you can use fine-tuning or transfer learning. This involves taking a pre-existing neural network model and adjusting it to meet your specific needs. We'll explore how transfer learning works and how it can be applied to image classification tasks.

Yup, okay. Okay. Okay. Okay, so that was our first demo. What if you can't find a pre trained model? So this is kinda what someone asked. So in this case we just want cat or dog. So we're still doing image recognition. So this is still, you know, image classification is still something like, so the base is kind of the same. But we just want to know cat or dog. We don't want to know like Egyptian cat versus, you know what? And this is where we're going to look at this kind of second tier, which is fine tuning or transfer learning. So the idea is that we have this neural network, we have this model that someone else created that sort of does what we want. We looked in the model, the model zoo is one thing that, you know, you can call it. It's like a bunch of models. We found one that sort of is similar to what we want. It takes us input images. But the output though is a thousand different classes of images. And this is where we're going to come back to this deep neural network picture and take a look at what's actually going on underneath the hood. And then we're going to see how we can maybe use transfer learning on this.

14. Neural Network Inputs and Hidden Layers

Short description:

We have inputs, which are images, and hidden layers in the neural network. These layers perform multiplications, additions, and activations. Each circle in the network multiplies the values from the inputs and adds a bias term. The activation function is like an if statement, passing positive values and zero otherwise. This fully-connected network is a good mental model. To run the example, I used a node HTTP server and accessed images.html. Other options include Python simple server and MPX serve.

All right, so we have our inputs and our inputs are the images. And then inside here, we have these neural network, hidden layers. And these are the multiplications, additions and activations that I told you. Actually each one of these circles contain, it multiplies the values from every single one of the inputs in front of it. And then it adds this bias term. And then inside of here is this activation function. You can think about it like an if statement. There are lots of, there are many activation functions, but the most common one is basically if it's positive then pass the positive value, otherwise pass the zero. So that's kind of like an if statement, right? If it's positive, pass the value, otherwise zero. This is a specific kind of neural network called a fully-connected network, and we'll look at a couple of examples of other types later. But this is not a bad mental model for, oh yeah, sorry, someone, I'm looking in the chat. The example I ran, so I can show you my terminal maybe, I ran, so how do you actually get this cat to run by running the images that HTML? I ran node HTTP server. So if you MPM install a global HTTP server, then you can just run HTTP server and then the path. And so I ran in the directory, so I ran HTTP server and then dot slash in the path I was in. And that's what allowed me to go to images.html and then it loaded like cat.jpg I can get exactly right. So and if I go to just the root, then this is what that HTTP server lets me do. So I see a couple other examples, yeah, Python simple server does the same thing, MPX serve. So that's what I'm using to run that example.

15. Deep Neural Network and Intermediate Values

Short description:

The deep neural network consists of multiple layers, with the deep part referring to networks with more than two layers. Some networks, like VGG16 and res nets, have hundreds of layers. However, training such large networks can be time-consuming. In our example, the final output layer has a thousand classes, representing different objects. We can also take intermediate values from other layers, which represent different aspects of the image. Understanding what these layers represent can be challenging due to the large number of connections. However, by running the values through our own machine learning algorithm, we can determine the classification. TensorFlow.js provides a way to perform inference on the model by passing an image to it. Each layer in the network focuses on different aspects of the classification task, with early layers learning simple representations like edges and later layers predicting more complex features.

All right back to our deep neural network. The neural network part is what we just covered. The deep part by the way, is just how many layers there are. And so like two or more layers is deep, right? This is another thing just like AI where deep is kind of an overloaded term. But generally it refers to more than two layers of a neural network.

Some of these neural networks by the way, so like VGG16, which we saw, I think that has probably 100 layers. I think that has, well some of the res nets and stuff they have like hundreds of layers. And so and there are hundreds of layers that operate on entire images. And so these, they have a different structure and they have, you know, advanced things in them. But like that's a network that's bigger than what we're going to want to train especially if we have, you know, 10 million image net examples. These type of networks take a long, long time to train on big GPUs. And that's why it's great that they're just provided for us.

Okay. So in our example, what we were looking at is this final output layer. And the final output layer in the image net case has a thousand examples or a thousand classes. And like one is Egyptian cat, one is tabby cat. You know, image net, by the way, has a lot of like dogs and cats and animals in it. For some reason, I think like 20 or 30% of it is just dogs. So that's something to keep in mind but that's what we're looking at this output layer. But what we can do is we can actually come in and take any of these other layers that we want. And what that gives us is not these thousand output layers, but it gives us some intermediate value. When a steam neural network is training, what really happens is because it adjusts the weights and biases of all these intermediate values, these have represent each one of these. So if we cut it here or we cut it here, or we cut it here, these will represent different aspects of the image. And so what we're actually gonna do is we're gonna just chop off this last output layer and we're gonna take these values here. And so this is just before it predicts Tabby cat versus a gypsy cat, for example. And we can look at these and then put it through our own machine learning algorithm, which is what we're gonna do and say, this is a cat and this is a dog. Now tell me if it's just cat or dog. There are some great blog posts you can look up for what these layers represent. A lot of research kind of goes into actually what do they represent? And it's sort of difficult to figure out sometimes. That's where you may hear that neural networks are a black box or that it's difficult to get what actually is going on inside of one out of it. And it's because you might have, oops, you might have, in these intermediate ones, you might have like a million connections. So how can you possibly know what those million weights and biases are all doing? And so that's where you might hear that it's kind of difficult to figure out what neural networks are doing. But we're gonna take it all. We're gonna run it through our own machine learning algorithm and then just tell it to figure it out. And that'll be great. This is how you do it in TensorFlow.js. So here's our network. So this is our model. And instead of doing a prediction, we're gonna do an inference, so.infer, passing our image to it. And there's a few different ways we can get out things that we wanna get out.

Oh, let's see, and the question is, so each of those layers focus on a different aspect of the classification task. So yes, they could, but we don't tell it at all what different aspect of the classification task to focus on. Instead, that's what's learned. If you look at, so this is not exactly an image network. This is just a regular Feed-forward Network. But if you look at some examples of image networks, what it shows is these early examples, they start learning representations like edges, because that's about the simplest thing that you could look for. And so that's just like an edge. And then the next one might be like corners. And the next one might be like circles and shapes. And so by the time we get to these last node, these nodes are predicting things like, is there an eyeball, are there two eyeballs? Are there ears? Are those ears floppy? Are there stripes? What we're gonna do is we're gonna take these last, so basically these early nodes are like primitive, so like edges and circles and shapes.

16. Using Embeddings and KNN Classifier

Short description:

We're inferring on the image using embeddings. Embeddings represent the last layer before the output, and they are learned numbers that represent features of the input. By using a KNN classifier, we group the embeddings into categories, such as dogs and cats. This custom model is added to the end of the deep learning model to separate the groups. In the fine-tuned HTML demo, we load TensorFlow, MobileNet, and the KNN classifier. We provide examples of dogs and cats to train the model.

And these last ones are some high-level representation like are there two Is and an Os? Are there wheels? Is this a car? But the tricky part is we can't exactly tell what each one of these nodes is doing, but these generally, these early ones are primitive and these later ones are more advanced features.

Okay, so we're inferring now, inferring on the image. And this embedding, what this embedding does is if we say embedding false, this is the same embeddings that we were talking about like the word embeddings, embedding false that's gonna give us the actual last layer. So this is one by 1,000. So if you want the full 1,000, the print out of the 1,000 different probabilities for each one of the classes, that's what we do. But if we say embedding true, that's gonna get us the layer just before. And that's, in this case, 1,024. But in the general case, it's whatever you designed as your last layer of the model. And so what this is gonna... The way that these 1,000 non embedding output layers are calculated is by multiplying and adding and doing an activation on these 1,024 numbers. And so these 1,024 numbers are some high level representation of what's actually in this image.

Another way you can do it is the layers all have names. So conv predictions, this stands for convolution predictions. That's the name of that last layer. And so we can also get it using conv predictions. To know that you have to look in like either the source or the documentation for each model that you're using. So you can just use embedding equals true and that'll give you like the last layer. So let's go back and talk about what these embeddings are real quick. Remember that we have like these numbers. So in this case, there's three numbers for each word and these embeddings are represent the last layer. In this case, the last layer before the output. And so each of these embeddings are going to be a learned three numbers or in this case, a learned 1024 numbers that represent something about either the word or the image or whatever we're passing in as the input. So this image of a cat might look something like this. This is 1024 image numbers. And like this number over here might represent it has two eyes. This number over here might represent how much fur it has. One of these numbers might be like, does it have four wheels? So like, is it a car? And we don't know at all what these numbers do. But these numbers all represent something about the image. So how can we possibly write this dog cat classifier? We're going to use something that's built into TensorFlow.js called a canon. This is a machine learning tactics, so not a deep learning tactic. But generally what it's going to do is we're going to take those 1024 numbers as the outputs. And we're going to try to group them into groups. So this is in two dimensions. So imagine us in 1024 dimensions. But we're going to try to say, like, this group over here is all dogs. This group over here is all cats. And I'm gonna show you an example of a carrot and why you might want a third group. But that's what we're going to try to do. And then the K&N machine learning model is what we're going to sort of tack on to the end of the deep learning model in order to use all of the MobileNet goodness that's given to us, plus this kind of additional custom model to try to separate out into these groups.

All right, so this is the second demo at the same repository. And we're going to go here. Okay, so this is the fine-tuned HTML demo. If we look at the source for this. First of all, are there any other questions before we start jumping into this model? I meant to be stopping for questions along the way, but let me know in the chat if there is. Okay, so we load TensorFlow, just like before. We load MobileNet just like before because we're doing the same image classifications. But now we load this KNN-classifier as well. And this does not have any pre-trained weights like the MobileNet one that we're gonna use. So we're gonna have to tell it what our dogs and cats are. In order to do that, I have four examples.

17. Using a KNN Classifier for Image Classification

Short description:

In this part, we explore how to use TensorFlow.js to classify images using a KNN classifier. We start by pulling examples of cats and dogs and assigning them class IDs. We then load the model and add the examples to the classifier. When predicting, we infer the image and use the classifier to determine the class. This process is straightforward and can be applied to different datasets and classes.

I just pulled four examples of cats and four examples of dogs. And you can see on-click for these. So just for this example, we're doing it on-click. Really, though, you'd probably want to do this like in Node.js, for example, and then you could save your classifier and then just use the pre-trained classifier. Or if you go to the TensorFlow.js site, there's an example of how to do this with a webcam, which is kind of neat, because then you can like train it on hand signals. You can train on whatever you want through your webcam. So that's kind of a neat example. But for now, we have four dogs and four cats.

And when I'm gonna click on them, it's going to add the example. So this is the name of the ID, cat1, and then it's going to assign it, remember, a number. So our KNN classifier predicts numbers. It doesn't predict dog or cat, right? So we're calling cats, zeros and dogs, ones. Then we have the images that we're going to predict at the bottom. So we're going to say, so we have a cat and we're going to guess if it's a cat. Then we're gonna, we have a dog, we're gonna guess if it's a cat or dog. And we have a carrot, and we're gonna see what happens when we have a carrot. And then we load fine tune.js. So just so we know what's going on. So we have these examples here. These are our training examples. So we have all cats and all dogs here. Then these are our predictions. And we're gonna see what happens when we try to predict a cat, a dog, or carrots.

Let's look at what's gonna happen though. The first thing we do, so we have our network. So this is the same as before. And then we have our classifier. And so this is a brand new classifier that we're creating. And so you can use this KNN classifier whenever you want to separate things into groups of things. When we, to load the model, the first thing we do is we load our mobile net model. So that's gonna download the 17 megabyte model. And then it'll say that we're loaded. And then when we click on each one of those examples, what it's gonna do is it's going to grab the image and then it's gonna get that activation level or the activation nodes by inferring. So instead of predicting, it's gonna infer that image and then this true is the same as embedding true. So it's gonna grab that last layer before the outputs and it's gonna get the 1024 numbers for that image. And then we're gonna add that example to our classifier. So our KNN classifier, we're gonna add the activation, which is just an array of numbers, and then we're going to give it a class ID. So this is the number zero for a cat, one for a dog. And that's all we do for adding examples. KNN's are super simple because you can just add the example. So in this case, it's the 1024 image numbers. But in your case, it could be a list of numbers from a table. It could be however you create something that represents your data. And then what class it belongs to. When we actually go to predict, we're gonna call cat or dog. It's gonna grab the image again. It is going to infer, in this case I did conf predictions. It's exactly the same as this, because conf predictions is the name of that last layer. And then we are going to call a classifier. So this is sorry, this is on our big mobile network.

18. Calling the Classifier and Predicting the Class

Short description:

We call a classifier to predict the class of the activation. The result includes a zero or one, representing cat or dog, respectively. The classifier doesn't know about cats or dogs, only zeros and ones. We can also obtain the confidences, which are the probabilities of it being a dog or cat.

Then we're gonna call a classifier. So as our canon classifier that we created, and we're trying to predict the class of the activation. So this is when we're actually doing the prediction. And the result we get back is going to have a couple different things. First of all, remember, we're gonna get a zero or a one back and so we're gonna say zero is cat, and one is dog. And so the classifier itself doesn't know anything about cats or dogs. It knows about zeros and ones, the label itself, that's gonna be the zero or one. So we can call classes on that. And then we can also get out the confidences, so we can get out the probability that it thinks that it's a dog or cat. And then we're just going to log that.

19. Running the Classifier and Fine Tuning

Short description:

We loaded the model, added four cats and four dogs, and ran them through a classifier. The predictions were accurate, demonstrating the power of fine tuning. With just a few examples, we were able to predict new dogs or cats. This highlights the importance of finding a model that aligns with your goals.

Okay, so let's see what that actually looks like. I can try to load this again. So if I load this again, so it's loading, and then we have to wait for a while cause it's downloading. Okay, now it successfully loaded the model, and now we can click on each one of these to actually add it to a classifier. So it's adding, takes a little bit, now it's done. And so each one of these, this would be, these images are actually really large, and so it would take less time if the images were smaller. But so all I'm gonna do is I'm gonna add four cats. Okay, and now I'm gonna add four dogs. And you can click on them, it'll do them in parallel, just say done when it's done. Sorry, it'll do them in series. Okay, so now what we've done is, we've created a canon classifier with just four examples of cats and four examples of dogs. And now let's click on our cats and see what it's gonna predict. Let's see, it predicts, it's running through a classifier. And it predicts a probability or a prediction of cat with a probability of one. So it really thinks that's a cat. Let's click on the dog. And it thinks it's a dog with a probability of one. And this I think is super amazing. And so let's take a moment to just figure out like how cool this is. ImageNet itself was trained on either a million or 10 million, depending on how you do images. And so, it learned something about the structure of dogs and cats. But what we did is with just four examples, so four cats and four dogs, we were able to predict new dogs or cats. And so this is the power of fine tuning, which is you can take, that's why I say really go after and see if there's a model that does kind of what you want it to do. Because you can take that kind of what you want it to do and turn it into exactly what you want it to do with very limited examples.

20. Handling Different Size Inputs

Short description:

Neural networks can handle images of different sizes in two ways: resizing the images to the same size before passing them through the network or using a fully convolutional network that reduces the size of the image at each step. MobileNet handles different size inputs automatically, but if you were designing it yourself, you would need to manage the different sizes.

Let's see, a question is, would this process still work if all the images were of different sizes? Would it change the number of inputs? Would it complicate things? So, okay, so yeah, these are actually different sizes. So there are two ways that neural networks handle that. Yeah, so each of these images, I just resize it so it looks the same size, but they're actually different. Okay, there's two ways that they can handle that. The first is it could resize it before it passes it through the network. That's a fairly common example. So then you get run into problems of what happens when you have a rectangular image and there's a few ways to handle that. So you can put padding on the top, like white pixels or black pixels, you can do this mirror kind of padding. You can just crop the image. And so that's sort of an older way to handle it is to resize the image in some way so that all the images are the same size. Because remember, we're doing this all on GPUs and GPUs expect the tensors to be just multi-dimensional rays and so they have to be the same size. Then the other way to do it though is a sort of newer way, which is kind of a fully convolutional network. And what that does is the network itself runs until it gets down to the same size. And we'll see this a little bit later but basically each step of a convolutional network reduces the size by half of the image. And so you can just reduce the size by half until you get down to just like one pixel and you have one pixel with a bunch of filters. And that's the other way that you can handle it. So yes, having different size inputs complicates things because you have to handle it differently. The beauty is MobileNet handles it for us. So these are actually different sized images. It gets completely handled for us. If you were designing it yourself, though, yeah, you would have to manage the different sizes.

21. Choosing the Right Layer in MobileNet

Short description:

To get a different layer in MobileNet, refer to the documentation for a list of layer names. The layer you choose should depend on how similar your data is to the data used to train the network. If your data is similar, use a later layer. If your data is different, use an earlier layer.

Let's see, there's more questions. Would it be more accurate if you use 10 cats and 10 dogs? Yes, and we're gonna look at the carrot example and I'll kinda come back to that. Is it the last layer always comp predictions? Is there a way you can get a different layer? Yes, so you can get a different layer. So if you go to the documentation for MobileNet, and I didn't do this, I don't exactly know where it is, but somewhere in there, it's going to have all the layer names. So you can get another layer if you wanted. Like if you want a really early layer, you can do that. I'll say the more your data looks like the data that was used to train the network. So we're using cats and dogs and there's lots of cats and dogs in ImageNet. So the more that your data looks like cats and dogs or cars or boats, whatever, the later the layer that you'll wanna use. If your data looks completely different, so like it's images, but it's like something completely different, like screenshots or something, you may wanna go to an earlier layer because if it's like screenshots of, maybe you're trying to detect what application you're running or something and you have a screenshot, you don't care about eyes and ears. You care about maybe the edges and stuff. So you may go to an earlier layer. So yeah, look at the documentation for MobileNet and you can figure out what layer you wanna grab.

22. Exporting and Importing Models

Short description:

You can export and import a tuned model for future use. The KNN classifier is a TensorFlow model that can be saved and loaded for predictions. The accuracy of predictions can be improved by providing more input examples that are similar to the test examples. Consider including out-of-band examples in the input set. When training, there are no speed-ups based on different shapes or backgrounds in photos. Larger images may take longer to train and do inference. Instead of a KNN classifier, you can use other algorithms or even another neural network depending on the complexity of the desired result.

And then let's see, can you export a new tuned model and just import it next time? Yes, so that KNN classifier is just a regular model, a TensorFlow model, and we'll... I don't show how to save it, but I'll show you how to, like you can import some models. And so, yes, you can save the results of the KNN classifier. And then next time you'd load MobileNet to get the activations and then run it through your pre-trained KNN classifier. And that's how you can do predictions. So yeah, you don't have to train it every time.

Okay, so we trained it on dogs and cats, and we used a dog and a cat and got a cat, a dog. What about carrots? So I'm gonna click on carrots. It's gonna predict. And it predict that these carrots are a dog with 66% probability. And so this is gonna be... The answer to the question would be more accurate if we use more values. If we think about what we did, we have ImageNet, which has 1000 classes, but our KNN classification, that just has two classes, cat or dog. And so when I give it carrots, it has to predict either cat or dog. And so what we're gonna talk about is this kind of out of band data. Whenever we use networks, they aren't magic. They only predict what you passed in. And so here we had exactly four cats and dogs. And these cats and dogs that we predicted on, they look kind of like the images that we used here. If we didn't do that, so if we use a cat that looked a lot different, so like maybe a different, you know, like just the side of a cat or something, then it's going to predict something, it might predict something strange. And so in this case, we have carrots, but we could have a cat here, which doesn't look like our input cats, and it might predict dog, right? And so yes, the more input examples that you can give it, the better it will be at predicting the output. And especially you want your input examples to be as close as possible to the test examples you're gonna give it. And so, yeah, if you have 10 dogs and cats, that'll do it better. If you have a hundred, that'll do even better. And think about this carrot. When you're doing this, think about this carrot. Because it's gonna predict the carrots are dog or cat. So it might be a good idea to, for example, if we know we might have carrots, to have a bunch of either carrots, right? So that'd be a third class, or just like out of band examples, we could call unknown or something, right? And get a bunch of examples which are not dogs or cats, but might be part of your input set. And so I kinda show this as a warning, basically, to know that you can get wrapped up in, or we use mobile net, that's great. We got these dogs and cats predicting, that's great. And so we're done, right? Well, it's very important, even if you're not doing the actual, the deep learning AI creation itself, like you're not actually creating mobile net. You still have to think about what your data is gonna look like and how you're gonna, and what you're gonna do inference on. So hopefully that helps a little bit.

Are there any rules regarding photos which speed up the training part, like different shapes, different background, etc? Which speed up the training part? Uh, so no, not necessarily. So you have to, when you're doing training, you have to look at the entire image no matter what that image is. And when you send it to the graphics card, it takes exactly the same for an image which is all, like say, it was all one color versus lots of different colors. So like, for example, if you're transmitting that over a network, then it's better to transmit like less colors, for example. But if you're doing it on, if you're doing training, it's exactly the same. So no, no speed-ups based on different shapes. Well, except for larger images will take longer because you have, you either have to resize them, and that's gonna take some time, or you if your network is one of the newer kinds of networks, it has to do more iterations to get it smaller. And so larger images generally will take longer to train and to do inference.

If you use the earlier layers to predict which application is running on a screenshot, would you want to attack another neural network at the end instead of a KNN classifier? Okay, so that's a good point. We added a KNN classifier to the end of this, but that's just because, let's see, if we go back to our, this, that's just because I knew that we were gonna have like dog, cat, and carrot. You don't have to use a KNN classifier though. You could actually use anything you want. You could use a conventional algorithm, you could use a whole other neural network, which is what Dennis is suggesting here. And so, it depends on basically the complexity of what you're getting out, the complexity of your final result. So here we just wanted like three classes, so KNN is perfect. If you wanted something more specific, so like, let's say we wanted to predict, so say we had these inputs, and we wanted to predict the position of the I's. So that's like a regression problem that's like actually getting the X, Y coordinate of the I's, for example.

23. Using Neural Network for Position Detection

Short description:

To determine the position of the I's, KNN is not suitable. Instead, a neural network would be a better choice, specifically using an earlier layer to identify the position of circles. The inputs would be 1024, and the output would be XY coordinates.

For that, we can't use KNN because KNN is just gonna give us, you know, dog, cat, carrot. So if we want position of the I's, we have to use something else, and a neural network would be a great choice for that. And probably, if you were doing position of I's, you would not use the last layer, like we did for MobileNet, you would probably use an earlier layer because you're looking for like the position of circles, for example. And so, so if you're doing something like that, yeah, I would use an earlier layer and then do something like a neural network on the output, where the inputs are, you know, 1024, or if you're using an earlier, something I didn't say, earlier layers are often larger, in this case, in this case, they're all the same size, but usually what happens is these will be different sizes, so again, run.shape a lot to figure out what the size of your model is that you're running, but yeah, so I'd use an earlier layer and then I'd run it through something like a neural network where the output was like XY coordinates.

24. Exploring Other Techniques for Tabular Data

Short description:

If you can't find a model or fine-tune one for your specific task, consider exploring other machine-learning techniques or conventional techniques. Deep learning is not always necessary, especially for small datasets. TensorFlow.js provides options for using other machine-learning techniques, and there are other JavaScript libraries available as well. Don't overlook the power of conventional techniques for tabular data.

Okay. All right, so we looked at a model where it did exactly what we wanted, we looked at a fine-tuned model where we used K and N, or you can use something else like a neural network, but what if you can't find, what if you can't fine-tune a model? What if you can't find anything that represents what you're trying to do? The first thing that that might look like is this, again, this tabular data. There are no models which are exactly gonna be, say we wanted to predict something on these three columns, right? There's gonna be no models like that. The first thing that I would say, though, is that you might not need a neural network. If we look back at this, this is just three data points. And people are very quick right now to jump to deep learning, but there are lots of other machine-learning techniques, number one, and there's lots of conventional techniques that are really good. And so, if you are looking for something and can't find a model for it, it might be because there's a better technique than using a neural network. So, that's the first thing I would say is look towards other machine-learning techniques, also. And you can do those in TensorFlow.js, but there's also lots of other JavaScript libraries that could do machine-learning on data, especially tabular data like this. So, you might not need a neural network.

25. Using Python to Convert and Load Models

Short description:

If you use Python, you can convert your models into something that can be read by TensorFlow.js. This allows you to easily run examples or tutorials written in Python and use the resulting model in TensorFlow. You can load the model in TensorFlow.js or save it for later use.

If we start for real options, though, the first one is to use Python, which is, I know in a JavaScript talk, maybe that's not the best example, but if you use Python, what you can do is convert your models into something that can be read by TensorFlow.js. And so, this is a great way where, if you find, this goes back to most examples are written in Python first. So, if you find, you can probably find an example or tutorial online that uses Python, and you don't have to understand that much to run it and then get a model which you can then use in TensorFlow.

And this is how you load a model. So, this is what I was talking about, and how you're loading in TensorFlow.js. There is another method called to save it. So, like a K9 classifier, you can just save the model and then you would load it this way. And so, this lets you load models from Python or from other things.

26. Types of Networks and Existing Models

Short description:

Option two is to create a new network. There are two types of networks: fully connected and Convolutional Neural Network (CNN). Fully connected networks look at every other previous layer, while CNNs use kernels to look at parts of the image at a time. CNNs are commonly used for image recognition and can detect edges. It is recommended to look for existing models when using TensorFlow.js.

Option two is create a new network. So, that's what we're gonna talk about a little bit. Like I mentioned, this is a fully connected network. Oh, sorry, that was a different question. So, this is a fully connected network. This is not the only kind of network. And this is the part where we're getting a little long into the workshop. And so, I'm going to gloss over a little bit how to do this. And I'll explain why later. But there are a couple types of networks. So, let's look at some of them.

The first is fully connected. That's also called dense. So, if you see dense and fully connected, that's the same thing. And the important thing for fully connected are how many different nodes you're going to have. So, in this case, it's how many outputs, so we're looking at just the outputs. But this is how many nodes you have. Kernel initializer, which variants scaling. Don't worry about it too much. That's a fine default. And then, this activation, that is the type of, I called it an if statement before, right? But that's going to be important, too. So, this is an example of the output layer. And so, if you see activations of softmax, that's going to be an output activation. Other activations you might see are ReLU, Leaky ReLU. You could see TannH. You could see Sigmoid. So, there's lots of different activations. So, if you're creating your own dense layers, pay attention to activations. So, a number of classes, activations. And the idea behind fully connected, whenever you look for fully connected networks, it's going to be like one row at a time. So, it's just going to take these three numbers and it's going to predict something. And every intermediate layer is going to look at every other previous layer. That's why it's called fully connected.

Another type of network that's super common is called a CNN, or Convolutional Neural Network. And this is going to be for images, mostly. The CNN, instead of every layer looking at every other layer, what it will do is look at something called a kernel. And this is five, so this is a five-by-five part of the image at a time. So it's going to look at a five-by-five part of the image and it's going to turn that into eight different numbers. And it's going to do that by moving this kernel over the image one pixel at a time. And then here's, if we look at Activations, this says Activations as well. This is an internal one, so it uses ReLu. ReLu is that if statement I was talking about. And so, if we look at that, what it's going to do is it's going to take these, these would be four-by-four, not the five-by-five, but it's going to take these numbers and it's going to turn those into something called filters, which are just numbers. And this is the early layers of a network. And this is where you can do things like detect edges. So for example, if you had big numbers here and low numbers here, that would be like a vertical edge, or a horizontal edge at the top. And so, this is how CNNs do things like detect edges.

Another type of network that I'm going over these really quickly, I just want to kind of explain what these ones are. And then later I'm going to show that you should probably, especially if you're using TensorFlow.js, look for models that already exist.

27. Recurrent Neural Networks and Data Length

Short description:

RNNs are used for data with variable length, like sentences. However, tensor cores and GPUs expect arrays of the same size, so you may need to pad sentences to match the length. RNNs can be thought of as a special type of CNN, but with a different internal structure. In the end, data is just arrays of numbers, so as long as you convert it, you can use RNNs, CNNs, or fully connected networks.

Okay. So another type is RNN, or recurrent neural network. And this is going to be for, I'm going to jump to the example first. This is going to be for things that have a variable length. So these are like sentences, often. It's important though, to know that even though the sentence might have variable length, remember the tensor cores and the GPU, they expect arrays that are of the same size. So if you have lots of sentences that are all different lengths, you're going to have to do something to match all those lengths up. And so this trips people up with recurrent neural networks often, because it's not like a CPU where you can just go one word at a time. When you go on a GPU, you have to look at the entire thing all at once. And so one thing to keep in mind when you're doing these RNNs is that this data here is a regular tensor. And so it's often the case that you have to pad the sentence to get it to be the same size. And then when you think about it that way these recurrent neural networks are more like kind of a special type of CNN, although this is an LSTM cell, which is there's a completely different internal structure to that. But the data is still just arrays of numbers. And so that tripped me up for a while with RNNs, it's like how can you pass words that are different length but it's really just arrays of numbers. So if you can convert it to an array of number you can use RNNs, CNNs or fully connected networks.

28. Using TensorFlow.js for AI in JavaScript

Short description:

Look for existing models in TensorFlow.js and use them without needing to understand the underlying algorithms. TensorFlow.js allows you to do AI in JavaScript, offering privacy and inference speed advantages over Python. It can be as fast, or even faster than Python for certain tasks. Check out the tutorial section for more information. As a JavaScript software engineer, you don't necessarily need to understand all the details of AI algorithms. Just find a model that suits your needs, follow the tutorials, and use it. There are also various APIs available for AI. Starting with models may spark your interest and lead you to dive deeper into AI. TensorFlow.js is a powerful tool that simplifies AI development in the browser or other platforms. As a software engineer with experience in web and mobile development, you can leverage TensorFlow.js to enhance your projects. Consider exploring AI further, as it can open up new opportunities for learning and growth.

All right, so things to remember, we're getting towards the end a little bit, so if you have more questions feel free to put them in the chat. And this is, remember, Zoom chat not Discord chat.

Okay, so number one, I've said this several times but look for existing models. TensorFlow.js has great models, they have loads of them. If you're trying to do something, remember TensorFlow.js is really useful on the web or on mobile. If you're trying to do something like that, there probably is a model that exists that is sort of like what you want. And you may have to read the documentation a lot to convert your data into the numbers that can go into the model that you can use. But once you do, you'll be able to use all of this great work that other people have done. And you won't have to redo that work.

Remember data is just multidimensional arrays of numbers. It took me a long time to figure this out. But all of the data is just these sensors. And so, it's really important to understand how these sensors operate. If you can look at this and you can see, oh yeah, I can see that this is a four dimensional sensor, then you're doing pretty well. And remember to use that.shape command in TensorFlow.js to really look at all of your inputs and outputs.

Neural networks learn by taking training data and converting it into numbers. And so, any problem that you have that you can take a pile of training data, convert it to numbers, and get an output, which is a number, you're going to be able to use neural networks for if you want to. So, in this case, we have a pile of images, which are just numbers, and our output, remember, was a zero for a cat and a one for a dog. And so, whenever you're thinking about creating your own neural networks or creating your own algorithms this way, think about, what numbers do I have as inputs? What numbers do I need as outputs?

The training loop, that was that we have our neural network. It starts completely random, and then we run this update step every time we pass in new input to get a new neural network. And this is, again, the beauty of the pre-existing models. Someone has already done all this work for us. So, you can look at the TensorFlow examples for how to train your own. But, one of the great, one of the great aspects of it is the model, the models that already exist.

All right, so why use TensorFlow in JS again? Again, you can do AI in JavaScript, which is fantastic. It works in the browser, the server, mobile, works on internet of things, and a two big things it has over Python are the privacy, because you don't have to send your data to some server, and inference speed, because you don't have to send your data to some server. You can do it all on the one device. And it can be as fast, or faster than Python. We saw how it can use that GPUs a little bit, and so as long as you're, and actually because of the just-in-time compiler, it can actually be faster than Python for some things, so there's some really neat benchmarks out there that show that.

All right, so head to the tutorial section if you want to check out more about TensorFlow JS, there are some great tutorials out there. The two things that we did and I can put that, or that link is in the top of the chat, are basically taken from some of those examples, and yeah, so that is what I had prepared. Does anyone have questions? Let's see. I am a JavaScript software engineer. Do I need to understand how AI works? or are we gonna use SaaS APIs in AI platforms, or APIs of AI platforms in the end? What do I recommend to do for normal software engineers? Yeah, so this is one of the main reasons that I say look at those models, because you don't really have to understand exactly what's going on inside MobileNet. So all the stuff I glossed over at the end about RNNs and CNNs and all that, you don't really have to understand those. All you can do, you can find a model that kind of looks like what you want, look at the tutorials, and then just use it. And I think that's one of the great things about TensorFlow.js is that all that work is done for us and we can just use it in the browser or whatever. You talk about APIs. Yeah, there's lots of APIs as well. GPT-3 is one of the most famous ones probably, but yeah, there are a lot of different APIs we can do. Now, I will say, as soon as I started doing some of this, so I started with TensorFlow.js because I'm a Javascript developer, right? I started using some of the models. And then I wanted to actually dive deeper and actually change some of the models, that will probably happen to you as well. And now I'm on this whole path about learning a lot about AI. And so it can be, if you're just doing like, you know, one thing, look at models and just do it. But at least for me, what that looked like is that I really got interested and I started diving deep. Am I a software engineer? Yep, I'm a software engineer. I've been doing web and mobile development. I guess I didn't talk about myself at all. I've been doing web and mobile development for about 13 years, and doing an independent contractor or consultant now. But just recently I got accepted to a master's program.

29. Master's in AI and Image Dimensions

Short description:

I'm going to be doing a master's in AI. TensorFlow.js supports distributed inference or training. Images don't have to have the same dimensions. For older style networks, you can resize the image, but be careful not to cut off important information. For newer networks, like fully convolutional networks, resizing is done within the model.

So I'm going to be doing a master's in AI now. So, yeah, I'm a web and mobile, mostly React and Node. But I'm doing a lot of Python now because that's what everything's written in.

Let's see, does TensorFlow support a distributed model that we can process data on multiple devices P2P. Sort of. Not out of the box, but because you can do the distributed stuff, all the distributed stuff that TensorFlow, regular TensorFlow can do. Yes, you can do distributed inference or training. I'll say it's kind of tricky and you sort of have to learn all the ins and outs. I haven't seen any good tutorials about it, but that is possible with TensorFlow.js, yes.

Should the images have all the same dimensions for training or recognition? They don't have to have all the same dimensions. Like I mentioned, there's two major ways of doing that. For older style networks, it will just resize the image. So you want to be smart about how you resize because like if you just crop, for example, then you may cut off information that's important. For newer networks that are fully convolutional networks, what it does is basically every convolution step, it looks at the image and then it does either a max pooling or something called a stride of two, which reduces it by half. And so it reduces, I'm trying to put this in the frame, it reduces by half every single iteration until it gets just down to one pixel by however many filters you have, which can be like thousands. And so it just does that until it's down to one pixel. And so you don't have to resize it because it will kind of resize it inside of the model.

QnA

Q&A and Resources

Short description:

Congrats on the masters! TensorFlow.js can be used for a product recommendation engine, but Python examples are also available. The model consists of weights and biases stored in a data structure. AI can be used to predict BTC exchange rates, but it's important to understand that it's not magic. Kaggle offers competitions and discussions on predicting stock prices. For beginner AI students, resources like TensorFlow.js documentation, the book 'Learning TensorFlow.js', fast.ai, Coursera, and online college courses are recommended.

Congrats on the masters, thanks. Yep, I'm super interested to see what I can learn by doing a masters. All right, did I miss anything? Or is there anyone else with questions? You can also, since it's just regular zoom, I suppose you could turn on your cameras if you wanted to ask a question.

Would I use TensorFlow.js for a product recommendation engine? You definitely could. It depends on what your inputs are. I would say that the first thing I'd probably look at for something like that is, so, there's collaborative filtering. That's kind of the keyword and TensorFlow.js can do that for sure. So, I would look for collaborative filtering examples and use that. And then, if you're gonna be running it on a server and you're already running Node, then you could definitely use that. But there are some great Python examples, of course, that can do it as well. And so, if you're running it on a server and you can run Python, then you may have more luck using some of that. But TensorFlow.js can do it.

So, the model is basically all the weights and biases stored in a unified data structure. Yep, exactly. So, if we look back, long way back here, yep. And so, these weights, so, I have like weight one, bias one, weight two, bias two, and there's, you know, there can be millions of them. That's what you talk about. If you talk about parameters of a network, that's what you talk about. And yes, TensorFlow, so Keras, PyTorch, they all have different ways of storing that. But, it's basically just a data structure. So, it's a data structure that stores the different layers and then the different weights and biases. And so, yeah, you can actually inspect the saved models and you'll just see a bunch of numbers. And it's these weights and biases which are stored.

Is it possible to predict BTC exchange rate? Have you heard about projects like that in open source? So, yes, but if you're talking about Bitcoin or stock prices, and you want to predict future prices from past values, remember that AI is not magic. What it will do is try to find signal in whatever data that you give it. And so for that to happen, first of all you have to believe that you can predict the future price from the past price. If you do believe that, then AI is one way to do it, yes. There are... So if you go to Kaggle, this is the one I know about. Let's see, competitions... Okay, and you go to... Here, Jane Street Market Prediction. This is exactly that. So this is trying to predict, I think stock price is based on past values, and this is like an ongoing, I believe this is an ongoing competition. First of all you can have prize money, if you do it well, so that's nice. The other is that there's loads of discussions. So go to these discussion forums and you can look at what all these people are doing to try to use AI to predict prices. All right, any other questions?

Resources for beginner AI students. Yeah, let me point some out. If you, the TensorFlow.js, if you want to do it in JavaScript, the docs are pretty good. There's also a book called, let's see, TensorFlow.js. There's a book that's just out, Learning TensorFlow.js. I just got this in the mail and it looks pretty good. And so this is just out by O'Reilly, so that's great. If you want to do it in Python, so one is, oops, not fast.js, fast.ai. Fast.ai is a great resource for beginners because, so it is in Python, but what they do is they start at a very high level, so I believe their first lesson, so there's videos and then there's discussions and things as well, I believe their first lesson is predicting dogs and cats with images. And so yeah, fast.ai is great. Like I mentioned before, also, you can look at, so Coursera is another place where there's a deep learning specialization, and so that'd be a great place to start as well. And then if you want more like a college course, there's like, I think Stanford and MIT and others have put their college courses online. So that's kind of what I'd recommend.

Object Detection in Videos

Short description:

TensorFlow.js can be used for finding objects in videos by breaking up the video into frames and using object detection code to localize and identify multiple objects. Video is just multiple frames of images in a row. There are examples, including a webcam example, that demonstrate object detection in videos.

Can, let's see, can it be used for finding objects in videos? Yes. So if you look at TensorFlow.js, and you go to the models, then this second model here is object detection. And so what I would do if I was trying to find objects in a video is I would break up the video into frames, and then look at this object detection code, and try to find it in there. So yeah, this is how you can localize and identify multiple objects in a single image. So for video, video is just multiple frames of images in a row. So I would do that and then you can use this. This is great. I think the example here, I think is a webcam example. Maybe not. But there's a webcam example that looks at video. Maybe it's just this one. One of the examples uses the webcam to detect things.

Okay. If there's any more questions let me know, otherwise let me get my contact information back on here. There you go.

Watch more workshops on topic

DevOps.js Conf 2024DevOps.js Conf 2024
163 min
AI on Demand: Serverless AI
Featured WorkshopFree
In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.
React Advanced Conference 2023React Advanced Conference 2023
98 min
Working With OpenAI and Prompt Engineering for React Developers
Workshop
In this workshop we'll take a tour of applied AI from the perspective of front end developers, zooming in on the emerging best practices when it comes to working with LLMs to build great products. This workshop is based on learnings from working with the OpenAI API from its debut last November to build out a working MVP which became PowerModeAI (A customer facing ideation and slide creation tool).
In the workshop they'll be a mix of presentation and hands on exercises to cover topics including:
- GPT fundamentals- Pitfalls of LLMs- Prompt engineering best practices and techniques- Using the playground effectively- Installing and configuring the OpenAI SDK- Approaches to working with the API and prompt management- Implementing the API to build an AI powered customer facing application- Fine tuning and embeddings- Emerging best practice on LLMOps
ML conf EU 2020ML conf EU 2020
160 min
Hands on with TensorFlow.js
Workshop
Come check out our workshop which will walk you through 3 common journeys when using TensorFlow.js. We will start with demonstrating how to use one of our pre-made models - super easy to use JS classes to get you working with ML fast. We will then look into how to retrain one of these models in minutes using in browser transfer learning via Teachable Machine and how that can be then used on your own custom website, and finally end with a hello world of writing your own model code from scratch to make a simple linear regression to predict fictional house prices based on their square footage.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

JSNation 2023JSNation 2023
24 min
AI and Web Development: Hype or Reality
In this talk, we'll take a look at the growing intersection of AI and web development. There's a lot of buzz around the potential uses of AI in writing, understanding, and debugging code, and integrating it into our applications is becoming easier and more affordable. But there are also questions about the future of AI in app development, and whether it will make us more productive or take our jobs.
There's a lot of excitement, skepticism, and concern about the rise of AI in web development. We'll explore the real potential for AI in creating new web development frameworks, and separate fact from fiction.
So if you're interested in the future of web development and the role of AI in it, this talk is for you. Oh, and this talk abstract was written by AI after I gave it several of my unstructured thoughts.
6 min
Charlie Gerard's Career Advice: Be intentional about how you spend your time and effort
Featured Article
When it comes to career, Charlie has one trick: to focus. But that doesn’t mean that you shouldn’t try different things — currently a senior front-end developer at Netlify, she is also a sought-after speaker, mentor, and a machine learning trailblazer of the JavaScript universe. "Experiment with things, but build expertise in a specific area," she advises.

What led you to software engineering?My background is in digital marketing, so I started my career as a project manager in advertising agencies. After a couple of years of doing that, I realized that I wasn't learning and growing as much as I wanted to. I was interested in learning more about building websites, so I quit my job and signed up for an intensive coding boot camp called General Assembly. I absolutely loved it and started my career in tech from there.
 What is the most impactful thing you ever did to boost your career?I think it might be public speaking. Going on stage to share knowledge about things I learned while building my side projects gave me the opportunity to meet a lot of people in the industry, learn a ton from watching other people's talks and, for lack of better words, build a personal brand.
 What would be your three tips for engineers to level up their career?Practice your communication skills. I can't stress enough how important it is to be able to explain things in a way anyone can understand, but also communicate in a way that's inclusive and creates an environment where team members feel safe and welcome to contribute ideas, ask questions, and give feedback. In addition, build some expertise in a specific area. I'm a huge fan of learning and experimenting with lots of technologies but as you grow in your career, there comes a time where you need to pick an area to focus on to build more profound knowledge. This could be in a specific language like JavaScript or Python or in a practice like accessibility or web performance. It doesn't mean you shouldn't keep in touch with anything else that's going on in the industry, but it means that you focus on an area you want to have more expertise in. If you could be the "go-to" person for something, what would you want it to be? 
 And lastly, be intentional about how you spend your time and effort. Saying yes to everything isn't always helpful if it doesn't serve your goals. No matter the job, there are always projects and tasks that will help you reach your goals and some that won't. If you can, try to focus on the tasks that will grow the skills you want to grow or help you get the next job you'd like to have.
 What are you working on right now?Recently I've taken a pretty big break from side projects, but the next one I'd like to work on is a prototype of a tool that would allow hands-free coding using gaze detection. 
 Do you have some rituals that keep you focused and goal-oriented?Usually, when I come up with a side project idea I'm really excited about, that excitement is enough to keep me motivated. That's why I tend to avoid spending time on things I'm not genuinely interested in. Otherwise, breaking down projects into smaller chunks allows me to fit them better in my schedule. I make sure to take enough breaks, so I maintain a certain level of energy and motivation to finish what I have in mind.
 You wrote a book called Practical Machine Learning in JavaScript. What got you so excited about the connection between JavaScript and ML?The release of TensorFlow.js opened up the world of ML to frontend devs, and this is what really got me excited. I had machine learning on my list of things I wanted to learn for a few years, but I didn't start looking into it before because I knew I'd have to learn another language as well, like Python, for example. As soon as I realized it was now available in JS, that removed a big barrier and made it a lot more approachable. Considering that you can use JavaScript to build lots of different applications, including augmented reality, virtual reality, and IoT, and combine them with machine learning as well as some fun web APIs felt super exciting to me.


Where do you see the fields going together in the future, near or far? I'd love to see more AI-powered web applications in the future, especially as machine learning models get smaller and more performant. However, it seems like the adoption of ML in JS is still rather low. Considering the amount of content we post online, there could be great opportunities to build tools that assist you in writing blog posts or that can automatically edit podcasts and videos. There are lots of tasks we do that feel cumbersome that could be made a bit easier with the help of machine learning.
 You are a frequent conference speaker. You have your own blog and even a newsletter. What made you start with content creation?I realized that I love learning new things because I love teaching. I think that if I kept what I know to myself, it would be pretty boring. If I'm excited about something, I want to share the knowledge I gained, and I'd like other people to feel the same excitement I feel. That's definitely what motivated me to start creating content.
 How has content affected your career?I don't track any metrics on my blog or likes and follows on Twitter, so I don't know what created different opportunities. Creating content to share something you built improves the chances of people stumbling upon it and learning more about you and what you like to do, but this is not something that's guaranteed. I think over time, I accumulated enough projects, blog posts, and conference talks that some conferences now invite me, so I don't always apply anymore. I sometimes get invited on podcasts and asked if I want to create video content and things like that. Having a backlog of content helps people better understand who you are and quickly decide if you're the right person for an opportunity.What pieces of your work are you most proud of?It is probably that I've managed to develop a mindset where I set myself hard challenges on my side project, and I'm not scared to fail and push the boundaries of what I think is possible. I don't prefer a particular project, it's more around the creative thinking I've developed over the years that I believe has become a big strength of mine.***Follow Charlie on Twitter
ML conf EU 2020ML conf EU 2020
41 min
TensorFlow.js 101: ML in the Browser and Beyond
Discover how to embrace machine learning in JavaScript using TensorFlow.js in the browser and beyond in this speedy talk. Get inspired through a whole bunch of creative prototypes that push the boundaries of what is possible in the modern web browser (things have come a long way) and then take your own first steps with machine learning in minutes. By the end of the talk everyone will understand how to recognize an object of their choice which could then be used in any creative way you can imagine. Familiarity with JavaScript is assumed, but no background in machine learning is required. Come take your first steps with TensorFlow.js!
React Summit US 2023React Summit US 2023
30 min
The Rise of the AI Engineer
We are observing a once in a generation “shift right” of applied AI, fueled by the emergent capabilities and open source/API availability of Foundation Models. A wide range of AI tasks that used to take 5 years and a research team to accomplish in 2013, now just require API docs and a spare afternoon in 2023. Emergent capabilities are creating an emerging title: to wield them, we'll have to go beyond the Prompt Engineer and write *software*. Let's explore the wide array of new opportunities in the age of Software 3.0!
JS GameDev Summit 2023JS GameDev Summit 2023
37 min
Building the AI for Athena Crisis
This talk will dive into how to build an AI for a turn based strategy game from scratch. When I started building Athena Crisis, I had no idea how to build an AI. All the available resources were too complex or confusing, so I just started building it based on how I would play the game. If you would like to learn how to build an AI, check out this talk!