And while we're dealing with this, we doing validation, first offline and then we do online validation. And this where the concept of A and B testing comes into place. Maybe you are familiar with this concept, but exactly what we want to do. So we want to have at least two versions of the model and we want to test to redirect some traffic from the users to one model, while we redirect only tiny part of the traffic to this model. And then we compare how these two models perform and we choose the model that performs better than another. So you can call it A beta testing. You can call it canary deployment. There are different versions of it, and that's why it's harder.
So what we usually going to do with the online, if we look at it online, we will do the same stuff as we've done before. We will start with the base, uh, class on the model. So in this case I call it online model, uh, just to make sure that I differentiate between batch and online. And in case it was batch, we have the class that we used for running batch inferences. In case it was online, what we are going to, to do, we are going to create an API, uh, so to, in order to use our model online. So in this case, the same story, we initialize the model, we load the model and then we run prediction. And in this case, how the prediction work, it's take some input features or data in our case and returns us back a prediction. And we can as well can define what do we want to expect from the model? So if it's online model and if we can do, let's say, interact, if we want to interact with it, then usually we do it via an app and it can be a web app or it can be, um, a mobile app depends on what do you want. However, there is as well difference between deploying to a mobile phone, uh, and deployed, uh, just to use it to the web application. So now let's concentrate on web application.
So last assume we have this model and we want to expo this model via RESTful API. So for this things, we can use Flask. Flask, this is a Feiten based framework and it's created exactly so it's stripped away, a lot of complexity, uh, away from you, and it allows you to define the steps of, um, actually the steps of usage of exploring the model. Um, so that's exactly where you have, uh, pretty a lot of tutorials, uh, in the, on the internet that are saying to you how to use Flask, but I want to make sure that you're aware about one simple thing. So let's call it for example, app.py, and just create a new and you'll file that I will use to get predictions from our model, let's say online. So I just created the app file. Let's see what will be there. So of course I start with import libraries. In this case, I start with my favorite library logging. Then I go, I want to use Flask. So I say from Flask what I need to import. Okay. I want to import Flask and I want to import for example, model request. Uh, that's what it needs. Then I initialize my app. Let's call it, um, let's say name. Let's call it this. And when we start, I want to make sure that at the beginning we don't have any model. And the thing that is different when you're running batch inference, you just load the model and you use it. In case it was, exposing your model via restful API, it's important to say that you are going to load model only once. So the biggest mistake they really see in some, in the scripts of the beginners, when they're trying to copy cats the flask, um, tutorials, uh, if somebody forgets to define that the model should be loaded only once what they're actually doing, every time a client send a request, they load the model. So it takes time and adds to latency. So actually you need to load model only once and then wait to incoming request. That's why if you're working with flask, usually what we're doing first let's define in this case up and we can call it, um, like before serving, before serving to make sure that we will load model only once. And then we define the function that actually will be responsible for loading the model. So this is our function to load a model. And what's important here, first, we need to, um, use actually the global, uh, specific to Python to make sure that this variable is, um, visible outside of the scope of this function. So it's not only here, it's not only be visible for the function load model, but it will be visible and reusable, uh, for all other functions that will follow. And then what exactly happened in here? So you have your model. You have your model pass. You can define it. Uh, you add your login and then you go, okay, for example, then I just use my function. I initialize, I instantiate my model. Uh, it was called online model here, online model. And I use, Oh, and I use actually the slot to load my model and I just got it. So, and I loaded it from some, for example, both where it's located. And then so first we are creating basic templates, uh, to serve out predictions. And we have, we're just starting to make sure that to load model only once before any other survey and that the model is, uh, is. It means that it's visible outside of the scope of this one particular function. And then we go to define exactly the, uh, in case was, was fight and we define the routes we want to say. And for example, we want to say it's okay. It's traditional predict, predict, rude, predict, and we define the matters that will be supported in our case. It will be just post. Uh, remember Google translate, you just type information and before pressing the button, while pressing the button, this is actually what's happening there. This method post is revoked post in then you say, okay. And then you define exactly what do you mean by predict? So in your case, um, you're having your global model to make sure it's here. And then what you get, so you're starting first, you want to get some, let's say rodeo data. Um, data and your road data. This is exactly what you, uh, get from Jason. If you're using Jason formats, request a Jason and then you say, okay, this is my row data. And then I want to get my prediction from my model. And in this case prediction, I want to say, this is my model.
Comments