The Hitchhiker's Guide to the Machine Learning Engineering Galaxy

This video is only available for Multipass users

Are you a Software Engineer who got tasked to deploy a machine learning or deep learning model for the first time in your life? Are you wondering what steps to take and how AI-powered software is different from traditional software? Then it is the right workshop to attend.

The internet offers thousands of articles and free of charge courses, showing how it is easy to train and deploy a simple AI model. At the same time in reality it is difficult to integrate a real model into the current infrastructure, debug, test, deploy, and monitor it properly. In this workshop, I will guide you through this process sharing tips, tricks, and favorite open source tools that will make your life much easier. So, at the end of the workshop, you will know where to start your deployment journey, what tools to use, and what questions to ask.

113 min
19 Jul, 2021


Sign in or register to post your comment.

AI Generated Video Summary

This Workshop covers the difference between AI-powered software and traditional software, the theory of AI-powered software, MLOps, AI pipelines, and the ML platform. It also discusses the challenges of AI versioning and MLOps, the importance of model training and versioning, and the process of model evaluation and deployment. The Workshop explores pipeline automation, ML metadata store, AI platform and deployments, and options for offline serving in batch mode. It also covers working with batch models, defining input and output schemes, and performing batch inference and deployment. The Workshop concludes with insights on offline and online deployment, A-B testing for online deployments, and tools for data versioning and ML deployment.

1. Introduction to AI-powered Software and MLOps

Short description:

Welcome to the Hitchhiker's Guide to the Machine Learning Engineering Galaxy. Today's workshop will cover the difference between AI-powered software and traditional software, the theory of AI-powered software, MLOps, AI pipelines, and the ML platform. The speaker has a background in data and development, and is involved in nonprofit organizations. They will also discuss the popularity of AI and MLOps and the different tasks in machine learning, such as classification, regression, and sequences.

Welcome, everyone. Pretty happy to be here. I'm really fascinated by the MLConf view, so really good organized conference, I should say. I really enjoyed it. And I'm also happy that for those who actually were given workshops, we got the opportunity not to have all workshops during the conference, but to spread it over the whole week so everyone has an opportunity to watch conference talks, and then if they want to their own pace, join or rewatch actually webinar videos.

So once again, welcome, everyone. Today's workshop is about the topic that is really dear to my heart. If you are not going to stop me within two hours, I can speak for ages about this topic, and I decided to present it a little bit specific way, I should say. So let's start. First of all, welcome to the Hitchhiker's Guide to the Machine Learning Engineering Galaxy. So what we're going to discuss today, agenda. First I want to have a really short round of, let's say, introduction, just to know a little bit more about you, and also to let you know why I'm here exactly to talk to you about this topic. And then we'll cover briefly the theory about what actually does it mean to have AI-powered software, and what's the difference with traditional software? And what's the difference in deployment? What is all MLOps is about, is it DevOps or is it DevOps with a twist? What is it actually? What is AI pipeline or machine learning pipeline is? What is the ML platform? Should I use it or not? And then we'll do a hands-on with AI pipelines. But what we're going to do, I decided not to do it in Python, we will do it in a pseudocode that looks like a Python. I explain you later why.

So, who am I to talk to you here and what I'm doing actually? I always joke in that my daily life or during the day, I have a full-time job. Before 1st October, I was a principle data solutions engineer. I managed to say that one word, awesome. And from the 1st October, I got as well promotion to be an AI and data engineering lead at a company named Linked. It's a Dutch consultant company. And also, during the night, okay, I'm the Netherlands. So, a lot of you know what is people do in the Netherlands at night. In my case, I volunteer for nonprofit organizations and now volunteer for 2. 1 is called Women in Artificial Intelligence, and I work as an AI mentor and tutor for AI-powered female start ups. And I'm as well an organizer of PyLadies Amsterdam chapter. And, PyLadies, unfortunately it's not about pies, unfortunately. I don't know. Depends on you. And what we're actually doing as a part of global nonprofit PyLadies, we are free to do whatever we want, but we need to make sure that at least 95% of our time is dedicated to Python. And exactly what PyLadies Amsterdam chapter is doing. Before corona started, we were doing offline workshops. Offline, right. Offline workshops every month. For beginners and for more advanced people on the topic that you can think about. It's data science, ML related stuff, non related stuff like Flask, Django, and Python. And as soon as COVID started, officially from to online. And now we're doing all our workshops online. And by the way, we'll have our workshop is free of charge and it's coming. So, for those who are interested in knowing what name entity recognition are and how to work with parts of speech, please feel welcome. I will provide the links afterwards at the end of the speech. Also a little bit about me. I have a mixed background. It was math and optimization problems in math and on top of it management. In total, for now, I think 16 years of experience. It was always somewhere about around data and development. And four years ago, I switched completely to development. So, a lot of people going from development to management, I did vice versa. I don't know if it was a good or bad decision. The future will show. And also I really love to do a lot of things with my own hands. And from time to time, I build different sensors. And then doing some stuff. On the picture, maybe somebody can spot Arduino. I have a Raspberry Pi. So, actually seeing what's possible and what they can do with the sensors. But I think enough about me. Let's switch exactly to the topic. Okay. So, the first question was what exactly is different between AI powered software and traditional software? And why know as all of us about the MLOps, AI? Why it's so popular now and why it was not so popular many years ago? Although we know that some of the reasons were developed like 50 years ago even. Why so? Before diving in, I want to say when we're talking about the different tasks that we are doing. So, we always talk about classification or regression tasks. And I also separated sequences because sometimes we need to treat them a little bit different. In a nutshell, classification, binary classification. Yes, I don't have experience with machine learning model deployments. No, I don't have experience with machine learning model deployments. Spam, not spam, it's a tumor, not a tumor. A regression based on some values they want to predict a price of a house, I want to predict some number. Sequences, a lot of things to discuss. But really, basically it can be time series, time stamp, some value and based on what's happened back in time, I want to predict it more and different NLP topics related to it but I think a lot of you already know what it is so let's move forward. While we're talking about machine learning, sometimes you hear such terms as AI, machine learning, deep learning.

2. Difference between ML and DL

Short description:

The difference between machine learning and deep learning lies in the process of feature engineering. In machine learning, features are engineered manually, while in deep learning, the model can compute features on its own. Different models require different deployment and training requirements, such as using CPUs or GPUs. It's important to consider these factors when deploying and generating predictions. Feature selection is crucial in building and training models, where important features are evaluated and used to predict the target variable. This part requires further exploration.

What exactly the difference are and why it's so important to differentiate it? In a nutshell, they are the same. You have model, it's like in this case, let's treat it as a black box and you have an input, in case this is the data and you get an output. In our case, it's predictions. Now we know that the model is a big, big function, takes data as an input and output predictions as an output. The thing is, usually the model is not able just to take data as it is because the computer is not able to understand what an image is, what text is, or what different categorical values are, so we need to process it. While we are working with machine learning or deep learning, we differentiate the process of feature engineering. In case with machine learning, we usually engineer features on our own. In case with deep learning, usually we let the model, let's say, engineer features on its own, compute features on its own. Why it's important and why it's different to... Why it's important to differentiate it is because, based on the type of the model, the deployment will be different and the requirements for the deployment will be different. Also, if you look just at the training of the model parts, we know that like simple machine learning models can be pretty well trained with CPUs, depends on what the deadlines exactly are and for the majority of the deep learning models. We need the GPU instances to train them on. Also, what's important to say that in case was deep learning, the question of running predictions is as well pointing out, okay, how to generate those predictions. Are we going to use CPU or are we going to use GPU? And if we're going to use GPUs, what are the So all these questions should be filled up or upfront, you know, to make the best decision for the model deployment at the end. And also, about the features, just a simple example. If we are talking about regression problem, predicting price of the houses. Imagine the tabular data in the rows. You have some values in the columns, you have number of bedrooms, location, anything else Then you have the price of the house or the target that you want actually to predict. And based on these features you pick out what are the most important based on some techniques, you evaluate these features, and then you build and train your model. So this part is still pretty fuzzy. So that's a deep dive further.

3. Challenges of AI Versioning and MLOps

Short description:

The biggest problem with AI is the need to version code, data, and models. Versioning data is challenging due to limitations in tools like Git. The importance of versioning becomes apparent when considering changes in features and their impact on model performance. Deployments require consistent performance and the ability to roll back models if necessary. Testing and monitoring AI models is more complex than traditional models due to the need to test data and models. Monitoring includes tracking data distribution changes and other metrics. Companies investing in AI need to measure return on investment and understand the value of the model. MLOps involves AI pipelines, where data is preprocessed, models make predictions, and results are post-processed. The evolution of MLOps pipelines is illustrated by Google's MLOps level zero, which involves data preparation and model training by data scientists and engineers.

The biggest problem currently is that with AI, we face the new paradigm. Before that, for traditional software, we can use the code, we version this code, we package this code, we release this code and we are good to go. We update some features, we re-update, we use CICD pipelines, and we are good to go. But what exactly is happening in the world of AI? Here we came across that we need to version at least three things. We need to version code, we need to version data, and we need to version model. Of course you can ask, what do you mean by versioning model, data? How is it possible to version such things? For example data, how can you version data? What does it mean? For example if I want to use Git, Git is not able to pick up more than 100 MB. Of course you can use GitHub LFS that is able to store more than 100 MB, but what exactly is it and why is it so important? Let's pick up an example of the same house price prediction in the past. You know that particular features were really important and your model was going really well. As soon as COVID started there was a swift in features and maybe now some features are more important and other features are less important. Just think about the features such as proximity to the supermarket or distance from the other houses. So in this case we really need to know what's happening with incoming data. Are there any features that we were not seeing before? And how these features can actually make an impact on the performance of our model. And when we're talking about the model version, what do we mean by that? Okay I can create a model, I can package it and I can assign a version to it. Is that what is necessary? Why do I need this? With the model we have a lot of moving pieces. If we're talking about simple machine learning models, you will have different parameters. If we're talking about more sophisticated deep learning models we have even more parameters. And the bigger the model is, the bigger the number of parameters as well. So it's really hard, let's say, without versioning models properly, without registering the models properly, it's really hard to control. And when we're talking about deployments, what we want is that the thing that was achieved before wants to have the same performance in production. And this is pretty important. And we also need to have a way to roll back the model if something happens to it. So we're saying that to sum up this part, we know that we have machine learning, we see deep learning where the difference is in compute, and the difference is in the feature generation, although for deep learning models, you can generate features as well. And traditional and let's say AI powered software, you need version code for traditional one, and you need to version code data model for the AI powered one. Also, as you assume, if we have so many things to versioning, in case of AI, it has a big influence on that testing part of the model, and as well, it has a big impact on monitoring part of the model. And now, if you look at the most problematic questions that's popped up in AI world, is actually how to test models, how to make sure that model performs well, how to know where to retrain the model, how to know where the performance of the models degrades, how to make sure that all components are working properly, and is there anything extra that I need to do in comparison to the traditional model? And yes, indeed, you need to do more, because if you version data, you need to somehow find a way to test data, if you version a model, you need somehow to find a way to test this model. And it's even more because of different types of deployments that we'll see today. And as I just said, difference with test and difference with AI monitoring. The difference between AmaLabs and devops, because AmaLabs is devops plus two extra things on top of it. And in this case for devops, okay, let's assume that we are going to monitor latency or throughput, RAM usage, CPU usage, like just basic metrics that is expected to be monitored. And at the same time about the model, what's going here. We need to actually monitor that data. Remember we versioning data. So the, the data, the model is trained on, then we need to monitor what's coming into the model. So we have the concept that called actually like input drift, when your data distribution is changing over the time, and you want to spot it as it was an example of regression model where you have the features that were important before colleagues and the feature that are unimportant, or if you're talking about the NOP problem related problem, it can be new words that appeared in the language that we are never seen before, or maybe we were oriented on one particular types of um, types of words and we don't see this words anymore. We see completely new words. So the model is unaware of what to do with it. And as well, we need to monitor a lot of other metrics. And on top of this, because a lot of companies want to invest money in AI, this is still a hot buzzword, but they want to measure their return of investment. And of course they want to invest as last as possible and get the results as soon as possible. So when you're working on the project on the machine learning or deep learning related project, it's always good to understand what you're going to do and what are the expectations of the business? What are the value of this model? How to measure this value of this model? Because a lot of companies confuse it and say, let's measure it was accuracy of F one score. It's a little bit different. So yeah, we see that a lot of things happening. But now that's deep dive a little bit into what malos is and what do we mean by AI pipelines actually. So this is really simplified a AI pipeline in action. So we know that we have somewhere a model and then we know that the model is not able to just digest data. We know that the data should be preprocessed and the majority of the cases, it just the floats that go into the model. And when the model returns something or a classification or aggression or any other problem, it's as well, the result that should be post-processed and that's what is missing with a lot of cases. I was looking for the best way to represent actually the evolution of MLOPS pipeline. And I think the best way, the best option that I found, it was we have this three pictures that you will see further. It comes from the Google, Google machine learning website. Personally, I didn't have any production experience with Google. I have production experience with Azure and AWS. I only played with Google a little bit but at the same time, this information I think totally reflect the current status of the new machine learning projects that are happening in the companies. So this is what Google called MLOps level zero. And this is the place where I have a lot of mental processes and what's happening there from the left side. You have actually people data scientists, data engineers, data analysts, or any other names can be there. They are working actually off getting some data and making some sense out of this data. So we different problems and answering the business questions. So they do some data preparation. It can be done in different formats. If you're familiar, it can be done with spark. It can be done with funders. It can be done with non-bikes or with any other stack. It can be done in R. So different options possible here. It can be done just as simple SQL. It depends on the situation. It depends on the types of the data. Then there is a model that is trained. Usually it's happened on the local machine.

4. Model Training and Versioning

Short description:

Model training is an interactive process that requires multiple iterations. It can be done in various environments, such as local machines or cloud platforms like Azure and AWS. It's important for companies to invest in model versioning and data versioning to track experiments and store metrics and parameters.

So you're bounded actually by the capacity of the computation power of your local machine. And the problem with model training is that it's an interact. Actually it's interactive process. So you're not able to just train it once. I wish it was. So it was possible. And then I do something with it in that case. Usually this happens in different environments. It can be happening in the on local machine, the Jupiter notebooks. If you work with Azure or AWS or Databricks, then you can do it in their notebooks. So it can be everywhere. It's good if company has invest some time and people a little bit aware about the importance of model versioning and importance of even versioning data and as well storing all metrics and all parameters in all hyper parameters somewhere, let's say to track the experiments. I will talk a little bit about this part later.

5. Model Evaluation and Deployment

Short description:

Model evaluation and validation are crucial in the development process. The ideal scenario involves delivering a packaged model rather than just a file. Collaboration between data scientists, engineers, and software developers is essential. However, miscommunication and difficulties in deploying and testing can arise. Companies with one or two models typically expose them through RESTful APIs using Flask. This process can be rudimentary and prone to issues.

And then there is a model evaluation, validation, standard situation. If it's about, we are starting always with offline validation and evaluation. First, what you do, you take a look, okay, what are the possibilities? Does the model performance is expected? And then you see if it's not evaluated properly or if it needs much more investigation. You go then at the end, probably a worst case scenario. You have just a Jupyter notebook with some codes, best case scenario. You had Biden scrapes or other type of the scrapes, that are perfectly modularized that are dockerized and there are, let's say ready to be packaged or it's a package. It's the best, best best case scenario. So if you're working for a company and your data scientist, data engineer and machine learning engineers actually deliver a not just a big old, or a H five formats model, but they deliver a package. You're at the best place. If not, you're still at the best place. Because I find that you're at the best place to develop. You're at the best place because you can grow together and learn together how to improve the whole process. And then if it's a good scenario, you end up somewhere with like a model registry. I'm going to say, okay, that was the model trained by this person and this time. And then the business parts stops at this place. And there is a big, big, huge fence and it's the ID people, software developers who are actually responsible for operation. Operation of this model. And we had this issue before with DevOps. We see it now as well with MLops, because usually the data scientists they sit together with the business people and data engineers, machine learning engineers, and software engineers. They actually sitting together with it people. So there are some miscommunication and as well if the company private, let's say the private language of the company as not a Python or R, that sometimes it's really hard to understand what exactly this binary artifact means, how it can deploy it, how it can test, and how can prepare it for next type prepared for staging, how prepared for actually production and as well, how do I monitor it? So all these questions should be answered definitely. And in this case, it's usually about companies that have one maximum, two models. So usually they expose their models via RESTful API's usually with Flask powered by Jr.Corner ViscGI and that's how it works. But it's, it's really rudimentary and really buggy process because a lot of things can happen around the way. And a lot of things can be actually, actually, yeah, spoiled.

6. Pipeline Automation and ML Metadata Store

Short description:

This part focuses on pipeline automation for companies with at least two models in production. It emphasizes the need for automatic steps in the experimentation development phase, where a service repository with best practices is used for pipeline deployment. The importance of an ML metadata store and model monitoring is highlighted, along with the possibility of a fully automated CI/CD pipeline for companies with 10 or more models in production.

So the second level, I see it as like huge and big, big picture. This is the part of pipeline automation. And this is where the companies that has at least two models in production, they want to invest more time and efforts in this to make some manual steps actually automatic. So in this case, this is what we are trying to do in this experimentation development phase. We actually try not to just have a model, like adjust it is serialized model as a Pico or H5 format. We want to have at least something that reminds a package or we want to have something that at least has some pipeline. Remember we have, we need to pre-process data to get it digested by the model and we have the the post processing of the model output. So in this case we have like a service repository where we can have some best practices in place and then say, Okay, now we are talking not about the model deployments, but now we're talking about pipeline deployment. And exactly what's happening here. Pipeline knows what to do, pipeline knows how to pre-process data, how to do all this magic and then return the data. And also here we're starting to speak about really important thing as an ML metadata store. And it's exactly, if you want to work with different models, we want to keep metadata about all this model that we have. And it's on top of the model registry that we use. And then if we are trained models and we will serve models, we want to monitor it, understand what's happening and have some specific triggers where what we can actually use to retrain the model. And the third thing is it's like the ML ops level two is when do you have the CI, CD pipeline, fully automated? Is it possible? Yes, of course it's possible. Depends on the tooling, depends on the knowledge that company has. It's possible. And this is usually used by companies that have at least 10 models in production or more because it's requires some engineering efforts to build it. Some build it from scratch, somebody reuse open source components depends on the situation.

7. AI Platform and Deployments

Short description:

Here we have the models that are packaged together with pipelines, with pre-processing, post-processing, and everything happens automatically. We discuss the difference between using different tools for ML deployments and the importance of an ML platform. There are millions of tools and processes involved in ML AI infrastructure, but today we focus on deployments. All-in-one solutions like FB learner, Billerner, Azure machine learning, and Amazon SageMaker are examples of platforms that streamline MLOps processes. The Babelfish concept is introduced as a metaphor for understanding and navigating the vast array of tools. The first rule is not to panic, and we will now explore AI pipelines in pseudocode. The materials and code for the session will be shared via a GitHub link. Additionally, a list of recommended projects and references will be provided.

So here what's the difference. Here we have the, let's say the pack, the models that are packaged together with pipelines, with pre processing, post processing, and everything happens not manually, but automatically. So as soon as a model pipeline is added to the source repository, we spin up a continuous integration process. So we build test packages components. So we get a package, then we do a deployment of it and we can play test it. And then we know exactly when to retrain the model.

So being said, okay, it looks nice, but once again, should I build everything from scratch? Should I do everything on my own? What's the point of all of it? Or I can just reuse something. I was hearing questions like, Oh, why not to use H2O or, Oh, why not to use Azure machine learning services? Or, Oh, why not to use AWS SageMaker services? Let's take a look at it. So it will take us three, five minutes, not more than that. And then we'll be switching to the Q&A part and then to the hands-on part.

Okay. So ML platform, what is this and why do we need an ML platform? If you're a small company maybe it's not a good idea to build it from scratch. You can just reuse what is there. If your big company has a lot of processes and you have cascade deployments where the outputs of one model is fed into the input of other model and it's the output of that model is fed into the inputs of the next model. When you really need to monitor a lot of things maybe you need it. But for now, if we take a look at the whole ML AI infrastructure, it sounds a little bit hard to understand because there are so many tools there. There are so many processes. Now we are talking about deploying models. It's only a small part of a bigger pipeline. And here you have tools from the left side. You have tools for data preparation. You have tools to version data. You have tools to do feature engineering and feature storage. You have tools to data labeling, quality checks, and then you have a whole bunch of tools dedicated to model building. And then you have a whole bunch of tools as well to monitor the models in production. So once again, there are millions of tools. The questions were to use this tool, about this tool, what's the difference? What are the, let's say the pros and cons? It's usually depends on the situation. And if we look at it from this side, today we're talking only about the deployments. So we have different options where and how we can deploy the models. But once again, if you look at the picture based on the level of obstruction, we go through the just basic hardware, cloud storage things, and then to the platforms. And for example, on top you have examples of, let's say, all-in-one solutions and they are actually like FB learner. This is a platform that was originally created for Facebook to streamline their MLOps processes to make sure that the, the models are deployed on time that the models are monitored properly, that you can do different experimentation with the deployments. And then you know exactly when to roll back the deployment of the model and everything should be automated. Imagine if you need to manage 100 models and you have, if you don't have one system to rule at all, then you have a lot of code duplication and it's just a man maintainables, so that you have a Billerner, as I mentioned, that have Uber, they built their own system, Michelangelo system. Uh, they do as well online and offline model serving, and they, they noticed that for some a situation, they had a lot of issues of they tried to solve it, but also noticed that it's really hard to sell it once. And then you have a, for example, it's, it seems to me that it's all picture of Azure machine learning and Amazon SageMaker and then the rest of it. But now if you look at all of this, um, actually, it sounds like you needed a bit, a Babelfish, and if you didn't watch the movie or read a book, the hitchhiker's guide to the galaxy. Um, I just explained what it is. So Babelfish is actually, um, the low latency translator. You just put it in your ear and you immediately can understand what people are saying around you. So it works as a translator and the same time, if you want to say something to the people, uh, the person who has this Babelfish can understand you as well. So it's like, um, universal translator. So in this case, you saw there are millions of tools. There are millions of things to do. And if you just came across that I have a model, uh, I have a POC, I built proof of concept. So I have this model, um, to solve this particular business issue. And I want just to try to build it further to make an MVP out of it. And I want, try to deploy it, um, where to start. I'm lost. So the first rule actually of the engineering galaxy is don't panic. So now what we're going to do, we'll go a little bit, hands on with AI pipelines. As I mentioned before, the hands on will not be in python. It will be in a pseudocode. But for you to understand the concepts and the important things. But before doing that, uh, this slide will be shared with you. You can find actually this slides here and I will copy it. Copy this GitHub link and send it to you. So all materials in all life coding that I will be doing today during the session will be there, so you can get back to it anytime we watch it or whatever you want. It was really good question. So I just, I don't know whether it's visible to everyone. The question is what AI suggested to follow up like projects, references to read after the workshop. Um, exactly. I was prepared for this question, so it's pretty easy. I created for you the whole list, where to go next, and as well. The least of my my personal favorite open source tools that I love to work with. So I briefly covered at the end as well. But this, I think actually this full stack deployment deployment that was one of the best courses ever taught. Um, first it was taught one of the biggest university in the United States, and then they decided to make its public. And now it's there. It's free of charge.

8. Introduction to Predictionizing LP Models

Short description:

The article on predictionizing LP models is highly recommended for those working with NLP models. It provides valuable insights from a skilled writer with deployment skills and lessons learned.

It doesn't require you, let's say any, anything. Only your time and efforts to spend there. And then some useful topics to go. And this specific article, um, predictionizing LP models. Although I'm not a big fan of medium articles, um, I try to write some, but let's say personally, it was a little bit hard to write and not to consume that, um, the content of other people. Uh, but this one is really good. So for example, if you're working with NLP models, uh, there are a lot of things, um, to you should to think about, and this article is really written by a person who definitely has this deployment skills and at least the lessons learned. So it's really cool one. So take, take a look at that. Um, yeah.

9. Deploying Machine Learning Models

Short description:

If you're a certified TensorFlow developer, find a suitable use case and showcase how you will solve it with TensorFlow. You can showcase this project on your GitHub, GitLab, BitBucket, and commit to the open source. Having a certificate and being an open-source committer for TensorFlow gives you extra pluses. Intermediate projects are crucial to applying the knowledge you've gained. Different people prefer different development environments, such as Visual Studio Code or PyCharm. When deploying a machine learning model, consider how the end user will interact with it. There are interactive and non-interactive use cases, depending on whether the model is accessed directly or runs in the background. Interactive use involves sending a prediction request to the model's API, while non-interactive use includes scheduled jobs that extract, transform, and load data.

Oh, I think it's really good question. So I will go, they just answer leave. So the question was, how can I make money to save up for college? If I'm certified TensorFlow developer, frankly speaking, I think it's not so important whether you are certified or not, because that certificates actually is a way for companies, uh, definitely to earn money on. I actually recommend to do something. For example, if you're a certified TensorFlow developer, right? So, okay. You know the specific concepts of TensorFlow, find the suitable use case. For example, something that is bothering you or something that is bothering, I don't know your friends, relatives, parents, and showcase how you will solve it with TensorFlow. What are you going to do? Or maybe it will be really cool. Uh, deployments or something. That's one option. And then you can showcase this project on your GitHub, GitLab, BitBucket and other options, what you can do as well. Um, you can commit to the open source, to the TensorFlow as well. And then, uh, if you decide to. Yeah. To start like the, the job search procedure, this is really good to say. Uh, in your resume, I am, let's say nobody, okay. Or I'm have a certificate or I have a certificate on the top of it. I'm one of the open source committer for TensorFlow that you as a company actually is using, uh, in development, in production. So it gives you extra pluses, I should say, but I think it's a really good question. I hope that it's on 3ds. Okay. Folks. Uh, two, three minutes to go and we will start with the second part, I was, like, half only in preview open. Also, what I want to say that, um, when a person noticed, for example, our company, uh, we are doing bootcamps. And, uh, we are treating data engineers and machine learning engineers. And, uh, from the last time, uh, cloud data engineers, I see that a lot of newbies. They spent a lot of time on learning stuff and they try to say, okay, I will finish this course, then I will finish this course, they will finish this course. But for me, the question is what are your intermediate projects. And then, if you just don't try it on your own in practice, it will be really hard to do something with the knowledge that you achieved. So, yeah. Uh, there was the, another question. What is the environment that you show us here? Um, I assume you mean this environment, this one that now is on the screen. So that's what I'm actually using. If this one just let me know, but it's just an ID to develop. Um, I can say about everybody. Different people prefer to use different things. I'm using visual studio code because I developed not only in Python and for me, it's easy to connect and it's easy to use it. Um, for example, I know the colleagues who are working intensely with Spytime they use it by charm. Um, I didn't use it because like I didn't need it for me. The visual studio code is enough and yeah, I have just the basic layout here and I run in it on them. Uh, macros right now, so yeah, I just using, uh, as a terminal. So it's built on top of bash. So that's, that's what I'm using right now. Okay, cool. So folks, let's start to the second part. I called it hands-on from one side, but once again, as I mentioned, we will do in the pseudo code. So actually when you come up with a task. To deploy machine learning model. Uh, we usually say, tell me first how the end user are going to interact with this model. And what do you mean by that? It's exactly if you want to get prediction from the model, what steps you're going to do. And based on that, there are a lot of other factors that we usually need to consider if we want to determine how to deploy machine learning model. And saying that what we usually, uh, let's say I'm taking care of. We first thinking we, I mean, a machine learning engineers, data engineers. There is still no division in who is doing what. So what's getting here. We are looking at. The usage. Are we going to interact with a model directly, or are we going not to interact with the model directly? What do you mean by that? Uh, example of interactive use. Um, you have a model exposed to the API, your sense of prediction request with a model, the model did something, and then you get the prediction back. It's going to be a bounding box for computer vision task. Uh, it can be. I don't know something for NLP task or it can be the price of the house. For example. Non-interactive use when your mission model machine learned, sorry, your machine learning model is doing something on the background. For example, as a part of, uh, detailed jobs, extract, transform load or extract low transform. So in this case, the model, uh, based on some particular schedule, um, gets data in returns prediction and says prediction somewhere it's somewhere. It can be any data store. You can think of it can be SQL database. It can be non-SQL database. It can be an object storage like Azure blob storage or Amazon S3 buckets.

10. Options for Offline Serving in Batch Mode

Short description:

In this part, we discuss the options of using a machine learning model for single or multiple predictions, synchronous and asynchronous requests, and the differences between real-time, near real-time, and non real-time deployments. We focus on offline serving in batch mode, where prediction jobs are scheduled at regular intervals and the results are stored for further analysis. Throughput is important in this scenario, and model validation is performed offline. We will demonstrate working with offline predictions using pseudocode.

So it can be different options. Then we are thinking about, are we going to use this model for a single prediction or a bunch of predictions? So what do we want? We want to send one request to the model, get prediction back and use it. All we want to send a lot of predictions at the same time and get back from the model.

Also, we are talking about synchronous and unsynchronous. In this case, in case was synchronous, I sent her a request and I want to make sure that it will get back to me. In case asynchronous, I can use queue, for example, and then say I sent a request and it can went and I don't care what this actually requests will return to me back.

And also there is a difference between the deployment machine learning model real-time, near real-time and non real-time. What's the difference between this three components? Non real-time it means, for example, with non interactive use batch, once per day, the machine learning gets, for example, information from the marketing department cluster, the clients gets the segmented cluster of customers back in it. I don't know, like any format that you need. This is not real-time definitely. Then you have real-time and non real-time, and it can be a little bit confusing because for you, you can say, Oh, wait a second. It's like, what's the difference? And the difference here is if it's near real-time, that's where you use it. Model, for example, if it's deployed in a web application, you send a request, you wait a little bit, sometimes some milliseconds to model, to do stuff for you and get back example. The Google Translate. So you put texts there. You click on the button under the hood under the themes and spin ups model, spin ups embeddings do everything that's needed to be done to give you a result back in another language. This is near real-time and real-time when we talk about streaming streaming, it can do is Kafka or if it's on WS it can be canasus or any other options and where it can be imagined that we have it sensors that streaming data. And this I just sensors for mission critical environment. And in this case, you want to immediately spot the issues. So for example, if the temperature of the humidity sensor sorry, no temperature, the humidity of the sensor is dropped, then you need immediately sent an alarm. So this will be the real time today, we will do two types of deployments. Once again, it will be pseudo deployments. Will do. We'll try offline serving batch, and we'll try in online serving near real time. Why was starting with offline? Because offline is a little bit easier and a little bit simpler than the, actually the online serving. And for, for the majority of the cases you will be working with, I'm pretty sure that it will be offline serving. And in this case, we'll be talking about on flying survey and a batch. So we have some prediction job scheduled. It can be based on your requirements. It can be a once week, once day, once an hour, sometimes more often. Sometimes it can be once for three day. Example, you get, for example, IFT data. I don't know, from the plane, from the some sensors that in the plane, they got delivered to the data lake of your company. And from this data lake, once per four days, you spin up the prediction job, they do the trick and you get the prediction stored somewhere. And this prediction further will be queried by BI analysts, like using SQL for example. What's important here, latency is not so important here. So we don't need to wait. We have more time. What's important here is throughput. So in this case, we need to make sure that your model will be able to handle a lot of requests simultaneously. And this your model will be able actually to process data as efficient as possibly. And as I mentioned before, in this case, end-user does not interact with the model directly. With what we interact, we interact with the results of the model. So we've got something stored, for example, the SQL database, and then we work on it. And in this case, about model validation, because it runs online, we validated offline as well. So for now, what I want you to do, I want to show you the way, how you will be working with offline predictions. And I will do it in pseudocode in real time. After that, we'll make a short break, and we'll go with questions, and then we'll continue with the online options.

11. Working with Batch Models and Model Loading

Short description:

If you work with batch models, you can modularize your code using classes. Initializing the class and loading the trained model are the first steps. The model can be stored in various places, such as S3 buckets or Azure Blob storage. The model should handle batch input features and return predictions. It's also important to handle errors when loading the model and define metadata and input/output schemes.

And also, if you look at these things, so in this case, if it's offline, it's non-interactive views because we don't interact with the model. It's batch, so we process the information in big chunks and try to vectorize this information as well. Then it's non-real time.

So for the batch, I have here like a dummy document, a dummy class. It's a Python script, but once again, it's a dummy code. And you even don't need to set up a Python environment to run it. In case, if you want to go further with it, what I highly recommend, I highly recommend instead of having millions of virtual environments, then installing tools to work with this virtual environment.

I recommend to use Konda and this is exactly placed here. I don't know whether you're familiar with Konda or not. There are two options. You can install Anaconda or Miniconda. I prefer to work with Miniconda because it's easy, simple, and lightweight. Anaconda distribution has all necessary data scientists and data analysis libraries. For me, personally, it's too much, but it's really good because if you work in just with Python, you're good to go with Python packages, but if you're going to use something like, for example, Facebook profit, if you've ever worked with this time series library, then you will be doomed because you need to manage a lot of dependencies.

And some of the dependencies are not in Python. So in our case, in my case, in my work experience, Conda environments save me a lot of time. And what exactly is doing so it's package dependency and environment management for, it supports different languages. What exactly is doing instead of providing requirements TXT file, you actually provides a conta Jamel file. So in YAML you specify everything that is needed to run there. But let's go back to our batch model. So imagine that you have a model, you train it. Um, it's ready. Remember that we want to work with a model and we want to be remember about be remembered about the pre process pre-processing step and post-processing step.

So in our case, if we get raw data, we need to pre-process this data to make them digestible by our model. And then we want to return the predictions back in our case, we decide to write them to, for example, SQL database. So it should be. Uh, actually compatible with this database. It was a schema. So what I usually prefer to do. I prefer to modularized codes and to provide enough, a level of obstruction, uh, to make it understandable, not only for the data scientists, they created this model, but for the rest of the software engineers at the team, uh, that will know exactly what to do with this model in the future.

So for this reason, if you're working in Biotech and I highly encourage you to look at object oriented parts of the item and start working with classes, because what you can do, you can create a basic class model and define what are you going to do exactly with this model? So it was a concept of classes. If I work with a batch and I need to deploy the model in batch, I usually do three simple steps. In the beginning, I'm starting usually with the simple batch model. So I'm starting with design in a class. So if it's class for those who are familiar with Python, for those who know it's a little bit let's say, the class is theory. It's a little bit the same as with other languages, but exactly the first step that you do, you just need to initialize this class. So to make sure that you can create an instance of this class in the future, then you have a trained model. So you have a trained model and you serialized this model in a specific format that you need. So if I want to use this model, I want to take this model and exactly load it. And then because we're talking about the batches, we are going to work with a batch. So what I expect here, I expect that my model will take, let's say the batch input features and we'll do some calculations for me and return predictions. So in this case, I just added a basic, this is basic doc streams that saying, okay, we'll predict the batch. We have this parameters. And then we, that's what we're good to return. What can be really hankful here as well, because we are working with the batch model, we won't to add some extra layers here. What I'm doing here, I just defining the methods, it's class methods. If you're used to work only with functional programming, now I defining the functions. If you used to work with object oriented programming, I just create the methods that let's say functions of the thing that your object can do. what can be hankful here as well, to think about, okay, so I need to initialize this model good. So then I need to load this model. The model can be somewhere on the storage on S3 buckets or in Azure Blob storage or any other place. So I need to find a way how to load this model safely. Here in this method, I will handle all the errors that can appear for example, if I tried to load the model and the past doesn't work anymore. If there are any other issues, so I want to handle it here. And then to predict batch. I just wanted to create like a contract to create an abstraction that I know this part will handle. That taking road data for me, processing this road data for me, and actually giving me something back. Or another option I can trade it is just. This is the input features for the model. And I get predictions at the results. Um, what can be useful here? Um, because we are working with, um, the model and we mentioned that the versioning of the models important, actually you can create here a method that in your case will just return the metadata, so we can call it for example. Metadata of the model and exactly what it will be doing. It will be taken the model and then, um, return back, uh, the metadata of the model. So let's say easy as that. And why it's important because remember when we were talking about the second, um, let's say the second stage of an LLOPs, when you have parts of the tasks automated. This is the, uh, the time or place where you will be using ML metadata storage. And this is for you will be, really comes in handy because you can just call the metadata and let's say the model will tell you, uh, which version this model has or any other extra efforts SS that is needed to metadata. What you can define as well here, and as well, um, can help you a lot. I prefer to always, um, if it's possible, of course, to define an input and output scheme, uh, it can be done in the, inside of this method, or it can be done outside of this method.

12. Defining Input and Output Scheme for Batch Model

Short description:

I prefer to always define an input and output scheme for the batch model. I instantiate the object, get the raw data, create the model instance, preprocess the data, and generate predictions. Finally, I write the predictions to a SQL database.

I prefer to always, um, if it's possible, of course, to define an input and output scheme, uh, it can be done in the, inside of this method, or it can be done outside of this method. So this, I will just call it for now. Um, in ports in bold data or to make it more of a Bose, I can say a scheme, uh, skip scheme, input data and here, what, what will be happening? I will be actually describing, uh, the, the scheme and it's can be, or Jason, or any of the format that we need. And then what I can as well as I can add here, the scheme outputs for example, it's just, just really, really basic and what expect from, from this, um, class, what I expect, what it will be doing if I want to use it.

So I will instantiate the object of this batch model. And I will then, uh, do some, uh, extra steps. I will show you right now which one. And at the end, um, I will be able to schedule, uh, this, uh, let's say that the running in, I will get the things actually going. So in this case, what I will do, uh, I will, let's say create some separate file. So let's say it's workshop source badge. Um, let's call it for example, batch inference pi. That's called this, cool. We have the badge inference. So what I will be going here. Um, I will skip the login parts right now, but just, I want to say that's pretty important to use it or let, let me have it here just to make sure that we have it. A lot of people forgetting the pretty simple principles of using it, uh, logging instead of printing. And here, for example, what I'm going to do, I going to define the function that exactly what it will do. It will, um, run batch. So, and here, what I need, I need pass to the model. So let me call for example, pass to model. So this is exactly the location of my sterilized model and what I expect here. So what I will do. So first of all, I will get, so let's say, um, the data. So let's call it like raw data. Now what I'm doing, I'm just defined the logic and this logic and the further developed and, um, translated in functions. But at least if you do this, you will be able to write the proper unit test to make sure that each components of the model is, uh, behaving as expected. So let's say for example, row data and imagine that the road data, I will have a function that was responsible for a get raw data. Let's go this way, get real data. And then, um, I had some logging just to make sure that it's there. Then I say, okay, now I need to create a model. So in this case, whatever I'll do, I will use my batch model. So I will say here it's our batch batch model. I need to import it as well, but for now just like this batch model. And then, um, maybe my batch model needs as well a way to, um, load the model. So here I have the loads defined, so let's use it just like loads. In this case, I provide a path to model, path to model. And so the path to model is provided, what I need else, what should I do? So first I get to, in order to run batches in order to get this prediction, first, what I need, I need to go raw data, okay? The raw data, I get it in, then I need to create the instance of my model I created. So I loaded my model is here. Then the next thing, what I want to do, I want to preprocess this data. So maybe for example, here, I call it snake, the, the predictions. Oh no. So first I want to preprocess it. So let's say here, we can create a file that we call like preprocess data. Here we have predict batch before predicting batch. We want to define as well the let's call it preprocess data. It can be one function. It can be different functions. Let's say data as an input and get something back. So, okay. I get real data. I create the incident of the model. And then the next step, what I'm doing exactly, I do the preprocessing. So I have my model. Then it's okay, let's call it like X. And then I do, for example, model preprocess. And what I'm going to preprocess, this is exactly my raw data. So, okay. I have this part handled. So this is my transformation that expected to do. And then I want to say now the next thing is predictions actually, that I want to get out from it. And I will say model and then just use, Oh, oops. And then I will just use, pretty batch. So predict batch. And I will use my preprocess data for this bag prediction. And at the last step, what I want to do, I want actually to write it somewhere. And for example, imagine that I created function like right, right. To SQL up to SQL DB, let's go this way. And then I just read this predictions there. So, okay. I created this function.

13. Batch Inference and Deployment

Short description:

This part explains how to perform batch inference using a file responsible for batch inference. It covers loading the model, preprocessing the data, generating predictions, and storing the results in a SQL database. It also discusses the use of libraries like arc parse for parsing data from the command line and the importance of logging and checkpoints. The part concludes by highlighting the majority of use cases for deploying models as simple batch jobs and the option of using visualization tools like Power BI to analyze the results. It emphasizes that most models are deployed in offline mode and mentions the importance of training models offline before deploying them for online inference.

So this file will be responsible for batch inference. So we take the data as an import. We had our model loaded and ready to process the data. We pre-process the data based on some logic on some transformation that we are talking about. And then we ask model actually go round the predictions on this batch. And then we want to actually dump this results. For example, to SQL database, that will be further queried by our, I dunno, SQL analysts or other people, for example.

And then we're saying if, where's my favorites, let's say, okay, if name equal to main, and then I say, for example, then I just want to run, run batch, run batch. And then I say here the run batch. This is my path to model here. So just put it here and that's it. And then as it's ready, so now, first of all, I have here my model encapsulated. I didn't do any check like on scheme, but it's as well advisable to do. Then I created like a pretty simple batch inference file. The whole responsibility of this file is to get raw data, then to load the model, pre-processed the data that are ready to be digested by our model, return predictions and then drop this predictions to, oh, it should be like this and then drop this prediction for example, to a SQL database and then say, okay, if I will be calling this file, please run batch.

The thing is here. So I don't want every time actually to manually add this past to the model. So in this case if you're working with Python, there are really cool libraries. This arc parse, config arc parse. I'm a big fan actually of arc parse library. What exactly it's allowing you to do, it's allow you to parse data from your command line. So if you have arc parse then you need to spend some time here to initialize this arc parser. I will skip this part for now, but then what it gives you actually, you can say here, run bash and you can see arcs dot fast in the model. And what does it mean? While running this file from the command line, you can actually add a string and this will be this past to the model. So it's make your, you don't need every time to hardcore this path. You can just retype it. You can just take it from the common clients. So it's make a life easy. And as I mentioned about logging, the good points to note here. I shouldn't say that's as much locks as possible, but at least for example, if you are working like local environment of if you're working on the environments where you know that there are not the best logging and monitoring practices, try to use locks wisely and what I recommend here. So I will say I want to have here some checkpoints for example, I want to check point first whether the road data is, get it. So that's why I want here actually to log, um, to log the status, what's happening there. Uh, then the next step of the checkpoint, I want to make sure that the model was loaded properly and it's handled for me. When it's handled for me. Uh, then I also want to checkpoint where the preprocessing runs as expected. And then I want to checkpoint the predictions, the model, predict batch, and the last step, uh, right to SQL database. So if you have this two files, we start really simple, you encapsulate your batch model. Then you have this like this pretty simple batch inference file. The first pretty stupid and simple thing just to schedule crowd jobs. That's it. And you can say that based on this specific schedule, I want to, uh, run this. The issue with standard cron jobs. Um, the batch jobs will fail definitely in the future. So the issue with this is that it's really hard to restart. So there is some different, uh, let's say tooling, uh, such as airflow exist. Um, the, the beauty of airflow, it's an orchestrator and actually you can orchestrate and rerun your jobs. Uh, so you can pick up the specific points and what's happening with airflow. It's, um, random in Python and you define the task in the form of duck. So exactly imagine like a graph and you're saying, okay, first, big does first step will be big. This road data. The second step will be, for example, preprocessing this road data and the third step will be taking the predictions from the model and go and write to the database. So that's, uh, exactly what to have. And, uh, that's it for the batch. So I should say the majority of use cases will not be building API using flask fast API or any other things. It will be just simple batch jobs. And also I want to ask you if you're working with a model first and somebody asks you to deploy it. For example, first ask, how do you want to use this model? Because the majority of use cases, uh, there will be, uh, for example, uh, people want to get the visualization. So they want to get, what do you mean by that? It's not the, the, the month of leap or seaborn visualization. I mean by that said you run your best jobs, then you drop it to SQL database, for example, and then the business analysts do their magic. Okay. So the, uh, query is as they needed, uh, they perform analysis and they needed, and they can use this actually. And on top of it, uh, we can connect it to power BI, to blo or any other tools and just use it there. So this is really simple, really. Um, yeah, really simple, uh, option how to deploy the model. And here we can actually tweak different steps and we can say, we want to pre-process data in distributed manner. We want to use spark. There are different options, how to do it, but the majority of the models actually will be deployed in offline mode. On top of it, the last piece that I want to add here, if you actually, um, starting with a project, a lot of projects projects will not be started with online inference immediately. For example, you can work on a fraud detection, and first you will be training model offline and you will the first, just make sure that based on the historical data, you're able to spot the fraud, um, in a proper manner. And then, uh, because for a lot of financial institutions, before letting the model be deployed in production, they need to pass a lot of security and governance and a lot of steps to checkpoints.


Offline and Online Deployment and Minimum Skills

Short description:

Offline serving involves running the model on a schedule, where predictions are stored in a database. Online deployment allows users to interact with the model in real-time via an API. The minimum basic skills for an ML engineer include Bash and Python. Understanding the math behind algorithms is beneficial but not necessary. Docker knowledge is also important. Experimenting with simple models and learning how to deploy them is crucial. Courses on machine learning deployments, such as those offered by Google, can provide further guidance.

That's why it's always really hard to bring it immediately to use online. That's where a lot of financial institutions actually start in small with offline inference. So if you're able to do that offline, inference, inference, please do it. You don't need to expose your models in microservice via API. It's not necessary. It's over-engineering this case. So I want to stop with this case here.

Uh, let's have a short break and time for questions. And after the break, we will actually continue with an online. And this is where the, um, yeah, the times will be really, really interesting because the size of, uh, the problem just starting to grow exponentially. So first of all, we'll be happy to see your questions. Uh, now it's 15 minutes. Let's have five minute break. And let's have, yeah, this time as well, um, for your questions. And then we'll continue further with online serving near real time.

Okay. I see. There was a question. Is the recording going to be shared later on? I would like to go over this amazing tutorial later on step-by-step. Sure. Uh, this video will be shared and ask, um, uh, get nation who is responsible for this conference. They said that for the people who are able to, uh, to exactly to get to this workshops. So their reader records will be shared with you. Okay. I goes, the question, uh, the check. So could you please just quickly explain what you mean by online offline? Okay. No problem. Sorry for miss that info. No problem. Um, okay. So what do you mean here by online offline? I mean, um, how are you using the model? So offline serving, um, the model is doing something on a schedule. You directly do not interact with a model. Um, you don't touch it. This is in your case, uh, offline deployment. So it's let's say at one o'clock in the morning, uh, there is a prediction job. This is started like scheduled as a crumb job or airflow started this job. And we get some raw data. This row data up preprocessed, this row data fed into the model, a model give back predictions and this predictions are stored somewhere in a database or a data storage. We try to do like the, the SQL database, imaginary SQL database, and then somebody who wants to use this predictions further, they will interact not with the model, but with the, for example, this database or any other interface, uh, that is actually linked to this. Uh, if I talking about online, it can be, for example, um, I'm right now opening a Google translator and I want to translate something under me's. There is a Google algorithm who's responsible for machine translation. What's happening there. As soon as you press the button, you type some text into it and you actually, you push the button you interact with the web application in that moment. So this is online deployment because we interact was model online, or you can call it interactive use. Um, near real time interactive use of the model. And in the majority of the cases the model will be exposed via API. It can be restful API or it can be GRPC, uh, Google protocol. So I do hope if it helps, if not ask further, I will try to do my best. Uh, next question. What are minimum basic skills required to me amount to be an ML engineer and what separates an experienced ML engineer from a new ML engineer. Oh, I love this question. Okay. I tried to be insurance, a minimum basic skills required to be a ML engineer. Um, Bash, Python, just to begin, uh, the raw ways pro and cause pros and cos about knowing math behind the algorithms versus, um, not knowing math. Um, I think in this case start simple and small. So bash bytone, why do need this? Uh, bash will help you to orchestrate your jobs and also learn how to use comment line terminal. Um, Python, why Python because the majority of algorithms right now is actually, um, yeah, um, not written in Python, but they are really good working really good with Python. Uh, I think that's the basics. Um, there are different courses. So as I mentioned before in this read me file that you will get to my GitHub repo, there is where to go next. So highly encourage you to go through this because this is really good piece of information that will help you at least to see some directions. Uh, what else is needed? Uh, if you're a engineer, uh, you needs as well to know how to work with Docker. Um, probably how to work with Kubernetes, but I think it will be maybe overkill at the beginning. So concentrate bash Python and, um, seeing what it can do with simple models. So you get a model. Okay. What next if you want to deploy this model? How are you going to deploy this model? And then you start asking questions. Um, I'm not pretty sure about the, uh, courses available online specifically on machine learning deployments. Um, it seems to me that there is some courses provided by Google. Uh, no, sorry.

Online Serving and Latency Optimization

Short description:

To become an experienced ML engineer, it is important to know what you're doing and understand the bottlenecks in the system. Communication skills and the ability to work with data scientists are also crucial. The level of maturity of a company can affect the role of a machine learning engineer, and there are different paths to explore, such as data scientist or data engineer. Online serving is more challenging than batch serving due to the need to optimize latency. Latency is crucial as users do not want to wait for results. Models that are too complex may not be suitable for online serving, and simpler models are often used. The goal is to optimize everything for latency and provide predictions with the lowest possible delay. GPU inference can be used to speed up predictions, but cost may be a limiting factor. In online serving, the user directly interacts with the model through an API, such as a RESTful API or GRPC.

It's not Google. It seems to me this collaboration of Google and, um, uh, deep learning AI. So there was a course dedicated to TensorFlow. Uh, but once again, not all algorithms on TensorFlow, then there is a, um, fast AI. Oh, I've, I completely forgot to mention this. So fast AI, it's more like for prototyping and just starting really basically. So to sum up about this minimum basic skills, bash, Python, Docker's, um, and then the basic course of deployments. Uh, then you have what separates an experienced ML engineer from a new ML engineer. I love this question. Um, I usually try to find a way how to test new machine learning engineers, for example, for our bootcamp, uh, for me, it's exactly knowing what you're doing. Uh, and you see it immediately because for example, I can give you a tessert flow model and say, give you a task to deploy this model. And if you, without any questions, start immediately doing something like building RESTful API, or so without any questions, like, how do you gonna use this model? Who are the users of this model? So I should say that experienced ML engineer knows the bottlenecks in the system, and he or she knows how to incorporate this in a current system in the best way. And you can learn it only by experience. For example, when I first time came across, uh, the machine learning deployments, I was given a Jupyter notebook with more than 1000 lines of code with different cells that were like, uh, run in chaotic order. So if you try to run it from top to bottom, it doesn't work. There was no information about which version of Python was used. What exactly version of libraries were used to it? It was trained somewhere in the local machine. So it was really hard to understand what is the next step. And this is as well, a communication skills, uh, for me personally, because if you're able to get back to data scientists and ask them nicely to explain what do you mean by that and ask them nicely to help you to pick up the model that is there, then this knowledge will be highly appreciated. So I do hope that I will not go too long explanation. So please let me know if it's something that's helping you. And also what I want to add, sorry. I forgot. Um, the steel, uh, depends on the level of maturity of the company. Um, there are a lot of misconception what exactly machine learning engineer is doing. Some people saying that machine learning engineer is responsible for training the model, uh, do feature engineering, uh, doing a feature evaluation and exactly finding the best way to deploy the model and other people saying that deploying of the model is not work of the machine learning engineer. So it's a really good question. Um, I used to work for United States markets and it's a little bit more mature that European market, they should say. So, yeah. And also you can come across the huge big clients where you expect to have some level of maturity. And when you learn that they have only two machine learning models in production and you're starting asking why, because they're not able to find good engineers. So that's the thing. But also make sure that you are knowing that is this is the path you want to follow, because if you are more interested in playing with data in visualize visualizing the data, finding some insights, trying different models, maybe the machine learning engineer is not yours. Maybe yours is a data scientist. And then that's where you can shine, or maybe your care more about taking the raw data and going from a to real value. Then maybe the it's the data engineer that it's waiting for you here. So there are different flavors and different things that you can try exactly. So folks, I think if I don't see any questions further, I think we can continue with the online part because it's the longest one. And then at the end we have some time to go through it actually. Okay. So I'm online. Once again, one example of online, the Google transceiver, they are going to use, or, you have like a something in the browser running, you have some fields where it can give some information, press, maybe a button. And then you expect to get something back. For example, if you're doing a PowerPoint presentation, now you get this new feature. It calls like design feature from the right hand. You get like design ideas. This is exactly powered by AI. And what's happening based on the input that you give to the slides, shapes, words. I don't assume that they process words, I think just shapes, lines, colors. They generate for you using guns, to generate adversarial networks, they generate for you some new layouts that you can try. So this is as well, let's say like the, the parts of the possible of online serving. So we go into online parts. It's more challenging than batch, because in this case, when we're talking about the batch, we're saying that to deploy something, we need to optimize the throughput. In case with online serving, we need to optimize the latency. Why latency is so important here? Because you don't want to wait for ages until you've seen the results. Imagine that you are playing with some fancy algorithm, and you're using your webcam to recognize your face. If it takes 5 minutes to create a bounding box around your face, you will give up. The same story for a lot of people who want to actually use the model. If you need to wait for a long time, then you will give up. In this case, if you're just playing with it, we are talking about like, okay, wait, minus whatever, it's actually direction of the model should be in a millisecond. That's why we're calling this near real time because it's happening in real time with a little bit delay. So once again, it's more challenging than the batch because we have this latency restrictions. So what's happened here, we have somebody who sent a request and wants to give us a prediction back and with a short with the lowest latency possible. And in this case, we optimizing everything for latency. That's why a lot of models, really beautiful deep learning models, never, never go into production because if you want to get predictions from this models, it will take you for ages or other option. If you want to speed it up, you can use GPU for inference to get this predictions, but it will cost you so much money that in the end, this model will be unprofitable for you. That's when in the majority of the cases in production, we are getting models that are simple to use. And they are not really, let's say, um, big ones. And also in this case, as I mentioned before, regarding interactivity, the user will directly interfaith interact with this model. So in this case, we are API and it can be restful API, or it can be GRPC, for example, Google products.

Deploying Model via RESTful API with Flask

Short description:

To deploy a model via a RESTful API, Flask can be used as a lightweight framework. The model should be loaded only once to avoid unnecessary latency. A function is defined to load the model and make it visible and reusable for other functions. Basic templates are created to serve predictions, and routes are defined for the API endpoints. The 'predict' route is defined with the 'post' method to receive data in JSON format. The model is then used to generate predictions based on the input data.

And while we're dealing with this, we doing validation, first offline and then we do online validation. And this where the concept of A and B testing comes into place. Maybe you are familiar with this concept, but exactly what we want to do. So we want to have at least two versions of the model and we want to test to redirect some traffic from the users to one model, while we redirect only tiny part of the traffic to this model. And then we compare how these two models perform and we choose the model that performs better than another. So you can call it A beta testing. You can call it canary deployment. There are different versions of it, and that's why it's harder.

So what we usually going to do with the online, if we look at it online, we will do the same stuff as we've done before. We will start with the base, uh, class on the model. So in this case I call it online model, uh, just to make sure that I differentiate between batch and online. And in case it was batch, we have the class that we used for running batch inferences. In case it was online, what we are going to, to do, we are going to create an API, uh, so to, in order to use our model online. So in this case, the same story, we initialize the model, we load the model and then we run prediction. And in this case, how the prediction work, it's take some input features or data in our case and returns us back a prediction. And we can as well can define what do we want to expect from the model? So if it's online model and if we can do, let's say, interact, if we want to interact with it, then usually we do it via an app and it can be a web app or it can be, um, a mobile app depends on what do you want. However, there is as well difference between deploying to a mobile phone, uh, and deployed, uh, just to use it to the web application. So now let's concentrate on web application.

So last assume we have this model and we want to expo this model via RESTful API. So for this things, we can use Flask. Flask, this is a Feiten based framework and it's created exactly so it's stripped away, a lot of complexity, uh, away from you, and it allows you to define the steps of, um, actually the steps of usage of exploring the model. Um, so that's exactly where you have, uh, pretty a lot of tutorials, uh, in the, on the internet that are saying to you how to use Flask, but I want to make sure that you're aware about one simple thing. So let's call it for example,, and just create a new and you'll file that I will use to get predictions from our model, let's say online. So I just created the app file. Let's see what will be there. So of course I start with import libraries. In this case, I start with my favorite library logging. Then I go, I want to use Flask. So I say from Flask what I need to import. Okay. I want to import Flask and I want to import for example, model request. Uh, that's what it needs. Then I initialize my app. Let's call it, um, let's say name. Let's call it this. And when we start, I want to make sure that at the beginning we don't have any model. And the thing that is different when you're running batch inference, you just load the model and you use it. In case it was, exposing your model via restful API, it's important to say that you are going to load model only once. So the biggest mistake they really see in some, in the scripts of the beginners, when they're trying to copy cats the flask, um, tutorials, uh, if somebody forgets to define that the model should be loaded only once what they're actually doing, every time a client send a request, they load the model. So it takes time and adds to latency. So actually you need to load model only once and then wait to incoming request. That's why if you're working with flask, usually what we're doing first let's define in this case up and we can call it, um, like before serving, before serving to make sure that we will load model only once. And then we define the function that actually will be responsible for loading the model. So this is our function to load a model. And what's important here, first, we need to, um, use actually the global, uh, specific to Python to make sure that this variable is, um, visible outside of the scope of this function. So it's not only here, it's not only be visible for the function load model, but it will be visible and reusable, uh, for all other functions that will follow. And then what exactly happened in here? So you have your model. You have your model pass. You can define it. Uh, you add your login and then you go, okay, for example, then I just use my function. I initialize, I instantiate my model. Uh, it was called online model here, online model. And I use, Oh, and I use actually the slot to load my model and I just got it. So, and I loaded it from some, for example, both where it's located. And then so first we are creating basic templates, uh, to serve out predictions. And we have, we're just starting to make sure that to load model only once before any other survey and that the model is, uh, is. It means that it's visible outside of the scope of this one particular function. And then we go to define exactly the, uh, in case was, was fight and we define the routes we want to say. And for example, we want to say it's okay. It's traditional predict, predict, rude, predict, and we define the matters that will be supported in our case. It will be just post. Uh, remember Google translate, you just type information and before pressing the button, while pressing the button, this is actually what's happening there. This method post is revoked post in then you say, okay. And then you define exactly what do you mean by predict? So in your case, um, you're having your global model to make sure it's here. And then what you get, so you're starting first, you want to get some, let's say rodeo data. Um, data and your road data. This is exactly what you, uh, get from Jason. If you're using Jason formats, request a Jason and then you say, okay, this is my row data. And then I want to get my prediction from my model. And in this case prediction, I want to say, this is my model.

Deploying Model via RESTful API with Flask

Short description:

To deploy a model via a RESTful API, Flask can be used as a lightweight framework. The model should be loaded only once to avoid unnecessary latency. The predict method takes input data, processes it, and returns the predictions in JSON format. This approach allows the model to be deployed as a microservice. For testing purposes, Flask can handle one request at a time, but for production, a web server like Gunicorn or FCGI should be used. Different versions of models can be deployed by handling requests and sending the appropriate output. Batch models can be scheduled using job schedulers like Cron or Airflow. Online models can be deployed via RESTful APIs. For near real-time deployments, using a queue to handle requests can be beneficial. Different interchange formats like PP-LAW or NX can be used to optimize online scoring. There are different formats like ONNX that allow interoperability between frameworks like TensorFlow and PyTorch. Open-source tools like Conda and DVC can be used for data versioning control.

Please predict whatever is needed just to make sure. Oops, sorry, just to make sure that it's here. Yeah, it was a, it made my prediction and I want to do prediction on this row data that exactly this is a Jason file that we are waiting for. And then what I want to get back, because if we use RESTful API, we decided to use in JSON for it, then the most simple way to do it, just to return, in our case, outputs that we want to send, we can define it as a dictionary and oh, like this. And then we just say, okay, this is what was, as an input. And this was the, uh, the row data in our case, row data. And then I want to send prediction, prediction, and it's exactly what the model will get us. So let me just here return this output. Output. So this is really, really simple one. And then at the end you say, if name, if name equal to, always forget should I use double quote, one quotes main. And then you just go to run your app, a brand and then you define here a host. I will assume that host is somewhere that will divide the port. Whatever we want to open. And for example, debug, I will put it now for false. That's it. And then what we will be able we are getting an end points that we can use for having prediction. So what's important to remember at this particular part here. So if you're working with online, first what you do, you define once again, the basic class for online model with the methods that you're gonna use. And then you're actually creating an app that you're going to use with Flask. You can use Flask, you can use FastAPI. I really appreciate how the FastAPI, it's a library, has developed pretty, pretty fast and have really good results and it's integrated with Swogger. So you can immediately test, visually test your API or if you're more hardcore fan of a Postman you can use Postman for that. And that's exactly what you do. And you're creating a Flask app or a FastAPI app you can use or this or that. And then first thing that you need to do, you need to load model, but to make sure that you load model only once before serving it. And then you define the predict methods that exactly will take some data from the input. Then you will process this data, predict and return output in a JSON in a dictionary that is really compatible with the, Oh, I just noticed the typo. Let's get back that data that you need. And then what you're doing, you can reuse this for actually deploy your model as a microservice. Whether, if you want to use it you will just go to the API. The model will take the prediction and then get the better results that you wanted. And if we're talking about the ABI testing in this case, for now what we have, this is really simple lightweighted. It's not ready for production in any way. Because here, the flask will be able to handle only one request. And if you want to run more, you need to use Goonicorn or fawcgi server or any other web server you're going to work with. So, but just for testing and to see you the first time that is there, it's okay. If we're going to use it then we need to exact exactly in the future to split the traffic between two version of models we are going to use. And this actually can be done by taking the request and sending exactly the output. You can define here in the way how you're going to do this. So in this case, what you should look at it, you want to make sure that if you're working with your model if you're calling your model and want to get predictions back that you don't need to waste a lot of time to have it. So in case was batch, we have the batch class model. We have a batch inference file that can be scheduled with Chrome jobs or Airflow jobs or whatever you're using or Spark jobs as well. If you have online, one of the options how to deploy your model online will be via exposing your model via RESTful API. So that's what you have. And before going to the question section. So that's exactly what I want to show you. I will close it now. And there was, oh, it was I think it's really good thing mentioned. Let me check it. For online scoring in a stream, calling an API was still bit too slow. You should not have the network latency. So it would better if it is an interchange format like PP-LAW or NX, this can be lazily loaded into memory directly at the streaming engine API that process the data. Yes, and also what I want to say, for example, even choose if you want to speed up or if there is a need to speed up actually the near real time deployment and depends on the, once again your requirements, there is a way to process, to work with a queue. So for example, if you got requests, you can spin up a different web servers that will handle your requests. So there are different options, but that's why I point that there is a difference between real time and near real time because for some use cases, you're not even able to wait this time. And as well, definitely there is different formats. I also love to work with oil and eggs. I wear some backwards incompatibilities while you're using, while going from TensorFlow to PyTorch or vice versa, we are on an X, but as well really good standards. And I do hope the more people will be actually helping or analytics to develop further as an open source formats. So this is, I think for now, that's what I want to share with you and about the tools. So this is where to go. So we briefly covered offline and online. Once again, it's really brief. It's really simple because in real life use cases, it can take more time. And sometimes if we are working with deployments, even the teaching about deployments can have more time. Where to go next? As I mentioned before, there are some useful links to see what you can do with it and exactly what are the issues if you want to deploy deep learning models, and they have really good lab at the ends where we try to deploy a mobile apps, it seems to me, that try to look at the text using optical character recognition, finding the lines and then do something with it. And then, this is just a bunch of my favorite open source tools. I briefly mentioned Conda in the beginning, and then what you have as well. You have data versioning control, DVC.

Tools for Data Versioning and ML Deployment

Short description:

Data versioning is important, and tools like Packarderm and MLflow can help with data lineage and pipeline deployment. Selden core is a YAML-based tool for deploying models as microservices on Kubernetes. TensorFlow Extended covers the entire ML pipeline. Kubeflow is another tool for production use, and the speaker is interested in hearing about others' experiences. They also mention the importance of showcasing work on GitHub or contributing to open source projects when seeking internships. The speaker provides their LinkedIn for contact and recommends newsletters, blogs, and conferences as resources for learning and staying connected with the ML community.

It's really cool for data versioning, although there are some, still there are some issues if you are, for example, working with distributor data, then there is a really cool thing called Packarderm. They have a part of an open source part, a part that is closed for, that's actually monetized. What it can help you with, it can help you as well with data lineage was pipeline and it's built completely on top of Kubernetes.

I'm a big fan of MLflow. This is now running exactly under the hood of Databricks because the MLflow that's from the same people that created Apache Spark and Databricks is just a monetized part of it. We highly recommend to use, I use it to read heavily for deployment as well and for experiment tracking.

Then the Selden core, this is one of my favorites. And actually if you're writing stuff in Kubernetes and if you want, for example, to expose your model as rest of GRPS microservice, you even don't need to write this code because what Selden is do, the beauty of it, you can define your models and graph in a YAML file. You can define all steps, like what are the pre-processing, what are the post-processing, and it will just do for your deployment in a second. On top of it, it's really easy to scale. So if you need more instances, just scale as it is.

And for those, for example, who is working extensively with TensorFlow, you can as well enjoy the benefits of using TensorFlow Extended. And now they're trying to cover the whole pipeline with different checks for the data in everything. So that's really cool things. There is as well a tool called Kubeflow. If somebody of you used Kubeflow in production, I'm all ears, please reach out to me after this event. I would love to talk to you because I have a lot of issues and I'm just really curious how you handle all these issues in production. So that's what I want to mention. And I think yes, let's now have some time for questions. It's not a lot of things that left. So let's just go through it. And once again, if you're working with Kubeflow in production, please let me know. I would love to talk to you about this. And also I'm really curious to hear, thank you Bas for sharing this experience, but I'm also really curious to hear your story, what kind of deployment you've done before? Because I didn't cover anything about how to test each part of it, what to do and how to deploy TensorFlow on edge or how to deploy your model on edge or how to deploy a model on different other tools. So really would love to hear something from you. In the meantime, I saw the question it was, how do I find internships at the beginning? A good question, but I think it's back to the value that you can give and send as much resume as possible, definitely, but you need to have something on your GitHub, GitLab or Bitbucket to showcase what you're able of doing. This is really important. Remember, do Capstone projects in between to make sure that there is some visibility and to get internship, you can as well commit to open source because there are a lot of small issues like documentation and tests, but they are pretty important one. And if you start internship for example as a machine learning engineer, trust me, you will not be tasked with deployment of machine learning model from the day one, but exactly what you will be doing. You will be fixing documentation, you will be writing tests to make sure that the models can be easily moved from one stage to another in the pipeline. Okay, so folks, I'm really curious to learn from your experience as well. If you want to find me, I think the best way, I don't have Twitter handler. Yeah, so you have my GitHub right now as well. There are all links to who am I. I'm really easily reachable out via LinkedIn. That's what I'm using. So feel free to contact if you need some help as well. But once again, for people who are there, if anybody, if you worked with Squba Flow and production, really, really want to have it. Oh, okay, I have a question. Any recommended forums, communities that you use to communicate with other practitioners in this field and my, my labs? It's really good question. And I just wanted to show you actually, wait a second. So there is a cool thing if you just click on it. Of course, open, trust the domain. Wait a second, let me move it to this part. Okay. I just want to share another screen with you or just move it here. The beauty of this thing is here. Wait, wait, wait, wait, wait, wait, wait. They have really cool part is called where to go next. And I highly recommend you to sign for this newsletters. This is really great one. The batch is provided by deep learning AI, so our famous Andrew M.G. that's sharing every week the most important things. Then you got this newsletter, Machine Learning in Production, Import AI, Machine Learning Engineer Newsletter. This one is really good. This is actually my two favorites, that's the batch and the Machine Learning Engineer Newsletter if it's about newsletters. Some blogs to follow as well. Really interesting corporate blogs. Also, I find my inspiration actually here. There are repositories and some, actually some tests on Reddit. But in the majority of the cases, this is actually communication with my colleagues and the people that they used to work with as well. To learn from them. As well, watching different conferences. And now we have, let's say, for one side, a really daunting, from another side, a really amazing opportunity to get access to almost all conferences online with less cost. Sometimes it's almost for free. Sometimes it was less with a really small investment. So this is a really good one. So yeah, that's, let's say, that's where you can do it. On top of it, out my hands, yeah, the local communities as well. I don't know, like once again, you are from different parts of the country, some of the cities, in my case, because I'm leading the pilates.

Pi Data Global Conference and Resource Sharing

Short description:

I highly recommend Pi Data, a global conference for Python developers with a focus on deployments. The link to the conference and other shared information can be found in the GitHub repository. Additionally, explore different courses, including those offered by Deep Learning AI. I also want to share a great resource called 'Papers with Code,' which provides research papers and code for new ideas and trends. It's a valuable source for exploring further possibilities. Thank you, Buzz, for sharing the online learning to dev courses. I will definitely check them out.

I'm from Sudan community. I usually look out for different speakers. Sorry, not speakers, workshop givers. And that's why it's really important here as well to know who are they, what they're doing. And this is how we create this knowledge session to learn from each other. Pi Data, I highly recommend this event. So according to me, it will happen really soon. It will be Pi Data global conference. It's exactly for Python developers, but with an emphasis for exactly that the biggest part was now dedicated to deployments. That's what is it. Can you please send us the link? So the link, I can send it. It's like this, but at the same time, I will send it here, the link. At the same time, this link is in the GitHub repo and I can share, so all the information that was shared today. I will just push the updates, so the repo here, I will push the updates, as I mentioned before, to the repo, so you'll have this here and all the clickable links are there. And also look at different courses that come up. I think what can be as well useful, I don't know much about the Audimy Audacity regarding Coursera, it's everything that is proposed by Deep Learning AI. Yeah, and if anybody of you as well want to share something, please do. And by the way, there's another really cool things I want to show you, I don't know whether you're aware of it or not. It's not about the deployment, but this is where I get the new ideas every day. Sorry, I forgot to add this link. So I'll do it right now. Here. So this is actually papers with code. It's all research paper with a dash got to it. Sometimes the code is already more like the, let's say the learning code. So it's not ready for production, but here you can just have the code to production, but here you can find a lot of really cool ideas and new trends and something that's happening. It's more like what you can do further, what's the next step. And you have here like the different traders, latest, greatest, for example, a person to try to explain or photograph neural network, generating image description with sequential cross model alignment, or different options. It's really very useful. Or even you see it. There is an option buried misconstruction grammar. Awesome. So the link, I just send it, and also, oh, cool. Thank you so much, Buzz. Online learning to dev courses. Wow. Well, it's really interesting. I will take a look at this as well. Thank you for sharing it with me and with the rest of the people.

A-B Testing for Online Deployments

Short description:

A-B testing for online deployments involves splitting traffic between different model versions and measuring their performance. The choice of metrics and the cost of failure are important considerations. For mission-critical applications, a small percentage of traffic can be redirected to the new model, while for less critical cases, a 50-50 split can be used. It's essential to analyze feedback and measure performance at each stage.

And, okay. I see another question as well here. So the question is, how do you go with A-B testing for online deployments? Yes, really good question. I don't spend a lot of time here. I just briefly showed you what it can do with it. It's exactly the topic, at least for some hours, to go through it. I think the best way to show it, it will be to use maybe the seldom-core, wait, wait a second, seldom-core A-B testing, just to make sure that it's seldom-core A-B testing to visualize it. It will be think it's faster. And better. You have A-B test deployment. Yeah, it's Vizambassador, but it doesn't matter. That's fine. No, they're using pigmented. Now, I want to find the visual way to show it. Okay, no, it will be too long. So, let me just get back here. So, what I will be doing here, if you are just working with a flask app, what I will be doing, I will exactly define here how much traffic I want to split to each model. I also want here to version model. So, to make sure that how much traffic goes to model version one and to the same model, for example, version 1.1. And the question is how to exactly know which model performs better and how to measure these predictions. And what to do. It depends on your situation as well. For example, if you know exactly which metrics let you choose which models performs better than another model, then you can set up a plan and decide how much traffic you want to redirect to this particular model. But as well, what's important here to know what are the cost of failure? What are the cost of mistake here? So if it's something really, really mission critical, I recommend the split of the traffic like 2% to the new model and 98% to the old model. If it's something that's, it's not so mission critical, I want to play with it. You can even do 50, 50 splits and then just getting feedback on which model performs better. If it's answering a question, if not reach out to me and I will like take a look what are the best options to go through it or what's the best information to share with, because once again it's information for at least some hours to go through it step-by-step and showing each stage, how to measure feedback and how to, yeah, how to do it exactly.

Watch more workshops on topic

ML conf EU 2020ML conf EU 2020
160 min
Hands on with TensorFlow.js
Come check out our workshop which will walk you through 3 common journeys when using TensorFlow.js. We will start with demonstrating how to use one of our pre-made models - super easy to use JS classes to get you working with ML fast. We will then look into how to retrain one of these models in minutes using in browser transfer learning via Teachable Machine and how that can be then used on your own custom website, and finally end with a hello world of writing your own model code from scratch to make a simple linear regression to predict fictional house prices based on their square footage.

ML conf EU 2020ML conf EU 2020
146 min
Introduction to Machine Learning on the Cloud
This workshop will be both a gentle introduction to Machine Learning, and a practical exercise of using the cloud to train simple and not-so-simple machine learning models. We will start with using Automatic ML to train the model to predict survival on Titanic, and then move to more complex machine learning tasks such as hyperparameter optimization and scheduling series of experiments on the compute cluster. Finally, I will show how Azure Machine Learning can be used to generate artificial paintings using Generative Adversarial Networks, and how to train language question-answering model on COVID papers to answer COVID-related questions.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

ML conf EU 2020ML conf EU 2020
41 min
TensorFlow.js 101: ML in the Browser and Beyond
Discover how to embrace machine learning in JavaScript using TensorFlow.js in the browser and beyond in this speedy talk. Get inspired through a whole bunch of creative prototypes that push the boundaries of what is possible in the modern web browser (things have come a long way) and then take your own first steps with machine learning in minutes. By the end of the talk everyone will understand how to recognize an object of their choice which could then be used in any creative way you can imagine. Familiarity with JavaScript is assumed, but no background in machine learning is required. Come take your first steps with TensorFlow.js!
6 min
Charlie Gerard's Career Advice: Be intentional about how you spend your time and effort
When it comes to career, Charlie has one trick: to focus. But that doesn’t mean that you shouldn’t try different things — currently a senior front-end developer at 
, she is also a sought-after speaker, mentor, and a machine learning trailblazer of the JavaScript universe. "Experiment with things, but build expertise in a specific area," she advises.
What led you to software engineering?
My background is in digital marketing, so I started my career as a project manager in advertising agencies. After a couple of years of doing that, I realized that I wasn't learning and growing as much as I wanted to. I was interested in learning more about building websites, so I quit my job and signed up for an intensive coding boot camp called General Assembly. I absolutely loved it and started my career in tech from there.

What is the most impactful thing you ever did to boost your career?
I think it might be
public speaking
. Going on stage to share knowledge about things I learned while building my side projects gave me the opportunity to meet a lot of people in the industry, learn a ton from watching other people's talks and, for lack of better words, build a personal brand.

What would be your three tips for engineers to level up their career?
Practice your communication skills. I can't stress enough how important it is to be able to explain things in a way anyone can understand, but also communicate in a way that's inclusive and creates an environment where team members feel safe and welcome to contribute ideas, ask questions, and give feedback. 
In addition, build some expertise in a specific area. I'm a huge fan of learning and experimenting with lots of technologies but as you grow in your career, there comes a time where you need to pick an area to focus on to build more profound knowledge. This could be in a specific language like JavaScript or Python or in a practice like accessibility or web performance. It doesn't mean you shouldn't keep in touch with anything else that's going on in the industry, but it means that you focus on an area you want to have more expertise in. If you could be the "go-to" person for something, what would you want it to be? 

And lastly, be intentional about how you spend your time and effort. Saying yes to everything isn't always helpful if it doesn't serve your goals. No matter the job, there are always projects and tasks that will help you reach your goals and some that won't. If you can, try to focus on the tasks that will grow the skills you want to grow or help you get the next job you'd like to have.

What are you working on right now?
Recently I've taken a pretty big break from side projects, but the next one I'd like to work on is a prototype of a tool that would allow hands-free coding using gaze detection. 

Do you have some rituals that keep you focused and goal-oriented?
Usually, when I come up with a side project idea I'm really excited about, that excitement is enough to keep me motivated. That's why I tend to avoid spending time on things I'm not genuinely interested in. Otherwise, breaking down projects into smaller chunks allows me to fit them better in my schedule. I make sure to take enough breaks, so I maintain a certain level of energy and motivation to finish what I have in mind.

You wrote a book called
Practical Machine Learning in JavaScript.
What got you so excited about the connection between JavaScript and ML?
The release of TensorFlow.js opened up the world of ML to frontend devs, and this is what really got me excited. I had machine learning on my list of things I wanted to learn for a few years, but I didn't start looking into it before because I knew I'd have to learn another language as well, like Python, for example. As soon as I realized it was now available in JS, that removed a big barrier and made it a lot more approachable. Considering that you can use JavaScript to build lots of different applications, including augmented reality, virtual reality, and IoT, and combine them with machine learning as well as some fun web APIs felt super exciting to me.

Where do you see the fields going together in the future, near or far? 
I'd love to see more AI-powered web applications in the future, especially as machine learning models get smaller and more performant. However, it seems like the adoption of ML in JS is still rather low. Considering the amount of content we post online, there could be great opportunities to build tools that assist you in writing blog posts or that can automatically edit podcasts and videos. There are lots of tasks we do that feel cumbersome that could be made a bit easier with the help of machine learning.

You are a frequent conference speaker. You have your own blog and even a newsletter. What made you start with content creation?
I realized that I love learning new things because I love teaching. I think that if I kept what I know to myself, it would be pretty boring. If I'm excited about something, I want to share the knowledge I gained, and I'd like other people to feel the same excitement I feel. That's definitely what motivated me to start creating content.

How has content affected your career?
I don't track any metrics on my blog or likes and follows on Twitter, so I don't know what created different opportunities. Creating content to share something you built improves the chances of people stumbling upon it and learning more about you and what you like to do, but this is not something that's guaranteed. I think over time, I accumulated enough projects, blog posts, and conference talks that some conferences now invite me, so I don't always apply anymore. I sometimes get invited on podcasts and asked if I want to create video content and things like that. 
Having a backlog of content helps people better understand who you are and quickly decide if you're the right person for an opportunity.
What pieces of your work are you most proud of?
It is probably that I've managed to develop a mindset where I set myself hard challenges on my side project, and I'm not scared to fail and push the boundaries of what I think is possible. I don't prefer a particular project, it's more around the creative thinking I've developed over the years that I believe has become a big strength of mine.
Follow Charlie on Twitter

React Advanced Conference 2021React Advanced Conference 2021
21 min
Using MediaPipe to Create Cross Platform Machine Learning Applications with React
This talk gives an introduction about MediaPipe which is an open source Machine Learning Solutions that allows running machine learning models on low-powered devices and helps integrate the models with mobile applications. It gives these creative professionals a lot of dynamic tools and utilizes Machine learning in a really easy way to create powerful and intuitive applications without having much / no knowledge of machine learning beforehand. So we can see how MediaPipe can be integrated with React. Giving easy access to include machine learning use cases to build web applications with React.
ML conf EU 2020ML conf EU 2020
32 min
An Introduction to Transfer Learning in NLP and HuggingFace
In this talk I'll start introducing the recent breakthroughs in NLP that resulted from the combination of Transfer Learning schemes and Transformer architectures. The second part of the talk will be dedicated to an introduction of the open-source tools released HuggingFace, in particular our Transformers, Tokenizers and Datasets libraries and our models.
TestJS Summit 2021TestJS Summit 2021
18 min
Predictive Testing in JavaScript with Machine Learning
This talk will cover how we can apply machine learning to software testing including in Javascript, to help reduce the number of tests that need to be run.
We can use predictive machine learning model to make an informed decision to rule out tests that are extremely unlikely to uncover an issue. This opens up a new, more efficient way of selecting tests.