JavaScript Beats Cancer

Bookmark

Skin cancer is a serious problem worldwide but luckily treatment in the early stage can lead to recovery. JavaScript together with a machine learning model can help Medical Doctors increase the accuracy in melanoma detection. During the presentation, we show how to use Tensorflow.js, Keras and React Native to build a solution that can recognize skin moles and detect if they are a melanoma or a benign mole. We also show issues that we have faced during development. As a summary, we present the pros and cons of JavaScript used for machine learning projects.

by



Transcription


Hi, my name is Karol Pustelski and I will tell you more today about how to use javascript to beat skin cancer. My experience is about 15 years in machine learning. So my background is machine learning, it's computer science. I did a PhD degree in artificial intelligence, how to use it in medical imaging and dermatoscopy as well. You can find some of my papers, research papers, in this topic on Google Scholar, for example. So feel free, here's one of the articles I have published, actually this is around five years ago, about analyzing of skin cancer on multispectral images. Actually, in that case, I used Python, but because javascript became more and more popular in the recent years, and also the usage of javascript specifically for this topic, I decided to also prepare a presentation on our solution app for skin cancer analysis. So my background is not only scientific, I also have founded in 2010, so 12 years ago, a company, a service company, working for Fortune 500 companies, building also data science machine learning solutions. And yeah, before that, I did also some other commercial work, for example, at ABM. So as I said, I have 15 years of experience in machine learning, and specifically in medical imaging, I mean, in applications in medical imaging. So why I decided to actually cover this topic and to build some solutions in this area? Well, as you can see, I'm not really in the risk group when it comes to skin cancer, because the biggest group of risk group is actually the blonde people with blue eyes. So this is the phototype number one with the highest risk of having skin cancer, especially if you're becoming kind of a... Your skin doesn't become brown when you're exposed to the sun, but actually it's more going in the direction of red. And actually also the risk of actually having, getting skin cancer is higher in this group. So the darker the skin is, and how it reacts to the sun, the lower the probability is to get a skin cancer. So there are six type, phototypes of skin. I'm more or less in the third group because of my hair color, eye color, and so on. But yeah, that's why the biggest... Well, the problem actually, it is the biggest, the countries like Germany, Scandinavia, and the Nordic countries, the US, Australia, especially Australia. So this is actually where this problem is even more and more important. So in the meantime, I also have done some partnership with some dermatoscopy companies, I mean, the companies who actually develop the hardware. So as you can see here, here's one of the device. This is a dermatoscope here. That's something, that is a device that is actually used by a dermatologist. In this case, I have also used an iPhone here on the front, because this is actually an extension. So it's not a typical dermatoscope. Usually, it doesn't come with an iPhone or any kind of mobile phone. It comes alone. It's a standalone device. Some dermatologists use also this kind of extension case just to make the pictures, to take the pictures in an easier way. And obviously, it's quite small, right? So we can take it to your pocket and actually visit even your patient to take a look on the mole like this, right? So this is how, where actually my solution is used. And well, it is combined with the special lenses, right? Special lenses, special light to get the best possible image of the skin mole. When it comes to the data set, because any kind of machine learning topic, model should be fed by some data. And when I started my research, I actually started with 50, 53 images or less. So as you can imagine, that's not a big enough data set to do any kind of research. So what I did, I did a lot of, I met, I guess, almost every company in public or private that do anything with dermatology in the city where I actually live, where I live in Krakow and in Poland. And well, actually, most of them actually declined to collaborate and actually build some models. It was in 2007, 2008. So the way how people thought about machine learning was totally different than actually compared to what's happening now. machine learning is AI became a buzzword and everyone want to do AI. In the past, I mean, 15 years ago, when I said AI, most people said, oh, no, thank you. That's not, I'm not interested. Now it's totally opposite. I need to explain people why not to use machine learning rather than actually use machine learning. So it changed dramatically and the COVID pandemic even increased the hype on AI. So when I reached to the companies, I obtained a data set of around 5,000 images. Now you can easily download a data set of about 26,000 images of skin moles. It's available on the ISICarchive.com website and you can use it for your research. So now it's even easier to develop algorithms to find different kinds of skin ills. It's not only cancer, which is technically, I mean, it's not the most popular. That's good illness when it comes to skin. So for all of you that want to use some of my code samples that I have prepared for you, you can always download my Docker image that contains JupyterLab, JupyterHub, together with some kernels for javascript as well, and also some libraries, javascript libraries. It's a bit old because I am doing it for many years already. So it might be that I will update it soon, but it's still working. So you can easily use it with the notebooks that I will show you next. So the architecture, how I started actually to use machine learning, well, how I combined machine learning with javascript. So because of this device, I decided obviously to use one of the javascript solutions to build a mobile app because the mobile devices are changing every year. So as you can see, this is an iPhone 6S, so quite old, and probably I will need to change it also soon to a newer one. Still it has a good, good, I'm able still to use this phone to make good quality pictures because the quality is here, not here, right? So it's not in the phone, but it's actually in the lens here. So that's why I decided to use javascript instead of building a native application. And in the past, I had to use some different kind of solutions, starting from Cordova, PhoneGap, now actually working on react Native. And for the machine learning part, I use tensorflow.js. So you might ask, okay, why tensorflow.js and not, I don't know, keras or actually Torch, for example. Well, there was one reason why I use tensorflow.js because it's the most robust library when it comes to javascript, obviously. So why I use, or why I actually choose javascript in this case, that's not exactly that I said, okay, let's do everything in javascript. And that's not true. I mean, the truth is that actually the model is trained in Python, but it is used with javascript. So I use tensorflow.js, not really for the training. And to be honest, I don't know anyone who actually do that. I mean, maybe because I'm into the data science field for a long time, and I mean, I know many people in this area, and they actually do mostly in Python or, I don't know, maybe Scala, some of them, especially related to big data. So in this case, in this specific case, I use javascript exactly because of the possibility actually to use tensorflow.js to load the model, use the model, retrain the model, and use it on a phone, mobile phone. So in production, it is used together with some Kubeflow and tensorflow service. I mean, there are two models, one that you see on the left, actually, you can combine it together with the main app. And that's how I did it here on the phone. But actually, and in many cases, it also calls the service that actually is trying to find similar models, similar lesions. And actually, it is a web app. So the app is actually connecting to the web app, and then also use that for retraining as well. So how does it look like? Let me just shortly move to some examples. Here we go. Here we go. That's partially, I did it also in the past for another conference at Harky for Ukraine. So I will just use part of the notebook. You can easily find it on my GitHub account, on my company's GitHub account. And a few repositories that are about machine learning, AI, and javascript. So when you start to do any kind of research related to data, when you do it in Python, you might use probably the first library that you will think about is Pandas. And there is a fork of this library in javascript. It's called Pandas JS, obviously. And this is, well, to some degree, it's very similar to Pandas, to the original one from Python. But actually, it's quite limited compared to the original Pandas. Still, many, many features are not yet implemented. So to some degree, you can use it in javascript, but still, there are a lack of many, many, many advanced features from the original Pandas, especially the one that actually are already just statistics. So another library that you can find or you can use in javascript for data analysis or data manipulation, DataFrame.js, Reclaim, DataForge, and so on. So there are plenty of actually such libraries to use. This is just a few examples how to work with series, how to use with DataFrames. Typically, that's something normal for people that work on data, probably for javascript engineers, not so typical, but still easy to manipulate, easy to export to JSON, for example. And so this is just a notebook on how to use that. Just a few short examples. When it comes to the visualization, in my opinion, javascript do a better job than Python. Actually, there are some libraries like Matlib, Seaborn, and some other ones in Python. When you compare it to the ones that actually are available in javascript, I think here, javascript has a huge advantage because there are good libraries for visualization, for printing the charts, that I think in many cases are just doing better jobs than the Python ones. So that's good. That's one one, let's say. And then when it comes to Scikit-Learn, the most popular library for machine learning for the shallow methods. So tensorflow, I mean, that's for building neural networks, where actually Scikit-Learn is more about shallow methods. Actually, most of the cases that you want to deal with when it comes to machine learning can be solved or should be solved using shallow methods. So Scikit-Learn, that's the first library we will use and easy to use in Python. In javascript, you have JSKitLearn and Scikit-Learn as a fork of Scikit-Learn, but not updated in the recent years. So it looks like someone did that, started something, but then dropped the maintenance and they are gone. I mean, not really a good library here to use. There are plenty of libraries that actually implement some specific shallow method like SVM, KNN, and so on. But actually, if you want to have it in one place, like it is in Python, then you don't have such a library in javascript, unfortunately. So when it comes to tensorflow, here's a very easy example of how actually you can build a linear regression. Here are some examples, how to import, how to use, how to implement a linear regression. And here's an example, here's also an image from the documentation of tensorflow.js. You have many kinds of APIs. The most popular is in Python, but actually back is written in C++. And you have also some other forks like Java, Go, javascript, and so on. So javascript is not special here. It's just another fork to get to the core of tensorflow, really. Here again, example of how to build a last function, combine it together, how to use it, import it, and train it. When it comes to neural networks, how we do it actually in Python. So mostly we do it with keras, PyTorch. Actually, you can easily combine the network, build it from kind of a blocks, actually have layers, connect the layers together, and you can easily build a very huge network shortly. In javascript, we have keras.js, but to be honest, it's far, far away from the one that is developed in Python, unfortunately. So that's the one thing. Let's just move back to the slides. When it comes to the skin cancer, as you can see here, you have three images, and only two of them actually are cancers. The one on the left and one on the right. Here on the left, you can see some patterns, I mean the dots here and the stripes, some vessels here visible, that all of that is not very symmetric, the borders are quite smooth really. Here on the right, you have this kind of a pattern that is called the blue veil pattern. You can see kind of a white, blue colors here. It means that actually, well, it's going deep. I mean, the cancer is actually going deeper into the skin, actually trying to get to the vessels. So that's bad for the patient, but that's also a pattern that actually tell us, oh, that's really bad. Here in the middle, you have an example of a suspicious care, but not confirmed to be cancer, because all of them actually are pathology confirmed or not. I mean, confirmed that they are cancer or not. So how medical doctors do that? They use some kind of a scoring method like ABCD, seven-point checklist, seven-point score, Hunter's score, three-point checklist, and so on and so on. So they use some patterns like asymmetry, border sharpness, number of colors. I mean, in ABCD, you have six colors that actually really counts, and they check how many colors there are. When you combine it together, I mean, you write down one by one how many different patterns are available, you can count easily more than 30 patterns that actually the doctor, the medicals can find out on the image. On the image or just using the loop directly. What you can also do, and what I did in my research is that I used different kind of wavelengths of light to get not only what you see with the visible light, but also what's actually deeper in the skin using, for example, the infrared light. Actually, I used four different wavelengths of light in total, but actually here you can see one of the infrared, UV to get the melamine, the vessels more visible. And that's how you can actually do a better research because you have more details, more information. Or what you also do, you can also do some image processing like binarization in this case to find out, oh, this here border is not smooth. You can, for example, use fractal methods to analyze that. All right. So the demo, another one, because I would like to show you also how you can also use that, what kind of methods you can use. Really, here's one of the examples. You can find more actually using, going to my GitHub repository. Here's a way how to extract the asymmetry, actually extract the region here first of all. So we could take this part, this region. I use here the ISIC as you can see there. And next, that's the next step to calculate. In this case, I can see it's Python because I have developed the modern Python that I actually exported to javascript to be used in javascript. And here you can use it to divide the picture into some regions and calculate the symmetry of the opposite sides of the blocks. So this is how you count easy the asymmetry. And this is how you can do it in many patterns. I mean, do some basic image processing. It allows us to find out some of the patterns. Some of them actually are more difficult. You need to build some sophisticated models based on neural networks. Please feel free to get more about that. But what is important because I mentioned about the neural networks and also the shallow models. So just to give you a better understanding, here you have, because we have the black box and white box battles. So the black box are neural networks. In most cases, you can see here a very short network with three layers here. It's easily trained with an accuracy of 98%. Very hard. But if you actually draw the weights, you can see printed weights of just one layer. It looks like that. So if you try to explain that, it's very hard, even not possible to explain each of this number. What does it mean? So there are some explainability, explainable methods to explain the numbers in the weights in the neural network, but it's more complex. When it comes to shallow methods, so the white box methods, it's easy to interpret, easy to explain, because as you can see here, this is a decision tree. Anyway, if you don't know what's a Gini index, if you don't know what X2 is, it's a feature, right? But if you don't know that, just looking on this simple chart, this simple tree, you can easily convert it to the set of if statements. It is easy, understandable for anyone, especially for software engineers, right? Just going back to the slides. Again, because of the limitation, please find more examples on the GitHub repository. This is how the first example, the first application looked like. So we check, I mean, the algorithm check automatically. Here you have the ABC scoring method, take a photo, and we find out the algorithm was able to find the specific patterns. So what are the pros of actually using javascript for skin cancer analysis, or in general for machine learning? It can be easily used for prototyping and actually using it also for mobile phones, right? Especially it works with all of them or most of them. You have a huge community, javascript community, that actually can support and actually learning javascript. The entry level to javascript is quite low, so not a big deal. And also what can be done actually, if your app also can be used with a camera in your laptop, then actually you can easily convert to a web app as well. What is not so good? Well, there is no really machine learning MLOps support when it comes to javascript. So still most of that rely on the Robots Python, mostly Python solutions when it comes to MLOps. I mean here, Kubeflow, obviously you have the cloud solutions like on aws, Google cloud, azure that actually provide it with some MLOps tools. But when it comes to javascript, it's still limited. When it comes to the community, huge community, still not so huge when it comes to the machine learning related topics. And many, many libraries, I mean the jokes about javascript libraries that actually every week the new javascript library pops up and technically we can use this library. You can see it actually in javascript very well when it comes to the machine learning, because if you type machine learning, you will find more actually libraries, machine learning libraries in javascript than in Python. But the thing is that, well, even if there are more libraries in javascript for machine learning, there is less Robots libraries that you can really use in production. In javascript compared to Python or R or Scala or Java and probably some more languages. Thank you for your attention. I hope you will not get a skin cancer and everyone will be healthy. In any case, please find some of the notebooks with some examples on my GitHub repository. Thank you.
25 min
20 Jun, 2022

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic