Introduction to Deep Learning and Applications in Image Processing
Overview
Are you a student or a researcher working with large datasets? Do you want to build Deep Learning Models? Join this webinar to explore Deep Learning concepts, use MATLAB Apps for automating your labelling, and generate CUDA code automatically.
Highlights
- Create, modify and analyze Deep Learning architectures
- Automate ground-truth labelling of image, audio and video data
- Accelerate training on GPUs/cloud platforms
About the Presenters
Shayoni Dutta, PhD, Senior Application Engineer, MathWorks
Shayoni Dutta is a Senior Application Engineer at MathWorks focusing on technical computing. Her core experience lies in computational Biology models and simulation, advanced statistics, machine/deep learning, medical imaging and clinical-trial analytics.
Prior to joining MathWorks, Shayoni worked as a data scientist at Bayer and before that as an Imaging scientist at Sun Pharma Advanced research center. Parallelly she has served as adjunct faculty for the last 6 years at National Institute of Forensic sciences and Criminology under home Ministry.
She has a PhD. in Computational Biology from Indian Institute of Technology, Delhi. She has published and reviewed papers in numerous international conferences and journals.
Praful Pai, PhD, Education Technical Evangelist, MathWorks
Praful works with the Education Team at MathWorks India, where his focus is on collaborating with faculty, researchers, and students to make STEM education engaging and accelerating research in science and engineering.
He completed his undergraduate studies in Biomedical Engineering from Manipal Institute of Technology, and his MS and PhD from the Department of Electronics & Electrical Communication Engineering at Indian Institute of Technology, Kharagpur. Prior to joining MathWorks, he worked as a Research Scientist with the National Brain Research Centre, Gurgaon on developing an MRI brain template for the Indian population. He is passionate about learning, teaching, and research in multi-disciplinary domains involving instrumentation, signal/image processing, statistics, and machine learning.
Recorded: 23 Nov 2020
Welcome to the MathWorks India Science Webinar Series on Using Computational Tools in Science and Research. This is the seventh and the final edition in this series, and we'll be focusing on introduction to Deep Learning and Applications in Image Processing. My name is Praful, and I work with the education team in MathWorks India, wherein we primarily work with faculty, students, and researchers in supporting teaching and research with MATLAB and Simulink. Joining me today is my colleague Dr. Dhruv Chandel, who, again, works with me in the education team.
This is the agenda that we'll be looking at today. So we'll just go over some introductory deep learning terminology. We'll then then proceed to look at how we can automate labeling of images, audio-video data. We can look at creating, modifying, and analyzing deep learning architectures. And finally, Dhruv will take you through how you can accelerate your training on GPUs or cloud platforms. And we'll have a final Q&A at the very end.
Talking about deep learning, where does deep learning really fit in? So deep learning is a subset of artificial intelligence and machine learning. So artificial intelligence itself is any technique which enables machines to mimic human intelligence. And it came about with the advent of computers themselves.
Machine learning, on the other hand, involves statistical methods which enables machines to learn tasks some data without any explicit hardcoded programming. And this came about in the 1980s. And various algorithms were developed then.
Deep learning, though, is a really nascent field which has really come to the fore only in the last decade, wherein neural networks with many, many layers have been used to learn representations of tasks directly from the data. So as we said, deep learning is a subset of machine learning which can work with automatic feature extraction. And that is, it learns features and tasks from the data, and its accuracy can surpass traditional machine learning algorithms.
Let's take this example of maybe recognizing an image of a vehicle. So in deep learning, you'll see that you'll be just giving all the images of the vehicle to your deep neural network, and the neural network will select or will be trained to arrive at a prediction, which could be the type of vehicle which is an image. If you just contrast a traditional machine running, this is what it looked like.
If you're doing the same task using machine learning techniques, you'd have to extract some sort of features from these images, maybe edges in the image or different textures in the image and so on. So you'll need to extract some sort of characteristics, followed by giving these characteristics to a machine learning algorithm to arrive at your final decision or final classification. In this case, the features that we want to extract are specified by us, and the type of model that we want to build is again specified.
In contrast, as I said, deep learning combines both the feature extraction and the classification bits in this task into a single operation performed by this vast neural network, in this case a convolutional neural network. So, in essence, it performs end-to-end learning by learning features, representations, tasks directly from time series data. And the only need from our end is to specify the architecture of this neural network in this case.
So how does deep learning daily work? We already know that deep learning uses neural networks and work similar to the human brain. So neural networks are simply a collection of neurons arranged in different layers. So each neuron can be considered to be a multiplication and an addition operation. And there are a bunch of neurons arranged in layers, and layers are connected to each other.
So, as you can see, there is an input layer which receives the input, or signal, or image, or whatever it might be. There are a bunch of hidden layers, wherein which the information is asked forward. And finally, at the output, we have a bunch of decisions or a bunch of classes, wherein which represent the tasks that we're trying to complete.
So there are different layer combinations in neural networks that we'll see shortly, and each of these neurons have their own weights and biases. So information is passed in a forward posturing in the neural network, while the weights in the neural network are adjusted in a backward pass, in a process known as backpropagation. So this adjustment of the weights is what leads the neural network to learn the task that it is doing.
So, most commonly, it can be used by the classification, wherein the output can be categorical in nature. Or it can be used for regression tasks, wherein the output is numeric or continuous in nature. So why has deep learning been in the news lately, or why has this been such a buzzword?
So this has happened because deep learning models can often match and surpass human accuracy in a bunch of tasks. So this graph represents the accuracy of a deep neural network in identifying the class of images given to it. So in this case, you see that this accuracy on an ImageNet data set has been going down over the years. And in 2015 actually, the deep neural networks which won this particular ImageNet Large Scale Visual Recognition Challenge performed better than a set of human raters who annotated this data set.
In addition to this, deep learning has also been enabled by a bunch of other factors. So it has really grown in popularity due to three main reasons. One is the increased availability of labeled public data sets, like is shown in this picture, increased GPU acceleration or computational power available to us, and a bunch of world-class neural network markers made available by researchers such as yourselves.
Now, let's look at where a few examples of deep learning are in the real world. So looking at this example for the University of Heidelberg Germany, they looked at predicting survival from colorectal cancer histology slides using deep learning with semantic segmentation. That is, they isolated individual part of the histopathology image and then used these classifications to predict survival rates. Another example from an industry cost model is Genentech used deep learning to annotate large images of big images, which we'll see shortly.
So in this case they use a particular network called Unet to annotate these images which were of size 25,000 pixels by 25,000 pixels. And this task is known as semantic segmentation. And we'll see shortly how MATLAB enables that.
This third example from Risumeikan University trained CNNs of convolutional neural networks for reducing radiation exposure risk in CT image. So what they did in this case was they used neural networks to increase diagnostic accuracy, while reducing radiation exposure at the same time. And finally, in this example from the University of Texas at Austin, researchers created a speech-driven brain computer interface to enable ALS patients to communicate by thinking of the act of speaking specific phrases. In this case, they used wavelet scalograms to train the deep neural network for classification and achieved an accuracy of nearly 96%. They also accelerated their training times by a factor of 10. And all of these examples were done using MATLAB.
So MATLAB enables deep learning workflows across different domains, and it is designed specifically for scientists and engineers. These pictures shown below just indicate a bunch of these domains wherein deep learning using MATLAB can be applied. You can go anywhere from computer vision applications for, say, automated driving, to audio processing, to volume processing and sensor data analytics. Essentially, are our toolboxes, have processing beyond deep learning. So you can work on problems in many different domains all within one frame.
So how do you get started with deep learning MATLAB? So any deep learning problem can essentially be broken down into four parts, the first being accessing and exploring data, wherein you're reading your particular files-- wherein you're reading your data or you're receiving data from a bunch of sensors, cameras, or whatever you have as acquisition devices. Then you look at step two-- that is labeling and preprocessing this data, wherein you may look at augmenting the data that you already have, you may look at creating labeled data sets, or importing reference moderates which you can use to work with this data.
Finally, you'll develop predictive models. You'll accelerate-- you'll train these models to kind of achieve the task that you are trying to do. You'll tune hyperparameters of these models so to speak and visualize what results are achieved in between in the net.
Finally, you'll integrate your final clean mortar with other systems. You may choose to create your own apps to share your deep learning algorithm. Or you may scale to an enterprise level by integrating with other programming languages or platforms. Or you can also deploy your deep learning to embedded devices and hardware.
Let's just quickly look at how you can get started, how easy it is to get started with deep learning in MATLAB. So if you were, say, doing some amount of deep learning with your image frames, this is how you start going about it. You would go ahead and define a deep neural network. So, in this case, we take on a neural network called alexnet, which we can acquired images from a webcam, which can be interfaced directly with MATLAB.
And lets just create a loop to go over a bunch of these images or to acquire images continuously and acquired snapshots at regular intervals. What we'll do is we'll resize the image to fit this particular network-- you'll see shortly why-- And classify what is in the image using the network. So this classification step would yield a label, which we can display along with the image that's shown. And we can simply display it for each frame. So let's just look at how these end lines help us in doing image classification of video image, frame classification in videos .
So, as you can see, this deep learning application is acquiring images from a camera connected to the computer, and the deep neural network alexnet is classifying the objects which are there in the image. So in this case, since the camera is pointed to a cup, you can see that the classification provider or the label provided by the image by the deep neural network is right up here. And you can see a bunch of possible labels here as well.
So as the image keeps changing, you can see that it is not 100% accurate representation. There are changes in this as well. So as you can see, while the display is mainly focused on a bag, the deep neural network detects this as a soap dispenser. So and correctly classifies it now.
So what you see on the top is the label with the highest probability that a particular label can get-- that a particular label gets from the deep neural network. So in this case, if, say, the class vacuum had a higher probability, say 99% or 99.9%, then vacuum would be the class shown here. We'll see all of how this works shortly. But let's go ahead more.
So keeping that example in mind, if you were booking using deep learning on images, you can essentially perform the following tasks-- you can perform object recognition, so something which we were doing in this video that you just saw, wherein you take an image input for the neural network, and the neural network gives out a label of what is in the image. It can perform object detection. And this goes on step further, and defines where exactly in the image an object is located. For example, in this case, where with a neural network called YOLO v2 has been used to detect specific segments in each specific image like yatchs, or individuals, or an airplane in the picture.
You can perform image-to-image-regression, say, wherein you're looking at improving resolution of a low-res image and getting a higher quality image from that. Or you can perform something called semantic segmentation, wherein you label individual pixels present in the image, as shown. So in the picture shown on your left, it shows a free road detection algorithm, wherein in all the pixels which belong to a class-- all the pixels, which indicate the road area highlighted in green, while objects which are on roads, say vehicles, are highlighted in blue.
On the right, though, you have a similar example on a medical image, which has a tumor region in a brain scan, and an MRI or a CT scan highlighted in different segments different colors. Keeping this in mind, let's focus on this first step of the deep learning workflow, on accessing and exploring data. So deep neural networks take in numeric data. Images are taken in as numeric matrix, signals as numeric vectors, and text as numeric vectors as well. Since
This webinar is focused primarily on images we'll just talk about this for now. So how do load and access large amounts of data? Say multiple images that you have? So instead of reading one image at a time, what you can do is you can use something called an image data stored, which can access all images of a bunch of images, say, stored a folder in one go.
This data store will load images or signal data input memory as and when needed. You also can define your own custom data store, depending upon the need that is there. Another feature is tall arrays. So tall arrays allow you to work out-of-memory numeric data. And in the deep-planning space, you can use this containing deep neural networks for numeric arrays.
And, finally, the last feature is big images. So herein you can use to feed in large TIFF image files, and the image you write contains that BigImage object. And that represents really big images, like we saw in some of the user stories that we highlighted earlier.
So it represents these big images in smaller Blockset data that can be independently loaded and processed. This example shown here mainly shows satellite data at three different resolutions. So in each case, the size of each small, red block shown has 1,024 by 1,024 pixels. And you can imagine the size of these individual images shown as well.
So let's focus on the BigImage part of it. So how big is the big in big images? So big images can handle any image size which can't be processed in the core RAM. And the bigimage class wraps around very large image files, which can have multiple resolution limits. So we routinely test 90,000 by 90,000 RGB image TIFF files, which can be of a gigabyte in size or more. And we test it on, say, reading 24 gigabytes of memory. You can go ahead and try it out yourself using a simple command, like bigimage. You can read this particular TIFF file, which is present in MATLAB.
So similar to the main shader source for regular images, you have a big image sheet as to it as well, which can be constructed with one or more bigimage objects. So in this case, you use the read image option to so many batch sizes of different tiles which are present in the image. And you can use options like readRelative to access a neighboring file. So that was it for accessing and exploring data. Let's look a little bit on the labeling and preprocessing.
So how do you label the data that you have? You can do that by going to the Apps Tab in MATLAB, which has a bunch of GUI apps for easing the labeling process. In particular, you have a bunch of image labels or video labelers. You also have audio and signal labelers here, and are ground truth labeler for your automotive applications. In addition, you have recently introduced a Lidar labeler as well.
Let's just look where you can find these. If you go to MATLAB, if you go onto the Apps tab, you can find all of these labelers in the apps over here. And you can simply type in label to access a bunch of these.
Going back to the presentation-- as I said, you have your image label labeling a bunch of images or different parts of the image and so on. You have your ground truth labeler for your automotive applications, your signal labeler, your audio labeler, and your video labeler. In addition to that, in case you need to label big images, you can find a BigImage labeler on the file exchange.
Let's just focus on the image and video labelers for now. So this short video over here indicates how you can load images via some folder to see these two car images. Open those. Select particular-- you browse through the image. You can create an ROI label yourself. Say, in this case, we create a label called Vehicle.
We draw bounding blocks around that particular object in the image, and you can also automate this using a bunch of algorithms, which are already placed. In this case, there are a bunch of vehicle detection and people detection algorithms. But you can come up with your own image processing algorithms and add it to this list for any other task that you may want to work with.
So for the labeling, what if I don't have enough data to work with? Or what if there's data imbalance? That there is one class of data, one class has a lot more data than the other? So in this case, one approach is to use a data transformation command in MATLAB which will apply random transformations, that is scaling, rotation, translation with your dataset.
And you can write these files out as new images and add them to your dataset. Another approach is to use something called Generative Adversarial Networks. So these GANs are neural networks themselves. And they can create synthetic images from noise.
So this video shows how GANs are creating new face images from random noise using deep learning to improve your deep learning models themselves. Another option is to synthesize data. So in addition to GANs you can create a synthesis, where you can synthesize.
You can synthesize a bunch of data depending upon what you know of the process. So for example, you can use a phased array and communications toolboxes for synthesizing wireless and data signals on training your networks. You can do similar things for, say, biosignals such as the ECG or the EEG.
Now that we have looked at the labeling and preprocessing with augmentation there. Let's see how we can work with a bunch of neural network models. So when working with a type of images or segments, you need to select your neural network architecture initially.
So if you are working with images, you would select a CNN or a convolutional neural network architecture maybe. While if you're working with signals, you could convert the signals to a two-dimensional domain and use CNNs. Or you can use another architecture type called LSTMs or long short-term memory networks.
Since we're focusing primarily on images, let's just look at this data. So convolutional neural networks consists of a bunch of layer combinations which can be used for, say, classification or regression.
So as we saw, deep neural networks consist of a bunch of layers. So you can generalize these initial layers to correspond to the feature-learning aspects of the tasks, while the final layers correspond to, say, the classification or regression aspects of the task at hand.
In this case of, say, recognizing the vehicle in an image, the initial layers, which can have a bunch of conversational or value or pooling layers, learn the features which are then used by the final layers to assign a label to the image. Say in this case of-- yeah.
So how do you know which layers to use? So you can break this down into several categories. So for images, you can use feature-extraction layers like convolutional layers like we just saw. While for sequence data you may use LSTM layers.
For activation functions which follow the convolutional LSTM layers, you have your value or/and hyperbolic functions. And for the normalization, we can use something called dropout or batch normalization.
If you're still confused, then the best guides would be research papers themselves or documentation examples on different areas which can provide you guidelines for creating your architecture.
I know all of this is too tedious for me. So when selecting an architecture, you can use a Deep Network Designer, which is an app for that. So you can use Deep Network Designer to quickly design and train deep neural networks or use or modify existing pretrained networks. You can find it in the apps tab over here.
Now let's just jump into MATLAB and maybe look at what AlexNet looks like. So in MATLAB, so this-- you can go to the apps tab and simply type in Deep Network Designer. You'll find it in the machine learning and deep learning section. And this is what the app looks like when it opens up.
So you can create a black network from scratch. Import the network which is there in the workspace, or you have a bunch of pretrained networks you can load from. So the example for video classification that we just saw earlier what used something called AlexNet, right?
So lets this browse through this list. AlexNet is right here. So let's go ahead and open this. So this will import a pretrained network called AlexNet into Deep Network Designer. As you can see, this creates a bunch of these boxes which I can zoom into and see what they are.
So each of these individual rectangles is a layer in the neural network. So we have your input layers, a convolutional layer, what we call a ReLu layer and a normalization layer. OK? Selecting each layer will show up its properties over here.
So in addition to that, you can introduce new layers from the list provided in the layer library over here. We have a bunch of different layer types to choose from. Say, your input layers, your convulational and fully connected layers, sequence layers.
Your activation layers, your normalization or other utility layers, your pooling layers, your combination layers or your object detection you find here are these. Each of these layers perform specific tasks. And if you want to learn more about what each layer does, you can go through the documentation and read up on each one.
You can get quick access to the documentation right from this tabloid. Yep. Now getting back to the presentation-- when talking about deep learning, there are basically two approaches, where you can either use a pretrained model like AlexNet to perform your tasks which we just demonstrated earlier. Or you can train your deep neural networks from scratch.
So if you're using a pretrained network, you can-- so pretrained models have been trained for very specific tasks, and they have a bunch of predefined layer orders and parameter values. They can be used for inference without need for any additional training.
And there are a bunch of models in Deep Network Designer that I already showed. And you can access the same. In addition to that, in case a model is not available in MATLAB, you can access pretrained models or import models from the web as well.
So you can import models from Keras or Caffe using importers which are available. Or you can import models and other frameworks such as TensorFlow, or PyTorch through the Open Neural Network Exchange and bring them back into MATLAB and start working.
So the other approach is to train a deep neural network from scratch. So training from scratch into design the network, decide how many layers that are, and what types of layers that can be, and how are they connected to each other?
So this is used especially when your available pretrained models do not work. So in this case, for the image shown on the right, you can see that the image of a car, which we can distinguish clearly the object was detected to be a laptop.
So in case you get a low accuracy on your dataset from your pretrained model or you have completely different category definitions or you have a completely different task of the classification versus regression. In that case, you may need to train your own models to do that.
Furthermore, pretrained models might not be available for your data. So most available pretrained networks have been trained on natural images, that is images of natural scenes which occur.
However, if you were, say, training your models on, say, paintings or, say, infrared images or X-ray or MRI images, or even histopathology images. Then you might not really come across a large number of networks which help you do that.
So in this case, you would have to resort to modifying existing models to suit your own case. In this case, what we would do is something called transfer learning. So what happens in transfer learning is you use a pretrained model and modify it to suit the tasks that you wanted to do.
So in this case you load your pretrained network. The early layers which learn the features themselves are retained while the final layers which use these feature values to perform the classification or regression tasks are changed according to what you want to do.
So as I said when layers are replaced and this new network is trained, essentially transferring information from the feature-learning layers to this new network. So in this case because you did not have to start or learn the feature, learn the weights and biases for the featured layers.
Instead of training on millions or thousands of images, training with just hundreds of images might be sufficient for your task. Finally, once you have achieved a reasonable degree of accuracy, you can use the trained network for instance.
Now let's look at a couple of examples which show all three of these steps in MATLAB. The first example that will be taken is of classifying parasitic infections in blood smear images.
So in this case, the goal is to extract a bunch of features that have the deep neural network extract a bunch of features from blood smear images, that is images of, say, a blood drop are smeared on a slide and put under microscope and captured using a camera. And use this to differentiate between different types of parasitic infections.
In this case, we'll be differentiating between babesiosis, Plasmodium, or trypanosomiasis. So let's just jump into MATLAB and see how we can do that. . Now as I said, we'll look at these three things; accessing the data, training a deep neural network model, and finally inferencing the final model.
Let's go ahead and configure this. So what I'm opening up is a project which kind of combines all of the data as well as the network that we developed into one single easy to use interface.
Let's go ahead and access the dataset. So in this case, our data is in this particular folder called Blood Smeared Images, which has three subfolders one for each class of images which is there. Each of these have a bunch of other images.
Now let's go ahead and access these using an image shaders tool. So accessing this using an image shaders tool, you simply need to specify the root part and the folder part. You can include sub folders and images within them. And you can even use the folder labels or folder names as your image labels.
So this creates the image data store called ImgSet. And as you can see, in this live script, you have the ImageDatastore with a bunch of files on your base folder and your labels.
So in this case, there are around 119 files in this particular industry. And if you just look at the properties of this datastore, there are a bunch of methods to work on these.
So now let's just look at how each of these images look like. So you can split each of the labels, which are there in the image, randomly. And you can create a montage of these as well.
So let me just go ahead and rerun this. And you'll see that these images are picked randomly from these orders. Every time I run this particular piece of code, you can see that three images from each class, that is babesiosis, Plasmodium, and trypanosomiasis is selected.
Now, let's go ahead and see how we can partition all the training and testing sets. So in this case, you can go-- before even partitioning, let's just look at how much data do we have for each of these things?
So we can simply count each label from here and create bar plots. So in this case, you'll see that looking at the blood smeared images, you have around 15 images for the babesiosis and trypanosomiasis class while the Plasmodium class has it worst around 85 or nearly 90 images.
So in this, because one class has way more images than the other, you may need to either augment these sites or cut this dataset down to similar sites. So in this case, what we did was we found the label having the minimum number of images and then simply split each of these labels with the minimum number of images which appear.
And then once we have selected the minimum number of images from each of these three labels, we can simply divide it into training and testing based on a percentage that we set. So in this case, we have further split it into a 70:30 ratio for our training and testing.
If you just run this particular piece of code, you'll see that there were around 15 to 16 images in the babesiosis or the trypanosomiasis category. And cut down the Plasmodium class to have a similar number of images. And then further divided them into training and testing classes, OK?
The next task is to prototype or develop a deep learning model itself. And in this case, we'd use transfer learning, more specifically AlexNet to classify the image into particular classes.
So let's just go ahead and define AlexNet. So one way of defining network was via importing it into Deep Network Designer. And we already have AlexNet imported. We can simply export this network to the workspace and have it exported as something called layers_1.
If I go back to the workspace, you'll see that this particular layers_1 is a 25 cross 1 neural network which is available here, OK? Now, another option to define AlexNet is simply to type in net is equal to AlexNet. This will create a series network with 25 layers, and there is a bunch of input data and output names as well.
If we just want to look at each of these layers, or say we want to look at the first layer in this, we can simply type in net.layers and look at the appropriate index. In this case, if we look at just the first layer, it's the image input layer which has an input size of 227 cross 227 cross 3.
So AlexNet takes in 3D or RGB images of size 227 cross 227. In addition to that, there are a bunch of hyperparameters which are variables there as well. So let's just go ahead and run this. And so AlexNet is set up to accept images of specific sites.
If you just look at our parasitology images, you'll see that these images are of a bunch of different sizes. Yeah? So if we just say look at the scene in the explorer, you'll see that, well, this image is of particular size. Other images might not be 227 cross 227.
But the more, if you just look at the Plasmodium images then all of these might be a different resolution levels. So since AlexNet is set up to accept images of a particular size, you'll have to resize these images before importing these to the network.
So in this case, the original images are all sized 300 cross 300 plus 3. While the input size it says is 227 cross 227. So we'll simply resize in the original images of input size. Now, let's simply see what the neural network does in action.
So in this case, we just read in the first image from the ImageDatastore or the 16th image from the ImageDatastore into a variable, looked at the size of the variable, and then resized the image to fit AlexNet.
Now let's see if we classify this image using AlexNet, what it would be. So to classify using AlexNet you simply need this particular function. Classify, specify the network which you want to use for classification and your input data.
So in this case the input data was an image in the InputImage variable. So in this case, if I just run this, you can see that AlexNet predicts the answer or the class or assigns the label to the image called honeycomb.
And if you just look at what the image looks like, AlexNet possibly looks at these structures which look similar to, say, a honeycomb cross-section and then classifies it as needed.
But wait, or does AlexNet even have classes for these three diseases? Does AlexNet even write an output responding to babesiosis and so on. So in this case, it might not. So just to kind of check what outputs AlexNet does give out, you can simply go with the variable Net, look at the layers, look at the final layer. That is CF classification output layer.
And you can see, AlexNet provides around 1,000 classes. And what are these classes like? It corresponds to, say, tench, goldfish, tiger shark. There's vultures or bald eagles or kites.
Maybe Plasmodium is in there but it certainly did not recognize it. So in this case, we need to modify our network for us, for our work, OK? So let's just go ahead and use Deep Network Designer to do that.
Yeah? So our objective was to classify between these three classes, that is babesiosis, Plasmodium, and trypanosomiasis. Now, Alex itself has, in its output layer around 1,000 classes. So we don't need these extra 997 classes. So what do we do?
What we can do is we can replace these final layers from AlexNet that suit our task. In this case we'll replace the final three layers. That is a fully connected softmax and your output classification layer. They find your fully connected layer to have an output size of say 3. I've particular classes, and let's just go ahead and delete this one.
Let's go ahead and connect this. We can auto-arrange the network. It does network preview. And before we go ahead and do anything on this, you can go ahead and analyze the network to see if there are any faults in it or if the outputs from the layer_1 fit into the input of the other.
There aren't any size mismatches at all. So as you can see, I don't have any warnings or errors. And I just see details. So from my final dropout layer, I have a fully connected layer which has a size of three and followed by a softmax and a classification. Yeah?
So now that this is done, I can export this network to the workspace. So this will create another network or layers too. You can also choose to generate code from this. So this will generate a live script, which as these layer combinations present.
And if I just see the live script, you have all of your layers over here specified. And you can even plot this and see in the form of layer graph. So let me just go ahead and run this and show you the aircraft. So this is what the layer graph looks like.
Now in addition to this, you can also go ahead and provide your data right here in deep network design. So if you just call onto the data tab over here, you can import your image data. So either you can import from a data store or directly from the folder.
So you can import your data directly from a folder, or choose to import raw data. So what we just did. In this case, let me go ahead and import an image as well, which is there. So you remember we divided into test training and test sets earlier.
So let's just select the training dataset. And again, for the validation set, it's just again select the test dataset. Now you see your data augmentation options right here. You can have random reflections along the x or y-axis. You can have your random locations, say between 0 to, say, 5 degrees or maybe more than that, say 30 degrees.
You're randomly scaling, say, in 1.5. So horizontal translation , say till, 0.75, and vertical translation, say again, 0.75, something of that sort. And then go ahead and import it.
This will import images from these data stores, and help you start training. So now you can see that this is a training data by different classes. You have your babesiosis and your Plasmodium classes over here. And you can go ahead and tap. Yep.
So you can have a slice of random observations from each class. And finally, go on to the next tab for your training. Select a bunch of training options. So these are basically the, say, soluble which will help you reduce the error, output error, or adjust rates for the neural network. Your validation frequencies, your epochs and so on and so forth.
So in this case, let me just go ahead and just run it for maybe we need both. OK? And I'll go ahead and close this and start training. So this will import the network for training and show you what the network itself looks like.
So this, if you remember, will automatically normalize the input data. So from image side of 300 cross 300 which was there earlier, it will automatically resize it to 227 cross to 227. Go for the AlexNet and then begin your training.
So it shows you are two graphs. The first for accuracy of your training, and the second for loss. And you can see training progress over here and a bunch of other parameters . As you can see, initially the training accuracy is low and as you keep going further, training accuracy somewhere around 40%. We'll go ahead and we'll add this one. Yep.
Now, let's go back and-- yeah. And finally, there is this particular option here. So maybe the training accuracy is not improving that much. So you can go ahead and maybe stop training. And what you can do is once training has been stopped, you have your final accuracy over here.
You can see that the accuracy is 33% which is exactly-- your validation accuracy is 33% because that is the base accuracy, which is available until classifying into three classes, OK? So you can generate, or for whatever you just did on the GUI using the Deep Network Designer right here.
And you can simply click on this and have the training set up, copied to a MAT file with your initial parameters. So this will copy all of the parameters from the MAT file and also generate a live script. Yeah.
So this is an automatically generated live script. As you can see here, this was auto generated just now. So whatever we did in the GUI has been replicated and the code for that has been generated here. You can set your training options, your layers, and simply training them, right?
Now let's go ahead and see if we could do this entire thing programmatically as well. Yes, you can. And also we can do this programmatically and you can augment an image dataset using something called ImageDataAugmentor, and specify a bunch of options that we did-- the same options that we specified earlier for reflection, translation, rotation.
In this case, you'll see that we're reflecting it, we're translating it by a lot more than what we had behind there, right? Another thing is, I can simply load in my training network. So since I'm augmenting my data a lot more and that there is a lot more data to train the network itself, it trains better.
And in this case, I'm not showing you the training but I have a trained network already which I'm loading from the data from the workspace. And if I just see my inference with the trained network, that would just require me to kind of use classify or use my trained network and my augmented method.
So in this case, I get an accuracy of nearly 80% on babesiosis, 60% on Plasmodium, and 100% on trypanosomiasis. So if you just look, go over a bunch of images which does this. So we can change this image and see what particular class it belongs to as well.
So in this case, the true class is babesiosis but it is predicted as Plasmodium with a confidence of, say, 44%. So in this case the true class is babesiosis and the predicted class is babesiosis as well. And you can see that the confidence also is quite high.
Similarly, whenever there is an error in classification between the true class and the predicted class, you can see that the confidence of that prediction kind of goes down. Scores which are assigned to that prediction goes down, while whenever there is a third classification, the scores are quite high. Yeah.
So I hope that kind of outlined what is possible in MATLAB using deep learning, and it gave you a flavor of what can be done. Now, let's look at the last two steps, that is developing predictive models and integrating a model with other systems. I'll ask Dhruv to kind of start presenting right now of--
So we wanted to briefly talk about why you may want to use parallel computing or similar functionalities just to speed up your code. So we know that deep learning networks have a lot of computational requirements in order to train them and sometimes also not to deploy them.
So what parallel computing can help you with is reducing the time it takes to train up these kinds of networks. And it can also be used for different kinds of numerical computing problems. Now why you might want to do this with MATLAB and Simulink. The main reason is that you can accelerate your computation with very few changes in your code and the same familiar syntax that you use on the desktop that can scale up to clusters and clouds as well.
So next, please. So the idea in MATLAB is that you prototype on the desktop. If you have the parallel computing toolbox installed, you can use multiple cores of your CPU or your GPU. And then you can scale up the same code to be deployed onto a cluster using MATLAB Parallel Server. And we also have support for AWS and Azure if you want to go to the cloud platforms.
And of course, all of this runs directly from within MATLAB's minimal setup. One of the easiest ways to get started is just to use the automatic parallel support that comes built in with all the toolboxes you see on the screen.
So if you're working with image processing statistics for machine learning, deep learning, et cetera, you don't even have to write any special code or change anything. You just enable a flag for parallel computing, and that will by default allow you to use all the cores of your CPU.
Could we go next, please? And if you have access to multiple GPUs, then you'll be pleased to find that the training time for deep learning network scales quite nicely with multiple GPUs.
So for example, what we've shown in this graph. This is a little bit older, but the results are compatible. So if you have a single GPU and it takes around 162 minutes to train, let's say it has net 50 network.
If you have two GPUs, that time is almost halved to 85 minutes. And then if you have four GPUs on your system, that time is further reduced to 45 minutes or so. So you can scale up your training very easily and utilize the full power of your hardware.
And very quickly, we're not going to show all the setup steps. But if you have access to a MATLAB Parallel Server, so this could be an HPC cluster inside your campus or your research institute. Or this could be something on the cloud, you can see the list there on the right hand side, all the different kinds of options that there might be.
All you have to do is once you have done the setup, you just go to the parallel button at the top of MATLAB, change your default cluster from local to some other name, whichever one you've set up.
And then by default all the code that you run inside your parallel computing constructs is automatically scaled up to the cloud, and it can take advantage of the extra computation power. So it really is quite simple to accelerate your computations.
Right. So that was a little bit about hardware accelerated training and how you can train faster. Now what do you do with the networks once they are trained? So let's take a look at that next.
So one of the most efficient and useful ways of using your deep learning networks is to just build an app out of them. So Praful has already shown you one of the apps which is the Deep Network Designer inside the machine learning and deep learning sections.
So you can also build your own apps. By going in the same apps tab, you can design your own app. You can put your code inside it, and then you can package it up and share it with other people.
So the work that you do doesn't remain just on your computer. You can send it out to other people, and they can use the results of your work. And this can be as secure or as flexible as you want it to be.
Could we go to the next, please?
Sort of. We are on the next slide.
Oh, I'm sorry. The slides, it's not updated on my screen. If you want to voice-over maybe.
Sure. Sure. Thanks Dhruv. So the deep learning on MATLAB has come a long way from the beginning. So CNNs and pretrained models in Caffe importer were introduced in 2016.
While now in 2020 as you can see, there have been a lot of additions to it. And we continue to make even more with each and every release that we have. So in 2020, you have in addition to your Deep Network Designer, another experiment manager app or to do multiple deep learning experiments and selecting the best one.
And you also have over 200 examples across different domains for you to start with. So in case you're wondering where to start, you can go over a bunch of these examples yourself and modify them or learn from them to begin work on your own.
MATLAB also interoperates with other coding environments. You can call other language libraries from MATLAB, or you can call MATLAB from other languages as well. Right?
Right.
You can also deploy your deep neural networks onto different hardware platforms. So you can convert your deep learning code, which is there in MATLAB, onto C or C+ or CUDA code according to the deployment target that you have in mind.
You can deploy or you can generate code for Intel or ARM CPUs, or NVidia, AMD or Intel GPUs or even generate HDL code for a bunch of FPGA platforms. So in terms of deployment and scaling of AI for deep learning tasks, you can generate all of your algorithms in MATLAB and package them and scale it up to enterprise systems as standalone applications.
Like how to mention as web apps, which can be accessed through the browser or scale them up to HDFS, the distributed systems or an interface with multiple other programming languages using MATLAB Production Server. And scale your entire algorithm up as per requirements that you have.
Or on the other hand, you can scale it up into different embedded hardware by generating C, C++, HDL, PLC code or CUDA code.
So the final takeaways from the webinar would be we showed you how you can work with different image formats and large image datasets using data stores. We showed you a bunch of MATLAB apps labeling and automation of image and video datasets.
We showed you how we could use pretrained neural networks and easily create and modify networks with Deep Network Designer and ONNX Importer.
Finally, modified and selected optimal hyperparameter values, Experiment Manager, and how you could accelerate your deep learning work with GPUs, clusters of-- and also use app designer to deploy your image processing or deep learning applications.
Now you can work on deep learning problems on MATLAB Online, similar in online on MATLAB Drive, which you can access via a browser. This does not require any download or installation at all. And it's ready to go, and you always get access to the latest version of MATLAB via this.
There are no minimum device specifications other than that, which are therefore your web browser. So in case you are working on an older computer, you can definitely use this.
You can also get started by yourself by doing a bunch of online training courses. So there are a bunch of free courses such as MATLAB Onramp, your Deep Learning Onramp, and Image Processing Onramp, which are two to three hours long and available on the MATLAB website. That is, matlabacademy.mathworks.com.
Also, there are a bunch of focused forces which are longer in duration, especially the MATLAB Fundamentals and Deep Learning with MATLAB which may be available to you via your license. But in order to access these just simply log into the matlabacademy.mathworks.com with your institutional email ID. And in case your institute has a license, you should be able to automatically get access to these.
You can access-- as I said, you can gain access to a campus wired license by simply checking in with the MathWorks website with your institute email and password. And in case you have any immediate needs and do not have access to a license, please download a 30-day trial from the MathWorks website or reach out to us and we'll be happy to help.