An Ultimate Road map to Computer Vision 2021 – Image Classification

Hello Everyone! Welcome to my blog. In previous two blog I have listed down the resources and road map to Data Science and Deep learning .

In this blog , I am going to discuss the road map to Computer vision 2021 – Image Classification which includes basic to advanced algorithms used in Image Classification tasks , Model development life cycle ( Training , testing , deployment) and few other tools , frameworks

I have written this blog beginner friendly with enough illustrated resources and mathematics resources for algorithms with hands on tutorials. Feel Free to Post your comments and Queries,

What is Computer Vision?

It is a Field of Machine learning , that focuses on enabling the machines to replicate the human eyes’ functionality. Computer vision involves in applications like Image classification , localisation , segmentation and generation. This can be achieved by Neural network algorithms which have unique architectures to understand the features and patterns of the images.

Let’s discuss the road-map in 4 different parts.

  1. Non Neural-net – Machine learning based Computer vision tasks
  2. Deep learning based Computer vision -Evolution of Convolutional neural networks (CNN)
  3.  Image-net Large Scale Visual Recognition Challenge (ILSVRC) Architectures.
  4. Tools and Frameworks
Image Classification using Non Neural Network – Machine learning algorithms :

Everyone used to start learning computer vision straight away from Deep learning , even with out introduction to Multi layer perceptron. But I would suggest to start practicing from basic machine learning algorithms like K-Nearest Neighbor , Support vector Machine , Random Forest , XgBoost ,etc..

By doing in this way , it would be a revision session where you can relearn the basic machine learning algorithms again and apply them on image classification task.

  1. K Nearest Neighbor :
    1. https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors-algorithm-6a6e71d01761
    2. https://www.javatpoint.com/k-nearest-neighbor-algorithm-for-machine-learning
    3. https://www.pyimagesearch.com/2016/08/08/k-nn-classifier-for-image-classification/
    4. https://yearsofnolight.medium.com/intro-to-image-classification-with-knn-987bc112f0c2
    5. https://medium.com/swlh/image-classification-with-k-nearest-neighbours-51b3a289280
  2. Support Vector Machine – SVM was the most used ML algorithm for Image classification task before CNN
    1. https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/
    2. https://www.kdnuggets.com/2018/12/solve-image-classification-problem-quickly-easily.html/2
    3. https://towardsdatascience.com/svm-support-vector-machine-for-classification-710a009f6873
    4. https://www.kaggle.com/halien/simple-image-classifer-with-svm
    5. https://www.kaggle.com/ashutoshvarma/image-classification-using-svm-92-accuracy
  3. Random forest and Decision Tree
    1. https://www.robots.ox.ac.uk/~vgg/publications/papers/bosch07a.pdf
    2. https://towardsdatascience.com/understanding-random-forest-58381e0602d2
    3. https://towardsdatascience.com/a-beginners-guide-to-decision-tree-classification-6d3209353ea
    4. https://www.linkedin.com/pulse/decision-tree-satellite-image-classification-jo%C3%A3o-otavio/
    5. https://github.com/87surendra/Random-Forest-Image-Classification-using-Python
    6. https://github.com/PraveenDubba/Image-Classification-using-Random-Forest/blob/master/Random_Forest_latest.py
  4. XGboost :
    1. https://towardsdatascience.com/https-medium-com-vishalmorde-xgboost-algorithm-long-she-may-rein-edd9f99be63d
    2. https://machinelearningmastery.com/gentle-introduction-xgboost-applied-machine-learning/
    3. https://setscholars.net/how-to-do-fashion-mnist-image-classification-using-xgboost-in-python/
Computer vision using Deep learning Evolution of Convolutional Neural Networks.

Brief History:

Check the previous blog : https://prabakaranchandran.com/2021/04/26/a-complete-road-map-to-deep-learning-2021-part-1/

1980 – The first “convolutional network” was the Neocognitron , by Japanese scientist Fukushima (1980) is used to hand-written character recognition.The neocognitron was inspired by the works of Hubel and Wiesel about the visual cortex of animals. At that time, the back-propagation algorithm was still not used to train neural networks. The neocognitron has given all fundamental ideas behind convNets.

1998 – Modern Convolution Networks with Gradient back propagation learning ,inspired by the neocognitron( by fukushima). Yann LeCun et al., in their paper “Gradient-Based Learning Applied to Document Recognition” (cited 17,588 times) demonstrated that a CNN model used for handwritten character recognition.

Major Parts of Convolution Neural Network ( LeCun’s Base Architecture) working principle and terminologies

  1. Convolutional Layers :
    1. https://towardsdatascience.com/gentle-dive-into-math-behind-convolutional-neural-networks-79a07dd44cf9
    2. https://www.analyticsvidhya.com/blog/2020/02/mathematics-behind-convolutional-neural-network/
    3. https://hackernoon.com/the-full-story-behind-convolutional-neural-networks-and-the-math-behind-it-2j4fk3zu2
    4. https://www.programmersought.com/article/87541005859/
    5. https://poloclub.github.io/cnn-explainer/
  2. Pooling Layers :
    1. https://dev.to/sandeepbalachandran/machine-learning-max-average-pooling-1366
    2. https://medium.com/@bdhuma/which-pooling-method-is-better-maxpooling-vs-minpooling-vs-average-pooling-95fb03f45a9
    3. https://www.machinecurve.com/index.php/2020/01/30/what-are-max-pooling-average-pooling-global-max-pooling-and-global-average-pooling/
  3. Activation functions :
    1. https://machinelearningmastery.com/choose-an-activation-function-for-deep-learning/
    2. https://www.analyticsvidhya.com/blog/2020/01/fundamentals-deep-learning-activation-functions-when-to-use-them/
    3. https://conferences.computer.org/ictapub/pdfs/ITCA2020-6EIiKprXTS23UiQ2usLpR0/114100a429/114100a429.pdf
    4. https://towardsdatascience.com/comparison-of-activation-functions-for-deep-neural-networks-706ac4284c8a
  4. Fully connected layer :
    1. https://cs231n.github.io/convolutional-networks/#fc
    2. https://www.superdatascience.com/blogs/convolutional-neural-networks-cnn-step-4-full-connection
  5. Normalization Layer :
    1. https://cs231n.github.io/convolutional-networks/#norm
    2. https://analyticsindiamag.com/everything-you-should-know-about-dropouts-and-batchnormalization-in-cnn/
    3. https://www.baeldung.com/cs/batch-normalization-cnn
    4. https://machinelearningmastery.com/batch-normalization-for-training-of-deep-neural-networks/
    5. https://medium.com/techspace-usict/normalization-techniques-in-deep-neural-networks-9121bf100d8
  6. Dropout :
    1. https://towardsdatascience.com/dropout-on-convolutional-layers-is-weird-5c6ab14f19b2
    2. https://towardsdatascience.com/dropout-on-convolutional-layers-is-weird-5c6ab14f19b2
  7. Multi class and Multi label classification:
    1. https://towardsdatascience.com/journey-to-the-center-of-multi-label-classification-384c40229bff
    2. https://cmci.colorado.edu/classes/INFO-4604/files/slides-7_multi.pdf
  8. Sigmoid and Softmax output layers:
    1. https://towardsdatascience.com/multi-layer-neural-networks-with-sigmoid-function-deep-learning-for-rookies-2-bf464f09eb7f
    2. https://glassboxmedicine.com/2019/05/26/classification-sigmoid-vs-softmax/
    3. https://medium.com/arteos-ai/the-differences-between-sigmoid-and-softmax-activation-function-12adee8cf322
  9. Weight Initialization in CNN :
    1. https://machinelearningmastery.com/weight-initialization-for-deep-learning-neural-networks/
    2. https://medium.com/@tylernisonoff/weight-initialization-for-cnns-a-deep-dive-into-he-initialization-50b03f37f53d
    3. https://towardsdatascience.com/weight-initialization-in-neural-networks-a-journey-from-the-basics-to-kaiming-954fb9b47c79
  10. Loss functions for Image classification:
    1. https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/
    2. https://towardsdatascience.com/choosing-and-customizing-loss-functions-for-image-processing-a0e4bf665b0a
    3. https://towardsdatascience.com/understanding-different-loss-functions-for-neural-networks-dd1ed0274718
    4. https://medium.com/@zeeshanmulla/cost-activation-loss-function-neural-network-deep-learning-what-are-these-91167825a4de
    5. https://algorithmia.com/blog/introduction-to-loss-functions
  11. Back Propagation in CNN :
    1. https://towardsdatascience.com/backpropagation-in-a-convolutional-layer-24c8d64d8509?gi=35b754b311dd
    2. https://towardsdatascience.com/backpropagation-in-a-convolutional-layer-24c8d64d8509
    3. https://medium.com/@2017csm1006/forward-and-backpropagation-in-convolutional-neural-network-4dfa96d7b37e
    4. https://becominghuman.ai/back-propagation-in-convolutional-neural-networks-intuition-and-code-714ef1c38199
    5. https://www.jefkine.com/general/2016/09/05/backpropagation-in-convolutional-neural-networks/
  12. Optimizers
    1. https://www.upgrad.com/blog/types-of-optimizers-in-deep-learning/
    2. https://towardsdatascience.com/optimizers-for-training-neural-network-59450d71caf6
    3. https://heartbeat.fritz.ai/exploring-optimizers-in-machine-learning-7f18d94cd65b
    4. https://medium.datadriveninvestor.com/overview-of-different-optimizers-for-neural-networks-e0ed119440c3

At this stage , you will be able to understand all the concepts around CNN , Let’s move on to different architectures based on ILSVRC competition.

Image-net  Large Scale Visual Recognition Challenge (ILSVRC) Architectures.
Image Net Dataset and ISLVRC competion :

ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images. The project has been instrumental in advancing computer vision and deep learning research. The data is available for free to researchers for non-commercial use. https://paperswithcode.com/sota/image-classification-on-imagenet

Competition

The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) evaluates algorithms for object detection and image classification at large scale. One high level motivation is to allow researchers to compare progress in detection across a wider variety of objects — taking advantage of the quite expensive labeling effort. Another motivation is to measure the progress of computer vision for large scale image indexing for retrieval and annotation.https://paperswithcode.com/sota/image-classification-on-imagenet

Famous Benchmark Architectures:

After LeCun ‘s Modern CNN paper , it took several years to publish a SOTA CNN paper. Alex net was the first big mile stone in image recognition challenge in 2013. Check the Resource below , where I have attached Model architecture , theory , implementation

  1. AlexNet – 2013 :
    1. Paper : https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
    2. Explanation : https://d2l.ai/chapter_convolutional-modern/alexnet.html , https://d2l.ai/chapter_convolutional-modern/alexnet.html
    3. Code implementation : https://pytorch.org/hub/pytorch_vision_alexnet/ https://towardsdatascience.com/implementing-alexnet-cnn-architecture-using-tensorflow-2-0-and-keras-2113e090ad98?gi=c45baf963fbc
013 B CNN AlexNet | Master Data Science
Schematic diagram of (a) VGG16 and (b) VGG19 models. | Download Scientific  Diagram
Vgg16 and vgg19
Residual block (top left), bottleneck layer (bottom left), and ResNet... |  Download Scientific Diagram
ResNet Arch
An Intuitive Guide to Deep Network Architectures | by Joyce Xu | Towards  Data Science
Inception v3
The architecture of the MobileNetv2 network. | Download Scientific Diagram
MobileNet
Google AI Blog: EfficientNet: Improving Accuracy and Efficiency through  AutoML and Model Scaling
Different model scaling @ Efficient Net.

Even We have 100s of models in place , I just wanted to list down few of them which are fundamental and important to other models. I did not list the models like ViT , MLP mixer , since the knowledge of Attention and Transformers is required ( I ll give a road map to transformer architecture soon )

Tools , Frameworks and other Resources – Image classification :

Transfer Learning : https://machinelearningmastery.com/how-to-use-transfer-learning-when-developing-convolutional-neural-network-models/

Is Transfer Learning the final step for enabling AI in Aviation? -  Datascience.aero
Transfer learning #c2c

Pytorch – Torch vision models : https://pytorch.org/vision/stable/models.html https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html

Pytorch image models by rwightman (timm) : https://paperswithcode.com/lib/timm https://rwightman.github.io/pytorch-image-models/

PyTorch Image Models (TIMM) is a library for state-of-the-art image classification. With this library you can:

  • Choose from 300+ pre-trained state-of-the-art image classification models.
  • Train models afresh on research datasets such as ImageNet using provided scripts.
  • Finetune pre-trained models on your own datasets, including the latest cutting edge models.

Keras and TensorFlow V2.x

  1. https://keras.io/examples/vision/
  2. https://developers.google.com/codelabs/tensorflow-2-computervision#0
  3. https://medium.com/@rishit.dagli/computer-vision-with-tensorflow-part-2-57e95cd0551
  4. https://www.tensorflow.org/tutorials/images/transfer_learning_with_hub

PyTorch Lighting :

  1. https://wandb.ai/wandb/wandb-lightning/reports/Image-Classification-using-PyTorch-Lightning–VmlldzoyODk1NzY
  2. https://www.kaggle.com/xooca1/image-classification-pytorch-lightning
  3. https://medium.com/pytorch/introducing-lightning-flash-the-fastest-way-to-get-started-with-deep-learning-202f196b3b98

Another important resource : joshstarmer’s StatQuest Youtube Channel. — Easier and illustrative explanations.

https://www.youtube.com/watch?v=HGwBXDKFk9I

So far , we have covered Image classification in Computer vision , In the upcoming blogs , we can learn object detection , segmentation , generation and other domains

Take your time , Don’t rush , Learn – Practice – Repeat!

Let me know if you have any queries or comments!

Thanks for reading and supporting I hope this blog helps you!

Happy learning!

By

Prabakaran chandran – May 14 0 4.10 am

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: