یادگیری عمیق

Deep Learning

Chapter 5: Deep learning for Computer Vision

Mahmood Amintoosi

m.amintoosi @ hsu.ac.ir

پاییز ۹۸

Source book

Deep Learning with Python,
by: FRANÇOIS CHOLLET
Deep Learning with Python
https://www.manning.com/books/deep-learning-with-python
LiveBook
Github: Jupyter Notebooks

Chapter 5

Deep learning for Computer Vision

This chapter covers:

  • Understanding convolutional neural networks
  • Using data augmentation to mitigate overfitting
  • Using a pretrained convnet to do feature extraction
  • Fine-tuning a pretrained convnet
  • Visualizing what convnets learn and how they make classification decisions

Understanding convolutional neural networks

  1. Convolution arithmetic tutorial
  2. Machine Learning and AI - Bangalore Chapter
  3. Counting No. of Parameters in Deep Learning Models by Hand

Classification with CNNs

  1. English Digit Classification
  2. Persian Digit Classification
  3. Classifying Cats vs Dogs
5.1 - Introduction to convnets
MNIST Classification (Included with Keras)
Overall Model:
5.1 - Introduction to convnets
MNIST Classification, TensorFlow Code
		 
import keras
from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
		 
		 
5.1 - Introduction to convnets
Number of Parameters
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten_1 (Flatten)          (None, 576)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                36928     
_________________________________________________________________
dense_2 (Dense)              (None, 10)                650       
=================================================================
Total params: 93,322
		 
Overall Architecture of a sample CNN
More about architecture and number of parameters:
Source: Counting No. of Parameters in Deep Learning Models by Hand
Source: Counting No. of Parameters in Deep Learning Models by Hand
Source: Counting No. of Parameters in Deep Learning Models by Hand
--
Persian Digits Classification (Not included with Keras)
		 
import keras
from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))	
		 
		 
5.2 - Using convnets with small datasets
Classify Dogs vs Cats (Not included with Keras)
5.2 - Using convnets with small datasets
Classify Dogs vs Cats (Building from scrach)
  • Download Images from Kaggle
  • Download Images from fastai 845MB, (need some manipulation)
5.2 - Using convnets with small datasets
Classify Dogs vs Cats (Building from scrach)
		 
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))			
		 
		 
There are various architectures of CNNs available
LeNet, AlexNet, VGGNet, GoogLeNet, ResNet, ZFNet
VGG16 Architecture
VGG16 is a convolutional neural network model proposed by K. Simonyan and A. Zisserman from the University of Oxford in the paper “Very Deep Convolutional Networks for Large-Scale Image Recognition”. The model achieves 92.7% top-5 test accuracy in ImageNet, which is a dataset of over 14 million images belonging to 1000 classes. VGG16 was trained for weeks and was using NVIDIA Titan Black GPU’s.
ML vs DL
Some Outputs
Our Outputs
Our Outputs

- Questions? -


m.amintoosi @ gmail.com

webpage : http://mamintoosi.ir

webpage in github : http://mamintoosi-cs.github.io

github : mamintoosi