Image Classification With Deep Learning

Image Classification With Deep Learning

1. The goal of the Image Classification Project

Our team participated in the Kaggle Image Classification Competition. We improved the performance of Deep Learning models to classify CIFAR-10 datasets more accurately. We mainly used three Deep Learning models (DenseNet, PyramidNet, ResNet) for this project.

I enhanced the performance of DenseNet and achieved a classification accuracy of 87.11%.

2. Why this project?

Even though Deep Learning models can classify CIFAR-10 dataset almost with 99% accuracy, Those models contain more than six hundred million parameters

So through this project, we investigated the way to classify CIFAR-10 datasets with less than two million parameters.

3. Relevant Research and Experiment

3.1 What is DenseNet?

A DenseNet is a type of convolutional neural network that utilizes dense connections between layers, through Dense Blocks, where we connect all layers (with matching feature-map sizes) directly with each other. To preserve the feed-forward nature, each layer obtains additional inputs from all preceding layers and passes on its own feature maps to all subsequent layers.

3.2 Hyper Parameter

Learning rate

In machine learning and statistics, the learning rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function.

Based on the paper, ‘Densely Connected Convolutional Networks (Haung et al., 2017)’, I checked that they utilized a learning rate scheduler while they were training. The learning rate scheduler helps to change the learning rate variably in the middle of training.

Weight decay

While training the model, the overfitting problem can occur. We can solve the problem by increasing the number of the training dataset. But If It is hard to retain additional training datasets, We used to resolve such a problem using weight decay to reduce the complexity of the models.

We can avoid the overfitting problem by assigning the higher penalty with a higher weight.

Momentum

The term ‘Momentum’ came out of physics. Literally, It gives inertia while moving through the Gradient Descent.

Result

3.3 Activation Function

In artificial neural networks, the activation function of a node defines the output of that node given an input or set of inputs.

PReLU

ELU

LeakyReLU

ReLU6

SELU

Result

3.4 Data Augmentation

I used the ‘torchvision.transforms’ for data augmentation. I variously transformed images with image transformation functions provided by ‘torchvision.transforms’.

The given code was already applying data augmentation techniques such as Randomcrop, RandomHorizontalFlip, ToTensor. So I basically used those three techniques and combined other types of functions to analyze the results.