Uploaded by

ARAVIND’S PROJECTS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views88 pages

Inception Net

Uploaded by

ARAVIND’S PROJECTS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 88

Deep Learning

BCSE-332L
Module 3:
Convolutional Neural Network
Dr . Saurabh Agrawal
Faculty Id: 20165
School of Computer Science and Engineering
VIT, Vellore-632014
Tamil Nadu, India
4-Sep-24 Dr. Saurabh Agrawal, SCOPE, DATABASE SYSTEMS, VIT, VELLORE 1
Outline
Foundations of Convolutional Neural Networks
CNN Operations
Architecture
Simple Convolution Network
Deep Convolutional Models
ResNet
AlexNet
InceptionNet
Others

4-Sep-24 Dr. Saurabh Agrawal, SCOPE, DATABASE SYSTEMS, VIT, VELLORE 2

Foundations of Convolutional Neural Networks
What is Convolutional Neural Network(CNN)?
A Convolutional Neural Network (CNN) is a type of deep learning algorithm that is particularly
well-suited for image recognition and processing tasks.
It is made up of multiple layers, including convolutional layers, pooling layers, and fully
connected layers.
The architecture of CNNs is inspired by the visual processing in the human brain, and they are
well-suited for capturing hierarchical patterns and spatial dependencies within images.

4-Sep-24 Dr. Saurabh Agrawal, SCOPE, DATABASE SYSTEMS, VIT, VELLORE 3

Foundations of Convolutional Neural Networks
Key components of a Convolutional Neural Network include:
1. Convolutional Layers: These layers apply convolutional operations to input images, using filters
(also known as kernels) to detect features such as edges, textures, and more complex patterns.
Convolutional operations help preserve the spatial relationships between pixels.
2. Pooling Layers: Pooling layers downsample the spatial dimensions of the input, reducing the
computational complexity and the number of parameters in the network. Max pooling is a common
pooling operation, selecting the maximum value from a group of neighboring pixels.
3. Activation Functions: Non-linear activation functions, such as Rectified Linear Unit (ReLU),
introduce non-linearity to the model, allowing it to learn more complex relationships in the data.
4. Fully Connected Layers: These layers are responsible for making predictions based on the high-
level features learned by the previous layers. They connect every neuron in one layer to every
neuron in the next layer.

4-Sep-24 Dr. Saurabh Agrawal, SCOPE, DATABASE SYSTEMS, VIT, VELLORE 4

Foundations of Convolutional Neural Networks
Key components of a Convolutional Neural Network include:

4-Sep-24 Dr. Saurabh Agrawal, SCOPE, DATABASE SYSTEMS, VIT, VELLORE 5

Foundations of Convolutional Neural Networks
CNNs are trained using a large dataset of labeled images, where the network learns to recognize patterns and
features that are associated with specific objects or classes.
Proven to be highly effective in image-related tasks, achieving state-of-the-art performance in various
computer vision applications.
Their ability to automatically learn hierarchical representations of features makes them well-suited for tasks
where the spatial relationships and patterns in the data are crucial for accurate predictions.
CNNs are widely used in areas such as image classification, object detection, facial recognition, and medical
image analysis.
The convolutional layers are the key component of a CNN, where filters are applied to the input image to
extract features such as edges, textures, and shapes.
The output of the convolutional layers is then passed through pooling layers, which are used to down-sample
the feature maps, reducing the spatial dimensions while retaining the most important information.
The output of the pooling layers is then passed through one or more fully connected layers, which are used to
make a prediction or classify the image.
4-Sep-24 Dr. Saurabh Agrawal, SCOPE, DATABASE SYSTEMS, VIT, VELLORE 6
Foundations of Convolutional Neural Networks
Convolutional Neural Network Design
The construction of a convolutional neural network is a multi-layered feed-forward neural network,
made by assembling many unseen layers on top of each other in a particular order.
It is the sequential design that give permission to CNN to learn hierarchical attributes.
In CNN, some of them followed by grouping layers and hidden layers are typically convolutional
layers followed by activation layers.
The pre-processing needed in a ConvNet is kindred to that of the related pattern of neurons in the
human brain and was motivated by the organization of the Visual Cortex.

4-Sep-24 Dr. Saurabh Agrawal, SCOPE, DATABASE SYSTEMS, VIT, VELLORE 7

Foundations of Convolutional Neural Networks
Convolutional Neural Network Training :CNNs are trained using a supervised learning approach.
This means that the CNN is given a set of labeled training images. The CNN then learns to map the input
images to their correct labels.
The training process for a CNN involves the following steps:
1. Data Preparation: The training images are preprocessed to ensure that they are all in the same format
and size.
2. Loss Function: A loss function is used to measure how well the CNN is performing on the training
data. The loss function is typically calculated by taking the difference between the predicted labels and
the actual labels of the training images.
3. Optimizer: An optimizer is used to update the weights of the CNN in order to minimize the loss function.
4. Backpropagation: Backpropagation is a technique used to calculate the gradients of the loss function
with respect to the weights of the CNN. The gradients are then used to update the weights of the CNN
using the optimizer.
4-Sep-24 Dr. Saurabh Agrawal, SCOPE, DATABASE SYSTEMS, VIT, VELLORE 8
Foundations of Convolutional Neural Networks
CNN Evaluation: After training, CNN can be evaluated on a held-out test set.
A collection of pictures that the CNN has not seen during training makes up the test set.
How well the CNN performs on the test set is a good predictor of how well it will function on actual data.
The efficiency of a CNN on picture categorization tasks can be evaluated using a variety of criteria.
Among the most popular metrics are:
1. Accuracy: Accuracy is the percentage of test images that the CNN correctly classifies.
2. Precision: Precision is the percentage of test images that the CNN predicts as a particular class and
that are actually of that class.
3. Recall: Recall is the percentage of test images that are of a particular class and that the CNN
predicts as that class.
4. F1 Score: The F1 Score is a harmonic mean of precision and recall. It is a good metric for evaluating
the performance of a CNN on classes that are imbalanced.