0% found this document useful (0 votes)
8 views27 pages

Simplifying Neural Networks and Deep Learning Basics!

The document provides an overview of neural networks and deep learning, detailing their structures such as perceptrons, multi-layer perceptrons, and various types of neural networks including RNNs and CNNs. It discusses training methods like backpropagation and challenges like the vanishing gradient problem, along with techniques like layer-wise pre-training and LSTM for long-term dependencies. Additionally, it highlights the architecture of CNNs and their application in feature extraction for images and text.

Uploaded by

Fake
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views27 pages

Simplifying Neural Networks and Deep Learning Basics!

The document provides an overview of neural networks and deep learning, detailing their structures such as perceptrons, multi-layer perceptrons, and various types of neural networks including RNNs and CNNs. It discusses training methods like backpropagation and challenges like the vanishing gradient problem, along with techniques like layer-wise pre-training and LSTM for long-term dependencies. Additionally, it highlights the architecture of CNNs and their application in feature extraction for images and text.

Uploaded by

Fake
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Neural Network

and Deep
Learning
Ahmed Elhelbawy
Naxcen Quantum Society

December 2024 Research Community

Neural Network
•Mimics the functionality of a brain.
•A neural network is a graph with neurons
(nodes, units etc.) connected by links.
Neural Network: Neuron

Neural Network: Perceptron


• Network with only single layer.
• No hidden layers
Neural Network: Perceptron
X1
W1= ?
a
t=? AND Gate

X2 W2= ?

X1
W1= ?
a
t=? OR Gate

X2 W2= ?

a
X1 t=? NOT Gate
W1= ?

Neural Network: Perceptron


X1
W1= 1
a
t = 1.5 AND Gate

X2 W2= 1

X1
W1= 1
a
t = 0.5 OR Gate

X2 W2= 1

a
X1 t = -0.5 NOT Gate
W1= -1
Neural Network: Multi Layer Perceptron
(MLP) or Feed-Forward Network (FNN)
• Network with n+1 layers
• One output and n hidden layers.

Training: Back propagation algorithm

•Gradient decent algorithm


Training: Back propagation algorithm

Training: Back propagation algorithm


Training: Back propagation algorithm

Training: Back propagation algorithm


Training: Back propagation algorithm
1. Initialize network with random weights
2. For all training cases (called examples):
a. Present training inputs to network and calculate
output
b. For all layers (starting with output layer, back to
input layer):
i. Compare network output with correct output (error
function)
ii. Adapt weights in current layer

Deep Learning
What is Deep Learning?
•A family of methods that uses deep architectures to

learn high-level feature representations

Example 1
MAN

Example 2

Why are Deep Architectures hard to train?

•Vanishing/Exploding gradient problem in Back


Propagation
Layer-wise Pre-training
•First, train one layer
at a time, optimizing
data-likelihood
objective P(x)

Layer-wise Pre-training
•Then, train second
layer next, optimizing
data-likelihood
objective P(h)
Layer-wise Pre-training
Finally, fine-tune labelled objective P(y|x) by
• Backpropagation

Deep Belief Nets


• Uses Restricted Boltzmann Machines (RBMs)
• Hinton et al. (2006), A fast learning algorithm
for deep belief nets.
Restricted Boltzmann Machine (RBM)

•RBM is a simple energy-based model:

where

Example:
• Let weights (h;1 x),1 (h; x)
1 3
be positive, others be
zero, b = d = 0.
• Calculate p(x,h) ?
• Ans: p(x1 = 1; x2 = 0; x3 = 1; h1 = 1; h2 = 0; h3 = 0)

Restricted Boltzmann Machine (RBM)

•P(x, h) = P(h|x) P(x)


•P(h|x): easy to compute
•P(x): hard if datasets are large.

Contrastive Divergence:
Deep Belief Nets (DBN) = Stacked RBM

Auto-Encoders: Simpler alternative to


RBMs
Deep Learning - Architecture
• Recurrent Neural Network (RNN)
• Convolution Neural Network (CNN)

Recurrent Neural Network (RNN)


Recurrent Neural Network (RNN)
•Enable networks to do temporal processing
and learn sequences

Character level language model Vocabulary: [h,e,l,o]


.

Training of RNN: BPTT


: Predicted
: Actual
V

W
U

Training of RNN: BPTT


One to many:
Sequence output (e.g. image captioning takes an image and outputs a sentence of
words)
Many to one:
Sequence input (e.g. sentiment analysis where a given sentence is classified as
expressing positive or negative sentiment)
Many to many:
Sequence input and sequence output (e.g. Machine Translation: an RNN reads a
sentence in English and then outputs a sentence in French)
Many to many:
Synced sequence input and output (e.g. Language modelling where we wish to
predict next words.

RNN Extensions
Bidirectional RNN

Deep (Bidirectional) RNNs

RNN (Cont..)
• “the clouds are in the sky”

clouds are in the W1

the clouds are in the

RNN (Cont..)
• “India is my home country. I can speak fluent Hindi.”
is my home fluent W2

India is my speak fluent

It is very hard for RNN to learn “Long Term Dependency”.


LSTM
•Capable of learning long-term dependencies.

Simple RNN

LSTM

LSTM
•LSTM remove or add information to the cell
state, carefully regulated by structures called
gates.

• Cell state: Conveyer belt of the cell


LSTM
• Gates
– Forget Gate
– Input Gate
– Output Gate

LSTM
• Gates
– Forget Gate
– Input Gate
– Output Gate
LSTM
• Gates
– Forget Gate
– Input Gate
– Output Gate

LSTM
• Gates
– Forget Gate
– Input Gate
– Output Gate
LSTM- Variants

Convolutional Neural Network (CNN)


Convolutional Neural Network (CNN)

•A special kind of multi-layer neural networks.


•Implicitly extract relevant features.
•Fully-connected network architecture does not
take into account the spatial structure.
•In contrast, CNN tries to take advantage of the
spatial structure.

Convolutional Neural Network (CNN)

1. Convolutional layer
2. Pooling layer
3. Fully connected layer
Convolutional Neural Network (CNN)

1. Convolutional layer 1 0 1

0 1 0

1 0 1
1 1 1 0 0
Convolution Filter
0 1 1 1 0
0 0 1 1 1
0 0 1 1 0
0 1 1 0 0
Image

Convolutional Neural Network (CNN)

1. Convolutional layer 1 0 1

0 1 0

1 0 1
Convolutional Neural Network (CNN)

1. Convolutional layer 1 0 1
• Local receptive field
• Shared weights 0 1 0

1 0 1

Convolutional Neural Network (CNN)

2. Pooling layer
Convolutional Neural Network (CNN)

3. Fully connected layer

.
. .
.

Convolutional Neural Network (CNN)

Putting it all together

Pooled
feature
Labels

Convolution
feature

Input matrix 3 convolution filter Pooling Flatten Fully-connected layers


Example 1: CNN for Image

Example 2: CNN for Text

You might also like