0% found this document useful (0 votes)

82 views49 pages

8-Deep Learning For NLP

This document provides an overview of deep learning for natural language processing. It discusses how deep learning algorithms are based on artificial neural networks and how they have been applied successfully in domains like text, image, and speech processing. The document also describes the basic structure of neural networks, including the input, hidden, and output layers. It explains key concepts like activation functions, backpropagation, and different types of neural networks like convolutional and recurrent neural networks.

Uploaded by

Getnete degemu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views49 pages

8-Deep Learning For NLP

Uploaded by

Getnete degemu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 49

Chapter 6 : Deep Learning for

Natural Language
Adama Science and Technology University
School of Electrical Engineering and Computing
Department of CSE
Dr. Mesfin Abebe Haile (2022)
Outline

 Introduction
 Why deep learning for NLP
 Overview of the NLP
 Basic Structure of NN
 Different types of layers
 Activation function
 Types of Neural Network
 Convolutional NN
 Recurrent NN

01/02/23 2
Deep Learning for Natural
Language
 Deep learning is an extended field of machine learning that has proven to
be highly useful in the domains of text, image, and speech, primarily.
 The collection of algorithms implemented under deep learning have
similarities with the relationship between stimuli and neurons in the
human brain.
 Deep learning has extensive applications in computer vision, language
translation, speech recognition, image generation, and so forth.
 These sets of algorithms are simple enough to learn in both a
supervised and unsupervised fashion.

01/02/23 3
Deep Learning for Natural
Language
 A majority of deep learning algorithms are based on the concept of
artificial neural networks, and the training of such algorithms in
today’s world has been made easier with the availability of abundant
data and sufficient computation resources.
 With additional data, the performance of deep learning models just
keep on improving.

 The term deep in deep learning refers to the depth of the artificial
neural network architecture, and learning stands for learning through
the artificial neural network itself.

01/02/23 4
Deep Learning for Natural
Language
 The figure shown below is an accurate representation of
the difference between a deep and a shallow network and why the term deep learning gained currency.

01/02/23 5
Deep Learning for Natural
Language
 Representation of deep and shallow networks.

01/02/23 6
Deep Learning for Natural
Language
 Deep neural networks are capable of discovering latent
structures (or feature learning) from unlabeled and
unstructured data, such as images (pixel data), documents (text
data), or files (audio, video data).
 What differentiates any deep neural network from an ordinary
artificial neural network is the way we use backpropagation.
 In an ordinary artificial neural network, backpropagation trains
later (or end) layers more efficiently than it trains initial (or
former) layers.
 Thus, as we travel back into the network, errors become
smaller and more diffused.
01/02/23 7
Deep Learning for Natural
Language
 How Deep is “Deep”?
 A deep neural network is simply a feed forward neural network
with multiple hidden layers.
 If there are many layers in the network, then we say that the
network is deep.

 Neural networks are a biologically inspired paradigm that

enables a computer to learn human faculties from
observational data.

01/02/23 8
Deep Learning for Natural
Language
 Multiple open source platforms and libraries for deep learning.

01/02/23 9
Basic Structure of Neural Network

 The basic principle behind a neural network is a collection of

basic elements, artificial neuron or perceptron, that were first
developed in the 1950s by Frank Rosenblatt.

 They take several binary inputs, x1, x2, ..., xN and produce a
single binary output if the sum is greater than the activation
potential.
 The neuron is said to “fire” whenever activation potential is
exceeded and behaves as a step function.

01/02/23 10
Basic Structure of Neural Network

 Biological Analogy:
 ANN is a computational model that simulate some properties of
the human brain.

Biological Neural Network Artificial Neural Networks

01/02/23 11
Basic Structure of Neural Network

 The neurons that fire pass along the signal to other neurons connected to their
dendrites, which, in turn, will fire, if the activation potential is exceeded, thus
producing a cascading effect.

01/02/23 12
Basic Structure of Neural Network

 As not all inputs have the same emphasis, weights are attached to
each of the inputs, xi to allow the model to assign more importance
to some inputs.

 Thus, output is 1, if the weighted sum is greater than activation

potential or bias, i.e.,

01/02/23 13
Basic Structure of Neural Network

 Multilayer perceptrons (MLPs) belong to the category of

feedforward neural networks and are made up of three types of
layers: an input layer, one or more hidden layers, and a final
output layer.
 A normal MLP has the following properties:
 Hidden layers with any number of neurons,
 An input layer using linear functions,
 Hidden layer(s) using an activation function, such as sigmoid,
 An activation function giving any number of outputs,
 Proper established connections between the input layer, hidden
layer(s), and output layer.
01/02/23 14
Basic Structure of Neural Network

 MLPs are also known as universal approximators, as they can

find the relationship between the input values and the targets,
by using a sufficient number of neurons in the hidden layer.

 This doesn’t even require a significant amount of prior

information about mapping between input and output values.
 Often, with the given degree of freedom to an MLP, it can
outperform the basic MLP network, by introducing more
hidden layers, with fewer neurons in each of the hidden layers
and optimum weights.

01/02/23 15
Basic Structure of Neural Network

 The following are a few of the features of network architecture

that have a direct impact on its performance.
 Hidden layers: These contribute to the generalization factor of the
network. In most cases, a single layer is sufficient to encompass
the approximation of any desired function, supported with a
sufficient number of neurons.

 Hidden neurons: The number of neurons present across the hidden

layer(s) that can be selected by using any kind of formulation.

01/02/23 16
Basic Structure of Neural Network

 The following are a few of the features of network architecture

that have a direct impact on its performance.
 Output nodes: The count of output nodes is usually equal to the
number of classes we want to classify the target value.

 Activation functions: These are applied on the inputs of

individual nodes.

01/02/23 17
Basic Structure of Neural Network

 In practice, this simple form is difficult, owing to the abrupt

nature of the step function.
 So, a modified form was created to behave more predictably,
i.e., small changes in weights and bias cause only a small
change in output. There are two main modifications.

01/02/23 18
Basic Structure of Neural Network

 The inputs can take on any value between 0 and 1, instead of

being binary.
 To make the output behave more smoothly for given inputs, x1, x2,
…, xN, and weights. w1, w2, …, wN, and bias, b, use the following
sigmoid function.

01/02/23 19
Basic Structure of Neural Network

 Other activation functions, which can be better choices than

sigmoid, for deep networks.
Hyperbolic Tangent Function (Pronounced “tanch”)

01/02/23 20
Basic Structure of Neural Network

 In addition to the usual sigmoid function, other nonlinearities

that are more frequently used include the following.
 ReLU: Rectified linear unit. This keeps the activation guarded at
zero. It is computed using the following function:
 The graph of the ReLU function, with ‘0’ value for all x <= 0,
and with a linear slope of 1 for all x > 0:

01/02/23 21
Basic Structure of Neural Network

 ReLUs quite often face the issue of dying, especially when the
learning rate is set to a higher value, as this triggers weight
updating that doesn’t allow the activation of the specific
neurons, thereby making the gradient of that neuron forever
zero.
 Another risk offered by ReLU is the explosion of the activation
function, as the input value, xj, is itself the output here.

01/02/23 22
Basic Structure of Neural Network

 Although ReLU offers other benefits as well, such as the

introduction of sparsity in cases where xj is below 0, leading to
sparse representations, and as the gradient returned in cases
where ReLU is constant, it results in faster learning,
accompanied by the reduced likelihood of the gradient vanishing.

 LReLUs (Leaky ReLUs): These mitigate the issue of dying

ReLUs by introducing a marginally reduced slope (~0.01) for
values of x less than 0.
 LReLUs do offer successful scenarios, although not always.

01/02/23 23
Basic Structure of Neural Network

 ELU (Exponential Linear Unit): These oﬀer negative values that

push the mean unit activations closer to zero, thereby speeding
the learning process, by moving the nearby gradient to the unit
natural gradient.
 Softmax: Also referred to as a normalized exponential function,
this transforms a set of given real values in the range of (0,1),
such that the combined sum is 1.
01/02/23 24
Basic Structure of Neural Network

 As in the mammalian brain, individual neurons are organized

in layers, with connections within a layer and to the next layer,
creating an ANN, or artificial neural network or multilayer
perceptron (MLP).

 The layers between input and output are referred to as hidden

layers, and the density and type of connections between layers
is the configuration.
 For example, a fully connected configuration has all the
neurons of layer L connected to those of L + 1.

01/02/23 25
Basic Structure of Neural Network

 An illustration of two hidden layers with dense connections.

01/02/23 26
Types of Neural Network

 Feedforward neural networks constitute the basic units of the

neural network family.

 Data movement in any feedforward neural network is from the

input layer to output layer, via present hidden layers,
restricting any kind of loops.
 Output from one layer serves as input to the next layer, with
restrictions on any kind of loops in the network architecture.

01/02/23 27
Types of Neural Network

Inputs
.6 Output
Age 34 .
.2  4 0.6
.1 .5
Gender 2 .3 .2
.8

.7  “Probability of
4 .2 being Alive”
Stage

Independent Dependent
Weights Hidden Weights
variables variable
Layer
Prediction
01/02/23 28
Types of Neural Network

Inputs
.6 Output
Age 34
.5 0.6
.1
Gender 2 
.7 .8 “Probability of
beingAlive”
Stage 4

Independent Dependent
Weights Hidden Weights
variables variable
Layer
01/02/23
Prediction 29
Types of Neural Network

Inputs
Output
Age 34
.2 .5
0.6
Gender 2 .3

“Probability of
.8
beingAlive”
Stage 4 .2

Dependent
Independent Weights Hidde Weights variable
variables Layer
01/02/23
Prediction 30
Types of Neural Network

Inputs
.6 Output
Age 34
.2 .5
.1 0.6
Gender 1 .3 
.7 “Probability of
.8
beingAlive”
Stage 4 .2

Independent Dependent
Weights Hidden Weights
variables variable
Layer
Prediction
01/02/23 31
Types of Neural Network

 So far (feed forward, back propagation), the structure of our

neural network treats all inputs interchangeably.
 No relationships between the individual inputs.
 Just an ordered set of variables.

 We want to incorporate domain knowledge into the architecture

of a Neural Network.

01/02/23 32
Types of Neural Network

 Image data has important structures, such as;

”Topology” of pixels,
Issues of lighting and contrast,
Knowledge of human visual system,
Nearby pixels tend to have similar values,
Edges and shapes,
Scale Invariance – objects may appear at different sizes in the
image.
 From a dimensionality standpoint, taking advantage of these
structures means much fewer parameters! (CNN)
Convoluted version (Convolution Neural Network)
01/02/23 33
Convolution Neural Network

 Convolutional neural networks are well adapted for image

recognition and handwriting recognition.
 Their structure is based on sampling a window or portion of an
image, detecting its features, and then using the features to
build a representation.

 This leads to the use of several layers, thus these models were
the first deep learning models.

01/02/23 34
Convolution Neural Network

 A CNN is a neural network with some convolutional layers (and

some other layers).
 A convolutional layer has a number of filters that does
convolutional operation.

01/02/23 35
Convolution Neural Network

 From a memory and capacity standpoint the CNN is not much

bigger than a regular two layer network.

 At runtime the convolution operations are computationally

expensive and take up about 67% of the time.
 CNN’s are about 3X slower than their fully connected
equivalents (size-wise).

01/02/23 36
Motivation of Sequential Model

01/02/23 37
Recurrent Neural Network

 Recurrent neural networks (RNNs) are used when a data

pattern changes over time.
 RNNs can be assumed as unrolled over time.
 An RNN applies the same layer to the input at each time step,
using the output (i.e., the state of previous time steps as inputs).

 RNNs have feedback loops in which the output from the

previous firing or time index T is fed as one of the inputs at
time index T + 1.
 There might be cases in which the output of the neuron is fed to
itself as input.
01/02/23 38
Recurrent Neural Network

 These are well-suited for applications involving sequences,

they are widely used in problems related to videos, which are a
time sequence of images, and for translation purposes, wherein
understanding the next word is based on the context of the
previous text.
 Following are various types of RNNs:
 Encoding recurrent neural networks: this set of RNNs enables
the network to take an input of the sequence form.

01/02/23 39
Recurrent Neural Network

 Following are various types of RNNs:

 Generating recurrent neural networks: Such networks basically
output a sequence of numbers or values, like words in a sentence.

01/02/23 40
Recurrent Neural Network

 Following are various types of RNNs:

 General recurrent neural networks: These networks are a
combination of the preceding two types of RNNs.
 General RNNs are used to generate sequences and, thus, are
widely used in NLG (natural language generation) tasks.

01/02/23 41
Recurrent Neural Network

 With images, we forced them into a specific input dimension.

 Not obvious how to do this with text.
 We will use a new structure of network called a “Recurrent
Neural Network”.
Issue: Variable length sequences of words.

 Want to do better than “bag of words” implementations.

 Ideally, each word is processed or understood in the
appropriate context.
 Need to have some notion of “context”.
01/02/23 42
Recurrent Neural Network

 RNN focused on text/words as application.

 But, RNNs can be used for other sequential data.
Time-Series Data,
Speech Recognition,
Sensor Data,
Genome Sequences.

 Nature of state transition means it is hard to keep information

from distant past in current memory without reinforcement.

01/02/23 43
Recurrent Neural Network

 Issue: Standard RNNs have poor memory.

Transition Matrix necessarily weakens signal.
Need a structure that can leave some dimensions unchanged
over many steps.
This is the problem addressed by so-called Long-Short Term
Memory RNNs (LSTM).

Define a more complicated update mechanism for the

changing of the internal state.
By default, LSTMs remember the information from the last
step.
01/02/23 44
Recurrent Neural Network

 There are many different “flavors” of LSTM:

Gated Recurrent Unit (GRU)
Depth-Gated RNN

 LSTMs have considerably more parameters than plain RNNs.

 Most of the big performance improvements in NLP have come

from LSTMs, not plain RNN.

01/02/23 45
Question & Answer

01/02/23 46
Thank You !!!

01/02/23 47
Individual Assignment - Four

 Write a short note on the following topics (Not more than 8

pages):
 RNNs,
 LSTM,
 Gated Recurrent Unit (GRU),
 Depth-Gated RNN,
 Bidirectional Recurrent Neural Network.

01/02/23 48
Individual Assignment - Four

 Explain the following terms by writing few paragraphs for each

of them (Not more than 8 pages) :
 GPT 2 and GPT 3,
 BERT,
 Hugging Face,
 Transformers,
 Attention Models,
 Word Embeddings,
 Word2vec,
 Gensim.
01/02/23 49

Generative AI on Google Cloud with LangChain: Design scalable generative AI solutions with Python, LangChain, and Vertex AI on Google Cloud
From Everand
Generative AI on Google Cloud with LangChain: Design scalable generative AI solutions with Python, LangChain, and Vertex AI on Google Cloud
Leonid Kuligin
No ratings yet
INT426 Gen AI
No ratings yet
INT426 Gen AI
4 pages
DL4CV BonusBundle
No ratings yet
DL4CV BonusBundle
79 pages
NNpred
100% (2)
NNpred
74 pages
Jeff Dean's Lecture For YC AI
100% (19)
Jeff Dean's Lecture For YC AI
86 pages
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
From Everand
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
Fouad Sabry
No ratings yet
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
Kursus Deep Learning
No ratings yet
Kursus Deep Learning
108 pages
Deep Learning For Health Informatics
No ratings yet
Deep Learning For Health Informatics
18 pages
Deep Learning@Ok Interviews
No ratings yet
Deep Learning@Ok Interviews
6 pages
Download Full Deep Learning 1st Edition Dulani Meedeniya PDF All Chapters
100% (2)
Download Full Deep Learning 1st Edition Dulani Meedeniya PDF All Chapters
50 pages
Scaling Laws For Neural Language Models
No ratings yet
Scaling Laws For Neural Language Models
30 pages
Dsa Lab File
No ratings yet
Dsa Lab File
97 pages
Dynamic programming The Ultimate Step-By-Step Guide
From Everand
Dynamic programming The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
Machine Learning Interview Questions
No ratings yet
Machine Learning Interview Questions
8 pages
Image Search Engine: Resource Guide
No ratings yet
Image Search Engine: Resource Guide
11 pages
MACHINELEARING UNIT 1material
100% (1)
MACHINELEARING UNIT 1material
64 pages
Artificial Intelligence Presentation 2019
No ratings yet
Artificial Intelligence Presentation 2019
28 pages
Scaling AI and ML
No ratings yet
Scaling AI and ML
4 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
9 pages
Self-Supervision, Bert, and Beyond: Building Transformer-Based Natural Language Processing Applications (Part 2)
No ratings yet
Self-Supervision, Bert, and Beyond: Building Transformer-Based Natural Language Processing Applications (Part 2)
117 pages
Machine Learning Is Fun 1565131730
No ratings yet
Machine Learning Is Fun 1565131730
48 pages
The 9 Deep Learning Papers You Need To Know About 3
No ratings yet
The 9 Deep Learning Papers You Need To Know About 3
19 pages
Mastering Object-Oriented Design Patterns in Modern C++: Unlock the Secrets of Expert-Level Skills
From Everand
Mastering Object-Oriented Design Patterns in Modern C++: Unlock the Secrets of Expert-Level Skills
Larry Jones
No ratings yet
Machine Learning
No ratings yet
Machine Learning
11 pages
Lecture 26
No ratings yet
Lecture 26
17 pages
Pioneering Views: Pushing the Limits of Your C/ETRM - Volume 2
From Everand
Pioneering Views: Pushing the Limits of Your C/ETRM - Volume 2
Pioneer Solutions
No ratings yet
Machine Learning Coursera
100% (1)
Machine Learning Coursera
55 pages
2023.02 - Time Series Forecasting With Transformer Models - en
100% (1)
2023.02 - Time Series Forecasting With Transformer Models - en
52 pages
Sandro Skansi - Introduction To Deep Learning. From Logical Calculus To Artificial Intelligence (2018, Springer)
No ratings yet
Sandro Skansi - Introduction To Deep Learning. From Logical Calculus To Artificial Intelligence (2018, Springer)
193 pages
Artificial Intelligence and Deep Learning for Decision Makers: A Growth Hacker's Guide to Cutting Edge Technologies
From Everand
Artificial Intelligence and Deep Learning for Decision Makers: A Growth Hacker's Guide to Cutting Edge Technologies
Navdeep Singh Gill
No ratings yet
BERT
No ratings yet
BERT
21 pages
Artificial Intelligence: Yunita Sari Kamis, 23 Feb 2012
No ratings yet
Artificial Intelligence: Yunita Sari Kamis, 23 Feb 2012
24 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
47 pages
Internship Presentation On Deep Learning
100% (1)
Internship Presentation On Deep Learning
23 pages
MLOps Syllabus and Weekly Schedule (June 2021) PDF
No ratings yet
MLOps Syllabus and Weekly Schedule (June 2021) PDF
5 pages
Complete Download Artificial Intelligence Programming with Python From Zero to Hero 1st Edition Perry Xiao PDF All Chapters
75% (4)
Complete Download Artificial Intelligence Programming with Python From Zero to Hero 1st Edition Perry Xiao PDF All Chapters
40 pages
Feature Selection Techniques in Machine Learning - Javatpoint
No ratings yet
Feature Selection Techniques in Machine Learning - Javatpoint
9 pages
Reinforcement Learning Explained - A Step-by-Step Guide to Reward-Driven AI
From Everand
Reinforcement Learning Explained - A Step-by-Step Guide to Reward-Driven AI
Luka Nikolic
No ratings yet
Get Understanding Large Language Models Thimira Amaratunga free all chapters
100% (3)
Get Understanding Large Language Models Thimira Amaratunga free all chapters
76 pages
Apache Mahout Essentials
From Everand
Apache Mahout Essentials
Jayani Withanawasam
No ratings yet
The Edge: Business Performance Through Information Technology Leadership
From Everand
The Edge: Business Performance Through Information Technology Leadership
Manoj Garg
No ratings yet
Machine Learning Basic Principles
No ratings yet
Machine Learning Basic Principles
124 pages
Toronto Data Online Curriculum
No ratings yet
Toronto Data Online Curriculum
11 pages
Slideshare Grokking Deep Learning 170314155452
No ratings yet
Slideshare Grokking Deep Learning 170314155452
20 pages
Introduction To Learning: Frederic Precioso 24/01/2019
No ratings yet
Introduction To Learning: Frederic Precioso 24/01/2019
179 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
11 pages
Data Warehousing in The Age of Artificial Intelligence
No ratings yet
Data Warehousing in The Age of Artificial Intelligence
94 pages
Pioneering Enterprise Architecture: Transforming Global Enterprises
From Everand
Pioneering Enterprise Architecture: Transforming Global Enterprises
Ashutosh Ahuja
No ratings yet
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
From Everand
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
Fouad Sabry
No ratings yet
Understanding Machine Learning Algorithms - in Depth
No ratings yet
Understanding Machine Learning Algorithms - in Depth
167 pages
Service in the AI Era: Science, Logic, and Architecture Perspectives
From Everand
Service in the AI Era: Science, Logic, and Architecture Perspectives
Jim Spohrer
No ratings yet
Hugging Face
No ratings yet
Hugging Face
1 page
Full Deep Learning With Python Develop Deep Learning Models On Theano and TensorFLow Using Keras Jason Brownlee Ebook All Chapters
100% (4)
Full Deep Learning With Python Develop Deep Learning Models On Theano and TensorFLow Using Keras Jason Brownlee Ebook All Chapters
62 pages
(Skiena, 2017) - Book - The Data Science Design Manual - 2
No ratings yet
(Skiena, 2017) - Book - The Data Science Design Manual - 2
1 page
Full download Neural Networks A Visual Introduction for Beginners Michael Taylor pdf docx
100% (1)
Full download Neural Networks A Visual Introduction for Beginners Michael Taylor pdf docx
65 pages
Geographic Coordinate Conversion
No ratings yet
Geographic Coordinate Conversion
11 pages
2001 Principlesforecasting
No ratings yet
2001 Principlesforecasting
862 pages
PDF Understanding Large Language Models: Learning Their Underlying Concepts and Technologies 1 / converted Edition Thimira Amaratunga download
100% (1)
PDF Understanding Large Language Models: Learning Their Underlying Concepts and Technologies 1 / converted Edition Thimira Amaratunga download
22 pages
2017 2nd and 4th Class Schedule-final (2)
No ratings yet
2017 2nd and 4th Class Schedule-final (2)
2 pages
mitiku tamirat profile
No ratings yet
mitiku tamirat profile
1 page
Grade 8-Performing and Visual Arts Pva - Fetena - Net - 9aeb
100% (1)
Grade 8-Performing and Visual Arts Pva - Fetena - Net - 9aeb
115 pages
2017 EC Academic Calendar
No ratings yet
2017 EC Academic Calendar
3 pages
Text Encoders Lack Knowledge: Leveraging Generative Llms For Domain-Specific Semantic Textual Similarity
No ratings yet
Text Encoders Lack Knowledge: Leveraging Generative Llms For Domain-Specific Semantic Textual Similarity
12 pages
4_5861484186887524483
No ratings yet
4_5861484186887524483
1 page
2nd yr maths summer class sechedule.docx (2)
No ratings yet
2nd yr maths summer class sechedule.docx (2)
1 page
HDP Work Book Final
100% (2)
HDP Work Book Final
98 pages
Utilizing Semantic Textual Similarity For Clinical Survey Data Feature Selection
No ratings yet
Utilizing Semantic Textual Similarity For Clinical Survey Data Feature Selection
9 pages
Grade 8-Career and Technical Education Cte - Fetena - Net - 7a2b
No ratings yet
Grade 8-Career and Technical Education Cte - Fetena - Net - 7a2b
162 pages
Applsci 12 09691 v2
No ratings yet
Applsci 12 09691 v2
35 pages
Grade 8-Information Technology IT Fetena Net Af43
100% (1)
Grade 8-Information Technology IT Fetena Net Af43
115 pages
Shimaa IsmailSemanticSimilarity
No ratings yet
Shimaa IsmailSemanticSimilarity
11 pages
Collective Human Opinions in Semantic Textual Simi
No ratings yet
Collective Human Opinions in Semantic Textual Simi
17 pages
Published Paper
No ratings yet
Published Paper
12 pages
Grade 8-Social Studies Fetena Net 1dc2
100% (4)
Grade 8-Social Studies Fetena Net 1dc2
213 pages
The Final Main Thesis-Compressed
No ratings yet
The Final Main Thesis-Compressed
85 pages
4-Lecture Four - (Part of Speech Tagging and Sequence Labeling)
No ratings yet
4-Lecture Four - (Part of Speech Tagging and Sequence Labeling)
36 pages
Paraphrasing Textual Entailment and Semantic Simil
No ratings yet
Paraphrasing Textual Entailment and Semantic Simil
239 pages
Let2 W
No ratings yet
Let2 W
46 pages
PVA Grade 10 Student Textbook Final Version V20220802 - Compressed
100% (1)
PVA Grade 10 Student Textbook Final Version V20220802 - Compressed
144 pages
9 Speech Recognition
No ratings yet
9 Speech Recognition
26 pages
Kaiwartya 2016
No ratings yet
Kaiwartya 2016
17 pages
2-Lecture Two - (Back Ground of NLP)
No ratings yet
2-Lecture Two - (Back Ground of NLP)
65 pages
Boosting The Performance of Transformer Architectu
No ratings yet
Boosting The Performance of Transformer Architectu
6 pages
7-Information Extraction (IE) and Machine Translation (MT)
No ratings yet
7-Information Extraction (IE) and Machine Translation (MT)
46 pages
Handout Cloud, Iot, Ip
No ratings yet
Handout Cloud, Iot, Ip
141 pages
6-Lecture Six (Chapter Four-Semantic Analysis)
No ratings yet
6-Lecture Six (Chapter Four-Semantic Analysis)
25 pages
3-Lecture Three - (Chapter Two-N-gram Language Models)
No ratings yet
3-Lecture Three - (Chapter Two-N-gram Language Models)
28 pages
Neural Networks 16 Mark Answers
No ratings yet
Neural Networks 16 Mark Answers
3 pages
JAACMV7A2 Sharkawy PDF
No ratings yet
JAACMV7A2 Sharkawy PDF
13 pages
Machine Learning: Lecture 4: Artificial Neural Networks (Based On Chapter 4 of Mitchell T.., Machine Learning, 1997)
No ratings yet
Machine Learning: Lecture 4: Artificial Neural Networks (Based On Chapter 4 of Mitchell T.., Machine Learning, 1997)
14 pages
Environmental Sound Classificationwith Convolutional Neural Networks
No ratings yet
Environmental Sound Classificationwith Convolutional Neural Networks
6 pages
Ai
No ratings yet
Ai
6 pages
ML Poster
No ratings yet
ML Poster
2 pages
Swe1011 Soft-Computing Eth 1.0 37 Swe1011
No ratings yet
Swe1011 Soft-Computing Eth 1.0 37 Swe1011
2 pages
4 - DNN Tip
No ratings yet
4 - DNN Tip
52 pages
Search Term-1
No ratings yet
Search Term-1
4 pages
CS-601-CBGS: B.Tech., VI Semester
No ratings yet
CS-601-CBGS: B.Tech., VI Semester
4 pages
AIA 6600 - Module 1
No ratings yet
AIA 6600 - Module 1
5 pages
Deep Learning Exp
No ratings yet
Deep Learning Exp
25 pages
Assignment 1 (
No ratings yet
Assignment 1 (
2 pages
Lecture 1
No ratings yet
Lecture 1
82 pages
Artificial Neural Network Based Model For Forecasting of Inflation in India
No ratings yet
Artificial Neural Network Based Model For Forecasting of Inflation in India
12 pages
Neural Machine Translation A Review of Methods Resources and - 2020 - AI Ope
No ratings yet
Neural Machine Translation A Review of Methods Resources and - 2020 - AI Ope
17 pages
Facial Emotion Detection in Low Light Conditions Using CNN
No ratings yet
Facial Emotion Detection in Low Light Conditions Using CNN
4 pages
Full download Deep Learning for Computer Vision: Image Classification, Object Detection, and Face Recognition in Python Jason Brownlee pdf docx
100% (1)
Full download Deep Learning for Computer Vision: Image Classification, Object Detection, and Face Recognition in Python Jason Brownlee pdf docx
40 pages
02 Edge-Detection-example C4W1L02 EdgeDetectionExample
No ratings yet
02 Edge-Detection-example C4W1L02 EdgeDetectionExample
4 pages
Neural Network Fundamentals With Graphs
No ratings yet
Neural Network Fundamentals With Graphs
6 pages
MLPPT
No ratings yet
MLPPT
21 pages
Applsci 12 09820 v2
No ratings yet
Applsci 12 09820 v2
15 pages
Final Unit 2 Questions.
No ratings yet
Final Unit 2 Questions.
5 pages
(FaultTolerance1) Neuron Fault Tolerance in SNN 1401 PDF Upload
No ratings yet
(FaultTolerance1) Neuron Fault Tolerance in SNN 1401 PDF Upload
7 pages
(huawei,KD)One-for-All- Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation
No ratings yet
(huawei,KD)One-for-All- Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation
13 pages
AD601 Deep Learning Unit-2 Notes
No ratings yet
AD601 Deep Learning Unit-2 Notes
14 pages
Machine Vison Homework 10
No ratings yet
Machine Vison Homework 10
11 pages
Machine Learning with Neural Networks: An Introduction for Scientists and Engineers Bernhard Mehlig instant download
100% (1)
Machine Learning with Neural Networks: An Introduction for Scientists and Engineers Bernhard Mehlig instant download
48 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
Encoder Decoder
No ratings yet
Encoder Decoder
8 pages

Uploaded by

Uploaded by

Chapter 6 : Deep Learning for

 Neural networks are a biologically inspired paradigm that

 The basic principle behind a neural network is a collection of

Biological Neural Network Artificial Neural Networks

 Thus, output is 1, if the weighted sum is greater than activation

 Multilayer perceptrons (MLPs) belong to the category of

 MLPs are also known as universal approximators, as they can

 This doesn’t even require a significant amount of prior

 The following are a few of the features of network architecture

 Hidden neurons: The number of neurons present across the hidden

 The following are a few of the features of network architecture

 Activation functions: These are applied on the inputs of

 In practice, this simple form is difficult, owing to the abrupt

 The inputs can take on any value between 0 and 1, instead of

 Other activation functions, which can be better choices than

 In addition to the usual sigmoid function, other nonlinearities

 Although ReLU offers other benefits as well, such as the

 LReLUs (Leaky ReLUs): These mitigate the issue of dying

 ELU (Exponential Linear Unit): These oﬀer negative values that

 As in the mammalian brain, individual neurons are organized

 The layers between input and output are referred to as hidden

 An illustration of two hidden layers with dense connections.

 Feedforward neural networks constitute the basic units of the

 Data movement in any feedforward neural network is from the

 So far (feed forward, back propagation), the structure of our

 We want to incorporate domain knowledge into the architecture

 Image data has important structures, such as;

 Convolutional neural networks are well adapted for image

 A CNN is a neural network with some convolutional layers (and

 From a memory and capacity standpoint the CNN is not much

 At runtime the convolution operations are computationally

 Recurrent neural networks (RNNs) are used when a data

 RNNs have feedback loops in which the output from the

 These are well-suited for applications involving sequences,

 Following are various types of RNNs:

 Following are various types of RNNs:

 With images, we forced them into a specific input dimension.

 Want to do better than “bag of words” implementations.

 RNN focused on text/words as application.

 Nature of state transition means it is hard to keep information

 Issue: Standard RNNs have poor memory.

Define a more complicated update mechanism for the

 There are many different “flavors” of LSTM:

 LSTMs have considerably more parameters than plain RNNs.

 Most of the big performance improvements in NLP have come

 Write a short note on the following topics (Not more than 8

 Explain the following terms by writing few paragraphs for each

You might also like