0% found this document useful (0 votes)

59 views

1. Deep Learning

Uploaded by

yitej21617

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views

1. Deep Learning

Uploaded by

yitej21617

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 127

Introduction to

Deep Learning
Agenda

❑ What is AI, ML, DL?

❑ Real-World DL
❑ Types of Artificial Intelligence
❑ DL Basics
❑ DL Challenges
❑ NLP
❑ Computer Vision
Artificial Intelligence
What is AI?

artificial intelligence (AI), the

ability of a digital computer or
computer-controlled robot to perform
tasks commonly associated with
intelligent beings.
“AI began with an ancient wish to forge the gods.”
- Pamela McCorduck, Machines Who Think, 1979
Frankenstein (1818)

Ex Machina (2015)

Visualized here are 3% of the neurons and 0.0001% of the synapses in the brain.
Thalamocortical system visualization via DigiCortex Engine.
https://deeplearning.mit.edu 2019
For the full list of references visit:
https://hcai.mit.edu/references
[286]
History of Deep Learning Ideas and Milestones*
• 1943: Neural networks
We are here
• 1957: Perceptron
• 1974-86: Backpropagation, RBM, RNN
• 1989-98: CNN, MNIST, LSTM, Bidirectional RNN
• 2006: “Deep Learning”, DBN
• 2009: ImageNet
Perspective:
• 2012: AlexNet, Dropout
• Universe created
13.8 billion years ago • 2014: GANs
• Earth created • 2014: DeepFace
4.54 billion years ago
• Modern humans • 2016: AlphaGo
300,000 years ago
• 2017: AlphaZero, Capsule Networks
• Civilization
12,000 years ago • 2018: BERT
• Written record * Dates are for perspective and not as definitive historical
5,000 years ago record of invention or credit

For the full list of references visit:

https://hcai.mit.edu/references https://deeplearning.mit.edu
History of DL Tools*
• Mark 1 Perceptron – 1960
• Torch – 2002
• CUDA – 2007
• Theano – 2008
• Caffe – 2014
• DistBelief – 2011
• TensorFlow 0.1 – 2015
• PyTorch 0.1 – 2017
• TensorFlow 1.0 – 2017
• PyTorch 1.0 – 2017
• TensorFlow 2.0 – 2019

* Truncated for clarity over completeness

For the full list of references visit:
https://hcai.mit.edu/references https://deeplearning.mit.edu
Neuron: Biological Inspiration for Computation
(Artificial) Neuron: computational building
block for the “neural network”

Neuron: computational building

block for the brain

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[18, 143] https://deeplearning.mit.edu 2019
Biological and Artificial Neural Networks

Human Brain
• Thalamocortical system:
3 million neurons
476 million synapses
• Full brain:
100 billion neurons
1,000 trillion synapses

Artificial Neural Network

• ResNet-152:
60 million synapses

Human brains have ~10,000,000 times synapses

than artificial neural networks.
For the full updated list of references visit:
https://selfdrivingcars.mit.edu/references
[286] https://deeplearning.mit.edu 2019
Neuron: Biological Inspiration for Computation
Key Difference:
• Parameters: Human brains have
~10,000,000 times synapses than
artificial neural networks.
• Topology: Human brains have no
“layers”. Async: The human brain works
• Neuron: computational asynchronously, ANNs work
building block for the brain synchronously.
• Learning algorithm: ANNs use gradient
descent for learning. We don’t know
what human brains use
• Power consumption: Biological neural
networks use very little power
compared to artificial networks
• Stages: Biological networks usually
never stop learning. ANNs first train
• (Artificial) Neuron: computational then test.
building block for the “neural network”
For the full updated list of references visit:
https://selfdrivingcars.mit.edu/references
[18, 143] https://deeplearning.mit.edu 2019
Neuron: Forward Pass

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[78] https://deeplearning.mit.edu
Combing Neurons in Hidden Layers:
The “Emergent” Power to Approximate

Universality: For any arbitrary function f(x), there exists a neural

network that closely approximate it for any input x
For the full updated list of references visit:
https://selfdrivingcars.mit.edu/references
[62] https://deeplearning.mit.edu
Neural Networks are Parallelizable
Step 1 Step 4

Step 2 Step 5

Step 3 Animated

For the full list of references visit:

https://hcai.mit.edu/references
[273] https://deeplearning.mit.edu
Compute Hardware

• CPU – serial, general purpose, everyone has one

• GPU – parallelizable, still general purpose
• TPU – custom ASIC (Application-Specific Integrated Circuit) by
Google, specialized for machine learning, low precision

For the full list of references visit:

https://hcai.mit.edu/references
[273] https://deeplearning.mit.edu
Real-World Deep Learning
Autonomous and semi-autonomous cars
Real-World Deep Learning
Chat-GPT
Neural Networks
Origins: Algorithms that try to mimic the brain.
Was very widely used in 80s and early 90s; popularity
diminished in late 90s.
Recent resurgence: State-of-the-art technique for many
applications
Neuron in the brain
“input wires”

“output wires”
Neurons in the brain

[Credit: US National Institutes of Health, National Institute on Aging]

DEEP LEARNING

Deep learning is a specific subfield of machine learning: a new take on

learning representations from data that puts an emphasis on learning
successive layers of increasingly meaningful representations. The deep
in deep learning isn’t a reference to any kind of deeper understanding
achieved by the approach; rather, it stands for this idea of successive layers
of representations. How many layers contribute to a model of the data is
called the depth of the model. Other appropriate names for the field could
have been layered representations learning and hierarchical
representations learning. Modern deep learning often involves tens or even
hundreds of successive layers of representations—and they’re all learned
automatically from exposure to training data. Meanwhile, other approaches
to machine learning tend to focus on learning only one or two layers of
representations of the data; hence, they’re sometimes called shallow
learning. In deep learning, these layered representations are (almost always)
learned via models called neural networks, structured in literal layers
stacked on top of each other.
AI-ML-DL
RELATIONSHIP
GEOMETRIC INTERPRETATION
OF DL
Neural networks consist entirely of chains of tensor
(generalized matrix) operations and that all of these tensor
operations are just geometric transformations of the
input data.
It follows that you can interpret a neural network as a very
complex geometric transformation in a high-dimensional
space, implemented via a long series of simple steps.
In 3D, the following mental image may prove useful. Imagine
two sheets of colored paper: one red and one blue. Put one on
top of the other. Now crumple them together into a small ball.
That crumpled paper ball is your input data, and each sheet of
paper is a class of data in a classification problem. What a
neural network (or any other machine-learning model) is meant
to do is figure out a transformation of the paper ball that would
uncrumple it, so as to make the two classes cleanly separable
again. With deep learning, this would be implemented as a
series of simple transformations of the 3D space, such as
those you could apply on the paper ball with your fingers, one
movement at a time.
Deep Learning is Representation Learning
(aka Feature Learning)

Deep
Learning

Representation
Learning

Machine
Learning

Artificial
Intelligence

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[20] https://deeplearning.mit.edu
Representation Matters

Task: Draw a line to separate the green triangles and blue circles.

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[20] https://deeplearning.mit.edu
Deep Learning is Representation Learning
(aka Feature Learning)

Task: Draw a line to separate the blue curve and red curve

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[146] https://deeplearning.mit.edu 2019
Representation Matters

Sun-Centered Model Earth-Centered Model

(Formalized by Copernicus in 16th century)

“History of science is the history of compression progress.”

- Jürgen Schmidhuber
For the full updated list of references visit:
https://selfdrivingcars.mit.edu/references
[20] https://deeplearning.mit.edu 2019
Why Deep Learning? Scalable Machine Learning

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[283, 284] https://deeplearning.mit.edu 2019
Gartner Hype Cycle

Deep Learning
Self-Driving Cars

For the full list of references visit:

https://hcai.mit.edu/references https://deeplearning.mit.edu
© deeplearning.ai
Andrew Ng
Intuition about deep
representation

𝑦
ො

© deeplearning.ai
Why Deep Learning and Why
Now?
https://www.prowesscorp.com/whats-the-difference-between-artificial-intelligence-ai-machine-learning-and-deep-learning/
Machine Learning definition

Arthur Samuel (1959). Machine Learning: Field of study that gives

computers the ability to learn without being explicitly
programmed.
Machine Learning definition

Herbert Simon.
Learning is any process by which a system improves performance
from experience.
Machine Learning is concerned with computer
programs that automatically improve their
performance through experience.
Machine Learning definition

Tom Mitchell (1998). A computer program is said to learn from

experience E with respect to some task T and some performance
measure P, if its performance on T, as measured by P, improves
with experience E.
Example: Spam Filtering
Question.
Learning to detect credit card fraud.
What are T, P and E?

Task T: Assign label of fraud or not fraud to credit card transaction

Performance measure P: Accuracy of fraud classifier
Training experience E: Historical credit card transactions labeled as
fraud or not
Question.
Suppose we feed a learning algorithm a lot of historical weather
data, and have it learn to predict weather. What are E, P and T?
Key ML Terminology

Labels
A label is the thing we're predicting.
Features
A feature is an input variable
Examples or samples
An example is a particular instance of data, x.
labeled examples. A labeled example includes both feature(s) and the label.
unlabeled examples. An unlabeled example contains features but not the
label.
Models
A model defines the relationship between features and label.
Training means creating or learning the model.
Inference means applying the trained model to unlabeled examples.
Key types of Machine Learning problems

Supervised machine learning: Learn to predict target values from labelled data/

Classification (target values are discrete classes)

Regression (target values are continuous values)

Unsupervised machine learining: Find structure in unlabeled data

Find groups of similar instances in the data (clusterin)

Finding unusual patterns (outlier detection)
Supervised learning Unsupervised learning

Training set: Training set:

Applications of clustering

Market segmentation Social network analysis

Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison)

Organize computing Astronomical data analysis

clusters
Supervised or Unsupervised?

•Examine the statistics of two football teams, and predicting which team will
win tomorrow's match (given historical data of teams' wins/losses to learn
from).

•This can be addressed using supervised learning, in which we learn from

historical records to make win/loss predictions.
Supervised or Unsupervised?

Take a collection of 1000 customers of ASAN Pay, and find a way to

automatically group these customers into a small number of groups of
customers that are somehow "similar" or "related".

This is an unsupervised learning/clustering problem.

Housing price prediction.
400

300
Price ($)
200
in 1000’s
100

0
0 500 1000 1500 2000 2500
Size in feet2

Supervised Learning Regression: Predict continuous

“right answers” given valued output (price)
Classification or Regression?

The amount of rain that falls in a day is usually measured

in either millimeters (mm) or inches. Suppose you use a
learning algorithm to predict how much rain will fall
tomorrow. Would you treat this as a classification or a
regression problem?
Regression is appropriate when we are trying to predict a
continuous-valued output, such as the amount of rainfall
measured in inches or mm.
Classification or Regression?

Suppose you are working on weather prediction, and your

weather station makes one of three predictions for each
day's weather: Sunny, Cloudy or Rainy. You'd like to use a
learning algorithm to predict tomorrow's weather. Would
you treat this as a classification or a regression problem?
Classification is appropriate when we are trying to predict
one of a small number of discrete-valued outputs, such as
whether it is Sunny (which we might designate as class 0),
Cloudy (say class 1) or Rainy (class 2).
Regression vs Classification

For the full list of references visit:

https://hcai.mit.edu/references
[288] https://deeplearning.mit.edu
Multi-Class vs Multi-Label

For the full list of references visit:

https://hcai.mit.edu/references
[288] https://deeplearning.mit.edu
Types of Artificial Intelligence

Types of Artificial Intelligence:

➢ Narrow AI
➢ General AI
➢ Super AI
Types of Artificial Intelligence
Narrow AI:
➢ It is also known as a weak AI
➢ Only narrowly defined special tasks can be
performed
➢ The machine has no thinking ability
➢ It performs a set of predetermined functions.
Types of Artificial Intelligence
General AI
• General AI is a type of intelligence which could perform
any intellectual task with efficiency like a human.
• Elon Musk and a group of artificial intelligence experts and
industry executives are calling for a six-month pause in
developing systems more powerful than OpenAI's newly
launched GPT-4, in an open letter citing potential risks to
society
General AI
➢ Hawking cautioned against an extreme form of AI
➢ Thinking machines would “take-off” on their own,
modifying themselves and independently designing
and building ever more capable systems.
➢ Humans, bound by the slow pace of biological
evolution, would be tragically outwitted.
Super AI
Super AI
➢ Super AI is a level of Intelligence of Systems
at which machines could surpass human
intelligence, and can perform any task better
than human with cognitive properties
➢ Currently, super AI does not exist
Artificial Intelligence Quiz

Check: Which of the following can AI do now?

Can play a millionaire game?
Can win anyone in Chess?
Can win any person in a GO game?
Can play table tennis?
Can take the glass and put it in the closet?
Can fully replace a person in housework?
Can drive safely in the highway?
Can drive in 20 Yanvar?
Can do weekly shopping?
Can he fully translate from one language to another?
Can discover mathematical theory?
Can perform heart surgery?
Can read a person’s brain?
Can determine whether the given feedback is negative or positive?
Can write a funny story?
NLP and Its Applications
Speech to Text and Text to Speech
➢ Speech recognition
➢ Text-to-speech synthesis (TTS)

Natural Language Processing

➢ Question Answering Systems
➢ Chatbots
➢ Machine Translation
➢ Surfing on the Web
➢ Text classification
➢ Content categorization.

General Purpose NLP:

➢ GPT-3 OpenAI:
https://www.youtube.com/watch?v=r2dQgdktUJg
➢ Jukebox by OpenAI:
https://openai.com/blog/jukebox/
NLP and Its role in Digital Inclusion
Personal Assistants for Digital Inclusion
Computer Vision and Its Applications

Image Segmentation

https://phenaki.github.io/
Image to Text Image Generation
VisionEye
With object detection and voice delivery systems,
the project aims to help individuals with visual
impairments navigate easily and safely. Our
students promote an inclusive society where
everyone has access to the resources that they
need.
Computer Vision and Its role in Digital Inclusion
Computer Vision and Its role in Digital Inclusion
Autonomous Car Driving
Path
Planning

Laser Terrain
Mapping

Learning from Human Drivers

Adaptive Vision

Sebastian

Stanley
Face Recognition

object models

object parts
(combination
of edges)

edges

pixels
Image Segmentation
Deep Learning in One Slide

• What is it: Exciting progress:

Extract useful patterns from data.
• Face recognition
• How:
• Image classification
Neural network + optimization
• Speech recognition
• How (Practical):
Python + TensorFlow & friends • Text-to-speech generation
• Hard Part: • Handwriting transcription
Good Questions + Good Data • Machine translation
• Why now: • Medical diagnosis
Data, hardware, community, tools,
investment • Cars: drivable area, lane keeping
• Where do we stand? • Digital assistants
Most big questions of intelligence • Ads, search, social recommendations
have not been answered nor
properly formulated • Game playing with deep RL
First Steps: Start Simple
1

Input Image:

TensorFlow Neural 5
Model: Network
6

Output: 5
(with 87% confidence)

For the full list of references visit:

https://hcai.mit.edu/references https://deeplearning.mit.edu
Why Deep Learning? Real World Applications

For the full list of references visit:

https://hcai.mit.edu/references https://deeplearning.mit.edu
Why Not Deep Learning? Unintended
Consequences
Human AI (Deep RL Agent)

Player gets reward based on:

1. Finishing time
2. Finishing position
3. Picking up “turbos”

For the full list of references visit:

https://hcai.mit.edu/references
[285] https://deeplearning.mit.edu
The Challenge of Deep Learning
• Ask the right question and know what the answer means:
image classification ≠ scene understanding

• Select, collect, and organize the right data to train on:

photos ≠ synthetic ≠ real-world video frames

For the full list of references visit:

https://hcai.mit.edu/references https://deeplearning.mit.edu
Pure Perception is Hard

For the full list of references visit:

https://hcai.mit.edu/references
[66] https://deeplearning.mit.edu
Visual Understanding is Harder

Examples of what we can’t do well:

• Mirrors
• Sparse information
• 3D Structure
• Physics
• What’s on
peoples’ minds?
• What happens next?
• Humor

For the full list of references visit:

https://hcai.mit.edu/references
[211] https://deeplearning.mit.edu
Deep Learning:
Our intuition about what’s “hard” is flawed (in complicated ways)

Visual perception: 540,000,000 years of data

Bipedal movement: 230,000,000 years of data
Abstract thought: 100,000 years of data

Prediction: Dog + Distortion Prediction: Ostrich

“Encoded in the large, highly evolve sensory and motor portions of the human brain is a billion
years of experience about the nature of the world and how to survive in it.… Abstract thought,
though, is a new trick, perhaps less than 100 thousand years old. We have not yet mastered it. It
is not all that intrinsically difficult; it just seems so when we do it.”
- Hans Moravec, Mind Children (1988)
For the full list of references visit:
https://hcai.mit.edu/references
[6, 7, 11, 68] https://deeplearning.mit.edu
Measuring Progress: Einstein vs Savant

Max Tegmark’s rising sea visualization of

Hans Moravec’s landscape of human competence
For the full list of references visit:
https://hcai.mit.edu/references
[281] https://deeplearning.mit.edu
Special Purpose Intelligence:
Estimating Apartment Cost

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[65] https://deeplearning.mit.edu
(Toward) General Purpose Intelligence:
Pong to Pixels
Policy Network:

• 80x80 image (difference image)

• 2 actions: up or down
• 200,000 Pong games

This is a step towards general purpose

artificial intelligence!
Andrej Karpathy. “Deep Reinforcement
Learning: Pong from Pixels.” 2016.

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[63] https://deeplearning.mit.edu
Deep Learning from Human and Machine
“Teachers” “Students”
Supervised
Human
Learning

Human Augmented
Supervised
Machine Learning

Human Semi-
Supervised
Machine Learning

Human Reinforcement
Machine Learning

Machine Unsupervised
Learning

https://deeplearning.mit.edu 2019
Data Augmentation
Crop: Flip:

Scale: Rotate:

Noise:
Translation:

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[294] https://deeplearning.mit.edu 2019
The Challenge of Deep Learning:
Efficient Teaching + Efficient Learning
• Humans can learn from very few examples
• Machines (in most cases) need thousands/millions of examples

For the full list of references visit:

https://hcai.mit.edu/references [291] https://deeplearning.mit.edu 2019
Deep Learning: Training and Testing

Training Stage:

Input Learning Correct

Data System Output
(aka “Ground Truth”)

Testing Stage:

New Input Learning

Best Guess
Data System

https://deeplearning.mit.edu
How Neural Networks Learn: Backpropagation

Forward Pass:

Input Neural
Prediction
Data Network

Backward Pass (aka Backpropagation):

Neural Measure
Network of Error
Adjust to Reduce Error

https://deeplearning.mit.edu
What can we do with Deep Learning?

Input Learning Correct

Data System Output

• Number • Number
• Vector of numbers • Vector of numbers
• Sequence of numbers • Sequence of numbers
• Sequence of vectors of numbers • Sequence of vectors of numbers

For the full list of references visit:

https://hcai.mit.edu/references https://deeplearning.mit.edu
Key Concepts:
Activation Functions
Sigmoid
• Vanishing gradients
• Not zero centered

Tanh
• Vanishing gradients

ReLU
• Not zero centered

For the full list of references visit:

https://hcai.mit.edu/references
[148] https://deeplearning.mit.edu
Loss Functions

• Loss function quantifies gap between

prediction and ground truth
• For regression:
• Mean Squared Error (MSE)
• For classification:
• Cross Entropy Loss

Mean Squared Error Cross Entropy Loss

Prediction Classes Prediction

Ground Truth Ground Truth {0,1}

For the full list of references visit:

https://hcai.mit.edu/references https://deeplearning.mit.edu
Backpropagation

Task: Update the weights and biases to decrease loss function

Subtasks:
1. Forward pass to compute network output and “error”
2. Backward pass to compute gradients
3. A fraction of the weight’s gradient is subtracted from the weight.

Learning Rate Numerical Method: Automatic Differentiation

For the full updated list of references visit:
https://selfdrivingcars.mit.edu/references
[63, 80, 100] https://deeplearning.mit.edu 2019
Learning is an Optimization Problem

Task: Update the weights and biases to decrease loss function

SGD: Stochastic Gradient Descent

References: [103] https://deeplearning.mit.edu 2019

Dying ReLUs
Vanishing Gradients:

• If a neuron is initialized poorly, it might not fire for

entire training dataset.
• Large parts of your network could be dead ReLUs! Partial derivatives are small = Learning is slow

Hard to break symmetry Vanilla SGD gets your there, but can be slow

References: [102, 104] https://deeplearning.mit.edu 2019

Mini-Batch Size

Mini-Batch size: Number of training instances the network

evaluates per weight update step.
• Larger batch size = more computational speed
• Smaller batch size = (empirically) better generalization

“Training with large minibatches is bad for your health. More importantly, it's
bad for your test error. Friends don’t let friends use minibatches larger than 32.”
- Yann LeCun
Revisiting Small Batch Training for Deep Neural Networks (2018)

For the full list of references visit:

https://hcai.mit.edu/references [329] https://deeplearning.mit.edu 2019
Overfitting and Regularization

• Help the network generalize to data it hasn’t seen.

• Big problem for small datasets.
• Overfitting example (a sine curve vs 9-degree polynomial):

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[24, 20, 140] https://deeplearning.mit.edu
Overfitting and Regularization

• Overfitting: The error decreases in the training set but

increases in the test set.

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[24, 20, 140] https://deeplearning.mit.edu
Regularization: Early Stoppage

• Create “validation” set (subset of the training set).

• Validation set is assumed to be a representative of the testing set.
• Early stoppage: Stop training (or at least save a checkpoint)
when performance on the validation set decreases

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[20, 140] https://deeplearning.mit.edu
Regularization: Dropout

• Dropout: Randomly remove some nodes in the network (along

with incoming and outgoing edges)
• Notes:
• Usually p >= 0.5 (p is probability of keeping node)
• Input layers p should be much higher (and use noise instead of dropout)
• Most deep learning frameworks come with a dropout layer

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[20, 140] https://deeplearning.mit.edu
Regularization: Weight Penalty (aka Weight Decay)

• L2 Penalty: Penalize squared weights. Result:

• Keeps weight small unless error derivative is
very large.
• Prevent from fitting sampling error.
• Smoother model (output changes slower as
the input change).
• If network has two similar inputs, it prefers to
put half the weight on each rather than all the
weight on one.

• L1 Penalty: Penalize absolute weights. Result:

• Allow for a few weights to remain large.

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[20, 140, 147] https://deeplearning.mit.edu 2019
Normalization

• Network Input Normalization

• Example: Pixel to [0, 1] or [-1, 1] or according to mean and std.

• Batch Normalization (BatchNorm, BN)

• Normalize hidden layer inputs to mini-batch mean & variance
• Reduces impact of earlier layers on later layers

• Batch Renormalization (BatchRenorm, BR)

• Fixes difference b/w training and inference by keeping a moving
average asymptotically approaching a global normalization.

• Other options:
• Layer normalization (LN) – conceived for RNNs
• Instance normalization (IN) – conceived for Style Transfer
• Group normalization (GN) – conceived for CNNs

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[289, 290] https://deeplearning.mit.edu 2019
Neural Network Playground
http://playground.tensorflow.org

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[154] https://deeplearning.mit.edu 2019
Convolutional Neural Networks: Image
Classification

• Convolutional filters:
take advantage of
spatial invariance

For the full list of references visit:

https://hcai.mit.edu/references [293] https://deeplearning.mit.edu 2019
• AlexNet (2012): First CNN (15.4%)
• 8 layers
• 61 million parameters
• ZFNet (2013): 15.4% to 11.2%
• 8 layers
• More filters. Denser stride.
• VGGNet (2014): 11.2% to 7.3%
• Beautifully uniform:
3x3 conv, stride 1, pad 1, 2x2 max pool
• 16 layers
• 138 million parameters
• GoogLeNet (2014): 11.2% to 6.7%
• Inception modules
• 22 layers
• 5 million parameters
(throw away fully connected layers)
• ResNet (2015): 6.7% to 3.57%
• More layers = better performance
Human error (5.1%) • 152 layers
surpassed in 2015 • CUImage (2016): 3.57% to 2.99%
• Ensemble of 6 models
• SENet (2017): 2.99% to 2.251%
• Squeeze and excitation block: network
is allowed to adaptively adjust the
weighting of each feature map in the
convolutional block.

References: [90] https://deeplearning.mit.edu 2019

Object Detection / Localization
Region-Based Methods | Shown: Faster R-CNN

For the full list of references visit:

https://hcai.mit.edu/references
[299] https://deeplearning.mit.edu
Object Detection / Localization
Single-Shot Methods | Shown: SSD

For the full list of references visit:

https://hcai.mit.edu/references
[299] https://deeplearning.mit.edu
Semantic Segmentation

For the full list of references visit:

https://hcai.mit.edu/references
[175] https://deeplearning.mit.edu
Transfer Learning

• Fine-tune a pre-trained model

• Effective in many applications: computer vision, audio, speech,
natural language processing

For the full list of references visit:

https://hcai.mit.edu/references https://deeplearning.mit.edu
Autoencoders

• Unsupervised learning
• Gives embedding
• Typically better embeddings
come from discriminative task

http://projector.tensorflow.org/
For the full updated list of references visit:
https://selfdrivingcars.mit.edu/references
[298] https://deeplearning.mit.edu 2019
Generative Adversarial Network (GANs)

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[302, 303, 304] https://deeplearning.mit.edu 2019
Word Embeddings (Word2Vec)

Skip Gram Model:

Word Vector

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[297] https://deeplearning.mit.edu 2019
Recurrent Neural Networks

• Applications
• Sequence Data
• Text
• Speech
• Audio
• Video
• Generation

For the full list of references visit:

https://hcai.mit.edu/references
[299] https://deeplearning.mit.edu
Long-Term Dependency

• Short-term dependence:
Bob is eating an apple.
Context • Long-term dependence:
Bob likes apples. He is hungry and decided to
have a snack. So now he is eating an apple.
In theory, vanilla RNNs
can handle arbitrarily
long-term dependence.

In practice, it’s difficult.

For the full list of references visit:

https://hcai.mit.edu/references
[109] https://deeplearning.mit.edu
Long Short-Term Memory (LSTM) Networks: Pick
What to Forget and What To Remember

Conveyer belt for previous state and new data:

1. Decide what to forget (state)
2. Decide what to remember (state)
3. Decide what to output (if anything)

For the full list of references visit:

https://hcai.mit.edu/references
[109] https://deeplearning.mit.edu
Bidirectional RNN

• Learn representations from both previous time

steps and future time steps

For the full list of references visit:

https://hcai.mit.edu/references
[109] https://deeplearning.mit.edu
Encoder-Decoder Architecture

Encoder RNN encodes input sequence into a fixed size vector,

and then is passed repeatedly to decoder RNN.

For the full list of references visit:

https://hcai.mit.edu/references https://deeplearning.mit.edu
Attention

Attention mechanism allows the network to refer back to the

input sequence, instead of forcing it to encode all information
into one fixed-length vector.
For the full list of references visit:
https://hcai.mit.edu/references https://deeplearning.mit.edu
AutoML and Neural Architecture Search (NASNet)

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[300, 301] https://deeplearning.mit.edu
Deep Reinforcement Learning

For the full updated list of references visit:

https://selfdrivingcars.mit.edu/references
[306, 307] https://deeplearning.mit.edu
Toward Artificial General Intelligence
• Transfer Learning
• Hyperparameter Optimization
• Architecture Search
• Meta Learning

For the full list of references visit:

https://hcai.mit.edu/references
[286, 291] https://deeplearning.mit.edu 2019
Reading material:

Chapter 1 - Zhang, Aston & Lipton, Zachary & Li, Mu & Smola,
Alexander. (2023). Dive into Deep Learning, Cambridge University
Press.
https://d2l.ai/chapter_introduction/index.html

Chapter 1 – Deep Learning for Coders with Fastai and PyTorch: AI

Applications Without a PhD, 1st edition, 2020
https://colab.research.google.com/github/fastai/fastbook/blob/
master/01_intro.ipynb

(Ebook) Machine Learning Algorithms in Depth (MEAP V01) by Vadim Smolyakov ISBN 9781633439214, 1633439216 download pdf
100% (5)
(Ebook) Machine Learning Algorithms in Depth (MEAP V01) by Vadim Smolyakov ISBN 9781633439214, 1633439216 download pdf
81 pages
Full Foodborne Disease Handbook Vol IV: Seafood and Environmental Toxins Hui PDF All Chapters
100% (4)
Full Foodborne Disease Handbook Vol IV: Seafood and Environmental Toxins Hui PDF All Chapters
52 pages
CNN Short
No ratings yet
CNN Short
61 pages
Artificial Neural Networks Video Tutorial: Machine Learning 17CS73
No ratings yet
Artificial Neural Networks Video Tutorial: Machine Learning 17CS73
23 pages
Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
Deep Learning Methods and Applications For Electrical Power Systems A Comprehensive Review
No ratings yet
Deep Learning Methods and Applications For Electrical Power Systems A Comprehensive Review
22 pages
Deep Learning Step by Step
No ratings yet
Deep Learning Step by Step
171 pages
Dl All Units Materials
No ratings yet
Dl All Units Materials
138 pages
PThread API Reference
No ratings yet
PThread API Reference
348 pages
Autoencoders
No ratings yet
Autoencoders
66 pages
Pthread
No ratings yet
Pthread
4 pages
Deep Learning
No ratings yet
Deep Learning
2 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
PPT_Btech CSE
No ratings yet
PPT_Btech CSE
17 pages
Deep Learning (MODULE-3) (1)
No ratings yet
Deep Learning (MODULE-3) (1)
85 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
Machine Learning 1
No ratings yet
Machine Learning 1
160 pages
UNIT-I_Introduction to Computer Vision
No ratings yet
UNIT-I_Introduction to Computer Vision
45 pages
RBM, DBN, and DBM
No ratings yet
RBM, DBN, and DBM
79 pages
CNN Architectures: Lenet, Alexnet, VGG, Googlenet, Resnet and More
No ratings yet
CNN Architectures: Lenet, Alexnet, VGG, Googlenet, Resnet and More
9 pages
Lecture 1: Introduction To Reinforcement Learning: David Silver
No ratings yet
Lecture 1: Introduction To Reinforcement Learning: David Silver
46 pages
Unit 2
No ratings yet
Unit 2
112 pages
CNN PPT Unit Iv
No ratings yet
CNN PPT Unit Iv
134 pages
Lecture 26-30 Unit 2
No ratings yet
Lecture 26-30 Unit 2
20 pages
2.neural Network
No ratings yet
2.neural Network
19 pages
RAG with math
No ratings yet
RAG with math
7 pages
Deep Learning: Prof:Naveen Ghorpade
No ratings yet
Deep Learning: Prof:Naveen Ghorpade
43 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
Back Propagation
100% (1)
Back Propagation
27 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
25 pages
Deep Neural Network
No ratings yet
Deep Neural Network
12 pages
Artificial Neural Networks: Part 1/3
No ratings yet
Artificial Neural Networks: Part 1/3
25 pages
LSTM
No ratings yet
LSTM
42 pages
Regularization: Swetha V, Research Scholar
No ratings yet
Regularization: Swetha V, Research Scholar
32 pages
Deep Learning: - Course Code: - Unit 1
No ratings yet
Deep Learning: - Course Code: - Unit 1
21 pages
50 Most Important CNN Interview Questions
No ratings yet
50 Most Important CNN Interview Questions
18 pages
Deep Learning Cours
No ratings yet
Deep Learning Cours
165 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
Notes On Backpropagation
No ratings yet
Notes On Backpropagation
14 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Neural
No ratings yet
Neural
35 pages
The Mostly Complete Chart of Neural Networks
100% (1)
The Mostly Complete Chart of Neural Networks
19 pages
Lab I TENSOR FLOW AND KERAS
No ratings yet
Lab I TENSOR FLOW AND KERAS
3 pages
Lesson 4 Gradient Descent
No ratings yet
Lesson 4 Gradient Descent
13 pages
Machine Learning Module-3
No ratings yet
Machine Learning Module-3
23 pages
L10 - Intro - To - Deep - Learning
No ratings yet
L10 - Intro - To - Deep - Learning
75 pages
Deep Learning and TensorFlow
No ratings yet
Deep Learning and TensorFlow
50 pages
Deep Learning With Tensorflow
No ratings yet
Deep Learning With Tensorflow
15 pages
Bidirectional RNN and RVNN
No ratings yet
Bidirectional RNN and RVNN
15 pages
Deep Learning Literature Review
100% (1)
Deep Learning Literature Review
8 pages
ML_LAB_Mannual-1
No ratings yet
ML_LAB_Mannual-1
79 pages
Computer Vision Unit 4
No ratings yet
Computer Vision Unit 4
186 pages
Ensemble Machine Learning With Python: 7-Day Mini-Course Jason Brownlee - The full ebook version is ready for instant download
100% (1)
Ensemble Machine Learning With Python: 7-Day Mini-Course Jason Brownlee - The full ebook version is ready for instant download
46 pages
A Practical Guide To Graph Neural Networks
No ratings yet
A Practical Guide To Graph Neural Networks
28 pages
Generative AI: - Lecture-1
100% (1)
Generative AI: - Lecture-1
21 pages
DLunit 4
No ratings yet
DLunit 4
16 pages
Artificial Intelligence in Mechanical Engineering: A Case Study On Vibration Analysis of Cracked Cantilever Beam
No ratings yet
Artificial Intelligence in Mechanical Engineering: A Case Study On Vibration Analysis of Cracked Cantilever Beam
4 pages
3 - ANN Part One PDF
No ratings yet
3 - ANN Part One PDF
30 pages
Gradient Descent Algorithms and Variations - PyImageSearch
No ratings yet
Gradient Descent Algorithms and Variations - PyImageSearch
21 pages
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
From Everand
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
Fouad Sabry
No ratings yet
Cambridge IGCSE: ACCOUNTING 0452/22
No ratings yet
Cambridge IGCSE: ACCOUNTING 0452/22
20 pages
Gcic NCP Seizure PICUOSMUN
100% (3)
Gcic NCP Seizure PICUOSMUN
2 pages
The Change Plan - w1
No ratings yet
The Change Plan - w1
2 pages
Deutz Allis Fl1011 Engine Operators Manual
No ratings yet
Deutz Allis Fl1011 Engine Operators Manual
7 pages
Full Listening Practice Test 3
No ratings yet
Full Listening Practice Test 3
6 pages
Explore The Ways Williams Portrays The Rise of A New Social Order
No ratings yet
Explore The Ways Williams Portrays The Rise of A New Social Order
2 pages
Self-Concept and Self-Awareness
No ratings yet
Self-Concept and Self-Awareness
4 pages
Neil Young
100% (2)
Neil Young
108 pages
inorganics-11-00070
No ratings yet
inorganics-11-00070
11 pages
Linear
No ratings yet
Linear
8 pages
What Is Tourism?
No ratings yet
What Is Tourism?
6 pages
Reflections On 'Abd Al-Ra'uf of Singkel (1615-1693)
No ratings yet
Reflections On 'Abd Al-Ra'uf of Singkel (1615-1693)
26 pages
Tausadi Mining Engineering Company Profile 2017
No ratings yet
Tausadi Mining Engineering Company Profile 2017
7 pages
A Note On The Availability of D'ailly's Writings On Astrology
No ratings yet
A Note On The Availability of D'ailly's Writings On Astrology
5 pages
Tối thứ 7
No ratings yet
Tối thứ 7
7 pages
Strategic Leadership: Managing The Strategy Process
100% (1)
Strategic Leadership: Managing The Strategy Process
50 pages
New Frontier College of Commerce Kohat
No ratings yet
New Frontier College of Commerce Kohat
34 pages
English Time
No ratings yet
English Time
4 pages
All Questions
No ratings yet
All Questions
5 pages
66baf511a969d
No ratings yet
66baf511a969d
1 page
IEEE Device Numbers
No ratings yet
IEEE Device Numbers
2 pages
Department of Computer Science and Engineering Internal Assessment Test-Iii
No ratings yet
Department of Computer Science and Engineering Internal Assessment Test-Iii
1 page
The Emirates Group
No ratings yet
The Emirates Group
18 pages
ISCC Self-Declaration For Points of Origin Generating Waste and Residues
No ratings yet
ISCC Self-Declaration For Points of Origin Generating Waste and Residues
1 page
The Literary Forms in Philippine Literature
No ratings yet
The Literary Forms in Philippine Literature
12 pages
Edible Cookie Dough - What Molly Made
No ratings yet
Edible Cookie Dough - What Molly Made
1 page
Missiology MAB
100% (1)
Missiology MAB
12 pages
1 Condidional Acceptance IRS 3176C Verdana
No ratings yet
1 Condidional Acceptance IRS 3176C Verdana
6 pages
Project 2 - Finlatics IBEP
No ratings yet
Project 2 - Finlatics IBEP
4 pages