0% found this document useful (0 votes)

4 views6 pages

11-Nonlinear Models (Neural Networks)

The document discusses nonlinear models, particularly focusing on neural networks and their ability to solve complex classification problems like XOR, which cannot be addressed by linear models. It explains the structure of artificial neural networks, including perceptrons and multi-layer perceptrons, as well as the processes of forward propagation and backpropagation for computing gradients. The document emphasizes the importance of learnable non-linear mappings and the challenges of programming these systems efficiently.

Uploaded by

soham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views6 pages

11-Nonlinear Models (Neural Networks)

Uploaded by

soham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

11-Nonlinear Models (Neural Networks)

Linear models
In previous topics, we mainly dealt with linear models
• Regression ℎ(𝒙) = 𝒘! 𝒙 + 𝑏

• Classification:
ℎ(𝒙) = argmax 𝒘!
" 𝒙 + 𝑏"

Category 𝑖 is referred than Category 𝑗

𝒘! !
" 𝒙 + 𝑏" ≥ 𝒘# 𝒙 + 𝑏# i.e., (𝒘" − 𝒘# )! 𝒙 + (𝑏" − 𝑏# ) ≥ 0
Binary classification with logistic regression is a special case of the above.

Nonlinear models
• Non-linear features 𝝓(𝒙)
○ E.g., Gaussian discriminant analysis with the different covariance
matrices, in which case we have quadratic features of 𝒙.
• Non-linear kernel 𝑘6𝒙" , 𝒙# 8
○ A kernel is an inner-product of two data samples that are
transformed in a certain vector space. The vector space could be very
high-dimensional (e.g., with infinite dimensions). A linear
classification in such a high-dimensional space could be non-linear in
the original low dimensional space.
• Learnable non-linear mapping
○ We can probably stack a few layers of learnable non-linear functions
(e.g., logistic functions) to learn the non-linear feature 𝝓(𝒙) or a non-
linear kernel that is appropriate to the task at hand.

Motivation: XOR, a nonlinear classification problem

○ We can probably stack a few layers of learnable non-linear functions
(e.g., logistic functions) to learn the non-linear feature 𝝓(𝒙) or a non-
linear kernel that is appropriate to the task at hand.

Motivation: XOR, a nonlinear classification problem

• The problem cannot be solved by logistic regression

• What if we stack multiple logistic regression classifiers?

• The XOR problem is solvable by three linear classifiers

○ One built upon the other two
○ But this is programming [HW]
§ Some machinery that allows you to specify certain things
§ Programming means you specify these things (usually
heuristically by human intelligence) that are can be input to the
machinery
§ The machinery accomplishes a certain task according to your
input (program).
○ Programming is very tedious and only feasible simple tasks
○ We want to learn the weights.

• Can we learn the weights?

○ Yes, still by gradient descent.

• Can we compute the gradient?

○ Yes, it's still a differentiable function

○ Again, brute-force computation of gradient is very tedious.

○ We need a systematic way of
○ Again, brute-force computation of gradient is very tedious.
○ We need a systematic way of
§ Defining a deep architecture, and
§ Computing its gradient.

Artificial Neural Network

• A perceptron [Rosenblatt, 1958]

𝑧 = 𝒘! 𝒙 + 𝑏
𝑦 = 𝑓(𝑧)

where 𝑓 is a binary thresholding function

• A perceptron-like neuron, unit, or node

𝑧 = 𝒘! 𝒙 + 𝑏
𝑦 = 𝑓 (𝑧 )

𝑓 is an activation function, e.g., sigmoid, tanh, ReLU

○ Usually we use nonlinear activation

○ Linear activation may be used for regression

• A multi-layer neural network, or a multi-layer perceptron

A common structure is layer-wise fully connected

For each node 𝑗 at layer

To simply notations, we omit the layer L, but call the output of the current
layer as 𝑦 and the input of the current layer 𝑥, which is the output of the
lower layer. In the simplified notation,

Since we have multiple layers, we need a recursive algorithm that

computes the activation of all nodes automatically.

Forward propagation (FP)

○ Initialization

○ Recursion

○ Termination

Gradient of multi-layer neural networks

Main idea: if we can compute the gradient for one layer, we may use
chain rule to compute the gradient for all layers.

Recursion on what?

We consider a local layer

Backpropagation (BP)
○ Initialization

○ Recursion
Backpropagation (BP)
○ Initialization

○ Recursion

○ Termination

• A few more thoughs

○ Non-layerwise connection: Topological sort
○ Multiple losses: BP is a linear system
○ Tied weights: Total derivative is the summation

Auto-differentiation in general

Numerical gradient checking

Specimen Papers 2021 - English PDF
No ratings yet
Specimen Papers 2021 - English PDF
122 pages
TEST 4. Sequences-Binomial 1 (2020)
No ratings yet
TEST 4. Sequences-Binomial 1 (2020)
16 pages
cs188 sp23 Note25
No ratings yet
cs188 sp23 Note25
8 pages
week 03-04 - Deep Feedforward Networks - Intro
No ratings yet
week 03-04 - Deep Feedforward Networks - Intro
141 pages
3 Non Linear Classifiers
No ratings yet
3 Non Linear Classifiers
74 pages
Deep Neural Networks_2 (7)
No ratings yet
Deep Neural Networks_2 (7)
55 pages
Module 2
No ratings yet
Module 2
44 pages
10 Multilayer Perceptrons
No ratings yet
10 Multilayer Perceptrons
54 pages
NN PDF
No ratings yet
NN PDF
23 pages
Neural Networks
No ratings yet
Neural Networks
14 pages
Supervised Learning Neural Networks
No ratings yet
Supervised Learning Neural Networks
34 pages
unit 3 .
No ratings yet
unit 3 .
48 pages
Machine Learning: The Hundred-Page Book
No ratings yet
Machine Learning: The Hundred-Page Book
17 pages
02A-DL2023-NN-basics
No ratings yet
02A-DL2023-NN-basics
52 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Ece18898g Neural Networks
No ratings yet
Ece18898g Neural Networks
47 pages
AN2DL_02_2324_Perceptron_2_FeedForward
No ratings yet
AN2DL_02_2324_Perceptron_2_FeedForward
55 pages
NN Theory
No ratings yet
NN Theory
138 pages
cs188-sp24-note22
No ratings yet
cs188-sp24-note22
8 pages
26 Neural Nets
No ratings yet
26 Neural Nets
77 pages
From Perceptron To Deep Neural Nets - Becoming Human - Artificial Intelligence Magazine
No ratings yet
From Perceptron To Deep Neural Nets - Becoming Human - Artificial Intelligence Magazine
36 pages
A2.2 DNN Update 2
No ratings yet
A2.2 DNN Update 2
51 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
UNIT V (1)
No ratings yet
UNIT V (1)
25 pages
DL mod 1 final
No ratings yet
DL mod 1 final
4 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
UNIT1_Perceptron_MLP
No ratings yet
UNIT1_Perceptron_MLP
26 pages
3 Non Linear Classifiers
No ratings yet
3 Non Linear Classifiers
74 pages
Lec 03 Deep Networks 1
No ratings yet
Lec 03 Deep Networks 1
53 pages
2023-Lecture11-NeuralNetworks
No ratings yet
2023-Lecture11-NeuralNetworks
48 pages
What Is Computer Vision?
No ratings yet
What Is Computer Vision?
120 pages
16-dl-1 - converted
No ratings yet
16-dl-1 - converted
9 pages
NN Unit 2
No ratings yet
NN Unit 2
20 pages
Basics of Deep Learning
No ratings yet
Basics of Deep Learning
20 pages
Neural Network and Fuzzy Logic
50% (2)
Neural Network and Fuzzy Logic
54 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Machine Learning: Neural Networks
No ratings yet
Machine Learning: Neural Networks
22 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
168 pages
UNIT 3-Multilayer-Perceptrons
No ratings yet
UNIT 3-Multilayer-Perceptrons
23 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
100 pages
lec05
No ratings yet
lec05
46 pages
GeoStat DeepLearn NDesassis 15 06 22
No ratings yet
GeoStat DeepLearn NDesassis 15 06 22
134 pages
1) deep_learning
No ratings yet
1) deep_learning
60 pages
Module_5 (1)
No ratings yet
Module_5 (1)
21 pages
CS 188 Introduction To Artificial Intelligence Fall 2017 Note 10 Neural Networks: Motivation
No ratings yet
CS 188 Introduction To Artificial Intelligence Fall 2017 Note 10 Neural Networks: Motivation
9 pages
Neural networks unit-3
No ratings yet
Neural networks unit-3
14 pages
AI UNIT 4 PART 2
No ratings yet
AI UNIT 4 PART 2
45 pages
Ch 12_Artificial Neural Networks
No ratings yet
Ch 12_Artificial Neural Networks
39 pages
Lecture 4 - Linear Classification
No ratings yet
Lecture 4 - Linear Classification
34 pages
NN-Ch2 New V1
No ratings yet
NN-Ch2 New V1
99 pages
Jntuk R20 ML Unit-V
No ratings yet
Jntuk R20 ML Unit-V
19 pages
UNIT V
No ratings yet
UNIT V
26 pages
Genetic Algorithms Versus Traditional Methods
No ratings yet
Genetic Algorithms Versus Traditional Methods
7 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
Deep Learning
No ratings yet
Deep Learning
13 pages
Deep Learning A Tutorial
No ratings yet
Deep Learning A Tutorial
16 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
Break Even Analysis v1
No ratings yet
Break Even Analysis v1
16 pages
Topic 4 1 Kinematics of Simple Harmonic Motion
No ratings yet
Topic 4 1 Kinematics of Simple Harmonic Motion
31 pages
Knowledgeable Communicators Thinkers Balanced
No ratings yet
Knowledgeable Communicators Thinkers Balanced
1 page
Computer Science IA Criteria A Soham PDF
No ratings yet
Computer Science IA Criteria A Soham PDF
3 pages
Mathematics Paper 1 TZ2 HL
No ratings yet
Mathematics Paper 1 TZ2 HL
19 pages
Lab Report Helicopter
No ratings yet
Lab Report Helicopter
2 pages
Divya Bhaskar 250 Words Essays On Teachers
No ratings yet
Divya Bhaskar 250 Words Essays On Teachers
1 page
Test 1. Sequences
No ratings yet
Test 1. Sequences
3 pages
PS216 Manual - Lab11
No ratings yet
PS216 Manual - Lab11
4 pages
7.1 Define and Use Sequences Series
No ratings yet
7.1 Define and Use Sequences Series
34 pages
Đề Kiểm Tra Mini Test 1- 6
No ratings yet
Đề Kiểm Tra Mini Test 1- 6
7 pages
Veeam One 12 0 Quick Start Guide
No ratings yet
Veeam One 12 0 Quick Start Guide
55 pages
Dbms Mini Project
0% (3)
Dbms Mini Project
18 pages
JLPT
No ratings yet
JLPT
9 pages
Reading Progress Chart 20234-2024
No ratings yet
Reading Progress Chart 20234-2024
10 pages
Fortran User Model
No ratings yet
Fortran User Model
20 pages
Tugas Scada Haiwell-Modbus - 2315374032
No ratings yet
Tugas Scada Haiwell-Modbus - 2315374032
6 pages
I PUC English MQP-1
No ratings yet
I PUC English MQP-1
4 pages
12th JEE Physics Pabt
No ratings yet
12th JEE Physics Pabt
31 pages
DLL_MAPEH-MUSIC 6_Q3_W1
No ratings yet
DLL_MAPEH-MUSIC 6_Q3_W1
5 pages
Experiment 6
No ratings yet
Experiment 6
4 pages
Journey To Emmaus - Thurston
No ratings yet
Journey To Emmaus - Thurston
3 pages
Simple Past Tense
No ratings yet
Simple Past Tense
1 page
n220 - Computer Practice n5 Memo Nov 2019
No ratings yet
n220 - Computer Practice n5 Memo Nov 2019
35 pages
IC 4017 Datasheet
0% (1)
IC 4017 Datasheet
3 pages
Java and The JVM: Presented by - Aftab Ahmad C.S.E (7 Sem) S.NO-03
No ratings yet
Java and The JVM: Presented by - Aftab Ahmad C.S.E (7 Sem) S.NO-03
21 pages
Discuss the major barriers to communication in pastoral ministry in Pentecostal churches
No ratings yet
Discuss the major barriers to communication in pastoral ministry in Pentecostal churches
5 pages
Introduction To Management Science 8th Edition by Bernard W. Taylor III
No ratings yet
Introduction To Management Science 8th Edition by Bernard W. Taylor III
24 pages
Os Le15
No ratings yet
Os Le15
7 pages
What is Database
No ratings yet
What is Database
24 pages
11 Discrete Mathematics December 2019
No ratings yet
11 Discrete Mathematics December 2019
3 pages
Cursive Writing Lower Case
No ratings yet
Cursive Writing Lower Case
5 pages
Unwritten Cool and Obscure Words
No ratings yet
Unwritten Cool and Obscure Words
9 pages
Q2 - Learning Activity Sheet 2
No ratings yet
Q2 - Learning Activity Sheet 2
2 pages
School Calendar 1st Term 2024-25 Student
No ratings yet
School Calendar 1st Term 2024-25 Student
1 page
Coderace_2024 Round 3 Briefing
No ratings yet
Coderace_2024 Round 3 Briefing
23 pages
CH-6 Intermediate Code Generator
No ratings yet
CH-6 Intermediate Code Generator
54 pages
Hospital Management Project Report PDF Free
No ratings yet
Hospital Management Project Report PDF Free
99 pages
Top 10 Wipro BPO
No ratings yet
Top 10 Wipro BPO
7 pages
w.11 - Civic Culture Concept - Gabriel A. Almond
No ratings yet
w.11 - Civic Culture Concept - Gabriel A. Almond
27 pages

Uploaded by

Uploaded by

11-Nonlinear Models (Neural Networks)

Category 𝑖 is referred than Category 𝑗

Motivation: XOR, a nonlinear classification problem

Motivation: XOR, a nonlinear classification problem

• The problem cannot be solved by logistic regression

• The XOR problem is solvable by three linear classifiers

• Can we learn the weights?

• Can we compute the gradient?

○ Again, brute-force computation of gradient is very tedious.

Artificial Neural Network

where 𝑓 is a binary thresholding function

• A perceptron-like neuron, unit, or node

𝑓 is an activation function, e.g., sigmoid, tanh, ReLU

○ Usually we use nonlinear activation

• A multi-layer neural network, or a multi-layer perceptron

A common structure is layer-wise fully connected

For each node 𝑗 at layer

Since we have multiple layers, we need a recursive algorithm that

Forward propagation (FP)

Gradient of multi-layer neural networks

We consider a local layer

• A few more thoughs

Numerical gradient checking

You might also like