Uploaded by

Harsh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

machine learning

Uploaded by

Harsh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

2.

Discuss on the given performance measures: MAE, MSE, RMSE, RSE, R2 score in
the context of Linear Regression. Justify the usability of each measure.

n the context of linear regression, which aims to model the relationship between
independent variables and a continuous dependent variable, various performance
measures are used to assess the quality of the model's predictions. Let's discuss
each of these measures and their usability:
Mean Absolute Error (MAE):
MAE measures the average absolute difference between the predicted values and
the actual values.
It provides a straightforward interpretation as it gives the average magnitude of
errors in the predictions.
Usability: MAE is useful when you want to understand the average error
magnitude without considering the direction of errors. It's less sensitive to outliers
compared to other measures like MSE.
Mean Squared Error (MSE):
MSE measures the average squared difference between the predicted values and
the actual values.
Squaring the errors penalizes larger errors more heavily, making MSE sensitive to
outliers.
Usability: MSE is widely used as it's easy to compute and differentiable. However,
it's sensitive to outliers, which can skew the evaluation of the model's
performance.
Root Mean Squared Error (RMSE):
RMSE is the square root of MSE, which brings the error metric back to the same
scale as the dependent variable.
It provides an interpretable measure of the average magnitude of errors, similar
to MAE but with sensitivity to outliers due to the squaring.
Usability: RMSE is useful for providing a measure of the typical deviation of the
predictions from the actual values, taking into account the scale of the dependent
variable. It's commonly used and easy to interpret.
Residual Standard Error (RSE):
RSE measures the standard deviation of the residuals (the differences between
observed and predicted values).
It's an absolute measure of lack of fit, providing a measure of the typical size of
the residuals.
Usability: RSE is useful for assessing the goodness of fit of the model. Lower values
indicate better fit, and it's particularly useful for comparing different models.
Coefficient of Determination (R-squared or R2 Score):
R2 score measures the proportion of the variance in the dependent variable that
is predictable from the independent variables.
It ranges from 0 to 1, where 1 indicates a perfect fit.
Usability: R2 score is widely used to evaluate the goodness of fit of the model. It
provides an intuitive interpretation of the proportion of variance explained by the
model. However, it can be misleading if used alone, especially when dealing with
complex models or multicollinearity.
In summary, each of these performance measures has its own strengths and
weaknesses, and their usability depends on the specific context and requirements
of the problem at hand. It's often recommended to use a combination of these
measures to gain a comprehensive understanding of the model's performance.

What is the need for feature scaling? Differentiate standard scaling and
minmax scaling.
Feature scaling is a crucial preprocessing step in many machine learning
algorithms, including linear regression, support vector machines (SVM), and
k-nearest neighbors (KNN). It involves transforming the features of the
dataset to a similar scale, which helps in improving the performance and
convergence of the machine learning algorithms. The need for feature
scaling arises due to the following reasons:
Different Scales: Features in a dataset often have different scales. For
example, one feature may range from 0 to 100 while another ranges from
1000 to 10000. These differences in scale can lead to biased or skewed
results in algorithms that rely on distance calculations or gradient descent
optimization.
Gradient Descent Optimization: Algorithms like linear regression and neural
networks use optimization techniques like gradient descent to minimize a
cost function. Features with larger scales can dominate the optimization
process, causing it to take longer to converge or to converge to suboptimal
solutions.
Distance-Based Algorithms: Algorithms like KNN and SVM use distance
metrics to make predictions. If features are not scaled, features with larger
scales will have a higher impact on the distance calculations, leading to
biased predictions.
Now, let's differentiate between two common methods of feature scaling:
Standard Scaling (Z-score normalization) and Min-Max Scaling.
Standard Scaling (Z-score normalization):
In standard scaling, each feature is scaled to have a mean of 0 and a
standard deviation of 1.
The formula for standard scaling is: 𝑧=𝑥−𝜇𝜎z=σx−μ
This method centers the data around 0 and scales it to have a standard
deviation of 1.
Standard scaling is useful when the distribution of the features is
approximately Gaussian (normal distribution).
Min-Max Scaling:
In min-max scaling, each feature is scaled to a fixed range, usually between 0
and 1.
The formula for min-max scaling is: 𝑥scaled=𝑥−min(𝑥)max(𝑥)
−min(𝑥)xscaled=max(x)−min(x)x−min(x)
This method scales the data to a specific range, making it suitable for
algorithms that require features to be on the same scale.
Min-max scaling preserves the original distribution of the data and is less
affected by outliers compared to standard scaling.
In summary, both standard scaling and min-max scaling are used to scale
features to a similar range, reducing the impact of feature scale differences
on the performance of machine learning algorithms. Standard scaling is
suitable for normally distributed data, while min-max scaling is useful when
preserving the original distribution of the data is important. The choice
between these methods depends on the characteristics of the dataset and
the requirements of the algorithm being used.

Top of Form
Q…What is Decision Tree? Briefly discuss on the terminologies used in
Decision Tree

Decision tree is a type of supervised learning algorithm (having a predefined

target variable) that is mostly used in classification problems. It can also be
used for solving regression problem. ●It works for both categorical and
continuous input and output variables. ●In this technique, we split the
population or sample into two or more homogeneous sets (or sub-
populations) based on most significant splitter differentiator in input
variables.
Discuss the decision tree splitting criteria Gini and Entropy

Screens Q
No ratings yet
Screens Q
18 pages
4007ES Operator's Manual: 579-1165 Rev D
No ratings yet
4007ES Operator's Manual: 579-1165 Rev D
34 pages
Feature Scaling in Machine Learning
No ratings yet
Feature Scaling in Machine Learning
4 pages
ML Notes
No ratings yet
ML Notes
8 pages
Feature Scaling Techniques: Machine Learning
No ratings yet
Feature Scaling Techniques: Machine Learning
27 pages
DS Notes
No ratings yet
DS Notes
36 pages
23.-Scaling-Techniques
No ratings yet
23.-Scaling-Techniques
30 pages
Feature Scaling (Standardization & Normalization)
No ratings yet
Feature Scaling (Standardization & Normalization)
35 pages
Unit 2 ML 2019
No ratings yet
Unit 2 ML 2019
91 pages
3_AML _Lecture 3_Feature Engg
No ratings yet
3_AML _Lecture 3_Feature Engg
39 pages
Unit 3-2
No ratings yet
Unit 3-2
15 pages
Normalization and Standardization: Methods To Preprocess Data To Have Consistent Scales and Distributions
No ratings yet
Normalization and Standardization: Methods To Preprocess Data To Have Consistent Scales and Distributions
10 pages
Lec3 4 ML Project
No ratings yet
Lec3 4 ML Project
26 pages
ML - WEEK 04
No ratings yet
ML - WEEK 04
33 pages
CH1
No ratings yet
CH1
64 pages
Data Mining
No ratings yet
Data Mining
33 pages
Machine Learning Mindmap PDF
100% (1)
Machine Learning Mindmap PDF
5 pages
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Metric
No ratings yet
Metric
6 pages
Feature Scaling
No ratings yet
Feature Scaling
13 pages
Module 1
No ratings yet
Module 1
19 pages
1737527078055
No ratings yet
1737527078055
111 pages
Session 7 Feature Selection & Dimensionality Reduction
No ratings yet
Session 7 Feature Selection & Dimensionality Reduction
20 pages
Lec7 (1)
No ratings yet
Lec7 (1)
9 pages
ML Unit 2
No ratings yet
ML Unit 2
90 pages
Machine Learning - Lec4 - 5
No ratings yet
Machine Learning - Lec4 - 5
41 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
Feature Engineering
No ratings yet
Feature Engineering
18 pages
6 Evaluarea performantei
No ratings yet
6 Evaluarea performantei
43 pages
feature scaling
No ratings yet
feature scaling
6 pages
6.Classification & Regression
No ratings yet
6.Classification & Regression
45 pages
Unit No. 4
No ratings yet
Unit No. 4
4 pages
Performance Metrics
No ratings yet
Performance Metrics
8 pages
Normalization Techniques (1)
No ratings yet
Normalization Techniques (1)
2 pages
Standardization vs Normalization in Pattern Recognition
No ratings yet
Standardization vs Normalization in Pattern Recognition
1 page
Assignment 121
No ratings yet
Assignment 121
9 pages
The Comprehensive Guide to Machine Learning Algorithms and Techniques
From Everand
The Comprehensive Guide to Machine Learning Algorithms and Techniques
Mohammed Ahmed
5/5 (1)
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
Lecture-11 - Feature Scaling
No ratings yet
Lecture-11 - Feature Scaling
26 pages
Final Cc01 Group7
No ratings yet
Final Cc01 Group7
23 pages
FINAL - CC01 - Group7
No ratings yet
FINAL - CC01 - Group7
23 pages
Week 10
No ratings yet
Week 10
50 pages
Unit 4 Basics of Feature Engineering
100% (1)
Unit 4 Basics of Feature Engineering
33 pages
Normalization: Normalization Techniques at A Glance
No ratings yet
Normalization: Normalization Techniques at A Glance
5 pages
Ds 5
No ratings yet
Ds 5
9 pages
Oral Aswers Dsbda
No ratings yet
Oral Aswers Dsbda
7 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
52 pages
ML Lec-6
No ratings yet
ML Lec-6
16 pages
Feature Engineering
No ratings yet
Feature Engineering
23 pages
FeatureEngineering (1)
No ratings yet
FeatureEngineering (1)
50 pages
UNIT I Notes
No ratings yet
UNIT I Notes
23 pages
UNIT I Notes-1
No ratings yet
UNIT I Notes-1
18 pages
Pattern recognition unit 2
No ratings yet
Pattern recognition unit 2
24 pages
Ai Hon 4
No ratings yet
Ai Hon 4
22 pages
ML Unit-3 - RTU
No ratings yet
ML Unit-3 - RTU
20 pages
L4b - Perfomance Evaluation Metric - Regression
No ratings yet
L4b - Perfomance Evaluation Metric - Regression
6 pages
L4b - Perfomance Evaluation Metric - Regression
No ratings yet
L4b - Perfomance Evaluation Metric - Regression
6 pages
Linear Regression Summary
No ratings yet
Linear Regression Summary
57 pages
Lecture 7 Data Transformation and Dimensionality Reduction
No ratings yet
Lecture 7 Data Transformation and Dimensionality Reduction
22 pages
chapter 1 capstone project ai class 12
No ratings yet
chapter 1 capstone project ai class 12
5 pages
Lecture5
No ratings yet
Lecture5
26 pages
CC02 Group6 Report
No ratings yet
CC02 Group6 Report
36 pages
APL_manual_20_june_2024
No ratings yet
APL_manual_20_june_2024
50 pages
finalblackbox
No ratings yet
finalblackbox
8 pages
JD_Internship - IoT Developer.docx
No ratings yet
JD_Internship - IoT Developer.docx
2 pages
FOV-Unit1 complete-27
No ratings yet
FOV-Unit1 complete-27
5 pages
Cube - Strategy - Founder's Office Internship - JD
No ratings yet
Cube - Strategy - Founder's Office Internship - JD
1 page
Cover Page Mta
No ratings yet
Cover Page Mta
1 page
FOV Unit1 Complete 17
No ratings yet
FOV Unit1 Complete 17
5 pages
6 Signs You’re Growing
No ratings yet
6 Signs You’re Growing
8 pages
Simpson Diversity
No ratings yet
Simpson Diversity
18 pages
FOV-Unit1 Complete
No ratings yet
FOV-Unit1 Complete
140 pages
OS UnitI Complete
No ratings yet
OS UnitI Complete
59 pages
Temperature Monitoring System: Seminar Topic
No ratings yet
Temperature Monitoring System: Seminar Topic
12 pages
MTi 3-3193818
No ratings yet
MTi 3-3193818
2 pages
Resume
No ratings yet
Resume
1 page
8254 Timer
No ratings yet
8254 Timer
9 pages
Version Historyggttttff
No ratings yet
Version Historyggttttff
3 pages
Web Engineering Lab 13
No ratings yet
Web Engineering Lab 13
15 pages
Net2-2-Network ភាសាខ្មែរ
No ratings yet
Net2-2-Network ភាសាខ្មែរ
155 pages
Concepts of Genetics Eleventh Edition instant download
100% (1)
Concepts of Genetics Eleventh Edition instant download
15 pages
Download Full Easy Computer Basics Windows 7 Edition Michael Miller PDF All Chapters
100% (14)
Download Full Easy Computer Basics Windows 7 Edition Michael Miller PDF All Chapters
60 pages
(LiteBeam M5) - Main
No ratings yet
(LiteBeam M5) - Main
1 page
Motion Control Solutions
No ratings yet
Motion Control Solutions
12 pages
00 SY ST EM: Audio Matrix Controller
No ratings yet
00 SY ST EM: Audio Matrix Controller
8 pages
Empire Manual
100% (1)
Empire Manual
40 pages
Waf Logs
No ratings yet
Waf Logs
5 pages
16 X 1 LCD Display
No ratings yet
16 X 1 LCD Display
19 pages
PHP Total PDF
No ratings yet
PHP Total PDF
44 pages
Standard Operating Procedure
100% (1)
Standard Operating Procedure
15 pages
Special Sell-Out Program For Selected SKUs - May - 2023 - 2
No ratings yet
Special Sell-Out Program For Selected SKUs - May - 2023 - 2
10 pages
CIT 831 SOLVED 2020 - 1 and 2021 - 2
No ratings yet
CIT 831 SOLVED 2020 - 1 and 2021 - 2
10 pages
Pradip Python-PPT-Geoinformatics (Pradip)
100% (1)
Pradip Python-PPT-Geoinformatics (Pradip)
8 pages
محاضرات شبكات - حاسوب
No ratings yet
محاضرات شبكات - حاسوب
42 pages
Chapter 17 Linked Lists: Starting Out With C++, 3 Edition
No ratings yet
Chapter 17 Linked Lists: Starting Out With C++, 3 Edition
68 pages
LAB02
No ratings yet
LAB02
6 pages
Carousel Bootstrap
No ratings yet
Carousel Bootstrap
9 pages
Python Scripting Language
No ratings yet
Python Scripting Language
7 pages
(Ebook) Advances in Data Science: Symbolic, Complex, and Network Data (Innovation, Entrepreneurship, Management; Big Data, Intelligence and Data Analaysis) by Edwin Diday (editor), Rong Guan (editor), Gilbert Saporta (editor), Huiwen Wang (editor) ISBN 9781786305763, 1786305763 - Download the ebook today and experience the full content
100% (2)
(Ebook) Advances in Data Science: Symbolic, Complex, and Network Data (Innovation, Entrepreneurship, Management; Big Data, Intelligence and Data Analaysis) by Edwin Diday (editor), Rong Guan (editor), Gilbert Saporta (editor), Huiwen Wang (editor) ISBN 9781786305763, 1786305763 - Download the ebook today and experience the full content
72 pages
Velero
No ratings yet
Velero
8 pages
Unit 1 - BD - Introduction To Big Data
100% (1)
Unit 1 - BD - Introduction To Big Data
90 pages
UEC735
No ratings yet
UEC735
2 pages

Uploaded by

Uploaded by

2.

Decision tree is a type of supervised learning algorithm (having a predefined

You might also like