0% found this document useful (0 votes)

10 views

Model Performance Assessment

The document discusses various techniques for evaluating machine learning models, including accuracy, precision, recall, F1 score, confusion matrices, and cross-validation. It explains these performance metrics and techniques in detail to analyze a model's strengths and weaknesses.

Uploaded by

sanyengere

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Model Performance Assessment

Uploaded by

sanyengere

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Model Performance Assessment

Model Performance

 Model evaluation is the process that uses some metrics which help us
to analyze the performance of the model.

 As we all know that model development is a multi-step process and a

check should be kept on how well the model generalizes future
predictions.

 Therefore evaluating a model plays a vital role so that we can judge

the performance of our model.

 The evaluation also helps to analyze a model’s key weaknesses.

Performance Metrics

 There are many metrics like Accuracy, Precision, Recall,

F1 score, Area under Curve, Confusion Matrix, and Mean
Square Error.

 Cross Validation is one technique that is followed during

the training phase and it is a model evaluation technique
as well.
Cross Validation and Holdout
Cross Validation is a method in which we do not use the whole dataset for training. In this
technique, some part of the dataset is reserved for testing the model.

There are many types of Cross-Validation out of which K Fold Cross Validation is mostly used.

In K Fold Cross Validation the original dataset is divided into k subsets. The subsets are known
as folds.

This is repeated k times where 1 fold is used for testing purposes.

Rest k-1 folds are used for training the model. It is seen that this technique generalizes the
model well and reduces the error rate.
Holdout

Holdout is the simplest approach. It is used in neural networks as well as in

many classifiers.

In this technique, the dataset is divided into train and test datasets.

The dataset is usually divided into ratios like 70:30 or 80:20.

Normally a large percentage of data is used for training the model and a
small portion of the dataset is used for testing the model.
Accuracy

Accuracy is defined as the ratio of the number of correct

predictions to the total number of predictions. This is the
most fundamental metric used to evaluate the model.

The formula is given by

Accuracy = (TP+TN)/(TP+TN+FP+FN)
Precision

Precision is the ratio of true positives to the summation of true

positives and false positives. It basically analyses the positive
predictions.

Precision = TP/(TP+FP)

The drawback of Precision is that it does not consider the True

Negatives and False Negatives.
Recall

Recall is the ratio of true positives to the summation of true

positives and false negatives. It basically analyses the number of
correct positive samples.

Recall = TP/(TP+FN)

The drawback of Recall is that often it leads to a higher false

positive rate.
F1 score

The F1 score is the harmonic mean of precision and recall. It is seen

that during the precision-recall trade-off if we increase the precision,
recall decreases and vice versa. The goal of the F1 score is to combine
precision and recall.

F1 score = (2×Precision×Recall)/(Precision+Recall)

Confusion Matrix
A confusion matrix is an N x N matrix where N is the number of target classes. It
represents the number of actual outputs and the predicted outputs. Some
terminologies in the matrix are as follows:

 True Positives: It is also known as TP. It is the output in which the actual and the
predicted values are YES.

 True Negatives: It is also known as TN. It is the output in which the actual and
the predicted values are NO.

 False Positives: It is also known as FP. It is the output in which the actual value is
NO but the predicted value is YES.

 False Negatives: It is also known as FN. It is the output in which the actual value
Confusion Matrix
Area Under Curve (AUC) /The Receiver Operating Characteristic(ROC) curve

The Receiver Operating Characteristic(ROC) curve is a probabilistic curve used to

highlight the model’s performance. The curve has two parameters:
 TPR: It stands for True positive rate. It basically follows the formula of Recall.

 FPR: It stands for False Positive rate. It is defined as the ratio of False positives to the
summation of false positives and True negatives.

This curve is useful as it helps us to determine the model’s capacity to distinguish between
different classes.

A model is considered good if the AUC score is greater than 0.5 and approaches 1. A poor
model has an AUC score of 0.
AUC or ROC

Area under the ROC Curve (AUC) measures how much better a machine learning model predicts classification
versus a random luck model.

ACM Examples Guide
100% (8)
ACM Examples Guide
240 pages
PHY 171B Lab 2
93% (14)
PHY 171B Lab 2
4 pages
Mooring Winch Brake Capacity Calculation
60% (10)
Mooring Winch Brake Capacity Calculation
1 page
Analytic Method:: Model Evaluation
No ratings yet
Analytic Method:: Model Evaluation
17 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
11 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
DL_IT324a_4
No ratings yet
DL_IT324a_4
52 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
49 pages
Unit - I Chap-4 Model Evaluation and Development
No ratings yet
Unit - I Chap-4 Model Evaluation and Development
35 pages
06-FSSR_DS610_2024=2025T1_ٍMetrics
No ratings yet
06-FSSR_DS610_2024=2025T1_ٍMetrics
24 pages
A10-Model-Performance-v2-2up
No ratings yet
A10-Model-Performance-v2-2up
11 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Chapter 3 Model Evaluation Final
No ratings yet
Chapter 3 Model Evaluation Final
30 pages
Module 6
No ratings yet
Module 6
24 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
11 pages
ADS 5
No ratings yet
ADS 5
5 pages
Lec 12 13 Evaluation Measures
No ratings yet
Lec 12 13 Evaluation Measures
45 pages
Analytics in Practice: Model Evaluation
No ratings yet
Analytics in Practice: Model Evaluation
40 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
CH-5_ML
No ratings yet
CH-5_ML
36 pages
Unit6 -7 Issues_23bc7150-918a-4ebe-9af6-01db96af986a
No ratings yet
Unit6 -7 Issues_23bc7150-918a-4ebe-9af6-01db96af986a
53 pages
IAI&ML UNIT-5
No ratings yet
IAI&ML UNIT-5
15 pages
FAI Lecture - 23-10-2023 PDF
No ratings yet
FAI Lecture - 23-10-2023 PDF
12 pages
Machine Learning Terminology
No ratings yet
Machine Learning Terminology
16 pages
3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
ML-Lecture-11-Evaluation
No ratings yet
ML-Lecture-11-Evaluation
17 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Unit 2 Chap 4
No ratings yet
Unit 2 Chap 4
14 pages
Unit3 7 Issues
No ratings yet
Unit3 7 Issues
24 pages
? Task
No ratings yet
? Task
23 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
37 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Exp7_MLAI2
No ratings yet
Exp7_MLAI2
8 pages
Performance Metrics
No ratings yet
Performance Metrics
12 pages
EvaluationMatrix
No ratings yet
EvaluationMatrix
29 pages
Lecture 5 Evaluation_Classifer
No ratings yet
Lecture 5 Evaluation_Classifer
61 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
5.2
No ratings yet
5.2
62 pages
Data Mining Final
No ratings yet
Data Mining Final
25 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
APznzaag02xO1GGi5u_A2DhJZs4CkLi9le3t7z9-R-wpvTJmn6o4ZfwQPBMHbFF9nnLxXjm40qffE-ZJQt7sji0grSXm812681Z1HXweJuujlkNekCE0LBXhi7QZzIbYwVm0Gy8OihuREB3yX-xuUY9vnUp00zdff4914hbLoLi_yw8ca2WGrMjDOn15XXUi5lnBdigIFlLgiIztS_axMl
No ratings yet
APznzaag02xO1GGi5u_A2DhJZs4CkLi9le3t7z9-R-wpvTJmn6o4ZfwQPBMHbFF9nnLxXjm40qffE-ZJQt7sji0grSXm812681Z1HXweJuujlkNekCE0LBXhi7QZzIbYwVm0Gy8OihuREB3yX-xuUY9vnUp00zdff4914hbLoLi_yw8ca2WGrMjDOn15XXUi5lnBdigIFlLgiIztS_axMl
15 pages
Evaluation Metrics For Machine Learning: Negative (Actual) 98 Positive (Actual) 1
No ratings yet
Evaluation Metrics For Machine Learning: Negative (Actual) 98 Positive (Actual) 1
2 pages
Cofusion Matrix Cross- Validation
No ratings yet
Cofusion Matrix Cross- Validation
34 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
Accuracy and error measures
No ratings yet
Accuracy and error measures
14 pages
Machine Learning # 2
No ratings yet
Machine Learning # 2
17 pages
dsbda_ut5
No ratings yet
dsbda_ut5
7 pages
Evaluation of Predictive Models Final
No ratings yet
Evaluation of Predictive Models Final
6 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Lec 8
No ratings yet
Lec 8
35 pages
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-08 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-08 Reference-Material-I
18 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
performance evaluation
No ratings yet
performance evaluation
24 pages
AD3501-DL-UNIT 4 NOTES
No ratings yet
AD3501-DL-UNIT 4 NOTES
16 pages
Lesson 6 Analytics Methods
No ratings yet
Lesson 6 Analytics Methods
12 pages
UNIT-3
No ratings yet
UNIT-3
13 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
12 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
CT Protocols
No ratings yet
CT Protocols
62 pages
HIT 400 PROJECT Simba Part 4
No ratings yet
HIT 400 PROJECT Simba Part 4
4 pages
Radiology Instrumentation: Mahidol University
No ratings yet
Radiology Instrumentation: Mahidol University
281 pages
Equip
No ratings yet
Equip
40 pages
Sample Bottle Cleaning
No ratings yet
Sample Bottle Cleaning
6 pages
44b2967d-fdc0-4db7-8674-d1fba260d1d4
No ratings yet
44b2967d-fdc0-4db7-8674-d1fba260d1d4
3 pages
MUCLecture 2021 112449616
No ratings yet
MUCLecture 2021 112449616
7 pages
Special Needs Final
No ratings yet
Special Needs Final
46 pages
297 Full
No ratings yet
297 Full
7 pages
Introduction To Contrast Injectors
No ratings yet
Introduction To Contrast Injectors
22 pages
Final Equipment Assignment
No ratings yet
Final Equipment Assignment
14 pages
Imaging Assignment 01
No ratings yet
Imaging Assignment 01
13 pages
Radiation Dosimetry Presentation-2
No ratings yet
Radiation Dosimetry Presentation-2
49 pages
Intrinsic Flood-Field Uniformity Evaluation
No ratings yet
Intrinsic Flood-Field Uniformity Evaluation
10 pages
Data Acquisition-1
No ratings yet
Data Acquisition-1
23 pages
Artifacts in CT Recognition and Avoidance
No ratings yet
Artifacts in CT Recognition and Avoidance
14 pages
Hit 2203-Big Data & Data Analytics - Lecture - 3
No ratings yet
Hit 2203-Big Data & Data Analytics - Lecture - 3
10 pages
Extraoral 2018
No ratings yet
Extraoral 2018
14 pages
Rehabilitation Care of Women With PCOS A
No ratings yet
Rehabilitation Care of Women With PCOS A
3 pages
Edited CT Pres GRP 3
No ratings yet
Edited CT Pres GRP 3
37 pages
2024 Market Basket Analysis
No ratings yet
2024 Market Basket Analysis
30 pages
MUCLecture 2022 56897
No ratings yet
MUCLecture 2022 56897
8 pages
BETHEL RARAMI Big Data Assignment
No ratings yet
BETHEL RARAMI Big Data Assignment
11 pages
CT Generations
No ratings yet
CT Generations
3 pages
DVCPR A
No ratings yet
DVCPR A
8 pages
Circle-theorems past questions
No ratings yet
Circle-theorems past questions
22 pages
Lesson 2 Decision Making
No ratings yet
Lesson 2 Decision Making
6 pages
Lec 9 Antihuman Globulin Testing
100% (1)
Lec 9 Antihuman Globulin Testing
9 pages
CHEM YEAR 10 END TERM 2 P2 MS
No ratings yet
CHEM YEAR 10 END TERM 2 P2 MS
4 pages
Hydrocyclone Research Poster
No ratings yet
Hydrocyclone Research Poster
1 page
University of Gondar College of Medicine and Health Science Department of Epidemiology and Biostatistics
No ratings yet
University of Gondar College of Medicine and Health Science Department of Epidemiology and Biostatistics
34 pages
Block Code/ Period Code Code Module Title Credits Exam CA
No ratings yet
Block Code/ Period Code Code Module Title Credits Exam CA
9 pages
Java Fullstack Brochure - Skillsquad
No ratings yet
Java Fullstack Brochure - Skillsquad
23 pages
Practice Es Final Exam 1 Answers
No ratings yet
Practice Es Final Exam 1 Answers
6 pages
Bs 8 3rd Term Scheme of Learning
No ratings yet
Bs 8 3rd Term Scheme of Learning
10 pages
Microsoft Visual C++ 2010 x86 Redistributable Setup - 20211119 - 160607358-MSI - VC - Red - Msi
No ratings yet
Microsoft Visual C++ 2010 x86 Redistributable Setup - 20211119 - 160607358-MSI - VC - Red - Msi
30 pages
MCQ S
No ratings yet
MCQ S
33 pages
Deletion From A Binary Search Tree
No ratings yet
Deletion From A Binary Search Tree
10 pages
Nuffield 200 Ventilator User Instruction Manual: Quality and Assurance in Anaesthesia
No ratings yet
Nuffield 200 Ventilator User Instruction Manual: Quality and Assurance in Anaesthesia
39 pages
Gemmw Module 2
100% (1)
Gemmw Module 2
14 pages
Mathematical Finance: Hiroshi Toyoizumi September 26, 2012
No ratings yet
Mathematical Finance: Hiroshi Toyoizumi September 26, 2012
87 pages
Lhjluihlu
No ratings yet
Lhjluihlu
11 pages
Types of Data Represented As Strings
No ratings yet
Types of Data Represented As Strings
2 pages
SOP For Water Testing
No ratings yet
SOP For Water Testing
50 pages
program 1 - Bresenham's Line Drawing Algorithm: / To Compile GCC Lab - Name.c - LGL - lGLU - Lglut
No ratings yet
program 1 - Bresenham's Line Drawing Algorithm: / To Compile GCC Lab - Name.c - LGL - lGLU - Lglut
30 pages
KM529
No ratings yet
KM529
17 pages
Energy Worsheet 7th Grade
No ratings yet
Energy Worsheet 7th Grade
2 pages
PDF 3
No ratings yet
PDF 3
11 pages
Chem 151 D10S - New
No ratings yet
Chem 151 D10S - New
2 pages
L 2 Net User Guide
No ratings yet
L 2 Net User Guide
42 pages
Cost of Production Report
No ratings yet
Cost of Production Report
7 pages

Uploaded by

Uploaded by

Model Performance Assessment

 As we all know that model development is a multi-step process and a

 Therefore evaluating a model plays a vital role so that we can judge

 The evaluation also helps to analyze a model’s key weaknesses.

 There are many metrics like Accuracy, Precision, Recall,

 Cross Validation is one technique that is followed during

This is repeated k times where 1 fold is used for testing purposes.

Holdout is the simplest approach. It is used in neural networks as well as in

The dataset is usually divided into ratios like 70:30 or 80:20.

Accuracy is defined as the ratio of the number of correct

The formula is given by

Precision is the ratio of true positives to the summation of true

The drawback of Precision is that it does not consider the True

Recall is the ratio of true positives to the summation of true

The drawback of Recall is that often it leads to a higher false

The F1 score is the harmonic mean of precision and recall. It is seen

F1 score = (2×Precision×Recall)/(Precision+Recall)

The Receiver Operating Characteristic(ROC) curve is a probabilistic curve used to

You might also like