0% found this document useful (0 votes)

295 views

Model Perf Cheat Sheet

This document provides a cheat sheet on performance measures for binary classification and regression tasks. For binary classification, it defines terms like true positive rate, precision, recall, F1 score, and ROC curves. It also lists relationships between different measures. For regression, it defines errors like mean squared error, root mean squared error, and R-squared, as well as model selection criteria like AIC and BIC. Resampling methods like cross-validation are also covered for estimating prediction error.

Uploaded by

vinodhewards

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

295 views

Model Perf Cheat Sheet

Uploaded by

vinodhewards

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Binary classification performances measure cheat sheet

Damien Franois v1.0 - 2009 ([email protected])

Confusion matrix for two possible
outcomes p (positive) and n
(negative)
Actual
p

True negative rate: proportion of

Total
actual negative which are predicted
negative
P
TN / (TN + FP)

true
false
positive postive

false
true
N
negative negative

Predicted

True positive rate: proportion of

actual positives which are predicted
positive
TP / (TP + FN)

Youden's index: arithmetic mean

between sensitivity and specificity
sensitivity - (1 - specificity)
Matthews correlation correlation
between the actual and predicted
(TP . TN FP . FN) /
((TP+FP) (TP+FN) (TP + FP) (TN+FN))1/2
comprised between -1 and 1

(Cumlative) Lift chart plot of the

true positive rate as a function of the
proportion of the population being
predicted positive, controlled by some
classifier parameter (e.g. a threshold)

Discriminant power normalised

likelihood index
Positive likelihood: likelihood that a
total
P'
N'
predicted positive is an actual positive sqrt(3) / .
(log (sensitivity / (1 specificity)) +
Classification accuracy
sensitivity / (1 - specificity)
log (specificity / (1 - sensitivity)))
(TP + TN) / (TP + TN + FP + FN)
<1 = poor, >3 = good, fair otherwise
Error rate
Negative likelihood: likelihood that a
Relationships
(FP + FN) / (TP + TN + FP + FN)
predicted negative is an actual
Graphical tools
negative
sensitivity = recall = true positive rate
Paired criteria
specificity / (1 - sensitivity)
specificity = true negative rate
ROC curve receiver operating
BCR = . (sensitivity + specificity)
characteristic curve : 2-D curve
Precision: (or Positive predictive value) Combined criteria
BCR = 2 . Youden's index - 1
parametrized by one parameter of the
proportion of predicted positives which
F-measure = F1measure
classification algorithm, e.g. some
are actual positive
BCR: Balanced Classification Rate
Accuracy = 1 error rate
TP / (TP + FP)
(TP / (TP + FN) + TN / (TN + FP)) threshold in the true postivie rate /
false positive rate space
BER: Balanced Error Rate, or HTER:
References
AUC The area under the ROC is
Recall: proportion of actual positives
Half Total Error Rate: 1 - BCR
between 0 and 1
which are predicted positive
Sokolova, M. and Lapalme, G. 2009. A
TP / (TP + FN)
F-measure harmonic mean between
systematic analysis of performance
precision and recall
measures for classification tasks. Inf.
2 (precision . recall) /
Process. Manage. 45, 4 (Jul. 2009),
(precision + recall)
Sensitivity: proportion of actual
427-437.
F-measure weighted harmonic mean
positives which are predicted positive
Demsar, J.: Statistical comparisons of
between precision and recall
TP / (TP + FN)
classifiers over multiple data sets.
(1+)2 TP / ((1+)2 TP + 2 FN + FP)
Journal of Machine Learning Research
Specificity: proportion of actual
7 (2006) 130
negative which are predicted negative The harmonic mean between specificity
and sensitivity is also often used and
TN / (TN + FP)
sometimes referred to as F-measure.

Regression performances measure cheat sheet

Damien Franois v0.9 - 2009 ([email protected])
Let
input/output pairs and
function such that for

be a set of
a
,

Absolute error

Robust error measures

Resampling methods

MAD Mean Absolute Deviation

Median Squared error

LOO Leave-one-out: build the model

on
data elements and test on
the remaining one. Iterate
times to
collect all
and compute mean error.

MAPE Mean Absolute Percentage Error

Squared error
SSE Sum of Squared Errors, or
RSS Residual Sum of Squares

Predicted error
PRESS Predicted REsidual Sums of
Squares

-trimmed MSE
where is the set of residuals
where
percents of the largest
values are discarded.
M-estimators

RMSE Root Mean Squared Error

where
is a matrix built by stacking
the
in rows.
is the vector of
where \rho is a non-negative function
with a mininmum in 0, like the
GCV Generalised Cross Validation
parabola, the Hubber function, or the
bisquare function.

NMSE Normalised Mean Squared Error

where
is a matrix built by stacking
Graphical tool
the
in rows.
is the vector of

MSE Mean Squared Error

Information criteria
where var is the empirical variance in
the sample.

AIC Akaike Information Criterion

R-squared

where
is the number of parameters
in the model

where var is the empirical variance in

the sample

BIC Bayesian Information Criterion

where
is the number of parameters
in the model

Plot of predicted value against actual

value. A perfect model places all dots
on the diagonal.

X-Val Cross validation. Randomly

split the data in two parts, use the
first one to build the model and the
second one to test it. Iterate to get a
distribution of the test error of the
model.
K-Fold Cut the data into K parts.
Build the model on the K-1 first parts
and test on the Kth one. Iterate from
1 to K to get a distribution of the test
error of the model.
Bootstrap Draw a random subsample
of the data with replacement. Compute
the error on the whole dataset minus
the training error of the model and
Iterate to get a distribution of such
values. The mean of the distribution is
the optimism. The bootstrap error
estimate is the training error on the
whole dataset plus the optimism.

3141b86-6fd4-7726-D8ad-20a1516bcd Statistics Interview Cheat Sheet - Emmading - Com. All Rights Reserved.
No ratings yet
3141b86-6fd4-7726-D8ad-20a1516bcd Statistics Interview Cheat Sheet - Emmading - Com. All Rights Reserved.
10 pages
A Comprehensive Statistics Cheat Sheet For Data Science Interviews - StrataScratch
No ratings yet
A Comprehensive Statistics Cheat Sheet For Data Science Interviews - StrataScratch
32 pages
Regression Analysis Sheet
No ratings yet
Regression Analysis Sheet
1 page
Docs Slides Lecture11
No ratings yet
Docs Slides Lecture11
18 pages
Statistics Formulae Sheet: X X N X F - X N L+ I F N - C) FM F 1) FM F 1) + (FM F 2) × I Lowest Value+highest Value
No ratings yet
Statistics Formulae Sheet: X X N X F - X N L+ I F N - C) FM F 1) FM F 1) + (FM F 2) × I Lowest Value+highest Value
4 pages
Types of Statistical Tests
No ratings yet
Types of Statistical Tests
5 pages
DAPv9d Mac2011
No ratings yet
DAPv9d Mac2011
36 pages
Cheat Sheet Stats For Exam Cheat Sheet Stats For Exam
No ratings yet
Cheat Sheet Stats For Exam Cheat Sheet Stats For Exam
3 pages
ML Cheatsheet
No ratings yet
ML Cheatsheet
1 page
Machine Learning Coursera All Exercies PDF
No ratings yet
Machine Learning Coursera All Exercies PDF
117 pages
Slide 4 - Linear Regression With Multiple Variables
100% (1)
Slide 4 - Linear Regression With Multiple Variables
30 pages
Machine Learning Andrew NG Week 6 Quiz 1
No ratings yet
Machine Learning Andrew NG Week 6 Quiz 1
8 pages
Slide 3 - Linear Regression One Variable
No ratings yet
Slide 3 - Linear Regression One Variable
60 pages
Unit3 160420200647 PDF
No ratings yet
Unit3 160420200647 PDF
146 pages
Cheat Sheet Statistics
No ratings yet
Cheat Sheet Statistics
3 pages
A Comprehensive Statistics Cheat Sheet For Data Science 1685659812
No ratings yet
A Comprehensive Statistics Cheat Sheet For Data Science 1685659812
39 pages
Regularization: The Problem of Overfitting
No ratings yet
Regularization: The Problem of Overfitting
24 pages
Choosing The Right Statistical Test: Source
No ratings yet
Choosing The Right Statistical Test: Source
4 pages
Microeconomics Formulas and Expressions
No ratings yet
Microeconomics Formulas and Expressions
5 pages
Linear Regression With Gradient Descent
100% (1)
Linear Regression With Gradient Descent
8 pages
Machine Learning Andrew NG Week 5 Quiz 1
No ratings yet
Machine Learning Andrew NG Week 5 Quiz 1
3 pages
Statistics Cheatsheet
No ratings yet
Statistics Cheatsheet
3 pages
Statistics With R
No ratings yet
Statistics With R
41 pages
Learning Preference Inventory
No ratings yet
Learning Preference Inventory
6 pages
Question and Answers For Pyplots
No ratings yet
Question and Answers For Pyplots
11 pages
Statistics Cheat Sheet
100% (1)
Statistics Cheat Sheet
4 pages
Statistics and Data Analytics Cheat Sheets
100% (1)
Statistics and Data Analytics Cheat Sheets
2 pages
PCL-5
No ratings yet
PCL-5
2 pages
Query Optimiation
No ratings yet
Query Optimiation
39 pages
Data Science With R
100% (1)
Data Science With R
6 pages
Multiple Regression in SPSS
No ratings yet
Multiple Regression in SPSS
17 pages
Lab5 SPSS ChiSquare
No ratings yet
Lab5 SPSS ChiSquare
14 pages
Groebner Business Statistics 7 Ch07
No ratings yet
Groebner Business Statistics 7 Ch07
34 pages
Steps of Test Development
No ratings yet
Steps of Test Development
7 pages
(eBook PDF) Discovering Statistics Using IBM SPSS Statistics 4th pdf download
No ratings yet
(eBook PDF) Discovering Statistics Using IBM SPSS Statistics 4th pdf download
57 pages
Data Mining Cheat Sheet PDF
No ratings yet
Data Mining Cheat Sheet PDF
6 pages
Meta-Analysis in R
No ratings yet
Meta-Analysis in R
67 pages
Deployment: Cheat Sheet: Machine Learning With KNIME Analytics Platform
No ratings yet
Deployment: Cheat Sheet: Machine Learning With KNIME Analytics Platform
1 page
Statistics Consulting Cheat Sheet: Kris Sankaran October 1, 2017
100% (1)
Statistics Consulting Cheat Sheet: Kris Sankaran October 1, 2017
44 pages
Introduction To Statistics
0% (1)
Introduction To Statistics
19 pages
StataCheatSheet Analysis
No ratings yet
StataCheatSheet Analysis
1 page
Free Online Course On PLS-SEM Using SmartPLS 3.0 - Mediator
No ratings yet
Free Online Course On PLS-SEM Using SmartPLS 3.0 - Mediator
24 pages
Structural Equation Modeling
No ratings yet
Structural Equation Modeling
42 pages
Behavioral Model Analysis
No ratings yet
Behavioral Model Analysis
0 pages
9-3 Basics of Statistics: Unit 9 Probability and Mathematical Induction
No ratings yet
9-3 Basics of Statistics: Unit 9 Probability and Mathematical Induction
16 pages
Download full Doing Meta Analysis with R A Hands On Guide 1st Edition Mathias Harrer ebook all chapters
100% (11)
Download full Doing Meta Analysis with R A Hands On Guide 1st Edition Mathias Harrer ebook all chapters
81 pages
Student Assessment Report Student Assessment Report
No ratings yet
Student Assessment Report Student Assessment Report
17 pages
Multiple Linear Regression Housing Case Study PDF
No ratings yet
Multiple Linear Regression Housing Case Study PDF
151 pages
10 Alcohol Fact Sheet
No ratings yet
10 Alcohol Fact Sheet
6 pages
Free Critical Thinking Test Assumptions Questions
No ratings yet
Free Critical Thinking Test Assumptions Questions
7 pages
Download Doing Meta Analysis with R A Hands On Guide 1st Edition Mathias Harrer ebook file with all chapters
100% (3)
Download Doing Meta Analysis with R A Hands On Guide 1st Edition Mathias Harrer ebook file with all chapters
79 pages
A Lesson 1 Introduction To Statistics & SPSS
100% (1)
A Lesson 1 Introduction To Statistics & SPSS
8 pages
Exercise 1 Crosstabulation and Chi Square Test
100% (1)
Exercise 1 Crosstabulation and Chi Square Test
3 pages
APA Citation: - In-Text Citation - Reference Page - Paraphrasing and Summarizing
No ratings yet
APA Citation: - In-Text Citation - Reference Page - Paraphrasing and Summarizing
19 pages
Organizing Visualizing and Describing Data
No ratings yet
Organizing Visualizing and Describing Data
35 pages
R - Tutorial: Matrices Are Vectors
No ratings yet
R - Tutorial: Matrices Are Vectors
13 pages
Cheat Sheet: With Stata 15
No ratings yet
Cheat Sheet: With Stata 15
1 page
Model Perf Cheat Sheet
No ratings yet
Model Perf Cheat Sheet
2 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
performance evaluation
No ratings yet
performance evaluation
24 pages
Glove: Global Vectors For Word Representation
No ratings yet
Glove: Global Vectors For Word Representation
12 pages
Project Cost Crashing Exercise
100% (1)
Project Cost Crashing Exercise
9 pages
SOLIDWORKS Simulation - Problem With Stress Linearization With 2D Simplification - Finite Element Analysis (FEA) Engineering - Eng-Tips
No ratings yet
SOLIDWORKS Simulation - Problem With Stress Linearization With 2D Simplification - Finite Element Analysis (FEA) Engineering - Eng-Tips
3 pages
NAS Summary Graphic
No ratings yet
NAS Summary Graphic
1 page
Guide
No ratings yet
Guide
3 pages
Stochastic Control Princeton
No ratings yet
Stochastic Control Princeton
14 pages
Major Project
No ratings yet
Major Project
17 pages
Payel EC303 Semfile f1562738070379 PDF
No ratings yet
Payel EC303 Semfile f1562738070379 PDF
157 pages
Stanford Center For AI Safety - Whitepaper
No ratings yet
Stanford Center For AI Safety - Whitepaper
6 pages
Building Machine Learning Systems With Python - Second Edition - Sample Chapter
100% (2)
Building Machine Learning Systems With Python - Second Edition - Sample Chapter
32 pages
CS8391-Data Structures
No ratings yet
CS8391-Data Structures
13 pages
Non-Homogeneous Markov Set Systems
No ratings yet
Non-Homogeneous Markov Set Systems
25 pages
Conversational Chatbot System For Student Support in Administrative Exam Information
No ratings yet
Conversational Chatbot System For Student Support in Administrative Exam Information
8 pages
AI_MSC
No ratings yet
AI_MSC
1 page
Tushar Pant MTech IEOR DHL
No ratings yet
Tushar Pant MTech IEOR DHL
1 page
1 s2.0 S2589721722000046 Main
No ratings yet
1 s2.0 S2589721722000046 Main
13 pages
Handwritten Marathi Compound Character PDF
No ratings yet
Handwritten Marathi Compound Character PDF
6 pages
Stewartcalcet8 03 06
No ratings yet
Stewartcalcet8 03 06
15 pages
Weak and Semi-Weak Keys in DES
100% (1)
Weak and Semi-Weak Keys in DES
10 pages
R22 ML SYLLABUS
No ratings yet
R22 ML SYLLABUS
2 pages
Etasr 4202 PDF
No ratings yet
Etasr 4202 PDF
6 pages
Hamming Codes
No ratings yet
Hamming Codes
11 pages
Complete Solution Assignment-1 PDF
100% (1)
Complete Solution Assignment-1 PDF
15 pages
Lakkireddybalreddi-M.tech - Ece - Systems and Signal Processing - Syllabus
No ratings yet
Lakkireddybalreddi-M.tech - Ece - Systems and Signal Processing - Syllabus
43 pages
Info 159/259 HW 2
No ratings yet
Info 159/259 HW 2
3 pages
Assigment Q machine learning_CSE
No ratings yet
Assigment Q machine learning_CSE
4 pages
STAT 231 Homework 5 Solutions
No ratings yet
STAT 231 Homework 5 Solutions
8 pages
Theory of Spinors and Its Application in Physics and Mechanics by Vladimir A. Zhelnorovich
No ratings yet
Theory of Spinors and Its Application in Physics and Mechanics by Vladimir A. Zhelnorovich
402 pages
Lecture8,9-Neural Networks
No ratings yet
Lecture8,9-Neural Networks
65 pages
ALGORITHMS LAB MANUAL - Updated
No ratings yet
ALGORITHMS LAB MANUAL - Updated
47 pages

Uploaded by

Uploaded by

Binary classification performances measure cheat sheet

Damien Franois v1.0 - 2009 ([email protected])

True negative rate: proportion of

True positive rate: proportion of

Youden's index: arithmetic mean

(Cumlative) Lift chart plot of the

Discriminant power normalised

Regression performances measure cheat sheet

Robust error measures

MAD Mean Absolute Deviation

Median Squared error

LOO Leave-one-out: build the model

MAPE Mean Absolute Percentage Error

RMSE Root Mean Squared Error

NMSE Normalised Mean Squared Error

MSE Mean Squared Error

AIC Akaike Information Criterion

where var is the empirical variance in

BIC Bayesian Information Criterion

Plot of predicted value against actual

X-Val Cross validation. Randomly

You might also like