0% found this document useful (0 votes)
4 views

ADS-EXP4

The document outlines various performance evaluation metrics for data models, including Accuracy, Error Rate, Precision, Sensitivity, Specificity, ROC Metric, F1 Score, and Geometric Mean. Each metric is defined, accompanied by its formula and evaluation focus, highlighting its importance in different scenarios, particularly in relation to balanced and imbalanced datasets. The conclusion emphasizes the importance of selecting the appropriate metric based on the specific problem domain and the costs associated with misclassification.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

ADS-EXP4

The document outlines various performance evaluation metrics for data models, including Accuracy, Error Rate, Precision, Sensitivity, Specificity, ROC Metric, F1 Score, and Geometric Mean. Each metric is defined, accompanied by its formula and evaluation focus, highlighting its importance in different scenarios, particularly in relation to balanced and imbalanced datasets. The conclusion emphasizes the importance of selecting the appropriate metric based on the specific problem domain and the costs associated with misclassification.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Class: BE-Computer; Semester: VIII

Subject: Applied Data Science Lab

Experiment No.: 04

Group-31

AIM : Implement and explore performance evaluation metrics for Data Models

Theory :

1. Accuracy (ACC)

 Formula: ACC=TP+TNTP+TN+FP+FNACC \ {TP + TN}{TP + TN + FP +


FN}ACC=TP+TN+FP+FNTP+TN

 Definition:
Accuracy measures the overall effectiveness of a classifier by calculating the proportion of
correctly classified instances (both positive and negative) out of the total number of
instances.

 Evaluation Focus:
Indicates how well the classifier performs across all classes but can be misleading if the
dataset is imbalanced.

2. Error Rate (ERR)

 Formula: ERR=FP+FNTP+TN+FP+FNERR \ {FP + FN}{TP + TN + FP + FN} ERR =


TP+TN+FP+FNFP+FN

 Definition:
Error rate represents the proportion of misclassified instances in the dataset. It is the
complement of accuracy.

 Evaluation Focus:
Used to understand the rate of incorrect predictions. A lower error rate is desirable.

3. Precision (PRC)

 Formula: PRC=TPTP+FPPRC \ {TP}{TP + FP}PRC = TP+FPTP

 Definition:
Precision (also called Positive Predictive Value) measures the proportion of true positive
predictions out of all predicted positives. It assesses how many of the predicted positive
cases are actually positive.

 Evaluation Focus:
Important in scenarios where false positives are costly, such as medical diagnosis or fraud
detection.
4. Sensitivity (SNS) / Recall / True Positive Rate (TPR)

 Formula: SNS=TPTP+FNSNS = \ {TP}{TP + FN}SNS=TP+FNTP

 Definition:
Sensitivity (also known as Recall or True Positive Rate) measures the ability of a classifier to
identify all actual positive cases.

 Evaluation Focus:
Critical in applications where false negatives are costly, such as detecting diseases or spam
filtering.

5. Specificity (SPC) / True Negative Rate (TNR)

 Formula: SPC=TNTN+FPSPC \ {TN}{TN + FP}SPC=TN+FPTN

 Definition:
Specificity (also known as True Negative Rate) measures how well the classifier identifies
actual negative cases.

 Evaluation Focus:
Useful in scenarios where distinguishing negative cases correctly is important, such as ruling
out non-fraudulent transactions.

6. Receiver Operating Characteristic (ROC) Metric

 Formula: ROC=SNS2+SPC22ROC \ {\sqrt{SNS^2 + SPC^2}}{\sqrt{2}}ROC=2SNS2+SPC2

 Definition:
The ROC metric combines sensitivity and specificity into a single measure to evaluate the
classifier's performance across different thresholds.

 Evaluation Focus:
Often used in ROC curves to compare classifiers and determine the optimal threshold for
decision-making.

7. F1 Score

 Formula: F1=2×PRC⋅SNSPRC+SNSF_1 \ {PRC \cdot SNS}{PRC + SNS}F1=2×PRC+SNSPRC⋅SNS

 Definition:
The F1 Score is the harmonic mean of precision and sensitivity (recall). It provides a balanced
measure when there is an uneven class distribution.

 Evaluation Focus:
Useful when both false positives and false negatives need to be minimized. It is ideal for
imbalanced datasets.
8. Geometric Mean (GM)

 Formula: GM=SNS⋅SPCGM \sqrt{SNS \cdot SPC}GM=SNS⋅SPC

 Definition:
Geometric Mean combines sensitivity and specificity into a single metric, particularly useful
for imbalanced datasets.

 Evaluation Focus:
Ensures that both positive and negative class predictions are balanced, making it effective
for datasets where one class is significantly larger than the other.

Conclusion

Each metric serves a specific purpose in evaluating classifier performance. The choice of the right
metric depends on the problem domain, the cost of misclassification, and whether the dataset is
balanced or imbalanced. For example:

 Accuracy is useful for balanced datasets but misleading for imbalanced ones.

 Precision is crucial when false positives are costly (e.g., fraud detection).

 Recall (Sensitivity) is vital when false negatives are costly (e.g., disease detection).

 F1 Score and Geometric Mean provide a balanced measure, especially for imbalanced
datasets.

 ROC helps compare classifiers across different thresholds.

You might also like