0% found this document useful (0 votes)

66 views

Report Logistic Regression

This document discusses logistic regression for classification and predictive analysis. It provides an overview of interpreting logistic regression results by transforming log odds into odds ratios. It also describes the three main types of logistic regression models: binary, multinomial, and ordinal logistic regression. An example is given applying logistic regression to assess credit risk by predicting the probability of loan default. Model diagnostics are discussed including goodness-of-fit tests and analyzing residuals.

Uploaded by

Zara Batool

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views

Report Logistic Regression

Uploaded by

Zara Batool

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Report Logistic Regression

Introduction:

Logistic regression is used for classification and predictive analytic. It estimates the probability
of an event occurring e.g. sleep or no sleep after eating, based on data set of independent
variables. Outcome probability is the reason why dependent variable probability e.g. sleeps
probability is between 0-1 ranges.

Interpreting Logistic regression:

Log odds are difficult to make sense of in the logistic regression data analysis hence exponent of
beta estimate is taken to transform data into an odd ratio (OR) to make easy interpretation. The
OR represents the odd ratio that is an outcome will occur given a particular event, compared to
the odds of the outcome occurring in the absence of that event. If OR is greater than 1 event is
more likely to occur and if OR is less than 1 the event is less likely to occur.

.
Types of logistic regression:
There are 3 types of logistic regression models, which are defined based on categorical response:

1-Binary logistic regression:

The response or dependent variable has only 2 possible outcomes (e.g. 0 or 1), some popular
examples of its use include predicting if an email is spam or not spam or if a tumor is malignant
or not malignant. In logistic regression it is one of the most common classifier for binary
classification.

2-Multinomial logistic regression:

In this type of logistic regression model, the dependent variable has three or more possible
outcomes; however, these values have no specified order, for example, movie studios want to
predict what genre of films a moviegoer is likely to see to market films more effectively. A
multinomial logistic regression model can help the studio to determine the strength of influence a
person’s age, gender and dating status may have on type of films they prefer. The studio can then
orient an advertising campaign of a specific movie towards a group of people likely to see it.

3-Ordinal logistic regression:

This type of logistic regression model is leveraged when the response variable has 3 or more
possible outcome, but in this case, these values do have a defined order. Examples of ordinal
responses including grading scales from A to F or rating scales from 1 to 5.

Use cases of logistic regression:

Logistic regression is commonly used for prediction and classification problems. Some of these
use cases are:
.Fraud detection: logistic regression models can help teams identify data anomalies, which are
predictive of fraud. Certain behaviors or characters may have higher association with fraud
activities which is particularly helpful to bankers to protect clients’ data.
Disease prediction: In machine, this analytics approach can be used to predict the likelihood of
disease or illness for a given population. Healthcare organizations can set up preventive care for
individuals that show higher propensity for special illnesses.

Churn prediction: For example, human resources and management teams may want to know if
there are high performers within the company who are at risk of leaving the organization; this
type of insight can prompt conversations to understand the problem areas within the company,
such as culture or compensation. Alternatively. The sales organizations may want to learn which
of their clients are at risk of taking their business elsewhere. This can prompt teams to set up a
retention strategy to avoid lost revenue.

Example of Logistic regression to access credit risk:

Preparing the Data for Analysis

Setting the random seed allows you to replicate the random selection of cases in this analysis.

1. To set the random seed, from the menus choose:

Transform > Random Number Generators...

Figure 1. Random Number Generators dialog box

2. Select Set Starting Point.

3. Select Fixed Value and type 9191972 as the value
4. Click OK.
5. To create the selection variable for validation, from the menus choose:

Transform > Compute Variable...

Figure 2. Compute Variable dialog box

6. Type validate in the Target Variable text box.

7. Type rv.bernoulli (0.7) in the Numeric Expression text box.

These sets the values of validate to be randomly generated Bernoulli variates with
probability parameter 0.7.

You only intend to use validate with cases that could be used to create the model; that is,
previous customers. However, there are 150 cases corresponding to potential customers
in the data file.

8. To perform the computation only for previous customers, click If.Figure 3. If Cases
dialog box
9. Select include if case satisfies condition.
10. Type MISSING (default) = 0 as the conditional expression.

This ensures that validate is only computed for cases with non-missing values for default;
that is, for customers who previously received loans.

11. Click Continue.

12. Click OK in the Compute Variable dialog box.

Approximately 70 percent of the customers previously given loans will have a validate value of
1. These customers will be used to create the model. The remaining customers who were
previously given loans will be used to validate the model results.

Running the Analysis

1. To create the logistic regression model, from the menus choose:

Analyze > Regression > Binary Logistic...

Figure 1. Logistic Regression dialog box

2. Select previously defaulted as the dependent variable.
3. Select Age in years through other debt in thousands as covariates.
4. Select Forward: LR as the method.
5. Select validate as the selection variable and click Rule.Figure 2. Set Rule dialog box

6. Type 1 as the value for selection variable.

7. Click Continue.
8. Click Categorical in the Logistic Regression dialog box.Figure 3. Define Categorical
dialog box
9. Select Level of education as a categorical covariate.
10. Click Continue.
11. Click Save in the Logistic Regression dialog box.Figure 4. Save dialog box

12. Select Probabilities in the Predicted Values group, Cook's in the Influence group,
and Studentized in the Residuals group.
13. Click Continue.
14. Click Options in the Logistic Regression dialog box.Figure 5. Options dialog box
15. Select Classification plots and Hosmer-Lemeshow goodness-of-fit.
16. Click Continue.
17. Click OK in the Logistic Regression dialog box.

Model Diagnostics
After building a model, you need to determine whether it reasonably approximates the behavior
of your data.

Tests of Model Fit. The Binary Logistic Regression procedure reports the Hosmer-Lemeshow
goodness-of-fit statistic.

Residual Plots. Using variables specified in the Save dialog box; you can construct various
diagnostic plots. Two helpful plots are the change in deviance versus predicted probabilities and
Cook's distances versus predicted probabilities.

Tests of Model Fit

Last Updated: 2022-09-13

Figure 1. Hosmer-Lemeshow statistic

Goodness-of-fit statistics help you to determine whether the model adequately describes the data.
The Hosmer-Lemeshow statistic indicates a poor fit if the significance value is less than 0.05.
Here, the model adequately fits the data.

Figure 2. Contingency Table for Hosmer-Lemeshow statistic

Change in Deviance versus Predicted

Probabilities
The change in deviance is not an option in the Save dialog box, but it can be estimated by
squaring the studentized residuals.

1. To create the change in deviance, from the menus choose:

Transform > Compute Variable...

Figure 1. Compute Variable dialog box

2. Type chgdev as the Target Variable.
3. Type sre_1**2 as the Numeric Expression.
4. Click OK.

The squared studentized residuals have been saved to chgdev.

5. To produce the residual plot, from the menus choose:

Graphs > Chart Builder...

Figure 2. Chart Builder

6. Select the Scatter/Dot gallery and choose Simple Scatter.
7. Select chgdev as the variable to plot on the Y axis.
8. Select Predicted probability as the variable to plot on the X axis.
9. Click OK.

Figure 3. Change in deviance (squared studentized residuals) vs. predicted

probabilities
The change in deviance plot helps you to identify cases that are poorly fit by the model. Larger
changes in deviance indicate poorer fits. There are two distinct patterns in the plot: a curve that
extends from the lower left to the upper right, and a curve that extends from the upper left to the
lower right.

 The curve that extends from the lower left to the upper right corresponds to cases in
which the dependent variable has a value of 0. Thus, non-defaulters who have large
model-predicted probabilities of default are poorly fit by the model.
 The curve that extends from the upper left to the lower right corresponds to cases in
which the dependent variable has a value of 1. Thus, defaulters who have small model-
predicted probabilities of default are poorly fit by the model.

By identifying the cases that are poorly fit by the model, you can focus on how those customers
are different, and hopefully discover another predictor that will improve the model.

Cook's Distances versus Predicted

Probabilities

Figure 1. Chart Builder

1. Recall the Chart Builder.
2. Select Analog of Cook's influence statistics as the variable to plot on the Y axis.
3. Select Predicted probability as the variable to plot on the X axis.
4. Click OK.

Figure 2. Cook's distances vs. predicted probabilities

The shape of the Cook's distances plot generally follows that of the previous figure, with some
minor exceptions. These exceptions are high-leverage points, and can be influential to the
analysis. You should note which cases correspond to these points for further investigation.

Choosing the Right Model

There are usually several models that pass the diagnostic checks, so you need tools to choose
between them.

 Automated Variable Selection. When constructing a model, you generally want to

only include predictors that contribute significantly to the model. The Binary Logistic
Regression procedure offers several methods for stepwise selection of the "best"
predictors to include in the model.
 Pseudo R-Squared Statistics. The r-squared statistic, which measures the variability in
the dependent variable that is explained by a linear regression model, cannot be
computed for logistic regression models. The pseudo r-squared statistics are designed to
have similar properties to the true r-squared statistic.
 Classification and Validation. Crosstabulating observed response categories with
predicted categories helps you to determine how well the model identifies defaulters.

 Variable Selection
 Figure 1. Variables not in the Equation, block 0

 Forward stepwise methods start with a model that doesn't include any of the predictors.
At each step, the predictor with the largest score statistic whose significance value is less
than a specified value (by default 0.05) is added to the model.
 Figure 2. Variables not in the Equation, block 1


 The variables left out of the analysis at the last step all have significance values larger
than 0.05, so no more are added.
 Figure 3. Model if Term Removed


 The variables chosen by the forward stepwise method should all have significant changes
in -2 log-likelihood. The change in -2 log-likelihood is generally more reliable than the
Wald statistic. If the two disagree as to whether a predictor is useful to the model, trust
the change in -2 log-likelihood.
 As a further check, you can build a model using backward stepwise methods. Backward
methods start with a model that includes all of the predictors. At each step, the predictor
that contributes the least is removed from the model, until all of the predictors in the
model are significant. If the two methods choose the same variables, you can be fairly
confident that it's a good model.

R-Squared Statistics
Figure 1. Model Summary

What constitutes a “good” R 2 value varies between different areas of application. While these
statistics can be suggestive on their own, they are most useful when comparing competing
models for the same data. The model with the largest R 2 statistic is “best” according to this
measure.

Logistic Regression Coefficients

Figure 1. Parameter Estimates
The parameter estimates table summarizes the effect of each predictor. The ratio of the
coefficient to its standard error, squared, equals the Wald statistic. If the significance level of the
Wald statistic is small (less than 0.05) then the parameter is useful to the model. The predictors
and coefficient values shown in the last step are used by the procedure to make predictions.

The meaning of a logistic regression coefficient is not as straightforward as that of a linear

regression coefficient. While B is convenient for testing the usefulness of predictors, Exp (B) is
easier to interpret. Exp (B) represents the ratio-change in the odds of the event of interest for a
one-unit change in the predictor. For example, Exp (B) for employ is equal to 0.781, which
means that the odds of default for a person who has been employed at their current job for two
years are 0.781 times the odds of default for a person who has been employed at their current job
for 1 year, all other things being equal.

Summary
Using the Logistic Regression Procedure, I have constructed a model for predicting the
probability a given customer will default on their loan.

A critical issue for loan officers is the cost of Type I and Type II errors. That is, what is the cost
of classifying a defaulter as a non-defaulter (Type I)? What is the cost of classifying a non-
defaulter as a defaulter (Type II)? If bad debt is the primary concern, then you want to lower
your Type I error and maximize your sensitivity. If growing your customer base is the priority,
then you want to lower your Type II error and maximize your specificity. Usually both are major
concerns, so you have to choose a decision rule for classifying customers that gives the best mix
of sensitivity and specificity.

Source: Summary - IBM Documentation

Logistic Regression
No ratings yet
Logistic Regression
41 pages
Logistic Regression Report
No ratings yet
Logistic Regression Report
39 pages
Linear Regression and Logit
No ratings yet
Linear Regression and Logit
15 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Business Analytics: Advance: Logistic Regression
100% (1)
Business Analytics: Advance: Logistic Regression
26 pages
Lesson 7 Logistic Regression
No ratings yet
Lesson 7 Logistic Regression
17 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Regression Analysis
No ratings yet
Regression Analysis
14 pages
Unit 3-2
No ratings yet
Unit 3-2
20 pages
7.logistics Regression - BDSM - Oct - 2020
No ratings yet
7.logistics Regression - BDSM - Oct - 2020
49 pages
Misc 5
No ratings yet
Misc 5
1 page
Logistic Regression
No ratings yet
Logistic Regression
20 pages
Dissertation Using Logistic Regression
100% (2)
Dissertation Using Logistic Regression
6 pages
Regression Log
No ratings yet
Regression Log
4 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Logistic Regression
100% (1)
Logistic Regression
56 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Topic 7 Regression (Cont2) Logistic Regression
No ratings yet
Topic 7 Regression (Cont2) Logistic Regression
33 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
ML 4
No ratings yet
ML 4
80 pages
Binary Logistic (5)
No ratings yet
Binary Logistic (5)
29 pages
2
No ratings yet
2
6 pages
UNIT 3
No ratings yet
UNIT 3
20 pages
Lecture Notes - Logistic Regression
100% (1)
Lecture Notes - Logistic Regression
11 pages
Report Logistic Regression
No ratings yet
Report Logistic Regression
21 pages
Logistic Regression
No ratings yet
Logistic Regression
27 pages
LO3 Logistic Regression1
No ratings yet
LO3 Logistic Regression1
31 pages
Aih Exp 1
No ratings yet
Aih Exp 1
6 pages
Intro LOGIT
No ratings yet
Intro LOGIT
46 pages
Logistic Regression
No ratings yet
Logistic Regression
7 pages
Logistic Multiple Regression Model-ch 20 Section 20 7
No ratings yet
Logistic Multiple Regression Model-ch 20 Section 20 7
6 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
23 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
20-questions-to-test-your-skills-on-logistic-regression
No ratings yet
20-questions-to-test-your-skills-on-logistic-regression
9 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Logistic Regression Monograph - DSBA v2
No ratings yet
Logistic Regression Monograph - DSBA v2
54 pages
Logistic Regression - Techical Note Case
No ratings yet
Logistic Regression - Techical Note Case
10 pages
RM - Binary Logistic Regression Model - Estimation
No ratings yet
RM - Binary Logistic Regression Model - Estimation
19 pages
Logistic Regression Lecture Notes
No ratings yet
Logistic Regression Lecture Notes
11 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
Lecture 22. Glm
No ratings yet
Lecture 22. Glm
41 pages
LogisticRegression Vs RandomForest 1601998364
No ratings yet
LogisticRegression Vs RandomForest 1601998364
14 pages
Regression Analysis
No ratings yet
Regression Analysis
16 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
Logistic Regression in SPSS
No ratings yet
Logistic Regression in SPSS
4 pages
Logistics Regression Notes
No ratings yet
Logistics Regression Notes
12 pages
BA3-4-5modules
No ratings yet
BA3-4-5modules
258 pages
Logistic Regression:: PGP Dse Bangalore July 2018
No ratings yet
Logistic Regression:: PGP Dse Bangalore July 2018
62 pages
Chapter 4 Statistical Classification Methods
No ratings yet
Chapter 4 Statistical Classification Methods
73 pages
Unit - 5
No ratings yet
Unit - 5
111 pages
Econometrics II CH 1
No ratings yet
Econometrics II CH 1
48 pages
Ilovepdf Merged (24)
No ratings yet
Ilovepdf Merged (24)
208 pages
Logit Regression Analysis
No ratings yet
Logit Regression Analysis
11 pages
5.1) Binary logistic regression
No ratings yet
5.1) Binary logistic regression
32 pages
Chapter 24 - Logistic Regression
100% (7)
Chapter 24 - Logistic Regression
21 pages
Financial Plans for Successful Wealth Management In Retirement: An Easy Guide to Selecting Portfolio Withdrawal Strategies
From Everand
Financial Plans for Successful Wealth Management In Retirement: An Easy Guide to Selecting Portfolio Withdrawal Strategies
Tushar S. Chande, Ph.D., MBA
No ratings yet
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
Lab 2 - GEOL 210 Plate Tectonics Lab Part 1
No ratings yet
Lab 2 - GEOL 210 Plate Tectonics Lab Part 1
5 pages
Geography Marking Scheme Topical Analysed
No ratings yet
Geography Marking Scheme Topical Analysed
51 pages
DPR#02 For 12 Inch Export Pipeline Offshore Installation On 22 Sept. 23
No ratings yet
DPR#02 For 12 Inch Export Pipeline Offshore Installation On 22 Sept. 23
4 pages
B 2d Task Analysis
No ratings yet
B 2d Task Analysis
2 pages
Ang and Linear Velocity
No ratings yet
Ang and Linear Velocity
1 page
RPG - Ficha de Pathfinder 1.0
No ratings yet
RPG - Ficha de Pathfinder 1.0
33 pages
3r Guidebook For Offices
No ratings yet
3r Guidebook For Offices
30 pages
Lesson Plan 2nd Grade 2012
No ratings yet
Lesson Plan 2nd Grade 2012
4 pages
Download ebooks file 78424 all chapters
No ratings yet
Download ebooks file 78424 all chapters
81 pages
AIA2013 Hands-On InventorCAM 01
No ratings yet
AIA2013 Hands-On InventorCAM 01
34 pages
Squad Leader
100% (1)
Squad Leader
10 pages
The Shadow Line
No ratings yet
The Shadow Line
4 pages
Learnathon 2020 Course Guidelines PDF
No ratings yet
Learnathon 2020 Course Guidelines PDF
4 pages
Cost Management: Time: 3 Hours Full Marks: 100
No ratings yet
Cost Management: Time: 3 Hours Full Marks: 100
5 pages
Garden Parking FFL - 0.30: Floor Plan
No ratings yet
Garden Parking FFL - 0.30: Floor Plan
1 page
Activity Guide and Evaluation Rubric - Stage 2 - Intellectual Property
No ratings yet
Activity Guide and Evaluation Rubric - Stage 2 - Intellectual Property
13 pages
Stay Hungry Stay Foolish
No ratings yet
Stay Hungry Stay Foolish
60 pages
Pra Us Sma Bahasa Inggris - Ipa - Ips
100% (1)
Pra Us Sma Bahasa Inggris - Ipa - Ips
65 pages
Hong Kong Experience of Using Recycled Aggregates From Construction and Demolition Materials I...
No ratings yet
Hong Kong Experience of Using Recycled Aggregates From Construction and Demolition Materials I...
10 pages
BS Buzz
No ratings yet
BS Buzz
8 pages
Methodology Aid Delivery Methods Project Cycle Management 200403 en 2
No ratings yet
Methodology Aid Delivery Methods Project Cycle Management 200403 en 2
28 pages
CBLM - BPP Prepare and Produce Bakery PR
No ratings yet
CBLM - BPP Prepare and Produce Bakery PR
97 pages
8 Parts of Speech Quiz
100% (1)
8 Parts of Speech Quiz
33 pages
Risk Assessment For Accommodation With Examples Jan 2015
No ratings yet
Risk Assessment For Accommodation With Examples Jan 2015
7 pages
Amos Brief Purchasing Guide
No ratings yet
Amos Brief Purchasing Guide
3 pages
Standalone Vs Multiplexes A Study of Cha
No ratings yet
Standalone Vs Multiplexes A Study of Cha
36 pages
List of EASA Supplemental Type Certificates
No ratings yet
List of EASA Supplemental Type Certificates
1,606 pages
MANUAL EPG5-M INGLES Electric Governor
No ratings yet
MANUAL EPG5-M INGLES Electric Governor
3 pages
WORKSHEET-PRINTING PRESS
No ratings yet
WORKSHEET-PRINTING PRESS
3 pages
Arts Reporting
No ratings yet
Arts Reporting
19 pages

Uploaded by

Uploaded by

Report Logistic Regression

Interpreting Logistic regression:

1-Binary logistic regression:

2-Multinomial logistic regression:

3-Ordinal logistic regression:

Use cases of logistic regression:

Example of Logistic regression to access credit risk:

Preparing the Data for Analysis

1. To set the random seed, from the menus choose:

Transform > Random Number Generators...

Figure 1. Random Number Generators dialog box

2. Select Set Starting Point.

Transform > Compute Variable...

Figure 2. Compute Variable dialog box

6. Type validate in the Target Variable text box.

11. Click Continue.

Running the Analysis

Analyze > Regression > Binary Logistic...

Figure 1. Logistic Regression dialog box

6. Type 1 as the value for selection variable.

Tests of Model Fit

Figure 1. Hosmer-Lemeshow statistic

Figure 2. Contingency Table for Hosmer-Lemeshow statistic

Change in Deviance versus Predicted

1. To create the change in deviance, from the menus choose:

Transform > Compute Variable...

Figure 1. Compute Variable dialog box

The squared studentized residuals have been saved to chgdev.

5. To produce the residual plot, from the menus choose:

Graphs > Chart Builder...

Figure 2. Chart Builder

Figure 3. Change in deviance (squared studentized residuals) vs. predicted

Cook's Distances versus Predicted

Figure 1. Chart Builder

Figure 2. Cook's distances vs. predicted probabilities

Choosing the Right Model

 Automated Variable Selection. When constructing a model, you generally want to

Logistic Regression Coefficients

The meaning of a logistic regression coefficient is not as straightforward as that of a linear

Source: Summary - IBM Documentation

You might also like