0% found this document useful (0 votes)

76 views

Chapter 3

This document discusses regression analysis techniques, including simple and multiple linear regression. It defines simple regression as using one independent variable to predict a dependent variable, while multiple regression uses two or more independent variables. The key assumptions of multiple regression are outlined, such as having a continuous dependent variable, independence of observations, linear relationships between variables, homoscedasticity, no multicollinearity, no outliers, and normally distributed residuals. Steps for conducting multiple regression are also summarized, including checking that the data meets these assumptions.

Uploaded by

feyisaabera19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views

Chapter 3

Uploaded by

feyisaabera19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 36

Regression Analysis

Multiple Regression
[ Cross-Sectional Data ]
Simple Regression
 A statistical model that utilizes one quantitative
independent variable “X” to predict the quantitative
dependent variable “Y.”
 i.e., considers the relation between a single explanatory
variable and response variable:
Multiple Regression
 A statistical model that utilizes two or more quantitative and
qualitative explanatory variables (x1,..., xp) to predict a
quantitative dependent variable Y.
Caution: have at least two or more quantitative
explanatory variables.
 Multiple regression simultaneously considers the influence of
multiple explanatory variables on a response variable Y:
Simple vs. Multiple
•  represents the unit change • i represents the unit change in
in Y per unit change in X . Y per unit change in Xi.
• Does not take into account
any other variable besides • Takes into account the effect of
single independent variable. other independent variable s.
• r2: proportion of variation in • R2: proportion of variation in Y
Y predictable from X. predictable by set of X’s
Goal
Develop a statistical model that can
predict the values of a dependent
(response)
response variable based upon the values
of the independent (explanatory)
explanatory variables.
Multiple Regression Models
M ultiple
Regression
M odels
Non-
Linear
Linear

Dummy Inter-
Linear action
Variable

Poly- Square
Log Reciprocal Exponential
Nomial Root
The Multiple Linear Regression Model building

Idea: Examine the linear relationship between 1 dependent (Y)

& 2 or more independent variables (Xi)

Multiple Regression Model with k Independent Variables:

Y-intercept Population slopes Random Error

Yi  β0  β1 X 1i  β2 X 2i    βk X ki  ε
• The coefficients of the multiple regression model
are estimated using sample data with k
independent variables
Estimated Estimated
(or predicted) Estimated slope coefficients
intercept
value of Y

Ŷi  b0  b1 X 1i  b2 X 2i    bk X ki
• Interpretation of the Slopes: (referred to as a Net
Regression Coefficient)
– b1=The change in the mean of Y per unit change in X1,
taking into account the effect of X2 (or net of X2)
– b0 Y intercept. It is the same as simple regression.
ASSUMPTIONS
• Linear regression model: The regression model is linear in the
parameters, though it may or may not be linear in variables.
• The X variable is independent of the error term. This means
that we require zero covariance between ui and each X variables.
cov.(ui , X1i) = cov(ui, X2i)=------- = cov(ui, Xki) = 0
• Zero mean value of disturbance ui. Given the value of Xi, the
mean, or the expected value of the random disturbance term ui is
zero.
E(ui)= 0 for each i
• Homoscedasticity or constant variance of ui . This implies that
the variance of the error term is the same, regardless of the value
of X.
var (ui) = σ2
• No auto-correlation between the disturbance terms.

cov ( ui, uj) = 0 i≠j

 This implies that the observations are sampled independently.

• the number of observations n must be greater than the number
of parameters to be estimated.
• There must be variation in the values of the X variables.
Technically, var(X) must be a positive number.
• No exact collinearity between the X variables. i.e.
 No multicollinearity: No exact linear relationship exists between
any of the explanatory variables.
Estimation of parameters and standard errors
The coefficient of determination and test of model adequacy
Are Individual Variables Significant?
• Use t-tests of individual variable slopes
• Shows if there is a linear relationship between the variable Xi
and Y; Hypotheses:
• H0: βi = 0 (no linear relationship)

• H1: βi ≠ 0 (linear relationship does exist between Xi and Y)

bi  0
• Test Statistic: tn  k  1 
Sb
i

• Confidence interval for the population slope βi

bi  tnk 1Sbi
Dummy independent Variables
Describing Qualitative Information
• In regression analysis the dependent variable can be
inﬂuenced by variables that are essentially qualitative in
nature,
such as sex, race, color, religion, nationality, geographical
region, political upheavals, and party affiliation.
• One way we could “quantify” such attributes is by
constructing artiﬁcial variables that take on values of 1 or 0,
 1 indicating the presence (or possession) of that attribute and 0
indicating the absence of that attribute.
• Variables that assume such 0 and 1 values are called dummy/
indicator/ binary/ categorical/ dichotomous variables.
Example 1:
where Y=annual salary of a college professor
Di  1 if male college professor

= 0 otherwise (i.e., female professor)

The Model may enable us to find out whether sex makes any
difference in a college professor’s salary, assuming, of course, that
all other variables such as age, degree attained, and years of
experience are held constant.
Mean salary of female college professor:
Mean salary of male college professor:
 tells by how much the mean salary of a male college professor
differs from the mean salary of his female counterpart.
A test of the null hypothesis that there is no sex discrimination
( Ho:  = 0) can be easily made and finding out whether on the basis
of the t test the estimated  is statistically significant.
Example 2:
Where: Xi = years of teaching experience
Mean salary of female college professor: E (Yi / X i , Di  0)   1  X i
Mean salary of male college professor: E (Yi / X i , Di  1)  (   2 )  X i

the male and female college professors’ salary functions in relation to

the years of teaching experience have the same slope () but different
intercepts.
 Male intercept = a1 +a2
 Female intercept = a1
 Difference = a2
Note: If a qualitative variable has ‘m’ categories, introduce only ‘m-1’
dummy variables.
The group, category, or classification that is assigned the value of 0 is
often referred to as the base, benchmark, control, comparison, reference,
or omitted category.
Example 3: qualitative variable with more than two classes
we want to regress the annual expenditure on health care by an
individual on the income and education of the individual.
Yi   1   2 D2i   3 D3i  X i  u i

Where Yi  annual expenditure on health care

X i  annual income

D2  1 if high school education

= 0 otherwise
D3  1 if college education

= 0 otherwise
“less than high school education” category as the base category.
Therefore, the intercept  will reflect the intercept for this
category.
• the mean health care expenditure functions for the three
levels of education, namely, less than high school, high
school, and college:
E (Yi | D2  0, D3  0, X i )   1  X i

E (Yi | D2  1, D3  0, X i )  ( 1   2 )  X i

E (Yi | D2  0, D3  1, X i )  ( 1   3 )  X i
Assumptions and Procedures to Conduct Multiple
Linear Regression
When you choose to analyse your data using multiple
regression, part of the process involves checking to make sure
that the data you want to analyse can actually be analysed using
multiple regression.
 You need to do this because it is only appropriate to use
multiple regression if your data "passes" eight assumptions that
are required for multiple regression to give you a valid result.
 In practice, checking for these eight assumptions just adds a
little bit more time to your analysis but it is not a difficult task.
 let's take a look at these eight assumptions:
Assumption #1:
 Your dependent variable should be measured on a continuous
scale .
Assumption #2:
 You should have two or more independent variables, which can
be either continuous or categorical or dummy.
Assumption #3:
 You should have independence of observations (i.e.,
independence of residuals), which you can easily check using the
Durbin-Watson statistic.
Assumption #4:
There needs to be a linear relationship between
 (a) the DV and each of your independent variables, and
 (b) the DV and the independent variables collectively.
Assumption #5:
 Your data needs to show homoscedasticity, which is where the
variances along the line of best fit remain similar as you move
along the line.
Assumption #6:
 Your data must not show multicollinearity, which occurs when
you have two or more independent variables that are highly
correlated with each other.
 This leads to problems with understanding which IV contributes
to the variance explained in the dependent variable
Assumption #7:
There should be no significant outliers, high leverage points or
highly influential points.
This can change the output that any Statistics produces and
reduce the predictive accuracy of your results as well as the
statistical significance.
Assumption #8:
Finally, you need to check that the residuals (errors) are
approximately normally distributed.
Two common methods to check this assumption include using:
(a) a histogram and a Normal P-P Plot; or (b) a Normal Q-Q Plot of
the studentized residuals.
You can check assumptions #3, #4, #5, #6, #7 and #8 using
SPSS Statistics.
Assumptions #1 and #2 should be checked first, before
moving onto assumptions #3, #4, #5, #6, #7 and #8.
 Just remember that if you do not run the statistical tests on
these assumptions correctly, the results you get when running
multiple regression might not be valid.
This is why we are concerned on the assumptions and
procedures of multiple regressions to help you get this right.
Given the assumptions and data on a dependent variable Y
and set of potential explanatory variables (X1,.., XK ) , the
following are a suggested procedures/steps to conduct multiple
linear regression:
1.Select variables that you believe are linearly related to the
dependent variable.
2.Use a computer and software to generate the coefficients and
the statistics used to assess the model.
3.Diagnose violations of required conditions/ assumptions.
 If there are problems, attempt to remedy them.
4. Assess the model’s fit.
 Three statistics that perform this function are
 the standard error of estimate,
 the coefficient of determination, and
 the F-test of the analysis of variance.
5. If we are satisfied with the model’s fit and that the required
conditions are met, we can interpret the coefficients and test
them.
6. We use the model to predict a value of the dependent variable
or estimate the expected value of the dependent variable.
Regression Output Interpretation
Example
 In a study of consumer demand (Qd), multiple regression analysis
is done to examine the relationship between quantity demanded and
four potential predictors.
The four independent variables are: price, income, tax and Price of
related goods.
The output for this example is interpreted as follows:
 Model Fit
Model Summary

Model R R Square Adjusted R Square Std. Error of the Estimate

1 .991a .982 .978 .334
a. Predictors: (Constant), price of related goods, income of the consumer, commodity tax, price of the
product
The multiple correlation coefficient is 0.991.
indicates that the correlation among the ID and DV is positive.
This statistic, which ranges from -1 to +1, does not indicate
statistical significance of this correlation.
The R2 is 0.982.
This means that the independent variables explain 98.2% of the
variation in the dependent variable.
The adjusted R-square, a measure of explanatory power, is 0.978.
 This statistic is not generally interpreted because it is neither a
percentage (like the R2), nor a test of significance (the F-statistic).
The standard error of the regression is 0.334, which is an estimate
of the variation of the observed quantity supplied, in quantity terms,
about the regression line.
ANOVAb
Model Sum of Squares(SS) df Mean Square F Sig.
1 Regression 93.124 4 23.281 208.359 .000a
Residual 1.676 15 .112
Total 94.800 19

a. Predictors: (Constant), price of related goods, income of the consumer, commodity tax, price
of the product

b. Dependent Variable: quantity demanded

The p value for the F statistic is < .01.

 This means that at least one of the independent variables is a
significant predictor of the dependent variable (quantity demanded).
This indicates rejection of the null hypothesis.
Interpreting Parameter Values (Model Coefficients)
The results of the estimated regression line include the estimated
coefficients, the standard error of the coefficients, the calculated t-
statistic, the corresponding p-value, and the bounds of both the 95%
and the 90% confidence intervals.
Coefficientsa
Model Unstandardized Standardized t Sig. 95% Confidence
Coefficients Coefficients Interval for B
B Std. Beta Lower Upper
Error Bound Bound
Constant 7.309 1.290 5.665 .000 4.559 10.060
price -.471** .167 -.361 -2.825 .013 -.827 -.116
income .027** .009 .275 2.844 .012 .007 .046
tax -.615* .159 -.375 -3.856 .002 -.954 -.275
price of related .013 .095 .010 .138 .892 -.189 .215
Finally, the above table will help us determine whether quantity
demanded and explanatory variables are significantly related, and
the direction and strength of their relationship.
 The prediction equation is based on the unstandardized coefficients
and written as:
Qd = 7.3 – 0. 47 price + 0.03 income – 0.62 tax
Results of the multiple linear regression model showed that out of
the 4 explanatory variables that were entered to the model, 3 of
them, namely price of the product, income of the consumer and
commodity tax were found to be statistically significant.
 Results of the statistically significant variables are discussed as
follows:
The Constant is the predicted value of quantity demanded when all
of the independent variables have a value of zero.
 The b coefficient associated with price (-0. 47) is negative,
indicating an inverse relationship in which higher price of the
product is associated with lower quantity demanded.
 For the independent variable price, the probability of the t
statistic (-2.825) for the b coefficient is less than the level of
significance of 0.05.
 We reject the null hypothesis that the slope associated with
price is equal to zero and conclude that there is a statistically
significant relationship between price and quantity demanded.
 A unit increase/decrease in the price of the product leads to a
0. 47 decrease/increase in quantity demanded, ceteris paribus.
 The income variable is found to be positively and
significantly (at 5% probability level) related to quantity
demanded.
 The result confirmed that holding other independent variables
constant, for every one unit increase/decrease in income,
quantity demanded will increase/decrease by 0.03 units.
 Tax coefficient is highly significant (at 1% probability level)
and carries negative sign.
 The slope of tax is -0.62. This means that for every one unit
increase/decrease in tax on a commodity, quantity demanded
will decrease/increase by -0.62 units, ceteris paribus.

Introduction To Econometrics, Tutorial
No ratings yet
Introduction To Econometrics, Tutorial
10 pages
Microeconomics Quiz
No ratings yet
Microeconomics Quiz
5 pages
Problem Set 2 (Econtwo)
No ratings yet
Problem Set 2 (Econtwo)
1 page
Lab Solutions
0% (1)
Lab Solutions
28 pages
Chapter 7 PDF
No ratings yet
Chapter 7 PDF
17 pages
1macroeconomics Under Graduate Course Chapters 123
No ratings yet
1macroeconomics Under Graduate Course Chapters 123
159 pages
Agricultural Economics Chapter 3
No ratings yet
Agricultural Economics Chapter 3
46 pages
Chapter 1 The Nature of Econometrics and Economic Data
No ratings yet
Chapter 1 The Nature of Econometrics and Economic Data
8 pages
Econometrics II Chapter One
No ratings yet
Econometrics II Chapter One
71 pages
Econometrics II
100% (1)
Econometrics II
4 pages
Quantitative Methods for Economic Analysis 1 Solved MCQs [set-7] McqMate.com
100% (1)
Quantitative Methods for Economic Analysis 1 Solved MCQs [set-7] McqMate.com
5 pages
Test Bank 2
No ratings yet
Test Bank 2
6 pages
MCQ
No ratings yet
MCQ
34 pages
Marginal Utility Quiz
No ratings yet
Marginal Utility Quiz
1 page
Practice Multiple Choice1
No ratings yet
Practice Multiple Choice1
11 pages
Econ 3010 Final Exam Multiple Choice (100 Points)
No ratings yet
Econ 3010 Final Exam Multiple Choice (100 Points)
9 pages
SEM 4 - 10 - BA-BSc - HONS - ECONOMICS - CC-10 - INTRODUCTORYECONOMETRI C - 10957
No ratings yet
SEM 4 - 10 - BA-BSc - HONS - ECONOMICS - CC-10 - INTRODUCTORYECONOMETRI C - 10957
3 pages
Multiple Linear Regression: y BX BX BX
No ratings yet
Multiple Linear Regression: y BX BX BX
14 pages
Exercises and Applications For Microeconomic Analysis PDF
0% (2)
Exercises and Applications For Microeconomic Analysis PDF
2 pages
CH 14 - Wage Determination - Practice Test W Ans and Diff Level
No ratings yet
CH 14 - Wage Determination - Practice Test W Ans and Diff Level
12 pages
Panel Data Regression Models
100% (1)
Panel Data Regression Models
25 pages
Pad118 Past Question
No ratings yet
Pad118 Past Question
8 pages
Worksheet Econometrics i
No ratings yet
Worksheet Econometrics i
7 pages
Exam Questions
No ratings yet
Exam Questions
3 pages
WARP&SARP (2) Notes PDF
No ratings yet
WARP&SARP (2) Notes PDF
39 pages
Heteroscedasticity: What Heteroscedasticity Is. Recall That OLS Makes The Assumption That
No ratings yet
Heteroscedasticity: What Heteroscedasticity Is. Recall That OLS Makes The Assumption That
20 pages
Chapter One
No ratings yet
Chapter One
5 pages
Supply and Demand
No ratings yet
Supply and Demand
3 pages
Assignment: Intermediate Microeconomics-II: Attempt Any 4 Questions. Each Question Carries 5 Marks
100% (1)
Assignment: Intermediate Microeconomics-II: Attempt Any 4 Questions. Each Question Carries 5 Marks
2 pages
Inflation: "Inflation Is Always and Everywhere A Monetary Phenomenon"
No ratings yet
Inflation: "Inflation Is Always and Everywhere A Monetary Phenomenon"
18 pages
Econometrics Chapter One
No ratings yet
Econometrics Chapter One
11 pages
Introduction To Introduction To Econometrics Econometrics Econometrics Econometrics (ECON 352) (ECON 352)
100% (2)
Introduction To Introduction To Econometrics Econometrics Econometrics Econometrics (ECON 352) (ECON 352)
12 pages
Webinar 1 Microeconomics Essentials 16 March 2024 (2)
No ratings yet
Webinar 1 Microeconomics Essentials 16 March 2024 (2)
33 pages
Hailemaram Dadi
100% (1)
Hailemaram Dadi
36 pages
Intermediate Microeconomics Syllabus
No ratings yet
Intermediate Microeconomics Syllabus
6 pages
92-Worksheet - Econometrics II
100% (1)
92-Worksheet - Econometrics II
4 pages
Chapt 2 MIC
No ratings yet
Chapt 2 MIC
13 pages
Managerial Economics Solution 1
No ratings yet
Managerial Economics Solution 1
10 pages
Model Final Exam - Principles of Macroeconomics
No ratings yet
Model Final Exam - Principles of Macroeconomics
10 pages
Chapter 23 Quiz
No ratings yet
Chapter 23 Quiz
15 pages
2010 AAU Entrance Exam - MSC in Economics
No ratings yet
2010 AAU Entrance Exam - MSC in Economics
4 pages
Introduction To Macroeconomics: Notes and Summary of Readings
No ratings yet
Introduction To Macroeconomics: Notes and Summary of Readings
27 pages
Jnu (SSS) 2019
No ratings yet
Jnu (SSS) 2019
14 pages
Chapter Two Introduction To The Theory of Consumer Behavior
100% (2)
Chapter Two Introduction To The Theory of Consumer Behavior
169 pages
Macro Aau
No ratings yet
Macro Aau
49 pages
Microeconomics - Econ - 101 PDF
No ratings yet
Microeconomics - Econ - 101 PDF
164 pages
Docx
No ratings yet
Docx
50 pages
Econometrics Chapter # 0: Introduction
No ratings yet
Econometrics Chapter # 0: Introduction
19 pages
Econometric S
No ratings yet
Econometric S
26 pages
QNT 565 Final Exam Question Answers
No ratings yet
QNT 565 Final Exam Question Answers
9 pages
Chapter 3
100% (1)
Chapter 3
28 pages
Caiib Macmillan Ebook Retail Banking
100% (1)
Caiib Macmillan Ebook Retail Banking
41 pages
Econometrics QP Calicut
No ratings yet
Econometrics QP Calicut
17 pages
Introduction To Econometrics
No ratings yet
Introduction To Econometrics
90 pages
CH 5 Time Series
No ratings yet
CH 5 Time Series
46 pages
Managerial Economics Ch3
No ratings yet
Managerial Economics Ch3
67 pages
LT 2 Econometrics
No ratings yet
LT 2 Econometrics
94 pages
MCQs On Supply Curve Concept
No ratings yet
MCQs On Supply Curve Concept
3 pages
Micro-Economics (Cmi Assign #1)
100% (1)
Micro-Economics (Cmi Assign #1)
6 pages
Session 1.3 Notes
No ratings yet
Session 1.3 Notes
39 pages
01 - Quantitative Methods
No ratings yet
01 - Quantitative Methods
28 pages
Chapter 13 Simple Regression
No ratings yet
Chapter 13 Simple Regression
44 pages
Jibson 2007
No ratings yet
Jibson 2007
10 pages
Ch.4 Correlation
No ratings yet
Ch.4 Correlation
1 page
Logistic Regression Via Excel Spreadsheets Mechani
No ratings yet
Logistic Regression Via Excel Spreadsheets Mechani
12 pages
Output - Group - Work - Project - 4652 - GWP1.ipynb - Colaboratory
No ratings yet
Output - Group - Work - Project - 4652 - GWP1.ipynb - Colaboratory
6 pages
Self-Quiz Unit 3 - Attempt Review
No ratings yet
Self-Quiz Unit 3 - Attempt Review
12 pages
Multiple Linear Regression Model: Pampanga State Agricultural University
No ratings yet
Multiple Linear Regression Model: Pampanga State Agricultural University
15 pages
02-Numerical Methods of Approximation-86
No ratings yet
02-Numerical Methods of Approximation-86
86 pages
Curve Fitting: University of Cebu Lapulapu & Mandaue
No ratings yet
Curve Fitting: University of Cebu Lapulapu & Mandaue
18 pages
Ridge and Lasso Regression in Python
No ratings yet
Ridge and Lasso Regression in Python
18 pages
Analysis of Population Growth of India and Estimation For Future
No ratings yet
Analysis of Population Growth of India and Estimation For Future
8 pages
2326 - EC2020 - Main EQP v1 - Final
No ratings yet
2326 - EC2020 - Main EQP v1 - Final
19 pages
InOpe - 1 - Forecasting
No ratings yet
InOpe - 1 - Forecasting
2 pages
NOTES - AMAT 110 Finals
No ratings yet
NOTES - AMAT 110 Finals
4 pages
UNIT-III Lecture Notes
No ratings yet
UNIT-III Lecture Notes
18 pages
Msf
No ratings yet
Msf
10 pages
Chapter 4 Interpolation Curve Fitting (Part 1)
No ratings yet
Chapter 4 Interpolation Curve Fitting (Part 1)
16 pages
Exercise Set 1 Solution Key
No ratings yet
Exercise Set 1 Solution Key
9 pages
Decision Science - Assignment
No ratings yet
Decision Science - Assignment
6 pages
CausalInference w7 Panel
No ratings yet
CausalInference w7 Panel
30 pages
8. Cross Model by Dr.zafar
No ratings yet
8. Cross Model by Dr.zafar
4 pages
Eviews 2
No ratings yet
Eviews 2
15 pages
Mechanical-Nonlin - 13.0 - App6A - Creep Curve Fitting in MAPDL
100% (2)
Mechanical-Nonlin - 13.0 - App6A - Creep Curve Fitting in MAPDL
17 pages
Lecture 5 31032024 123913pm
No ratings yet
Lecture 5 31032024 123913pm
7 pages
ch5
No ratings yet
ch5
8 pages
MAT523 MINI Project-Recycling Rates 1 MAT523 MINI Project - Recycling Rates 1
No ratings yet
MAT523 MINI Project-Recycling Rates 1 MAT523 MINI Project - Recycling Rates 1
29 pages
Logistic Regression & Practice
100% (1)
Logistic Regression & Practice
51 pages
Gr5205 Midterm Key
No ratings yet
Gr5205 Midterm Key
13 pages
DOC-20250427-WA0054.
No ratings yet
DOC-20250427-WA0054.
1 page

Uploaded by

Uploaded by

Regression Analysis

Idea: Examine the linear relationship between 1 dependent (Y)

Multiple Regression Model with k Independent Variables:

Y-intercept Population slopes Random Error

cov ( ui, uj) = 0 i≠j

 This implies that the observations are sampled independently.

• H1: βi ≠ 0 (linear relationship does exist between Xi and Y)

• Confidence interval for the population slope βi

= 0 otherwise (i.e., female professor)

the male and female college professors’ salary functions in relation to

Where Yi  annual expenditure on health care

D2  1 if high school education

Model R R Square Adjusted R Square Std. Error of the Estimate

b. Dependent Variable: quantity demanded

The p value for the F statistic is < .01.

You might also like