0% found this document useful (0 votes)

19 views

Topic 3 Multiple Regression Analysis Estimation

The document discusses multiple regression analysis and estimation. It defines the classical linear regression model (CLRM) and its assumptions. It then provides examples of using multiple regression to estimate relationships between variables like test scores and spending, consumption and income, and salary determinants. It outlines the estimation process, using the least squares method to estimate regression coefficients from sample data. Finally, it shows an example of using multiple regression to estimate the relationship between programmer salary, experience, and test scores.

Uploaded by

Hann Yo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Topic 3 Multiple Regression Analysis Estimation

Uploaded by

Hann Yo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

Topic 3 Multiple Regression

Analysis: Estimation
2
Multiple Regression Model
• Multiple Regression Model
The equation that describes how the dependent variable y is related to the independent variables x1,
x2, . . . xp and an error term is:
Classical Linear Regression Model (CLRM)
 A-1: The regression model is linear in the parameters; it may or may not be linear in the
variables Y and the Xs.
 A-2: The regressors are assumed to be fixed or nonstochastic in the sense that their values
are fixed in repeated sampling. Fixed X values or X values independent of the error term.
Hence, this means we require zero covariance between and each X variables.

 A-3: Given the values of the X variables, the expected, or mean, value of the error term is
zero. That is,

 A-4: The variance of each ui, given the values of X, is constant, or homoscedastic (homo
means equal and scedastic means variance). That is,
Classical Linear Regression Model (CLRM)
 A-5: There is no correlation between two error terms. That is, there is no autocorrelation.
Symbolically,

 A-6: There are no perfect linear relationships among the X variables. This is the
assumption of no multicollinearity. For example, relationships like are ruled out.
 A-7: The regression model is correctly specified. Alternatively, there is no specification
bias or specification error in the model used in empirical analysis. It is implicitly
assumed that the number of observations, n, is greater than the number of parameters
estimated.

Although it is not a part of the CLRM, it is assumed that the error term follows the normal
distribution with zero mean and (constant) variance . Symbolically,
 A-8:
Classical Linear Regression Model (CLRM)
 On the basis of Assumptions A-1 to A-7, it can be shown that the method of ordinary least
squares (OLS), the method most popularly used in practice, provides estimators of the
parameters of the population regression function (PRF) that have several desirable
statistical properties, such as:
1. The estimators are linear, that is, they are linear functions of the dependent variable Y.
Linear estimators are easy to understand and deal with compared to nonlinear estimators.
2. The estimators are unbiased, that is, in repeated applications of the method, on average,
the estimators are equal to their true values.
3. In the class of linear unbiased estimators, OLS estimators have minimum variance. As a
result, the true parameter values can be estimated with least possible uncertainty; an
unbiased estimator with the least variance is called an efficient estimator.

 In short, under the assumed conditions, OLS estimators are BLUE: best linear unbiased
estimators. This is the essence of the well-known Gauss–Markov theorem, which
provides a theoretical justification for the method of least squares.
Multiple Regression Analysis: Estimation
6

 Motivation for multiple regression

 Incorporate more explanatory factors into the model
 Explicitly hold fixed other factors that otherwise would be in
 Allow for more flexible functional forms

 Example: Wage equation

Multiple Regression Analysis: Estimation
7

 Example: Average test scores and per student spending

 Per student spending is likely to be correlated with average family income at a given high school because of
school financing.
 Omitting average family income in regression would lead to biased estimate of the effect of spending on
average test scores.
 In a simple regression model, effect of per student spending would partly include the effect of family
income on test scores.
Multiple Regression Analysis: Estimation
8

 Example: Family income and family consumption

 Model has two explanatory variables: inome and income squared

 Consumption is explained as a quadratic function of income
 One has to be very careful when interpreting the coefficients:
Multiple Regression Analysis: Estimation
9

 Example: CEO salary, sales and CEO tenure

 Model assumes a constant elasticity relationship between CEO salary and the sales of his or her firm.
 Model assumes a quadratic relationship between CEO salary and his or her tenure with the firm.

 Meaning of “linear” regression

 The model has to be linear in the parameters (not in the variables)
Multiple Regression Analysis: Estimation
10

 Example: Determinants of college GPA

 Interpretation
 Holding ACT fixed, another point on high school grade point average is associated with another .453 points college
grade point average
 Or: If we compare two students with the same ACT, but the hsGPA of student A is one point higher, we predict
student A to have a colGPA that is .453 higher than that of student B
 Holding high school grade point average fixed, another 10 points on ACT are associated with less than one point on
college GPA
11
Multiple Regression Equation
• Multiple Regression Equation
The equation that describes how the mean value of y is related to x1, x2,
. . . xk is:
E(y) = 0 + 1x1 + 2x2 + . . . + kxk
12 Estimated Multiple Regression Equation
• Estimated Multiple Regression Equation

= b0 + b1x1 + b2x2 + . . . + bkxk

A simple random sample is used to compute sample statistics b0, b1, b2, . . . , bk
that are used as the point estimators of the parameters b0, b1, b2, . . . , bk.
13
Estimation Process
Multiple Regression Model
Sample Data:
E(y) = 0 + 1x1 + 2x2 +. .+ kxk + e x1 x2 . . . xk y
Multiple Regression Equation . . . .
E(y) = 0 + 1x1 + 2x2 +. . .+ kxk . . . .
Unknown parameters are
b0 , b1 , b2 , . . . , bk

Estimated Multiple
Regression Equation
b0, b1, b2, . . . , bk = b0 + b1x1 + b2x2 + . . . + bkxk
provide estimates of Sample statistics are
b 0 , b 1 , b2 , . . . , b k b0, b1, b2, . . . , bk
14 Least Squares Method
 Least Squares Criterion

min

• Computation of Coefficient Values

The formulas for the regression coefficients b0, b1, b2, . . . bk involve the
use of matrix algebra. We will rely on computer software packages to
perform the calculations.
The emphasis will be on how to interpret the computer output rather
than on how to make the multiple regression computations.
15 Multiple Regression Model
• Example: Programmer Salary Survey
A software firm collected data for a sample of 20 computer programmers. A
suggestion was made that regression analysis could be used to determine if
salary was related to the years of experience and the score on the firm’s
programmer aptitude test.
The years of experience, score on the aptitude test, and corresponding annual
salary ($1000s) for a sample of 20 programmers is shown on the next slide.
16 Multiple Regression Model
Exper. Test Salary Exper. Test Salary
(Yrs.) Score ($1000s) (Yrs.) Score ($1000s)
4 78 24.0 9 88 38.0
7 100 43.0 2 73 26.6
1 86 23.7 10 75 36.2
5 82 34.3 5 81 31.6
8 86 35.8 6 74 29.0
10 84 38.0 8 87 34.0
0 75 22.2 4 79 30.1
1 80 23.1 6 94 33.9
6 83 30.0 3 70 28.2
6 91 33.0 3 89 30.0
17 Multiple Regression Model
Suppose we believe that salary (y) is related to the years of experience
(x1) and the score on the programmer aptitude test (x2) by the following
regression model:
y = 0 + 1x1 + 2x2 + 

where
y = annual salary ($1000s)
x1 = years of experience
x2 = score on programmer aptitude test
18
Solving for the Estimates of 0, 1, 2

Least Squares
Input Data Output
x1 x2 y Computer b0 =
Package b1 =
4 78 24 for Solving
7 100 43 b2 =
Multiple
. . .
Regression R2 =
. . .
3 89 30 Problems etc.
19
Solving for the Estimates of 0, 1, 2
• Regression Equation Output

Predictor Coef SE Coef T p

Constant 3.17394 6.15607 0.5156 0.61279
Experience 1.4039 0.19857 7.0702 1.9E-06
Test Score 0.25089 0.07735 3.2433 0.00478
20 Estimated Regression Equation

SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)

(Note: Predicted salary will be in thousands of dollars.)

21 Interpreting the Coefficients
• In multiple regression analysis, we interpret each regression coefficient as
follows:
bi represents an estimate of the change in y corresponding to one
unit increase in xi when all other independent variables are held
constant.
22 Interpreting the Coefficients

b1 = 1.404

Salary is expected to increase by $1,404 for each additional year of

experience (when the variable score on programmer attitude test is
held constant).
23 Interpreting the Coefficients

b2 = 0.251

Salary is expected to increase by $251 for each additional point scored on

the programmer aptitude test (when the variable years of experience is held
constant).
24 Multiple Coefficient of Determination
• Relationship Among SST, SSR, SSE

SST = SSR + SSE

where:
SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error
25
Multiple Coefficient of Determination
• ANOVA Output

Analysis of Variance

SOURCE DF SS MS F P
Regression 2 500.3285 250.164 42.76 0.000
Residual Error 17 99.45697 5.850
Total 19 599.7855
26
Multiple Coefficient of Determination

R2 = SSR/SST

R2 = 500.3285/599.7855 = .83418
27 Adjusted Multiple Coefficient of Determination
• Adding independent variables, even ones that are not statistically significant,
causes the prediction errors to become smaller, thus reducing the sum of
squares due to error, SSE.
• Because SSR = SST – SSE, when SSE becomes smaller, SSR becomes larger,
causing R2 = SSR/SST to increase.
• The adjusted multiple coefficient of determination compensates for the number
of independent variables in the model.
28 Adjusted Multiple Coefficient of Determination

2 2 𝑛 −1
𝑅𝑎 =1−(1 − 𝑅 )
𝑛 −𝑘 −1

2 20 −1
𝑅𝑎 =1− ( 1−.834179 ) =.814671
20 −2 −1
29
Assumptions About the Error Term 
• The error  is a random variable with mean of zero.
• The variance of  , denoted by  2, is the same for all values of the
independent variables.
• The values of  are independent.
• The error  is a normally distributed random variable reflecting the deviation
between the y value and the expected value of y given by 0 + 1x1 + 2x2 + . .
+ kxk.
Components
30 of OLS Variances:

 1) The error variance

 A high error variance increases the sampling variance because there is more “noise” in the equation.
 A large error variance doesn‘t necessarily make estimates imprecise.
 The error variance does not decrease with sample size.

 2) The total sample variation in the explanatory variable

 More sample variation leads to more precise estimates.
 Total sample variation automatically increases with the sample size.
 Increasing the sample size is thus a way to get more precise estimates.
Components
31 of OLS Variances:

 3) Linear relationships among the independent variables

 Regress xj on all other independent variables (including constant)

 The R-squared of this regression will be the higher when xj can be better explained by the other
independent variables.

 The sampling variance of the slope estimator for xj will be higher when xj can be better
explained by the other independent variables.

 Under perfect multicollinearity, the variance of the slope estimator will approach infinity.

CH 03 Wooldridge 5e PPT PDF
100% (3)
CH 03 Wooldridge 5e PPT PDF
35 pages
Multiple Linear Regression-I
No ratings yet
Multiple Linear Regression-I
6 pages
MultipleRegression 1
No ratings yet
MultipleRegression 1
40 pages
Multiple Regression
No ratings yet
Multiple Regression
36 pages
Chapter 15
No ratings yet
Chapter 15
67 pages
Chapter 15
No ratings yet
Chapter 15
67 pages
Multiple Regression (Compatibility Mode)
No ratings yet
Multiple Regression (Compatibility Mode)
24 pages
Multiple Regression
No ratings yet
Multiple Regression
60 pages
MultipleRegression
No ratings yet
MultipleRegression
43 pages
Multiple Regression
No ratings yet
Multiple Regression
57 pages
C2 English
No ratings yet
C2 English
34 pages
CH - 03 - Multiple Regression Analysis Estimation
No ratings yet
CH - 03 - Multiple Regression Analysis Estimation
36 pages
Chapter 15
No ratings yet
Chapter 15
50 pages
STATISTIQUE APPLIQUEE - Seance 4
No ratings yet
STATISTIQUE APPLIQUEE - Seance 4
60 pages
Chap.2 MultipleRegression
No ratings yet
Chap.2 MultipleRegression
20 pages
Ch_03_Multiple Regression Analysis Estimation [Autosaved]
No ratings yet
Ch_03_Multiple Regression Analysis Estimation [Autosaved]
36 pages
Chuong 6 - Hoi Quy Boi (SBE - 11e Ch15)
No ratings yet
Chuong 6 - Hoi Quy Boi (SBE - 11e Ch15)
67 pages
CH - 03 - Multiple Regression Analysis Estimation
No ratings yet
CH - 03 - Multiple Regression Analysis Estimation
36 pages
CH 03 Wooldridge 6e PPT Updated
No ratings yet
CH 03 Wooldridge 6e PPT Updated
36 pages
Multiple Regression Analysis: Estimation
No ratings yet
Multiple Regression Analysis: Estimation
36 pages
Chapter 3 Multiple Linear Regression - We Use This One
No ratings yet
Chapter 3 Multiple Linear Regression - We Use This One
6 pages
Econometrics for Finance Lecture III
No ratings yet
Econometrics for Finance Lecture III
54 pages
Brm Unit 3 Mcom Sem1
No ratings yet
Brm Unit 3 Mcom Sem1
40 pages
Multiple Regression Slides Mod-Ed
No ratings yet
Multiple Regression Slides Mod-Ed
32 pages
IST2024 Lecture02
No ratings yet
IST2024 Lecture02
31 pages
4 Multiple Regression Analysis
No ratings yet
4 Multiple Regression Analysis
58 pages
I Am Sharing 'Chapter 15 Updated Student Versio 240104 162608
No ratings yet
I Am Sharing 'Chapter 15 Updated Student Versio 240104 162608
65 pages
Week 11-2 Lecture 15 Student
No ratings yet
Week 11-2 Lecture 15 Student
54 pages
Unit 5
No ratings yet
Unit 5
10 pages
3.multiple Correlation & Regression
No ratings yet
3.multiple Correlation & Regression
24 pages
C2-English
No ratings yet
C2-English
33 pages
Chapter 2 P1
No ratings yet
Chapter 2 P1
20 pages
Lecture - 33 Notes
No ratings yet
Lecture - 33 Notes
33 pages
Mod 3C
No ratings yet
Mod 3C
36 pages
Lecture 3 Multiple Regression Model-Estimation
No ratings yet
Lecture 3 Multiple Regression Model-Estimation
40 pages
Multiple Regression
No ratings yet
Multiple Regression
100 pages
(Reformatted) Module 5 (Students)
No ratings yet
(Reformatted) Module 5 (Students)
32 pages
Regression - Part III - 2021
No ratings yet
Regression - Part III - 2021
55 pages
Ch_03_Wooldridge_5e_PPT
No ratings yet
Ch_03_Wooldridge_5e_PPT
35 pages
Chapter 2
No ratings yet
Chapter 2
19 pages
TP06 Econometrics p2
No ratings yet
TP06 Econometrics p2
20 pages
Chapter 15
No ratings yet
Chapter 15
43 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
15 pages
Fba 1
No ratings yet
Fba 1
9 pages
Module01.1 LinearRegression
No ratings yet
Module01.1 LinearRegression
32 pages
Block 3
No ratings yet
Block 3
45 pages
Multiple Linear Regression
100% (3)
Multiple Linear Regression
26 pages
Multiple Regression
100% (1)
Multiple Regression
100 pages
Regression Analysis
No ratings yet
Regression Analysis
65 pages
Ch13 Multiple Regres
No ratings yet
Ch13 Multiple Regres
46 pages
CH 03 Wooldridge 6e PPT Updated
No ratings yet
CH 03 Wooldridge 6e PPT Updated
36 pages
Applied Business Forecasting and Planning: Multiple Regression Analysis
No ratings yet
Applied Business Forecasting and Planning: Multiple Regression Analysis
100 pages
8.-Linear-Regression
No ratings yet
8.-Linear-Regression
25 pages
REGRESSION
No ratings yet
REGRESSION
8 pages
Pajares, Allan Mark L. - MLR.
No ratings yet
Pajares, Allan Mark L. - MLR.
2 pages
CH 03 Wooldridge 5e PPT
No ratings yet
CH 03 Wooldridge 5e PPT
35 pages
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet

Uploaded by

Uploaded by

Topic 3 Multiple Regression

 Motivation for multiple regression

 Example: Wage equation

 Example: Average test scores and per student spending

 Example: Family income and family consumption

 Model has two explanatory variables: inome and income squared

 Example: CEO salary, sales and CEO tenure

 Meaning of “linear” regression

 Example: Determinants of college GPA

= b0 + b1x1 + b2x2 + . . . + bkxk

• Computation of Coefficient Values

Predictor Coef SE Coef T p

SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)

(Note: Predicted salary will be in thousands of dollars.)

Salary is expected to increase by $1,404 for each additional year of

Salary is expected to increase by $251 for each additional point scored on

SST = SSR + SSE

 1) The error variance

 2) The total sample variation in the explanatory variable

 3) Linear relationships among the independent variables

You might also like