0% found this document useful (0 votes)

82 views

Econometrics: Multicollinearity

Multicollinearity refers to a near perfect linear relationship between two or more explanatory variables in a regression model. Near multicollinearity does not violate the assumptions of the classical linear regression model but it does result in imprecise and unstable estimates with large standard errors. This makes it difficult to determine the individual impact of each variable and results in few or no variables being deemed statistically significant despite the overall model fit. Polynomial regressions are also susceptible to multicollinearity between the higher order terms. Multicollinearity can be identified through high correlation coefficients between variables and significant overall model fits but insignificant individual variables.

Uploaded by

Carlos Abeli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views

Econometrics: Multicollinearity

Uploaded by

Carlos Abeli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Econometrics

Multicollinearity

Multicollinearity
! We start by considering the model with k-1 explanatory
variables,
Y = Xβ + e
w ith E [ e | X ] = 0 , V ar[ e | X ] = σ 2 I n and
X ( nxk ) = [x 1 x 2 … x n ]
x 1 = 1  x 2 =  x 21  ... x k =  xk1 
1  x  x 
   22   k2 
M  M   M 
     
1   x 2 n   x kn 

Econometrics Patrícia Cruz 8-2

Perfect Multicollinearity
! Originally the term multicollinearity meant the existence
of a “perfect”, or exact, linear relationship among
some of the explanatory variables of the model.

! For example, in our model with k explanatory variables,

we say that there is an exact linear relationship between
them if the following condition is satisfied:
λ1x1 + ... + λk x k = 0
where l1, l 2,…, lk are constants such that at least one of
them is different from zero. That is, at least one of the
→ PERFECT
explanatory variables can be expressed as a linear
MULTI COCCINEA NTI
combination of the other variables.
Econometrics Patrícia Cruz 8-3

Perfect Multicollinearity
! In the situation just described, we say that there is
perfect multicollinearity.
! In this situation, the columns of the regressors matrix, X,
are linearly dependent, so
ra n k ( X ) < k ⇒ ra n k ( X ' X ) < k ⇒ X ' X = 0
and it is not possible to invert the X’X matrix (that is,
X’X is a singular matrix).
! In this case, it is not possible to estimate the regression
coefficients using the OLS rule. In fact, there is not a
unique solution for the normal equations
( X ' X ) βˆ = X ' y
Econometrics Patrícia Cruz 8-4
Example
! Example: Consider the model
y i = β 1 + β 2 x 2 i + β 3 x 3 i + ei
and lets assume that
x3i = λ x2 i with λ ≠ 0
Replacing in the model above, we obtain
y i = β 1 + β 2 x 2 i + β 3 ( λ x 2 i ) + ei
= β 1 + ( β 2 + λ β 3 ) x 2 i + ei
= β 1 + α x 2 i + ei w ith α = β 2 + λ β 3
Therefore, although using the OLS rule we can estimate
α uniquely, there is no way to estimate the impact of x2
on y (β2) and the impact of x3 on y (β3) uniquely.
Econometrics Patrícia Cruz 8-5

Example
! Example: Mathematically,
αˆ = βˆ 2 + λβˆ3
gives us only one equation in two unknowns (l is given)
and there is an infinity of solutions to this equation for
given values of αˆ and l.

In particular, if αˆ = 0 .8 a n d λ = 2 , we have

0 .8 = βˆ 2 + 2 βˆ 3 ⇔ βˆ 2 = 0 .8 − 2 βˆ 3
so, there is no unique solution forβ̂2 .

Econometrics Patrícia Cruz 8-6

Near Multicollinearity
■ A different situation is when the explanatory variables
are strongly correlated but not perfectly so, as follows
λ1 x 1 + λ 2 x 2 + ... + λ k x k + v = 0
where ν is a vector with random errors.
■ If we assume, for example, that λ2 ≠ 0, we can write the
equation above as
λ λ λ 1
x 2 = − 1 x1 − 3 x 3 − ... − k x k − v
λ2 λ2 λ2 λ2
which shows that x2 is not an exact linear combination of
the other explanatory variables because it is also
determined by the error term ν.
Econometrics Patrícia Cruz 8-7

Near Multicollinearity
■ When there are nearly exact linear relationships between
the explanatory variables we say that there is near or
high multicollinearity.
■ In this case none of the assumptions of the classical
linear regression model is being violated. In fact, X has
k linearly independent columns – rank (X) = k – the X’X
matrix is non singular, and the matrix (X’X)−1 exists.
■ In the figure below we represent several degrees of
collinearity between x2 and x3. The circles represent the
variations in y (the dependent variable) and in x2 and x3
(the explanatory variables).

Econometrics Patrícia Cruz 8-8

Multicollinearity

y y

x2 x3 x2
x3

No collinearity y
Low collinearity

x2 x3

High collinearity
Econometrics Patrícia Cruz 8-9

Statistical Consequences of Near

Multicollinearity
■ Statistical consequences of near multicollinearity

1) Although BLUE, the OLS estimators have large

variances and covariances. In fact, when there are
nearly exact dependencies among the explanatory
variables, some elements of (X'X)−1 will be large and
so some elements of Var(βˆ | X ) = σ 2 (X'X)−1 will be large.

2) Large standard errors for the OLS estimators imply

that confidence intervals tend to be much wider and
that the information provided by the sample data
about the unknown parameters is relatively imprecise.

Econometrics Patrícia Cruz 8-10

to
③ Ko o vs Ha : pi
pi
: =

t
ftp.vtn-u
-
-

Statistical Consequences of Near

Multicollinearity
3) Because of the large estimated standard errors, it is likely
that the usual t tests will lead to the conclusion that the
parameter values are not significantly different from zero.
→ Fail to Reject Ho

Not st significant
4) The outcome described in 3) occurs despite possible high
.

R2 or “F-values” indicating that the model is globally

significant. In fact, collinear variables do not provide
enough information to estimate their separate effects, even
though economic theory, and their total effect, may
indicate their importance in the relationship.
5) The OLS estimators may be very sensitive to the addition
or deletion of a few observations, or the deletion of an
apparently insignificant variable.
Econometrics Patrícia Cruz 8-11

cant
You have really big differences
( precision )

Multicollinearity in Polynomial Regressions

■ Multicollinearity, as we have defined it, refers only to

linear relationships among the explanatory variables. It
does not rule out nonlinear relationships among them.
■ For example, in polynomial regressions the explanatory
variable(s) appear with various powers. These terms are
going to be correlated, making it difficult to estimate the
various slope coefficients with precision (although the
assumption of no multicollinearity is not violated).

Econometrics Patrícia Cruz 8-12

Identifying Multicollinearity
■ Identifying multicollinearity
1) High R2 and the F statistic indicating that the model is
globally significant, but the individual t tests showing that
none or very few of the partial slope coefficients are
statistically different from zero.

2) High sample correlation coefficients between pairs of → Lou should

explanatory variables. A commonly used rule is that a
correlation coefficient greater than 0.8 indicates a strong always
linear association and a potentially harmful collinear compute
relationship.
the core .

Econometrics Patrícia Cruz 8-13

of coefficient
when
you
here corn > 0.8 btdlvau.dk
Not &
enough ←
problems contain )
'T 't
of TWLTLCOCU NEAR
Corral , ,Xz)
( !
)
Identifying Multicollinearity
Notice , however, that high correlation coefficients are a
sufficient but not a necessary condition for the existence of
multicollinearity because it can exist although the
correlations coefficients are relatively low (of course that if
there are only two explanatory variables this measure will
suffice).
3) In order to find out which explanatory variable is related to
other explanatory variables we can regress each xj on the
remaining explanatory variables and compute the
correspondent R2, which we can designate as R2j. Each one
of these regressions is called an auxiliary regression. A rule
of thumb is that multicollinearity may be a problem only if
R2j is greater than the R2 in the initial model.
Econometrics Patrícia Cruz 8-14
ki Benzi exp bei E
Y pi
=

←
output L
6. Son cetfital returns to sale
p, e
p,
=
Degree of

e. p, em t t 9 ers
en t
palmeri
t
f' " e"
P2 Ps
=

g,
=

here
\ If we strong

egjdeg÷%Yff
in the model

Solutions to Collinear Data

( I p 3) Cruzi In Uzi
lnyi en
pi
t t
Ps
-
=

L New Deb - VAR CABLE

1) Using a priori information, that is, introducing in

Coni ?
-
-
'n
pi t
Ps
In
K÷ )
nonsample information in the form of linear
restrictions on the parameters. This a priori
other models
information could come from previous empirical work -
-
estimated
in which the collinearity problem is less serious or
htt
from economic theory. !
n
ye =p ,
t Pah Pt
t
Ps

-
2) Combining cross-sectional and time-series data, or, *

( en y e) pit pin
Pt t ee
pooling the data. =

with en yet en gt -
pas
en It

3) Dropping an explanatory variable(s) from the =

model. Notice that in dropping a variable from a (d) Predict the bebevioe of

(
Def variable
model we may be committing a specification error. if
.

re rt
;
Globally
significant

÷÷÷÷÷÷÷÷÷÷i:
Econometrics Patrícia Cruz 8-15
Peo BLED

not significant
only do this if you're sure the variable is

Solutions to Collinear Data

4) Transformation of the variables in the initial model.

5) Additional or new data. Since multicollinearity is a
sample feature, it is possible that in another sample
involving the same variables collinearity may not be
so serious as in the first sample. Sometimes simple
increasing the size of the sample may solve (attenuate)
the collinearity problem.

Econometrics Patrícia Cruz 8-16

bigley coerce Ted

t Kit t P3 ↳ it t ee → heard uz E are

4) ye fi Pz
-

model as
valid
→ The
use t ee i
t Ps i

§ KE
-

t fr
-

i
Let pi
-

t CE
D ee C
-
-

) Ps (
Us
-

use E
t
-

p Cuz E me
-
-
i
2
Ye
2
I
Ye
- -
Is Multicollinearity Necessarily Bad?
■ Despite the difficulties in isolating the effects of individual
variables, if the only purpose of regression analysis is
prediction or forecasting, then multicollinearity is not a
serious problem.
■ In fact, as long as the model has good explanatory power
(high R2 and a F statistic indicating that the model is
globally significant) and the structure of the collinear
relationship remains the same within the new sample
observations, accurate forecasts may still be possible.
■ If the objective of the analysis is not only prediction but
also obtain reliable estimates of the parameters, serious
multicollinearity will be a problem because of the large
variances of the OLS estimator.
Econometrics Patrícia Cruz 8-17

Multicollinearity Nature of Multicollinearity
100% (2)
Multicollinearity Nature of Multicollinearity
7 pages
(J. G. Kalbfleisch) Probability and Statistical I PDF
No ratings yet
(J. G. Kalbfleisch) Probability and Statistical I PDF
188 pages
The Identification Zoo: Meanings of Identification in Econometrics
No ratings yet
The Identification Zoo: Meanings of Identification in Econometrics
10 pages
Chapter 7 (Multicolinarity)
No ratings yet
Chapter 7 (Multicolinarity)
64 pages
LEC11
No ratings yet
LEC11
21 pages
Multicollinearity and Endogeneity PDF
No ratings yet
Multicollinearity and Endogeneity PDF
37 pages
Multicollinearity 2023
No ratings yet
Multicollinearity 2023
32 pages
Econ 321.6
No ratings yet
Econ 321.6
20 pages
Multicollinearity: What Happens If Explanatory Variables Are Correlated.
No ratings yet
Multicollinearity: What Happens If Explanatory Variables Are Correlated.
20 pages
CHAPTER 4_violations of Assumptions
No ratings yet
CHAPTER 4_violations of Assumptions
96 pages
Econometrics Presentation
No ratings yet
Econometrics Presentation
31 pages
Chapter Four Violations of The Assumptions of Classical Model
No ratings yet
Chapter Four Violations of The Assumptions of Classical Model
151 pages
Mulicolinearity
No ratings yet
Mulicolinearity
18 pages
Multicollinearity (1)
No ratings yet
Multicollinearity (1)
7 pages
Chapter 5
No ratings yet
Chapter 5
26 pages
Multicollinearity
100% (1)
Multicollinearity
25 pages
Multicollinearity
No ratings yet
Multicollinearity
35 pages
Multi Col Linearity
No ratings yet
Multi Col Linearity
8 pages
Multicollinearity
No ratings yet
Multicollinearity
25 pages
6 Multicolinearity
No ratings yet
6 Multicolinearity
6 pages
Multi Col Linearity
No ratings yet
Multi Col Linearity
37 pages
Multicollinerity
No ratings yet
Multicollinerity
27 pages
Chapter7 Econometrics Multicollinearity
No ratings yet
Chapter7 Econometrics Multicollinearity
25 pages
Chapter 4
No ratings yet
Chapter 4
47 pages
Multicollinearity Samiji
No ratings yet
Multicollinearity Samiji
13 pages
14.-MULTIKOLINEARITAS-2015
No ratings yet
14.-MULTIKOLINEARITAS-2015
15 pages
AE Unit II
No ratings yet
AE Unit II
64 pages
Statistical Modelling: Regression: Multicollinearity
No ratings yet
Statistical Modelling: Regression: Multicollinearity
22 pages
MULTICOLLINEARITY
No ratings yet
MULTICOLLINEARITY
8 pages
Econometrics: Multicollinearity: What Happens If The Regressors Are Correlated?
100% (1)
Econometrics: Multicollinearity: What Happens If The Regressors Are Correlated?
45 pages
C4-English
No ratings yet
C4-English
27 pages
Multi Kol
No ratings yet
Multi Kol
44 pages
Multicollinearity 074432
No ratings yet
Multicollinearity 074432
21 pages
Multicollinearity: Abhijeet Kumar Kumar Anshuman Manish Kumar Umashankar Singh
100% (1)
Multicollinearity: Abhijeet Kumar Kumar Anshuman Manish Kumar Umashankar Singh
22 pages
Multi Col Linearity
No ratings yet
Multi Col Linearity
28 pages
Data Problems: Multicollinearity and Inadequate Variation
No ratings yet
Data Problems: Multicollinearity and Inadequate Variation
4 pages
Violation of OLS Assumption - Multicollinearity
No ratings yet
Violation of OLS Assumption - Multicollinearity
18 pages
Multicollinearity
No ratings yet
Multicollinearity
26 pages
MULTICOLLINEARITY(1)
No ratings yet
MULTICOLLINEARITY(1)
21 pages
CH 10
No ratings yet
CH 10
9 pages
Chapter 04 (1)
No ratings yet
Chapter 04 (1)
70 pages
Chapter_5_multicollinearity.pptx
No ratings yet
Chapter_5_multicollinearity.pptx
20 pages
Chapter 4 Multicollinearity
No ratings yet
Chapter 4 Multicollinearity
7 pages
CH 10 MULTICOLLINEARITY WHAT HAPPENS IF THE EGRESSORS ARE CORRELATED
No ratings yet
CH 10 MULTICOLLINEARITY WHAT HAPPENS IF THE EGRESSORS ARE CORRELATED
36 pages
Chapter 05 - Multicollinearity
100% (1)
Chapter 05 - Multicollinearity
26 pages
Econometrics Chapter-4-Multicollinearity-14-08-2023
No ratings yet
Econometrics Chapter-4-Multicollinearity-14-08-2023
33 pages
Multi Collinearity
No ratings yet
Multi Collinearity
22 pages
Econometric S
No ratings yet
Econometric S
11 pages
Chapter7 Econometrics Multicollinearity
No ratings yet
Chapter7 Econometrics Multicollinearity
24 pages
MULTICOLLINEALITY
No ratings yet
MULTICOLLINEALITY
20 pages
Multicollinearity
No ratings yet
Multicollinearity
36 pages
Chapter 4 Violations of The Assumptions of Classical Linear Regression Models
100% (10)
Chapter 4 Violations of The Assumptions of Classical Linear Regression Models
10 pages
9
No ratings yet
9
25 pages
Econometrics ch11
No ratings yet
Econometrics ch11
44 pages
Lecture 5,6,7 - Violations of CLRM
No ratings yet
Lecture 5,6,7 - Violations of CLRM
91 pages
Multicollinearity Assignment April 5
100% (1)
Multicollinearity Assignment April 5
15 pages
AIS Lecture 18
No ratings yet
AIS Lecture 18
33 pages
Lecture Notes on Multicollinearity
No ratings yet
Lecture Notes on Multicollinearity
16 pages
7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced
No ratings yet
7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced
23 pages
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Math for Computer Applications
From Everand
Math for Computer Applications
The Editors of REA
No ratings yet
Encyclopaedia Britannica, 11th Edition, Volume 9, Slice 7 "Equation" to "Ethics"
From Everand
Encyclopaedia Britannica, 11th Edition, Volume 9, Slice 7 "Equation" to "Ethics"
Various Various
No ratings yet
MTH 161: Introduction To Statistics: Dr. Mumtaz Ahmed
No ratings yet
MTH 161: Introduction To Statistics: Dr. Mumtaz Ahmed
24 pages
Independance of Variable
No ratings yet
Independance of Variable
20 pages
Artificial Intelligence Question Bank: Module-3
No ratings yet
Artificial Intelligence Question Bank: Module-3
2 pages
Ma40092 Problem Sheet 3 - Solutions
No ratings yet
Ma40092 Problem Sheet 3 - Solutions
4 pages
Arch Uni Slides
No ratings yet
Arch Uni Slides
92 pages
Biostatistics Revision Dr.nj
No ratings yet
Biostatistics Revision Dr.nj
67 pages
Data For The Assignment in Statistical Models.
No ratings yet
Data For The Assignment in Statistical Models.
4 pages
11th Maths Lesson 15
No ratings yet
11th Maths Lesson 15
26 pages
Computing Binomial Probabilities With Minitab
No ratings yet
Computing Binomial Probabilities With Minitab
6 pages
stepwise regression
No ratings yet
stepwise regression
2 pages
U2.T4 Session 3 Introduction To Hypothesis Testing SY2223
No ratings yet
U2.T4 Session 3 Introduction To Hypothesis Testing SY2223
30 pages
Statistics for Business and Economics-Paul Newbold
No ratings yet
Statistics for Business and Economics-Paul Newbold
6 pages
Anova
No ratings yet
Anova
3 pages
Generalized linear mixed models modern concepts methods and applications 1st Edition Stroup - Read the ebook online or download it for a complete experience
No ratings yet
Generalized linear mixed models modern concepts methods and applications 1st Edition Stroup - Read the ebook online or download it for a complete experience
54 pages
Understanding Statistical Power in The Context of Applied Research
No ratings yet
Understanding Statistical Power in The Context of Applied Research
8 pages
Monte Carlo - Simulation - Using - Excel - For - Predicting
No ratings yet
Monte Carlo - Simulation - Using - Excel - For - Predicting
6 pages
BKM 10e Ch07 Two Security Model
No ratings yet
BKM 10e Ch07 Two Security Model
2 pages
CS - SE 3341 Probability and Statistics in Computer Science (M. Baron) Fall 2010
No ratings yet
CS - SE 3341 Probability and Statistics in Computer Science (M. Baron) Fall 2010
3 pages
Chapter 10 Measures of Central Tendency
No ratings yet
Chapter 10 Measures of Central Tendency
40 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
7 pages
Mean of Discrete Random Variable
No ratings yet
Mean of Discrete Random Variable
25 pages
CSE 422 Machine Learning Probabilistic Methods
No ratings yet
CSE 422 Machine Learning Probabilistic Methods
28 pages
MJC JC 2 H2 Maths 2011 Mid Year Exam Questions Paper 2
No ratings yet
MJC JC 2 H2 Maths 2011 Mid Year Exam Questions Paper 2
10 pages
Statistical Methods: 1. Pareto Diagrams and Dot Diagrams
No ratings yet
Statistical Methods: 1. Pareto Diagrams and Dot Diagrams
48 pages
Probability and Statistics July 2023
No ratings yet
Probability and Statistics July 2023
8 pages
Large Sample Test
No ratings yet
Large Sample Test
27 pages
S5+Lunar+New+Year+Holiday+Assignment+Statistics Mean, Mode,+Median Past+Papers+Paper+II+MC
No ratings yet
S5+Lunar+New+Year+Holiday+Assignment+Statistics Mean, Mode,+Median Past+Papers+Paper+II+MC
4 pages
COSM Question Bank For The (2023-24)
No ratings yet
COSM Question Bank For The (2023-24)
21 pages