C2-English
C2-English
Multiple Regression
(Course: Econometrics)
Phuong Le
3 Model selection
Information criteria
Wald test
Introduction
y = β0 + β1 x1 + β2 x2 + ... + βp xp + ε,
where
• β0 , β1 , . . . , βp are the parameters (there is k = p + 1 parameters),
• ε is a random variable called the error term.
E(y) = β0 + β1 x1 + β2 x2 + ... + βp xp .
Introduction
Matrix representation
Y = X β + ε,
where
1 x11 x21 ... xp1 y1
1 x12 x22 ... xp2 y2
X = .. .. .. .. , Y = .. ,
. . . . .
1 x1n x2n ... xpn yn
β0 ε1
β1 ε2
β= .. , ε = ..
. .
βp εn
Representation in matrix
Y = X β̂ + e,
where
β̂0 e1
β̂1 e2
β̂ = , e = .. .
..
. .
β̂p en
Some multiple regression functions
ln yi = β0 + β1 ln x1i + β2 ln x2i + εi .
yi = β0 + β1 xi + β2 xi2 + εi .
This is a multiple regression model for y, x and x 2 .
Least squares method
Least squares criterion
X X X 2
ei2 = (yi −ŷi )2 = yi − β̂0 − β̂1 x1i − β̂2 x2i − · · · − β̂p xpi → min .
where
• Salary: annual salary ($1000s),
• Experience: years of experience,
• TestScore: score on programmer aptitude test.
Least squares method
STATA code: regress Salary Experience TestScore
MSE = ESS / n -p - 1 MSR = RSS / p
Result:
n - p -1 =
n - 1=
căn bậc 2 của MSE
RSS
R2 = .
TSS
n−1
Ra2 = R 2 := 1 − (1 − R 2 ) .
n−p−1
Let cii be the item at the cell (i, i) of the matrix (X T X )−1 , then
ESS
where s2 = MSE = n−p−1 . Hence we can estimate σβ̂i by
√
se(β̂i ) = s cii .
Testing for significance
t Test for Significance of Individual Parameters
For a given number β ∗ .
1 Hypotheses:
H0 : β i = β ∗ ,
Ha : βi ̸= β ∗ .
2 Test Statistics:
β̂i − β ∗
t= .
se(β̂i )
β̂2 .2508854
t= = = 3.24
se(β̂2 ) .0773541
β̂1 − 1 1.403902 − 1
t= = = 2.03.
se(β̂1 ) 0.1985669
• Interval estimate of y0 :
ŷ0 − tα/2 (n − p − 1)se(ŷ0 ), ŷ0 + tα/2 (n − p − 1)se(ŷ0 ) ,
q −1
where se(ŷ0 ) = σŷ20 where σŷ20 ≈ s2 X0T X T X X0 .
Information criteria
• The adjusted multiple coefficient of determination
n−1 ESS/(n − p − 1)
Ra2 = R 2 := 1 − (1 − R 2 ) =1−
n−p−1 TSS/(n − 1)
(higher is better).
• Akaike information criterion
ESS 2(p+1)
AIC = e n
n
(smaller is better).
• Schwarz information criterion (BIC/SC)
ESS p+1
BIC = n n
n
(smaller is better).
Information criteria
Example 8. A real estate company investigates the prices of
apartments for young families. They use the following regression
model:
where
• PRICE: price of the
apartment (in thousands
dollars),
• SQFT: area (in square feet),
• BEDRMS: number of
bedrooms,
• BATHS: number of
bathrooms.
Find the best linear model.
Information criteria
Information criteria
Information criteria