0% found this document useful (0 votes)
10 views

Chapter_6

Chapter 6 discusses multiple regression analysis, focusing on the effects of data scaling on OLS statistics and the interpretation of coefficients. It explains how standardizing variables can provide clearer insights into their effects, and introduces logarithmic and quadratic functional forms to capture complex relationships. The chapter also covers interaction terms and their implications for interpreting regression results.

Uploaded by

xinshao240020
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Chapter_6

Chapter 6 discusses multiple regression analysis, focusing on the effects of data scaling on OLS statistics and the interpretation of coefficients. It explains how standardizing variables can provide clearer insights into their effects, and introduces logarithmic and quadratic functional forms to capture complex relationships. The chapter also covers interaction terms and their implications for interpreting regression results.

Uploaded by

xinshao240020
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Chapter 6: Multiple Regression Analysis:

Further Issues
Introductory Econometrics: A Modern Approach
6.1 Effects of data scaling on OLS statistics
Dependent Variables

ˆ
bwght
^ ^
= β0 + β1 cigs +β
^
2
faminc

bwght = child birth weight, in ounces.


cigs = number of cigarettes smoked by the mother while pregnant, per day.
faminc = annual family income, in thousands of dollars.

^
ˆ /16 = β
bwght 0
^
/16 + (β 1 /16) cigs + (β
^
2
/16) faminc .

When variables are re-scaled, the coefficients, standard errors, confidence intervals, t statistic and F statistic
changes in a way that preserve the testing outcome.

4
5
Data scaling of independent variables

Define packs
cigs
=
20

ˆ
bwght
^ ^
= β 0 + (20β 1 )( cigs /20) + β
^
2
faminc = ^ ^
β 0 + (20β 1 ) packs +β
^
2
faminc.

the coefficient on packs is 20 times that on cigs


standard error on packs is 20 times that on cigs

this means that t-statistics is the same in both cases


7
Standardizing the variables (Beta coefficients)
The test scores are used in wage equations, and the scale of these test scores is often arbitrary and not easy to
interpret

We are interested in how a particular individual's score compares with the population:

Instead of asking about the effect on hourly wage if say a test score is 10 point higher, it makes more sense to
ask what happends when the test score is one standard deviation higher.

Standardizing all variables:

1) Original form

^ ^ ^ ^
^i
yi = β 0 + β 1 xi1 + β 2 xi2 + … + β k xik + u

2) Subtract the mean

^ ^ ^
^i
yi − ȳ = β 1 (xi1 − x̄1 ) + β 2 (xi2 − x̄2 ) + … + β k (xik − x̄k ) + u

3) Let σ
^i be the sample standard deviation for each variable

^ ^
^ y = (σ
(yi − ȳ ) /σ ^ 1 /σ
^y ) β [(xi1 − x̄1 ) /σ
^ 1 ] + … + (σ
^ k /σ
^y ) β [(xik − x̄k ) /σ
^ k ] + (u
^ i /σ
^y )
11
1 k
Standardizing all variables
^ ^
^ y = (σ
(yi − ȳ ) /σ ^ 1 /σ
^y ) β [(xi1 − x̄1 ) /σ
^ 1 ] + … + (σ
^ k /σ
^y ) β [(xik − x̄k ) /σ
^ k ] + (u
^ i /σ
^y )
1 k

It is useful to rewrite the equation above as:

zy = ^
b 1 z1 + ^
b 2 z2 + … + ^
b k zk + error

where zy denotes the z-score of y, z1 is the z-score of x1 , and so on. The new coefficients are

^ ^
b j = (σ
^ j /σ
^y ) β j for j = 1, … , k

^
bj are traditionally called standardized coefficients or beta coefficients

Beta coefficients receive their interesting meaning from equation above:

If x1 increases by one standard deviation, then y^ changes by ^


b 1 standard deviations.

Thus, we are measuring effects not in terms of the original units of y or the xj , but in standard deviation
units.

12
Example 6.1: Effects of pollution on housing prices
Level- level model:

price = β0 + β1 nox +β2 crime +β3 rooms +β4 dist +β5 stratio +u

where crime is the number of reported crimes per capita

Standardized model:

ˆ
zprice = −.340 znox −.143 zcrime +.514 zrooms −.235 zdist −.270 zstratio

Interpretation:

a one standard deviation increase in nox decreases price by .34 standard deviations

a one standard deviation increase in crime reduces price by .14 standard deviations

Whether we use standardize or unstandardized variables does not affect statistical significance: the t statistics are
the same in both cases

15
6.2 More on functional form
a) More on using logarithmic functional form

log( price )= β0 + β1 log( nox )+β2 rooms +u

Thus, when nox increases by 1% price falls by .718% holding rooms fixed.

When rooms increases by one, price increases by approximately 100(.306) = 30.6%

18
a) More on using logarithmic functional form
To calculate the exact change:

^
^ = 100 ⋅ [exp(β Δx2 ) − 1]
%Δy 2

when Δx2 = 1

^
%Δy
^ = 100 ⋅ [exp(β 2 ) − 1]

Applied to the housing price example with x2 = rooms and β


^
2
= .306 ,

ˆ
%Δ price = 100[exp(.306) − 1] = 35.8%

which is notably larger than the approximate percentage change 30.6%.

20
More on using logarithmic functional form
Reasons why logarithmic form appears in applied work:

coefficients have appealing interpretations

we can ignore the units of measurement of variables

strictly positive variables often have distribution that are heteroskedastic or skewed; taking the log can
mitigate, if not, eliminate both problems

taking the log of variables often narrows its range which makes the OLS estimates less sensitive to outliers
(extreme values)

it can create extreme values cases. - An example is when a variable y is between zero and one (such as a
proportion) and takes on values close to zero. In this case, log(y) (which is necessarily negative) can be very
large in magnitude whereas the original variable, y, is bounded between zero and one.

25
More on using logarithmic functional form
Unwritten rules:

When a variable is a positive dollar amount, the log is often taken. We have seen this for variables such as
wages, salaries, firm sales, and firm market value.

Variables such as population, total number of employees, and school enrollment often appear in logarithmic
form; these have the common feature of being large integer values.

Review: distinction between a percentage change and a percentage point change

Remember, if unem goes from 8% to 9% , this is an increase of one percentage point, but a 12.5% increase
from the initial unemployment level. Using the log means that we are looking at the percentage change in the
unemployment rate:
log(9) − log(8) ≈ .118 or 11.8%, which is the logarithmic approximation to the actual 12.5% increase.

28
b) Model with quadratics
Quadratic functions are also used quite often in applied economics to capture decreasing or increasing marginal
effects

Model:
2
y = β0 + β1 x + β2 x + u.

Remember, that β1 does not measure the change in y with respect to x ; it makes no sense to hold x2 fixed while
changing x

Estimated equation:

^ ^ ^ 2
^ = β + β x + β x
y 0 1 2

^ ^
^ ≈ (β + 2β x) Δx
Δy 1 2
,

so Δy^/Δx ^ ^
≈ β 1 + 2β 2 x

The estimated slope is β


^
1
^
+ 2β 2 x

29
Example

Marginal effect of experience


Δ wage

Δ exper
= .298 − 2(.0061) exper

This estimated equation implies that exper has a diminishing effect on wage. The first year of experience is worth
roughly 30¢ per hour ($.298).

The second year of experience is worth less .298 − 2(.0061)(1) ≈ .286 , or 28.6¢

In going from 10 to 11 years of experience, wage is predicted to increase by about .298 − 2(.0061)(10) = .176 ,
or 17.6%

30
When the coefficient on x is positive and the coefficient on x2 is negative, the quadratic has a parabolic shape
(concave).

There is always a positive value of x where the effect of x on y is zero; before this point, x has a positive effect on y
; after this point, x has a negative effect on y.

How can we find the turning point?

^
β1

x = | |
^

2

In the wage example, x∗ = exper is .298/[2(.0061)] ≈ 24.4


32
Does this mean the return to experience becomes negative after 24.4 years?

Not necessarily. It depends on how many observations in the sample lie to the right of the turnaround
point.

In the given example, these are about 28% of the observations. There may be a specification problem (e.g.
omitted variables).

33
Example: Effects of pollution on housing prices
When the coefficient on x is negative and the coefficient on x2 is positive, the quadratic has a convex shape.
Increasing effect of x on y.

log( price )= β0 + β1 log( nox )+β2 log( dist )+β3 rooms +β4 rooms 2 + β5 stratio +u.

Does this mean that, at a low number of rooms, more rooms are associated with lower prices?

Δ log( price ) %Δ price

Δ rooms
=
Δ rooms
= −.545 + .124 rooms

34
Example: Effects of pollution on housing prices

The coefficient on rooms is negative and the coefficient on rooms^2 is positive, this equation literally implies that,
at low values of rooms, an additional room has a negative effect on log(price)

Do we really believe that starting at three rooms and increasing to four rooms actually reduces a house’s expected
value? Probably not.

It turns out that only five of the 506 communities in the sample have houses averaging 4.4 rooms or less,
about 1% of the sample. This is so small that the quadratic to the left of 4.4 can, for practical purposes, be
ignored.

%Δ price ≈ 100{[−.545 + 2(.062)] rooms }Δ rooms

= (−54.5 + 12.4 rooms )Δ rooms. 37


c) Models with interaction terms
Sometimes, it is natural for the dependent variable with respect to an explanatory variable to depend on the
magnitude of yet another explanatory variable.

price = β0 + β1 sqrf t + β2 bdrms + β3 sqrf t ⋅ bdrms + β4 bthrms + u

The partial effect of bdrms on price (holding all other variables fixed) is

Δ price
= β2 + β3 sqrf t
Δbdrms

If β3 > 0 , then it implies that an additional bedroom yields a higher increase in housing price for larger houses

In other words, there is an interaction effect between square footage and number of bedrooms.
In summarizing the effect of bdrms on price, we must evaluate the equation above at interesting values of sqrft,
such as the mean value, or the lower and upper quartiles in the sample

The parameters on the original variables can be tricky to interpret when we include an interaction term.

For example, in the previous housing price equation, equation shows that β2 is the effect of bdrms on price for
a home with zero square feet
42
Example 6.3: Effects of attendance on final exam performance

44
data(attend, package='wooldridge')
# Regression
model_1 <- feols(data=attend, stndfnl ~ atndrte+priGPA+ACT+priGPA^2+ACT^2+priGPA*atndrte)
summary(model_1)

## OLS estimation, Dep. Var.: stndfnl


## Observations: 680
## Standard-errors: IID
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.050293 1.360319 1.507215 0.13222476
## atndrte -0.006713 0.010232 -0.656067 0.51200517
## priGPA -1.628540 0.481003 -3.385720 0.00075118 ***
## ACT -0.128039 0.098492 -1.299998 0.19404671
## I(priGPA^2) 0.295905 0.101049 2.928314 0.00352322 **
## I(ACT^2) 0.004533 0.002176 2.082939 0.03763374 *
## atndrte:priGPA 0.005586 0.004317 1.293817 0.19617261
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.868368 Adj. R2: 0.221777

45
If we add the term β7 ACT·atndrte to equation (6.18), what is the partial effect of atndrte on stndfnl?
The new model would be:

stndfnl = β0 + β1 atndrte +β2 priGP A + β3 ACT + β4 priGP A2 + β5 ACT


2
+ β6 priGPA atndrte +β7 ACT ⋅
atndrte +u .

Therefore, the partial effect of atndrte on stndfnl is β1 + β6 priGPA +β7 ACT.

48
data(wage1, package='wooldridge')
# Regression
model_1 <- lm(wage ~ educ, wage1)
#summary(model_1)
wage1 %<>% mutate(wagehat1 = fitted(model_1))
ggplot(data = wage1, mapping = aes(x = educ)) +
theme_bw() +
geom_point(mapping = aes(y = wage, col = 'wage')) +
geom_point(mapping = aes(y = wagehat1, col = 'linear prediction'))

49
50

You might also like