Multiple Regression
Multiple Regression
Multiple Regression
• y – response variable
x1, x2 , … , xk -- a set of explanatory variables
yˆ a b1 x1 b2 x2 ... bk xk
Example: Mental impairment study
• y = mental impairment (summarizes extent of psychiatric
symptoms, including aspects of anxiety and depression, based
on questions in “Health opinion survey” with possible responses
hardly ever, sometimes, often)
Ranged from 17 to 41 in sample, mean = 27, s = 5.
• x1 = life events score (composite measure of number and
severity of life events in previous 3 years)
Ranges from 0 to 100, sample mean = 44, s = 23
• x2 = socioeconomic status (composite index based on
occupation, income, and education)
Ranges from 0 to 100, sample mean = 57, s = 25
SSE = 768.2 smaller than SSE for either bivariate model or for
any other linear equation with predictors x1 , x2.
Comments
• Partial effects in multiple regression refer to controlling other
variables in model, so differ from effects in bivariate models,
which ignore all other variables.
• Partial effect of x1 (controlling for x2) is same as bivariate
effect of x1 when correlation = 0 between x1 and x2
(as is true in most designed experiments).
• Partial effect of a predictor in this multiple regression model
is identical at all fixed values of other predictors in model
(insert graph)
TSS SSE ( y y ) 2
( y ˆ
y ) 2
R2
TSS ( y y ) 2
TSS SSE ( y y ) 2
( y ˆ
y ) 2
1162.4 768.2
R
2
0.339
TSS ( y y ) 2
1162.4
Software provides an ANOVA table with the sums
of squares used in R-squared and a Model
Summary table with values of R and R-squared.
• R2 = 0.34, so there is a 34% reduction in error when
we use life events and SES together to predict
mental impairment (via the prediction equation),
compared to using y to predict mental impairment.
• 0 ≤ R2 ≤ 1
• R R 2 so 0 ≤ R ≤ 1 (i.e., it can’t be negative)
• The larger their values, the better the set of
explanatory variables predict y
• R2 = 1 when observed y = predicted y, so SSE = 0
• R2 = 0 when all predicted y = y so TSS = SSE.
When this happens, b1 = b2 = … = bk = 0 and the
correlation r = 0 between y and each x predictor.
• R2 cannot decrease when predictors added to model
• With single predictor, R2 = r2 , R = |r|
• The numerator of R2, which is TSS – SSE, is called
the regression sum of squares. This represents the
variability in y “explained” by the model.
Equivalent to testing
H0: population multiple correlation = 0 (or popul. R2 = 0)
vs. Ha: population multiple correlation > 0
• Test statistic (with k explanatory variables)
R2 / k
F
(1 R 2 ) /[n ( k 1)]
Test statistic
R2 / k 0.339 / 2
F 9.5
(1 R ) /[ n ( k 1)] (1 0.339) /[40 (2 1)]
2
Likewise for test of H0: 1 = 0 (P-value = 0.003), but life events has
positive effect on mental impairment, controlling for SES.
A 95% CI for 2 is b2 ± t(se), which is
-0.097 ± 2.03(0.029), or (-0.16, -0.04)
(picture)
Comments about interaction model
Example: Compare
• Notation:
ryx1 . x2
denotes the partial correlation between y and x1 while controlling
for x2.
or
is equivalent to t test of
H0 : population partial slope = 0
is equivalent to test of
H0 : 2 = 0
Standardized Regression Coefficients
• Recall for a bivariate model, the correlation is a
“standardized slope,” reflecting what the slope
would be if x and y had equal standard dev’s.
• In multiple regression, there are also standardized
regression coefficients that describe what the
partial regression coefficients would equal if all
variables had the same standard deviation.
b1* = b1 (sx1 / sy )
b2* = b2 (sx2 / sy ), etc.
Note:
An alternative form for prediction equations uses
standardized regression coefficients as coefficients
of standardized variables
(text, p. 352)
Some multiple regression review questions
• (T/F) If you get a small P-value in the F test that all regression
coefficients = 0, then the P-value will be small in at least one of
the t tests for the individual regression coefficients.