0% found this document useful (0 votes)
42 views

Multiple Linear Regression Session 4

Uploaded by

mereninnas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Multiple Linear Regression Session 4

Uploaded by

mereninnas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Multiple Linear Regression

 Introduction

 Assumptions

 Fitting the Model

 Standard Errors of the Coefficients Bj’s

 Confidence Interval for Regression Coefficients

 Testing of Hypothesis
 Multiple Regression Model
◦ A regression model that contains more than one
regressor variable.

 Multiple Linear Regression Model


◦ A multiple regression model that is a linear
function of the unknown parameters b0, b1, b2,
and so on.
◦ Examples:
MultipleRegression is very popular
among social scientists.
Most social phenomena have more than
one cause.

It is very difficult to manipulate just one


social variable through experimentation.

Social scientists must attempt to model


complex social realities to explain them.
Multiple Regression allows us to:
Use several variables at once to explain the
variation in a continuous dependent variable.

Isolate the unique effect of one variable on the


continuous dependent variable while taking into
consideration that other variables are affecting it
too.

Write a mathematical equation that tells us the


overall effects of several variables together and
the unique effects of each on a continuous
dependent variable.
Simple vs. Multiple Regression

 One dependent variable Y  One dependent variable Y


predicted from one predicted from a set of
independent variable X independent variables (X1,
X2 ….Xk)
 One regression coefficient  One regression coefficient
for each independent
variable
 r2: proportion of variation
 R2: proportion of variation
in dependent variable Y
in dependent variable Y
predictable from X
predictable by set of
independent variables (X’s)
Example,
The yield of crop per acre say
(y) fertility of soil (x1), fertilizer used (x2),
irrigation facilities (x3), weather conditions
(x4) and so on whenever we are interested in
studying the joint effect of a group of
variables up on a variable not included is that
group, our study is that of multiple regression
and multiple correlation.
 Assumption of simple or Multiple Linear Regression

 Independence: the scores of any particular subject are


independent of the scores of all other subjects

 Normality: in the population, the scores on the dependent


variable are normally distributed for each of the possible
combinations of the level of the X variables; each of the
variables is normally distributed

 Homoscedasticity: in the population, the variances of the


dependent variable for each of the possible combinations of
the levels of the X variables are equal.

 Linearity: In the population, the relation between the


dependent variable and the independent variable is linear when
all the other independent variables are held constant.
In general for two Independent
variables
Example

a. Fit the regression equation


b. Predict the value of weight loss for X1= 6.5 and x2 =
0.35
Solution
Exercise
Standard Errors of the Coefficients Bj’s
 The standard errors of the coefficients are
vital in statistical inferences about the
coefficients.

 We use the standard error of a coefficient


for constructing the confidence intervals
of the coefficients and to test the
significance of the variable to which the
coefficient is attached.
SSE=Syy-𝐵𝑗𝑆𝑥𝑦
Testing of Hypothesis
Overall significance test of a model are summarized in the Analysis of
Variance (ANOVA) table as follows.
 Coefficient of determination R2

 Adjusted coefficient of determination

 Partial correlation coefficients

 Practical exercise
Coefficient of determination R2
 The coefficient of determination is often interpreted as the
percent of variability in y that is explained by the regression
equation.
 The quantity varies from 0 to higher values indicate a better
regression.
 Caution should be used in making general interpretations of R2
because a high value can result from either a small SSE or a
large SST or both.
 R2 is also known as percent explained variability.
Adjusted coefficient of determination
 There is a potential problem with using R2 as an overall measure of the
quality of a fitted equation.

 As additional independent variables are added to a multiple regression


model, the explained sum of squares SSR will increase even if the
additional independent variable is not an important predictor variable.

 Thus we might find that R2 has increased spuriously after one or more non
significant predictor variables have been added to the multiple regression
model.

 In such a case the increased value of R2 would be misleading.


 To avoid this problem, the adjusted coefficient of determination can be
computed as follows.
 We use this measure to correct for the fact that non-relevent

independent variables will result in some small reduction in the

error sum of squares.

 Thus, the adjusted R2 provide a better comparison between

multiple regression models with different numbers of independent

variables.

 But the difference between R2 and adjusted R 2 is not very large.

 However, if the regression model had contained a number of

independent variables that were not important conditional predictors,

then the difference would be substantial.


Partial correlation coefficients
 The partial correlation coefficient measures the
net correlation between the dependent variable
and one independent variable offer excluding the
common influence of other independents variables
in the model.
 For example, is the partial correlation
between y and x1 after removing the influence of
x2 from both y and x1.
 The table below lists husbands’ hours of housework per
week (Y), number of children (X), and husbands’ years of
education (Z) for a sample of 12 dual-career households
 Calculating the partial (first-order) correlation between
husbands’ housework (y) and number of children (x)
controlling for husbands’ years of education (z)
Exercise

You might also like