Week 11-2 Lecture 15 Student
Week 11-2 Lecture 15 Student
𝑚𝑖𝑛∑ ( 𝑦 𝑖 − ^𝑦 𝑖 )
2
where
y = annual salary ($1000s)
x1 = years of experience
x2 = score on programmer aptitude test
where:
SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error
R2 = SSR/SST
R2 = 500.3285/599.7855 = 0.83418
2 𝑛− 1 2
𝑅𝑎 =1−(1 − 𝑅 )
𝑛 −𝑝 −1
2 20 − 1
𝑅𝑎 =1− ( 1− .834179 ) = .814671
20 − 2 − 1
RIT 2023 Spring 16
Assumptions About the Error Term
• F Test
Hypotheses H0: 1 = 2 = . . . = p = 0
Ha: One or more of the parameters is not equal to zero
• F Test: example
Hypotheses H0: 1 = 2 = 0
Ha: One or both of the parameters is not equal to zero.
• t Test
𝑏𝑖
Test Statistics 𝑡=
𝑠𝑏 𝑖
• t Test: example
Hypotheses H0 : = 0 Ha: ≠ 0
• For simple linear regression the residual plot against and the residual
plot against x provide the same information.
• In multiple regression analysis it is preferable to use the residual plot
against to determine if the model assumptions are satisfied.
where:
• Excel’s Chart tools can be used to develop a scatter diagram and fit a
straight line to bivariate data.
• The estimated regression equation and the coefficient of determination
for simple linear regression can also be developed.
• The results of using Excel’s Chart tools to fit a line to the data are shown
below.
y = b0 + b1x + b2x2 + e
• Excel’s Chart tools can be used to fit a polynomial curve to the data.
(Dialog box is below.)
• To get the dialog box, position the mouse pointer over any data point in
the scatter diagram and right-click.
• The estimated multiple regression equation and multiple coefficient of
determination for this second-order model are also obtained.
• Excel’s Chart tools output does not provide any means for testing the
significance of the results, so we need to use Excel’s Regression tool.
• We will treat the values of x2 as a second independent variable (called
MonthSq below).
• Second Independent Variable (MonthSq) Added
We should be pleased with the fit provided by the estimated multiple regression equation.
• Problem One
Regression analysis was applied between sales data (y in $1000s) and
advertising data (x in $100s) and the following information was obtained.
A. 3 B. 45
The F statistic computed from the above data is:
C. 48 D. 50
• Problem Two
Regression analysis was applied between sales data (y in $1000s) and
advertising data (x in $100s) and the following information was obtained.
A. 1.80 B. 1.96
The t statistic for testing the significance of the slope is:
C. 6.71 D. 0.56
• The simplest case is when we have collected data for just one variable x1
and want to estimate y by using a straight-line relationship. In this case
z1 = x 1 .
• This model is called a simple first-order model with one predictor
variable.
y = b 0 + b 1 x1 + e
𝑦 =𝛽 0 + 𝛽 1 𝑥 1+ 𝛽2 𝑥 12 +𝜀
𝑦 =𝛽 0 + 𝛽 1 𝑥 1+ 𝛽2 𝑥 2 + 𝛽 3 𝑥 12+ 𝛽 4 𝑥 22 + 𝛽 5 𝑥 1 𝑥 2+ 𝜀
• In this model, the variable z5 = x1x2 is added to account for the potential
effects of the two variables acting together.
• This type of effect is called interaction.
𝐸 ( 𝑦 ) =𝛽 0 𝛽 1𝑥