Lecture
10: THE LINEAR REGRESSION MODEL
The
population regression line can be written as:
Yi
= A + B.Xi + ei
where
Yi is the ith observed value of the dependent variable, Xi
is the ith observed value of the independent variable, and ei is a
normally distributed random variable with mean zero and standard deviation se. Because of the presence of
the error term ei, the observed values of Yi fall
around
the population regression line, and not on it.
Figure
12.4 (page 457) explains the situation.
The sample regression line:
To carry out a regression analysis, we must obtain the mathematical equation for a line that dscribes the average relationship between the dependent variable and independent variable. This line is calculate from the sample of observations and is called the sample or estimated regression line.
The sample regression line is
Yihat= a + b.Xi,
Where Yihat is the value of the dependent variable predicted by the regression line, and a and b are estimated values of A and B.
Figure 12.5 gives the estimated regression based on data in table 12.1.
Yihat= 1.266 + 0.752.Xi.
It is important to be able to interpret a (intercept) and b (slope).
The Method of Least Squares:
The method of least squares dictates that we choose the line where the sum of the squared deviations of the points from the line is minimum.
See Figure 2.6.
It can be shown that b and a can be calculated using the formulae:
b = { nSXiYi – (SXi)( (SYi)}/{nSXi2 – (SXi)2}
a = (SYi)/n – b. (SXi)/n
See example in Table 12.2.
Do exercises 12.1, 12.2, 12.3, 12.4, 12.5, 12.6, and 12.7 on pages 464-467.