Lecture 4



 Measures of Dispersion:

 Arithmetic average by itself is not an adequate summary measure. Often one needs to know about the extent of variability of the individual measurements from the mean.

A comparison of the average yearly temperature between Albany and San Francisco may not tell the complete story. Albany temperature fluctuates a lot more than that of San Francisco, even though the average temperature may be very similar.

Histogram
of grades
Figure 2.4
(page 54)

Alternative measures of Dispersion:

Distance measures:

Range = Highest value – smallest value

Interquartile range: Defined as the difference between third and the first quartile. The third quartile is the value such that 75% of the observations lie below it; the first quartile is the value such that 25% of the observations lie below it. Thus, interquartile range is the spread bounding the middle 50% of the values of the observations.

What is second quartile?

Percentiles:

Any other distance measures based on percentiles, e.g., difference between 90th. and 10th. percentiles.

VARIANCE and STANDARD DEVIATION:

Variance is the arithmetic mean of the squares of deviations of the observations from their mean. The population value is denoted by

s 2 = {(X1 - m )2 + (X2 - m )2 + (X3 - m )2 + … + (XN - m )2}/N

 

Variance is a measure of dispersion.

Standard Deviation is square root of Variance (s). It has the same unit as that of the measurements.

Calculation of Standard Deviation:

Table 2.4, page 59

Calculation of Standard Deviation from Grouped Data

 

Table 2.5 (page 60).