Boxplots and Outliers

Boxplots are graphs that can be used to see central tendency, the data spread and distribution, and the presence of extreme values (outliers). The boxplots drawn in the text depend on five numbers. The numbers are the lowest score, the first quartile, the median or second quartile, the third quartile and the largest score. Below is a boxplot for the five mile race data that we have used before.

[Maple Plot]

The lowest horizontal edge of the box corresponds to the first quartile value while the top horizontal edge of the box corresponds to the third quartile. The other horizontal line in the box corresponds to the median or second quartile which is a value that is not quite 44. The largest value of 75.028 is visible on the vertical axis. The smallest value which is 26.77 is not visible because it is close to the small horizontal line.

You'll note that this graph has two short horizontal lines that are not included on the graphs done in the book. These two lines are used to help identify outliers or extreme values. The graph above reveals several large outliers and no small outliers.

An outlier may be an exceptional value or a data error. In this case the data is not in error. The people with the exceptional times were actually race walkers.

click here if you want to find out how to get the outlier lines