Last month we showed the X chart in figure 1. The four lowest values and the three highest values were seen to be “outliers” when we looked at the histogram. When we fitted a bell-shaped curve to the histogram, the outliers corrupted the model and resulted in a poor fit. Yet we used all the data to compute the limits seen in figure 1. How can the outliers corrupt one computation but not another?
ADVERTISEMENT |
The answer lies in how we compute limits for the X chart. The central line is commonly taken to be the average value. Now, while it’s true that the average may be influenced by extreme values, this effect is generally smaller than you might expect. In this case, deleting the seven outliers would only change the average from 595.4 to 595.6. The average value is a very robust measure of location. However, in cases where we think the average may have been unduly influenced by extreme values, we may always resort to using the median value instead. In this case the median is 596. Thus, one way or another, we’re going to have a reasonable estimate of location regardless of the outliers.
…
Add new comment