## Boxplots and Outliers

### The Box Plot

• Here are the directions for drawing a box plot:

1. Compute Q1, Q2 and Q3. Also, compute the interquartile range IQR = Q3 - Q1.

Example:  Suppose that the dataset consists of these hypothetical test scores:

5  39  75  79  85  90  91  93  93  98

Q1 = 75, Q2 = 88, Q3 = 92. IQR = 93 - 75 = 18.

2. Draw three horizontal lines, all of the same length and all starting at the same x-value: one at height Q1, the second at Q2 and the third at Q3.

Example:  Here is the boxplot after Step 2.

3. Draw two vertical lines, one at connecting the left endpoints of the lines and the other connecting their right endpoints.

Example:  Here is the updated boxplot after Step 3.

4. Compute the inner fences IF1 = Q1 - 1.5 * IQR and IF2 = Q3 + 1.5 * IQR.

Example:   The inner fences are

IF1 = 75 - 1.5 * 18 = 42 and IF2 = 92 + 1.5 * 18 = 110.

5. Draw a whisker downward from Q1 to IF1 or Q0, whichever comes first. Draw a whisker upward from Q3 to IF2 or Q4, whichever comes first.

Example:   Here is the boxplot after adding the whiskers in Step 4.

6. Compute the outer fences OF1 = Q1 - 3 * IQR and OF2 = Q3 + 3 * IQR.

Example:   The outer fences are

OF1 = 75 - 3 * 18 = 21 and OF2 = 92 + 3 * 18 = 146.

7. Extreme outliers are observations that are beyond one of the outer fences OF1 or OF2. Mark any extreme outliers on the boxplot with an asterisk (*).

Example:   The only observation less than OF1 = 21 is 5. Here is the boxplot after marking 5 with a *.

8. Mild outliers are observations that are between an inner and outer fence. Mild outliers are marked with a circle (O).

Example:   The only observation that is between an inner fence and an outer fence is 39, which is between IF1 = 42 and OF1 = 21. Here is the boxplot after marking 39 with a O.

• Compare your boxplot with one constructed by SPSS from the same data.

### Mild vs. Extreme Outliers

• Extreme outliers are data points that are more extreme than Q1 - 3 * IQR or Q3 + 3 * IQR.

• Extreme outliers are marked with an asterisk (*) on the boxplot.

• Mild outliers are data points that are more extreme than than Q1 - 1.5 * IQR or Q3 + 1.5 * IQR, but are not extreme outliers.

• Mild outliers are marked with a circle (O) on the boxplot.