Box Plot For Excel
Box Plot For Excel
Excel doesnt offer a box-and-whisker chart. Instead, you can cajole a type of Excel chart into boxes
and whiskers. Instead of showing the mean and the standard error, the box-and-whisker plot shows
the minimum, first quartile, median, third quartile, and maximum of a set of data. Statisticians refer to
this set of statistics as a five-number summary.
You represent each five-number summary as a box with whiskers. The box is bounded on the top
by the third quartile, and on the bottom by the first quartile. The median divides the box. How you lay
out the chart determines the width of the box. The whiskers are error bars: One extends upward from
the third quartile to the maximum, and the other extends downward from the first quartile to the
minimum.
Notice that the median isnt necessarily in the middle of the box and the whiskers arent necessarily
the same length.
The first order of business is to put data into a worksheet and start computing some statistics. The
following figure shows the worksheet and the statistics.
The next group of statistics holds the values for the five-number summary. You can use MIN to find
the minimum value for each year, and MAX to find the maximum value. QUARTILE.INC computes
the first quartile and the third quartile. Not surprisingly, MEDIAN determines the median.
The final group of statistics holds the values you put directly into the box-and-whisker plot. Why is
this group necessary?
You can turn a Stacked Column chart into a box-and-whisker plot. In a stacked column, each
segments size is proportional to how much it contributes to the size of the column. In a box-andwhisker box, however, the size of a segment represents a difference between one value and another
like the difference between the quartile and the median, or between the median and the first
quartile.
So the box is really a stacked column with three segments. The first segment is the first quartile. The
second is the difference between the median and the first quartile. The third is the difference
between the third quartile and the median.
But wait. Wont that just look like a column that starts at the x-axis? Not after you make the first
segment disappear!
The other two differences between the maximum and the third quartile and between the first
quartile and the minimum become the whiskers.
Follow these steps after you calculate all the statistics:
1.
2.
Select INSERT | Recommended Charts, and then select the sixth option to add a
stacked column chart to the worksheet.
The fourth option in the Recommended Charts is also a stacked column chart. Dont select that
one. Its rows and columns are reversed.
The following figure shows what the stacked column chart looks like after you insert it, delete
the gridlines, move the legend, remove Chart Title, and reformat and title the axes. The figure
also shows the chart toolset to right of the chart.
3.
This opens the Format Error Bars panel. Select the Minus radio button, the Cap radio button,
and the Custom radio button.
Then click the Specify Value button to open the Custom Error Bars dialog box. Leaving the
Positive Error Value as is, specify the cell range for the Negative Error Value. For this
worksheet, thats B20:D20 (Q1-Minimum).
4.
Clicking OK closes this dialog box, and clicking the Close symbol closes the Format
Error Bars panel.
Follow similar steps to add the upper whiskers. This time select the part of the stacked columns
corresponding to Q3-Median (the upper portion of each stacked column). Then as earlier, click
the Plus Sign in the chart toolset.
Again, select the box next to Error Bars in the pop-up menu, and the arrowhead to the right of
that option. This time in the Format Error Bars panel, select the Plus radio button, the Cap radio
button, and the Custom radio button.
Again, click the Specify Value button to open the Custom Error Bars dialog box. This time,
specify the cell range for the Positive Error Value. That cell range is B24:D24 (Max-Q3). Click
OK and Close.
5.
6.
Notice that after you finish working with the Format Data Series panel for one data series, you can
leave it open. Then select another data series in the chart and start formatting it. Unlike earlier
versions of Excel (that worked with dialog boxes rather than panels), you dont have to close the
formatting panel and reopen it each time you want to format a data series.