### Lowering the bar

^{3}) and they were something like:

92.5, 91, 93.5, 92And here's what Excel produced:

The vertical axis starts at 89.5, so the height of each bar represents the density−89.5, which means ... ??

Junk Charts quotes Naomi Robbins, author of Creating More Effective Graphs thus: "all bar charts must include zero". Indeed—otherwise what do the bar heights represent? That Excel's defaults violate this rule is, ahem, unfortunate. (I've tried this using Excel 2000 and Excel on a Mac, but perhaps it's been fixed in newer versions? Maybe?)

Excel can be coerced into starting its vertical axis at 0, but it takes a fair bit of clicking and navigating. The result is:

Relative to a density of zero, there's very little variation. But perhaps this hides the message in these numbers. Doesn't that just bring us back to the first bar chart? Well ... no.

This graph shows the data, with the vertical axis zoomed in to where the action is. Unlike the original bar chart, it doesn't show bars with arbitrary heights.

Again from Junk Charts:

This isn't too far from my view, but it doesn't address bar charts, which are a special case because they emphasize the heights of the bars, rather than the position of the tops of bars. Bar charts are only appropriate for variables that are measured on ratio scales. For such variables, there is a non-arbitrary zero, which means that you can calculate a meaningful ratio; e.g. for weight: one thing might weigh twice as much as another. But some variables aren't like that; e.g. IQ: an IQ of zero is meaningless, and so it doesn't make sense to say that someone with an IQ of 100 is twice as intelligent as someone with an IQ of 50. For variables of this kind bar charts make no sense at all.The "start-at-0" rule says that the vertical axis of any graph ought to start at value 0. The rule was mentioned in Huff's classic booklet, "How to Lie with Statistics": as the name implies, the rule is intended to eradicate mischievous graphs that exaggerate small differences by not starting at 0, which is to say, by choosing a misleading scale.

Others, like Tufte and Wainer, have long realized that the start-at-0 rule is not absolute ... My own "anti-rule" stipulates that if all data appearing in a chart are far from 0, then don't start at 0.

If, on the other hand, some of the plotted data are close to 0, then it is essential to start at 0.

So, if your variable isn't ratio scaled (in other words, there isn't a meaningful zero), don't use a bar chart. If it is ratio scaled and you decide to use a bar chart, make sure your axis starts at zero.

Derek puts it well in a comment at Pictures of Numbers:

In case anyone thinks this really isn't much of an issue, here are some examples I found quite easily:There is a circumstance in which the would-be grapher absolutely must start with zero, and that's when creating a bar graph. If that causes problems, it's time to consider abandoning the bar graph and adopting something which doesn't need a zero on the scale. I've seen bar graphs where the designer recognised the problem with zero, adopted and defended the solutions, but without getting rid of the bar graph format. Those wavy gaps are the least bad of the abortive compromises resorted to by people who won't give up their bars.

Labels: bar chart, Excel, levels of measurement, ratio measurement, start-at-0