### To bar or not to bar?

Bar charts are sometimes used to show percentages, like this:

Looks like a trend, right? The top of each bar gives a point estimate of the percentage at each time. But to evaluate whether the apparent trend is more than the play of chance, we need to consider the precision of the estimates. So I often recommend showing confidence intervals around the point estimates, like this:

The overlap of the confidence intervals is considerable, which suggests that the data are consistent with there being no trend at all. In SPSS, the chi-square trend test ("Linear-by-Linear Association") gives a 2-sided p-value of 0.229, so it's clearly not statistically significant. (The reason is that the proportions are based on very small denominators.)

At a blog called Junkcharts there was a recent posting titled When not to use bars. I added a comment (which I've edited a bit below):

Of course it's a rather unfamiliar display. I'd be interested to hear other people's thoughts on this.

Looks like a trend, right? The top of each bar gives a point estimate of the percentage at each time. But to evaluate whether the apparent trend is more than the play of chance, we need to consider the precision of the estimates. So I often recommend showing confidence intervals around the point estimates, like this:

The overlap of the confidence intervals is considerable, which suggests that the data are consistent with there being no trend at all. In SPSS, the chi-square trend test ("Linear-by-Linear Association") gives a 2-sided p-value of 0.229, so it's clearly not statistically significant. (The reason is that the proportions are based on very small denominators.)

At a blog called Junkcharts there was a recent posting titled When not to use bars. I added a comment (which I've edited a bit below):

"I think that one of the reasons the bar chart is so popular is that it paints broad strokes of ink (particularly striking when color is used), giving the figure a kind of visual punch. A bar chart can be seen from halfway across a room, whereas the traditional figure favoured by statisticians nearly disappears (admittedly I'm not wearing my glasses, but I think the point holds). However, one could achieve a similar visual effect to the bar chart using vertical colored boxes with light horizontal lines indicating the point estimates."Here's what I meant:

Of course it's a rather unfamiliar display. I'd be interested to hear other people's thoughts on this.

## 5 Comments:

I'm not too fond of this banded-bar display; it suggests a lot of probability density across the entire confidence interval. Either keep it simple with the thin lines, or go all out with a raindrop graphic.

Thanks for your comments, Mike. Yes, I know what you mean. I was aiming to match the aesthetic appeal of a bar chart, while conveying the information in a confidence interval. But I'm not sure it was a success.

I'm delighted you mentioned the raindrop, which my PhD supervisor and I introduced. If anyone is interested, here's the citation:

The American Statistician, 2003, vol. 57, pp.268-274

Raindrop Plots: A New Way to Display Collections of Likelihoods and Distributions

Nicholas J. Barrowman; Ransom A. Myers

D'oh! That's what I get for not checking the article before posting (excuse: my copy of American Statistician is in the office, I commented at home). As I recall, it was a delightful article.

You wouldn't happen to have a snippet of raindrop R code lying about, would you? I'm thinking of introducing my undergrads to R graphics next semester, and a good example is worth a dozen lessons.

Thanks, it was fun working on the article. I do have some R code for raindrops. I'll try to dig it up ...

Ok, here is some R code that defines 3 functions:

plotraindrop(to plot a single raindrop),plotraindrops(to plot a list of raindrops), andprofileglm(to compute a profile likelihood for a generalized linear model). This last function (profileglm) is just to help with an example, here. The example takes a little while to run(around 30 seconds on my not-particularly-new computer) because it computes profile likehoods, but I think it's a nice illustration. Let me know how it works!## Post a Comment

<< Home