Wednesday, November 30, 2005

To bar or not to bar?

Bar charts are sometimes used to show percentages, like this:
Looks like a trend, right? The top of each bar gives a point estimate of the percentage at each time. But to evaluate whether the apparent trend is more than the play of chance, we need to consider the precision of the estimates. So I often recommend showing confidence intervals around the point estimates, like this:
The overlap of the confidence intervals is considerable, which suggests that the data are consistent with there being no trend at all. In SPSS, the chi-square trend test ("Linear-by-Linear Association") gives a 2-sided p-value of 0.229, so it's clearly not statistically significant. (The reason is that the proportions are based on very small denominators.)

At a blog called Junkcharts there was a recent posting titled When not to use bars. I added a comment (which I've edited a bit below):
"I think that one of the reasons the bar chart is so popular is that it paints broad strokes of ink (particularly striking when color is used), giving the figure a kind of visual punch. A bar chart can be seen from halfway across a room, whereas the traditional figure favoured by statisticians nearly disappears (admittedly I'm not wearing my glasses, but I think the point holds). However, one could achieve a similar visual effect to the bar chart using vertical colored boxes with light horizontal lines indicating the point estimates."
Here's what I meant:

Of course it's a rather unfamiliar display. I'd be interested to hear other people's thoughts on this.
Bookmark and Share

5 Comments:

Anonymous Mike Anderson said...

I'm not too fond of this banded-bar display; it suggests a lot of probability density across the entire confidence interval. Either keep it simple with the thin lines, or go all out with a raindrop graphic.

8:31 AM, December 14, 2005  
Blogger Nick Barrowman said...

Thanks for your comments, Mike. Yes, I know what you mean. I was aiming to match the aesthetic appeal of a bar chart, while conveying the information in a confidence interval. But I'm not sure it was a success.

I'm delighted you mentioned the raindrop, which my PhD supervisor and I introduced. If anyone is interested, here's the citation:

The American Statistician, 2003, vol. 57, pp.268-274
Raindrop Plots: A New Way to Display Collections of Likelihoods and Distributions
Nicholas J. Barrowman; Ransom A. Myers

9:26 PM, December 14, 2005  
Anonymous Mike Anderson said...

D'oh! That's what I get for not checking the article before posting (excuse: my copy of American Statistician is in the office, I commented at home). As I recall, it was a delightful article.

You wouldn't happen to have a snippet of raindrop R code lying about, would you? I'm thinking of introducing my undergrads to R graphics next semester, and a good example is worth a dozen lessons.

10:29 PM, December 15, 2005  
Blogger Nick Barrowman said...

Thanks, it was fun working on the article. I do have some R code for raindrops. I'll try to dig it up ...

10:42 PM, December 15, 2005  
Blogger Nick Barrowman said...

Ok, here is some R code that defines 3 functions: plotraindrop (to plot a single raindrop), plotraindrops (to plot a list of raindrops), and profileglm (to compute a profile likelihood for a generalized linear model). This last function (profileglm) is just to help with an example, here. The example takes a little while to run(around 30 seconds on my not-particularly-new computer) because it computes profile likehoods, but I think it's a nice illustration. Let me know how it works!

10:43 AM, December 16, 2005  

Post a Comment

<< Home