Saturday, April 29, 2006

Probably incorrect probability

A recent post on the blog Blackademic about the alleged rape of a black woman by white Duke university lacrosse players generated a flurry of comments. One of them anonymously argued:
"Unfortunately, statistically a black women is significantly more likely to make a false accusation of rape than to have been raped by a white man. According to the National Crime Victimization Survey ( http://www.ojp.usdoj.gov/bjs/pub/pdf/cvus/current/cv0342.pdf ), less than .0004% of black rape victims were raped by whites. (The NCVS reports the percentage as 0% because there were less than 10 reported cases. I assumed 9 cases, to come up with an actual percentage) Even with the most conservative figure of 2% of rape allegations being false, this means in the case of the Duke Rape Case, the victim is 5000 times more likely to have made a false accusation than to have actually been raped."
There were some perplexed responses to this dramatic claim:
  • "yeah, cuz stats and figures are ALWAYS correct--whatever. it depends on who did the survery and for whom."
  • "How did you come up with the 5000 times more likely figure? That makes no sense at all. Using the figures you cited, the victim regrdless of race is likely to be lying only 2% of the time."
  • "Even if this study were accurate, and even if it were ethical to invoke the laws of probability to determine whether someone is believable-- two outsized ifs-- one coin, landing heads up, doesn't determine the likelihood of the next coin landing heads up. Neither does one woman, 30 years ago, have any bearing on the likelihood that another woman is telling the truth."
  • "the statistics and logic are just that - excercises in probability that tell us nothing about the case in question, because they are not equal to evidence."
But I think these responses miss the point: as far as I can see, the claim is simply incorrect. I think my reasoning is correct, but if I've slipped up please leave a comment.

First, Anonymous claimed that "less than .0004% of black rape victims were raped by whites." I followed the link to the National Crime Victimization Survey to check on this. The total number of rapes or sexual assaults of blacks listed was 24,010 and based on "about 10 or fewer sample cases" the perceived offender was white 0.0% of the time. I'm not entirely sure what this means, but Anonymous reasoned that as many as 9 of the offenders might be white. Now 9 out of 24,010 is about 0.04%, not 0.0004%. Anonymous then introduces "the most conservative figure of 2% of rape allegations being false". Dividing 2% by the incorrect figure of 0.0004%, Anonymous claims that "the victim is 5000 times more likely to have made a false accusation than to have actually been raped". If we divide 2% by the correct figure of 0.04%, we get 50 not 5000!

But apart from this error, the interpretation of the ratio is wrong. For it to be right, we would have to know the probability that the woman was raped. But that's not what the 0.04% represents. Instead, it's an estimate of the probability that if a black person is raped, the offender is white.

So the whole thing is invalid. The problem isn't with probabilistic reasoning per se, it's with faulty probabilistic reasoning. And that's a shame when something so important is at stake.
Bookmark and Share

2 Comments:

Blogger Zeno said...

Thanks for checking the original information, Nick, and doing the math. I wish that people who don't understand probability would refrain from trying to make probabilistic arguments. Mr. Anonymous, in this case, didn't even know how to place a decimal point.

3:27 PM, April 29, 2006  
Anonymous Anonymous said...

I am the anoymous commenter in question. My response:

"Now 9 out of 24,010 is about 0.04%, not 0.0004%."

True… obviously arithmetic is not my strong point.

"Dividing 2% by the incorrect figure of 0.0004%, Anonymous claims that "the victim is 5000 times more likely to have made a false accusation than to have actually been raped". If we divide 2% by the correct figure of 0.04%, we get 50 not 5000!"

Once again... true.

"But apart from this error, the interpretation of the ratio is wrong. For it to be right, we would have to know the probability that the woman was raped. But that's not what the 0.04% represents. Instead, it's an estimate of the probability that if a black person is raped, the offender is white.

I can go with this logic, so using my 2% of all reported rapes are false allegations (a very very conservative number). So this means out of 10000 reported rapes, 9800 are victims are actually raped. Using the corrected .04% figure (.04% of actual rapes of black women are committed by one white males) we end up with .039% of the reported rapes by black females are actual rapes committed by white males. If we divide 2% (reported rapes that are false) by the corrected figure of 0.039%, (reported rapes that are true and committed by white males) we get a figure of 51.

So the whole thing is invalid. The problem isn't with probabilistic reasoning per se, it's with faulty probabilistic reasoning. And that's a shame when something so important is at stake.

Actually I have noted one other error in my argument as well.

1. The rape figures I used to make the calculations are “single offender” statistics. I was unable to find figures for multiple offender numbers.

Having conceded these points, I still think that the revised numbers still adequately make my point. It is more likely that the AV in the Duke Rape Case made up the rape allegations than she was actually raped by a white male.

I think though in the future, I will stay away from amateur statistical analysis.

9:43 AM, May 24, 2006  

Post a Comment

<< Home