It is well known that correlation does not imply causation. But when scientific studies are reported in the media, this dictum is often forgotten. Professor Jon Mueller at North Central College in Naperville, Illinois has compiled a great set of links to news articles reporting scientific findings. Some of the headlines for these articles suggest causal relationships and some do not. Clicking through to the actual news articles shows that the purported causal relationships are often a stretch, to say the least. For example:
The researchers found children who watched two to four hours of TV were 2.5 times more likely to have high blood pressure compared with those who watched less than two hours of television a day. Those who watched more than 4 hours per day were 3.3 times more likely to have hypertension.In other words this was an observational study, which established a correlation between watching high amounts of TV per day and having high blood pressure. Contrary to the headline, the study did not show that the TV watching was the cause of the high blood pressure. For convenience let's rework the headline, while preserving its causal sense:
TV watching increases the probability of high blood pressure. (1)The causal implication can be removed by writing:
TV watchers have higher probability of high blood pressure. (2)In a wiki entry on causal language Gustavo Lacerda points out that action words often express causality. Note that in the present example, in order to remove the causal aspect of (1), it was necessary to change the verb "watching" into the noun "watchers" and the verb "increases" into the noun "higher".
Interestingly, there is a Bayesian formulation that sounds closer to (1):
Being a TV watcher increases the probability that a child has high blood pressure.Note that this version has the verb "increases", like (1), but not the verb "watching". Instead it's expressed as "being a TV watcher", which indicates group membership rather than action or choice. It is this information about group membership that is used to update the probability of high blood pressure, following the Bayesian recipe.
Prediction and causality
Prediction can sound a lot like causation. Consider this statement:
If you exercise, you're less likely to have a heart attack. (3)Does this mean:
People who exercise are less likely than people who don't to have a heart attack. (4)or does it mean:
The act of exercising reduces your chances of having a heart attack. (5)It seems quite ambiguous. On the one hand, "if you exercise" sounds like a statement about your choice simply to exercise instead of not exercising, which supports interpretation (5). On the other hand, "if you exercise" identifies you as a person who exercises, and that may predict your risk of heart attack, perhaps due to another behaviour common among people who exercise, such as healthy eating. This would support interpretation (4).
Natural language allows ambiguities. It's convenient to leave things out because everyone knows what we mean. Don't they? Not necessarily. Certainly, when it comes to causality, ambiguity can lead to a mess of trouble. In ordinary speech, the distinction between correlation and causation is often blurred. Statement (3) above is ambiguous about the comparator: less likely than whom to have a heart attack? Less likely than people who don't exercise? Less likely than you would be if you chose not to exercise?
It seems to me that causal language is almost a worst-case scenario. Many people would see the concern as unimportant. And yet evidence and beliefs about causation are at the foundation of any intervention, whether in health care, education, social programs, economics, what have you. The media and politicians routinely use misleading causal language. But it's difficult even when we try to be clear!
Sweetness and life
One of my favorite of Mueller's examples is: