Friday, April 11, 2008

StatLinks

The Internet makes it possible to link a dispersed community of common interest. Now there are a number of blogs that focus entirely or in part on Statistics, but they seem not to be well connected.

So I've just set up a social bookmarking website just for applied statistics, data analysis, and visualization. It's called StatLinks.

It lists links that users submit, and allows other users to vote on their relevance. Links are listed in order of popularity (or in chronological order, if you prefer).

I encourage people to visit StatLinks, to submit links that are likely to be of interest, and to pass the word! I've put a few links in to get things started. (Hat tip to Slinkset whose technology made it a breeze to set this up.)

Labels: , , , ,

Tuesday, March 04, 2008

Is it fair not to share?

Over at Adventures in Ethics and Science there's a very interesting post and follow-up comments on whether researchers should share data. It's based on a recent New York Times article by biostatistician Andrew Vickers (Cancer Data? Sorry, can't have it).

Labels: , ,

Monday, March 03, 2008

Paint by numbers

The map on the left labels countries as free (green), partly free (peach), and not free (red) for the year 2006. The classification is from Freedom House, a primarily US-government-funded organization. Each country is scored on political liberty and civil liberty; the combined average of these scores determines how the country is classified. Cuba, for example, receives the poorest rating on both political and civil liberties, and is thus categorized as "not free".

The Wikipedia entry for democracy lists two other measures of democracy, one from the Polity IV project and the Democracy Index from The Economist. Each measure has its own scheme for measuring and weighting different characteristics felt to characterize democracies. Since the different measures assess many of the same things, it is not surprising that they show some agreement.

But democracy is not a simple thing and it is far from clear which characteristics matter and how best to weight them. For example, does a free press count more than an independent judiciary? If so, by how much? (And presumably this depends on how "free" and how "independent" they are.) What about non-traditional characteristics that may be important measures of democracy? For example, should voter turnout be factored in? What about incarceration rate? Or media concentration? Or universal health care?

Although there may be some value in overall measures of democracy, individual characteristics still need to be examined and put in context. When we do that, we may find that the global canvas no longer looks like a paint by numbers kit.

Update: 05Mar2008

No sooner had I put up this post than I saw this Adbusters press release relating to media concentration and democracy:

Adbusters Demands Canwest, the CBC and the CRTC Stop Blocking Citizen-Produced Advertising

On Monday, February 18, Adbusters lost its court battle against two of Canada's television networks that refused to sell airtime for its commercials. Adbusters claimed the CBC and Canwest Global had violated its right to free speech under the Canadian Charter of Rights and Freedoms by refusing to sell air time, but the court decided that the Charter does not apply to private corporations.

"It's outrageous that the fast food, oil and automobile industries can buy as much TV time as they want in order to promote their agendas, but citizens are not allowed to talk back," said Adbusters Editor-in-Chief Kalle Lasn in response to the ruling. "Canadian democracy will not work properly until we the people have the same right to buy airtime as corporations do."

The rejected Adbusters ads pointed out that over 50 percent of the calories in a Big Mac come from fat, called for an end to the age of the automobile, and promoted Buy Nothing Day. While Court Justice William Ehrcke ruled that private broadcasters have the right to run whatever ads they like, Adbusters feels the case raises some troubling questions.

Firstly, why are Canwest and the CBC selling as much time as they possibly can to corporations, while fighting expensive legal actions to keep citizen-produced messages off the air? Why does the CBC call itself "Canada's Public Broadcaster" if they won't sell airtime to citizens?

Secondly, why is the CRTC not standing up for public access? When they grant licences to broadcasters, why is the right of Canadian citizens to access their own "public" airwaves not being guaranteed? Thirdly, why is our freedom of speech being suppressed? Why can corporations buy airtime while citizens cannot? Why doesn't the Canadian Charter apply to the most powerful social communications medium of our age - television?

"This case goes to the very heart of what our democracy is all about," says Lasn. "A healthy society allows its citizens to walk into their local TV stations and buy airtime under the same rules and conditions that corporations do. Adbusters has been given 30 days to challenge the ruling. This legal battle for media democracy will go on."

To talk to Kalle Lasn, or Ryan Dalziel, our lawyer, about the case please contact Lauren Bercovitch (lauren@adbusters.org)

EDITOR'S NOTES

For more information about Adbusters and the global media democracy movement visit www.mediacarta.org and www.adbusters.org

[1] Canadian Media facts:

Three corporations (CanWest, Quebecor and Torstar) control 70 per cent of the country's daily newspaper circulation.

Five major media acquisitions in Canada have been approved by CRTC in the past year: CHUM was purchased by CTVglobemedia for $1.4 billion, which then sold five CityTV stations to Rogers Communications for $375 million; CanWest purchased Alliance Atlantis for $2.3 billion; Astral Media bought Standard Broadcasting for $1.2 billion; and Quebecor bought the Osprey Media newspaper chain for $414 million.

[2] Facts about Media Democracy:

More than 30,000 people have signed the Media Carta www.mediacarta.org, to voice their concerns about the way information is distributed in our society.

In the past year, a growing number of grassroots media activist groups have been formed in Canada to express a dissatisfaction with the continued consolidation of the country's media:

DemocraticMedia.org

MediaReform.ca

MediaDemocracy.ca

Labels: , , , , , ,

Monday, February 25, 2008

Data and development

Here's a fascinating talk by Hans Rosling about international health and development. His presentation reminds me of Al Gore's An Inconvenient Truth.

Thoughts?

Labels: , , , ,

Thursday, January 31, 2008

LOLstats

Thursday, January 24, 2008

The princess and the outlier

In my continuing effort to eff the ineffable (consciousness), I today stumbled on an article by Jaron Lanier with the intriguing title "Death: The skeleton key of consciousness studies?" It's written in an entertaining manner with little in the way of technical jargon. Lanier makes some interesting points, but what struck me was the following piece near the beginning:
There is a popular story about a princess who complains that she cannot sleep comfortably because of a single pea buried under layers of mattresses. That pea is consciousness in the sciences.

To consider consciousness by itself is entirely undemanding. It is a pea. There is nothing to describe. An attempt to account for it in context, however, forces the construction of ever shifting, elaborate adventures of thought.

What a temptation it is to dispose of this erratic data point. That is what any first year student of statistics would be taught to do.
Excuse me? I was enjoying the metaphor until that last bit!

At this point I'm tempted to launch into an extended discussion of statistical approaches to outliers. Or an outraged defense of statisticians against the notion that we teach first year students to casually discard data points that seem aberrant. But I think I'll put it on my to-blog list. That's one more pea under my matress!

Labels: ,

Sunday, December 09, 2007

Things that (probably) don't exist

A recent article by philosopher Steven Hales is titled "You Can Prove a Negative" (a slightly different version of the article is available as a pdf file). Hales argues that the "principle of folk logic" saying you can't prove a negative is just plain wrong.

He points out that "any claim can be expressed as a negative, thanks to the rule of double negation." So it's easy to come up with examples of proving a negative. Hales goes on to say that "Some people seem to think that you can’t prove a specific sort of negative claim, namely that a thing does not exist." He counters this with an example of a valid proof that something doesn't exist:
1. If unicorns had existed, then there is evidence in the fossil record.
2. There is no evidence of unicorns in the fossil record.
3. Therefore, unicorns never existed.
Of course, the difficulty here is with the truth of the premises (1 and 2). In particular, it could be that we just haven't found unicorn fossils yet. Or perhaps, unicorns don't leave a fossil trace. Deductive arguments are so neat and tidy we may forget about what's been swept under the carpet: the truth (or otherwise) of the premises.

Finally Hales grasps the nettle:
Maybe people mean that no inductive argument will conclusively, indubitably prove a negative proposition beyond all shadow of a doubt. For example, suppose someone argues that we’ve scoured the world for Bigfoot, found no credible evidence of Bigfoot’s existence, and therefore there is no Bigfoot. A classic inductive argument. A Sasquatch defender can always rejoin that Bigfoot is reclusive, and might just be hiding in that next stand of trees. You can’t prove he’s not! (until the search of that tree stand comes up empty too).

And now we come to the heart of the matter:
The problem here isn’t that inductive arguments won’t give us certainty about negative claims (like the nonexistence of Bigfoot), but that inductive arguments won’t give us certainty about anything at all, positive or negative. All observed swans are white, therefore all swans are white looked like a pretty good inductive argument until black swans were discovered in Australia.
Well, hold on just a moment. We were talking about "a specific sort of negative claim, namely that a thing does not exist". And the swan argument hasn't been written that way. If we do write it that way, we get the inductive argument no observed swans are black, therefore all swans are non-black. So non-existence claims based on observation are uncertain.

But what about existence claims based on observation? Well, you only have to see one black swan too conclude that not all swans are white, and this inference is certain because it's deductive. (This is, of course, provided that we can trust that what we've seen really is a swan, and it really is black, and that we didn't just imagine the whole thing. There are some important issues here, but taking this too far can lead to radical skepticism, which is unproductive.)

My point is that when it comes to using observational evidence to argue for existence (a positive claim) or non-existence (a negative claim), you can't prove a negative, whereas you can prove a positive. (Here I'm using "prove" to mean "establish with certainty".) So, in this sense, I disagree with Hales. And I think that this is what people typically mean when they state that "you can't prove a negative". I also think that the imbalance in the difficulty of demonstrating non-existence compared to existence is a strong argument that the burden of proof should be on those who claim the existence of something.

I agree with Hales, however, in his defense of induction:
The very nature of an inductive argument is to make a conclusion probable, but not certain, given the truth of the premises. That's just what an inductive argument is. We’d better not dismiss induction because we’re not getting certainty out of it, though.
I believe we all crave certainty, but it's in pretty short supply—caveat emptor.

If we weren't so terrified of uncertainty, we might make much better decisions. When it comes to things that can be quantified, the field of statistics offers some very useful tools for dealing with uncertainty. Suppose, for example, we're trying to determine whether all swans are white. If we sample, at random, 100 swans, and each of them is white, then a very useful approximation, the "Rule of Three" tells us that we can have 95% confidence that the true proportion of non-white swans is less than 3/100 or 3%. Suppose we continue sampling swans and they stubbornly continue to be white. Having sampled 10,000 white swans, we can now have 95% confidence that the true proportion is less than 3/10,000 or 0.03%.

The notion of "95% confidence" can be made precise (but I won't get into the details here). It's also noteworthy that there are Bayesian analogues to the Rule of Three. Details are in Jovanovic and Levy, A Look at the Rule of Three, 1997, The American Statistican, 51: 137-139.

Unfortunately, there's a major difficulty in the application of the Rule of Three to the swan example: the assumption that the swans are randomly sampled! It turns out that the black swans were hiding out in Australia. But there's a message here: non-random samples can give very misleading information. That's one reason why anecdotal evidence is treated so skeptically by scientists.

For an atheist perspective on the "you can't prove a negative" idea, see here. And here's a page on burden of proof relating to claims of existence, from philosopher Philip Pecorino.

Update 12Dec2007: I sent a link to this post to Professor Hales and he kindly replied:
You write that you only have to see one black swan to know that not all swans are white, and that “this inference is certain because it is deductive.” But wait—the argument I gave about unicorns was also deductive, and you dismissed that as proving its conclusion. Therefore you can’t hold that the conclusion of your swan argument is certain because the argument form is deductive. If the conclusion of the swan argument is certain, then it is for some other reason. I suspect that you think it is certain because you are convinced of your premise that we have seen black swans. Of course, I’m rather convinced of my premises that if unicorns had existed, then there is evidence in the fossil record, and that there is no evidence of unicorns in the fossil record. Before you rejoin that we could find out that we are mistaken about the fossil record (as we would discover if we locate a unicorn skeleton), let me point out that we could also be mistaken about observing black swans. Maybe upon further study we’ll find out that they aren’t swans at all, but are merely related to swans. Or we could discover that they were phony, dyed white swans prepared to fool naïve naturalists. Or we might show that other even more skeptical hypotheses are true (mass hallucinations, dreaming, etc.). The real problem, as I see it, is your equation of proof with certainty. Most epistemologists don’t think we are certain of anything outside of logic, mathematics, and other things known a priori. There is always the possibility of error. But that doesn’t mean that we can’t prove things in some reasonable, real-world sense of prove.”

Labels: , , , ,

Wednesday, March 28, 2007

In memoriam: Ram Myers

Goodbye, dear friend.

I learned of Ram's death on Tuesday evening, from the mother of Ram's wife (Rita, who is also a dear friend).

She added that Rita has requested that people not try to contact her.

It's now early Wednesday morning, and I am going to bed.

Labels: , , , , , , , , ,