Category Archives: statistics

Fun with probabilities

Here’s a real life application of probability theory. It is similar to the sports betting company business model:

You get an mailing list (email, it is cheaper), and you pick a big football game. You send half the list an advisory that one of the teams will win, and you send the other half an advisory that the other will win, along with the opportunity to subscribe to your newsletter for a low, low price.

To the half of the people that got the correct prediction, you send another letter, picking another big game, and sending half the prediction that one team will win, half the prediction that the other team will win.

People who get two correct predictions in two weeks will be amazed and will be more likely to subscribe to your newsletter.

But you can split that group in half, and send out a third prediction, then split the people who got three weeks in a row right and send out a fourth letter, and so on. You might recycle some of the losers, too, so that some people see that your accuracy is 5 out of 6 weeks and so on.

Imagine getting a letter where the predictions were right five weeks in a row! What is the probability of that? This newsletter must be really good!

Or the newsletter writer knows some probability…

NYTimes article about statistics

I have to link to this article, For Today’s Graduate, Just One Word: Statistics mostly because of this quote:

“I keep saying that the sexy job in the next 10 years will be statisticians,” said Hal Varian, chief economist at Google. “And I’m not kidding.”

Works for me. There’s also a nice example at the end of the article about how just finding relationships in data is not always enough:

For example, in the late 1940s, before there was a polio vaccine, public health experts in America noted that polio cases increased in step with the consumption of ice cream and soft drinks, according to David Alan Grier, a historian and statistician at George Washington University. Eliminating such treats was even recommended as part of an anti-polio diet. It turned out that polio outbreaks were most common in the hot months of summer, when people naturally ate more ice cream, showing only an association, Mr. Grier said.

What do we know?

Science depends on statistics to allow us to “know” something. We experiment and show differences and try to show that those differences are not just random, but show some underlying effect.

Sometimes, even statistics are not enough, when enough people in a field are convinced of the current theories — which are theories because they can never be “proved.” An example is the struggle of two scientists who saw the data tell them that ulcers were not caused by what “everyone” knew they were caused by, but rather by a bacteria. Publishing their results was frustrated because the reviewers knew they must be wrong.

Lucky for us they persevered, and eventually on the Nobel Prize.