Tuesday, June 2, 2009

What are the chances?

I cringe every time that I hear this phrase. I heard it most recently when my friend had her car stolen from San Juan. It was recovered in a semi-drivable state in Bayamon. She invested a couple thousand dollars to get rid of the "semi", only to have the car re-stolen a couple of months later. This time, when the car was recovered in Dorado, there was no "semi" to be had, so she's currently trying to sell it for pieces. While the police were fairly understanding (though less than helpful) the first time her car got stolen, the second time was occasion for all sorts of raised eyebrows and skepticism. And in exasperation my friend uttered the phrase in question.

"What are the chances" is a Pandora's box of bad statistics. Statistics is about hedging your bets given incomplete information, but this phrase is always uttered after the fact, when we have (relatively) complete information: it happened. So unless you plan on repeating the experiment, the chances are one. It happened.

As an example, let's take the famous Goat/Car Puzzle. There are 3 doors; one has a car behind it and the other two have goats. After you pick a door, the game-show host opens one of the other doors and reveals a goat. You are then offered the option of switching your choice to the other door. If you play this game repeatedly, you'll win more often if you switch your choice. But the instance just played out, the car was behind one of the doors, and if that was the door you picked, your chances of getting the car were 1. If you didn't, your chances were 0.

You might object: "What were the chances beforehand, when I didn't know where the goat and car were?". But to do that, you need to make some assumptions. You need to assume that at each playing of the game, the cars and goats are randomly assigned and/or you randomly pick doors. Otherwise, it might be the case that the car is always behind door #1, and you always pick door #2. Your chances of success wouldn't be so good in this case. You might have decent prior knowledge of how cars, goats, and doors are picked in this example, but for everyday occurrences, we usually have much more limited prior knowledge. Are cars randomly stolen, or are certain brands targeted? Are certain areas targeted? People often assume that these processes are random, but they rarely are. With limited priors on these events, the question "what are the chances" can't be answered with any certainty and any answers given should be taken with a great big shaker of salt.

Furthermore, people have selective attention. We ignore whole heaps of ordinary outcomes and only pay attention to ones that strike us as interesting. As a friend of mine once said: "Low probability stuff happens pretty regularly because stuff is happening all the time." Even if the processes involved are random, unlikely outcomes are to be expected if the processes are repeated often enough. People tend to ignore the ordinary outcomes, exclaim at the extraordinary ones, and then assume that something deeper is afoot. In my friend's case, the police started wondering if she was being personally targeted or if she was really bad at locking her car. But even if we assume a random model of car thefts, some number unlikely outcomes doesn't automatically imply that our random model is wrong.

Finally, we also need to keep in mind that in complex systems like real life, there may be a huge number of possible outcomes. But something has to happen. When you roll a die, each number only has a 1/6 chance of coming up. Would you roll a die once and then exclaim: "Wow, it came up six! What are the chances?" In real life, there might be millions of outcomes, each with one-in-a-million chance of coming true, but the fact that one of them happens shouldn't be surprising.

Fighting against all of the pitfalls inherent in asking "What are the chances?", I've developed a reflexive response: "What are the chances?" One.