Saturday, September 22, 2018

Don't Judge a Book by its Cover

Situations or observations often have an "obvious" first-order explanation. These explanations are attractive and complete.

Sometimes, however, a deeper and far more interesting second order effect lurks under the surface.

Consider this military example of survivorship bias:
During World War II, the statistician Abraham Wald took survivorship bias into his calculations when considering how to minimize bomber losses to enemy fire. Researchers from the Center for Naval Analyses had conducted a study of the damage done to aircraft that had returned from missions, and had recommended that armor be added to the areas that showed the most damage. Wald noted that the study only considered the aircraft that had survived their missions—the bombers that had been shot down were not present for the damage assessment. The holes in the returning aircraft, then, represented areas where a bomber could take damage and still return home safely. Wald proposed that the Navy instead reinforce the areas where the returning aircraft were unscathed, since those were the areas that, if hit, would cause the plane to be lost. His work is considered seminal in the then-fledgling discipline of operational research.
Or perhaps this example of the law of large (or small) numbers from Statistics Done Wrong, that I brought up previously on the blog. The mean is not a reliable metric, when the variance is large.
Suppose you’re in charge of public school reform. As part of your research into the best teaching methods, you look at the effect of school size on standardized test scores. Do smaller schools perform better than larger schools? Should you try to build many small schools or a few large schools?

To answer this question, you compile a list of the highest-performing schools you have. The average school has about 1,000 students, but the top-scoring five or ten schools are almost all smaller than that. It seems that small schools do the best, perhaps because of their personal atmosphere where teachers can get to know students and help them individually.

Then you take a look at the worst-performing schools, expecting them to be large urban schools with thousands of students and overworked teachers. Surprise! They’re all small schools too. 
Smaller schools have more widely varying average test scores, entirely because they have fewer students. With fewer students, there are fewer data points to establish the “true” performance of the teachers, and so the average scores vary widely. As schools get larger, test scores vary less, and in fact increase on average.
Or this example from Kahneman on whether praise or criticism improves outcomes, when an Israeli air force instructor claimed:
“On many occasions I have praised flight cadets for clean execution of some aerobatic maneuver, and in general when they try it again, they do worse. On the other hand, I have often screamed at cadets for bad execution, and in general they do better the next time. So please don’t tell us that reinforcement works and punishment does not, because the opposite is the case.”
The underlying second-order principle that was operative was regression to the mean. Praise or criticism had nothing to do with the observations, and were merely nuisance variables.

Or perhaps the grisly observation that sales of ice-cream and the number of homicides in cities are strongly correlated. Here, the first order explanation might be to dismiss the correlation as spurious.

However, a more careful look might point out to an important hidden variable, warm weather, which helps us come up with a causal explanation. Warm weather makes people buy more ice-cream. Warm weather also brings people outdoors, which increases the odds of murders.

Friday, September 7, 2018

Pascal and Fermat

Fermat and Pascal exchanged correspondence discussing the problem of points.

Here is a loose sketch of the problem:

Two players toss a fair coin. Player A gets a point, if it comes up heads, while player B gets a point, if it comes up tails.

They repeat this, until one of the player gets to 10 points.

At the start of the game, each player wagers $50 for a total pot of $100.

Suppose the game is interrupted at a certain point due to unavoidable reasons (say player A has 8 points and player B has 7 points).

How should the pot be divided?

At the start of the game, the odds are even, since the coin is fair. However, when the game is interrupted, player A has a higher chance of winning. How can we systematically take this condition into account?

A wonderful exploration of this problem written using "modern" terminology is available here. It outlines the problem, sketches Fermat's and Pascal's approaches, and generalizes the problem and solution.

The original correspondence (translated) is available here (pdf).