Thursday, March 25, 2010

Can most people be above average?

Can there be truth to the Lake Wobegon "effect": "where all the children are above average"?

The short (and correct) answer is no.

But could there be a town where "the number of above-average children, exceeds the number of below-average children"?

Quite certainly possible!

Without trying to sound like a lawyer, it all hinges on the definition of the word "average". There are at least three commonly used measures of the average or "central tendency" of a bunch of data.

The mean, median, and the mode.

The "mean" is the "average" you are used to, where you add up all the numbers and then divide by the number of data points. When you see somebody's batting average, you are considering the mean:

total number of runs/total number of "outs"

It is quite possible that the number of data-points greater than the mean is larger than half (and vice versa). A common, but interesting example, is the fact that most US mutual funds (70%) under-perform the average benchmark (or passive index funds).

The "median" is the "middle" value in the list of numbers.

To find the median, you sort your numbers in ascending (or descending) order, and make a list. You then pick the guy in the middle of the list. A common example of this measure are "median incomes", which is really a fancy way of saying that you make everybody from Bill Gates to the poorest person on earth stand in order, in one straight line. The income of the guy at the center of the line is the median income.

Thus, if by average you mean median (no pun intended), then there are exactly as many people above average, as they are below it.

Ergo, no Lake Woebegone effect.

The "mode" is the value that occurs most often. If no number is repeated, then there is no mode for the list. It is also non-unique. There can be two or more modes. Again, you can easily have more values above the mode, than below it.

In summary, if average is defined as the mean or the mode, then the "weak form" of the Lake Wobegon Effect can certainly be true.

Sunday, March 21, 2010

The Poincare' Conjecture

I spent an entire Sunday afternoon reading up on the Poincare' conjecture, thanks to this link from Abi, and a few "heads up"s from friends on Facebook.

One thing led to the other, I stumbled upon this New Yorker article called "Manifold Destiny", by Sylvia Nasser (of A Beautiful Mind fame) and David Gruber. If you like stories of big egos, monumental leaps in science doused with a certain tabloid appeal, this is an amazing read.

I thoroughly enjoyed it.

Wednesday, March 17, 2010

An IIT wingmate wins the Alan T. Waterman Award

One of my old wingmates from IIT Bombay, Subhash Khot, recently won the Alan T. Waterman award. He was also JEE AIR 1 in 1995, so his genius predates my acquaintance ;).

Hearty congratulations, Khot!

The top few JEE students really tend to be a class apart. I will write something separately about that.

PS: I had the weird fortune of being in the same wing as JEE 1 (Khot), JEE 10 (Bansal, now at IBM) and JEE 14 (Devavrath, now faculty at MIT and also PGM in 1999) during my freshman year.

Saturday, March 13, 2010

Links

1. Why does a salad cost more than a Big Mac? (via Flowing Data) Short answer: subsidies. Reality is somewhat more nuanced - since subsidies for soy and corn have been lumped into meat/dairy in this graph.

2. Canada: The country that pees together.  They sure take their hockey seriously!

3. Economic forecasts are BS! I recall Warren Buffett once remarked "Forecasts may tell you something about the forecaster, but they tell you nothing about the future."

Friday, March 12, 2010

A Fascinating Puzzle!

A colleague alerted me to this fascinating puzzle that is doing rounds on the internet. I spent an hour trying to figure it out. It sounds pretty innocuous (but like all great puzzles, it is very nuanced):
In a country in which people only want boys every family continues to have children until they have a boy. If they have a girl, they have another child. If they have a boy, they stop. What is the proportion of boys to girls in the country?

The "official" answer is 50-50, although you may be inclined to think that such a country would overproduce girls.

Spoiler: Here's a link to a great answer and discussion. Try not to peek at it before you give it a shot.

Wednesday, March 10, 2010

ImageMagick

If you use a Linux machine, you have probably heard or used ImageMagick before. It is a program that lets you manipulate pictures, and convert them to about a hundred different formats.

You can also use it directly from the command line.

So if you had a directory full of jpeg images, say, which you wanted to resize, and convert to png, you could write a very simple shell script.

If you like GUIs, you could open an image by saying:


$ display pic1.jpg

and then interacting with the GUI (which looks like something from the 80s, but is quite powerful).

It also lets you take screenshots (partial or complete) with the least amount of effort.

$ import pic.jpg

Use the mouse to select the portion you want to capture, and it gets saved in "pic.jpg"

Saturday, March 6, 2010

Salman Rushdie

I read this article on the freedom of speech, in the context of MF Hussain "choosing" Qatar (via Abi). It mentions Salman Rushdie (and several others) who've been punished for exercising their right to free speech, and got me thinking about the man who once remarked:
What is freedom of expression? Without the freedom to offend, it ceases to exist.
I first heard about "The Satanic Verses", when I was a school kid. Since the book was promptly banned in India, I picked it up much later, while in grad school in Ann Arbor.

I must confess that I truly enjoyed it.

Parts of the book are probably offensive to some people. Parts of  it, especially where he heads off into tangents are disorienting (like the movie Mulholland Drive).

Significant parts of it, however, are sheer genius.

I remember reading the following lines, and being compelled to pencil them down.
Question: What is the opposite of faith?
Not disbelief. Too final, certain, closed. Itself is a kind of belief.
Doubt.
I also got a chance to see him in person, when he was doing a talk in Ann Arbor. His political views, especially on matters such as freedom of speech, cultural relativism, and state of Islam struck me as very thoughtful.

I sign off, with one more Rushdie quote:
The only people who see the whole picture,' he murmured, 'are the ones who step out of the frame.

Tuesday, March 2, 2010

Dispersing points on the surface of a sphere

Dispersing or choosing points, uniformly or randomly, on the surface of a sphere is a task that someone like me who does molecular simulations, runs into, every once in a while. In some form or shape, this problem is encountered, for example, while modeling a 3D random walk, the motion of a Brownian particle in space, decorating a nanoparticle with stabilizers etc.

While the task is not hard, there are some elegant methods and potential pitfalls that a person doing this for the first time should be aware of.

As I mentioned earlier, one may choose points randomly, or uniformly on the surface (of a unit sphere, here, but can be generalized trivially).

1. Random:

The key point here is that it is incorrect to choose points by selecting angles phi and theta corresponding to spherical coordinates from uniform distributions. This incorrectly concentrates points near the poles.

The correct method is to select two random numbers u and v from a uniform distribution, and construct the two angles as follows:

u = rand();            // uniform random number between
v = rand();            // zero and one.
phi = 2 * pi * u;      // pi is 3.141...
theta = acos(2*v - 1)  // acos is inverse cosine

This is explained with figures here.

2. Uniform:

This is trickier than I thought when I first attempted to do it. Here is an old link (circa 1998, but still good!) which touches upon some of the subtleties. The trouble starts with what "uniform" means, for an arbitrary number of points to be distributed.

If we decide that uniform means a distribution which "maximizes the minimum separation" between n points on the sphere, we can start going somewhere.

There are a number of algorithms that can try to solve this problem, as described on this excellent page. Here is C++ code that I wrote while implementing the golden spiral rule from that page.

//
// Spray n points uniformly on the surface of a sphere 
// of radius "radius"
//
// The positions are returned as an array of Vectors
// where, struct Vector { double x; double y; double z }
//
// Based on algorithm from 
// http://www.xsi-blog.com/archives/115
//
void SprayPointsSphere(int n, double radius, Vector * p)
{
  double inc = PI * (3.0 - sqrt(5.0));
  double off = 2.0/n;

  for(int k = 0; k < n; k++) {
    double y   = k * off - 1 + (off/2);
    double r   = sqrt(1 - y*y);
    double phi = k * inc;

    p[k].x = cos(phi)*r * radius;
    p[k].y = y * radius;
    p[k].z = sin(phi)*r * radius);
  }
}