Friday, August 30, 2013

Zen Pencils: Bill Watterson's advice

Bill Watterson has been my hero for a long time (Calvin and Hobbes is probably is only "complete collection" I possess). 

Here's a beautiful "comic tribute" from Zen Pencils (which is an amazing "comic" blog in its own right), on his famous 1990 speech at Kenyon College.

I think this part of the speech has bits that apply to academia:
We're not really taught how to recreate constructively. We need to do more than find diversions; we need to restore and expand ourselves. Our idea of relaxing is all too often to plop down in front of the television set and let its pandering idiocy liquefy our brains. Shutting off the thought process is not rejuvenating; the mind is like a car battery-it recharges by running. 
You may be surprised to find how quickly daily routine and the demands of "just getting by: absorb your waking hours. You may be surprised matters of habit rather than thought and inquiry. You may be surprised to find how quickly you start to see your life in terms of other people's expectations rather than issues. You may be surprised to find out how quickly reading a good book sounds like a luxury.
The end of the speech is, of course, great!
But having an enviable career is one thing, and being a happy person is another. 
Creating a life that reflects your values and satisfies your soul is a rare achievement. In a culture that relentlessly promotes avarice and excess as the good life, a person happy doing his own work is usually considered an eccentric, if not a subversive. Ambition is only understood if it's to rise to the top of some imaginary ladder of success. Someone who takes an undemanding job because it affords him the time to pursue other interests and activities is considered a flake. A person who abandons a career in order to stay home and raise children is considered not to be living up to his potential-as if a job title and salary are the sole measure of human worth. 
You'll be told in a hundred ways, some subtle and some not, to keep climbing, and never be satisfied with where you are, who you are, and what you're doing. There are a million ways to sell yourself out, and I guarantee you'll hear about them. 
To invent your own life's meaning is not easy, but it's still allowed, and I think you'll be happier for the trouble. 
Reading those turgid philosophers here in these remote stone buildings may not get you a job, but if those books have forced you to ask yourself questions about what makes life truthful, purposeful, meaningful, and redeeming, you have the Swiss Army Knife of mental tools, and it's going to come in handy all the time.

Tuesday, August 27, 2013

Lengthscales Animation

A very nice interactive animation of length scales called "the Scale of the Universe". You can even turn your speakers on, as you travel from the smallest to the largest length scales.

There used to be another video (I cannot seem to locate it right now) which essentially did the same thing, except it was not interactive. The cool part of that video was that it showed quite vividly how most of space is empty at both ends of the lengthscale spectrum (atoms and outer space), and how matter is just like some tiny discrete dust sprinkled on, almost as an afterthought.


Sunday, August 18, 2013

Block Averaging: Estimating Uncertainty (Part 2)

In the last post, we ended with a conundrum.

To summarize:

When we have a timeseries of independent samples, estimating the mean and the uncertainty of that estimate is straightforward. The central-limit theorem tells us everything we need to know:  \(\sigma_{\langle x \rangle} = \sigma_x/\sqrt{N}\).

However, when successive data points are correlated they tell us "less" than  uncorrelated data points. If we ignore this diminished importance of points, and pretend as if they were as good as uncorrelated samples (by lavishing the central-limit theorem on them), we underestimate the true uncertainty in \(\sigma_{\langle x \rangle}\).

In the previous example, we found  \(\langle x \rangle = 4.686 \pm 0.057\).

The problem with this result is not necessarily that the mean is not as close to "5" as we would like, but that the error-bar (standard deviation) gives us a false sense of security.

Note that in this example, we knew the true answer. In a molecular simulation, we don't know the true answer, and it is easier to be seduced by small error-bars.

In short, the real problem with treating correlated data is that the present analysis underestimates the true error-bar.

Now let us see, what we can change. In "the real world" we cannot change the fact that data are going to be correlated or that the estimated mean is not as close to the "true" mean as we would like, unless we are prepared to carry out extraordinarily long simulations (or extraordinarily many independent simulations). Thus, we set ourselves a fairly modest task.

We want more reliable estimates of \(\langle x \rangle\). Here is where a trick called block averaging can be surprisingly helpful. This one slide from David Kofke's lecture notes on Molecular simulation explains the whole idea.

The idea is to chop the dataseries into \(n\) chunks or "blocks" of different size "b" (n * b = N). We compute the mean of each block \(\bar{m}_i, ~ i = 1, 2, ... n\), and compute the standard deviation of the block averages, \(\sigma_{\bar{m}_b}\). The subscript "b" denotes that this standard deviation is for blocks of size "b".

The simulation error is estimated as \(\sigma_{\langle x \rangle} = \sigma_{\bar{m}_b}/\sqrt{n-1}\). Luckily for you, I wrote an Octave script which does this calculation for you.

If you plot this quantity versus "b", you get a quantity that gradually increases, and then plateaus out. For the correlated data set from the previous post, we get:

We get plateau estimate of \(\sigma_{\langle x \rangle} \approx 0.296\), which implies a much more consistent "1 sigma" estimate of \(\langle x \rangle = 4.686 \pm 0.296\).

As you can see, block averaging reflects the true uncertainty in the data, which is larger than that implied by a mindless application of the central-limit theorem.

What happens if we subject our uncorrelated data from the previous post to the block averaging treatment.

See for yourself:

The standard deviation doesn't change with block size.

Wednesday, August 14, 2013

Block Averaging: Estimating Uncertainty (Part 1)

When you periodically sample a property (eg. thermodynamic attributes such as pressure, static properties such as radius of gyration of a polymer molecule, etc.) in "equilibrium" molecular dynamics or Monte Carlo simulations, you end up with a time series of that property.

For specificity, let us call this property \(x(t)\), where \(t = 1, 2, ..., N\) are discrete time steps (easy to adapt this discussion to cases where it is not discrete). Usually, the sampling frequency is greater than the characteristic relaxation time of the system, which makes successive measurements of \(x\) "correlated" or "non-independent".

The weather or the stock market on successive days is correlated because the timescale of a weather system or a business news item is often a few days; however the successive tosses of a fair coin or pair of dice can be expected to be uncorrelated.

Uncorrelated data-points are nice and desirable because they save us a lot of trouble. We can not only find average value \[\langle x \rangle = \frac{x(1) + x(2) + ... + x(N)}{N},\]  of the property easily, but also associate an error bar or standard deviation.

Thus, if \(\sigma_x\) is the standard deviation of \(x\), then the central-limit theorem tells us that \(\sigma_{\langle x \rangle} = \sigma_x/\sqrt{N}\).

Let us consider a simple example. I generated time series with N = 1000 samples of a random number with mean value of "5" using the Octave command: x = 5 + 2 * rand(N,1) - 1, to add a uniform random number between [-1, 1].

I plot the time-series (or vector in Octave parlance), and its autocorrelation function computed via this script, in the following figure.
As expected, \(\sigma_x = 0.575 \approx 1/\sqrt{3}\), and \(\langle x \rangle = 4.983 \pm 0.018\), where the standard deviation of the mean is 0.575/sqrt(1000). This gives a restrictive "1 sigma" range for the mean between 4.965 and 5.001, between which the mean value of "5" falls, and everything is hunky dory.

We also see the autocorrelation function quickly decay from 1 to zero. This is a hallmark of an uncorrelated process. If successive samples were correlated, then the autocorrelation function would take its own sweet time to fall to zero (as we shall see in an example shortly). The amount of time it takes to fall to zero or "decorrelate" is a measure of how correlated the data is.

Now let me make up correlated process. In the following example, I set x(1) = 0; x(t+1) = 0.95 * x(t) + 2 * rand(N,1) - 1, and then shifted all the values by "5". This is an autoregressive model, but we don't have to know anything about that here.

The time-series and its autocorrelation function look as follows:

You can "see" the correlation in the data-series visually. The series fluctuates around "5" (not completely obvious) in a qualitatively different manner from the uncorrelated series. This is apparent in the autocorrelation function, which now decays to 0 around \(t = 50\). The red line is the autocorrelation of this particular dataset, and the black line is the average over 100 such datasets.

Here \(\sigma_x = 1.791\), and naively using  \(\sigma_{\langle x \rangle} = \sigma_x/\sqrt{N}\), we get \(\langle x \rangle = 4.686 \pm 0.057\). This "1 sigma" range for the mean between 4.629 and 4.743, which is not the "correct" bound for the mean value of "5". The "2 sigma" or 95% interval is not much better.

Clearly something is amiss here. We'll find out more in the next installment.


Saturday, August 10, 2013

Links

1. Should Humans Eat Meat: Vaclav Smil presents a thumbnail sketch in this Scientific American article.
... is it possible to come up with a comprehensive appraisal in order to contrast the positive effects of meat consumption with the negative consequences of meat production and to answer a simple question: are the benefits (health and otherwise) of eating meat greater than the undesirable cost, multitude of environmental burdens in particular, of producing it?
2. Is Dawkins hurting Athesim:  Martin Robin contends that the movement is maturing, and no longer needs the forceful and erratic professor.
@RichardDawkins is the increasingly erratic comedy creation of a bored Oxford Professor called Richard Dawkins. One of the best science writers of the last few decades, Dawkins has succeeding in crafting an online character that ironically parodies the more militant tendencies in capital-A Atheism, serving as a useful reminder for all of us to be more nuanced and tolerant.

Thursday, August 8, 2013

EconTalk, Pallotta, Charity etc.

I came across Dan Pallotta through an interview he did with Russ Roberts on EconTalk. Here is an interesting TED talk which presents his thesis on charity and the non-profit sector.

Interesting!

Sunday, August 4, 2013

Dangerous Code

Michael Lewis tells the chilling story of Sergey Aleynikov in Vanity Fair:
Then he explained what he knew, or thought he knew: in April 2009, Serge had accepted a job at a new high-frequency-trading shop called Teza Technologies, but had remained at Goldman for the next six weeks, until June 5, during which time he sent himself, through a so-called “subversion repository,” 32 megabytes of source code from Goldman’s high-frequency stock-trading system. The Web site Serge had used (which has the word “subversion” in its name) as well as the location of its server (Germany) McSwain clearly found highly suspicious. He also seemed to think it significant that Serge had used a site not blocked by Goldman Sachs, even after Serge tried to explain to him that Goldman did not block any sites used by its programmers, but merely blocked its employees from porn and social-media sites and suchlike. Finally, the F.B.I. agent wanted him to admit that he had erased his “bash history”—that is, the commands he had typed into his own Goldman computer keyboard. Serge tried to explain why he had done this, but McSwain had no interest in his story. “The way he did it seemed nefarious,” the F.B.I. agent would later testify. [Emphasis mine]
Wikipedia has some additional details on this extremely bizarre case, which I just don't seem to get!

Friday, August 2, 2013

Timescale is Everything!

In my last post, I waxed about my field of rheology in an attempt to show that the boundary between solids and liquids is fuzzier than you might think.

One other thing that rheology teaches you to appreciate is the importance of timescales. To rheologists the "pitch-drop experiment" is exciting; the difference between something that flows and something that doesn't is patience.

In fact, there is widely used dimensionless number called the Deborah number named after the prophetess Deborah, who said "The mountains flowed before the Lord".

Presumably "the Lord" works on much longer timescales.

All this is a long prelude to some amazing videos by the Slow Mo Guys. They film things squishing, popping, dropping or exploding, with a high-speed camera, and set it to music. The result is beauty and a deep appreciation for everyday phenomena that we miss because we sense and process visual data at "faster" time scales.

Here are specific links to a few that I really enjoyed: bubble bursting, underwater bullets, exploding water-melons and rubber bands, droplet collisions, etc.

Heck! Check them all out.