Tuesday, December 30, 2014


(i) Vaccines work; and here are the facts in comic form (medium.com)

(ii) On a related note, Steve Novella discusses the evolution of pertussis (whooping cough) bacteria:
One can already argue that it is everyone’s duty to get vaccinated, not only to protect themselves but to contribute to herd immunity for everyone. We can now reasonably argue that this duty extends to minimizing resistance to the existing vaccines. Non-compliance can not only lead to outbreaks, but to diminishing effectiveness of the vaccines for everyone.
(iii) Yet another example Goodhart's law, "When a measure becomes a target, it ceases to be a good measure." Did Northeastern University game college rankings? Or did it just excel at something that every university tries to do?

(iv) NPR's story about fear-mongering by the Food Babe elicited a response at her site. David Gorski at science-based medicine offers a useful summary in his response to her response.

(v) While on the matter of food, here's Scott Adams on how you may want to think about atoning for the "eating season" we are currently in.

Monday, December 29, 2014

Python for Matlab Users

As part of our MCMC seminar series this semester, we've adopted Python as the language that we will all write code in. While I've spent plenty of time programming in Octave and Matlab, I have only dabbled with Python here and there.

This time, I thought I'd make a concerted effort to get past that initial hump.

You know, you got to roll a snowball to a certain size, before it launches an avalanche.

Here are resources that I found useful:
  1. Introduction to Python and Problem Solving with Python (pdf) by Sophia Coban (H/T Walking Randomly). The first presentation introduces you to Python in general, the second talks more about modules. More importantly it also compares array and matrix operations in Matlab and numpy.
  2. Another simple introduction (pdf) to numpy and scipy by M. Scott Shell at UCSB.
  3. If you don't want a PDF link, here is a tentative numpy tutorial in HTML.

Monday, December 15, 2014

Tyco Brahe: Who says Scientists are Boring?

An entertaining portrait of "history's strangest astronomer".
Still, his life seems almost dry and tedious compared to his mysterious death. He died of a sudden bladder disease in 1601 while at a banquet in Prague. He was unable to urinate except in the smallest of quantities, and after eleven days of excruciating agony he finally died. At least, that's the official story. It's possible he actually succumbed to mercury poisoning, as later researchers have detected toxic quantities of the substance on his mustache hairs. 
In order to shed some more light on this, his remains were recently exhumed for further medical study. Assuming the researchers find more evidence of mercury on his bone and hair samples, there are two possibilities. If there's evidence of longer-term exposure, then he likely ingested the mercury accidentally during the course of his experiments. If, on the other hand, the mercury can only be found right at the roots of his hair, then that would indicate he was given one big fatal dose of mercury. And that means...murder!
I am not sure how much to drama has been added in the article, though. I thought Tyco Brahe died a bizarre, but less malicious, death as echoed in the wikipedia entry:
Tycho suddenly contracted a bladder or kidney ailment after attending a banquet in Prague, and died eleven days later, on 24 October 1601. According to Kepler's first hand account, Tycho had refused to leave the banquet to relieve himself because it would have been a breach of etiquette.
The same entry entertains, but quickly dismisses, the mercury poisoning story, by suggesting that the mercury could have come from the metal noses he wore.

Further down,
In life, Brahe had jealously guarded his data, not even letting his prized pupil Johannes Kepler gain access.

That all changed upon his death, as Kepler took advantage of the confusion to take possession of the data, something he himself later admitted was not entirely ethical:

"I confess that when Tycho died, I quickly took advantage of the absence, or lack of circumspection, of the heirs, by taking the observations under my care, or perhaps usurping them." 
With that data in hand, Kepler was able to move astronomy further forward than anyone before him, becoming what Carl Sagan would later call "the first astrophysicist and the last scientific astrologer."

Thursday, December 11, 2014

What the Deal with Oil?

A fascinating animated infographic on how the domestic demand for oil (as a percentage of GDP) has fallen, even as production has increased, the average car has become more fuel efficient, and the shift to renewables has begun in earnest.

Check it out at Bloomberg

Monday, December 8, 2014


1. Formula versus Breast Feeding
The Reality Check explores this unnecessarily controversial topic. The show notes are extensive, and worth a look. The bottomline is that difference between breast milk and formula is usually overstated by its proponents, and exacts a heavy emotional toll from mothers who have trouble breast-feeding
2. Sandy, Issac and the Red Cross
ProPublica and NPR presents a damning expose'.

3. Scilab and Textbook Companion Project (H/T Michael Croucher)
Prof. Kannan Moudgalya, a former teacher or mine at IIT Bombay, and his team ported code from a gazillion engineering textbooks to Scilab. Impressive feat of endurance!

Thursday, December 4, 2014

Newcomb's paradox

I learnt about Newcomb's paradox recently. Wikipedia has a nice post on it.
The player of the game is presented with two boxes, one transparent (labeled A) and the other opaque (labeled B). The player is permitted to take the contents of both boxes, or just the opaque box B. Box A contains a visible $1,000. The contents of box B, however, are determined as follows: At some point before the start of the game, the Predictor makes a prediction as to whether the player of the game will take just box B, or both boxes. If the Predictor predicts that both boxes will be taken, then box B will contain nothing. If the Predictor predicts that only box B will be taken, then box B will contain $1,000,000.
The Predictor is almost infallible.

The range of possibilities are (from Wikipedia):

1. One can say that "A and B" is a superior choice, because given a predicted choice (which one can't control) it offers a better payout.

If the Predictor was not very reliable, then this would certainly be the better choice.

2. One can say that "B only" is a better choice, because the Predictor is almost always right. Thus, the probability of a mismatch between predicted and actual choices is so small that we might ignore it. Therefore, one should look at only the first and last rows in the table above, and "B only" offers a higher payout.

If the Predictor very perfectly reliable, then this would certainly be the better choice.

There is a lot of commentary and nuance to this topic, so go google it.

Sunday, November 23, 2014

Shooting at Florida State University

On Thursday morning, I woke up groggy in a hotel room in Atlanta, after my phone buzzed for the third time. It was 6am, and I was at the AIChE annual meeting.

In a few minutes, I found out about the shooting at Strozier Library on campus.

Shortly after midnight, a gunman, later identified as Myron May, had opened fire and injured three unsuspecting students, whose only fault was that they were in the wrong place at the wrong time. The library had been unusually packed, due to exams and project deadlines that mark a semester rolling to its end.

Obviously, this one struck really close. I stroll past Strozier library and Landis Green almost daily, because it is one of the most beautiful parts of the campus.

Even as the University is struggling to make peace with this senseless random act, a cloud of confusion envelops me.

Personally, I am firmly anti-gun. There are a few things, I am absolutely clear about: (i) assault style rifles have no place in a civilized society, (ii) some background checks/licensing are absolutely needed to prevent criminals and psychotics from picking up a gun from the nearest Walmart.

An outright national ban on guns would probably please me, but it is politically difficult. I also haven't sorted out this issue in my mind completely, and a number of discussions I've had with friends and colleagues have left me with a lot of nuance.

For example, I recently heard the argument that gun laws should be local: the rules in downtown New York City need not be the same as the rules in a rural Minnesota village, where hunting is woven into the fabric of the society. At face value, this certainly seems to be a reasonable proposition. Furthermore, banning firearms from a jurisdiction is probably not going to deter a criminal, who is bent on breaking the law in any case. Background checks can help, but someone could buy a weapon when they are sane, and retain the weapon, unless gun licenses are renewed annually.

But perhaps, the issue gnawing at me most uncomfortably is whether all this talk about banning guns lets us avoid looking at the issue of serious mental illness. In all of the recent school and campus shootings, guns and mental illness have co-conspired to create a deadly cocktail. Guns seem like an issue that can be divided into a neat binary position - you have them or you don't.

Mental illness, on the other hand, is a much more complicated. It is already stigmatized, and a part of me worries that when the spotlight is turned towards the issue, people will say "put all these loonies away", or "snatch away their guns", rather than having a honest discussion of how to restructure community and safety nets so that mentally ill people can get the help they deserve.

In summary, there are a few things I feel certain about. I would rather see "passionate incrementalism" (a phrase I learned recently and have grown to love) to move the issue forward in little steps, rather than attempting impossibly large leaps that circle us back to the beginning.

Wednesday, November 19, 2014

Longform Articles

1. The Empire of Edge: How  a doctor, a trader, and the billionaire Steven A. Cohen got entangled in a vast financial scandal. (New Yorker)

2. The Empire Reboots: Can CEO Satya Nadella save Microsoft? (Vanity Fair)

3. Beyond the Bell Curve: A beautiful exposition of the mystery of a universal statistical law. (Simons Foundation)

Sunday, November 9, 2014

New Links

1. A short history of important equations including the ideal gas law, Fourier transforms, and the wave equation (the Guardian)

2. The experiment Galileo would have loved to see:

Or perhaps not!

3. The problem with Nate Silver (Matt Yglesias)

4. An interesting puzzle (Michael Lugo): Given 5 digits construct a three digit and a two digit number so that the product is maximized.

Friday, November 7, 2014

Screening for Down's Syndrome

In our discussion on Bayes theorem in the seminar yesterday, I brought up a personal anecdote. During my wife's first pregnancy, she was offered the choice of taking an integrated test to screen for Down's syndrome in the fetus.

I looked up the numbers for the accuracy and false positive rates and found that they were about 95% and 5% respectively (somewhat of a coincidence that these numbers add up to 100%).

The baseline rate of the syndrome steadily increases with the age of the mother.

For a 25 year old mother, it is 0.0001 (1/1100).
For a 35 year old mother it is 0.004 (1/250).
For a 45 year old mother it is 0.05 (1/20).

You can run these numbers through one of the online calculators I wrote about yesterday.

If the test is positive, then the posterior probabilities are again a function of age:

For a 25 year old mother, it is 1.7%.
For a 35 year old mother it is 7.1%.
For a 45 year old mother it is 50%.

Thus, at that time I concluded that the taking the test would only have been meaningful if my spouse were around 45 years old.

For young mothers, even a positive test result is not of particularly great practical value.

Wednesday, November 5, 2014

Bayes Theorem: Interactive Modules

In our undergrad seminar, we have been reading about Bayes Theorem from a very nice post by Eliezer Yudkowsky. However, most of the Java applets on the page don't seem to work (or at least I couldn't not get them to work!).

Fortunately, thanks to Geogebra, there are multiple interactive HTML5 "applets" which work straight away in any modern browser. If you have Geogebra on your system, you can download and modify the applet as well.

Here is a link to "Exploring Bayes' Theorem"

If you like "tree" based descriptions better, here is another applet/worksheet.

Tuesday, November 4, 2014

Wason Selection Test

I learned about this interesting logic puzzle yesterday. Here's how wikipedia poses it:
You are shown a set of four cards placed on a table, each of which has a number on one side and a colored patch on the other side. The visible faces of the cards show 3, 8, red and brown. Which two cards must you turn over in order to test the truth of the proposition that if a card shows an even number on one face, then its opposite face is red?
 Apparently 90% of the people fail.

The same article goes on to mention that given the right social context most people make the correct choice.
For example, if the rule used is "If you are drinking alcohol then you must be over 18", and the cards have an age on one side and beverage on the other, e.g., "16", "drinking beer", "25", "drinking coke", most people have no difficulty in selecting the correct cards...

Friday, October 31, 2014

Compound Interest and Paying Down a Mortgage

Suppose, you take out a \(T\) period, fixed rate (rate = \(i\)) mortgage for a house worth \(B\). We want to find out how much your fixed monthly payment \(D\)?

To find the answer to this problem consider two independent "thought" buckets.

In one bucket, you have your loan $B grow without payment.

In the other bucket, you regularly contribute $D every year (I am using a yearly interest rate - modifications for monthly payments are straightforward).

Left unattended, at the end of \(T\) years, the first bucket balloons to \(S^- = B(1+i)^T\), and the second bucket to \(S^+ = D [(1+i)^T - 1]/i\).

If we set things up just right (choose the correct D), the amount we owe (bucket 1), might be just about equal to the amount we have in bucket 2.

Mathematically, we can impose \(S^+ = S^-\), to obtain this "just right" monthly payment. Solving for \(D\), we get,\[D^* = B \frac{i}{1-(1+i)^{-T}}.\] Suppose \(B\) = 100,000, \(i\) = 0.05/12 (monthy interest rate), and \(T = 360\) (30 years), we get \(D^* = $536.82\).

Bucket 1
Bucket 2
Net of Bucket 1 and 2
If you used a higher D, then you would end up with a net excess, and vice versa if you used a smaller D.

Tuesday, October 28, 2014

Compound Interest, Loans, and Saving for Retirement

Suppose you borrow $B at time t=0 at an annual interest rate i. If you don't make any payments, the loan compounds, and at the end of T years ballons to \(B (1 + i)^T\).

As a concrete example, if you borrow B=$10,000 at 5% (= 0.05) for 10 years, your debt expands to nearly $16,289 at the end of the loan term, if you don't make any payments.

Nobody likes to ponder sad eventualities, so lets consider the more optimistic problem of saving for retirement. In a subsequent post, we will connect these two problems: irresponsible borrowing, and prudent saving, for how to responsibly pay down a home mortgage, for example.

Consider you make constant regular payments of $D to your retirement account every year from t=1 to t=T.

If we assume that our money grows at the same constant rate of i=0.05, we want to figure out how much we end up with after T years.

The first installment has (T-1) periods to grow over. So it grows to $\(D (1+i)^{T-1}\).

Similarly the second installment has (T-2) periods to grow over; so it grows to $\(D (1+i)^{T-2}\).

So on and so forth.

Thus, the total amount of money at the end of T years is given by,\[S^+ = $D(1+i)^{T-1} + $D (1+i)^{T-2} + ... + $D.\] This is a simple geometrical series, which conveniently sums up to,\[S^+ = $D \frac{(1+i)^T-1}{i}.\] To finish off this post, and set up the stage for how to responsibly pay down loans, let me plot the increase in the loan amount as a function of time (I am going to call the number \(S^-\)), and the growth of the retirement account.

Again, using B = 10,000, i = 0.05, and T = 10:

Using D = 100, i = 0.05, and T = 10:
In this case, we end up with $1,257.5, out of which only $1,000 were contributions, and the rest was compound interest.

Saturday, October 25, 2014

Setting up SpellCheck in TeXmaker

As the official documentation suggests, you go to "Configure Texmaker" -> "Editor" -> "Spelling dictionary" and set up the location.

The default location on my Linux distribution was: /usr/share/myspell/en_GB.dic

As the wikipedia entry says:
MySpell was the former spell checker included with OOo Writer of the free OpenOffice.org office suite.
The current spell checker of choice for OpenOffice is Hunspell. We can configure TeXmaker to use this dictionary by pointing to the location: /usr/share/hunspell/en_US.dic

Note, you can try the command locate hunspell to figure out where the "dic" file rests on your installation.

Sunday, October 19, 2014


1. Alcohol consumption in America (Slate.com)
The numbers in the survey surprised me on the downside. 30% of Americans don't drink. The median American has one drink every two weeks. The top decile consumes 10 drinks/day.
2. Is common Common-Core really bad?
You may have seen Facebook posts of frustrated parents (for e.g). However, I think the standards themselves are very decent. Some of the ideas might seem foreign, but I have to admit I am impressed by them. Here is coverage from the American RadioWorks and Intelligence Squared US Debates.
3. 15 Questions to see how logical you are. For example,
No introverts are charismatic
All philosophers are introverts
Therefore, no philosophers are charismatic

Friday, October 17, 2014

Education, Second Chances, and Transformations

For some random reason today, I was reminded of a story that happened when I just started my current academic position.

I was a very eager assistant professor teaching thermodynamics to undergraduates. I had not yet been bathed in the richness of personal struggles that many students dragged with them into the classroom.

So after my first class, this kid - let's call him MJ - walked into my office. He delivered what seemed like a prepared two-minute talk on how he was going to focus, work hard, and turn things around that semester.

After he left, I looked up his past grades, and found that he was struggling with a GPA of less than 2.5, and an academic history that was littered with Cs and Ds. 

I did not think much about the incident, until the semester had picked up some steam. MJ showed up during office hours regularly. Although he was rusty in multiple areas, his dogged effort was palpable.

Over time, I got to know a little more about the social structure in which he had been surrounded - an ugly milieu that was riddled with gangs, drugs, and far worse - apathy.

What MJ lacked in mathematical ability, he tried to make up with logic, unconventional thinking, and tenacity.

He had that "born-again" zeal you can sometimes see in people who are given second chances, who have determined that this is it. It is now or never!

Thanks to the compounding effect of knowledge, he started making rapid progress. Within a month, I realized that he really had a good shot at turning this thing around, and actually found myself rooting for him. I really wanted it to end well for him.

He finished the course with a well-deserved A. I found out later that he had gotten As in most of his other classes that semester.

I watched him land a well-paying job a year later.

I saw him once more after that, when he came to recruit for "his" company, and caught up with him. He was doing well, and he had dramatically changed his family's trajectory.

The lesson: saving or helping one individual might not make a big difference to the world, but it does make a world of difference to that one individual.

Tuesday, October 14, 2014

An underflow problem in slice sampling

In a MCMC seminar that Peter and I run, we were recently discussing slice sampling. Here is an overview from Radford Neal himself (open access). Let me skip the details of the algorithm, and pose the particular problem we encountered.

Problem: Compute the energy \(U^*\) of a system. We don't have any idea of the magnitude of this energy, or its sign. Suppose the probability of the system to be in that state is \(\exp(-U^*)\).

This number is going to be extremely small if \(U^*\) is large (which is probable, if you don't work hard at estimating a good initial guess).

The slice sampling algorithm asks us to draw a number \[y \sim U(0, \exp(-U^*)),\] and compute the difference \[\Delta U_y = -\log(y) - U^*.\]
Straightforward Implementation:

The following Octave program computes 10,000 samples of \(\Delta U_y\) to produce the histogram

% Small Ustar = 10, no problems
Us  = 10;
num = 10000;
y   = rand(num,1) * exp(-Us);
Uy  = -log(y);
dUy = Uy - Us;
[hx, x] = hist(dUy,25);
hx = hx/num;
bar(x, hx)
Histrogram of dUy with U*=10 is an exponential distribution
Unfortunately if I set Us = 1000 in the program above, it fails. The failure is simple to parse. If Us is large, exp(-Us) is below "realmin" in double precision, and is approximated to zero, which spawns a cascade of disastrous events, ending in the logarithm of zero.

There are multiple ways of fixing this problem.

Solution 1:

Since we are really interested in \(\Delta U_y\), we can draw it directly from the correct distribution. In this case, a little analysis suggests that the distribution of \(\Delta U_y\) is an exponential distribution with parameter = 1. Hence, we can skip the auxillary variables \(y\) and \(U_y\) and go directly to the meat of the problem:

Us  = 1000;
num = 10000;
dUy = exprnd(1,num,1);

Solution 2:

Since we observe that \(U^*\) doesn't participate in the distribution, we can think of another method that does not directly involve sampling from an exponential random number generator. We define \(y' \sim U(0,1),\) and \(y = y' * \exp(-U^*)\).

We can write \(-\log(y) = -\log(y') + U^*\).

This implies  that  \(\Delta U_y = -\log(y) - U^* = -\log y'\).


yp  = rand(num,1);
dUy = -log(yp);

also works well. The advantage is that one doesn't need an special exponential random number generator.

On my computer, solution 2 was more efficient than solution 1 for num < 50000, after which solution 1 became more efficient.

Thursday, October 9, 2014


1. How Bayesian thinking might explain how children build their worldview (Massimo Fuggetta).

2. Fuck Yeah Fluid Dynamics Tumblr Page

3. So you aren't scared of the Yellowstone supervolcano, huh?

4. What college has the most billionaires (Barry Ritholtz)?

5. Academic Humilty (Agabus)

Monday, October 6, 2014

Kahan or Compensated Summation

Wikipedia describes this as follows:
In numerical analysis, the Kahan summation algorithm (also known as compensated summation [1]) significantly reduces the numerical error in the total obtained by adding a sequence of finite precision floating point numbers, compared to the obvious approach. This is done by keeping a separate running compensation (a variable to accumulate small errors).
Here are a couple of links (postscript document) which argue that this relatively simple method ought to be better known.

An Octave program to sum \(s = x_1 + x_2 + ... + x_n\) is (embedded):

Tuesday, September 30, 2014

Identify the Thief

We are terrible at remembering faces shown to us fleetingly. This video, for example, admirably demonstrates this effect: 4/5 people pick the wrong guy!

On a trip from Denver, we were unfortunate enough to see this play out in practice. We were returning from a very satisfying trip to Yellowstone National Park. After returning our rental van (4 adults and 2 kids), we boarded an Avis bus that was supposed to take us to the terminal. We were tired, anxious to get back home, and somewhat distracted.

There is always plenty of distraction when travelling with two little kids.

There were three other people in the same Avis bus: two guys and a lady. One of the guys (who turned out to be a thief) even dropped something during the trip. I watched him get out of his seat, pick it up, and get back in.

After the bus stopped, one of the guys and the lady got off, and I almost bumped into the thief, and politely gave him the right of way. He said, "after you, please", and I got off, and ran to a get a trolley to haul our luggage.

When I returned, I saw that one of our backpacks was missing. I tried running into the airport terminal to find this guy, but I'm embarrassed to confess that seriously I only had the vaguest idea what he looked like.

And I had two "good" looks at him.

This memory effect is especially not good when you're in a sea of people at Denver International Airport.

In any case, we lost a couple of passports, which led to some irritation, and an old digital camera.

More significantly, I  lost a little more trust in strangers, which unfortunately, is less easy to replace.

Friday, September 26, 2014

Interesting Links

1. How MOOCs may have the last laugh after all. (Aswath Damodaran)

2. Murray Gell-Mann: TED talk on Beauty, Truth and Physics

3. The magic of decaffeination: (YouTube video)

4. The link between artificial sweeteners and diabetes: Steve Novella explores problems in a recent study.

5. Online books on Ordinary Differential Equations

Wednesday, September 24, 2014

Interesting Questions

In the past month, I got asked two extremely inquisitive questions by an elementary school kid. I paraphrase them for context and clarity.

1. The upper floor of a two-level house is warmer than the lower floor. In fact, I wrote about this question a long time ago. Warm air is lighter (from ideal gas law, for example) and floats above dense cold air.

Q: If that is so, then why does it get colder as we climb or drive up a tall mountain?

2. What is the temperature of a mixture of ice and water? Until all the ice melts, the "classic" answer is zero centigrade.

Q: If there are icebergs floating in the earths oceans, why isn't the temperature of the oceans equal to zero centigrade?

Friday, September 19, 2014

DCT in 2D: Symmetric Functions

This is an extension of a previous post that sought to approximate a 1D function \(f(x), ~ x \in [0,1)\), that is symmetric about \(x = 1/2\). That is, the shape of the function above the line of symmetry is a mirror image of the function below it.

As as example, we considered the function:  \[f(x) = (1/2-x)^2 + \cos(2 \pi x).^2.\]
We sample the function at \(n\) equispaced points \(x_i = i/(n+1)\), where \(i = 0, 1, 2, ..., n\). Given the bunch of points \(\{x_i, f_i\}\), we approximate the function using a sum of discrete cosine functions, \[f(x) \approx \sum_{j=0}^{N-1} a_j v_j(x_i),\] where,  \(v_j(x) = \cos(2 \pi j x)\) is a typical orthogonal basis function, and \(N\) is the number of basis functions used.

In this post, we add a small wrinkle to the original problem.

Suppose we had a 2D function \(G(x,y) = f(x) f(y)\). This is certainly a special kind of 2D function. It has mirror symmetry about the centerline which it inherits from \(f(x)\). In addition, it has a very special symmetry along the \(x\) and \(y\) directions. For example, \(G(x,y) = G(y,x)\).

For concreteness, let us use the particular \(f(x)\) that we considered above, and build a \(G(x,y)\) out of it. It looks like:

clear all

% number of collocation points is (n+1)
n  = 20;  
% collocation points for discrete cosine series
xi = [0:n]'/(n+1);
% grid is equivalent in the x and y-directions
yi = xi;

% Sample Function
f = @(x) (1/2-x).^2 + cos(2*pi*x).^2;

[X, Y] = meshgrid(xi);
G      = f(X).*f(Y);


Since the "x" and "y" grid-points are at identical locations, we can assert\[G(x_i, y_j) =  \sum_{j=0}^{N-1} a_j v_j(x_i) \sum_{k=0}^{N-1} a_k v_k(y_j.)\] In the matrix language developed in the original post, this is equivalent to \[\mathbf{G} = \mathbf{V a}  (\mathbf{V a})^T = \mathbf{V A V^T},\] where \(\mathbf{A} = \mathbf{a a^T}\).

Using the property of orthogonality, we hit both sides with the appropriate factors to obtain \[\mathbf{V^T G V} =   \frac{(n+1)^2}{4} \begin{bmatrix} 2 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0& 0\\ 0 & 0& 0& 1 & 0\\ 0 & 0 & 0& 0& 1\end{bmatrix} \mathbf{A}  \begin{bmatrix} 2 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0& 0\\ 0 & 0& 0& 1 & 0\\ 0 & 0 & 0& 0& 1\end{bmatrix}.\]  If \(\mathbf{a} = (a_0, a_1, ..., a_{N-1})\) then this implies that the RHS is equal to, \[\frac{(n+1)^2}{4} \begin{bmatrix} 4 a_0^2 & 2 a_0 a_1 & 2 a_0 a_2 & ... & 2 a_0 a_{N-1}\\ 2 a_1 a_0 & a_1^2 & a_1 a_2 & ... & a_1 a_{N-1}\\ 2 a_2 a_0 & a_2 a_1 & a_2^2 & ...&  a_1 a_{N-1}\\  \vdots & \vdots & \vdots & \vdots & \vdots \\ 2 a_{N-1} a_0 & a_{N-1} a_1 & a_{N-1} a_2 & ... & a_{N-1}^2\end{bmatrix},\]

In the above equation, the LHS is known. The diagonal of the the computed matrix yields the vector \(\mathbf{a}\) that we seek.


nmodes   = 7;

% Build the matrix of basis vector V = [v_0 v_1 ...  v_N-1]

V     = zeros(n+1, nmodes);
V(:,1) = ones(n+1,1);

for j = 2:nmodes
  V(:,j) = cos(2*pi*(j-1)*xi);

% Compose the matrix: The key step
% Replace small negative values with 0 to prevent imaginary roots

M      = diag(V'*G*V);
M(M<0) = 0;                
M      = sqrt(M);

a    = zeros(nmodes,1);
a(1) = M(1)/(n+1);

a(2:nmodes) = 2*M(2:nmodes)/(n+1);

% Compute Norm on the Same Grid
g       = V*a;
Gapprox = g*g';
disp('Norm of Error');
norm(Gapprox - G)/(n+1)^2

This yields the coefficients for the basis vectors which can be used to reconstruct the function. Here are contour plots of the original data on the grid, and the smooth fitted function.

Wednesday, September 10, 2014


1. Why Arkansas is pronounced differently from Kansas

2. Mathematical Gifs my David Whyte

3. Tennis Ball + Soccer Ball = Interesting Physics Lesson

4. The Big History Project and Bill Gates

Monday, September 8, 2014

The Magic of Compounding

While Albert Einstein may never have said that compound interest is "the most powerful source in the universe", the exponential growth implied by the magic of compounding can lead to spectacular outcomes.

Example 1: As as example, consider the following: You start with a penny on day one. On day two, you double it. So you have $0.02. On day 3, you double it again ($0.04), and so on.

Without actually carrying out the math, can you guess how much money you will end up with at the end of a 30 day month?

The answer of course is 2^29 pennies, which is over 5 million dollars!

Example 2: The idea is also enshrined in legend. Consider, for example, the story of the chess-board and grains of rice. Essentially, a king was asked to set one grain of rice in the first square, two in the next, and to keep on doubling until all the 64 squares on the chessboard were used up.

A quick calculation shows that the total number of grains would be \[2^0 + 2^1 + ... + 2^{63} = 2^{64} - 1.\] Assuming each grain weights 25 mg, this corresponds to more than 450 billion tonnes of rice, which is about 1000 times larger than the annual global production.

Example 3: What makes Warren Buffett fabulously wealthy? If you start with an amount \(P\) and grow it at an annual growth rate of \(i\) for \(n\) years, you end up with,\[A = P (1 + i)^n.\] Two ways to get compounding to work its magic is to have large growth rates and/or long incubation times. In Buffett's case, he has managed both; he's compounded money at more than \(i = 0.20\), for a long time, \(n=60\) years. With this $100 becomes, \[A = $100  (1+0.2)^{60} = $5,634,514.\]
Example 4: This is in someways my favorite example, because it doesn't deal with material things. It is an insight that comes from this essay that I wrote about a while ago. I love the following quote:
What Bode was saying was this: "Knowledge and productivity are like compound interest.'' Given two people of approximately the same ability and one person who works ten percent more than the other, the latter will more than twice outproduce the former. The more you know, the more you learn; the more you learn, the more you can do; the more you can do, the more the opportunity - it is very much like compound interest. I don't want to give you a rate, but it is a very high rate. Given two people with exactly the same ability, the one person who manages day in and day out to get in one more hour of thinking will be tremendously more productive over a lifetime.

Wednesday, August 27, 2014


1. Steve Novella exposes weaknesses in Nassim Taleb argument on GMO crops.

2. Teaching "Inferencing" (Grant Wiggins)

3. How expensive is programmer time relative to computation time? John D Cook suggest it is about three orders of magnitude.

4. Interesting NYT portrait of James Simons.

Tuesday, August 26, 2014

Symmetric Functions And Discrete Cosine Approximation

Consider a periodic function \(f(x), ~ x \in [0,1)\), that is symmetric about \(x = 1/2\), so that \(f(x + 0.5) = f(x - 0.5)\). For example, consider the function \[f(x) = (1/2-x)^2 + \cos(2 \pi x).^2.\]
The form of the function \(f(x)\) does not have to be analytically available. Knowing the function at \(n\) equispaced collocation points \(x_i = i/(n+1)\), \(i = 0, 1, 2, ..., n\), is sufficient. Let us label the function at these collocation points \(f_i = f(x_i)\).

Due to its periodicity, and its symmetry, the discrete cosine series (DCS) is an ideal approximating function for such a data-series. The DCS family consists of members, \(\{1, \cos(2 \pi x), \cos(4 \pi x),... \cos(2 \pi j x), ...\},\) where \(v_j(x) = \cos(2 \pi j x)\) is a typical orthogonal basis function.

The members are orthogonal in the following sense. Let the inner product of two basis function be defined as,\[\langle v_i, v_j\rangle = \frac{2}{n+1} \sum_{i=0}^n v_i(x_i) v_j(x_j).\] Then we have, \[
\langle v_i, v_j\rangle = \begin{cases} 0, & \text{ if } j \neq k \\ 1 & \text{ if } j = k >0 \\ 2 & \text{ if } j = k = 0 \end{cases}.\] This can be verified easily by the following Octave commands:

This snippet yields \[\frac{2}{n+1} V^T V = \begin{bmatrix} 2 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0& 0\\ 0 & 0& 0& 1 & 0\\ 0 & 0 & 0& 0& 1\end{bmatrix}\]

The idea then is to approximate the given function in terms of the basis functions,\[f(x) = \sum_{j=0}^{N-1} a_j v_j(x_i),\] where \(N\) is the number of basis functions used.

From a linear algebra perspective we can think of the vector f and matrix V as, \[\mathbf{f} = \begin{bmatrix} f_0 \\ f_1 \\ \vdots \\ f_n \end{bmatrix}, ~~~ \mathbf{V} = \begin{bmatrix} | & | & ... & | \\  v_0(x_i) & v_1(x_i) & ... & v_{N-1}(x_i) \\  | & | & ... & | \\ \end{bmatrix}. \] The \((n+1) \times N\) matrix V contains the basis vectors evaluated at the collocation points.

We are trying to project f onto the column space of V, f = Va, where the column vector a specifies the linear combination of the matrix columns. In the usual case, the number of collocation points is greater than the number of DCS modes that we want to use in the approximating function.

Thus, we have to solve the problem in a least-squared sense by the usual technique, \[\mathbf{V^T f} = \mathbf{V^T V a}.\] Due to discrete orthogonality, we have already shown that, \[\mathbf{V^T V} = \frac{n+1}{2}  \begin{bmatrix} 2 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0& 0\\ 0 & 0& 0& 1 & 0\\ 0 & 0 & 0& 0& 1\end{bmatrix}\]

Therefore, \[\begin{bmatrix} 2 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0& 0\\ 0 & 0& 0& 1 & 0\\ 0 & 0 & 0& 0& 1\end{bmatrix} \mathbf{a} = \frac{2}{n+1} \mathbf{V^T f}.\] In Octave, we can write the following
This yields,

a =


and the plot:

Monday, August 25, 2014

Visualizing 3D Scalar and Vector Fields in Matlab

Visualizing 2D fields is relatively straightforward. You don't have to bring in the heavy artillery.

Its when you move to fully 3D models and try to visualize scalar S(x,y,z), or vector V(x,y,z) fields, that the task of visualization becomes a challenge

Doug Hull has a series of 9 short and crisp videos explaining how to use intrinsic commands in Matlab to exactly that. He many of the important tools listed and explained here (in text format) that Matlab makes available.

Thursday, July 31, 2014


1. Vax: Understanding the spread of infectious diseases via networks, and the role of tools such as vaccination and quarantine. (via Flowing Data)

Given the recent outbreak of pertussis, and yet another systematic review debunking the autism-vaccination link as profiled here, perhaps this game will help anti-vaccination folks understand how we, like wolves, derive our strength from the pack.

Also gotta check out Samantha Bee's piece (video) for the Daily Show.

2. At a recent conference somebody brought up Stigler's law of eponymy:
Stigler's law of eponymy is a process proposed by University of Chicago statistics professor Stephen Stigler in his 1980 publication "Stigler’s law of eponymy". In its simplest and strongest form it says: "No scientific discovery is named after its original discoverer." Stigler named the sociologist Robert K. Merton as the discoverer of "Stigler's law", so as to avoid this law about laws disobeying its very own decree.

Friday, July 25, 2014

Veusz: Awesome Plotting Software

As a working scientist, I do a lot of data plotting. Most of these plots are for internal consumption, as I try to tease meaning out of data.

I tend to use gnuplot a lot, because I've gotten extremely used to it.

However, every once in a while I have to make a plot for external consumption.

For the longest time, I've relied on Grace for my journal quality plots.

Last week, I discovered Veusz (pronounced "views"). It is a python based program for 2D plots, which feels truly modern.

Grace hasn't been updated in a while, and while it works fine for the most part, from an aesthetic standpoint, it feels like your friend from the eighties, who did not realize that bell-bottoms went out of fashion.

It is multiplatform (runs on Linux, MacOS and Windows), exports to a wide variety of useful formats (EPS, PDF, SVG, TIFF), and is unfettered by some of the legacy issues surrounding Grace (multiple plots) such as:
  • multiple plots/insets are a cinch
  • subscripts/superscripts use latex notation
  • presence of an undo button
  • more concise and readable scripts
  • import from a wide variety of formats
  • ability to link to data files instead of loading them in
It is also possible to write "script" files, and use the program from the command line. All in all, I think this is a program that I will use a lot more in the future. I will post about my experiences after I use it for a while.

Thursday, July 24, 2014


1. Math Porn: Good for a few laughs.

2. US National Archives to upload all content on Wikimedia. This is bigger than it sounds.

3. Steve Novella and Michael Fullerton argue about the 9/11 conspiracy in a four part series.

4. Unintended consequences of journal rank

Friday, July 18, 2014

The Top 10 Algorithms

I am teaching a senior undergrad seminar (for Scientific Computing majors) in the Fall semester, and thought it would be a good idea to pick some kind of a theme.

After some thought, I figured that "famous algorithms" may be a good idea. I tried to google "top algorithms" and came up with many lists. Let me begin for bad lists (in my opinion) to good ones.

A.  George Dvorsky has a list of "the 10 algorithms that dominate the world"

1. Google Search
2. Facebook's News Feed
3. OKCupid Date Matching
4. NSA Data Collection, Interpretation, and Encryption
5. "You May Also Enjoy..."
6. Google AdWords
7. High Frequency Stock Trading
8. MP3 Compression
10. Auto-Tune

While the list is interesting, it is somewhat disappointing. It conflates software products with actual algorithms.

HFT is not an algorithm; although the words algorithmic trading and HFT are often used synonymously. Sure, there are important algorithms lurking under many of these "products"

B. Marcos Otero has a better list of "the real 10 algorithms that dominate the world"

This list is a reaction to the one above. The author prefaces his comments with:
Now if you have studied algorithms the first thing that could come to your mind while reading the article is “Does the author know what an algorithm is?” or maybe “Facebook news feed is an algorithm?” because if Facebook news feed is an algorithm then you could eventually classify almost everything as an algorithm.
1. Merge Sort, Quick Sort and Heap Sort
2. Fourier Transform and Fast Fourier Transform
3. Dijkstra’s algorithm
4. RSA algorithm
5. Secure Hash Algorithm
6. Integer factorization
7. Link Analysis
8. Proportional Integral Derivative Algorithm
9. Data compression algorithms
10. Random Number Generation

While this is a better list, in the sense the the items listed are usually "real" algorithms, or something close, it has a strong computer science bias. For example, #4, #5, and #6 are all algorithms for encryption. While encryption is clearly important, it is probably not 30% by weight of the most important algorithms.

C. SIAM has its own list (pdf) of the "top 10 algorithms of the 20th century"

I like this more comprehensive list the best (although I still have some reservations), because the forest in which they hunt is the biggest. Also, the list is a collaboration of two people, which provides some balance on the topics that are touched.

1. Monte Carlo Method
2. Simplex Method for Linear Programming
3. Krylov Subspace Iteration Methods
4. Decompositional Approach to Matrix Computations
5. Fortran Optimizing Compiler
6. QR Algorithm
7. QuickSort
8. FFT
9. Integer Relation Detection Algorithm
10. Fast Multipole Method

Monday, July 7, 2014

Net Neutrality Explained

Here is a nice illustrated introduction to net-neutrality, why it matters, and what one can do about it (until mid-July)!

Along the same lines, and by the same folks, "What's going on with Social Security"

Wednesday, July 2, 2014

On Student Debt

The NYT has this by-now popular article asking people to take a chill-pill. The Reality of Student Debt Is Different From the Clich├ęs.

It is based largely based on a Brookings Institution study which essentially claims that the sky is not falling. The 3 main takeaways from that study (emphasis mine):
1. Roughly one-quarter of the increase in student debt since 1989 can be directly attributed to Americans obtaining more education, especially graduate degrees. The average debt levels of borrowers with a graduate degree more than quadrupled, from just under $10,000 to more than $40,000. By comparison, the debt loads of those with only a bachelor’s degree increased by a smaller margin, from $6,000 to $16,000. 
2. Increases in the average lifetime incomes of college-educated Americans have more than kept pace with increases in debt loads. Between 1992 and 2010, the average household with student debt saw an increase of about $7,400 in annual income and $18,000 in total debt. In other words, the increase in earnings received over the course of 2.4 years would pay for the increase in debt incurred. 
3. The monthly payment burden faced by student loan borrowers has stayed about the same or even lessened over the past two decades. The median borrower has consistently spent three to four percent of their monthly income on student loan payments since 1992, and the mean payment-to-income ratio has fallen significantly, from 15 to 7 percent. The average repayment term for student loans increased over this period, allowing borrowers to shoulder increased debt loads without larger monthly payments.
The NYT tries to shine a light on the real problem:
The vastly bigger problem is the hundreds of thousands of people who emerge from college with a modest amount of debt yet no degree. For them, college is akin to a house that they had to make the down payment on but can’t live in. In a cost-benefit calculation, they get only the cost. And they are far, far more numerous than bachelor’s degree holders with huge debt burdens.
Here is an attempted "takedown" of the report and the NYT article.

And here is a well-reasoned takedown of the takedown.

Tuesday, July 1, 2014

Passing arguments to sed

Suppose you have a variable var=25, and a file test.dat which contains the word foo.

$ var=25
$ cat test.dat 

You want to replace all instances of foo in the file to foo$var (=foo25) using sed.

You might think of trying the following:

$ sed 's/foo/foo$var/g' test.dat 

Clearly not what you expected. The fix is easy: use double quotes instead of single quotes.

$ sed "s/foo/foo$var/g" test.dat 

Wednesday, June 25, 2014


1. All you can eat sushi places should not exist. Why do they?
The point of this post: Economic reasoning can be used to "prove" things that are patently false, like the non-existence of all-you-can-eat sushi. And sometimes "inefficient" choices are actually reasoned responses to something missing from the economist's model.
2. Issac Newton: Man, Myth, and Mathematics (pdf link)
In roughly a year, without benefit of instruction, he mastered the entire achievement of seventeenth century analysis, and began to break new ground.

3. Simpson's and Fermat's Last Theorem (via Ontario Math Links)

\[3987^{12} + 4365^{12} = 4472^{12}\]

Wednesday, June 18, 2014

Proof Without Words

I was trying to explain the meaning and greater logic of integration by parts (beyond the mechanics) to someone today, and found this beautiful "proof without words" (PDF file).

Wikipedia reproduces this beautiful visual argument in its article on integration by parts. I like it, because the argument is more than a proof, it provides deep insight which a "normal" mathematical proof may not provide.

Upon googling, I found this nice related thread on Math Overflow, and this pdf (senior project?) document.

Monday, June 16, 2014

Text Editors

When I began coding in earnest as an undergraduate student, we had a few servers which had to be accessed using non-graphical "dumb" terminals. The only thing they could handle was text; even web-browsing was powered a non-graphical program called "lynx".

Boy, what fun it was!

Those times seem as old as dinosaurs.

Of course, in the technology world obsolescence always lurks around the bend; even the original iPhone looks somewhat clunky today.

During those good old days, the text-editor of choice for most programmers was either pico/nano or vi/vim. Since there were no "mice", one had to perform gymnastics with ones fingers on the keyboard to invoke commands. There are many key-bindings that are still deeply etched into residual muscle memory.

While these editors are still capable and retain large fan-bases (vim was the most popular editor among Linux Journal readers in 2006), after I moved to graduate school, I jumped over to the Emacs camp.

Emacs was awesome and I loved it.

It opened up whole new ways of doing standard tasks. It was very extensible, configurable, and greatly facilitated code development. Syntax highlighting, auto-indentation, regex search and replace - you name it! You could open multiple files in the same window, and have access to the command-line from within the program.

There were many instances in which entire days were spent in the confines of a single Emacs window.

I've used Emacs for almost a decade now. I've resisted the urge to "upgrade" to a full-scale IDE like Eclipse, because a subset of the primary languages in which I program (C++, Fortran 90 and GNU Octave)  tend to be poorly supported. Yes, there is a lot of support for C++ because of its use in traditional software development (as opposed to Scientific Computing where Fortran's influence is very persistent), but would like to develop all my code using the same editor.

Earlier this year, I decided to give Geany a try. It supports C++ and Fortran quite competently, and inherits most of the advantages of Emacs. Unlike Emacs, many of the keyboard shortcuts are more mainstream (Ctrl-C, Ctrl-V), which given my increasing propensity to forget things is convenient.

It makes moving around code a lot easier, auto-completes variable names, and allows code to be "folded", which I never imagined would be so useful. It also has a lot of plugins, and despite it capabilities does not feel "heavy" like Eclipse.

Overall, I find that my coding productivity has clearly gone up.

Thursday, June 12, 2014

Logarithm of Sum of Exponentials: Compendium

This post is just to collect a few recent entries on this topic under one umbrella.

1. This post set up the general problem

Numerically compute the following sum, for arbitrary \(x_i\) ,\[F = \log \left(e^{x_1} + e^{x_2} + ... + e^{x_N} \right).\] It also briefly discussed the major problem with doing this by brute force (overflow/underflow).

2.  We then made the problem specific, whose answer could be analytically computed. \[F = \log \left(e^{-1000} + e^{-1001} + ... + e^{-1100} \right) = -999.54.\]
3. We briefly looked at numerically evaluating  an even simpler model problem \[f(y) = \log(1 + e^y).\] While much simpler, this problem reflects all of the complexity in the original problem.

4. Equipped with the solution, we went back and solved our specific problem.

Tuesday, June 10, 2014


1. Should laptops be banned from classes? I struggle with similar issues.
Over time, a wealth of studies on students’ use of computers in the classroom has accumulated to support this intuition. Among the most famous is a landmark Cornell University study from 2003 called “The Laptop and the Lecture,” wherein half of a class was allowed unfettered access to their computers during a lecture while the other half was asked to keep their laptops closed. 
The experiment showed that, regardless of the kind or duration of the computer use, the disconnected students performed better on a post-lecture quiz. The message of the study aligns pretty well with the evidence that multitasking degrades task performance across the board.
I like the suggestion that the author makes:
I had one small suggestion, which I will implement the next time I teach (and for that class, I will generally continue to have the laptops closed): I will require my students to read some of the studies I’ve alluded to in this post, to help them understand why I’m doing what I’m doing and to get them to think critically about the use of technology in their lives and their education.
2. Why have female hurricanes more deadly? Or perhaps, why you should never accept easy explanations without adequate skepticism, even if they appear in PNAS.
For a start, they analysed hurricane data from 1950, but hurricanes all had female names at first. They only started getting male names on alternate years in 1979. This matters because hurricanes have also, on average, been getting less deadly over time.
The authors have rebuttal (not very convincing to me), but there are other holes. On the whole, the theory has a "compelling" narrative, but hangs on less compelling or ambiguous data.

Monday, June 9, 2014

Wisdom from Stoics

A wise thought from Mr. Money Moustache as he discusses making one's own wine:
I also resist, as is the nature of the ancient Stoics, becoming a connoisseur of material goods. Becoming the kind of person who can only enjoy the very finest and most expensive of anything, be it wine, automobile or speaker cable, is doubly wasteful. 
First, you are foolishly using your educational time while you become an expert and second, you’ll have to spend the rest of your life incurring expenses as the price of having achieved such expertise.
Couldn't agree more! 

Friday, June 6, 2014

Intelligence Squared Debate on Nature of Death

I'm a regular listener of the Intelligence Squared podcast, and its American counterpart, Intelligence Squared US. The podcast is heavily edited, so sometimes it is worthwhile to look up the unabridged video version of the debate on their website.

Recently, they debated the proposition: "Death is Not Final". On the against team (arguing that death is final) were Sean Carroll, and Steven Novella. You can find the full video on YouTube here.

I was against the proposition to begin with, and ended on the same side, with a stronger level of conviction. Here is Carroll's take on the debate, and here is Steve Novella's.

Overall, the debate was somewhat one-sided (that is my bias speaking, perhaps), and the against side articulated their case much more clearly.

There were two salient light-hearted moments.

When Dr. Alexander confronted Dr. Carroll with the potential link between consciousness and quantum mechanics, and the attraction of some of the fathers of quantum mechanics with mysticism, Dr. Carroll quoted Scott Aronson:
“Quantum mechanics is confusing and consciousness is confusing, so maybe they’re the same.” 
The other moment was when Dr. Alexander suggested that Carl Sagan thought that the evidence for reincarnation was overwhelming, and even asked him to look up page 302 of Sagan's book "Demon Haunted World".

Within seconds, the Twitter world exploded.

The skeptical community, when stripped of its predominantly atheistic clothes, treats Carl Sagan as God.

Here's Steve Novella's response on his blog:
Alexander specifically referenced Demon Haunted World page 302. The relevant section has already been posted by many others, including in the comments here, but here it is: 
“Perhaps one percent of the time, someone who has an idea that smells, feels, and looks indistinguishable from the usual run of pseudoscience will turn out to be right. Maybe some undiscovered reptile left over from the Cretaceous period will indeed be found in Loch Ness or the Congo Republic; or we will find artifacts of an advanced, non-human species elsewhere in the Solar System. At the time of writing there are three claims in the ESP field which, in my opinion, deserve serious study: 
(1) that by thought alone humans can (barely) affect random number generators in computers;
(2) that young children sometimes report the details of a previous life, which upon checking turn out to be accurate and which they could not have known about in any other way than reincarnation;
(3) that people under mild sensory deprivation can receive thoughts or images “projected” at them. 
I pick these claims not because I think they’re likely to be valid (I don’t), but as examples of contentions that might be true. The last three have at least some, although still dubious, experimental support. Of course, I could be wrong.” 
To put this in context, Sagan is arguing that we have to be open to even unlikely possibilities, and sometimes it is not unreasonable to gamble on low-probability ideas. I tend to agree, within the limits of practicality and resources. But if someone wants to spend their time researching very unlikely ideas, more power to them. Just expect to be held to a very high standard of scientific rigor. 
In the full quote Sagan clearly states that he does not think these propositions are likely to be valid, and the evidence so far for them is “dubious.” But – further research might be interesting. That’s pretty thin gruel on which Alexander is hanging his hat.

Sunday, June 1, 2014

Puzzle: Certainty amid Uncertainty

In a recent Michael Mauboussin talk that I linked to here, he raises an interesting puzzle, which may be paraphrased as:

"Person A (male), who is married, is looking at person B (female). In turn, person B is looking at person C (male), who is unmarried.  Is a married person looking at an unmarried person?"

There is uncertainty in the set up. We don't know if person B is married or not. If we knew, then the question would be trivial.

However, the question being asked of us, does not require complete information. In fact, we can answer the question, with certainty, without knowing the marital status of person B.

The answer is yes, there is a married person looking at an unmarried person.

We recognize that person B is either married or unmarried.
  • if she is married, then she is looking at person C, who is unmarried.
  • if she is unmarried, then person A, who is married, is looking at her.
In either case, there is someone who is married looking at someone who is not!

Friday, May 30, 2014

Logarithm of A Sum of Exponentials: Part 3

This post is a continuation of the Logarithm of A Sum of Exponentials series. Here are part 1 and part 2 of the series. I also talked about handling \(f(y) = \log (1 + \exp(y))\) in an exception-safe fashion, here.

Suppose we are given \(x_1\) and \(x_2\). Without any loss of generality let us suppose \(x_1 > x_2\).

We can write \(F = \log(e^{x_1} + e^{x_2})\) as \[F =  \log\left(e^{x_1} (1 + e^{x_2-x_1}) \right) = x_1 + \log\left(1 + e^{x_2-x_1}\right).\] Let \(y = x_2 - x_1 < 0\). Therefore \(e^y\) is a number smaller than 1, and we can use our numerical evaluation of the function \(f(y) = \log (1 + \exp(y)\) to enumerate it.

In Octave, one could accomplish this with:

% Compute F = log( exp(x1) + exp(x2) )
function F = LogSumExp(x1, x2)

  m = max(x1,x2);
  x = min(x1,x2) - m;
  F = m + ApproxLogP1(x);


This yields the following algorithm for solving the general problem, \[F = \log \left(e^{x_1} + e^{x_2} + ... + e^{x_N} \right),\] once one recognizes \(F = \log e^F\).

% Method: Protected Sum of Logs
F = x(1);
for i = 2:n
  F = LogSumExp(F, x(i));

When we evaluate the specific problem  \[F = \log \left(e^{-1000} + e^{-1001} + ... + e^{-1100} \right),\] we get F = -999.54 as expected.

Wednesday, May 28, 2014

Learning Regular Expressions

A really useful website to experiment and learn how regular expressions work.

You can cut and paste your text in the "Text" box, and try out various regular expressions.

The website also has a handy cheat sheet and examples, in addition to a comprehensive reference.

Thursday, May 22, 2014

Numerical Approximation of log( 1 + exp(y) )

Before we continue our journey to part 3 of this series, we need to figure out one small bit.

Consider the function \[f(y) = \log(1 + e^y).\] When \(y = 0, f(y) = \log 2 \approx 0.693147.\)

For large negative values of \(y\) the function asymptotically approaches zero. For large positive \(y\), the function is not bounded (plots at the end of the post).

Given a \(y\) direct numerical evaluation of \(f(y)\) may suffer from overflow or underflow problems.

Since the largest number that can be represented in double precision format is approximately \(10^{308}\), the maximum \(y\) for which \(e^{y}\) is within this threshold is approximately 710. That is \(e^{710} > 10^{308}\).

For large -ve values of \(y\), the "addition with 1" suffers from roundoff problems in which significant digits are lost. This is shown by the following Octave commands:

octave:1> format long e

octave:2> log(1+exp(0))   % log(2) as expected
ans =  6.93147180559945e-01

octave:4> log(1+exp(700)) % almost exactly 700 
ans =  7.00000000000000e+02

octave:4> log(1+exp(710)) % Should be 710 
ans =  Inf

octave:5> log(1+exp(-36))   % getting to threshold
ans =  2.22044604925031e-16

octave:6> log(1+exp(-37))   % actual answer is >0
ans =  0.00000000000000e+00

To improve the numerical precision of evaluating the function, we recognize that for large \(y, f(y) \approx y\). For \(y > 34\), using the approximation may be cheaper, and just as accurate.

octave:23> 34 - log(1+exp(34))
ans =  0.00000000000000e+00

For large -ve values of \(y\), we recognize that \[\lim_{x \rightarrow 0} \log(1+x) \approx x.\] Hence, for \(y \ll 0, \log(1+e^{y}) \approx e^{y}.\)

octave:24> log(1+exp(-37))
ans =  0.00000000000000e+00

octave:25> exp(-37)
ans =  8.53304762574407e-17

Thus, instead of evaluating \(f(y)\) directly, we define the following three cases,\[f(y) = \begin{cases} y & y > 35 \\ e^{y} & y < -10 \\\log(1 + e^y) & \text{otherwise} \end{cases}\]

With this definition, we protect from underflow and overflow. The pictures below illustrate the logic.

A blowup near y = 0.

A simple Octave program to compute the approximation is:

% Approximation for log(1+exp(x))
function f = ApproxLogP1(x)

  if (x < -10)      % protect against underflow

    f = exp(x);

  elseif (x > 35)    % protect against overflow 

    f = x;


    f = log( 1 + exp(x) );