## Friday, May 30, 2014

### Logarithm of A Sum of Exponentials: Part 3

This post is a continuation of the Logarithm of A Sum of Exponentials series. Here are part 1 and part 2 of the series. I also talked about handling $f(y) = \log (1 + \exp(y))$ in an exception-safe fashion, here.

Suppose we are given $x_1$ and $x_2$. Without any loss of generality let us suppose $x_1 > x_2$.

We can write $F = \log(e^{x_1} + e^{x_2})$ as $F = \log\left(e^{x_1} (1 + e^{x_2-x_1}) \right) = x_1 + \log\left(1 + e^{x_2-x_1}\right).$ Let $y = x_2 - x_1 < 0$. Therefore $e^y$ is a number smaller than 1, and we can use our numerical evaluation of the function $f(y) = \log (1 + \exp(y)$ to enumerate it.

In Octave, one could accomplish this with:

%
% Compute F = log( exp(x1) + exp(x2) )
%
function F = LogSumExp(x1, x2)

m = max(x1,x2);
x = min(x1,x2) - m;
F = m + ApproxLogP1(x);

endfunction

This yields the following algorithm for solving the general problem, $F = \log \left(e^{x_1} + e^{x_2} + ... + e^{x_N} \right),$ once one recognizes $F = \log e^F$.

%
% Method: Protected Sum of Logs
%
F = x(1);
for i = 2:n
F = LogSumExp(F, x(i));
endfor

When we evaluate the specific problem  $F = \log \left(e^{-1000} + e^{-1001} + ... + e^{-1100} \right),$ we get F = -999.54 as expected.

## Wednesday, May 28, 2014

### Learning Regular Expressions

A really useful website to experiment and learn how regular expressions work.

You can cut and paste your text in the "Text" box, and try out various regular expressions.

The website also has a handy cheat sheet and examples, in addition to a comprehensive reference.

## Thursday, May 22, 2014

### Numerical Approximation of log( 1 + exp(y) )

Before we continue our journey to part 3 of this series, we need to figure out one small bit.

Consider the function $f(y) = \log(1 + e^y).$ When $y = 0, f(y) = \log 2 \approx 0.693147.$

For large negative values of $y$ the function asymptotically approaches zero. For large positive $y$, the function is not bounded (plots at the end of the post).

Given a $y$ direct numerical evaluation of $f(y)$ may suffer from overflow or underflow problems.

Since the largest number that can be represented in double precision format is approximately $10^{308}$, the maximum $y$ for which $e^{y}$ is within this threshold is approximately 710. That is $e^{710} > 10^{308}$.

For large -ve values of $y$, the "addition with 1" suffers from roundoff problems in which significant digits are lost. This is shown by the following Octave commands:

octave:1> format long e

octave:2> log(1+exp(0))   % log(2) as expected
ans =  6.93147180559945e-01

octave:4> log(1+exp(700)) % almost exactly 700
ans =  7.00000000000000e+02

octave:4> log(1+exp(710)) % Should be 710
ans =  Inf

octave:5> log(1+exp(-36))   % getting to threshold
ans =  2.22044604925031e-16

octave:6> log(1+exp(-37))   % actual answer is >0
ans =  0.00000000000000e+00

To improve the numerical precision of evaluating the function, we recognize that for large $y, f(y) \approx y$. For $y > 34$, using the approximation may be cheaper, and just as accurate.

octave:23> 34 - log(1+exp(34))
ans =  0.00000000000000e+00

For large -ve values of $y$, we recognize that $\lim_{x \rightarrow 0} \log(1+x) \approx x.$ Hence, for $y \ll 0, \log(1+e^{y}) \approx e^{y}.$

octave:24> log(1+exp(-37))
ans =  0.00000000000000e+00

octave:25> exp(-37)
ans =  8.53304762574407e-17

Thus, instead of evaluating $f(y)$ directly, we define the following three cases,$f(y) = \begin{cases} y & y > 35 \\ e^{y} & y < -10 \\\log(1 + e^y) & \text{otherwise} \end{cases}$

With this definition, we protect from underflow and overflow. The pictures below illustrate the logic.

A blowup near y = 0.

A simple Octave program to compute the approximation is:

%
% Approximation for log(1+exp(x))
%
function f = ApproxLogP1(x)

if (x < -10)      % protect against underflow

f = exp(x);

elseif (x > 35)    % protect against overflow

f = x;

else

f = log( 1 + exp(x) );

endif

endfunction

## Tuesday, May 20, 2014

### Logarithm of A Sum of Exponentials: Part 2

Specific Problem: For concreteness, less us consider a specific problem. Suppose $x_i = -1000 + (i-1)$, and $N = 101$. Thus, we are interested in the following summation, $F = \log \left(e^{-1000} + e^{-1001} + ... + e^{-1100} \right).$
Exact Solution: The term in the parenthesis is a geometric series, with $a = e^{-1000}$ and $r = 1/e$. Thus, exact solution may be obtained from the geometric series summation formula: $Z = \frac{a (1-r^N)}{1-r}.$ Thus, $F = \log Z = \log a + \log (1 - r^N) - \log (1-r),$ which can be evaluated, yielding  $F = -999.54$.

Brute Force Solution: If we attempt a brute force evaluation using the following Matlab commands

a = -1000;
b = -1100;
y = [a:-1:b]';

SumX = log(sum(exp(y)))

We get -Inf as the answer, since the brute flow calculation leads to underflow.

Analysis: If we knew all the $x_i$ before carrying out the summation, we could use the technique I describe next. Note that this technique would not work if we had to keep a running (logarithm of) sum (that is the $x_i$ were streaming in, one by one).

However, as I describe in a following post, we can repurpose the key idea from this method to tackle the general problem.

The key insight recognizes that we can rewrite the problem as: $F = \log e^{-1000} \left(1 + e^{-1} + ... + e^{-100} \right).$ Using the property of logarithms, this implies, $F=-1000 + \log \left(1 + e^{-1} + ... + e^{-100} \right).$ This second term in the series does not pose significant numerical problems if summed up directly.

The series does add up to -999.54, as expected.

When we pulled out the $e^{-1000}$, we essentially descaled the problem. If we knew all the $x_i$ beforehand, we can pull out the most significant (read largest) term out, and sum up the remaining lower order terms without bumping into false infinities or zeros.

However, if we don't have the entire string of $x_i$ available, we don't know what the most significant term is. Right?

Wrong!

In the future post, I will discuss how we can view the summation pairwise, and repurpose this same idea to address the above shortcoming.

## Monday, May 19, 2014

### Logarithm of A Sum of Exponentials: Part 1

Problem: Consider a stream of real numbers $x_1, x_2, ..., x_N$. We don't know the magnitude or sign of these numbers.

We want to numerically compute the following sum, for arbitrary $x_i$ ,$F = \log \left(e^{x_1} + e^{x_2} + ... + e^{x_N} \right).$

Motivation: Versions of this problem occur in many places. In statistical mechanics, for example, $x_i = -\beta U_i$ might correspond to the energy of the system, normalized by the Boltzmann factor.

The logarithm of the partition function $Z = \sum e^{x_i}$, corresponds to a free energy, $F = \log Z$.

Issue: The central issue at stake here can be articulated by considering a single term of the summation. $F = \log e^{x_1} = x_1.$ Suppose $x_1$ was 1000. Numerically computing $\log e^{1000}$ would be a problem, because $e^{1000}$ is greater than the largest number (~ $10^{308}$)that can be represented in double precision.

## Wednesday, May 14, 2014

### Trains of Thought and Words

At the beginning of the short story "The Resident Patient", Arthur Conan Doyle narrates the following exchange between Dr. Watson and Mr. Holmes. Dr. Watson is absorbed in contemplation when, Holmes interrupts him with:
"You are right, Watson," said he. "It does seem a very preposterous way of settling a dispute."
"Most preposterous!" I exclaimed, and then, suddenly realizing how he had echoed the inmost thought of my soul, I sat up in my chair and stared at him in blank amazement.
Dr. Watson is startled by Holmes' telepathy. Holmes goes on to explain:
"Then I will tell you. After throwing down your paper, which was the action which drew my attention to you, you sat for half a minute with a vacant expression. Then your eyes fixed themselves upon your newly-framed picture of General Gordon, and I saw by the alteration in your face that a train of thought had been started. But it did not lead very far. Your eyes turned across to the unframed portrait of Henry Ward Beecher which stands upon the top of your books. You then glanced up at the wall, and of course your meaning was obvious. You were thinking that if the portrait were framed it would just cover that bare space and correspond with Gordon's picture over there."
"You have followed me wonderfully!" I exclaimed.
While we may lack Holmes' deductive prowess, we are all very familiar with such trains of thought.

You listen to a song on the radio. It reminds you of the last time you heard it, and who you were with then. That brings other memories of your companions, and within a few simple steps, you've embarked on a bumpy trajectory of thoughts that leads you far away from the original stimulus.

It doesn't have to be solitary contemplation. The same phenomenon occurs in conversations between friends, for example. You are talking about something. One thing leads to another, and very soon the original history of the train of thoughts gets obscured, unless one takes particular effort to pause, and reconstruct it.

The act of reconstruction can sometimes be insightful in weird way.

On a recent episode of "A Way with Words", Grant was talking about his baldness, and Martha brought up the word "callow", which originally meant "bald".
The term callow goes back to Old English calu, meaning “bald.” The original sense of callow referred to young birds lacking feathers on their heads, then referred to a young man’s down cheek, and eventually came to mean “youthful” or “immature.”
The evolution from "bald" to "immature" does not seem direct, unless one appreciates the trajectory of the word.

## Friday, May 9, 2014

### "Prettify" Fortran Code

Some Fortran code that I wrote a long time ago looks downright ugly to my eyes today. Interestingly, the direction of this trend is the opposite of my handwriting, which has gotten progressively more illegible.

Anyway, wouldn't it be nice of one could pick some old Fortran 90 code, and deck it up by consistent indentation, proper alignment of comments, and getting all keywords in the same case (upper or lower or whatever).

Fortunately, there is a program which does just that. It is called f90ppr, which is actually a fortran program itself written by Michael Olagnon. It is also mirrored over here. You only need the f90ppr.f90 program, and perhaps f90ppr.1 (which is a man page that can be read by typing man f90ppr.1 on a unix type system).

Compile the f90ppr.f90 program (gfortran f90ppr.f90 -o f90ppr), and you are ready to "prettify".

Note that you can "unindent" any inconsistently indented program on an editor like Geany by selecting all, and repeatedly pressing Shift+Tab.

The next step is to manually add the following lines to the top of the "unindented" program. These are line that direct f90ppr to do the needful.

$define FPPR_MAX_LINE 100$define FPPR_KWD_CASE FPPR_LOWER
$define FPPR_USR_CASE 0$define FPPR_STP_INDENT 4
$define FPPR_FXD_IN 0$define FPPR_FXD_OUT 0

It sets up the max number of columns (100), treatment of keywords (lower), tab size (4 spaces) etc. The manual has details of other options.

With this lines pasted in the header of the ugly.f90 program, you say

./f90ppr < ugly.f90 > pretty.f90

And you are set. Here is the result.

It adds a "!" on  all empty lines. If you find that pesky, you can clean it up easily with the following sed script:

sed 's/^!\$//g' pretty.f90

## Wednesday, May 7, 2014

A fascinating example of the seeming advantages, and catastrophic unintended consequences of changing the formal definition of a derivative from $f'(x) = \lim_{h \rightarrow 0} \frac{f(x+h) - f(x)}{h},$ to $f'(x) = \lim_{h \rightarrow 0} \frac{f(x+h) - f(x-h)}{2h}.$