Before we continue our journey to part 3 of this series, we need to figure out one small bit.

Consider the function \[f(y) = \log(1 + e^y).\] When \(y = 0, f(y) = \log 2 \approx 0.693147.\)

For large negative values of \(y\) the function asymptotically approaches zero. For large positive \(y\), the function is not bounded (plots at the end of the post).

Given a \(y\) direct numerical evaluation of \(f(y)\) may suffer from overflow or underflow problems.

Since the largest number that can be represented in double precision format is approximately \(10^{308}\), the maximum \(y\) for which \(e^{y}\) is within this threshold is approximately 710. That is \(e^{710} > 10^{308}\).

For large -ve values of \(y\), the "addition with 1" suffers from roundoff problems in which significant digits are lost. This is shown by the following Octave commands:

octave:1> format long e

octave:2> log(1+exp(0)) % log(2) as expected

ans = 6.93147180559945e-01

octave:4> log(1+exp(700)) % almost exactly 700

ans = 7.00000000000000e+02

octave:4> log(1+exp(710)) % Should be 710

ans = Inf

octave:5> log(1+exp(-36)) % getting to threshold

ans = 2.22044604925031e-16

octave:6> log(1+exp(-37)) % actual answer is >0

ans = 0.00000000000000e+00

Consider the function \[f(y) = \log(1 + e^y).\] When \(y = 0, f(y) = \log 2 \approx 0.693147.\)

For large negative values of \(y\) the function asymptotically approaches zero. For large positive \(y\), the function is not bounded (plots at the end of the post).

Given a \(y\) direct numerical evaluation of \(f(y)\) may suffer from overflow or underflow problems.

Since the largest number that can be represented in double precision format is approximately \(10^{308}\), the maximum \(y\) for which \(e^{y}\) is within this threshold is approximately 710. That is \(e^{710} > 10^{308}\).

For large -ve values of \(y\), the "addition with 1" suffers from roundoff problems in which significant digits are lost. This is shown by the following Octave commands:

octave:1> format long e

octave:2> log(1+exp(0)) % log(2) as expected

ans = 6.93147180559945e-01

octave:4> log(1+exp(700)) % almost exactly 700

ans = 7.00000000000000e+02

octave:4> log(1+exp(710)) % Should be 710

ans = Inf

octave:5> log(1+exp(-36)) % getting to threshold

ans = 2.22044604925031e-16

octave:6> log(1+exp(-37)) % actual answer is >0

ans = 0.00000000000000e+00

To improve the numerical precision of evaluating the function, we recognize that for large \(y, f(y) \approx y\). For \(y > 34\), using the approximation may be cheaper, and just as accurate.

octave:23> 34 - log(1+exp(34))

ans = 0.00000000000000e+00

For large -ve values of \(y\), we recognize that \[\lim_{x \rightarrow 0} \log(1+x) \approx x.\] Hence, for \(y \ll 0, \log(1+e^{y}) \approx e^{y}.\)

octave:24> log(1+exp(-37))

ans = 0.00000000000000e+00

octave:25> exp(-37)

ans = 8.53304762574407e-17

Thus, instead of evaluating \(f(y)\) directly, we define the following three cases,\[f(y) = \begin{cases} y & y > 35 \\ e^{y} & y < -10 \\\log(1 + e^y) & \text{otherwise} \end{cases}\]

With this definition, we protect from underflow and overflow. The pictures below illustrate the logic.

A blowup near y = 0.

A simple Octave program to compute the approximation is:

**%****% Approximation for log(1+exp(x))****%****function f = ApproxLogP1(x)**

**if (x < -10) % protect against underflow**

**f = exp(x);**

**elseif (x > 35) % protect against overflow**

**f = x;**

**else**

**f = log( 1 + exp(x) );**

**endif**

**endfunction**
## No comments:

Post a Comment