Before we continue our journey to part 3 of this series, we need to figure out one small bit.
Consider the function \[f(y) = \log(1 + e^y).\] When \(y = 0, f(y) = \log 2 \approx 0.693147.\)
For large negative values of \(y\) the function asymptotically approaches zero. For large positive \(y\), the function is not bounded (plots at the end of the post).
Given a \(y\) direct numerical evaluation of \(f(y)\) may suffer from overflow or underflow problems.
Since the largest number that can be represented in double precision format is approximately \(10^{308}\), the maximum \(y\) for which \(e^{y}\) is within this threshold is approximately 710. That is \(e^{710} > 10^{308}\).
For large -ve values of \(y\), the "addition with 1" suffers from roundoff problems in which significant digits are lost. This is shown by the following Octave commands:
octave:1> format long e
octave:2> log(1+exp(0)) % log(2) as expected
ans = 6.93147180559945e-01
octave:4> log(1+exp(700)) % almost exactly 700
ans = 7.00000000000000e+02
octave:4> log(1+exp(710)) % Should be 710
ans = Inf
octave:5> log(1+exp(-36)) % getting to threshold
ans = 2.22044604925031e-16
octave:6> log(1+exp(-37)) % actual answer is >0
ans = 0.00000000000000e+00
Consider the function \[f(y) = \log(1 + e^y).\] When \(y = 0, f(y) = \log 2 \approx 0.693147.\)
For large negative values of \(y\) the function asymptotically approaches zero. For large positive \(y\), the function is not bounded (plots at the end of the post).
Given a \(y\) direct numerical evaluation of \(f(y)\) may suffer from overflow or underflow problems.
Since the largest number that can be represented in double precision format is approximately \(10^{308}\), the maximum \(y\) for which \(e^{y}\) is within this threshold is approximately 710. That is \(e^{710} > 10^{308}\).
For large -ve values of \(y\), the "addition with 1" suffers from roundoff problems in which significant digits are lost. This is shown by the following Octave commands:
octave:1> format long e
octave:2> log(1+exp(0)) % log(2) as expected
ans = 6.93147180559945e-01
octave:4> log(1+exp(700)) % almost exactly 700
ans = 7.00000000000000e+02
octave:4> log(1+exp(710)) % Should be 710
ans = Inf
octave:5> log(1+exp(-36)) % getting to threshold
ans = 2.22044604925031e-16
octave:6> log(1+exp(-37)) % actual answer is >0
ans = 0.00000000000000e+00
To improve the numerical precision of evaluating the function, we recognize that for large \(y, f(y) \approx y\). For \(y > 34\), using the approximation may be cheaper, and just as accurate.
octave:23> 34 - log(1+exp(34))
ans = 0.00000000000000e+00
For large -ve values of \(y\), we recognize that \[\lim_{x \rightarrow 0} \log(1+x) \approx x.\] Hence, for \(y \ll 0, \log(1+e^{y}) \approx e^{y}.\)
octave:24> log(1+exp(-37))
ans = 0.00000000000000e+00
octave:25> exp(-37)
ans = 8.53304762574407e-17
Thus, instead of evaluating \(f(y)\) directly, we define the following three cases,\[f(y) = \begin{cases} y & y > 35 \\ e^{y} & y < -10 \\\log(1 + e^y) & \text{otherwise} \end{cases}\]
With this definition, we protect from underflow and overflow. The pictures below illustrate the logic.
A blowup near y = 0.
A simple Octave program to compute the approximation is:
%
% Approximation for log(1+exp(x))
%
function f = ApproxLogP1(x)
if (x < -10) % protect against underflow
f = exp(x);
elseif (x > 35) % protect against overflow
f = x;
else
f = log( 1 + exp(x) );
endif
endfunction
%
% Approximation for log(1+exp(x))
%
function f = ApproxLogP1(x)
if (x < -10) % protect against underflow
f = exp(x);
elseif (x > 35) % protect against overflow
f = x;
else
f = log( 1 + exp(x) );
endif
endfunction
No comments:
Post a Comment