Before we continue our journey to part 3 of this series, we need to figure out one small bit.
Consider the function f(y) = \log(1 + e^y). When y = 0, f(y) = \log 2 \approx 0.693147.
For large negative values of y the function asymptotically approaches zero. For large positive y, the function is not bounded (plots at the end of the post).
Given a y direct numerical evaluation of f(y) may suffer from overflow or underflow problems.
Since the largest number that can be represented in double precision format is approximately 10^{308}, the maximum y for which e^{y} is within this threshold is approximately 710. That is e^{710} > 10^{308}.
For large -ve values of y, the "addition with 1" suffers from roundoff problems in which significant digits are lost. This is shown by the following Octave commands:
octave:1> format long e
octave:2> log(1+exp(0)) % log(2) as expected
ans = 6.93147180559945e-01
octave:4> log(1+exp(700)) % almost exactly 700
ans = 7.00000000000000e+02
octave:4> log(1+exp(710)) % Should be 710
ans = Inf
octave:5> log(1+exp(-36)) % getting to threshold
ans = 2.22044604925031e-16
octave:6> log(1+exp(-37)) % actual answer is >0
ans = 0.00000000000000e+00
Consider the function f(y) = \log(1 + e^y). When y = 0, f(y) = \log 2 \approx 0.693147.
For large negative values of y the function asymptotically approaches zero. For large positive y, the function is not bounded (plots at the end of the post).
Given a y direct numerical evaluation of f(y) may suffer from overflow or underflow problems.
Since the largest number that can be represented in double precision format is approximately 10^{308}, the maximum y for which e^{y} is within this threshold is approximately 710. That is e^{710} > 10^{308}.
For large -ve values of y, the "addition with 1" suffers from roundoff problems in which significant digits are lost. This is shown by the following Octave commands:
octave:1> format long e
octave:2> log(1+exp(0)) % log(2) as expected
ans = 6.93147180559945e-01
octave:4> log(1+exp(700)) % almost exactly 700
ans = 7.00000000000000e+02
octave:4> log(1+exp(710)) % Should be 710
ans = Inf
octave:5> log(1+exp(-36)) % getting to threshold
ans = 2.22044604925031e-16
octave:6> log(1+exp(-37)) % actual answer is >0
ans = 0.00000000000000e+00
To improve the numerical precision of evaluating the function, we recognize that for large y, f(y) \approx y. For y > 34, using the approximation may be cheaper, and just as accurate.
octave:23> 34 - log(1+exp(34))
ans = 0.00000000000000e+00
For large -ve values of y, we recognize that \lim_{x \rightarrow 0} \log(1+x) \approx x. Hence, for y \ll 0, \log(1+e^{y}) \approx e^{y}.
octave:24> log(1+exp(-37))
ans = 0.00000000000000e+00
octave:25> exp(-37)
ans = 8.53304762574407e-17
Thus, instead of evaluating f(y) directly, we define the following three cases,f(y) = \begin{cases} y & y > 35 \\ e^{y} & y < -10 \\\log(1 + e^y) & \text{otherwise} \end{cases}
With this definition, we protect from underflow and overflow. The pictures below illustrate the logic.
A blowup near y = 0.
A simple Octave program to compute the approximation is:
%
% Approximation for log(1+exp(x))
%
function f = ApproxLogP1(x)
if (x < -10) % protect against underflow
f = exp(x);
elseif (x > 35) % protect against overflow
f = x;
else
f = log( 1 + exp(x) );
endif
endfunction
%
% Approximation for log(1+exp(x))
%
function f = ApproxLogP1(x)
if (x < -10) % protect against underflow
f = exp(x);
elseif (x > 35) % protect against overflow
f = x;
else
f = log( 1 + exp(x) );
endif
endfunction
No comments:
Post a Comment