I wrote the question below for our PhD qualifiers. It addresses a problem I have been thinking about for over a decade now - starting from my time as a graduate student: how to fit a sum of decaying exponentials to data?
The question explores a method called the Prony method. Here is the question:
A classical problem in data analysis involves fitting a sum of exponentials to a time series of uniformly sampled observations. Here, let us suppose we are given N observations \((t_i, f_i)\), where \(t_i = i \Delta t\) for \(i = 0, 1, ..., N-1\).
We want to fit the data to a sum of two exponentials. The model equation is, \[\hat{f}(t) = a_1 e^{b_1 t} + a_2 e^{b_2 t}.\] The general nonlinear regression problem to determine \(\{a_j, b_j\}\) becomes difficult as the number of exponentials in the sum increases. A number of quasi-linear methods have been developed to address this. In the question, we will explore one of these methods, and determine the fitting parameters.
(a) First, generate a synthetic dataset \((t_i, f_i)\) with true \(a_1^* = a_2^* = 1.0\), \(b_1^* = -2.0\), \(b_2^* = -0.2\). Use \(t_0 = 0\), \(\Delta t = 1\), and N = 20. Attach a plot of the synthetic dataset. Use this dataset for numerical calculations below.
(b) If \(b_1\) and \(b_2\) are known, then we can determine \(a_1\) and \(a_2\) by linear least squares. Set \(u_1 = e^{b_1 \Delta t}\) and \(u_2 = e^{b_2 \Delta t}\). Recognize that \(e^{b_i t_j} = e^{b_i j \Delta t} = u_i^j\). Hence from the model eqn, we can get a linear system:
\begin{align}
f_0 & = a_1 u_1^0 + a_2 u_2^0 \nonumber\\
f_1 & = a_1 u_1^1 + a_2 u_2^1 \nonumber\\
\vdots & = \vdots \nonumber\\
f_{N-1} & = a_1 u_1^{N-1} + a_2 u_2^{N-1}
\end{align}
Write a program to determine \(a_1\) and \(a_2\), given the data, \(b_1\) and \(b_2\).
(c) Consider the polynomial \(p(z)\), which has \(u_1\) and \(u_2\) as its roots, \(p(z) = (z-u_1)(z-u_2) = z^2 - d_1 z -d_2 = 0\). Express \(u_1\) and \(u_2\) in terms of \(d_1\) and \(d_2\).
(d) Now we seek to take linear combinations equations in the linear system above with the goal of eliminating \(a_j\). For example, consider the first three equations. If we multiply the first eqn by \(d_2\), the next by \(d_1\), and the third by -1 and sum them up.
\begin{align*}
d_2 f_0 & = a_1 d_2 + a_2 d_2\\
d_1 f_1 & = a_1 u_1 d_1 + a_2 u_2 d_1 \\
-1 f_2 & = -a_1 u_1^2 - a_2 u_2^2.
\end{align*}
We get \(-f_2 +d_1 f_1 + d_2 f_0 = -a_1(u_1^2 - d_1 u_1 - d_2) -\) \( a_2(u_2^2 -d_1 u_2 - d_2) = 0\), since \(p(u_i) = 0\).
We can pick the next set of three equations, and repeat the process (multiply by \(d_2\), \(d_1\), and -1 before summing up). Show that we end up with the following linear system:
\[\begin{bmatrix} f_{1} & f_0 \\ f_2 & f_1 \\
\vdots & \vdots \\
f_{N-2} & f_{N-3} \\
\end{bmatrix} \begin{bmatrix} d_1 \\ d_2 \end{bmatrix} = \begin{bmatrix} f_2 \\ f_{3} \\ \vdots \\ f_{N-1} \end{bmatrix}\]
Determine \(d_1\) and \(d_2\), and hence \(u_1\) and \(u_2\). From this, find the estimated \(b_1\) and \(b_2\).
(e) Once you know \(b_1\) and \(b_2\) find \(a_1\) and \(a_2\) by linear least squares solution of linear system.
The question explores a method called the Prony method. Here is the question:
A classical problem in data analysis involves fitting a sum of exponentials to a time series of uniformly sampled observations. Here, let us suppose we are given N observations \((t_i, f_i)\), where \(t_i = i \Delta t\) for \(i = 0, 1, ..., N-1\).
We want to fit the data to a sum of two exponentials. The model equation is, \[\hat{f}(t) = a_1 e^{b_1 t} + a_2 e^{b_2 t}.\] The general nonlinear regression problem to determine \(\{a_j, b_j\}\) becomes difficult as the number of exponentials in the sum increases. A number of quasi-linear methods have been developed to address this. In the question, we will explore one of these methods, and determine the fitting parameters.
(a) First, generate a synthetic dataset \((t_i, f_i)\) with true \(a_1^* = a_2^* = 1.0\), \(b_1^* = -2.0\), \(b_2^* = -0.2\). Use \(t_0 = 0\), \(\Delta t = 1\), and N = 20. Attach a plot of the synthetic dataset. Use this dataset for numerical calculations below.
(b) If \(b_1\) and \(b_2\) are known, then we can determine \(a_1\) and \(a_2\) by linear least squares. Set \(u_1 = e^{b_1 \Delta t}\) and \(u_2 = e^{b_2 \Delta t}\). Recognize that \(e^{b_i t_j} = e^{b_i j \Delta t} = u_i^j\). Hence from the model eqn, we can get a linear system:
\begin{align}
f_0 & = a_1 u_1^0 + a_2 u_2^0 \nonumber\\
f_1 & = a_1 u_1^1 + a_2 u_2^1 \nonumber\\
\vdots & = \vdots \nonumber\\
f_{N-1} & = a_1 u_1^{N-1} + a_2 u_2^{N-1}
\end{align}
Write a program to determine \(a_1\) and \(a_2\), given the data, \(b_1\) and \(b_2\).
(c) Consider the polynomial \(p(z)\), which has \(u_1\) and \(u_2\) as its roots, \(p(z) = (z-u_1)(z-u_2) = z^2 - d_1 z -d_2 = 0\). Express \(u_1\) and \(u_2\) in terms of \(d_1\) and \(d_2\).
(d) Now we seek to take linear combinations equations in the linear system above with the goal of eliminating \(a_j\). For example, consider the first three equations. If we multiply the first eqn by \(d_2\), the next by \(d_1\), and the third by -1 and sum them up.
\begin{align*}
d_2 f_0 & = a_1 d_2 + a_2 d_2\\
d_1 f_1 & = a_1 u_1 d_1 + a_2 u_2 d_1 \\
-1 f_2 & = -a_1 u_1^2 - a_2 u_2^2.
\end{align*}
We get \(-f_2 +d_1 f_1 + d_2 f_0 = -a_1(u_1^2 - d_1 u_1 - d_2) -\) \( a_2(u_2^2 -d_1 u_2 - d_2) = 0\), since \(p(u_i) = 0\).
We can pick the next set of three equations, and repeat the process (multiply by \(d_2\), \(d_1\), and -1 before summing up). Show that we end up with the following linear system:
\[\begin{bmatrix} f_{1} & f_0 \\ f_2 & f_1 \\
\vdots & \vdots \\
f_{N-2} & f_{N-3} \\
\end{bmatrix} \begin{bmatrix} d_1 \\ d_2 \end{bmatrix} = \begin{bmatrix} f_2 \\ f_{3} \\ \vdots \\ f_{N-1} \end{bmatrix}\]
Determine \(d_1\) and \(d_2\), and hence \(u_1\) and \(u_2\). From this, find the estimated \(b_1\) and \(b_2\).
(e) Once you know \(b_1\) and \(b_2\) find \(a_1\) and \(a_2\) by linear least squares solution of linear system.
 
 
No comments:
Post a Comment