Monday, March 31, 2014

Links

1. Interesting approximation to "e", and a nice way to check the claim that the approximation is correct to a septillion digits.

2. Years of border changes in Europe (video link) (H/T Flowing Data)

3. TED talk on mind control by parasites by Ed Wong; very graphic and very cool!

4. Veritasium recreates the original double slit experiment.

Tuesday, March 25, 2014

Floating Point Numbers: Links

1. An YouTube video which explains the basics of floating point numbers.

2. An interview with William Kahan (1998) and a paper by Cleve Model (1996) tell the story of how the IEEE standard was forged, and how it made computer arithmetic a lot more deterministic.

3. Did I say deterministic? Well, somethings are and others aren't.

4. And finally, David Goldberg's must read paper: "What Every Computer Scientist Should Know About Floating-Point Arithmetic".

5. An interesting related SMBC comic.

Tuesday, March 18, 2014

Fortran and Octave/Matlab: Variable File Names

Suppose you want to write to files with names that are not known before compile time, or loop over a set of related filenames. For example, you may want to write to files file0001.txt, file0002.txt, ... so on.

Doing this in Octave/Matlab is quite easy: For example:
In Fortran 90, you do something similar

It is easy to extend this idea to create files using loops.

Wednesday, March 12, 2014

McGowan on the PhD glut

John McGowan presents a well-sourced article (and somewhat old article) on the status of surplus PhDs we've been producing over the last thirty or so years. The appendices link to a number of interesting articles and reports on the topic.

By and large, I agree with the central thesis that the current academic model encourages the production of ever greater numbers of PhD students.

It is perhaps too much of a good thing.

Universities are driven increase the supply of PhDs. Students, on the other hand, are driven by the promise of more lucrative or meaningful employment. From a simplistic economic viewpoint, the demand for PhD students from universities exceeds the demand for doctorates. Thus, the incentives of universities are not necessarily aligned with the incentives of students, as I also remarked a while ago.

Having said that, and essentially murmured my agreement with the author, I think two of the arguments he lays down require additional qualification.
It is worth considering some basic arithmetic. A typical professor has at least two graduate students at any time, sometimes many more. A Ph.D. programs in the United States usually lasts 5-7 years. A professor will take on and “advise” students from about age 30 as a starting assistant professor to retirement or death (say 65). This means a typical professor produces at least ten Ph.D.’s during his or her academic career in his or her academic specialty. If all the students pursue an academic career, certainly the ambition of many, this means an additional nine (9) new positions must be created over 35 years on top of the professor’s own position. Even with a fifty percent drop out rate, at least four or five new positions must be created to absorb the new Ph.D.’s. Of course, nothing of the kind has been the case for over forty years.
From my anecdotal experience (chemical engineering in the  US) the fraction of PhD students seeking an academic position is much smaller than 50%. I would guess that it is closer to 10-20%. I am sure the numbers are different for theoretical physics, organic chemistry, petroleum engineering, etc., although I recognize that even these small levels would cause a "glut".

For many students, academia is not the endgame they seek.

There are more lucrative positions in the industry, that often offer starting salaries that are about 50% greater than in academia. I am sure a large fraction of the PhD quants on Wall Street, did not end up there because they could not find academic jobs. The menu of possible jobs for PhDs has certainly diversified in the past couple of decades.

The second is the argument about the unsustainability of the PhD production growth rate outstripping the GDP. Superficially, the argument makes sense. But the GDP is an aggregate of industries in decline (manufacturing) and in explosive growth phases (IT). It is quite possible that fraction of GDP due to PhD-level high-tech jobs is mounting a secular advance. I don't know if this is true (perhaps it is not), but unless that is ruled out, the "conclusion" is really a hypothesis to test.

Saturday, March 8, 2014

Fortran Example Code for Solving Cyclic Tridiagonal Matrix

I posted a short program which implements the algorithm presented here.

As an example it calculates the following cyclic tridiagonal system \[\begin{bmatrix}
 -2 &  1 &  0 &  0 &  2 \\
 1 &  -2 &  1 &  0 &  0 \\
 0 &  1 &  -2 &  1 &  0 \\
 0 &  0 &  1 &  -2 &  1 \\
 -1 &  0 &  0 &  1 &  -2 \\
\end{bmatrix} x = \begin{bmatrix}
 1 \\
 2 \\
 3 \\
 4 \\
 5 \\
\end{bmatrix}.\]

Wednesday, March 5, 2014

Earth and Asteroids

For some reason, I never thought that the odds of asteroid-earth collisions were as high as Ed Lu explains they are.


I heard about this on the Commonwealth Club podcast.

Monday, March 3, 2014

Cyclic Tridiagonal Matrices

In a recent post, I talked about how the Thomas algorithm can be used to solve a tridiagonal system of equations very efficiently. Recall, that a tridiagonal matrix looks like, \[A = \begin{bmatrix} b_1 & c_1 & 0 & ... & 0\\ a_2 & b_2 & c_2 & ... & 0 \\ 0 & & \ddots & & 0 \\ 0 & ... & a_{n-1} & b_{n-1} & c_{n-1} \\ 0 & & ... & a_n & b_n\\ \end{bmatrix}.\] In some applications with periodic boundaries, a variant called the cyclic tridiagonal matrix occurs, \[A' = \begin{bmatrix} b_1 & c_1 & 0 & ... & \beta\\ a_2 & b_2 & c_2 & ... & 0 \\ 0 & & \ddots & & 0 \\ 0 & ... & a_{n-1} & b_{n-1} & c_{n-1} \\ \alpha & & ... & a_n & b_n\\ \end{bmatrix}.\] The efficient Thomas algorithm, and any optimized program we may have written to implement it, cannot be applied.

However, thanks to the Sherman-Morrison formula, we are still in business!

Following the lead from Numerical Recipes, we can set \[u = \begin{bmatrix} \gamma \\ 0 \\ \vdots \\ 0 \\ \alpha \end{bmatrix}, \quad \quad v = \begin{bmatrix}1 \\ 0 \\ \vdots \\ 0 \\ \beta/\gamma \end{bmatrix},\] and modify the elements of the tridiagonal part \(A\) of the cyclic tridiagonal matrix \(A'\), \(b_1^\prime = b_1 - \gamma\), and \(b_n^{\prime} = b_n - \alpha \beta/\gamma\).

Here \(\gamma\) is arbitrary, but is set to \(-b_1\) by default.

Thus \(A' = A + uv^T\).

To solve \(A' x = b\), in addition to matrix multiplication, we need to make two calls to the tridiagonal system. Thus, generally speaking, the asymptotic cost remains \(\mathcal{O}(n)\), but is about twice as costly as a standard tridiagonal system.

Sunday, March 2, 2014

Links:

1. Interactive Visualizations at Setosa (and Victor Powell's blog)

2. Why "following your passion is bad career advice" (video)?

3. The story of R (H/T FlowingData).