1. The Psychology of Small Packages (H/T Farnam Street)

2. Wilkinson's Matrices

3. The Grammar of Anarchy

2. Wilkinson's Matrices

3. The Grammar of Anarchy

1. The Psychology of Small Packages (H/T Farnam Street)

2. Wilkinson's Matrices

3. The Grammar of Anarchy

2. Wilkinson's Matrices

3. The Grammar of Anarchy

The quintessential least-squares problem looks like \[A x = b,\] where \(A_{m \times n}\) is a "tall" matrix with \(m > n\), \(x_{n \times 1}\) is the vector we are trying to "regress", and \(b_{m \times 1}\) is a vector as tall as \(A\).

Since we have too many equations or constraints (m) compared with unknowns (n), we can't solve the problem in the usual sense. Hence we seek a least-squares solution, which can be obtained by hitting both sides of the original equation with the transpose \(A'\), to obtain the so called*normal* equation, \[A' A x = A' b.\]Now we have a linear system \(n\) equations and \(n\) unknowns, which can be solved for \(x\).

There are three methods one can use:

Since we have too many equations or constraints (m) compared with unknowns (n), we can't solve the problem in the usual sense. Hence we seek a least-squares solution, which can be obtained by hitting both sides of the original equation with the transpose \(A'\), to obtain the so called

There are three methods one can use:

**Direct solution of Normal Equations**: The idea is to factorize \(A' A\) using Cholesky decomposition, and solve the resulting system using forward and back substitution. This is straightforward, fast (asymptotic cost \(\sim m n^2 + n^3/3\)), but susceptible to round-off. We will explore this last part in a little bit.**QR Decomposition**: The idea is to first factorize \(A = QR\). Then, we can write the normal equations as \[A' A x = b \implies R' Q' Q R x = R' Q' b.\] Since \(Q\) is orthogonal, we can simplify the equations to the triangular system \[R x = Q' b.\] The asymptotic cost is \(2mn^2 - 2n^3/3\). This method is less susceptible to round-off.**SVD**: We can always bring out the ultimate thermonuclear weapon. We first factorize \(A = U S V'\). The normal equations can then be simplified as above into \[S V' x = U' b.\]The asymptotic cost is \(2mn^2 + 11 n^3\). This method is also less susceptible to round-off. An advantage is that, if required, we can throw in some regularization at a problem to make it less nasty.

Let us explore the relative merits of the three methods by considering a Matlab example.

So I was at a dinner this evening, where this guy Chris Jordan gave an amazing talk.

You really should check out his artwork. He specializes in "garbage" - using photographs to give us a sense of the maddening material consumption in our society. It is funny how pictures can make big mind-numbing numbers (of discarded plastic bottles, cellphones, cars etc.) spring to life; how they replace apathy with grief.

His remarkable ability to make us "feel" these big numbers is captured in this TED talk.

A particularly poignant story that he told revolves around the albatrosses on Midway, and the amount of plastic found in the guts of dead baby albatrosses. As a friend apparently said to him, "if I start crying, I am worried I will never stop."

You really should check out his artwork. He specializes in "garbage" - using photographs to give us a sense of the maddening material consumption in our society. It is funny how pictures can make big mind-numbing numbers (of discarded plastic bottles, cellphones, cars etc.) spring to life; how they replace apathy with grief.

His remarkable ability to make us "feel" these big numbers is captured in this TED talk.

A particularly poignant story that he told revolves around the albatrosses on Midway, and the amount of plastic found in the guts of dead baby albatrosses. As a friend apparently said to him, "if I start crying, I am worried I will never stop."

Consider a simple Matlab/Octave exercise.

I have an N by 2 vector**x**. Thus, **x(i,1)** and **x(i,2)** are the entries of the i-th row. Say I want to rearrange **x** so that the first entry in any row is larger than the second. That is, I want to switch the entries of a row, only if a certain condition is met.

It is easy to write this in a element-by-element fashion

**for i = 1:N**

** if x(i,2) > x(i,1)**

** c = x(i,1);**

** x(i,1) = x(i,2);**

** x(i,2) = c;**

** endif**

**endfor**

But we all know that element-by-element operations in Matlab or Octave suck! Here is another vectorized method:

**SwitchCond = x(:,2) < x(:,1);**

**x(SwitchCond,[1,2]) = x(SwitchCond,[2,1]);**

Besides being faster, it is also more succinct.

I have an N by 2 vector

It is easy to write this in a element-by-element fashion

But we all know that element-by-element operations in Matlab or Octave suck! Here is another vectorized method:

Besides being faster, it is also more succinct.

1. The Art of Splash: Beautiful pictures of fluids in motion!

2. Nice back and forth between Deborah Meier and Eric Hanushek at Bridging Differences.

3. Yet another writing guide, with some nice references.

2. Nice back and forth between Deborah Meier and Eric Hanushek at Bridging Differences.

3. Yet another writing guide, with some nice references.

Subscribe to:
Posts (Atom)