Thursday, January 28, 2016

Ornstein-Zernike Equation

I've been teaching myself some liquid state theory for a coarse-graining problem I am interested in.
  1. The classic "Theory of Simple Liquids" by Hansen and McDonald is of course a great textbook.
  2. Another freely available introduction, especially focused on the Ornstein-Zernike equation and its closures, is this readable chapter by Nagele called "Theories of Fluid Microstructure" (pdf). This presentation, which closely follows the material in the chapter, is also useful as a review.
  3. Finally, I found a very nice description of the equation, closures, review of numerical techniques to solve the OZ equation, and (wait I am not done) - python program - called pyOZ - to solve the OZ equations. While it is no longer actively maintained (last release was 2009), it is a very handy resource. Thank you Lubos Vrbka!
  4. There is also a Fortran 90 OZ equation solver that is maintained as part of a larger project here.
  5. Per Linse has a better documented OZ Fortran 95 software.
  6. A reasonably straightforward explanation of the computational method used is described here.

Monday, January 25, 2016

Links

1. A great Python resource page by Andy Terrel, especially for HPC

2. Why Spiderman can't exist, unless he shrinks himself to the size of a lizard.

3. Nick Moore gave a fantastic talk in our department last week. Here is a link to some of his work that has found its way into the popular press.

4. A great visualization of completing the square that Eddie Woo tweeted:


Thursday, January 21, 2016

Other MCMC Class Links

1. Bernd Berg's MCMC Class at FSU (link). The lectures are clear and concise, with a bias towards statistical physics.

2. Michael Mascagni's Monte Carlo class at FSU. The landing page has many useful links, and the lecture notes are available here. The class focuses on random number generation, direct Monte Carlo methods, and using MC to solve PDEs.

3. MCMC for computer vision at UCLA. Includes some advanced topics like reversible-jump MCMC.

4. Glenn Cowan at University of London has a great course (and accompanying text) on Statistical Data Analysis, which has a fair amount of overlap with some MC topics.

5. A very nice iPython/jupyter notebook powered MCMC class at Harvard.

6. Werner Krauth has an excellent Coursera class on Statistical Mechanics with lots of MC and MCMC computation.

7. Of course, my own course notes are here.

Tuesday, January 19, 2016

Links: A thing I learned recently

As I listened to a recent Surprisingly Awesome podcast entitled "Broccoli" I had a mini-spiritual "we are one" moment. For I learned something about this wild plant Brassica oleracea, that really made me go "this is awesome!"

According to wikipedia:
Brassica oleracea is the species of plant that includes many common foods as cultivars, including cabbage, broccoli, cauliflower, kale, Brussels sprouts, collard greens, savoy, kohlrabi and kai-lan.
Here
Enterprising farmers over the last several thousand years contributed to domesticating several distinct lineages of B. oleracea, each amplifying different parts of this wild plant to produce several vegetable varieties, or cultivar groups or subspecies (“ssp.”): kale and collard greens (ssp. acephala), Chinese broccoli (ssp. alboglabra), red and green cabbages (ssp. capitata), savoy cabbage (ssp.sabauda) kohlrabi (ssp. gongylodes), Brussels sprouts (ssp. gemmifera), broccoli (ssp. italica), and cauliflower (ssp. botrytis). These varieties look dramatically – sometimes comically – different but are nonetheless considered to be the same species because they are all still interfertile, capable of mating with one another and producing fertile offspring.
So all these different vegetables are like different breeds of dog. In principle, just as you can get a doberman and chihuahua to produce an offspring, you can get cabbage and broccoli to have sex with each other.

Isn't that wild!

Sunday, January 17, 2016

Robert Frank: Talk at Google

As I mentioned in the last post, I found out about Robert Frank's book "The Economic Naturalist" from a recent EconTalk podcast. Here is his talk at Google, when he was on a book tour.


Saturday, January 16, 2016

Of Tuxedos and Wedding Dresses

In a recent EconTalk podcast, Russ Roberts discussed the following puzzle with the guest Robert Frank:

Why do grooms typically rent tuxedos, while brides usually buy their wedding gowns?

As described here:
[Let's begin] with the assumption that women are more likely than men to want to make a fashion statement on big social occasions. It’s a strong assumption. But it’s also a plausible one. I’ve described this example in many different countries, and no one has yet objected. If we grant that assumption, one implication is that a rental company would have to carry a huge stock of distinctive gowns—perhaps forty or fifty in each size, to enable brides to achieve their goal. Because each garment would be rented only infrequently, perhaps just once every four or five years, the company would have to charge its rental customers more than the purchase price of the garment just to cover its costs. But if buying were cheaper than renting, why would anyone rent?

Conditions are markedly different in a rental market for tuxedos. Because grooms are willing to settle for a standard style, a rental company can serve this market with an inventory of only a few tuxedos in each size. So each suit gets rented several times a year, which enables a rental company to cover its costs by charging only a small fraction of the tuxedo’s purchase price.

Thursday, January 14, 2016

Kelly Criterion: Derivation

This is a follow-up to a previous post on the Kelly Criterion.

Setup

You may be familiar with the compound interest formula: \[a_n = a_0 (1+r)^n,\] where \(r\) is the constant annual (say) rate of return. Setting \(R = (1+r)\), we have \(a_n = a_0 R^n\).

If the returns per year are not constant (think stocks instead of bonds), then we have to modify the formula to, \[a_n = a_0 \prod_{i=1}^n R_i.\] Suppose the \(R_i\) come from a known (and "stable over time") random process. We know how to deal with sums and products of independent and identically distributed (iid) random variables.

As I demonstrated in the previous post, we don't want to maximize the arithmetic average \(\langle a_n \rangle\) because it is skewed by positive outcomes; instead, we use a geometric average.

We can do this using the property of logarithms.
\[\begin{align}
    \log a_n & = \log a_0 + \sum_{i=1}^{n} \log R_i\\
    \langle \log \frac{a_n}{a_0} \rangle & = n \langle \log R \rangle,\\
    \end{align}\] where I have used the result that expected value of the sum of iid random variables, \(y = \sum_{i=1}^{n} x_i,\) is given by \(\langle y \rangle = n \langle x \rangle.\)

We wan to maximize \(\langle \log \frac{a_n}{a_0} \rangle\) by maximizing \(\langle \log R \rangle\).

Let us denote the average return \(S = \exp(\langle \log R \rangle)\), so that we can crudely write \(a_n \sim a_0 S^{n}\) (compare this with the formula  \(a_n = a_0 R^n\) for constant rate of returns).

Note that \(S\) is a function of \(f\). We want to choose \(f\) to maximize this growth rate.

Optimization

Back to our wager. Let us generalize it a little bit.

Suppose there are \(m\) possible outcomes (instead of just two), each with probability \(q_i\), and payoff \(b_i\). Note that the \(q_i\)s have to add to one. How much do we wager?

We want to maximize \(S\) (or \(\log S\), since it is a monotonic function). \[\begin{align}
\log S & = \langle \log R \rangle \\
       & = q_1 \log(1 + b_1 f) + ... + q_m \log(1 + b_m f) \\
       & = \sum_{i=1}^{m} q_i \log(1 + b_i f)
      \end{align}\] Setting up:
\[\dfrac{d \log S}{df} = \sum_{i=1}^{m} \frac{q_i b_i}{1 + b_i f} = 0\] Solving the last equation for \(f\) gives the optimal value. This value corresponds to an average growth rate of \[S = \exp \left( \sum q_i (1 + b_i f^*) \right)\]
Special Case

When we have just two outcomes, \[\frac{q_1 b_1}{1 + b_1 f} + \frac{q_2 b_2}{1 + b_2 f} = 0\] Following the example in the last post, by Setting \(b_2 = -1\) and \(q_1 = q\), we get \[\frac{q b}{1 + b f} = \frac{1 - q}{1 - f},\] which simplifies to the familiar rule, presented there.

Python Code

Lets encode the equation to solve:

def funcKelly(f, q, b):
    """equation to solve for f"""
    s = 0.
    for qi, bi in zip(q,b):
        s += (qi * bi) / (1. + bi * f)
        
    return s

def optimalKelly(q, b):
    """given probabilities q and payouts b, 
       solve the equation to get optimal Kelly
       by solving the function funcKelly"""

    from scipy.optimize import brentq
    fstar = brentq(funcKelly, 0.0, 1.0, args=(q, b))
   
    return fstar

Geometric Average Growth Rate

Finally, let me plot the average growth rate S versus fraction f, by taking the average of 1000 independent simulation (of a series of 100 bets) with q = (0.5, 0.5), b = (2, -1).

The "star" on the plot shows the optimality of \(f^*\).

Tuesday, January 12, 2016

MCMC Class Lecture Notes

In the fall of 2015, I taught an applied introductory course in Markov-Chain Monte Carlo Methods targeted to graduate students from scientific computing, engineering, math, and computer science.

Here's the blurb I put out for the course:
Markov Chain Monte Carlo (MCMC) is one the most powerful and versatile methods developed in the 20th century. It uses a sequence of random numbers to solve important problems in physics, computational biology, econometrics, political science, Bayesian inference, machine learning, data science, optimization, etc. For many of these problems, simple Monte Carlo ("integration by darts") is inefficient. Often, MCMC is the answer. 
Broadly speaking, MCMC is collection of sampling methods that allows us to solve problems of integration, optimization, simulation in high dimensional spaces. In this course, we will look at the foundations of Monte Carlo and MCMC, introduce and implement different sampling algorithms, and develop statistical concepts and intuition to analyze convergence and error in the simulation. Assignments and labs will consider illustrative examples from statistics, material science, physics, economics, optimization, and Bayesian inference.
The complete set of lecture notes may be downloaded from my Google Sites webpage.

Wednesday, January 6, 2016

Links: Underbelly of Doing Science

1. John Horgan @ScientificAmerican
[...] taught me some lessons about science journalism that my subsequent experiences reinforced. First, researchers, when accused of hype, love to blame it on the media. But media hype can usually be traced back to the researchers themselves.
2.  Gary Marcus @NewYorker
But science is not just about conclusions, which are occasionally incorrect. It’s about a methodology for investigation, which includes, at its core, a relentless drive towards questioning that which came before. You can both love science and question it. As my father, who passed away earlier this year, taught me, there is no contradiction between the two.
3. Jesse Singal @NYMag
After all, what both the [...] cases have in common  [...] is the extent to which groups of progressive self-appointed defenders of social justice banded together to launch full-throated assaults on legitimate science, and the extent to which these attacks were abetted by left-leaning academic institutions and activists too scared to stand up to the attackers, often out of a fear of being lumped in with those being attacked, or of being accused of wobbly allyship.
4. Steve Novella @Neurologica
Skeptics are often in a tricky position. We simultaneously are cheerleaders for science, promoting science education, scientific literacy, and the power of science as the best method for understanding the universe. 
At the same time the skeptical approach requires that we explore and discuss all the various flaws, errors, and weaknesses in the institutions and process of science. Science in theory is fantastic, but it is practiced by flawed people with all their cognitive biases and perverse incentives (much like democracy or capitalism).

Sunday, January 3, 2016

Kelly Criterion

Suppose someone offers you the following deal: you put down a certain amount of money. A fair coin is tossed.  Heads - you triple your money; tails - you lose the money wagered.

Loss Aversion

Q1: Do you take the deal?
A1: You probably should; the expected payout is 0.5 * $2 + 0.5 * (-$1) = $1 > 0.

Q2: Suppose you  have $1 million dollars of life savings. Do you go all in?
A2: You probably shouldn't. There is a 50% chance your life-savings will be wiped out! The downside doesn't justify the upside.
Payout based on fraction wagered assuming starting amount is $1, for a single bet
Now say the terms of the deal are extended.

Q3: The deal will be offered, say, once every month for the rest of your life. What fraction of your savings "f" would you wager?

If \(f \approx 0\), you don't play the game or bet small. You are probably not making the most of the opportunity offered. If \(f \approx 1\), you go all in. A single loss will wipe you off entirely.

The intutive answer seems to be somewhere between 0  and 1.

Further let us suppose that this "fraction" is a constant over all the (monthly) bets.

Kelly Criterion

The Kelly criterion informs the size of the bet, given the payouts and probabilities of possible outcomes. From the wikipedia article, if

  • \(f^*\) is the fraction to bet;
  • \(b\) is the net earnings per dollar wagered, if you win;
  • \(q\) is the probability of winning;
  • \(1-q\) is the probability of losing,
then the optimal fraction to bet is \[f^{*} = \frac{bq - (1-q)}{b} = \frac{q(b + 1) - 1}{b}.\] This fraction may be thought of as: \[f^{*} = \frac{\text{expected payout}}{\text{payout if you win}}\]

We can derive this formula later. Here, it is easy to see that, f* = (2*0.5 - 0.5)/(2) = 0.25.

Let's simulate a series of such bets. Suppose we start with an amount \(a_0\). After \(n\) bets, we are left with \(a_n = a_0 R_1 R_2 ... R_n\), or \[a_n = a_0 \prod_{i=1}^{n} R_i,\] where \(R_i\) is amount we are left with, for each available dollar, after bet \(i\).

Starting with $1, bet f*=0.25 of bankroll every bet for 100 bets.
The best one can do is to win all bets, leading to a payout of \[a_{n} = a_0 (1 + bf)^n.\]

The worst one can do is to lose all bets, leading to a payout of \[a_{n} = a_0 * (1-f)^n.\] Notice that in theory you never go bankrupt if \(f < 1\), since you are never wagering all your assets. However, in practice, there is usually a limit, since fractions of a penny may be unacceptable wagers.

Simulations

One series is anecdote. Let's collect some data, by repeating the experiment 10 times at different values of f = 0.25, 0.05 (smaller than optimal), and 0.6 (larger than optimal).
f=0.25. Black and blue lines are arithmetic and geometric means, respectively.
You notice that the arithmetic mean (the thick black line) is overly optimistic, and is heavily influenced by the lucky few outliers. The thick blue line is geometric average which seems more "reasonable". Geometric means weigh positive outliers less strongly.

To see this important concept in a more salient way, assume you took a series of 10 bets, and went all in (f = 1). Everything else about the problem is unchanged. Unless you win every round (probability of \(0.5^{10} = 1/1024\)) and end up with \(3^{10}\) dollars, you will end up with zero (probability = 1023/1024). However, the mean payout \(\langle a_n \rangle\) = 57.7 dollars, which obscures reality since the geometric mean and the median are essentially 0.

This is like playing a lottery.

If we use a smaller \(f\), we trade return for smaller variance. We are being more conservative when we do this.

Let's pick \(f = 0.05 < f^*\) to demonstrate this. For ease of comparison, I've kept the scale of the plot the same as the previous one.
f = 0.05
Now \(f = 0.6 > f^*\), shows how the downside can catch up with us, if we get too risk-on. Lot's of disasters.
f=0.6
This post has gone on too long, so I'll discuss the derivation of the criterion in a following post.

Friday, January 1, 2016

Math Links

1. Another take on the Pinker-Taleb "debate".
Narratives without statistics are blind, but statistics without narratives are empty. (Pinker)
2. Book review of Really Big Numbers
How big is 10^100 [a googol]? It’s bigger than the number of atoms that could in theory be packed into a volume the size of the observable universe, but not by much (that is to say, not by more than, oh, a dozen orders of magnitude — which is small potatoes when you’re talking about a number like a googol, which exceeds the number 1 by a hundred orders of magnitude). 
Schwartz points out that a googol is exactly equal to the number of ways to make a painting consisting of 100 colored squares, using a palette of 10 colors.

3. An impressive quartet of mathematicians attempt something very hard.
That is the case with these four mathematicians. They are all individually talented, and each has pursued his own research interests over the years. But they are also close friends with a shared background and a similar approach to mathematics. This has allowed them to prompt each other, teach each other, and make discoveries together that they might not otherwise have made so easily. These include several smaller papers they’ve written in tandem and now, most recently, their biggest collaborative discovery yet — a forthcoming result by Zhang and Yun that’s already being hailed as one of the most exciting breakthroughs in an important area of number theory in the last 30 years.