Thursday, March 30, 2017

Statistics and Gelman

Russ Roberts had a fantastic conversation with Andrew Gelman on a recent podcast. It covered a lot of issues and examples, some of which were familiar.

A particularly salient metaphor "the Garden of Forking Paths" crystallized (for me) some unintentional p-hacking by people with integrity.
In this garden of forking paths, whatever route you take seems predetermined, but that’s because the choices are done implicitly. The researchers are not trying multiple tests to see which has the best p-value; rather, they are using their scientific common sense to formulate their hypotheses in reasonable way, given the data they have. The mistake is in thinking that, if the particular path that was chosen yields statistical significance, that this is strong evidence in favor of the hypothesis.
This is why replication studies in which "researcher degrees of freedom" are taken away have more reliable scientific content. Unfortunately, they are unglamorous. Often, in the minds of the general population, they do not replace the flawed original study.

Gelman discusses numerous such examples on his blog. These include studies on "priming" and "power poses" that have failed to replicate. Sure there is the element of schadenfreude, but what I find far more interesting is the response of scientists who championed a theory react to new disconfirming data. For instance, Daniel Kanheman recently admitted that he misjudged the strength of the scientific evidence on priming, and urged readers to disregard one of the chapters devoted to it in his best-seller "Thinking Fast and Slow". Similarly, one of the coauthors of the original power poses work, Dana Carney, had the courage to publicly change her mind.

That is what good scientists do. They update their priors, when new data instructs them to do so.

This brings me to another health and nutrition story doing rounds on the internet. It suggests a 180-degree turn on how to deal with rising incidence of peanut allergies. Instead of keeping infants away from nuts, it urges parents to incorporate them into early, and often. I haven't looked at the original study carefully, but my instincts on retractions and reversals of consensus tells me to take the findings seriously.


Monday, March 27, 2017

Logarithms of Negative Numbers

A plot of log(x) looks something like the following:

As x decreases to zero log(x) approaches negative infinity. For negative values of real x, the log function is undefined. For example, consider the following numpy interaction:

>>> import numpy as np
>>> np.log(1)
0.0
>>> np.log(-1)
__main__:1: RuntimeWarning: invalid value encountered in log
nan

If I try to do the same in Octave, I get something different, and interesting.

octave:1> log(1)
ans = 0
octave:2> log(-1)
ans =  0.00000 + 3.14159i

The answer makes sense if we expand the scope of "x" from real to complex. We know Euler's famous identity, \(e^{i \pi} = -1\). Logarithms of negative numbers exist. They just exist in the complex plane, rather than on the real number line.

Octave's answer above just takes the logarithm of both sides of Euler's identity.

We can make python behave similarly by explicitly specifying the complex nature of the argument. So while log(-1) did not work above, the following works just as expected.

>>> np.log(-1+0j)
3.1415926535897931j

For x < 0, if we plot the absolute value of the complex number, then we get a nice symmetric plot for log(x).


Notes:

  • In matlab, the command reallog is similar to np.log

Thursday, March 23, 2017

Housel on Writing

Morgan Housel is one of my favorite writers on the subject of economics and finance. He offers three pieces of writing advice in this column.

Paraphrasing,

1. Be direct
2. Connect fields
3. Rewrite

Tuesday, March 21, 2017

Try a Pod

I am an avid podcast listener; over the past 6 years, they have enriched commutes, workouts and chores, immeasurably. There has been a concerted call to evangelize for the platform ("try a pod") in the past few weeks. In 2013, I already shared what I was listening to then. Podcast that I currently follow:

History/Politics
  • BackStory
  • My History Can Beat Up Your Politics
  • Hardcore History with Dan Carlin
  • CommonSense with Dan Carlin
  • Revisionist History
Science and Tech
  • Radiolab
  • Skeptics Guide to the Galaxy
  • Science Vs
  • a16z
  • Above Avalon
  • Full Disclosure
  • Note to Self
  • Recode Decode
  • Rationally Speaking
  • Reply All
  • 50 Things That Made the Modern World
Stories
  • Snap Judgement
  • The Moth
  • Criminal
  • This American Life
  • Found
  • 99% Invisible
Language
  • The Allusionist
  • And Eat it Too!
  • A Way with Words
Economics/Business
  • EconTalk
  • Five Good Questions
  • FT Alphachat
  • How I Built This
  • Invest like the Best
  • The Knowledge Project
  • Masters in Business
  • Rangeley Captical Podcast

Others
  • Audio Dharma
  • Philosophize This
  • Educate
  • Commonwealth Club of California
  • Fareed Zakaria GPS
  • Frontline audiocast
  • In Our Time
  • Intelligence Squared
  • Intelligence Squared US
  • Left Right and Center
  • Please Explain
  • More Perfect

Tuesday, March 14, 2017

Links:

1. Doug Natelson's compilation of "advice" blog-posts (nanoscale views)

2. Are Polar Coordinates Backwards? (John D. Cook)

3. Learning Styles are baseless? (Guardian)

4. 5 Unusual Proofs (PBS YouTube Channel)

Friday, March 10, 2017

QuickTip: Sorting Pairs of Numpy Arrays

Consider the two "connected" numpy arrays:

import numpy as np
x = np.array([1992,1991,1993])
y = np.array([15, 20, 30])

order = x.argsort()
x     = x[order]
y     = y[order]

x = array([1991, 1992, 1993])
y = array([20, 15, 30])

Wednesday, March 8, 2017

Perverse Incentives and Integrity

Edwards and Roy write about scientific integrity in the face of perverse incentive systems (full citation: Edwards Marc A. and Roy Siddhartha. Environmental Engineering Science. January 2017, 34(1): 51-61. doi:10.1089/ees.2016.0223.)

Here is a table from the paper, which grapples with incentives and unintended consequences.


Worth a look!