## Thursday, June 15, 2017

### Joint Distribution From Marginals

Consider two dependent random variables, $y_1$ and $y_2$, with a correlation coefficient $\rho$.

Suppose you are given the marginal distributions $\pi(y_1)$ and $\pi(y_2)$ of the two random variables. Is it possible to construct the joint probability distribution $\pi(y_1, y_2)$ from the marginals?

In general, the answer is no. There is no unique answer. The marginals are like shadows of a hill from two orthogonal angles. The shadows are not sufficient to specify the full 3D shape (joint distribution) of the hill.

Let us simplify the problem a little, so that we can seek a solution.

Let us assume $y_1$ and $y_2$ have zero mean and unit standard deviation. We can always generalize later by shifting (different mean) and scaling (different standard distribution). Let us also stack them into a single random vector $Y = [y_1, y_2]$.

The covariance matrix of two such random variables is given by, $C(Y) = \begin{bmatrix} E(y_1 y_1) - \mu_1 \mu_1 & E(y_1 y_2) - \mu_1 \mu_2 \\ E(y_2 y_1) - \mu_2 \mu_1 & E(y_2 y_2) - \mu_2 \mu_2 \end{bmatrix} = \begin{bmatrix} 1 & \rho \\ \rho & 1 \end{bmatrix},$ where $\mu$ and $\sigma$ refer to the mean and standard deviation.

Method

A particular method for sampling from the joint distribution of correlated random variables $Y$ begins by drawing samples of independent random variables $X = [x_1, x_2]$ which have the same distribution as the desired marginal distributions.

Note that the covariance matrix in this case is an identity matrix, because the correlation between independent variables is zero  $C(X) = I$.

Now we recognize that the covariance matrix $C(Y)$ is symmetric and positive definite. We can use Cholesky decomposition $C(Y) = LL^T$ to find the lower triangular matrix $L$.

The recipe then says that we can draw the correlated random variables with the desired marginal distribution by simply setting $Y = L X$.

Example

Suppose we seek two random variables whose marginals are normal distributions (zero mean, unit standard deviation) with a correlation coefficient 0.2.

The method above asks us to start with independent random variables $X$ such as those below.

Cholesky decomposition with $\rho$ = 0.2, gives us,  $L = \begin{bmatrix} 1 & 0 \\ 0.1 & 0.9797 \end{bmatrix}.$ If we generate $Y = LX$ using the same data-points used to create the scatterplot above, we get,

It has the same marginal distribution, and a non-zero correlation coefficient as is visible from the figure above.

#### 1 comment:

Philips Huges said...

Its very useful to me. Wonderful blog.. Thanks for sharing informative Post.

Installment loans
Payday loans
Title loans