Covariance and correlation

Main articles: Covariance and Correlation

In probability theory and statistics, the mathematical concepts of covariance and correlation are very similar.^[1]^[2] Both describe the degree to which two random variables or sets of random variables tend to deviate from their expected values in similar ways.

If X and Y are two random variables, with means μ_X and μ_Y, and standard deviations σ_X and σ_Y, respectively, then their covariance and correlation are as follows:

covariance	$\sigma _{XY}=E[(X-\mu _{X})\,(Y-\mu _{Y})]$
correlation	$\rho _{XY}=E[(X-\mu _{X})\,(Y-\mu _{Y})]/(\sigma _{X}\sigma _{Y})$ ,

So that

$cov_{XY}=\sigma _{XY}=\rho _{XY}\sigma _{X}\sigma _{Y}$

where E is the expected value operator. Notably, correlation is dimensionless while covariance is in units obtained by multiplying the units of the two variables. The covariance of a variable with itself (i.e. $\sigma _{XX}$ ) is called the variance and is more commonly denoted as $\sigma _{X}^{2},$ the square of the standard deviation. The correlation of a variable with itself is always 1 (except in the degenerate case where the two variances are zero, in which case the correlation does not exist).

Time series analysis

In the case of a time series which is stationary in the wide sense, both the means and variances are constant over time (E(X_n+m) = E(X_n) = μ_X and so on). In this case the cross-covariance and cross-correlation are functions of the time difference:

cross-covariance	$\sigma _{XY}(m)=E[(X_{n}-\mu _{X})\,(Y_{n+m}-\mu _{Y})].$
cross-correlation	$\rho _{XY}(m)=E[(X_{n}-\mu _{X})\,(Y_{n+m}-\mu _{Y})]/(\sigma _{X}\sigma _{Y}),$

Although the values of the theoretical covariances and correlations are linked in the above way, the probability distributions of sample estimates of these quantities are not linked in any simple way and they generally need to be treated separately. These distributions depend on the joint distribution of the pair of random quantities (X,Y) when the values are assumed independent across different pairs. In the case of a time series, the distributions depend on the joint distributions of the whole time-series.

References

This article is issued from Wikipedia - version of the 10/24/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.

Covariance and correlation

Time series analysis

See also

References