Talk:Correlation and Mutual Information


 * discussion of the "CORREL" function in excel may be helpful.
 * also faster ways of doing this in excel via the Tools-> Data Analysis -> correlation.
 * The Anscombe;s quartet is a good example, but might require more explanation. A reader might not understand at this stage what this tells us about the correlation coef.  Does it mean that a correlation coef is a bad measure or something else?
 * Engineering applications: this could really use an example.
 * relative entropy: when discussing this, you will need to define the marginal entropy Currently it is not defined but should be H(A)=sum[p(A)log(p(A))]
 * Mutual information example: The probability density for P and T is not correctly calculated nor interpreted correctly. For a proper probability density we need to be able to integrate over some range of a variable to find the probability of seeing that variable in that state.  Thus if we integrate p(T) from 1-2 we should find the probability of finding the system with a temperature from 1 to 2.  Currently it is not correct...
 * note too that the joint probability function is a constant--this does not make sense.
 * Easier example would be to use a discrete case for mutual information.
 * I'm extracting the example and placing it below for future editing.

Mutual Information Example
Below is a typical packed bed reactor that will allow a gas-phase reaction to occur via a catalyst. The following equation is an adaptation of the Ergun equation adapted for multiple reactions and membrane reactors.



The characteristic equation includes both a pressure drop and a temperature gradient. In this example, we will explore the mutual dependency of both of these parameters as they relate to the distance that gas has traveled through the packed bed. The characteristic equation is below:

$$ z(T,P)= \left(\frac{T}{P}\right)\left(\frac{10}{21}\right) $$

where,

• P = pressure (atm); with the boundaries that 1 ≤ P ≤ 5.

• T = temperature (degrees Celsius); with the boundaries that 1 ≤ T ≤ 10.

Next we will find the marginal probability densities of T and P.

$$ p(T) = \int_{1}^{10} \left(\frac{10}{21}\right) \left(\frac{T}{P}\right)dT $$

$$ p(T) = \frac {23.571}{P}$$

Probability Density Function of T

$$ p(P) = \int_{1}^{5} \left(\frac{10}{21}\right) \left(\frac{T}{P}\right)dP $$

$$ p(P) = 0.766 T $$

Probability Density Function of P

We will now find the joint probability density function of P and T.

$$ p(T,P) = \int_{1}^{5} \int_{1}^{10} \left(\frac{10}{21}\right) \left(\frac{T}{P}\right) dT dP $$

$$ p(T,P) = 37.938 $$

The mutual information between P and T is now calculated below.

$$ I(T;P) = \int_{1}^{5} \int_{1}^{10} \left( 37.938 * log \left( \frac{37.938*P}{23.571 * 0.766*T} \right) dT dP \right) $$

$$ I(T;P) = 385\, bits$$ $$ I(T;P) = 267\, nats $$ $$ I(T;P) = 116\, bans$$