Matrix normal distribution

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Matrix normal
Notation \mathcal{MN}_{n,p}(\mathbf{M}, \mathbf{U}, \mathbf{V})
Parameters \mathbf{M} location (real n\times p matrix)

\mathbf{U} scale (positive-definite real n\times n matrix)
\mathbf{V} scale (positive-definite real p\times p matrix)

Support \mathbf{X} \in \mathbb{R}^{n \times p}
pdf \frac{\exp\left( -\frac{1}{2} \, \mathrm{tr}\left[ \mathbf{V}^{-1} (\mathbf{X} - \mathbf{M})^{T} \mathbf{U}^{-1} (\mathbf{X} - \mathbf{M}) \right] \right)}{(2\pi)^{np/2} |\mathbf{V}|^{n/2} |\mathbf{U}|^{p/2}}
Mean \mathbf{M}
Variance \mathbf{U} (among-row) and \mathbf{V} (among-column)

In statistics, the matrix normal distribution is a probability distribution that is a generalization of the multivariate normal distribution to matrix-valued random variables.

Contents

Definition

The probability density function for the random matrix X (n × p) that follows the matrix normal distribution \mathcal{MN}_{n,p}(\mathbf{M}, \mathbf{U}, \mathbf{V}) has the form:


p(\mathbf{X}|\mathbf{M}, \mathbf{U}, \mathbf{V}) = \frac{\exp\left( -\frac{1}{2} \, \mathrm{tr}\left[ \mathbf{V}^{-1} (\mathbf{X} - \mathbf{M})^{T} \mathbf{U}^{-1} (\mathbf{X} - \mathbf{M}) \right] \right)}{(2\pi)^{np/2} |\mathbf{V}|^{n/2} |\mathbf{U}|^{p/2}}

where M is n × p, U is n × n and V is p × p.

There are several ways to define the two covariance matrices. One possibility is

\mathbf{U} = E[(\mathbf{X} - \mathbf{M})(\mathbf{X} - \mathbf{M})^{T}]
\mathbf{V} = E[(\mathbf{X} - \mathbf{M})^{T} (\mathbf{X} - \mathbf{M})] / c

where c is a constant which depends on U and ensures appropriate power normalization.

The matrix normal is related to the multivariate normal distribution in the following way:

\mathbf{X} \sim \mathcal{MN}_{n\times p}(\mathbf{M}, \mathbf{U}, \mathbf{V}),

if and only if

\mathrm{vec}(\mathbf{X}) \sim \mathcal{N}_{np}(\mathrm{vec}(\mathbf{M}), \mathbf{V} \otimes \mathbf{U})

where \otimes denotes the Kronecker product and \mathrm{vec}(\mathbf{M}) denotes the vectorization of \mathbf{M}.

Example

Let's imagine a sample of n independent p-dimensional random variables identically distributed according to a multivariate normal distribution:

\mathbf{Y}_i \sim \mathcal{N}_p({\boldsymbol \mu}, {\boldsymbol \Sigma}) \text{ with } i \in \{1,\ldots,n\}.

When defining the n × p matrix \mathbf{X} for which the ith row is \mathbf{Y}_i, we obtain:

\mathbf{X} \sim \mathcal{MN}_{n \times p}(\mathbf{M}, \mathbf{U}, \mathbf{V})

where each row of \mathbf{M} is equal to {\boldsymbol \mu}, that is \mathbf{M}=\mathbf{1}_n \times {\boldsymbol \mu}^T, \mathbf{U} is the n × n identity matrix, that is the rows are independent, and \mathbf{V} = {\boldsymbol \Sigma}.

Relation to other distributions

Dawid (1981) provides a discussion of the relation of the matrix-valued normal distribution to other distributions, including the Wishart distribution, Inverse Wishart distribution and matrix t-distribution, but uses different notation from that employed here.

See also

References