User:Timothee Flutre/Notebook/Postdoc/2012/01/02
Project name | <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page <html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html> </html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html> |
Learn about the multivariate Normal and matrix calculus(Caution, this is my own quick-and-dirty tutorial, see the references at the end for presentations by professional statisticians.)
[math]\displaystyle{ L(\theta) = \mathbb{P}(X/\theta) }[/math] As the observations are independent: [math]\displaystyle{ L(\theta) = \prod_{i=1}^N f(x_i / \theta) }[/math] It is easier to work with the log-likelihood: [math]\displaystyle{ l(\theta) = ln(L(\theta)) = \sum_{i=1}^N ln( f(x_i / \theta) ) }[/math] [math]\displaystyle{ l(\theta) = -\frac{NP}{2} ln(2\pi) - \frac{N}{2}ln(|\Sigma|) - \frac{1}{2} \sum_{i=1}^N (x_i-\mu)^T \Sigma^{-1} (x_i-\mu) }[/math]
[math]\displaystyle{ d(f(u)) = f'(u) du }[/math], eg. useful here: [math]\displaystyle{ d(ln(|\Sigma|)) = |\Sigma|^{-1} d(|\Sigma|) }[/math] [math]\displaystyle{ d(|U|) = |U| tr(U^{-1} dU) }[/math] [math]\displaystyle{ d(U^{-1}) = - U^{-1} (dU) U^{-1} }[/math] ... <TO DO> ...
[math]\displaystyle{ \sum_{i=1}^N (x_i-\mu)^T \Sigma^{-1} (x_i-\mu) = \sum_{i=1}^N tr( (x_i-\mu)^T \Sigma^{-1} (x_i-\mu) ) }[/math] As the trace is invariant under cyclic permutations ([math]\displaystyle{ tr(ABC) = tr(BCA) = tr(CAB) }[/math]): [math]\displaystyle{ \sum_{i=1}^N (x_i-\mu)^T \Sigma^{-1} (x_i-\mu) = \sum_{i=1}^N tr( \Sigma^{-1} (x_i-\mu) (x_i-\mu)^T ) }[/math] The trace is also a linear map ([math]\displaystyle{ tr(A+B) = tr(A) + tr(B) }[/math]): [math]\displaystyle{ \sum_{i=1}^N (x_i-\mu)^T \Sigma^{-1} (x_i-\mu) = tr( \sum_{i=1}^N \Sigma^{-1} (x_i-\mu) (x_i-\mu)^T ) }[/math] And finally: [math]\displaystyle{ \sum_{i=1}^N (x_i-\mu)^T \Sigma^{-1} (x_i-\mu) = tr( \Sigma^{-1} \sum_{i=1}^N (x_i-\mu) (x_i-\mu)^T ) }[/math] As a result: [math]\displaystyle{ l(\theta) = -\frac{NP}{2} ln(2\pi) - \frac{N}{2}ln(|\Sigma|) - \frac{1}{2} tr(\Sigma^{-1} Z) }[/math] with [math]\displaystyle{ Z=\sum_{i=1}^N(x_i-\mu)(x_i-\mu)^T }[/math] We can now write the first differential of the log-likelihood: [math]\displaystyle{ d_{\theta}l(\theta) = - \frac{N}{2} d(ln(|\Sigma|)) - \frac{1}{2} d(tr(\Sigma^{-1} Z)) }[/math] ... <TO DO> ... The first-order conditions are: [math]\displaystyle{ \hat{\Sigma}^{-1} (Z - n\hat{\Sigma}) \hat{\Sigma}^{-1} = 0 }[/math] and [math]\displaystyle{ \hat{\Sigma}^{-1} (\bar{x} - \hat{\mu}) = 0 }[/math] From which follow: [math]\displaystyle{ \hat{\mu} = \bar{x} = \frac{1}{n} \sum_{i=1}^N x_i }[/math] and: [math]\displaystyle{ \hat{\Sigma} = \frac{1}{n} \sum_{i=1}^N (x_i - \bar{x})(x_i - \bar{x})^T }[/math]
|