# User:Timothee Flutre/Notebook/Postdoc/2012/08/16

(Difference between revisions)
 Revision as of 14:19, 16 August 2012 (view source) (Autocreate 2012/08/16 Entry for User:Timothee_Flutre/Notebook/Postdoc)← Previous diff Revision as of 14:49, 16 August 2012 (view source) (→Entry title: first version)Next diff → Line 6: Line 6: | colspan="2"| | colspan="2"| - ==Entry title== + ==Variational Bayes approach for the mixture of Normals== - * Insert content here... + + * '''Motivation''': I have described on [http://openwetware.org/wiki/User:Timothee_Flutre/Notebook/Postdoc/2011/12/14 another page] the basics of mixture models and the EM algorithm in a frequentist context. It is worth reading before continuing. Here I am interested in the Bayesian approach as well as in a specific variational method (nicknamed "Variational Bayes"). + + + * '''Data''': we have N univariate observations, $y_1, \ldots, y_N$, gathered into the vector $\mathbf{y}$. + + + * '''Assumptions''': we assume the observations to be exchangeable and distributed according to a mixture of K Normal distributions. The parameters of this model are the mixture weights ($w_k$), the means ($\mu_k$) and the precisions ($\tau_k$) of each mixture components, all gathered into $\Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}$. There are two constraints: $\sum_{k=1}^K w_k = 1$ and $\forall k \; w_k > 0$. + + + * '''Observed likelihood''': $p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k)$ + + + * '''Latent variables''': let's introduce N latent variables, $z_1,\ldots,z_N$, gathered into the vector $\mathbf{z}$. Each $z_n$ is a vector of length K with a single 1 indicating the component to which the $n^{th}$ observation belongs, and K-1 zeroes. + + + * '''Augmented likelihood''': $p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k)^{z_{nk}}$ + + + * '''Priors''': we choose conjuguate ones + ** for the parameters: $\forall k \; \mu_k \sim Normal(\mu_0,\tau_0)$ and $\forall k \; \tau_k \sim Gamma(\alpha,\beta)$ + ** for the latent variables: $\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})$ and $\mathbf{w} \sim Dirichlet(\gamma)$

## Revision as of 14:49, 16 August 2012

Project name Main project page
Previous entry      Next entry

## Variational Bayes approach for the mixture of Normals

• Motivation: I have described on another page the basics of mixture models and the EM algorithm in a frequentist context. It is worth reading before continuing. Here I am interested in the Bayesian approach as well as in a specific variational method (nicknamed "Variational Bayes").

• Data: we have N univariate observations, $y_1, \ldots, y_N$, gathered into the vector $\mathbf{y}$.

• Assumptions: we assume the observations to be exchangeable and distributed according to a mixture of K Normal distributions. The parameters of this model are the mixture weights (wk), the means (μk) and the precisions (τk) of each mixture components, all gathered into $\Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}$. There are two constraints: $\sum_{k=1}^K w_k = 1$ and $\forall k \; w_k > 0$.

• Observed likelihood: $p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k)$

• Latent variables: let's introduce N latent variables, $z_1,\ldots,z_N$, gathered into the vector $\mathbf{z}$. Each zn is a vector of length K with a single 1 indicating the component to which the nth observation belongs, and K-1 zeroes.

• Augmented likelihood: $p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k)^{z_{nk}}$

• Priors: we choose conjuguate ones
• for the parameters: $\forall k \; \mu_k \sim Normal(\mu_0,\tau_0)$ and $\forall k \; \tau_k \sim Gamma(\alpha,\beta)$
• for the latent variables: $\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})$ and $\mathbf{w} \sim Dirichlet(\gamma)$