<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/css" href="http://www.openwetware.org/skins/common/feed.css?164"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
	<channel>
		<title>User:Timothee Flutre/Notebook/Postdoc/2012/08/16 - Revision history</title>
		<link>http://www.openwetware.org/index.php?title=User:Timothee_Flutre/Notebook/Postdoc/2012/08/16&amp;action=history</link>
		<description>Revision history for this page on the wiki</description>
		<language>en</language>
		<generator>MediaWiki 1.13.2</generator>
		<lastBuildDate>Wed, 22 May 2013 13:58:54 GMT</lastBuildDate>
		<item>
			<title>Timothee Flutre: /* Variational Bayes approach for the mixture of Normals */ add part of updates for q_z</title>
			<link>http://www.openwetware.org/index.php?title=User:Timothee_Flutre/Notebook/Postdoc/2012/08/16&amp;diff=623555&amp;oldid=prev</link>
			<description>&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Variational Bayes approach for the mixture of Normals:&amp;#32;&lt;/span&gt; add part of updates for q_z&lt;/p&gt;

			&lt;table style=&quot;background-color: white; color:black;&quot;&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;tr valign='top'&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;←Older revision&lt;/td&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;Revision as of 22:25, 4 September 2012&lt;/td&gt;
			&lt;/tr&gt;
		&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 79:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 79:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Updates for &amp;lt;math&amp;gt;q_\mathbf{z}&amp;lt;/math&amp;gt;''':&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Updates for &amp;lt;math&amp;gt;q_\mathbf{z}&amp;lt;/math&amp;gt;''':&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;We start by writing the functional derivative of &amp;lt;math&amp;gt;\mathcal{F}&amp;lt;/math&amp;gt; with respect to &amp;lt;math&amp;gt;q_{\mathbf{z}}&amp;lt;/math&amp;gt;:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\frac{\partial \mathcal{F}}{\partial q_{\mathbf{z}}} = \int_\Theta \, \mathrm{d}\Theta \; q_\Theta(\Theta) \; \left[ \frac{\partial}{\partial q_{\mathbf{z}}} \left( \int_\mathbf{z} \, \mathrm{d}\mathbf{z} \; \left( q_{\mathbf{z}}(\mathbf{z}) \mathrm{ln} \, p(\mathbf{y},\mathbf{z}|\Theta,K) - q_{\mathbf{z}}(\mathbf{z}) \mathrm{ln} \, q_{\mathbf{z}}(\mathbf{z}) \right) \right) \right] + C_{\mathbf{z}}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\frac{\partial \mathcal{F}}{\partial q_{\mathbf{z}}} = \int_\Theta \, \mathrm{d}\Theta \; q_\Theta(\Theta) \; \left[ \mathrm{ln} \, p(\mathbf{y},\mathbf{z}|\Theta,K) - \mathrm{ln} \, q_{\mathbf{z}}(\mathbf{z}) - 1 \right] + C_{\mathbf{z}}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;Then we set this functional derivative to zero. We also make use of a frequent assumption, namely that the variational distribution fully factorizes over each individual latent variables (mean-field assumption):&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\frac{\partial \mathcal{F}}{\partial q_{\mathbf{z}}} \bigg|_{q_{\mathbf{z}}^{(t+1)}} = 0 \Longleftrightarrow \forall \, n \; \mathrm{ln} \, q_{z_n}^{(t+1)}(z_n) = \left( \int_\Theta \, \mathrm{d}\Theta \; q_\Theta(\Theta) \; \mathrm{ln} \, p(\mathbf{y},\mathbf{z}|\Theta,K) \right) - 1 + C_{z_n}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;TODO&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;TODO&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;!-- diff generator: internal 2013-05-22 13:58:54 --&gt;
&lt;/table&gt;</description>
			<pubDate>Tue, 04 Sep 2012 22:25:08 GMT</pubDate>			<dc:creator>Timothee Flutre</dc:creator>			<comments>http://www.openwetware.org/wiki/User_talk:Timothee_Flutre/Notebook/Postdoc/2012/08/16</comments>		</item>
		<item>
			<title>Timothee Flutre: /* Variational Bayes approach for the mixture of Normals */ add missing term</title>
			<link>http://www.openwetware.org/index.php?title=User:Timothee_Flutre/Notebook/Postdoc/2012/08/16&amp;diff=623543&amp;oldid=prev</link>
			<description>&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Variational Bayes approach for the mixture of Normals:&amp;#32;&lt;/span&gt; add missing term&lt;/p&gt;

			&lt;table style=&quot;background-color: white; color:black;&quot;&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;tr valign='top'&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;←Older revision&lt;/td&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;Revision as of 21:53, 4 September 2012&lt;/td&gt;
			&lt;/tr&gt;
		&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 73:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 73:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;As we ultimately aim at inferring the parameters and latent variables that maximize the marginal log-likelihood, we will use the [http://en.wikipedia.org/wiki/Calculus_of_variations calculus of variations] to find the functions &amp;lt;math&amp;gt;q_\mathbf{z}&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;q_\Theta&amp;lt;/math&amp;gt; that maximize the functional &amp;lt;math&amp;gt;\mathcal{F}&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;As we ultimately aim at inferring the parameters and latent variables that maximize the marginal log-likelihood, we will use the [http://en.wikipedia.org/wiki/Calculus_of_variations calculus of variations] to find the functions &amp;lt;math&amp;gt;q_\mathbf{z}&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;q_\Theta&amp;lt;/math&amp;gt; that maximize the functional &amp;lt;math&amp;gt;\mathcal{F}&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;math&amp;gt;\mathcal{F}(q_\mathbf{z}, q_\Theta) = \int_\Theta \, \mathrm{d}\Theta \; \left( \int_\mathbf{z} \, \mathrm{d}\mathbf{z} \; q(\mathbf{z}) \; \mathrm{ln} \, \frac{p(\mathbf{y}, \mathbf{z} | \Theta, K)}{q(\mathbf{z})} + \mathrm{ln} \, \frac{p(\Theta | K)}{q(\Theta)} \right) + C_{\mathbf{z}} + C_{\Theta}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;math&amp;gt;\mathcal{F}(q_\mathbf{z}, q_\Theta) = \int_\Theta \, \mathrm{d}\Theta &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;\; q_\Theta(\Theta) &lt;/ins&gt;\; \left( \int_\mathbf{z} \, \mathrm{d}\mathbf{z} \; q(\mathbf{z}) \; \mathrm{ln} \, \frac{p(\mathbf{y}, \mathbf{z} | \Theta, K)}{q(\mathbf{z})} + \mathrm{ln} \, \frac{p(\Theta | K)}{q(\Theta)} \right) + C_{\mathbf{z}} + C_{\Theta}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;This naturally leads to a procedure very similar to the EM algorithm where, at the E step, we calculate the expectations of the parameters with respect to the variational distributions &amp;lt;math&amp;gt;q_\mathbf{z}&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;q_\Theta&amp;lt;/math&amp;gt;, and, at the M step, we recompute the variational distributions over the parameters.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;This naturally leads to a procedure very similar to the EM algorithm where, at the E step, we calculate the expectations of the parameters with respect to the variational distributions &amp;lt;math&amp;gt;q_\mathbf{z}&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;q_\Theta&amp;lt;/math&amp;gt;, and, at the M step, we recompute the variational distributions over the parameters.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;!-- diff generator: internal 2013-05-22 13:58:54 --&gt;
&lt;/table&gt;</description>
			<pubDate>Tue, 04 Sep 2012 21:53:52 GMT</pubDate>			<dc:creator>Timothee Flutre</dc:creator>			<comments>http://www.openwetware.org/wiki/User_talk:Timothee_Flutre/Notebook/Postdoc/2012/08/16</comments>		</item>
		<item>
			<title>Timothee Flutre: /* Variational Bayes approach for the mixture of Normals */ shorten + add principle variational bayes</title>
			<link>http://www.openwetware.org/index.php?title=User:Timothee_Flutre/Notebook/Postdoc/2012/08/16&amp;diff=623126&amp;oldid=prev</link>
			<description>&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Variational Bayes approach for the mixture of Normals:&amp;#32;&lt;/span&gt; shorten + add principle variational bayes&lt;/p&gt;

			&lt;table style=&quot;background-color: white; color:black;&quot;&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;tr valign='top'&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;←Older revision&lt;/td&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;Revision as of 23:11, 2 September 2012&lt;/td&gt;
			&lt;/tr&gt;
		&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 11:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 11:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Data''': &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;we have &lt;/del&gt;N univariate observations, &amp;lt;math&amp;gt;y_1, \ldots, y_N&amp;lt;/math&amp;gt;, gathered into the vector &amp;lt;math&amp;gt;\mathbf{y}&amp;lt;/math&amp;gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Data''': N univariate observations, &amp;lt;math&amp;gt;y_1, \ldots, y_N&amp;lt;/math&amp;gt;, gathered into the vector &amp;lt;math&amp;gt;\mathbf{y}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;* '''Model''': mixture of K Normal distributions&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Assumptions&lt;/del&gt;''': &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;we assume the observations to be exchangeable and distributed according to a mixture of &lt;/del&gt;K &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Normal distributions. The parameters of this model are the &lt;/del&gt;mixture weights (&amp;lt;math&amp;gt;w_k&amp;lt;/math&amp;gt;), &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;the &lt;/del&gt;means (&amp;lt;math&amp;gt;\mu_k&amp;lt;/math&amp;gt;) and &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;the &lt;/del&gt;[http://en.wikipedia.org/wiki/Precision_%28statistics%29 precisions] (&amp;lt;math&amp;gt;\tau_k&amp;lt;/math&amp;gt;) &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;of each &lt;/del&gt;mixture &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;components, all gathered into &amp;lt;math&amp;gt;\Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}&amp;lt;/math&amp;gt;. There are two constraints: &amp;lt;math&amp;gt;\sum_{k=1}^K w_k = 1&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; w_k &amp;gt; 0&amp;lt;/math&amp;gt;.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Parameters&lt;/ins&gt;''': K mixture weights (&amp;lt;math&amp;gt;w_k&amp;lt;/math&amp;gt;), &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;K &lt;/ins&gt;means (&amp;lt;math&amp;gt;\mu_k&amp;lt;/math&amp;gt;) and &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;K &lt;/ins&gt;[http://en.wikipedia.org/wiki/Precision_%28statistics%29 precisions] (&amp;lt;math&amp;gt;\tau_k&amp;lt;/math&amp;gt;)&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;, one per &lt;/ins&gt;mixture &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;component&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Observed likelihood&lt;/del&gt;''': &amp;lt;math&amp;gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;p(&lt;/del&gt;\&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;mathbf&lt;/del&gt;{&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;y} | \Theta, K) = \prod_{n&lt;/del&gt;=1}^&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;N p(y_n|\Theta,&lt;/del&gt;K&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;) = \prod_{n&lt;/del&gt;=1&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;}^N &lt;/del&gt;\&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;sum_{&lt;/del&gt;k&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;=1}^K &lt;/del&gt;w_k &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Normal(y_n;\mu_k,\tau_k^{-1})&lt;/del&gt;&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Constraints&lt;/ins&gt;''': &amp;lt;math&amp;gt;\&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;sum_&lt;/ins&gt;{&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;k&lt;/ins&gt;=1}^K &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;w_k &lt;/ins&gt;= 1&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;&lt;/ins&gt;\&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;forall &lt;/ins&gt;k &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;\; &lt;/ins&gt;w_k &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;gt; 0&lt;/ins&gt;&amp;lt;/math&amp;gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;* '''Observed likelihood''':&amp;nbsp; observations assumed exchangeable (independent and identically distributed given the parameters)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;* '''Maximizing the observed log-likelihood''': as shown [http://openwetware.org/wiki/User:Timothee_Flutre/Notebook/Postdoc/2011/12/14 here]&lt;/del&gt;, &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;maximizing the likelihood of a mixture model is like doing a weighted likelihood maximization. However&lt;/del&gt;, &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;these weights depend on the parameters we want to estimate! That's why we now switch to the missing&lt;/del&gt;-&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;data formulation of the mixture model.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;lt;math&amp;gt;p(\mathbf{y} | \Theta&lt;/ins&gt;, &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;K) = \prod_{n=1}^N p(y_n|\Theta&lt;/ins&gt;,&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k^{&lt;/ins&gt;-&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;1})&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;* '''Latent variables''': N hidden variables, &amp;lt;math&amp;gt;z_1,\ldots,z_N&amp;lt;/math&amp;gt;, each being a vector of length K with a single 1 indicating the component to which the &amp;lt;math&amp;gt;n^{th}&amp;lt;/math&amp;gt; observation belongs, and K-1 zeroes.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Latent variables&lt;/del&gt;''': &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;let's introduce N latent variables, &amp;lt;math&amp;gt;z_1,\ldots,z_N&amp;lt;/math&amp;gt;, gathered into the vector &amp;lt;math&amp;gt;\mathbf{z}&amp;lt;/math&amp;gt;. Each &amp;lt;math&amp;gt;z_n&amp;lt;/math&amp;gt; is a vector of length K with a single 1 indicating the component to which the &amp;lt;math&amp;gt;n^{th}&amp;lt;/math&amp;gt; observation belongs, and K-1 zeroes. &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Augmented likelihood&lt;/ins&gt;''':&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k^{-1})^{z_{nk}}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Augmented &lt;/del&gt;likelihood''': &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&amp;lt;math&amp;gt;p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k^{-1})^{z_{nk}}&amp;lt;/math&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Maximum-&lt;/ins&gt;likelihood &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;estimation&lt;/ins&gt;''': &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;integrate out the latent variables&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\mathrm{ln} \, p(\mathbf{y} | \Theta, K) = \sum_{n=1}^N \mathrm{ln} \, \int_{z_n} \mathrm{d}{z_n} \; p(y_n, z_n | \Theta, K)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;* '''Priors''': in the Bayesian paradigm, parameters and &lt;/del&gt;latent variables &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;are random variables for which we want to infer &lt;/del&gt;the &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;posterior distribution&lt;/del&gt;. &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;To make &lt;/del&gt;the &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;calculations possible, we choose for them prior distributions &lt;/del&gt;that &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;are conjuguate with the form of &lt;/del&gt;the likelihood.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;The &lt;/ins&gt;latent variables &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;induce dependencies between all &lt;/ins&gt;the &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;parameters of the model&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;** for the &lt;/del&gt;parameters&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;: &amp;lt;math&amp;gt;\forall k \; \mu_k | \tau_k \sim Normal(\mu_0&lt;/del&gt;,&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;(\tau_0 \tau_k)^{&lt;/del&gt;-&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;1})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; \tau_k \sim Gamma(\alpha,\beta)&amp;lt;/math&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;This makes it difficult to find &lt;/ins&gt;the &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;parameters &lt;/ins&gt;that &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;maximize &lt;/ins&gt;the likelihood.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;** for &lt;/del&gt;the &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;latent variables: &amp;lt;math&amp;gt;\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\mathbf{w} \sim Dirichlet(\gamma)&amp;lt;/math&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;An elegant solution is to introduce a variational distribution of &lt;/ins&gt;parameters &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;and latent variables&lt;/ins&gt;, &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;which leads to a re&lt;/ins&gt;-&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;formulation of the classical EM algorithm.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;But let's show it directly in &lt;/ins&gt;the &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Bayesian paradigm.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;* '''Priors''': conjuguate&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;** &amp;lt;math&amp;gt;\forall k \; \mu_k | \tau_k \sim Normal(\mu_0,(\tau_0 \tau_k)^{-1})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; \tau_k \sim Gamma(\alpha,\beta)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;** &amp;lt;math&amp;gt;\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\mathbf{w} \sim Dirichlet(\gamma)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Variational Bayes''': &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;our primary goal &lt;/del&gt;here &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;is to calculate &lt;/del&gt;the marginal log-likelihood of our data set:&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Variational Bayes''': &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;let's focus &lt;/ins&gt;here &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;on calculating &lt;/ins&gt;the marginal log-likelihood of our data set &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;in order to perform model comparison&lt;/ins&gt;:&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;math&amp;gt;\mathrm{ln} \, p(\mathbf{y} | K) = \mathrm{ln} \, \int_\mathbf{z} \int_\Theta \, \mathrm{d}\mathbf{z} \, \mathrm{d}\Theta \; p(\mathbf{y}, \mathbf{z}, \Theta | K)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;math&amp;gt;\mathrm{ln} \, p(\mathbf{y} | K) = \mathrm{ln} \, \int_\mathbf{z} \int_\Theta \, \mathrm{d}\mathbf{z} \, \mathrm{d}\Theta \; p(\mathbf{y}, \mathbf{z}, \Theta | K)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;However the fact that there are latent variables induce dependencies between all the parameters of the model.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;This makes it difficult to find the parameters that maximize the marginal likelihood.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;An elegant solution is to introduce a &amp;quot;variational distribution&amp;quot; &amp;lt;math&amp;gt;q&amp;lt;/math&amp;gt; of the parameters and the latent variables &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;math&amp;gt;\mathrm{ln} \, p(\mathbf{y} | K) = \mathrm{ln} \, \left( \int_\mathbf{z} \int_\Theta \, \mathrm{d}\mathbf{z} \, \mathrm{d}\Theta \; q(\mathbf{z}, \Theta) \; \frac{p(\mathbf{y}, \mathbf{z}, \Theta | K)}{q(\mathbf{z}, \Theta)} \right) + C_{\mathbf{z}, \Theta}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;math&amp;gt;\mathrm{ln} \, p(\mathbf{y} | K) = \mathrm{ln} \, \left( \int_\mathbf{z} \int_\Theta \, \mathrm{d}\mathbf{z} \, \mathrm{d}\Theta \; q(\mathbf{z}, \Theta) \; \frac{p(\mathbf{y}, \mathbf{z}, \Theta | K)}{q(\mathbf{z}, \Theta)} \right) + C_{\mathbf{z}, \Theta}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 46:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 53:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;The constant &amp;lt;math&amp;gt;C&amp;lt;/math&amp;gt; is here to remind us that &amp;lt;math&amp;gt;q&amp;lt;/math&amp;gt; has the constraint of being a distribution, ie. of summing to 1, which can be enforced by a Lagrange multiplier.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;The constant &amp;lt;math&amp;gt;C&amp;lt;/math&amp;gt; is here to remind us that &amp;lt;math&amp;gt;q&amp;lt;/math&amp;gt; has the constraint of being a distribution, ie. of summing to 1, which can be enforced by a Lagrange multiplier.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;The &lt;/del&gt;'''&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;crucial assumption&lt;/del&gt;''' is to &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;assume &lt;/del&gt;the &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;independence &lt;/del&gt;of the &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;parameters &lt;/del&gt;and the latent variables:&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;We can then use the concavity of the logarithm ([http://en.wikipedia.org/wiki/Jensen%27s_inequality Jensen&lt;/ins&gt;'&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;s inequality]) to derive a lower bound of the marginal log-likelihood:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;lt;math&amp;gt;\mathrm{ln} \, p(\mathbf{y} | K) \ge \int_\mathbf{z} \int_\Theta \, \mathrm{d}\mathbf{z} \, \mathrm{d}\Theta \; q(\mathbf{z}, \Theta) \; \mathrm{ln} \, \frac{p(\mathbf{y}, \mathbf{z}, \Theta | K)}{q(\mathbf{z}, \Theta)} + C_{\mathbf{z}, \Theta} = \mathcal{F}(q)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Let&lt;/ins&gt;'&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;s call this lower bound &amp;lt;math&amp;gt;\mathcal{F}(q)&amp;lt;/math&amp;gt; as it is a [http://en.wikipedia.org/wiki/Functional_%28mathematics%29 functional], ie. a &lt;/ins&gt;''&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;function of functions&lt;/ins&gt;''&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;. To gain some intuition about the impact of introducing &amp;lt;math&amp;gt;q&amp;lt;/math&amp;gt;, let's expand &amp;lt;math&amp;gt;\mathcal{F}&amp;lt;/math&amp;gt;:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;lt;math&amp;gt;\mathcal{F}(q) = \int_\mathbf{z} \int_\Theta \, \mathrm{d}\mathbf{z} \, \mathrm{d}\Theta \; q(\mathbf{z}, \Theta) \; \mathrm{ln} \, p(\mathbf{y} | \mathbf{z}, \Theta, K) + \int_\mathbf{z} \int_\Theta \, \mathrm{d}\mathbf{z} \, \mathrm{d}\Theta \; q(\mathbf{z}, \Theta) \; \mathrm{ln} \, \frac{p(\mathbf{z}, \Theta | K)}{q(\mathbf{z}, \Theta)} + C_{\mathbf{z}, \Theta}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;lt;math&amp;gt;\mathcal{F}(q) = \mathrm{ln} \, p(\mathbf{y} | \mathbf{z}, \Theta, K) - D_{KL}(q || p)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;From this, it &lt;/ins&gt;is &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;clear that &amp;lt;math&amp;gt;\mathcal{F}&amp;lt;/math&amp;gt; (ie. a lower-bound of the marginal log-likelihood) is the conditional log-likelihood minus the [http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence Kullback-Leibler divergence] between the variational distribution &amp;lt;math&amp;gt;q&amp;lt;/math&amp;gt; and the joint posterior of latent variables and parameters.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;In practice, we have &lt;/ins&gt;to &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;make &lt;/ins&gt;the &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;following crucial assumption &lt;/ins&gt;of &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;independence on &amp;lt;math&amp;gt;q&amp;lt;/math&amp;gt; in order for &lt;/ins&gt;the &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;calculations to be analytically tractable:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;lt;math&amp;gt;q(\mathbf{z}, \Theta) = q_{\mathbf{z}}(\mathbf{z}) q_\Theta(\Theta)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;This means that &amp;lt;math&amp;gt;q_\mathbf{z} q_\Theta&amp;lt;/math&amp;gt; approximates the joint posterior, &lt;/ins&gt;and &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;therefore &lt;/ins&gt;the &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;lower-bound will be tight if and only if this approximation is exact and the KL divergence is zero.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;As we ultimately aim at inferring the parameters and &lt;/ins&gt;latent variables &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;that maximize the marginal log-likelihood, we will use the [http://en.wikipedia.org/wiki/Calculus_of_variations calculus of variations] to find the functions &amp;lt;math&amp;gt;q_\mathbf{z}&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;q_\Theta&amp;lt;/math&amp;gt; that maximize the functional &amp;lt;math&amp;gt;\mathcal{F}&amp;lt;/math&amp;gt;.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;lt;math&amp;gt;\mathcal{F}(q_\mathbf{z}, q_\Theta) = \int_\Theta \, \mathrm{d}\Theta \; \left( \int_\mathbf{z} \, \mathrm{d}\mathbf{z} \; q(\mathbf{z}) \; \mathrm{ln} \, \frac{p(\mathbf{y}, \mathbf{z} | \Theta, K)}{q(\mathbf{z})} + \mathrm{ln} \, \frac{p(\Theta | K)}{q(\Theta)} \right) + C_{\mathbf{z}} + C_{\Theta}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;This naturally leads to a procedure very similar to the EM algorithm where, at the E step, we calculate the expectations of the parameters with respect to the variational distributions &amp;lt;math&amp;gt;q_\mathbf{z}&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;q_\Theta&amp;lt;/math&amp;gt;, and, at the M step, we recompute the variational distributions over the parameters.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;* '''Updates for &amp;lt;math&amp;gt;q_\mathbf{z}&amp;lt;/math&amp;gt;'''&lt;/ins&gt;:&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&amp;lt;math&amp;gt;q(\mathbf{z}, \Theta) = q(\mathbf{z}) q(\Theta)&amp;lt;/math&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;TODO&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;We can then use the concavity of the logarithm and Jensen's inequality to optimize a lower bound of the marginal log-likelihood:&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;math&amp;gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;\mathrm{ln} \, p(\mathbf{y} | K) \ge \int_\Theta \, \mathrm{d}\Theta \; \left( \int_\mathbf{z} \, \mathrm{d}\mathbf{z} \; q(\mathbf{z}) \; \mathrm{ln} \, \frac{p(\mathbf{y}, \mathbf{z} | \Theta, K)}{q(\mathbf{z})} + \mathrm{ln} \, \frac{p(\Theta | K)}{q(\Theta)} \right) + C_{\mathbf{z}} + C_{&lt;/del&gt;\Theta&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;}&lt;/del&gt;&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;* '''Updates for &lt;/ins&gt;&amp;lt;math&amp;gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;q_&lt;/ins&gt;\Theta&amp;lt;/math&amp;gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;''':&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Now we have to optimize the right-hand side of the inequality. Let's name it &amp;lt;math&amp;gt;\mathcal{F}&amp;lt;/math&amp;gt; as it is a [http://en.wikipedia.org/wiki/Functional_%28mathematics%29 functional], ie. a ''function of functions''. Using the [http://en.wikipedia.org/wiki/Calculus_of_variations calculus of variations], we'll find the function &amp;lt;math&amp;gt;q&amp;lt;/math&amp;gt; that maximizes it.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;TODO&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;!-- ##### DO NOT edit below this line unless you know what you are doing. ##### --&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;!-- ##### DO NOT edit below this line unless you know what you are doing. ##### --&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;!-- diff generator: internal 2013-05-22 13:58:54 --&gt;
&lt;/table&gt;</description>
			<pubDate>Sun, 02 Sep 2012 23:11:46 GMT</pubDate>			<dc:creator>Timothee Flutre</dc:creator>			<comments>http://www.openwetware.org/wiki/User_talk:Timothee_Flutre/Notebook/Postdoc/2012/08/16</comments>		</item>
		<item>
			<title>Timothee Flutre: /* Variational Bayes approach for the mixture of Normals */ add principle of Variational Bayes</title>
			<link>http://www.openwetware.org/index.php?title=User:Timothee_Flutre/Notebook/Postdoc/2012/08/16&amp;diff=621837&amp;oldid=prev</link>
			<description>&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Variational Bayes approach for the mixture of Normals:&amp;#32;&lt;/span&gt; add principle of Variational Bayes&lt;/p&gt;

			&lt;table style=&quot;background-color: white; color:black;&quot;&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;tr valign='top'&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;←Older revision&lt;/td&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;Revision as of 19:29, 31 August 2012&lt;/td&gt;
			&lt;/tr&gt;
		&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 18:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 18:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Observed likelihood''': &amp;lt;math&amp;gt;p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k^{-1})&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Observed likelihood''': &amp;lt;math&amp;gt;p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k^{-1})&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;* '''Maximizing the observed log-likelihood''': as shown [http://openwetware.org/wiki/User:Timothee_Flutre/Notebook/Postdoc/2011/12/14 here], maximizing the likelihood of a mixture model is like doing a weighted likelihood maximization. However, these weights depend on the parameters we want to estimate! That's why we now switch to the missing-data formulation of the mixture model.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 26:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 29:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Priors''': we choose conjuguate &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;ones&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Priors''': &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;in the Bayesian paradigm, parameters and latent variables are random variables for which we want to infer the posterior distribution. To make the calculations possible, &lt;/ins&gt;we choose &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;for them prior distributions that are &lt;/ins&gt;conjuguate &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;with the form of the likelihood.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;** for the parameters: &amp;lt;math&amp;gt;\forall k \; \mu_k | \tau_k \sim Normal(\mu_0,(\tau_0 \tau_k)^{-1})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; \tau_k \sim Gamma(\alpha,\beta)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;** for the parameters: &amp;lt;math&amp;gt;\forall k \; \mu_k | \tau_k \sim Normal(\mu_0,(\tau_0 \tau_k)^{-1})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; \tau_k \sim Gamma(\alpha,\beta)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;** for the latent variables: &amp;lt;math&amp;gt;\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\mathbf{w} \sim Dirichlet(\gamma)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;** for the latent variables: &amp;lt;math&amp;gt;\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\mathbf{w} \sim Dirichlet(\gamma)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;* '''Variational Bayes''': our primary goal here is to calculate the marginal log-likelihood of our data set:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\mathrm{ln} \, p(\mathbf{y} | K) = \mathrm{ln} \, \int_\mathbf{z} \int_\Theta \, \mathrm{d}\mathbf{z} \, \mathrm{d}\Theta \; p(\mathbf{y}, \mathbf{z}, \Theta | K)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;However the fact that there are latent variables induce dependencies between all the parameters of the model.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;This makes it difficult to find the parameters that maximize the marginal likelihood.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;An elegant solution is to introduce a &amp;quot;variational distribution&amp;quot; &amp;lt;math&amp;gt;q&amp;lt;/math&amp;gt; of the parameters and the latent variables &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\mathrm{ln} \, p(\mathbf{y} | K) = \mathrm{ln} \, \left( \int_\mathbf{z} \int_\Theta \, \mathrm{d}\mathbf{z} \, \mathrm{d}\Theta \; q(\mathbf{z}, \Theta) \; \frac{p(\mathbf{y}, \mathbf{z}, \Theta | K)}{q(\mathbf{z}, \Theta)} \right) + C_{\mathbf{z}, \Theta}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;The constant &amp;lt;math&amp;gt;C&amp;lt;/math&amp;gt; is here to remind us that &amp;lt;math&amp;gt;q&amp;lt;/math&amp;gt; has the constraint of being a distribution, ie. of summing to 1, which can be enforced by a Lagrange multiplier.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;The '''crucial assumption''' is to assume the independence of the parameters and the latent variables:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;q(\mathbf{z}, \Theta) = q(\mathbf{z}) q(\Theta)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;We can then use the concavity of the logarithm and Jensen's inequality to optimize a lower bound of the marginal log-likelihood:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;math&amp;gt;\mathrm{ln} \, p(\mathbf{y} | K) \ge \int_\Theta \, \mathrm{d}\Theta \; \left( \int_\mathbf{z} \, \mathrm{d}\mathbf{z} \; q(\mathbf{z}) \; \mathrm{ln} \, \frac{p(\mathbf{y}, \mathbf{z} | \Theta, K)}{q(\mathbf{z})} + \mathrm{ln} \, \frac{p(\Theta | K)}{q(\Theta)} \right) + C_{\mathbf{z}} + C_{\Theta}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;color: red; font-weight: bold; text-decoration: none;&quot;&gt;Now we have to optimize the right-hand side of the inequality. Let's name it &amp;lt;math&amp;gt;\mathcal{F}&amp;lt;/math&amp;gt; as it is a [http://en.wikipedia.org/wiki/Functional_%28mathematics%29 functional], ie. a ''function of functions''. Using the [http://en.wikipedia.org/wiki/Calculus_of_variations calculus of variations], we'll find the function &amp;lt;math&amp;gt;q&amp;lt;/math&amp;gt; that maximizes it.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;!-- ##### DO NOT edit below this line unless you know what you are doing. ##### --&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;!-- ##### DO NOT edit below this line unless you know what you are doing. ##### --&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;!-- diff generator: internal 2013-05-22 13:58:54 --&gt;
&lt;/table&gt;</description>
			<pubDate>Fri, 31 Aug 2012 19:29:55 GMT</pubDate>			<dc:creator>Timothee Flutre</dc:creator>			<comments>http://www.openwetware.org/wiki/User_talk:Timothee_Flutre/Notebook/Postdoc/2012/08/16</comments>		</item>
		<item>
			<title>Timothee Flutre: /* Variational Bayes approach for the mixture of Normals */ fix error prior \mu_k + add link precision</title>
			<link>http://www.openwetware.org/index.php?title=User:Timothee_Flutre/Notebook/Postdoc/2012/08/16&amp;diff=621828&amp;oldid=prev</link>
			<description>&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Variational Bayes approach for the mixture of Normals:&amp;#32;&lt;/span&gt; fix error prior \mu_k + add link precision&lt;/p&gt;

			&lt;table style=&quot;background-color: white; color:black;&quot;&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;tr valign='top'&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;←Older revision&lt;/td&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;Revision as of 18:29, 31 August 2012&lt;/td&gt;
			&lt;/tr&gt;
		&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 14:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 14:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Assumptions''': we assume the observations to be exchangeable and distributed according to a mixture of K Normal distributions. The parameters of this model are the mixture weights (&amp;lt;math&amp;gt;w_k&amp;lt;/math&amp;gt;), the means (&amp;lt;math&amp;gt;\mu_k&amp;lt;/math&amp;gt;) and the precisions (&amp;lt;math&amp;gt;\tau_k&amp;lt;/math&amp;gt;) of each mixture components, all gathered into &amp;lt;math&amp;gt;\Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}&amp;lt;/math&amp;gt;. There are two constraints: &amp;lt;math&amp;gt;\sum_{k=1}^K w_k = 1&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; w_k &amp;gt; 0&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Assumptions''': we assume the observations to be exchangeable and distributed according to a mixture of K Normal distributions. The parameters of this model are the mixture weights (&amp;lt;math&amp;gt;w_k&amp;lt;/math&amp;gt;), the means (&amp;lt;math&amp;gt;\mu_k&amp;lt;/math&amp;gt;) and the &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[http://en.wikipedia.org/wiki/Precision_%28statistics%29 &lt;/ins&gt;precisions&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;] &lt;/ins&gt;(&amp;lt;math&amp;gt;\tau_k&amp;lt;/math&amp;gt;) of each mixture components, all gathered into &amp;lt;math&amp;gt;\Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}&amp;lt;/math&amp;gt;. There are two constraints: &amp;lt;math&amp;gt;\sum_{k=1}^K w_k = 1&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; w_k &amp;gt; 0&amp;lt;/math&amp;gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Observed likelihood''': &amp;lt;math&amp;gt;p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Observed likelihood''': &amp;lt;math&amp;gt;p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;^{-1}&lt;/ins&gt;)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 23:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 23:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Augmented likelihood''': &amp;lt;math&amp;gt;p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k)^{z_{nk}}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Augmented likelihood''': &amp;lt;math&amp;gt;p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;^{-1}&lt;/ins&gt;)^{z_{nk}}&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Priors''': we choose conjuguate ones&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* '''Priors''': we choose conjuguate ones&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;** for the parameters: &amp;lt;math&amp;gt;\forall k \; \mu_k \sim Normal(\mu_0,\tau_0)&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; \tau_k \sim Gamma(\alpha,\beta)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;** for the parameters: &amp;lt;math&amp;gt;\forall k \; \mu_k &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;| \tau_k &lt;/ins&gt;\sim Normal(\mu_0,&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;(&lt;/ins&gt;\tau_0 &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;\tau_k)^{-1}&lt;/ins&gt;)&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; \tau_k \sim Gamma(\alpha,\beta)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;** for the latent variables: &amp;lt;math&amp;gt;\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\mathbf{w} \sim Dirichlet(\gamma)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;** for the latent variables: &amp;lt;math&amp;gt;\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\mathbf{w} \sim Dirichlet(\gamma)&amp;lt;/math&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;!-- diff generator: internal 2013-05-22 13:58:54 --&gt;
&lt;/table&gt;</description>
			<pubDate>Fri, 31 Aug 2012 18:29:45 GMT</pubDate>			<dc:creator>Timothee Flutre</dc:creator>			<comments>http://www.openwetware.org/wiki/User_talk:Timothee_Flutre/Notebook/Postdoc/2012/08/16</comments>		</item>
		<item>
			<title>Timothee Flutre: /* Entry title */ first version</title>
			<link>http://www.openwetware.org/index.php?title=User:Timothee_Flutre/Notebook/Postdoc/2012/08/16&amp;diff=618574&amp;oldid=prev</link>
			<description>&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Entry title:&amp;#32;&lt;/span&gt; first version&lt;/p&gt;

			&lt;table style=&quot;background-color: white; color:black;&quot;&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;col class='diff-marker' /&gt;
			&lt;col class='diff-content' /&gt;
			&lt;tr valign='top'&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;←Older revision&lt;/td&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black;&quot;&gt;Revision as of 18:49, 16 August 2012&lt;/td&gt;
			&lt;/tr&gt;
		&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 6:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 6:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;| colspan=&amp;quot;2&amp;quot;|&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;| colspan=&amp;quot;2&amp;quot;|&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;!-- ##### DO NOT edit above this line unless you know what you are doing. ##### --&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;lt;!-- ##### DO NOT edit above this line unless you know what you are doing. ##### --&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;==&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Entry title&lt;/del&gt;==&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;==&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Variational Bayes approach for the mixture of Normals&lt;/ins&gt;==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;-&lt;/td&gt;&lt;td style=&quot;background: #ffa; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;Insert content here&lt;/del&gt;...&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;* &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;'''Motivation''': I have described on [http://openwetware&lt;/ins&gt;.&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;org/wiki/User:Timothee_Flutre/Notebook/Postdoc/2011/12/14 another page] the basics of mixture models and the EM algorithm in a frequentist context&lt;/ins&gt;. &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;It is worth reading before continuing&lt;/ins&gt;. &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Here I am interested in the Bayesian approach as well as in a specific variational method (nicknamed &amp;quot;Variational Bayes&amp;quot;).&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;* '''Data''': we have N univariate observations, &amp;lt;math&amp;gt;y_1, \ldots, y_N&amp;lt;/math&amp;gt;, gathered into the vector &amp;lt;math&amp;gt;\mathbf{y}&amp;lt;/math&amp;gt;.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;* '''Assumptions''': we assume the observations to be exchangeable and distributed according to a mixture of K Normal distributions. The parameters of this model are the mixture weights (&amp;lt;math&amp;gt;w_k&amp;lt;/math&amp;gt;), the means (&amp;lt;math&amp;gt;\mu_k&amp;lt;/math&amp;gt;) and the precisions (&amp;lt;math&amp;gt;\tau_k&amp;lt;/math&amp;gt;) of each mixture components, all gathered into &amp;lt;math&amp;gt;\Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}&amp;lt;/math&amp;gt;. There are two constraints: &amp;lt;math&amp;gt;\sum_{k=1}^K w_k = 1&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; w_k &amp;gt; 0&amp;lt;/math&amp;gt;.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;* '''Observed likelihood''': &amp;lt;math&amp;gt;p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;* '''Latent variables''': let's introduce N latent variables, &amp;lt;math&amp;gt;z_1,\ldots,z_N&amp;lt;/math&amp;gt;, gathered into the vector &amp;lt;math&amp;gt;\mathbf{z}&amp;lt;/math&amp;gt;. Each &amp;lt;math&amp;gt;z_n&amp;lt;/math&amp;gt; is a vector of length K with a single 1 indicating the component to which the &amp;lt;math&amp;gt;n^{th}&amp;lt;/math&amp;gt; observation belongs, and K-1 zeroes. &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;* '''Augmented likelihood''': &amp;lt;math&amp;gt;p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k)^{z_{nk}}&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;* '''Priors''': we choose conjuguate ones&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;** for the parameters: &amp;lt;math&amp;gt;\forall k \; \mu_k \sim Normal(\mu_0,\tau_0)&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\forall k \; \tau_k \sim Gamma(\alpha,\beta)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;nbsp;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;background: #cfc; color:black; font-size: smaller;&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;** for the latent variables: &amp;lt;math&amp;gt;\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;\mathbf{w} \sim Dirichlet(\gamma)&amp;lt;/math&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background: #eee; color:black; font-size: smaller;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;!-- diff generator: internal 2013-05-22 13:58:54 --&gt;
&lt;/table&gt;</description>
			<pubDate>Thu, 16 Aug 2012 18:49:31 GMT</pubDate>			<dc:creator>Timothee Flutre</dc:creator>			<comments>http://www.openwetware.org/wiki/User_talk:Timothee_Flutre/Notebook/Postdoc/2012/08/16</comments>		</item>
		<item>
			<title>Timothee Flutre: Autocreate 2012/08/16 Entry for User:Timothee_Flutre/Notebook/Postdoc</title>
			<link>http://www.openwetware.org/index.php?title=User:Timothee_Flutre/Notebook/Postdoc/2012/08/16&amp;diff=618571&amp;oldid=prev</link>
			<description>&lt;p&gt;Autocreate 2012/08/16 Entry for User:Timothee_Flutre/Notebook/Postdoc&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{|{{table}} width=&amp;quot;800&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|style=&amp;quot;background-color: #EEE&amp;quot;|[[Image:owwnotebook_icon.png|128px]]&amp;lt;span style=&amp;quot;font-size:22px;&amp;quot;&amp;gt; Project name&amp;lt;/span&amp;gt;&lt;br /&gt;
|style=&amp;quot;background-color: #F2F2F2&amp;quot; align=&amp;quot;center&amp;quot;|&amp;lt;html&amp;gt;&amp;lt;img src=&amp;quot;/images/9/94/Report.png&amp;quot; border=&amp;quot;0&amp;quot; /&amp;gt;&amp;lt;/html&amp;gt; [[{{#sub:{{FULLPAGENAME}}|0|-11}}|Main project page]]&amp;lt;br /&amp;gt;{{#if:{{#lnpreventry:{{FULLPAGENAME}}}}|&amp;lt;html&amp;gt;&amp;lt;img src=&amp;quot;/images/c/c3/Resultset_previous.png&amp;quot; border=&amp;quot;0&amp;quot; /&amp;gt;&amp;lt;/html&amp;gt;[[{{#lnpreventry:{{FULLPAGENAME}}}}{{!}}Previous entry]]&amp;lt;html&amp;gt;&amp;amp;nbsp;&amp;amp;nbsp;&amp;amp;nbsp;&amp;amp;nbsp;&amp;amp;nbsp;&amp;amp;nbsp;&amp;lt;/html&amp;gt;}}{{#if:{{#lnnextentry:{{FULLPAGENAME}}}}|[[{{#lnnextentry:{{FULLPAGENAME}}}}{{!}}Next entry]]&amp;lt;html&amp;gt;&amp;lt;img src=&amp;quot;/images/5/5c/Resultset_next.png&amp;quot; border=&amp;quot;0&amp;quot; /&amp;gt;&amp;lt;/html&amp;gt;}}&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot;|&lt;br /&gt;
&amp;lt;!-- ##### DO NOT edit above this line unless you know what you are doing. ##### --&amp;gt;&lt;br /&gt;
==Entry title==&lt;br /&gt;
* Insert content here...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- ##### DO NOT edit below this line unless you know what you are doing. ##### --&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
__NOTOC__&lt;/div&gt;</description>
			<pubDate>Thu, 16 Aug 2012 18:19:09 GMT</pubDate>			<dc:creator>Timothee Flutre</dc:creator>			<comments>http://www.openwetware.org/wiki/User_talk:Timothee_Flutre/Notebook/Postdoc/2012/08/16</comments>		</item>
	</channel>
</rss>
