Revision as of 14:50, 2 October 2006

Temp notes for Lecture 7

quick comment on upPass

not a popular algorithm

(you won't be tested on it)

but here's the correct way to do it:

(from Peter Beerli's website)

ancestor = a

parent = p, node we're looking at

children = q,r

F_x: the upPass set we want to get to
S_x: the downpass set we got to

Revisit overall strategy

for all possible trees:compute score (tree)return best tree

Scoring functions

max parsimony (fewest mutations)
generalized parsimony (Sankoff: weighted mutation costs)
Maximum Likelihood

ML intro

examples of a ML estimator:
1. for normally distributed random var X, X(bar), the mean of the data you observe, is a ML estimator of the mean of the distribution they were drawn from
2. A best fit line thru data is a ML estimator.

Probability Refresher

total area of a box = 1

p(A)= p(B)= p(A,B)= p(A|B)= deriving bayes' rule

ML in trees

What is the best tree T given the data we have D ? p(T|D) is what we want to maximize not intuitively obvious how we want to do that, so use bayes law

p(D) is a constant ... don't have to worry about it

what is the a priori probability of the tree ?
Without looking at the data, do we have a way of saying any tree is more likely than another one if they don't have any data associated with them ?
No... not really

so what we're left maximizing is just p(D|T)

p(A U B) = p(A) + p(B) - p(A,B) p(A I B) = p(A)*p(B)

these are independent events (that's why you multiply)

Jukes-Cantor

cost matrix

if x == y : [JC eqn] if x != y : [JC eqn]

equations draw the little graph thing

Evolutionary Model

gives up likelihood of D|T (need branch lengths)

downPass for ML compute L(p|q,r,d)

q , r = likelihood of the two subtrees and the distance to them

Temp notes for Lecture 6

Exams and HWs

In-class midterm on 10/11

studying for the exam:

don't memorize lecture notes,

more important to be able to work through the problems

understand all the homeworks and you'll be prepared

Homeworks coming up

HW5: Due next wednesday

downPass, maximum likelihood

HW6:

search tree space

Up Pass

If we know what the best answer is at the root- all of out other internal nodes aren't necessarily the best guess. We need an upPass algorithm that passes information from the root, back up to the leaves.

For this example, we are dealing with one column of the sequence alignment

def upPass(tree,parent):if tree is a leaf: #(stop) return i123 = parent <intersect> left <intersect> right i12 = left <intersect> right i23 = parent <intersect> left i13 = parent <intersect> right if i123 != None: tree['data'] = i123 else: #possible to simplify if empty set doesn't add with += if i12 != None: data += i12 if i13 != None: data += i13 if i23 != None: data += i23 if data == None: data = parent + left + right # union here tree['data']= data

We are doing this for the most general case, where we considered:

what should we do when there is no intersect ?

When you start upPass that can't be the case because every parent should have an interset with its children if you've done downPass correctly....BUT this situation does arise as you traverse up the tree because you eliminate possibilities at a parent before you test for intersect with the children.

@@ Line 7: / Line 7: @@
 :(from [http://www.scs.fsu.edu/~beerli/BSC-5936/08-31-05/Lecture_Notes_2.pdf Peter Beerli's website])
-[[Image:UpPass_subtree.jpg|200px|right]‎]
+[[Image:UpPass_subtree.jpg|150px|right]]
 ancestor = a
@@ Line 17: / Line 17: @@
 S<sub>x</sub>: the downpass set we got to <br>
-[[Image:UpPass.jpg|200px|right]‎]
+[[Image:UpPass.jpg|250px]]
-[[Image:L5_tree.jpg|300px|right]]
-<blockquote><tt>
-</tt></blockquote>
 ==Revisit overall strategy==
@@ Line 39: / Line 33: @@
 ==ML intro==
-example of a ML estimator:
+*examples of a ML estimator:
-for normally dist random var X, X(bar) is a ML estimate of mean
+*#for normally distributed random var X, X(bar), the mean of the data you observe, is a ML estimator of the mean of the distribution they were drawn from
-best fit line is a ML estimator
+*#A best fit line thru data is a ML estimator.
+[[Image:ML_plot.jpg]]
 ==Probability Refresher==
+[[Image:Bayes_diagram.jpg|250px|right]]
+total area of a box = 1
-area of a box = 1
+<blockquote><tt>
+p(A)=
+p(B)=
+p(A,B)=
+p(A|B)=
-draws two circles in the box
+deriving bayes' rule
-calls then set A and B
-.5 = area outside both circles
-.2 = area in A not B
-.2 = area in B not A
-intersect = 0.1
-p(A)
+</tt></blockquote>
-p(B)
-p(A,B)
-p(A|B)
-(...deriving  bayes rule)
@@ Line 84: / Line 76: @@
 ==Jukes-Cantor==
 cost matrix
+<blockquote><tt>
 if x == y :
-[eqn]
+[JC eqn]
 if x != y :
-[eqn]
+[JC eqn]
+</tt></blockquote>
 equations
 draw the little graph thing

SoniaSandbox: Difference between revisions

Revision as of 14:50, 2 October 2006

Contents

Temp notes for Lecture 7

quick comment on upPass

Revisit overall strategy

Scoring functions

ML intro

Probability Refresher

ML in trees

Jukes-Cantor

Evolutionary Model

Temp notes for Lecture 6

Exams and HWs

Homeworks coming up

Up Pass

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools