User:R. Eric Collins/MBL/ML

From OpenWetWare
Jump to navigationJump to search

Maximum Likelihood

Paul Lewis


Derrick Zwickl

Large ML inference

  • GARLI
  • branch length optimization makes up majority of ML inference time because changing one branch affects lengths of every other branch.
    • this problem gets untenable with large trees (e.g. >50 taxa)
  • how accurate does a tree likelihood need to be?
  • evolutionary (not technically genetic) algorithm
    • start with a large number of individuals (parameter sets)
    • compete them against each other, unfit ones die out (and some fit ones, but never best fit)
    • can give starting tree that you believe, even if it has polytomies
  • difficulty due to parameter estimation: nucleotide < amino acid < codon (because of transition probability estimation)
  • when to use codons
    1. mixture of divergent and closely related sequences
    • not always better, plus it takes longer
    1. when the models get better



Questions:

  1. which branches take longest to calculate? leaves? deep branches?
  2. settings contstraints... are there any algorithms that 'fill in' the valleys (level the mountains)
  3. when do you need a tree this good? use cases for depth-of-tree'ing would be useful
  4. ancestral state reconstruction... indel/gap models