Moore Notes 7 16 14

From OpenWetWare
Jump to navigationJump to search

Group Call

  • Participants: Katie, Tom, Jonathan, Dongying, Guillaume, Stacia, Stephen, Josh, Patrick
  • Eugene META symposium in August
    • Katie, Stephen, Tom will coordinate presentations
  • Shotmap paper
    • Stephen working on simulations
    • Processed all Spanish MetaHIT samples with read-length abundance thresholds
    • L4 samples all annotated and analyzed for temporal patterns
      • Some Gilbert patterns recapitulated, some not
      • Rapsearch is using a lot of disk (for temp files)
      • Rapsearch uses a lot of memory if you don't employ multiple threads
      • Issues for local installations (60 million reads creates a 1TB footprint vs. KEGG, 20 million reads also 1TB vs. Sfams)
      • Reducing the number of hits might help, but need to think about impact on downstream analysis
    • Stephen evaluated different approaches to read translation and gene prediction
    • Planning for paper via email, then discuss on July 30 call
  • Dongying's new project about phylogenetic analysis of eukaryotic taxa in metagenomes (PHYECO eukaryotic markers)
    • Slides: http://edhar.genomecenter.ucdavis.edu/~dwu/presentation_dir/euktest.pdf
    • Laura Katz has identified 34 eukaryotic marker genes
      • Did not use same criteria as Dongying did for bacteria
      • Did not have very many genomes
    • Could also look for subclasses, such as diatoms or fungi
    • Challenges
      • Organelle derived genes
      • Low coverage
      • More copy number variants
        • Duplications and multiple copies of mitochondrial genome
        • Evenness not so useful
        • Universality and monophyly are more important
      • Huge volume of data
        • May need to divide into subgroups
    • Stategies
      • Expand bacterial/archaeal markers
      • Use 34 markers
      • Use tree to separate out the non-eukaryotic hits
        • Necessary because there is no score threshold that separates eukaryote from non-eukaryote hits
        • Works for a ribosomal protein, but not a mitochondrial protein
      • Focus on assignments to subgroups (lower down in trees), where they are monophyletic (i.e., do not need to resolve deep branches now)
      • Use metagenomic data to refine marker set (good data on universality, beyond what can be done with genomes)
  • Patrick: related question - Is there some reason half of the archaeal ribosome would be variable across metagenomes (like the eukaryotic ribosome) and the other half would not (like the bacterial ribosome)?