Moore Notes 1 9 13

From OpenWetWare
Jump to navigationJump to search

Group Call

  • Participants: Katie, Jonathan, Stephen, Tom, Guillaume, Josh, Dongying
  • Progress report
  • Hiring
    • Sarah Hird postdoc candidate for Eisen lab
    • UCSF bioinformatics hire - final candidate interview Friday
  • Data opportunity from Declan Schroeder
  • 18S/16S, but not shotgun metagenomics
  • Tom: SFam updating/fixes
    • Automating and improving the process for family construction
      • LAST vs. BLAST
      • Sift with LAST first and then use HMMs
      • Stephen's additional ideas
    • Do we need supercomputer time?
      • Cluster access probably OK for now (2x per year updates)
    • Incorporating metagenomic sequences in addition to genomes
      • Much more complex and larger data sets (so might need more compute resources)
      • Youssef has published methods
    • Future of SFams
      • Some one else (CAMERA, JGI) might be able to do the searches that we need
      • We care more about having good families more than maintaining a protein db
      • People from Banfield and Giovannini labs are using SFam HMMs
    • Targeting next SFams release in a month or so
  • Josh: beta diversity mapping
    • Method
      • Model community dissimilarity (difference in OTU abundance) as a function of distance in environmental variable space
      • Make predictions using raster data
      • Project predictions into geographic space
      • Average in 100km window around each grid cell
    • Preliminary results
      • Highly correlated with primary productivity
      • Tom: relative abundance metric might be affected by sampling issues
      • Katie: what about seasonality? Josh: All subsets regression selected annual variables over month-specific ones, but will try putting month in the model to see what month-specific predictions look like
      • Cross-validation is pretty good
      • Declan's data might be good for validation
    • Community classification
      • Clustered a random subset of grid cells, applied kNN algorithm to assign remaining cells to clusters
      • Arbitrarily picked 6 clusters
      • Methods could be improved
      • Result: Arctic different from Antarctic
    • Working on plotting direction of greatest turnover at each grid point
    • What are good applications for this?