Moore Notes 7 6 11

From OpenWetWare
Jump to navigationJump to search

Group Call

  • Protein Db
    • Update from Guillaume
      • Re-running crashed jobs
      • PFAM clustering
        • What distance and algorithm?
        • Compare to what? Homology similarities and families
        • Maybe save for a later paper
        • Need to compare to PFAM: provides functional info and comparison of amount of clustering
        • Should also compare to COGs (have phylogenetic context)
    • Why is this db/paper different from existing protein databases?
      • Full length gene families
      • Derived from bacterial genomes
      • High-throughput, automated, easily updated with new genomes, open
    • Generation of full-length protein families and models
      • Description of workflow
      • Description of database
      • Database accessibility
    • Statistical assessment of the families
      • Family size distribution
      • Family PD distribution
      • Precision and recall distributions (local v. global)
    • The relationship between families in homology space.
      • Which families have models that recruit the same sequences?
      • Cytoscape-like network map of family homology (see attached image)
      • Clusters may represent superfamilies
    • The relationship between families in functional space
      • Hierarchical clustering of families by their pfam annotations
      • Can clusters be partitioned into broad-based functional groups?
    • The overlap between these relationships
      • Can we quantify the amount of overlap between the homology clusters and the functional clusters?
      • What does this tell us about the evolution of function across superfamilies?
    • To do
      • Make an outline (Tom, Katie)
      • Introduction (Morgan)
      • Describe workflow and metrics (Dongying, Guillaume, Jonathan)
      • Compare to PFAM
      • Compare to COGs or describe differences
      • Finish statistical analyses
      • Search vs. metagenomes and/or new genomes (compare to PFAM or COGs?)
  • GBMF proposal request
    • Overhead issue
    • Katie will start outlining, Jonathan back next week