User:Matthew Whiteside/Notebook/Ortholuge Development/Tasks/Task2

From OpenWetWare
Jump to navigationJump to search

TASK2: Inparalog prediction

Notes

I reviewed the inparanoid approach:

  1. Remm M, Storm CE, and Sonnhammer EL. Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J Mol Biol. 2001 Dec 14;314(5):1041-52. DOI:10.1006/jmbi.2000.5197 | PubMed ID:11743721 | HubMed [1]

In Inparanoid, clusters of orthologs and inparalogs can be merged if either the orthologs or inparalogs overlap to create unique cluster assignments for orthologs and inparalogs. In inparanoid, there are 5 rules that are followed to determine whether clusters are i) merged, ii) one is deleted or iii) inparalogs are divided. Review the paper above, to see rules.

Ortholuge is ortholog centred. It merges clusters anytime there are overlapping orthologs (equal rbbh to multiple genes). Inparalogs are accessory, we compute them using the basic inparanoid definition and add them to the ortholog cluster. I do not try to assign inparalogs to a single cluster as in Inparanoid. If an inparalog satisfies the definitions below, it will be added to the cluster regardless if its already in a cluster. Ortholuge does not try to resolve the orthology of inparalogs.

Definition (using BLAST bit scores)*:

  1. Lax: If A-Aip > A-B, Aip is inparalog of A
  2. Strict: If A-Aip > A-B, and Aip's top hit is B, Aip is inparalog of A.
* if there are multiple A-B rbbh relationships, we take the smallest A-B bit score, i.e. the most inclusive. 
** Orthologs are excluded, i.e. if any orthologs satisfy this inparalog definition they will not be added as inparalogs.

To Do List

  1. comparison of strict and lax inparalog prediction methods - ongoing