BioSysBio:abstracts/2007/Sanne Abeln: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 30: Line 30:


==Methods==
==Methods==
=== Fragments ===
==== Fragments ====


The fragment library generated for this study, contains fragment-pairs of length 10,15,20 and 30, with a maximum allowed gaplength of 2,3,4,6 respectively. All fragments are based on pairwise comparisons between structural domain as defined by SCOP . The pairs are scored for similarity purely on structural grounds using the coordinates of the c-alpha atoms only.
The fragment library generated for this study, contains fragment-pairs of length 10,15,20 and 30, with a maximum allowed gaplength of 2,3,4,6 respectively. All fragments are based on pairwise comparisons between structural domain as defined by SCOP . The pairs are scored for similarity purely on structural grounds using the coordinates of the c-alpha atoms only.

Revision as of 08:16, 28 September 2006

Linking evolution of protein structures through fragments

Author(s): Sanne Abeln, Charlotte M. Deane
Affiliations: University of Oxford
Contact:email: abeln@stats.ox.ac.uk
Keywords: 'protein structure' 'evolution' 'fragments' 'completed genomes'


Summary

Here we use a strucutural fragment library to investigate evolutionary links between protein folds. We show that 'older' folds have relatively more such links than 'younger' folds.


Motivation

At present there is no universal understanding how proteins can change topology during evolution, and how such pathways can be determined in a systematic way. The ability to create links between fold topologies would have important consequences for structural classification, structure prediction and homology modelling. It has been proven difficult however to show the evolutionary relevance of such links between topologies based on geometrical measures. Here we use our a previously determined age measure for protein folds or superfamilies [1] to investigate the effect of structural fragments on protein structure evolution .

Results

  • show figure with links between set of old folds and new folds
  • need brief discussion of evolutionary model

Methods

Fragments

The fragment library generated for this study, contains fragment-pairs of length 10,15,20 and 30, with a maximum allowed gaplength of 2,3,4,6 respectively. All fragments are based on pairwise comparisons between structural domain as defined by SCOP . The pairs are scored for similarity purely on structural grounds using the coordinates of the c-alpha atoms only.

All possible pairwise fragments between two domains of the given lengths are first screened and aligned using a method similar to the prefilter used by MAMMOTH [2]. Each fragment pair with an alignment score above a threshold is then superimposed to create an RMSD score for the fragment pair.

Age estimates

Age estimates for protein folds or superfamilies are generated using fold recongnition of structural domains on a set of completed genomes. The occurrence patterns of such predictions, are analysed with a parsimony algorithm to estimate an age for a superfamily or fold, for more details see [1].

The age of a fold or superfamily is based on a score between [0.0,1.0] with 0.0 indicating a last common recent ancestor at the leafs (youngest), and 1.0 indicating present at the root of the species tree (oldest). Here an 'old' fold is defined as a fold with an age of 1.0, and a 'young' fold with an age < 0.5

Linking Folds

Since no consideration of secondary structure is taken into account, the amount of shared fragments needs to be normalised for the amount a fragment occurrs in general. Friedberg and Godzik (2005) used a similar approach, although sequence similarity was also taken into account, and used a fold based normalisation to overcome this problem [3].

Conclusion

We show that younger folds have relatiely fewer shared fragments with other fold, than old protein fold. This might indicate that evolutionary links above superfamily or fold level could be established, through such shared fragments.

References

  1. Winstanley HF, Abeln S, and Deane CM. How old is your fold?. Bioinformatics. 2005 Jun;21 Suppl 1:i449-58. DOI:10.1093/bioinformatics/bti1008 | PubMed ID:15961490 | HubMed [Winstanley-2005]
  2. Ortiz AR, Strauss CE, and Olmea O. MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci. 2002 Nov;11(11):2606-21. DOI:10.1110/ps.0215902 | PubMed ID:12381844 | HubMed [Ortiz-2002]
  3. Friedberg I and Godzik A. Fragnostic: walking through protein structure space. Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W249-51. DOI:10.1093/nar/gki363 | PubMed ID:15980462 | HubMed [Friedberg-2005]

All Medline abstracts: PubMed | HubMed