Wikiomics:Protein mass spectrometry
From OpenWetWare
Protein mass spectrometry can be divided into:
- identification of proteins/peptides
- quantification
Protein/peptide identification
Peptide Mass Fingerprinting (PMF) or (MS)
Old method, superseded by MS/MS
- algorithms:
- Mascot (gives probabilistic score)
- Aldente
- ProFound ProFound
- caveats
- no sequence information
- journals started to require that at least one peptide of a protein identified by PMF should be confirmed by MS/MS
Peptide fragment fingerprinting (PFF) or (MS/MS)
- algorithms (most commonly used):
- algorithms (other/new/experimental):
- Spectrum Mill $$$
- MASPIC
- this paper claims 5-15% more confident hits than Sequest: [1]
- InsPecT A new variable mods search from Pevzner & Tanner @UCSD (free?)
- filtering of the results
- Trans Proteomic Pipeline [2] (free?)
- download from Sourceforge (TPP Cygwin Setup for Windows or 'Trans-Proteomic Pipeline' for Linux)
- commercial offshot IPP
- wiki devoted to TPP TPP_Wiki
- dynamic newsgroup: spctools-discuss
- Trans Proteomic Pipeline [2] (free?)
- DTASelect it seems to be in a semi-frozen state (free for nonprofit but requires signed MTA)
Databases
Use (if possible):
- IPI International Protein Index
- always use target-decoy database (i.e. concatenated: human_IPI + reversed_human_IPI)
- decoy databases creation methods:
- protein reversal (simple)
- MEGGAYGAGKAGGAFDPYTL -=> LTYPDFAGGAKGAGYAGGEM
- peptide pseudo-reversal (used in Sorcerer by Sage-N Research)
- MEGGAYGAGKAGGAFDPYTL => (trypsin digest, used Ms-Digest) MEGGAYGAGK AGGAFDPYTL => GAGYAGGEMK-ALTYPDFAGG (each peptide reversed, but trypsin digestion site preserved -> guessed from the Elias 2007 paper)
- shuffled
- MEGGAYGAGKAGGAFDPYTL => FYAGADEAGMGTYKGGAGLP (used SMS, results differ each time)
- random (i.e. creating database of random proteins based on frequency of AA in source fasta file)
- protein reversal (simple)
- to create decoy database use DBToolkit free java standalone
de novo sequence determination algorithms
Spectral matching
- P3 (server) from Global Proteomics Machine (free)
- SpectraST from ISB, Seattle (not as many species/options as P3)
- BiblioSpec from MacCoss lab. (free for non-profit, online licence)
- command line only
Web sites
- UCSD (Pevzner)
- U. of Washington (MacCoss)
- Proteome Commons collection of tools & links
- GenePattern proteomics modules from Broad Inst.
Reviews
For a good review of programs and aspects of protein identification by mass spectrometry see:
Other tools to be sorted out
- DeNovoID web
- SPIDER (PDF) de novo + homology search in other species
- OpenSea (HTML) Java program available from authors
- MSQuant MSQuant Parser for Mascot results for quantitation.
- ModifiComb (HTML) (available from authors?)
- MODi [6] web server for PTMs discovery
- UNIMOD [7] database of PTMs
- SILVER view your spectra with LOD scores
Credits
- Darek Kedra wrote this tutorial