After determining a list of genes involved in a given biological process the next step is to map these genes to known pathways/Gene Ontology terms and determine i.e. which pathways are overrepresented in a given set of genes.
Recent review (Jan 2008 !): Nam, Dougu, and Seon-Young Kim. “Gene-set approach for expression pattern analysis.” Brief Bioinform (17, 2008): bbn001. HTML See table 1 for complete list of tools.
- g:Profiler a web-based toolset for functional profiling of gene lists from large-scale experiments. Easy to use web server
- KOBAS server used for i.e. elucidating pathways in addiction
- takes both FASTA files and lists of genes
- excise gi| from typical FASTA NCBI entry to get unique IDs
- only about 1/3 of genes will get annotated in the first step
- Li, Chuan-Yun, Xizeng Mao, and Liping Wei. “Genes and (Common) Pathways Underlying Drug Addiction.” PLoS Computational Biology 4, no. 1 (1, 2008) HTML
- GSEA http://www.broad.mit.edu/gsea/software/software_index.html
objections (Damian D, Gorfine M. Statistical concerns about the GSEA procedure): http://www.nature.com/ng/journal/v36/n7/full/ng0704-663a.html and reply: http://www.nature.com/ng/journal/v36/n7/full/ng0704-663b.html
Other tools to check
- GEPAT Genome Expression Pathway Analysis Tool. Performs standard microarray analyzes plus "Ensembl database and provides information about gene names, chromosomal location, GO categories and enzymatic activity for each probe on the chip.". Complex installation of java jars/MySQL etc.
- ErmineJ Java stand-alone program "designed to be used by biologists with little or no informatics background" + command line for expert
- PAGE Parametric Analysis of Gene Set Enrichment
- CPath database and software suite for storing, visualizing, and analyzing biological pathways demo page
- EASE (old?) http://www.pubmedcentral.gov/articlerender.fcgi?tool=pubmed&pubmedid=14519205
- nonparametric multivariate analysis Nettleton et al. HTML. R code availebla from author.
- Cytoscape leader in the field
- ONDEX HTML "enables data from diverse biological data sets to be linked, integrated and visualised through graph analysis techniques"
- PIANA Protein Interactions And Network Analysis) ** integrates data from multiple sources in a centralized database,
- automating the analysis of protein-protein interactions networks.
- KEGG first choice for scope
- Reactome human + model organisms pathways. Expert annotations from literature.
- PID Pathway Interaction Database @NIH
- Cyclone - provides an open source Java API for easier access to BioCyc.
- RegulonDB E.coli K12 DB (operons/genes/regulatory elements)
Pathway specific languages
- BioPAX Biological Pathway Exchange Language
Stuff 2 check
- GenMapp, Pathway Processor GeneXpress see:
Cavalieri D, De Filippo C. Bioinformatic methods for integrating whole-genome expression results into cellular networks. Drug Discov Today. 2005;10:727–734. doi: 10.1016/S1359-6446(05)03433-1
- Aittokallio, Tero, and Benno Schwikowski. “Graph-based methods for analysing networks in cell biology.” Brief Bioinform 7, no. 3 (September 1, 2006): 243-255.
- Li, Chuan-Yun, Xizeng Mao, and Liping Wei. “Genes and (Common) Pathways Underlying Drug Addiction.” PLoS Computational Biology 4, no. 1 (1, 2008): e2 EP -.
- Nam, Dougu, and Seon-Young Kim. “Gene-set approach for expression pattern analysis.” Brief Bioinform (17, 2008): bbn001.
- Resources for integrative systems biology: from data through databases to networks and dynamic system models -- Ng et al. 7 (4): 318 -- Briefings in Bioinformatics.” http://bib.oxfordjournals.org/cgi/content/full/7/4/318.
- Stromback, Lena, Vaida Jakoniene, He Tan, and Patrick Lambrix. “Representing, storing and accessing molecular interaction data: a review of models and tools.” Brief Bioinform 7, no. 4 (December 1, 2006): 331-338.
- “Tools for visually exploring biological networks -- Suderman and Hallett 23 (20): 2651 -- Bioinformatics.” http://bioinformatics.oxfordjournals.org/cgi/content/full/23/20/2651.