Richard Brous Week 9: Difference between revisions
From OpenWetWare
Jump to navigationJump to search
Line 80: | Line 80: | ||
#amine biosynthetic process | #amine biosynthetic process | ||
#arginine biosynthetic process | #arginine biosynthetic process | ||
**''' | **'''ZEB Top 10 Gene Ontology terms:''' | ||
#localization | |||
#cellular biopolymer biosynthetic process | |||
#biopolymer biosynthetic process | |||
#celluar macromolecule biosynthetic process | |||
#macromolecule biosynthetic process | |||
#cellular macromolecule metabolic process | |||
#macromolecule metabolic process | |||
#cell projection organization | |||
#biopolymer metabolic process | |||
#transporter activity | |||
*MAPPFinder lets you find Gene Ontology (GO) terms with which a listed gene is associated. | *MAPPFinder lets you find Gene Ontology (GO) terms with which a listed gene is associated. | ||
**First collapse the tree | **First collapse the tree |
Revision as of 14:23, 29 October 2010
Richard Brous
Electronic Lab Notebook - Analysis of Vibrio Cholerae microarray data project for Week 9
Getting Started
- Partnered with Zeb (older db)
- I have selected the newest Vc-Std_External10201022 gene database
- Extracted to Desktop
Staring GenMAPP and getting organized
- Launched GenMAPP (thank god its not looking for update server!!! =D)
- Ensure correct gene database is loaded Vc-Std_External10201022
- If not, load the correct one
- Data > Choose Gene Database menu item to select the Gene Database you need
- If not, load the correct one
- Select Data menu then Expression Dataset manager
- Select new dataset which is the tab deliminated text file formatted for GenMAPP made during week 8.
- Since the tab-deliminated file contains only data DO NOT select any column to exclude from Data Type Specification window
- Let the Expression dataset manager convert the data
- This may take a while so don't think system hung up... only minimal screen feedback
- Complete conversion yields the data active in the Expression Dataset Manager window AND creates a conversion file named *.gex placed in the same location as where your text-deliminated sourcefile was.
- Errors (almost certainly) occured where the Expression Dataset manager could not convert 1 or more lines of data
- An exception file is created which contains all the raw data with the addition of a column called "~Error~"
- Errors are either messages or if the program finds no errors: a single space character
- I specifically found 121 errors of 5221 records
- Zeb should get more errors since his gene database is older.
- Zeb found 772 errors of 5221 records
- An exception file is created which contains all the raw data with the addition of a column called "~Error~"
- This may take a while so don't think system hung up... only minimal screen feedback
- Customize the new Expression Dataset by creating new Color Sets
- Color Sets contain the GenMAPP instructions for displaying data from an Expression Dataset on MAPPS.
- Create a color set by filling in the fields:
- Name for the Color Set: pathogenic_vs_lab
- gene value: Avg_LogFC_all
- Criteria that determine how a gene object is colored on the MAPP
- increased gene expression (red)
- [Avg_LogFC_all] > 0.25 AND [Pvalue] < 0.05
- decreased gene expression (green)
- [Avg_LogFC_all] < -0.25 AND [Pvalue] < 0.05
- Expressions are equivalent to queries performed in PostgreSQL
- increased gene expression (red)
- Create a color set by filling in the fields:
- After completing a new criterion, add the criterion entry (label, criterion, and color) to the Criteria List by clicking the Add button
- 2 criterion were created: "Increased" will be [Avg_LogFC_all] > 0.25 AND [Pvalue] < 0.05 and "Decreased will be [Avg_LogFC_all] < -0.25 AND [Pvalue] < 0.05
- Can always add more criterion by following previous steps
- Save the entire Expression Dataset by selecting Save from the Expression Dataset menu
- Color Sets contain the GenMAPP instructions for displaying data from an Expression Dataset on MAPPS.
- Exit Expression Dataset Manager to view the Color Sets on a MAPP and then close it
- Keep your *.gex file safe prior to wiki upload (save to usb drive or email it to yourself)
MAPPFinder Procedure
- Zeb and I were in front row selected: INCREASED
- Launch MAPPFinder program or launch from within GenMAPP select Tools -> MAPPFinder
- Ensure the correct Gene Database is loaded!!!
- If not choose File -> Choose Gene Database and select the correct one.
- Located in: C:\GenMAPP 2 Data\Gene Databases\
- If not choose File -> Choose Gene Database and select the correct one.
- Press "Calculate New Results" button
- Select your *.gex file you created previously.
- RAB_10_23_2010_Merrell_Compiled_Raw_Data_Vibrio.gex
- Click OK
- Choose the Color Set and Criteria with which to filter the data
- Chose Increased since that is what we are assigned
- Check Gene Ontology and Calculate p values
- Click the "Browse" button and create a meaningful filename for the results
- Click "Run MAPPFinder".
- The analysis will take several minutes.
- It may look like the computer is stalled; but the hourglass should be on the screen indicating its working
- The analysis will take several minutes.
- Select your *.gex file you created previously.
- When the results have been calculated, a Gene Ontology browser will open showing your results
- All of the Gene Ontology terms that have at least 3 genes measured and a p value of less than 0.05 will be highlighted yellow.
- A term with a p value less than 0.05 is considered a "significant" result.
- All of the Gene Ontology terms that have at least 3 genes measured and a p value of less than 0.05 will be highlighted yellow.
Evaluating MAPPFinder results
- List most significant Gene Ontology terms
- Click on "Show Ranked List" from Menu Bar for list ranked by Z score and p value
- Top 10 Gene Ontology terms:
- branched chain family amino acid metabolic process
- branched chain family amino acid biosynthetic process
- IMP biosynthetic process
- IMP metabolic process
- arginine metabolic process
- cellular nitrogen compound biosynthetic process
- leucine biosynthetic process
- leucine metabolic process
- amine biosynthetic process
- arginine biosynthetic process
- ZEB Top 10 Gene Ontology terms:
- localization
- cellular biopolymer biosynthetic process
- biopolymer biosynthetic process
- celluar macromolecule biosynthetic process
- macromolecule biosynthetic process
- cellular macromolecule metabolic process
- macromolecule metabolic process
- cell projection organization
- biopolymer metabolic process
- transporter activity
- MAPPFinder lets you find Gene Ontology (GO) terms with which a listed gene is associated.
- First collapse the tree
- Type the gene identifier into the gene ID search field
- Example - genes mentioned in Merrell et al. (2000)
- VC0028
- metal ion binding
- iron-sulfur cluster binding
- 4 iron, 4 sulfur cluster binding
- catalytic activity
- lyase activity
- dihydroxy-acid dehydratase
- VC0941
- pyridoxal phosphate binding
- catalytic activity
- glycine hydroxymethyltransferase
- VC0869
- nucleotide binding
- ATP binding
- catalytic activity
- ligase activity
- phosphoribosilformylglycinamidine synthase activity
- VC0051
- nucleotide binding
- ATP binding
- catalytic activity
- lyase activity
- carboxy-lyase activity
- phosphoribosylaminoimidazole caroxylase activity
- VC0647
- nucleotidyltransferase activity
- polyribonucleotide nucleotidyltransferase activity
- VC0468
- metal ion binding
- nucleotide binding
- ATP binding
- catalytic activity
- ligase activity
- glutathione synthase activity
- VC2350
- catalytic activity
- lyase activity
- deoxyribose-phosphate aldolase activity
- VCA0583
- outer membrane-bounded periplasmic space
- VC0028
- INSERT - Are they the same as your buddy who is using a different Gene Database? Why or why not?
- Example - genes mentioned in Merrell et al. (2000)
- Click on one of the GO terms that are associated with one of the genes you looked up in the previous step.
- A MAPP will open listing all of the genes (as boxes) associated with that GO term. Moreover, the genes on the MAPP will be color-coded with the gene expression data from the microarray experiment.
- List in your journal entry the name of the GO term you clicked on and whether the expression of the gene you were looking for changed significantly in the experiment.
- Looking for VC0028
- GO term: 4 iron, 4 sulfur cluster binding
- ILVD_VIBCH increased expression 1.65
- LEUC_VIBCH increased expression 0.52
- Q9KM58_VIBCH increased expression 1.01
- RUMB_VIBCH increased expression 0.45
- THIC_VIBCH increased expression 1.61
- VC0028 = ILVD_VIBCH from UnitProt db
- pathogenic strain 1.65 increased expression compared to 1.27 lab strain
- VC0028 = ILVD_VIBCH from UnitProt db
- Links out to other db (VC0028)
- http://cmr.jcvi.org/tigr-scripts/CMR/shared/GenePage.cgi?locus=VC0028
- JCVI Annotation Display: VC_0028 - Function: dihydroxy-acid dehydratase
- 4 iron, 4 sulfur cluster binding.mapp
- http://cmr.jcvi.org/tigr-scripts/CMR/shared/GenePage.cgi?locus=VC0028
- Looking for VC0028
- Compare excel files:
339 probes met the [Avg_LogFC_all] > 0.25 AND [Pvalue] < 0.05 criteria. 338 probes meeting the filter linked to a UniProt ID. 219 genes meeting the criterion linked to a GO term. 5221 Probes in this dataset 5100 Probes linked to a UniProt ID. 2475 Genes linked to a GO term. The z score is based on an N of 2475 and a R of 219 distinct genes in the GO.
Zeb_results 339 probes met the [Avg_LogFC_all] > 0.25 AND [Pvalue] < 0.05 criteria. 291 probes meeting the filter linked to a UniProt ID. 184 genes meeting the criterion linked to a GO term. 5221 Probes in this dataset 4449 Probes linked to a UniProt ID. 1990 Genes linked to a GO term. The z score is based on an N of 1990 and a R of 184 distinct genes in the GO.