Richard Brous Week 9: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 66: Line 66:


====Evaluating MAPPFinder results====
====Evaluating MAPPFinder results====
 
*List most significant Gene Ontology terms
**Click on "Show Ranked List" from Menu Bar
*Top 10 Gene Ontology terms:
**branched chain family amino acid metabolic process
**branched chain family amino acid biosynthetic process
**IMP biosynthetic process
**IMP metabolic process
**arginine metabolic process
**cellular nitrogen compound biosynthetic process
**leucine biosynthetic process
**leucine metabolic process
#amine biosynthetic process
#arginine biosynthetic process





Revision as of 14:14, 27 October 2010

Richard Brous

Electronic Lab Notebook - Analysis of Vibrio Cholerae microarray data project for Week 9

Getting Started

  • Partnered with Zeb (older db)
  • I have selected the newest Vc-Std_External10201022 gene database
    • Extracted to Desktop

Staring GenMAPP and getting organized

  • Launched GenMAPP (thank god its not looking for update server!!! =D)
  • Ensure correct gene database is loaded Vc-Std_External10201022
    • If not, load the correct one
      • Data > Choose Gene Database menu item to select the Gene Database you need
  • Select Data menu then Expression Dataset manager
    • Select new dataset which is the tab deliminated text file formatted for GenMAPP made during week 8.
    • Since the tab-deliminated file contains only data DO NOT select any column to exclude from Data Type Specification window
    • Let the Expression dataset manager convert the data
      • This may take a while so don't think system hung up... only minimal screen feedback
        • Complete conversion yields the data active in the Expression Dataset Manager window AND creates a conversion file named *.gex placed in the same location as where your text-deliminated sourcefile was.
      • Errors (almost certainly) occured where the Expression Dataset manager could not convert 1 or more lines of data
        • An exception file is created which contains all the raw data with the addition of a column called "~Error~"
          • Errors are either messages or if the program finds no errors: a single space character
        • I specifically found 121 errors of 5221 records
        • Zeb should get more errors since his gene database is older.
    • Customize the new Expression Dataset by creating new Color Sets
      • Color Sets contain the GenMAPP instructions for displaying data from an Expression Dataset on MAPPS.
        • Create a color set by filling in the fields:
          • Name for the Color Set: pathogenic_vs_lab
          • gene value: Avg_LogFC_all
          • Criteria that determine how a gene object is colored on the MAPP
            • increased gene expression (red)
              • [Avg_LogFC_all] > 0.25 AND [Pvalue] < 0.05
            • decreased gene expression (green)
              • [Avg_LogFC_all] < -0.25 AND [Pvalue] < 0.05
            • Expressions are equivalent to queries performed in PostgreSQL
      • After completing a new criterion, add the criterion entry (label, criterion, and color) to the Criteria List by clicking the Add button
      • 2 criterion were created: "Increased" will be [Avg_LogFC_all] > 0.25 AND [Pvalue] < 0.05 and "Decreased will be [Avg_LogFC_all] < -0.25 AND [Pvalue] < 0.05
      • Can always add more criterion by following previous steps
      • Save the entire Expression Dataset by selecting Save from the Expression Dataset menu
    • Exit Expression Dataset Manager to view the Color Sets on a MAPP and then close it
  • Keep your *.gex file safe prior to wiki upload (save to usb drive or email it to yourself)

MAPPFinder Procedure

  • Zeb and I were in front row selected: INCREASED
  • Launch MAPPFinder program or launch from within GenMAPP select Tools -> MAPPFinder
  • Ensure the correct Gene Database is loaded!!!
    • If not choose File -> Choose Gene Database and select the correct one.
      • Located in: C:\GenMAPP 2 Data\Gene Databases\
  • Press "Calculate New Results" button
    • Select your *.gex file you created previously.
      • RAB_10_23_2010_Merrell_Compiled_Raw_Data_Vibrio.gex
    • Click OK
    • Choose the Color Set and Criteria with which to filter the data
      • Chose Increased since that is what we are assigned
      • Check Gene Ontology and Calculate p values
      • Click the "Browse" button and create a meaningful filename for the results
      • Click "Run MAPPFinder".
        • The analysis will take several minutes.
          • It may look like the computer is stalled; but the hourglass should be on the screen indicating its working
  • When the results have been calculated, a Gene Ontology browser will open showing your results
    • All of the Gene Ontology terms that have at least 3 genes measured and a p value of less than 0.05 will be highlighted yellow.
      • A term with a p value less than 0.05 is considered a "significant" result.

Evaluating MAPPFinder results

  • List most significant Gene Ontology terms
    • Click on "Show Ranked List" from Menu Bar
  • Top 10 Gene Ontology terms:
    • branched chain family amino acid metabolic process
    • branched chain family amino acid biosynthetic process
    • IMP biosynthetic process
    • IMP metabolic process
    • arginine metabolic process
    • cellular nitrogen compound biosynthetic process
    • leucine biosynthetic process
    • leucine metabolic process
  1. amine biosynthetic process
  2. arginine biosynthetic process