Dahlquist:Edge Software Protocol

From OpenWetWare
Revision as of 11:40, 14 October 2008 by Kam D. Dahlquist (talk | contribs) (→‎Analyzing Data with Edge)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Home        Research        Protocols        Notebook        People        Publications        Courses        Contact       


This protocol is for running the Edge software for DNA microarray data anlysis.

  • Click here to see the Edge manual.
  • Click here to link to the Dahlquist Lab Notebook for Microarray Data Analysis.

Notes on Getting Edge to Run

  • The Edge software would not allow data files to be loaded on Windows, so Kevin wrote to the discussion group for suggestions. Click here to see thread. He was told to do the following fix in a Linux installation:
    1. Open the file "knnimpute.r" in a text editor
    2. Go to lines 129 and 168, it should read 'PACKAGE="impute"'
    3. Change this to 'PACKAGE="knnimpute"'
    4. Save the file, then start EDGE and try again.
  • This fixed the problem, so we are now running Edge in a Linux environment.

Directions for Launching Edge at the Keck Lab

  • Login with your Keck lab username to mason (the names of the machines are on the lower-left corner of the login screens).
  • Right-click on the green tabula rasa.
  • Choose Terminal.
  • Type:
cd Desktop/edge_1.1.290
R
  • At this point, the R prompt shows up. Type:
source("edge.r")
edge()
  • The Edge GUI should now appear.

Analyzing Data with Edge

  • Create two tab-delimited text files for "genes" and "covariates".
  • Load both into an Edge session.
  • Select "Impute Missing Data" from the menu. Calculat Percent Missing Data. The results are:
    • Percent of genes missing data:
    • Percent of arrays missing data:
    • Overall percent of missing data:
  • For KNN Parameters, set:
    • Percent of missing values to tolerate in a gene: 100 (so all genes included)
    • Number of nearest neighbors to use (maximum of 15): 15
    • clicked GO to impute missing data.
  • Selected "Identify Differentially Expressed Genes"
    • Class variable is:
    • Covariate giving time points is:
    • Covariate corresponding to individuals is:
    • Choose spline type, accepted default of Natural Cubic Spline, dimension 4
    • Number of null iterations, set to 1000
    • Choose a seed for reproducible results, set to 47
    • 1000 permutations looks like it will take about 10 minutes.
    • Save results as:
  • To save the plots, do the following command in the R console window.
savePlot(filename = "PvalHistogram_wt-vs-dCIN5", type = c("png"), device = dev.cur())
  • This will save the active plot window under a file name you choose.