J'aime C. Moehlman's Week 12

From OpenWetWare
Jump to navigationJump to search

Vibrio cholerae Data Analysis

Normalize the log ratios for the set of slides in the experiment

  • entered a new worksheet into our excel file
  • pasted all of the compiled raw data into the scaled_centered worksheet
  • intserted two rows at the top of the worksheet (above data & below titles)
  • in cell A2, we typed "Average" and in cell A3, we typed "StdDev"
  • You will now compute the Average log ratio for each chip (each column of data). We did this by using the excel equation "=AVERAGE(B4:B5224)"
  • After following that example we computed the average for the rest of the columns
  • Then we followed the same steps as above to compute the standard deviation of the log ratios by using the equation "=STDEV(B4:B5224)" and then found the standard deviations for all of the other columns.
  • we inserted new colums to the right of each patient sample (i.e. A1- A4, B1-B4, C1-C4) and labelled them each A1-C4_scaled_centered
  • In cell C4, we entered this equation: "=(B4-$B$2)/$B$3" and then did the same for every cell in the column, after that we did this for each of the following empty columns

Perform Statistical Analysis on the Ratios

  • we created a new worksheet called "statistics"
  • then we copied all of the gene id's into this new worksheet into column A
  • we copied the values from the A1_scaled_centered worksheet
  • we created 3 new columns to show the log values.
  • we created a new sheet that is designed specifically for genMAPP.

genMAPP worksheets

Sanity Check

  • pvalues less than .05: 5
  • pvalues less than .01: 0
  • pvalues less than .001: 0
  • pvalues less than .0001: 0
  • Keeping the "Pvalue" filter at p < 0.05, filter the "Avg_LogFC_all" column to show all genes with an average log fold change greater than zero. How many are there?
    • 4
  • Keeping the "Pvalue" filter at p < 0.05, filter the "Avg_LogFC_all" column to show all genes with an average log fold change less than zero. How many are there?
    • 1
  • There showed to be 1617 log fold changes between -.25 and .25.
  • Merrell et al. used the p value as criteria to determine significant gene expression change.
  • VC0028 has a p value of .325668
  • VC0941 has a p value of about .73
  • VC0869 has a p value of about .46
  • VC0051 has a p value of about .28
  • VC0647 has a p value of about .45
  • VC0468 has a p value of about .83
  • VC2350 has a p value of about .18
  • VCA0583 has a p value of about .29
    • These values all seem to be very different, some are significant while others don't really seem to be.