BIOL368/F14:Chloe Jones Week 13: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(→‎Scaling and Centering the Data: fixed worksheet reference)
(→‎Scaling and Centering the Data: edited column names)
Line 28: Line 28:
* Excel will now do some work for you.  Copy these two equations (cells B2 and B3) and paste them into the empty cells in the rest of the columns.  Excel will automatically change the equation to match the cell designations for those columns.
* Excel will now do some work for you.  Copy these two equations (cells B2 and B3) and paste them into the empty cells in the rest of the columns.  Excel will automatically change the equation to match the cell designations for those columns.
* You have now computed the average and standard deviation of the log ratios for each chip.  Now we will actually do the scaling and centering based on these values.
* You have now computed the average and standard deviation of the log ratios for each chip.  Now we will actually do the scaling and centering based on these values.
* Copy the column headings for all of your data columns and then paste them to the right of the last data column so that you have a second set of headers above blank colums of cells.  Edit the names of the columns so that they now read:  A1_scaled_centered, A2_scaled_centered, etc.
* Copy the column headings for all of your data columns and then paste them to the right of the last data column so that you have a second set of headers above blank colums of cells.  Edit the names of the columns so that they now read:  <previous name>_scaled_centered, etc.
* In cell N4, type the following equation:
* In cell N4, type the following equation:
  =(B4-B$2)/B$3
  =(B4-B$2)/B$3

Revision as of 16:20, 19 November 2014

Calculating fold change

  • Here is the Overton_MicroarrayData_20141119.xlsx used for this analysis of MRSA with Ranalexin.
    • Start from the sheet called "SAR_only"
  • Step. 1 Rename the columns with the item isolated, biological replicate, and dye used. For each array data file there will be 2 columns: signal and background.
  • Step. 2 Minus the background from the signal for each data file. Then take the answers from each replicate and divide it by its corresponding dye (i.e. Cy5 Ranalexin(B1)/Cy3 MRSA252 (B1)). NOw, the fold change is calculated.
  • Step 3. Take the log2 of each fold change.
=LOG(number, base)
for example, =LOG(A1,2) takes the log2 of the number in cell A1.
  • Fix the dye-swapped samples so that the orientation is alwasys ranalexin/control by multiplying the log fold changes of the samples where the control was labeled with Cy5 by negative 1 (-1).
=(-1)*A1 would multiply the number in cell A1 by negative 1.
  • create a new worksheet. Copy and paste the following columns into the new sheet, ID, index, the log fold changes of the non-dye-swapped samples and the fold changes of the dye-swapped samples that have been flipped. Be sure to use the option, Paste special > paste values instead of a regular copy and paste.
  • there are probably error messages in certain cells after you have done these calculations. We need to replace all of these with a single space character using the Find/replace option. Record the number of replacements you make.

Scaling and Centering the Data

To scale and center the data (between chip normalization) perform the following operations:

  • Insert a new Worksheet into your Excel file, and name it "scaled_centered".
  • Go back to your previous worksheet with the log fold changes, Select All and Copy. Go to your new "scaled_centered" worksheet, click on the upper, left-hand cell (cell A1) and Paste.
  • Insert two rows in between the top row of headers and the first data row.
  • In cell A2, type "Average" and in cell A3, type "StdDev".
  • You will now compute the Average log ratio for each chip (each column of data). In cell B2, type the following equation:
=AVERAGE(B4:B5483)
and press "Enter". Excel is computing the average value of the cells specified in the range given inside the parentheses. Instead of typing the cell designations, you can click on the beginning cell, scroll down to the bottom of the worksheet, and shift-click on the ending cell.
  • You will now compute the Standard Deviation of the log ratios on each chip (each column of data). In cell B3, type the following equation:
=STDEV(B4:B5483)
and press "Enter".
  • Excel will now do some work for you. Copy these two equations (cells B2 and B3) and paste them into the empty cells in the rest of the columns. Excel will automatically change the equation to match the cell designations for those columns.
  • You have now computed the average and standard deviation of the log ratios for each chip. Now we will actually do the scaling and centering based on these values.
  • Copy the column headings for all of your data columns and then paste them to the right of the last data column so that you have a second set of headers above blank colums of cells. Edit the names of the columns so that they now read: <previous name>_scaled_centered, etc.
  • In cell N4, type the following equation:
=(B4-B$2)/B$3
In this case, we want the data in cell B4 to have the average subtracted from it (cell B2) and be divided by the standard deviation (cell B3). We use the dollar sign symbols in front of the "2" and "3" to tell Excel to always reference that row in the equation, even though we will paste it for the entire column. Why is this important?
  • Copy and paste this equation into the entire column.
  • Copy and paste the scaling and centering equation for each of the columns of data with the "_scaled_centered" column header. Be sure that your equation is correct for the column you are calculating.