Tessa A. Morris Electronic Lab Notebook

From OpenWetWare

Jump to: navigation, search

Contents

Microarray Data Analysis (May 18- 26, 2015)

Tessa A. Morris General Microarray Data Analysis For the first week and a half of SURP 2015 the class all went through the process of Microarray data analysis. The methods and observations are recorded here.

16 Test Files from Dahlquist-data (May 26- 28, 2015) (June 3, 2015) (June 22, 2015)

GRNmap Testing Report 16 Test Files from Dahlquist-data 2015-05-26 TM

  • Follow the protocol described by Dr. Dahlquist to prepare the data and select a list of candidate transcription factors to test. The process that was performed by this student is explained here.

Creating the General Input Excel Workbook for Each Model

  1. Download the general input sheet from Github.
  2. Naming convention is (#-genes)_(#-edges)_(data-source-model-used)_(forward vs. estimation)_(fixb-1 vs. fixb-0)_(fixP-1 vs. fixP-0)_(graph vs. no graph)
    • For example: Sigmoidal, Estimate + Forward, Estimate b (b=0) and Estimate P (P=0), Graph is named 22-genes_47-edges_Dahlquist-data_Sigmoidal_estimation_fixb-0_fixP-0_graph
  3. For all input sheets:
    • Copy the transposed matrix from your "network" sheet and paste it into the worksheets called "network" and "network_weights".
    • Note that the transcription factor names have to be in the same order and same format across the top row and first column. CIN5 does not match Cin5p, so the latter will need to be changed to CIN5 if you have not already done so.
    • It may be easier for you if you put the transcription factors in alphabetical order (using the sort feature in Excel), but whether you leave your list the same as it is from the YEASTRACT assignment or in alphabetical order, make sure it is the same order for all of the worksheets.
  4. The next worksheet to edit is the one called "degradation_rates".
    • Paste your list of transcription factors from your "network" sheet into the column named "StandardName". You will need to look up the "SystematicName" of your genes. YEASTRACT has a feature that will allow you to paste your list of standard names in to retrieve the systematic names here.
    • Next, you will need to look up the degradation rates for your list of transcription factors. These rates have been calculated from protein half-life data from a paper by Belle et al. (2006). Look up the rates for your transcription factors from this file and include them in your "degradation_rates" worksheet.
    • If a transcription factor does not appear in the file above, use the value "0.027182242" for the degradation rate.
  5. The next worksheet to edit is the one called "production_rates".
    • Paste the "SystematicName" and "StandardName" columns from your "degradation_rates" sheet into the "production_rates" sheet.
    • The initial guesses for the production rates we are using for the model are two times the degradation rate. Compute these values from your degradation rates and paste the values into the column titled "ProductionRate".
  6. Next you will input the expression data for the wild type strain and one other strain (dcin5, dgln3, dhap4, dhmo1, dzap1, or spar; note that we can't use dswi4 because it only has 2 cold shock timepoints). You need to include only the data for the genes in your network, in the same order as they appear in the other worksheets.
    • Put the wild type data in the sheet called "wt".
    • The sample spreadsheet has a worksheet named "dcin5". Change this name to match the strain you are using (listed above). The instructions below should be followed for each strain sheet.
    • Paste the SystematicName and StandardName columns from one of your previous sheets into this one.
    • This data in this sheet is the Log Fold Changes for each replicate and each timepoint from the "Rounded_Normalized_Data" worksheet from the big Excel workbook in which you computed the statistics. We are only going to use the cold shock timepoints for the modeling. Thus your column headings for the data should be "15", "30", and "60". There will be multiple columns for each timepoint (typically 4) to represent the replicate data, but they will all have the same name. For example, you may have four columns with the header "15".
    • Copy and paste the data from your spreadsheet into this one. You need to include only the data for the genes in your network. Make sure that the genes are in the same order as in the other sheets.
  7. The last sheet that will be identical is "network_b".
    • Paste in the list of standard names for your transcription factors from one of your previous sheets. Note that this sheet does not have a column for the Systematic Name.
    • The "threshold" value for each gene should be "0".

Editing the Optimization Parameters for each Input

  • There are sixteen different input sheets that need to be tested. The "optimization parameters" worksheet is adjusted for each type. For all inputs, the following optimization parameters will be the same:
    • "alpha" should be 0.01
    • "kk_max" should be 1
    • "MaxIter" should be 1e08
    • "TolFun" should be 1e-6
    • "MaxFunEval" should be 1e08
    • "TolX" should be 1e-6
    • For the parameter "time" (Cell A13), should have "15", "30", and "60"
    • For the parameter "Strain" (Cell A14), make sure it says "dcin5", making sure that the capitalization and spelling is the same as the worksheet containing that strain's expression data.
    • For the parameter "Sheet" (Cell A15), give the number of the worksheet from left to right that your "Strain" log2 expression data is in. This should be the fourth sheet.
    • For the parameter "Deletion", leave the zero in cell B15 (corresponding to wt). In cell C15, put a number corresponding to the position in the list of gene names that the gene that was deleted appears. This should be the number three in the list (disregard the column header in this count and only consider the actual gene names themselves).
    • For the parameter, "simtime", perform the forward simulation of the expression in five minute increments from 0 to 60 minutes. Thus, this row should read: simtime should be 0, 5, <...fill by steps of 5...>, 60, each number in a different cell.
  • The parameters "Sigmoid","estimateParams", "makeGraphs","fix_P", and "fix_b" will be different for each input sheet.
  1. Sigmoidal: Estimate + Forward, Estimate b and Estimate p, Graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 0
    • "fix_b" should be 0
  2. Sigmoidal: Estimate + Forward, Estimate b and Estimate p, No Graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 0
    • "fix_b" should be 0
  3. Sigmoidal: Estimate + Forward, Estimate b and Fix p, Graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 1
    • "fix_b" should be 0
  4. Sigmoidal: Estimate + Forward, Estimate b and Fix p, No Graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 1
    • "fix_b" should be 0
  5. Sigmoidal: Estimate + Forward, Fix b and Estimate p, Graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 0
    • "fix_b" should be 1
  6. Sigmoidal: Estimate + Forward, Fix b and Estimate p, No Graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 0
    • "fix_b" should be 1
  7. Sigmoidal: Estimate + Forward, Fix b and Fix p, Graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 1
    • "fix_b" should be 1
  8. Sigmoidal: Estimate + Forward, Fix b and Fix p, No Graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 1
    • "fix_b" should be 1
  9. Sigmoidal: Forward only, Graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 0
    • "makeGraphs" should be 1
    • "fix_P" should be 0
    • "fix_b" should be 1
  10. Sigmoidal: Forward only, No Graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 0
    • "makeGraphs" should be 0
    • "fix_P" should be 0
    • "fix_b" should be 1
  11. Michaelis Menten: Estimate + Forward, Fix p, Graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 1
    • "fix_b" should be 0
  12. Michaelis Menten: Estimate + Forward, Fix p, No Graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 1
    • "fix_b" should be 0
  13. Michaelis Menten: Estimate + Forward, Estimate p, Graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 0
    • "fix_b" should be 0
  14. Michaelis Menten: Estimate + Forward, Estimate p, No Graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 0
    • "fix_b" should be 0
  15. Michaelis Menten: Forward only, Graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 0
    • "makeGraphs" should be 1
    • "fix_P" should be 0
    • "fix_b" should be 0
  16. Michaelis Menten: Forward only, No Graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 0
    • "makeGraphs" should be 0
    • "fix_P" should be 1
    • "fix_b" should be 0

Running GRNmap

You will now finally run the GRNmap model on each input workbook created above.

  1. Download the version 1.0.6 of GRNmap from GitHub.
    • Save it into a new folder called "GRNmap" on the Desktop.
    • Unzip the file by right-clicking on it and choosing 7-zip > Extract here.
  2. Open the "GRNmap-1.0.6" folder and open the "matlab" subfolder. Double-click on the file "GRNmodel.m" to open GRNmap in MATLAB 2014b.
  3. Click on the green triangle "Run" button to run the model.
    • You will be prompted by an Open dialog to find your input file that you created in the previous section. Browse and select this input file and click OK.
    • Note that the Open dialog will default to show files of *.xlsx only. If your file is saved as *.xls, you will need to select the drop-down menu to show all files.
    • A window called "Figure 1" will appear. The counter is showing the number of iterations of the least squares optimization algorithm. The top plot is showing the values of all the parameters being estimated. You should see some movement of the diamonds each time the counter iterates.
  4. Once the model has completed its run, plots showing the expression over time for all of the genes in the network will appear if "makeGraphs" was set to 1. The plots will automatically be saved as *.jpg files in the same folder as your input file. Compile the figures into a folder following the naming convention described earlier. Compress this folder by right clicking then selecting "7-zip" and "Add to archive...". Make sure the archive format is "zip."
  5. Upload the .xlsx output sheet and the zipped folder with the output plots onto openwetware.

Compare Input Sheets

  1. Create an empty excel workbook and make a column for each of the sixteen options.
  2. Create three worksheets, one for the out production rates, one for the weights, and one for the "network_b".
  3. Create a bar chart in order to compare the weights.
    • Divide up the bar charts by chunks of controllers --> targets so it can be easier to visualize patterns.
    • Make sure the scale is the same for all plots (-3 to 3 and -6 to 6 were chosen)
  4. Create bar charts to compare the differences in the "network_b" which may be changed to "threshold_b"
  5. Create GRNsight maps for each, making sure the nodes are in approximately the same order. It is easiest to put well connected nodes towards the center.
  6. Add all maps and charts to a powerpoint presentation

Format Input Sheet to Work With Latest GRNmap Version (June 3, 2015)

  1. Change network_b to threshold_b
  2. Change (strain) to (strain)_log2)_expression
  3. Delete concentration sigmas sheet (if applicable)
  4. Make sure all names are correct according to the naming convention

Update alpha and optimization parameters (June 22, 2015)

  • For all 16 input sheets change alpha to 0.001
  • Adjust the following parameters
MaxIter 1.00E+06
TolFun 1.00E-05
MaxFunEval 1.00E+06
TolX 1.00E-05
  • Run all 16 and upload their .mat, .xlsx input, .xlsx output, and plots

.xlsx vs. .xls (May 27, 2015)

GRNmap Testing Report .xlsx vs. .xls 2015-05-27

Preparing the input sheets

  1. Download 22-genes_47-edges_Dahlquist-data_MM_estimation_fixP-1_graph and 22-genes_47-edges_Dahlquist-data_Sigmoid_estimation_fixb-1_fixP-1_graph from a previous experiment comparing the sixteen possible options for the input sheet.
  2. Both of these files should be in .xlsx format. Open them in Microsoft Excel, enable editting and save as an .xls file.
  3. Upload the input sheets to openwetware.

Running GRNmap

  1. Download the current version of GRNmap from GitHub, in this case we are using version 1.0.6.
    • Save it into a new folder called "GRNmap" on the Desktop.
    • Unzip the file by right-clicking on it and choosing 7-zip > Extract here.
  2. Drag the four input sheets into the "matlab" subfolder of "GRNmap-1.0.6". To keep it organized I created two folders within the "matlab" subfloder called "Need to Run" and "Ran".
  3. Open the "GRNmap-1.0.6" folder and open the "matlab" subfolder. Double-click on the file "GRNmodel.m" to open GRNmap in MATLAB 2014b.
  4. Click on the green triangle "Run" button to run the model.
    • You will be prompted by an Open dialog to find your input file that you created in the previous section. Browse and select this input file and click OK.
    • Note that the Open dialog will default to show files of *.xlsx only. If your file is saved as *.xls, you will need to select the drop-down menu to show all files.
    • A window called "Figure 1" will appear. The counter is showing the number of iterations of the least squares optimization algorithm. The top plot is showing the values of all the parameters being estimated. You should see some movement of the diamonds each time the counter iterates.
  5. Once the model has completed its run, plots showing the expression over time for all of the genes in the network will appear if "makeGraphs" was set to 1. The plots will automatically be saved as *.jpg files in the same folder as your input file. Compile the figures into a folder following the naming convention described earlier. Compress this folder by right clicking then selecting "7-zip" and "Add to archive...". Make sure the archive format is "zip."
    • Note that because in this case the file names are essentially the same with the file extension being different, there may be some issues. I added _xls or _xlsx to the names of the folders so they would not overwrite. I did not run into any issue with Matlab because I moved the .mat files into the folder "Ran" for the first run (.xls) but when I tried to move the second output (.xlsx) there was an issue because it had the exact same name. To rectify this, windows added a (2) to the end of the second .mat file). The .mat files will likely not be needed in this experiment, so it should not cause problems.
  6. Upload the output sheet and the zipped folder with the output plots onto openwetware.
  7. After the initial run, type the commands "clear all" and "close all" then "GRNmodel" and select the input sheet to run.

Analyzing the Results

  1. The production rate and threshold b values will not be used in the data analysis. The version of Matlab has the previously documented errors with P and b for the Sigmoidal. The Michealis Menten model does not make use of the term b and also has an error with the production rate.
  2. The simplest way to understand if there is a difference in the weights between .xlsx and .xls is to subtract the matrices. This is done in excel by selecting a blank area with the same dimensions as the two matrices. In the formula bar type =(All of Matrix "xls")-(All of Matrix "xlsx") then press "Control" "Shift" "Enter". These should all be pressed at the same time.
  3. Find and report the maximum of the difference matrices. This is done by typing =MAX(Difference Matrix) in a new cell.

16 Test Files (June 1 - 2, 2015)

GRNmap Testing Page

GRNmap Testing Report 16 Test Files 2015-06-01

Selecting the GRNmap Version and Base Input Sheet

  1. Download the GRNmap-beta branch version from Github by downloading the zip file.
  2. Move it to the desktop and unzip by right-clicking on it and choosing 7-zip > Extract here.
  3. Open the unzipped "GRNmap-beta branch" folder and find the folder "test_files" then navigate to "estimation_tests" and open the file "4-genes_6-edges_artificial-data_Sigmoid_estimation.xls"
    • We compared "4-genes_6-edges_artificial-data_Sigmoid_estimation.xls" and "4-genes_6-edges_artificial-data_Sigmoid_forward.xlsx" and found them to contain different data points and different time points, so the "4-genes_6-edges_artificial-data_Sigmoid_estimation.xls" was chosen as the base for the 16 test files.

Editing the Optimization Parameters for each Input

  • There are sixteen different input sheets that need to be tested. The "optimization parameters" worksheet is adjusted for each type. For all inputs, the following optimization parameters will be left as they were originally entered
    • "alpha" should be 1.00E-10
    • "kk_max" should be 1
    • "MaxIter" should be 1e08
    • "TolFun" should be 1e-10
    • "MaxFunEval" should be 1e08
    • "TolX" should be 1e-10
    • For the parameter "time" (Cell A13), should have "0.4", "0.8", "1.2", and "1.4"
    • For the parameter "Strain" (Cell A14), make sure it says "dcin5", making sure that the capitalization and spelling is the same as the worksheet containing that strain's expression data.
    • For the parameter "Sheet" (Cell A15), give the number of the worksheet from left to right that your "Strain" log2 expression data is in. This should be the fourth sheet.
    • For the parameter "Deletion", leave the zero in cell B15 (corresponding to wt). In cell C15, put a number corresponding to the position in the list of gene names that the gene that was deleted appears. This should be the number three in the list (disregard the column header in this count and only consider the actual gene names themselves).
    • For the parameter, "simtime", the row should read: 0, 0.1, 0.2, 0.3 <...fill by steps of 0.1...>, 2, each number in a different cell.
  • The parameters "Sigmoid","estimateParams", "makeGraphs","fix_P", and "fix_b" will be different for each input sheet.
  1. 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-0_fixP-0_graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 0
    • "fix_b" should be 0
  2. 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-0_fixP-0_no-graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 0
    • "fix_b" should be 0
  3. 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-0_fixP-1_graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 1
    • "fix_b" should be 0
  4. 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-0_fixP-1_no-graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 1
    • "fix_b" should be 0
  5. 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-1_fixP-0_graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 0
    • "fix_b" should be 1
  6. 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-1_fixP-0_no-graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 0
    • "fix_b" should be 1
  7. 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-1_fixP-1_graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 1
    • "fix_b" should be 1
  8. 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-1_fixP-1_no-graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 1
    • "fix_b" should be 1
  9. 4-genes_6-edges_artificial-data_Sigmoidal_forward_graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 0
    • "makeGraphs" should be 1
    • "fix_P" should be 0
    • "fix_b" should be 1
  10. 4-genes_6-edges_artificial-data_Sigmoidal_forward_no-graph
    • "Sigmoid" should be 1
    • "estimateParams" should be 0
    • "makeGraphs" should be 0
    • "fix_P" should be 0
    • "fix_b" should be 1
  11. 4-genes_6-edges_artificial-data_MM_estimation_fixP-1_graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 1
    • "fix_b" should be 0
  12. 4-genes_6-edges_artificial-data_MM_estimation_fixP-1_no-graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 1
    • "fix_b" should be 0
  13. 4-genes_6-edges_artificial-data_MM_estimation_fixP-0_graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 1
    • "makeGraphs" should be 1
    • "fix_P" should be 0
    • "fix_b" should be 0
  14. 4-genes_6-edges_artificial-data_MM_estimation_fixP-0_no-graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 1
    • "makeGraphs" should be 0
    • "fix_P" should be 0
    • "fix_b" should be 0
  15. 4-genes_6-edges_artificial-data_MM_forward_graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 0
    • "makeGraphs" should be 1
    • "fix_P" should be 1
    • "fix_b" should be 0
  16. 4-genes_6-edges_artificial-data_MM_forward_no-graph
    • "Sigmoid" should be 0
    • "estimateParams" should be 0
    • "makeGraphs" should be 0
    • "fix_P" should be 1
    • "fix_b" should be 0

Running GRNmap

You will now finally run the GRNmap model on each input workbook created above.

  1. Open MATLAB and navigate to the "matlab" subfolder in the "GRNmap-beta" folder on the desktop
  2. Move the input sheet that you would like to run into this folder.
  3. In MATLAB type the command "GRNmodel" and select the input sheet of study
  4. Once the model has finished running, plots showing the expression over time for all of the genes in the network will appear if "makeGraphs" was set to 1. The plots will automatically be saved as *.jpg files in the same folder as your input file. Compile the figures into a folder following the naming convention described earlier. Compress this folder by right clicking then selecting "7-zip" and "Add to archive...". Make sure the archive format is "zip."
  5. Upload the .xlsx output sheet and the zipped folder with the output plots onto openwetware.
  6. In between each run type "close all" and "clear all" into MATLAB

Updating GRNmap on Github

  1. Download "Git" from this link
  2. Go through the instillation process (no changes needed)
  3. Once installed open "Git Bash" from the programs menu on the computer
  4. To have the GRNmap folder to edit on the desktop type cd Desktop then press enter
  5. To clone into GRNmap type git clone https://github.com/kdahlquist/GRNmap.git then press enter
  6. To edit GRNmap, type cd GRNmap then press enter
  7. To edit the Beta version, type git checkout beta
  8. There will now be a file on the desktop titled "GRNmap" where you can put the updated test files and then upload them to github
  9. Create two new folders in the "test_files" subfolder called "sixteen_tests" and "sixteen_tests_output"
    • The "sixteen_tests" contains the input sheets and the "sixteen_tests_output" contains the output sheet.
  10. To update the GRNmap on the web
    • Type git checkout beta then press enter
    • Type git status then press enter will show the changes that you made in red
    • Type git add test_files/s then press enter
    • Type git add test_files/s then press tab. The code should then read git add test_files/sixteen_tests/ then press space and type git add test_files/s again then tab so it reads git add test_files/sixteen_tests_output/ then type an underscore to indicate you're done.
    • Type git status then press enter
    • Type git commit -m "add sixteen combinations of input files and their output files" or a description of the change that was made then press enter
      • I made a typo and wrote "coombinations instead of combinations", unfortunately
    • In order to configure the computer to your user name Type git config --global user.name "github username" then press enter and then type git config --global user.email "email used for github" then press enter
    • To update anything that was changed while you were working type git pull then press enter
    • To update the website with your changes type git push then press enter and enter your username and password when prompted.
  11. Comment on Issue 74 on Github explaining what was done and change the label to "review requested" and select Dr. Dahlquist.
  12. To remove files git rm test_files/sixteen_tests_output/*.xlsx

Adjust Input Sheets to Correct for Change to Sheet Name

  1. Change sheet names "wt" and "dcin5" to "wt_log2_expression" and "dcin5_log2_expression"
  2. Save and rerun on Matlab
    • When the new workbook with updated sheet names with the newest beta version (4:39 pm)the following Matlab error appeared Error using * Inner matrix dimensions must agree. Error in general_least_squares_error (line 94) L = L + alpha*(wts')*wts + alpha*(b-bp)'*(b-bp) + alpha*sum(proratep(:))^2; Error in lse (line 53) GRNstruct.GRNOutput.lse_0 = general_least_squares_error(initial_guesses); Error in GRNmodel (line 32) GRNstruct = lse(GRNstruct);
    • Alpha was accidentally deleted from the code, but once the error was fixed, try running again.
  3. Upload new input and output sheets into github

Bug found in Optimization Parameters

  1. It was discovered that in the optimization parameters sheet of 4-genes_6-edges_artificial-data_Sigmoid_estimation.xls the times were listed as 0.4, 0.8, 1.2, and 1.4 instead of 04, 0.8, 1.2, and 1.6.
  2. Correct the bug and save as a new document 4-genes_6-edges_artificial-data_Sigmoid_estimation_updated.xlsx (this name may need to be adjusted later).
  3. Run this new file through MATLAB.
  4. There was also some problem with how MATLAB is reading the time points in K1 of the log2_expression sheets. Retyping "1.2" corrected this issue.
  5. Once the input sheets have been corrected run them in MATLAB and upload the input and output sheets to Github. Upload the input, output, .mat, and zipped plots to openwetware.

Check Output Sheet

  1. To check that the output sheets are correct, make an excel checklist with a column with the sixteen input sheet names and then rows with the following labels: wt_log2_optimized_expression, wt_sigmas, dcin5_log2_optimized_expression, dcin5_sigmas, optimized_production_rates, optimized_threshold_b network_optimized_weights, and optimization_diagnostics.
  2. Check each output sheet and write "yes" or "no" when appropriate in the appropriate cell.
  3. Report findings both on the github issue and openwetware.

Identical Runs with Same Code Version and Same Input Workbook (June 2, 2015)

GRNmap Testing Report

GRNmap Testing Report Identical Runs with Same Code Version and Same Input Workbook 2015-06-01

Selecting and Naming the Input Sheets

  1. There were four total input sheets compared for this test. They were taken from the "sixteen_tests" subfolder of the GRNmap-beta branch.
  2. The input sheets taken for study were
    • 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-0_fixP-0_graph.xlsx (& copy)
    • 4-genes_6-edges_artificial-data_MM_estimation_fixP-0_graph.xlsx (& copy)
  3. To simplify the four files to run will be named:
    • 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-0_fixP-0_graph_copy-1.xlsx
    • 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-0_fixP-0_graph_copy-2.xlsx
    • 4-genes_6-edges_artificial-data_MM_estimation_fixP-0_graph_copy-1.xlsx
    • 4-genes_6-edges_artificial-data_MM_estimation_fixP-0_graph_copy-2.xlsx

Running the model in MATLAB

  1. Drag into a folder on the desktop named "GRNmap Testing Report Identical Runs with Same Code Version and Same Input Workbook 2015-06-01" then begin running in MATLAB


When run in MATLAB the following error occurred: Error using inputcheck (line 40) Multiple inputs that look like file names: 'C:\Users\Student\Desktop\GRNmap' and 'Testing'. Error in print (line 156) [pj, devices, options ] = inputcheck( pj, inputargs{:} ); Error in graphs (line 46) eval(['print -djpeg ' directory 'figure_' num2str(kk)]); Error in output (line 6) GRNstruct = graphs(GRNstruct); Error in GRNmodel (line 34) GRNstruct = output(GRNstruct);

  • The problem was that there were spaces in the name of the folder. Rename it to GRNmap_Testing_Report_Identical_Runs_with_Same_Code_Version_and_Same_Input_Workbook_2015-06-01

2. Make sure to save the input file, output file, .mat output sheet, plots, and counter ("figure 1") and upload to openwetware (Change names of figures to the appropriate gene before zipping).

Computing LSE and Penalty Term

  1. To get the LSE & the penalty term, type the following into MATLAB
Code for LSE:
GRNstruct.GRNOutput.lse_out

Code for Penalty
GRNstruct.GRNOutput.reg_out

2. Record these values on the wiki, as well as the counter number (which should be saved manually, named "Counter.jpeg" and zipped into the folder with the other plots)

Analysis Workbook

  1. Use an analysis sheet that was constructed for a different test as a basis for how to compare the different copies.
  2. Delete the plots on all three sheets.
  3. Delete the column headers that you do not need for this comparison.
    • The headings should be Sigmoidal_estimation_fixb-0_fixP-0_graph_copy-1; Sigmoidal_estimation_fixb-0_fixP-0_graph_copy-2; MM_estimation_fixP-0_graph_copy-1; MM_estimation_fixP-0_graph_copy-2; Sigmoidal_estimation_fixb-0_fixP-0 _copy-1_vs_copy-2; MM_estimation_fixP-0 _copy-1_vs_copy-2; Sigmoidal_estimation_fixb-0_fixP-0 _copy-1_vs_copy-2_maximum; MM_estimation_fixP-0 _copy-1_vs_copy-2_maximum, starting at column B, leaving column A the same as in the basis analysis sheet.
    • The columns that have copy-1_vs_copy-2 should contain the difference between the copies (=B2-C2 or =D2-E2)
    • The columns that have maximum should take the maximum of the difference (=MAX(F:F) or =MAX(G:G))
  4. In this case the LSE, Penalty term, and number of iterations were exactly the same, but in the future if they are not, this same method can be used to compare them.
  5. Report findings on openwetware and comment results on Issue 99.

Wednesday Meeting Update (June 3, 2015)

Google Presentation Notes from meeting

  • Verify sigmas (all strains) Issue 102
    • Create a new test folder called "validation_test"
    • Folder should have: input, output, calculate standard deviations
  • Adjust "big" 16 input sheets to have the correct sheet names (ect) and re-run

.xlsx vs. .xls (June 3, 2015)

GRNmap Testing Report .xlsx vs. .xls 2015-06-03 TM

Selecting and Naming the Input Sheets

  1. There were four total input sheets compared for this test. They were taken from the "sixteen_tests" subfolder of the GRNmap-beta branch.
  2. The input sheets taken for study were
    • 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-0_fixP-0_graph.xlsx (& xls version)
    • 4-genes_6-edges_artificial-data_MM_estimation_fixP-0_graph.xlsx (& xls version)
  3. To simplify the four files to run will be named:
    • 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-0_fixP-0_graph_xlsx.xlsx
    • 4-genes_6-edges_artificial-data_Sigmoidal_estimation_fixb-0_fixP-0_graph_xls.xls
    • 4-genes_6-edges_artificial-data_MM_estimation_fixP-0_graph_xlsx.xlsx
    • 4-genes_6-edges_artificial-data_MM_estimation_fixP-0_graph_xls.xls

Running the model in MATLAB

  1. Drag into a folder on the desktop named "GRNmap_testing_xlsx_xls_2015-06-03_TM"
  2. Name the counter to "Counter" and save as a .jpeg
  3. Rename the plot figures to their appropriate gene, move to a folder with the name of the input sheet then _plots and zip by right clicking, pressing "7zip" "add to archive" and making sure "zip" is selected.
  4. Make sure to save the input file, output file, .mat output sheet, plots, and counter ("figure 1") and upload to openwetware.

Computing LSE and Penalty Term

  1. To get the LSE & the penalty term, type the following into MATLAB and record on the wiki
Code for LSE:
GRNstruct.GRNOutput.lse_out

Code for Penalty
GRNstruct.GRNOutput.reg_out

Analysis Workbook

  1. Use an analysis workbook that was constructed for a different test as a basis for how to compare the different copies.
  2. Delete the plots on all three sheets.
  3. Delete the column headers that you do not need for this comparison.
    • The headings should be Sigmoidal_estimation_fixb-0_fixP-0_graph_copy-1; Sigmoidal_estimation_fixb-0_fixP-0_graph_copy-2; MM_estimation_fixP-0_graph_copy-1; MM_estimation_fixP-0_graph_copy-2; Sigmoidal_estimation_fixb-0_fixP-0 _copy-1_vs_copy-2; MM_estimation_fixP-0 _copy-1_vs_copy-2; Sigmoidal_estimation_fixb-0_fixP-0 _copy-1_vs_copy-2_maximum; MM_estimation_fixP-0 _copy-1_vs_copy-2_maximum, starting at column B, leaving column A the same as in the basis analysis sheet.
    • The columns that have copy-1_vs_copy-2 should contain the difference between the copies (=B2-C2 or =D2-E2)
    • The columns that have maximum should take the maximum of the difference (=MAX(F:F) or =MAX(G:G))
  4. In this case the LSE, Penalty term, and number of iterations were exactly the same, but in the future if they are not, this same method can be used to compare them.
  5. Report findings on openwetware and comment results on Issue 99.

Sigmas (June 3, 2015)

GRNmap Testing Report Sigmas 2015-06-03 TM

  • All strains; any of the sixteen; "large"

Executable (June 8, 2015)

GRNmap Testing Report Executable 2015-06-08 TM

  • 16 "small"

GRNmap-beta-bgf (June 8, 2015)

GRNmap Testing Report GRNmap-beta-bgf 2015-06-08 TM

Wednesday Meeting Update (June 10, 2015)

  1. Updated the input sheets for the "large" sixteen test files
    • Change from (strain) to (strain)_log2_expression and from network_b to threshold b
  2. Verified that GRNmap was calculating the sigmas properly
    • ran wt, dcin5, dgln3, dhap4, dhmo1, and dzap1
    • Calculated the standard deviations of the log2 fold expression and compared them to the sigmas that GRNmap calculated
    • Largest difference was E-15
  3. Tested the Executable on a computer without MATLAB
    • The executable works as long as the computer has administrator privileges and none of the directory names (folder name or administrator name) have any spaces.
    • Anti-virus software also prevented the download of GRNmap from the GRNmap website (http://kdahlquist.github.io/GRNmap/index.html) so it must be disabled for the download to work. However, the anti-virus software did allow downloads from github (https://github.com/kdahlquist/GRNmap).
    • Each step of the process involving the executable takes a few minutes to work, which may be due to the fact that it was run on an older computer.
    • The executable works even after the laptop has been shut off and the program was not reinstalled.
    • GRNmap will work on a non administrator account if it has been downloaded on an administrator account.
    • The executable is not compatible with OS X Yosemite
    • Worked after being uninstalled and then reinstalled
    • Think about:
      • Packaging a test file with the executable
      • Figuring out which test file to run
        • Took 29:07.18 for 22-genes_47-edges_Dahlquist-data_Sigmoidal_estimation_fixb-0_fixP-0_graph.xlsx
        • Took 3:34.52 for 4-genes_6-edges_Dahlquist-data_Sigmoidal_estimation_fixb-0_fixP-0_graph.xlsx
        • Took 3:43.75 for 22-genes_47-edges_Dahlquist-data_Sigmoidal_estimation_fixb-1_fixP-1_graph.xlsx to run
        • It was possible to run two at the same time without dramatically affecting the run time. The only observed issue was that the figures will overwrite.
  4. Tested changes to GRNmap-beta from Dr. Fitzpatrick:
    • optimization_diagnostics sheet with the requested output data.
      • added code functionality to compute min LSE and SSEs of individual genes.
    • reordering of the strain sigma sheets according to Issue #107
    • renamed graph files to correspond to gene names.
    • saved the diagnostic graph to a jpg.
    • fixed a bug in the penalty computation of the production rates.
    • Results:
      • First time I ran it the counter got up to 5,000,000 before I force closed it
      • Increasing the TolX and TolFun values to 1e-6 worked, but increasing them to 1e-8 did not. Decreasing MaxIter and MaxFunEval to 1e5 worked, but decreasing them to 1e6 did not. The optimization_diagnostics sheet had fairly different values for increasing the TolX and TolFun values to 1e-6 and decreasing MaxIter and MaxFunEval to 1e5. For both of the experiments that worked (2 and 4):
        • Order of sheets was correct (wt_log2_optimized_expression_, dcin5_log2_optimized_expression, wt_sigmas, dcin5_sigmas, optimized_production_rates, optimized_threshold_b, network_optimized_weights, optimization_diagnostics)
        • Graph files were saved correspond to gene names
        • Automatically saved the diagnostic graph to a .jpg (named OptimizationDiagnostic.jpg)
      • I repeated the test fixing b-1 and keeping P-0. Increasing the TolX and TolFun values to 1e-6 and 1e-8 and decreasing MaxIter and MaxFunEval to 1e6 and 1e5 worked. The output sheets all had very similar values, except that the value of the optimization_diagnostic parameters varied between increasing the TolX and TolFun and decreasing MaxIter and MaxFunEval.

Setting estimateParams=0

GRNmap Testing Report Setting estimateParams=0 2015-06-22 TM

Perform Multiple Runs on the Same Computer

  1. Select cmd.exe from the start menu.
  2. Once the window appears, type in matlab -automation, which will launch a MATLAB command window.
  3. Once the MATLAB dialogue box opens up press Control Alt Delete.
  4. Select "Start Task Manager."
  5. When "Windows Task Manager" appears go to the "Processes" tab.
  6. Find and right click on "MATLAB.exe" and select "Set affinity." Deselect all processors except for the one processor (CPU #) you would like MATLAB to run on and press "OK."
  7. Navigate back to the "MATLAB Command Window" and direct it to the folder that contains the GRNmodel.m file by copying the navigation cd C:\Users\Student\Desktop\GRNmap-beta\matlab and pressing enter.
    • The navigation can be found by opening the folder that contains the GRNmodel.m file in Windows Explorer. At the top of the dialogue box there will be a yellow folder that has the name of the folders and subfolders which led to file. Click the yellow folder and the navigation will be highlighted in blue, which you can then copy.
  8. Type in GRNmodel and press enter.
  9. It will then ask you to select an input sheet.
  10. Once the input sheet has been selected, MATLAB will start running and the optimization_diagnostic window will appear (named "Figure 1").
  11. Repeat steps 2-10 for each run, selecting a different CPU number each time. Make sure to note which input is running on which processor.
    • If all of the input sheets are in the same folder, the optimization_diagnostic and the plots of the genes will overwrite. To avoid this problem, make a new folder on the desktop for each input sheet. The MATLAB outputs will save in this folder and you will eliminate the danger of any files being overwritten.

Dahlquist Lab Navigation

Personal tools