Natalie Williams: Electronic Notebook: Difference between revisions

Revision as of 15:23, 30 April 2015

Natalie Williams: Electronic Notebook

Protocol for MATLAB

This page will help you input and run data sets from your document into an output.

Dahlquist:GRNmap

Electronic Notebook

Fall 2014

September 2014

September 18, 2014

Data Set Up
Openwetware familiarization: I became familiar with openwetware code and programing
MATLAB procedure: the MATLAB procedure that was written contains the instructions in using it to receive the output with the optimization network weights of the system.

Network
Ten random networks were made from the original network.

The original network Excel file was used, and each cell on the network sheet had the following formula in it:
- =IF(RAND()<0.1134,1,0)
This procedure was done ten times to get these ten random networks
Each network was saved as rand# (1 - 10)

September 25, 2014

The random network Excel files were put into MATLAB to be run to get the optimized weights of the network

The file is saved as the final name with _output.xls

Opening the file, the weights of these networks was found on the optimized_network_weights sheet.

Visualization of the Networks
These files had to be re-saved as .xlsx in order to upload them to GRNSight. GRNSight visualizes the networks and because there are varying numbers that suggest how much one gene controls another, the resulting output has different colors. After each individual random network was visualized, it was compared to the original network. For better analysis, the same order of the proteins was used to see the different connectivities.

October 2014

October 2, 2014

Information that could have been gathered from comparing the Original network to the 10 Random ones was found. This information includes: Nodes: the positive and the negative
Frequencies: the In and Out degrees. These were how often one gene controlled other genes. It was found through the following equations:

=COUNTIF(B2:B22,"<>"&0)
=COUNTIF(B3:V3,"<>"&0)
From this, the frequencies were found by looking at how often 0 appeared or 1 appeared, etc.
- For example, =COUNTIF(B23:V23,"=0") for In Degree to see how often 0 occurred

Next, bar graphs were used to compare the weighted networks between a random network and the original network.
After that, the minimum and maximum values from each random network was found.

The minimum was found using: =MIN(B2:V22)
The maximum was found using: =MAX(B2:V22)

The sum was found of the entire worksheet of the optimized_network_weights.

=SUM(B2:V22)

The average of the worksheet was also taken for the entire matrix.

=AVERAGE(B2:V22)

We used this information to see if there were any key factors to what made the original network the one that we accepted. We hoped that it would shed some light on what key differences were between the random networks and the original one.

October 9, 2014

I was out of town, so there was nothing I needed to do specifically for this week.

October 16, 2014

I began the process of the forward simulations of the networks. I had to isolate the deletion strains and see if there was any resemblance between the wild type strain with the four deletion strains.

October 23, 2014

All the bugs in the system were noted and written down to be fixed.
The forward simulations were rerun. The production and degradation rates from the output were inserted into each of the individual strains. For the network weights of the individual strains, the output from the general workbook sheet, optimized_network_weights, was used.
The deletion strains needed hard 0's across their row on their worksheet. On the optimization_parameters sheet, the following things needed to be altered:

iestimate = 0.00+E0
fixed_b = 0
strains:
- wt/3/0
- dcin5/4/3, where the first number is the sheet, and the second number is the row of the gene within the sheet
- This controlled which strains would be shown after the workbook was run through MATLAB

Network_b sheet used the optimized_network_b from the general workbook output was used for each individual strain.

October 30, 2014

The Real WT individual strain was compared to the forward simulation WT and deletion strains.
I made a list of transcription factors of the individual strains that did not compare well with the real WT individual. Those transcription factors were going to be looked at more closely and might have been taken away. The parts that I compared were the data points and the fit of the line.

November 2014

November 6, 2014

16 transcription factors were taken and run through YEASTRACT. However, the results have to be formatted in a way so that GRNSIght can visualize it the network that results.

 The network I used was created with the following transcription factors:
 ARG80
 CIN5
 GLN3
 HAP4
 HMO1
 NRG2
 RSF2
 RTG3
 STB4
 SWI4
 TBF1
 TOS8
 TYE7
 YHP1
 YOX1
 ZAP1

Navigate to Generate Regulation Matrix [[1]] on the YEASTRACT
Select the appropriate check boxes for the filters.
Paste the list of transcription factors into the appropriate field.
Paste a list of targets into the Target ORF/Genes field, or check the box to consider all ORF/Genes.
Click the Generate button.
In the results window that appears, click the link to download the Regulation matrix results file as a Semicolon Separated Values (CSV) file.
Once you have downloaded the file, launch Microsoft Excel.
Select the menu item, File > Open and select the file that you downloaded.
Select Column A.
Select the menu item, Data > Text to Columns...
In the first window of the wizard that appears, select the radio button for "Delimited" and click Next.
In the second window of the wizard that appears, check the box for "Other" under "Delimiters" and type a semicolon in the field to the right and click Finish.
Select the menu item, File > Save As. Save the file as an Excel Workbook (.xlsx).
The orientation of the matrix has to be flipped. A new worksheet must be created by clicking on the new worksheet icon at the bottom of the screen. Name this new worksheet "network".
Select the adjacency matrix from the first worksheet and copy it to the clipboard. Go to the "network" worksheet and click on cell A1. Select the menu item Edit > Paste Special. In the window that appears, check the box "Transpose" and click OK.
The labels for the genes in the columns and rows needs to match. The "p" of the gene names in the columns must be deleted.
Paste the following text into cell A1 "rows genes affected/cols genes controlling".
Save your work, which is now ready for loading into GRNsight. The original sheet can be deleted if you want.

Results
GRNSight v1.8 had to be used to visualize the networks. Only four of the input selection choices gave network connections among the listed transcription factors.
Documented DNA Binding Evidence

15 genes
58 edges

Documented DNA Expression

15 genes
31 edges

DNA Expression plus Binding

15 genes
58 edges

DNA Expression and Binding

15 genes
4 edges

Potential with Motifs

0 genes
0 edges

Potential without Motifs

0 genes
0 edges

Documented plus Potential

0 genes
0 edges

Documented and Potential

0 genes
0 edges

November 13, 2014

I reran MATLAB to see if I got the same results as Dr. Fitzpatrick. I received the same results as Dr. Fitzpatrick. When each deletion strain was compared to the WT strain, the targeted genes that were supposed to be affected were.

Spring 2015

January 2015

I met the other people that are working on this project - Juan, Trixie, and Grace. For this month, we discussed where the project was heading and what parts of the code need to be changed. During these meetings, Profs. Dahlquist and Fitzpatrick gave overviews of the research project and all the computational functions that the model requires.

February 2015

February 6, 2015

I reran the protocol for microarray data that I received from Dr. Dahlquist. The protocol can be found here.

First, I created files on my desktop to host the Ontario and GCAT data
- Each folder contained the following:
  1. The script for either Ontario or GCAT
  2. The target files for those scripts --> Ontario_Targets and GCAT_Targets
  3. The .gpr files from the microarrays were also located in the individual files
I downloaded and unzipped the files that were listed under the protocol

Note that Ontario was saved under Ontario and GCAT with GCAT

The R used was the 32-bit

The directory had to be changed to the folder where the files that would be corrected were extracted to

The Ontario script was run first and then followed by GCAT
There will be two different outputs from running GCAT. We want the Final_Normalized_Data

My file was then sent to Grace J., who then began to compare the results that we got.

February 12/13, 2015

I spent this day searching literature for data sets of transcription rates with Grace J. We wrote an abstract to submit to the Undergraduate Research Symposium to present what was done last semester. The abstract was submitted the following Friday.

February 19/20, 2015

Spring Break

February 26/27, 2015

We worked on the poster for our presentation at the 7th Annual Undergraduate Research Symposium. We compiled the data that we were going to use and present on our poster. For the random networks, we gathered the LSE's to compare to the lierature-derived. We chose Random Networks 1 and 4 due to them having the lowest and the highest LSE output.

March 2015

March 5/6, 2015

We added more information to our poster.

March 12/13, 2015

We continued to edit our poster for the 7th Annual Research Symposium. We edited the results section and changed the layout of the poster. We also edited the abstract that we submitted for the symposium to enter ourselves in the WCBSURC (West Coast Biological Sciences Undergraduate Research Conference). We found out the next day that we were accepted to present on April 25, 2015.

For Friday, we continued to look for production and degradation rates in the literature.

March 19/20, 2015

For this week, we finalized our poster for presenting at LMU's symposium. The Results section as well as some of the background information were edited. Images, graphs, and the layout of the poster were formatted to be clearer and easier to follow. The graph titles as well as the scales were altered so that each graph had the same axises. Furthermore, some of the section headings were edited to summarize the main finding for each result. We printed out our posters to put them up on Friday morning.

On Friday, we continued to look for the various degradation and production rates of mRNA.

March 26/27, 2015

April 2015

=====April

Back to User:Natalie Williams

@@ Line 182: / Line 182: @@
 =====March 19/20, 2015=====
+For this week, we finalized our poster for presenting at LMU's symposium. The Results section as well as some of the background information were edited. Images, graphs, and the layout of the poster were formatted to be clearer and easier to follow. The graph titles as well as the scales were altered so that each graph had the same axises. Furthermore, some of the section headings were edited to summarize the main finding for each result. We printed out our posters to put them up on Friday morning.
+On Friday, we continued to look for the various degradation and production rates of mRNA.
+=====March 26/27, 2015=====
+====April 2015====
+=====April
 <br>
 Back to [[User:Natalie Williams]]

Natalie Williams: Electronic Notebook: Difference between revisions

Revision as of 15:23, 30 April 2015

Contents