Matthew E. Jurek
Electronic Laboratory Notebook
- Began by opening the sample data and renaming it to Jurek_Week14 before beginning manipulations.
- My network consists of 23 transcription factors based on the Week 12 Assignment. The Regulation Matrix has been copied onto the 3 appropriate worksheets: network, network_weights, and network_thresholds. The transcription factors are not in alphabetical order, rather I will keep them in the order generated by Yeastract, ensuring each worksheet utilizes the same order. Following the important step in, HMO1 was deleted from my network as that is the gene I was assigned. My worksheets have been adjusted accordingly.
- The standard names have been copied and pasted into Yeastract. I have now received the systematic name and have placed them in the appropriate column. The degradation rates are being explored according to the Belle et al. data. Only about 4 transcription factors are not on this list, thus warranting a rate of .027182242.
- Production rates have been found by creating a column next to the degradation rates. A formula has been applied that multiplies these values by 2. With the degradation rates doubled, I now have the production rates. These will be copied and pasted on the production_rates worksheet.
- Both the systematic and standard names of the transcription factors have been pasted into the log2_concentrations sheet. I will now use the control F function to find all 23 transcription factors in my network. This function is useful as there are over 6000 entries on the Week 9 spreadsheet that I am using. After locating the appropriate transcription factor I have been copying and pasting the average log fold change for the cold shock data (time points 30,60, and 90). This will be repeated 23 times until the worksheet has all this info.
- To perform these calculations in a timely manner, I have setup 2 additional worksheets on the Week 9 spreadsheet. The first worksheet contains the scaled and centered data for the average log fold change for each of the four trials within a timepoint. This will be repeated for the 30,60, and 90 timepoints. After having the 4 scaled and centered values for each time point, the standard deviation is calculated across the 4 values for each timepoint. The second worksheet consists of just the standard deviation for each gene at each time point. I have been copying and pasting the standard deviations as I complete a time point. The second sheet also contains the systematic names so that I will be able to look up the appropriate transcription factors. After completing the second sheet I have been able to look up my network using the control F function again. I can then copy the row, since this sheet only contains the 30,60,90 standard deviations. These values are pasted in the appropriate concentration_sigmas worksheet. This will be repeated for the whole network.
- As suggested, the optimization_parameters worksheet was left alone.
- The values on the simulation_times worksheet exceeded the values given the Dahlquist data, thus I am adjusting them. The new range covers 0-60 minutes over 5 minute intervals.
- The network_b sheet contains all of the transcription factors within my network according to standard name with the threshold set at 0.000.