Moneil5 Dahlquist Lab Notebook: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(→‎Spring 2017: added 3.30.17 and poster)
(→‎March 30,2017: added technical report information)
Line 357: Line 357:
*Look into how gephi is computing each statistic (both weighted and unweighted) to know for sure we know how the statistic is being calculated
*Look into how gephi is computing each statistic (both weighted and unweighted) to know for sure we know how the statistic is being calculated
**By-hand calculation for each statistic, come back to eccentricity later because it's hard to tell what's going on
**By-hand calculation for each statistic, come back to eccentricity later because it's hard to tell what's going on
*Develop group report that sums up our work for the semester along with tables, graphs, etc.
**Both as word doc and powerpoint, describe everything like technical report that points to specific files


[[Category:Dahlquist Lab]]
[[Category:Dahlquist Lab]]

Revision as of 16:36, 30 March 2017


Spring 2016

January 15, 2016

  • branch
  • date time downloaded
  • name file link to download
  • bug, functionality, priorities
  • priority level
    • 0-greatest priority
    • 0.5- next up to work on
    • 1- …
    • 2- least priority
  • data analysis- data not code
  • question- asking people questions
  • dont close issues on your own- write comment “resolved because…” and label review requested
  • purely website ones
  • assignments- assign issues to people (sparingly assign things to him)
  • give updates when working in between meetings
  • make electronic lab notebook that describes what was done each day- use as repository for files and such
  • go through wiki checklist and edit user page to skills
  • assignment for data bases class
  • format like resume
  • alphabetize the genes - gonna take some time

January 22, 2016

  • p= production rate
  • w=weight
  • b=”threshold”
  • Can control any of these parameters
  • Production and threshold for every gene in network
  • weight for every edge in network
  • Number of timepoints vs number of parameters is out of whack
  • trying to find overall set of values to closest set of values- might never converge on an answer
  • LSE vs. penalty being plotted
  • “sweet spot” for alpha value found in “elbow” of curve
  • questions trying to answer:
  • ex. what happens if ….?
    • estimate w,p,b
    • estimate just p
    • estimate just w
    • estimate just b
  • compare sigmoid to mm
  • want to look at just wild type or wild type plus mutant
  • strain influence/ strain #

January 29th, 2016

  • abstracts for undergrad symposium due by the 12th
  • honors research grant also due by the 12th
  • with grace on poster for symposium
  • read trace paper
  • not separating transcription and translation
  • implementation verification
  • output- tough because where we’d be making predictions
  • change model to production_function in excel
  • l-curve function call it 0
  • put between production function and estimate_params
  • run l curve analysis this week\
  • Do 4 runs this week- do largest and smallest networks
    • +/- deletion strains
    • generates 4 l curves
      • LSE on y-axis
      • penalty on x-axis
      • Should look like l (put labels for alpha values)
  • make_graphs=0

February 5, 2016

  • figure out how to run multi-core processor
  • name of file- remove “dahlquist data” and put in initials of person running it instead
  • make sure everyone deleted the same strains
  • will be working with wild type data from beginning to understand process

February 11, 2016

  • meet up with Brandon in Dahlquist’s lab to work on project on some day next week
    • meeting next week is at 3:15
  • plot data from LSE runs
  • by next week- alpha selected, data collected
  • replace 41998 #VALUE!
  • 23 is correct # of data points wt
    • t15=4
    • t30=5
    • t60=4
    • t90=5
    • t120=5
    • total=23
  • 20 is correct # data points dcin5
    • t15=4
    • t30=4
    • t60=4
    • t90=4
    • t120=4
    • total=20

February 17, 2016

  • Quantitate the fluorescence signal in each spot (GenePix Pro)
  • Calculate the ratio of red/green fluorescence (GenePix Pro)
  • Log transform the ratios (GenePix Pro)
  • Normalize the ratios on each microarray slide (within-chip normalization)
  • Normalize the ratios for a set of slides in an experiment (between-chip normalization)
  • Perform statistical analysis on the ratios
    • Within-strain ANOVA
    • Modified t test for each timepoint
    • Between-strain ANOVA
    • Benjamini & Hochberg and Bonferroni p value corrections for the above three tests
  • "Sanity Check" on above three tests
  • Determining candidate transcription factors and gene regulatory network (YEASTRACT)
  • Dynamical modeling with GRNmap; visualization with GRNsight

February 19, 2016

  • Grace to finish honors ambassadorial grant for Experimental Biology Conference in April
  • Output parameter comparison for largest network with added strains for alpha values:
    • 0.01
    • 0.008
    • 0.005
    • 0.002
    • 0.001
  • To complete for the poster (two weeks after spring break)
  • Stick with subfamily with strains_added
  • "Production runs:" Evaluate (with graphs) the networks of 15, 34, 25, 20, and 30 genes.
  • Help Grace run these networks

February 26, 2016

  • Mistakes in a couple input sheets, need to look into helping Grace fix those
    • Explanation behind this error may be connected to the weird weight parameters output in the file on 2/23
  • Look into helping Grace redo run for largest network with alpha value of 0.002

March 10, 2016

  • Helped Grace in putting graph outputs of interesting genes on the poster for the spring/symposium. Also helped in adding MSE/ANOVA values, and parameter comparisons as well as in degree and out degree figures.
  • Helped re-run some of the networks to remedy for an error in the first run.

March 27, 2016

  • Looked at/worked on editing the abstract for ASBMB conference with Grace
  • Told how the random networks might work, better understand the principles behind these networks

April 15, 2016

  • Only a few weeks left, will be working to help Grace compile a powerpoint containing the graphs, figures, and tables for the dHAP4 network analysis done for the semester.
  • Focusing on in-degree and out-degree distribution, small networks and whatever else Grace needs help on

Fall 2016

August 30, 2016

Mathematically modeling networks - how will graph theory help?

  • Mostly focusing on using gene regulatory networks in relation to graph theory
    • Are there papers that suggest you can determine what is happening in a system based on outputs of graph theory-based statistics?
    • What do strict numbers from stats tell you about what is happening in a system? Or is it a mostly visually based interpretation that’s needed?

Are the feed forward loops AND or OR type loops? (i.e. A and B needed for C activation, alternatively A or B is needed for C activation) How does suppression and activation play a role in these feed forward loops?

Cursory searches:

September 6, 2016

  • Searched “graph theory and yeast”
  • Network properties in “using graph theory to analyze biological networks” - don’t pay attention to clustering, probably doesn’t relate to what we’re doing
  • Pay attention to paragraph on gene regulatory networks
  • Documentation of model TRACE model, see dry lab protocols
    • Look at papers referenced in the model
    • Issue #170 - goal is to get words on a page to describe GRNmap

September 12, 2016

  • Code of conduct:
    • Re-read, agree, post to issue saying read and agree
  • Need way to check calculations
  • Look for pre-packaged ways to compute betweenness centrality and shortest path
  • Start into mode of what we can get done in MatLab
  • Start googling MatLab documentation, implementation in MatLab and use as independent check for GRNsight team
  • Systems biology package for matlab
  • Values computed for weighted and unweighted networks
  • Look for code to do analysis with, continue literature search
  • Looking for way to do degree distribution quickly an easily
  • Projection: mma deg rates, good random network -> proceed to run simulations

September 13, 2016

  • Look up and complete matlab tutorials
  • Do write-up of data
  • Find articles that focus on betweenness centrality and and shortest distance models for graph theory
  • Worked with Kristen to make powerpoint with quick summations of the articles we read
  • Looked into using a systems biology toolbox for MATLAB can be found online


September 19, 2016

  • R in degree out degree generator- random networks only?
  • Use bibliographic software to format references zoterro- web and standalone. *Can type in DOI and get field with everything. Export to whatever format ( AP etc.), best thing to do
  • Find betweenness centrality program in matlab
  • First 4 points of TRACE documentation [2]
  • Double check everything about 5 15 gene networks and do production runs, generate random networks and collect data
  • Test same network on same computer twice, different computers, and other control experiments

September 20, 2016

  • Spent most of time getting familiar with basic Matlab functions using the Matlab tutorial found at this link: [3]
  • Reviewed matlab basic tutorials and looked into tutorials for systems biology package and graph theory. Nothing concrete found yet, but will look more into it at the next research session

October 3, 2016

  • Keep working on TRACE documentation
  • Get past admin block to download SBEToolbox on all of the Dahlquist Lab computers
  • Figure out formatting for toolbox, use networks of different size, see what works with the program and how it’s interface works
  • Kristen going to contact authors to ask for how we should format everything

October 17, 2016

  • Keep working with Kristen on exploring SBEToolbox capabilities and its application to the graph statistics of interest
  • Create a small graph first to test with
    • 4 nodes, 6 edges might be a good size to look at
  • Update Github more often with results
  • TRACE documentation can be found on the TRACE wiki

October 18, 2016

  • Worked on figuring how SBEToolbox works with Kristen
  • Runs as a program in matlab called SBEGUI
    • Once running offers different organisms to choose from - Kristen selected S. cerevisiae and notes when MatLab and SBEGUI are restarted, you are not asked about organisms again
  • Interface is pretty user-friendly for SBEGUI pop up
  • Some interesting features include:
    • Creation of random networks (small world, Erdos-Renyi, and Ring Lattice)
    • Can upload own networks in .txt (tab delimited) format
    • Allows you to select nodes and the program tells you it’s functionality
    • Lots of graph statistics that can be run, will look into each further moving forward

October 24, 2016

  • Convert to SIF files using GRNsight website
    • SIF instructions are on GRNsight website - also GRNsight able to convert from excel to SIf format
  • Create our own documentation of the package tomorrow. Make note of everything that works and everything that doesn’t work

October 25, 2016

  • Worked with Kristen running a 4 node network through all programs/functions included in the package
  • Will run 21-gene network next week on more specific stats now that we know what works and what doesn’t
  • Powerpoint of our work can be found here Media:HorstmannOneil_SBEToolbox_Tests.pptx

October 31, 2016

  • Focus on running the more informative stats for our own networks, HAP4, GLN3 and ZAP1 and run visualization
  • Compile for a powerpoint to present next week

November 1, 2016

  • While working on powerpoint realized that betweenness centrality and shortest path that the statistics are being run without taking directionality of the edges into account
  • Look into program features to figure out if there is a way to change this to view the networks as directed
  • Only ran unweighted networks because can’t find where the weighted networks live in GitHub
  • Powerpoint of findings can be found here: Media: SBEToolbox_TEST.ppt

November 7, 2016

  • Brandon and Natalie to try different types of motifs when creating random networks. This includes regulatory, feed-forward, etc.
  • Drop out grey connections- rescale and it might show some of the stronger connections actually become less important
  • SBEToolbox confirmed to use assumptions of undirected networks. Does every program do this?
  • Might be looking at math rather than comp sci packages
  • Start doing google searches of shortest path/betweenness centrality of directed networks
  • Look into other programs, YeD and Gephi, etc. and email Dahlquist about other programs

November 8, 2016

  • In looking at other programs, seems like Gephi is the top choice
  • Yed seems like it might just be a figure generator, can’t find anywhere on their website where stats might be done

November 28, 2016

  • After exploring Gephi and getting familiar with Gephi, now going to run all 6 graphs/ directed networks through Gephi
  • Run weighted and unweighted and directly compare results
  • Look up definitions and calculations of each statistic (my primary task)
  • Add GRNsight visualization for each network
  • Think about analysis and conclusions that can be drawn from this semester for next semester’s presentations and conferences. UCI systems biology conference might be a possibility, going to be coming up soon, start thinking about it.

November 29, 2016

December 5, 2016

  • In reviewing Gephi work from last week, most interesting stats seem to be strong component, closeness centrality, betweenness centrality and eigenvalues
  • Look into plotting closeness and harmonic against each other to see if there’s a major relationship between the two
  • Find strong component and why some numbers are so much greater than others
  • Work on table of contents and CD’s of work to wrap up the semester

December 6, 2016

  • Made CD with table of contents of work and reviewed Gephi stats
  • Turned CD in

Spring 2017

January 12, 2017

  • Attending the UCI Systems Biology conference on January 28. Abstracts due next week by the 20th
  • Moving forward only one computer will be used for running models with GRNmap to reduce variables

January 19, 2017

  • Worked with Kristen on writing the abstract for the conference
  • Focusing on old HAP4 results from last spring and what other information the statistics from Gephi can tell us about the smallest HAP4 network

January 24, 2017

  • Worked on finding old HAP4 input sheets and poster documents
  • Kristen uploaded Gephi results and HAP4 outputs

January 26, 2017

  • Worked with Kristen to get poster completed
  • Discussed poster during the meeting and what is important to keep/not keep.
  • Worked on and completed the poster after research meeting and uploaded it to repository

February 2, 2017

  • Ended up presenting on my own at the conference due to unforeseen circumstances for Kristen. No problem at all, very interesting conference to attend.
  • Worked on symposium abstract
  • MSE relationship with ANOVA
*do p <0.05 genes have better fits
    • Dont
    • No relationship
  • Are genes with no inputs modeled worse?
  • Compare list of genes no inputs to list with - which modeled better?
  • How does b value play into MSE
    • Number of inputs
  • Genes decreasing - expression worse fit?
  • How do centrality measures connect to MSE?

February 9, 2017

  • Continued writing abstract for undergrad symposium
  • Initially my project was the poster version of Kristen’s talk, but now have it changed to compare all 6 db’s in Gephi and figure out what those stats might be saying about the networks and specifically what is being said about the nodes. Can be found in GRNmap GitHub repository. Got help and editing from Dahlquist and Fitzpatrick

February 16, 2017

  • Working on compiling Gephi statistics for all dibs
  • Start thinking about comparing nodes across networks and what this might mean
  • Compile into single document similar to Brandon’s - make all node based

February 23, 2017

  • Working on compiling Gephi statistics based on new naming scheme for the different families of networks
    • db1 - wt
    • db2 - dCIN5 14 nodes
    • db3 - dCIN5 17 nodes
    • db4 - dGLN3
    • db5 - dHAP4
    • db6 - dZAP1
  • Talk to Brandon about stats that might be used

February 27, 2017

  • All raw files for Gephi outputs (db2, db3, and uploaded and completed to Dahlquist Repository in [4]
  • If time available between midterms this week, will try my best to upload the in degree and out degree totals (Issue 328) before Thursday but not looking likely

March 2, 2017

  • Downloaded all 6 db sheets with comment "uploaded output sheets from first round to modeling" to use for Github Issue #329
  • Created a compiled excel workbook for the network outputs with sums of rows and columns in each matrix/ sheet for all dbs
  • Ran average, median, min and max of all rows and columns sums

Meeting

  • Take column summation (out degree) and row summation (in degree) and compute average in degree for each gene in each network, excluding zeros in the average calculation
    • Use COUNTIF for this etc.
  • For next week try and have most of poster complete

March 30,2017

  • Worked on symposium for the past couple weeks, completed poster can be found here
  • Created a file detailing how to upload files to Gephi, view graph statistics, and export the file to excel. A Word document version can be found here and a PDF version of the document can be found here

Meeting

  • Symposium good, make research for rest of the semester more robust.
  • 6 dbs
  • 30 random networks
  • LSE/minLSE ratios- bar chart (Natalie's presentation)
  • Degree distributions
    • Unweighted - R script done DB1-6, need to do random
    • Weighted - bar chart, cumulative plot done for db1-6. SPSS
  • Gephi - tables on in and out degree for both weighted and unweighted, and total degree; all db-derived of random
  • MSE/minMSE for db 5, Natalie will set up excel spreadsheet to facilitate
  • Gephi stats
    • I have db1-6
    • Kristen has random 20
  • Convert so each sheet is a statistic rather than just 1 sheet
  • Look into how gephi is computing each statistic (both weighted and unweighted) to know for sure we know how the statistic is being calculated
    • By-hand calculation for each statistic, come back to eccentricity later because it's hard to tell what's going on
  • Develop group report that sums up our work for the semester along with tables, graphs, etc.
    • Both as word doc and powerpoint, describe everything like technical report that points to specific files