CAGEN: Robust Gene Response Challenge: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 48: Line 48:
=== Benchmark Data ===
=== Benchmark Data ===


Test implementations of baseline circuits in ''E. coli'' <s>and ''S. Cerevisiae'</s>' have been built and characterized.  Benchmark data and scripts for computing the score are posted below.
Test implementations of baseline circuits in ''E. coli'' <s>and ''S. Cerevisiae''</s> have been built and characterized.


=== Tools for the Robust Gene Response Challenge ===
=== Tools for the Robust Gene Response Challenge ===

Revision as of 15:36, 23 May 2012

2012 CAGEN Challenge Problem


This challenge problem has been chosen as the 2012 CAGEN Challenge Problem. To participate in the 2012 CAGEN Challenge:

  1. (optional) Send e-mail to the chair of the CAGEN steering committee (murray-at-cds-dot-caltech-dot-edu), indicating that you plan on participating in the challenge problem
  2. Subscribe to the cagen-announce mailing list to receive information about the CAGEN challenge
  3. Submit a technical paper describing the results of your design by the deadline for the competition (15 June 2012)

More information on the CAGEN challenge motivation and principles is available on the CAGEN home page.

The competition itself will be broken into two phases: an initial phase in which a self assessment must be submitted (as described below) and a second phase in which a "critical assessment" will be performed for a set of finalists that are determined by the steering committee based on the self assessment. For the critical assessment, competitors will send their DNA inserted into a standard (commercially available) vector for characterization using a standard protocols for transformation, growth and measurement. The idea for this second phase is to move to a state where we are providing *critical* assessments of different designs. We will work with participants to make sure that we can handle different organisms, if needed.

Additional notes:

  • For the critical assessment, an evaluation will be done for both the rise time and decay time (90% to 10%), with both scores reported but rise time being used as the primary score. If two teams have primary scores within 10% of each other, decay time will be used as a secondary measure and included in the final determination of the winner (to be determined by the steering committee).
  • To allow a common critical assessment, modify the CAGEN specification to specify that the inducer be IPTG at a fixed concentration (1 mM) and the output be a a standard fluorescent protein (either GFP, RFP or YFP). The idea here is to insure that the innovation in the design is not due to optimizing the input or output (which we often don't have control over), but rather the circuit between these species.
  • For the self assessment, competitors can submit their data in one of two forms: single cell microscopy data (as currently described) or flow cytometry data. For the flow cytometry data, at least 8 time points must be provide at each of the three temperatures. The score will be computed by using a linear interpolation of the mean and variance of the submitted data. The idea here is twofold: allow people without the ability to do single cell microscopy to submit a design and explore alternate measures of robustness.

Challenge Problem Description

Sample step response, taken from http://partsregistry.org/Part:BBa_F2620:Response_time.

Synopsis: The goal of this challenge is to design a circuit that can rapidly express a fluorescent protein at a controlled level upon the introduction of a chemical inducer, with minimal variation in expression between cells and in multiple contexts. At conditions yielding maximum expression, the circuit should quickly bring the volume-normalized fluorescence from 1X to 10X in response to the addition of an inducer of the designer's choice. The circuit must work at multiple temperatures, with minimal variation in the fluorescence over time, operating temperature and cell choice.

Motivation: Current synthetic circuits demonstrate large variability in expression level when operating different contexts and this limits the ability of synthetic biologists to build on designs performed by other groups. By designing circuits that demonstrate highly repeatable performance over a range of operating conditions, it will be possible to make better use of designs in a modular fashion.

Impact: Improved understanding engineering processes for synthetic biologists will enable more rapid and pervasive development of synthetic circuits, with applications in materials processing, environmental science, agriculture and and medicine.

Metric(s): The winner of this challenge will be determined based on the worst case, mean square error between the ideal step response and the experimental results, with evaluation over multiple temperatures. To be considered, data for the circuit must be submitted for steady state operating temperatures at a nominal value (chosen by the contestant), nominal + 5% and nominal - 5%, with measurements taken in at least 5 individual cells chosen from separate colonies. This represents a set of 15 total time traces of data. At least one of these responses must demonstrate a step response that goes from 1X to 10X expression level in response to the addition of the inducer.

Each participating entry is required to submit time traces at a nominal temperature and at temperatures 5% above and below this nominal value, with measurements from at least five individual cells from separate colonies at each temperature. This represents a total of 15 time traces of data (denoted [math]\displaystyle{ y_j (t) }[/math], [math]\displaystyle{ j }[/math] = 1, 2, 3 . . . 15). One of these traces must be designated as a reference trace, and its equilibrium value after induction ([math]\displaystyle{ M }[/math]) should be at least tenfold larger than its uninduced value ([math]\displaystyle{ M_0 }[/math]). The score for each time trace ([math]\displaystyle{ S_j }[/math]), is the integrated square error between the time trace and an idealized step response with amplitude [math]\displaystyle{ M }[/math],

[math]\displaystyle{ S_j =\int_{T1}^{T2} |y_i(t) - M|^2 dt }[/math]

Here, [math]\displaystyle{ T_1 }[/math] is the time where the reference trace reaches 10% of [math]\displaystyle{ M }[/math], and [math]\displaystyle{ T_2 }[/math] is the time after which the variation in the reference trace is within 10% of [math]\displaystyle{ M }[/math]. Additionally, the reference trace is required to stay within 10% of [math]\displaystyle{ M }[/math] for a time duration [math]\displaystyle{ T_2 - T_1 }[/math] after [math]\displaystyle{ T_2 }[/math].

The score of the design is based on worst-case analysis and will be the highest among the scores of the individual traces. Additionally, we normalize this metric by the equilibrium amplitude of the reference trace. With this consideration, the final score is,

[math]\displaystyle{ S = \max_j \frac{S_j}{M^2} }[/math]

This generates a single number that can be used to order the designs.

Contact: To provide feedback on this challenge, send e-mail to Richard Murray (murray-at-caltech-dot-edu), representing the steering committee.

Benchmark Data

Test implementations of baseline circuits in E. coli and S. Cerevisiae have been built and characterized.

Tools for the Robust Gene Response Challenge

The specifications of the Robust Gene Response Challenge are to design a genetic circuit that ensures fast, robust expression of a fluorescent protein upon induction. For both metrics, we have worked through the specifications for a reference design to establish a baseline performance score, using both computational and experimental methods. This analysis has helped refine the specifications and will provide tools and protocols to help participating teams in their designs.

For the original metric, this analysis is presented in the following report (pdf), and also in the following presentation charts (ppt, pdf, movie1, movie2). The associated latex files, scripts, and data can be accessed here. Similar data has also been acquired using a microfluidic setup (CellASIC ONIX), which offers better control over the induction time and the possibility of pulsed induction (Movie, Preliminary Data).

For the alternative metric, the entire analysis has been repeated and is presented in a (report) and presentation chart (ppt, pdf, movie1, movie2, movie3) formats. The associated latex files, scripts, and data can be accessed here.