Reshma Shetty/FAQ and thoughts
Frequently asked questions and objections
Disclaimer: These are personal opinions developed as a result of my own thinking on this subject and interactions with others (which I have tried to note as applicable). They are highly likely to evolve over time. If you have a comment/question on something here, send me an email.
But life isn't digital!
The most common question I receive when talking about my work with others is "Biology isn't digital, so why concentrate on digital logic"? There's an answer on the Synthetic Biology FAQ but I'll give my own two cents here. In my mind, there are two valid responses to this question.
Maybe not, but we are better at thinking digitally.
This is the answer I usually give. After thinking about this question a fair amount and talking with others in the MIT Synthetic Biology Working Group (especially Tom Knight), I came to the conclusion that the first "digital" devices in electrical engineering probably weren't very digital either. In fact, even today's devices aren't 100% digital. Calling a device "digital" is really an abstraction we place upon a physical object that behaves according to certain specifications. By carefully determining those behavior requirements and carefully engineering the device, the digital abstraction holds up sufficiently well for the device to work as desired. A lot of work has already been done in electrical engineering in terms of engineering digital circuits from analog electrical components. We should be able to leverage this expertise to design digital devices from analog, biological components.
Sure it is.
Information in biology is encoded by DNA which consists of strings of 4 kinds of nucleotides. Therefore life is fundamentally digital. I first heard this argument in a talk given by Leroy Hood at MIT in 2003 but I am sure that others use it as well. If biology at its core is digital, then it is no longer so unreasonable to design digital devices from biological parts.
BioBricks assembly is too cumbersome.
Agreed. I would really like to be able to email an arbitrarily long DNA sequence to a synthesizer sitting in my lab and have it given me a tube with that DNA in it for very little cost. It would make my Ph.D. go much faster. Unfortunately, we aren't quite there yet (though some might justifiably disagree). So as a stopgap measure, BioBricks assembly allows us to construct systems from parts in a *relatively* cheap, efficient and reproducible manner.
However, in my mind there is a clear difference between the concept behind BioBricks itself (and by BioBricks, I am referring to the parts in the MIT Registry of Standard Biological Parts) and the method used to assemble BioBricks together. There are various ways that one could imagine to improve assembly or even eliminate it all together by using DNA synthesis instead. Regardless of the technique used to fabricate a system, it is still useful to maintain a library of reusable, well-characterized biological parts from which systems can be made. This is how engineers avoid "reinventing the wheel" so to speak. In my mind, it is this concept (not the particular assembly technique) that is the key idea behind BioBricks. Thus, I think BioBricks and the Registry of Standard Biological Parts will still be useful even if/when long DNA synthesis is easily available.
What's the difference between parts, devices and systems?
What's the deal with PoPS?
All these ideas were formulated by the MIT Synthetic Biology Working Group ... most of them even before I joined. I am just discussing them here to clarify my thoughts and possibly help address others' questions. Feel free to contact me with problems, issues etc.
What is PoPS?
PoPS is an acronymn for Polymerases per Second. It represents the rate at which RNA polymerase moves past a given point in the DNA. In some sense, it can be thought of as analogous to current flowing through a particular point in a wire. Previously, the MIT SBWG called this unit TIPS for transcription initiations per second. However, this name is a bit misleading because there are locations on a piece of DNA where we care about the flow of PoPS even though transcription isn't technically initiating at that point. For instance, it is useful to know the rate at which RNA polymerase moves through a terminator. Such a rate is not really transcription initiations per second. PoPS is probably more appropriate.
Why use PoPS?
PoPS is a common signal carrier for transcription-based devices. (See the registry abstraction hierarchy page for a comparison of more traditional inverter devices in which signals are in protein concentrations versus PoPS based devices ... I won't repeat that discussion here.) Devices which have both an input and output in PoPS are composable. They can be arbitrarily hooked up to one another to create more complex devices and systems. It's only by having composable devices that we can begin to think about characterizing many devices independently and then combining them to create more complex circuits with predictable behavior.
What's the difference between PoPS and transcription rate?
Transcription rate is generally a parameter associated with a particular transcript and has units of transcripts per unit time. In contrast, PoPS is essentially transcription rate at a particular location on the DNA. The value of PoPS just downstream of a coding region should be equal to the coding region's transcription rate. However, there are certain positions at which a transcription rate doesn't necessarily make sense yet PoPS does. For instance, as biological engineers, we care about the rate at which RNA polymerase moves through a terminator (or the PoPS downstream of a terminator) but yet most people don't talk about the transcription rate of a terminator.
What are some examples of PoPS devices?
A promoter is an example of a PoPS source (or a battery): it produces a steady output PoPS with no input. A terminator is like a PoPS sink or a connection to ground: it takes a PoPS input and gives no output. A PoPS based inverter consists of an RBS, repressor coding region, terminator and cognate promoter. It takes an input signal in PoPS and inverts the signal. A high PoPS input leads to repressor expression and promoter binding thereby producing a low output signal. A low PoPS input leads to little to no repressor expression and therefore the promoter is free to generate PoPS. An RBS is kind of just like a wire, it just permits PoPS signals to pass through it. Similarly, a coding region is also a wire but it may have some resistance: its output PoPS may be less than its input PoPS.
But what about devices that propagate signals via other means like phosphorylation?
PoPS is just a common signal carrier not a universal signal carrier. Other classes of devices may rely on other signal carriers to propagate information. For example, devices implemented in kinases may carry signals in other units. See Samantha Sutton's work on post-translational logic. It is also conceivable that there will be some devices that serve to translate signals from one signal carrier (like PoPS) to another signal carrier. Since biology takes advantage of multiple signal carriers, in all likelihood, so will synthetic biological devices.
Stay tuned for more.
Here's some thoughts that I decided to post. Feedback is welcome.
Diatribe on Escherichia coli strains
One thing that surprises me is how difficult it is to find an Escherichia coli strain that meets a certain set of specifications. For instance, I want a strain with the lactose permease knocked out and the arabinose permease under the control of a constitutive promoter so that I can get linear induction with both lactose and arabinose. I don't think one exists (if it does please email me!). I also can't find a strain that is lacIq and has the lactose permease deleted (again email me if you have this strain!). I find this situation mildly frustrating.
I also find the nomenclature of Escherichia coli genotype information to be unnecessarily confusing but I am willing to let it slide as a historical artifact. (See the attempt to decipher the code.)
However, in my mind what is truly astonishing is the dearth of information available on existing strains and the fact that some of this information is wrong! Case in point: I was interested in using a strain of Escherichia coli with the lacIq mutation. I found various strains that are supposed to have this mutation: D1210, JM109, BW26434. Then, since I was getting some anomalous experimental results, Tom suggested that I sequence verify the fact that my strains were lacIq. So I did and lo and behold, none of my sequences had the lacIq mutation on the genome. Now based on my anomalous experimental results (which are no longer so anomalous) and reading of some papers, I think that D1210 really is lacIq but that it just has lacIq on the F plasmid rather than on the genome. But JM109 and BW26434 ... or at least the versions that I sequenced ... are not lacIq as documented. I don't understand how people use these strains without having correct genotype information. I also don't understand that with all the sequencing centers there are and how many people work on or with Escherichia coli, why all the common lab strains at least don't get sequenced. Some claim it is a combination of the lack of resources and the fact that this isn't an interesting thing to do. Quite possibly this is true, but nevertheless, I find this situation unbelievable. Anyway, it was these experiences that led me to populate the standard strain page.
Inducible promoters are obnoxious
I am so annoyed at the lack of inducible promoters that provide linear, single cell control of gene expression. I would vent on this more but Kathleen already presented an objective discussion of this topic at Titratable control of pBAD and lac promoters in individual E. coli cells.