Genome refactoring is a means by which biologists can recode or 'refactor' the genomes of extant organisms in such a way as to make the easier to study and engineer. Current synthetic biology approaches attempt to apply engineering principles in a biological context this is not usually as 'cut and dry' as typical engineering problems, such as building a bridge. Biological systems are highly evolved, complex systems with many interacting parts that we do not fully understand. With this in mind, genome refactoring aims to reorganize these systems so that we can both mitigate the complexity, and facilitate greater understanding of the whole system. Below are two recent examples of genome refactoring from the literature.
T7 is a common bacteriophage, and one that has been used extensively as a model organism in molecular biology for decades. Much of what we know about the basic systems of genetic regulation were discovered using T7 and similar bacteriophages. Classic genetics and biochemistry first allowed the characterization of the system of elements that make up the T7 genome, which was further characterized by full sequencing of the genome in 1983. However, there still exist domains of the genome that have no annotated functions. In order to facilitate characterization, and demonstrate that their principles of design could result in a viable phage, Chan et al set out to completely refactor the genome of T7 .
Chan et al set out to refactor the T7 genome with six goals in mind:
1. Define a set of components relevant to the development and viability of the phage, and choose one exact DNA sequence that would encode that function.
2. The DNA sequence encoding any function would not physically overlap with any of function-encoding DNA sequence.
3. Each function-encoding DNA sequence would only code for that function and no others.
4. Enable precise and independent manipulation of each encoding element.
5. Construct the refactoring genome of T7.1.
6. The refactored genome must encode a viable bacteriophage.
The design process to refactor T7 followed a simple algorithm. The first step was to re-annotate the entire genome, by which they defined boundaries for 57 ORFs with 57 RBSs which code for 60 proteins. They also defined 51 regulatory regions relating to gene expression, replication, and packaging of the genome into the phage. Next, each of these genetic elements were organized into 73 separate parts, each containing one or more elements. Inside each of these parts, DNA sequences for the elements could overlap, but separate part boundaries must not overlap. Parts were then organized into six contiguous sections, alpha through zeta, with their boundaries defined by specific, single-cutting restriction endonucleases. This enables each 'section' to be manipulated separate from the others, and compartmentalized changes to the genome.
Construction and testing
The group constructed only the first two sections, alpha and beta, which together contain the first 32 of 73 parts encoded in the T7.1 genome. The refactored alpha and beta replace the leftmost 11,515 bp of wild-type genome with 12,179 bp of redesigned DNA.
To build each section, several complicated steps were taken. To begin, each section was reduced down to a 'scaffold' sequence, which is essentially the remaining sequence of a section after removal of all contained parts. The scaffold retains all restriction sites required to add the parts in subsequence cloning steps, and in practice the alpha and beta scaffolds contained some small number of parts built-in. The scaffolds were ordered built from a DNA synthesis company (Blue Heron), and were cloned into a plasmid to facilitate addition of parts. The construction of alpha was made more difficult because the entire scaffold could not be built, and so several other steps were taken to build the scaffold as four different fragments before combining to full section alpha. Beta construction was more straightforward.
To build the chimeric refactored phage strains, the refactored sections were ligated to restriction digested wild-type T7 DNA. The ligation products were transfected into IJ1127 E. coli cells, which can be transformed with T7 DNA and will produce virions. Single plaques were used to inoculate liquid cultures and DNA isolated from these lysates was tested by restriction digest to identify correct chimera clones.
The two refactored sections were recombined with wild-type T7 to create three separate chimeric genomes, apha-WT, beta-WT, and alpha-beta-WT. Upon testing, all three of the chimeric phage genomes produced viable phage that were able to infect E. coli. To test the completeness of their built parts in the chimeras, restriction digests were performed. 30 or the 32 parts encoded by the alpha and beta designs could be cut out as they were designed. Sequencing of alpha and beta isolates revealed several differences from the designed sequences. Alpha included single-base deletions in the gene 0.4 and the E. coli terminator TE. Beta differences included single-amino acid changes in genes 1.8 and 2 and single-base deletion in gene 2.5. Additionally, there was an 82 base truncation in gene 2.8. The authors attribute all errors to construction errors or limitation. Further, growth assays revealed differences in lysis times. Alpha, beta, and alpha-beta chimeras exhibited 20, -1.4, and 22% increase in half-lysis times relative to wild-type. Plaque sizes between chimeras and wild-type were comparable at 30 degrees, but the chimeras showed slower and smaller plaques at 37 degrees.
Saccharomyces cerevisiae is a popular model organism for studying everything from basic genetics to bioengineering and large-scale production. It's importance also reaches into many industries, namely that of bread and (most importantly) alcohol. While it's simplicity and tractability for study is attractive, it is still a eukaryotic organism and much remains to be discovered about it. However, large scale DNA synthesis and principles of genome refactoring can allow researches to redesign the genome to further facilitate characterization and manipulation. To that end, Dymond et al have begun an ambitious design project for a synthetically designed and constructed refactored yeast genome.
In a similar fashion to the T7 design group, the authors defined and followed three design principles throughout the project.
1. The refactored genome should result in near wild-type phenotype and fitness.
2. The refactored genome should lack any destabilizing elements. These include repeats, tRNA genes, and transposons.
3. It should have genetic flexibility in order to facilitate further studies.
Two separate regions of the yeast genome were chosen for design and construction. The authors began their project with the right arm of chromosome IX (IXR). This arm was chosen because it is the smallest in the genome, and contains several features of interest. The region chosen began at ORF YIL002W and went through the centromere and the remainder of IXR. This region is encoded by 89,299 bp of DNA. A tRNA gene and Ty1 LTR element were removed, along with telomeric regions. Additionally, 43 LoxPsym sites were added. The redesigned sequence measured 91,010 bp, slightly longer than the wild-type region. A 30kb region of the left arm of chromosome VI was also redesigned similarly. This change replaced 15.7% of the chromosome. In all, 17% of the original sequences were changed by base substitution, deletions, or insertions.
Certain changes were designed into the refactored chromosome arms to facilitate study and and future manipulations of the refactored regions. First, all TAG stop codons in the regions were replaced with TAA stops, in order to open a codon for expansion of the genetic code. Second, PCRTags were added to the regions. These tags are short pairs of recoded sequence that are unique to the refactored region to facilitate rapid verification of refactored region introduction. LoxPsym sites were also added throughout both refactored regions. These sites are loxP sites that are able to recombine in either orientation, and allow recombination either forward or backward with equal probability. These sites formed the basis for the 'SCRaMbLE' (synthetic chromosome rearrangement and modification by loxP-mediated evolution) system to generate stochastic genetic diversity. LoxPsym sites were introduced 3bp after the stop codon of every nonessential gene in the refactored sequence, at major chromosomal landmarks, such as tRNA and LTR deletion sites, flanking CEN9, and adjacent to telomeres.
Owing to the genetic tractability of yeast, the two separate refactored arms were introduced in different ways, as described below.
Arm swap strains for synIXR were constructed were cloned using a circular BAC and transformed into a diploid yeast strain. A 10-15% success rate was found for transformation of full synIXR (based on PCRTag amplification). Further, a truncation construct was transformed into these strains, such that one copy of wt IXR would be truncated. This strain was sporulated to produce haploid strains and clones were obtained that contained the proper arrangement of truncated chromosomal IXR and the synIXR BAC. 10 correct strains were obtained, along with four strains that had chimeric BACs, due to gene conversion during sporulation. Sequencing confirmed that the synIXR contained no mutations or structural aberrations.
synVIL was constructed using a linear transformation strategy. Linear fragments carrying a selection marker and a short sequence of wild-type sequence at one end were transformed into the desired yeast strain. This results in a recombination event at the wild-type terminus of the introduced fragment, resulting in replacement of the wildtype region with the synthetic region. This method resulted in 10 out of 12 tested clones bearing all PCRTag sites. This method differs from the synIXR method in that it physically replaces the chromosomal region, and is not encoded on an extra-chromosomal BAC. This is simpler and more amenable to further modification. For example, the use of alternating markers can facilitate the systematic, step-wise replacement of an entire chromosome with synthetically constructed sequence.
Testing of several aspects of the synthetic strains was undertaken in order to assess any changes introduced by synthetic replacement.
To satisfy the first design principle, growth and fitness levels of the synthetic strains must retain a wild-type or near phenotype. To measure this, the authors measured colony size and morphology of synIXR strains under six different growth conditions, in addition to performing transcript profiling. The authors were not able to distinguish the synthetic strains from wild-type in these studies, and fitness measurements of synVIL gave similar results. Transcript profiling revealed few changes, all of which were predictable. These changes included increased (approximately doubled) expression of genes present in two copies in synIXR, and increased expression of two genes normally present near the telomeres, which may be released from silencing when in the BAC. Genome-wide transcript measurements showed no significant compensatory changes outside of the synthetic regions.
The SCRaMbLE system was designed such that it would only be active when activated by expression of the Cre recombinase, but would not affect anything until that time. To facilitate this, an engineered Cre was introduced, which was fused to a murine oestrogen binding domain. This allows activation by oestradiol and has very low basal activity. Importantly, its expression is controlled by a daughter-cell specific promoter. This allows recombination to occur only when in the presence of the activating compound, and only in the daughter cells post-division, thus creating a clonal recombined strain from that daughter cell. To test the system, Cre was activated in the synthetic strains by oestradiol addition. SynIXR-derived strains showed an average 100-fold drop in viability, owing to the presence of several essential genes in the region. SCRaMbLEd synIXR strains showed much greater growth-rate variability than wild-type control strains, and exhibit many different phenotypes. In contrast, synVIL strains showed no effect on viability, due to a lack of essential genes in this segment, and only five loxPsym sites.
- Chan LY, Kosuri S, and Endy D. . pmid:16729053.
- Dymond JS, Richardson SM, Coombes CE, Babatz T, Muller H, Annaluru N, Blake WJ, Schwerzmann JW, Dai J, Lindstrom DL, Boeke AC, Gottschling DE, Chandrasegaran S, Bader JS, and Boeke JD. . pmid:21918511.
- Hoess RH, Wierzbicki A, and Abremski K. . pmid:3457367.
- Lindstrom DL and Gottschling DE. . pmid:19652178.