Yeast rebuild

From OpenWetWare
Jump to navigationJump to search

The big picture

  • We can now synthesize relatively long stretches of DNA
  • This makes it possible to consider rebuilding existing (small) chromosomes, or building new chromosomes, from scratch. For example, chromosomes I, III and VI in S. cerevisiae are < 350kb long, and are candidates for being rebuilt

This leads to the $64,000 question: If you were to rebuild a yeast chromosome, what changes would you make to it ? or If you were to build a new yeast chromosome, what would you put on it ?

If you're interested in understanding large-scale chromosome structure and its effects, there are (at least) a couple of different overarching goals that can drive how to answer this question:

  • The Science goal: Investigate how chromosomes are currently organized, and the importance of various elements of their organization.
  • The Engineering goal: Investigate how to build a chromosome with a particular set of capabilities independent of the actual genes on the chromosome, like a low overall recombination rate.

Chromosome/genome organization in S.cerevisiae

List of genomic elements

Essential elements of chromosomes

  • Centromeres, origins of replication, telomeres
  • A length of at least 55kb for mitotic stability and some control of copy number. (Based on a brief skim of the paper; need to re-read this more carefully) (Murray and Szostak, '83)

Gene order and distribution

  • Overall: gene order and distribution isn't random. Good overview paper: Pal and Hurst, '04
  • Genes involved in the same metabolic pathway (as defined by KEGG) tend to "cluster" on chromosomes, where "cluster" means "large region of chromosome with high concentration of pathway members, although non-members may also be present". 98% of metabolic pathways in S.cerevisiae exhibit this kind of clustering, after controlling for tandem duplicates (Lee and Sonnhammer, '03)
  • Genes that are controlled by the same sequence-specific transcription factor tend to be regularly spaced along chromosome arms. Different periods are observed for different chromosome arms. Regularities are consistent with a genome-wide loop model of chromosomes, in which co-regulated genes dynamically co-localize in 3D. (Kepes, '03)
  • Adjacent pairs of genes show correlated expression independent of their origin. Correlated triplets, but not quadruplets, were also found more often than expected by chance. Correlation maps also revealed regularly-spaced groups of correlated genes along chromosomes that might be indicative of higher-order chromosome structure. (Cohen et al, '00)
  • Statistically significant fraction of genes coding for subunits of stable complexes are located within 10-30kb of each other. This clustering may ensure better coregulation and maintain the right stoichiometry of complexes upon duplication of chromosomal segments (Teichmann and Veitia, '04)
  • Gene orientation (ie whether they’re on the plus or minus strand) can be modeled by a first-order Markov model ie the orientation of a gene depends on the orientation of the gene that precedes it. (Note: Transition probabilities for yeast are pretty close to 0.5 ie close to a random coin-flipping model, but the authors claim that the coin-flip model is statistically improbable; I can’t really judge their statistics, but I still don’t put much trust into this model.) (Simons and Morton, '03)
  • Essential genes in yeast are clustered, independent of co-expression and tandem duplication. Clusters of essential genes are in regions of low recombination and larger clusters have lower recombination rates. (Pal and Hurst, '03)
  • There is negative correlation between chromosome length and G+C content at (silent) third codon positions (GC3s) of ORFS. Chromosome III is abnormal in that it has strong clustering of GC3s; could be because it contains mating-type loci, so there’s selective pressure to keep mating-type switching an intrachromosomal reaction and thus to keep most of the chromosome (between HML and HMR) intact, leading to less structural disruption than other chromosomes (which preserves existing clusters ?) (Bradnam et al, '99)

Recombination frequency

  • There are hot- and coldspots of meiotic recombination in S.cerevisiae. Each chromosome has hotspots & coldspots; hotspots tend to cluster around regions with high G+C content whereas coldspots are nonrandomly associated with centromeres and telomeres. Hotspots are also enriched near genes involved in metabolic pathways and ionic homeostasis; coldspots were over-represented near ORFs involved in transport facilitation and intracellular transport. Some types of hotspots require transcription factor binding in order to become active. Hotspots tend to be in intergenic regions. (Gerton et al, '00)

Transcription factor binding sites

  • Paper with lots o' data: Harbison et al, '04
  • Lots of high-scoring transcription factor binding sites in ORFs, some of which are actually bound to in vivo (but with lower average binding strength than sites in intergenic regions). (My 7.90 class project)

Chromosome replication

  • Survey paper: MacAlpine and Bell, '05
  • Autonomous replication sequences (ARS), are about 200 bp long and contain an ARS consensus sequence (ACS) that's ~11bp long. Sequence flanking the ACS is essential, but there are no obvious sequence similarities between flanking sequences in different ARSs.
  • There are 200-400 ARSs in the yeast genome (ie they occur every 30-40 kbp), but not all function as origins of replication.
  • For given cell type, under given growth condition, each part of the genome replicates at a characteristic time within S phase.
  • Activation timing of each origin is related to its chromosomal position; origins near centromeres are activated earlier, origins near telomeres are activated later than other origins.
  • No correlation between steady-state transcription level of a gene and establishment/activation of an origin near the gene.
  • Pre-RC (complex that assembles at origins before replication) primarily assembles at pro-ARSs (ie possible origins of replication) in intergenic regions. However, significantly fewer pro-ARSs occur in intergenic sequences flanked by diverging transcripts than would be expected.
  • Bottom line:
  1. Still can't predict which ARSs will actually function as origins of replication, and the timing of their activation.
  2. Mechanism responsible for establishing the conserved, characteristic pattern of replication across the genome is still unknown.

Chromatin structure

(Random notes, not yet organized)

  • Histone variants: H2A.z is associated with transcribed regions; inhibits repressive chromatin structures. CENP-A histone variant is associated with nucleosomes that include centromeric DNA.
  • Nucleosome remodeling complexes can be targeted to DNA by interaction with DNA-bound transcription factors. Alternatively, binding of some TF to DNA is incompatible with association of the same DNA with a histone octamer. Since nucleosomes require >147bp of DNA to form, if 2 such TF bind < 147bp apart, the DNA between them can't assemble into a nucleosome.
  • Nucleosomes assemble preferentially on A:T-rich DNA when minor groove faces the histone octamer, G:C-rich DNA when major groove faces octamer. Sequences that alternative between A:T and G:C-rich sequences with periodicity of ~5bp act as preferred nucleosome binding sites.
  • Modification of N-terminal tails of histones alters chromatin accessibility. Acetylated nucleosomes are typically associated with transcriptionally active nucleosomes, deacetylated nucleosomes with transcriptionally inactive chromatin. Methylation can have either effect, depending on particular amino acid that is methylated. There are no known demethylases.
  • Proteins with bromodomains interact with acetylated histone tails, proteins with chromodomains with methylated histone tails. Bromo/chromodomain-containing proteins are often associated with acetylases/methylases and can thus participate in a positive feedback loop.
  • During DNA replication, H3:H4 tetramers are either transferred wholesale to the new strand or retained on the old strand. H2A:H2B dimers are released into soluble pool and then reassociate with the old and new strands.

Semi-random papers