In general, we take an integrative approach, combining genome-wide transcription factor binding, gene expression perturbation using genetic manipulations, comparative genomics, and physiological approaches to understand on a systems-wide basis how a tissue is defined. We are particularly interested in understanding how a phenotype such as cell type transcription can be maintained in the face of genomic changes driven by evolution or cancer.
Transcription and transcriptional regulatory evolution in mammals
Recent results suggests that few transcription factor-DNA interactions appear to be evolutionarily maintained in mammals, yet most evidence suggests that the gene expression programs of particular tissues are highly conserved. My laboratory, in collaboration with a number of other laboratories, continues to explore the regulatory mechanisms that can maintain specific transcriptional programs in spite of genetic evolutionary drift and subsequent divergence of transcription factor binding in vivo.
Determinants of tissue-specific transcriptional regulation
Sets of conserved transcription factors are responsible for conserved tissue-specific transcription, yet transcription factor binding events diverge rapidly between closely related species. To decouple the distinct molecular mechanisms that direct transcription factor binding and gene expression we investigated tissue-specific transcriptional regulation in a mouse containing human chromosome 21 (the Tc1 mouse). Gene expression and transcription intiation occurs at similar syntenic genes in hepatocytes from humans, wild-type mice, and Tc1 mice; however, the transcription initiation occurring in other genomic regions is specified by species-specific genetic sequences. Characterization of transcription factor binding in the Tc1 mice reveals that tissue-specific transcriptional regulation is directed almost exclusively by species-specific genetic sequences. Divergent patterns of transcriptional regulation coded in genetic sequence can thus be transplanted between species to recapitulate conserved transcription in homologous tissues.
Origin and impact of CTCF binding in mammals
CCCTC-binding factor (CTCF) is a DNA-binding protein that can divide transcriptional and chromatin domains, help direct the location of cohesin, and orchestrate global enhancer-promoter looping. We are experimentally analyzing CTCF binding in tissues from many species of a cross section of mammalian orders to identify highly conserved and lineage-specific CTCF binding events. We, and other labs, have identified that a major mechanism of CTCF binding evolution is carriage via SINE repeats. Newborn CTCF binding events often serve as both chromatin and gene expression barriers. Remarkably, we have already discovered that fossilized repeat elements exist around over a hundred deeply shared CTCF binding events, indicating they originated from similar repeat driven expansions in a common mammalian ancestor hundreds of millions of years ago. Repeat-driven dispersal of CTCF binding is a fundamental, ancient, and still highly active mechanism of genome evolution in mammalian lineages.
Evolution of polymerase activity
The three major mammalian polymerases drive expression of most of a cell's content, and bind the genome in a highly tissue- and species-specific manner. For instance, RNA polymerase III (pol III) transcription of transfer RNA (tRNA) genes is essential for generating the tRNA adapter molecules that link genetic sequence and protein translation. We have mapped pol III occupancy genome-wide in the livers of mouse, rat, human, macaque, dog and opossum, and found that pol III binding to individual tRNA genes varies substantially in strength and location. However, taking into account tRNA redundancies by grouping pol III occupancy into 46 anticodon isoacceptor families or 21 amino acid-based isotype classes shows strong conservation. Similarly, pol III occupancy of amino-acid isotypes is almost invariant among transcriptionally and evolutionarily diverse tissues in mouse. Thus, synthesis of functional tRNA isotypes has been highly constrained, though the usage of individual tRNA genes has evolved rapidly. The evolution of other polymerases is under active investigation, as is the relationship among the polymerases at (possible) commonly transcribed genomic regions.