This item is under embargo and not available online per the author's request. For access information, please visit


Transcriptional Gene Regulation in Microbes: A Game of Positioning and Induction

Date of Award

Spring 5-15-2015

Author's School

School of Engineering & Applied Science

Author's Department

Biomedical Engineering

Degree Name

Doctor of Philosophy (PhD)

Degree Type



Knowing the specificity of transcription factors is critical to understanding regulatory networks in cells. Using latest high-throughput sequencing technology, it becomes possible to quantify the transcription factor's specificity profile in a principled way, and therefore achieve both high accuracy and throughput, which we named Spec-seq. This method has been successfully applied to two distinct protein families, i.e., LacI family in prokaryotes and Glucocorticoid receptor in mammalian system.

The lac repressor‐operator system has been studied for many decades, but not with high throughput methods capable of determining specificity comprehensively. Details of its binding interaction and its selection of an asymmetric binding site have been controversial. So I employed Spec-seq to accurately determine relative binding affinities to thousands of sequences simultaneously, requiring only sequencing of bound and unbound fractions. An analysis of 2560 different DNA sequence variants, including both base changes and variations in operator length, provides a detailed view of lac repressor sequence specificity. It is found that the protein can bind with nearly equal affinities to operators of three different lengths, but the sequence preference changes depending on the length, demonstrating alternative modes of interaction between the protein and DNA. The wild type operator has an odd length causing the two monomers to bind in alternative modes, making the asymmetric operator the preferred binding site.

However, two other LacI/GalR protein family members purR and Ycjw, cannot bind their operators with variable spacing. A further comparison with known and predicted motifs suggests that lac repressor may be unique in this ability. Therefore, I used site-directed mutagenesis approach to build series of lacI and purR mutants and tested their specificity profiles. It is discovered that the YQ recognition residues combined with the hinge helix loop in lac repressor are necessary and sufficient for this unique structural flexibility, even though it doesn't necessarily means alternative motif recognition.

Besides its structural flexibility, I also used lac repressor as model system to investigate the specificity change under different salts concentrations and temperatures. Even though ionic strength is known to be able to modulate protein's binding affinity to its target, it shows limited impact on specificity, at least for lac repressor case. Moreover, we can successfully use Spec-seq to quantify the specificity change under different temperatures.

Given lac repressor's detailed binding specificity and some other knowledge, it is possible to build a quantitative model to understand how it gets positioned onto the lac operator. In a simplified model, high specific binding energy, appropriate TF copy number, combined with DNA looping are sufficient to fulfill this job, and this could be generalized to other gene regulatory systems.


English (en)


Gary Dean Stormo

Committee Members

Jianmin Cui, Rohit Pappu, Rob Mitra, Gautam Dantas, Ting Wang


Permanent URL:

This document is currently not available here.