Date of Award

Summer 8-15-2021

Author's School

Graduate School of Arts and Sciences

Author's Department

Biology & Biomedical Sciences (Computational & Systems Biology)

Degree Name

Doctor of Philosophy (PhD)

Degree Type



Regulation of transcription factor (TF) binding specificity lies at the heart of transcriptional control which governs how cells divide, differentiate, and respond to their environments. TFs are known to bind to DNA in a sequence specific manner, and such short sequence is known as transcription factor binding site (TFBS). However, the in vivo TF bound regions do not always contain a TFBS, and additionally, there are often excessive non-functional TFBSs with binding potential in the regulatory regions that are unbound for a given TF. This dissertation focuses on understanding the principles of TF binding specificity and is divided into two chapters: 1) developing a novel high throughput method that would facilitate the study of TF binding regulations and the resulting functional output; 2) analyzing the roles of local DNA context around TFBS in specifying TF localization. In the first chapter of this dissertation, we report a tool, Calling Cards Reporter Arrays (CCRA), that measures transcription factor (TF) binding and the consequences on gene expression for hundreds of synthetic promoters in yeast. Using Cbf1p and MAX, we demonstrate that the CCRA method is able to detect small changes in binding free energy with a sensitivity comparable to in vitro methods, enabling the measurement of energy landscapes in vivo. We then demonstrate the quantitative analysis of cooperative interactions by measuring Cbf1p binding at synthetic promoters with multiple sites. We find that the cooperativity between Cbf1p dimers varies sinusoidally with a period of 10.65 bp and energetic cost of 1.37 KBT for sites that are positioned “out of phase”. Finally, we characterize the binding and expression of a group of TFs, Tye7p, Gcr1p, and Gcr2p, that act together as a “TF collective”, an important but poorly characterized model of TF cooperativity. We demonstrate that Tye7p often binds promoters without its recognition site because it is recruited by other collective members, whereas these other members require their recognition sites, suggesting a hierarchy where these factors recruit Tye7p but not vice versa. Our experiments establish CCRA as a useful tool for quantitative investigations into TF binding and function. In the second chapter of this dissertation, we seek out to investigate if predictive information is embedded in local DNA context (LDC) on a large collection of TFs in Saccharomyces cerevisiae. We identify there is a general preference for TFs to bind at CG rich sequences; we then analyze whether such preference is linked to intrinsic nucleosome binding preference and found the CG preference in LDC for TF binding was independent of nucleosome regulation. We next examine the possible mechanism by which LDC influence TFs binding site selection, through recruiting ‘licensing’ factors or kinetically assisting TF search for a target site. We show high CG LDC is preferred by TFs in vitro condition, which suggests such preference only involves TFs and DNA and directs us to TF search kinetics mechanism. CG rich feature in LDC may act as an energetical funnel to facilitate TF recognizing a target binding site, and we verify the theoretical validity of this hypothesis with Gillespie simulation. In the end, we reveal CG preference was also present in a large group of human TFs, indicating the usage of LDC is a general mechanism for TF binding specificity.


English (en)

Chair and Committee

Robi R. Mitra

Committee Members

Barak B. Cohen, Gary G. Stormo, Douglas D. Chalker, Alex A. Holehouse,