ORCID

https://orcid.org/0009-0007-1288-3700

Date of Award

Spring 5-2025

Author's School

McKelvey School of Engineering

Author's Department

Computer Science & Engineering

Degree Name

Master of Science (MS)

Degree Type

Thesis

Abstract

A fundamental issue in mapping regulatory networks between transcription factors and their target genes is the poor overlap between the set of genes bound by a given transcription factor (TF), and the set of genes that are differentially expressed after knocking out or overexpressing the same TF. We began with the hypothesis that to predict whether a gene will respond to perturbation of a TF, it is important to not only consider whether that TF is bound at the gene’s promoter, but also whether other TFs bind at the same promoter.

In this work, we propose a novel modeling procedure to better predict gene expression changes in Saccharomyces cerevisiae following TF overexpression by considering the binding data of the perturbed TF and additional pairwise TF-TF interactions.

Using binding data from Calling Cards experiments and perturbation response data from the McIsaac ZEV overexpression dataset, we created 101 models for predicting which genes will respond to perturbations of 101 different TFs. The input features for these models, which are linear in their parameters, include the binding profile of the perturbed TF and 119 interaction terms between the perturbed TF and each other TF. A three-step pipeline employing bootstrapping and nested cross-validated LASSO modeling was used to identify high-confidence predictors that affect the perturbation response in a consistent direction.

Our findings suggest that these TF interactions often contribute better explanatory power than individual binding signals alone. Additionally, several recovered interaction terms align with known biological interactions such as GCR2:TYE7 and FKH1:FKH2, supporting the validity of our approach in identifying both known and novel regulatory relationships. These results provide additional support for proposed regulatory mechanisms and offer directions for future exploration. Thus, our work introduces a robust procedure for identifying biologically meaningful TF–TF interactions and improving the predictability of gene expression from TF binding data.

Language

English (en)

Chair

Michael Brent

Committee Members

Tao Ju, Roman Garnett

Available for download on Saturday, May 02, 2026

Included in

Engineering Commons

Share

COinS