Date of Award

Spring 5-15-2022

Author's School

Graduate School of Arts and Sciences

Author's Department

Biology & Biomedical Sciences (Computational & Systems Biology)

Degree Name

Doctor of Philosophy (PhD)

Degree Type



The human body contains approximately 100 trillion cells, encompassing distinct cell types that serve diverse functions. Understanding cell population heterogeneity is vital for uncovering different biological functions and mechanisms. In addition, cells at transition during continual processes, such as development, reprogramming, and disease, are essential for painting the entire blueprint and highlighting critical stages of the progression trajectory. For instance, cell fate engineering holds much promise for generating clinically valuable cell types from mature somatic cells. Nonetheless, current reprogramming protocols are inefficient, and charting the changes in cell identity during such processes can help design strategies to mitigate the off-target and increase efficiency. RNA-sequencing allows us to study transcript abundance and dissect different genetic features. Prior to single-cell level sequencing, bulk-level transcriptomics have demonstrated power at a lower resolution to distinguish populations and identify differential gene markers. The advent of single-cell RNA-sequencing technologies has brought us a new era of exploring the small world inside individual cells via their transcriptome profiles. Single-cell RNA-sequencing takes a snapshot of individual cells, enabling the dissection of population composition and capture of cells at different states in complex biological systems. Cell type annotation has been a long-standing interest in understanding cell identities from gene profiles. Yet, manual annotations require prior knowledge of cell-type-specific gene signatures and are labor-intensive and time-consuming. Automated annotation approaches are in demand for exponentially growing single-cell datasets.In response to such demand, many computational approaches have been developed. However, they classify cells in a discrete, categorical manner, limiting their application in continuous biological systems. Focusing on continual processes, we designed a computational tool, 'Capybara,' to measure cell identity as a continuum at a single-cell resolution. This approach enables the classification of discrete cell identities and recognizes cells harboring hybrid identities, supporting a quantitative cell-fate transition metric. After benchmarking against other classifiers and validation with "ground-truth" lineage data, we apply Capybara to a diverse range of cellular programming and reprogramming protocols: The application to direct cardiac reprogramming uncovers a patterning bias and a hybrid state between atrial and ventricular cardiomyocytes; Capybara reveals previously uncharacterized patterning deficiencies in motor neuron programming, instructing a new approach to alleviate the lack of proper patterning; Further, we apply Capybara to our in-house system, direct reprogramming of fibroblast to induced endoderm progenitors, and find a putative in vivo correlate for this engineered cell type that has, to date, remained poorly defined. These findings highlight the utility of Capybara to dissect cell identity and fate transitions in development, reprogramming, and disease. Finally, we further explore the direct cardiac reprogramming system using the comprehensive set of tools developed in the lab. We resolve lineage relationships in this system using CellTagging, find key regulatory transcription factors using CellOracle, and evaluate small molecules' effect on the patterning bias using Capybara. In summary, I have developed a tool to highlight cell fate transitions and reveal insight into cellular heterogeneity in different continuous biological processes. Further investigation in the transition states by integration with other data modalities and experimental approaches may help pinpoint key checkpoints for successful reprogramming, allowing future interventions to improve the efficiency and fidelity of cell fate engineering.


English (en)

Chair and Committee

Samantha A. Morris

Committee Members

Nancy Saccone