Date of Award

Winter 12-15-2018

Author's Department

Energy, Environmental & Chemical Engineering

Degree Name

Doctor of Philosophy (PhD)

Degree Type



How can we get living cells to do what we want? What do they actually ‘want’? What ‘rules’ do they observe? How can we better understand and manipulate them? Answers to fundamental research questions like these are critical to overcoming bottlenecks in metabolic engineering and optimizing heterologous pathways for synthetic biology applications. Unfortunately, biological systems are too complex to be completely described by physicochemical modeling alone.

In this research, I developed and applied integrated mechanistic and data-driven frameworks to help uncover the mysteries of cellular regulation and control. These tools provide a computational framework for seeking answers to pertinent biological questions. Four major tasks were accomplished.

First, I developed innovative tools for key areas in the genome-to-phenome mapping pipeline. An efficient gap filling algorithm (called BoostGAPFILL) that integrates mechanistic and machine learning techniques was developed for the refinement of genome-scale metabolic network reconstructions. Genome-scale metabolic network reconstructions are finding ever increasing applications in metabolic engineering for industrial, medical and environmental purposes.

Second, I designed a thermodynamics-based framework (called REMEP) for mutant phenotype prediction (integrating metabolomics, fluxomics and thermodynamics data). These tools will go a long way in improving the fidelity of model predictions of microbial cell factories.

Third, I designed a data-driven framework for characterizing and predicting the effectiveness of metabolic engineering strategies. This involved building a knowledgebase of historical microbial cell factory performance from published literature. Advanced machine learning concepts, such as ensemble learning and data augmentation, were employed in combination with standard mechanistic models to develop a predictive platform for important industrial biotechnology metrics such as yield, titer, and productivity.

Fourth, my modeling tools and skills have been used for case studies on fungal lipid metabolism analyses, E. coli resource allocation balances, reconstruction of the genome-scale metabolic network for a non-model species, R. opacus, as well as the rapid prediction of bacterial heterotrophic fluxomics.

In the long run, this integrated modeling approach will significantly shorten the “design-build-test-learn” cycle of metabolic engineering, as well as provide a platform for biological discovery.


English (en)


Yinjie Tang

Committee Members

Pratim Biswas, Michael Brent, Tae Seok Moon, Roman Garnett,


Permanent URL: https://doi.org/10.7936/e1y5-6592