Date of Award
Doctor of Philosophy (PhD)
Humans have harnessed the power of microbiology for the creation of fermented foods for over 10,000 years. Synthetic biology now allows chemicals and materials that historically were derived from fossil fuels to be produced via fermentation. These technologies will lessen our dependence on nonrenewable resources and reduce the rate CO2 is released into the atmosphere. Non-model microbial species with unique metabolic traits are promising solutions for bioproduction from renewable feedstocks including lignin, agricultural waste, and CO2. Systems biology characterizations, metabolic models, and data-driven approaches are essential for deciphering and optimizing cell metabolism for these applications.Rhodococcus opacus, R. opacus, is an aromatic-tolerant bacteria that can produce lipid-based biofuels from lignocellulose, and Clostridium carboxidivorans, C. carboxidivorans, is a syngas-consuming bacteria that can produce alcohol-based biofuels. These two species are promising non-model bacteria for advanced biofuel production, but because they have undergone less rigorous scientific analysis than model organisms, including E. coli and S. cerevisiae, their fermentation outcomes are challenging to predict. The focus of this work is to use omics analyses, 13C-metabolic flux analysis (13C-MFA), metabolic modeling, data mining, and machine learning to characterize the metabolism of these bacterial species. This research has accomplished three independent goals. First, an integrative systems biology analysis was applied to R. opacus to delineate its central pathways and flux profiles when it utilizes aromatic and sugar substrates. A genome-scale model for R. opacus was developed and validated using 13C-MFA and transcriptomic data. This high-quality model can be used to link multi-omics analyses to R. opacus fermentation outcomes and can support computational strain design. Second, machine learning (ML) methods were applied to predict Clostridium carboxidivorans syngas fermentation behavior. Data from 17 syngas (CO, CO2, H2) fermentations was used to train models that used gas composition and fermentative metabolite concentrations as features to predict the production rates of acetate, ethanol, butyrate, and butanol. Random forests and support vector machines were the two ML algorithms that made the most accurate predictions. Additionally, ML-based feature importance analysis highlighted the significant impacts of CO and H2 on alcohol production, which offered guidance for model predictive control. Third, to facilitate data-driven modeling, a biomanufacturing database (ImpactDB) that collects and organizes information from published metabolic engineering papers was built. Via feature engineering and integrative genome-scale modeling with ML, this ongoing platform development project will offer guidelines for strain development and bioprocess optimizations. Together, the systems biology tools, mechanistic models, and data-driven approaches developed in this work can be extended to broader microbial systems for metabolic characterizations and engineering.
Yinjie J. Tang
Douglas Allen, Gautam Dantas, Marcus Foston, Joshua Yuan,
Available for download on Wednesday, May 15, 2024