Abstract

The growth of untargeted metabolomic profiling technology has prompted the upstart of many large cohort studies. Metabolomics serves as an informative complement to other more established ‘omics (e.g., genomics, transcriptomics, proteomics). As a younger discipline, however, experimental and computational workflows are non-standardized and at times inadequate for large-scale studies. Two primary limitations in analysis of untargeted metabolomics data are 1) the poor performance and scalability of the algorithms applied to detect metabolite signals from the raw LC/MS data and 2) inefficient and arduous metabolite identification, which is the process of determining the biochemical structures that correspond to the detected LC/MS peaks. Identification of all metabolites is critical for enabling systems-level analyses and inferences into metabolic mechanisms. These challenges have limited large studies to targeted analyses that have considerably lower computational overhead. However, targeted analyses restrict the biological insights that can be gleaned. Further, due to the difficulty of performing metabolite identification, the traditional bioinformatic workflow of untargeted metabolomics only attempts to identify signals that show statistical significance above a certain cutoff, thereby reducing the number of metabolites that need to be identified. However, systems-biology approaches like pathway mapping and multi-omics integration require structural identities of both statistically significant and non-significant metabolites for meaningful results. Here I describe an alternative workflow that leverages pooled-reference samples and computational tools to automate the processing of large-scale metabolomics data, including the steps of metabolite detection, curation, and identification. By performing thorough analysis of a pooled-reference sample, the metabolite signals relevant to a particular study can be determined, curated, and identified. These metabolite signals can then be extracted from the raw data with a much lower computational cost, and the metabolite abundances can be normalized to remove batch effects and other sources of technical variability. In total, these advances enable an improved metabolomics workflow that scales to arbitrary sample numbers. We apply such a workflow to a study of COVID-19 severity and discovered lipid metabolites that provide prognostic value for predicting disease course. Beyond this application, this dissertation details a computational approach to another advanced metabolomics workflow that involves stable isotopes and spatially-resolved metabolite measurements. These examples demonstrate the importance of computation to the processing and interpretation of metabolomics data and highlights the biological insights into human health and disease that metabolomics offers.

Committee Chair

Gary Patti

Degree

Doctor of Philosophy (PhD)

Author's Department

Biology & Biomedical Sciences (Computational & Systems Biology)

Author's School

Graduate School of Arts and Sciences

Document Type

Dissertation

Date of Award

8-16-2023

Language

English (en)

DOI

https://doi.org/10.7936/96cg-y264

Author's ORCID

https://orcid.org/0000-0001-7286-0024

Recommended Citation

Stancliffe, Ethan, "Automated Processing of Metabolomics Data for Population-scale Studies" (2023). Arts & Sciences Theses and Dissertations. 3128.

The definitive version is available at https://doi.org/10.7936/96cg-y264

Download

Included in

Biology Commons

COinS

DOI

https://doi.org/10.7936/96cg-y264

Arts & Sciences Theses and Dissertations

Automated Processing of Metabolomics Data for Population-scale Studies

Abstract

Committee Chair

Degree

Author's Department

Author's School

Document Type

Date of Award

Language

DOI

Author's ORCID

Recommended Citation

Included in

DOI

Search

Links

Browse

Author Corner

Arts & Sciences Theses and Dissertations

Automated Processing of Metabolomics Data for Population-scale Studies

Author

Abstract

Committee Chair

Degree

Author's Department

Author's School

Document Type

Date of Award

Language

DOI

Author's ORCID

Recommended Citation

Included in

Share

DOI

Search

Links

Browse

Author Corner