This item is under embargo and not available online per the author's request. For access information, please visit


Date of Award

Spring 5-15-2021

Author's School

Graduate School of Arts and Sciences

Author's Department

Biology & Biomedical Sciences (Computational & Systems Biology)

Degree Name

Doctor of Philosophy (PhD)

Degree Type



Adverse drug reactions (ADRs) are a serious problem with increasing morbidity, mortality, and health care costs worldwide. In the U.S., ADRs are responsible for more than 50% of acute liver failure cases and are the fourth most common cause of death, costing 100,000 lives annually.Idiosyncratic adverse drug reactions (IADRs) are immune-mediated hypersensitivity ADRs that are difficult to foresee during drug development. IADRs are often caused by reactive metabolites produced during drug metabolism. These reactive metabolites covalently attach to cellular components, and the resulting conjugates may provoke toxic immune response. Because reactive metabolites are short-lived, they can be difficult to detect. Tools to reliably predict whether a compound forms reactive metabolites would enable us to avoid drug candidates prone to causing IADRs and make new medicines safer. Unfortunately, due to inadequate modeling of metabolism, current experimental and computational approaches do not reliably identify drug candidates that form reactive metabolites. Bioactivation pathways leading to reactive metabolite formations often are composed of multiple steps. To accurately predict reactive metabolite formation, we must explicitly model metabolic steps of bioactivation pathways. Therefore, we built models to predict specific metabolic transformations such as hydroxylation, epoxidation, dehydrogenation, quinonation, hydrolysis, reduction, glucuronidation, sulfuration, acetylation, and methylation. Using machine learning and literature-derived data, we trained models that can predict both the likelihood that a molecules undergoes a certain chemical transformation and the specific site(s) within the molecule where this transformation happens. Together, our metabolism models cover ∼ 95% of enzymatically-driven chemical reactions in human. Our models achieve high area under the receiver operating characteristic curve scores (AUCs) of ∼ 90% in cross-validated tests. Our mechanistic approach outperformed structural alerts—a common tool used to screen out candidate compounds during drug development. Structural alerts are chemical moieties that were frequently observed to give rise to reactive metabolite upon bioactivation. However, many safe drugs also contain structural alerts which are not bioactivated and, conversely, many toxic drugs contain no structural alert. We combined models of metabolism, metabolite structure prediction, and reactivity to offer a better prediction of reactive metabolite formation in the context of structural alerts. Based on the known bioactivation pathway(s) of each structural alert, appropriate metabolism models were applied to evaluate whether drugs containing the structural alert actually form reactive metabolites. Our study focused on the furan, phenol, nitroaromatic, and thiophene alerts. Specifically, we used models of epoxidation, quinone formation, reduction, and sulfur-oxidation to predict the bioactivation of furan-, phenol-, nitroaromatic-, and thiophene-containing drugs. Our models separated bioactivated and not-bioactivated furan-, phenol-, nitroaromatic-, and thiophene-containing drugs with AUC performances of 100%, 73%, 93%, and 88%, respectively. In addition, we used our models to uncover bioactivation mechanisms that were previously under-appreciated. For example, N-dealkylation is the oxidation of an alkylated amine at the nitrogen-carbon bond, cleaving the parent compound into an amine and an aldehyde. Even though aldehydes can be toxic, metabolic studies usually neglect to report or investigate them because they are assumed to be efficiently detoxified into carboxylic acids and alcohols. Applying the N-dealkylation model to approved and withdrawn medicines, we found that aldehyde metabolites produced from N-dealkylation may explain the hepatotoxicity of several drugs: indinavir, piperacillin, verapamil, and ziprasidone. These results demonstrated the utility of comprehensive bioactivation models that systematically consider constituent metabolic steps in gauging toxicity risks.


English (en)

Chair and Committee

Sanjay Joshua Swamidass

Committee Members

Michael Brent, Kristen Naegle, Greg Bowman, Mark Anastasio,

Available for download on Sunday, May 15, 2022

Included in

Chemistry Commons