Combining computer simulations and deep learning to understand and predict protein structural dynamics
Date of Award
Doctor of Philosophy (PhD)
Molecular dynamics simulations provide a means to characterize the ensemble of structures that a protein adopts in solution. These structural ensembles provide crucial information about how proteins function, and these ensembles also reveal potential drug binding sites that are not observable from static protein structures (i.e. cryptic pockets). However, analyzing these high- dimensional datasets to understand protein function remains challenging. Additionally, finding cryptic pockets using simulation data is slow and expensive, which makes the appeal of computationally screening for cryptic pockets limited to a narrow set of circumstances. In this thesis, I develop deep learning based methods to overcome these challenges. First, I develop a deep learning algorithm, called DiffNets, to deal with the high-dimensionality of structural ensembles. DiffNets takes structural ensembles from similar systems with different biochemical properties and learns to highlight structural features that distinguish the systems, ultimately connecting structural signatures to their associated biochemical properties. Using DiffNets, I provide structural insights that explain how naturally occurring genetic variants of the oxytocin receptor alter signaling. Additionally, DiffNets help reveal how a SARS-CoV-2 protein involved in immune evasion becomes activated. Next, I use MD simulations to hunt for cryptic pockets across the SARS-CoV-2 proteome, which led to the discovery of more than 50 new potential druggable sites. Because this effort required an extraordinary amount of resources, I developed a deep learning approach to predict sites of cryptic pockets from single protein structures. This approach reduces the time to identify if a protein has a cryptic pocket by ~10,000-fold compared to the next best method.
Chair and Committee
Gregory R. Bowman
Ward, Michael D., "Combining computer simulations and deep learning to understand and predict protein structural dynamics" (2022). Arts & Sciences Electronic Theses and Dissertations. 2729.