Arts & Sciences Graduate Student Theses and Dissertations

Learning Sequence Constraints Governing Conformational Ensembles and Function in Intrinsically Disordered Proteins

Jeffrey Lotthammer, Washington University in St. LouisFollow

Abstract

Intrinsically disordered proteins and regions (IDRs) lack stable three-dimensional structure under physiological conditions. Instead, IDRs are better described by a conformational equilibrium wherein these proteins rapidly interconvert between many distinct structural states. Although they lack a well-defined reference fold, IDRs are ubiquitous across the Tree of Life and play essential roles in virtually every biological process, including gene regulation, molecular recognition, and signal transduction. The absence of a well-defined fold, however, makes IDRs difficult to interpret and challenging to engineer. Traditional structure-based approaches that rely on tertiary structure or evolutionary conservation are poorly suited to handle the complexity of IDRs. My work advances two complementary paradigms for understanding and designing IDRs. In the first paradigm, sequence → ensemble, I interpret IDR sequences through a molecular biophysical lens: observables derived from the statistics of disordered conformational ensembles are used to generate mechanistic hypotheses about IDR function and used to guide hypothesis generation and design. To enable this at scale, I develop high-throughput sequence-to-ensemble predictors that enable us to navigate disordered conformational landscapes directly from sequence. These models are implemented in robust, user-friendly software, making quantitative ensemble-based analysis accessible across large protein sets, not just individual case studies. In the second paradigm, sequence → function, I develop disorder-specific deep learning models to infer functional sequence constraints directly from the amino acid sequence. Instead of relying on biophysical models, this approach leverages generative modeling and learned representations to design disordered regions. Building on recent advances in natural language processing, I introduce a diffusion-based protein language model tailored to intrinsically disordered regions that learns IDR-specific sequence representations and can condition on adjacent folded domains when present. This allows the model to capture how local sequence context constrains disordered regions, enabling the context-aware design of disordered protein sequences. A defining feature throughout this body of work is its high-throughput, software-first implementation. I design and implement robust, scalable tools that make these models easy to deploy, integrate, and extend within diverse protein bioinformatics and protein design workflows for both computational and experimental researchers. Collectively, these methods and tools are intended to enable a broad community of researchers to systematically probe, predict, and engineer intrinsically disordered proteins and protein regions.

Committee Chair

Alex Holehouse

Committee Members

Andrea Soranno; Eric Galbert; Joshua Rackers; Michael Brent; Roman Garnett

Degree

Doctor of Philosophy (PhD)

Author's Department

Biology & Biomedical Sciences (Computational & Systems Biology)

Author's School

Graduate School of Arts and Sciences

Document Type

Dissertation

Date of Award

4-28-2026

Language

English (en)

DOI

https://doi.org/10.7936/daz2-1a56

Author's ORCID

https://orcid.org/0000-0002-5022-7006

Recommended Citation

Lotthammer, Jeffrey, "Learning Sequence Constraints Governing Conformational Ensembles and Function in Intrinsically Disordered Proteins" (2026). Arts & Sciences Graduate Student Theses and Dissertations. 3741.

The definitive version is available at https://doi.org/10.7936/daz2-1a56

Download

Available for download on Thursday, April 27, 2028

Included in

Biophysics Commons

COinS

DOI

https://doi.org/10.7936/daz2-1a56

Arts & Sciences Graduate Student Theses and Dissertations

Learning Sequence Constraints Governing Conformational Ensembles and Function in Intrinsically Disordered Proteins

Abstract

Committee Chair

Committee Members

Degree

Author's Department

Author's School

Document Type

Date of Award

Language

DOI

Author's ORCID

Recommended Citation

Included in

DOI

Search

Links

Browse

Author Corner

Arts & Sciences Graduate Student Theses and Dissertations

Learning Sequence Constraints Governing Conformational Ensembles and Function in Intrinsically Disordered Proteins

Author

Abstract

Committee Chair

Committee Members

Degree

Author's Department

Author's School

Document Type

Date of Award

Language

DOI

Author's ORCID

Recommended Citation

Included in

Share

DOI

Search

Links

Browse

Author Corner