Abstract
The rapid growth of medical imaging data highlights the need for intelligent systems that can understand both radiological language and image content. This dissertation presents two complementary research directions that advance machine learning in radiology through large-scale natural language understanding and unsupervised image representation learning. The first part develops and benchmarks machine learning models for identifying follow-up recommendations in radiology reports—a task essential for improving patient care and reducing missed follow-ups. Using 49,769 reports across multiple modalities and three institutional datasets, along with external and temporal test sets, thirty-two classification methods were systematically evaluated on both findings and impression sections. These methods span traditional machine learning, neural networks, and state-of-the-art large language models, including Meta's LLAMA3 and OpenAI's HIPAA-compliant GPT models. Results show that generative-discriminative and attention-based recurrent architecture achieved the best internal performance, while prefixed prompting with GPT-4 offered the strongest external and temporal generalization. This large-scale evaluation establishes a solid foundation for automated extraction of actionable clinical information from radiology reports. The second part introduces Sparse Coding–based Variational Autoencoder (SC-VAE), a new framework for unsupervised image representation learning. SC-VAE integrates sparse coding principles into the VAE architecture through a learnable Iterative Shrinkage-Thresholding Algorithm (ISTA) to enforce sparsity in the latent space. This design addresses key limitations of existing VAEs by learning compact yet expressive representations composed of a small number of orthogonal atoms. Experiments on two image datasets show that SC-VAE achieves superior reconstruction quality compared with state-of-the-art continuous and discrete VAE variants. The learned sparse representations also support effective downstream tasks, including image generation and unsupervised segmentation via patch-level clustering. Together, these studies advance intelligent radiological analysis by improving text understanding and image representation learning. The proposed frameworks demonstrate how scalable machine learning—from clinical text classification to unsupervised generative modeling—may help enhance the interpretability, efficiency, and integration of AI in radiology.
Committee Chair
Aristeidis Sotiras
Committee Members
Aimilia Gastounioti; Daniel Marcus; Hua Li; Thomas Kannampallil
Degree
Doctor of Philosophy (PhD)
Author's Department
Interdisciplinary Programs
Document Type
Dissertation
Date of Award
12-19-2025
Language
English (en)
DOI
https://doi.org/10.7936/aqp8-gp71
Recommended Citation
Xiao, Pan, "Learning from Images and Text (Clinical) Data: Toward Putting AI in Radiology Workflows" (2025). McKelvey School of Engineering Theses & Dissertations. 1324.
The definitive version is available at https://doi.org/10.7936/aqp8-gp71