Abstract
Colorectal cancer (CRC) is the third most commonly diagnosed cancer and the second leading cause of cancer-related deaths in the United States, with a concerning rise in incidence among younger adults. Clinical management is currently shifting toward organ-preserving ``watch-and-wait" strategies for rectal cancer and precision endoscopic resection for early-stage polyps. However, current diagnostic standards, including white-light colonoscopy (WLC) and magnetic resonance imaging (MRI), lack the resolution and functional specificity to accurately assess the invasion depth of precancerous polyps or to reliably distinguish post-treatment fibrosis from residual viable tumor. This dissertation advances colorectal cancer diagnostics through three integrated contributions: CPU-optimized photoacoustic (PA) and optical coherence tomography (OCT) imaging software that enables real-time deployment on commodity hardware, and the development and clinical validation of PA and OCT endoscopic systems. To address computational barriers limiting clinical translation, we developed imaging software optimized for commodity CPU hardware. By implementing FFT-based convolution algorithms with SIMD vectorization and thread-level parallelism, we achieved real-time performance (544 MS/s) without discrete GPUs. The resulting platforms (\texttt{ArpamGui}, \texttt{OCTGui}, \texttt{USPAT\_OpenCV}) enable point-of-care deployment on standard computers. We investigated clinical utility across the CRC spectrum. For early detection, we demonstrated the first \textit{in vivo} application of vision transformer (ViT) enhanced OCT during screening colonoscopy. In 32 patients with complex polyps, the system achieved benign-versus-malignant classification AUC of 0.984 (95\% CI: 0.972-0.996) and Cohen's Kappa of 0.845, imaging subsurface architecture to approximately 800~$\mathrm{\mu m}$ depth and providing information unavailable to surface-only modalities that may inform the endoscopic-versus-surgical decision for complex polyps. For locally advanced rectal cancer (LARC), we evaluated coregistered acoustic-resolution photoacoustic microscopy and ultrasound (ARPAM-US) for treatment response assessment. Immunohistochemical analysis using ETS-related gene (ERG) revealed that complete responders exhibit significantly higher microvascular density than partial or non-responders ($p < 0.01$), indicating functional vascular normalization as a reliable biomarker for pathologic complete response. In a prospective cohort of 25 patients, deep learning-assisted ARPAM-US achieved AUC of 0.956 (95\% CI: 0.912-1.000) for predicting complete response. Exploratory T2-weighted MRI radiomics analysis revealed substantial cross-cohort domain shift, with a model trained on a retrospective single-institution cohort (internal validation AUC 0.88) degrading to AUC 0.65 when applied to the prospective ARPAM cohort, where scanner hardware and acquisition parameters differed. Together, these contributions show that accessible CPU-based imaging software, combined with catheter-based OCT and ARPAM-US endoscopy, can provide clinically relevant information beyond standard-of-care modalities for rectal cancer treatment response and complex polyp assessment.
Committee Chair
Quing Zhu
Committee Members
Chao Zhou; Matthew Lew; Song Hu; William Chapman
Degree
Doctor of Philosophy (PhD)
Author's Department
Biomedical Engineering
Document Type
Dissertation
Date of Award
4-29-2026
Language
English (en)
DOI
https://doi.org/10.7936/971w-zs59
Recommended Citation
Nie, Haolin, "Deep Learning Enhanced Multimodal Endoscopy for Colorectal Cancer: From Real-Time Software Architecture to Clinical Response Assessment" (2026). McKelvey School of Engineering Graduate Student Theses & Dissertations. 1390.
The definitive version is available at https://doi.org/10.7936/971w-zs59