Date of Award
Summer 8-24-2023
Degree Name
Doctor of Philosophy (PhD)
Degree Type
Dissertation
Abstract
Machine learning has catalyzed significant strides in improving the quality of predictions regarding patient health. However, significant specific expertise is still required to properly develop statistical and machine learning models that can perform those predictions. Automated machine learning, or AutoML, represents an opportunity to democratize the construction of machine learning models and enable stakeholders without that expertise to create those models. In this dissertation, I explore the feasibility of implementing an AutoML pipeline for predicting the risk of patients’ admission to the hospital within 30-days of their first positive COVID-19 test by presenting a series of methods that (1) assess the extent to which automating patient clustering is possible, (2) compare and contrast an AutoML framework against a more traditional model development approach, and (3) quantify model degradation over time and discuss whether and how to automatically update models that are deployed. This dissertation demonstrates that while significant automation is possible, there remain significant challenges at either extreme of an automated pipeline, especially in data cleaning and preprocessing and following model deployment. Further, I discuss the need for AutoML-generated models to be trustworthy, validated, and trained such that they have minimal bias. It is essential that models be used with consideration for the patients on whose data they are trained and by professionals with appropriate domain expertise for the contexts in which the models are deployed.
Language
English (en)
Chair
Randi Foraker
Committee Members
Albert Lai