Date of Award
Master of Arts (AM/MA)
Multiclass classification with high-dimensional data is an applied topic both in statistics and machine learning. The classification procedure could be done in various ways. In this thesis, we review the theory of the Lasso procedure which provides a parameter estimator while simultaneously achieving dimension reduction due to a property of the L1 norm. Lasso with elastic net penalty and sparse group lasso are also reviewed. Our data is high-dimensional proteomic data (iTRAQ ratios) of breast cancer patients with four subtypes of breast cancer. We use the multinomial logistic regression to train our classifier and use the false classification rates obtained from cross validation to compare models.
Chair and Committee
Jose Figueroa-Lopez, Nan Lin
Zhai, Hongxuan, "Variable selection via Lasso with high-dimensional proteomic data" (2018). Arts & Sciences Electronic Theses and Dissertations. 1295.