Abstract
Multiclass classification with high-dimensional data is an applied topic both in statistics and machine learning. The classification procedure could be done in various ways. In this thesis, we review the theory of the Lasso procedure which provides a parameter estimator while simultaneously achieving dimension reduction due to a property of the L1 norm. Lasso with elastic net penalty and sparse group lasso are also reviewed. Our data is high-dimensional proteomic data (iTRAQ ratios) of breast cancer patients with four subtypes of breast cancer. We use the multinomial logistic regression to train our classifier and use the false classification rates obtained from cross validation to compare models.
Committee Chair
Todd Kuffner
Committee Members
Jose Figueroa-Lopez, Nan Lin
Degree
Master of Arts (AM/MA)
Author's Department
Mathematics
Document Type
Thesis
Date of Award
Spring 5-18-2018
Language
English (en)
DOI
https://doi.org/10.7936/K7GQ6X6Z
Recommended Citation
Zhai, Hongxuan, "Variable selection via Lasso with high-dimensional proteomic data" (2018). Arts & Sciences Theses and Dissertations. 1295.
The definitive version is available at https://doi.org/10.7936/K7GQ6X6Z
Comments
Permanent URL: https://doi.org/10.7936/K7GQ6X6Z