Date of Award
Spring 5-18-2018
Additional Affiliations
Statistics
Degree Name
Master of Arts (AM/MA)
Degree Type
Thesis
Abstract
Multiclass classification with high-dimensional data is an applied topic both in statistics and machine learning. The classification procedure could be done in various ways. In this thesis, we review the theory of the Lasso procedure which provides a parameter estimator while simultaneously achieving dimension reduction due to a property of the L1 norm. Lasso with elastic net penalty and sparse group lasso are also reviewed. Our data is high-dimensional proteomic data (iTRAQ ratios) of breast cancer patients with four subtypes of breast cancer. We use the multinomial logistic regression to train our classifier and use the false classification rates obtained from cross validation to compare models.
Language
English (en)
Chair and Committee
Todd Kuffner
Committee Members
Jose Figueroa-Lopez, Nan Lin
Recommended Citation
Zhai, Hongxuan, "Variable selection via Lasso with high-dimensional proteomic data" (2018). Arts & Sciences Electronic Theses and Dissertations. 1295.
https://openscholarship.wustl.edu/art_sci_etds/1295
Comments
Permanent URL: https://doi.org/10.7936/K7GQ6X6Z