Abstract

Multiclass classification with high-dimensional data is an applied topic both in statistics and machine learning. The classification procedure could be done in various ways. In this thesis, we review the theory of the Lasso procedure which provides a parameter estimator while simultaneously achieving dimension reduction due to a property of the L1 norm. Lasso with elastic net penalty and sparse group lasso are also reviewed. Our data is high-dimensional proteomic data (iTRAQ ratios) of breast cancer patients with four subtypes of breast cancer. We use the multinomial logistic regression to train our classifier and use the false classification rates obtained from cross validation to compare models.

Committee Chair

Todd Kuffner

Committee Members

Jose Figueroa-Lopez, Nan Lin

Comments

Permanent URL: https://doi.org/10.7936/K7GQ6X6Z

Degree

Master of Arts (AM/MA)

Author's Department

Mathematics

Author's School

Graduate School of Arts and Sciences

Document Type

Thesis

Date of Award

Spring 5-18-2018

Language

English (en)

Included in

Mathematics Commons

Share

COinS