Date of Award

Spring 5-18-2018

Author's School

Graduate School of Arts and Sciences

Author's Department

Mathematics

Additional Affiliations

Statistics

Degree Name

Master of Arts (AM/MA)

Degree Type

Thesis

Abstract

Multiclass classification with high-dimensional data is an applied topic both in statistics and machine learning. The classification procedure could be done in various ways. In this thesis, we review the theory of the Lasso procedure which provides a parameter estimator while simultaneously achieving dimension reduction due to a property of the L1 norm. Lasso with elastic net penalty and sparse group lasso are also reviewed. Our data is high-dimensional proteomic data (iTRAQ ratios) of breast cancer patients with four subtypes of breast cancer. We use the multinomial logistic regression to train our classifier and use the false classification rates obtained from cross validation to compare models.

Language

English (en)

Chair and Committee

Todd Kuffner

Committee Members

Jose Figueroa-Lopez, Nan Lin

Comments

Permanent URL: https://doi.org/10.7936/K7GQ6X6Z

Share

COinS