Date of Award

Spring 5-15-2018

Author's School

Graduate School of Arts and Sciences

Author's Department

Mathematics

Degree Name

Doctor of Philosophy (PhD)

Degree Type

Dissertation

Abstract

This dissertation develops novel methodologies for distributed quantile regression analysis

for big data by utilizing a distributed optimization algorithm called the alternating direction

method of multipliers (ADMM). Specifically, we first write the penalized quantile regression

into a specific form that can be solved by the ADMM and propose numerical algorithms

for solving the ADMM subproblems. This results in the distributed QR-ADMM

algorithm. Then, to further reduce the computational time, we formulate the penalized

quantile regression into another equivalent ADMM form in which all the subproblems have

exact closed-form solutions and hence avoid iterative numerical methods. This results in the

single-loop QPADM algorithm that further improve on the computational efficiency of the

QR-ADMM. Both QR-ADMM and QPADM enjoy flexible parallelization by enabling data

splitting across both sample space and feature space, which make them especially appealing

for the case when both sample size n and feature dimension p are large.

Besides the QR-ADMM and QPADM algorithms for penalized quantile regression, we

also develop a group variable selection method by approximating the Bayesian information

criterion. Unlike existing penalization methods for feature selection, our proposed gMIC

algorithm is free of parameter tuning and hence enjoys greater computational efficiency.

Although the current version of gMIC focuses on the generalized linear model, it can be

naturally extended to the quantile regression for feature selection.

We provide theoretical analysis for our proposed methods. Specifically, we conduct numerical

convergence analysis for the QR-ADMM and QPADM algorithms, and provide

asymptotical theories and oracle property of feature selection for the gMIC method. All

our methods are evaluated with simulation studies and real data analysis.

Language

English (en)

Chair and Committee

Nan Lin

Committee Members

Yixin Chen, Jimin Ding, Jose Figueroa-Lopez, Todd Kuffner,

Comments

Permanent URL: https://doi.org/10.7936/K7F47NKS

COinS