Abstract

Over the past decades, numerous optimization and machine learning (ML) algorithms have been proposed, many of which have demonstrated success in real-world applications and significantly impacted people's lives. Researchers have devoted considerable effort to understanding the theoretical underpinnings of these methods and to improving their performance, as well as designing algorithms compatible with demanding real-world constraints. This dissertation focuses on investigating the statistical properties of several mainstream optimization and ML algorithms, enabling us to make decisions with statistical guarantees.First, we examine the classical stochastic gradient descent algorithm (SGD) in a general nonconvex context. Utilizing the multiplier bootstrap technique, we design two inferential procedures that yield consistent covariance matrix estimators and asymptotically exact confidence intervals. Notably, our procedures can be executed online, aligning perfectly with the nature of SGD. We employ fundamentally different proof techniques than those used in inference with convex SGD, and we believe these techniques can be extended to other inferential procedures. Our novel results represent the first practical statistical inference with SGD that transcends the convexity constraint.Second, we explore the problem of testing conditional independence without assuming a specific regression model. In recent years, researchers have proposed numerous model-free statistical testing methods, which are favored for their robustness, particularly in high-dimensional data analysis. Building upon the existing Conditional Randomization Test (CRT), we introduce the Conditional Randomization Rank Test (CRRT). Compared to CRT, CRRT is applicable to a broader range of ML frameworks and offers superior computational efficiency. We demonstrate that CRT can guarantee the desired type 1 error and prove its robustness to distribution misspecification. Through extensive simulations, we empirically validate the effectiveness and robustness of the method.Finally, we investigate a gradient-free extension of the renowned Expectation Maximization algorithm (EM). Although EM and its gradient version have achieved remarkable success in estimating mixture models and other latent variable models, they are not applicable when direct maximization or gradient evaluation is unavailable. To address this limitation, we propose the zeroth-order EM, which requires only function values, making it easily applicable to complex models. We analyze the convergence rate of the zeroth-order EM under both smooth and non-smooth conditions and demonstrate the effectiveness of this method using simulated data.

Committee Chair

Todd Kuffner Soumendra Lahiri

Committee Members

Ari Stern, Likai Chen, Robert Lunde,

Degree

Doctor of Philosophy (PhD)

Author's Department

Mathematics

Author's School

Graduate School of Arts and Sciences

Document Type

Dissertation

Date of Award

Spring 5-15-2023

Language

English (en)

DOI

https://doi.org/10.7936/7b4d-v504

Author's ORCID

http://orcid.org/0000-0003-2131-1812

Recommended Citation

Zhong, Yanjie, "Essays on Statistical Inference, Nonconvex Optimization and Machine Learning" (2023). Arts & Sciences Theses and Dissertations. 2929.

The definitive version is available at https://doi.org/10.7936/7b4d-v504

Download

Included in

Mathematics Commons

COinS

DOI

https://doi.org/10.7936/7b4d-v504

Arts & Sciences Theses and Dissertations

Essays on Statistical Inference, Nonconvex Optimization and Machine Learning

Abstract

Committee Chair

Committee Members

Degree

Author's Department

Author's School

Document Type

Date of Award

Language

DOI

Author's ORCID

Recommended Citation

Included in

DOI

Search

Links

Browse

Author Corner

Arts & Sciences Theses and Dissertations

Essays on Statistical Inference, Nonconvex Optimization and Machine Learning

Author

Abstract

Committee Chair

Committee Members

Degree

Author's Department

Author's School

Document Type

Date of Award

Language

DOI

Author's ORCID

Recommended Citation

Included in

Share

DOI

Search

Links

Browse

Author Corner