Date of Award
Doctor of Philosophy (PhD)
Over the past decades, numerous optimization and machine learning (ML) algorithms have been proposed, many of which have demonstrated success in real-world applications and significantly impacted people's lives. Researchers have devoted considerable effort to understanding the theoretical underpinnings of these methods and to improving their performance, as well as designing algorithms compatible with demanding real-world constraints. This dissertation focuses on investigating the statistical properties of several mainstream optimization and ML algorithms, enabling us to make decisions with statistical guarantees.
First, we examine the classical stochastic gradient descent algorithm (SGD) in a general nonconvex context. Utilizing the multiplier bootstrap technique, we design two inferential procedures that yield consistent covariance matrix estimators and asymptotically exact confidence intervals. Notably, our procedures can be executed online, aligning perfectly with the nature of SGD. We employ fundamentally different proof techniques than those used in inference with convex SGD, and we believe these techniques can be extended to other inferential procedures. Our novel results represent the first practical statistical inference with SGD that transcends the convexity constraint.
Second, we explore the problem of testing conditional independence without assuming a specific regression model. In recent years, researchers have proposed numerous model-free statistical testing methods, which are favored for their robustness, particularly in high-dimensional data analysis. Building upon the existing Conditional Randomization Test (CRT), we introduce the Conditional Randomization Rank Test (CRRT). Compared to CRT, CRRT is applicable to a broader range of ML frameworks and offers superior computational efficiency. We demonstrate that CRT can guarantee the desired type 1 error and prove its robustness to distribution misspecification. Through extensive simulations, we empirically validate the effectiveness and robustness of the method.
Finally, we investigate a gradient-free extension of the renowned Expectation Maximization algorithm (EM). Although EM and its gradient version have achieved remarkable success in estimating mixture models and other latent variable models, they are not applicable when direct maximization or gradient evaluation is unavailable. To address this limitation, we propose the zeroth-order EM, which requires only function values, making it easily applicable to complex models. We analyze the convergence rate of the zeroth-order EM under both smooth and non-smooth conditions and demonstrate the effectiveness of this method using simulated data.
Chair and Committee
Todd Kuffner Soumendra Lahiri
Ari Stern, Likai Chen, Robert Lunde,
Zhong, Yanjie, "Essays on Statistical Inference, Nonconvex Optimization and Machine Learning" (2023). Arts & Sciences Electronic Theses and Dissertations. 2929.