Abstract
In the last decade, motivated by a variety of applications in medicine, bioinformatics, genomics, brain imaging, etc., a growing amount of statistical research has been devoted to large-scale multiple testing, where thousands or even greater numbers of tests are conducted simultaneously. However, due to the complexity of real data sets, the assumptions of many existing multiple testing procedures, e.g. that tests are independent and have continuous null distributions of p-values, may not hold. This poses limitations in their performances such as low detection power and inflated false discovery rate (FDR). In this dissertation, we study how to better proceed the multiple testing problems under complex data structures. In Chapter 2, we study the multiple testing with discrete test statistics. In Chapter 3, we study the discrete multiple testing with prior ordering information incorporated. In Chapter 4, we study the multiple testing under complex dependency structure. We propose novel procedures under each scenario, based on the marginal critical functions (MCFs) of randomized tests, the conditional random field (CRF) or the deep neural network (DNN). The theoretical properties of our procedures are carefully studied, and their performances are evaluated through various simulations and real applications with the analysis of genetic data from next-generation sequencing (NGS) experiments.
Committee Chair
Nan Lin
Committee Members
Jimin Ding, Jose E. Figueroa-Lopez, Edward Spitznagel, Ting Wang,
Degree
Doctor of Philosophy (PhD)
Author's Department
Mathematics
Document Type
Dissertation
Date of Award
Spring 5-15-2018
Language
English (en)
DOI
https://doi.org/10.7936/K7GM86QQ
Recommended Citation
Dai, Xiaoyu, "Large-scale Multiple Hypothesis Testing with Complex Data Structure" (2018). Arts & Sciences Theses and Dissertations. 1523.
The definitive version is available at https://doi.org/10.7936/K7GM86QQ
Comments
Permanent URL: https://doi.org/10.7936/K7GM86QQ