Date of Award
Spring 5-15-2018
Degree Name
Doctor of Philosophy (PhD)
Degree Type
Dissertation
Abstract
In the last decade, motivated by a variety of applications in medicine, bioinformatics, genomics, brain imaging, etc., a growing amount of statistical research has been devoted to large-scale multiple testing, where thousands or even greater numbers of tests are conducted simultaneously. However, due to the complexity of real data sets, the assumptions of many existing multiple testing procedures, e.g. that tests are independent and have continuous null distributions of p-values, may not hold. This poses limitations in their performances such as low detection power and inflated false discovery rate (FDR). In this dissertation, we study how to better proceed the multiple testing problems under complex data structures. In Chapter 2, we study the multiple testing with discrete test statistics. In Chapter 3, we study the discrete multiple testing with prior ordering information incorporated. In Chapter 4, we study the multiple testing under complex dependency structure. We propose novel procedures under each scenario, based on the marginal critical functions (MCFs) of randomized tests, the conditional random field (CRF) or the deep neural network (DNN). The theoretical properties of our procedures are carefully studied, and their performances are evaluated through various simulations and real applications with the analysis of genetic data from next-generation sequencing (NGS) experiments.
Language
English (en)
Chair and Committee
Nan Lin
Committee Members
Jimin Ding, Jose E. Figueroa-Lopez, Edward Spitznagel, Ting Wang,
Recommended Citation
Dai, Xiaoyu, "Large-scale Multiple Hypothesis Testing with Complex Data Structure" (2018). Arts & Sciences Electronic Theses and Dissertations. 1523.
https://openscholarship.wustl.edu/art_sci_etds/1523
Comments
Permanent URL: https://doi.org/10.7936/K7GM86QQ