Biostatistics Weekly Seminar

Set-based inference for integrative analysis of genetic compendiums

Ryan Sun, PhD
Harvard University

The increasing popularity of biobanks and other genetic compendiums has introduced exciting opportunities to extract knowledge using datasets combining information from a variety of genetic, genomic, environmental, and clinical sources. To manage the large number of association tests that may be performed with such data, set-based inference strategies have emerged as a popular alternative to testing individual features. Set-based tests enjoy natural advantages including a decreased multiplicity burden and superior interpretations in certain settings. However, existing methods are often challenged to provide adequate power due to three issues in particular: sparse signals, weak effect sizes, and features exhibiting a diverse variety of correlation structures. Motivated by these challenges, we propose the Generalized Berk-Jones (GBJ) statistic, a set-based association test designed to detect rare and weak signals while explicitly accounting for arbitrary correlation patterns. Consistent with its formulation as a generalization of the Berk-Jones statistic, GBJ demonstrates improved power compared to other set-based tests over a variety of moderately sparse settings. We apply GBJ to perform inference on sets of genotypes and sets of phenotypes, and we also discuss strategies for situations where the global null is not the null hypothesis of interest.

2525WE, 10th Floor VICTR Conference Room
24 January 2019

Topic revision: r4 - 10 Jan 2019, TawannaPeters

This site is powered by FoswikiCopyright © 2013-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback