Biostatistics Weekly Seminar

A unified approach for inference on algorithm-agnostic variable importance

Brian Williamson, PhD
Fred Hutchinson Cancer Research Center

Assessing the relative contribution of subsets of features in predicting the response is often of interest in predictive modeling applications. The variable importance measure used is commonly determined by the prediction technique employed, creating a tradeoff: restrictive assumptions are often necessary for valid statistical inference on the true importance. Rather than considering importance as a summary of a specific prediction algorithm, it is useful to consider variable importance as a summary of the true data-generating mechanism. In this talk, I will focus on a notion of variable importance that captures the best-case predictive potential attributable to one variable or a set of variables. I will discuss general conditions under which a simple estimator of this importance is nonparametric efficient and valid statistical inference on the true importance can be obtained, even when flexible machine learning-based techniques are used as part of the estimation strategy. Finally, I will illustrate the use of the proposed framework with data from a study of an antibody against HIV-1 infection.

Zoom (Link to Follow)
21 October 2020

Speaker Itinerary

Topic revision: r2 - 29 Sep 2020, AndrewSpieker

This site is powered by FoswikiCopyright © 2013-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback