### Department of Biostatistics Seminar/Workshop Series

# Probability Machines

## James D. Malley, PhD

### Center for Information Technology, National Institutes of Health, Bethesda, MD

### Wednesday, May 18, 1:30-2:30pm, MRBIII Conference Room 1220

Many statistical learning machines can provide an optimal classification for binary outcomes. However, probabilities are required for risk estimation using individual patient characteristics for personalized medicine. This talk shows that statistical learning machines that are consistent for the nonparametric regression problem are also consistent for the probability estimation problem. These will be called probability machines.

Probability machines discussed include classification and regression random forests and two nearest-neighbor machines, all of which use any collection of predictors with arbitrary statistical structure. Two simulated and two real data sets illustrate the use of these machines for probability estimation for an individual.