## Department of Biostatistics Seminar/Workshop Series

## Biostatistics Student Research Forum:

# Statistical A Modified Random Forest Kernel for Highly Nonstationary Gaussian Process Regression with Application to Clinical Data

## Jacob VanHouten, Ph.D. Candidate, Department of Biostatistics

## Vanderbilt University, School of Medicine

Nonstationary Gaussian process regression can be used to transform irregularly episodic and noisy measurements into continuous probability densities to make them more compatible with standard learning algorithms. However, current inference algorithms are time consuming or have difficulties with the highly bursty, extremely nonstationary data that are common in the medical domain. One efficient and flexible solution uses a partition kernel based on random forests, but its current embodiment produces undesirable pathologies rooted in the piecewise-constant nature of its inferred posteriors. We present a modified random forest kernel that adds a new source of randomness to the trees, which overcomes existing pathologies and produces good results for highly bursty, extremely nonstationary clinical laboratory measurements.