## Department of Biostatistics Seminar/Workshop Series

# **Selection of Optimal Cut-Points to Dichotomize Continuous Predictors to Discriminate Disease Status**

Bethany J. Wolf, PhD, Assistant Professor of Biostatistics, Department of Public Health Science, Medical University of South Carolina
Continuous variables are often dichotomized to develop decision tools in clinical practice. However, development of complex diseases may be influenced by interactions between genetic and environmental factors. Appropriate clinical management of a patient requires consideration of these factors when assessing patient risk for disease. If true cut-points for one or more continuous variables exist, the challenge is identifying them. We first examine common methods for dichotomization to identify which methods recover a true cut-point. We provide mathematical and numeric proofs demonstrating that maximizing the odds ratio, Youden's statistic, Gini Index, chi-square statistic, relative risk and kappa statistic all theoretically recover a true cut-point. A simulation study evaluating the ability of these statistics to recover a cut-point when sampling from a population indicates that maximizing the chi-square statistic and Gini Index have the smallest bias and variability while the maximizing odds ratio is the most variable and biased of the methods. While these methods can be shown to recover a true cut-point, there is limited methodology for simultaneously optimizing cut-points for 2 or more variables. Thus we also propose a method for jointly dichotomizing two or more variables and conduct simulations to compare joint and marginal dichotomization for the ability to recover the correct cut-points. Our results show that, in most scenarios, cut-points selected jointly exhibit smaller mean square error and similar bias relative to those selected marginally.