Note that the use of the Gleason data is not completely straightforward. The key point here is that in the PATHOLOG dataset each biopsy is represented as multiple rows corresponding to diffferent slides/cores, so this needs to be accounted for when summarizing the data. Prostate cancer diagnosis (PROSCAR) and Gleason grade are provided for each row. There should be a primary Gleason grade (GSGRA), secondary Gleason grade (GSGRB), and a Gleason score or sum (GLSC=GSGRA+GSGRB) for each post-baseline slide/core that has prostate cancer diagnosis (PROSCAR) =Y. For summary purposes GLSC can be used as the Gleason score for a given biopsy unless multiple slides/cores have PROSCAR=Y and GLSC differs across these slides/cores. In such cases the variable OVGLSM should be used instead, as this is an overall Gleason sum provided by Bostwick Labs for this purpose. When GLSC differs across slides/cores and OVGLSM is not available the missing value of OVGLSM should be imputed as the largest (maximum) value of GLSC for that biopsy, and the summary tables should clearly indicate that imputation was done.
