Biostatistics

Division of Biostatistics and Epidemiology

Department of Health Evaluation Sciences

924-8712

July 2001

Frank Harrell PhD

James Patrie MS

Jennifer Gibson MS

Mark Conaway PhD

Rosner, B.

Series of short articles about statistical concepts by JM Bland and DG Altman

Cohn V: A perspective from the press: how to help reporters tell the truth (sometimes).

Matthews JNS,

- Experimental design
- Various types of random variables
- Data distributions and descriptive statistics
- Graphical presentation of data and results
- Probability
- Data analysis for description, estimation, hypothesis testing, and prediction
- Linear regression models
- Dealing with repeated measurements in one patient and how to measure change
- Avoiding pitfalls in interpreting statistical analyses

- Presenter
- : Frank Harrell
- Topics and Readings
- :

General Overview (R1)

Descriptive Statistics and Graphics (AB8, R2) - Objectives
- : To
- understand the role of biostatistics as a science, and biostatistical methods as tools of scientific inquiry
- the meaning of description, estimation, hypothesis testing, and prediction
- understand what is meant by
*random variable* - know advantages of using continuous variables and of preserving their continuous nature in the analysis
- understand distributions of random variables
- know characteristics of distributions (central tendency, variance (variability, spread), quantiles or percentiles)
- be able to choose graphs that are useful for depicting data distributions
- be able to choose graphs that are useful for summarizing results of studies
- be able to make informative tables

- Presenter
- : Jennifer Gibson
- Topics and Readings
- :

Probability (R3.1-3.6)

Estimation (R6.1-6.2,6.4-6.7.1) - Objectives
- : To
- understand the meaning of probability
- understand what it means to say that two events are independent
- be able to compute the probability of the union of two events
- be able to compute the probability of the intersection of two independent events
- understand conditional probability
- know the meaning of population and a sample from that population
- know how to estimate population quantities such as mean, median and other quantiles, and standard deviation from sample values
- obtain an initial understanding of interval estimates and how to construct a confidence interval for the unknown mean of a normal-shaped population
- understand how to estimate a population probability from a sample of events and non-events
- know a simple approximate formula for a confidence interval for an unknown population probability
- memorize and understand the 3/
*n*rule

- Presenter
- : Frank Harrell
- Topics and Readings
- :

Hypothesis Testing: One-sample inference (R7(except 7.4.1,7.8,7.9.2,7.10),BA8,AB^{1})

Two-sample inference (R8(except 8.6,8.7,8.9,8.11),MA25) - Objectives
- : To
- understand the fundamentals of hypothesis testing and assembling evidence using classical statistics
- know the meanings of type I and II errors,
*P*-values, and power - know the general structure of a
*t*statistic in general - know one basis for estimating the required sample size
- understand the construction and interpretation of a confidence interval for an unknown mean from a normal population
- know the relationshop between confidence intervals and
*P*-values - know how to carry out and interpret a one-sample
*t*-test for paired (*R*8.2) or unpaired data from a normal distribution - understand how
*P*-values are ``backwards'' and how to avoid errors in interpreting them - learn how to compute and interpret confidence intervals for the difference in two population means when the data are normal
- understand the setup for a two-sample problem
- be able to carry out a two-sample (unpaired)
*t*-test for normally distributed data - be able to construct and interpret a confidence interval for the difference in two means
- know how to compute power or the sample size to achieve a given power for comparing two means
- know how to compute the sample size to achieve a given precision for estimating a probability, a mean, and a difference in two means
- understand pitfalls in interpreting
*P*-values

- Presenter
- : Frank Harrell
- Topics and Readings
- :

Comparing two proportions (R10.1-10.2,10.5.1)

Nonparametric methods (R9.1,9.3-9.6)

Hypothesis testing review (R7,R8) - Objectives
- : To
- learn how to do an approximate test for the difference in two proportions by hand
- learn to use approximate methods for computing sample size or power for comparing two population probabilities
- learn the advantages of nonparametric tests for continuous responses without assuming a distribution
- understand the nonparametric counterpart of the one-sample
*t*-test, the Wilcoxon signed-rank test - understand the nonparametric counterpart of the two-sample
*t*-test, the Wilcoxon-Mann-Whitney two-sample rank-sum test - review ``big picture'' concepts of hypothesis testing and interval estimation

- Presenter
- : Jim Patrie
- Topics and Readings
- :

Regression and Correlation (R11.1-11.7,11.9-11.10) - Objectives
- : To
- understand in detail the simple linear regression model and how its slope and intercept are estimated
- understand interval estimation of the slope and of a prediction
- know the assumptions made by regression
- understand multiple regression, especially interpreting regression coefficients and what it means to adjust for the effects of certain variables
- know what the linear correlation coefficient measures
- understand the correspondence between testing for nonzero correlation and testing for nonzero slope in simple regression
- be able to interpret
*R*^{2} - know the assumptions made by standard linear multiple regression

- Presenter
- : Frank Harrell
- Topics and Readings
- :

Regression Review (R11)

Rank correlation (R11.12)

One-way analysis of variance and the Kruskal-Wallis test (R12.1,AB20,R12.7)

Heterogeneity of effects (BA23,AM24,MA25,MA26,R12.6)

Analysis of covariance (R12.5.3)

Multiple significance tests (BA10) - Objectives
- : To
- further understand the most important issues related to regression analysis, and hazards of multiple regression
- know how to estimate the sample size needed to estimate a correlation coefficient to a certain precision
- know the advantages of the nonparametric counterpart to the linear correlation coefficient and test
- understand principles involved in comparing
*k*groups using analysis of variance - know a method for pairwise comparisons of means
- understand how the Kruskal-Wallis test generalizes the Wilcoxon
test from 2 to
*k*samples - understand advantages of the Kruskal-Wallis test over parametric analysis of variance
- know when a two-way ANOVA is appropriate
- be introduced to methods for assessing differential treatment effects
- know the purpose of analysis of covariance
- be introduced to methods (such as Bonferroni) for keeping the probability of a false positive result at an acceptable level when many hypotheses are tested

- Presenters
- : Mark Conaway and Frank Harrell
- Topics, Readings, and Presenter
- :

Measuring change (Harrell) (TBD)

Repeated Measurements (Conaway) (BA1,BA12,BA13,Matthews*et al.*)

Experimental Design (Conaway) - Objectives
- : To
- know problems with percent change
- understand one basis for choosing a measure of change
- understand some of the most common experimental designs used in experiments to compare therapies
- be introduced to factorial designs and their advantages and disadvantages
- know why multiple measurements from the same patient cannot be analyzed as if they were measurements from separate patients
- be introduced to simple methods for analyzing such serial data

- 1
- ``Absence of evidence'' paper