How Should Change be Measured?
 Analysis of Paired Observations
 Frequently one makes multiple observations on same experimental unit
 Can't analyze as if independent
 When two observations made on each unit (e.g., prepost), it is common to summarize each pair using a measure of effect and then to analyze effects as if (unpaired) raw data
 Most common: simple difference, ratio, percent change
 Can't take effect measure for granted
 Subjects having large initial values may have largest differences
 Subjects having very small initial values may have largest post/pre ratios
 What's Wrong with Percent Change?
 First, we define percent change to be: % change = (first value  second value) / second value * 100
 The first value is often called the new value and the second value is called the old value, but this does not fit all situations
 Example:
 Treatment A: 0.05 proportion having stroke
 Treatment B: 0.09 proportion having stroke
 The point of reference (which term is used in the denominator?) will impact the answer
 Treatment A reduced proportion of stroke by 44%
 Treatment B increased proportion by 80%
 Two increases of 50% result in a total increase of 125%, not 100%
* Math details: If %$x$% is your original amount, two increase of 50% is %$x\times 1.5\times 1.5$%. Then, % change = %$(1.5\times 1.5\times x  x) / x = x\times (1.5\times 1.5  1) / x = 1.25$%, or a 125% increase >

 Percent change (or ratio) is not a symmetric measure
 A 50% increase followed by a 50% decrease results in an overall decrease (not no change)
 A 50% decrease followed by a 50% increase results in an overall decrease (not no change)
 Simple difference or log ratio are symmetric
 Unless percents represent proportions times 100, it is not appropriate to compute descriptive statistics (especially the mean) on percents. For example, the correct summary of a 100% increase and a 50% decrease, if they both started at the same point, would be 0%.
 Analysis of % change has lower power than other methods
 Objective Method for Choosing Effect Measure
 Goal: Measure of effect should be as independent of baseline value as possible
 Note: Because of regression to the mean, it may be impossible to make the measure of change truly independent of the initial value. A high initial value may be that way because of measurement error. The high value will cause the change to be less than it would have been had the initial value been measured without error. Plotting differences against averages rather than against initial values will help reduce the effect of regression to the mean.
 Plot difference in pre and post values vs. the average of the pre and post values (BlandAltman plot). If this shows no trend, the simple differences are adequate summaries of the effects, i.e., they are independent of initial measurements.
 If a systematic pattern is observed, consider repeating the previous step after taking logs of both the pre and post values. If this removes any systematic relationship between the average and the difference in logs, summarize the data using logs, i.e., take the effect measure as the log ratio.
 Other transformations may also need to be examined
Avoiding Change as a Response Variable in Parallel Designs
In a twogroup parallel design, analysis of change is not recommended at all. The response variable should be the final measurement and the baseline measurement should be adjusted for as a covariate using analysis of covariance, with treatment assigned as one of the other variables. Besides the issues listed above, change scores are affected by regression to the mean. The slope of the baseline value may not be 1.0.
Summary of Reasons to Avoid Change Scores (Change from Baseline)
 It is imperative to adjust for the baseline value anyway, using regression modeling (for reasons of bias reduction in observational studies and for maximizing power and precision in randomized trials).
 Summary statistics computed on change scores strongly assume that the change measure is valid (see above). For example, computing the mean change from baseline of a response variable Y that has been transformed by a function f (which may be f(Y)=Y, i.e., no transformation needed) assumes that f(followup)  f(baseline) has constant variance across subjects and that it has no correlation with f(baseline) + f(followup). The recommended 3number summary (quartiles) of the response can change arbitrarily if different transformations are used, because different transformations can reorder change scores across patients. On the other hand, quartiles of Y at followup are always valid and are never misleading, as the median of f(Y) is just f(median Y). Instead of plotting changes from baseline, plot responses over time with baseline values located at t=0.
 Due to natural history of disease and regression to the mean caused by measurement error, patients change over time. Such changes are not interesting related to the goals of the study, and may result in misleading interpretations of changes from baseline in all treatment groups.
 Using each patient as her own control through calculation of a change score is worse than using no control if the baseline is noisy. If the correlation between baseline and followup measurement is less than 0.5, subtracting the baseline is worse than just analyzing the followup measurement.
 A noisy baseline cannot hurt analysis of covariance except for spending one degree of freedom from the error variance (with a small chance of variance inflation due to loss of orthogonality). A baseline that should be ignored will receive an appropriately small regression coefficient.