Discussion Board for Issues Relating to the Analysis of Serial, Longitudinal, and Repeated Measures Data
Karl Knoblick (karlknoblich@yahoo.de) asked several good questions on the R-help list on 17May07
Here are his questions and a few answers.
I have two groups (placebo/verum), every subject is measured at 5 times, the first time t0 is the baseline measurement, t1 to t4 are the measurements after applying the medication (placebo or verum). The question is, if there is a significant difference in the two groups and how large the differnce is (95% confidence intervals).
Let me give sample data
# Data
ID <- factor(rep(1:50,each=5)) # 50 subjects
GROUP <- factor(c(rep("Verum", 115), rep("Placebo", 135)))
TIME <- factor(rep(paste("t",0:4,sep=""), 50))
set.seed(1234)
Y <- rnorm(250)
# to have an effect:
Y[GROUP=="Verum" & TIME=="t1"] <- Y[GROUP=="Verum" & TIME=="t1"] + 0.6
Y[GROUP=="Verum" & TIME=="t2"] <- Y[GROUP=="Verum" & TIME=="t2"] + 0.3
Y[GROUP=="Verum" & TIME=="t3"] <- Y[GROUP=="Verum" & TIME=="t3"] + 0.9
Y[GROUP=="Verum" & TIME=="t4"] <- Y[GROUP=="Verum" & TIME=="t4"] + 0.9
DF <- data.frame(Y, ID, GROUP, TIME)
I have heard of different ways to analyse the data
- Comparing the endpoint t4 between the groups (t-test), ignoring baseline
- Comparing the difference t4 minus t0 between the two groups (t-test)
- Comparing the endpoint t4 with t0 as a covariate between the groups (ANOVA - how can this model be calculated in R?)
- Taking a summary score (im not sure but this may be a suggestion of Altman) istead of t4
- ANOVA (repeated measurements) times t0 to t4, group placebo/verum), subject as random factor - interested in interaction times*groups (How to do this in R?)
- As 5) but times t1 to t4, ignoring baseline (How to do this in R?)
- As 6) but additional covariate baseline t0 (How to do this in R?)
What will be best? - (Advantages / disadvantages?)
How to analyse these models in R with nested and random effects and possible covariate(ID, group - at least I think so) and random parameter ID)? Or is there a more simple possibility?
FrankHarrell's response:
- Don't even consider t-tests ignoring baseline.
- Comparing differences from baseline over the two groups is not optimal.
- Using t0 as a covariate is the way to go. A question is whether to just use t4. Generally this is not optimum.
- It's not obvious that random effects are needed if you take the correlation into account in a good way. Generalized least squares using for example an AR1 correlation structure (and there are many others) is something I often prefer. A detailed case study with R code (similar to your situation) is in FrankHarrellGLS?. This includes details about why t0 is best to consider as a covariate. One reason is that the t0 effect may not be linear.
- If you want to focus on t4 it is easy to specify a contrast (after fitting is completed) that tests t4. If time is continuous this contrast would involve predicted values at the 4th time, otherwise testing single parameters.
- Like mixed effects models, GLS is fairly robust to non-random dropouts and allows you to use all available data without the need to impute missing response values.
Topic revision: r2 - 10 Jul 2008 - 15:55:35 -
JohnHarrell