Preparing data for survival analysis

Our original data file, temp.csv, has repeated glucose values for each subject (DeID) at various times points (Time) from both before and after treatment (Pre.or.Post).


x <- csv.get("temp.csv")

We first want to divide x into two data frames -- one with only the pre-treatment records, and one with only the post.

xpre <- subset(x, subset=Pre.or.Post=="pre")
xpost <- subset(x, subset=Pre.or.Post=="post")

Now for xpre and xpost, we can record the first Time value at which each subject had a glucose value < 130.
  • NOTE: x, and therefore xpre and xpost, are sorted by subject and time. That's why we can easily pull off the first glucose value for each subject's glucose values > 130.

subxpre <- subset(xpre, subset=glucose < 130)
junk <- tapply(X=subxpre$Time, INDEX=subxpre$DeID, FUN=head, n=1)
newD <- data.frame(DeID=names(junk), frsttimepre = junk)

subxpost <- subset(xpost, subset=glucose < 130)
junk <- tapply(X=subxpost$Time, INDEX=subxpost$DeID, FUN=head, n=1)
newD1 <- data.frame(DeID=names(junk), frsttimepost = junk)

Now we can merge these "first time" values to xpre and xpost.

newxpre <- merge(x=xpre, y=newD, by.x="DeID", by.y="DeID", all.x=TRUE)
newxpost <- merge(x=xpost, y=newD1, by.x="DeID", by.y="DeID", all.x=TRUE)

Now we want to define eventtime and censored variables for newxpre and newxpost.

newxpre <- upData(newxpre,
   eventtime = ifelse(frsttimepre==Time, frsttimepre, NA),
   censored = ifelse(!, 0, NA))

newxpost <- upData(newxpost,
   eventtime = ifelse(frsttimepost==Time, frsttimepost, NA),
   censored = ifelse(!, 0, NA))

Laslty, we post newxpre and newxpost= back together.

newx <- data.frame(DeID=c(newxpre$DeID, newxpost$DeID),
   Time=c(newxpre$Time, newxpost$Time), 
   Pre.or.Post=c(newxpre$Pre.or.Post, newxpost$Pre.or.Post),
   frsttime=c(newxpre$frsttimepre, newxpost$frsttimepost),
   eventtime=c(newxpre$eventtime, newxpost$eventtime),
   censored=c(newxpre$censored, newxpost$censored))

This problem was posted by Shirley Liu
