\chapter{Hierarchical normal / normal model}
\label{ch:hier.norm.norm}
This chapter provides a concrete example of Bayesian hierarchical (multilevel) normal / normal model for longitudinal data. From this example, we will get hands-on experience about how to draw MCMC samples from posterior distributions using WinBUGS and BRugs package in R. All the following chapters will incorporate these aspects: (1) a concrete example with data publicly available; (2) mathematical model formulae statisticians are familiar with; (3) brief Bayesian theory if necessary; (4) programs ready to run; (5) interpretation of modeling results; (6) references for further investigation.
Hierarchical normal / normal model is analogous to mixed model, however in Bayesian world, there are no fixed effects because all parameters are treated as random with distributions.
\section{Data}
\label{sec:rats.data}
Data are obtained from WinBUGS (Spielhalter et al. 2002) example volume I ({\footnotesize\texttt{http://www.mrc-bsu.cam.ac.uk/bugs}}), originally from Gelfand et al. (1990). 30 young rats' weights were measured weekly for five weeks (Figure \ref{fig:rats.data}). For illustration purpose in the later part of this chapter, we add an artificial treatment group variable \texttt{trt} and assign the first half (15) rats to the first treatment group and the other half to the second treatment group. Denote $Y_{ij}$ as the weight of the $i$th rat measured at age $x_j$. The data is available at the IBC homepage.
\texttt{http://biostat.mc.vanderbilt.edu/BayesianDataAnalysisWithOpenBUGSAndBRugs} and
\begin{figure}[htb]
\begin{center}
\includegraphics[height=3in,width=3in]{data/rats_data}
\caption{Rats data in hierarchical normal model}
\label{fig:rats.data}
\end{center}
\end{figure}
\section{Random effects model}
The data suggest a growing pattern with age with a little downward curvature. For now we assume a linear model (\ref{eq:rats.linear}) with random effects to account for the subject-specific growth pattern. You may want to model the nonlinear pattern using restricted cubic spline (Harrell 2001). The programming code is provided for the restricted cubic spline model at the end of the chapter.
\beq\label{eq:rats.linear}
Y_{ij} &\sim& N(a_i + \beta trt_i + b_i(x_j - \bar{x}), \tau_0^{-1})\noo
a_i &\sim & N(\mu_a, \tau_a^{-1})\noo
b_i &\sim& N(\mu_b,\tau_b^{-1})
\eeq
where $\bar{x} = 22$, the average of $x$, $trt_i$ is the group assignment for rat $i$, and $\tau_0, \tau_a, \tau_b$ are precisions (1/variance) for the corresponding normal distributions. For now, we standardize the $x_j$'s around their mean to reduce dependence between two random effects $a_i$ and $b_i$ in their likelihood. This model suggests that for each subject (i.e., fix random effects $a_i$ and $b_i$, and group $trt_i$), the growth curve is linear with noise precision $\tau_0$. The group effect can be captured by $\beta$.
\textbf{A little about Bayesian notation}: In Bayesian models, precisions or precision matrices are more commonly used than variances or covariance matrices. In BUGS language, \texttt{Normal(0, tau)} means $\tau$ is precision NOT variance, which is different from the common textbooks.
\section{Prior and hyperprior}
\label{sec:prior.hyperprior}
The above model is not a fully Bayesian model, because it can be treated as a typical mixed model with fixed effects \texttt{intercept}, \texttt{day}, \texttt{trt} and random effects \texttt{intercept}, \texttt{day}. This mixed model can be fitted using popular statistical software, e.g., SAS (\texttt{Mixed} procedure) and R (\texttt{nlme} library). A fully Bayesian model needs additional equipments, priors and / or hyperpriors. Bayesian inference is from the posterior distribution based on both prior beliefs $p(\btheta)$ and data-based likelihood $p(\by|\btheta)$. Now let's look at model (\ref{eq:rats.linear}) in Bayesian way. The first equation in model (\ref{eq:rats.linear}) specifies the likelihood and the other two specify priors for $\bm a$ and $\bm b$ through another level of parameters $\mu_a, \mu_b, \tau_a$, and $\tau_b$. The other priors need to specify are for the error precision $\tau_0$ and $\beta$. Because we do not have informative belief about them, vague priors are desired. One type of vague prior is Gamma$(\epsilon, \epsilon)$ (mean 1, variance 1/$\epsilon$) (Gelman's book) for $\epsilon$ fairly small, e.g., 0.001, and $\beta \sim N(0, 10^{-6})$.
After we specify all priors for parameters, we may also need to further specify the priors for the parameters in the priors, e.g., $\mu_a, \mu_b, \tau_a$, and $\tau_b$ in model (\ref{eq:rats.linear}), which are called hyperpriors. In most cases, the hyperpriors are vague. In this model, the vague hyperpriors are specified as follows $\mu_a, \mu_b \sim N(0, 10^{-6})$ and $\tau_a, \tau_b \sim $ Gamma$(\epsilon, \epsilon)$. As a summary the fully Bayesian model (\ref{eq:rats.linear}) consists of three levels: data-based likelihood level $p(\by|\btheta)$, prior level $p(\btheta|\bpsi)$, and hyperprior level $p(\bpsi)$. Complex models may involve more levels, but models with more than four levels are unusual and unhelpful. The higher the level is, the more contribution to the posterior inference, so the likelihood provides the most information, then the prior, then the hyperprior. In clinical trials, as data cumulates during the trial, the prior's effect on the posterior becomes less.
\section{BUGS program}
\label{sec:HNNbugs}
Throughout this course, we only focus on BUGS language for it is very convenient and easy to program. We recommend use it whenever possible. BUGS is a highly-structured language and users do not have a lot control unlike R and C. Both WinBUGS standalone and BRugs in R share the same code, however WinBUGS will be used first because of its user-friendly interface.
\linespread{1.0}
\footnotesize
\begin{verbatim}
model
{
#likelihood p(Y|theta)
for( i in 1 : N ) {
for( j in 1 : T ) {
Y[i , j] ~ dnorm(mu[i , j],tau.0)
mu[i , j] <- a[i] + beta * trt[i] + b[i] * (x[j] - xbar)
}
#Prior p(theta|Psi)
a[i] ~ dnorm(mu.a, tau.a)
b[i] ~ dnorm(mu.b, tau.b)
}
#prior
tau.0 ~ dgamma(0.001,0.001)
beta ~ dnorm(0.0,1.0E-6)
#hyper-priors
mu.a ~ dnorm(0.0,1.0E-6)
mu.b ~ dnorm(0.0,1.0E-6)
tau.a ~ dgamma(0.001,0.001)
tau.b ~ dgamma(0.001,0.001)
#parameters of interest
sigma <- 1 / sqrt(tau.0) #error sd
w0[1] <- mu.a - xbar * mu.b #weight at birth for 1st group
w0[2] <- mu.a + beta - xbar * mu.b #weight at birth for 2nd group
}
\end{verbatim}
\linespread{1.2}
\normalsize
After write the model structure, the next step is to provide data. The data can be written in two formats, a list or a table. The list format can be created from R using \texttt{dput} command. The data for this program is as follows.
\linespread{1.0}
\footnotesize
\begin{verbatim}
list(x = c(8.0, 15.0, 22.0, 29.0, 36.0), xbar = 22, N = 30, T = 5,
Y = structure(
.Data = c(151, 199, 246, 283, 320,
145, 199, 249, 293, 354,
147, 214, 263, 312, 328,
155, 200, 237, 272, 297,
135, 188, 230, 280, 323,
159, 210, 252, 298, 331,
141, 189, 231, 275, 305,
159, 201, 248, 297, 338,
177, 236, 285, 350, 376,
134, 182, 220, 260, 296,
160, 208, 261, 313, 352,
143, 188, 220, 273, 314,
154, 200, 244, 289, 325,
171, 221, 270, 326, 358,
163, 216, 242, 281, 312,
160, 207, 248, 288, 324,
142, 187, 234, 280, 316,
156, 203, 243, 283, 317,
157, 212, 259, 307, 336,
152, 203, 246, 286, 321,
154, 205, 253, 298, 334,
139, 190, 225, 267, 302,
146, 191, 229, 272, 302,
157, 211, 250, 285, 323,
132, 185, 237, 286, 331,
160, 207, 257, 303, 345,
169, 216, 261, 295, 333,
157, 205, 248, 289, 316,
137, 180, 219, 258, 291,
153, 200, 244, 286, 324),
.Dim = c(30,5)))
\end{verbatim}
\normalsize
\linespread{1.2}
It's very convenient to create data from R, but be careful about two issues: (1) list data obtained from R do not have the required \texttt{.Data} keyword for BUGS. Add this keyword for BUGS. (2) BUGS reads matrix in a different way from R. For example, there is a matrix $M: 5\times 3$ in R. In order to use it in BUGS, follow this procedure: (a) transpose $M$: \texttt{M <- t(M)}; (b) dump $M$: \texttt{dput(M, "M.dat")}; (c) open \texttt{M.dat}, add \texttt{.Data} keyword and change \texttt{.Dim = c(3,5)} to \texttt{.Dim = c(5,3)}.
Table data have the format:
\linespread{1.0}
\footnotesize
\begin{verbatim}
n[] x[]
47 0
148 18
119 8
END
\end{verbatim}\normalsize
\linespread{1.2}
MCMC algorithm needs to be initialized. The last step for programming is to initialize the model. BUGS may automatically generate initial values, but it is highly recommended to provide initial values for fixed effects. Good initial values potentially improve convergence. For this model, the fixed effects are $\mu_a, \mu_b, \beta, \tau_0, \tau_a,$ and $\tau_b$. So it is recommended to initialize at least these parameters. All the other parameters can be initialized by BUGS, in this model they are $\bm a$ and $\bm b$. The BUGS code and data are available at the IBC homepage.
\footnotesize
\begin{verbatim}
list(mu.a = 150, mu.b = 10, beta=0, tau.0 = 1, tau.a = 1, tau.b = 1)
\end{verbatim}\normalsize
\section{Procedure to run BUGS code}
\label{sec:procedure.bugs}
\subsection{WinBUGS}
\textbf{Launch WinBUGS}. The icon, which resembles a spider, is in the directory where WinBUGS was installed. After create a shortcut and place it on the desktop, double click the spider icon to launch WinBUGS. Read the license agreement and close it. Open a new file and save it as WinBUGS document (\texttt{.odc}).
\noindent
\textbf{Check code}. Download the above data and code from IBC homepage, then copy-paste or type them on your new WinBUGS document. Pull down the \texttt{Model} menu, then select \texttt{Specification}. Highlight or double click \texttt{list} in the model code, and click \texttt{check model} button on the \texttt{specification} tool. If the model is correct, ``\texttt{model is syntactically correct}" will appear at the bottom line of your document. If not correct, the cursor is positioned after the symbol that caused the error. Then highlight or double click \texttt{list} in the data code, and click \texttt{load data}. If load correctly, ``\texttt{data loaded}" will appear. Then compile your model by click \texttt{compile} button and select the number of MCMC chains. The last step is to initialize the model by click \texttt{initialize} button. If only part of parameters are initialized, WinBUGS can generate the other required initial values by clicking \texttt{gen inits} button.
\noindent
\textbf{Run the code}. MCMC needs burn-in period, i.e., samples before convergence. Pull down \texttt{Model} menu and click \texttt{Update}. On a small pop-up window, click \texttt{update} button. Choose the number of burn-in samples. The default number is 1000. Then pull down \texttt{specification} menu, click \texttt{Samples}. Type parameters of interest and click \texttt{set} button. These parameters can be monitored during the program run to check convergence. The commonly-used statistical inference for all parameters in the model is available on the \texttt{samples} menu.
\subsection{OpenBUGS}
\texttt{OpenBUGS} can be run both in Windows and in Unix systems, however the current version of \texttt{BRugs} package only work on Windows. The running procedure of using \texttt{BRugs} in R is pretty much the same as in WinBUGS, except that \texttt{BRugs} only read text files. Following's the steps to run the above BUGS code.
\be
\i Create three text files namely \texttt{ratsmodel.txt}, \texttt{ratsdata.txt}, \texttt{ratsinits.txt} and save the three pieces of code in these files, respectively.
\i loading BRugs: \texttt{library(BRugs)}
\i Check code: \texttt{modelCheck("ratsmodel.txt") }
\i Load data: \texttt{modelData("ratsdata.txt")}
\i Compile: \texttt{modelCompile(numChains=2) }
\i Initialize model: \texttt{modelInits(rep("ratsinits.txt", 2)) }
\i Burn-in: \texttt{modelUpdate(1000) }
\i Monitor samples: \texttt{samplesSet(c("w0", "beta"))}
\i More samples: \texttt{modelUpdate(1000)}
\i Statistical inference and plots are also available (see BRugs package information).
\ee
\section{Results and interpretation}
\label{sec:HNNresults}
Suppose we are particularly interested in two aspects in this data. One is treatment effect $\beta$ and another is the average birth weight $\bm w_0$ for two groups. In order to get inference about these two quantities, they need to be available in the BUGS code. The posterior densities of these parameters can be estimated by the MCMC samples after convergence. The statistical inference may be drawn from the posterior 95\% credible intervals (CI). Since 95\%CI of $\beta$ covers 0, there is no significant difference between these two groups at .05 level. As a conclusion, once we have the distribution of a parameter of interest, we completely know that parameter in statistical sense, so we can do whatever inference from it.
\section*{Reference}
\be
\i Gelfand, A.E., Hills, S., Racine-Poon, A., and Smith, A.F.M. (1990) Illustration of Bayesian Inference in Normal Data Models Using Gibbs Sampling. \emph{Journal Amer. Stat. Assoc.}, {\bf 85}:972-985.
\i Spielhalter, D., Thomas, A., Best, N., and Lunn, D. (2002) \emph{WinBUGS User Manual Version 1.4}, Cambridge, UK: MRC Biostatistics Unit.
\i Harrell, F.E. (2001) \emph{Regression modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis}. Springer.
\i Gelman A., (2003) \emph{Bayesian Data Analysis} CRC press.
\ee