A related topic is Getting Started at Vanderbilt Biostatistics.


Vanderbilt University offers an ideal setting for its new Department of Biostatistics established in September 2003. The department offers a full array of biostatistical support with an emphasis on establishing long-term collaborative relationships with investigators in accomplishing the research mission of the institution. The Department currently has 33 full-time faculty biostatisticians and 17 staff Masters-level biostatisticians, 7 computer systems analysts, and 6 administrative staff members. The department has one division, the Division of Cancer Biostatistics, Dr. Yu Shyr, Division Chief.

In addition to operating the Biostatistics Shared Resource for the Vanderbilt Ingram Cancer Center, the department operates the Design, Biostatistics, and Research Ethics Core for the Vanderbilt Institute for Clinical and Translational Research (VICTR), i.e., CTSA, the Statistics and Methodology Core for the Vanderbilt Kennedy Center for Research on Human Development, and the Biostatistics Collaboration Center (BCC), a recharge center. The department also holds a walk-in Biostatistics Clinic each day of the week from noon to 1:15pm. These clinics are for the entire Vanderbilt and Meharry Medical College community to provide assistance in study design, measurement refinement, statistical analysis, and statistical interpretation of their results and of journal articles. Clinic staff also assist investigators in navigating all the quantitative resources at Vanderbilt.

The department occupies approximately 12,624 square feet on the eleventh floor of the 2525 West End Avenue building, including two conference rooms, and an additional conference room centrally located in Medical Center North.

The conference rooms are equipped with projectors and screens. Ubuntu Linux based computers are placed in each room and may be used for presentations and live analyses. Presenter's laptops may be connected to the projectors.

Department Wiki

The department operates a collaborative web site; the entire site is a wiki. We encourage department members to use the site as a repository for meeting notes, grant development, tutorials and the like. The wiki allows for rapid hosting of new content and fosters collaborative work, hosting of teaching materials, and managing clinics. We view the site as a constantly growing knowledge base that has very many contributors.

IT Team

The Department of Biostatistics boasts a strong IT support team consisting of 7 programmers and analysts with 75 combined years of experience. Led by Dale Plummer, the team collaborates with scientific investigators and statisticians to develop and leverage technology solutions that enable and support the collection of data and their analysis. IT team members have a broad range of skills and experiences that facilitate their integration into biomedical research teams. Past successful collaborations range from routine data capture support to the development of complex R packages to the development of standalone software.

The IT Team supports various hardware and software platforms, departmental networking, computer performance issues, software acquisition and installation, and task specific programming. The team has programming experience with a wide array of programming languages and techniques to support programming design, database design, statistical programming, web applications, and high performance computing. The team has extensive experience collecting data from disparate sources and assembling the data into clean, analysis-ready data sets.

Members of the group have experience with R Programming, building R packages, supporting package repositories, and making submissions to CRAN. Biostatistics programmers maintain the Hmisc and rms packages. The rApache, Rook, brew, RMySQL, canvas, datamap, gearman, redcap, rmemcache, and MingSWF packages are developed and/or maintained here. Work is being done with RStudio to make it possible to easily create interactive web interfaces from R, to improve HTML rendering of tables in R, and to facilitate the deployment of R web applications.

The popular PS: Power and Sample Size Calculation program was developed and is maintained by a member of the IT Team.

The Biostatistics IT Team understands the issues surrounding confidentiality and is committed to compliance with data privacy rules and regulations.

The IT Team members are Cole Beck, Thomas Dupont, Shawn Garbett, Zhouwen Liu, Dale Plummer, and Jeremy Stephens.

Skill Set IT Team Members
Statistical analysis (under the direction of statisticians) All
R Programming, building R packages, supporting package repositories, and submissions to CRAN ColeBeck, CharlesDupont, ZhouwenLiu, JeremyStephens, Shawn Garbett
Contributions to the RStudio product JeffreyHorner
SAS Programming DalePlummer, ZhouwenLiu
Stata Programming, including contributions to the base Stata product DalePlummer
Web applications using R and RApache, Shiny, Ruby and Rails, PHP, and other web application toolsets All
Web libraries: Sinatra, .NET, Maria (Javascript MVC framework); Qt (desktop GUI library) JeremyStephens
Acquisition and preparation of data for the production of analysis data sets ("data cleaning") All
Programming in C, C++, Ruby, Perl, PHP, Python, Java, Javascript, HTML, CSS, Go ColeBeck, CharlesDupont, ZhouwenLiu, JeremyStephens, Shawn Garbett
Design and implementation of databases using MySQL, SQL Server, PostgreSQL, and other SQL-based database packages All
Design, implementation, and support of REDCap databases All
Support of Linux based servers (installation, maintenance, scripting, backup, performance management) ColeBeck, CharlesDupont, DalePlummer, JeremyStephens, Shawn Garbett
Support of desktop computers (Linux, Macintosh, Windows; installation, maintenance, backup, help) ColeBeck, DalePlummer
Hardware work: selection, troubleshooting and repair, peripherals, networking, printers, etc. ColeBeck, DalePlummer
Production of desktop applications using various languages All
Management of clinical trials data All
Understanding and compliance with data privacy rules and regulations All
High performance computing and use of Vanderbilt's ACCRE cluster ZhouwenLiu
LaTeX, knitr, Sweave ColeBeck, CharlesDupont

Desktop Computers

Each faculty and staff member is provided with a well equipped desktop computer running Ubuntu Linux and a wide range of open source software.

Individual Linux workstations are systems produced by System76, Inc.. Typically, these systems have Intel Core i5-750 processors or better, 8 GB of memory or more, and at least 500 GB of disk space. These computers are configured with Linux and a large selection of open source software. The R package for statistical computing and graphics is installed on all analysts' workstations. The department supports approximately 50 Linux desktop workstations.

All computers that are connected to the Vanderbilt network are protected from insecure outside access by the Vanderbilt perimeter firewall. Access to internal computers from external locations require the use of Vanderbilt’s VPN system or the SSH network protocol for secure data communication.

There are also a number of desktop computers running the Windows operating system. These computers participate in the Vanderbilt active directory domain.

Files on the desktop computers are backed up on a regular basis using a service provided by Advanced Computing Center for Research and Education (ACCRE). ACCRE uses the TiBS enterprise backup system that is produced by Teradactyl, LLC. This backup scheme is designed to allow for recovery after device has failed or if a file has been accidentally deleted. It is important to note that this is not an archival backup service. We cannot restore a file from any time in the past. The ACCRE tape service should be able to restore from 3 to 6 months in the past, depending on the volume of data being backed up. But if you delete a file and then 3 years later want it restored, you are out of luck.

All computers are connected to the Vanderbilt campus network that provides high speed access to e-mail and the Internet.


The department operates a number of server computers:
  1. a custom built system with two 64-bit Intel processors, 96 GB main memory, 1 TB of mirrored disk storage. This system acts as a compute server. Its primary purpose is to run jobs written in the R programming language that need fast processors and/or large amounts of memory. The server's operating system is 64-bit Ubuntu Linux. ( See ( Information for more information.
  2. a custom built system with two 64-bit AMD Opteron (2.0 GHz) processors, 4 GB main memory, 200 GB of mirrored disk storage. This system acts as a database and web server. The server's operating system is 64-bit Ubuntu Linux. A number of web applications are hosted on this system. (
  3. The department's wiki is hosted on a virtual server provided by Vanderbilt's IT department and managed by the Biostatistic's IT Team (

Files on the servers are also backed up on a regular basis using the TiBS service provided by ACCRE.

High Performance Computing

Advanced Computing Center for Research and Education (ACCRE) also operates a High Performance Compute Cluster that department members can use. See ACCRE's web site for details.

Secure File Transfer

VUMC provides a secure file transfer web application produced by Accellion that gives the ability to send large files securely.

The department also provides Data Hippo, a web application for secure transfer of data containing confidential information.

Wireless Networking

Vanderbilt operates a variety of wireless networks. A Wi-Fi Protected Access II (WPA2) network is available for Vanderbilt faculty and staff. An open network is provided for patients and visitors. Information about the VU Wireless Network can be found at

Cloud File Storage

Vanderbilt University provides a account for each faculty and staff member to store, share, and access files online. is an "in the cloud" file sharing service much like Dropbox. Each user's account provides 50.0GB of storage.


Printing services are provided by 3 color laser printers. All printers are connected to the Vanderbilt campus network.

The department also has fax, scanner, and copy machines and other standard office equipment. There is a secure document disposal system in our office suite.

Scientific Environment

Career development and methodologic mentoring is provided for all faculty. There is a meeting each month for all tenure-track faculty and another meeting for non-tenure-track faculty. Monthly faculty meetings are used to discuss ways to improve our research infrastructure and collaborations with biomedical researchers. These meetings are also used to seek opportunities for between-faculty methodologic collaboration.

The department has a weekly seminar, daily biostatistics clinics, and weekly clinics for the R statistical computing language. Frequently, collaborators and former mentors are brought in to make presentations in the seminar series. The daily clinics, which are intended primarily for biomedical researchers and are staffed by an average of five biostatisticians per day, have also provided a significant amount of methodologic assistance for biostatisticians, in two ways: by allowing them to witness how other, often more senior, biostatisticians solve problems in biomedical research, and by providing time for biostatisticians to ask each other questions. In addition to these opportunities, the faculty have an informal luncheon two days per week to discuss statistical philosophy and theory and career development.

Early Stage Investigators

For early stage investigators, the department provides protected time for developing their research programs including the development of methodologic grant proposals. Especially for these early stage faculty, the department also provides funds for professional travel, collaboration with faculty at other institutions, and for books, journals, and professional fees.

Biostatistics Core Resource

Biostatistics Collaboration Center


Department of Biostatistics staff and faculty project effort is billed directly to each project as a percentage of salary and fringe benefit expenses. In addition, a scientific resource fee of $8,300 per 100% annual FTE (faculty and staff combined) is billed to each project for allowable expenses necessary to perform the work of biostatisticians and computer systems analysts. These expenses are directly related to ensuring that each biostatistician has access to appropriate tools and other scientific resources that support each project, including the array of technologies needed to manage and analyze many types of data across the spectrum of biomedical research.

The biostatistics scientific resource fee is administratively managed through the Biostatistics Collaboration Center, a VUMC sponsored core resource. The VUMC Office of Research (OOR) annually reviews the BCC to ensure best compliance with all applicable federal and state regulations, including uniform administrative requirements, cost principles, and audit requirements for federal awards. Rates are adjusted annually to ensure that the BCC operates on a strict non-profit cost recovery basis. The scientific resource fee is billed on a monthly basis via the OOR centralized core billing system Core Ordering and Reporting Enterprise System (CORES).


In 2010 the Department of Biostatistics began allocating M.S. staff biostatistician time to projects using hourly billing based on a core-standard 1,500 billable hour work year. A change is being made to enhance the culture of collaboration and academically productive partnerships.

Beginning July 1, 2014 all biostatistics staff and faculty effort is being billed directly to each project as a percentage of salary and fringe benefit expenses. In addition, a scientific resource fee is being billed to each project for allowable costs related to providing cutting-edge biostatistics support. Changes have been made to the model and some costs have been reduced related to providing this service. Therefore, your overall costs will be reduced with the new rate. The department utilizes robust computing technologies and innovative methodologies to manage complex analyses. This scientific resource fee covers costs necessary to perform the work of biostatisticians and computer systems analysts. These resources are directly related to the many technologies used and types of data generated across multiple disciplines that biostatisticians must competently handle. The scientific resource fee will be billed on a monthly basis via the Core Ordering and Reporting Enterprise System (CORES).

The growth in the number of biostatisticians in the Department of Biostatistics comes solely from the growth of funded biomedical research requiring statistical design and analysis. The Department of Biostatistics faces real personnel support costs in meeting the demands from other departments and centers, and the School of Medicine has created a model described here that allows the costs to be recovered from the sources requesting the assistance.

These changes have been reviewed and approved by executive leadership of the School of Medicine.

Alfresco Community Edition

The department operates a server that runs Alfresco Community Edition. Alfresco is an open source Enterprise Content Management that we use for project management tasks. The server is a secure host and repository for discussions, documents, data, and other materials related to a project.

Security Statement

The Alfresco server is a Linux server that is maintained with all security updates.

The server exists behind Vanderbilt's firewall which actively scans for known attacks, and suspicious actively is immediately investigated by Vanderbilt's network security team. The iptables do not allow for routing packets, and report and log any suspicious packets and drops all malformed packets (usually due to spoofing). Secure shell connections are allowed on a non-broadcasted non-standard port.

All data uploaded to and downloaded from the Alfresco is SSL encrypted using AES 256 encryption with a TLS using SHA256 with RSA for digital signatures. It is therefore safe to use Alfresco from unsecured wireless connections or other public internet access points. SSL connections are further protected by the following technologies:

  • HTTP Strict Transport Security (HSTS) – This ensures that connections are only ever via encrypted HTTPS, never HTTP.
  • Perfect Forward Secrecy (PFS) – This further strengthens SSL to prevent later decryption of data even in the event that the SSL key is compromised. This precaution is considered essential to prevent third parties intercepting private communications.

The Alfresco server only allows password logins and access to documents is controlled via permissions. Content in Alfresco is secured using granular permissions and role-based security to ensure authenticated users only access the content they’re authorized to. Owners can hide folders or files from specific users, or provide read-only access to certain content. All files and folders can have individual permissions set allowing for precise granular control of content. Permission assignments are made in Access Control Lists (ACLs), which are lists of Access Control Entries (ACEs). An ACE associates an authority (group or user) with a permission or set of permissions, and defines whether the permission is denied or allowed for the authority. Every node has a related ACL. When you create a node, it automatically inherits an ACL from its parent. You can alter this behavior after node creation by breaking inheritance or modifying the ACL. We have disabled guest access. We have integrated Vanderbilt's LDAP system for internal logins, and allow for user/password logins for remote users.
