Course Goals

  • Learn how to use modern regression methods to answer scientific questions
  • Become familiar with statistical concepts including exploratory data analysis, estimation, testing in linear, logistic, and survival models
  • Understand how the development of statistical methodology is motivated by biological and medical problems
  • Develop data analytic skills including familiarity with several statistical programs
  • Develop writing skills needed to communicate the results of a data analysis

Topics of Discussion

  • Introduction to Regression Models
  • Simple Linear Regression
  • A Review of Matrix Algebra and Important Results of Random Vectors
  • Precision, Effect Modification, and Confounding
  • Specification Issues in Regression Models: ANCOVA, ANOVA, Multicollinearity
  • Multivariable Regression
  • Model Selection
  • Case Studies in Linear Regression
  • Logistic Regression
  • Generalized Linear Models and Poisson Regression
  • Survival models
  • Bayesian Regression
  • Model Checking: diagnostics, transformations, influential observations, lack-of-fit test

Software

R information

  • R 2.8 (optional, free from http://www.r-project.org): Powerful, versatile, and actively maintained and updated. It may require a longer learning curve than Stata and SPSS, but the effort will pay off later on. To get a feel, look at one of the following: 1 (and try the commands in "A Sample Session“) 2 3. The Department of Biostatistics has a free R Clinic every Thursday. Print the R reference card to get a list of the most commonly used commands.
  • To download R: Linux Mac Windows
  • R Studio is a powerful and easy to use interface for R. I highly recommend it.

Comments on Other Software Packages

  • Stata: Powerful and good graphics with an SPSS-like menu system. A good support site is at http://www.ats.ucla.edu/stat/stata. Cost is $89 for a year and $145 for life, through GradPlan. Buy Small Stata for $45 if you have to pay by yourself. Stata also is available at the College of Arts & Science Microcomputer Labs.
  • SAS: The oldest survivor, with strong legacy. Hard to learn and extend, with outdated structure and the worst graphics of any major package.
  • SPSS: Have “standard” methods and good graphical user interface. However, it is difficult to extend beyond the “standard” methods.
  • Honorable mention: Epi Info (free from CDC), S-Plus.
  • There is a long list of reasons not to use Excel. See ExcelProblems
  • See StatComp for more information about statistical computing including links to several online statistical and probability calculators

-- ChrisSlaughter - 07 Dec 2009
Topic revision: r6 - 02 Jan 2013, ChrisSlaughter
 

This site is powered by FoswikiCopyright © 2013-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback