Statistical Modeling for Biomedical Researchers

This page provides the data sets that are used in Dupont, W. D. (2002, 2009). Statistical Modeling for Biomedical Researchers. Cambridge, U.K.: Cambridge University Press.

Lecture notes for a course based on this text are available. These notes contain extensive screen shots that explain how to execute Stata commands with the pull-down menus.

Reviews of this text

Ray Hoffmann. Teaching Statistics in the Health Sciences
Stanley P. Azen. American Journal of Epidemiology

Lecture notes
Data sets for 2nd Edition
Stata graphics schemes for the 2nd edition
Data sets for 1st Edition

Links related to this text are

William D. Dupont
Preface
Table of Contents
Log Files and Do Files
        1st Edition
        2nd Edition
Cambridge University Press
Stata Corporation
Purchase hardback Cambridge University Press (U.S.A.)
Cambridge University Press (U.K.)
Purchase paperback Cambridge University Press (U.S.A.)
Cambridge University Press (U.K.)
Amazon.com (U.S.A.)
Statistical Modeling for Biomedical Researchers

Complete Stata log files and do files of all data analyses from this text or in my lecture notes are also provided. The sources of these data sets are as follows:

  1. Bernard, G. R., A. P. Wheeler, et al. (1997). "The effects of ibuprofen on the physiology and survival of patients with sepsis. The Ibuprofen in Sepsis Study Group." N Engl J Med 336: 912-8.
  1. Brent, J., K. McMartin, et al. (1999). "Fomepizole for the treatment of ethylene glycol poisoning. Methylpyrazole for Toxic Alcohols Study Group." N Engl J Med 340: 832-8.
  1. Breslow, N. E. and N. E. Day (1980). Statistical Methods in Cancer Research: Vol. 1 - The Analysis of Case-Control Studies. Lyon, France, IARC Scientific Publications.
    I have posted the data from Appendix I of this text, which is from the Ille-et-Vilaine study of esophageal cancer. (See also Tuyns et al. 1977)
  1. Cole, T. J. and Green, P. J. (1992), Smoothing reference centile curves: The lms method and penalized likelihood. Statistics in Medicine, 11: 1305-1319.
  1. Dupont WD, Page DL (1985). Risk factors for breast cancer in women with proliferative breast diasease. N Engl J Med 312:146-51.
  1. Eisenhofer, G., J. W. Lenders, et al. (1999). "Plasma normetanephrine and metanephrine for detecting pheochromocytoma in von Hippel-Lindau disease and multiple endocrine neoplasia type 2." N Engl J Med 340(24): 1872-9.
  1. Framingham Heart Study (1997). The Framingham Study - 40 Year Public Use Data Set. Bethesda, MD: National Heart, Lung, and Blood Institute, NIH.
  1. Green, T. Touchstone, J. "Urinary tract estriol: an index of placental function". Am J Obs Gyn. 1963; 85:1.9.
  1. Gross, C. P., G. F. Anderson, et al. (1999). "The relation between funding by the National Institutes of Health and the burden of disease." N Engl J Med 340(24): 1881-7.
  1. Knaus,W.A., Harrell, F.E., Jr., Lynn, J., Goldman, L., Phillips, R.S., Connors, A.F., Jr. et al. The SUPPORT prognostic model. Objective estimates of survival for seriously ill hospitalized adults. Study to understand prognoses and preferences for outcomes and risks of treatments. Ann Intern Med. 1995; 122:191-203.
  1. Lang, C. C., C. M. Stein, et al. (1995). "Attenuation of isoproterenol-mediated vasodilatation in blacks." N Engl J Med 333: 155-60.
  1. Levy, D., National Heart Lung and Blood Institute., et al. (1999). 50 years of discovery : medical milestones from the National Heart, Lung, and Blood Institute's Framingham Heart Study. Hackensack, N.J., Center for Bio-Medical Communication Inc.
    I have posted a subset of the 40 year follow-up data from the Framingham Heart Study (see also reference 6).
  1. O'Donnell HC, Rosand J, Knudsen KA, Furie KL, Segal AZ, Chiu RI, et al. (2000). "Apolipoprotein E genotype and the risk of recurrent lobar intracerebral hemorrhage." N Engl J Med; 342:240-5.
  1. Parl FF, Cavener DR, Dupont WD (1989). "Genomic DNA analysis of the estrogen receptor gene in breast cancer." Breast Cancer Res Tr; 14:57-64.
  1. Rosner B. Fundamentals of Biostatistics. 6th ed. Belmont CA: Danbury 2006
  1. Royston, Patrick and Sauerbrei, Willi. Multivariable Model - Building: A Pragmatic Approach to Regression Anaylsis based on Fractional Polynomials for Modelling Continuous Variables. Wiley, 1st edition, July 2008.
  1. Scholer SJ, Hickson GB, Mitchel EF, Jr., Ray WA (1997). "Persistently increased injury mortality rates in high-risk young children." Arch Pediatr Adolesc Med; 151:1216-9.
  1. Tuyns, A. J., G. Pequignot, et al. (1977). "Le cancer de L'oesophage en Ille-et-Vilaine en fonction des niveau de consommation d'alcool et de tabac. Des risques qui se multiplient." Bull Cancer 64: 45-60.

Data Sets for the 2nd Edition

To download any of the following data sets, click on the data set name, then follow instructions. The numbers at the beginning of these names indicate the chapter and subsection where the data set is first used in Dupont (2009).


Stata Data Set Data Source Comma separated data sets
1.3.2.Sepsis.dtaBernard et al. (1997) 1.3.2.Sepsis.csv
1.4.11.Sepsis.dtaBernard et al. (1997) 1.4.11.Sepsis.csv
2.12.Poisson.dtaBrent et al. (1999) 2.12.Poisson.csv
2.18.Funding.dtaGross et al. (1999)2.18.Funding.csv
2.20.Framingham.dtaLevy (1999)2.20.Framingham.csv
2.ex.vonHippelLindau.dtaEisenhofer et al. (1999) 2.ex.vonHippelLindau.csv
3.25.2.SUPPORT.dtaKnaus et al. (1995)3.25.2.SUPPORT.csv
3.ex.Funding.dtaGross et al. (1999) 3.ex.Funding.csv
4.11.Sepsis.dtaBernard et al. (1997) 4.11.Sepsis.csv
4.18.Sepsis.dtaBernard et al. (1997) 4.18.Sepsis.csv
4.21.EsophagealCa.dtaBreslow & Day (1980) 4.21.EsophagealCa.csv
4.ex.Sepsis.dtaBernard et al. (1997)4.ex.Sepsis.csv
5.5.EsophagealCa.dtaBreslow & Day (1980) 5.5.EsophagealCa.csv
5.ex.InjuryDeath.dtaScholer et al. (1997) 5.ex.InjuryDeath.csv
6.9.Hemorrhage.dtaO'Donnell et al. (2000) 6.9.Hemorrhage.csv
6.ex.Breast.dtaDupont et al. (1985) 6.ex.Breast.csv
8.7.Framingham.dtaLevy (1999)8.7.Framingham.csv
8.8.2.Person-Years.dta(no ref) 8.8.2.Person-Years.csv
8.8.2.Survival.dta(no ref) 8.8.2.Survival.csv
8.12.Framingham.dtaLevy (1999)8.12.Framingham.csv
8.ex.InjuryDeath.dtaScholer et al. (1997) 8.ex.InjuryDeath.csv
10.8.ERpolymorphism.dtaParl et al. (1989) 10.8.ERpolymorphism.csv
11.2.Isoproterenol.dtaLang et al. (1995) 11.2.Isoproterenol.csv
11.2.Long.Isoproterenol.dtaLang et al. (1995) 11.2.Long.Isoproterenol.csv
11.AreaUnderCurve.dta(no ref) 11.AreaUnderCurve.csv
11.ex.Sepsis.dta Bernard et al. (1997) 11.ex.Sepsis.csv

The Stata data sets can be opened directly from within Stata on any computer that is connected to the Internet.  For example, the Stata command

.use http://biostat.mc.vanderbilt.edu/dupontwd/wddtext/data/1.3.2.Sepsis.dta

will open the 1.3.2.Sepsis data set directly over the web.  When opening a Stata data file in this way you must be careful to capitalize correctly the web address and data file name.

Stata Log Files and Do Files for the 2nd Edition

The log files given in the text, and their corresponding do files are given below.  They have been modified slightly from those given in the text to take advantage of Stata's default s2color scheme. Versions of these files updated to Stata 11 are also provided where Stata 11 syntax differs from that of Stata 10. See lecture notes for additional explaination.

To download any of the following Stata log files or do files, click on the file name, then follow the directions to save the file onto your computer.

Log and do files for the 2nd Edition


Log File Name
version 10
Log File Name
version 11
Data Source
1.3.2.Sepsis.log   Bernard et al. (1997)
1.3.9.Sepsis.log   Bernard et al. (1997)
1.4.11.Sepsis.log   Bernard et al. (1997)
1.4.14.Sepsis.log   Bernard et al. (1997)
2.12.Poisson.log   Brent et al. (1999)
2.16.Poisson.log   Brent et al. (1999)
2.18.Funding.log   Gross et al. (1999)
2.20.Framingham.log   Levy (1999)
2.22.Framingham.log   Levy (1999)
3.12.1.Framingham.log   Levy (1999)
3.25.2.SUPPORT.log   Knaus (1995)
4.11.Sepsis.log   Bernard et al. (1997)
4.13.1.Sepsis.log   Bernard et al. (1997)
4.18.Sepsis.log   Bernard et al. (1997)
4.22.EsophagealCa.log   Breslow & Day (1980)
5.5.EsophagealCa.log   Breslow & Day (1980)
5.9.EsophagealCa.log   Breslow & Day (1980)
5.10.EsophagealCa.log 5.10.EsophagealCa.log Breslow & Day (1980)
5.11.1.EsophagealCa.log 5.11.1.EsophagealCa.log Breslow & Day (1980)
5.20.EsophagealCa.log 5.20.EsophagealCa.log Breslow & Day (1980)
5.32.3.Sepsis.log   Bernard et al. (1997)
5.35.SUPPORT.log   Knaus (1995)
6.9.Hemorrhage.log   O'Donnell et al. (2000)
6.16.Hemorrhage.log   O'Donnell et al. (2000)
7.6.Framingham.log 7.6.Framingham.log Levy (1999)
7.8.4.Framingham.log   Levy (1999)
8.2.Framingham.log   Levy (1999)
8.7.Framingham.log   Levy (1999)
8.8.2.Survival_to_PersonYears.log   (no ref)
8.9.Framingham.log   Levy (1999)
8.12.Framingham.log 8.12.Framingham.log Levy (1999)
9.3.Framingham.log 9.3.Framingham.log Levy (1999)
10.8.ERpolymorphism.log 10.8.ERpolymorphism.log Parl et al. (1989)
11.2.Isoproterenol.log   Lang et al. (1995)
11.5.Isoproterenol.log   Lang et al. (1995)
11.11.Isoproterenol.log   Lang et al. (1995)
11.AreaUnderCurve.log   (no ref)
   

Do File Name
version 10
Do File Name
version 11
Data Source
1.3.2.Sepsis.do   Bernard et al. (1997)
1.3.9.Sepsis.do   Bernard et al. (1997)
1.4.11.Sepsis.do   Bernard et al. (1997)
1.4.14.Sepsis.do   Bernard et al. (1997)
2.12.Poisson.do   Brent et al. (1999)
2.16.Poisson.do   Brent et al. (1999)
2.18.Funding.do   Gross et al. (1999)
2.20.Framingham.do   Levy (1999)
2.22.Framingham.do   Levy (1999)
3.12.1.Framingham.do   Levy (1999)
3.25.2.SUPPORT.do   Knaus (1995)
4.11.Sepsis.do   Bernard et al. (1997)
4.13.1.Sepsis.do   Bernard et al. (1997)
4.18.Sepsis.do   Bernard et al. (1997)
4.22.EsophagealCa.do   Breslow & Day (1980)
5.5.EsophagealCa.do   Breslow & Day (1980)
5.9.EsophagealCa.do   Breslow & Day (1980)
5.10.EsophagealCa.do 5.10.EsophagealCa.do Breslow & Day (1980)
5.11.1.EsophagealCa.do 5.11.1.EsophagealCa.do Breslow & Day (1980)
5.20.EsophagealCa.do 5.20.EsophagealCa.do Breslow & Day (1980)
5.32.3.Sepsis.do   Bernard et al. (1997)
5.35.SUPPORT.do   Knaus (1995)
6.9.Hemorrhage.do   O'Donnell et al. (2000)
6.16.Hemorrhage.do   O'Donnell et al. (2000)
7.6.Framingham.do 7.6.Framingham.do Levy (1999)
7.8.4.Framingham.do   Levy (1999)
8.2.Framingham.do   Levy (1999)
8.7.Framingham.do   Levy (1999)
8.8.2.Survival_to_PersonYears.do   (no ref)
8.9.Framingham.do   Levy (1999)
8.12.Framingham.do 8.12.Framingham.do Levy (1999)
9.3.Framingham.do 9.3.Framingham.do Levy (1999)
10.8.ERpolymorphism.do 10.8.ERpolymorphism.do Parl et al. (1989)
11.2.Isoproterenol.do   Lang et al. (1995)
11.5.Isoproterenol.do   Lang et al. (1995)
11.11.Isoproterenol.do 11.11.Isoproterenol.do Lang et al. (1995)
11.AreaUnderCurve.do   (no ref)
   

Stata schemes for the 2nd edition

An interesting strength of Stata is that it allows users a very wide degree of choice for default settings on their graphs.  This is done through the creation of Stata scheme files.  In my text the following schemes were used.

scheme-WDDtext.scheme
scheme-WDDtextBig.scheme
scheme-WDDtextReallyBig.scheme
scheme-WDDtextSmall.scheme

These schemes may be of mild interest to a few users as examples of how to modify standard Stata schemes.


Data Sets for the 1st Edition

To download any of the following data sets, click on the data set name, then follow instructions. The numbers at the beginning of these names indicate the chapter and subsection where the data set is first used in Dupont (2002).


Stata Data Set Data Source Comma separated data sets
1.3.2.Sepsis.dtaBernard et al. (1997) 1.3.2.Sepsis.csv
1.4.11.Sepsis.dtaBernard et al. (1997) 1.4.11.Sepsis.csv
10.7.ERpolymorphism.dtaParl et al. (1989) 10.7.ERpolymorphism.csv
11.2.Isoproterenol.dtaLang et al. (1995) 11.2.Isoproterenol.csv
11.2.Long.Isoproterenol.dtaLang et al. (1995) 11.2.Long.Isoproterenol.csv
11.AreaUnderCurve.dta(no ref) 11.AreaUnderCurve.csv
2.12.Poisson.dtaBrent et al. (1999) 2.12.Poisson.csv
2.18.Funding.dtaGross et al. (1999)2.18.Funding.csv
2.20.Framingham.dtaLevy (1999)2.20.Framingham.csv
2.ex.vonHippelLindau.dtaEisenhofer et al. (1999) 2.ex.vonHippelLindau.csv
3.ex.Funding.dtaGross et al. (1999) 3.ex.Funding.csv
4.11.Sepsis.dtaBernard et al. (1997) 4.11.Sepsis.csv
4.18.Sepsis.dtaBernard et al. (1997) 4.18.Sepsis.csv
4.21.EsophagealCa.dtaBreslow & Day (1980) 4.21.EsophagealCa.csv
4.ex.Sepsis.dtaBernard et al. (1997)4.ex.Sepsis.csv
5.5.EsophagealCa.dtaBreslow & Day (1980) 5.5.EsophagealCa.csv
5.ex.InjuryDeath.dtaScholer et al. (1997) 5.ex.InjuryDeath.csv
6.9.Hemorrhage.dtaO'Donnell et al. (2000) 6.9.Hemorrhage.csv
6.ex.Breast.dtaDupont et al. (1985) 6.ex.Breast.csv
8.12.Framingham.dtaLevy (1999)8.12.Framingham.csv
8.7.Framingham.dtaLevy (1999)8.7.Framingham.csv
8.8.2.Person-Years.dta(no ref) 8.8.2.Person-Years.csv
8.8.2.Survival.dta(no ref) 8.8.2.Survival.csv
8.ex.InjuryDeath.dtaScholer et al. (1997) 8.ex.InjuryDeath.csv
11.ex.Sepsis.dta 11.ex.Sepsis.csv

The Stata data sets can be opened directly from within Stata on any computer that is connected to the Internet.  For example, the Stata command

.use http://biostat.mc.vanderbilt.edu/dupontwd/wddtext/data/1.3.2.Sepsis.dta

will open the 1.3.2.Sepsis data set directly over the web.  When opening a Stata data file in this way you must be careful to capitalize correctly the web address and data file name.


Stata Log Files and Do Files for the 1st Edition

The log files given in the text, and their corresponding do files are given below. Stata released Version 8 of their software soon after the publication of this text.  To obtain Version 8 editions of the text's log and do files click 

Version 8.    

To download any of the following Stata log files or do files, click on the file name, then follow the directions to save the file onto your computer.

Version 7 log and do files for the 1st Edition


Log File Name Data Source
1.3.2.Sepsis.log Bernard et al. (1997)
1.3.6.Sepsis.log Bernard et al. (1997)
1.4.11.Sepsis.log Bernard et al. (1997)
1.4.14.Sepsis.log Bernard et al. (1997)
10.7.ERpolymorphism.log Parl et al. (1989)
11.11.Isoproterenol.log Lang et al. (1995)
11.2.Isoproterenol.log Lang et al. (1995)
11.5.Isoproterenol.log Lang et al. (1995)
11.AreaUnderCurve.log (no ref)
2.12.Poisson.log Brent et al. (1999)
2.16.Poisson.log Brent et al. (1999)
2.18.Funding.log Gross et al. (1999)
2.20.Framingham.log Levy (1999)
3.11.1.Framingham.log Levy (1999)
4.11.Sepsis.log Bernard et al. (1997)
4.13.1.Sepsis.log Bernard et al. (1997)
4.18.Sepsis.log Bernard et al. (1997)
4.22.EsophagealCa.log Breslow & Day (1980)
5.10.EsophagealCa.log Breslow & Day (1980)
5.11.1.EsophagealCa.log Breslow & Day (1980)
5.12.EsophagealCa.log Breslow & Day (1980)
5.20.EsophagealCa.log Breslow & Day (1980)
5.32.2.Sepsis.log Bernard et al. (1997)
5.5.EsophagealCa.log Breslow & Day (1980)
5.9.EsophagealCa.log Breslow & Day (1980)
6.16.Hemorrhage.log O'Donnell et al. (2000)
6.9.Hemorrhage.log O'Donnell et al. (2000)
7.7.Framingham.log Levy (1999)
7.9.4.Framingham.log Levy (1999)
8.12.Framingham.log Levy (1999)
8.2.Framingham.log Levy (1999)
8.7.Framingham.log Levy (1999)
8.8.2.Survival_to_Person-Years.log(no ref)
8.9.Framingham.log Levy (1999)
9.3.Framingham.log Levy (1999)

Do File Name Data Source
1.3.2.Sepsis.do Bernard et al. (1997)
1.3.6.Sepsis.do Bernard et al. (1997)
1.4.11.Sepsis.do Bernard et al. (1997)
1.4.14.Sepsis.do Bernard et al. (1997)
10.7.ERpolymorphism.do Parl et al. (1989)
11.11.Isoproterenol.do Lang et al. (1995)
11.2.Isoproterenol.do Lang et al. (1995)
11.5.Isoproterenol.do Lang et al. (1995)
11.AreaUnderCurve.do (no ref)
2.12.Poisson.do Brent et al. (1999)
2.16.Poisson.do Brent et al. (1999)
2.18.Funding.do Gross et al. (1999)
2.20.Framingham.do Levy (1999)
3.11.1.Framingham.do Levy (1999)
4.11.Sepsis.do Bernard et al. (1997)
4.13.1.Sepsis.do Bernard et al. (1997)
4.18.Sepsis.do Bernard et al. (1997)
4.22.EsophagealCa.do Breslow & Day (1980)
5.10.EsophagealCa.do Breslow & Day (1980)
5.11.1.EsophagealCa.do Breslow & Day (1980)
5.12.EsophagealCa.do Breslow & Day (1980)
5.20.EsophagealCa.do Breslow & Day (1980)
5.32.2.Sepsis.do Bernard et al. (1997)
5.5.EsophagealCa.do Breslow & Day (1980)
5.9.EsophagealCa.do Breslow & Day (1980)
6.16.Hemorrhage.do O'Donnell et al. (2000)
6.9.Hemorrhage.do O'Donnell et al. (2000)
7.7.Framingham.do Levy (1999)
7.9.4.Framingham.do Levy (1999)
8.12.Framingham.do Levy (1999)
8.2.Framingham.do Levy (1999)
8.7.Framingham.do Levy (1999)
8.8.2.Survival_to_Person-Years.do(no ref)
8.9.Framingham.do Levy (1999)
9.3.Framingham.do Levy (1999)

 

Acknowledgement

I would like to thank the following people for generously allowing me to use their data in my text and on this web page.

Gordon R. Bernard, M.D.
Jeffrey Brent, M.D., Ph. D.
Norman E. Breslow, Ph.D.
Graeme Eisenhofer, Ph.D.
Frank E. Harrell, Jr., Ph.D.
Steven M. Greenberg, M.D., Ph.D.
Cary P. Gross, M.D.
Daniel Levy, M.D.
Fritz F. Parl, M.D., Ph.D.
Paul Sorlie, Ph. D.
Wayne A. Ray, Ph.D.
Alastair J.J. Wood, M.D.

Other data web sites

Professor Breslow has posted the Ille-et-Vilaine data set as well as other data sets from his text books at http://faculty.washington.edu/norm/datasets.html.

Disclaimer

The opinions expressed in my text are my own and do not necessarily reflect the views of the authors listed above, their employers or funding institutions. This includes the National Heart, Lung and Blood Institute, NIH, DHHS.

William D. Dupont Nashville, TN
  2008



[ VUMC Home | About VUMC | Health Care Services | Schools | Research | Search ]

Copyright © 2004, Vanderbilt University Medical Center
URL: http://biostat.mc.Vanderbilt.Edu/dupontwd/wddtext/index.htm
For More Information: <
dale.plummer@Vanderbilt.Edu>