Datasets

Most of the datasets on this page are in the S dumpdata and R compressed save() file formats. Some are available in Excel and ASCII ( .csv) formats. Methods for retrieving and importing datasets may be found here. If you need one of the datasets we maintain converted to a non-S format please e-mail mailto:charles.dupont@vanderbilt.edu to make a request.

For R users of the prostate dataset, put library(chron) into effect to handle date variables. A simpler approach is to just convert the one date variable to the built-in R format by running the command prostate$sdate <- as.Date(prostate$sdate).


Description R S-Plus Excel ASCII contents()
Meningitis dataset
abm.html abm.sav abm.sdd abm.xls NA Cabm.html
Cardiac catheterization diagnostic data
acath.html acath.sav acath.sdd acath.xls.zip NA Cacath.html
WHO ARI Multicentre Study of clinical signs and etiologic agents
Description ari.sav ari_other.sav NA NA ari.zip ari.html
Rosner's estriol data
NA birth.estriol.sav NA NA NA Cbirth.estriol.html
Boston neighborhood housing prices data
boston.html boston.sav boston.sdd NA NA Cboston.html
Cervical Dystonia longitudinal dataset
cdystonia.html cdystonia.sav cdystonia.sdd NA NA Ccdystonia.html
U.S. counties and 1992 presidential election dataset
counties.html counties.sav counties.sdd countiesxls.zip NA Ccounties.html
Diabetes data
diabetes.html diabetes.sav diabetes.sdd diabetes.xls NA Cdiabetes.html
Duchenne muscular dystrophy dataset
dmd.html dmd.sav dmd.sdd NA dmd.csv Cdmd.html
German Breast Cancer Data
GermanBreastCa gbsg.sav NA NA gbsg_ba_ca.dat Cgbsg.html
Hypertension data from the Dominican Republic
DominicanHTN.html DominicanHTN.sav NA DominicanHTN.xls NA CDominicanHTN.html
Rosner's FEV data
FEV.html FEV.sav FEV.sdd NA FEV.csv CFEV.html
Rosner's hospital data
NA hospital.sav NA NA NA Chospital.html
Rat vaginal cancer data
kprats.html kprats.sav kprats.sdd NA NA Ckprats.html
Rosner's lead data
NA lead.sav lead.rda (no labels) NA NA NA Clead.html
NHANES glycohemoglobin data
NhanesGh nhgh.rda NA NA NA nhgh.html
1996 Olympics medal counts
olympics.1996.html olympics.1996.sav olympics.1996.sdd NA olympics.1996.asc Colympics.1996.html
Mayo Clinic primary biliary cirrhosis data
pbc.html pbc.sav pbc.sdd pbc.xls NA Cpbc.html
Plasma Retinol/Beta-Carotene dataset
plasma.html plasma.sav plasma.sdd NA NA Cplasma.html
Byar & Greene prostate cancer data
prostate.html prostate.sav prostate.sdd prostate.xls NA Cprostate.html
Right heart catheterization dataset
rhc.html rhc.sav rhc.sdd NA rhc.csv Crhc.html
40-observation sex-age-response data
sex.age.response.html sex.age.response.sav sex.age.response.sdd NA NA Csex.age.response.html
Stress Echocardiography Data
stressEcho.html stressEcho.sav stressEcho.sdd NA stressEcho.csv CstressEcho.html
SUPPORT study datasets
Description support.sav support.sdd support.xls NA Csupport.html
  support2.sav support2.sdd NA support2csv.zip Csupport2.html
Data for Titanic passengers
titanic.html titanic.sav titanic.sdd NA titanic.txt Ctitanic.html
NA titanic2.sav titanic2.sdd NA NA Ctitanic2.html
NA titanic3.sav titanic3.sdd titanic3.xls titanic3.csv Ctitanic3.html
VA lung cancer data
valung.html valung.sav valung.sdd NA valung.csv Cvalung.html
Very low birth weight infant
vlbw.html vlbw.sav vlbw.sdd NA vlbw.zip Cvlbw.html
Data sets from Dupont, W. D. (2002). Statistical Modeling for Biomedical Researchers
Bernard et al. (1997) NA NA NA 1.3.2.Sepsis.csv NA
Bernard et al. (1997) NA NA NA 1.4.11.Sepsis.csv NA
Parl et al. (1989) NA NA NA 10.7.ERpolymorphism.csv NA
Lang et al. (1995) NA NA NA 11.2.Isoproterenol.csv NA
Lang et al. (1995) NA NA NA 11.2.Long.Isoproterenol.csv NA
(no ref) NA NA NA 11.AreaUnderCurve.csv NA
Brent et al. (1999) NA NA NA 2.12.Poisson.csv NA
Gross et al. (1999) NA NA NA 2.18.Funding.csv NA
Levy (1999) NA NA NA 2.20.Framingham.csv NA
Eisenhofer et al. (1999) NA NA NA 2.ex.vonHippelLindau.csv NA
Gross et al. (1999) NA NA NA 3.ex.Funding.csv NA
Bernard et al. (1997) NA NA NA 4.11.Sepsis.csv NA
Bernard et al. (1997) NA NA NA 4.18.Sepsis.csv NA
Breslow & Day (1980) NA NA NA 4.21.EsophagealCa.csv NA
Bernard et al. (1997) NA NA NA 4.ex.Sepsis.csv NA
Breslow & Day (1980) NA NA NA 5.5.EsophagealCa.csv NA
Scholer et al. (1997) NA NA NA 5.ex.InjuryDeath.csv NA
O'Donnell et al. (2000) NA NA NA 6.9.Hemorrhage.csv NA
Dupont et al. (1985) NA NA NA 6.ex.Breast.csv NA
Levy (1999) NA NA NA 8.12.Framingham.csv NA
Levy (1999) NA NA NA 8.7.Framingham.csv NA
(no ref) NA NA NA 8.8.2.Person-Years.csv NA
(no ref) NA NA NA 8.8.2.Survival.csv NA
Scholer et al. (1997) NA NA NA 8.ex.InjuryDeath.csv NA
(no ref) NA NA NA 11.ex.Sepsis.csv NA

Note: To make csv files from R save files do the following:

load(url('http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/foo.sav'))
ls() # find name of data frame just loaded (here assumed 'foo')
write.table(d, file='foo.csv', sep=',', col.names=NA)

Other Datasets Available from the Web

Topic attachments
I Attachment Action Size Date Who Comment
elseEXT Asource manage 0.1 K 05 Feb 2004 - 11:32 ColeBeck  
htmlhtml CDominicanHTN.html manage 1.2 K 04 Feb 2004 - 14:09 ColeBeck  
htmlhtml CFEV.html manage 1.0 K 04 Feb 2004 - 14:09 ColeBeck  
htmlhtml Cabm.html manage 6.0 K 02 Feb 2004 - 17:03 ColeBeck  
htmlhtml Cacath.html manage 0.9 K 04 Feb 2004 - 14:09 ColeBeck  
htmlhtml Cbirth.estriol.html manage 0.2 K 04 Feb 2004 - 14:09 ColeBeck  
htmlhtml Cboston.html manage 5.9 K 04 Feb 2004 - 14:10 ColeBeck  
htmlhtml Ccdystonia.html manage 1.5 K 05 Feb 2004 - 11:16 ColeBeck  
htmlhtml Ccounties.html manage 2.2 K 05 Feb 2004 - 11:17 ColeBeck  
htmlhtml Cdiabetes.html manage 3.3 K 05 Feb 2004 - 11:17 ColeBeck  
htmlhtml Cdmd.html manage 1.1 K 05 Feb 2004 - 11:17 ColeBeck  
htmlhtml Cgbsg.html manage 1.5 K 29 May 2009 - 17:04 FrankHarrell contents() for gbsg
htmlhtml Chospital.html manage 1.4 K 05 Feb 2004 - 11:17 ColeBeck  
htmlhtml Ckprats.html manage 0.3 K 05 Feb 2004 - 11:17 ColeBeck  
htmlhtml Clead.html manage 7.0 K 05 Feb 2004 - 11:18 ColeBeck  
htmlhtml Colympics.1996.html manage 0.3 K 05 Feb 2004 - 11:18 ColeBeck  
htmlhtml Cpbc.html manage 3.5 K 05 Feb 2004 - 11:18 ColeBeck  
htmlhtml Cplasma.html manage 2.3 K 05 Feb 2004 - 11:18 ColeBeck  
htmlhtml Cprostate.html manage 3.8 K 05 Feb 2004 - 11:18 ColeBeck  
htmlhtml Crhc.html manage 11.6 K 05 Feb 2004 - 11:19 ColeBeck  
htmlhtml Csex.age.response.html manage 0.6 K 05 Feb 2004 - 11:19 ColeBeck  
elseEXT Csource manage 0.1 K 05 Feb 2004 - 11:21 ColeBeck  
htmlhtml CstressEcho.html manage 4.2 K 05 Feb 2004 - 11:19 ColeBeck  
htmlhtml Csupport.html manage 6.1 K 05 Feb 2004 - 11:19 ColeBeck  
htmlhtml Csupport2.html manage 8.1 K 05 Feb 2004 - 11:19 ColeBeck  
htmlhtml Ctitanic.html manage 22.7 K 05 Feb 2004 - 11:19 ColeBeck  
htmlhtml Ctitanic2.html manage 0.9 K 05 Feb 2004 - 11:20 ColeBeck  
htmlhtml Ctitanic3.html manage 9.2 K 05 Feb 2004 - 11:20 ColeBeck  
htmlhtml Cvalung.html manage 1.2 K 05 Feb 2004 - 11:20 ColeBeck  
htmlhtml Cvlbw.html manage 4.8 K 05 Feb 2004 - 11:20 ColeBeck  
txttxt Dcontents.txt manage 0.3 K 20 Jul 2011 - 16:43 FrankHarrell  
htmlhtml DominicanHTN.html manage 0.8 K 06 Feb 2004 - 11:42 ColeBeck  
elsesav DominicanHTN.sav manage 2.8 K 05 Feb 2004 - 16:36 ColeBeck  
xlsxls DominicanHTN.xls manage 31.4 K 05 Feb 2004 - 16:02 ColeBeck  
elseEXT Dsource manage 0.1 K 05 Feb 2004 - 16:08 ColeBeck  
elsecsv FEV.csv manage 29.8 K 21 Jan 2008 - 18:11 ChrisSlaughter Rosner's FEV data in .csv format
htmlhtml FEV.html manage 0.4 K 21 Jan 2008 - 18:12 ChrisSlaughter Description of Rosner's FEV data
elsesav FEV.sav manage 8.0 K 05 Feb 2004 - 16:36 ColeBeck  
elsesdd FEV.sdd manage 20.0 K 05 Feb 2004 - 16:20 ColeBeck  
txttxt Rcontents.txt manage 0.4 K 20 Jul 2011 - 16:30 FrankHarrell  
elseEXT Rsource manage 0.1 K 05 Feb 2004 - 16:08 ColeBeck  
txttxt Scontents.txt manage 0.3 K 03 Feb 2004 - 12:31 ColeBeck  
elseEXT Ssource manage 0.1 K 05 Feb 2004 - 16:08 ColeBeck  
elseEXT Xsource manage 0.1 K 05 Feb 2004 - 16:08 ColeBeck  
htmlhtml abm.html manage 3.2 K 06 Feb 2004 - 11:33 ColeBeck  
elsesav abm.sav manage 24.3 K 02 Feb 2004 - 16:56 ColeBeck  
elsesdd abm.sdd manage 93.0 K 02 Feb 2004 - 16:57 ColeBeck  
xlsxls abm.xls manage 273.7 K 02 Feb 2004 - 16:57 ColeBeck  
htmlhtml acath.html manage 1.3 K 06 Feb 2004 - 11:33 ColeBeck  
elsesav acath.sav manage 26.4 K 05 Feb 2004 - 16:36 ColeBeck  
elsesdd acath.sdd manage 92.4 K 05 Feb 2004 - 16:20 ColeBeck  
zipzip acath.xls.zip manage 61.1 K 05 Feb 2004 - 16:02 ColeBeck  
htmlhtml aindex.html manage 0.7 K 30 Jan 2004 - 14:26 ColeBeck  
htmlhtml ari.html manage 15.4 K 09 Jan 2007 - 17:46 FrankHarrell WHO ARI Multicentre Study of Clinical Signs and Etiologic Agents
elsesav ari.sav manage 1594.6 K 09 Jan 2007 - 17:38 FrankHarrell WHO ARI Multicentre Study of Clinical Signs and Etiologic Agents
zipzip ari.zip manage 384.7 K 19 Mar 2012 - 09:30 CharlesDupont WHO ARI Multicentre Study of Clinical Signs and Etiologic Agents
elsesav ari_other.sav manage 82.2 K 09 Jan 2007 - 17:39 FrankHarrell Sc, Y, Y.death objects for WHO ARI Multicentre Study of Clinical Signs and Etiologic Agents
elsesav birth.estriol.sav manage 0.3 K 05 Feb 2004 - 16:36 ColeBeck  
htmlhtml boston.html manage 2.2 K 06 Feb 2004 - 11:33 ColeBeck  
elsesav boston.sav manage 26.3 K 05 Feb 2004 - 16:37 ColeBeck  
elsesdd boston.sdd manage 80.7 K 05 Feb 2004 - 16:21 ColeBeck  
htmlhtml cdystonia.html manage 1.2 K 06 Feb 2004 - 11:33 ColeBeck  
elsesav cdystonia.sav manage 4.4 K 05 Feb 2004 - 16:37 ColeBeck  
elsesdd cdystonia.sdd manage 17.3 K 05 Feb 2004 - 16:21 ColeBeck  
jpgjpg college1.jpg manage 7.9 K 06 Feb 2004 - 11:33 ColeBeck  
jpgjpg college2.jpg manage 9.2 K 06 Feb 2004 - 11:33 ColeBeck  
htmlhtml counties.html manage 0.8 K 06 Feb 2004 - 11:38 ColeBeck  
elsesav counties.sav manage 159.7 K 05 Feb 2004 - 16:37 ColeBeck  
elsesdd counties.sdd manage 457.3 K 05 Feb 2004 - 16:21 ColeBeck  
zipzip countiesxls.zip manage 280.7 K 05 Feb 2004 - 16:03 ColeBeck  
htmlhtml diabetes.html manage 1.8 K 06 Feb 2004 - 11:42 ColeBeck  
elsesav diabetes.sav manage 12.0 K 05 Feb 2004 - 16:37 ColeBeck  
elsesdd diabetes.sdd manage 40.6 K 05 Feb 2004 - 16:21 ColeBeck  
xlsxls diabetes.xls manage 131.4 K 05 Feb 2004 - 16:02 ColeBeck  
htmlhtml dindex.html manage 4.1 K 30 Jan 2004 - 14:27 ColeBeck  
elsecsv dmd.csv manage 9.0 K 26 Jan 2005 - 22:19 FrankHarrell Duchenne muscular dystrophy csv dataset
htmlhtml dmd.html manage 2.0 K 06 Feb 2004 - 11:42 ColeBeck  
elsesav dmd.sav manage 3.6 K 05 Feb 2004 - 16:37 ColeBeck  
elsesdd dmd.sdd manage 11.4 K 05 Feb 2004 - 16:21 ColeBeck  
elsesav gbsg.sav manage 12.7 K 29 May 2009 - 17:05 FrankHarrell German Breast Cancer R save file
elsedat gbsg_ba_ca.dat manage 59.7 K 29 May 2009 - 17:05 FrankHarrell German Breast Cancer Tab Delimted File
htmlhtml hindex.html manage 3.5 K 30 Jan 2004 - 14:27 ColeBeck  
elsesav hospital.sav manage 0.6 K 05 Feb 2004 - 16:37 ColeBeck  
htmlhtml kprats.html manage 0.5 K 06 Feb 2004 - 11:50 ColeBeck  
elsesav kprats.sav manage 0.4 K 05 Feb 2004 - 16:38 ColeBeck  
elsesdd kprats.sdd manage 0.6 K 05 Feb 2004 - 16:21 ColeBeck  
elserda lead.rda manage 5.2 K 06 Mar 2008 - 16:28 FrankHarrell Rosner's lead dataset without variable labels and units
elsesav lead.sav manage 5.8 K 05 Feb 2004 - 16:38 ColeBeck  
htmlhtml nhgh.html manage 2.5 K 20 Jul 2011 - 16:18 FrankHarrell html(contents(nhgh))
elserda nhgh.rda manage 147.4 K 20 Jul 2011 - 16:18 FrankHarrell NHANES Glycohemoglobin and body measurements dataset for R
elseasc olympics.1996.asc manage 1.5 K 05 Feb 2004 - 11:31 ColeBeck  
htmlhtml olympics.1996.html manage 0.7 K 06 Feb 2004 - 11:50 ColeBeck  
elsesav olympics.1996.sav manage 1.3 K 05 Feb 2004 - 16:38 ColeBeck  
elsesdd olympics.1996.sdd manage 1.9 K 05 Feb 2004 - 16:21 ColeBeck  
htmlhtml pbc.html manage 0.6 K 06 Feb 2004 - 11:50 ColeBeck  
elsesav pbc.sav manage 13.0 K 05 Feb 2004 - 16:38 ColeBeck  
elsesdd pbc.sdd manage 41.5 K 05 Feb 2004 - 16:22 ColeBeck  
xlsxls pbc.xls manage 131.6 K 05 Feb 2004 - 16:02 ColeBeck  
htmlhtml plasma.html manage 3.3 K 06 Feb 2004 - 11:50 ColeBeck  
elsesav plasma.sav manage 11.6 K 05 Feb 2004 - 16:38 ColeBeck  
elsesdd plasma.sdd manage 28.4 K 05 Feb 2004 - 16:22 ColeBeck  
htmlhtml prostate.html manage 0.9 K 06 Feb 2004 - 11:51 ColeBeck  
txttxt prostate.notes.txt manage 0.8 K 06 Feb 2004 - 11:56 ColeBeck  
elsesav prostate.sav manage 11.6 K 05 Feb 2004 - 16:38 ColeBeck  
elsesdd prostate.sdd manage 44.0 K 05 Feb 2004 - 16:22 ColeBeck  
xlsxls prostate.xls manage 182.2 K 05 Feb 2004 - 16:03 ColeBeck  
elsecsv rhc.csv manage 2266.0 K 25 Jun 2004 - 09:22 ColeBeck  
htmlhtml rhc.html manage 14.4 K 06 Feb 2004 - 11:56 ColeBeck  
txttxt rhc.sascode.dir.txt manage 23.4 K 06 Feb 2004 - 11:56 ColeBeck  
zipzip rhc.sascode.zip manage 314.6 K 06 Feb 2004 - 11:56 ColeBeck  
elsesav rhc.sav manage 324.8 K 05 Feb 2004 - 16:38 ColeBeck  
elsesdd rhc.sdd manage 1781.1 K 05 Feb 2004 - 16:22 ColeBeck  
htmlhtml rindex.html manage 3.7 K 30 Jan 2004 - 14:26 ColeBeck  
htmlhtml sex.age.response.html manage 0.6 K 06 Feb 2004 - 12:02 ColeBeck  
elsesav sex.age.response.sav manage 0.4 K 05 Feb 2004 - 16:39 ColeBeck  
elsesdd sex.age.response.sdd manage 1.0 K 05 Feb 2004 - 16:22 ColeBeck  
htmlhtml sindex.html manage 3.1 K 30 Jan 2004 - 14:26 ColeBeck  
elsecsv stressEcho.csv manage 65.1 K 02 May 2005 - 14:20 CharlesDupont  
htmlhtml stressEcho.html manage 7.6 K 06 Feb 2004 - 12:01 ColeBeck  
elsesav stressEcho.sav manage 15.8 K 05 Feb 2004 - 16:39 ColeBeck  
elsesdd stressEcho.sdd manage 55.3 K 05 Feb 2004 - 16:22 ColeBeck  
elsesav support.sav manage 52.2 K 05 Feb 2004 - 16:39 ColeBeck  
elsesdd support.sdd manage 194.7 K 05 Feb 2004 - 16:22 ColeBeck  
xlsxls support.xls manage 574.1 K 05 Feb 2004 - 16:03 ColeBeck  
elsesav support2.sav manage 529.6 K 05 Feb 2004 - 16:39 ColeBeck  
elsesdd support2.sdd manage 2435.9 K 05 Feb 2004 - 16:23 ColeBeck  
zipzip support2csv.zip manage 763.9 K 05 Feb 2004 - 11:31 ColeBeck  
htmlhtml titanic.html manage 3.2 K 06 Feb 2004 - 12:18 ColeBeck  
elsesav titanic.sav manage 27.4 K 05 Feb 2004 - 16:39 ColeBeck  
elsesdd titanic.sdd manage 122.8 K 05 Feb 2004 - 16:23 ColeBeck  
txttxt titanic.txt manage 114.2 K 05 Feb 2004 - 11:31 ColeBeck  
elsesav titanic2.sav manage 5.3 K 05 Feb 2004 - 16:39 ColeBeck  
elsesdd titanic2.sdd manage 27.3 K 05 Feb 2004 - 16:23 ColeBeck  
elsecsv titanic3.csv manage 114.0 K 15 Feb 2010 - 10:27 ChrisSlaughter  
elsesav titanic3.sav manage 36.5 K 05 Feb 2004 - 16:39 ColeBeck  
elsesdd titanic3.sdd manage 123.5 K 05 Feb 2004 - 16:23 ColeBeck  
xlsxls titanic3.xls manage 277.5 K 05 Feb 2004 - 16:03 ColeBeck  
txttxt titanic3info.txt manage 5.9 K 06 Feb 2004 - 12:18 ColeBeck  
elsecsv valung.csv manage 5.7 K 24 Sep 2010 - 12:34 CharlesDupont csv version of valung dataset
htmlhtml valung.html manage 0.5 K 06 Feb 2004 - 12:01 ColeBeck  
elsesav valung.sav manage 1.6 K 05 Feb 2004 - 16:40 ColeBeck  
elsesdd valung.sdd manage 5.3 K 05 Feb 2004 - 16:23 ColeBeck  
htmlhtml vlbw.html manage 1.6 K 06 Feb 2004 - 12:01 ColeBeck  
txttxt vlbw.notes.txt manage 0.6 K 06 Feb 2004 - 12:01 ColeBeck  
elsesav vlbw.sav manage 19.9 K 05 Feb 2004 - 16:40 ColeBeck  
elsesdd vlbw.sdd manage 83.6 K 05 Feb 2004 - 16:23 ColeBeck  
zipzip vlbw.zip manage 19.2 K 27 Jan 2005 - 08:35 FrankHarrell Ascii CSV version of vlbw dataset
htmlhtml xindex.html manage 1.3 K 30 Jan 2004 - 14:26 ColeBeck  
Topic revision: r40 - 19 Mar 2012 - 09:31:49 - CharlesDupont
 
Register | Log In
Copyright © 2011 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback