Using Stat/Transfer
You can use Stat/Transfer under Win4lin to convert any SAS dataset to Stata (probably works best - allows for long variable names and does not put blanks at end of variable labels; import into R using
stata.get) or SPSS (use
spss.get in R). You can copy the Stat/Transfer program files to linux (e.g., under the directory
~/bin/st) and run them fine under linux using the command
st after defining the following alias in your
.bashrc file:
alias st='wine ~/bin/st/st32w.exe &'
This works best if you do not have many files to translate.
Using the SAS Viewer
The SAS Viewer can read
sas7bdat files and export the dataset into a
csv file. See
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Restricted/RestrictedSoftware for more information about getting the viewer and using it under Linux. Soon instructions will be added here for how to use the SAS viewer to export SAS metadata along with the data and to use both of these files in importing SAS data into R.
If You Have Access to SAS
Creating a SAS Data Library Suitable for Importing into R
LIBNAME d engine "directoryname"; * Use LIBNAME d SASV5XPT "foo.xpt"; to create transport files;
DATA d.dataset1; ....
DATA d.dataset2; ...
/* If the datasets are already created but you only want to export a
subset of them do something like the following instead */
LIBNAME old "olddirectoryname";
PROC COPY IN=old OUT=work; SELECT data1 data2 data3; RUN;
PROC FORMAT CNTLOUT=d.formats;RUN; * If used any PROC FORMAT ...; VALUE ...;
* Can use CNTLOUT=formats if writing to work area;
* work can be the first argument given to the macro;
Note: If you used a newer version of SAS to create the dataset and used variable names longer than 8 characters, you'll need to specify the following option to SAS to allow truncation of names to 8 characters for creating a version 5 export file:
OPTIONS VALIDVARNAME=V6;
Running SAS Job to Create csv Files
%INCLUDE "foo\exportlib.sas"; * Define macro;
LIBNAME d ...; * E.g. LIBNAME d SASV5XPT "foo.xpt";
/* To use regular SAS datasets (non-transport files) use for example
LIBNAME d "olddirectory"; or LIBNAME d "."; (current working directory);*/
%exportlib(d, outdir, tempdir);
* Default outdir is . (current working directory);
* Default tempdir is C:/WINDOWS/TEMP;
This creates a
.csv file in
tempdir for every SAS dataset in
d (including the
PROC FORMAT output if any) plus a file called
_contents_.csv containing
PROC CONTENTS output for all datasets combined.
_contents_.csv allows the SAS data import to know about variable labels, formats, and types (including date, time, date/time variables). Under Windows, this SAS job will run much faster if you store the SAS commands in a file such as
exportsas.sas and you left click on
exportsas.sas then click on BATCH SUBMISSION. After the job finishes you will see file
exportsas.log in the same directory as
exportsas.sas. The only error messages you should see are related to missing formats - ignore these.
Here are simple examples in which a single SAS dataset is exported to directory
C:my/sascsv and there are no PROC FORMAT value labels. First consider the case where the dataset is the only dataset in the permanent data library.
LIBNAME d "."; * SAS datasets are in current working directory;
%exportlib(d, C:my/sascsv);
If the permanent data library has more than one SAS dataset but you only want to export one of them, say ds1, use for example
LIBNAME d "projects/mydatasets"; * SAS datasets are somewhere else;
DATA ds1; SET d.ds1; RUN;
%exportlib(work, C:my/sascsv);
Importing Data into R
d <- sasxport.get(file, method='csv')
# file is name of directory containing all the .csv files created by exportlib
This will produce a single data frame
d if only one
.csv file existed, or a list of data frames whose major elements are named by lower case versions of all the SAS datasets, with underscores replaced by periods.