Skip to topic | Skip to bottom
Home
Main
Main.SweaveConvertr1.11 - 23 Sep 2007 - 16:05 - FrankHarrelltopic end

You are here: Main > StatComp > RS > StatReport > SweaveConvert

Start of topic | Skip to actions

Converting Documents Produced by Sweave

Converting from LaTeX to html

The recommended format for statistical reports to send to collaborators is pdf produced by pdflatex. Sometimes it is necessary to give collaborators a version of a report that can be edited outside of LaTeX, or to post a report on a web site. Experiments with latex2rtf and hevea have shown that these are not adequate for reports that incorporate advanced features such as latex(describe()) output. One of the most reliable approaches is to use TtH to convert from LaTeX to html.

A linux script is needed to translate Sweave's LaTeX output for use with TtH (thanks to Ben Bolker for providing most of the code). Here are the steps needed to get going.

  1. As root do the following (the second step is necessary if you are including LaTeX picture code in documents (as produced by latex.describe) to create the ppmtogif executable)
    1. apt-get install tth
    2. apt-get install netpbm
  2. As user, copy sweave2html to ~/bin and make sure that ~/bin is in your search path
  3. Mark this script as executable
To convert your .tex file to html and create all the needed graphics files, do the following.
  1. Create a graphics subdirectory in your project directory and have Sweave use it by putting the following command in your .Rnw file: \SweaveOpts{prefix.string=graphics/plot}
  2. Run Sweave, using for example a script named Sweave in ~/bin that is marked as executable and contains
#!/bin/sh
echo "Sweave(\"$1.Rnw\")" | R --no-save --no-restore
If you ran Sweave foo next run sweave2html foo

You can also view the output in a browser such as konqueror and copy and paste it into an OpenOffice document and save in a variety of formats including Word. Use Select All and Paste and all graphics and table formats will be preserved.

It is important to give your collaborator all the .pdf files in the graphics directory to use in manuscripts; do not let them use the lower resolution graphs that will be included in the foo.html document. Bundle all the necessary files to send to the collaborator, using for example

zip /tmp/z.zip foo.pdf foo.html *.gif graphics/*.pdf graphics/*.png
E-mail /tmp/z.zip as an attachment.

Using TeX4ht

The TeX4ht package is a comprehensive LaTeX to html convertor. It may be installed easily using apt-get. In one test it performed well (including greek letters and superscripts and LaTeX picture environments) although I did not see how to get postscript or pdf graphics to appear in the final output. Advanced summary.formula.reverse tables are handled nearly perfectly, include those that contain micro dot charts. TeX4ht is used as follows:
htlatex foo.tex            # produces foo.html
mk4ht oolatex foo.tex      # produces an OpenOffice .sxw file
Note that the tth package has to be installed for htlatex to run completely.

My test of the oolatex option resulted in output that was not as good as running htlatex and opening the resulting .html file in OpenOffice. See StatReport for more information and example output, and note its comment about turning off picture links in the OpenOffice document.

Using OpenOffice Exclusively

The odfWeave package by Max Kuhn can be used to produce reports directly in open document format, and the output can be save in Word format. At present, graphics are somewhat low resolution. Source code is similar to what is used with Sweave. Here is how to run an example (in linux), after installing the odfWeave package and the latest OpenOffice. The file can then be exported to open document or Word format.

R
library(odfWeave)
odfWeave('/usr/local/lib/R/site-library/odfWeave/examples/examples.odt', '/tmp/out.odt')
You can then open /tmp/out.odt in OpenOffice Writer. Note: On some systems the correct file name will be /usr/lib/R/site-library/odfWeave/examples/examples.odt.

This approach does not allow you to use the advanced table making capabilities of Hmisc that rely on LaTeX.

Weaving with Raw HTML

Greg Snow has written a document showing how to use raw HTML and the R2HTML package to produce .html reports.

Batch Conversion of Document Formats

cd /tmp
tar zxvf ooconvert-*
# You may need to edit line to change python2.3 to python
sudo chmod a+x ooconvert
sudo mv ooconvert /usr/local/bin or to ~/bin

One-step Conversion of LaTeX Documents to Word

  • Install ooconvert, tth, tex4ht
  • Put the following script in ~/bin and chmod +x to make it executable
  • Run it by saying ltx2doc foo to convert foo.tex to foo.doc
mk4ht oolatex $1.tex
rm -f $1.css $1.idv $1.lg $1.tmp $1.4tc $1.xref $1.4ct
ooconvert $1.odt $1.doc
rm $1.odt

to top

I Attachment sort Action Size Date Who Comment
sweave2html manage 0.6 K 25 Jul 2006 - 21:59 FrankHarrell Sweave LaTeX to html convertor
htmlWeave.pdf manage 60.8 K 27 Jul 2006 - 03:59 FrankHarrell Automating Reports with Sweave by Greg Snow

You are here: Main > StatComp > RS > StatReport > SweaveConvert

to top

Home | VUMC Web Email | Medical Center Home | VU | Statistics at Vanderbilt | VICC Cancer Biostatistics Division

Vanderbilt Biostatistics
S-2323 Medical Center North, 1161 21st Avenue South, Nashville, TN 37232-2158
(615) 322-2001 • fax (615) 343-4924
General questions about the department? E-mail biostat@vanderbilt.edu

Copyright © 2003-2008 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Do you have ideas, requests, problems regarding Biostatistics TWiki web site? Send feedback.