RE: [Rd] Problem with read.xport() from foreigh package (PR#7389)

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Fri 10 Dec 2004 - 01:42:35 EST

Have you looked at the latest version of foreign, 0.8-2? The issue has already been resolved, AFAIK.

On Thu, 9 Dec 2004, Werner Engl wrote:

> Dear R-devel list,
>
> This is to confirm Prof. Ripley's analysis of the
> read.xport issue.
>
> The section on missing data in TS140 is pertinent
> to numeric variables only. In SAS, character
> variables are of fixed length (between 1 and 200
> for the xport format). Shorter strings are padded
> with trailing blanks when assigned to a variable.
>
> An uninitialized character variable is stored as
> all blanks in the xport format file. This is the
> only representation of 'missing' data for SAS
> character variables. 'Special missing' codes
> (.A to .Z and ._) are available for numeric
> variables only.
>
> Please find enclosed a patch to the
> R-2.0.1/src/library/Recommended/foreign/SASxport.c
> file and a xport file that I used for testing. The
> xport file was created by SAS V8.2 on Linux, but
> should be plattform and version independent (except
> for the header information). I have simply commented
> out the code lines that try to detect missing character
> values.
>
> The code in SASxport.c already does a good job in
> removing trailing blanks from character values.
> For missing character data (all blanks) the result
> is the empty string (""), which is fine for me.
> There is no equivalent to the R missing character
> representation in SAS (as far as I know).
>
> The enclosed gzipped tar file contains:
>
> diff_SASxport_c.txt diff for SASxport.c
> xptchar1.xpt test file in xport format
> xptchar.sas trivial SAS program used to
> generate xptchar1.xpt
> xptchar_SAS_System_Viewer9_1.csv xptchar1.xpt
> converted to comma separated file using SAS
> System Viewer 9.1 (on Win XP)
>
> With the patch applied, read.xport produces the same
> data frame from xptchar1.xpt as read.csv does from
> xptchar_SAS_System_Viewer9_1.csv (tested on i386 Linux
> with R Version 2.0.1) except that read.csv converts empty
> strings to NAs. As explained above, the empty string is
> closer to the meaning of an all-blanks value in SAS.
>
> There is renewed interest in this old data format in
> the pharmaceutical industry, because the US Food and
> Drug Administration requests clinical and
> pre-clinical data to be submitted in this format. I
> spent some time analyzing the xport file format to
> be sure of what is actually submitted to FDA with
> these files.
>
> Thank you for considering this patch (and for the
> great R system, of course)!
>
>
> Best regards,
>
> Werner Engl
>
>
>
> _____________________________________
> Werner Engl, PhD, CStat
> Senior Manager, Biostatistics
> Baxter AG, Vienna, Austria
> e-mail: werner_engl@baxter.com
> --- Please disregard any text below this line ---
>
> --
>
> GMX DSL-Netzanschluss + Tarif zum supergŁnstigen Komplett-Preis!

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________ R-devel@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-devel

Received on Fri Dec 10 02:10:35 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 09:02:05 EST