Re: [R] Problem Reading SPlus Dump Into R - Spaces Embedded in Data

From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>
Date: Fri 30 Dec 2005 - 21:01:58 EST

allan miller <amiller@a2software.com> writes:

> Peter Dalgaard wrote:
>
> >allan miller <amiller@a2software.com> writes:
> >
> >
> >>Hello,
> >>
> >> I'm trying to source() an SPlus 6.x file created using dump(...,
> >> oldStyle=T) into R (version 2.01) as using the following
> >> instructions:
> >>
> >>
> >>> *If you have access to S-PLUS, it is usually more reliable to
> >>> |dump| the object(s) in S-PLUS and |source| the dumpfile in R. For
> >>> S-PLUS 5.x and 6.x you may need to use |dump(..., oldStyle=T)|,
> >>> and to read in very large objects it may be preferable to use the
> >>> dumpfile as a batch script rather than use the |source| function.*
> >>>
> >>(from "R Data Import/Export," pg. 15)
> >>
> >>An example:
> >>
> >> > source("testdump")
> >>Error in parse(file, n, text, prompt) : syntax error on line 1895
> >>
> >> where the data on line 1895 - and other lines causing this - have
> >> embedded spaces, such as the following:
> >>
> >>
> >>[line 1895] Johnson Partners LLC
> >>
> >>
> >> I can't seem to find any options for either the SPlus dump, or R
> >> source(), that relate to this problem. Any suggestions for how to
> >> either dump or source files containing data with embedded spaces?
> >>
> >
> >A bit more context might be helpful. What's in lines surrounding 1895?
> >Can you show a simple S-PLUS object displaying the behaviour? What
> > happens if you dput() the object? Will S-PLUS itself restore the
> > file?
> Unfortunately, I don't have access to S-PLUS :'( , the S-PLUS file
> dump was provided to me to load in R. Here are the lines in the
> S-PLUS dump surrounding 1895:
>
> > 1892 .Label
> > 1893 character
> > 1894 1
> > 1895 Johnson Partners LLC
> > 1896 class
> > 1897 character
> > 1898 1
> > 1899 factor
> > 1900 Protocol
>
> The problem is with the embedded spaces (whitespace?) characters in
> 1895. If I remove the spaces, i.e., change it to:
>
> JohnsonPartnersLLC
>
> the line is successfully loaded, and the next error that comes up is
> another Label with embedded spaces.
>
> Thanks for your help.

As I suspected, your data are not in the format that you thought they were.

turmalin:~/>Splus
S-PLUS : Copyright (c) 1988, 2003 Insightful Corp. S : Copyright Lucent Technologies, Inc.
Version 6.2.1 for Linux 2.4.18 : 2003
Working data will be in /home/bs/pd/MySwork
> x <- "Johnson Partners LLC"
> dump("x",file="testfile",oldStyle=TRUE)
[1] "testfile"
>

[1]+  Stopped                 Splus

turmalin:~/>cat testfile
"x" <-
"Johnson Partners LLC"

turmalin:~/>fg
Splus

> data.dump("x",file="test2")

>

[1]+  Stopped                 Splus

turmalin:~/>cat test2
## Dump S Version 4 Dump ##
x
character
character
1
Johnson Partners LLC

....

> data.dump("x",oldStyle=TRUE)

>

[2]+  Stopped                 Splus

turmalin:~/>cat dumpdata
x
character
1
Johnson Partners LLC

So what you have looks like the oldStyle (? - check line 1) data.dump() format, which is quite different from dump().

data.restore() from the foreign package can read those if they contain only basic data objects. For the oldStyle=F format, you seem to be out of luck.

(And BTW, R will _parse_ almost any file consisting of lines with just a single word or a numeric constant. That doesn't mean it can do anything sensible with it...)

-- 
   O__  ---- Peter Dalgaard             ุster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)                  FAX: (+45) 35327907

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Fri Dec 30 21:06:33 2005

This archive was generated by hypermail 2.1.8 : Sat 31 Dec 2005 - 00:04:07 EST