Re: [R] read.spss in R 2.1.0 & make basic dataframe

From: Thomas Lumley <tlumley_at_u.washington.edu>
Date: Fri 27 May 2005 - 00:08:09 EST

On Thu, 26 May 2005, Bliese, Paul D LTC USAMH wrote:
> On a related note, do other users routinely use read.spss with the
> defaults of "to.data.frame=F" or "use.value.labels=T"? My experience
> is that I am always using the non-default values in which case it would
> be helpful to change the defaults to "to.data.frame=T" and
> "use.value.labels=F". It would also probably make sense to change the
> default for "trim.factor.names=T". Interested in others' perspective.
>

Actually, most of this is me rather than Saikat.

I use use.value.labels=TRUE most of the time. The main point of to.data.frame=TRUE is that it is quite a lot faster for large files, especially if you are going to use only a few of the variables. I think Brian Ripley spoke up in favour of it for this reason last time the issue was raised.

The reason I made trim.factor.names=FALSE the default was backwards compatibility, but it probably makes sense to switch it at some point.

Incidentally, PSPP (the original source of the code) now has a version that reads long variable names from post-version 12 SPSS files. This confirms that the "unrecognised record type 7, subtype 13" message really is due to long variable names and so is harmless. It also means that anyone who wants long variable names badly enough could work out a patch.

         -thomas



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri May 27 00:17:25 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:07 EST