Re: [Rd] importing explicitly declared missing values in read.spss (foreign)

From: Jeroen Ooms <>
Date: Tue, 05 Aug 2008 04:52:49 -0700 (PDT)

First of all, apologies if you feel misquoted, I was only trying to keep things clear. Now, I have installed and tried the new version of the package and it works perfectly. It does exactly what it should do. I tested it on some huge SPSS's sample files which contained a lot of variables with several types of missingness, and all missing values were correctly converted to R <NA> values. I find this a very big improvement, and it makes the transition from spss to R even easier. Thank you very much!

Prof Brian Ripley wrote:
> I've put up an experimental version at
> See the new 'use.missings' argument. It does what I think should happen
> in your example and the other one I tried, but more experience would be
> helpful.
> On Mon, 4 Aug 2008, Jeroen Ooms wrote:
> Please don't silently excise context -- see the posting guide for the
> rights of posters to be quoted fairly (and your usage of my posting fails
> to be fair).

>> Prof Brian Ripley wrote:
>>>> From the messages you get I do not believe this is a recent version of
>> read.spss (message 2 no longer appears)...
>> I am sorry you are right here, I was using an outdated version of
>> foreign. I
>> have updated my packages. My current version is now R version 2.7.1
>> (2008-06-23) with foreign_0.8-28.
>> I have experimented importing some spss datafiles, mostly from the sample
>> data files that are included with SPSS. Most of these files do not
>> generate
>> any warnings, so I am not sure this is related to the missingness.
>> However,
>> the problem of read.spss() not returning any information on missingness
>> persists in all of these datafiles.
>> Prof Brian Ripley wrote:
>>> All that is 'harmfull' is that you are not told that value labels NA and
>>> NAP were to be regarded as 'missing' in SPSS.  We've no idea whether if
>>> would be a more or less egregious choice to map them to R's NA, and
>>> certainly are not in a position to assert 'far less harmfull' in
>>> general.
>> Of course the 'least harmfull' behavior of the function completely
>> depends
>> on the data and the user's intentions. I was explicitly suggesting making
>> the mapping of missing values to <NA>'s optional, to give users who
>> consider
>> this appropriate, the option to replace these missings. I do not claim
>> this
>> to be the best default behavior, just a very useful feature.

> --
> Brian D. Ripley,
> Professor of Applied Statistics,
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
> ______________________________________________
> mailing list
View this message in context:
Sent from the R devel mailing list archive at

______________________________________________ mailing list
Received on Tue 05 Aug 2008 - 11:59:43 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 05 Aug 2008 - 12:35:54 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive