Re: [R] RODBC and NULL values

From: Mark Wardle <mark_at_wardle.org>
Date: Tue 17 Oct 2006 - 10:49:06 GMT

Prof Brian Ripley wrote:
> What sqltype(s) are your variables?
>

The variables are all numeric.

> For numeric types, RODBC merely maps values the ODBC driver says are
> NULL to NA. Since you appear not to have character data,
>
> nullstring: character string to be used when reading 'SQL_NULL_DATA'
> character items from the database.
>
> na.strings: character string(s) to be mapped to 'NA' when reading
> character data.
>
> are not relevant to you.

I thought that, but was grasping at straws because at that point I didn't know whether it was problem with the ODBC driver misinforming RODBC about the correct character types.

>
> At least on Windows and Linux the PostgreSQL 8.1 ODBC driver works
> correctly, and NULLs in numeric columns are mapped to NAs in R. (There
> is an example in my test suite.)

I'm using Actual's ODBC driver. In my previous email, I did a test with another ODBC client (Microsoft Excel/Query) and found it too was misinterpreting NULL values as zero, concluding it was an issue with the ODBC driver itself. However, I was wrong - using the iodbctest program, the ODBC driver *is* successfully returning NULLs. It is only Microsoft Excel/Query and R that I am having the problem with these empty spaces/NULL characters being converted to zeros.

>> ...
>> When I query the database manually using psql, it is clear there are
>> NULL values.
>>        3 |         3 |         18 |       |       27 |        1
>>          |           |            |       |       13 |        1
>>        1 |         5 |            |       |       10 |        0
>>       10 |        13 |         13 |       |       22 |        0

>
> No, it is not clear. It is clear that there are values which are
> printed as blank or empty strings.
>

I *think* postgresql is regarding them as NULL values. I don't know whether this proves it? [The first two must be functionally equivalent)

ataxia=#select count(calc_survival_bilateral_support) from clinical;  count


    53
(1 row)

ataxia=#select count(calc_survival_bilateral_support) from clinical where calc_survival_bilateral_support is NOT NULL;  count


    53
(1 row)

ataxia=# select count(*) from clinical;
 count


   140
(1 row)

> Find out what the types involved are. Perhaps try as.is=FALSE?
>
Have done, and I'm afraid it doesn't change anything.

> Since the mapping of NULLs to NAs works in other examples, I find it
> hard to see how this can be an RODBC issue.
>
Perhaps it is a peculiarity in my set-up, or I'm missing something obvious and making some assumption somewhere. I will retrace my steps! Perhaps I should use a different approach, but I always have difficulty giving up on a problem unsolved!

-- 
Dr. Mark Wardle
Clinical research fellow and Specialist Registrar in Neurology,
C2-B2 link, Cardiff University, Heath Park, CARDIFF, CF14 4XN. UK

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue Oct 17 20:52:25 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 17 Oct 2006 - 15:30:11 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.