Re: [R] + and - in RODBC : no longer considered factors

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Wed, 30 Apr 2008 11:49:14 +0100 (BST)

It is nothing to do with RODBC, which follows read.table here:

% cat > foo.txt
x
+
-
...

> read.table("foo.txt", header=TRUE)

   x
1 0
2 0

and that uses

> type.convert(c("+", "-"))

[1] 0 0
> type.convert(c("+", "a"))

[1] + a
Levels: + a

Whereas 2.6.2 did

> type.convert(c("+", "-"))

[1] + -
Levels: + -

The difference is related to a change to deciding in R (and not the OS) what a 'numeric field' is:

     o	Parsing and scanning of numerical constants is now done by R's
 	own C code.  This ensures cross-platform consistency, and
 	mitigates the effects of setting LC_NUMERIC (within base R it
 	only applies to output -- packages may differ).

 	The format accepted is more general than before and includes
 	binary exponents in hexadecimal constants: see
 	?NumericConstants for details.

There's a comment in the sources that numeric fields with no digits should perhaps be regarded as non-numeric, so this can easily be changed.

On Wed, 30 Apr 2008, Dieter Menne wrote:

> I have a large Sweave report that reads data from a database file. Some of
> the columns are 1-character strings containing only +, - or NA. An example
> for such a table is shown below, and can be downloaded for easier testing
> from
>
> http://www.menne-biomed.de/uni/test.zip
>
> (For security reasons, the file is zipped)
>
> table test
>
> hp hp1
> + a
> - +
>
>
> library(RODBC)
> channel = odbcConnectAccess("test.mdb")
> ret = sqlQuery(channel,"select * from test")
> odbcClose(channel)
> str(ret)
> # 'data.frame': 2 obs. of 2 variables:
> # $ hp : num 0 0
> # $ hp1: Factor w/ 2 levels "+","a": 2 1
>
>
> Note that the column hp with "+" and "-" only is read as numeric 0, but
> when there is only other character such as in hp1, the conversion occurs.
>
> In R 2.6.2 (or was it an earlier version of RODBC?), column hp was treated
> as factor.
>
> Is this a new feature I have to live with, or an ... ahem ... issue? I know
> that with as.is I can get around this, but it need a lot of explicit
> programming for the columns I don't want to be as.issed
>
> Disclaimer:
> -- Yes, I know I should have reported this earlier, but the problem of
> having
> to re-create the report came up today.
> -- Yes, I should have reported this on the windows/devel r-help or directly
> to the author (of RODBC; or base?), so I feel guilty in advance that this is
>
> the wrong list.
> -- Yes, I have read the NEWS, and could not find something related.
> -- Yes, I cannot rule out this is a user error.
>
>
> Dieter
>
>
> ---------------------------
>
> R version 2.7.0 (2008-04-22)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=Germ
> an_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] RODBC_1.2-3
>>
>>
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 30 Apr 2008 - 10:52:34 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 30 Apr 2008 - 13:30:33 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive