Re: [Rd] A couple of issues with colClasses/setAs

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Wed 08 Sep 2004 - 19:31:53 EST


>From ?read.table (this is about read.table, despite the subject line, I
believe?)

colClasses: character. A vector of classes to be assumed for the columns.

"NULL" is not a class in my book (and certainly not one a column can have). So no wonder it does not work, and it is not a bug not to work in undocumented cases.

We can look into making it work, but once you start skipping columns I think you should be using scan(). (I also suspect scan did not accept NULL when this was implemented.)

On 8 Sep 2004, Peter Dalgaard wrote:

> Consider this:
>
> $ cat test.dat
> 1 a
> 2 b
>
> Now, we want to read the 2nd column as a factor and ignore the first
> (since it's just a sequential ID).

Well, you have to have row names, so that's not actually an advantage.

> We can't just put "factor" among
> the colClasses (would have been nice), so let's try this instead
>
> > setAs("character","factor",as.factor)
> Arguments in definition changed from (x) to (from)
> > read.table("test.dat",colClasses=c("numeric","factor"))
> Error in inherits(x, "factor") : Object "x" not found
>
> which is a bit peculiar: Why does it change the argument when that's
> going to create a function that doesn't work?? You do need to spell it
> out:
>
> > setAs("character","factor",function(from)as.factor(from))

> And now we get somewhere
>
> > read.table("test.dat",colClasses=c("numeric","factor"))
> V1 V2
> 1 1 a
> 2 2 b

Might be a good idea to teach colClasses about "factor".

>
> but suppose we want to get rid of col.1:
>
> > read.table("test.dat",colClasses=c("NULL","factor"))
> Error in data[[i]] : subscript out of bounds
>
> which looks like a pretty clear bug. In contrast, this works fine
>
> > read.table("test.dat",colClasses=c("NULL","character"))
> V2
> 1 a
> 2 b
>
> so the issue only arises when you have nontrivial coercions.
>
> Presumably, the issue is that the colClasses in those cases
> miscalculate indices by forgetting the columns that were skipped.
>
>

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Wed Sep 08 19:35:25 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 09:00:12 EST