Re: [R] Bug in levels() function?

From: Thomas Lumley <tlumley_at_u.washington.edu>
Date: Mon, 28 Jan 2008 11:03:51 -0800 (PST)

This is not a bug; it is deliberately designed this way.

There are circumstances when you want to drop levels on subsetting and other circumstances where you don't, so the default behaviour can't make everyone happy. However, there is an option to get the behaviour you want
> x<-as.factor(LETTERS)
> levels(x[1])


  [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"
> levels(x[1,drop=TRUE])

[1] "A"

On Mon, 28 Jan 2008, Groot, Philip de wrote:

> Hello all,
>
> I am not sure whether it actually is a bug, but it is not the behaviour I would expect. Please consider this:
>
>> Sibships
> [1] Patient_2400 Patient_2400 Patient_345 Patient_345 Patient_8901
> [6] Patient_8901 Patient_4008 Patient_4008 Patient_7991 Patient_7991
> [11] Patient_8353 Patient_8353 Patient_1212 Patient_1212 Patient_2168
> [16] Patient_2168 Patient_2760 Patient_2760 Patient_4726 Patient_4726
> [21] Patient_6699 Patient_6699 Patient_7641 Patient_7641 Patient_8263
> [26] Patient_8263 Patient_1389 Patient_1389 Patient_1618 Patient_1618
> [31] Patient_2410 Patient_2410 Patient_2612 Patient_2612 Patient_2721
> [36] Patient_2721 Patient_5053 Patient_5053 Patient_8458 Patient_8458
> [41] Patient_211 Patient_211 Patient_9004 Patient_9004 Patient_3423
> [46] Patient_3423 Patient_7413 Patient_7413 Patient_7815 Patient_7815
> [51] Patient_9232 Patient_9232 Patient_2267 Patient_2267 Patient_468
> [56] Patient_468
> 28 Levels: Patient_1212 Patient_1389 Patient_1618 Patient_211 ... Patient_9232
>
>> Comparison_Indices
> [1] TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE
> [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
>> Sibships[Comparison_Indices]
> [1] Patient_2400 Patient_2400 Patient_345 Patient_345 Patient_8901
> [6] Patient_8901 Patient_7413 Patient_7413
> 28 Levels: Patient_1212 Patient_1389 Patient_1618 Patient_211 ... Patient_9232
>
> The problem with this last command is that I would expect 4 levels (because only 8 "Comparison_Indices" are true, which is equal to 4 sibships. So: levels() does not take array indices into account or stated otherwise: if you use a subset in an array (vector), the levels() are not properly updated (to my opinion).
>
> What I additionally found is the following:
>> small_test <- factor(x=c("a", "b", "c"))
>> typeof(small_test)
> [1] "integer"
>
> The same happens to the Sibships that I defined as a factor? Why is it of type integer?
>
> This is the version() output:
>> version
> _
> platform x86_64-unknown-linux-gnu
> arch x86_64
> os linux-gnu
> system x86_64, linux-gnu
> status
> major 2
> minor 6.1
> year 2007
> month 11
> day 26
> svn rev 43537
> language R
> version.string R version 2.6.1 (2007-11-26)
>>
>
> So: should I submit a Bug report?
>
> Regards,
>
> Dr. Philip de Groot
> Wageningen University
>
>
>
>
> ______________________________________________
> R-help_at_r-project.org mailing list
>
https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley_at_u.washington.edu	University of Washington, Seattle

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 28 Jan 2008 - 19:06:05 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 29 Jan 2008 - 09:30:09 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive