Re: [R] factor : how does it work ?

From: Florence Combes <fcombes_at_gmail.com>
Date: Fri 07 Oct 2005 - 01:08:17 EST

> head(merged)

ID Name Pcc_0h_A Pcc_0h_swapped_A

3302 301495 Q0010_01 |Q0010||Hypothetical ORF 12.276 11.716
6943 309175 Q0010_01 |Q0010||Hypothetical ORF 11.958 11.271
14065 298935 Q0017_01 |Q0017||Hypothetical ORF 14.098 13.122
6420 306615 Q0017_01 |Q0017||Hypothetical ORF 13.843 13.061
5066 296375 Q0032_01 |Q0032||Hypothetical ORF 12.451 11.467
12707 304055 Q0032_01 |Q0032||Hypothetical ORF 11.745 11.482
Pcc_0h_M Pcc_0h_swapped_M
3302 -0.249 0.316
6943 -0.115 0.780
14065 -0.053 0.263
6420 0.009 0.323
5066 0.015 0.687
12707 0.074 0.768

> str(merged)

`data.frame': 12202 obs. of 6 variables: $ ID : Factor w/ 12202 levels "295080","295081",..: 5076 11177 3046 9147 1009 7110 5136 11237 3106 9207 ...

..- attr(*, "names")= chr "3302" "6943" "14065" "6420" ...
$ Name : Factor w/ 6101 levels "Q0010_01 ..",..: 1 1 2 2 3 3 4 4 5 5 ...
..- attr(*, "names")= chr "3302" "6943" "14065" "6420" ...
$ Pcc_0h_A : Factor w/ 5386 levels "10.001","10.002",..: 1812 1547 3308 3114 1960 1370 NA NA NA NA ...
..- attr(*, "names")= chr "3302" "6943" "14065" "6420" ... $ Pcc_0h_swapped_A: Factor w/ 5082 levels "10.001","10.002",..: 1256 885 2533 2477 1051 1064 NA NA NA NA ...
..- attr(*, "names")= chr "3302" "6943" "14065" "6420" ... $ Pcc_0h_M : Factor w/ 1940 levels " 0.000"," 0.001",..: 499 231 107 18 30 148 NA NA NA NA ...
..- attr(*, "names")= chr "3302" "6943" "14065" "6420" ... $ Pcc_0h_swapped_M: Factor w/ 2343 levels " 0.000"," 0.001",..: 632 1453 526 646 1319 1434 NA NA NA NA ...
..- attr(*, "names")= chr "3302" "6943" "14065" "6420" ...

> > a last question, and thanks a million for your patience and your
> > explanations ...
> >
> >
> > I tried with a df called "merged" and a column named "Pcc_0h_A" (which
> is
> > numeric values):
> >
> >> length(as.vector(merged$Pcc_0h_A))
> > [1] 12202
> >>as.numeric(as.vector(merged$Pcc_0h_A)[1:10])
> > [1] 12.276 11.958 14.098 13.843 12.451 11.745 NA NA NA NA
> >> ord<-ordered(merged$Pcc_0h_A)
> >> length(ord)
> > [1] 12202
> >> ord[1:10]
> > [1] 12.276 11.958 14.098 13.843 12.451 11.745 <NA> <NA> <NA> <NA>
> > 5386 Levels: 10.001 < 10.002 < 10.003 < 10.005 < 10.006 < 10.010 < ... <
> > 9.999
> >
> > here I have <NA> instead of NA because ord is a factor and the notation
> is
> > different ?

>

> >
> >> length(as.numeric(merged$Pcc_0h_A))
> > [1] 12202
> >> as.numeric(merged$Pcc_0h_A[1:10])
> > [1] 1812 1547 3308 3114 1960 1370 NA NA NA NA
> >
> > are these the levels names converted into numbers ? I don't think
> because
> > levels are like 10.001, 10.002 etc and 1812, 1547 etc are not in this
> form.

with the str(merged) value I guess that 1812, 1547 etc are a sort of rank , am I right ?

>

> > thanks a million
> >
> > florence;
> >

>
>

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Oct 07 01:12:14 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:40:38 EST