From: Duncan Murdoch <murdoch_at_stats.uwo.ca>

Date: Fri 07 Oct 2005 - 00:57:45 EST

> [1] 12202

> [1] 12.276 11.958 14.098 13.843 12.451 11.745 NA NA NA NA

> [1] 12202

> [1] 12.276 11.958 14.098 13.843 12.451 11.745 <NA> <NA> <NA> <NA>

*> 5386 Levels: 10.001 < 10.002 < 10.003 < 10.005 < 10.006 < 10.010 < ... <
*

*> 9.999
*

*>
*

*> here I have <NA> instead of NA because ord is a factor and the notation is
*

*> different ?
*

> [1] 12202

> [1] 1812 1547 3308 3114 1960 1370 NA NA NA NA

*>
*

*> are these the levels names converted into numbers ? I don't think because
*

*> levels are like 10.001, 10.002 etc and 1812, 1547 etc are not in this form.
*

*>
*

*> thanks a million
*

*>
*

*> florence;
*

*>
*

*>
*

*>
*

*>
*

*> On 10/6/05, Duncan Murdoch <murdoch@stats.uwo.ca> wrote:
*

Date: Fri 07 Oct 2005 - 00:57:45 EST

On 10/6/2005 10:50 AM, Florence Combes wrote:

> a last question, and thanks a million for your patience and your

*> explanations ...
**>
**>
**> I tried with a df called "merged" and a column named "Pcc_0h_A" (which is
**> numeric values):
**>
*

>> length(as.vector(merged$Pcc_0h_A))

> [1] 12202

>>as.numeric(as.vector(merged$Pcc_0h_A)[1:10])

> [1] 12.276 11.958 14.098 13.843 12.451 11.745 NA NA NA NA

>> ord<-ordered(merged$Pcc_0h_A) >> length(ord)

> [1] 12202

>> ord[1:10]

> [1] 12.276 11.958 14.098 13.843 12.451 11.745 <NA> <NA> <NA> <NA>

I can't tell what's going on here. Since you are only showing me converted values of each column (as.vector(), as.numeric(), ordered(), etc.) I can't tell what the original looked like.

A useful way to get an overview of a dataframe is to look at the results of three function calls:

head(merged) # list the first few rows str(merged) # describe the structure of the dataframe summary(merged) # summarize the data in each of the columns.

Duncan Murdoch

*>
*

>> length(as.numeric(merged$Pcc_0h_A))

> [1] 12202

>> as.numeric(merged$Pcc_0h_A[1:10])

> [1] 1812 1547 3308 3114 1960 1370 NA NA NA NA

On 10/6/2005 10:20 AM, Florence Combes wrote:

> > 2d I can't manage to deal with factors, so when I have some, I
>> transform
> > them in vectors (with levels()), but I think I miss the power and
>> utility
> of
> > the factor type ?
>
> levels() is not the conversion you want.
>
>
> in fact I use
> 'as.numeric(levels(f))[f]'
> (from the ?factor description)

That will only work if the levels have names that can be converted to
numbers. In the example below, the levels are "a" and "b", so you'll
get NA values if you try this.
>
> That lists all the levels, but
> it doesn't tell you how they correspond to individual observations.
For
> example,
>
> > df <- data.frame(x=1:3, y=c('a','b','a'))
> > df
> x y
> 1 1 a
> 2 2 b
> 3 3 a
> > levels(df$y)
> [1] "a" "b"
>
> If you need to convert back to character values, use as.character():
>
> > as.character(df$y)
> [1] "a" "b" "a"
>
>
> got it.
>
>
> 1. You can't compare the levels of a factor unless you declared it to
> be ordered:
>
> > df$y[1] > df$y[2]
> [1] NA
> Warning message:
> > not meaningful for factors in: Ops.factor(df$y[1], df$y[2])
>
> but
>
> > df$y <- ordered(df$y)
> > df$y[1] > df$y[2]
> [1] FALSE
>
> However, you need to watch out here: the comparison is done by the
order
> of the factors
>
>
> I am sorry I don't understand this.
> here you compare the position of a in the factor and the position of b
in
> the factor ?

It's the position of "a" in the levels() vector that is being compared.
I declared that the factor had ordered levels, and R interprets that
to mean that the first level is less than the second level, etc. This
is useful if you want to use meaningful names for ordered categories.
Comparison will be by the order of the categories, not by the name you
chose.

Duncan Murdoch

>
> , not an alphabetic comparison of their names:
>
> > levels(df$y) <- c("before", "after")
> > df
> x y
> 1 1 before
> 2 2 after
> 3 3 before
> > df$y[1] > df$y[2]
> [1] FALSE
>
>
> best regards,
>
> florence.

*
