From: Duncan Murdoch <murdoch_at_stats.uwo.ca>

Date: Fri 07 Oct 2005 - 00:57:45 EST

> [1] 12202

> [1] 12.276 11.958 14.098 13.843 12.451 11.745 NA NA NA NA

> [1] 12202

> [1] 12.276 11.958 14.098 13.843 12.451 11.745 <NA> <NA> <NA> <NA>

*> 5386 Levels: 10.001 < 10.002 < 10.003 < 10.005 < 10.006 < 10.010 < ... <
*

*> 9.999
*

*>
*

*> here I have <NA> instead of NA because ord is a factor and the notation is
*

*> different ?
*

> [1] 12202

> [1] 1812 1547 3308 3114 1960 1370 NA NA NA NA

*>
*

*> are these the levels names converted into numbers ? I don't think because
*

*> levels are like 10.001, 10.002 etc and 1812, 1547 etc are not in this form.
*

*>
*

*> thanks a million
*

*>
*

*> florence;
*

*>
*

*>
*

*>
*

*>
*

*> On 10/6/05, Duncan Murdoch <murdoch@stats.uwo.ca> wrote:
*

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Oct 07 01:07:04 2005

Date: Fri 07 Oct 2005 - 00:57:45 EST

On 10/6/2005 10:50 AM, Florence Combes wrote:

> a last question, and thanks a million for your patience and your

*> explanations ...
**>
**>
**> I tried with a df called "merged" and a column named "Pcc_0h_A" (which is
**> numeric values):
**>
*

>> length(as.vector(merged$Pcc_0h_A))

> [1] 12202

>>as.numeric(as.vector(merged$Pcc_0h_A)[1:10])

> [1] 12.276 11.958 14.098 13.843 12.451 11.745 NA NA NA NA

>> ord<-ordered(merged$Pcc_0h_A) >> length(ord)

> [1] 12202

>> ord[1:10]

> [1] 12.276 11.958 14.098 13.843 12.451 11.745 <NA> <NA> <NA> <NA>

I can't tell what's going on here. Since you are only showing me converted values of each column (as.vector(), as.numeric(), ordered(), etc.) I can't tell what the original looked like.

A useful way to get an overview of a dataframe is to look at the results of three function calls:

head(merged) # list the first few rows str(merged) # describe the structure of the dataframe summary(merged) # summarize the data in each of the columns.

Duncan Murdoch

*>
*

>> length(as.numeric(merged$Pcc_0h_A))

> [1] 12202

>> as.numeric(merged$Pcc_0h_A[1:10])

> [1] 1812 1547 3308 3114 1960 1370 NA NA NA NA

>> >> On 10/6/2005 10:20 AM, Florence Combes wrote: >> >> > > 2d I can't manage to deal with factors, so when I have some, I >> >> transform >> >> > > them in vectors (with levels()), but I think I miss the power and >> >> utility >> >> > of >> >> > > the factor type ? >> >> > >> >> > levels() is not the conversion you want. >> > >> > >> > in fact I use >> > 'as.numeric(levels(f))[f]' >> > (from the ?factor description) >> >> That will only work if the levels have names that can be converted to >> numbers. In the example below, the levels are "a" and "b", so you'll >> get NA values if you try this. >> > >> > That lists all the levels, but >> >> > it doesn't tell you how they correspond to individual observations. >> For >> >> > example, >> >> > >> >> > > df <- data.frame(x=1:3, y=c('a','b','a')) >> >> > > df >> >> > x y >> >> > 1 1 a >> >> > 2 2 b >> >> > 3 3 a >> >> > > levels(df$y) >> >> > [1] "a" "b" >> >> > >> >> > If you need to convert back to character values, use as.character(): >> >> > >> >> > > as.character(df$y) >> >> > [1] "a" "b" "a" >> > >> > >> > got it. >> > >> > >> >> > 1. You can't compare the levels of a factor unless you declared it to >> >> > be ordered: >> >> > >> >> > > df$y[1] > df$y[2] >> >> > [1] NA >> >> > Warning message: >> >> > > not meaningful for factors in: Ops.factor(df$y[1], df$y[2]) >> >> > >> >> > but >> >> > >> >> > > df$y <- ordered(df$y) >> >> > > df$y[1] > df$y[2] >> >> > [1] FALSE >> >> > >> >> > However, you need to watch out here: the comparison is done by the >> order >> >> > of the factors >> > >> > >> > I am sorry I don't understand this. >> > here you compare the position of a in the factor and the position of b >> in >> > the factor ? >> >> It's the position of "a" in the levels() vector that is being compared. >> I declared that the factor had ordered levels, and R interprets that >> to mean that the first level is less than the second level, etc. This >> is useful if you want to use meaningful names for ordered categories. >> Comparison will be by the order of the categories, not by the name you >> chose. >> >> Duncan Murdoch >> >> > >> > , not an alphabetic comparison of their names: >> >> > >> >> > > levels(df$y) <- c("before", "after") >> >> > > df >> >> > x y >> >> > 1 1 before >> >> > 2 2 after >> >> > 3 3 before >> >> > > df$y[1] > df$y[2] >> >> > [1] FALSE >> > >> > >> > best regards, >> > >> > florence. >> > >> >> > ______________________________________________R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Oct 07 01:07:04 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:40:38 EST
*