Re: [R] Sort problem with merge (again)

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Tue 26 Sep 2006 - 06:34:24 GMT

On Mon, 25 Sep 2006, Bruce LaZerte wrote:

> # R version 2.3.1 (2006-06-01) Debian Linux "testing"
>
> # Is the following behaviour a bug, feature or just a lack of
> # understanding on my part? I see that this was discussed here
> # last March with no apparent resolution.

Reference? It is the third alternative. A factor is sorted by its codes: consider

> x <- factor(1:3, levels=as.character(3:1))
> x

[1] 1 2 3
Levels: 3 2 1
> sort(x)

[1] 3 2 1
Levels: 3 2 1

and that is what is happening here: for your example the levels of df$Date are

> levels(df$Date)

[1] "1970-04-04" "1970-08-11" "1970-10-18" "1970-06-04" "1970-08-18"

so the result is sorted correctly.

If you want to sort a character column in lexicographic order, don't make it into a factor. Similarly for a date column: use class "Date".

> d <- as.factor(c("1970-04-04","1970-08-11","1970-10-18"))
> x <- c(9,10,11)
> ch <- data.frame(Date=d,X=x)
>
> d <- as.factor(c("1970-06-04","1970-08-11","1970-08-18"))
> y <- c(109,110,111)
> sp <- data.frame(Date=d,Y=y)
>
> df <- merge(ch,sp,all=TRUE,by="Date")
> # the rows with dates missing all ch vars are tacked on the end.
> # the rows with dates missing all sp vars are sorted in with
> # the row with a date with vars from both ch and sp
> # is.ordered(df$Date) returns FALSE
>
> # The rows of df are not sorted as they should be as sort=TRUE
> # is the default. Adding sort=TRUE does nothing.
> # So try this:
> # dd <- df[order(df$Date),]
> # But that doesn't work.
> # Nor does sort(df$Date)
> # But sort(as.vector(df$Date)) does work.
> # As does order(as.vector(df$Date)), so this works:
> dd <- df[order(as.vector(df$Date)),]
> # ?????

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue Sep 26 16:40:55 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 26 Sep 2006 - 07:30:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.