Re: [Rd] Expected behaviour of is.unsorted?

From: Matthew Dowle <mdowle_at_mdowle.plus.com>
Date: Thu, 24 May 2012 11:39:10 +0000

Duncan Murdoch <murdoch.duncan <at> gmail.com> writes:
>
> On 12-05-23 4:37 AM, Matthew Dowle wrote:
> >
> > Hi,
> >
> > I've read ?is.unsorted and searched. Have found a few items but nothing
> > close, yet. Is the following expected?
> >
> >> is.unsorted(data.frame(1:2))
> > [1] FALSE
> >> is.unsorted(data.frame(2:1))
> > [1] FALSE
> >> is.unsorted(data.frame(1:2,3:4))
> > [1] TRUE
> >> is.unsorted(data.frame(2:1,4:3))
> > [1] TRUE
> >
> > IIUC, is.unsorted is intended for atomic vectors only (description of x in
> > ?is.unsorted). Indeed the C source (src/main/sort.c) contains an error
> > message "only atomic vectors can be tested to be sorted". So that is the
> > error message I expected to see in all cases above, since I know that
> > data.frame is not an atomic vector. But there is also this in
> > ?is.unsorted: "except for atomic vectors and objects with a class (where
> > the>= or> method is used)" which I don't understand. Where>= or> is
> > used by what, and where?
>
> If you look at the source, you will see that the basic test for classed
> objects is
>
> all(x[-1L] >= x[-length(x)])
>
> (in the function base:::.gtn).
>
> This comparison doesn't really makes sense for dataframes, but it does
> seem to be backwards: that tests that x[2] >= x[1], x[3] >= x[2], etc.,
> returning TRUE if all comparisons are TRUE: but that sounds like it
> should be is.sorted(), not is.unsorted(). Or is it my brain that is
> backwards?

Thanks. Yes you're right. So is.unsorted() on a data.frame is trying to tell us if there exists any unsorted row, it seems.

> DF = data.frame(a=c(1,3,5),b=c(1,3,5))
> DF

  a b

1 1 1               # this row is sorted
2 3 3               # this row is sorted
3 5 5               # this row is sorted

> is.unsorted(DF) # going by row but should be !.gtn
[1] TRUE
> with(DF,is.unsorted(order(a,b))) # most people's natural expectation I guess
[1] FALSE
> DF[2,2]=2
> DF

  a b
1 1 1               # this row is sorted
2 3 2               # this row isn't sorted
3 5 5               # this row is sorted

> is.unsorted(DF) # going by row but should be !.gtn
[1] FALSE
> with(DF,is.unsorted(order(a,b))) # most people's natural expectation I guess
[1] FALSE Since it seems to have a bug anyway (and if so, can't be correct in anyone's use of it), could either is.unsorted on a data.frame return the error that's in the C code already: "only atomic vectors can be tested to be sorted", for safety and to lessen confusion, or be changed to return the natural expectation proposed above? The easiest quick fix would be to negate the result of the .gtn call of course, but then you could never go back.

Matthew

> Duncan Murdoch
>
> >
> > I understand why the first two are FALSE (1 item of anything must be
> > sorted). I don't understand the 3rd and 4th cases where length is 2:
> > do_isunsorted seems to call lang3(install(".gtn"), x, CADR(args))). Does
> > that fall back to TRUE for some reason?
> >
> > Matthew
> >
> >> sessionInfo()
> > R version 2.15.0 (2012-03-30)
> > Platform: x86_64-pc-mingw32/x64 (64-bit)
> >
> > locale:
> > [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United
> > Kingdom.1252
> > [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
> > [5] LC_TIME=English_United Kingdom.1252
> >
> > attached base packages:
> > [1] stats graphics grDevices utils datasets methods base
> >
> > other attached packages:
> > [1] data.table_1.8.0
> >
> > loaded via a namespace (and not attached):
> > [1] tools_2.15.0
> >
> > ______________________________________________
> > R-devel <at> r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu 24 May 2012 - 11:45:14 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 24 May 2012 - 13:11:47 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive