Re: [Rd] Expected behaviour of is.unsorted?

From: Duncan Murdoch <murdoch.duncan_at_gmail.com>
Date: Thu, 24 May 2012 09:57:29 -0400

On 24/05/2012 9:15 AM, Matthew Dowle wrote:
> Duncan Murdoch<murdoch.duncan<at> gmail.com> writes:
> >
> > On 12-05-24 7:39 AM, Matthew Dowle wrote:
> > > Duncan Murdoch<murdoch.duncan<at> gmail.com> writes:
> > >>
> > >> On 12-05-23 4:37 AM, Matthew Dowle wrote:
> > > Since it seems to have a bug anyway (and if so, can't be correct in anyone's
> > > use of it), could either is.unsorted on a data.frame return the error
> that's in
> > > the C code already: "only atomic vectors can be tested to be sorted", for
> > > safety and to lessen confusion, or be changed to return the natural
> expectation
> > > proposed above? The easiest quick fix would be to negate the result of
> the .gtn
> > > call of course, but then you could never go back.
> >
> > I don't follow the last sentence. If the .gtn call needs to be negated,
> > why would you want to go back?
>
> Because then is.unsorted(DF) would work, but go by row, which you guessed above
> wasn't intended and isn't sensible. But once it worked in that way, users might
> start to depend on it; e.g., by writing is.unsorted(t(DF)). If I came
> along in future and suggested that was inefficient and wouldn't it be more
> natural and efficient if is.unsorted(DF) went by column, returning the same as
> with(DF,is.unsorted(order(a,b))) but implemented efficiently, you would fear
> that user code now depended on it going by row and say it was too late. I'd
> persist and highlight that it didn't seem in keeping with the spirit of
> is.unsorted()'s speed since it short circuits on the first unsorted item, which
> is why we love it. You'd reply that's not documented. Which it isn't. And that
> would be the end of that.

Okay, I'm going to fix the handling of .gtn results, and document the unsuitability of this
function for dataframes and arrays.

Duncan Murdoch

>
> > Duncan Murdoch
> >
> > >
> > > Matthew
> > >
> > >> Duncan Murdoch
> > >>
> > >>>
> > >>> I understand why the first two are FALSE (1 item of anything must be
> > >>> sorted). I don't understand the 3rd and 4th cases where length is 2:
> > >>> do_isunsorted seems to call lang3(install(".gtn"), x, CADR(args))). Does
> > >>> that fall back to TRUE for some reason?
> > >>>
> > >>> Matthew
> > >>>
> > >>>> sessionInfo()
> > >>> R version 2.15.0 (2012-03-30)
> > >>> Platform: x86_64-pc-mingw32/x64 (64-bit)
> > >>>
> > >>> locale:
> > >>> [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United
> > >>> Kingdom.1252
> > >>> [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
> > >>> [5] LC_TIME=English_United Kingdom.1252
> > >>>
> > >>> attached base packages:
> > >>> [1] stats graphics grDevices utils datasets methods base
> > >>>
> > >>> other attached packages:
> > >>> [1] data.table_1.8.0
> > >>>
> > >>> loaded via a namespace (and not attached):
> > >>> [1] tools_2.15.0
> > >>>
> > >>> ______________________________________________
> > >>> R-devel<at> r-project.org mailing list
> > >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> > >>
> > >>
> > >
> > > ______________________________________________
> > > R-devel<at> r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu 24 May 2012 - 14:00:25 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 24 May 2012 - 15:31:48 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive