Re: [Rd] Expected behaviour of is.unsorted?

From: Matthew Dowle <mdowle_at_mdowle.plus.com>
Date: Thu, 24 May 2012 16:10:07 +0100

> On 24/05/2012 9:15 AM, Matthew Dowle wrote:

>> Duncan Murdoch<murdoch.duncan<at> gmail.com> writes:
>> >
>> > On 12-05-24 7:39 AM, Matthew Dowle wrote:
>> > > Duncan Murdoch<murdoch.duncan<at> gmail.com> writes:
>> > >>
>> > >> On 12-05-23 4:37 AM, Matthew Dowle wrote:
>> > > Since it seems to have a bug anyway (and if so, can't be correct
>> in anyone's
>> > > use of it), could either is.unsorted on a data.frame return the
>> error
>> that's in
>> > > the C code already: "only atomic vectors can be tested to be
>> sorted", for
>> > > safety and to lessen confusion, or be changed to return the
>> natural
>> expectation
>> > > proposed above? The easiest quick fix would be to negate the
>> result of
>> the .gtn
>> > > call of course, but then you could never go back.
>> >
>> > I don't follow the last sentence. If the .gtn call needs to be
>> negated,
>> > why would you want to go back?
>>
>> Because then is.unsorted(DF) would work, but go by row, which you
>> guessed above
>> wasn't intended and isn't sensible. But once it worked in that way,
>> users might
>> start to depend on it; e.g., by writing is.unsorted(t(DF)). If I came
>> along in future and suggested that was inefficient and wouldn't it be
>> more
>> natural and efficient if is.unsorted(DF) went by column, returning the
>> same as
>> with(DF,is.unsorted(order(a,b))) but implemented efficiently, you would
>> fear
>> that user code now depended on it going by row and say it was too late.
>> I'd
>> persist and highlight that it didn't seem in keeping with the spirit of
>> is.unsorted()'s speed since it short circuits on the first unsorted
>> item, which
>> is why we love it. You'd reply that's not documented. Which it isn't.
>> And that
>> would be the end of that.
>
> Okay, I'm going to fix the handling of .gtn results, and document the
> unsuitability of this
> function for dataframes and arrays.

But that leaves the door open to confusion later, whilst closing the door to a better solution: making is.unsorted() work by column for data.frame; i.e., making is.unsorted _suitable_ for data.frame. If you just do the quick fix for .gtn result you can never go back. If making is.unsorted(DF) work by column is too hard for now, then leaving the door open would be better by returning the error message already in the C code: "only atomic vectors can be tested to be sorted". That would be a better quick fix since it leaves options for the future.

> Duncan Murdoch
>

>>
>> > Duncan Murdoch
>> >
>> > >
>> > > Matthew
>> > >
>> > >> Duncan Murdoch
>> > >>
>> > >>>
>> > >>> I understand why the first two are FALSE (1 item of anything
>> must be
>> > >>> sorted). I don't understand the 3rd and 4th cases where length
>> is 2:
>> > >>> do_isunsorted seems to call lang3(install(".gtn"), x,
>> CADR(args))). Does
>> > >>> that fall back to TRUE for some reason?
>> > >>>
>> > >>> Matthew
>> > >>>
>> > >>>> sessionInfo()
>> > >>> R version 2.15.0 (2012-03-30)
>> > >>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>> > >>>
>> > >>> locale:
>> > >>> [1] LC_COLLATE=English_United Kingdom.1252
>> LC_CTYPE=English_United
>> > >>> Kingdom.1252
>> > >>> [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>> > >>> [5] LC_TIME=English_United Kingdom.1252
>> > >>>
>> > >>> attached base packages:
>> > >>> [1] stats graphics grDevices utils datasets methods
>> base
>> > >>>
>> > >>> other attached packages:
>> > >>> [1] data.table_1.8.0
>> > >>>
>> > >>> loaded via a namespace (and not attached):
>> > >>> [1] tools_2.15.0
>> > >>>
>> > >>> ______________________________________________
>> > >>> R-devel<at> r-project.org mailing list
>> > >>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> > >>
>> > >>
>> > >
>> > > ______________________________________________
>> > > R-devel<at> r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>> >
>>
>> ______________________________________________
>> R-devel_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu 24 May 2012 - 15:12:51 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 24 May 2012 - 16:51:41 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive