Re: [Rd] [R] Semantics of sequences in R

From: Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk_at_idi.ntnu.no>
Date: Mon, 23 Feb 2009 11:31:16 +0100

Berwin A Turlach wrote:
> On Mon, 23 Feb 2009 08:52:05 +0100
> Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk_at_idi.ntnu.no> wrote:
>
>

>> Berwin A Turlach wrote:
>>     
>>> G'day Stavros,
>>>       
>> <snip>
>>     
>>>> In many cases, the orthogonal design is pretty straightforward.
>>>> And in the cases where the operation is currently an error (e.g.
>>>> sort(list(...))), I'd hope that wouldn't break existing code. [...]
>>>>     
>>>>         
>>> This could actually be an example that would break a lot of existing
>>> code.
>>>
>>> sort is a generic function, and for sort(list(...)) to work, it
>>> would have to dispatch to a function called sort.list; and as
>>> Patrick Burns' "The R Inferno" points out, such a function exists
>>> already and it is not for sorting list.  
>>>   
>>>       
>> and you mean that sort.list not being applicable to lists is a) good
>> design, and b) something that by noe means should be fixed, right?
>>     
>

> I neither said nor meant this and I do not see how what I said could be
> interpreted in such a way. I was just commenting to Stavros that the
> example he picked, hoping that it would not break existing code, was
> actually a bad one which potentially will break a lot (?) of existing
> code.
>

would it, really? if sort.list were, in addition to sorting atomic vectors (can-be-considered-lists), able to sort lists, how likely would this be to break old code? can you give one concrete example, and suggest how to estimate how much old code would involve the same issue?

sort.list, to be applied to an atomic vector, must be called explicitly on the vector, because calling sort will not automatically dispatch to sort.list (right?). so allowing sort.list to sort lists does not change anything in this respect -- except for that, as i suggested, if sort.list were requiring an explicit comparator, you'd have to add one wherever sort.list is called, but to accomodate for old code sort.list could actually check whether the argument is not an atomic vector.

how much old code could be relying on the fact that sort.list raises an error when given a list? i suspect it's fairly unlikely that any single piece of code does; and if so, allowing sort.list to sort lists would not change anything here either.

> Also, until reading Patrick Burns' "The R Inferno" I was not aware of
> sort.list. That function had not registered with me since I hardly
> used it.

which hints that "potentially will break a lot (?) of existing code" is a rather unlikely event.

> And I also have no need of calling sort() on lists. For em a
> lists is a flexible enough data structure such that defining a sort()
> command for them makes no sense; it could only work in very specific
> circumstances.
>

i don't understand the first part: "flexible enough data structure such that defining a sort() command for them makes no sense" makes no sense.

as to "it could only work in very specific circumstances" -- no, it would work for any list whatsoever, provided the user has a correctly implemented comparator. for example, i'd like to sort a list of vectors by the vectors' length -- is this a very exotic idea?

>

>>> In fact, currently you get:
>>>
>>> R> cc <- list(a=runif(4), b=rnorm(6))
>>> R> sort(cc)
>>> Error in sort.list(cc) : 'x' must be atomic for 'sort.list'
>>> Have you called 'sort' on a list?
>>>   
>>>       
>> one of the most funny error messages you get in r.  note also that,
>> following rolf turner's lists and vectors unproven theorem, a vector
>> can be considered a list 
>>     
>

> I do not remember the exact context of Rolf's comments, but I believe
> he was talking in a more general sense and not in technical terms.

indeed, he was blurring the concepts instead of referring to concrete documentation with clear specified meaning of the terms he used.

> I
> find it perfectly valid, even when talking about R, to say something
> like "vectors are stored as a list of numbers in consecutive memory
> locations in memory".

yes; and you can always say that 'vectors can be considered electrical charges', or better, 'vectors can be considered electrical charges, in some sense'.

what sense of 'list' are you using here? i'd rather use the term 'array', unless confusing the user is the real purpose. (and to be really picky, you do not store numbers.)

> Clearly, in a phrase like this, we are not
> talking about "vectors" and "list" as defined by the "R Language
> Definition" or "R Internals", or what functions like is.vector(),
> is.list() &c return for various R objects.
>

clearly, you can say anything you like, and then add 'i was not talking about x as defined by y'. the art is to talk about x as defined by y.

> BTW, as I mentioned once before, you might want to consider to lose
> these chips on your shoulders.
>

berwin, it's been a tradition on this list to discourage people from commenting on the design and implementation of r whenever they think it's wrong. you really should be doing the opposite. as a chinese proverb says, a gem cannot be polished without friction. friction seems to be what you fear a lot.

>

>> -- hence sort.list should raise the error on any vector input, no?
>>     
>

> You will have to take that up with the designers of sort.list.
>
>
>>> Thus, to make sort(list()) work, you would have to rename the
>>> existing sort.list and then change every call to that function to
>>> the new name. I guess this might break quite a few packages on CRAN.
>>>   
>>>       
>> scary!  it's much preferred to confuse new users.
>>     
>

> I usually learn a lot when I get confused about some issues/concept.
> Confusion forces one to sit down, think deeply and, thus, gain some
> understanding. So I am not so much concerned with new users being
> confused. It is, of course, a problem if the new user never comes out
> of his or her confusion.
>

the problem, is, r users have to learn lots and lots of *bad* and *messy* design to get up and running without devils catching them behind every corner. in principle, you're absolutely right; the problem lies in the amount of effort a user has to make to avoid confusion while using r (where 'using' means a bit more than simply fitting and plotting a model).

cheers,
vQ



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon 23 Feb 2009 - 09:33:20 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 23 Feb 2009 - 12:30:38 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive