Re: [Rd] [R] Semantics of sequences in R

From: Berwin A Turlach <berwin_at_maths.uwa.edu.au>
Date: Mon, 23 Feb 2009 19:27:57 +0800

On Mon, 23 Feb 2009 11:31:16 +0100
Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk_at_idi.ntnu.no> wrote:

> Berwin A Turlach wrote:
> > On Mon, 23 Feb 2009 08:52:05 +0100
> > Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk_at_idi.ntnu.no> wrote:
[...]
> >> and you mean that sort.list not being applicable to lists is a)
> >> good design, and b) something that by noe means should be fixed,
> >> right?
> >
> > I neither said nor meant this and I do not see how what I said
> > could be interpreted in such a way. I was just commenting to
> > Stavros that the example he picked, hoping that it would not break
> > existing code, was actually a bad one which potentially will break
> > a lot (?) of existing code.
>
> would it, really? if sort.list were, in addition to sorting atomic
> vectors (can-be-considered-lists), able to sort lists, how likely
> would this be to break old code?

Presumably not.

> can you give one concrete example, and suggest how to estimate how
> much old code would involve the same issue?

Check out the svn source of R, run configure, do whatever change you want to sort.list, "make", "make check FORCE=FORCE". That should give you an idea how much would break.

Additionally, you could try to install all CRAN packages with your modified version and see how many of them break when their examples/demos/&c is run.

AFAIK, Brian is doing something like this on his machine. I am sure that if you ask nicely he will share his scripts with you.

If this sounds too time consuming, you might just want to unpack the sources and grep for "sort.list" on all .R files; I am sure you know how to use find and grep to do this.

> > Also, until reading Patrick Burns' "The R Inferno" I was not aware
> > of sort.list. That function had not registered with me since I
> > hardly used it.
>
> which hints that "potentially will break a lot (?) of existing code"
> is a rather unlikely event.

Only for code that I wrote; other people's need and knowledge of R may vary.  

> > And I also have no need of calling sort() on lists. For em a
> > lists is a flexible enough data structure such that defining a
> > sort() command for them makes no sense; it could only work in very
> > specific circumstances.
> >
>
> i don't understand the first part: "flexible enough data structure
> such that defining a sort() command for them makes no sense" makes no
> sense.

lists are very flexible structure whose component must not be of equal type. So how do you want to compare components? How to you compare a vector of numbers to a vector of character strings? Or a list of lists?

Or should the sorting be on the length of the components? Or their names? Or should sort(myList) sort each component of myList? But for that case we have already lapply(myList, sort).

> as to "it could only work in very specific circumstances" -- no, it
> would work for any list whatsoever, provided the user has a correctly
> implemented comparator. for example, i'd like to sort a list of
> vectors by the vectors' length -- is this a very exotic idea?

No, if that is what you want. And I guess it is one way of sorting a list. The question is what should be the default way?

> > BTW, as I mentioned once before, you might want to consider to lose
> > these chips on your shoulders.
> >
>
> berwin, it's been a tradition on this list to discourage people from
> commenting on the design and implementation of r whenever they think
> it's wrong.

I am not aware of any such tradition and I subscribed to R-help on 15 April 1998.

The point is rather that by commenting only one will not achieve much, in particular if the comments look more like complaints and the same comments are done again and again (along with dragging up previous comments or comments received on previous comments).

R is open source. Check out the svn version, fix what you consider needs fixing, submit a patch, convince R core that the patch fixes a real problem/is an improvement/does not break too much. Then you have a better chance in achieving something.

Alternatively, if it turns out that something that bugs you cannot be changed without breaking too much existing code, start from scratch that with a better design. Apparently the GAP project (http://www.gap-system.org/) is doing something like this, as someone closely associated with that project once told me. While developing a version of GAP they collect information on how to improve the design, data structures &c; then, at some point, they start to write the next version from scratch.   

> >> scary! it's much preferred to confuse new users.
> >
> > I usually learn a lot when I get confused about some issues/concept.
> > Confusion forces one to sit down, think deeply and, thus, gain some
> > understanding. So I am not so much concerned with new users being
> > confused. It is, of course, a problem if the new user never comes
> > out of his or her confusion.
>
> the problem, is, r users have to learn lots [...]

Indeed, and I guess in this age of instant gratification that that is a real bummer for new users.

Best,

        Berwin



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon 23 Feb 2009 - 10:31:31 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 23 Feb 2009 - 15:30:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive