Re: [Rd] setdiff bizarre

From: Stavros Macrakis <>
Date: Tue, 02 Jun 2009 15:11:02 -0400

On Tue, Jun 2, 2009 at 1:18 PM, William Dunlap <> wrote:

> %in% is a thin wrapper on a call to match().

Yes, as I mentioned in my email, all this is clearly documented in ? match.

> match() is not a generic function (and is not documented to be one),
> so it treats data.frames as lists, as their underlying representation is a
> list of columns.

Yes, I understand that this is the proximal cause of the current strange behavior. What I don't understand is why the current behavior is a good idea.

> match is documented to convert lists to character and to then run

the character version of match on that character data

Yes, this peculiar behavior is documented. What I don't get is its rationale.

> match does not bail out if the types of the x and table arguments don't
> match
> (that would be undesirable in the integer/numeric mismatch case).

Why would it 'bail out'?

The related functions, duplicated() and unique(), do have
> row-wise data.frame methods. E.g.,
> > duplicated(data.frame(x=c(1,2,2,3,3),y=letters[c(1,1,2,2,2)]))
> Perhaps match() ought to have one also....

I think that %in% and is.element() ought to remain calls to match()
> and that if you want them to work row-wise on data.frames then
> match should get a data.frame method.

After all that, it sounds like we agree...!


        [[alternative HTML version deleted]] mailing list Received on Tue 02 Jun 2009 - 19:32:14 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 02 Jun 2009 - 23:34:30 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive