Re: [Rd] [R] S4 method dispatch matrixOrArray (fwd)

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Fri 14 Apr 2006 - 14:02:59 GMT

Right. That does not solve the whole problem (as I pointed out) but was just a suggestion intended to be reasonably consistent with how it works now. The best would be if the search order were a function only of the relationships among the classes and not dependent on the setMethod order.

On 4/14/06, John Chambers <jmc@r-project.org> wrote:
> Gabor Grothendieck wrote:
> I think a rule is needed that would require that a class not succeed any
> of
its parents on the method list. That way "array" could never come after
> "vector" and, in general, a parent would never be found prior
to any of its
> descendents.

>
> A good idea & it would help, we can certainly do something like this
> (probably for 2.4, not 2.3). But so long as the inheritance search is
> greedy, a "wrong" method can still be chosen.
>
> Suppose, say, class A contains B and C, and C contains D. Function foo has
> methods for B and D. We would agree, I think, that foo(a) where a is from
> class A should get the B method (parent) rather than the D method
> (grandparent). But classess B and D are unrelated so there's no obvious
> ordering.
>
> My preference would be for a full search. The ordering you suggest would at
> least help avoid some branches of the search. If we can manage at some
> point to re-implement important parts of the methods package, in C and more
> built-in, that should save much more computing time than the full search
> would add.
>
> Of course that still does not address the problem that the order
of the
> setMethod's rather than the relationships among the classes
can influence
> the search order but it would eliminate the situation
we saw here.

On
> 4/13/06, John Chambers <jmc@r-project.org> wrote:

> What a delicious example! (I'm taking the liberty of sharing it
> with
r-devel, since it raises some good issues.)

You have two questions,
> presumably:

1 - how could the order of the setMethod calls make a
> difference in the

results?
2 - what's causing the infinite loop & how could
> it be avoided, reliably?

Second question first. The danger sign is the
> "vector" method:

setMethod("foo",
 signature(A = "vector"),
 function(A,
> ...) {

 foo(matrix(A, nrow = 1), ...)
 })

This handles a vector by
> recalling the generic with a matrix. But "vector"
is a superclass of
> "matrix" (see getClass("matrix")) and you DON'T have an
explicit method
> defined for "matrix". So for sure if "vector" was the only
method, you would
> be in immediate trouble. Coercing down the inheritance
tree is potentially
> bad news.

Generally, if you're going to take over dispatch by coercing an
> argument &

then recalling the generic, the result is simplest if you have an
> exact

match for the new call, not relying on inheritance. After all, if
> you're

doing the work to coerce the argument, might as well pick one that
> works

right away. Not a universal rule, but other things being equal
> ...

There's an additional issue with methods for "matrix" and "array"
> because R

never allows 2-way arrays to have class "array", which means using
> array()

here with the same dimensions would not have helped. Also, matrix
> and array

objects are weird in that they are not basic vectors but also have
> no class

attribute, and is.object() is FALSE for them.

More interesting
> though--how can the order of the setMethod() calls matter?
To see that we
> need to look at the methods list object. (The discussion is
simplified to
> the case that only one argument is involved, which doesn't
affect the
> result.)

The MethodsList object has a slot containing a named list of
> methods, with

the names being those of the classes that appeared in the
> setMethod calls,

in the order that the calls occurred(note!). All classes
> are essentially

equal from the view of the generic function, so there's no
> ordering favoring

the "more relevant".

When method dispatch occurs, the
> code first looks for an exact match to the
class of the actual
> argument--that's a quick search in the names of the
list.

If the direct
> search fails, the code now looks for an inherited method. The
key point is
> that this second search is "greedy"--the first inherited method
found is
> used.

NOW it should be clear why the order of the setMethod() calls
> matters. You

have two potential inherited methods here for "matrix"; namely,
> "array" and

"vector". WE know that the "array" method is `closer', and the R
> dispatcher

could decide this also, if it were willing to look through all
> possible

inheritance paths and drop one possibility if a better one was
> found.

It currently doesn't do any further search, and doing so would be a
> modest

efficiency hit. I'm inclined to think the cost would be worth it
> to

eliminate unpleasant suprises like this one, but opinions may differ.
> (Once

found, the inherited method is stored directly in the list used for
> the

first lookup, so the hit is only the first time a particular
> signature

turns up.)

To make the difference clearer, I added to your
> example another generic

"bar" with the same methods as "foo", but with the
> order of the setMethod

calls reversed.
By looking at the "methods" slot in
> the two cases, we can see why the bad
("vector") method is selected for
> bar() but not for foo():

> names(getMethods("foo")@methods)

> [1] "array" "vector"

> names(getMethods("bar")@methods)

> [1] "vector" "array"

After running foo(1:10) and trying to run bar(1:10):

> showMethods("foo")

> Function "foo":

 A = "array"
 A = "vector"
 A = "integer"

 (inherited from A
> = "vector")

 A = "matrix"
 (inherited from A = "array")

> showMethods("bar")

> Function "bar":

 A = "vector"
 A = "array"
 A = "integer"

 (inherited from A
> = "vector")

 A = "matrix"
 (inherited from A = "vector")

But including
> setMethod("bar", "matrix", ...) in the source code makes bar()
work
> fine.

Paul Roebuck wrote:
Sorry to bother but could you shed some light
> on this?

> I don't understand

> why order of setMethod calls makes

> any difference. Since it obviously does,

> it has shaken

> the foundations of what I thought I understood about
S4

> methods. Even Gabor was surprised...

>

---------- Forwarded message

> ----------

> Date: Wed, 12 Apr 2006 18:24:46 -0400
From: Gabor Grothendieck

> <ggrothendieck@gmail.com>
To: Paul Roebuck <roebuck@mdanderson.org> Cc: R
> Help Mailing List

<r-help@stat.math.ethz.ch> Subject: Re: [R] S4 method
> dispatch matrixOrArray

On 4/12/06, Paul Roebuck <roebuck@mdanderson.org>
> wrote:

On Wed, 12 Apr 2006, Gabor Grothendieck wrote:

>

> On 4/12/06, Paul Roebuck <roebuck@mdanderson.org> wrote:

>

> I have some code where the primary dispatching is on

> other parameters so I'd

> like not to have to create a

> set of functions for "matrix" and another

> duplicate

> set for "array". But the class union technique isn't
working as

> implemented below and I don't have my Green

> book with me. How do I fix my

> infinite recursion

problem?

>

##--------------------------------------------------------
library(methods)

setGeneric("foo",
> function(A, ...) {

> cat("generic", match.call()[[1]], "\n")

> standardGeneric("foo")

> })

setMethod("foo",
 signature(A = "vector"),

> function(A, ...) {

> callGeneric(matrix(A, nrow = 1), ...)

> })

> setClassUnion("matrixOrArray", c("matrix", "array"))

setMethod("foo",

> signature(A = "matrixOrArray"),

> function(A, ...) {

 cat("A =", A, "\n")

> })

> ## Test

foo(1:4)
foo(matrix(1:4, 1, 4))
foo(array(1:4, c(1, 4, 1)))


> I think its good enough to just define an array method, i.e. you

> don't need

> the matrix method or the matrixOrArray class, and the

> vector method can call

> foo(matrix(A,1), ...) so:

> setGeneric("foo",

 function(A, ...) {

> cat("generic", match.call()[[1]], "\n")

> standardGeneric("foo")

> })

> setMethod("foo",

 signature(A = "array"),
 function(A, ...) {
 cat("A

> =", A, "\n")

> })

setMethod("foo",
 signature(A = "vector"),
 function(A,

> ...) {

> foo(matrix(A, nrow = 1), ...)

 })

> Something didn't seem right here. That was pretty close

> to what I had

> started with, before trying to go the

> classUnion route. Matter of fact, the

> vector method can

> retain use of callGeneric.

The solution has to do with

> the order in which calls to

> setMethod are made. Adding foo-vector after

> foo-array

> works fine; the other way around causes infinite recursion.

> This is surprising. I would have thought that the

> parent/child relationships

> determine the order that

> dispatched methods are invoked, not the order

> that

> the setMethod commands are issued in.

> ______________________________________________
R-devel@r-project.org
> mailing
> list

https://stat.ethz.ch/mailman/listinfo/r-devel

>



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat Apr 15 00:34:16 2006

This archive was generated by hypermail 2.1.8 : Fri 14 Apr 2006 - 18:17:07 GMT