Re: [Rd] validObject() -> slow down ?! [was "package:Matrix handling ..."]

From: John Chambers <jmc_at_r-project.org>
Date: Mon 10 Jul 2006 - 12:40:08 GMT

If you want to avoid checking your objects for validity, create them as default objects and then set the slots.

validObject is only called if new() gets optional arguments. and it's not called when a slot is assigned (because in some cases an invalid intermediate object will exist, between two slot assignments).

 > setClass("foo", representation(x="numeric"))
 > trace(validObject)
 > xx = new("foo", x=pi)

trace: validObject(.Object)
 > x = new("foo")
 > x@x = pi
 >  ## no call to validObject

Martin Maechler wrote:

>[Diverted from R-help to R-devel]
>
>
>
>>>>>>"roger" == roger koenker <roger@ysidro.econ.uiuc.edu>
>>>>>> on Sun, 9 Jul 2006 12:31:16 -0500 writes:
>>>>>>
>>>>>>
>
> >>
> roger> On 7/8/06, Thaden, John J <ThadenJohnJ@uams.edu>
> roger> wrote:
>
> >> As there is nothing inherent in either compressed,
> >> sparse, format that would prevent recognition and
> >> handling of duplicated index pairs, I'm curious why the
> >> dgCMatrix class doesn't also add x values in those
> >> instances?
>
> roger> why not multiply them? or take the larger one, or
> roger> ...? I would interpret this as a case of user
> roger> negligence -- there is no "natural" default behavior
> roger> for such cases.
>
> roger> On Jul 9, 2006, at 11:06 AM, Douglas Bates wrote:
>
> >> Your matrix Mc should be flagged as invalid. Martin and
> >> I should discuss whether we want to add such a test to
> >> the validity method. It is not difficult to add the test
> >> but there will be a penalty in that it will slow down all
> >> operations on such matrices
>
>hmm, maybe "all operations" is slightly pessimistic.
>The issue seems to be *when* (under what exact circumstances)
>the 'validity' method for a class will be called, i.e., when the
>equivalent of validObject(<obj>) should be called automatically.
>
>We (those from R-core present) discussed this question a
>bit last summer in Seattle, and we had a proposal by Robert Gentleman,
>that this should both be better defined and documented and also
>slightly changed -- such that validObject() is called less
>frequently.
>
>IIRC, one consequence of that is the 'complete = FALSE' default
>that validObject() has got in the mean time. But I don't know
>about the other issue, of ensuring (or not) that validObject()
>is not called too often.
>
>I wonder if we should consider a new optional argument to
>new(..) [ well actuallly, initialize() ] :
>
>the default new(....., .check.validity = TRUE)
>would call {the equivalent of} validObject() after object
>creation, but one could always explicitly use
> new(....., .check.validity = FALSE)
>for fast "but dangerous" objet creation.
>
> >> and I'm not sure if we want to pay that price to catch a
> >> rather infrequently occuring problem.
>
> roger> Elaborating the validity procedure to flag such
> roger> instances seems to be well worth the speed penalty in
> roger> my view. Of course, anticipating every such misstep
> roger> imposes a heavy burden on developers and constitutes
> roger> the real "cost" of more elaborate validity checking.
>
>At the moment I tend to agree with Roger that we (Matrix
>authors) should try to add more stringent testing even at some
>cost --- particularly if that penalty would only occur at object
>creation time. One important "use case" of our sparse matrices
>of course are lmer() calls. They shouldn't become slower noticably.
>
> roger> [My 2cents based on experience with SparseM.]
>
>Martin
>
>______________________________________________
>R-devel@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>

        [[alternative HTML version deleted]]



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon Jul 10 22:43:09 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 10 Jul 2006 - 14:27:22 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.