Re: [Rd] S4 accessors

From: Henrik Bengtsson <hb_at_stat.berkeley.edu>
Date: Wed 27 Sep 2006 - 22:00:06 GMT

On 9/27/06, John Chambers <jmc@r-project.org> wrote:
> There is a point that needs to be remembered in discussions of accessor
> functions (and more generally).
>
> We're working with a class/method mechanism in a _functional_ language.
> Simple analogies made from class-based languages such as Java are not
> always good guides.
>
> In the example below, "a function foo that only operates on that class"
> is not usually a meaningful concept in R. Whereas in Java a method can
> only be invoked "on" an object, given the syntax of the Java language, R
> (that is, the S language) is different. You can intend a function to be
> used only on one class, but that isn't normally the way to think about R
> software.
>
> Functions are first-class objects and in principle every function should
> have a "function", a purpose. Methods implement that purpose for
> particular combinations of arguments.
>
> Accessor functions are therefore a bit anomalous. If they had a
> standard syntactic pattern, say get_foo(object), then it would be more
> reasonable to think that you're just defining a method for that function
> for a given class that happens to have a slot with the particular name,
> "foo".
>
> Also, slot-setting functions will be different in R because we deal with
> objects, not object references as in Java. An R-like naming convention
> would be something along the lines of
> set_foo(object) <- value
> but in any case one will need to use replacement functions to conform to
> the way assignments work.

In the Object class system of the R.oo package I have for years worked successfully with what I call virtual fields. I find them really useful and convenient to work with.

These works as follows, if there is a get<Field>(object) function, this is called whenever object$<field> is called. If there is no such function, the internal field '<field>' is access (from the environment where all fields live in). Similarily, object$<field> <- value check for set<Field>(object, value), which is called if available. [I work with environments/references so my set functions don't really have to be replacement functions, but there is nothing preventing them from being such.]

There are several advantages doing it this way. You can protect fields behind a set function, e.g. preventing assignment of negative values and similar, e.g.

  circle$radius <- -5
  Error: Negative radius: -5

You can also provide redundant fields in your API, e.g.

  circle$radius <- 5
  print(circle$diameter)
  circle$area <- 4
  print(circle$radius)

and so on. How the circle is represented internally does not matter and may change over time. With such a design you don't have to worry as a software developer; the API is stable. I think this schema carries over perfectly to S4 and '@'.

FYI: I used the above naming convention because I did this way before the '_' operator was redefined.

Comment: If you don't want the user to access a slot/field directly, I recommend to name the slot with a period prefix, e.g. '.radius'. This gives at least the user the chance to understand your design although it does not prevent them to misuse it. The period prefix is also "standard" for "private" object, cf. ls(all.names=FALSE/TRUE).

/Henrik

>
> Ross Boylan wrote:
> > On Tue, 2006-09-26 at 10:43 -0700, Seth Falcon wrote:
> >
> >> Ross Boylan <ross@biostat.ucsf.edu> writes:
> >>
> >
> >
> >>>> If anyone else is going to extend your classes, then you are doing
> >>>> them a disservice by not making these proper methods. It means that
> >>>> you can control what happens when they are called on a subclass.
> >>>>
> >>> My style has been to define a function, and then use setMethod if I want
> >>> to redefine it for an extension. That way the original version becomes
> >>> the generic.
> >>>
> >>> So I don't see what I'm doing as being a barrier to adding methods. Am
> >>> I missing something?
> >>>
> >> You are not, but someone else might be: suppose you release your code
> >> and I would like to extend it. I am stuck until you decide to make
> >> generics.
> >>
> > This may be easier to do concretely.
> > I have an S4 class A.
> > I have defined a function foo that only operates on that class.
> > You make a class B that extends A.
> > You wish to give foo a different implementation for B.
> >
> > Does anything prevent you from doing
> > setMethod("foo", "B", function(x) blah blah)
> > (which is the same thing I do when I make a subclass)?
> > This turns my original foo into the catchall method.
> >
> > Of course, foo is not appropriate for random objects, but that was true
> > even when it was a regular function.
> >
> >
> >>> Originally I tried defining the original using setMethod, but this
> >>> generates a complaint about a missing function; that's one reason I fell
> >>> into this style.
> >>>
> >> You have to create the generic first if it doesn't already exist:
> >>
> >> setGeneric("foo", function(x) standardGeneric("foo"))
> >>
> > I wonder if it might be worth changing setMethod so that it does this by
> > default when no existing function exists. Personally, that would fit the
> > style I'm using better.
> >
> >>>> For accessors, I like to document them in the methods section of the
> >>>> class documentation.
> >>>>
> >>> This is for accessors that really are methods, not my fake
> >>> function-based accessors, right?
> >>>
> >> Which might be a further argument not to have the distinction in the
> >> first place ;-)
> >>
> >> To me, simple accessors are best documented with the class. If I have
> >> an instance, I will read help on it and find out what I can do with
> >> it.
> >>
> >>
> >>> If you use foo as an accessor method, where do you define the associated
> >>> function (i.e., \alias{foo})? I believe such a definition is expected by
> >>> R CMD check and is desirable for users looking for help on foo (?foo)
> >>> without paying attention to the fact it's a method.
> >>>
> >> Yes you need an alias for the _generic_ function. You can either add
> >> the alias to the class man page where one of its methods is documented
> >> or you can have separate man pages for the generics. This is
> >> painful. S4 documentation, in general, is rather difficult and IMO
> >> this is in part a consequence of the more general (read more powerful)
> >> generic function based system.
> >>
> > As my message indicates, I too am struggling with an appropriate
> > documentation style for S4 classes and methods. Since "Writing R
> > Extensions" has said "Structure of and special markup for documenting S4
> > classes and methods are still under development." for as long as I cam
> > remember, perhaps I'm not the only one.
> >
> > Some of the problem may reflect the tension between conventional OO and
> > functional languages, since R remains the latter even under S4. I'm not
> > sure if it's the tools or my approach that is making things awkward; it
> > could be both!
> >
> > Ross
> >
> > ______________________________________________
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu Sep 28 08:06:57 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 27 Sep 2006 - 22:30:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.