Re: [R] eval(parse(text vs. get when accessing a function

From: Ramon Diaz-Uriarte <rdiaz02_at_gmail.com>
Date: Sat 06 Jan 2007 - 14:16:51 GMT

Dear Greg,

On 1/5/07, Greg Snow <Greg.Snow@intermountainmail.org> wrote:
> Ramon,
>
> I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us).
>

Those suggestions do apply to me of course (no claim to being organized nor beyond idiocy here). And actually the suggestions on this thread are being very useful. I think, though, that I was not very clear on the context and my examples were too dumbed down. So I'll try to give more detail (nothing here is secret, I am just trying not to bore people).

The code is part of a web-based application, so there is no interactive user. The R code is passed the arguments (and optional user functions) from the CGI.

There is one "core" function (call it cvFunct) that, among other things, does cross-validation. So this is one way to do things:

cvFunct <- function(whatever, genefiltertype, whateverelse) {

      internalGeneSelect <- eval(parse(text = paste("geneSelect",
                                             genefiltertype, sep = ".")))

      ## do things calling internalGeneSelect,
}

and now define all possible functions as

geneSelect.Fratio <- function(x, y, z) {##something} geneSelect.Wilcoxon <- function(x, y, z) {## something else}

If I want more geneSelect functions, adding them is simple. And I can even allow the user to pass her/his own functions, with the only restriction that it takes three args, x, y, z, and that the function is to be called: "geneSelect." and a user choosen string. (Yes, I need to make sure no calls to "system", etc, are in the user code, etc, etc, but that is another issue).

The general idea is not new of course. For instance, in package "e1071", a somewhat similar thing is done in function "tune", and David Meyer there uses "do.call". However, tune is a lot more general than what I had in mind. For instance, "tune" deals with arbitrary functions, with arbitrary numbers and names of parameters, whereas my functions above all take only three arguments (x: a matrix, y: a vector; z: an integer), so the neat functionality provided by "do.call", and passing the args as a list is not really needed.

So, given that my situation is so structured, and I do not need "do.call", I think the approach via eval(parse(paste makes my life simple:

  1. the central function (cvFunct) uses something I can easily recognize: "internalGeneSelect"
  2. after the initial eval(parse(text I do not need to worry anymore about what the "true" gene selection function is called
  3. adding new functions and calling them is simple: function naming follows a simple pattern ("geneSelect." + postfix) and calling the user function only requires passing the postfix to cvFunct.
  4. notice also that, at least the functs. I define, will of course not be named "f.1", etc, but rather things like "geneSelect.Fratio" or "geneSelect.namesThatStartWithCuteLetters";

I hope this makes things more clear. I did not include this detail because this is probably boring (I guess most of you have stopped reading by now :-).

> Using the list forces you to think about what functions may be called and thinking about things before doing them is usually a good idea. Personally I don't trust the user of my functions (usually my future self who has forgotten something that seemed obvious at the time) to not do something stupid with them.
>
> With list elements you can have names for the functions and access them either by the name or by a number, I find that a lot easier when I go back to edit/update than to remember which function f.1 or f.2 did what.
>

But I don't see how having your functions as list elements is easier (specially if the function is longer than 2 to 3 lines) than having all functions systematically named things such as:

geneSelect.Fratio
geneSelect.Random
geneSelect.LetterA

etc

Of course, I could have a list with the components named "Fratio" "Random", "LetterA". But I fail to see what it adds. And it forces me to build the list, and probably rebuild it whe (or not build it until) the user enters her/his own selection function. But the later I do not need to do with the scheme above.

> With your function, what if the user runs:
>
> > g(5,3)
>
> What should it do? (you have only shown definitions for f.1 and f.2). With my luck I would accidentily type that and just happen to have a f.3 function sitting around from a previous project that does something that I really don't want it to do now. If I use the list approach then I will get a subscript out of bounds error rather than running something unintended.
>
>

I see the general concern, but not how it applies here. If I pass argument "Fratio" then either I use geneSelect.Fratio or I get an error if "geneSelect.Fratio" does not exist. Similar to what would happen if I do

g1(2, 8)

when f.8 is not defined:

Error in eval(expr, envir, enclos) : object "f.8" not found So even in more general cases, except for function redefinitions, etc, you are not able to call non-existent stuff.

> 2nd, If I used the eval-parse approach then I would probably at some point redefine f.1 or f.2 to the output of a regression analysis or something, then go back and run the g function at a later time and wonder why I am getting an error, then once I have finally figured it out, now I need to remember what f.1 did and rewrite it again. I am much less likely to accidentally replace an element of a list, and if the list is well named I am unlikely to replace the whole list by accident.
>
>

Yes, that is true. Again, it does not apply to the actual case I have in mind, but of course, without the detailed info on context I just gave, you could not know that.

> 3rd, If I ever want to use this code somewhere else (new version of R, on the laptop, give to coworker, ...), it is a lot easier to save and load a single list than to try to think of all the functions that need to be saved.
>

Oh, sure. But all the functions above live in a single file (actually, a minipackage) except for the optional use function (which is read from a file).

>
> Personally I have never regretted trying not to underestimate my own future stupidity.
>

Neither do I. And actually, that is why I asked: if Thomas Lumley said, in the fortune, that I better rethink about it, then I should try rethinking about it. But I asked because I failed to see what the problem is.

> Hope this helps,
>

It certainly does.

Best,

R.

> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.snow@intermountainmail.org
> (801) 408-8111
>
>
>
> > -----Original Message-----
> > From: r-help-bounces@stat.math.ethz.ch
> > [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Ramon
> > Diaz-Uriarte
> > Sent: Friday, January 05, 2007 11:41 AM
> > To: Peter Dalgaard
> > Cc: r-help; rdiaz02@gmail.com
> > Subject: Re: [R] eval(parse(text vs. get when accessing a function
> >
> > On Friday 05 January 2007 19:21, Peter Dalgaard wrote:
> > > Ramon Diaz-Uriarte wrote:
> > > > Dear All,
> > > >
> > > > I've read Thomas Lumley's fortune "If the answer is parse() you
> > > > should usually rethink the question.". But I am not sure it that
> > > > also applies (and why) to other situations (Lumley's comment
> > > > http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html
> > > > was in reply to accessing a list).
> > > >
> > > > Suppose I have similarly called functions, except for a
> > postfix. E.g.
> > > >
> > > > f.1 <- function(x) {x + 1}
> > > > f.2 <- function(x) {x + 2}
> > > >
> > > > And sometimes I want to call f.1 and some other times f.2 inside
> > > > another function. I can either do:
> > > >
> > > > g <- function(x, fpost) {
> > > > calledf <- eval(parse(text = paste("f.", fpost, sep = "")))
> > > > calledf(x)
> > > > ## do more stuff
> > > > }
> > > >
> > > >
> > > > Or:
> > > >
> > > > h <- function(x, fpost) {
> > > > calledf <- get(paste("f.", fpost, sep = ""))
> > > > calledf(x)
> > > > ## do more stuff
> > > > }
> > > >
> > > >
> > > > Two questions:
> > > > 1) Why is the second better?
> > > >
> > > > 2) By changing g or h I could use "do.call" instead; why
> > would that
> > > > be better? Because I can handle differences in argument lists?
> >
> > Dear Peter,
> >
> > Thanks for your answer.
> >
> > >
> > > Who says that they are better? If the question is how to call a
> > > function specified by half of its name, the answer could well be to
> > > use parse(), the point is that you should rethink whether that was
> > > really the right question.
> > >
> > > Why not instead, e.g.
> > >
> > > f <- list("1"=function(x) {x + 1} , "2"=function(x) {x + 2}) h <-
> > > function(x, fpost) f[[fpost]](x)
> > >
> > > > h(2,"2")
> > >
> > > [1] 4
> > >
> > > > h(2,"1")
> > >
> > > [1] 3
> > >
> >
> > I see, this is direct way of dealing with the problem.
> > However, you first need to build the f list, and you might
> > not know about that ahead of time. For instance, if I build a
> > function so that the only thing that you need to do to use my
> > function g is to call your function "f.something", and then
> > pass the "something".
> >
> > I am still under the impression that, given your answer,
> > using "eval(parse(text" is not your preferred way. What are
> > the possible problems (if there are any, that is). I guess I
> > am puzzled by "rethink whether that was really the right question".
> >
> >
> > Thanks,
> >
> > R.
> >
> >
> >
> >
> >
> >
> >
> > > > Thanks,
> > > >
> > > >
> > > > R.
> >
> > --
> > Ramón Díaz-Uriarte
> > Centro Nacional de Investigaciones Oncológicas (CNIO)
> > (Spanish National Cancer Center) Melchor Fernández Almagro, 3
> > 28029 Madrid (Spain)
> > Fax: +-34-91-224-6972
> > Phone: +-34-91-224-6900
> >
> > http://ligarto.org/rdiaz
> > PGP KeyID: 0xE89B3462
> > (http://ligarto.org/rdiaz/0xE89B3462.asc)
> >
> >
> >
> > **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en
> > s...{{dropped}}
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>

-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sun Jan 07 04:11:16 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 07 Jan 2007 - 21:30:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.