From: E Hofstadler <e.hofstadler_at_gmail.com>

Date: Fri, 01 Apr 2011 15:54:08 +0300

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 01 Apr 2011 - 12:59:09 GMT

Date: Fri, 01 Apr 2011 15:54:08 +0300

2011/4/1 Nick Sabbe <nick.sabbe_at_ugent.be>:

> This should be a version that does what you want.

Indeed it does, thank you very much!

> Because you named the variable lvarname, I assumed you were already passing

*> "lvar" instead of trying to pass lvar (without the quotes), which is in no
**> way a 'name'.
*

Sorry about that, I can see how my variable names were somewhat confusing.

Many thanks once again!

*>
**>
**>
*

> -----Original Message-----

*> From: irene.prix_at_googlemail.com [mailto:irene.prix_at_googlemail.com] On Behalf
**> Of E Hofstadler
**> Sent: vrijdag 1 april 2011 14:28
**> To: Nick Sabbe
**> Cc: r-help_at_r-project.org
**> Subject: Re: [R] programming: telling a function where to look for the
**> entered variables
**>
**> Thanks Nick and Juan for your replies.
**>
**> Nick, thanks for pointing out the warning in subset(). I'm not sure
**> though I understand the example you provided -- because despite using
**> subset() rather than bracket notation, the original function (myfunct)
**> does what is expected of it. The problem I have is with the second
**> function (myfunct.better), where variable names + dataframe are not
**> fixed within the function but passed to the function when calling it
**> -- and even with bracket notation I don't quite manage to tell R where
**> to look for the columns that related to the entered column names.
**> (but then perhaps I misunderstood you)
**>
**> This is what I tried (using bracket notation):
**>
**> myfunct.better(dataframe, subgroup, lvarname,yvarname){
**> Data.tmp <- dataframe[dataframe[,deparse(substitute(lvarname))]==subgroup,
**> c("xvar",deparse(substitute(yvarname)))]
**> }
**>
**> but this creates an empty contingency table only -- perhaps because my
**> use of deparse() is flawed (I think what is converted into a string is
**> "lvarname" and "yvarname", rather than the column names that these two
**> function-variables represent in the dataframe)?
**>
**>
**> 2011/4/1 Nick Sabbe <nick.sabbe_at_ugent.be>:
**>> See the warning in ?subset.
**>> Passing the column name of lvar is not the same as passing the 'contextual
**>> column' (as I coin it in these circumstances).
**>> You can solve it by indeed using [] instead.
**>>
**>> For my own comfort, here is the relevant line from your original function:
**>> Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar"))
**>> Which should become something like (untested but should be close):
**>> Data.tmp <- Fulldf[Fulldf[,"lvar"]==subgroup, c("xvar","yvar")]
**>>
**>> This should be a lot easier to translate based on column names, as the
**>> column names are now used as such.
**>>
**>> HTH,
**>>
**>>
**>> Nick Sabbe
**>> --
**>> ping: nick.sabbe_at_ugent.be
**>> link: http://biomath.ugent.be
**>> wink: A1.056, Coupure Links 653, 9000 Gent
**>> ring: 09/264.59.36
**>>
**>> -- Do Not Disapprove
**>>
**>>
**>>
**>>
**>> -----Original Message-----
**>> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org]
**> On
**>> Behalf Of E Hofstadler
**>> Sent: vrijdag 1 april 2011 13:09
**>> To: r-help_at_r-project.org
**>> Subject: [R] programming: telling a function where to look for the entered
**>> variables
**>>
**>> Hi there,
**>>
**>> Could someone help me with the following programming problem..?
**>>
**>> I have written a function that works for my intended purpose, but it
**>> is quite closely tied to a particular dataframe and the names of the
**>> variables in this dataframe. However, I'd like to use the same
**>> function for different dataframes and variables. My problem is that
**>> I'm not quite sure how to tell my function in which dataframe the
**>> entered variables are located.
**>>
**>> Here's some reproducible data and the function:
**>>
**>> # create reproducible data
**>> set.seed(124)
**>> xvar <- sample(0:3, 1000, replace = T)
**>> yvar <- sample(0:1, 1000, replace=T)
**>> zvar <- rnorm(100)
**>> lvar <- sample(0:1, 1000, replace=T)
**>> Fulldf <- as.data.frame(cbind(xvar,yvar,zvar,lvar))
**>> Fulldf$xvar <- factor(xvar, labels=c("blue","green","red","yellow"))
**>> Fulldf$yvar <- factor(yvar, labels=c("area1","area2"))
**>> Fulldf$lvar <- factor(lvar, labels=c("yes","no"))
**>>
**>> and here's the function in the form that it currently works: from a
**>> subset of the dataframe Fulldf, a contingency table is created (in my
**>> actual data, several other operations are then performed on that
**>> contingency table, but these are not relevant for the problem in
**>> question, therefore I've deleted it) .
**>>
**>> # function as it currently works: tailored to a particular dataframe
**>> (Fulldf)
**>>
**>> myfunct <- function(subgroup){ # enter a particular subgroup for which
**>> the contingency table should be calculated (i.e. a particular value of
**>> the factor lvar)
**>> Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar"))
**>> #restrict dataframe to given subgroup and two columns of the original
**>> dataframe
**>> Data.tmp <- na.omit(Data.tmp) # exclude missing values
**>> indextable <- table(Data.tmp$xvar, Data.tmp$yvar) # make contingency table
**>> return(indextable)
**>> }
**>>
**>> #Since I need to use the function with different dataframes and
**>> variable names, I'd like to be able to tell my function the name of
**>> the dataframe and variables it should use for calculating the index.
**>> This is how I tried to modify the first part of the #function, but it
**>> didn't work:
**>>
**>> # function as I would like it to work: independent of any particular
**>> dataframe or variable names (doesn't work)
**>>
**>> myfunct.better <- function(subgroup, lvarname, yvarname, dataframe){
**>> #enter the subgroup, the variable names to be used and the dataframe
**>> in which they are found
**>> Data.tmp <- subset(dataframe, lvarname==subgroup, select=c("xvar",
**>> deparse(substitute(yvarname)))) # trying to subset the given dataframe
**>> for the given subgroup of the given variable. The variable "xvar"
**>> happens to have the same name in all dataframes) but the variable
**>> yvarname has different names in the different dataframes
**>> Data.tmp <- na.omit(Data.tmp)
**>> indextable <- table(Data.tmp$xvar, Data.tmp$yvarname) # create the
**>> contingency table on the basis of the entered variables
**>> return(indextable)
**>> }
**>>
**>> calling
**>>
**>> myfunct.better("yes", lvarname=lvar, yvarname=yvar, dataframe=Fulldf)
**>>
**>> results in the following error:
**>>
**>> Error in `[.data.frame`(x, r, vars, drop = drop) :
**>> undefined columns selected
**>>
**>> My feeling is that R doesn't know where to look for the entered
**>> variables (lvar, yvar), but I'm not sure how to solve this problem. I
**>> tried using with() and even attach() within the function, but that
**>> didn't work.
**>>
**>> Any help is greatly appreciated.
**>>
**>> Best,
**>> Esther
**>>
**>> P.S.:
**>> Are there books that elaborate programming in R for beginners -- and I
**>> mean things like how to best use vectorization instead of loops and
**>> general "best practice" tips for programming. Most of the books I've
**>> been looking at focus on applying R for particular statistical
**>> analyses, and only comparably briefly deal with more general
**>> programming aspects. I was wondering if there's any books or tutorials
**>> out there that cover the latter aspects in a more elaborate and
**>> systematic way...?
**>>
**>> ______________________________________________
**>> R-help_at_r-project.org mailing list
**>> https://stat.ethz.ch/mailman/listinfo/r-help
**>> PLEASE do read the posting guide
**> http://www.R-project.org/posting-guide.html
**>> and provide commented, minimal, self-contained, reproducible code.
**>>
**>>
**>
**>
*

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 01 Apr 2011 - 12:59:09 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Fri 01 Apr 2011 - 13:00:25 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*