Date: Mon, 11 Feb 2008 13:29:42 -0400

I had a similar problem, trying to use lme within a custom rpart function. I got around it by passing the dataframe I needed through the parms option in rpart, and then using the parms option in evaluation, init and split as a dataset. It's not the most elegant solution, but it will work.

Have you (or anyone else) figured out the details of the summary and text options in the init function? I know that they are used to fill out the summary of the model and the text.rpart plotting, but I can't seem to use any of the variables being passed to them efficiently (or at all).

Hope that helps,

Sam Stewart

On Feb 20, 2007 2:47 AM, Tobias Guennel <tguennel_at_vcu.edu> wrote:

*> I have made some progress with the user defined splitting function and I got
**> a lot of the things I needed to work. However, I am still stuck on accessing
**> the node data. It would probably be enough if somebody could tell me, how I
**> can access the original data frame of the call to rpart.
**> So if the call is: fit0 <- rpart(Sat ~Infl +Cont+ Type,
**> housing, control=rpart.control(minsplit=10, xval=0),
**> method=alist)
**> how can I access the housing data frame within the user defined splitting
**> function?
**>
**> Any input would be highly appreciated!
**>
**> Thank you
**> Tobias Guennel
**>
**>
**> -----Original Message-----
**> From: Tobias Guennel [mailto:tguennel_at_vcu.edu]
**> Sent: Monday, February 19, 2007 3:40 PM
*

> To: 'r-help@stat.math.ethz.ch'

*> Subject: [R] User defined split function in rpart
**>
**> Maybe I should explain my Problem a little bit more detailed.
**> The rpart package allows for user defined split functions. An example is
**> given in the source/test directory of the package as usersplits.R.
**> The comments say that three functions have to be supplied:
**> 1. "The 'evaluation' function. Called once per node.
**> Produce a label (1 or more elements long) for labeling each node,
**> and a deviance."
**> 2. The split function, where most of the work occurs.
**> Called once per split variable per node.
**> 3. The init function:
**> fix up y to deal with offsets
**> return a dummy parms list
**> numresp is the number of values produced by the eval routine's "label".
**>
**> I have altered the evaluation function and the split function for my needs.
**> Within those functions, I need to fit a proportional odds model to the data
**> of the current node. I am using the polr() routine from the MASS package to
**> fit the model.
**> Now my problem is, how can I call the polr() function only with the data of
**> the current node. That's what I tried so far:
**>
**> evalfunc <- function(y,x,parms,data) {
**>
**> pomnode<-polr(data$y~data$x,data,weights=data$Freq)
**> parprobs<-predict(pomnode,type="probs")
**> dev<-0
**> K<-dim(parprobs)[2]
**> N<-dim(parprobs)[1]/K
**> for(i in 1:N){
**> tempsum<-0
**> Ni<-0
**> for(l in 1:K){
**> Ni<-Ni+data$Freq[K*(i-1)+l]
**> }
**> for(j in 1:K){
**> tempsum<-tempsum+data$Freq[K*(i-1)+j]/Ni*log(parprobs[i,j]*Ni/data$Freq[K*(i
**> -1)+j])
**> }
**> dev=dev+Ni*tempsum
**> }
**> dev=-2*dev
**> wmean<-1
**> list(label= wmean, deviance=dev)
**>
**> }
**>
**> I get the error: Error in eval(expr, envir, enclos) : argument "data" is
**> missing, with no default
**>
**> How can I use the data of the current node?
**>
**> Thank you
**> Tobias Guennel
**>
**>
*

