Re: [Rd] [R] predict()

From: peter dalgaard <pdalgd_at_gmail.com>
Date: Fri, 15 Apr 2011 09:10:25 +0200

On Apr 14, 2011, at 16:52 , Ivo Shterev wrote:

> Dear Dr. Therneau,
> 
> Thank you for your response. Just to point out that we didn't experience any problems with the lm() function under R version 2.12.2 (2011-02-25):

I couldn't reproduce it from Terry's description either, but there _is_ an issue which parallels the coxph one: In model.frame.lm, we have

function (formula, ...)
{

    dots <- list(...)
    nargs <- dots[match(c("data", "na.action", "subset"), names(dots),

        0)]
    if (length(nargs) || is.null(formula$model)) {

        fcall <- formula$call
        fcall$method <- "model.frame"
        fcall[[1L]] <- as.name("lm")
        fcall[names(nargs)] <- nargs
        env <- environment(formula$terms)
        if (is.null(env)) 
            env <- parent.frame()
        eval(fcall, env, parent.frame())
    }
    else formula$model
}
<environment: namespace:stats>

So under some circumstances, we evaluate formula$call (where "formula" is actually an "lm" object --- and age-old design infelicity) with modified arguments. This evaluation takes place in the environment of the model formula, nested in the parent frame, but neither necessarily contains the arguments to formula$call:

> t3
function(dat,fm)
 {
   model.frame(lm(fm,data=dat),subset=1:5)  }
> testdat

        otime event          x
1  0.84345726     0 -0.4456620
2  0.57661027     0  1.2240818
3  1.32905487     0  0.3598138
4  0.03157736     0  0.4007715
5  0.05621098     0  0.1106827
6  0.31650122     1 -0.5558411
7  0.31422729     1  1.7869131
8  0.14526680     1  0.4978505
9  2.72623646     1 -1.9666172
10 0.02915345     1  0.7013559

> fm2
otime ~ x
> t3(testdat,fm2)
Error in model.frame(formula = fm, data = dat, subset = 1:5, drop.unused.levels = TRUE) :   object 'fm' not found

It looks more than a bit iffy to try and set this right. One suggestion could be to explicitly replace the formula part of fcall with the formula contained in the model object. Or, the original evaluation frame should be kept with the model object and used rather than parent.frame().

This should probably move to r-devel.

> 

>> set.seed(123)
>> testdat=data.frame(y=rexp(10),event=rep(0:1,each=5),x=rnorm(10))
>> testfm=as.formula('y~x')
>>
>> testfun=function(dat,fm)
> +   {
> +     predict(lm(fm,data=dat),newdata=dat)
> +   }

>> testfun(testdat,testfm)
> 1 2 3 4 5 6 > 1.00414971 0.07061335 0.55381658 0.53091759 0.69310319 1.06574974 > 7 8 9 10 > -0.24405980 0.47664172 1.85449993 0.36286396

>> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform: x86_64-pc-linux-gnu (64-bit)
> 
> locale:
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
> [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8   
> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
> [9] LC_ADDRESS=C               LC_TELEPHONE=C            
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
> 
> loaded via a namespace (and not attached):
> [1] tools_2.12.2

>>
> 
> 
> Regards
> Ivo
> 
> 
> 
> --- On Thu, 4/14/11, Terry Therneau <therneau_at_mayo.edu> wrote:
> 

>> From: Terry Therneau <therneau_at_mayo.edu>
>> Subject: Re: predict()
>> To: "Ivo Shterev" <idc318_at_yahoo.com>
>> Cc: r-help_at_r-project.org
>> Date: Thursday, April 14, 2011, 2:59 PM
>> --- begin included message ---
>> I am experimenting with the function predict() in two
>> versions of R and
>> the R extension package "survival".
>>
>> library(survival)
>>
>> set.seed(123)
>> testdat=data.frame(otime=rexp(10),event=rep(0:1,each=5),x=rnorm(10))
>> testfm=as.formula('Surv(otime,event)~x')
>>
>> testfun=function(dat,fm)
>> {
>>
>> predict(coxph(fm,data=dat),type='lp',newdata=dat)
>> }
>>
>> -- end inclusion ----
>>
>> The question was: this works under survival 2.35-8, but
>> not survival
>> 2.36-5
>>
>> Answer: The predict and underlying model.frame functions
>> for coxph were
>> brought into line with lm and other models. The major
>> advantage is that
>> I now deal with factors and data dependent predictors like
>> ns()
>> correctly.
>> You've shown a disadvantage of which I was not
>> aware. Using your
>> example but replacing coxph() by lm() with otime ~x as the
>> model I get a
>> similar failure. I'd like to ask a wider audience of
>> R-devel since it
>> is bigger than coxph.
>>
>> Terry Therneau
>>
>>
>>
> 
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes_at_cbs.dk  Priv: PDalgd_at_gmail.com

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Fri 15 Apr 2011 - 07:15:27 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 15 Apr 2011 - 07:20:50 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive