From: Christophe Dutang <dutangc_at_gmail.com>

Date: Wed, 27 Apr 2011 18:53:59 +0200

Date: Wed, 27 Apr 2011 18:53:59 +0200

Among many solutions, I generally use the following code, which avoids the ideal average individual, by considering the mean across of the predicted values:

averagingpredict <- function(model, varname, varseq, type, subset=NULL) {

if(is.null(subset))

mydata <- model$data

else

mydata <- model$data[subset, ]

f <- function(x)

{

mydata[, varname] <- x mean(predict(model, newdata=mydata, type=type), na.rm=TRUE)}

sapply(varseq, f)

}

It is time consuming, but it deals with non numeric variables.

Christophe

2011/4/26 Paul Johnson <pauljohn32_at_gmail.com>

> Is anybody working on a way to standardize the creation of "newdata"

*> objects for predict methods?
**>
**> When using predict, I find it difficult/tedious to create newdata data
**> frames when there are many variables. It is necessary to set all
**> variables at the mean/mode/median, and then for some variables of
**> interest, one has to insert values for which predictions are desired.
**> I was at a presentation by Scott Long last week and he was discussing
**> the increasing emphasis in Stata on calculations of marginal
**> predictions and "Spost" an several other packages, and,
**> co-incidentally, I had a student visit who is learning to use R MASS's
**> polr (W.Venables and B. Ripley) and we wrestled for quite a while to
**> try to make the same calculations that Stata makes automatically. It
**> spits out predicted probabilities each independent variable, keeping
**> other variables at a reference level.
**>
**> I've found R packages that aim to do essentially the same thing.
**>
**> In Frank Harrell's Design/rms framework, he uses a "data.dist"
**> function that generates an object that the user has to put into the R
**> options. I think many users trip over the use of "options" there. If
**> I don't use that for a month or two, I completely forget the fine
**> points and have to fight with it. But it does "work" to give plots
**> and predict functions the information they require.
**>
**> In Zelig ( by Kosuke Imai, Gary King, and Olivia Lau), a function
**> "setx" does the work of creating "newdata" objects. That appears to be
**> about right as a candidate for a generic "newdata" function. Perhaps
**> it could directly generalize to all R regression functions, but right
**> now it is tailored to the models in Zelig. It has separate methods for
**> the different types of models, and that is a bit confusing to me,since
**> the "newdata" in one model should be the same as the newdata in
**> another, I'm guessing. But his code is all there, I'll keep looking.
**>
**> In Effects (by John Fox), there are internal functions to create
**> newdata and plot the marginal effects. If you load effects and run,
**> for example, "effects:::effect.lm" you see Prof Fox has his own way of
**> grabbing information from model columns and calculating predictions.
**>
**> I think it is time the R Core Team would look at this tell "us" what
**> is the right way to do this. I think the interface to setx in Zelig is
**> pretty easy to understand, at least for numeric variables.
**>
**> In R's termplot function, such a thing could be put to use. As far as
**> I can tell now, termplot is doing most of the work of creating a
**> newdata object, but not exactly.
**>
**> It seems like it would be a shame to proliferate more functions that
**> do the same function, when it is such a common thing.
**>
**> --
**> Paul E. Johnson
**> Professor, Political Science
**> 1541 Lilac Lane, Room 504
**> University of Kansas
**>
**> ______________________________________________
**> R-devel_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-devel
**>
*

-- Christophe DUTANG Ph. D. student at ISFA, Lyon, France [[alternative HTML version deleted]] ______________________________________________ R-devel_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-develReceived on Wed 27 Apr 2011 - 16:57:23 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 27 Apr 2011 - 19:10:52 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*