Re: [R] Using 'sapply' and 'by' in one function

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Sun, 10 Feb 2008 09:25:43 -0500

Actually thinking about this, not only do you not need sapply but you don't even need by:

new2 <- transform(new, sex = factor(sex)) coef(lm(as.matrix(new2[1:2]) ~ sex/Pred - 1, new2))

On Feb 10, 2008 8:43 AM, Gabor Grothendieck <ggrothendieck_at_gmail.com> wrote:
> By passing new to fxa via the second argument of fxa, new is not being
> subsetted hence the error. Try this:
>
> by(new, new$sex, function(x) sapply(x[1:2], function(y) coef(lm(y ~ Pred, x)))
>
> Actually, you can do the above without sapply as lm can take a matrix
> for the dependent variable:

>
> by(new, new$sex, function(x) coef(lm(as.matrix(x[1:2]) ~ Pred, x)))
>
>
> On Feb 10, 2008 8:19 AM, David & Natalia <3.14david_at_gmail.com> wrote:
> > Greetings,
> >
> > I'm having a problem with something that I think is very simple - I'd
> > like to be able to use the 'sapply' and 'by' functions in 1 function
> > to be able (for example) to get regression coefficients from multiple
> > models by a grouping variable. I think that I'm missing something
> > that is probably obvious to experienced users.

> >
> > Here's a simple (trivial) example of what I'd like to do:

> >
> > new <- data.frame(Outcome.1=rnorm(10),Outcome.2=rnorm(10),sex=rep(0:1,5),Pred=rnorm(10))
> > fxa <- function(x,data) { lm(x~Pred,data=data)$coef }
> > sapply(new[,1:2],fxa,new) # this yields coefficients for the
> > predictor in separate models
> >
> > fxb <- function(x) {lm(Outcome.1~Pred,da=x)$coef};
> > by(new,new$sex,fxb) #yields the coefficient for Outcome.1 for each sex
> >
> > ## I'd like to be able to combine 'sapply' and 'by' to be able to get
> > the regression coefficients for Outome.1 and Outcome.2 by each sex,
> > rather than running fxb a second time predicting 'Outcome.2' or by
> > subsetting the data - by sex - before I run the function, but the
> > following doesn't work -
> >
> > by(new,new$sex,FUN=function(x)sapply(x[,1:2],fxa,new))
> > 'Error in model.frame.default(formula = x ~ Pred, data = data,
> > drop.unused.levels = TRUE) :
> > variable lengths differ (found for 'Pred')'
> >
> > ##I understand the error message - the length of 'Pred' is 10 while
> > the length of each sex group is 5, but I'm not sure how to correctly
> > write the 'by' function to use 'sapply' inside it. Could someone
> > please point me in the right direction? Thanks very much in advance
> >
> > David S Freedman, CDC (Atlanta USA) [definitely not the well-know
> > statistician, David A Freedman, in Berkeley]

> >
> > ______________________________________________
> > R-help_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 10 Feb 2008 - 14:29:24 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 10 Feb 2008 - 15:30:14 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive