[R] Using 'sapply' and 'by' in one function

From: David & Natalia <3.14david_at_gmail.com>
Date: Sun, 10 Feb 2008 08:19:47 -0500


I'm having a problem with something that I think is very simple - I'd like to be able to use the 'sapply' and 'by' functions in 1 function to be able (for example) to get regression coefficients from multiple models by a grouping variable. I think that I'm missing something that is probably obvious to experienced users.

Here's a simple (trivial) example of what I'd like to do:

new <- data.frame(Outcome.1=rnorm(10),Outcome.2=rnorm(10),sex=rep(0:1,5),Pred=rnorm(10)) fxa <- function(x,data) { lm(x~Pred,data=data)$coef } sapply(new[,1:2],fxa,new) # this yields coefficients for the predictor in separate models

fxb <- function(x) {lm(Outcome.1~Pred,da=x)$coef}; by(new,new$sex,fxb) #yields the coefficient for Outcome.1 for each sex

## I'd like to be able to combine 'sapply' and 'by' to be able to get the regression coefficients for Outome.1 and Outcome.2 by each sex, rather than running fxb a second time predicting 'Outcome.2' or by subsetting the data - by sex - before I run the function, but the following doesn't work -

by(new,new$sex,FUN=function(x)sapply(x[,1:2],fxa,new)) 'Error in model.frame.default(formula = x ~ Pred, data = data, drop.unused.levels = TRUE) :
  variable lengths differ (found for 'Pred')'

##I understand the error message - the length of 'Pred' is 10 while the length of each sex group is 5, but I'm not sure how to correctly write the 'by' function to use 'sapply' inside it. Could someone please point me in the right direction? Thanks very much in advance

David S Freedman, CDC (Atlanta USA) [definitely not the well-know statistician, David A Freedman, in Berkeley]

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 10 Feb 2008 - 13:25:57 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 10 Feb 2008 - 14:30:13 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive