Re: [R] aggregate() function and na.rm = TRUE

From: Daniel Malter <daniel_at_umd.edu>
Date: Tue, 08 Jul 2008 17:59:57 -0400


That may have something to do with that you have "empty" groups. In your example, ALL Hour=0 have Y2=NA. The following example may illustrate the point. The first 2 aggregate commands perform the function on data that contain NAs. However, the NAs are not perfectly collinear with any level by which you are grouping. The second example fails as your example does.

x1=rep(c(0,1),each=48)
x2=rep(c(0,1),48)
x1=c(x1,NA,NA,NA,NA)
x2=c(NA,NA,NA,NA,x2)
x3=rnorm(100,0,1)
x3=ifelse(x1==1,NA,x3) ##All x3=NA if x1=1
y=rnorm(100,0,1)
y=sort(y)

aggregate(y,by=list(x1,x2),FUN=mean)
aggregate(y,by=list(x1,x2),FUN=sd)

aggregate(list(y,x3),by=list(x1,x2),FUN=mean) aggregate(list(y,x3),by=list(x1,x2),FUN=sd)

Best,
Daniel



cuncta stricte discussurus

-----Ursprüngliche Nachricht-----
Von: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] Im Auftrag von David Afshartous
Gesendet: Tuesday, July 08, 2008 4:57 PM An: r-help_at_r-project.org
Betreff: [R] aggregate() function and na.rm = TRUE

All,

I've been using aggregate() to compute means and standard deviations at time/treatment combinations for a longitudinal dataset, using na.rm = TRUE for missing data.

This was working fine before, but now when I re-run some old code it isn't. I've backtracked my steps and can't seem to find out why it was working before but not now. In any event, below is a reproducible example of the current problem, viz., calculating the standard deviation via aggregate and employing na.rm = TRUE is not working.

Thanks,
David

dat = data.frame( Hour = c(0, 0, 0, 0, 1, 1,1, 1), Drug = factor(c("P", "D", "P", "D", "P", "D", "P", "D")), Y1 = rnorm(8, 0), Y2 = c(NA, NA, NA, NA, 1, 2, 3, 4) )

> aggregate(dat[c(3,4)], dat[c(1,2)], mean)
  Hour Drug Y1 Y2

1    0    D -0.75534554 NA
2    1    D  0.27529835  3
3    0    P -0.03949923 NA
4    1    P  0.02627489  2

> aggregate(dat[c(3,4)], dat[c(1,2)], sd)
Error in var(x, na.rm = na.rm) : missing observations in cov/cor
> aggregate(dat[c(3,4)], dat[c(1,2)], sd, na.rm = TRUE)
Error in var(x, na.rm = na.rm) : no complete element pairs

> sessionInfo()

R version 2.7.1 (2008-06-23)
i386-apple-darwin8.10.1

locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached): [1] grid_2.7.1 lattice_0.17-8 nlme_3.1-89
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 08 Jul 2008 - 22:05:06 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 09 Jul 2008 - 15:31:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive