[R] (Newbie) Aggregate for NA values

From: Vivek Satsangi <vivek.satsangi_at_gmail.com>
Date: Sat 25 Feb 2006 - 02:16:29 EST


Folks,

Sorry if this question has been answered before or is obvious (or worse, statistically "bad"). I don't understand what was said in one of the search results that seems somewhat related.

I use aggregate to get a quick summary of the data. Part of what I am looking for in the summary is, how much influence might the NA's have had, if they were included, and is excluding them from the means causing some sort of bias. So I want the summary stat for the NA's also.

Here is a simple example session (edited to remove the typos I made, comments added later):

> tmp_a <- 1:10
> tmp_b <- rep(1:5,2)
> tmp_c <- rep(1:2,5)
> tmp_d <- c(1,1,1,2,2,2,3,3,3,4)
> tmp_df <- data.frame(tmp_a,tmp_b,tmp_c,tmp_d);
> tmp_df$tmp_c[9:10] <- NA ;
> tmp_df

   tmp_a tmp_b tmp_c tmp_d

1      1     1     1     1
2      2     2     2     1
3      3     3     1     1
4      4     4     2     2
5      5     5     1     2
6      6     1     2     2
7      7     2     1     3
8      8     3     2     3
9      9     4    NA     3
10    10     5    NA     4

> aggregate(tmp_df$tmp_d,by=list(tmp_df$tmp_b,tmp_df$tmp_c),mean);
  Group.1 Group.2 x
1       1       1 1
2       2       1 3
3       3       1 1
4       5       1 2
5       1       2 2
6       2       2 1
7       3       2 3
8       4       2 2

# Only one row for each (tmp_b, tmp_c) combination, NA's getting dropped.

> aggregate(tmp_df$tmp_d,by=list(tmp_df$tmp_c),mean);
  Group.1 x

1       1 1.75
2       2 2.00

What I want in this last aggregate is, a mean for the values in tmp_d that correspond to the tmp_c values of NA. Similarly, perhaps there is a way to make the second last call to aggregate return the values of tmp_d for the NA values of tmp_c also.

How can I achieve this?

--
-- Vivek Satsangi
Student, Rochester, NY USA

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Sat Feb 25 02:32:37 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:45 EST