> Just want to add that if you want to clean out the NA rows in a matrix

*> or data frame, take a look at ?complete.cases. Can be handy to use
**> with big datasets. I got curious, so I just ran the codes given here
**> on a big dataset, before and after removing NA rows. I have to be
**> honest, this is surely an illustration of the power of rowMeans. I'm
**> amazed myself.
*

I was too...the documentation (?rowMeans) wasn't joking:

"These functions are equivalent to use of 'apply' with 'FUN = mean' or 'FUN = sum' with appropriate margins, but are a lot faster."

*>
**> DF <- data.frame(
**> A=rep(DF$A,10000),
**> B=rep(DF$B,10000)
**> )
**>
*

>> system.time(apply(DF,1,mean,na.rm=TRUE))

*> user system elapsed
**> 13.26 0.06 13.46
**>
**>> system.time(matrix(rowMeans(DF, na.rm=TRUE), ncol=1))
**> user system elapsed
**> 0.03 0.00 0.03
**>
**>> system.time(t(as.matrix(aggregate(t(as.matrix(DF)),list(rep(1:1,each=2)),mean,
**> + na.rm=TRUE)[,-1]))
**> + )
**>
**> Timing stopped at: 227.84 1.03 249.31 -- I got impatient and pressed the escape
**>
**>> DF <- DF[complete.cases(DF),]
**>
**>> system.time(apply(DF,1,mean,na.rm=TRUE))
**> user system elapsed
**> 0.39 0.00 0.39
**>
**>> system.time(matrix(rowMeans(DF, na.rm=TRUE), ncol=1))
**> user system elapsed
**> 0.01 0.00 0.02
**>
**>> system.time(t(as.matrix(aggregate(t(as.matrix(DF)),list(rep(1:1,each=2)),mean,
**> + na.rm=TRUE)[,-1]))
**> + )
**> user system elapsed
**> 10.01 0.07 13.40
**>
**> Cheers
**> Joris
**>
**>
**> On Sat, Jun 26, 2010 at 1:08 AM, emorway <emorway_at_engr.colostate.edu> wrote:
**>>
**>> Forum,
**>>
**>> Using the following data:
**>>
**>> DF<-read.table(textConnection("A B
**>> 22.60 NA
**>> NA NA
**>> NA NA
**>> NA NA
**>> NA NA
**>> NA NA
**>> NA NA
**>> NA NA
**>> 102.00 NA
**>> 19.20 NA
**>> 19.20 NA
**>> NA NA
**>> NA NA
**>> NA NA
**>> 11.80 NA
**>> 7.62 NA
**>> NA NA
**>> NA NA
**>> NA NA
**>> NA NA
**>> NA NA
**>> 75.00 NA
**>> NA NA
**>> 18.30 18.2
**>> NA NA
**>> NA NA
**>> 8.44 NA
**>> 18.00 NA
**>> NA NA
**>> 12.90 NA"),header=T)
**>> closeAllConnections()
**>>
**>> The second column is a duplicate reading of the first column, and when two
**>> values are available, I would like to average column 1 and 2 (example code
**>> below). But if there is only one reading, I would like to retain it, but I
**>> haven't found a good way to exclude NA's using the following code:
**>>
**>> t(as.matrix(aggregate(t(as.matrix(DF)),list(rep(1:1,each=2)),mean)[,-1]))
**>>
**>> Currently, row 24 is the only row with a returned value. I'd like the
**>> result to return column "A" if it is the only available value, and average
**>> where possible. Of course, if both columns are NA, NA is the only possible
**>> result.
**>>
**>> The result I'm after would look like this (row 24 is an avg):
**>>
**>> 22.60
**>> NA
**>> NA
**>> NA
**>> NA
**>> NA
**>> NA
**>> NA
**>> 102.00
**>> 19.20
**>> 19.20
**>> NA
**>> NA
**>> NA
**>> 11.80
**>> 7.62
**>> NA
**>> NA
**>> NA
**>> NA
**>> NA
**>> 75.00
**>> NA
**>> 18.25
**>> NA
**>> NA
**>> 8.44
**>> 18.00
**>> NA
**>> 12.90
**>>
**>> This is a small example from a much larger data frame, so if you're
**>> wondering what the deal is with list(), that will come into play for the
**>> larger problem I'm trying to solve.
**>>
**>> Respectfully,
**>> Eric
**>>
**>>
