Re: [R] "apply" question

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Tue 03 May 2005 - 01:19:59 EST

On 5/2/05, Christoph Scherber <Christoph.Scherber@uni-jena.de> wrote:
> Dear R users,
>
> Iīve got a simple question but somehow I canīt find the solution:
>
> I have a data frame with columns 1-5 containing one set of integer
> values, and columns 6-10 containing another set of integer values.
> Columns 6-10 contain NAīs at some places.
>
> I now want to calculate
> (1) the number of values in each row of columns 6-10 that were NAīs

Supposing our data is called DF,

rowSums(!is.na(DF[,6:10]))

> (2) the sum of all values on columns 1-5 for which there were no missing
> values in the corresponding cells of columns 6-10.

In the expression below 1 + 0 *DF[,6:10] is like DF[,6:10] except all non-NAs are replaced by 1. Multiplying DF[,1:5] by that effectively replaces each element in DF[,1:5] with an NA if the corresponding DF[,6:10] contained an NA.

rowSums( DF[,1:5] * (1 + 0 * DF[,6:10]), na.rm = TRUE )

>
> Example: (letīs call the data frame "data")
>
> Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10
> 1 2 5 2 3 NA 5 NA 1 4
> 3 1 4 5 2 6 NA 4 NA 1
>
> The result would then be (for the first row)
> (1) "There were 2 NAīs in columns 6-10."
> (2) The mean of Columns 1-5 was 2+2+3=7" (because there were NAīs in the
> 1st and 3rd position in rows 6-10)

I guess you meant sum when you referred to mean in (2). If you really do want the mean replace rowSums with rowMeans in the expression given above in the answer to (2).



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue May 03 01:51:12 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:31 EST