From: Marc Schwartz <MSchwartz_at_MedAnalytics.com>

Date: Fri 22 Apr 2005 - 02:11:33 EST

test$year: 2001

x y z

105 114 97

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Apr 22 02:16:26 2005

Date: Fri 22 Apr 2005 - 02:11:33 EST

On Thu, 2005-04-21 at 16:31 +0100, jose silva wrote:

> I know this question is very simple, but I am not figure it out

> test<- data.frame(year=c(2000,2000,2001,2001),x=c(54,41,90,15), y=c(29,2,92,22), z=c(26,68,46,51))

*> test
*

> I want to sum the vectors x, y and z within each year (2000 and 2001) to obtain this:

> I tried tapply but did not work (or probably I do it wrong)

*>
*

> Any suggestions?

tapply() is typically used against a single vector, subsetting by one or more factors.

In this case, since you want to get the colSums for more than one column in the data frame, there are a few options:

- Use by():

> by(test[, -1], test$year, colSums)

test$year: 2000

x y z

95 31 94

test$year: 2001

x y z

105 114 97

2. Use aggregate():

> aggregate(test[, -1], list(Year = test$year), sum)

Year x y z

1 2000 95 31 94

2 2001 105 114 97

3. Use split() and then lapply():

> test.s <- split(test, test$year)

> test.s

$"2000"

year x y z

1 2000 54 29 26

2 2000 41 2 68

$"2001"

year x y z

3 2001 90 92 46

4 2001 15 22 51

> lapply(test.s, function(x) colSums(x[, -1]))

$"2000"

x y z

95 31 94

$"2001"

x y z

105 114 97

Which you choose may depend upon how you need the output structured for subsequent use.

See ?by, ?aggregate, ?lapply and ?split for more information.

**HTH,
**
Marc Schwartz

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Apr 22 02:16:26 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:31:21 EST
*