On Fri, 2006-02-24 at 08:18 -0800, Matt Crawford wrote:

*> I am having trouble doing the following. I have a data.frame like
**> this, where x and y are a variable that I want to do calculations on:
**>
**> Name Year x y
**> ab 2001 15 3
**> ab 2001 10 2
**> ab 2002 12 8
**> ab 2003 7 10
**> dv 2002 10 15
**> dv 2002 3 2
**> dv 2003 1 15
**>
**> Before I do all the other things I need to do with this data, I need
**> to summarize or collapse the data by name and year. I've found that I
**> can do things like
**> nameyear<-interaction(name,year)
**> dataframe$nameyear<-nameyear
**> tapply(dataframe$x,dataframe$nameyear,sum)
**> tapply(dataframe$y,dataframe$nameyear,sum)
**> and then bind those together.
**>
**> But my problem is that I need to somehow retain the original Names in
**> my collapsed dataset, so that later I can do analyses with the Name
**> factors. All I can think of is something like
**> tapply(dataframe$Name,dataframe$nameyear, somefunction?)
**> but nothing seems to work.
**>
**> I'm actually trying to convert a SAS program, and I can't get out of
**> that mindset. There, it's a simple Proc Means, By Name Year.
**>
**> Thanks for any help or suggestions on the right way to go about this.
**>
> Matt Crawford

Matt,

Just use aggregate():

> aggregate(MyDF[, 3:4], list(Name = MyDF$Name, Year = MyDF$Year), sum)

Name Year x y

1 ab 2001 25 5

2 ab 2002 12 8

3 dv 2002 13 17

4 ab 2003 7 10

5 dv 2003 1 15

See ?aggregate for more information.

Marc Schwartz

