# Re: [R] Data Manipulations - Group By equivalent

From: Wensui Liu <liuwensui_at_gmail.com>
Date: Sun 02 Jul 2006 - 12:39:08 EST

Zubin,

Here is a piece of R code I copy from my blog side by side with SAS. You might need to tweak it a little to get what you need.

CALCULATE GROUP SUMMARY IN R

```##################################################
```

# HOW TO CALCULATE GROUP SUMMARY IN R #
# DATE : DEC-13, 2005 #
```##################################################
```

# EQUIVALENT SAS CODE: #
# #
# DATA DATA; #
# DO I = 1 TO 2; #
# DO J = 1 TO 4; #
# GROUP = 'TREATMENT_'||PUT(I, 1.); #
# X = RANNOR(1); #
# OUTPUT; #
# END; #
# END; #
# KEEP GROUP X; #
# RUN; #
# #
# PROC SQL; #
# CREATE TABLE COMBINE AS #
# SELECT *, MEAN(X) AS MEAN_X, SUM(X) AS SUM_X #
# FROM DATA #
# GROUP BY GROUP; #
# QUIT; #
```##################################################

```

# GENERATE A TREATMENT GROUP #

group<-as.factor(paste("treatment", rep(1:2, 4), sep = '_'));

# CREATE A SERIES OF RANDOM VALUES #

x<-rnorm(length(group));

# CREATE A DATA FRAME TO COMBINE THE ABOVE TWO #
data<-data.frame(group, x);

# CALCULATE SUMMARY FOR X #

x.mean<-tapply(data\$x, data\$group, mean, na.rm = T); x.sum<-tapply(data\$x, data\$group, sum, na.rm = T);

# CREATE A DATA FRAME TO COMBINE SUMMARIES #
summ<-data.frame(x.mean, x.sum, group = names(x.mean));

# COMBINE DATA AND SUMMARIES TOGETHER #
combine<-merge(data, summ, by = "group");

On 7/1/06, zubin <binabina@bellsouth.net> wrote:

```>
> Hello, a beginner R user - boy i wish there was a book on just data
> manipulations for SAS users learning R (equivalent to the SAS DATA
> STEP)..  Okay, my question:
>
> I have a panel data set, hotel data occupancy by month for 12 months,
> 1000 hotels.  I have a field labeled 'year' and want to consolidate the
> monthly records using an average into 1000 occupancy numbers - just a
> simple average of the 12 months by hotel.  In SQL this operation is
> pretty easy, a group by query (group by hotel where year = 2005, avg
> occupancy) - how is this done in R? (in R language not SQL).  Thx!
>
> -zubin
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> http://www.R-project.org/posting-guide.html
>

```
```--
WenSui Liu
(http://spaces.msn.com/statcompute/blog)
Senior Decision Support Analyst
Health Policy and Clinical Effectiveness
Cincinnati Children Hospital Medical Center

[[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help