From: Gabor Grothendieck <ggrothendieck_at_gmail.com>

Date: Mon 30 May 2005 - 13:42:12 EST

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon May 30 13:51:06 2005

Date: Mon 30 May 2005 - 13:42:12 EST

On 5/29/05, McClatchie, Sam (PIRSA-SARDI)
<mcclatchie.sam@saugov.sa.gov.au> wrote:

> Background:

*> OS: Linux Mandrake 10.1
**> release: R 2.0.0
**> editor: GNU Emacs 21.3.2
**> front-end: ESS 5.2.3
**> ---------------------------------
**> Colleagues
**>
**> I am having some trouble extracting results from the function by, used to
**> average variables in a data.frame first by one factor (depth) and then by a
**> second factor (station). The real data.frame is quite large
**> > dim(data.2001)
**> [1] 32049 11
**>
**> Here is a snippet of code:
**>
**> ## bin density data for each station into 1 m depth bins, containing means
**> data.2001.test$integer.Depth <- as.factor(round(data.2001.test$Depth,
**> digits=0))
**> attach(data.2001.test)
**> binned.data.2001 <- by(data.2001.test[,5:11], list(depth=integer.Depth,
**> station=Station), mean)
**>
**> and here is a snippet of the data.frame
**>
**> > dim(data.2001.test)
**> [1] 150 11
**> > dump("data.2001.test", file=stdout())
**> data.2001.test <-
**> structure(list(Cruise = structure(as.integer(c(1, 1, 1, 1, 1,
*

Try the following. To keep this short lets just take a subset of rows called dd. Also, we drop the Station levels that are not being used since this test only uses 2 levels and there are 288 Station levels in total. The function that we apply using by returns a vector consisting of the integer.Depth, Station and the column means of columns 5 to 10. (Asking for just the mean of those, as in your example, would take all the numbers in all the columns passed to mean and give back a grand mean rather than a mean per column.) Finally we rbind it all back together.

*> # data.2001.test is your data frame including the integer.Depth column
**> dd <- data.2001.test[50:60,]
*

> dd$Station <- dd$Station[drop = TRUE]

> dd.bin <- by(dd, list(dd$integer.Depth, dd$Station), function(x)

+ c(integer.Depth = x$integer.Depth[1], Station = x$Station[1],
+ colMeans(x[,5:10])))

> do.call("rbind", dd.bin)

integer.Depth Station Depth Temperature.oC Salinity Fluoresence.Volts [1,] 20 1 23.90167 17.67420 35.47650 1.107433 [2,] 21 1 24.75350 17.33355 35.59050 1.060400 [3,] 1 2 5.19000 19.61510 35.54870 0.726500 [4,] 2 2 5.82950 19.61305 35.55025 0.719200 [5,] 3 2 6.81250 19.61300 35.58345 0.741150 [6,] 4 2 7.55000 19.61180 35.60460 0.754600 Density.kg.m3 Brunt.Vaisala.Freq.cycl.h [1,] 25.82400 -5.095467 [2,] 25.99820 16.030975 [3,] 25.30560 -6.261240 [4,] 25.31015 4.051561 [5,] 25.33985 8.893225 [6,] 25.35960 -8.167610 ______________________________________________R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon May 30 13:51:06 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:32:15 EST
*