[R] Calculation of group summaries

From: <Seeliger.Curt_at_epamail.epa.gov>
Date: Wed 13 Jul 2005 - 03:51:03 EST


I know R has a steep learning curve, but from where I stand the slope looks like a sheer cliff. I'm pawing through the available docs and have come across examples which come close to what I want but are proving difficult for me to modify for my use.

Calculating simple group means is fairly straight forward:   data(PlantGrowth)
  attach(PlantGrowth)
  stack(mean(unstack(PlantGrowth)))

I'd like to do something slightly more complex, using a data frame and groups identified by unique combinations of three id variables. There may be thousands of such combinations in the data. This is easy in SQL:

  select year,

         site_id,
         visit_no,
         mean(undercut) AS meanUndercut,
         count(undercut) AS nUndercut,
         std(undercut) AS stdUndercut

  from channelMorphology
  group by year, site_id, visit_no

      ;

Reading a CSV written by SAS and selecting only records expected to have values is also straight forward in R, but getting those summary values for each site visit is currently beyond me:

  sub<-read.csv('c:/data/channelMorphology.csv'

,header=TRUE
,na.strings='.'
,sep=','
,strip.white=TRUE

               )

  undercut<-subset(sub,

                  ,TRANSDIR %in% c('LF','RT')

,select=c('YEAR','SITE_ID','VISIT_NO','TRANSECT','TRANSDIR'
                           ,'UNDERCUT'
                           )
                  ,drop=TRUE
                  )


Thanks all for your help.
cur

--
Curt Seeliger, Data Ranger
CSC, EPA/WED contractor
541/754-4638
seeliger.curt@epa.gov

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Wed Jul 13 03:56:20 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:34 EST