Re: [R] Tapply for Group Specific Means and Proportions

From: jim holtman <jholtman_at_gmail.com>
Date: Mon, 03 Mar 2008 18:37:20 -0500

Here is how you can get the proportions from your data frame:

> prop.table(table(paste(x$testdate, x$testtime), x$Behavior),margin=1)

                       EA         FL         HO         MA         OS
       PE         SI

  28Mar96 1014 0.00000000 0.00000000 0.50000000 0.00000000 0.00000000 0.50000000 0.00000000
  28Mar96 752 0.00000000 0.00000000 0.00000000 0.00000000 0.25000000 0.75000000 0.00000000
  28Mar96 924 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.83333333 0.16666667
  28Mar96 954 0.00000000 0.00000000 0.50000000 0.00000000 0.00000000 0.50000000 0.00000000
  29Mar96 835 0.03225806 0.09677419 0.16129032 0.03225806 0.00000000 0.67741935 0.00000000
>

On Mon, Mar 3, 2008 at 5:27 PM, Bret Collier <bacollier_at_ag.tamu.edu> wrote:
> UseRs,
>
> I am working on a dataset (see small example below) where individuals
> were followed on a specific date-time combo and multiple repeated
> measurements were taken (e.g., height in meters, behavior class in 2
> letter code). Observation numbers varied between individual (ranging
> from 1 observation for each date-time combo to >50)
>
> I am trying to summarize the data into 1 row per individual-date-time
> combination. I used tapply to pull mean height (TreeHt) out for each
> date-time combo. However, all my attempts to get the proportion of
> times a specific behavior category occurs within the same date-time
> combo have failed thus far having tried tapply, aggregate, table
> (because Behavior is a factor), etc.-- likely I probably did not search
> the right word combination in the help archives
>
> If anyone can point me in the right direction toward streamlining my
> code to output the summaries along these general lines (column headers
> being the Behavior categories, 0.xx being the proportion per date-time)
> I would appreciate it:
>
> Date-Time MeanHt PE OS SI ...
> 28Mar96.0752 6.000000 0.xx 0.xx 0.xx ...
> 28Mar96.1014 7.000000 0.xx 0.xx 0.xx ...
>
>
> TIA,
> Bret (R 2.6.1 on I386-pc-mingw32)
> Texas A&M
>
> > Final
> Sequence testdate testtime Behavior Substrate TreeHt
> 1 1 28Mar96 0752 PE TW 6
> 2 2 28Mar96 0752 OS <NA> 6
> 3 3 28Mar96 0752 PE TW 6
> 4 4 28Mar96 0752 PE TW 6
> 5 1 28Mar96 0924 PE TW 8
> 6 2 28Mar96 0924 PE BR 8
> 7 3 28Mar96 0924 PE TW 7
> 8 4 28Mar96 0924 SI TW 7
> 9 5 28Mar96 0924 PE TW 7
> 10 6 28Mar96 0924 PE TW 7
> 11 1 28Mar96 0954 HO BR 10
> 12 2 28Mar96 0954 PE BR 10
> 13 1 28Mar96 1014 PE TW 7
> 14 2 28Mar96 1014 HO TW 7
> 15 1 29Mar96 0835 PE TW 4
> 16 2 29Mar96 0835 EA BR 4
> 17 3 29Mar96 0835 MA BR 4
> 18 4 29Mar96 0835 PE TW 5
> 19 5 29Mar96 0835 PE TW 5
> 20 6 29Mar96 0835 PE TW 13
> 21 7 29Mar96 0835 PE TW 13
> 22 8 29Mar96 0835 PE TW 13
> 23 9 29Mar96 0835 PE BR 13
> 24 10 29Mar96 0835 PE TW 13
> 25 11 29Mar96 0835 HO TW 12
> 26 12 29Mar96 0835 HO TW 12
> 27 13 29Mar96 0835 HO TW 12
> 28 14 29Mar96 0835 HO TW 12
> 29 15 29Mar96 0835 PE TW 13
> 30 16 29Mar96 0835 PE TR 13
> 31 17 29Mar96 0835 FL <NA> NA
> 32 18 29Mar96 0835 PE BR 12
> 33 19 29Mar96 0835 FL <NA> NA
> 34 20 29Mar96 0835 PE TW 13
> 35 21 29Mar96 0835 PE TW 13
> 36 22 29Mar96 0835 FL <NA> NA
> 37 23 29Mar96 0835 HO TW 4
> 38 24 29Mar96 0835 PE BR 5
> 39 25 29Mar96 0835 PE BR 5
> 40 26 29Mar96 0835 PE BR 5
> 41 27 29Mar96 0835 PE TW 4
> 42 28 29Mar96 0835 PE TW 5
> 43 29 29Mar96 0835 PE TW 5
> 44 30 29Mar96 0835 PE TW 13
> 45 31 29Mar96 0835 PE TW 5
> > str(Final)
> 'data.frame': 45 obs. of 6 variables:
> $ Sequence : num 1 2 3 4 1 2 3 4 5 6 ...
> $ testdate : Factor w/ 2 levels "28Mar96","29Mar96": 1 1 1 1 1 1 1 1 1
> 1 ...
> $ testtime : Factor w/ 5 levels "0752","0835",..: 1 1 1 1 3 3 3 3 3 3 ...
> $ Behavior : Factor w/ 7 levels "EA","FL","HO",..: 6 5 6 6 6 6 6 7 6 6 ...
> $ Substrate: Factor w/ 3 levels "BR","TR","TW": 3 NA 3 3 3 1 3 3 3 3 ...
> $ TreeHt : num 6 6 6 6 8 8 7 7 7 7 ...
> > test<-sort((tapply(Final$TreeHt, INDEX=interaction(Final$testdate,
> Final$testtime), FUN=mean, na.rm=TRUE)))
> > data.frame(test)
> test
> 28Mar96.0752 6.000000
> 28Mar96.1014 7.000000
> 28Mar96.0924 7.333333
> 29Mar96.0835 8.928571
> 28Mar96.0954 10.000000
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 03 Mar 2008 - 23:55:26 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 04 Mar 2008 - 00:30:18 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive