Re: [R] Stymied by plyr

From: Dennis Murphy <djmuser_at_gmail.com>
Date: Thu, 21 Apr 2011 00:44:56 -0700

Hi:

The example below uses both the plyr and reshape packages. I'm presuming that you expect the table proportions to comprise columns of the output data frame - to do that, we'll use the cast() function from reshape.

Writing functions to pass into a plyr function needs to be done with a bit of care. When writing a function for ddply(), the output of the function must be one of (a) a scalar; (b) a named vector; or (c) a data frame. A simplified version of your problem is illustrated below; the ratings variable is coerced to a factor with a fixed set of levels so that the output of the (relative) frequency table has the same length in each subgroup. This will matter when it comes to reshaping the data frame.

library(plyr)
library(reshape2) # more recent version of the original reshape package # Example data frame: three schools, two components, 100 observations per school df <- data.frame(school = rep(LETTERS[1:3], each = 100),

                 component = rep(rep(1:2, each = 50), 3),
                 rating = factor(sample(1:5, 300, replace = TRUE),
levels = 1:5))

# Function to compute the relative frequencies - the input is a generic sub-data frame,
# the output is a data frame representation of prop.table() mktab <- function(df) as.data.frame(prop.table(table(df$rating)))

# Apply the function to the input data frame by school/component subgroups # Notice that the output has one row for each rating in each school/component subgroup
(dftab <- ddply(df, .(school, component), mktab))

# reshape dftab to display the proportion of each rating in columns instead cast(dftab, school + component ~ Var1)

# My result:
  school component 1 2 3 4 5

1      A         1 0.24 0.14 0.34 0.14 0.14
2      A         2 0.12 0.20 0.10 0.28 0.30
3      B         1 0.24 0.22 0.22 0.14 0.18
4      B         2 0.10 0.26 0.16 0.32 0.16
5      C         1 0.24 0.18 0.24 0.20 0.14
6      C         2 0.22 0.10 0.16 0.22 0.30

See

http://www.jstatsoft.org/v21/i12
http://www.jstatsoft.org/v40/i01
http://had.co.nz/plyr/
http://had.co.nz/reshape/

Re the last link, the original reshape package has been enhanced and manifested in the package reshape2. Since you're new to all of this, you're better off learning how reshape2 works in conjunction with plyr and several other of Hadley's packages.

HTH,
Dennis

2011/4/20 Stuart Luppescu <slu_at_ccsr.uchicago.edu>:
> Hello, This is my first time trying to use plyr, and I'm getting
> nowhere. I have teacher ratings data (1:4), on 10 components, by
> external observers and internal observers, in schools in areas. I want
> to calculate the percentage of each rating given on each component, by
> each type of observer, within each school, within each area. The data
> look like this:
>
> unit area ext.obs rating comp
> 11 77777 11 0 3 1
> 12 77777 11 0 4 2
> 13 77777 11 0 3 3
> 14 77777 11 0 4 4
> 15 77777 11 0 3 5
> 16 77777 11 0 3 6
> 17 77777 11 0 3 7
> 18 77777 11 0 3 8
> 19 77777 11 0 3 9
> 20 77777 11 0 3 10
>
> I thought this would be a perfect application for plyr. I tried this:
>
> calc.pct <- function(x) {
> table(x)/sum(table(x))
> }
>
> pcts <- ddply(test.school, .(area, ext.obs, comp), calc.pct, x=rating)
> Error in .fun(piece, ...) : unused argument(s) (piece)
>
> Then I tried this:
> pcts <- ddply(test.school, .(area, ext.obs, comp), .(calc.pct(rating)))
> Error in .fun(piece, ...) : attempt to apply non-function
>
> I tried all kinds of other variations but with no success. Can someone
> give me some pointers?
>
> Thanks.
> --
> Stuart Luppescu -=- slu .at. ccsr.uchicago.edu
> University of Chicago -=- CCSR
> 才文と智奈美の父 -=- Kernel 2.6.36-gentoo-r5
> Lars Strand: Will R run under Windows Pocket PC? Brian D. Ripley: We
> don't know! There are no binary versions of R for that platform, but
> perhaps you could find a suitable compiler and manage to build the
> sources. Outside pure mathematics it is usually very hard to establish
> that something cannot be done (and it can be very hard in pure
> mathematics, too). -- Lars Strand and Brian D. Ripley R-help (November
> 2004)
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 21 Apr 2011 - 07:46:57 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 21 Apr 2011 - 09:10:32 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive