Re: [R] error for ttest

From: Dennis Murphy <djmuser_at_gmail.com>
Date: Thu, 14 Apr 2011 06:57:10 -0700

Hi:

It's hard to diagnose the problem without an illustrative example. Perhaps the following might help:

(1) When writing a function to use in ddply(), make a generic data frame the input argument

     to the function and refer to the variables within the function either with the $ notation

     or in relation to with(dataframe, ...). This is because you want to apply the function to

     each sub-data frame indexed by combinations of the grouping factors. (2) The function in (1) should return either a scalar quantity or a data frame.
(3) If you're computing groupwise scalar summaries, make sure the third argument of

      ddply() is summarise, as in
      ddply(mydf, .(grp1, grp2), summarise, mean = mean(y, na.rm = TRUE), sd
= sd(y, na,rm = TRUE))

I don't think as.data.frame.function(f) ... is going to work. Data frames and functions are two quite different types of objects. If you're trying to write a function that returns a data frame, then see point (2) above.

Here's an example with a few different versions of what is basically the same function. Observe how they are handled in ddply().

mydf <- data.frame(grp1 = rep(LETTERS[1:3], each = 20),
                   grp2 = rep(rep(letters[1:2], each = 10), 3),
                      w = rpois(60, 10),
                      x = rpois(60, 5),
                      y = rbinom(60, 1, 0.5))

# One can use either with() to temporarily attach a data frame for the
# purpose of the calculation or use the $ notation to refers to components # of a data frame. Either works, as shown below. f <- function(df) {

    u <- with(df, (w + x)/2 + y)
    v <- df$x + df$w * df$y
    data.frame(u = u, v = v)
  }

# In this function, the reference to the data frame is never invoked. h <- function(df) {

    u <- (w + x)/2 + y
    v <- x + w * y
    data.frame(u = u, v = v)
  }

# This returns both the original and newly created variables g <- function(df) {

     df <- transform(df,
                      u = (w + x)/2 + y,
                      v = x + w * y
                    )
     df

  }

# Returns only the variables u and v + grouping variables; the originals x, y, z are gone
ddply(mydf, .(grp1, grp2), f)
# Returns the original data frame; the new variables u and v are not added. In this case,
# ddply silently ignores the function f
ddply(mydf, .(grp1, grp2), transform, f) # This gets it right
ddply(mydf, .(grp1, grp2), g)
# What happens when you use variable names without accessing the referent data frame
ddply(mydf, .(grp1, grp2), h)

HTH,
Dennis

On Wed, Apr 13, 2011 at 12:40 PM, 1Rnwb <sbpurohit_at_gmail.com> wrote:

> Hello all,
>
> I have arranged my data as per Dennis's suggestion in this post
> http://www.mail-archive.com/r-help@r-project.org/msg107156.html.
> the posted code works fine but when I try to apply it to my data, i get ">
> u2 <- ddply(xxm, .(plateid, cytokine), as.data.frame.function(f))
> Error in t.test.formula(conc ~ Self_T1D, data = df, na.rm = T) :
> grouping factor must have exactly 2 levels".
> Self_T1D has two levels "N" and "Y"
>
> I have used the ddply function to do the mean and sd for the same dataframe
> without any issues.
> I would appreciate help to solve this.
> Thanks
> Sharad
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/error-for-ttest-tp3448056p3448056.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 14 Apr 2011 - 13:59:08 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 14 Apr 2011 - 15:50:32 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive