From: Gabor Grothendieck <ggrothendieck_at_gmail.com>

Date: Wed 13 Sep 2006 - 04:32:28 GMT

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Sep 13 14:39:18 2006

Date: Wed 13 Sep 2006 - 04:32:28 GMT

If I understand this correctly we want to sum the mass over each combination of the first 6 variables and display the result with the 6th, prey, along the top and the others along the side.

library(reshape)

testm <- melt(test, id = 1:6)

cast(testm, nbpc + trip + set + tagno + depth ~ prey)

On 9/12/06, Denis Chabot <chabotd@globetrotter.net> wrote:

*> Hi,
**>
**> I'm trying to move to R the last few data handling routines I was
**> performing in SAS.
**>
**> I'm working on stomach content data. In the simplified example I
**> provide below, there are variables describing the origin of each prey
**> item (nbpc is a ship number, each ship may have been used on
**> different trips, each trip has stations, and individual fish (tagno)
**> can be caught at each station.
**>
**> For each stomach the number of lines corresponds to the number of
**> prey items. Thus a variable identifies prey type, and others (here
**> only one, mass) provide information on prey abundance or size or
**> digestion level.
**>
**> Finally, there can be accompanying variables that are not used but
**> that I need to keep for later analyses (e.g. depth in the example
**> below).
**>
**> At some point I need to transform such a dataset into another format
**> where each stomach occupies a single line, and there are columns for
**> each prey item.
**>
**> The "reshape" function works really well, my program is in fact
**> simpler than the SAS equivalent (not shown, don't want to bore you,
**> but available on request), except that I need zeros when prey types
**> are absent from a stomach instead of NAs, a problem for which I only
**> have a shaky solution at the moment:
**>
**> 1) creation of a dummy dataset:
**> #######
**> nbpc <- rep(c(20,34), c(110,90))
**> trip <- c(rep(1:3, c(40, 40, 30)), rep(1:2, c(60,30)))
**> set <- c(rep(1:4, c(10, 8, 7, 15)), rep(c(10,12), c(25,15)), rep(1:3,
**> rep(10,3)),
**> rep(10:12, c(20, 10, 30)), rep(7:8, rep(15,2)))
**> depth <- c(rep(c(100, 150, 200, 250), c(10, 8, 7, 15)), rep(c
**> (100,120), c(25,15)), rep(c(75, 50, 200), rep(10,3)),
**> rep(c(200, 150, 50), c(20, 10, 30)), rep(c(100, 250), rep
**> (15,2)))
**> tagno <- rep(round(runif(42,1,200)),
**> c(7,3, 4,4, 2,2,3, 5,5,5, 4,6,4,3,5,3, 7,8, 4,6, 5,5,
**> 7,3,
**> 6,6,4,4, 4,6, 3,3,4,5,5,6,4, 5,5,5, 8,7))
**> prey.codes <-c(187, 438, 792, 811)
**> prey <- sample(prey.codes, 200, replace=T)
**> mass <- runif(200, 0, 10)
**>
**> test <- data.frame(nbpc, trip, set, depth, tagno, prey, mass)
**> ########
**>
**> Because there are often multiple occurrences of the same prey in a
**> single stomach, I need to sum them for each stomach before using
**> "reshape". Here I use summarizeBy because my understanding of the
**> many variants of "apply" is not very good:
**>
**> ########
**> test2 <- summaryBy(mass~nbpc+trip+set+tagno+prey, data=test, FUN=sum,
**> keep.names=T, id=~depth)
**>
**> #this messes up sorting order, I fix it
**> k <- order(test2$nbpc, test2$trip, test2$set, test2$tagno)
**> test3 <- test2[k,]
**> result <- reshape(test3, v.names="mass", idvar=c("nbpc", "trip",
**> "set", "tagno"),
**> timevar="prey", direction="wide")
**> #########
**>
**> I'm quite happy with this, although you may know of better ways of
**> doing it.
**> But my problem is with preys that are absent from a stomach. In later
**> analyses, I need them to have zero abundance instead of NA.
**> My shaky solution is:
**> #########
**> empties <- is.na(result)
**> result[empties] <- 0
**> #########
**>
**> which did the job in this example, but it won't always. For instance
**> there could have been NAs for "depth", which I do not want to become
**> zero.
**>
**> Is there a way to transform NAs into zeros for multiple columns of a
**> dataframe in one step, while ignoring some columns?
**>
**> Or maybe there is another way to achieve this that would have put
**> zeros where I need them (i.e. something else than "reshape")?
**>
**> Thanking you in advance,
**>
**> Denis Chabot
**>
*

> ______________________________________________

*> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Sep 13 14:39:18 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Wed 13 Sep 2006 - 05:30:04 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*