Re: [R] conversion of data for use within barchart

From: Deepayan Sarkar <deepayan.sarkar_at_gmail.com>
Date: Wed, 02 Jul 2008 13:50:29 -0700

On 7/2/08, Karin Lagesen <karinlag_at_studmed.uio.no> wrote:
>
>
> I have a data matrix like this:
>
>
> > data[1:10,]
> aaname grp cluster count
> 1 Ala All Singleton 432
> 2 Arg All Singleton 1239
> 3 Asn All Singleton 396
> 4 Asp All Singleton 152
> 5 Cys All Singleton 206
> 6 Gln All Singleton 370
> 7 Glu All Singleton 211
> 8 Gly All Singleton 594
> 9 His All Singleton 213
> 10 Ile All Singleton 44
>
> where the cluster column has three levels.
>
> > levels(data$cluster)
> [1] "Array" "Singleton" "rRNA"
> >
>
> Now, I would like to plot this like this:
>
> barchart(aaname~count|grp, group = cluster, data = data, stack = TRUE)
>
> I am thus using the cluster as the grouping.
>
> I would like to plot the relative abundance within each grouping, such
> that the max level in my plot always is one (or 100). This would for
> instance mean for the Ala in the All grp that the Singleton cluster
> consitute lets say 40% of the Ala in the All grp, wheras the Singleton
> and rRNA makes up 20% each. In this case I would get in my plot a
> Singleton stretching to 40%, whereas the other two would be 20% each,
> all in all making 100%.
>
> I am uncertain of whether I am managing to describe what I want, so I
> hope somebody understands what I want!

So you basically need to compute the sum(count) within clusters, and divide by those counts. Consider using ave(). For example:

> foo <- data.frame(g = gl(3, 3), count = rpois(9, lambda=20))
> foo

  g count
1 1 14
2 1 16
3 1 20
4 2 21
5 2 24
6 2 16
7 3 15
8 3 24
9 3 12
> with(foo, ave(count, g, FUN = sum))

[1] 50 50 50 61 61 61 51 51 51
> foo$gsum <- with(foo, ave(count, g, FUN = sum))
> foo

  g count gsum
1 1 14 50
2 1 16 50
3 1 20 50
4 2 21 61
5 2 24 61
6 2 16 61
7 3 15 51
8 3 24 51
9 3 12 51

-Deepayan



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 02 Jul 2008 - 21:14:29 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 02 Jul 2008 - 21:32:12 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive