Re: [R] extracting index list when using tapply()

From: Charles C. Berry <cberry_at_tajo.ucsd.edu>
Date: Tue, 08 Jul 2008 16:33:53 -0700

On Tue, 8 Jul 2008, hesicaia wrote:

>
> Hello,
> The quick version of my question is how can I extract a matrix instead of
> a vector using tapply()? I would like to be able to access both the results
> of tapply() and also the index variables.
>
> In case further explanation would help: I am analyzing a large (3million
> rows x 9 columns) spatial/temporal dataset and am attempting to calculate
> the number of unique years containing any data within each geographic area
> (10 degree cells in this case). I can do this, but I also want to extract a
> subset vector of the index variable (area).
>

It really would help to provide a worling example as another suggested. We cannot test our suggestions without a trial dataset.

> My script to calculate the number of unique years containing any data for
> each area is:
> x<-tapply(years, area, function(x) length(unique(x)))
>

or

 	tab <- table( area, years )
 	x <- rowSums ( tab !=0  )


> Now, I want to extract the vector of areas where the number of unique years
> containing any data is >20, but tapply() only returns a vector of unique
> years and I was a matrix.

         x <- rownames(tab)[ rowSums( tab !=0 ) > 20 ]

unless, perhaps, you meant

         x <- rownames(tab)[ rowSums( tab > 20 ) !=0 ]

>
> I could use a looping function to do this, but tapply() is much faster with
> large datasets and so I would like to use it if possible.
>

Depending on the size of the dataset and the number of different years and areas, there may be better ways to do this (since 'tab' could be very big and sparse). For a start in that direction, see

         http://finzi.psych.upenn.edu/R/Rhelp02a/archive/118816.html

and perhaps library(Matrix) (on CRAN).

HTH, Chuck

> Any help is appreciated.
> Thanks.
> --
> View this message in context: http://www.nabble.com/extracting-index-list-when-using-tapply%28%29-tp18345794p18345794.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry_at_tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 08 Jul 2008 - 23:43:14 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 09 Jul 2008 - 02:31:46 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive