Re: [R] apply on large arrays

From: <Bill.Venables_at_csiro.au>
Date: Thu, 14 Feb 2008 10:41:05 +1000

Hmm. I think this could be faster still:

	tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
	tab3 <- rowSums(tab1 == 1)

but check it...

Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA

Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:                         +61 4 8819 4402
Home Phone:                     +61 7 3286 7700
mailto:Bill.Venables_at_csiro.au
http://www.cmis.csiro.au/bill.venables/

-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of Venables, Bill (CMIS, Cleveland) Sent: Thursday, 14 February 2008 10:30 AM To: erich.neuwirth_at_univie.ac.at; r-help_at_stat.math.ethz.ch Subject: Re: [R] apply on large arrays

Your code is

	tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
	tab2 <- apply(tab1, 1:4, 
			function(x) ifelse(sum(x) == 1, 1, 0))
	tab3 <- apply(tab2, 1, sum)

As far as I can see, step 2, (the problematic one), merely replaces any entries in tab1 that are not equal to one by zeros. I think this would do the same job a bit faster:

	tab2 <- tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
	tab2[] <- 0
	tab2[which(tab1 == 1, arr.ind = TRUE)] <- 1
	tab3 <- rowSums(tab2)

If you don't need to keep tab1, you would make things even better by removing it.

Bill Venables.         

Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA

Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:                         +61 4 8819 4402
Home Phone:                     +61 7 3286 7700
mailto:Bill.Venables_at_csiro.au
http://www.cmis.csiro.au/bill.venables/

-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of Erich Neuwirth
Sent: Thursday, 14 February 2008 9:52 AM To: r-help

Subject: [R] apply on large arrays

I have a big contingency table, approximately of size 60*2*500*500, and I need to count the number of cells containing a count of 1 for each

of the factors values defining the first dimension. Here is my attempt:

tab1<-with(pisa1,table(CNT,GENDER,ISCOF,ISCOM))
tab2<-apply(tab1,1:4,function(x)ifelse(sum(x)==1,1,0))
tab3<-apply(tab2,1,sum)

Computing tab2 is very slow.
Is there a faster and/or more elegant way of doing this?

-- 
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459


______________________________________________
R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Thu 14 Feb 2008 - 00:49:20 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 14 Feb 2008 - 02:30:14 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive