Subject: [R] performance problem
From: Alexandre Fayolle (Alexandre.Fayolle@logilab.fr)
Date: Fri 15 Dec 2000 - 01:13:43 EST
Message-ID: <Pine.LNX.4.21.0012141608300.3514-100000@leo.logilab.fr>
Hello,
I needed a function like table(), but which used the value of a column
instead of counting occurences, but could not find anything in the
builtin modules (maybe I missed it...). SO I decided to write my own, and
I came up with the following:
table.ponderate<-function(arow,acol,aweight){
matrix(data=0,nrow=length(levels(arow)),ncol=length(levels(acol)),
byrow=TRUE,dimnames=list(levels(arow),levels(acol)))->m
aweight[is.na(aweight)]<- 0
for (a in seq(length(arow))) {
prev<-m[as.integer(arow[a]),as.integer(acol[a])]
m[as.integer(arow[a]),as.integer(acol[a])]<-prev+aweight[a]
}
m
}
The problem is that the performance is very poor. I have not had time to
benchmark it, but it takes several seconds to process 1000 lines, and I
need to process a few dozens of 10000+ lines data sets.
Is there a way to write things differently that could speed up things?
Alexandre Fayolle
-- http://www.logilab.com Narval is the first software agent available as free software (GPL). LOGILAB, Paris (France).-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
This archive was generated by hypermail 2b25 : Thu 01 Feb 2001 - 16:14:36 EST