[R] Ranking within factor subgroups

From: maneesh deshpande <dmaneesh_at_hotmail.com>
Date: Wed 22 Feb 2006 - 13:44:47 EST

Hi,

I have a dataframe, x of the following form:

Date            Symbol   A    B  C
20041201     ABC      10  12 15
20041201     DEF       9    5   4
...
20050101     ABC         5  3   1
20050101     GHM       12 4    2

....

here A, B,C are properties of a set symbols recorded for a given date. I wante to decile the symbols For each date and property and create another set of columns "bucketA","bucketB", "bucketC" containing the decile rank
for each symbol. The following non-vectorized code does what I want,

bucket <- function(data,nBuckets) {

     q <- quantile(data,seq(0,1,len=nBuckets+1),na.rm=T)
     q[1] <- q[1] - 0.1 # need to do this to ensure there are no extra NAs
     cut(data,q,include.lowest=T,labels=F)

}

calcDeciles <- function(x,colNames) {
nBuckets <- 10
dates <- unique(x$Date)
for ( date in dates) {
  iVec <- x$Date == date
  xx <- x[iVec,]
  for (colName in colNames) {

     data <- xx[,colName]
     bColName <- paste("bucket",colName,sep="")
     x[iVec,bColName] <- bucket(data,nBuckets)
  }
}

x
}

x <- calcDeciles(x,c("A","B","C"))

I was wondering if it is possible to vectorize the above function to make it more efficient.
I tried,

rlist <- tapply(x$A,x$Date,bucket)
but I am not sure how to assign the contents of "rlist" to their appropriate slots in the original
dataframe.

Thanks,

Maneesh



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Feb 22 13:56:26 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:39 EST