From: Huntsinger, Reid <>
Date: Sat 18 Jun 2005 - 04:35:22 EST

You can use tapply() to compute the medians, as in

meds <- tapply(mydata$inc,INDEX=mydata$ed,FUN=median)

then create a new column with the medians as

medianEd <- meds[mydata$ed]

Reid Huntsinger

Hi there,

I have a data frame (mydata) with 1 numeric variable (income) and 1 factor (education). I want a new column in this data with the median income for each education level. A obviously inneficient way to do this is

for ( k in 1: nrow(mydata) ) { l <- mydata$education[k]
mydata$md[k] <- median(mydata$income[mydata$education==l],na.rm=T)


Since mydata has nearly 30.000 rows, this will be done not untill the end of this month. I thus need some help for vectorizing this, please.



