From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>

Date: Thu 08 Dec 2005 - 20:34:00 EST

"Rau, Roland" <Rau@demogr.mpg.de> writes:

> Dear all,

**> given I have data in a data.frame which indicate the number of people in
**> a
**> specific year at a specific age:
**> n <- 10
**> mydf <- data.frame(yr=sample(1:10, size=n, replace=FALSE),
**> age=sample(1:12, size=n, replace=FALSE),
**> no=sample(1:10, size=n, replace=FALSE))
**> Now I would like to make a matrix with (in this simple example)
**> 10 columns (for the years) and 12 rows (for the ages). In each cell,
**> I would like to put the correct number of individuals.
**> So far I was doing this as follows:
**> mymatrix <- matrix(0, ncol=10, nrow=12)
**> for (year in unique(mydf$yr)) {
**> for (age in unique(mydf$age)) {
**> if (length(mydf$no[mydf$yr==year & mydf$age==age]) > 0) {
**> mymatrix[age,year] <- mydf$no[mydf$yr==year & mydf$age==age]
**> } else {
**> mymatrix[age,year] <- 0
**> }
**> }
**> }
**> This is fairly fast in such a simple setting.
**> But with more years and ages (and for roughly 300 datasets) this becomes
**> pretty slow. And in addition, this is not really elegant R-code.
**> Can somebody point me into the direction how I can do that in a more
**> elegant
**> way, possibly avoiding the loops?
This almost gets you there:

with(mydf, tapply(no,list(age,yr), sum))

except that it puts NA where you want 0, which you could fix with

m <- with(mydf, tapply(no,list(age,yr), sum))
m[is.na(m)] <- 0

m

Other options include matrix indexing:

with(mydf, {

M <- matrix(0,12,10)

M[cbind(age,yr)]<-no

})

or (tada...) the reshape() function, esp. if you want a data frame as output.

