[R] Avoiding loops in creating a coinvestment matrix

From: Daniel Malter <daniel_at_umd.edu>
Date: Sun, 03 Apr 2011 17:16:04 -0500 (CDT)


Hi, I am working on a dataset in which a number of venture capitalists invest in a number of firms. What I am creating is an asymmetric matrix M in which m(ij) is the volume (sum) of coinvestments of VC i with VC j (i.e., how much has VC i invested in companies that VC j also has investments in). The output should look like the "coinvestments" matrix produced with the code below. If possible I would like to avoid loops and optimize the code for speed because the real data is huge. If anybody has suggestions, I would be grateful.

invest=c(20,50,40,30,10,20,20,30,40)
vc=rep(c('A','B','C'),each=3)
company=c('E','F','G','F','G','H','G','H','I') data=data.frame(vc,company,invest)

data #data

inv.mat=tapply(invest,list(vc,company),sum) inv.mat=replace(inv.mat,which(is.na(inv.mat)==T),0)

inv.mat #investment matrix

exist.mat=inv.mat>0

coinvestments<-matrix(0,nrow=length(unique(vc)),ncol=length(unique(vc)))

for(i in unique(vc)){

	for(j in unique(vc)){
	i.is=which(unique(vc)==i)
	j.is=which(unique(vc)==j)
	i.invests=exist.mat[i,]
	j.invests=exist.mat[j,]
	which.i=which(i.invests==T)
	which.j=which(j.invests==T)
    i.invests.with.j=which.i[which.i%in%which.j]
	coinvestments[i.is,j.is]=sum(inv.mat[i.is,i.invests.with.j])

}

}

coinvestments

system.time(
for(i in unique(vc)){

	for(j in unique(vc)){
	i.is=which(unique(vc)==i)
	j.is=which(unique(vc)==j)
	i.invests=exist.mat[i,]
	j.invests=exist.mat[j,]
	which.i=which(i.invests==T)
	which.j=which(j.invests==T)
    i.invests.with.j=which.i[which.i%in%which.j]
	coinvestments[i.is,j.is]=sum(inv.mat[i.is,i.invests.with.j])

}

  }
)

Thanks much,
Daniel

--
View this message in context: http://r.789695.n4.nabble.com/Avoiding-loops-in-creating-a-coinvestment-matrix-tp3424298p3424298.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sun 03 Apr 2011 - 22:18:51 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 03 Apr 2011 - 22:40:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive