[R] retrieving large columns using RODBC

From: Tamas K Papp <tpapp_at_princeton.edu>
Date: Sat 13 Aug 2005 - 18:03:14 EST


Hi,

I have a large table in Postgresql (result of an MCMC simulation, with 1 million rows) and I would like to retrive colums (correspond to variables) using RODBC. I have a column called "index" which is used to order rows.

Unfortunately, sqlQuery can't return all the values from a column at once (RODBC complains about lack of memory). So I am using the following code:

getcolumns <- function(channel, tablename, colnames, totalrows,

                      ordered=TRUE,chunksize=1e5) {
  r <- matrix(double(0),totalrows,length(colnames))   for (i in 1:ceiling(totalrows/chunksize)) {     cat(".")
    r[((i-1)*chunksize+1):(i*chunksize)] <- as.matrix(
      sqlQuery(channel, paste("SELECT", paste(colnames,collapse=", "),

"FROM", tablename,
"WHERE index <=", i*chunksize,
"AND index >", (i-1)*chunksize,
if (ordered) "ORDER BY index;" else ";"))) } cat("\n") drop(r) # convert to vector if needed
}

to retrieve it in chunks. However, this is very slow -- takes about 15 minutes on my machine. Is there a way to speed it up?

I am running Linux on a powerbook, RODBC version 1.1-4, R 2.1.1. The machine has only 512 Mb of RAM.

Thanks,

Tamas



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Aug 13 18:11:16 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:39:48 EST