Re: [R] Fast update of a lot of records in a database?

From: Steve Miller <steve.miller_at_jhu.edu>
Date: Sat 20 May 2006 - 22:15:39 EST


Though I'd question the prudence of doing mass database changes through R:

Other things equal, single updates in a loop should be slower than a correlated update due to the performance costs of bind and execute (and perhaps even parse). Also, updates are generally slower than inserts because of rollback and logging, and the potential of introducing chained blocks to hold the updates. The correlated temporary table update should be faster than the original, provided indexing is appropriate (i.e an index on bigtable.id).

Any chance you could create a new bigtable as a join select of old bigtable and updates? This would be faster still.

Steve Miller

-----Original Message-----
From: r-help-bounces@stat.math.ethz.ch
[mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of hadley wickham Sent: Friday, May 19, 2006 2:20 PM
To: Duncan Murdoch
Cc: Robert.McGehee@geodecapital.com; r-help@stat.math.ethz.ch; br44114@gmail.com
Subject: Re: [R] Fast update of a lot of records in a database?

> put the updates into a temporary table called updates
>
> UPDATE bigtable AS a
> FROM updates AS b
> WHERE a.id = b.id
> SET a.col1 = b.col1

I don't think this will be any faster - why would creating a new table be faster than updating existing rows?

I've never had a problem with using large numbers of SQL update statements (in the order of hundreds of thousands) to update a table and having them complete in a reasonable time (a few minutes). How heavily indexed is the field you are updating? You may be able to get some speed improvements by turning off indices before the update and back on again afterwards (highly dependent on your database system though).

I would strongly suspect your bottleneck lies elsewhere (eg. when generating the statements in R, or using ODBC to send them)

Hadley



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun May 21 00:56:35 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 22 May 2006 - 08:10:17 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.