Re: [R] MySql Versus R

From: Prof Brian Ripley <>
Date: Fri, 01 Apr 2011 12:15:09 +0100 (BST)

On Fri, 1 Apr 2011, Henri Mone wrote:

> Dear R Users,
> I use for my data crunching a combination of MySQL and GNU R. I have
> to handle huge/ middle seized data which is stored in a MySql
> database, R executes a SQL command to fetch the data and does the
> plotting with the build in R plotting functions.
> The (low level) calculations like summing, dividing, grouping, sorting
> etc. can be done either with the sql command on the MySQL side or on
> the R side.
> My question is what is faster for this low level calculations / data
> rearrangement MySQL or R? Is there a general rule of thumb what to
> shift to the MySql side and what to the R side?

The data transfer costs almost always dominate here: since such low-level computations would almost always be a trivial part of the total costs, you should do things which can reduce the size (e.g. summarizations) in the DBMS.

I do wonder what you think the R-sig-db list is for if not questions such as this one. Please subscribe and use it next time.

> Thanks
> Henri

Brian D. Ripley,        
Professor of Applied Statistics,
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 01 Apr 2011 - 11:19:27 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 01 Apr 2011 - 11:50:26 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive