Re: [R] Parallel Processing and Linear Regression

From: Martin Morgan <mtmorgan_at_fhcrc.org>
Date: Fri, 25 Jul 2008 06:26:29 -0700

Hi Alan --

"Alan Spearot" <acspearot_at_gmail.com> writes:

> Does anybody have any suggestions regarding applying standard regression
> packages lm(), hccm(), and others within a parallel environment? Most of
> the packages I've found only deal with iterative processes (bootstrap) or
> simple linear algebra. While the latter might help, I'd rather not program
> the estimation code. I'm currently using a IA-64 Teragrid system through UC
> San Diego.

If you mean that you have a single regression that takes a long time to compute, using a parallel BLAS on a single machine (as described in the R installation and administration guide) might help (I have no direct relevant experience). Otherwise, I think you're out of luck in terms of parallelizing without writing code.

If you mean that you've got many data sets for which you'd like to perform regressions, then the general strategy is the coarse-grained lapply-like solutions available in all the usual suspects snow / Rmpi / nws / and others.

If you've got really big data that doesn't fit easily in memory then the available solutions are to get more memory (on a single 64 bit machine) or to use a package such as biglm designed to work with large data sets.

Hope that helps,

Martin

> Alan
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 25 Jul 2008 - 13:29:10 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 25 Jul 2008 - 14:32:31 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive