Re: R-beta: Sun performance library

Rob Creecy (rcreecy@census.gov)
Fri, 20 Mar 1998 13:31:12 -0500


Message-Id: <3512B5F0.5C12FCE0@census.gov>
Date: Fri, 20 Mar 1998 13:31:12 -0500
From: Rob Creecy <rcreecy@census.gov>
To: David Clayton <david.clayton@mrc-bsu.cam.ac.uk>
Subject: Re: R-beta: Sun performance library

I looked at this a little last fall after noticing
that matrix multiplication performance in R was far below
that achievable performance in FORTRAN or C using the
BLAS and LAPACK. I've been looking for notes on this but
can't find them. My recollection was that roughly speaking
I could get 100MFLOPS on a SPARC ULtra 2300 single processor
for matrix multiply for matrices bigger than 100x100 using
FORTRAN, while I was getting less than 10MFLOPS in R (or SPLUS).
These numbers may not be accurate so take them with a grain of salt.
I then did a very quick test of trying to link the BLAS DGEMM matrix
multiply routine with R. I only remember the bottom line: it wasn't
worth pursuing further because even though the matrix multiply in
FORTRAN was sped up considerably, there was still a lot of overhead
passing data
back and forth between R and the FORTRAN routine. 

It is possible that for very large problems there may still be an
advantage
of linking an optimized BLAS or LAPACK routine since execution time
should
be O(n^3) while time spent passing matrices as arguments should be
O(n^2) - it just wasn't worth it for the problems I was trying.

Another point - I found that compiling the BLAS and LAPACK source code
available from NETLIB gave the about the same results as the "optimized"
SUN Performance Library routines, so using the SUN routines just saves
time figuring out how to compile them. I never did finish trying to do
the link with R since it didn't seem worth the effort.

Rob Creecy
Census Bureau

David Clayton wrote:
> 
> Has anyone tried to build R on Suns by linking to the versions of BLAS and
> LINPACK in the Sun Performance Library rather than the standard versions
> included in the distribution? On the face of it this would seem to be a highly
> desirable thing to do since these versions of the routines are more efficient
> and can exploit parallelism on multiprocessor machines.
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._