Re: [Rd] ATLAS threaded 64 bit Opteron build for R: need -fPIC

From: Prof Brian Ripley <>
Date: Fri 10 Feb 2006 - 11:14:17 GMT

On Fri, 10 Feb 2006, Amit Aronovitch wrote:

You set the reply address to Martin Maechler! That's antisocial.

> Hi,
> Sorry for sending such a late reply, and for being abit OT.
> I've been trying to compile 64 bit ATLAS for numpy
> ( ), and so far this thread is the most useful
> one I could google up - thanks!.
> I encountered similiar problems, and so far could not get a .a linkable
> to numpy (comparing to your post - it seems I might have forgotten to
> add the -fPIC for the F77FLAGS or MMFLAGS).

Yes, that _is_ in the R-admin manual. I guess you have not read that - it describes how to install R. You can get it in the R tarball from

> Also, I'm having trouble with the ATLAS lapack. To get a usable lib, one
> has to merge it with a full lapack implementation (as described in the
> ATLAS errata). However, I'm using RHEL4, and their installed liblapack.a
> seems to have been compiled without -fPIC, so the merged library is
> unlinkable to numpy's .so. Is there a way to use Redhat's installed

No, nor should you want to. If RHEL4 is like FC3/4 watch out, as RH have managed to get BLAS routines in liblapack and not liblas, and use incorrect patches to LAPACK 3.0. (Again, see the latest R-admin manual.)

> Few questions about your compiler flags:
> 1) Is there a reason to compile with -O rather than -O3?
> (did you try and encounter some problem, or found no major performance
> difference)

ATLAS chose that. Since the real work is done by hand-tuned assembler code it should not matter.

> 2) I see you use -mfpmath=387 - does this work better than sse2 (which
> seems to be
> the default)? How about the "sse,387" option - should I try that?

Depends on your ATLAS version. Again, ATLAS chose those.

As it happens, I have been trying to build ATLAS on my new dual Opteron box this morning. The latest devel version (3.7.11) does not build, as at some point it says it expects the GNU x86-32 assembler. If it did it would use SSE3 and so be faster.

Both 3.6.0 and 3.7.11 fail because my machine is too fast, and I had to increase the number of replications (1000) in make/Make.{mv,r1}tune and in tune/blas/level1/*.c. Even then I do not entirely trust the results (and the two versions report different L1 caches sizes ...).

I got pretty exasperated with this (it needed about ten builds to get one that succeeded). Both ACML and the Goto BLAS work well out of the box on Opterons, but do have licence issues. (Again, see the R-admin manual for details.)

> Martin Maechler wrote:
>>>>>> / "PD" == Peter Dalgaard <p.dalgaard at <>>
> />>>>>/ >>>>>> "PD" == Peter Dalgaard <p.dalgaard at>

>>>>>>> on 26 Feb 2004 15:44:16 +0100 writes:
>> PD> Douglas Bates <bates at> writes:
>> >> Have you tried configuring R with Goto's BLAS
>> >>
>> >>
>> >> I haven't worked with Opteron or Athlon64 computers but I understand
>> >> that Goto's BLAS are very effective on those machines. Furthermore
>> >> Goto's BLAS are (only) available as .so libraries so you don't need to
>> >> mess with creating the .so version.
>> PD> I tried it, yes. Somewhat to my surprise, it seemed to be not quite as
>> PD> fast as the threaded ATLAS, but I wasn't very systematic about the
>> PD> benchmarking.
>> PD> (and the Goto items have license issues, which get in the way for
>> PD> binary distributions.)
>> Thanks a lot, Peter, Brian, Doug, for your feedbacks!
>> In the mean time, I have three running versions of R(-devel) on
>> the 64-Opteron
>> - "plain"
>> - linked against threaded GOTO
>> - linked against threaded (static) ATLAS (using -fPIC for compilation;
>> "large" Rlapack)
>> and I find that GOTO is faster than ATLAS
>> consistently (between ~ 5-20%) for several tests
>> (square matrices; %*% and solve).
>> ATLAS is still an order of magnitude faster than "plain" for
>> 3000x3000 matrices.
>> Here are somewhat repeatable "ATLAS for R" build instructions:
>> 1. get ATLAS source; unpack
>> 2. make : use defaults and "express" installation
>> 3. Before "make install ...", edit the Make.<ARCHITECTURE> file:
>> add "-fPIC" to three places, namely F77FLAGS, CCFLAG0, and MMFLAGS:
>> which in case of the "threaded Opteron" architecture, leads to
>> the three new lines
>> F77FLAGS = -fPIC -fomit-frame-pointer -O -m64
>> CCFLAG0 = -fPIC -fomit-frame-pointer -O -mfpmath=387 -m64
>> MMFLAGS = -fPIC -fomit-frame-pointer -O -mfpmath=387 -m64
>> in the file Make.Linux_HAMMER64SSE2_2
>> 4. make install arch=Linux_HAMMER64SSE2_2
>> 5. the ATLAS libraries into /usr/local/lib:
>> cd /usr/local/lib
>> ln -s <ATLAS_build_dir>/lib/Linux_HAMMER64SSE2_2/lib* .
>> 6. (needed for runtime!):
>> Use environment variable LD_LIBRARY_PATH=/usr/local/lib
>> Note that I haven't built *.so (shared) libraries yet.
> /

Brian D. Ripley,        
Professor of Applied Statistics,
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________ mailing list
Received on Fri Feb 10 22:44:19 2006

This archive was generated by hypermail 2.1.8 : Mon 20 Feb 2006 - 03:21:41 GMT