Re: [Rd] NaN and linear algebra

From: Bill Northcott <w.northcott_at_unsw.edu.au>
Date: Thu 24 Mar 2005 - 11:59:16 GMT

On 24/03/2005, at 10:04 PM, Andy wrote:
> I just tested this problem on a Windows 2000 Pro P4-Xeon system. I
> tried
> this in patched 2.0.1 (2005-2-28) and 2.1.0-alpha (2005-3-23) with both
> generic BLAS and precompiled ATLAS BLAS dll (downloaded from CRAN
> windows
> binaries). In all 4 combinations I get
>
>> d<-matrix(NaN,3,3)
>> f<-solve(d)
>> f
> [,1] [,2] [,3]

> [1,] NaN NaN NaN
> [2,] NaN NaN NaN
> [3,] NaN NaN NaN
>> det(d)
> [1] NaN
>>
>>
I just want to clarify some of the discussion here for my own benefit.

As I understand it, there are two software libraries involved (three if you count R). BLAS is the lowest layer providing basic matrix operations. It comes in vanilla and platform optimised (ATLAS or Goto) flavours. Presumably R's built in version is vanilla.

Relying on BLAS is LAPACK which provides the functions which R actually calls like dgesv. LAPACK relies on BLAS to provide the platform optimisations. As far as I can see the issue(s) here is(are) with LAPACK not BLAS. It is LAPACK that issues the error and halts the program. (The MacOS X Accelerate Framework contains BLAS and LAPACK as distinct libraries libBlas.dylib and libLapack.dylib)

If, as reported, the behaviour is fundamentally similar on MacOS X, Linux and IRIX then it would appear that this is a feature(bug?) of the LAPACK reference implementation and that whoever wrote the Windows version saw fit to improve on it. I don't think LAPACK ever pretended to properly handle NaNs.

It seems to me that the Windows version is an improvement, because it complies with the robust arithmetic principles that programs should not halt but propagate NaN or other indicators that can tested as appropriate in the application. So it might be good idea to press other implementers of LAPACK to do the same.

My view for what it is worth is that R should also be robust in this sense and not halt on NaNs. For preference this should be handled in the LAPACK implementations, but if not R should should trap them. As I see it, it is incorrect to say a matrix containing NaNs is singular, not positive definite or whatever, because since it contains NaNs the computations to make those determinations cannot be performed.

Is this the wrong way to look at it?

Bill Northcott

PS This discussion has given me another thought. On MacOS X, it would be much faster to use floats instead of doubles because they can be directly processed by the Altivec section of the CPU. I can see no way to have R use floats. Would that be possible as a future enhancement or is it just too hard?



R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu Mar 24 23:02:39 2005

This archive was generated by hypermail 2.1.8 : Mon 20 Feb 2006 - 03:21:02 GMT