# Re: [R] Large determinant problem

Date: Mon, 10 Dec 2007 10:08:32 -0500

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of maj_at_stats.waikato.ac.nz
Sent: Sunday, December 09, 2007 2:45 AM
To: Prof Brian Ripley
Cc: r-help_at_r-project.org
Subject: Re: [R] Large determinant problem

I tried crossprod(S) but the results were identical. The term -0.5*log(det(S)) is a complexity penalty meant to make it unattractive to include too many components in a finite mixture model. This case was for a 9-component mixture. At least up to 6 components the determinant behaved as expected and increased with the number of components.

> Hmm, S'S is numerically singular. crossprod(S) would be a better way to
> compute it than crossprod(S,S) (it does use a different algorithm), but
> look at the singular values of S, which I suspect will show that S is
> numerically singular.
>
> Looks like the answer is 0.
>
>
> On Sun, 9 Dec 2007, maj_at_stats.waikato.ac.nz wrote:
>
>> I thought I would have another try at explaining my problem. I think
>> that
>> last time I may have buried it in irrelevant detail.
>>
>> This output should explain my dilemma:
>>
>>> dim(S)
>> [1] 1455 269
>>> summary(as.vector(S))
>> Min. 1st Qu. Median Mean 3rd Qu. Max.
>> -1.160e+04 0.000e+00 0.000e+00 -4.132e-08 0.000e+00 8.636e+03
>>> sum(as.vector(S)==0)/(1455*269)
>> [1] 0.8451794
>> # S is a large moderately sparse matrix with some large elements
>>> SS <- crossprod(S,S)
>>> (eigen(SS,only.values = TRUE)\$values)[250:269]
>> [1] 9.264883e+04 5.819672e+04 5.695073e+04 1.948626e+04
>> 1.500891e+04
>> [6] 1.177034e+04 9.696327e+03 8.037049e+03 7.134058e+03
>> 1.316449e-07
>> [11] 9.077244e-08 6.417276e-08 5.046411e-08 1.998775e-08
>> -1.268081e-09
>> [16] -3.140881e-08 -4.478184e-08 -5.370730e-08 -8.507492e-08
>> -9.496699e-08
>> # S'S fails to be non-negative definite.
>>
>> I can't show you how to produce S easily but below I attempt at a
>> reproducible version of the problem:
>>
>>> set.seed(091207)
>>> X <- runif(1455*269,-1e4,1e4)
>>> p <- rbinom(1455*269,1,0.845)
>>> Y <- p*X
>>> dim(Y) <- c(1455,269)
>>> YY <- crossprod(Y,Y)
>>> (eigen(YY,only.values = TRUE)\$values)[250:269]
>> [1] 17951634238 17928076223 17725528630 17647734206 17218470634
>> 16947982383
>> [7] 16728385887 16569501198 16498812174 16211312750 16127786747
>> 16006841514
>> [13] 15641955527 15472400630 15433931889 15083894866 14794357643
>> 14586969350
>> [19] 14297854542 13986819627
>> # No sign of negative eigenvalues; phenomenon must be due
>> # to special structure of S.
>> # S is a matrix of empirical parameter scores at an approximate
>> # mle for a model with 269 paramters fitted to 1455 observations.
>> # Thus, for example, its column sums are approximately zero:
>>> summary(apply(S,2,sum))
>> Min. 1st Qu. Median Mean 3rd Qu. Max.
>> -1.148e-03 -2.227e-04 -7.496e-06 -6.011e-05 7.967e-05 8.254e-04
>>
>> I'm starting to think that it may not be a good idea to attempt to
>> compute
>> large information matrices and their determinants!
>>
>> Murray Jorgensen
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Brian D. Ripley, ripley_at_stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
>

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 10 Dec 2007 - 15:15:57 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 10 Dec 2007 - 15:30:18 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.