From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>

Date: Sun, 09 Dec 2007 11:15:03 +0100

Date: Sun, 09 Dec 2007 11:15:03 +0100

Prof Brian Ripley wrote:

> Hmm, S'S is numerically singular. crossprod(S) would be a better way to

*> compute it than crossprod(S,S) (it does use a different algorithm), but
**> look at the singular values of S, which I suspect will show that S is
**> numerically singular.
**>
**> Looks like the answer is 0.
**>
**>
*

Another possibility is that there is a scaling issue. Linear algebra
routines are unhappy about having regressors on widely different scales,
and the squaring of eigenvalues impied by taking crossproducts is no
help. So what is the diagonal of S'S like? How about svd(S)? (Maybe
after column normalization)

> On Sun, 9 Dec 2007, maj@stats.waikato.ac.nz wrote:

*>
**>
**>> I thought I would have another try at explaining my problem. I think that
**>> last time I may have buried it in irrelevant detail.
**>>
**>> This output should explain my dilemma:
**>>
**>>
**>>> dim(S)
**>>>
**>> [1] 1455 269
**>>
**>>> summary(as.vector(S))
**>>>
**>> Min. 1st Qu. Median Mean 3rd Qu. Max.
**>> -1.160e+04 0.000e+00 0.000e+00 -4.132e-08 0.000e+00 8.636e+03
**>>
**>>> sum(as.vector(S)==0)/(1455*269)
**>>>
**>> [1] 0.8451794
**>> # S is a large moderately sparse matrix with some large elements
**>>
**>>> SS <- crossprod(S,S)
**>>> (eigen(SS,only.values = TRUE)$values)[250:269]
**>>>
**>> [1] 9.264883e+04 5.819672e+04 5.695073e+04 1.948626e+04 1.500891e+04
**>> [6] 1.177034e+04 9.696327e+03 8.037049e+03 7.134058e+03 1.316449e-07
**>> [11] 9.077244e-08 6.417276e-08 5.046411e-08 1.998775e-08 -1.268081e-09
**>> [16] -3.140881e-08 -4.478184e-08 -5.370730e-08 -8.507492e-08 -9.496699e-08
**>> # S'S fails to be non-negative definite.
**>>
**>> I can't show you how to produce S easily but below I attempt at a
**>> reproducible version of the problem:
**>>
**>>
**>>> set.seed(091207)
**>>> X <- runif(1455*269,-1e4,1e4)
**>>> p <- rbinom(1455*269,1,0.845)
**>>> Y <- p*X
**>>> dim(Y) <- c(1455,269)
**>>> YY <- crossprod(Y,Y)
**>>> (eigen(YY,only.values = TRUE)$values)[250:269]
**>>>
**>> [1] 17951634238 17928076223 17725528630 17647734206 17218470634 16947982383
**>> [7] 16728385887 16569501198 16498812174 16211312750 16127786747 16006841514
**>> [13] 15641955527 15472400630 15433931889 15083894866 14794357643 14586969350
**>> [19] 14297854542 13986819627
**>> # No sign of negative eigenvalues; phenomenon must be due
**>> # to special structure of S.
**>> # S is a matrix of empirical parameter scores at an approximate
**>> # mle for a model with 269 paramters fitted to 1455 observations.
**>> # Thus, for example, its column sums are approximately zero:
**>>
**>>> summary(apply(S,2,sum))
**>>>
**>> Min. 1st Qu. Median Mean 3rd Qu. Max.
**>> -1.148e-03 -2.227e-04 -7.496e-06 -6.011e-05 7.967e-05 8.254e-04
**>>
**>> I'm starting to think that it may not be a good idea to attempt to compute
**>> large information matrices and their determinants!
**>>
**>> Murray Jorgensen
**>>
**>> ______________________________________________
**>> R-help_at_r-project.org mailing list
**>> https://stat.ethz.ch/mailman/listinfo/r-help
**>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**>> and provide commented, minimal, self-contained, reproducible code.
**>>
**>>
**>
**>
*

-- O__ ---- Peter Dalgaard ุster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk) FAX: (+45) 35327907 ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Sun 09 Dec 2007 - 10:23:53 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Sun 09 Dec 2007 - 10:30:18 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*