Re: [Rd] [R] factanal AIC?

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Fri, 04 May 2007 19:20:04 -0700

Dear R Developers:

      What is the proper log(likelihood) for 'factanal'? I believe it should be something like the following:

      (-n/2)*(k*(log(2*pi)+1)+log(det(S)))

or without k*(log(2*pi)-1):

      (-n/2)*log(det(S)),

where n = the number of (multivariate) observations.

      The 'factanal' help file say the factanal component "criteria: The results of the optimization: the value of the negative log-likelihood and information on the iterations used."

      However, I'm not able to get this. If it's a log(likelihood), then replacing a data set m1 by rbind(m1, m1) should double the log(likelihood). However it has no impact on it. Consider the following modification of the first example in the 'factanal' help page:

 >    v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)
 >      v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)
 >      v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)
 >      v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)
 >      v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)
 >      v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)
 >      m1 <- cbind(v1,v2,v3,v4,v5,v6)
 > tst <- factanal(m1, factors=3)
 > fit1 <- factanal(m1, factors=3)
 > fit2 <- factanal(rbind(m1, m1), factors=3)
 > fit1$criteria["objective"]

objective
0.4755156
 > fit2$criteria["objective"]
objective
0.4755156

      From the following example, it seems that the "k*(log(2*pi)-1) term is omitted:

 > x2 <- c(-1,1)
 > X2.4 <- as.matrix(expand.grid(x2, x2, x2, x2))
 > factanal(X2.4, 1)$criteria
      objective counts.function counts.gradient
              0               7               7

      However, I can't get the 'fit1$criteria' above, even ignoring the 
sample size. Consider the following:

      > S3 <- tcrossprod(fit1$loadings)+diag(fit1$uniquenesses)  > log(det(S3))
[1] -6.725685

      Shouldn't this be something closer to the 0.4755 for fit1$criteria?

      Thanks,
      Spencer Graves

JENS: For your purposes, I suggest you try to compute (-n/2)*log(det(tcrossprod(fit$loadings)+diag(fit$uniquenesses)). If you do this, you might get more sensible results with your examples.

      Hope this helps. 
      Spencer Graves

Jens Oehlschlägel wrote:
> Dear list members,

>

> Could any expert on factor analysis be so kind to explain how to calculate AIC on the output of factanal. Do I calculate AIC wrong or is factanal$criteria["objective"] not a negative log-likelihood?
>

> Best regards
>
>

> Jens Oehlschlägel
>
>
>

> The AIC calculated using summary.factanal below don't appear correct to me:
>

> n items factors total.df rest.df model.df LL AIC AICc BIC
> 1 1000 20 1 210 170 40 -5.192975386 90.38595 93.80618 286.6962
> 2 1000 20 2 210 151 59 -3.474172303 124.94834 132.48026 414.5059
> 3 1000 20 3 210 133 77 -1.745821627 157.49164 170.51984 535.3888
> 4 1000 20 4 210 116 94 -0.120505369 188.24101 207.97582 649.5700
> 5 1000 20 5 210 100 110 -0.099209921 220.19842 247.66749 760.0515
> 6 1000 20 6 210 85 125 -0.072272695 250.14455 286.18574 863.6140
> 7 1000 20 7 210 71 139 -0.054668588 278.10934 323.36515 960.2873
> 8 1000 20 8 210 58 152 -0.041708051 304.08342 358.99723 1050.0622
> 9 1000 20 9 210 46 164 -0.028686298 328.05737 392.87174 1132.9292
> 10 1000 20 10 210 35 175 -0.015742783 350.03149 424.78877 1208.8887
> 11 1000 20 11 210 25 185 -0.007007901 370.01402 454.55947 1277.9487
> 12 1000 20 12 210 16 194 -0.005090725 388.01018 481.99776 1340.1147
> >

> summary.factanal <- function(object, ...){
> if (inherits(object, "try-error")){
> c(n=NA, items=NA, factors=NA, total.df=NA, rest.df=NA, model.df=NA, LL=NA, AIC=NA, AICc=NA, BIC=NA)
> }else{
> n <- object$n.obs
> p <- length(object$uniquenesses)
> m <- object$factors
> model.df <- (p*m) + (m*(m+1))/2 + p - m^2
> total.df <- p*(p+1)/2
> rest.df <- total.df - model.df # = object$dof
> LL <- -as.vector(object$criteria["objective"])
> k <- model.df
> aic <- 2*k - 2*LL
> aicc <- aic + (2*k*(k+1))/(n-k-1)
> bic <- k*log(n) - 2*LL
> c(n=n, items=p, factors=m, total.df=total.df, rest.df=rest.df, model.df=model.df, LL=LL, AIC=aic, AICc=aicc, BIC=bic)
> }
> }
>

> multifactanal <- function(factors=1:3, ...){
> names(factors) <- factors
> ret <- lapply(factors, function(factors){
> try(factanal(factors=factors, ...))
> })
> class(ret) <- "multifactanal"
> ret
> }
>

> summary.multifactanal <- function(object,...){
> do.call("rbind", lapply(object, summary.factanal))
> }
>

> print.multifactanal <- function(x,...){
> ret <- summary.multifactanal(x)
> print(ret, ...)
> invisible(ret)
> }
>

> # simulate a true 4-factor model
> n <- 1000
> ktrue <- 4
> kfac <- 5
> true <- matrix(rnorm(n*ktrue), ncol=ktrue)
> x <- matrix(rep(true, kfac)+rnorm(n*ktrue*kfac), ncol=ktrue*kfac)
> dimnames(x) <- list(NULL, paste(rep(letters[1:ktrue], kfac), rep(1:kfac, rep(ktrue, kfac)), sep=""))
> covmat <- cov.wt(x)
> # run factanal for several numbers of factors
> mf <- multifactanal(factors=1:12, covmat=covmat)
> mf
>
>

> version
> _
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status
> major 2
> minor 5.0
> year 2007
> month 04
> day 23
> svn rev 41293
> language R
> version.string R version 2.5.0 (2007-04-23)
>

R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat 05 May 2007 - 02:24:45 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 05 May 2007 - 09:33:47 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.