Re: [R] PCA problem in R

From: Liaw, Andy <andy_liaw_at_merck.com>
Date: Tue 16 Aug 2005 - 03:14:07 EST

> From: Dennis Shea
>
> [SNIP]>>
> >>>On Sat, 13 Aug 2005, Alan Zhao wrote:
> >>>
> >>>>When I have more variables than units, say a 195*10896
> matrix which has
> >>>>10896 variables and 195 samples. prcomp will give only
> 195 principal
> >>>>components. I checked in the help, but there is no
> explanation that why
> >>>>this happen.
>
> [SNIP]
>
> >Sincerely,
> >Zheng Zhao
> >Aug-14-2005
> >______________________________________________
>

> Just yesterday I subscribed to r-help because I am planning
> on learning the basics of R ... today. :-)
> Thus, I am not sure about the history of this question.
>
> The above situation, more variables than samples,
> is commonly encounterd in the climate studies.
> Consider annual mean temperatures for 195 years
> on a coarse 72 [lat] x 144 [lon] grid [72*144=10368
> spatial variables].
>
> Let S be the number of grid points and T be the number
> of years. I think there is a theorem (?Eckart-Young?)
> which states that the maximum number of unique eigenvalues
> is min(S,T). In your case 195 eigenvalues is correct.

> I speculate that the underlying function transposes the
> input data matrix and computes the the TxT [rather than SxS]
> covariance matrix and solves for the eigenvalues/vectors.
> It then uses a linear transformation to get the results
> for the original input data matrix.
>
> Computationally, the above is much faster and uses less memory.

It is usually a good idea to consult the help page before speculating. ?prcomp has, in its `Detail' section:

The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy.

Andy

> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> http://www.R-project.org/posting-guide.html
>
>
>

R-help@stat.math.ethz.ch mailing list