Re: [R] number of effective tests

From: Daniel Malter <daniel_at_umd.edu>
Date: Thu, 10 Jul 2008 20:34:59 -0700 (PDT)

Hi, what do you mean by effective number of tests? How you approach it also depends on the research tradition in your field. Some fields just include the variables in alternative regressions and then include them jointly. However, since your variables are so highly correlated (i.e. they convey almost the same information), you almost certainly have to reduce the dimensionality of your data if you want to include them "jointly" (basically you make 2 out of your 6 variables or whatever number). PCA, as Moshe suggested, is a good way. It is typically used when your variables are measured without error (that is if each of them are hard-fact numbers). If the variables are measured with error (e.g. subject responses on a survey), you would typically perform factor analysis.

You may want to standardize each of the six variables before performing pca or factor analysis so that each of the six has the same scale. Otherwise the variables with the greater variance will be much more influential than the others (that's not the best description for it, but I hope its makes the point).

look for prcomp() or princomp for PCA and at factanal() for factor analysis (there are packages available for factor analysis too, I think).

Best,
Daniel

Georg Ehret wrote:
>
> Dear R community,
> I am using 6 variables to test for an effect (by linear
> regression).
> These 6 variables are strongly correlated among each other and I would
> like
> to find out the number of independent test that I perform in this
> calcuation. For this I calculated a matrix of correlation coefficients
> between the variables (see below). But to find the rank of the table in R
> is
> not the right approach... What else could I do to find the effective
> number
> of independent tests?
> Any suggestion would be very welcome!
> Thanking you and with my best regards, Georg.
>

>> for (a in 1:6){

> + for (b in 1:6){
> +
> r[a,b]<-summary(lm(unlist(d[a])~unlist(d[b])),na.action="na.exclude")$adj.r.squared
> + }
> + }
>>
>> r

> SR SU ST DR DU DT
> SR 1.0000000 0.9636642 0.9554952 0.2975892 0.3211303 0.3314694
> SU 0.9636642 1.0000000 0.9101678 0.3324979 0.3331389 0.3323826
> ST 0.9554952 0.9101678 1.0000000 0.2756876 0.3031676 0.3501157
> DR 0.2975892 0.3324979 0.2756876 1.0000000 0.9981733 0.9674843
> DU 0.3211303 0.3331389 0.3031676 0.9981733 1.0000000 0.9977780
> DT 0.3314694 0.3323826 0.3501157 0.9674843 0.9977780 1.0000000
>
> *************************
> Georg Ehret
> Johns Hopkins University
> Baltimore, US
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
-- 
View this message in context: http://www.nabble.com/number-of-effective-tests-tp18395271p18395867.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 11 Jul 2008 - 03:40:54 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 11 Jul 2008 - 05:31:54 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive