Re: [R] Higher Dimensional Matrices

From: <>
Date: Tue 26 Dec 2006 - 00:59:45 GMT

Lars from space [sic] asks:
> -----Original Message-----

> From:
> [] On Behalf Of downunder
> Sent: Monday, 25 December 2006 12:31 PM To:
> Subject: [R] Higher Dimensional Matrices
> Hi all.
> I want to calculate partial correlations while controlling for one
> or more variables. That works already fine when I control for
> example just for x[,1] and x[,2] that gives me one single
> correlation matrix and i have to to it for x [,1]...x[,10]. That
> would give me out 10 matrices. Controlling for 2 Variables 100
> matrices. how can I run a loop to get f.e the 10 or 100 matrices at
> once? I appreciate for every hint. have nice holidays.

I don't quite understand this. You have 10 variables and you want to find the partial correlations controlling for two of them at a time. If you take each possible set of two variables to condition upon at a time, this would give you choose(10, 2) = 45 matrices, wouldn't it? Where do you get '10 or 100' matrices from?

> greetings lars
> x <- read.table("C:/.....dat")
> dim(x) #200x10
> a <- matrix(0,200,10)
> for (i in 1:10)
> a[,i] <- residuals(lm(x[,i]~1+x[,1]+x[,2]))
> b <- matrix(0,200,10)
> for (i in 1:10)
> b[,i] <- residuals(lm(x[,i]~1+x[,1]+x[,2]))
> #a=round(a,5)
> #b=round(b,5)
> d <- cor(a,b)
> d

But a and b are the same, aren't they? Why do you need to compute them twice? Why not just use cor(a, a) [which is the same as cor(a), of course]?

There is a more serious problem here, though. Your residuals are after regression on x[, 1:2] so if you *select* x[, 1:2] as the y-variables in your regression then the residuals are going to be zero, but in practice round-off error. so the first two rows and columns of d will be correlations with round-off error, i.e. misleading junk. It doesn't make sense to include the conditioning variables in the correlation matrix *conditioning on them*. Only the 8 x 8 matrix of the others among themselves is defined, really.

So how do we do it? Here are a few pointers.

To start, here is a function that uses a somewhat more compact way of finding the partial correlations than your method. Sorting out how it works should be a useful exercise.

partialCorr <- function (cond, mat)

        cor(qr.resid(qr(cbind(1, mat[, cond])), mat[, -cond]))

To find the matrix of partial correlations conditioning on x[, 1:2] you would use

d <- partialCorr(c(1,2), x)

So how to do it for all possible conditioning pairs of variables. Well you could do it in an obvious loop:

cmats <- array(0, c(8,8,45))
k <- 0
for(i in 1:9) for(j in (i+1):10) {

    k <- k+1
    cmats[, , k] <- partialCorr(c(i, j), x) }

Now the results are in a 3-way array, but without any kind of labels. Perhaps you should think about how to fix that one yourself...

With more than two conditioning variables you should probably use a function to generate the subsets of the appropriate size rather than trying to write ever more deeply nested loops. There are plenty of functions around to do this.

Bill Venables. mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Tue Dec 26 12:05:01 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 26 Dec 2006 - 06:30:23 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.