Re: [R] faster GLS code

From: Charles C. Berry <cberry_at_tajo.ucsd.edu>
Date: Thu, 07 Jan 2010 11:42:02 -0800

On Thu, 7 Jan 2010, Ravi Varadhan wrote:

> Try this:
>
>
> X <- kronecker(diag(1,3),x)
> Y <- c(y) # stack the y in a vector
>
> # residual covariance matrix for each observation
> covar <- kronecker(sigma,diag(1,N))
>
> csig <- chol2inv(covar)
> betam2 <- ginv(csig %*% X) %*% csig %*% Y
>
> This is more than 2 times faster than your code (however, it doesn't compute `betav') .
>

Faster still (by a wide margin) if X will truly be of that form:

> B <- coef(lm(y~0+.,as.data.frame(x)))
> all.equal( as.vector((B)), as.vector(betam))
[1] TRUE When X is of that form, the covariance matrix drops out of the computation.

:)

Chuck

> Here is a timing comparison:
>
> # Your method
> # GLS betas covariance matrix
> system.time({
> inv.sigma <- solve(covar)
> betav <- solve(t(X)%*%inv.sigma%*%X)
>
> # GLS mean parameter estimates
> betam <- betav%*%t(X)%*%inv.sigma%*%Y
> })
>
> # New method
> system.time({
> csig <- chol2inv(covar)
> betam2 <- ginv(csig %*% X) %*% csig %*% Y
> })
>
> all.equal(betam, betam2)
>
>> # GLS betas covariance matrix
>> system.time({
> + inv.sigma <- solve(covar)
> + betav <- solve(t(X)%*%inv.sigma%*%X)
> +
> + # GLS mean parameter estimates
> + betam <- betav%*%t(X)%*%inv.sigma%*%Y
> + })
> user system elapsed
> 1.14 0.51 1.76
>>
>> system.time({
> + csig <- chol2inv(covar)
> + betam2 <- ginv(csig %*% X) %*% csig %*% Y
> + })
> user system elapsed
> 0.47 0.08 0.61
>>
>> all.equal(betam, betam2)
> [1] TRUE
>>
>
>
> Hope this helps,
> Ravi.
> ____________________________________________________________________
>
> Ravi Varadhan, Ph.D.
> Assistant Professor,
> Division of Geriatric Medicine and Gerontology
> School of Medicine
> Johns Hopkins University
>
> Ph. (410) 502-2619
> email: rvaradhan_at_jhmi.edu
>
>
> ----- Original Message -----
> From: Carlo Fezzi <c.fezzi_at_uea.ac.uk>
> Date: Thursday, January 7, 2010 12:13 pm
> Subject: [R] faster GLS code
> To: r-help_at_r-project.org
>
>
>> Dear helpers,
>>
>> I wrote a code which estimates a multi-equation model with generalized
>> least squares (GLS). I can use GLS because I know the covariance
>> matrix of
>> the residuals a priori. However, it is a bit slow and I wonder if anybody
>> would be able to point out a way to make it faster (it is part of a bigger
>> code and needs to run several times).
>>
>> Any suggestion would be greatly appreciated.
>>
>> Carlo
>>
>>
>> ***************************************
>> Carlo Fezzi
>> Senior Research Associate
>>
>> Centre for Social and Economic Research
>> on the Global Environment (CSERGE),
>> School of Environmental Sciences,
>> University of East Anglia,
>> Norwich, NR4 7TJ
>> United Kingdom.
>> email: c.fezzi_at_uea.ac.uk
>> ***************************************
>>
>> Here is an example with 3 equations and 2 exogenous variables:
>>
>> ----- start code ------
>>
>>
>> N <- 1000 # number of observations
>> library(MASS)
>>
>> ## parameters ##
>>
>> # eq. 1
>> b10 <- 7; b11 <- 2; b12 <- -1
>>
>> # eq. 2
>> b20 <- 5; b21 <- -2; b22 <- 1
>>
>> # eq.3
>> b30 <- 1; b31 <- 5; b32 <- 2
>>
>> # exogenous variables
>>
>> x1 <- runif(min=-10,max=10,N)
>> x2 <- runif(min=-5,max=5,N)
>>
>> # residual covariance matrix
>> sigma <- matrix(c(2,1,0.7,1,1.5,0.5,0.7,0.5,2),3,3)
>>
>> # residuals
>> r <- mvrnorm(N,mu=rep(0,3), Sigma=sigma)
>>
>> # endogenous variables
>>
>> y1 <- b10 + b11 * x1 + b12*x2 + r[,1]
>> y2 <- b20 + b21 * x1 + b22*x2 + r[,2]
>> y3 <- b30 + b31 * x1 + b32*x2 + r[,3]
>>
>> y <- cbind(y1,y2,y3) # matrix of endogenous
>> x <- cbind(1,x1, x2) # matrix of exogenous
>>
>>
>> #### MODEL ESTIMATION ###
>>
>> # build the big X matrix needed for GLS estimation:
>>
>> X <- kronecker(diag(1,3),x)
>> Y <- c(y) # stack the y in a vector
>>
>> # residual covariance matrix for each observation
>> covar <- kronecker(sigma,diag(1,N))
>>
>> # GLS betas covariance matrix
>> inv.sigma <- solve(covar)
>> betav <- solve(t(X)%*%inv.sigma%*%X)
>>
>> # GLS mean parameter estimates
>> betam <- betav%*%t(X)%*%inv.sigma%*%Y
>>
>> ----- end of code ----
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>>
>> PLEASE do read the posting guide
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry_at_tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 07 Jan 2010 - 19:46:46 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 07 Jan 2010 - 20:40:08 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive