Re: [R] How to make a lagged variable in panel data?

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Sun 14 Aug 2005 - 11:26:50 EST

On 8/13/05, Ila Patnaik <ila@mayin.org> wrote:
> Suppose we observe N individuals, for each of which we have a
> time-series. How do we correctly create a lagged value of the
> time-series variable?
>
> As an example, suppose I create:
>
> A <- data.frame(year=rep(c(1980:1984),3),
> person= factor(sort(rep(1:3,5))),
> wage=c(rnorm(15)))
>
> > A
> year person wage
> 1 1980 1 0.17923212
> 2 1981 1 0.25610292
> 3 1982 1 0.50833655
> 4 1983 1 -0.42448395
> 5 1984 1 0.49233532
> 6 1980 2 -0.49928025
> 7 1981 2 0.06842660
> 8 1982 2 0.65677575
> 9 1983 2 0.15947390
> 10 1984 2 -0.46585116
> 11 1980 3 -0.29052635
> 12 1981 3 -0.27109203
> 13 1982 3 -0.76168164
> 14 1983 3 0.02294361
> 15 1984 3 2.22828032
>
> What I'd like to do is to make a lagged wage for each person, i.e., I
> should get an additional variable A$wage.lag1:
>
> > A
> year person wage wage.lag1
> 1 1980 1 0.17923212 NA
> 2 1981 1 0.25610292 0.17923212
> 3 1982 1 0.50833655 0.25610292
> 4 1983 1 -0.42448395 0.50833655
> 5 1984 1 0.49233532 -0.42448395
> 6 1980 2 -0.49928025 NA
> 7 1981 2 0.06842660 -0.49928025
> 8 1982 2 0.65677575 0.06842660
> 9 1983 2 0.15947390 0.65677575
> 10 1984 2 -0.46585116 0.15947390
> 11 1980 3 -0.29052635 NA
> 12 1981 3 -0.27109203 -0.29052635
> 13 1982 3 -0.76168164 -0.27109203
> 14 1983 3 0.02294361 -0.76168164
> 15 1984 3 2.22828032 0.02294361
>

We can use 'by' to split data frame A by person and to apply the function f to each such subset of rows. Function f makes that portion of wage which corresponds to a single person into a ts class time series so that we can use lag with it and then we cbind wage together with its lag. From the cbind'ed result we extract out those times that correspond to the original series since the example output only includes those. Note that such extraction has a side effect of turning wages into a matrix rather than a time series. We then put every together using cbind(...) once again and once the 'by' is complete we rbind all rows together.

	f <- function(x) { 
		wage <- ts(x$wage, start = x$year[1])
		idx <- seq(length = length(wage))
		wages <- cbind(wage, lag(wage, -1))[idx,]
		cbind(x[,1:2], wages)
	}

	result <- do.call("rbind", by(A, A$person, f))
	result

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun Aug 14 12:34:07 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 15:18:59 EST