Re: [R] subsetting with by() or other function??

From: Brian S Cade <brian_cade_at_usgs.gov>
Date: Fri 14 Oct 2005 - 01:48:16 EST


Fair enough. To clarify what I'm trying to achieve I've pasted below a small piece of the larger data frame with the hierarchical structure of factors POPULATION and LOCID and the ascending order of YEARS and the variable DBC that I would like to transform to another variable that is a lag of the previous years DBC (call it LAG1DBC) within LOCID within POPULATION. The desired outcome is shown in the second example data set pasted below the first. The setup is desired for doing some 1st order autoregressive analyses (not in the time series library). Any examples I've tried doing using by() only seem to work for outputing results not creating new variables in an existing data frame. I suspect that people do similar types of hierarchical subgroup data manipulations all the time in R (I know how to do these easily in SYSTAT), so I'm sure I'm missing some obvious, simple trick. My search of the R newslist archives and various other R documentation has not yielded any solutions yet. Suggestions are graciously welcomed.

       LOCID  POPULATION  YEAR        DBC
1      algb-1           A 1992 0.70451575
2      algb-1           A 1993 0.59506851
3      algb-1           A 1997 0.84837544
4      algb-1           A 1998 0.50283182
5      algb-1           A 2000 0.91242707
6      algb-2           A 1992 0.09747155
7      algb-2           A 1993 0.84772253
8      algb-2           A 1997 0.43974081
9      algb-2           A 1998 0.83108544
10     algb-2           A 2000 0.22291192
11     algb-3           A 1992 0.44234175
12     algb-3           A 1993 0.54089534
5680 taylr-73           B 2001 0.43918082
5681 taylr-73           B 2002 0.34694427
5682 taylr-73           B 2003 3.35619190
5683 taylr-73           B 2004 0.71575815
5684 taylr-73           B 2005 0.42038506
5685 taylr-74           B 1992 3.88410354
5686 taylr-74           B 1993 3.32472557
5687 taylr-74           B 1994 3.29861501
5688 taylr-74           B 1996 0.48153827
5689 taylr-74           B 1997 3.63570636
5690 taylr-74           B 1998 1.94630194

       LOCID  POPULATION  YEAR        DBC LAG1DBC
1      algb-1           A 1992 0.70451575       NA 
2      algb-1           A 1993 0.59506851 0.70451575
3      algb-1           A 1997 0.84837544       0.59506851
4      algb-1           A 1998 0.50283182 0.84837544
5      algb-1           A 2000 0.91242707       0.50283182
6      algb-2           A 1992 0.09747155       NA
7      algb-2           A 1993 0.84772253 0.09747155
8      algb-2           A 1997 0.43974081       0.84772253
9      algb-2           A 1998 0.83108544       0.43974081
10     algb-2           A 2000 0.22291192       0.83108544
11     algb-3           A 1992 0.44234175       NA
12     algb-3           A 1993 0.54089534       0.44234175
5680 taylr-73           B 2001 0.43918082       NA
5681 taylr-73           B 2002 0.34694427       0.43918082
5682 taylr-73           B 2003 3.35619190       0.34694427
5683 taylr-73           B 2004 0.71575815       3.35619190
5684 taylr-73           B 2005 0.42038506       0.71575815
5685 taylr-74           B 1992 3.88410354       NA
5686 taylr-74           B 1993 3.32472557       3.88410354
5687 taylr-74           B 1994 3.29861501       3.32472557
5688 taylr-74           B 1996 0.48153827       3.29861501
5689 taylr-74           B 1997 3.63570636       0.48153827
5690 taylr-74           B 1998 1.94630194       3.63570636

Brian

Brian S. Cade

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO 80526-8818

email: brian_cade@usgs.gov
tel: 970 226-9326

Florence Combes <fcombes@gmail.com>
10/13/2005 05:34 AM

To
Brian S Cade <brian_cade@usgs.gov>
cc

Subject
Re: [R] subsetting with by() or other function??

maybe an example of the data you have and the data you want could be helpful for the people of the list to understand, and so to be able to help you ?

best regards,

Florence.

On 10/12/05, Brian S Cade <brian_cade@usgs.gov> wrote: I think I must be missing something obvious, but I'm having trouble getting a data transformation to work on groupings of data within a data frame (csss3) as defined by 2 factors (population, locid). The data are sorted by year within locid within population and I want to lag another variable (dbc), i.e, shift them down by 1 row replacing the first row with NA, within groups defined by locid nested within population. I thought I could do something using by(csss3,list(locid, population), function) but don't seem to be having any success. Any suggestions??

Brian

Brian S. Cade

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO 80526-8818

email: brian_cade@usgs.gov
tel: 970 226-9326

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Oct 14 02:06:30 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 18:50:33 EST