Re: [R] within group sequential subtraction

From: Joshua Wiley <jwiley.psych_at_gmail.com>
Date: Thu, 10 Mar 2011 10:19:01 -0800

Dear Natalie,

I am sure there are other ways, but one way you can do this is by applying diff() to each group using tapply() or by(). Because those return lists, if you want to add it back into your data frame, you can wrap the whole call in unlist(). Here is an example:

dat <- structure(list(group = c("IND1", "IND1", "IND2", "IND2", "IND2", "IND3", "IND4", "IND5",
"IND6", "IND6"), date_obs = structure(c(6468, 7063, 9981, 14186, 14372, 5129, 9767, 11168, 10243, 10647), class = "Date")), .Names = c("group",
"date_obs"), row.names = c(NA, 10L), class = "data.frame")

## calculate differences using diff() by each group ## note the prepended NA
dat$time <- unlist(tapply(dat$date_obs, dat$group,   function(x) {diff(c(NA, x))}))

dat ## updated data frame

HTH, Josh

On Thu, Mar 10, 2011 at 6:56 AM, natalie.vanzuydam <nvanzuydam_at_gmail.com> wrote:
> Hi Everyone,
>
> I would like to do sequential subtractions within a group so that I know the
> time between separate observations for a group of individuals.
>
> My data:
>
> data <- structure(list(group = c("IND1", "IND1", "IND2",
> "IND2", "IND2", "IND3", "IND4", "IND5",
> "IND6", "IND6"), date_obs = structure(c(6468,
> 7063, 9981, 14186, 14372, 5129, 9767, 11168, 10243, 10647), class =
> "Date")), .Names = c("group",
> "date_obs"), row.names = c(NA, 10L), class = "data.frame")
>
> So I start with:
>
>  group   date_obs
> 1   IND1 1987-09-17
> 2   IND1 1989-05-04
> 3   IND2 1997-04-30
> 4   IND2 2008-11-03
> 5   IND2 2009-05-08
> 6   IND3 1984-01-17
> 7   IND4 1996-09-28
> 8   IND5 2000-07-30
> 9   IND6 1998-01-17
> 10  IND6 1999-02-25
>
> what I would like:
>
>  group   date_obs     time
> 1   IND1 1987-09-17 NA
> 2   IND1 1989-05-04 595
> 3   IND2 1997-04-30 NA
> 4   IND2 2008-11-03 4205
> 5   IND2 2009-05-08 186
> 6   IND3 1984-01-17 NA
> 7   IND4 1996-09-28 NA
> 8   IND5 2000-07-30 NA
> 9   IND6 1998-01-17 NA
> 10  IND6 1999-02-25 404
>
> So that if there is one entry/individual a 0/NA would be acceptable and if
> there is more than one entry/individual the sequential difference would be
> calculated.
>
> I started with some code but it I cannot edit it appropriately.
>
> x <- do.call(rbind, lapply(split(data, data$group),
>        function(dat) {
>                        dat <- dat[order(dat$date_obs), ]
>                        d<-diff(dat$date_obs)
>                         dat <- rbind(dat,d)
>                        }))
>
> I get this error: "Error in as.Date.numeric(value) : 'origin' must be
> supplied" so I'm not sure if it does what I need it to do.  In addition to
> this the vector lengths won't match up as the first date in the sequence
> won't be subtracted from itself.
>
> I'm not sure if anyone knows an easier way to achieve this.
>
> Thanks for the help,
> Natalie
>
>
>
>
> -----
> Natalie Van Zuydam
>
> PhD Student
> University of Dundee
> nvanzuydam_at_dundee.ac.uk
> --
> View this message in context: http://r.789695.n4.nabble.com/within-group-sequential-subtraction-tp3346033p3346033.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 10 Mar 2011 - 18:23:00 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 10 Mar 2011 - 18:30:20 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive