Re: [R] from list to dataframe

From: Stephen D. Weigand <>
Date: Thu 19 May 2005 - 13:13:27 EST

On May 18, 2005, at 5:39 PM, wrote:

> I was wondering if someone can help me figure out the following:
> I have two patient datasets, ds1 and ds2. ds1 has fields "patid",
> "date", and "lab1". ds2 has "patid", "date", and "lab2". I want to
> find all the patids that have at least 2 dated records for each lab.
> I started by splitting each dataset by patid, to create ds1.list and
> ds2.list. Then I did some processing (with sapply) to each list to
> get the lengths of each patient list item. Then I kind of lost my way
> and things got messy as I tried to extract just the patids of those
> with lengths >= 2, convert them to dataframes (which I didn't have
> much success with), and then merge the two dataframes to get a vector
> of the desired patids. Any help would be much appreciated.
> Thanks,
> Steven


I might not exactly understand your problem, but for what it's worth, you could try to identify the patients in ds1 who appear at least twice and identify the patients in ds2 who appear at least twice via

ptid1 <- c("A", "A", "B", "C", "D", "D") keep1 <- names(table(ptid1))[table(ptid1) >= 2] keep1

or if ptid is numeric

ptid1 <- c(1, 1, 2, 3, 4, 4)
keep1 <- as.numeric(names(table(ptid1))[table(ptid1) >= 2]) keep1

then subset the respective data sets via

ds1.keep <- subset(ds1, ptid %in% intersect(keep1, keep2)) ds2.keep <- subset(ds2, ptid %in% intersect(keep1, keep2))

then use merge().

Good luck!

Stephen mailing list PLEASE do read the posting guide! Received on Thu May 19 13:21:00 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:50 EST