[R] Reshaping a dataframe with conditional summary of columns - apply or reshape?

From: Dave Hewitt <dhewitt_at_vims.edu>
Date: Wed, 23 Jan 2008 23:31:15 -0500


I need to do a somewhat complex rearrangement of an existing dataframe that involves an "on-the-fly" conditional summary as well. I've tried to make the example as simple as possible. Code to produce the object being acted on is at the end of the message. I hope this isn't too much for a single posting. Any guidance will be much appreciated.

The input example looks like this:

survey	year	count	location	var1	var2
1	1	2	A1	21.2	1
2	1	0	A1	15.6	1
3	1	29	A1	12.1	1
1	1	11	B3	9	2
3	1	5	B3	4.8	2
1	2	7	A2	20.1	4
2	2	2	A2	19.2	4

I need to rearrange it to look like this (the order of the columns is not important):

location	year	survey1	survey2	survey3	var1summ	var2
A1	1	1	0	2	16.3	1
B3	1	2	NA	1	6.9	2
A2	2	1	1	NA	19.7	4

The operations/conditions that I need are:

(A) The observations in "count" for each survey at each location in a given
year need to be reduced according to the following rules and then placed in new columns "survey1", "survey2", "survey3" :

-- when  count = 0,  surveyX = 0
-- when  0 < count < 11,  surveyX = 1
-- when  11 < count,  surveyX = 2

(B) The observations in var1 need to be summarized (here, by taking the
mean) across all surveys for a given location in a given year and placed in "var1summ"

(C) Observations in var2 need to be summarized in a similar way, except
that they don't change across surveys for a given location in a given year
(so mean, median, etc. would work, but a more general solution would be
nice for variables that are categorical)

(D) Surveys (rows) can be missing for a given location in a given year. I'd
like to retain a column for each survey in the new dataframe and denote surveys that were missing as NA. E.g., survey2 is missing for location B3 in year 1.

Thanks,
Dave Hewitt
Fisheries Science
Virginia Institute of Marine Science, Gloucester Point, VA, USA



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 24 Jan 2008 - 04:42:03 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 24 Jan 2008 - 19:30:08 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive