Re: [R] How to do multi-factor stratified sampling in R

From: David Winsemius <dwinsemius_at_comcast.net>
Date: Sat, 08 Mar 2008 20:54:17 +0000 (UTC)

"Robert A. LaBudde" <ral_at_lcfltd.com> wrote in news:0JXF00LSO864ATE0_at_vms040.mailsrvcs.net:

> Given a set of data with a number of variables plus a response, I'd
> like to obtain a randomized subset of the rows such that the
> marginal proportions of each variable are maintained closely in the
> subset to that of the dataset, and possibly maintaining as well the
> two-factor interaction marginal proportions as well for some pairs.
>
> This must be a common problem in data mining, but I don't seem to be
> able to locate the proper library or function for doing this in R.
>
> Thanks for any help.

Have you looked at the "sampling" package? I have never used it, but the strata() function appears to be capable.

-- 
David Winsemius

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat 08 Mar 2008 - 20:58:18 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 09 Mar 2008 - 02:30:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive