From: Thomas Lumley <tlumley_at_u.washington.edu>

Date: Thu, 03 Jun 2010 09:05:26 -0700 (PDT)

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 03 Jun 2010 - 16:10:57 GMT

Date: Thu, 03 Jun 2010 09:05:26 -0700 (PDT)

On Thu, 3 Jun 2010, Roni Kobrosly wrote:

> Hello,

*>
**> I'm using a complex survey dataset and my goal is to simply spit out a bunch of probability-weighted outcome variable means for the different levels of covariate. So I first define the structure of the study design (I'm using the CDC's NHANES data):
**>
**> dhanes <- svydesign(id=~PSU, strat=~STRATA, weight=~lab_weight, data=final, nest=TRUE)
**>
**> No problem there.
**> Now I use the "svyby" function as follows:
**>
**> svyby(~outcome, ~covariate, design=dhanes, svymean, na.rm=T) -> haha
**> print(haha)
**>
**> covariate outcome se.outcome
**> 1 1 0.4961189 0.08828457
**> 2 2 0.4474706 0.22214557
**> 3 3 0.5157026 0.12076008
**> 4 4 0.6773910 0.20605025
**> NA NA 0.8728167 0.15622274
**>
**> ...and it works just fine. I get a nice table of the mean and standard error for each level of the covariate. I started writing a custom function to automate this and I had problems. Consider this really basic custom function (that does not seem very different from the above code):
**>
**> this_is_a_test <-function(outcome, covariate)
**> {
**>
**> svyby(~outcome, ~covariate, design=dhanes, svymean, na.rm=T) -> haha
**>
**> print(hah)
**>
**>
**> }
**>
*

You are asking for the mean of a variable called 'outcome', divided up according to a variable called 'covariate'. Presumably you don't have variables with either of those names, so R is getting confused.

Formulas don't work the way you want them to. As a simpler example with nothing to do with the survey package

this_is_a_simpler_example<-function(outcome){

~outcome

}

*> this_is_a_simpler_example(test)
*

~outcome

If you want to substitute a variable into a formula, you need to do it yourself. In your case, you probably want to use make.formula(), from the survey package

*> make.formula("test")
*

~test

*> make.formula(c("fred","barney","wilma"))
*

~fred + barney + wilma

Presumably you want to do something like

approach_that_works <-function(outcome, covariate, design=dhanes,...) svyby(make.formula(outcome), make.formula(covariate), design,...)

some_outcomes <- colnames(dhanes)[47:63]

some_covariates <- colnames(dhanes)[83:95]

lapply( some_outcomes,

function(an_outcome) lapply(some_covariates, approach_that_works, outcome=an_outcome) )

For another recent thread using another approach to a related question, see http://tolstoy.newcastle.edu.au/R/e10/help/10/05/5676.html

-thomas

Thomas Lumley Assoc. Professor, Biostatistics tlumley_at_u.washington.edu University of Washington, Seattle ______________________________________________R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 03 Jun 2010 - 16:10:57 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Thu 03 Jun 2010 - 16:20:28 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*