[R] pulling items out of a lm() call

From: Andrew Gelman <gelman_at_stat.columbia.edu>
Date: Mon 01 May 2006 - 20:46:59 EST

I want to write a function to standardize regression predictors, which will require me to do some character-string manipulation to parse the variables in a call to lm() or glm().

For example, consider the call
lm (y ~ female + I(age^2) + female:black + (age + education)*female).

I want to be able to parse this to pick out the input variables ("female", "age", "black", "education"). Then I can transform these as appropriate (to get "z.female", "z.age", etc), feed them back into the lm() function, and go from there.

Does anyone know an easy way to pull out the variables? I basically have to parse out the symbols "+", ":", "*", and " ", but there's also the problem of handling parentheses and the I() operator.


Andrew Gelman
Professor, Department of Statistics
Professor, Department of Political Science

Statistics department office:
  Social Work Bldg (Amsterdam Ave at 122 St), Room 1016
Political Science department office:
  International Affairs Bldg (Amsterdam Ave at 118 St), Room 731

Mailing address:
  1255 Amsterdam Ave, Room 1016
  Columbia University
  New York, NY 10027-5904
  (fax) 212-851-2164

R-help@stat.math.ethz.ch mailing list
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Mon May 01 20:53:25 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 01 May 2006 - 22:09:56 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.