[R] linear models and colinear variables...

From: Peter Gaffney <petertgaffney_at_yahoo.com>
Date: Thu 01 Jul 2004 - 09:32:59 EST


Hi!

I'm having some issues on both conceptual and technical levels for selecting the right combination of variables for this model I'm working on. The basic, all inclusive form looks like

lm(mic ~ B * D * S * U * V * ICU)

Where mic, U, V, and ICU are numeric values and B D and S are factors with about 16, 16 and 2 levels respectively. In short, there's a ton of actual explanatory variables that look something like this:

Bstaph.aureus:Dvan:Sr:U:ICU

There are a good number of hits but there's also a staggering number of complete misses, due to a combination of scare data in that particular niche and actual lack of deviation from the categorical mean. My suspicion is that there's a large degree of colinearity in some of these variables that serves to reduce the total effect of either of a nearly colinear pair to an insignificant level; my hope is that removing one of a mostly colinear group would allow the other variables' possibly significant effects to be measured.

Question 1) Is this legitimate at all? Can I do regression using the entire data set over only selected factors while ignoring others?
(Admittedly I only just got my Bachelor's in math; the gaps in my knowlege here are profound and
aggravating.)

Question 2) How do I go about selecting possible colinear explanatory variables?
I had originally thought I'd just make a matrix of coefficients of colinearity for each pair of variables and iteratively re-run the model until I got the results I wanted, but I can't really figure out how to do this. In addition, I'm not sure how to do this in the model syntax once I've actually decided on some variables to exclude.
For instance, supposing I wanted to run the model as above without the variable
Bstaph.aureus:Dvan:Sr:U:ICU. What I tried was

lm(mic ~ B * D * S * U * V * ICU -
Bstaph.aureus:Dvan:Sr:U:ICU).

Obviously this doesn't work because the variable name Bstaph.aureus:Dvan:Sr:U:ICU hasn't been recognized yet. How do I do this? My best guess so far is to build and define each of the variables like Bstaph.aureus:Dvan:Sr:U:ICU by hand with some imperative/iterative style programming using some kind of string generation system. This sounds like a royal pain, and is something I'd rather avoid doing if at all possible.

Any suggestions? :-D

-petertgaffney



R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 01 09:38:51 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 08:11:10 EST