[R] Automated Fixed Order Stepwise Regression Function

From: Tyler Rinker <tyler_rinker_at_hotmail.com>
Date: Thu, 07 Apr 2011 09:03:56 -0400

Greetings,  

I am interested in creating a stepwise fixed order regression function. There's a function for this already called add1( ). The F statistics are calculated using type 2 anova (the SS and the F changes don't match SPSS's). You can see my use of this at the very end of the email.  

What I want: a function to make an anova table with f changes and delt R^2.  

I ran into 10 snags to making this a fully automated function using the full linear model (order matters here). Each snag is marked with a Comment #. Some snags are repeated because I couldn't do the first time and certainly couldn't do it after that. Help with any/all snags would be very appreciated.  

I'm a 2 1/2 month [R] user who's reading everything online (incl. manuals) & ordering every book I can (looking at Dalgaard, Crawly's and Teetor's very helpful books right now). Loops and their usage is a foreign thing to me, despite studying them, and unfortunately I think that my function may call for them. I also realize that beyond 10 predictors may make this function way to bulky.  

I'm running the latest version of R (2.12.2)on a windows 7 machine.

DATASET   mtcars
full.model<-lm(mpg~cyl+disp+hp+drat, data=mtcars)  

CODE stepFO<-function(model)
{
m<-data.frame(model.frame(model))
num.of.var<-length(colnames(m))

mod1<-lm(m[,1]~m[,2])
mod2<-lm(m[,1]~m[,2]+m[,3])
mod3<-lm(m[,1]~m[,2]+m[,3]+m[,4])
mod4<-lm(m[,1]~m[,2]+m[,3]+m[,4]+m[,5])

#Comment 1--I don't know how to automated this process(above) of adding
#...additional variables. Probably a loop is needed but I don't understand
#...how to apply it here. Maybe update.model [1:num.ofvar]?
a1<-anova(mod1)
a2<-anova(mod2)
a3<-anova(mod3)
a4<-anova(mod4)

#Comment 2--SAME AS COMMENT 1 except applied to the anova tables. How do I make
#...[R] add a5, a6...an as necessary?

rb<-rbind(a1,a2,a3,a4)
#Comment 3--again I can't automate this to make the addition of a's automated

anova1<-rbind(rb[1,],rb[4,],rb[8,],rb[13,],rb[14,])
#Comment 4--the rb's follow a pattern of 1+3+4+5...+n variables
#then I row bind these starting with 1 and rowbind one more after the last
#...rb to include the bottom piece of the anova table that tells
#...about residuals (how do I aoutomate this?)

anova<-anova1[,1:num.of.var]
anova.table<-data.frame(anova)
#Comment 5--Something that bugs me here is that I have to turn it into a dataframe to
#...add the totals row and the delta R^2 (tried playing w/ tkinsert to no avail)
#...I miss the stuff that's at the bottom of the anova table (sig values)
#Comment 6--I'd love to turn the place value to round to 2 after the decimal.
#...I've worked this many ways including changing my options but this does
#...not seem to apply to a data frame

Total<-c(sum(anova[,1]),sum(anova[,2])," ", " ", " ") anova.table<-rbind(anova.table,Total)

R1<-summary(mod1)[[8]][[1]]
R2<-summary(mod2)[[8]][[1]]
R3<-summary(mod3)[[8]][[1]]
R4<-summary(mod4)[[8]][[1]]

#Comment 7--SAME AS COMMENT 2. How do I make
#...[R] add R5, R6...Rn as necessary?
deltaR.1<-R1
deltaR.2<-R2-R1
deltaR.3<-R3-R2
deltaR.4<-R4-R3

#Comment 8--SAME AS COMMENT 7. How would I aoutomate this process?

Delta.R.Squared<-c(deltaR.1,deltaR.2,deltaR.3,deltaR.4," ","")
#Comment 9--I need a way to add as many deltaR's as
#...necessary(n of R = n of predictors)

anova.table<-cbind(anova.table, Delta.R.Squared) colnames(anova.table)<-c("df","Sum Sq","Mean Sq","F-change", "P-value","Delta.R.Squared")
rownames(anova.table)<-c("X1", "X2 elminating for X1",
"X3 eliminating for X2 & X3", "X4 eliminating for X1,X2, & X3","Residuals",  "Total")
anova.table
}
#Comment 10--Again I would need [R] to automate the list for row names as we
#...add more predictors.
#See the final product below I'm after (with two places after the decimal)
> anova.table

                                df           Sum Sq          Mean Sq         F-change            P-value     Delta.R.Squared
X1                               1 817.712952354614 817.712952354614 79.5610275293349 6.112687142581e-10   0.726180005093805
X2 elminating for X1             1 37.5939529324851 37.5939529324851  4.0268283172755 0.0541857201845285  0.0333857704630917
X3 eliminating for X2 & X3       1 9.37092926438942 9.37092926438942 1.00388976918267  0.324951851250774 0.00832196853596723
X4 eliminating for X1,Xx2, & X3  1 16.4674349222492 16.4674349222492 1.81550535203668  0.189048514740917  0.0146241073243205
Residuals                       27 244.901918026262 9.07044140838007             <NA>               <NA>                    
Total                           31     1126.0471875  
 
 

USING THE ADD1() FUNCTION> NOT WHAT I WANT>   mod1<-lm(mpg~cyl, data=mtcars)
add1(mod1,~cyl+disp+hp+drat, data=mtcars, test="F")

Model:
mpg ~ cyl

       Df Sum of Sq    RSS    AIC F value   Pr(F)  
<none>              308.33 76.494                  
disp    1    37.594 270.74 74.334  4.0268 0.05419 .
hp      1    16.360 291.98 76.750  1.6249 0.21253  
drat 1 15.841 292.49 76.807 1.5706 0.22012
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1 


 		 	   		  
	[[alternative HTML version deleted]]


______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Received on Thu 07 Apr 2011 - 13:06:40 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 07 Apr 2011 - 21:30:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive