From: Maura E Monville <maura.monville_at_gmail.com>

Date: Tue, 18 Dec 2007 18:23:37 -0600

colnames(cosmat) <- coscol

colnames(sinmat) <- sincol

stepmod <- step(FTmod, direction="both") summary(stepmod)

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

rs <- summary(Regmod)

which.max(rs$adjr)

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 19 Dec 2007 - 00:28:32 GMT

Date: Tue, 18 Dec 2007 18:23:37 -0600

Thank you very much for the example. I think interactively I could get
something.

But my obstacle is to write an R script that processes my set of data
automatically.

My difficulty is to extract the information that appears on the screen, when
R is operated interactively, from a scripts.

Let me go over some steps to make sure I am doing things right. Assume my data have been read into the matrix xx. Such a matrix contains a number of curve samples. I call each curve a cycle. I am processing one cycle at a time and using regression analysis to find the frequencies in the single cycle. Since I have many, I have to come up with an automatic way to do that. The main phases are:

- run a regression analysis including all possible frequencies
- use "step" to cut off the non significant frequencies
- input remaining frequencies from phase (2) to "regsubsets"

In the following I have cut and pasted some code and the information I get
from phase (2) and phase (3). To start with I would like to make sure that I
got the output from (3) right. The ouput of (3) tells me that the highest
R^2 value was reached after 8 iterations and there are only 8 significant
predictors in this model ??? In addition the only significant frequencies

(predictors) left are:

cos1, cos2, cos4, cos7, sin1,sin2, sin3, sin5
I got this information interactively. But I'm in troubles at extracting it
automatically.

Any suggestion ?

Question: Do I have to run "step" in advance of "regsubsets" for a first-pass model pruning or may I run "regsubsets" on the original model bringing in all possible frequencies (88 in all as sums of sinuoids) ?

Thank you so much.

Kind regards,

Maura

T <- xx$timestamp[end] - xx$timestamp[start]
nsamples <- end +1 - start

nfr <- ceiling(nsamples/2)

yy <- xx[start:end,"amplitude"]

tt <- xx[start:end,"timestamp"]

cosmat <- matrix(nrow=nsamples,ncol=nfr) coscol <- NULL sinmat <- matrix(nrow=nsamples,ncol=nfr) sincol <- NULL for(i in 1:nfr){ cosmat[,i] <- cos(tt*2*pi*i/T) coscol <- c(coscol,paste("cos",i,sep="")) sinmat[,i] <- sin(tt*2*pi*i/T) sincol <- c(sincol,paste("sin",i,sep=""))}

colnames(cosmat) <- coscol

colnames(sinmat) <- sincol

xnam1 <- NULL xnam1 <- paste(sep="","cosmat[,",1:nfr,"]",collapse="+") xnam2 <- NULL xnam2 <- paste(sep="","sinmat[,",1:nfr,"]",collapse="+")fmla <- as.formula(paste("yy ~ ", paste(xnam1,"+",xnam2,sep=""))) FTmod <- lm(fmla)

stepmod <- step(FTmod, direction="both") summary(stepmod)

*Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 0.237808 0.008728 27.248 < 2e-16 ***

cosmat[, 1] -1.011932 0.012390 -81.675 < 2e-16 *** cosmat[, 2] -0.417265 0.012329 -33.844 < 2e-16 *** cosmat[, 3] 0.020143 0.012318 1.635 0.106604 cosmat[, 4] 0.081425 0.012397 6.568 8.43e-09 *** cosmat[, 6] -0.042804 0.012388 -3.455 0.000951 *** cosmat[, 7] -0.044927 0.012340 -3.641 0.000526 *** cosmat[, 9] 0.039322 0.012411 3.168 0.002298 ** cosmat[, 11] -0.020710 0.012375 -1.673 0.098831 . cosmat[, 14] 0.016393 0.012390 1.323 0.190245 cosmat[, 17] -0.016482 0.012374 -1.332 0.187319 cosmat[, 19] 0.016537 0.012387 1.335 0.186306 sinmat[, 1] -0.460705 0.012296 -37.469 < 2e-16 *** sinmat[, 2] -0.289316 0.012356 -23.416 < 2e-16 *** sinmat[, 3] -0.049610 0.012367 -4.011 0.000153 *** sinmat[, 4] 0.023937 0.012289 1.948 0.055551 . sinmat[, 5] 0.045542 0.012406 3.671 0.000476 *** sinmat[, 6] -0.017782 0.012298 -1.446 0.152790 sinmat[, 12] -0.016093 0.012325 -1.306 0.196038 ---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08135 on 68 degrees of freedom Multiple R-Squared: 0.9932, Adjusted R-squared: 0.9914 F-statistic: 549.8 on 18 and 68 DF, p-value: < 2.2e-16*

newmat <- cosmat[,-c(5,8,10,12,13,15,16,18,20:44)] newmat <- cbind(newmat,sinmat[,-c(7:11,13:44)]) Regmod <- regsubsets(newmat, yy)

rs <- summary(Regmod)

which.max(rs$adjr)

*[1] 8

*rs$which[which.max(rs$adjr), ]

*(Intercept) cos1 cos2 cos3 cos4 cos6 TRUE TRUE TRUE FALSE TRUE FALSE cos7 cos9 cos11 cos14 cos17 cos19 TRUE FALSE FALSE FALSE FALSE FALSE sin1 sin2 sin3 sin4 sin5 sin6 TRUE TRUE TRUE FALSE TRUE FALSE sin12 FALSE* [[alternative HTML version deleted]] ______________________________________________R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 19 Dec 2007 - 00:28:32 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 19 Dec 2007 - 00:30:19 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*