[R] two questions about regression models and clustering routines

From: Maura E Monville <maura.monville_at_gmail.com>
Date: Thu, 05 Jun 2008 14:46:01 -0500


 I managed to use an example (see attachment) of clever regression routines. I customized it to suit my needs. The initial model I try to fit consists of the first 10 powers of time (time the observation was recorded) and the first 10 powers of the phase. In fact my files record patients' breathing signals as a sequence of breathing cycles. Every cycle sampled phase (inhale - exhale) is mapped to an angle in the range [0,2PI] I have two questions,

  1. Surprisingly (for me) for some files the summary of the regular lm command shows a number of non significant coefficients (those for which the column "Pr(>|t|)" value is > 0.05) But after running the step command on the model output from lm I see that all the 20 coefficients have become significant, which makes me feel astonished because I have always thought that step would prune the model stripping it off the non significant coefficients. So I was thinking to submit the model output from "step" to the Cp test anyway. As it is implemented right now the Cp stage is run only if the model output from "step" still has some non significant coefficients. Your thoughts .....
  2. The regression model coefficients, stored in the first 20 columns of matrix rg, are used to calculate a distance matrix that is then input to clustering routines. I am writing a more sophisticated clustering algorithm that uses PAM. The 21st column of matrix rg stores the file ID, which is obviously not used in the distance evaluation. I would like to be able to attach the file ID as labels visible in each cluster. The dist command description mentions some "Labels". But it fails to explain clearly how the observations labels can be saved in the distance matrix and then displayed in the cluster plot. Can you please help me with that ?

Thank you in advance
Best regards,

--
Maura E.M

______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Received on Thu 05 Jun 2008 - 21:39:14 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 05 Jun 2008 - 22:30:41 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive