From: Uwe Ligges <ligges_at_statistik.uni-dortmund.de>

Date: Wed 06 Jul 2005 - 04:42:37 EST

*>
*

> Warning message:

*> variables are collinear in: lda.default(x, grouping, ...)
*

*>
*

*> I guess this is not a good thing, however, I *did* get a result and it
*

*> discriminated perfectly between my groups. Can anyone explain what this
*

*> means? Does it invalidate my results?
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jul 06 04:52:44 2005

Date: Wed 06 Jul 2005 - 04:42:37 EST

michael watson (IAH-C) wrote:

> Dear All

*>
**> This is more of a statistics question than a question about help for R,
**> so forgive me.
**>
**> I am using lda from the MASS package to perform linear discriminant
**> function analysis. I have 14 cases belonging to two groups and have
**> measured each of 37 variables. I want to find those variables that best
**> discriminate between the two groups, and I want to visualise that and
**> create a classification function. Please note at this stage it is a
**> proof of concept problem - I realise that I must follow this up with a
**> much more robust anaylsis involving cross-validation.
**>
**> 1) First problem, I got this error message:
**>
*

>>z <- lda(C0GRP_NA ~ ., dpi30)

> Warning message:

Well, 14 cases and 37 variables mean that not that many degrees of
freedom are left.... ;-)

Of course, you get a perfect fit - with arbitrary data.

*>
*

> 2) My analysis came up with one discriminant variable. How do I control

*> how many are produced? I currently assume this is the only significant
**> discriminant variable found. Can I insist it finds more?
*

Well, if projection into one dimension is already perfect, it's hard to find a second one that improves the result...

> 3) More of a tip - when my analysis only finds one significant variable,

*> what is a good way to visualise this graphically?
*

Depends of the amount of data, either all data on one line, maybe jittered, or maybe even beter two boxplot, given there would be really perfect (and sensible) separation ....

> 4) Can I work out from the coefficients which sub groups of my variable

*> are better at discriminating than others? I guess I could simply
**> perform a t-test first to select the best variables...?
*

No, because you ignore possible projections in this case.

> 5) How do I turn my discriminant function into a classification

*> function? i.e. when I plot the scores for the groups I can see
**> graphically that all the values for one group are below 0.1 and all the
**> values for the other group are above 1. But how do I turn my
**> discriminant function into a classification function?
*

What about looking for the point where it has the value 0.5 for the posterior?

Uwe LIgges

> Many thanks in advance for your help

*>
**> Mick
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jul 06 04:52:44 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:33:16 EST
*