Re: [R] logistic discrimination: which chance performance??

From: Frank E Harrell Jr <f.harrell_at_vanderbilt.edu>
Date: Sat 12 Aug 2006 - 01:33:36 EST

Bruno L. Giordano wrote:

> Well,
> If posting a possible solution to one's own problem is not part of the
> netiquette of this list please correct me.
> 
> Following Titus et al. (1984) one might use Cohen's kappa to have a
> chance-corrected measure of agreement between the original and reproduced
> classification:
> 
> Kappa() in library vcd
> kappa2() in library irr
> ckappa() in library psy
> cohen.kappa() in library concord......
> 
>     Bruno
> 
> Kimberly Titus; James A. Mosher; Byron K. Williams (1984), Chance-corrected 
> Classification for Use in Discriminant Analysis: Ecological Applications, 
> American Midland Naturalist, 111(1),1-7.
> 
> 
> ----- Original Message ----- 
> From: "Bruno L. Giordano" <bruno.giordano@music.mcgill.ca>
> To: <r-help@stat.math.ethz.ch>
> Sent: Thursday, August 10, 2006 6:18 PM
> Subject: [R] logistic discrimination: which chance performance??
> 
> 

>> Hello,
>> I am using logistic discriminant analysis to check whether a known
>> classification Yobs can be predicted by few continuous variables X.
>>
>> What I do is to predict class probabilities with multinom() in nnet(),
>> obtaining a predicted classification Ypred and then compute the percentage
>> P(obs) of objects classified the same in Yobs and Ypred.
>>
>> My problem now is to figure out whether P(obs) is significantly higher
>> than
>> chance.

The most powerful approach, and one that is automatically corrected for chance, is to use the likelihood ratio test for the global null hypothesis for the whole model.

With classification proportions you not only lose power and have trouble correcting for chance, but you have arbitrariness in what constitutes a positive prediction.

Frank Harrell

>>
>> I opted for a crude permutation approach: compute P(perm) over 10000
>> random
>> permutations of Yobs (i.e., refit the multinom() model 10000 times
>> randomly
>> permuting Yobs) and consider P(obs) as significantly higher than chance if
>> higher than the 95th percentile of the P(perm) distribution.
>>
>> Now, the problem is that the mode of P(perm) is always really close to
>> P(obs), e.g., if P(obs)=1 (perfect discrimination) also the most likely
>> P(perm) value is 1!!!
>>
>> I figured out that this is due to the fact that, with my data, randomly
>> permuted classifications are highly likely to strongly agree with the
>> observed classification Yobs, but, probably since my machine learning
>> background is almost 0, I am kind of lost about how to proceed at this
>> point.
>>
>> I would greatly appreciate a comment on this.
>>
>> Thanks
>> Bruno
>>
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> Bruno L. Giordano, Ph.D.
>> CIRMMT
>> Schulich School of Music, McGill University
>> 555 Sherbrooke Street West
>> Montréal, QC H3A 1E3
>> Canada
>> http://www.music.mcgill.ca/~bruno/
>>
>> ______________________________________________
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

> 
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Aug 12 01:40:08 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 12 Aug 2006 - 02:20:41 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.