Re: [R] Calculate Specificity and Sensitivity for a given threshold value

From: Frank E Harrell Jr <f.harrell_at_vanderbilt.edu>
Date: Thu, 13 Nov 2008 13:17:47 -0600

Pierre-Jean-EXT.Breton_at_sanofi-aventis.com wrote:
> Hi Frank,
>
> Thank you for your answer.
> In fact, I don't use this for clinical research practice.
> I am currently testing several scoring methods and I'd like
> to know which one is the most effective and which threshold
> value I should apply to discriminate positives and negatives.
> So, any idea for my problem ?

The use of thresholds gets in the way of finding a good solution because you will have predictor values in the "gray zone". I tend to rank methods by the most sensitive index available such as the log likelihood in the binary logistic model. You can extend ordinary logistic models to allow for nonlinear effects on the log odds scale using regression splines.

Frank

>
> Pierre-Jean
>
> -----Original Message-----
> From: Frank E Harrell Jr [mailto:f.harrell_at_vanderbilt.edu]
> Sent: Thursday, November 13, 2008 5:00 PM
> To: Breton, Pierre-Jean-EXT R&D/FR
> Cc: r-help_at_r-project.org
> Subject: Re: [R] Calculate Specificity and Sensitivity for a given
> threshold value
>
> Kaliss wrote:

>> Hi list,
>>
>>
>> I'm new to R and I'm currently using ROCR package.
>> Data in input look like this:
>>
>> DIAGNOSIS	SCORE
>> 1	0.387945
>> 1	0.50405
>> 1	0.435667
>> 1	0.358057
>> 1	0.583512
>> 1	0.387945
>> 1	0.531795
>> 1	0.527148
>> 0	0.526397
>> 0	0.372935
>> 1	0.861097
>>
>> And I run the following simple code:
>> d <- read.table("inputFile", header=TRUE); pred <- prediction(d$SCORE,

>
>> d$DIAGNOSIS); perf <- performance( pred, "tpr", "fpr");
>> plot(perf)
>>
>> So building the curve works easily.
>> My question is: can I have the specificity and the sensitivity for a 
>> score threshold = 0.5 (for example)? How do I compute this ?
>>
>> Thank you in advance

>
> Beware of the utility/loss function you are implicitly assuming with
> this approach. It is quite oversimplified. In clinical practice the
> cost of a false positive or false negative (which comes from a cost
> function and the simple forward probability of a positive diagnosis,
> e.g., from a basic logistic regression model if you start with a cohort
> study) vary with the type of patient being diagnosed.
>
> Frank
>
-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 13 Nov 2008 - 19:21:01 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 13 Nov 2008 - 19:31:23 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive