Re: [R] cross validation and parameter determination

From: Ramon Diaz-Uriarte <rdiaz_at_cnio.es>
Date: Wed 20 Apr 2005 - 17:56:29 EST

On Wednesday 20 April 2005 00:17, array chip wrote:
> Hi all,
>
> In Tibshirani's PNAS paper about nearest shrunken
> centroid analysis of microarrays (PNAS vol 99:6567),
> they used cross validation to choose the amount of
> shrinkage used in the model, and then test the
> performance of the model with the cross-validated
> shrinkage in separate independent testing set. If I
> don't have the luxury of having independent testing
> set, can I just use the cross validation performance
> as the performance estimate? In other words, can I use
> the same single cross-validation to both choose the
> value of the parameter (amount of shrinkage in this
> case) and estimate the performance which was based on
> the value of the parameter chosen by the same
> cross-validation? I kind of feel awkward by getting
> both on a single cross validation, because it seems
> like I used the dataset in training set manner. Am I
> wrong/right?

That error rate is probably optimistic, because as you say
> cross-validation? I kind of feel awkward by getting
> both on a single cross validation, because it seems
> like I used the dataset in training set manner. Am I

However, you can easily wrap the whole pam procedure within an outer-loop of cross validation or bootstrap. (This problem is not that different from, say, using knn and selecting k using cross-validation; or selecting the number of genes to use with cross-validation, etc. You should then assess the error rate of your procedure).

R.
>
> Thanks!
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

-- 
Ramón Díaz-Uriarte
Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900

http://ligarto.org/rdiaz
PGP KeyID: 0xE89B3462
(http://ligarto.org/rdiaz/0xE89B3462.asc)




**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido. 
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Wed Apr 20 18:32:55 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:17 EST