Re: [R] pls -- crossval vs plsr(..., CV=TRUE)

From: Bjørn-Helge Mevik <bhx2_at_mevik.net>
Date: Thu 12 May 2005 - 23:34:52 EST

martin peters writes:

> $ library(pls)
> $ data(NIR)
>
> $ testing.plsNOCV <- plsr(y ~ X, 6, data = NIR, method="kernelpls",
> validation="none")
> $ NIR.plsCV <- plsr(y ~ X, 6, data = NIR, CV=TRUE, method="kernelpls")
> $ testing.plsCV <- crossval(testing.plsNOCV)
> $ R2(NIR.plsCV)
> (Intercept) 1 comps 2 comps 3 comps 4 comps 5
> comps
> 0.0000 0.9812 0.9825 0.9964 0.9997
> 0.9999
> 6 comps
> 0.9999
> $ R2(testing.plsCV)
> (Intercept) 1 comps 2 comps 3 comps 4 comps 5
> comps
> 0.0000 0.9678 0.9782 0.9941 0.9991
> 0.9996
> 6 comps
> 0.9997

[...]

> If the above result is correct can someone explain the difference to me.

There are two reasons:

  1. The call plsr(y ~ X, 6, data = NIR, CV=TRUE, method="kernelpls") is incorrect. The `CV' argument of the superseded `pls.pcr' package has been replaced by the `validation' argument, so the correct call would be NIR.plsCV <- plsr(y ~ X, 6, data = NIR, validation="CV", method="kernelpls") (If you had done R2(testing.plsNOCV), you would have gotten exactly the same as with the R2(NIR.plsCV) above.)
  2. plsr(... , validation = "CV") and crossval(...) both by default use CV with 10-fold _randomly selected_ segments, which means that each time you run the cross-validation, you will get slightly different results. (Try running R2(crossval(testing.plsNOCV)) a couple of times.)

   If you want the same segments in two separate calls, either add the    argument segment.type = "consecutive" or "interleaved", or specify    the segments explicitly with the `segments' argument (see    ?crossval or ?mvrCv for how).

   The segments actually used in a cross-validation is stored in the    $validation$segments component of the object,    i.e. testing.plsCV$validation$segments.

(By the way, `method = "kernelpls"' is not needed, as it is the default fit method for plsr (and mvr).)

-- 
Bjørn-Helge Mevik

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Thu May 12 23:38:40 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:44 EST