Re: [R] help: pls package

From: Bjørn-Helge Mevik <bhx2_at_mevik.net>
Date: Fri 22 Jul 2005 - 20:50:22 EST

wu sz writes:

> trainSet = as.data.frame(scale(trainSet, center = T, scale = T))
> trainSet.plsr = mvr(formula, ncomp = 14, data = trainSet, method = "kernelpls",
> CV = TRUE, validation = "LOO", model = TRUE, x = TRUE,
> y = TRUE)

[Two side notes here:

  1. scaling of the data (with its sd) should be performed inside the cross-validation. In the current version of 'pls', one can use cvplsr <- crossval(plsr(y ~ scale(X), ncomp = 14, data = mydata), length.seg = 1) (However, 'crossval' is slower than the built-in cross-validation on 'mvr'/'plsr'. In the development version of the package, scaling within the cross-validation has been implemented in the built-in cross-validation. This will hopefully be published shortly.)
  2. The 'CV' argument is from the earlier 'pls.pcr' package, and is no longer used. It is silently ignored.]

> i = 1; msep_element = c()
> while(i <= length(p)){
> msep_element[,i] = (p[i]-y)^2
> i = i + 1
> }

Hmm... I don't see how you got that code to run. This should work, though:

msep_element <- (p - y)^2

> msep = colMeans(msep_element)
> msep_sd = sd(msep_element)

You will get much closer to the true value with

sd(msep_element) / sqrt(length(y))

However, this will not produce an unbiased estimate of the sd of the estimated MSEP, because it ignores the depencies between the residuals. E.g., the residual when sample 1 is predicted is not independent of the residual when sample 2 is predicted. In general, I think, it will produce underestimated sds. The effect should be largest for small data sets.

This is the reason the pls package currently doesn't estimate se of cross-validated MSEPs. There is also the question of what the estimated should be conditioned on: for leave-one-out cross-validation, sd(MSEP | trainData) = 0.

[If someone knows how to calculate unbiased estimates of cross-validated MSEPs, please let me know. :-)]

-- 
Bjørn-Helge Mevik

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Fri Jul 22 20:54:42 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:55 EST