[R] Using cv.tree to assign cases to specific cv-groups

From: <jshuter_at_uoguelph.ca>
Date: Fri, 08 Feb 2008 16:07:34 -0500


Hello,

I would like to use cv.tree to run a 10-fold cross-validation experiment on a tree object to help me choose a tree size.

Many users seem to allow their cases to be assigned to CV groups randomly, but I have assigned each case to one of 10 cv groups, such that the data from each of my experimental units is included in only one cv-group.

According to the manual for the tree Package (Ripley 2007), the cv.tree argument "rand" [cv.tree(object, rand, FUN = prune.tree, K=10)], allows the user the option to specify an “integer vector of the length the number of cases used to create object, assigning the cases to different groups for cross-validation” (Ripley 2007). However, after searching the R-archives and various online sources, I have been unable to find an example of code in which someone has exercised this option, so I am unsure how to proceed.

Specifically, should I:

  1. Create a 1 column dataframe, with each case containing a number from 1-10, with the order corresponding to the order of cases in the original dataset used to generate the tree object.

2.Call that dataset using the “rand” argument when I run the full syntax for cv.tree

OR should I:

1.List the integers used for case assignment directly in the syntax for cv.tree, following the “rand” argument?

If anyone has any experience using cv.tree (or another function) to assign specific cv-groups, any advice would be greatly appreciated!

Jen Shuter
University of Guelph



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 08 Feb 2008 - 21:35:13 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 08 Feb 2008 - 22:30:12 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive