[R] predict() question

From: Weiwei Shi <helprhelp_at_gmail.com>
Date: Wed 18 May 2005 - 04:09:33 EST

Hi, there:
Following yesterday's question ( i had a new level for a categorical variable occurred in validation dataset and predict() complains about it: i made some python code to solve the problem), but here, I am just curious about some details about the mechanism:

I believed rpart follows CART and for a categorical variable, the splitting criteria should be like,
is it A or not?

   --yes, go to left branch
   --no, go to right

So, when you predict, if you have a new level C,for example, the predict() should not complain about the occurrence of "C" (of course, if there are many "C"'s in validation, it should complain). Maybe for robustness, predict() has to check first if there is new level or not.

I am not sure if my understanding is right or not, please be advised!


Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Wed May 18 04:15:29 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:50 EST