Re: [R] library(rpart) or library(tree)

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Wed, 19 Dec 2007 22:09:45 +0000 (GMT)

You appear to have fitted a regression tree, which does not seem to be what your interpretation of 'pnV22' requires.

I have little idea what you actually did, but am confident that it is not what you claim you did.

Also, note fortune("dog"):

Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'? Anyway, it might clash with the function 'matrix'.

On Wed, 19 Dec 2007, Ingo Holz wrote:

> Hi,
>
> I have a problem with library (rpart) (and/or library(tree)).
>
> I use a data.frame with variables
> "pnV22" (observation: 1, 0 or yes, no)
> "JTemp" (mean temperature)
> "SNied" (summer rain)
>
> I used function "rpart" to build a model:
>
> library(rpart)
> attach(data.frame)
> result <- rpart(pnV22 ~ JTemp + SNied)
>
> I got the following tree:

I don't believe that: how could rpart know about 'punkte'?

> n=55518 (50 observations deleted due to missingness)
>
> node), split, n, deviance, yval
> * denotes terminal node
>
> 1) root 55518 668.744500 0.0121942400
> 2) punkte[["JTemp"]]< 10.35 51251 18.992960 0.0003707245 *
> 3) punkte[["JTemp"]]>=10.35 4267 556.532000 0.1542067000
> 6) punkte[["SNied"]]>=450 3136 291.318600 0.1036352000 *
> 7) punkte[["SNied"]]< 450 1131 234.954900 0.2944297000
> 14) punkte[["JTemp"]]>=10.55 723 113.502100 0.1950207000 *
> 15) punkte[["JTemp"]]< 10.55 408 101.647100 0.4705882000
> 30) punkte[["JTemp"]]< 10.45 48 4.479167 0.1041667000 *
> 31) punkte[["JTemp"]]>=10.45 360 89.863890 0.5194444000 *
>
> I constructed a simple new.data.frame:
>
> new.data.fame <- data.frame
> new.data.frame[,"JTemp"] <- 10.5
> new.data.frame[,"SNied"] <- 430
>
> Than I used predict() to predict values for "pnV22" in the following way:
>
> pred <- predict(result, data.frame)
> pred2 <- predict(result, new.data.frame)

It is not finding the new values from the new data frame: they do not have names like 'punkte[["JTemp"]]'.

> The results are the same, which I checked by ploting the values of pred and pred2 and by
>
> table(pred ==pred2) which is true for all values.
>
> Looking at the tree I would expect that pred2 has the same high value for all elements of the
> vector. Did I make a mistake?
>
> Thanks, Ingo
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 19 Dec 2007 - 22:13:48 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 19 Dec 2007 - 22:30:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.