[R] Gini's Importance Value Variable = Inf

From: Melanie Vida <mvida_at_mitre.org>
Date: Thu 24 Mar 2005 - 07:58:35 EST


Hi All,

In the script below, the importance measure for column 4 (ie MeanDecreaseGini) indicated "Inf" for V7. Running the getTree command showed that "V7" had been selected at least twice in one of the trees for Random Forest. So the "Inf" command was not generated as a result of dividing the sum of the decreases by 0.

Any suggestions on what may be causing the Inf in "V7" would be helpful? Thanks in advance,

-Melanie

---------i

 library(randomForest)

credit<-read.csv(url("ftp://ftp.ics.uci.edu/pub/machine-learning-databases/credit-screening/crx.data"), header=FALSE, na.string="?")

credit.rf <- randomForest(V16~., credit, imp=T, do.trace=100,na.action=na.omit)

imp <- round(importance(credit.rf), 2)

imp

getTree(credit.rf, 1)

 left daughter right daughter split var split point status prediction
[1,] 2 3 15 492.0000 1 0
[2,] 4 5 11 2.5000 1 0
[3,] 6 7 2 38.5000 1 0
[4,] 8 9 14 83.0000 1 0
[5,] 10 11 7 207.0000 1 0
[6,] 12 13 11 0.5000 1 0
[7,] 0 0 0 0.0000 -1 2
[8,] 14 15 7 117.0000 1 0
[9,] 16 17 8 3.0625 1 0

 [10,]            18             19         3      0.2700      1          0
 [11,]             0              0         0      0.0000     -1          2
 [12,]            20             21        15   4753.0000      1          0
 [13,]            22             23         2     37.0850      1          0
 [14,]            24             25        14      8.5000      1          0

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Mar 24 08:10:35 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:30:55 EST