Re: [R] Problems using rfImpute

From: James Reilly <reilly_at_stat.auckland.ac.nz>
Date: Tue, 06 May 2008 01:31:21 +1200

The values NA and "NA" are different. The first is treated as missing; the second is not. For example,
 > table(factor(c(NA,"0","1","NA","NA")))

  0 1 NA
  1 1 2

I suspect you have "NA" where you want NA, and this is causing your problem.

James

-- 
James Reilly
Department of Statistics, University of Auckland
Private Bag 92019, Auckland, New Zealand

On 6/5/08 1:04 AM, Birgit Lemcke wrote:

> Hello R-user!
>
> I am running R 2.7.0 on a Power Book (Tiger). (I am still R and
> statistics beginner)
>
> I tried rfImpute (randomForest) and as far as I understood should it
> replace NA`s using a proximity matrix:
>
> > set.seed(100000)
> > Subset5Imputed<-rfImpute(Sex~., data=Subset5)
> ntree OOB 1 2
> 300: 11.78% 12.36% 11.21%
> ntree OOB 1 2
> 300: 12.07% 12.64% 11.49%
> ntree OOB 1 2
> 300: 11.49% 11.21% 11.78%
> ntree OOB 1 2
> 300: 12.50% 12.93% 12.07%
> ntree OOB 1 2
> 300: 12.07% 12.36% 11.78%
> > str(Subset5Imputed)
>
> 'data.frame': 696 obs. of 24 variables:
> $ Sex : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2
> 2 2 2 ...
> $ InfSpath_caducuous : Factor w/ 3 levels "0","1","NA": 1 1 1 1
> 1 1 1 1 1 1 ...
> $ InfType_sparsely_paniculate: Factor w/ 3 levels "0","1","NA": 1 1 1 3
> 1 1 1 1 1 1 ...
>
> But there are still NA`s in the data frame. Sorry if this reason is only
> ma stupididty and thanks for answering in advance.
>
> B.
>
>
> Birgit Lemcke
> Institut für Systematische Botanik
> Zollikerstrasse 107
> CH-8008 Zürich
> Switzerland
> Ph: +41 (0)44 634 8351
> birgit.lemcke_at_systbot.uzh.ch
>
> 175 Jahre UZH
> «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
> MNF-Jubiläumsevent für gross und klein.
> 19. April 2008, 10.00 Uhr bis 02.00 Uhr
> Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
> Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft
______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Mon 05 May 2008 - 13:44:31 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 05 May 2008 - 15:00:36 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive