Re: [R] missing handling

From: Weiwei Shi <helprhelp_at_gmail.com>
Date: Wed 05 Oct 2005 - 04:51:11 EST

Hi, Jim:
I tried your code and get the following error: trn1<-read.table('trn1.svm', header=F, na.string='.', sep='|') Med<-apply(trn1, 2, median, na.rm=T)
Ind<-which(is.na(trn1), arr.ind=T)
trn1[Ind]<-Med[Ind[,'col']]
Error in "[<-.data.frame"(`*tmp*`, Ind, value = c(1.00802124455, 1.00802124455, :

only logical matrix subscripts are allowed in replacement

I cannot figure out why.

Thanks for help,

On 9/27/05, jim holtman <jholtman@gmail.com> wrote:
>
> Use 'which(...arr.ind=T)'
> > x.1
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> [1,] 6 10 3 4 10 7 9 8 4 10
> [2,] 8 7 4 7 4 8 3 NA 3 4
> [3,] 7 7 10 10 3 5 3 2 2 2
> [4,] 3 4 5 10 10 2 6 9 4 5
> [5,] 3 5 9 5 6 NA 3 NA 6 7
> [6,] 9 6 10 5 10 4 2 10 NA 5
> [7,] 5 2 5 10 3 7 6 4 6 8
> [8,] 2 6 1 8 9 2 7 8 3 8
> [9,] 9 1 4 9 8 10 2 NA 1 7
> [10,] 2 4 8 7 NA 4 3 NA 5 5
> > x.4
> [1] 5.5 5.5 5.0 7.5 8.0 5.0 3.0 8.0 4.0 6.0
> > Med <- apply(x.1, 2, median, na.rm=T) # get median
> > Ind <- which(is.na(x.1), arr.ind=T) # determine which are NA
> > x.1[Ind] <- Med[Ind[,'col']] # replace with median
> > x.1
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> [1,] 6 10 3 4 10 7 9 8 4 10
> [2,] 8 7 4 7 4 8 3 8 3 4
> [3,] 7 7 10 10 3 5 3 2 2 2
> [4,] 3 4 5 10 10 2 6 9 4 5
> [5,] 3 5 9 5 6 5 3 8 6 7
> [6,] 9 6 10 5 10 4 2 10 4 5
> [7,] 5 2 5 10 3 7 6 4 6 8
> [8,] 2 6 1 8 9 2 7 8 3 8
> [9,] 9 1 4 9 8 10 2 8 1 7
> [10,] 2 4 8 7 8 4 3 8 5 5
> >
>
>
> On 9/27/05, Weiwei Shi <helprhelp@gmail.com> wrote:
>
> > Hi,
> > I have the following codes to replace missing using median, assuming
> > missing
> > only occurs on continuous variables:
> >
> > trn1<-read.table('trn1.fv', header=F, na.string='.', sep='|')
> >
> > # median
> > m.trn1<-sapply(1:ncol(trn1), function(i) median(trn1[,i], na.rm=T))
> >
> > #replace
> > trn2<-trn1
> > for (each in 1:nrow(trn1)){
> > index.missing=which(is.na(trn1[each,]))
> > trn2[each,]<-replace(trn1[each,], index.missing, m.trn1[index.missing])
> > }
> >
> >
> > Anyone can suggest some ways to improve it since replacing 10 takes 1.5sec:
> > > system.time(for (each in 1:10){index.missing=which(is.na
> > (trn1[each,]));
> > trn2[each,]<-replace(trn1[each,], index.missing, m.trn1[index.missing
> > ]);})
> > [1] 1.53 0.00 1.53 0.00 0.00
> >
> >
> > Another general question is
> > are there some packages in R doing missing handling?
> >
> > Thanks,
> >
> > --
> > Weiwei Shi, Ph.D
> >
> > "Did you always know?"
> > "No, I did not. But I believed..."
> > ---Matrix III
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 247 0281
>
> What the problem you are trying to solve?

--
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

	[[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Wed Oct 05 04:56:07 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:40:35 EST