From: Weiwei Shi <helprhelp_at_gmail.com>

Date: Wed 22 Jun 2005 - 03:25:36 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jun 22 03:30:10 2005

Date: Wed 22 Jun 2005 - 03:25:36 EST

Even before I tried, I already realize it must be true when I read this reply! Great job! thanks, Andy.

> str(z)

`data.frame': 235 obs. of 2 variables:
$ CLAIMNUM : Factor w/ 1907 levels "0","10000001849",..: 1083 1083
1083 1582 1582 1084 1681 1681 1391 1391 ...
$ SIU.SAVED: int 475 3000 3000 0 0 4352 0 0 4500 3000 ...

So, I have another general question: how to avoid this when I do the matching? In my case, claimnum does not have to be a factor. I think I can do as.integer on it to de-factor it. But, I want to know how to do it w/ keeping is as factor? btw, what's your way to drop those levels? :)

weiwei

On 6/21/05, Liaw, Andy <andy_liaw@merck.com> wrote:

> What does str(z) say? I suspect the second column is a factor, which, after

*> the subsetting, has some empty levels. If so, just drop those levels.
**>
**> Andy
**>
**> > From: Weiwei Shi
**> >
**> > hi
**> > i tried all the methods suggested above:
**> > ave and rowsum with "with" function works for my situation. I think
**> > the problem might not be due to tapply.
**> > My data z comes from
**> > z<-y[y[[1]] %in% x[[2]], c(1,9)]
**> >
**> > while z is supposed to have no entries for those non-matched
**> > between x and y.
**> >
**> > however, when I run tapply, and the result also includes those
**> > non-matched entries. I use is.na function to remove those entry from z
**> > first and then use tapply again, but the result is the same: those
**> > NA's and those non-matched results are still there. That's what I mean
**> > by "it doesn't work".
**> >
**> > Is there something I missed here so that z "implicitly" has some
**> > "trace" back to y dataset?
**> >
**> > thanks,
**> >
**> > On 6/20/05, Gabor Grothendieck <ggrothendieck@gmail.com> wrote:
**> > > On 6/20/05, Weiwei Shi <helprhelp@gmail.com> wrote:
**> > > > hi,
**> > > > i have another question on tapply:
**> > > > i have a dataset z like this:
**> > > > 5540 389100307391 2600
**> > > > 5541 389100307391 2600
**> > > > 5542 389100307391 2600
**> > > > 5543 389100307391 2600
**> > > > 5544 389100307391 2600
**> > > > 5546 381300302513 NA
**> > > > 5547 387000307470 NA
**> > > > 5548 387000307470 NA
**> > > > 5549 387000307470 NA
**> > > > 5550 387000307470 NA
**> > > > 5551 387000307470 NA
**> > > > 5552 387000307470 NA
**> > > >
**> > > > I want to sum the column 3 by column 2.
**> > > > I removed NA by calling:
**> > > > tapply(z[[3]], z[[2]], sum, na.rm=T)
**> > > > but it does not work.
**> > > >
**> > > > then, i used
**> > > > z1<-z[!is.na(z[[3]],]
**> > > > and repeat
**> > > > still doesn't work.
**> > > >
**> > > > please help.
**> > > >
**> > >
**> > > Depending on what you want you may be able to use rowsum:
**> > >
**> > > - display only groups that have at least one non-NA with the sum
**> > > being the sum of the non-NAs:
**> > >
**> > > with(na.omit(z), rowsum(V3, V2))
**> > >
**> > > - display all groups with the sum being NA if any member is NA:
**> > >
**> > > rowsum(z$V3, z$V2)
**> > >
**> >
**> >
**> > --
**> > Weiwei Shi, Ph.D
**> >
**> > "Did you always know?"
**> > "No, I did not. But I believed..."
**> > ---Matrix III
**> >
**> > ______________________________________________
**> > R-help@stat.math.ethz.ch mailing list
**> > https://stat.ethz.ch/mailman/listinfo/r-help
**> > PLEASE do read the posting guide!
**> > http://www.R-project.org/posting-guide.html
**> >
**> >
**> >
**>
**>
**>
**> ------------------------------------------------------------------------------
**> Notice: This e-mail message, together with any attachment...{{dropped}}
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jun 22 03:30:10 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:32:56 EST
*