From: Tiago R Magalhaes <tiago17_at_socrates.Berkeley.EDU>

Date: Sat 19 Mar 2005 - 05:21:44 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Mar 19 05:26:20 2005

Date: Sat 19 Mar 2005 - 05:21:44 EST

Thank you very much to Andy Liaw, Rob J Goedman and Marc Schwartz for taking their time to solve my problem. I've learned in many other occasions from useful tips coming from all 3 of them and it just happened once again. You got to love this mailing list...

subset(x, a %in% a[duplicated(a)])

works in all cases and it's the simplest, but as always all the solutions made me understand a little better the R concepts and functions.

I would suggest to include this in the help pages for duplicated. Also useful might be:

subset(x, !a %in% a[duplicated(a)])

giving all rows that don't have any duplicated

again thanks for all help in this mailing list

>Here's one more possibility:

*>
**> > subset(x, a %in% a[duplicated(a)])
**> a b
**>2 2 10
**>3 2 10
**>4 3 10
**>5 3 10
**>6 3 10
**>
**>HTH,
**>
**>Marc Schwartz
**>
**>
**>On Thu, 2005-03-17 at 22:25 -0500, Liaw, Andy wrote:
**>> OK, strike one...
**>>
**>> Here's my second try:
**>>
**>> > cnt <- table(x[,1])
**>> > v <- as.numeric(names(cnt[cnt > 1]))
**>> > v
**>> [1] 2 3
**>> > x[x[,1] %in% v, ]
**>> a b
**>> 2 2 10
**>> 3 2 10
**>> 4 3 10
**>> 5 3 10
**>> 6 3 10
**>>
**>> Andy
**>>
**>> > From: Liaw, Andy
**>> >
**>> > Does this work for you?
**>> >
**>> > > x[table(x[,1]) > 1,]
**>> > a b
**>> > 2 2 10
**>> > 3 2 10
**>> > 5 3 10
**>> > 6 3 10
**>> >
**>> > Andy
**>> >
**>> > > From: Tiago R Magalhaes
**>> > >
**>> > > Hi
**>> > >
**>> > > I want to extract all the rows in a data frame that have duplicates
**>> > > for a given column.
**>> > > I would expect this question to come up pretty often but I have
**>> > > researched the archives and surprisingly couldn't find anything.
**>> > > The best I can come up with is:
**>> > >
**>> > > x <- data.frame(a=c(1,2,2,3,3,3), b=10)
**>> > > xdup1 <- duplicated(x[,1])
**>> > > xdup2 <- duplicated(x[,1][nrow(x):1])[nrow(x):1]
**>> > > xAllDups <- x[(xdup1+xdup2)!=0,]
**>> > >
**>> > > This seems to work, but it's so convoluted that I'm sure there's a
**>> > > better method.
**>> > > Thanks for any help and enlightenment
**> > > > [[alternative HTML version deleted]]
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Mar 19 05:26:20 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:30:51 EST
*