Re: [R] subset

From: Marc Schwartz (via MN) <>
Date: Wed 17 May 2006 - 04:49:33 EST

On Tue, 2006-05-16 at 14:37 -0400, Guenther, Cameron wrote:
> Hello everyone,
> I have a large dataset (x) with some rows that have duplicate variables
> that I would like to remove. I find which rows are the duplicates with
> X1<-which(duplicated(x)). That gives me the rows with duplicated
> variables. Now, how can I remove just those rose from the original data
> frame. I think I can create a new data frame without the duplicates
> using subset. I have tried:
> Subset(x,!x1) and subset(x,!x[x1,])
> I can't seem to find the correct syntax. Any advice.
> Thanks in advance

Even easier would be to use unique():

  NewDF < unique(x)

NewDF will contain rows from 'x' with duplicates removed.

See ?unique for more information.

unique(), which has a data.frame method, is basically:

  x[!duplicated(x), , drop = FALSE]

which covers the case where the result may contain a single row and which remains a data frame.

Note that the above presumes that you want to test all columns in 'x' for dups.

HTH, Marc Schwartz mailing list PLEASE do read the posting guide! Received on Wed May 17 04:59:44 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 17 May 2006 - 06:10:18 EST.

Mailing list information is available at Please read the posting guide before posting to the list.