Re: [R] Fast Removing Duplicates from Every Column

From: Petr Pikal <petr.pikal_at_precheza.cz>
Date: Tue 16 Jan 2007 - 10:47:07 GMT

Hi

I have no idea how Test data look like. However help pages of functions

data.frame()
as.data.frame()
str()

and maybe few others can help you find how to change objects to data frames.

HTH
Petr

On 16 Jan 2007 at 10:36, Bert Jacobs wrote:

From:           	"Bert Jacobs" <b.jacobs@pandora.be>
To:             	"'Petr Pikal'" <petr.pikal@precheza.cz>
Subject:        	RE: [R] Fast Removing Duplicates from Every Column
Date sent:      	Tue, 16 Jan 2007 10:36:42 +0100

> Hi Petr,
>
> Thx for answeringen me on the question below.
> Actually I could use this line of code to get my problem solved.
>
> Test = apply(X=my_data, MARGIN=2, FUN=unique)
>
> Now I was wondering how to transform 'Test' into a dataframe, while
> there are different rows implied.
>
> Thx,
> Bert
>
> _____________________________
>
> Bert Jacobs
> Marketing Intelligence Engineer
> Plasveldlaan 5
> 9400 Ninove
> Tel: 0477/68.74.07
> Fax: 054/25.00.35
> E-mail: b.jacobs@pandora.be
>
> -----Original Message-----
> From: Petr Pikal [mailto:petr.pikal@precheza.cz]
> Sent: 05 January 2007 11:51
> To: Bert Jacobs; 'R help list'
> Subject: Re: [R] Fast Removing Duplicates from Every Column
>
> Hi
>
> I am not sure if I understand how do you want to select unique items.
>
> with
> sapply(DF, function(x) !duplicated(x))
> you can get data frame with TRUE when an item in particular column is
> unique and FALSE in opposite. However then you need to choose which
> rows to keep or discard
>
> e.g.
>
> DF[rowSums(sapply(comp, function(x) !duplicated(x)))>1,]
>
> selects all rows in which are 2 or more unique values.
>
> HTH
> Petr
>
>
> On 5 Jan 2007 at 9:54, Bert Jacobs wrote:
>
> From: "Bert Jacobs" <b.jacobs@pandora.be>
> To: "'R help list'" <r-help@stat.math.ethz.ch>
> Date sent: Fri, 5 Jan 2007 09:54:17 +0100
> Subject: Re: [R] Fast Removing Duplicates from Every Column
>
> > Hi,
> >
> > I'm looking for some lines of code that does the following:

> > I have a dataframe with 160 Columns and a number of rows (max 30):
> >
> > Col1 Col2 Col3 ... Col 159 Col 160
> > Row 1 0 0 LD ... 0 VD
> > Row 2 HD 0 0 0 MD
> > Row 3 0 HD HD 0 LD
> > Row 4 LD HD HD 0 LD
> > ... ...
> > LastRow HD HD LD 0 MD
> >
> >
> > Now I want a dataframe that looks like this. As you see all
> > duplicates are removed. Can this dataframe be constructed in a fast

> > way?
> >
> > Col1 Col2 Col3 ... Col 159 Col 160
> > Row 1 0 0 LD 0 VD
> > Row 2 HD HD 0 0 MD
> > Row 3 LD 0 HD 0 LD
> >
> > Thx for helping me out.

> > Bert
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
>
> Petr Pikal
> petr.pikal@precheza.cz
>
>

Petr Pikal
petr.pikal@precheza.cz



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue Jan 16 21:53:15 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 16 Jan 2007 - 11:30:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.