Re: [R] unqiue problem

From: Assa Yeroslaviz <frymor_at_gmail.com>
Date: Mon, 14 Jun 2010 18:32:21 +0200

I thought unique delete the whole line.
I don't really need the row names, but I thought of it as a way of getting the unique items.

Is there a way of deleting whole lines completely according to their identifiers?

What I really need are unique values on the first column.

Assa

On Mon, Jun 14, 2010 at 18:04, jim holtman <jholtman_at_gmail.com> wrote:

> Your process does remove all the duplicate entries based on the
> content of the two columns. After you do this, there are still
> duplicate entries in the first column that you are trying to use as
> rownames and therefore the error. Why to you want to use non-unique
> entries as rownames? Do you really need the row names, or should you
> only be keeping unique values for the first column?
>
> On Mon, Jun 14, 2010 at 8:54 AM, Assa Yeroslaviz <frymor_at_gmail.com> wrote:
> > Hello everybody,
> >
> > I have a a matrix of 2 columns and over 27k rows.
> > some of the rows are double , so I tried to remove them with the command
> > unique():
> >
> >> Workbook5 <- read.delim(file = "Workbook5.txt")
> >> dim(Workbook5)
> > [1] 27748 2
> >> Workbook5 <- unique(Workbook5)
> >> dim(Workbook5)
> > [1] 20101 2
> >
> > it removed a lot of line, but unfortunately not all of them. I wanted to
> add
> > the row names to the matrix and got this error message:
> >> rownames(Workbook5) <- Workbook5[,1]
> > Error in `row.names<-.data.frame`(`*tmp*`, value = c(1L, 2L, 3L, 4L, 5L,
> :
> > duplicate 'row.names' are not allowed
> > In addition: Warning message:
> > non-unique values when setting 'row.names': ‘A_51_P102339’,
> > ‘A_51_P102518’, ‘A_51_P103435’, ‘A_51_P103465’,
> > ‘A_51_P103594’, ‘A_51_P104409’, ‘A_51_P104718’,
> > ‘A_51_P105869’, ‘A_51_P106428’, ‘A_51_P106799’,
> > ‘A_51_P107176’, ‘A_51_P107959’, ‘A_51_P108767’,
> > ‘A_51_P109258’, ‘A_51_P109708’, ‘A_51_P110341’,
> > ‘A_51_P111757’, ‘A_51_P112427’, ‘A_51_P112662’,
> > ‘A_51_P113672’, ‘A_51_P115018’, ‘A_51_P116496’,
> > ‘A_51_P116636’, ‘A_51_P117666’, ‘A_51_P118132’,
> > ‘A_51_P118168’, ‘A_51_P118400’, ‘A_51_P118506’,
> > ‘A_51_P119315’, ‘A_51_P120093’, ‘A_51_P120305’,
> > ‘A_51_P120738’, ‘A_51_P120785’, ‘A_51_P121134’,
> > ‘A_51_P121359’, ‘A_51_P121412’, ‘A_51_P121652’,
> > ‘A_51_P121724’, ‘A_51_P121829’, ‘A_51_P122141’,
> > ‘A_51_P122964’, ‘A_51_P123422’, ‘A_51_P123895’,
> > ‘A_51_P124008’, ‘A_51_P124719’, ‘A_51_P125648’,
> > ‚ÄòA_51_P125679‚Äô, ‚ÄòA_51_P125779‚ [... truncated]
> >
> > Is there a better way to discard the duplicataions in the text file
> (Excel
> > file is the origin).
> >
> >> R.version
> > _
> > platform x86_64-apple-darwin9.8.0
> > arch x86_64
> > os darwin9.8.0
> > system x86_64, darwin9.8.0
> > status Patched
> > major 2
> > minor 11.1
> > year 2010
> > month 06
> > day 03
> > svn rev 52201
> > language R
> > version.string R version 2.11.1 Patched (2010-06-03 r52201)
> >
> > THX
> >
> > Assa
> >
> > ______________________________________________
> > R-help_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 14 Jun 2010 - 16:34:26 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 14 Jun 2010 - 17:30:32 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive