Re: [R] unique/subset problem

From: Sarah Goslee <sarah.goslee_at_gmail.com>
Date: Fri 26 Jan 2007 - 16:17:27 GMT

Without knowing more about your data, it is hard to say for certain, but might you be confusing unique _values_ with _factor levels_?

> mydata <- as.factor(sort(rep(1:5, 2)))
# mydata has 10 values, 5 unique values, and 5 factor levels
> mydata

 [1] 1 1 2 2 3 3 4 4 5 5
Levels: 1 2 3 4 5
> unique(mydata)

[1] 1 2 3 4 5
Levels: 1 2 3 4 5
> mydata.subset <- mydata[1:4]

# the subset now has only 2 unique values, but the output # still lists all five factor levels
> unique(mydata.subset)

[1] 1 2
Levels: 1 2 3 4 5

# try drop=TRUE as an option to subset
> mydata.subset <- mydata[1:4, drop=TRUE]
> unique(mydata.subset)

[1] 1 2
Levels: 1 2

Alternatively, if this is the problem and you don't need those data to be factors, you could always convert them to a more appropriate form.

Sarah

> > On 1/25/07, lalitha viswanath
> > <lalithaviswanath@yahoo.com> wrote:
> > > Hi
> > > I am new to R programming and am using subset to
> > > extract part of a data as follows
> > >
> > > names(dataset) =
> > > c("genome1","genome2","dist","score");
> > > prunedrelatives <- subset(dataset, score < -5);
> > >
> > > However when I use unique to find the number of
> > unique
> > > genomes now present in prunedrelatives I get
> > results
> > > identical to calling unique(dataset$genome1)
> > although
> > > subset has eliminated many genomes and records.
> > >
> > > I would greatly appreciate your input about using
> > > "unique" correctly in this regard.
> > >
> > > Thanks
> > > Lalitha
> > >

-- 
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat Jan 27 03:25:43 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 26 Jan 2007 - 17:30:29 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.