From: Kenn Konstabel <lebatsnok_at_gmail.com>

Date: Sat, 17 May 2008 23:28:26 +0300

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 17 May 2008 - 20:33:18 GMT

Date: Sat, 17 May 2008 23:28:26 +0300

Can it be this:

foo<-tapply(d$tt, d$v, min)

data.frame(v=names(foo), tt=foo)

On Sat, May 17, 2008 at 10:56 PM, jim holtman <jholtman_at_gmail.com> wrote:

> Is this what you want:

*>
**> > v<-c(rep("v1",3), rep("v2",4), rep("v3",2),"v4",rep("v5",6))
**> >
**> > tt<-c(1,2,3,3,1,2,3,4,5,2,7,9,2,3,1,4)
**> > d<-data.frame(v,tt)
**> > do.call(rbind, lapply(split(d, d$v), function(x){
**> + x[which.min(x$tt),]
**> + }))
**> v tt
**> v1 v1 1
**> v2 v2 1
**> v3 v3 4
**> v4 v4 2
**> v5 v5 1
**> >
**> >
**>
**>
**> On Sat, May 17, 2008 at 3:48 PM, souvik banerjee <bansouvik_at_gmail.com>
**> wrote:
**>
**> > Hi,
**> > I am facing a problem in data manipulation. Suppose a data
**> frame
**> > contains two columns. The first column consists of some repeated
**> characters
**> > and the second consists of some numerical values. The problem is to
**> extract
**> > and create a new data frame consisting of rows of each unique character
**> of
**> > first column with minimum second column entry. For example if "d" is the
**> > data frame, created with the following R code
**> >
**> >
**> > v<-c(rep("v1",3), rep("v2",4), rep("v3",2),"v4",rep("v5",6))
**> >
**> > tt<-c(1,2,3,3,1,2,3,4,5,2,7,9,2,3,1,4)
**> > d<-data.frame(v,tt)
**> >
**> > then the answer would be
**> >
**> >
**> > v tt
**> >
**> > v1 1
**> >
**> > v2 1
**> >
**> > v3 4
**> >
**> > v4 2
**> >
**> > v5 1
**> >
**> >
**> >
**> > I have written a small R code given below that does the job (assumming
**> "d"
**> > to the initial data frame)
**> >
**> >
**> >
**> > b<-data.frame(NULL)
**> >
**> > i<-1
**> >
**> > x<-d[1,]
**> >
**> > while(i<dim(d)[1])
**> >
**> > {
**> >
**> > if(length(unique(x[,1]))==1)
**> >
**> > {
**> >
**> > x<-rbind(x,d[i+1,])
**> >
**> > i=i+1
**> >
**> > }
**> >
**> > if(length(unique(x[,1]))>1)
**> >
**> > {
**> >
**> > y<-x[1:(nrow(x)-1),]
**> >
**> > z<-which(y[,2]==min(y[,2]))
**> >
**> > b<-rbind(b,y[z,])
**> >
**> > x<-d[i,]
**> >
**> > }
**> >
**> > }
**> >
**> > z<-which(x[,2]==min(x[,2]))
**> >
**> > b<-rbind(b,x[z,])
**> >
**> > b
**> >
**> >
**> >
**> > The code is working properly giving me the desired result, but the
**> problem
**> > is that I have to repeat this procedure for many data frames and nearly
**> > all
**> > the data frame contains approximately 15,000 repeated characters with
**> more
**> > than 12,500 unique characters. Using the above code in a loop is taking a
**> > considerable amount of time to compute.
**> > Can anybody suggest me of a faster approach?
**> >
**> > Regards
**> >
**> > Souvik Bandyopadhyay
**> > Research Fellow,
**> > Dept Of Statistics
**> > Calcutta University
**> >
**> > [[alternative HTML version deleted]]
**> >
**> > ______________________________________________
**> > R-help_at_r-project.org mailing list
**> > https://stat.ethz.ch/mailman/listinfo/r-help
**> > PLEASE do read the posting guide
**> > http://www.R-project.org/posting-guide.html<
**> http://www.r-project.org/posting-guide.html>
**> > and provide commented, minimal, self-contained, reproducible code.
**> >
**>
**>
**>
**> --
**> Jim Holtman
**> Cincinnati, OH
**> +1 513 646 9390
**>
**> What is the problem you are trying to solve?
**>
**> [[alternative HTML version deleted]]
**>
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide
**> http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

[[alternative HTML version deleted]]

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 17 May 2008 - 20:33:18 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Sat 17 May 2008 - 21:30:56 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*