Re: [R] Unique?

From: Francisco J. Zagmutt <gerifalte28_at_hotmail.com>
Date: Fri 12 May 2006 - 03:10:15 EST


Hi Cameron

You need to be more specific when you ask a question so you can get a better answer. Anyhow, when you say that you want to retain all the other variables do you mean that you want to create a new column in the dataset that contains the calculated sum? If that is the case you can use a construction like:

set.seed(1)
step4<-data.frame(TRIPID=rep(c(111,222,333),3),CONVUNIT=rpois(9,40)) result<-tapply(step4$CONVUNIT,INDEX=step4$TRIPID,FUN=sum) step4[,"SUM"]=result[match(step4[,"TRIPID"],names(result))] step4
  TRIPID CONVUNIT Sum

1    111       36 122
2    222       48 121
3    333       48 129
4    111       42 122
5    222       30 121
6    333       43 129
7    111       44 122
8    222       43 121
9    333       38 129


Cheers

Francisco

>From: "Guenther, Cameron" <Cameron.Guenther@MyFWC.com>
>To: "Francisco J. Zagmutt" <gerifalte28@hotmail.com>
>Subject: RE: [R] Unique?
>Date: Thu, 11 May 2006 12:08:31 -0400
>
>It is close but not quite what I want. I need to retain all of the
>other variables as well.
>
>
>Cameron Guenther, Ph.D.
>Associate Research Scientist
>FWC/FWRI, Marine Fisheries Research
>100 8th Avenue S.E.
>St. Petersburg, FL 33701
>(727)896-8626 Ext. 4305
>cameron.guenther@myfwc.com
>-----Original Message-----
>From: Francisco J. Zagmutt [mailto:gerifalte28@hotmail.com]
>Sent: Wednesday, May 10, 2006 6:06 PM
>To: Guenther, Cameron; r-help@stat.math.ethz.ch
>Subject: RE: [R] Unique?
>
>If you only care about the sum of CONVUNIT by each TRIPID then you can
>use tapply i.e.:
>
>step4<-data.frame(TRIPID=rep(c(111,222,333),3),CONVUNIT=rpois(9,40))
>result<-tapply(step4$CONVUNIT,INDEX=step4$TRIPID,FUN=sum)
>result
>111 222 333
>115 107 123
>
>Is this what you wanted to do? I can't think of anything faster than
>tapply for your problem.
>
>I hope this helps
>
>Francisco
>
>
>
>
> >From: "Guenther, Cameron" <Cameron.Guenther@MyFWC.com>
> >To: <r-help@stat.math.ethz.ch>
> >Subject: [R] Unique?
> >Date: Wed, 10 May 2006 17:02:33 -0400
> >
> >
> >Hello,
> >I have sample data set that looks like:
> >
> >YEAR MONTH DAY CONTINUE SPL TIMEFISH
> >TIMEUNIT AREA COUNTY DEPTH DEPUNIT GEAR TRIPID
> >CONVUNIT
> >1992 1 26 1 SP0073928 8
> >H 7 25 4 NA 1000000
> >02163399054 161
> >1992 1 26 1 SP0073928 8
> >H 7 25 4 NA 1000000
> >02163399054 8
> >1992 1 26 2 SP0004228 8
> >H 7 25 4 NA 1000000
> >02163399054 161
> >1992 1 26 2 SP0004228 8
> >H 7 25 4 NA 1000000
> >02163399054 8
> >1992 1 25 NA SP0052652 8
> >H 7 25 4 NA 1000000
> >02163399057 85
> >1992 1 26 NA SP0037940 8
> >H 7 25 4 NA 1000000
> >02163399058 70
> >1992 1 27 NA SP0072357 8
> >H 7 25 4 NA 1000000
> >02163399059 15
> >1992 1 27 NA SP0072357 8
> >H 7 25 4 NA 1000000
> >02163399059 20
> >1992 1 27 NA SP0026324 8
> >H 7 25 4 NA 1000000
> >02163399060 8
> >1992 1 28 1 SP0072357 8
> >H 7 25 4 NA 1000000
> >02163399062 200
> >
> >How can I use unique to extract the rows that have repeated tripid's
> >only, not a unique value for each variable but only for TRIPID. I then
>
> >want to condense the unique values by summing the CONVUNIT for each
> >unique value of TRIPID. I posted a similar question last week and
> >received a sufficient answer of how to do this without using uniqe.
> >The solution below worked just fine on this sample data set but the
> >full data set has 446,000 rows of data and my computer and R simply
> >cannot handle this follwing code on data this large.
> >
> >conds<-by(Step4,Step4$TRIPID,function(x)
> >replace(x[1,],"CONVUNIT",sum(x$CONVUNIT)))
> >Step5<-do.call(rbind,conds)
> >
> >Thank you,
> >
> >Cameron Guenther, Ph.D.
> >Associate Research Scientist
> >FWC/FWRI, Marine Fisheries Research
> >100 8th Avenue S.E.
> >St. Petersburg, FL 33701
> >(727)896-8626 Ext. 4305
> >cameron.guenther@myfwc.com
> >
> >______________________________________________
> >R-help@stat.math.ethz.ch mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide!
> >http://www.R-project.org/posting-guide.html
>
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri May 12 03:18:10 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 12 May 2006 - 04:10:08 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.