Re: [R] How to delete rows based on replicate values in one column with some extra calcuation

From: Yi <liuyi.feier_at_gmail.com>
Date: Tue, 29 Jun 2010 16:56:36 -0700

Great help. It works when the first and the second columns are ordered the same way. But aggregate does not work for the following case:  z=c('ab','ah','bc','ah','dv')
x=substr(z,start=1,stop=1)
y=substr(z,start=2,stop=2)
v1=5:9
v2=7:11
data=data.frame(x,y,z,v1,v2)
> data

  x y z v1 v2
1 a b ab 5 7
2 a h ah 6 8
3 b c bc 7 9
4 a h ah 8 10
5 d v dv 9 11

##I want to do the aggregate WRT z and sum up v1 and v2. The expected output is:

   x y z v1 v2
1 a b ab 5 7
2 a h ah 14 18
3 b c bc 7 9
4 d v dv 9 11
### I do this almost manually. As you see here:

newdata=aggregate(data$v1,by=list(data$z),sum) newdata2=aggregate(data$v2,by=list(data$z),sum) x=substr(newdata$Group.1,start=1,stop=1) y=substr(newdata$Group.1,start=2,stop=2) data.frame(x,y,newdata$Group.1,newdata$x,newdata2$x) new=data.frame(x,y,newdata$Group.1,newdata$x,newdata2$x) names(new)=c('x','y','z','v1','v2')
new

Because I do not think 'aggregate' can not set z as a list and at the same time keep x and y for z.

Any tips? I mean my way is too 'silly'.

Thanks all in advance!

Yi

On Mon, Jun 28, 2010 at 7:58 PM, Nikhil Kaza <nikhil.list_at_gmail.com> wrote:

>
> aggregate(data$third, by=list(data$first), sum)
>
> or
>
> reqiure(reshape)
> cast(melt(data), ~first, sum)
>
>
>
> On Jun 28, 2010, at 9:30 PM, Yi wrote:
>
>
>> first=c('u','b','e','k','j','c','u','f','c','e')
>> second
>> =
>> c
>> ('usa
>> ','Brazil
>> ','England','Korea','Japan','China','usa','France','China','England')
>> third=1:10
>> data=data.frame(first,second,third)
>>
>
>

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 30 Jun 2010 - 02:26:23 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 30 Jun 2010 - 02:40:43 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive