Re: [R] Does SQL group by have a heavy duty equivalent in R

From: Farrel Buchinsky <fjbuch_at_gmail.com>
Date: Sun 31 Dec 2006 - 21:16:55 GMT

I converted the whole data frame to character by using as.matrix

And then using a posting that explained how to get the naming conventions back (which had been lost when converting to matrix)

Anything that I did not list with the id's it insisted in including them with the measured variables. In other words it would not let me drop.

despite

melted<-melt(BigDF, id=c("SAMPLE_ID","ASSAY_ID"), measured=c("GENOTYPE_ID","DESCRIPTION"))

unique(melted$variable)
 [1] CUSTOMER PROJECT PLATE EXPERIMENT CHIP WELL_POSITION GENOTYPE_ID DESCRIPTION ENTRY_OPERATOR [10] INTERACT PLATEc
Levels: CUSTOMER PROJECT PLATE EXPERIMENT CHIP WELL_POSITION GENOTYPE_ID DESCRIPTION ENTRY_OPERATOR INTERACT PLATEc

I should have only got GENOTYPE_ID and DESCRIPTION

"hadley wickham" <h.wickham@gmail.com> wrote in message news:f8e6ff050612310758p11f96c0dl256ac5b15d11dc2c@mail.gmail.com...
>> nr.attempts
>> <-aggregate(RawSeq$GENOTYPE_ID,list(sample=RawSeq$SAMPLE_ID,assay=RawSeq$ASSAY_ID),length)
>> This was simply to figure out how many times the same piece of
>> information
>> had been obtained. I ran out of patience. It took beyond forever and
>> tapply
>> did not perform much better. The reshape package did not help - it
>> implied
>> one was out of luck if the data was not numeric. All of my data is
>> character
>> or factor.
>
> The reshape package will work if all your data is numeric, or all of
> it is character - it doesn't work with a mix. I will try and make
> this more clear in the documentation.
> However, depending on the size and structure of your data it may not
> be any faster than tapply or aggregate.
>
> Hadley
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon Jan 01 08:20:55 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 01 Jan 2007 - 01:30:24 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.