Re: [R] data summerization etc...

From: ajay ohri <ohri2007_at_gmail.com>
Date: Sat, 12 Jul 2008 04:26:35 +0530

Hello,

Have you tried using the GUI Rattle from www.rattle.togaware.com . It works pretty well for summarization.

Regards,

Ajay
www.decisionstats.com

On Sat, Jul 12, 2008 at 4:14 AM, sj <ssj1364_at_gmail.com> wrote:
>
> Hello,
>
> I am trying to do some fairly straightforward data summarization, i.e., the
> kind you would do with a pivot table in excel or by using SQL queires. I
> have a moderately sized data set of ~70,000 records and I am trying to
> compute some group averages and sum values within groups. the code example
> below shows how I am trying to go about doing this
>
> pti <-rnorm(70000,10)
> fid <- rnorm(70000,100)
> finc <- rnorm(70000,1000)
>
>
> ### compute the sums of pti within fid groups
> sum_pinc <-aggregate(cbind(fid,pti),list(fid),FUN=sum)
>
> #### compute mean finc within fid groups
> tot_finc <- aggregate(cbind(fid,finc),list(fid),FUN=mean)
>
> when I try to do it this way I get an error message telling me that enough
> memory cannot be allocated ( I am using R 2.7.1 on Windows XP with 2 GB of
> Memory). I figure that there must be a more efficent way to go about doing
> this. Please suggest.
>
> I would typically do this kind of task in a database and use SQL to push the
> data around. I know RODBC allows you to write SQL to query external DBs. Is
> there any mechanisim that allows you to write SQL queies against datasets
> internal to R e.g. in the case above
>
>
> I could do something like
>
> set <- cbind(fid,pti,finc)
>
> select fid, sum(pti)
> from set
> group by fid
>
> that would be handy!
>
> Thanks,
>
> Spencer
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 11 Jul 2008 - 23:01:15 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 12 Jul 2008 - 00:31:44 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive