Re: [Rd] split.data.frame

From: Matthew Dowle <mdowle_at_mdowle.plus.com>
Date: Thu, 17 Dec 2009 12:39:27 +0000

This seems very similar to the data.table package.

The 'by' argument splits the data.table by that value then executes the j expression within each subset. The package documentation talks about 'subset' and 'with' in some detail. See ?"[.data.table".

dt = data.table(x=1:20, y=rep(1:4,each=5) dt[,sum(x),by="y"]

> and x has a variable called grp, what do you get?
In data.table that choice is given to the user via the argument 'with' which by default is TRUE meaning you get the x inside dt.

"Romain Francois" <romain.francois_at_dbmail.com> wrote in message news:4B288645.3010602_at_dbmail.com...
> On 12/16/2009 12:14 AM, Peter Dalgaard wrote:

>> Romain Francois wrote:
>>> Hello,
>>>
>>> I very much enjoy "with" and "subset" semantics for data frames and
>>> was wondering if we could have something similar with split, basically
>>> by evaluating the second argument "with" the data frame :
>>
>> I seem to recall that this idea was considered and rejected when the
>> current split.data.frame was written (10 years ago!). The main reasons
>> were that
>>
>> - it's not really THAT hard to evaluate a single splitting expression
>> using with() or eval()
>

> Sure, this is just about convenience and laziness.
>
>> - not all applications will have the splitting factor inside the df to
>> split ( split(df[-1], df[[1]]) for a simple case)
>

> this still works
>
>> - if you need a computed splitting factor, there's a risk of inadvertent
>> variable capture. I.e., if you inside a function do
>>
>> ....
>> grp <- ...whatever...
>> spl <- split(x, grp)
>> ....
>>
>> and x has a variable called grp, what do you get?
>

> this is a problem indeed.
>

> thanks for the reply.
>

> Romain
>

> --
> Romain Francois
> Professional R Enthusiast
> +33(0) 6 28 91 30 30
> http://romainfrancois.blog.free.fr
> |- http://tr.im/HlX9 : new package : bibtex
> |- http://tr.im/Gq7i : ohloh
> `- http://tr.im/FtUu : new package : highlight
>

R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu 17 Dec 2009 - 12:48:15 GMT

This archive was generated by hypermail 2.2.0 : Thu 17 Dec 2009 - 12:51:09 GMT