Re: [Rd] Lightweight data frame class

From: Gabor Grothendieck <ggrothendieck_at_myway.com>
Date: Sat 27 Nov 2004 - 10:41:47 EST

Vadim Ogranovich <vograno <at> evafunds.com> writes:

:
: Don't know whether it will suffice. Lm() was just an example. Are you
: going to re-write lm(), e.g. lm.zoo(), to accept lists?

A previous unreleased version of zoo did hack lm but the current test version interfaces to lm without making any changes to lm at all.

: I am more thinking of a general purpose class that would pass wherever
: data.frame is expected.

Yes, I figured so. The lightweight data frame idea seems neat but thought I would mention this, in addition, in case its germane.

:
: Probably I need to wait until the new version of zoo comes out. At the
: very least it could be a good prototype for what I have in mind.

If you want it before then contact me offlist and I can send you the beta test version.

:
: Thanks for the info,
: Vadim
:
: > -----Original Message-----
: > From: r-devel-bounces <at> stat.math.ethz.ch
: > [mailto:r-devel-bounces <at> stat.math.ethz.ch] On Behalf Of Gabor
: > Grothendieck
: > Sent: Thursday, November 25, 2004 7:42 PM
: > To: r-devel <at> stat.math.ethz.ch
: > Subject: Re: [Rd] Lightweight data frame class
: >
: > Vadim Ogranovich <vograno <at> evafunds.com> writes:
: >
: > :
: > : Hi,
: > :
: > : As far as I can tell data.frame class adds two features to those of
: > : lists:
: > : * matrix structure via [,] and [,]<- operators (well, I
: > know these are
: > : actually "["(i, j, ...), not "[,]").
: > : * row names attribute.
: > :
: > : It seems that the overhead of the support for the row names, both
: > : computational and RAM-wise, is rather non-trivial. I frequently
: > : subscript from a data.frame, i.e. use [,] on data frames,
: > and my timing
: > : shows that the equivalent list operation is about 7 times
: > faster, see
: > : below.
: > :
: > : On the other hand, at least in my usage pattern, I really
: > rarely benefit
: > : from the row names attribute, so as far as I am concerned
: > row names is
: > : just an overhead. (Of course the speed difference may be
: > due to other
: > : factors, the only thing I can tell is that subscripting is
: > very slow in
: > : data frames relative to in lists).
: > :
: > : I thought of writing a new class, say
: > lightweight.data.frame, that would
: > : be polymorphic with the existing data.frame class. The class would
: > : inherit from "list" and implement [,], [,]<- operators. It
: > would also
: > : implement the "rownames" function that would return
: > seq(nrow(x)), etc.
: > : It should also implement as.data.frame to avoid the overhead of
: > : conversion to a full-blown data.frame in calls like lm(y ~ x,
: > : data=myLightweightDataframe).
: >
: > The next version of zoo (currently in
: > test) supports lists in the data argument of lm and can also
: > merge zoo series into a list (or to another zoo series, as it
: > does now).
: > Would that be a sufficient alternative?
: >
: > ______________________________________________
: > R-devel <at> stat.math.ethz.ch mailing list
: > https://stat.ethz.ch/mailman/listinfo/r-devel
: >
:
: ______________________________________________
: R-devel <at> stat.math.ethz.ch mailing list
: https://stat.ethz.ch/mailman/listinfo/r-devel
:
:



R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat Nov 27 10:47:52 2004

This archive was generated by hypermail 2.1.8 : Sat 27 Nov 2004 - 11:12:02 EST