[Rd] interactive graphics for R: was Google Summer of Code 2009

From: Sklyar, Oleg (London) <osklyar_at_maninvestments.com>
Date: Thu, 19 Feb 2009 17:27:45 +0000

Dear Simon,

thanks for comments.

I better give a bit of a background first. We are analysing time series of financial data, often multivariate and with say 200K samples. It is quite a frequent situation that one needs to display multivariate time series of say 200K rows and 10 columns over the whole time range and be able to zoom in to look for effects of interest. The obvious choice of plots is a multiplot window with a shared x-axis, in this case time, zooming should be done simultaneously in all time series displayed.

I do understand this is a very specific example, but I am sure similar problems arise in other discilines: think of a genomic browser, sequencing or any other non-financial time series data etc.

Essentially, no matter what the graphying or rendering technology used beneath (GTK, QT or anything else), my requirements, and yes they are in a way subjective, but on the other hand quite generic, would be a possibliity to produce multiplot windows (similar to say setting mfrow in par) with two simple features: zooming and panning simultaneously on all plots or independently. The support for Axis/pretty method callbacks is required because those are the methods that provide correct axis labeling independently on the class of the data. This is essentially the only thing that is not supported by the gtkdatabox widget as the rulers can only display numbers.

On the other issues of interactivity, I agree it is quite a broad term, but the functionality I describe above is pretty much basic.

As for Java objections: this is not because Java is slow on its own, but the interface is not native, requires a huge JVM for a fairly simple task and the interface is relatively slow and cumbersome. As soon as I see a package demonstrating good performance via rJava, I will be happy to say I was wrong. But essentially the same problem with 'playwith' package mentioned earlier -- it uses RGtk, gWidgets and therefore it is slow -- it is not that GTK is slow, but the complex binding from R via RGtk to GTK. If used natively, it is very fast.

> As for iPlots, the development has shifted a while ago from
> the 'old'
> iPlots to the new ones which are in development stage (as I
> said they
> are announced for the useR! conference). My point was not about
> telling you to use a specific software, it was rather about
> making you
> aware of the fact that what you describe already exists (ggobi
> definitely is IG in GTK) and/or is worked on (iPlots 3.0) with
> possibly better approach.

Where can I find it to have a look? No matter that it is in development, if it fits the needs, I will only be happy to contribute what I can.

>
> > 3) I have a prototype using gtkdatabox for very fast interactive
> > plots in R using GTK, but it is limited by the capabilities of the
> > gtkdatabox widget, not that of R or GTK as such.
> >
>
> I don't know about your prototype, so I cannot really comment
> on that,
> but gtkdatabox is not IG, either.
>

I cannot send you an example of an R package using gtkdatabox from the office, but I will create a small demo pack at home and will send it to you separately as to indicate what I am looking into. Possibly it is not IG, but this is essentially what I described above, although quite primitive (but it was a one-day project for me, not 3-months).

>
> > I do think there is a need for an interactive graphics
> package for R.
> >
>
> I do completely agree with that, but interactive means it satisfies
> basic requirements on IG such as the availability of selection,
> highlighting, queries, interactive change of parameters etc. This is
> not about 2d/3d clouds at all - that we have for decades
> already. Also
> this is not about "hacks" to glue on interactivity to existing
> graphics systems with a chewing gum. We need a versatile (possible
> extensible) set of interactive statistical plots -- at least that's
> what our experience shows.

Agree completely.

>
> Cheers,
> Simon
>
>
> >
> >> -----Original Message-----
> >> From: Simon Urbanek [mailto:simon.urbanek_at_r-project.org]
> >> Sent: 19 February 2009 14:34
> >> To: Sklyar, Oleg (London)
> >> Cc: Friedrich Leisch; r-devel_at_r-project.org;
> >> Manuel.Eugster_at_stat.uni-muenchen.de
> >> Subject: Re: [Rd] Google Summer of Code 2009
> >>
> >>
> >> On Feb 19, 2009, at 6:38 , Sklyar, Oleg (London) wrote:
> >>
> >>> Two ideas:
> >>>
> >>> 1) A library for interactive plots in R
> >>>
> >>> R lacks functionality that would allow displaying of interactive
> >>> plots with two distinct functionalities: zooming and panning. This
> >>> functionality is extremely important for the analysis of
> >> large, high
> >>> frequency, data sets spanning over large ranges (in time as well).
> >>> The functionality should acknowledge Axis methods in callbacks on
> >>> rescale (so that it could be extended to user-specific classes for
> >>> axis generation) and should have a native C interface to R
> >> (i.e. no
> >>> Java, but such cross platform widgets like GTK or QT or anything
> >>> similar that does not require heavy-weight add-ons). GTK has been
> >>> used successfully from within R in many applications (RGtk,
> >> rgobby,
> >>> EBImage etc) on both *nix and Windows, and thus could be a
> >>> preferential option, it is also extremely easy to integrate
> >> into R.
> >>> The existing tools (e.g. iplots) are slow, unstable and
> >> lack support
> >>> for time/date plots (or actually any non-standard axes) and
> >> they are
> >>> all Java. We are looking into stanard xy-plots as well as
> >> image and
> >>> 3D plots. Obviously one can think of further interactivity,
> >> but this
> >>> would be too much for the Summer of Code project. A good prototype
> >>> would already be a step forward.
> >>>
> >>
> >> If primitive 3d scatterplot interactivity is all you want, go with
> >> rggobi. It's GTK and has all this already and much more. However,
> >> ggobi also shows why GTK is not a good choice for general
> >> interactive
> >> graphics toolkit - it [GTK] is slow and lacks reasonable graphics
> >> support. OpenGL is IMHO a better way to go since IG don't really
> >> leverage any of the widgets (you get them for free via R widgets
> >> packages anyway) and OpenGL gives you excellent speed,
> alpha-support
> >> and anti-aliasing etc.
> >>
> >> As you can imagine I don't agree with most of your statements above
> >> and I'm happy to discuss them in a separate thread. Just
> as an aside
> >> iPlots 3.0 (announced for useR!/DSC) are no longer Java based
> >> and have
> >> a native C interface.
> >>
> >> Cheers,
> >> S
> >>
> >>
> >>> 2) Cross platform GUI debugger, preferably further Eclipse
> >>> integration (beyond StatET capabilities)
> >>>
> >>> Tibco has recently released the S+ workbench for eclipse
> >> which has a
> >>> reasonable support for non-command line debugging. In the R
> >>> community, the StatET eclipse plugin mimics a lot of code
> >>> development functionality of S+ workbench, but has poor
> >> support for
> >>> in-line execution of R sessions in eclipse and does not have
> >>> debugging capabilities. Supporting this project further, or
> >>> developing a GUI debugger independent of eclipse, are both
> >>> acceptable options. The debugger should allow breakpoints,
> >> variable
> >>> views etc.
> >>>
> >>> For both of the above, our interest is mostly on the Linux
> >> side, but
> >>> one should look into cross-platform solutions.
> >>>
> >>> Regards,
> >>> Oleg
> >>>
> >>> Dr Oleg Sklyar
> >>> Research Technologist
> >>> AHL / Man Investments Ltd
> >>> +44 (0)20 7144 3107
> >>> osklyar_at_maninvestments.com
> >>>
> >>>> -----Original Message-----
> >>>> From: r-devel-bounces_at_r-project.org
> >>>> [mailto:r-devel-bounces_at_r-project.org] On Behalf Of
> >> Friedrich Leisch
> >>>> Sent: 18 February 2009 22:54
> >>>> To: r-devel_at_r-project.org
> >>>> Cc: Manuel.Eugster_at_stat.uni-muenchen.de
> >>>> Subject: [Rd] Google Summer of Code 2009
> >>>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> in approximately one months time mentoring institutions
> can propose
> >>>> projects for the Google Summer of Code 2009, see
> >>>>
> >>>> http://code.google.com/soc/
> >>>>
> >>>> Last year the R Foundation succesfully participated with 4
> >> projects,
> >>>> see http://www.r-project.org/SoC08/ for details. We want to
> >>>> participate again this year. Our project proposals will be
> >> managed by
> >>>> Manuel Eugster (email address in CC). Manuel is one of my PhD
> >>>> students
> >>>> and mentored the Roxygen project last year. This mail is mainly
> >>>> intended to make you aware of the program, Manuel will send a
> >>>> followup
> >>>> email with more technical details in the next days.
> >>>>
> >>>> In this phase we are looking for potential mentors who can offer
> >>>> interesting projects to students. I don't think that we will get
> >>>> much
> >>>> more than 4-6 projects, so don't be disappointed if you propose
> >>>> something and don't get selected.
> >>>>
> >>>> There are two selection steps involved: (a) The R
> Foundation has to
> >>>> compile an official "ideas list" of projects, for which
> >> students can
> >>>> apply. Last year we had 8 of those. After that, we (b) get
> >> a certain
> >>>> number of slots from Google (4 last year) and all
> >> prospective project
> >>>> mentors can vote on which projects actually get funding.
> >>>>
> >>>> Currently we are looking for good ideas for phase (a). I give no
> >>>> guarantees that all ideas will get on our official ideas
> >> list, what
> >>>> we
> >>>> pick depends on the number of submissions and topics,
> >> respectively.
> >>>> We
> >>>> want to make sure to have a broad range of themes, it is
> unlikely,
> >>>> that we will, e.g., pick 10 database projects. Also keep
> >> in mind that
> >>>> students have only three months time. This is not a
> >> research exercise
> >>>> for the students, you should have a rough idea what needs
> >> to be done.
> >>>>
> >>>> Last year we had a majority of "infrastructure projects",
> >> and only
> >>>> few
> >>>> with focus on statistical algorithms. We got a lot of
> >> applications
> >>>> for
> >>>> the latter, so don't hesitate to formulate projects in that
> >>>> direction. Important infrastructure may get precedence over
> >>>> specialized algorithms, though, because the whole community can
> >>>> benfit
> >>>> from those. But that will be a decision in phase (b), and
> >> we are not
> >>>> there yet.
> >>>>
> >>>> Please don't send any ideas to me right now, wait for the above
> >>>> mentioned email by Manuel on the technical details for idea
> >>>> submission.
> >>>>
> >>>> Best,
> >>>> Fritz
> >>>>
> >>>> --
> >>>> --------------------------------------------------------------
> >>>> ---------
> >>>> Prof. Dr. Friedrich Leisch
> >>>>
> >>>> Institut für Statistik Tel: (+49 89)
> >>>> 2180 3165
> >>>> Ludwig-Maximilians-Universität Fax: (+49 89)
> >>>> 2180 5308
> >>>> Ludwigstraße 33
> >>>> D-80539 München
> >>>> http://www.statistik.lmu.de/~leisch
> >>>> --------------------------------------------------------------
> >>>> ---------
> >>>> Journal Computational Statistics --- http://www.springer.com/180
> >>>> Münchner R Kurse --- http://www.statistik.lmu.de/R
> >>>>
> >>>> ______________________________________________
> >>>> R-devel_at_r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>>
> >>>
> >>>
> >>
> **********************************************************************
> >>> Please consider the environment before printing this email or its
> >>> attachments.
> >>> The contents of this email are for the named addressees
> >> ...{{dropped:
> >>> 19}}
> >>>
> >>> ______________________________________________
> >>> R-devel_at_r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >>>
> >>
> >>
> >
> >
> **********************************************************************
> > Please consider the environment before printing this email or its
> > attachments.
> > The contents of this email are for the named addressees only. It
> > contains information which may be confidential and privileged. If
> > you are not the intended recipient, please notify the sender
> > immediately, destroy this email and any attachments and do not
> > otherwise disclose or use them. Email transmission is not a secure
> > method of communication and Man Investments cannot accept
> > responsibility for the completeness or accuracy of this
> email or any
> > attachments. Whilst Man Investments makes every effort to keep its
> > network free from viruses, it does not accept
> responsibility for any
> > computer virus which might be transferred by way of this email or
> > any attachments. This email does not constitute a request, offer,
> > recommendation or solicitation of any kind to buy, subscribe, sell
> > or redeem any investment instruments or to perform other such
> > transactions of any kind. Man Investments reserves the right to
> > monitor, record and retain all electronic communications
> through its
> > network to ensure the integrity of its systems, for record keeping
> > and regulatory purposes.
> > Visit us at: www.maninvestments.com
> > TG0908
> >
> **********************************************************************
> >
>
>



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu 19 Feb 2009 - 16:31:01 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 21 Feb 2009 - 02:30:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive