Re: [Rd] Creating XML document extremely slow

From: Milan Bouchet-Valat <nalimilan_at_club.fr>
Date: Fri, 10 Feb 2012 18:43:05 +0100

Le vendredi 10 février 2012 à 17:36 +0100, Titus von der Malsburg a écrit :
> On Fri, Feb 10, 2012 at 2:10 PM, Milan Bouchet-Valat <nalimilan@club.fr> wrote:
> > Le vendredi 10 février 2012 à 13:18 +0100, Titus von der Malsburg a
> > écrit :
> > Just a guess, but I'd try creating all 'Marker' nodes first, storing
> > them in a 'markers' list, and then calling addChildren(markernode,
> > kids=markers).
>
> A good guess. I changed the code according to your suggestion and it
> reduced the processing time from ~25 to ~3 seconds. Better but still
> ridiculously slow. When I generate the same XML document by
> concatenating pre-fabricated strings, as suggested by Friedrich, the
> whole process takes just 10 ms according to system.time.
Doesn't sound so bad to me. I don't think you'll find a use case where 3s will really be a problem.

>From what Rprof() says, xmlNode() doesn't seem to do anything obviously
wrong. It's just that you're calling it 500 times, so there's some overhead. You'd need a vectorized version that would handle all the data in one go, i.e. create all the children from the values of x and y, and
then add them to their respective parents, in one function call.

Actually, if you look at what xmlNode() does, you'll see that it can easily be re-implemented to do this. Though, since your data is simple, it might be as easy to write the output by hand as you said. The real benefit of libxml2 appears when you create complex documents, and even more when you need to parse them: the hardest part to implement is error checking and finding nodes in a structure you don't perfectly know in advance.

Cheers



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 10 Feb 2012 - 17:53:04 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 10 Feb 2012 - 21:10:16 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive