Re: [Rd] XML parsing under R / Extracting nodes’ values

From: Duncan Temple Lang <duncan_at_wald.ucdavis.edu>
Date: Tue, 15 May 2007 07:08:30 -0700

You can use getNodeSet() as Hin-Tak suggests. But you will need to do it for each of the target nodes. So you can use sapply() to loop over these.

However, if these nodes are all children of the same XML node, you can get the values as

  # This is the document content.
z = "<doc><nbRelations>2</nbRelations>

<nbActors>2</nbActors>
<nbRuns>5</nbRuns>
<nbStep>2000</nbStep></doc>"


 # parse the document
 d = xmlRoot(xmlTreeParse(z, isURL = FALSE))

 as.numeric(xmlSAppy(d, xmlValue))

and now you have a vector with named elements corresponding to nbRelations, etc.

If these are children of a sub-node in the tree, then you have to fetch that node first. Hopefully you can get at that easily using subsetting of the document. (Otherwise, you can do that with getNodeSet(). But getNodeSet() only works with internal documents so you need useInternaNodes = TRUE in the call to xmlTreeParse().)

I suggest that you don't assign these to regular, top-level variables but access the values from the vector. But if you really need to assign them to individual variables,

xmlSAppy(d, function(node)

                assign(xmlName(node), xmlValue(node), globalenv()))

will do the trick.

Hin-Tak Leung wrote:

> - you should have posted to either R-help or (more appropriately) to
> the omega-help list.
> 
> That said, you need something like this:
> 
> root.node <- xmlTreeParse(x, useInternalNodes = TRUE)
> nbrelation.set <- getNodeSet(root.node, "//nbRelations")
> nbrelation.list <- sapply(nbrelation.set, function(x) { xmlValue(x) } )
> 
> and nbrelation.list now contains the "2" in
> nbRelations as text - you may want to do as.numeric() on it as well.
> 
> Abdelhakim z wrote:

>> Hi,
>> I have an XML file which contains among other nodes :
>>
>> ===myXMLfile.xml===
>> (…)
>> <nbRelations>2</nbRelations>
>> <nbActors>2</nbActors>
>> (...)
>> <nbRuns>5</nbRuns>
>> <nbStep>2000</nbStep>
>> (…)
>> ===End file===
>> I need to extract those values and to make them R variables such as:
>> nbRelations = 2
>> nbActors = 2
>>
>> nbRuns = 5
>> nbSteps = 2000
>>
>> I read the help and have seen the examples of the xml package, it
>> seems that I need to use xmlTreeParse() function but I don't know how
>> exactly as I'm not an R advanced programmer, please can anyone show me
>> how to do that explicitly ?
>>
>> Any help would be much appreciated
>>
>> Thanks,
>>
>> Abdel
>> University of Boumerdès
>> Algeria
>>
>> ______________________________________________
>> R-devel_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue 15 May 2007 - 14:42:05 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 15 May 2007 - 16:33:45 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.