Re: [R] Relational Databases or XML?

From: Doran, Harold <>
Date: Thu, 10 Apr 2008 16:28:41 -0400

I'm not sure it is possible to parse an XML file in R directly. Well, I guess it's *possible*, but may not be the best way to do it. ElementTree in Python is an easy-to-use parser that you might use to first parse your XML file (or others hierarchically structured data), organize it anyway you want, and then bring those data into R for subsequent analysis.

In fact, I have recently done just this. I have another statistical program that outputs data as an XML file. So, I wrote a python program that parses that XML file, pulls out the data of interest into a text file, and then I bring those data into R for analysis.

> -----Original Message-----
> From:
> [] On Behalf Of Keith Alan
> Chamberlain
> Sent: Thursday, April 10, 2008 4:14 PM
> To:
> Subject: [R] Relational Databases or XML?
> Dear R-Help,
> I am working on a paper in an R course for large file support
> in R using scan(), relational databases, and XML. I have
> never used SQL or heirarchical document formats such as XML
> (except where it occurs without user interaction), and
> knowledge in RDBs and XML is lacking in my program. I have
> tried finding a working example for the novices-novice on the
> topic, read many postings, the r-data I/O manual several
> times, and descriptions of packages RODBC, DBI, XML, among
> others. I understand that RDBs are (assumed at least) used
> widely among the R community. I have not been able to put all
> of the pieces together, but assuming that RDB use is actually
> quite widespread, it should be quite easy to fill me in
> and/or correct my understanding where necessary.
> For a cross-platform solution (PC/OSX at least, or in part)
> my questions/problems are about what preliminary steps are
> needed to get an SQL or XML query "to work" in R to begin
> with, what the appropriate data-file formats are, and how to
> convert to them if starting out with data in, say, a
> delimited ASCII text file. Very basic examples should
> suffice, say, a table with 20 random observations, a grouping
> variable with 2 levels, and a factor with 2 levels.
> ## untested code
> set.seed(1024)
> write.table("junk.txt",
> data.frame(Subj=c(rep(1,10),rep(2,10)),block=rep(c(rep(-1,5),r
> ep(1,5)),2), obs=rnorm(20,0,1)))
> Specifically,
> 1- what are the minimum required non R components that are
> needed to support SQL or XML functionality, which may or may
> not need to be installed?
> 2- what R packages need to be installed, at a minimum (also
> as a cross-PC/Mac solution if possible or at least as much as
> possible)
> 3- I keep seeing reference to connections of a given name "if
> previously setup". What kind of setup is needed outside of R, if any?
> 4- what steps are needed in R to then connect to a file and
> import a subset based on a query?
> 5- Do I then use standard R routines (e.g. write()) to export
> as a DB, or an RDB/XML specific function?
> Sincerely,
> KeithC. [U.S]
> 1/k^c
> ______________________________________________
> mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
> mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Thu 10 Apr 2008 - 20:31:00 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 10 Apr 2008 - 22:30:28 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive