Re: [R] Tools For Preparing Data For Analysis

From: Barry Rowlingson <>
Date: Mon, 11 Jun 2007 12:21:29 +0100

Chris Evans wrote:

> Thanks Ted, great thread and I'm impressed with EpiData that I've
> discovered through this. I'd still like something that is even more
> integrated with R but maybe some day, if EpiData go fully open source as
> I think they are doing ("A full conversion plan to secure this and
> convert the software to open-source has been made (See complete
> description of license and principles)." at but
> the link to doesn't exactly clarify this
> I don't think. But I can hope.)
> Thanks, yet again, to everyone who creates and contributes to the R
> system and this list: wonderful!

  Perhaps what we need is an XML standard for describing record-oriented data and its validation? This could then be used to validate a set of records and possibly also to build input forms with built-in validation for new records.

  You could then write R code that did 'check this data frame against this XML description and tell me the invalid rows'. Or Python code.

  This is the kind of thing that is traditionally built using a database front-end, but keeping the description in XML means that alternate interfaces (web forms, standalone programs using Qt or GTK libraries) can be used on the same description set.

  I had a quick search to see if this kind of thing exists already, but google searches for 'data entry verification' indicate that I should really pay some people in India to do that kind of thing for me...

Barry mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Mon 11 Jun 2007 - 11:23:31 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 14 Jun 2007 - 12:31:57 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.