Re: [R] How to extract following data

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Wed, 05 Nov 2008 06:22:40 -0500

As others have pointed out its close to XML but not quite there; however, you could use strapply in gsubfn to extract the data. It pulls out the data matching the regular expression giving vector, vec, consisting of: date price date price ... Pulling out even and odd elements separately and converting them to Date and numeric, respectively, gives the resulting data.frame.

See
http://gsubfn.googlecode.com
for more on the gsubfn package and
the three zoo vignettes in the zoo package for more on it.

Lines <- '- <Temp diffgr:id="Temp14" msdata:rowOrder="13">

 <Date>2005-01-17T00:00:00+05:30</Date>
 <SecurityID>10149</SecurityID>
 <PriceClose>1288.40002</PriceClose>
 </Temp>

- <Temp diffgr:id="Temp15" msdata:rowOrder="14">
 <Date>2005-01-18T00:00:00+05:30</Date>
 <SecurityID>10149</SecurityID>
 <PriceClose>1291.69995</PriceClose>
 </Temp>

- <Temp diffgr:id="Temp16" msdata:rowOrder="15">
 <Date>2005-01-19T00:00:00+05:30</Date>
 <SecurityID>10149</SecurityID>
 <PriceClose>1288.19995</PriceClose>
 </Temp>'

library(gsubfn)
vec <- strapply(Lines, "....-..-..|[0-9]+[.][0-9]+")[[1]] ix <- seq_along(vec) %% 2 == 1
DF <- data.frame(date = as.Date(vec[ix]), price = as.numeric(vec[!ix]))

# or, instead of the last line, you could convert it to a zoo object so # that its in a more convenient form for time series manipulation:

library(zoo)
z <- zoo(as.numeric(vec[!ix]), as.Date(vec[ix]))

On Wed, Nov 5, 2008 at 1:22 AM, RON70 <ron_michael70_at_yahoo.com> wrote:
>
> Hi everyone,
>
> I have this kind of raw dataset :
>
> - <Temp diffgr:id="Temp14" msdata:rowOrder="13">
> <Date>2005-01-17T00:00:00+05:30</Date>
> <SecurityID>10149</SecurityID>
> <PriceClose>1288.40002</PriceClose>
> </Temp>
> - <Temp diffgr:id="Temp15" msdata:rowOrder="14">
> <Date>2005-01-18T00:00:00+05:30</Date>
> <SecurityID>10149</SecurityID>
> <PriceClose>1291.69995</PriceClose>
> </Temp>
> - <Temp diffgr:id="Temp16" msdata:rowOrder="15">
> <Date>2005-01-19T00:00:00+05:30</Date>
> <SecurityID>10149</SecurityID>
> <PriceClose>1288.19995</PriceClose>
> </Temp>
>
> I was looking for some R procedure to extract data from this, that should be
> in following format :
>
> 2005-01-17 1288.40002
> 2005-01-18 1291.69995
> 2005-01-19 1288.19995
>
> Can R help me to do this?
>
> --
> View this message in context: http://www.nabble.com/How-to-extract-following-data-tp20336690p20336690.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 05 Nov 2008 - 11:26:50 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 05 Nov 2008 - 13:30:23 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive