Re: [R] Yearly statistics

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Mon, 28 May 2007 08:34:40 -0400

Here are a couple of solutions:

  1. using zoo package

First add Date to the header so there
are the same number of column headers as columns and then read in using read.zoo. Then aggregate over years using mean. For more on zoo try library(zoo); vignette("zoo") and for more on dates see the R News 4/1 help desk article.

# added Date to the header

Lines <- "Date open  high   low    close  hc  lc
2004-12-29 4135 4135 4106  4116  8 -21
2004-12-30 4120 4131 4115  4119 15  -1
2004-12-31 4123 4124 4114  4117  5  -5
2005-01-04 4106 4137 4103  4137 20 -14
2005-01-06 4085 4110 4085  4096 10 -15
2005-01-10 4133 4148 4122  4139 15 -11
2005-01-11 4142 4158 4127  4130 19 -12
2005-01-12 4113 4138 4112  4127  18  8

"

library(zoo)

# z <- read.zoo("myfile.dat", header = TRUE) z <- read.zoo(textConnection(Lines), header = TRUE)

aggregate(z[,"hc"] > 0 & z[,"lc"] < 0, function(x) format(x, "%Y"), mean)

2. Using data frames and tapply

Read in as a data frame, calculate year and tapply the mean by year:

# Lines is from above

# dat <- read.table("myfile.dat", header = TRUE) dat <- read.table(textConnection(Lines), header = TRUE)

year <- as.numeric(format(as.Date(dat$Date), "%Y")) tapply(dat$hc > 0 & dat$lc < 0, year, mean)

On 5/27/07, Alfonso Sammassimo <cincinattikid_at_bigpond.com> wrote:
> Dear R-experts,
>
> Sorry if I've overlooked a simple solution here. I have calculated a
> proportion of the number of observations which meet a criteria, applied to
> five years of data. How can I break down this proportion statistic for each
> year?
>
> For example (data in zoo format):
>
> open high low close hc lc
> 2004-12-29 4135 4135 4106 4116 8 -21
> 2004-12-30 4120 4131 4115 4119 15 -1
> 2004-12-31 4123 4124 4114 4117 5 -5
> 2005-01-04 4106 4137 4103 4137 20 -14
> 2005-01-06 4085 4110 4085 4096 10 -15
> 2005-01-10 4133 4148 4122 4139 15 -11
> 2005-01-11 4142 4158 4127 4130 19 -12
> 2005-01-12 4113 4138 4112 4127 18 8
>
> Statistic of interest is proportion of times that sign of "hc" is positive
> and sign of "lc" is negative on any given day. Looking to return something
> like:
>
> Yr Prop
> 2004 1.0
> 2005 0.8
>
> Along these lines, if I have datasets A and B, where B is a subset of A, can
> I use the number of matching dates to calculate the yearly proportions in
> question?
>
> Thanks,
> Alfonso Sammassimo
> Melbourne Australia
>
> ______________________________________________
> R-help_at_stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 28 May 2007 - 12:44:47 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 28 May 2007 - 13:31:16 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.