[R] Preparing data for display

From: Stavros Macrakis <macrakis_at_alum.mit.edu>
Date: Mon, 10 Nov 2008 16:28:38 -0500

I have a dataset of about 10^6 rows, each consisting of a timestamp, several factors, a string, some integers, and some floats.

I'd like to graph this data in various ways, including straightforward ones (how many events per week over the past year for each of 4 values of some factor), some less straightforward. I've managed to do this by brute force, but I'd like to learn how to do it in more elegant, more R-like code.

Consider for example the following, which graphs the 25th, 50th, and 75th percentile values per day of data$x

perc <- function(code,data)
{ # select the part of the data with factor value   slice <- data[data$factor == code,];
  # calc quartiles for each day
  quarts <- tapply(slice$x,

                             function(x) quantile(x,c(.25,.50,.75)));

# returns a tagged list of tagged vectors
# list("2008-10-07" = c("25%" = .05, "50%" = .47,
... ) , ...)

    # convert to a data frame -- is there some mapping function to do this?    fr <- data.frame( day = to.time(names(quarts)), # strings back to dates (!)

                               "25%" = sapply(quarts, function(x)
x[[1]] ),   # !!
                               "50%" = sapply(quarts, function(x) x[[2]] ),
                               "75%" = sapply(quarts, function(x) x[[3]] ) );

# columns are now labelled "X25." etc. (!)
    for (i in 2:4) { plot( fr$day, res[[2]], type="l", ylim= c( 0, max(pmax(fr[[1]],fr[[2]],fr[[3]] )) ));
                          par(new=TRUE); }

This works, but is pretty ugly in a variety of ways. What is the right way to do this?



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 10 Nov 2008 - 21:31:30 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 11 Nov 2008 - 03:30:24 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive