Re: [R] Combining a list of similar dataframes into a single data frame [Broadcast]

From: Liaw, Andy <andy_liaw_at_merck.com>
Date: Sun 09 Jul 2006 - 10:50:00 EST


A couple of suggestions:  

  1. This screams out for do.call. Try jj <- do.call("rbind", t1).
  2. Use rowSums() instead of apply(..., 1, sum).

Andy


From: r-help-bounces@stat.math.ethz.ch on behalf of Mike Nielsen Sent: Sat 7/8/2006 7:20 PM
To: r-help@stat.math.ethz.ch
Subject: Re: [R] Combining a list of similar dataframes into a single dataframe [Broadcast]

Well, this worked, and rather more quickly than I had expected.

Many thanks to the dogs, who told me the answer in return for walking them and feeding them!

> jj <- eval(parse(text=paste(sep=" ","rbind(",paste(sep="
","t1[[",1:length(t1),"]]",collapse=","),")")))
> str(jj)

`data.frame': 85644 obs. of 4 variables:  $ server : Factor w/ 122 levels "AB93-99","AMP93-1",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ ts :'POSIXct', format: chr "2006-06-30 12:31:44" "2006-06-30 12:32:58" "2006-06-30 12:34:46" "2006-06-30 12:36:55" ...  $ countername : Factor w/ 4 levels "Bytes Received/sec",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ countervalue: num NA 938 816 4213 906 ...
>

On 7/8/06, Mike Nielsen <mr.blacksheep@gmail.com> wrote:
> I would be very grateful to anyone who could point to the error of my
> ways in the following.
>
> I have a dataframe called net1, as such:
>
> > str(net1)
> `data.frame': 114192 obs. of 9 variables:
> $ server : Factor w/ 122 levels "AB93-99","AMP93-1",..: 1 1 1
> 1 1 1 1 1 1 1 ...
> $ ts :'POSIXct', format: chr "2006-06-30 12:31:44"
> "2006-06-30 12:31:44" "2006-06-30 12:31:44" "2006-06-30 12:31:44" ...
> $ instance : Factor w/ 22 levels "1","2","Compaq Ethernet_Fast
> Ethernet Adapter_Module",..: 4 4 4 4 4 4 4 4 4 4 ...
> $ instanceno : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1
...
> $ perftime : num 3.16e+13 3.16e+13 3.16e+13 3.16e+13 3.16e+13 ...
> $ perffreq : num 6.99e+08 6.99e+08 6.99e+08 6.99e+08 6.99e+08 ...
> $ perftime100nsec: num 1.28e+17 1.28e+17 1.28e+17 1.28e+17 1.28e+17 ...
> $ countername : Factor w/ 4 levels "Bytes Received/sec",..: 1 3 2
> 4 1 3 2 4 1 3 ...
> $ countervalue : num 6.08e+07 6.64e+07 5.58e+06 1.00e+08 6.09e+07 ...
> >
>
> What I am trying to do is subset this thing down by server, instance,
> instanceno, countername and then apply a function to each subsetted
> dataframe. The function performs a calculation on countervalue,
> essentially "collapsing" instanceno and instance down to a single
> value.
>
> Here is a snippet of my code:
> t1 <- by(net1,
> list(
> net1$server,
> factor(as.character(net1$countername))),# get rid of
> unused levels of countername for this server
> function(x){
> g <- by(x,
> list(factor(as.character(x$instance)), # get rid of
> unused levels of instance for this server
> factor(as.character(x$instanceno))), # same with
instanceno
>
> function(y){c(NA,mean(y$perffreq)*diff(y$countervalue)/diff(y$perftime))})

> data.frame(server=x$server,
> ts=x$ts,
> countername = x$countername,
> countervalue =
> apply(sapply(g[!sapply(g,is.null)],I),1,sum))
> })
>
> So t1 then is a list of dataframes, each with an identical set of columns)

>
> > str(t1[[1]])
> `data.frame': 149 obs. of 4 variables:
> $ server : Factor w/ 122 levels "AB93-99","AMP93-1",..: 1 1 1 1
> 1 1 1 1 1 1 ...
> $ ts :'POSIXct', format: chr "2006-06-30 12:31:44"
> "2006-06-30 12:32:58" "2006-06-30 12:34:46" "2006-06-30 12:36:55" ...
> $ countername : Factor w/ 4 levels "Bytes Received/sec",..: 1 1 1 1 1
> 1 1 1 1 1 ...
> $ countervalue: num NA 938 816 4213 906 ...
>
> What I'd dearly love to do, without looping or lapply-ing through t1
> and rbinding (too much data for this to finish quickly enough -- this
> is about 10% of what I'm eventually going to have to manage), is
> convert t1 to one big dataframe.
>
> On the other hand, I admit that I may be going about this wrongly from
> the start; perhaps there's a better approach?
>
> Any pointers would be most gratefully received.
>
> Many thanks!
>
>
> --
> Regards,
>
> Mike Nielsen
>

-- 
Regards, 

Mike Nielsen 

______________________________________________ 
R-help@stat.math.ethz.ch mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>  
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Sun Jul 09 10:56:41 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 09 Jul 2006 - 12:16:25 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.