Re: [R] speed issues / pros & cons: dataframe vs. matrix

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: Fri, 22 Jun 2007 18:59:38 -0400

On 22/06/2007 6:21 PM, Thomas Pujol wrote:
> I've read that certain operations performed on a matrix (e.g. ribind, cbind) are often much faster compared to operations performed on a data frame.
>
> Other then the "bind functions", what are the main operations that are significantly faster on a a matrix?

Indexing (e.g. x[1,3]) is much faster on a matrix.
>
> I know that data frames allow for columnnames and rownames, and that each column in a data frame can have different data types. Are there any other advantages of storing data in a a dataframe rather then a matrix?

Data frames are lists, so you can use things like df$columnname, with(df, expression), attach(df), etc. Data frame columns have names, but matrices don't necessarily.

I'd generally use data frames in any situation where the rows are cases and the columns are characteristics, until I found they were too slow: and then I'd consider temporary conversion to a matrix to speed things up. As Knuth said, premature optimization is the root of all evil.

Duncan Murdoch



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 22 Jun 2007 - 23:04:39 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 22 Jun 2007 - 23:32:22 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.