Re: [Rd] csv version of data in an R object

From: Barry Rowlingson <b.rowlingson_at_lancaster.ac.uk>
Date: Sat, 21 Apr 2012 20:59:50 +0100

On Sat, Apr 21, 2012 at 3:28 PM, Max Kuhn <mxkuhn_at_gmail.com> wrote:
> For a package, I need to write a csv version of a data set to an R
> object. Right now, I use:
>
>    out <- capture.output(
>                          write.table(x,
>                                      sep = ",",
>                                      na = "?",
>                                      file = "",
>                                      quote = FALSE,
>                                      row.names = FALSE,
>                                      col.names = FALSE))
>
> To me, this is fairly slow; 131 seconds for a data frame with 8100
> rows and 1400 columns.
>
> The data will be in a data frame; I know write.table() would be faster
> with a matrix. I was looking into converting the data frame to a
> character matrix using as.matrix() or, better yet, format() prior to
> the call above. However, I'm not sure what an appropriate value of
> 'digits' should be so that the character version of numeric data has
> acceptable fidelity.
>
> I also tried using a text connection and sink() as shown in
> ?textConnection but there was no speedup.
>

 You could try a loop over each row, and use 'paste' to join each element in a row by commas. Then use 'paste' again to join everything you've got (a vector of rows) by a '\n' character.

something like: paste(apply(x,1,paste,collapse=","),collapse="\n") # untested

you probably also want to stick a final \n on it.

Is it faster? I don't know!

Barry



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat 21 Apr 2012 - 20:03:19 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 23 Apr 2012 - 13:30:47 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive