Re: [R] Seeking a more efficient way to read in a file

From: jim holtman <jholtman_at_gmail.com>
Date: Wed, 2 Jan 2008 20:31:39 -0500

After you read in the first line, read the rest of the file with a single scan:

rest <- scan(..., sep="\t", what=0, skip=1) index <- 1 # used to march through 'rest' for (i in 1:3000){

    for (j in 1:i){

        malt[i,j] <- rest[index]
        index <- index+1

    }
}

There are probably faster ways, but this should go quicker since most of your previous time was spent in the reading.

On Jan 2, 2008 6:05 PM, Talbot Katz <topkatz_at_msn.com> wrote:
>
> Hi.
>
> I have a matrix stored in a large, tab-delimited flat file. The first row contains column names. Because the matrix is symmetric, the file has lower triangular format, so the second row contains one number, the third row two numbers, etc. In general, row k+1 contains k numbers; the matrix has 3000 rows, so the file has 3001 rows. The file has variable length records, so each row ends with its last piece of data. I read in the file and produced the full symmetric matrix as follows:
>
> > mana01 <- scan( file = "C:/mat.dat", sep = "\t", nlines = 1, what = "character" )Read 3000 items> nco <- length( mana01 )> malt <- matrix(0, nrow = nco, ncol = nco )> colnames( malt ) <- mana01> rownames( malt ) <- mana01> for ( i in 1:3000 ) { malt[ i, (1:i) ] <- scan( file="C:/mat.dat", skip = i, n = i, quiet = TRUE ) }
> > mat <- malt + t( malt ) - diag( diag( malt ) )>
>
> The for loop took a couple of hours to complete. I suspect there's a much faster way to do this. Any suggestions? Thanks!
>
> -- TMK --212-460-5430 home917-656-5351 cell
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 03 Jan 2008 - 01:35:47 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 03 Jan 2008 - 02:30:05 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive