Re: [R] reading long matrix

From: Liaw, Andy <andy_liaw_at_merck.com>
Date: Fri 23 Dec 2005 - 06:13:18 EST


Here's one possibility, if you know the number of species and the numbers of rows and columns before hand, and the dimension for all species are the same.

readSpeciesMap <- function(fname, nspecies, nr, nc) {

    spcnames <- character(nspecies)
    spcdata <- array(0, c(nc, nr, nspecies))     ## open the file for reading, and close it upon exit.     f <- file(fname, open="r")
    on.exit(close(f))
    for (i in seq(along=spcnames)) {

        ## read the name
        spcnames[i] <- readLines(f, 1)[[1]]
        ## read the grid
        spcdata[, , i] <- as.numeric(unlist(strsplit(readLines(f, nr), "")))
        ## pick up the empty line
        readLines(f, 1)

    }
    ## replace the 9s with NAs
    spcdata[spcdata == 9] <- NA
    dimnames(spcdata)[[3]] <- spcnames
    ## "transpose" the array in each species     aperm(spcdata, c(2, 1, 3))
}

Using the example you supplied (saved in the file "species.txt"):

> readSpeciesMap("species.txt", 3, 6, 9)
, , SPECIES1      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]

[1,]   NA   NA   NA    0    0    1    0   NA   NA
[2,]   NA    0    0    1    1    0    1    0   NA
[3,]    0    1    1    1    0    1    0    0    0
[4,]   NA    0    1    1    0    0    1    0    1
[5,]    1    1    0    1    0    0    0    1   NA
[6,]   NA    0    1    1    1    0    0    1   NA

, , SPECIES2      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]

[1,]   NA   NA   NA    0    0    0    0   NA   NA
[2,]   NA    0    0    1    1    0    1    1   NA
[3,]    0    1    1    1    0    1    1    0    0
[4,]   NA    0    1    0    1    0    1    0    1
[5,]    1    1    0    0    0    0    0    1   NA
[6,]   NA    0    0    0    0    0    0    1   NA

, , SPECIES3      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]

[1,]   NA   NA   NA    0    0    1    0   NA   NA
[2,]   NA    0    0    1    0    0    1    0   NA
[3,]    0    1    1    1    0    0    0    1    0
[4,]   NA    0    1    1    0    0    1    0    0
[5,]    1    1    0    1    0    0    0    1   NA
[6,]   NA    0    1    1    1    0    0    1   NA

Andy

From: Colin Beale
>
> Hi,
>
> I'm needing some help finding a function to read a large text
> file into an array in R. The data are essentially presence /
> absence / na data for many species and come as a grid with
> each species name (after two spaces) at the beginning of the
> matrix defining the map for that species. An excerpt could

> therefore be:
>
> SPECIES1
> 999001099
> 900110109
> 011101000
> 901100101
> 110100019
> 901110019
>
> SPECIES2
> 999000099
> 900110119
> 011101100
> 901010101
> 110000019
> 900000019
>

> SPECIES3
> 999001099
> 900100109
> 011100010
> 901100100
> 110100019
> 901110019
>

> where 9 is actually na, 0 is absence and 1 presence. The
> final array I want to create should have dimensions that are
> the x and y coordinates and the number of species (known in
> advance). (In this example dim = c(9,6,3)). It would be sort
> of neat if the code could also read the species name into the
> appropriate names attribute, but this is a refinement that I
> could probably do if someone can help me read the data into R
> and into an array in the first place. I'm currently thinking
> a line by line approach using readLines might be the best
> option, but I've got a very long file - well over 100
> species, each a matrix of 70 x 100 datapoints. making this
> option rther time consuming, I expect - especially as the
> next dataset has 1300 species and a much larger grid...
>
> Any hints would be gratefully recieved.

>
> Colin Beale
> Macaulay Land Use Research Institute

>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Dec 23 06:53:40 2005

This archive was generated by hypermail 2.1.8 : Fri 23 Dec 2005 - 09:32:35 EST