Re: [R] more woes trying to convert a data.frame to a numerical matrix

From: Andrew Yee <andrewjyee_at_gmail.com>
Date: Wed, 16 May 2007 09:14:10 -0400

Thanks again to everyone for all your help.

I think I've figured out the solution to my dilemma. Instead of using data.matrix or sapply, this works for me:

sample.data<-read.csv("sample.csv")
sample.matrix.raw<-as.matrix(sample.data[-1,-1])
sample.matrix <- matrix(as.numeric(sample.matrix.raw),

    nrow=attributes(sample.matrix.raw)$dim[1], ncol=attributes( sample.matrix.raw)$dim[2])

With the above code, I get the desired matrix of:

1 2 3
4 5 6
7 8 9

(I'd like to be able to import the whole csv and then subset the relevant header and data sections (rather than creating a separate csv for the header and csv for the data)

Of course, the above code seems kind of clunky, and welcome any suggestions for improvement.

Thanks,
Andrew

On 5/16/07, Andrew Yee <andrewjyee_at_gmail.com> wrote:
>
> Thanks for the suggestion.
>
> However, I've tried sapply and data.matrix.
>
> The problem is that it while it returns a numeric matrix, it gives back:
>
> 1 1 1
> 2 2 2
> 3 3 3
>
> instead of
>
> 1 2 3
> 4 5 6
> 7 8 9
>
> The latter matrix is the desired result
>
> Thanks,
> Andrew
>
> On 5/16/07, Marc Schwartz < marc_schwartz_at_comcast.net> wrote:
> >
> > On Wed, 2007-05-16 at 08:40 -0400, Andrew Yee wrote:
> > > Thanks for the suggestion and the explanation for why I was running
> > > into these troubles.
> > >
> > > I've tried:
> > >
> > > as.numeric(as.matrix(sample.data[-1, -1]))
> > >
> > > However, this creates another vector rather than a matrix.
> >
> > Right. That's because I'm an idiot and need more caffeine... :-)
> >
> > > Is there a straight forward way to convert this directly into a
> > > numeric matrix rather than a vector?
> >
> > Yeah, Dimitris' approach below of using data.matrix().
> >
> > You could also use:
> >
> > mat <- sapply(sample.data[-1, -1], as.numeric)
> > rownames(mat) <- rownames(sample.data[-1, -1])
> >
> > > mat
> > x y z
> > 2 1 1 1
> > 3 2 2 2
> > 4 3 3 3
> >
> > Though, this is essentially what data.matrix() does internally.
> >
> > > Additionally, I've also considered:
> > >
> > > data.matrix(sample.data[-1,-1]
> > >
> > > but bizarrely, it returns:
> > >
> > > x y z
> > > 2 1 1 1
> > > 3 2 2 2
> > > 4 3 3 3
> >
> > That is a numeric matrix:
> >
> > > str(data.matrix(sample.data[-1, -1]))
> > int [1:3, 1:3] 1 2 3 1 2 3 1 2 3
> > - attr(*, "dimnames")=List of 2
> > ..$ : chr [1:3] "2" "3" "4"
> > ..$ : chr [1:3] "x" "y" "z"
> >
> > HTH,
> >
> > Marc
> >
> > >
> > > Thanks,
> > > Andrew
> > >
> > >
> > > On 5/16/07, Marc Schwartz < marc_schwartz_at_comcast.net> wrote:
> > > On Wed, 2007-05-16 at 08:10 -0400, Andrew Yee wrote:
> > > > I have the following csv file:
> > > >
> > > > name,x,y,z
> > > > category,delta,gamma,epsilon
> > > > a,1,2,3
> > > > b,4,5,6
> > > > c,7,8,9
> > > >
> > > > I'd like to create a numeric matrix of just the numbers in
> > > this csv dataset.
> > > >
> > > > I've tried the following program:
> > > >
> > > > sample.data <- read.csv("sample.csv")
> > > > numerical.data <- as.matrix (sample.data[-1,-1])
> > > >
> > > > However, print(numerical.data ) returns what appears to be a
> > > matrix of
> > > > characters:
> > > >
> > > > x y z
> > > > 2 "1" "2" "3"
> > > > 3 "4" "5" "6"
> > > > 4 "7" "8" "9"
> > > >
> > > > How do I force it to be numbers rather than characters?
> > > >
> > > > Thanks,
> > > > Andrew
> > >
> > > The problem is that you have two rows which contain alpha
> > > entries.
> > >
> > > The first row is treated as the header, but the second row is
> > > treated as
> > > actual data, thus overriding the numeric values in the
> > > subsequent rows.
> > >
> > > You could use:
> > >
> > > as.numeric(as.matrix(sample.data [-1, -1]))
> > >
> > > to coerce the matrix to numeric, or if you don't need the
> > > alpha entries,
> > > you could modify the read.csv() call to something like:
> > >
> > > read.csv("sample.csv", header = FALSE, skip = 2, row.names =
> > > 1,
> > > col.names = c("name", "x", "y", "z")
> > >
> > > This will skip the first two rows, set the first column to the
> >
> > > row names
> > > and give you a data frame with numeric columns, which in most
> > > cases can
> > > be treated as a numeric matrix and/or you could explicitly
> > > coerce it to
> > > one.
> > >
> > > HTH,
> > >
> > > Marc Schwartz
> > >
> > >
> > >
> >
> >
>

        [[alternative HTML version deleted]]



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 16 May 2007 - 13:36:44 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 16 May 2007 - 16:31:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.