Re: [R] Problem with rowMeans()

From: Erik Iverson <iverson_at_biostat.wisc.edu>
Date: Thu, 12 Jun 2008 18:48:25 -0500

ss wrote:
> It is:
>
> > data <-
> read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt',
> row.names = NULL ,header=TRUE, fill=TRUE)
> > class(data[3])
> [1] "data.frame"
> >
>

Oops, should have said class(data[[3]]) and is.numeric(data[[3]])

See ?Extract

>
> And if I try to use as.matrix(read.table()), I got:
>
> >data
> <-as.matrix(read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt',
> + row.names = NULL ,header=TRUE, fill=TRUE))
> > data[1:4,1:4]
> Probe_ID Gene_Symbol M16012391010920 M16012391010525
> [1,] "A_23_P105862" "13CDNA73" "-1.6" " 0.16"
> [2,] "A_23_P76435" "15E1.2" "0.18" " 0.59"
> [3,] "A_24_P402115" "15E1.2" "1.63" "-0.62"
> [4,] "A_32_P227764" "15E1.2" "-0.76" "-0.42"
>
> You see they are surrounded by "".
>
> I don't see such if I just use >read.table
>

That is because matrices (objects of class 'matrix') are of homogeneous type. It changes everything to a character (including the numbers), which you certainly do NOT want.

You want a data.frame, I will provide an example of what I think you are after.

Try the following commands and see how they compare to your situation: these work for me.

test <- data.frame(x = factor(rep(c("A", "B"), each = 13)), y = rnorm(26), z = rnorm(26))

test

class(test)

is.numeric(test[[2]])

is.numeric(test[[3]])

rowMeans(test)

rowMeans(test[2:3])

> > data <-
> read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt',
> row.names = NULL ,header=TRUE, fill=TRUE)
> > data[1:4,1:4]
> Probe_ID Gene_Symbol M16012391010920 M16012391010525
> 1 A_23_P105862 13CDNA73 -1.6 0.16
> 2 A_23_P76435 15E1.2 0.18 0.59
> 3 A_24_P402115 15E1.2 1.63 -0.62
> 4 A_32_P227764 15E1.2 -0.76 -0.42
>
>
> Thanks,
> Allen
>
>
>
> On Thu, Jun 12, 2008 at 7:34 PM, Erik Iverson <iverson_at_biostat.wisc.edu
> <mailto:iverson_at_biostat.wisc.edu>> wrote:
>
>
>
> ss wrote:
>
> Hi Wacek,
>
> Yes, data is data frame not a matrix.
>
> is.numeric(data[3])
>
> [1] FALSE
>
>
> what is class(data[3])
>
>
> But I looked at the column 3 and it looks okay though. There are
> few NAs and
> I did find
> anything strange.
>
> Any suggestions?
>
> Thanks,
> Allen
>
>
>
> On Thu, Jun 12, 2008 at 7:01 PM, Wacek Kusnierczyk <
> Waclaw.Marcin.Kusnierczyk_at_idi.ntnu.no
> <mailto:Waclaw.Marcin.Kusnierczyk_at_idi.ntnu.no>> wrote:
>
> ss wrote:
>
> Thank you very much, Wacek! It works very well.
> But there is a minor problem. I did the following:
>
> data <-
>
> read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt',
> +row.names = NULL ,header=TRUE, fill=TRUE)
>
> looks like you have a data frame, not a matrix
>
>
> dim(data)
>
> [1] 23963 85
>
> data[1:4,1:4]
>
> Probe_ID Gene_Symbol M16012391010920 M16012391010525
> 1 A_23_P105862 13CDNA73 -1.6 0.16
> 2 A_23_P76435 15E1.2 0.18 0.59
> 3 A_24_P402115 15E1.2 1.63 -0.62
> 4 A_32_P227764 15E1.2 -0.76 -0.42
>
> data1<-data[sapply(data, is.numeric)]
> dim(data1)
>
> [1] 23963 82
>
> data1[1:4,1:4]
>
> M16012391010525 M16012391010843 M16012391010531
> M16012391010921
> 1 0.16 -0.23 -1.40
> 0.90
> 2 0.59 0.28 -0.30
> 0.08
> 3 -0.62 -0.62 -0.22
> -0.18
> 4 -0.42 0.01 0.28
> -0.79
>
> You will notice that, after using 'data[sapply(data,
> is.numeric)]' and
> getting
> data1, the first sample in data, called
> 'M16012391010920', was missed
> in data1.
>
> Any further suggestions?
>
> surely there must be an entry in column 3 that makes it
> non-numeric.
> what does is.numeric(data[3]) say? (NAs should not make a
> column
> non-numeric, unless there are only NAs there, which is not
> the case
> here.) check your data for non-numeric entries in column 3,
> there can
> be a typo.
>
> vQ
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org <mailto:R-help_at_r-project.org> mailing list
>
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 13 Jun 2008 - 01:13:43 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 13 Jun 2008 - 08:30:55 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive