From: Marc Schwartz <marc_schwartz_at_comcast.net>

Date: Fri, 07 Mar 2008 16:52:09 -0600

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 07 Mar 2008 - 22:55:29 GMT

Date: Fri, 07 Mar 2008 16:52:09 -0600

Alexey Shipunov wrote:

> Dear list,

*>
**> I just made a very simple mistake, but it was hard to spot. And I
**> think that I should warn other people, because it is probably so
**> simple to make...
**>
**> === R code ===
**>
**> # Let us create a matrix:
**> (a <- cbind(c(0,1,1), rep(1,3)))
**>
**> # [,1] [,2]
**> # [1,] 0 1
**> # [2,] 1 1
**> # [3,] 1 1
**>
**> # That is a MISTAKE:
**> a/colSums(a)
**>
**> # [,1] [,2]
**> # [1,] 0.0000000 0.3333333
**> # [2,] 0.3333333 0.5000000
**> # [3,] 0.5000000 0.3333333
**> # I just wonder if some R warning should be issued here?
**>
**> # That is what I actually needed (column-wise frequencies):
**> t(t(a)/colSums(a))
**>
**> # [,1] [,2]
**> # [1,] 0.0 0.3333333
**> # [2,] 0.5 0.3333333
**> # [3,] 0.5 0.3333333
**>
**> === end of R code ===
**>
**> With best wishes and regards,
**> Alexey Shipunov
*

or more simply:

> prop.table(a, 2)

[,1] [,2]

[1,] 0.0 0.3333333 [2,] 0.5 0.3333333 [3,] 0.5 0.3333333

See ?prop.table

You need to recognize how matrices are stored in R. They are vectors with a 'dim' attribute. Thus 'a' is essentially:

> as.vector(a)

[1] 0 1 1 1 1 1

If I take that vector, as say 'x':

x <- as.vector(a)

> x

[1] 0 1 1 1 1 1

dim(x) <- c(3, 2)

> x

[,1] [,2]

[1,] 0 1 [2,] 1 1 [3,] 1 1

When you do the division, since colSums(a) contains two values, they are recycled as required to get the result. Thus colSums(a) effectively becomes:

> rep(colSums(a), 3)

[1] 2 3 2 3 2 3

so that it is equal in length to 'a'.

The result is then:

> as.vector(a) / rep(colSums(a), 3)

[1] 0.0000000 0.3333333 0.5000000 0.3333333 0.5000000 0.3333333

which is then returned as a matrix with the original dimensions:

> matrix(as.vector(a) / rep(colSums(a), 3), 3, 2)

[,1] [,2]

[1,] 0.0000000 0.3333333 [2,] 0.3333333 0.5000000 [3,] 0.5000000 0.3333333

Another option simply for the sake of example, is:

> t(apply(a, 1, function(x) x / colSums(a)))

[,1] [,2]

[1,] 0.0 0.3333333 [2,] 0.5 0.3333333 [3,] 0.5 0.3333333

prop.table() actually uses sweep(), so you might also want to look at that.

So the key to understanding the behavior is to understand how R objects are stored and how recycling rules work.

**HTH,
**
Marc Schwartz

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 07 Mar 2008 - 22:55:29 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Fri 07 Mar 2008 - 23:30:20 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*