Dimitris Rizopoulos

Biostatistical Centre

School of Public Health

Catholic University of Leuven

you could try a simple for() loop, e.g.

N <- 100

k <- 10

set.seed(12345)

mat <- matrix(sample(0:1, N * k, TRUE), N, k)
key <- sample(letters[1:4], k, TRUE)

out <- matrix("", N, k)

unq.key <- unique(key)

for (i in 1:k) {

ind <- mat[, i] == 1

out[ind, i] <- key[i]

vals <- unq.key[!unq.key %in% key[i]]
out[!ind, i] <- sample(vals, sum(!ind), TRUE)
}

out

Address: Kapucijnenvoer 35, Leuven, Belgium

Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm

From: "Doran, Harold" <HDoran_at_air.org>
Sent: Tuesday, March 04, 2008 4:33 PM
Subject: [R] Sampling letters

>I have a binary matrix of size N x 300. I then create the following:

**>> set.seed(1234)
**>> (key_file <- sample(letters[1:4], 300, replace=TRUE))
**> [1] "a" "c" "c" "c" "d" "c" "a" "a" "c" "c" "c" "c" "b" "d" "b" "d"
**> "b" "b" "a" "a" "b" "b" "a"
**> [24] "a" "a" "d" "c" "d" "d" "a" "b" "b" "b" "c" "a" "d" "a" "b" "d"
**> "d" "c" "c" "b" "c" "b" "c"
**> [47] "c" "b" "a" "d" "a" "b" "c" "c" "a" "c" "b" "d" "a" "d" "d" "a"
**> "b" "a" "a" "c" "b" "c" "a"
**> [70] "c" "a" "d" "a" "d" "a" "c" "b" "a" "b" "c" "d" "b" "a" "c" "a"
**> "d" "b" "b" "a" "d" "a" "d"
**> [93] "a" "a" "a" "c" "b" "a" "b" "c" "a" "c" "b" "a" "a" "b" "a" "a"
**> "b" "a" "c" "a" "d" "a" "a"
**> [116] "d" "d" "b" "a" "d" "c" "d" "d" "d" "b" "b" "b" "c" "b" "b"
**> "d"
**> "c" "a" "b" "d" "b" "d" "c"
**> [139] "c" "d" "c" "d" "b" "b" "b" "c" "c" "c" "d" "c" "b" "a" "a"
**> "d"
**> "a" "d" "c" "d" "b" "c" "b"
**> [162] "c" "b" "a" "a" "c" "b" "a" "d" "b" "d" "c" "c" "b" "b" "d"
**> "b"
**> "c" "a" "b" "b" "b" "c" "a"
**> [185] "d" "a" "d" "c" "b" "c" "c" "d" "a" "d" "d" "d" "d" "c" "d"
**> "c"
**> "c" "c" "b" "d" "c" "c" "b"
**> [208] "b" "a" "d" "c" "b" "a" "d" "c" "d" "c" "c" "b" "d" "b" "a"
**> "b"
**> "b" "a" "b" "d" "b" "c" "b"
**> [231] "d" "c" "a" "d" "c" "a" "c" "b" "b" "d" "b" "a" "a" "c" "d"
**> "b"
**> "d" "c" "d" "d" "c" "c" "b"
**> [254] "b" "a" "c" "b" "a" "c" "c" "d" "a" "c" "b" "a" "a" "c" "a"
**> "a"
**> "c" "b" "d" "b" "d" "a" "c"
**> [277] "d" "c" "b" "b" "b" "b" "d" "d" "c" "b" "b" "b" "c" "d" "c"
**> "b"
**> "d" "a" "c" "d" "c" "a" "c"
**> [300] "b"
**> I now replace all 1's in column 1 with key_file[1], I replace all
**> 1's in
**> column 2 with key_file[2] and so on through column 300. This part is
**> simple.
**> Now, I want to replace the 0's in column 1 with either b,c, or d,
**> but
**> not with an a since that was used to replace the 1's. For column 2 I
**> want to replace all 0's with either a,b, or d but not with c since
**> that
**> was used to replace the 1's.
**>
**> However, I do not want all 0's in column 1 to be the same letter.
**> That
**> is, I would not want them all to be replaced with a 'b'. Rather, I
**> want
**> to randomly recode the 0's as either b,c, or d. So, some 0's will be
**> recoded as b, some as c, and some as d.
**> If I were replacing the zeros with the same letter, this would be a
**> simple ifelse command. But, because I want randomness I'm not sure
**> how I
**> can do this other than a costly loop than goes through the data
**> matrix
**> cell-by-cell and does some replacement. That would be fine, but very
**> time consuming.
**> Does anyone have thoughts on how else I could tackle this?
**>
**> Harold
