From: Dimitris Rizopoulos <dimitris.rizopoulos_at_med.kuleuven.be>

Date: Tue, 04 Mar 2008 16:51:58 +0100

Dimitris Rizopoulos

Biostatistical Centre

School of Public Health

Catholic University of Leuven

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 04 Mar 2008 - 15:57:46 GMT

Date: Tue, 04 Mar 2008 16:51:58 +0100

you could try a simple for() loop, e.g.

N <- 100

k <- 10

set.seed(12345)

mat <- matrix(sample(0:1, N * k, TRUE), N, k)
key <- sample(letters[1:4], k, TRUE)

out <- matrix("", N, k)

unq.key <- unique(key)

for (i in 1:k) {

ind <- mat[, i] == 1

out[ind, i] <- key[i]

vals <- unq.key[!unq.key %in% key[i]]
out[!ind, i] <- sample(vals, sum(!ind), TRUE)
}

out

I hope it helps.

Best,

Dimitris

Dimitris Rizopoulos

Biostatistical Centre

School of Public Health

Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium

Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm

- Original Message ----- From: "Doran, Harold" <HDoran_at_air.org> To: <r-help_at_r-project.org> Sent: Tuesday, March 04, 2008 4:33 PM Subject: [R] Sampling letters

>I have a binary matrix of size N x 300. I then create the following:

*>
**>> set.seed(1234)
**>> (key_file <- sample(letters[1:4], 300, replace=TRUE))
**> [1] "a" "c" "c" "c" "d" "c" "a" "a" "c" "c" "c" "c" "b" "d" "b" "d"
**> "b" "b" "a" "a" "b" "b" "a"
**> [24] "a" "a" "d" "c" "d" "d" "a" "b" "b" "b" "c" "a" "d" "a" "b" "d"
**> "d" "c" "c" "b" "c" "b" "c"
**> [47] "c" "b" "a" "d" "a" "b" "c" "c" "a" "c" "b" "d" "a" "d" "d" "a"
**> "b" "a" "a" "c" "b" "c" "a"
**> [70] "c" "a" "d" "a" "d" "a" "c" "b" "a" "b" "c" "d" "b" "a" "c" "a"
**> "d" "b" "b" "a" "d" "a" "d"
**> [93] "a" "a" "a" "c" "b" "a" "b" "c" "a" "c" "b" "a" "a" "b" "a" "a"
**> "b" "a" "c" "a" "d" "a" "a"
**> [116] "d" "d" "b" "a" "d" "c" "d" "d" "d" "b" "b" "b" "c" "b" "b"
**> "d"
**> "c" "a" "b" "d" "b" "d" "c"
**> [139] "c" "d" "c" "d" "b" "b" "b" "c" "c" "c" "d" "c" "b" "a" "a"
**> "d"
**> "a" "d" "c" "d" "b" "c" "b"
**> [162] "c" "b" "a" "a" "c" "b" "a" "d" "b" "d" "c" "c" "b" "b" "d"
**> "b"
**> "c" "a" "b" "b" "b" "c" "a"
**> [185] "d" "a" "d" "c" "b" "c" "c" "d" "a" "d" "d" "d" "d" "c" "d"
**> "c"
**> "c" "c" "b" "d" "c" "c" "b"
**> [208] "b" "a" "d" "c" "b" "a" "d" "c" "d" "c" "c" "b" "d" "b" "a"
**> "b"
**> "b" "a" "b" "d" "b" "c" "b"
**> [231] "d" "c" "a" "d" "c" "a" "c" "b" "b" "d" "b" "a" "a" "c" "d"
**> "b"
**> "d" "c" "d" "d" "c" "c" "b"
**> [254] "b" "a" "c" "b" "a" "c" "c" "d" "a" "c" "b" "a" "a" "c" "a"
**> "a"
**> "c" "b" "d" "b" "d" "a" "c"
**> [277] "d" "c" "b" "b" "b" "b" "d" "d" "c" "b" "b" "b" "c" "d" "c"
**> "b"
**> "d" "a" "c" "d" "c" "a" "c"
**> [300] "b"
**>
**> I now replace all 1's in column 1 with key_file[1], I replace all
**> 1's in
**> column 2 with key_file[2] and so on through column 300. This part is
**> simple.
**>
**> Now, I want to replace the 0's in column 1 with either b,c, or d,
**> but
**> not with an a since that was used to replace the 1's. For column 2 I
**> want to replace all 0's with either a,b, or d but not with c since
**> that
**> was used to replace the 1's.
**>
**> However, I do not want all 0's in column 1 to be the same letter.
**> That
**> is, I would not want them all to be replaced with a 'b'. Rather, I
**> want
**> to randomly recode the 0's as either b,c, or d. So, some 0's will be
**> recoded as b, some as c, and some as d.
**>
**> If I were replacing the zeros with the same letter, this would be a
**> simple ifelse command. But, because I want randomness I'm not sure
**> how I
**> can do this other than a costly loop than goes through the data
**> matrix
**> cell-by-cell and does some replacement. That would be fine, but very
**> time consuming.
**>
**> Does anyone have thoughts on how else I could tackle this?
**>
**> Harold
**>
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide
**> http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 04 Mar 2008 - 15:57:46 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Tue 04 Mar 2008 - 16:30:20 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*