From: Marc Schwartz <MSchwartz_at_medanalytics.com>

Date: Thu 15 Jul 2004 - 04:49:30 EST

d <- permutations(3, 3, 10:12)

intra <- rbind(a[-1, ], b[-1, ], c[-1, ], d[-1, ])

[19,] 12 10 11

[20,] 12 11 10

}

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 15 04:55:47 2004

Date: Thu 15 Jul 2004 - 04:49:30 EST

On Wed, 2004-07-14 at 08:06, Rolf Turner wrote:

> In respect of generating random ``restricted'' permutations, it

*> occurred to me as I was driving home last night .... If one is going
**> to invoke some kind of ``try again if it doesn't work procedure''
**> then one might as well keep it simple: Essentially use the rejection
**> method. Just generate a random permutation, and then check whether
**> it meets the restriction criterion. If yes, return that
**> permutation, if not, throw it away and try again.
**>
**> This will definitely (???) genererate a genuinely random restricted
**> permutation. I figured that since a very large fraction of permutations
**> are acutally restricted permutions one wouldn't reject much of the
**> time, and so the rejection method should be pretty efficient.
**>
**> I wrote the following code to do the work:
**>
**> restr.perm2 <- function () {
**> #
**> okay <- function (x) {
**> m1 <- cbind(1:12,rep(1:4,rep(3,4)))
**> m2 <- m1[x,]
**> all((m2[,1] == m1[,1]) | (m2[,2] != m1[,2]))
**> }
**> #
**> repeat{
**> x <- sample(12,12)
**> if(okay(x)) return(x)
**> }
**> }
**>
**> I'm bothered again, but: I did a little test to see how many tries
**> it would take on average. On 1000 attempts I got a mean of 8.44
**> tries, with a standard deviation of 7.7610 (standard error of the
**> mean = 0.2454). The value of 7.76 is roughly consistent with
**> sqrt(1-p.hat)/p.hat = 7.92 that one gets from the geometric
**> distribution.
**>
**> This would indicate that the fraction of ``restricted'' permutations
**> is something like p.hat = 1/8.44 = 0.1184834. Which is a lot less
**> than the rough estimate of (4.7 x 10^6)/12! approx. = 0.9853 from
**> Robert Baskin's back-of-the-envelope calculations.
**>
**> What's going on/wrong?
*

Rolf,

I have not quite figured out what the issues are, but I took your approach above and combined it with Gabor's f1a() function that was the result of the recent thread on matching/counting rows between matrices and came up with the following. The basic concept is that we know (or think that we know) what the non-allowable set of intra-block permutations are:

# Create non-allowable 'intra-block' permutations # Need a generalizable way to do this, but # good enough for now a <- permutations(3, 3, 1:3) b <- permutations(3, 3, 4:6) c <- permutations(3, 3, 7:9)

d <- permutations(3, 3, 10:12)

intra <- rbind(a[-1, ], b[-1, ], c[-1, ], d[-1, ])

'intra' looks like:

> intra

[,1] [,2] [,3]

[1,] 1 3 2 [2,] 2 1 3 [3,] 2 3 1 [4,] 3 1 2 [5,] 3 2 1 [6,] 4 6 5 [7,] 5 4 6 [8,] 5 6 4 [9,] 6 4 5 [10,] 6 5 4 [11,] 7 9 8 [12,] 8 7 9 [13,] 8 9 7 [14,] 9 7 8 [15,] 9 8 7 [16,] 10 12 11 [17,] 11 10 12 [18,] 11 12 10

[19,] 12 10 11

[20,] 12 11 10

With that in place, we can then take the randomly generated 'x', coerce it into a 3 column matrix by row and use Gabor's function to check for any matches of the rows of 3 in 'x' with the non-allowable permutations in 'intra'.

Thus, the function will assign 'x' to the appropriate row in 'results' (which is pre-allocated based upon the number of runs) if 'x' passes or sets the row to NA's if it does not.

I then use complete.cases() to return only the valid rows. Presumably, some of the randomly generated 'x' vectors could be duplicates, so I also use unique():

restr.perm3 <- function(runs)

{

results <- matrix(numeric(runs * 12), ncol = 12)

# use Gabor's function to check for row matches
# between 'x' and 'intra' to filter out in okay()
f1a <- function(a,b,sep=":")

{

f <- function(...) paste(..., sep=sep) a2 <- do.call("f", as.data.frame(a)) b2 <- do.call("f", as.data.frame(b))c(table(c(b2,unique(a2)))[a2] - 1)

}

okay <- function (x)

{

x.match <- matrix(x, ncol = 3, byrow = TRUE)
all(f1a(x.match, intra) == 0)

}

for (i in 1:runs)

{

x <- sample(12,12)

if (okay(x))

results[i, ] <- x

else

results[i, ] <- rep(NA, 12)

}

unique(results[complete.cases(results), ]) }

So, with all that in place, we can then do something like the following:

r <- replicate(10, restr.perm3(1000))

> str(r)

List of 10

$ : num [1:934, 1:12] 10 6 8 7 7 11 1 1 8 4 ... $ : num [1:944, 1:12] 7 4 4 11 1 11 8 3 3 9 ... $ : num [1:953, 1:12] 8 4 11 1 11 6 7 11 9 10 ... $ : num [1:951, 1:12] 1 2 1 4 4 10 8 10 3 12 ... $ : num [1:949, 1:12] 1 7 11 9 2 2 11 7 11 4 ... $ : num [1:937, 1:12] 3 3 11 10 4 8 12 10 3 2 ... $ : num [1:952, 1:12] 7 3 1 6 4 4 4 11 2 8 ... $ : num [1:946, 1:12] 1 9 3 10 11 6 1 7 8 11 ... $ : num [1:945, 1:12] 8 9 1 2 8 3 4 1 7 11 ... $ : num [1:933, 1:12] 9 1 12 3 4 1 10 1 1 8 ...

So the above would suggest that indeed, the restricted permutations are a fairly high proportion of the total.

I went ahead and did the following 50 replicates of 1000 each:

> system.time(r <- replicate(50, restr.perm3(1000)))

[1] 91.20 0.06 92.98 0.00 0.00

> mean(unlist(lapply(r, nrow)))

[1] 941.52

> sd(unlist(lapply(r, nrow)))

[1] 6.494079

There are likely to be some efficiencies in the function that can be brought to bear, but it is a start.

In either case, the restricted permutations appear to be around 94%, if all of the assumptions are correct.

**HTH,
**
Marc Schwartz

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 15 04:55:47 2004

*
This archive was generated by hypermail 2.1.8
: Wed 03 Nov 2004 - 22:54:58 EST
*