Re: [R] Permutations

From: Marc Schwartz <MSchwartz_at_medanalytics.com>
Date: Thu 15 Jul 2004 - 04:49:30 EST

On Wed, 2004-07-14 at 08:06, Rolf Turner wrote:
> In respect of generating random ``restricted'' permutations, it
> occurred to me as I was driving home last night .... If one is going
> to invoke some kind of ``try again if it doesn't work procedure''
> then one might as well keep it simple: Essentially use the rejection
> method. Just generate a random permutation, and then check whether
> it meets the restriction criterion. If yes, return that
> permutation, if not, throw it away and try again.
>
> This will definitely (???) genererate a genuinely random restricted
> permutation. I figured that since a very large fraction of permutations
> are acutally restricted permutions one wouldn't reject much of the
> time, and so the rejection method should be pretty efficient.
>
> I wrote the following code to do the work:
>
> restr.perm2 <- function () {
> #
> okay <- function (x) {
> m1 <- cbind(1:12,rep(1:4,rep(3,4)))
> m2 <- m1[x,]
> all((m2[,1] == m1[,1]) | (m2[,2] != m1[,2]))
> }
> #
> repeat{
> x <- sample(12,12)
> if(okay(x)) return(x)
> }
> }
>
> I'm bothered again, but: I did a little test to see how many tries
> it would take on average. On 1000 attempts I got a mean of 8.44
> tries, with a standard deviation of 7.7610 (standard error of the
> mean = 0.2454). The value of 7.76 is roughly consistent with
> sqrt(1-p.hat)/p.hat = 7.92 that one gets from the geometric
> distribution.
>
> This would indicate that the fraction of ``restricted'' permutations
> is something like p.hat = 1/8.44 = 0.1184834. Which is a lot less
> than the rough estimate of (4.7 x 10^6)/12! approx. = 0.9853 from
> Robert Baskin's back-of-the-envelope calculations.
>
> What's going on/wrong?

Rolf,

I have not quite figured out what the issues are, but I took your approach above and combined it with Gabor's f1a() function that was the result of the recent thread on matching/counting rows between matrices and came up with the following. The basic concept is that we know (or think that we know) what the non-allowable set of intra-block permutations are:

  # Create non-allowable 'intra-block' permutations
  # Need a generalizable way to do this, but
  # good enough for now
  a <- permutations(3, 3, 1:3)
  b <- permutations(3, 3, 4:6)
  c <- permutations(3, 3, 7:9)

  d <- permutations(3, 3, 10:12)
  intra <- rbind(a[-1, ], b[-1, ], c[-1, ], d[-1, ])

'intra' looks like:

> intra

      [,1] [,2] [,3]

 [1,]    1    3    2
 [2,]    2    1    3
 [3,]    2    3    1
 [4,]    3    1    2
 [5,]    3    2    1
 [6,]    4    6    5
 [7,]    5    4    6
 [8,]    5    6    4
 [9,]    6    4    5
[10,]    6    5    4
[11,]    7    9    8
[12,]    8    7    9
[13,]    8    9    7
[14,]    9    7    8
[15,]    9    8    7
[16,]   10   12   11
[17,]   11   10   12
[18,]   11   12   10

[19,] 12 10 11
[20,] 12 11 10

With that in place, we can then take the randomly generated 'x', coerce it into a 3 column matrix by row and use Gabor's function to check for any matches of the rows of 3 in 'x' with the non-allowable permutations in 'intra'.

Thus, the function will assign 'x' to the appropriate row in 'results' (which is pre-allocated based upon the number of runs) if 'x' passes or sets the row to NA's if it does not.

I then use complete.cases() to return only the valid rows. Presumably, some of the randomly generated 'x' vectors could be duplicates, so I also use unique():

restr.perm3 <- function(runs)
{
  results <- matrix(numeric(runs * 12), ncol = 12)

  # use Gabor's function to check for row matches   # between 'x' and 'intra' to filter out in okay()   f1a <- function(a,b,sep=":")
  {

    f <- function(...) paste(..., sep=sep)
    a2 <- do.call("f", as.data.frame(a))
    b2 <- do.call("f", as.data.frame(b))
    c(table(c(b2,unique(a2)))[a2] - 1)
  }

  okay <- function (x)
  {
    x.match <- matrix(x, ncol = 3, byrow = TRUE)     all(f1a(x.match, intra) == 0)
  }

  for (i in 1:runs)
  {
    x <- sample(12,12)
    if (okay(x))
      results[i, ] <- x
    else
      results[i, ] <- rep(NA, 12)
  }

  unique(results[complete.cases(results), ]) }

So, with all that in place, we can then do something like the following:

r <- replicate(10, restr.perm3(1000))

> str(r)

List of 10

 $ : num [1:934, 1:12] 10 6 8 7 7 11 1 1 8 4 ...
 $ : num [1:944, 1:12] 7 4 4 11 1 11 8 3 3 9 ...
 $ : num [1:953, 1:12] 8 4 11 1 11 6 7 11 9 10 ...
 $ : num [1:951, 1:12] 1 2 1 4 4 10 8 10 3 12 ...
 $ : num [1:949, 1:12] 1 7 11 9 2 2 11 7 11 4 ...
 $ : num [1:937, 1:12] 3 3 11 10 4 8 12 10 3 2 ...
 $ : num [1:952, 1:12] 7 3 1 6 4 4 4 11 2 8 ...
 $ : num [1:946, 1:12] 1 9 3 10 11 6 1 7 8 11 ...
 $ : num [1:945, 1:12] 8 9 1 2 8 3 4 1 7 11 ...
 $ : num [1:933, 1:12] 9 1 12 3 4 1 10 1 1 8 ...


So the above would suggest that indeed, the restricted permutations are a fairly high proportion of the total.

I went ahead and did the following 50 replicates of 1000 each:

> system.time(r <- replicate(50, restr.perm3(1000)))
[1] 91.20 0.06 92.98 0.00 0.00

> mean(unlist(lapply(r, nrow)))

[1] 941.52

> sd(unlist(lapply(r, nrow)))

[1] 6.494079

There are likely to be some efficiencies in the function that can be brought to bear, but it is a start.

In either case, the restricted permutations appear to be around 94%, if all of the assumptions are correct.

HTH, Marc Schwartz



R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 15 04:55:47 2004

This archive was generated by hypermail 2.1.8 : Wed 03 Nov 2004 - 22:54:58 EST