Re: [R] Multiset Permutations

From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>
Date: Sun, 06 Apr 2008 22:59:13 +0100

Hi,

If I understand you correctly, there is beta code within the development version of package 'vegan' on R forge to do this.

install.packages("vegan", repos="http://R-Forge.R-project.org")

Then:

> numPerms(5)

[1] 120
> perms <- allPerms(5, observed = TRUE)
> perms[1:10,]

      [,1] [,2] [,3] [,4] [,5]

 [1,]    1    2    3    5    4
 [2,]    1    2    4    3    5
 [3,]    1    2    4    5    3
 [4,]    1    2    5    3    4
 [5,]    1    2    5    4    3
 [6,]    1    3    2    4    5
 [7,]    1    3    2    5    4
 [8,]    1    3    4    2    5
 [9,]    1    3    4    5    2

[10,] 1 3 5 2 4

As you can see, it returns the ordering of 5 observations when permuted freely. perms is a matrix with some extra attributes.

To use this, loop over the rows of perms. To get the actual permuted vector, use something like the following:

> dat <- c(0,0,1,2,2)
> dat[perms[1,]]

[1] 0 0 1 2 2
> dat[perms[2,]]

[1] 0 0 2 1 2
> dat[perms[3,]]

[1] 0 0 2 2 1

This assumes that the entries in dat are freely exchangeable, which is what I take your output example to be representing.

Be careful to read the help page for allPerms, if n is large you'll need to increase argument 'max'. numPerms(n) where n is the number of entries in dat will tell you how big max needs to be.

Note that you should be able to do:

perms <- allPerms(dat)

but that isn't working at all (did I say this was beta code?) due to a silly bug I introduced by trying to be too clever for my own good. So you have to use allPerms(n) instead.

If this isn't exactly what you wanted, email me back as there may be a way to use the more complicated restricted permutations available in vegan to do what you want.

HTH G

On Sun, 2008-04-06 at 13:57 -0700, Stropharia wrote:
> Dear R users,
>
> I want to perform an exact permutation of multisets. I have looked at the
> coin package, but it doesn't seem to offer what I am looking for. I want to
> perform permutations (exact - without duplications) on multisets of scalars
> (e.g., the multiset 0,0,1,2,2). I want to output all the possible
> permutations into a data frame so that each multiset permutation occupies a
> row (or column). The output would look something like this:
>
> 00122
> 01220
> 01210
> 20201
> 10202
> 12200
> etc...
>
> There seem to be numerous algorithms in the primary statistical literature
> for doing this, but they haven't been implemented in R. I've been searching
> for weeks online and in books for a way of doing this, but the only thing
> i've come across is a Java script posted anonymously on a forum. I don't
> know Java at all, so can't make enough sense of it to try a translation into
> R, but i thought i'd include it below in case someone can glean from this a
> way of doing it in R. Thanks in advance for any help on this, it is greatly
> appreciated.
>
> Here are the original (anonymous) poster's comments and code:
>
> "We definitely need to use recursion. There can be up to n! permutations.
> For example we'll use recursion to generate a tree of n! leaves. Each node
> is a run of the recursive function. The root node has n children, the nodes
> at the second level have each n-1 children, the nodes at the third level
> have n-2 children, etc. Each leaf of this tree will be a permutation. The
> idea is to add an element to the permutation, remove it from the set, and
> call the recursive function on that set. Because it's a multiset i'm using a
> Hashtable to eliminate duplicate entries. A pointer to the Hashtable gets
> passed along with each recursive call.
> The recursive function will know that it's on a "leaf" when the size of the
> set is 0, at which point it will insert its permutation in the Hashtable.
> Since identical permutations map to the same "bucket" in the hashtable, they
> overwrite eachother to leave only unique permutations. Here's the algorithm
> in Java:"
>
> ----------------------------- JAVA CODE --------------------------------
> import java.util.*;
>
> public class Test {
> public static void PermutationsRecursive(char[] s, String p, Hashtable
> perms){
> for(int x=0; x<s.length; x++){
> String np = new String(p);
> char[] ns = new char[s.length-1];
> int y = 0;
> for(int z=0; z<s.length; z++){
> if(z != x) ns[y++] = s[z];
> }
> np = np + s[x];
> if(ns.length == 0) perms.put(np, new Boolean(true));
> else PermutationsRecursive(ns, np, perms);
> }
> }
> public static String[] GetPermutations(char[] in){
> int fact = Factorial(in.length);
> Hashtable perms = new Hashtable(fact);
> PermutationsRecursive(in, "", perms);
> Enumeration e = perms.keys();
> String[] out = new String[perms.size()];
> int i = 0;
> while(e.hasMoreElements()){
> out[i++] = (String) e.nextElement();
> }
> return out;
> }
> private static int Factorial(int n){
> int fact = 1;
> for(int i=2; i<=n; i++){
> fact *= i;
> }
> return fact;
> }
> public static void main(String[] args){
> char[] set = new char[]{'A', 'A', 'B', 'C'};
> String[] perms = GetPermutations(set);
> Arrays.sort(perms, String.CASE_INSENSITIVE_ORDER);
> for(int i=0; i<perms.length; i++){
> System.out.println(perms[i]);
> }
> }
> }
> ------------------------- END JAVA CODE -----------------------------
>
> This code prints the following:
> AABC
> AACB
> ABAC
> ABCA
> ACAB
> ACBA
> BAAC
> BACA
> BCAA
> CAAB
> CABA
> CBAA
>

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sun 06 Apr 2008 - 23:03:27 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 06 Apr 2008 - 23:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive