Re: [R] Generating groupings of ordered observations

From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>
Date: Sat, 21 Jun 2008 20:30:02 +0100

Dear Chuck,

That is, quite simply, superb! The members of this list never cease to amaze me, and your response is no exception.

As an aside, does this type of problem come under a particular name or topic in mathematics?

The ordering need not be exactly as I had it in my example --- that just represented the way I had tried to look at the problem using pen and paper (!)

For the record and the archives, the following achieves a slightly different ordering to the one in my example, but which is equally usable for my application; ordered in terms of number of groups.

> n <- 6
> base10 <- seq(0, length=2^(n-1) )
> base2.bits <- outer(0:(n-2), base10, function(y,x) ( x %/% (2^y)) %%2 )
> res <- sapply(apply( base2.bits==1, 2, which ), function(x) rep(1:(1+length(x)), diff(c(0,x,n))))
> ord <- order(apply(res, 2, function(x) length(unique(x))))
> res[,ord]

And your pointing me to choose(n-1, 0:(n-1)) as the basis to determine the number of possible combinations, allows me to break the result (res[,ord]) down into the chunks I need to work with.

Many thanks,

Gavin

On Sat, 2008-06-21 at 11:30 -0700, Charles C. Berry wrote:
> On Sat, 21 Jun 2008, Gavin Simpson wrote:
>
> > Dear List,
> >
> > I have a problem I'm finding it difficult to make headway with.
> >
> > Say I have 6 ordered observations, and I want to find all combinations
> > of splitting these 6 ordered observations in g groups, where g = 1, ...,
> > 6. Groups can only be formed by adjacent observations, so observations 1
> > and 4 can't be in a group on their own, only if 1,2,3&4 are all in the
> > group.
> >
>
> Right. And in the example below there are 32 distinct patterns.
>
> Which arises from sum( choose( 5, 0:5 ) ) different placements of 0:5
> split positions.
>
> You can represent the splits as a binary number with n-1 bits: 00000
> implies no splits, 10000 implies a split between 1 and 2, 10100 implies
> splits between 1 and 2 and between 3 and 4, et cetera.
>
> So, 32 arises as 2^5, too.
>
> Something like this:
>
> > base10 <- seq(0, length=2^(n-1) )
> > base2.bits <- outer(0:(n-2), base10, function(y,x) ( x %/% (2^y)) %%2 )
> > sapply(apply( base2.bits==1, 2, which ), function(x) rep(1:(1+length(x)), diff(c(0,x,n))))
>
> Getting this in the same column order as your example is left as an
> exercise for the reader.
>
> HTH,
>
> Chuck
>
> > For example, with 6 observations, the columns of the matrices below
> > represent the groups that can be formed by placing the 6 ordered
> > observations into 2-5 groups. Think of the columns of these matrices as
> > being an indicator of group membership. We then cbind these matrices
> > with the trivial partitions into 1 and 6 groups:
> >
> > mat2g <- matrix(c(1,1,1,1,1,
> > 2,1,1,1,1,
> > 2,2,1,1,1,
> > 2,2,2,1,1,
> > 2,2,2,2,1,
> > 2,2,2,2,2),
> > nrow = 6, ncol = 5, byrow = TRUE)
> >
> > mat3g <- matrix(c(1,1,1,1,1,1,1,1,1,1,
> > 2,2,2,2,1,1,1,1,1,1,
> > 3,2,2,2,2,2,2,1,1,1,
> > 3,3,2,2,3,2,2,2,2,1,
> > 3,3,3,2,3,3,2,3,2,2,
> > 3,3,3,3,3,3,3,3,3,3),
> > nrow = 6, ncol = 10, byrow = TRUE)
> >
> > mat4g <- matrix(c(1,1,1,1,1,1,1,1,1,1,
> > 2,2,2,2,2,2,1,1,1,1,
> > 3,3,3,2,2,2,2,2,2,1,
> > 4,3,3,3,3,2,3,3,2,2,
> > 4,4,3,4,3,3,4,3,3,3,
> > 4,4,4,4,4,4,4,4,4,4),
> > nrow = 6, ncol = 10, byrow = TRUE)
> >
> > mat5g <- matrix(c(1,1,1,1,1,
> > 2,2,2,2,1,
> > 3,3,3,2,2,
> > 4,4,3,3,3,
> > 5,4,4,4,4,
> > 5,5,5,5,5),
> > nrow = 6, ncol = 5, byrow = TRUE)
> >
> > cbind(rep(1,6), mat2g, mat3g, mat4g, mat5g, 1:6)
> >
> > I'd like to be able to do this automagically, for any (reasonable,
> > small, say n = 10-20) number of observations, n, and for g = 1, ..., n
> > groups.
> >
> > I can't see the pattern here or a way forward. Can anyone suggest an
> > approach?
> >
> > Thanks in advance,
> >
> > Gavin
> >
> > --
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> > Dr. Gavin Simpson [t] +44 (0)20 7679 0522
> > ECRC, UCL Geography, [f] +44 (0)20 7679 0565
> > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
> > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
> > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> >
> > ______________________________________________
> > R-help_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> Charles C. Berry (858) 534-2098
> Dept of Family/Preventive Medicine
> E mailto:cberry_at_tajo.ucsd.edu UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
>
>

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat 21 Jun 2008 - 19:36:27 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 21 Jun 2008 - 22:30:54 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive