From: <greg.kochanski_at_phon.ox.ac.uk>

Date: Sun 29 Jan 2006 - 18:16:24 GMT

R-devel@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon Jan 30 05:24:27 2006

Date: Sun 29 Jan 2006 - 18:16:24 GMT

Full_Name: Greg Kochanski

Version: 2.2.1

OS: Debian Linux (testing)

Submission from: (NULL) (212.159.16.190)

This is really a feature request.

When you do mosaicplot() on a data set where the probability of several nearby rows is small, then the labels for those rows are plotted overlapping each other.

This situation can be improved by calling mosaicplot()
with a large value of "off", but sometimes, even off=50

(the largest allowable value) isn't sufficient,

especially if the labels are several characters long.

The problem exists even if the labels don't overlap,
because one needs space between the labels to avoid
confusion. For instance, labels "L*H", "!H*", and
"L%" when too close together turn into

"L*H!H*L%" which is confusing to anyone.

The problem could be solved by breaking the assumption that
the label position need always be exactly matched to the
graphic. This is OK, especially for rows because

(a) the graphical blocks that are part of a single row

aren't aligned with each other anyway, and

(b) if you can read the labels, you can generally

match things up by counting.

One way to do this in a fairly nice way is to position the labels in such a way to minimize the sum of the squared error between the label center and the average position of the blocks on that row, subject to the constraint that labels be non-overlapping.

This problem is actually not too hard to solve: it is essentially Kruskal's algorithm for finding a best-fit monotonic sequence (which probably exists in CRAN already).

Neglecting edge effects, assume you have a
vector of desired positions z, and

a vector of minimum widths for each label w.
Then, you can compute the space used up by
the labels: s[i] = -0.5*w[1] + sum(j<i of w[i]) + 0.5*w[i]
and compute y = M(z-s) + s

where M() gives the best-fit monotonically nondecreasing
fit to it's argument. Y should then be the correct
place to put each label.

If there's a likelyhood of getting a patch accepted, I could probably supply one.

(Given the opportunity, I'd think about shifting the blocks

up and down also, to do an overall alignment.)

R-devel@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon Jan 30 05:24:27 2006

*
This archive was generated by hypermail 2.1.8
: Mon 20 Feb 2006 - 03:21:40 GMT
*