Re: [R] Scatterplot Showing All Points

From: Antony Unwin <unwin_at_math.uni-augsburg.de>
Date: Tue, 18 Dec 2007 18:44:23 +0100

On 18 Dec 2007, at 4:49 pm, Duncan Murdoch wrote:

>> One good alternative here is the fluctuation diagram variant of a
>> mosaic plot:
>> xx<-as.factor(x)
>> yy<-as.factor(y)
>> imosaic(xx,yy, type="f")
>
> That plot is better than jittering, but there's the problem in the
> mosaic plot of understanding the scale of the rectangles: is it
> area or diameter that encodes the count?

Area is used.

> With a jittered plot, you lose resolution when the number of points
> gets too high because you just see a mess of ink, but at least you
> only require the viewer to count in order to get a close numerical
> reading from the plot.

If someone needs a count, they should be given a table. Graphics are for qualitative conclusions not details. Anyway, counting will only work for really small datasets.

> I could also claim that while imperfect, at least jittering is
> widely applicable. For example, if the data were not on a regular
> grid, perhaps because they had been generated like this:
>
> xloc <- rnorm(50)
> yloc <- rnorm(50)
> index <- sample(1:50, 5000, rep=TRUE, prob = abs(xloc))
> x <- xloc[index]
> y <- yloc[index]
>
> then jittering still works as well (or as poorly), but the imosaic
> would not work at all.

That's right and that's (almost) the sort of example I was thinking of. For a limited number of locations like this a bubble plot would be best (which has already been suggested in this thread, I think). For many locations and few replications I would still go for varying pointsize and transparency.

Incidentally, to check your suggestion I ran your code and discovered that the transparency in iplot does not seem to like replications. Very strange, we'll have to check why. I then looked closely at the numbers of replications generated and discovered that case 25 was picked 325 times and case 40 only once. Rather too extreme for my liking! Running it again gave very similar results, though not exactly the same: this time it was 325 times for case 25 and case 40 was not picked at all. Other numbers varied slightly. This is not what I expected, any ideas?

> P.S. iplots 1.1-1 may have an init problem in Windows: in my first
> attempt, the plot made the boxes too large to fit in their cells,
> but it fixed itself when I resized the window, and the bug doesn't
> seem to be repeatable.

Thanks. This happens occasionally on the Mac too. Refreshing solves it in practice, but we need to find out why it can happen (and stop it happening!).

Antony Unwin
Professor of Computer-Oriented Statistics and Data Analysis, University of Augsburg,

Germany

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 18 Dec 2007 - 17:50:41 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 18 Dec 2007 - 18:30:21 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.