Re: [R] Scatterplot Showing All Points

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: Tue, 18 Dec 2007 13:05:03 -0500

On 12/18/2007 12:44 PM, Antony Unwin wrote:
> On 18 Dec 2007, at 4:49 pm, Duncan Murdoch wrote:
>

>>> One good alternative here is the fluctuation diagram  variant of a  
>>> mosaic plot:
>>> xx<-as.factor(x)
>>> yy<-as.factor(y)
>>> imosaic(xx,yy, type="f")
>>
>> That plot is better than jittering, but there's the problem in the  
>> mosaic plot of understanding the scale of the rectangles:  is it  
>> area or diameter that encodes the count?

>
> Area is used.
>
>> With a jittered plot, you lose resolution when the number of points  
>> gets too high because you just see a mess of ink, but at least you  
>> only require the viewer to count in order to get a close numerical  
>> reading from the plot.

>
> If someone needs a count, they should be given a table. Graphics
> are for qualitative conclusions not details. Anyway, counting will
> only work for really small datasets.
>
>> I could also claim that while imperfect, at least jittering is  
>> widely applicable.  For example, if the data were not on a regular  
>> grid, perhaps because they had been generated like this:
>>
>> xloc <- rnorm(50)
>> yloc <- rnorm(50)
>> index <- sample(1:50, 5000, rep=TRUE, prob = abs(xloc))
>> x <- xloc[index]
>> y <- yloc[index]
>>
>> then jittering still works as well (or as poorly), but the imosaic  
>> would not work at all.

>
> That's right and that's (almost) the sort of example I was thinking
> of. For a limited number of locations like this a bubble plot would
> be best (which has already been suggested in this thread, I think).
> For many locations and few replications I would still go for varying
> pointsize and transparency.
>
> Incidentally, to check your suggestion I ran your code and discovered
> that the transparency in iplot does not seem to like replications.
> Very strange, we'll have to check why. I then looked closely at the
> numbers of replications generated and discovered that case 25 was
> picked 325 times and case 40 only once. Rather too extreme for my
> liking! Running it again gave very similar results, though not
> exactly the same: this time it was 325 times for case 25 and case 40
> was not picked at all. Other numbers varied slightly. This is not
> what I expected, any ideas?

abs(xloc) typically varies by a factor of about 100 from smallest to largest, but sometimes the small end is really small, and so the ratio is really big.

Duncan Murdoch

>

>> P.S. iplots 1.1-1 may have an init problem in Windows: in my first  
>> attempt, the plot made the boxes too large to fit in their cells,  
>> but it fixed itself when I resized the window, and the bug doesn't  
>> seem to be repeatable.

>
> Thanks. This happens occasionally on the Mac too. Refreshing solves
> it in practice, but we need to find out why it can happen (and stop
> it happening!).
>
> Antony Unwin
> Professor of Computer-Oriented Statistics and Data Analysis,
> University of Augsburg,
> Germany


R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 18 Dec 2007 - 18:14:21 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 18 Dec 2007 - 18:30:21 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.