Re: [R] Scatterplot Showing All Points

From: Antony Unwin <unwin_at_math.uni-augsburg.de>
Date: Tue, 18 Dec 2007 16:01:40 +0100

On 18 Dec 2007, at 2:42 pm, Duncan Murdoch wrote:

>> (I must admit to being very surprised that jittering and
>> sunflower plots have been suggested for a dataset of 5000
>> points. Do those who mentioned these methods have examples on
>> that scale where they are effective?)
>
> Sure. The original post said there were about 50-60 unique
> locations. This plot:
>
> x <- rbinom(5000, 20, 0.15)
> y <- rbinom(5000, 20, 0.15)
> plot(x,y)
>
> has a few more unique locations; tune those probabilities if you
> want it closer. Due to the overlap, the distribution is very
> unclear. But this plot
>
> plot(jitter(x), jitter(y))
>
> makes the distribution quite clear.

No it doesn't! It makes it moderately clearer than the plot without jittering. One good alternative here is the fluctuation diagram variant of a mosaic plot:

xx<-as.factor(x)
yy<-as.factor(y)

imosaic(xx,yy, type="f")

Using jittering for categorical data is really not to be recommended and will certainly degrade in performance as the dataset gets bigger.

Antony Unwin
Professor of Computer-Oriented Statistics and Data Analysis, University of Augsburg,
Germany

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 18 Dec 2007 - 15:09:57 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 18 Dec 2007 - 16:30:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.