# Re: [R] Graph many points without hiding some

From: Peter Langfelder <peter.langfelder_at_gmail.com>
Date: Thu, 31 Mar 2011 00:25:39 -0700

On Wed, Mar 30, 2011 at 10:04 PM, Samuel Dennis <sjdennis3_at_gmail.com> wrote:
> I have a very large dataset with three variables that I need to graph using
> a scatterplot. However I find that the first variable gets masked by the
> other two, so the graph looks entirely different depending on the order of
> variables. Does anyone have any suggestions how to manage this?
>
> This code is an illustration of what I am dealing with:
>
> x <- 10000
> plot(rnorm(x,mean=20),rnorm(x),col=1,xlim=c(16,24))
> points(rnorm(x,mean=21),rnorm(x),col=2)
> points(rnorm(x,mean=19),rnorm(x),col=3)
>
> gives an entirely different looking graph to:
>
> x <- 10000
> plot(rnorm(x,mean=19),rnorm(x),col=3,xlim=c(16,24))
> points(rnorm(x,mean=20),rnorm(x),col=1)
> points(rnorm(x,mean=21),rnorm(x),col=2)
>
> despite being identical in all respects except for the order in which the
> variables are plotted.
>
> I have tried using pch=".", however the colours are very difficult to
> discern. I have experimented with a number of other symbols with no real
> solution.
>
> The only way that appears to work is to iterate the plot with a for loop,
> and progressively add a few numbers from each variable, as below. However
> although I can do this simply with random numbers as I have done here, this
> is an extremely cumbersome method to use with real datasets.
>
> plot(1,1,xlim=c(16,24),ylim=c(-4,4),col="white")
> x <- 100
> for (i in 1:100) {
> points(rnorm(x,mean=19),rnorm(x),col=3)
> points(rnorm(x,mean=20),rnorm(x),col=1)
> points(rnorm(x,mean=21),rnorm(x),col=2)
> }
>
> Is there some function in R that could solve this through automatically
> iterating my data as above, using transparent symbols, or something else? Is
> there some other way of solving this issue that I haven't thought of?

Assume you are plotting variables y1, y2, y3 of the same length against a common x, and you would like to assign colors say c(1,2,3). You can automate the randomization of order as follows:

n = length(y1);
y = c(y1, y2, y3);
xx = rep(x, 3);
colors = rep(c(1,2,3), c(n, n, n));

order = sample(c(1:(3*n)));

plot(xx[order], y[order], col= colors[order])

I basically turn the y's into a single vector y with the corresponding values of x stored in xx and the plotting colors, then randomize the order using the sample function.

HTH, Peter

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 31 Mar 2011 - 07:29:20 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 31 Mar 2011 - 09:10:26 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.