From: Peter Langfelder <peter.langfelder_at_gmail.com>

Date: Thu, 31 Mar 2011 00:25:39 -0700

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 31 Mar 2011 - 07:29:20 GMT

Date: Thu, 31 Mar 2011 00:25:39 -0700

On Wed, Mar 30, 2011 at 10:04 PM, Samuel Dennis <sjdennis3_at_gmail.com> wrote:

> I have a very large dataset with three variables that I need to graph using

*> a scatterplot. However I find that the first variable gets masked by the
**> other two, so the graph looks entirely different depending on the order of
**> variables. Does anyone have any suggestions how to manage this?
**>
**> This code is an illustration of what I am dealing with:
**>
**> x <- 10000
**> plot(rnorm(x,mean=20),rnorm(x),col=1,xlim=c(16,24))
**> points(rnorm(x,mean=21),rnorm(x),col=2)
**> points(rnorm(x,mean=19),rnorm(x),col=3)
**>
**> gives an entirely different looking graph to:
**>
**> x <- 10000
**> plot(rnorm(x,mean=19),rnorm(x),col=3,xlim=c(16,24))
**> points(rnorm(x,mean=20),rnorm(x),col=1)
**> points(rnorm(x,mean=21),rnorm(x),col=2)
**>
**> despite being identical in all respects except for the order in which the
**> variables are plotted.
**>
**> I have tried using pch=".", however the colours are very difficult to
**> discern. I have experimented with a number of other symbols with no real
**> solution.
**>
**> The only way that appears to work is to iterate the plot with a for loop,
**> and progressively add a few numbers from each variable, as below. However
**> although I can do this simply with random numbers as I have done here, this
**> is an extremely cumbersome method to use with real datasets.
**>
**> plot(1,1,xlim=c(16,24),ylim=c(-4,4),col="white")
**> x <- 100
**> for (i in 1:100) {
**> points(rnorm(x,mean=19),rnorm(x),col=3)
**> points(rnorm(x,mean=20),rnorm(x),col=1)
**> points(rnorm(x,mean=21),rnorm(x),col=2)
**> }
**>
**> Is there some function in R that could solve this through automatically
**> iterating my data as above, using transparent symbols, or something else? Is
**> there some other way of solving this issue that I haven't thought of?
*

Assume you are plotting variables y1, y2, y3 of the same length against a common x, and you would like to assign colors say c(1,2,3). You can automate the randomization of order as follows:

n = length(y1);

y = c(y1, y2, y3);

xx = rep(x, 3);

colors = rep(c(1,2,3), c(n, n, n));

order = sample(c(1:(3*n)));

plot(xx[order], y[order], col= colors[order])

I basically turn the y's into a single vector y with the corresponding values of x stored in xx and the plotting colors, then randomize the order using the sample function.

**HTH,
**
Peter

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 31 Mar 2011 - 07:29:20 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Thu 31 Mar 2011 - 09:10:26 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*