Re: [R] Graph many points without hiding some

From: Greg Snow <Greg.Snow_at_imail.org>
Date: Thu, 31 Mar 2011 10:07:07 -0600

Just a note, Base graphics does support transparency as long as the device plotting to supports it.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow_at_imail.org
801.408.8111



> -----Original Message-----
> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-
> project.org] On Behalf Of Dennis Murphy
> Sent: Thursday, March 31, 2011 1:36 AM
> To: Samuel Dennis
> Cc: R-help_at_r-project.org
> Subject: Re: [R] Graph many points without hiding some
>
> Hi:
>
> I can think of a couple: (1) size reduction of the points; (2) alpha
> transparency; (3) (1) + (2)
>
> >From your original plot in base graphics, I reduced cex to 0.2 and it
> didn't
> look too bad:
>
> plot(rnorm(x,mean=19),rnorm(x),col=3,xlim=c(16,24), cex = 0.2)
> points(rnorm(x,mean=20),rnorm(x),col=1, cex = 0.2)
> points(rnorm(x,mean=21),rnorm(x),col=2, cex = 0.2)
>
> AFAIK, base graphics doesn't have alpha transparency available, but the
> ggplot2 package does. One approach is to adjust the alpha transparency
> on
> default size points; another is to combine reduced point size with
> alpha
> transparency. Here is your example rehashed for ggplot2.
>
> require(ggplot2)
> d <- data.frame(x1 = rnorm(10000, mean = 19), x2 = rnorm(10000, mean =
> 20),
> x3 = rnorm(10000, mean = 21), x = rnorm(10000))
> # Basically stacking x1 - x3, creating two new vars named variable and
> value
> dm <- melt(d, id = 'x') # from reshape package, loads with ggplot2
> # Alpha transparency is set to a low level with default point size,
> # but the colors in the legend are muted by the level of transparency
> ggplot(dm, aes(x = x, y = value, colour = variable)) + theme_bw() +
> geom_point(alpha = 0.05) +
> scale_colour_manual(values = c('x1' = 'black',
> 'x2' = 'red', 'x3' = 'green'))
>
> # A tradeoff is to reduce the point size and increase alpha a bit, but
> these
> changes will
> # also be reflected in the legend.
>
> ggplot(dm, aes(x = x, y = value, colour = variable)) + theme_bw() +
> geom_point(alpha = 0.15, size = 1) +
> scale_colour_manual(values = c('x1' = 'black',
> 'x2' = 'red', 'x3' = 'green'))
>
> You may well find the legend to be useless for this example, so to get
> rid
> of it,
>
> ggplot(dm, aes(x = x, y = value, colour = variable)) + theme_bw() +
> geom_point(alpha = 0.15, size = 1) +
> scale_colour_manual(values = c('x1' = 'black',
> 'x2' = 'red', 'x3' = 'green')) +
> opts(legend.position = 'none')
>
> The nice thing about the ggplot2 graph is that you can adjust the point
> size
> and alpha transparency to your tastes. The default point size is 2 and
> the
> default alpha = 1 (no transparency).
>
> HTH,
> Dennis
>
> On Wed, Mar 30, 2011 at 10:04 PM, Samuel Dennis <sjdennis3_at_gmail.com>
> wrote:
>
> > I have a very large dataset with three variables that I need to graph
> using
> > a scatterplot. However I find that the first variable gets masked by
> the
> > other two, so the graph looks entirely different depending on the
> order of
> > variables. Does anyone have any suggestions how to manage this?
> >
> > This code is an illustration of what I am dealing with:
> >
> > x <- 10000
> > plot(rnorm(x,mean=20),rnorm(x),col=1,xlim=c(16,24))
> > points(rnorm(x,mean=21),rnorm(x),col=2)
> > points(rnorm(x,mean=19),rnorm(x),col=3)
> >
> > gives an entirely different looking graph to:
> >
> > x <- 10000
> > plot(rnorm(x,mean=19),rnorm(x),col=3,xlim=c(16,24))
> > points(rnorm(x,mean=20),rnorm(x),col=1)
> > points(rnorm(x,mean=21),rnorm(x),col=2)
> >
> > despite being identical in all respects except for the order in which
> the
> > variables are plotted.
> >
> > I have tried using pch=".", however the colours are very difficult to
> > discern. I have experimented with a number of other symbols with no
> real
> > solution.
> >
> > The only way that appears to work is to iterate the plot with a for
> loop,
> > and progressively add a few numbers from each variable, as below.
> However
> > although I can do this simply with random numbers as I have done
> here, this
> > is an extremely cumbersome method to use with real datasets.
> >
> > plot(1,1,xlim=c(16,24),ylim=c(-4,4),col="white")
> > x <- 100
> > for (i in 1:100) {
> > points(rnorm(x,mean=19),rnorm(x),col=3)
> > points(rnorm(x,mean=20),rnorm(x),col=1)
> > points(rnorm(x,mean=21),rnorm(x),col=2)
> > }
> >
> > Is there some function in R that could solve this through
> automatically
> > iterating my data as above, using transparent symbols, or something
> else?
> > Is
> > there some other way of solving this issue that I haven't thought of?
> >
> > Thankyou,
> >
> > Samuel Dennis
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Thu 31 Mar 2011 - 16:09:42 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 04 Apr 2011 - 03:30:26 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive