RE: [R] Highlighting points in a scatter plot matrix

From: Mulholland, Tom <Tom.Mulholland_at_dpi.wa.gov.au>
Date: Tue 22 Mar 2005 - 19:20:48 EST


There are two issues here identifying the outliers and highlighting them.

I have only a basic grasp of both of these concepts but will give what I have in case it helps. There appears to have been a move in the last 2 decades to improve the concepts of what actually constitutes an outlier, Brian Ripley made comment on this in 2003 when he said "That's the whole point of robust methods: compensate rather than reject." So I would suggest that you might like to find a copy of an article cited by Brian last year http://finzi.psych.upenn.edu/R/Rhelp02a/archive/35340.html

As Uwe has pointed out if you are using pairs than you will have to write your own panel function unless someone has already written something. I have avoided using the panel function as it seems a bit cumbersome in comparison to writing your own using normal plots.

I haven't used the lattice package for a while now but it is obvious that major improvements have been made recently and you may find that this is a better vehicle for plotting your data.

However for a single plot there's no real problem. plot(x,y,pch = 20, col = "navy")
points(x[outlier],y[outlier],pch = 20, col = "red")

where "outlier" are the observations you consider to be such

A crude example of what can be done rather than what should be done is (I have used inappropriate data)

par(mfrow = c(4,4))
# Just select setosa
iris <- iris[1:50,]

for (j in 1:4){
  for (k in 1:4){
  if (j == k){
    plot(5,axes = FALSE,type = "n",xlab = "",ylab = "")     } else {
    mah <- mahalanobis(iris[,c(j,k)],rowMeans(iris[,c(j,k)]),cov(iris[,c(j,k)]))     outlier <- which(mah > quantile(mah,.95))

    plot(iris[,j],iris[,k],pch = 20, col = "navy",axes = F,xlab = names(iris)[j],ylab = names(iris)[k])     points(iris[outlier,j],iris[outlier,k],pch = 20, col = "red")

    }
    }
    }
    



> -----Original Message-----
> From: Brett Stansfield [mailto:brett@hbrc.govt.nz]
> Sent: Tuesday, 22 March 2005 6:09 AM
> To: R help (E-mail)
> Subject: [R] Highlighting points in a scatter plot matrix
>
>
> Dear R
> I recently did a scatterplot matrix using the following command
> pairs(sleep[c("SlowSleep", "ParaSleep", "logbw", "logbrw", "loglife",
> "loggest")],col=1+as.integer(ParaSleep > 5.5 | SlowSleep > 15.7))
> this highlighted outlying points for some of the x,y plots
> that I needed to
> identify. Unfortunately this highlights all the x,y plots
> some for which
> these points are not necessarily outliers. Is there a way to specify
> highlighting selected points at selected x,y plots within a matrix?
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Mar 22 19:26:13 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:30:52 EST