RE: [R] Highlighting points in a scatter plot matrix

From: Mulholland, Tom <>
Date: Tue 22 Mar 2005 - 19:20:48 EST

There are two issues here identifying the outliers and highlighting them.

I have only a basic grasp of both of these concepts but will give what I have in case it helps. There appears to have been a move in the last 2 decades to improve the concepts of what actually constitutes an outlier, Brian Ripley made comment on this in 2003 when he said "That's the whole point of robust methods: compensate rather than reject." So I would suggest that you might like to find a copy of an article cited by Brian last year

As Uwe has pointed out if you are using pairs than you will have to write your own panel function unless someone has already written something. I have avoided using the panel function as it seems a bit cumbersome in comparison to writing your own using normal plots.

I haven't used the lattice package for a while now but it is obvious that major improvements have been made recently and you may find that this is a better vehicle for plotting your data.

However for a single plot there's no real problem. plot(x,y,pch = 20, col = "navy")
points(x[outlier],y[outlier],pch = 20, col = "red")

where "outlier" are the observations you consider to be such

A crude example of what can be done rather than what should be done is (I have used inappropriate data)

par(mfrow = c(4,4))
# Just select setosa
iris <- iris[1:50,]

for (j in 1:4){
  for (k in 1:4){
  if (j == k){
    plot(5,axes = FALSE,type = "n",xlab = "",ylab = "")     } else {
    mah <- mahalanobis(iris[,c(j,k)],rowMeans(iris[,c(j,k)]),cov(iris[,c(j,k)]))     outlier <- which(mah > quantile(mah,.95))

    plot(iris[,j],iris[,k],pch = 20, col = "navy",axes = F,xlab = names(iris)[j],ylab = names(iris)[k])     points(iris[outlier,j],iris[outlier,k],pch = 20, col = "red")


> -----Original Message-----
> From: Brett Stansfield []
> Sent: Tuesday, 22 March 2005 6:09 AM
> To: R help (E-mail)
> Subject: [R] Highlighting points in a scatter plot matrix
> Dear R
> I recently did a scatterplot matrix using the following command
> pairs(sleep[c("SlowSleep", "ParaSleep", "logbw", "logbrw", "loglife",
> "loggest")],col=1+as.integer(ParaSleep > 5.5 | SlowSleep > 15.7))
> this highlighted outlying points for some of the x,y plots
> that I needed to
> identify. Unfortunately this highlights all the x,y plots
> some for which
> these points are not necessarily outliers. Is there a way to specify
> highlighting selected points at selected x,y plots within a matrix?
> ______________________________________________
> mailing list
> PLEASE do read the posting guide!
> mailing list PLEASE do read the posting guide! Received on Tue Mar 22 19:26:13 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:30:52 EST