From: Mulholland, Tom <Tom.Mulholland_at_dpi.wa.gov.au>

Date: Tue 22 Mar 2005 - 19:20:48 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Mar 22 19:26:13 2005

Date: Tue 22 Mar 2005 - 19:20:48 EST

There are two issues here identifying the outliers and highlighting them.

I have only a basic grasp of both of these concepts but will give what I have in case it helps. There appears to have been a move in the last 2 decades to improve the concepts of what actually constitutes an outlier, Brian Ripley made comment on this in 2003 when he said "That's the whole point of robust methods: compensate rather than reject." So I would suggest that you might like to find a copy of an article cited by Brian last year http://finzi.psych.upenn.edu/R/Rhelp02a/archive/35340.html

As Uwe has pointed out if you are using pairs than you will have to write your own panel function unless someone has already written something. I have avoided using the panel function as it seems a bit cumbersome in comparison to writing your own using normal plots.

I haven't used the lattice package for a while now but it is obvious that major improvements have been made recently and you may find that this is a better vehicle for plotting your data.

However for a single plot there's no real problem.
plot(x,y,pch = 20, col = "navy")

points(x[outlier],y[outlier],pch = 20, col = "red")

where "outlier" are the observations you consider to be such

A crude example of what can be done rather than what should be done is (I have used inappropriate data)

par(mfrow = c(4,4))

# Just select setosa

iris <- iris[1:50,]

for (j in 1:4){

for (k in 1:4){

if (j == k){

plot(5,axes = FALSE,type = "n",xlab = "",ylab = "")
} else {

mah <- mahalanobis(iris[,c(j,k)],rowMeans(iris[,c(j,k)]),cov(iris[,c(j,k)]))
outlier <- which(mah > quantile(mah,.95))

plot(iris[,j],iris[,k],pch = 20, col = "navy",axes = F,xlab = names(iris)[j],ylab = names(iris)[k]) points(iris[outlier,j],iris[outlier,k],pch = 20, col = "red")

} } }

> -----Original Message-----

*> From: Brett Stansfield [mailto:brett@hbrc.govt.nz]
**> Sent: Tuesday, 22 March 2005 6:09 AM
**> To: R help (E-mail)
**> Subject: [R] Highlighting points in a scatter plot matrix
**>
**>
**> Dear R
**> I recently did a scatterplot matrix using the following command
**> pairs(sleep[c("SlowSleep", "ParaSleep", "logbw", "logbrw", "loglife",
**> "loggest")],col=1+as.integer(ParaSleep > 5.5 | SlowSleep > 15.7))
**> this highlighted outlying points for some of the x,y plots
**> that I needed to
**> identify. Unfortunately this highlights all the x,y plots
**> some for which
**> these points are not necessarily outliers. Is there a way to specify
**> highlighting selected points at selected x,y plots within a matrix?
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide!
**> http://www.R-project.org/posting-guide.html
**>
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Mar 22 19:26:13 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:30:52 EST
*