Re: [R] NAs introduced by coercion in dist()

From: Petr PIKAL <petr.pikal_at_precheza.cz>
Date: Thu, 03 May 2007 08:47:54 +0200


r-help-bounces_at_stat.math.ethz.ch napsal dne 02.05.2007 16:47:55:

>
> It was suggested that the 'NAs introduced by coercion' message might be
> warning me that my data are not what they should be. I checked this
using
> str(PeaksMatrix), as suggested, and the data seem to be what I thought
they
> were:
>
> 'data.frame': 335 obs. of 127 variables:
> $ Code : Factor w/ 335 levels "A1MR","A1MU",..: 1 2 3 4 5 6 7 8 9 10
...
> $ P3.70 : num 0 0 0 0 0 0 0 0 0 0 ...
> $ P3.97 : num 0 0 0 0 0 0 0 0 0 0 ...
> $ P4.29 : num 0 0 0 0 0 0 0 0 0 0 ...
> $ P4.90 : num 0 0 0 0 0 0 0 0 0 0 ...
> $ P6.30 : num 0 0 0 0 0 0 0 0 0 0 ...
> $ P6.45 : num 7.73 0 0 0 0 0 4.03 0 0 0 ...
> $ P6.55 : num 0 0 0 0 0 0 0 0 0 0 ...
>
> ...
>
> I do have 335 observations, 127 variables that are named P3.70, 3.97,
P4.29,
> etc.. This was a relief, but I still don't know whether the distance
matrix
> is what it should be. I tried 'str(dist.PxMx)', which is the name of my
> distance matrix, but I get something that has not much meaning to me, an
> unexperienced R user:
>
> Class 'dist' atomic [1:55945] 329.6 194.9 130.1 70.7 116.9 ...
> ..- attr(*, "Size")= int 335
> ..- attr(*, "Labels")= chr [1:335] "1" "2" "3" "4" ...
> ..- attr(*, "Diag")= logi FALSE
> ..- attr(*, "Upper")= logi FALSE
> ..- attr(*, "method")= chr "euclidean"
> ..- attr(*, "call")= language dist(x = PeaksMatrix, method =
"euclidean",
> diag = FALSE, upper = FALSE, p = 2)
>
> Any more suggestions, please?

Well, it seems that you have the data which you want but why you do not see them is not clear for me.

I tried:

x<-sample(0:2, 100, replace=T)
dim(x)<-c(10,10)
x

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    1    0    0    1    1    1    1    0     1
 [2,]    0    1    0    2    1    0    2    0    0     2
 [3,]    0    2    0    0    0    1    1    0    1     2
...
[10,]    1    2    0    0    1    2    0    2    1     0
xx<-data.frame(var=c("a", "b"),x)
xx

   var X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 a 0 1 0 0 1 1 1 1 0 1
2 b 0 1 0 2 1 0 2 0 0 2
....
9 a 1 1 0 1 1 0 0 2 2 0
10 b 1 2 0 0 1 2 0 2 1 0

dist(xx, method='euclidean', diag=F,upper=F)

          1        2        3        4        5        6        7        8 
       9
2  2.966479  
3  2.345208 3.146427  
4  3.633180 3.633180 4.571652  
5  4.195235 5.549775 4.571652 4.195235  
6  4.195235 4.195235 4.062019 3.924283 3.924283  
7  1.816590 3.781534 3.316625 3.781534 3.781534 4.806246  
8 2.774887 4.571652 3.633180 4.062019 4.062019 4.806246 3.316625 9 3.316625 4.449719 4.062019 4.449719 3.316625 4.449719 2.774887 3.146427  

10 2.774887 5.029911 3.633180 4.324350 3.146427 4.324350 2.569047 2.966479 2.774887

xxx<-dist(xx, method='euclidean', diag=F,upper=F) Warning message:
NAs introduced by coercion
str(xxx)
Class 'dist' atomic [1:45] 2.97 2.35 3.63 4.20 4.20 ...

  ..- attr(*, "Size")= int 10
  ..- attr(*, "Diag")= logi FALSE
  ..- attr(*, "Upper")= logi FALSE
  ..- attr(*, "method")= chr "euclidean"
  ..- attr(*, "call")= language dist(x = xx, method = "euclidean", diag = 
F, upper = F)

seems to be similar to what you get. So I wonder why you do not see you matrix. Try dist.PxMx[1:50] or head(dist.PxMx) to see if you can get something from it.

Regards
Petr

>
>
>
> Silvia Lomascolo wrote:
> >
> > I work with Windows and use R version 2.4.1. I am JUST starting to
learn
> > this program...
> >
> > I get this warning message 'NAs introduced by coercion' while trying
to
> > build a distance matrix (to be analyzed with NMDS later) from a 336 x
100
> > data matrix. The original matrix has lots of zeros and no missing
values,
> > but I don't think this should matter.
> >
> > I searched this forum and people have suggested that the warning
should be
> > ignored but when I try to print the distance matrix I only get the row
> > numbers (the matrix seems to be 'empty') and I'm not being able to
judge
> > whether the matrix worked or not.
> >
> > To get the distance matrix I wrote:
> > dist.PxMx <- dist (PeaksMatrix, method='euclidean', diag=FALSE,
> > upper=FALSE)
> >
> > I tried including the p argument (included in the help for dist()) and
> > leaving it out, but that didn't seem to change anything. I think
that's
> > required for one distance measure though, not for euclidean dist.
> >
> > Should I really ignore this warning? If so, why am I not being able to
see
> > the distance matrix?
> >
>
> --
> View this message in context: http://www.nabble.com/NAs-introduced-by-
> coercion-in-dist%28%29-tf3680727.html#a10286882
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 03 May 2007 - 07:03:57 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 03 May 2007 - 07:31:49 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.