[R] strange behavior in data frames with duplicated column names

From: William Revelle <wr_at_revelle.net>
Date: Tue, 08 May 2007 09:26:43 -0500


Dear R gurus,

There is an interesting problem with accessing specific items in a column of data frame that has incorrectly been given a duplicate name, even though addressing the item by row and column number. Although the column is correctly listed, an item addressed by row and column number gives the item with the correct row and the original not the duplicated column.

Here are the instructions to get this problem

x <- matrix(seq(1:12),ncol=3)
colnames(x) <- c("A","B","A") #a redundant name for column 2

x.df <- data.frame(x)
x.df        #the redundant name is corrected
x.df[,3]    #show the column -- this always works
x.df[2,3]   #this works here

#now incorrectly label the columns with a duplicate name colnames(x.df) <- c("A","B","A") #the redundant name is not detected
x.df
x.df[,3]     #this works as above and shows the column
x.df[2,3]    #but this gives the value of the first column, not the third  <---
rownames(x.df) <- c("First","Second","Third","Third") #detects duplicate name
x.df
x.df[4,]     #correct second row and corrected column names!
x.df[4,3]    #wrong column
x.df         #still has the original names with the duplication


and corresponding output:

> x <- matrix(seq(1:12),ncol=3)
> colnames(x) <- c("A","B","A") #a redundant name for column 2
> x.df <- data.frame(x)
> x.df #the redundant name is corrected

   A B A.1
1 1 5 9
2 2 6 10
3 3 7 11
4 4 8 12
> x.df[,3] #show the column -- this always works
[1] 9 10 11 12
> x.df[2,3] #this works here

[1] 10
> #now incorrectly label the columns with a duplicate name
> colnames(x.df) <- c("A","B","A") #the redundant name is not detected
> x.df

   A B A
1 1 5 9
2 2 6 10
3 3 7 11
4 4 8 12
> x.df[,3] #this works as above and shows the column
[1] 9 10 11 12
> x.df[2,3] #but this gives the value of the first column, not the
>third <---

[1] 2
> rownames(x.df) <- c("First","Second","Third","Third") #detects
>duplicate name

Error in `row.names<-.data.frame`(`*tmp*`, value = c("First", "Second", :

        duplicate 'row.names' are not allowed
> x.df

   A B A
1 1 5 9
2 2 6 10
3 3 7 11
4 4 8 12
> x.df[4,] #correct second row and corrected column names!

   A B A.1
4 4 8 12
> x.df[4,3] #wrong column

[1] 4
> x.df #still has the original names with the duplication

> unlist(R.Version())

                                      platform 
arch                                            os
                      "i386-apple-darwin8.9.1" 
"i386"                                 "darwin8.9.1"
                                        system 
status                                         major
                           "i386, darwin8.9.1" 
"Patched"                                           "2"
                                         minor 
year                                         month
                                         "5.0" 
"2007"                                          "04"
                                           day 
svn rev                                      language
                                          "25" 
"41315"                                           "R"
                                version.string
"R version 2.5.0 Patched (2007-04-25 r41315)"
>

Bill

-- 
William Revelle		http://personality-project.org/revelle.html
Professor			http://personality-project.org/personality.html
Department of Psychology       http://www.wcas.northwestern.edu/psych/
Northwestern University	http://www.northwestern.edu/
Use R for statistics:                 http://personality-project.org/r

______________________________________________
R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 08 May 2007 - 14:55:37 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 08 May 2007 - 18:31:35 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.