[R] Difficulty with 'merge'

From: Michael Kubovy <kubovy_at_virginia.edu>
Date: Thu 05 Jan 2006 - 03:38:37 EST


Dear R-helpers,

Happy New Year to all the helpful members of the list.

Here is the behavior I'm looking for:
> v1 <- c("a","b","c")
> n1 <- c(0, 1, 2)
> v2 <- c("c", "a", "b")
> n2 <- c(0, 1 , 2)
> (f1 <- data.frame(v1, n1))

   v1 n1
1 a 0
2 b 1
3 c 2
> (f2 <- data.frame(v2, n2))
   v2 n2
1 c 0
2 a 1
3 b 2
> (m12 <- merge(f1, f2, by.x = "v1", by.y = "v2", sort = F))
   v1 n1 n2
1 c 2 0
2 a 0 1
3 b 1 2

Now to my data:
> summary(pL)

         pairL

a fondo   :  41
alto      :  41
ampio     :  41
angoloso  :  41
aperto    :  41
appoggiato:  41
(Other)   :1271

> pL$pairL[c(1,42)]

[1] appoggiato dentro
37 Levels: a fondo alto ampio angoloso aperto appoggiato asimmetrico complicato convesso davanti dentro destra ... verticale

> summary(oppN)

         pairL              pairR         subject            
L                LL                RR               M
a fondo   :  41   a galla    :  41   S1     :  37   Min.   :0.3646    
Min.   :0.02083   Min.   :0.0010   Min.   :0.0000
alto      :  41   acuto      :  41   S10    :  37   1st Qu.:0.5521    
1st Qu.:0.37500   1st Qu.:0.1771   1st Qu.:0.1042
ampio     :  41   arrotondato:  41   S11    :  37   Median :0.6354    
Median :0.47917   Median :0.2708   Median :0.2292
angoloso  :  41   basso      :  41   S12    :  37   Mean   :0.6403    
Mean   :0.46452   Mean   :0.2760   Mean   :0.2598
aperto    :  41   chiuso     :  41   S13    :  37   3rd Qu.:0.7188    
3rd Qu.:0.55208   3rd Qu.:0.3750   3rd Qu.:0.3854
appoggiato:  41   compl      :  41   S14    :  37   Max.   :0.9375    
Max. :0.92708 Max. :0.6042 Max. :0.7812 (Other) :1271 (Other) :1271 (Other):
1295                                      NA's   :3.0000   NA's   : 
3.0000
       asym             polar            polar_a1          clust
Min.   :-0.5555   Min.   :-1.2410   Min.   :-2.949e+00   c1:492
1st Qu.: 0.2091   1st Qu.: 0.4571   1st Qu.:-1.902e-01   c2:287
Median : 0.5555   Median : 1.1832   Median :-1.110e-16   c3: 82
Mean   : 0.6265   Mean   : 1.3428   Mean   :-5.745e-02   c4:246
3rd Qu.: 0.9383 3rd Qu.: 2.0712 3rd Qu.: 1.168e-01 c5: 82 Max. : 2.7081 Max. : 4.6151 Max. : 4.218e+00 c6:328
                    NA's   : 3.0000   NA's   : 3.000e+00

> oppN$pairL[c(1,42)]

[1] spesso fine
37 Levels: a fondo alto ampio angoloso aperto appoggiato asimmetrico complicato convesso davanti dentro destra ... verticale

> unique(sort(oppM$pairL)) == unique(sort(pL$pairL))
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [26] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE In other words I think that pL$pairL and oppN$pairL consists of 37 blocks of 41 repetitions of names, and that these blocks are permutations of each other,

However:

> summary(m1 <- merge(oppM, pairL, by.x = "pairL", by.y = "pairL",
sort = F))

         pairL               pairR          subject             
L                LL                RR               M
a fondo   : 1681   a galla    : 1681   S1     : 1517   Min.   : 
0.3646   Min.   :0.02083   Min.   :0.0010   Min.   :0.0000
alto      : 1681   acuto      : 1681   S10    : 1517   1st Qu.: 
0.5521   1st Qu.:0.37500   1st Qu.:0.1771   1st Qu.:0.1042
ampio     : 1681   arrotondato: 1681   S11    : 1517   Median : 
0.6354   Median :0.47917   Median :0.2708   Median :0.2292
angoloso  : 1681   basso      : 1681   S12    : 1517   Mean   : 
0.6398   Mean   :0.46402   Mean   :0.2760   Mean   :0.2598
aperto    : 1681   chiuso     : 1681   S13    : 1517   3rd Qu.: 
0.7188   3rd Qu.:0.55208   3rd Qu.:0.3750   3rd Qu.:0.3854
appoggiato: 1681   compl      : 1681   S14    : 1517   Max.   : 
0.9375 Max. :0.92708 Max. :0.6042 Max. :0.7812 (Other) :51988 (Other) :51988 (Other):52972

       asym polar polar_a1 clust

Min.   :-0.5555   Min.   :-1.2410   Min.   :-2.949e+00   c1:20172
1st Qu.: 0.2091   1st Qu.: 0.4571   1st Qu.:-1.904e-01   c2:11644
Median : 0.5555   Median : 1.1832   Median :-1.110e-16   c3: 3362
Mean   : 0.6234   Mean   : 1.3428   Mean   :-5.745e-02   c4:10086
3rd Qu.: 0.9383   3rd Qu.: 2.0712   3rd Qu.: 1.169e-01   c5: 3362
Max.   : 2.7081   Max.   : 4.6151   Max.   : 4.218e+00   c6:13448

I was expecting pairL to be 41 items longs, not 1681 = 41^2.



Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS:     P.O.Box 400400    Charlottesville, VA 22904-4400
Parcels:    Room 102        Gilmer Hall
         McCormick Road    Charlottesville, VA 22903
Office:    B011    +1-434-982-4729
Lab:        B019    +1-434-982-4751
Fax:        +1-434-982-4766

WWW: http://www.people.virginia.edu/~mk9y/

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jan 05 05:30:43 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:41:50 EST