Re: [R] a problem: factors, names, tables ..

From: Uwe Ligges <ligges_at_statistik.uni-dortmund.de>
Date: Mon 19 Jul 2004 - 01:39:18 EST

Adaikalavan Ramasamy wrote:

> Please give a reproducible example. Here is one way :
> 
> # generate example
> 

>>v1 <- rep( c(0, 2, 10, 11, 13, 14, 15), c(15, 6, 1, 3, 8, 15, 10) )
>>t1 <- table(v1)
>>t1
> 
> v1
>  0  2 10 11 13 14 15 
> 15  6  1  3  8 15 10 
> 
> 

>>v2 <- rep( c(0, 1, 2, 10, 11, 12, 13, 14, 15), c(817, 119, 524, 96,
> 
> 700, 66, 559, 358, 283) )
> 

>>t2 <- table(v2)
>>t2
> 
> v2
>   0   1   2  10  11  12  13  14  15 
> 817 119 524  96 700  66 559 358 283 
> 
> # find results
> 

>>merge(t1, t2, by=1, all.x=TRUE)
> 
>   v1 Freq.x Freq.y
> 1  0     15    817
> 2 10      1     96
> 3 11      3    700
> 4 13      8    559
> 5 14     15    358
> 6 15     10    283
> 7  2      6    524
> 
> Uwe's suggestion may need a slight modification as the two table have
> different labels/levels and hence non-conformable for division

No!
Uwe's suggestion works perfectly well, since *the same* variable is used for calculating both tables, expect: in one case it's a *subset*. Hence converting to factor at first is completely sufficient!!!

Reproducible example from your v2:
v2 <- rep(c(0, 1, 2, 10, 11, 12, 13, 14, 15),

           c(817, 119, 524, 96, 700, 66, 559, 358, 283)) v1 <- factor(v2)
set.seed(1)
v1 <- sample(v1, 10)

table(v1)
table(v2)
table(v1)/table(v2)

So, what's the problem???

Uwe Ligges

> 

>>t2.f <- table( v2.f <- factor(v2) )
>>t1.f <- table( v1.f <- factor(v1, levels=levels(v2.f)) )
> 
> 

>>cbind( t1.f, t2.f, ratio=t1.f / t2.f )
> 
>    t1.f t2.f       ratio
> 0    15  817 0.018359853
> 1     0  119 0.000000000
> 2     6  524 0.011450382
> 10    1   96 0.010416667
> 11    3  700 0.004285714
> 12    0   66 0.000000000
> 13    8  559 0.014311270
> 14   15  358 0.041899441
> 15   10  283 0.035335689
> 
> 
> Also have a look at this related posting
> http://tolstoy.newcastle.edu.au/R/help/04/06/0594.html
> 
> Regards, Adai.
> 
> 
> On Sun, 2004-07-18 at 13:05, Uwe Ligges wrote:
> 

>>PvR wrote:
>>
>>>Hi all,
>>>
>>>I am *completely* lost in trying to solve a relatively simple task.
>>>
>>>I want to compute the relative number of occurences of an event, the
>>>data of which sits in a large table (read from file).
>>>
>>>I have the occurences of the events in a table 'tt'
>>>
>>>0 2 10 11 13 14 15
>>>15 6 1 3 8 15 10
>>>
>>>.. meaning that event of type '0' occurs 15 times, type '2' occurs 6
>>>times etc.
>>>
>>>Now I want to divide the occurence counts by the total number of events
>>>of that type, which is given in the table tt2:
>>>
>>> 0 1 2 10 11 12 13 14 15
>>>817 119 524 96 700 66 559 358 283
>>>
>>>Saying that event type '0' occurred 817 times, type '1' occurs 119
>>>times etc.
>>>
>>>The obvious problem is that not all events in tt2 are present in tt,
>>>which is the result of the experiment so that cannot be changed.
>>>
>>>What needs to be done is loop over tt, take the occurence count, and
>>>divide that with the corresponding count in tt2. This corresponding
>>>tt2 count is *not* at the same index in tt2, so I need a reverse lookup
>>>of the type number. For example:
>>>
>>>event type 10:
>>>occurs 1 time (from table tt)
>>>occurs 96 times in total (from table tt2) <- this is found by looking
>>>up type '10' in tt2 and reading out 96
>>>
>>>result: 1/96
>>>
>>>
>>>
>>>I have tried programming this as follows:
>>
>>
>>It's *much* easier. Just make V32 a factor. After that, table() knows
>>all the levels and counts also the zeros:
>>
>>V32 <- factor(V32)
>>table(V32[V48 == 0]) / table(V32)
>>
>>Uwe Ligges
>>
>>
>>
>>
>>
>>>tt <- table(V32[V48 == 0]) # this is taking the events I want counted
>>>tt2 <- table(V32) # this is taking the total event count per type
>>>df <- as.data.frame(tt) #convert to dataframe to allow access to
>>>type-numbers .. ?
>>>df2 <- as.data.frame(tt2) #same here
>>>
>>>print(tt);
>>>print(df);
>>>
>>>print(tt2);
>>>print(df2);
>>>
>>>for( i in 1:length(tt) ) { #loop over smallest table tt
>>> print("i:"); #index
>>> print(i);
>>> print( "denominator "); #corresponds to the "1" in the example
>>> print( df$Freq[i] );
>>> denomtag = ( df$Var1[ i ] ); # corresponds to the "10" in the
>>>example, being the type number of the event
>>> print("denomtag ");
>>> print( denomtag );
>>> print( "nominator: " );
>>> print( df2[2][ df[1] == as.numeric(denomtag) ] ); #this fails ....
>>> #result would then be somthing like : denomitor / nominator
>>>}
>>>
>>>The problem is that the factor names that are extracted in 'denomtag'
>>>are not usable as index in the dataframe in the last line. I have
>>>tried converting to numeric using 'as.numeric', but that fails since
>>>this returns the index in the factor rather then the factor name I need
>>>from the list.
>>>
>>>Any suggestions .. ? I am sure its dead simple, as always.
>>>
>>>
>>>Thanks,
>>>
>>>
>>>Piet (Belgium)
>>>
>>>PS: please reply to pvremortNOSPAM@vub.ac.be
>>>
>>>______________________________________________
>>>R-help@stat.math.ethz.ch mailing list
>>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide!
>>>http://www.R-project.org/posting-guide.html
>>
>>______________________________________________
>>R-help@stat.math.ethz.ch mailing list
>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
>
>

R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Jul 19 01:47:53 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 02:36:42 EST