Re: [R] a problem: factors, names, tables ..

Date: Mon 19 Jul 2004 - 02:58:14 EST

Yes, you are right ! I did not realise that the first variable is a subset of the second. I tried this instead

> v1 <- rep( c(0, 2, 10, 11, 13, 14, 15), c(15, 6, 1, 3, 8, 15, 10) )
> v2 <- rep( c(0, 1, 2, 10, 11, 12, 13, 14, 15), c(817, 119, 524, 96,
700, 66, 559, 358, 283) )
> table(factor(v1)) / table(factor(v2))
Error in table(factor(v1))/table(factor(v2)) :

non-conformable arrays
>

On Sun, 2004-07-18 at 16:39, Uwe Ligges wrote:
>
> > Please give a reproducible example. Here is one way :
> >
> > # generate example
> >
> >>v1 <- rep( c(0, 2, 10, 11, 13, 14, 15), c(15, 6, 1, 3, 8, 15, 10) )
> >>t1 <- table(v1)
> >>t1
> >
> > v1
> > 0 2 10 11 13 14 15
> > 15 6 1 3 8 15 10
> >
> >
> >>v2 <- rep( c(0, 1, 2, 10, 11, 12, 13, 14, 15), c(817, 119, 524, 96,
> >
> > 700, 66, 559, 358, 283) )
> >
> >>t2 <- table(v2)
> >>t2
> >
> > v2
> > 0 1 2 10 11 12 13 14 15
> > 817 119 524 96 700 66 559 358 283
> >
> > # find results
> >
> >>merge(t1, t2, by=1, all.x=TRUE)
> >
> > v1 Freq.x Freq.y
> > 1 0 15 817
> > 2 10 1 96
> > 3 11 3 700
> > 4 13 8 559
> > 5 14 15 358
> > 6 15 10 283
> > 7 2 6 524
> >
> > Uwe's suggestion may need a slight modification as the two table have
> > different labels/levels and hence non-conformable for division
>
> No!
> Uwe's suggestion works perfectly well, since *the same* variable is used
> for calculating both tables, expect: in one case it's a *subset*. Hence
> converting to factor at first is completely sufficient!!!
>
> Reproducible example from your v2:
> v2 <- rep(c(0, 1, 2, 10, 11, 12, 13, 14, 15),
> c(817, 119, 524, 96, 700, 66, 559, 358, 283))
> v1 <- factor(v2)
> set.seed(1)
> v1 <- sample(v1, 10)
> table(v1)
> table(v2)
> table(v1)/table(v2)
>
> So, what's the problem???
>
> Uwe Ligges
>
>
>
> >
> >>t2.f <- table( v2.f <- factor(v2) )
> >>t1.f <- table( v1.f <- factor(v1, levels=levels(v2.f)) )
> >
> >
> >>cbind( t1.f, t2.f, ratio=t1.f / t2.f )
> >
> > t1.f t2.f ratio
> > 0 15 817 0.018359853
> > 1 0 119 0.000000000
> > 2 6 524 0.011450382
> > 10 1 96 0.010416667
> > 11 3 700 0.004285714
> > 12 0 66 0.000000000
> > 13 8 559 0.014311270
> > 14 15 358 0.041899441
> > 15 10 283 0.035335689
> >
> >
> > Also have a look at this related posting
> > http://tolstoy.newcastle.edu.au/R/help/04/06/0594.html
> >
> >
> >
> > On Sun, 2004-07-18 at 13:05, Uwe Ligges wrote:
> >
> >>PvR wrote:
> >>
> >>>Hi all,
> >>>
> >>>I am *completely* lost in trying to solve a relatively simple task.
> >>>
> >>>I want to compute the relative number of occurences of an event, the
> >>>data of which sits in a large table (read from file).
> >>>
> >>>I have the occurences of the events in a table 'tt'
> >>>
> >>>0 2 10 11 13 14 15
> >>>15 6 1 3 8 15 10
> >>>
> >>>.. meaning that event of type '0' occurs 15 times, type '2' occurs 6
> >>>times etc.
> >>>
> >>>Now I want to divide the occurence counts by the total number of events
> >>>of that type, which is given in the table tt2:
> >>>
> >>> 0 1 2 10 11 12 13 14 15
> >>>817 119 524 96 700 66 559 358 283
> >>>
> >>>Saying that event type '0' occurred 817 times, type '1' occurs 119
> >>>times etc.
> >>>
> >>>The obvious problem is that not all events in tt2 are present in tt,
> >>>which is the result of the experiment so that cannot be changed.
> >>>
> >>>What needs to be done is loop over tt, take the occurence count, and
> >>>divide that with the corresponding count in tt2. This corresponding
> >>>tt2 count is *not* at the same index in tt2, so I need a reverse lookup
> >>>of the type number. For example:
> >>>
> >>>event type 10:
> >>>occurs 1 time (from table tt)
> >>>occurs 96 times in total (from table tt2) <- this is found by looking
> >>>up type '10' in tt2 and reading out 96
> >>>
> >>>result: 1/96
> >>>
> >>>
> >>>
> >>>I have tried programming this as follows:
> >>
> >>
> >>It's *much* easier. Just make V32 a factor. After that, table() knows
> >>all the levels and counts also the zeros:
> >>
> >>V32 <- factor(V32)
> >>table(V32[V48 == 0]) / table(V32)
> >>
> >>Uwe Ligges
> >>
> >>
> >>
> >>
> >>
> >>>tt <- table(V32[V48 == 0]) # this is taking the events I want counted
> >>>tt2 <- table(V32) # this is taking the total event count per type
> >>>type-numbers .. ?
> >>>df2 <- as.data.frame(tt2) #same here
> >>>
> >>>print(tt);
> >>>print(df);
> >>>
> >>>print(tt2);
> >>>print(df2);
> >>>
> >>>for( i in 1:length(tt) ) { #loop over smallest table tt
> >>> print("i:"); #index
> >>> print(i);
> >>> print( "denominator "); #corresponds to the "1" in the example
> >>> print( df\$Freq[i] );
> >>> denomtag = ( df\$Var1[ i ] ); # corresponds to the "10" in the
> >>>example, being the type number of the event
> >>> print("denomtag ");
> >>> print( denomtag );
> >>> print( "nominator: " );
> >>> print( df2[2][ df[1] == as.numeric(denomtag) ] ); #this fails ....
> >>> #result would then be somthing like : denomitor / nominator
> >>>}
> >>>
> >>>The problem is that the factor names that are extracted in 'denomtag'
> >>>are not usable as index in the dataframe in the last line. I have
> >>>tried converting to numeric using 'as.numeric', but that fails since
> >>>this returns the index in the factor rather then the factor name I need
> >>>from the list.
> >>>
> >>>Any suggestions .. ? I am sure its dead simple, as always.
> >>>
> >>>
> >>>Thanks,
> >>>
> >>>
> >>>Piet (Belgium)
> >>>
> >>>
> >>>______________________________________________
> >>>R-help@stat.math.ethz.ch mailing list
> >>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >>>http://www.R-project.org/posting-guide.html
> >>
> >>______________________________________________
> >>R-help@stat.math.ethz.ch mailing list
> >>https://www.stat.math.ethz.ch/mailman/listinfo/r-help