# Re: [R] A. Mani : Avoiding loops

From: Petr Pikal <petr.pikal_at_precheza.cz>
Date: Mon 22 Aug 2005 - 14:40:45 EST

On 20 Aug 2005 at 3:26, A. Mani wrote:

> On Friday 19 August 2005 11:54, Sean O'Riordain wrote:
> > Hi,
> > I'm not sure what you actually want from your email (following the
> > posting guide is a good way of helping you explain things to the
> > rest of us in a way we understand - it might even answer your
> > question!
> >
> > I'm only a beginner at R so no doubt one of our expert colleagues
> > will help me...
> >
> > > fred <- data.frame()
> > > fred <- edit(fred)
> > > fred
> >
> > A B C D E
> > 1 1 2 X Y 1
> > 2 2 3 G L 1
> > 3 3 1 G L 5
> >
> > > fred[,3]
> >
> > [1] X G G
> > Levels: G X
> >
> > > fred[fred[,3]=="G",]
> >
> > A B C D E
> > 2 2 3 G L 1
> > 3 3 1 G L 5
> >
> > so at this point I can create a new dataframe with column 3 (C) ==
> > "G"; either explicitly or implicitly...
> >
> > and if I want to calculate the sum() of column E, then I just say
> > something like...
> >
> > > sum(fred[fred[,3]=="G",][,5])
> >
> > [1] 6
> >
> >
> > now naturally being a bit clueless at manipulating stuff in R, I
> > didn't know how to do this before I started... and you guys only get
> > to see the lines that I typed in and got a "successful" result...
> >
> > according to section 6 of the "Introduction to R" manual which comes
> > with R, I could also have said
> >
> > > sum(fred[fred\$C=="G",]\$E)
> >
> > [1] 6
> >
> > Hmmm.... I wonder would it be reasonable to put an example of this
> > type into section 2.7 of the "Introduction to R"?
> >
> >
> > cheers!
> > Sean
> >
> > On 18/08/05, A. Mani <a_mani_sc_gs@vsnl.net> wrote:
> > > Hello,
> > > I want to avoid loops in the following situation. There is
> > > a
> > > 5-col dataframe with col headers alone. two of the columns are
> > > non-numeric. The problem is to calculate statistics(scores) for
> > > each element of one column. The functions depend on matching in
> > > the other non-numeric column.
> > >
> > > A B C E F
> > > 1 2 X Y 1
> > > 2 3 G L 1
> > > 3 1 G L 5
> > > and so on ...30000+ entries.
> > >
> > > I need scores for col E entries which depend on conditional
> > > implications.
> > >
> > >
> > > Thanks,
> > >
> Hello,
> Sorry about the incomplete problem. Here is a better version for
> the
> problem: (the measure is not simple)
> The data frame is like
> col1 col2 col3 col4 col5
> <num> <nonum> <nonum> <num> <num>
> A B C E F
> There are repeated strings in col3, col2. Problem : Calculate
> Measure(Ci) = [No. of repeats of Ci *100] + [If (Bi, Ci) is same as
> (Bj, Cj) and 6>= Ej - Ei >=3 then add 100 else 10] .

Hi

I am not sure what exactly you would like to compute, **working** example could help. But if you want to do some computation for row "i" which depends on row "j", I suppose that you can not avoid loops.

Generally you can use one of aggregate, tapply, by or ave for some computation split by factor. See help pages.

tapply(vector or data frame, list(factors), function)

is the standard form.

HTH
Petr

>
>
> Actually it is to stretched further by adding similar blocks.
>
> How do we use *apply or
> something else in the situation ?
>
>
> In prolog it is extremely easy, but here it is not quite...
>
>
> A. Mani
> Member, Cal. Math. Soc
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help