[R] Re : A. Mani : Avoiding Loops

From: A Mani <a.manigs_at_gmail.com>
Date: Sat 20 Aug 2005 - 07:38:15 EST


On Friday 19 August 2005 11:54, Sean O'Riordain wrote:
> Hi,
> I'm not sure what you actually want from your email (following the
> posting guide is a good way of helping you explain things to the rest
> of us in a way we understand - it might even answer your question!
>
> I'm only a beginner at R so no doubt one of our expert colleagues will
> help me...
>
> > fred <- data.frame()
> > fred <- edit(fred)
> > fred
>
> A B C D E
> 1 1 2 X Y 1
> 2 2 3 G L 1
> 3 3 1 G L 5
>
> > fred[,3]
>
> [1] X G G
> Levels: G X
>
> > fred[fred[,3]=="G",]
>
> A B C D E
> 2 2 3 G L 1
> 3 3 1 G L 5
>
> so at this point I can create a new dataframe with column 3 (C) ==
> "G"; either explicitly or implicitly...
>
> and if I want to calculate the sum() of column E, then I just say
> something like...
>
> > sum(fred[fred[,3]=="G",][,5])
>
> [1] 6
>
>
> now naturally being a bit clueless at manipulating stuff in R, I
> didn't know how to do this before I started... and you guys only get
> to see the lines that I typed in and got a "successful" result...
>
> according to section 6 of the "Introduction to R" manual which comes
> with R, I could also have said
>
> > sum(fred[fred$C=="G",]$E)
>
> [1] 6
>
> Hmmm.... I wonder would it be reasonable to put an example of this
> type into section 2.7 of the "Introduction to R"?
>
>
> cheers!
> Sean
>
> On 18/08/05, A. Mani <a_mani_sc_gs@vsnl.net> wrote:
> > Hello,
> > I want to avoid loops in the following situation. There is a
> > 5-col dataframe with col headers alone. two of the columns are
> > non-numeric. The problem is to calculate statistics(scores) for each
> > element of one column. The functions depend on matching in the other
> > non-numeric column.
> >
> > A B C E F
> > 1 2 X Y 1
> > 2 3 G L 1
> > 3 1 G L 5
> > and so on ...30000+ entries.
> >
> > I need scores for col E entries which depend on conditional implications.
> >
> >
> > Thanks,
> >
Hello,

      Sorry about the incomplete problem. Here is a better version for the problem: (the measure is not simple)
The data frame is like

  col1       col2            col3       col4        col5
  <num>  <nonum>   <nonum>      <num>   <num>
       A           B             C                  E           F   
There are repeated strings in col3, col2. Problem : Calculate Measure(Ci) = [No. of repeats of Ci *100] + [If (Bi, Ci) is same as (Bj, Cj) and 6>= Ej - Ei >=3 then add 100 else 10] .

Actually it is to stretched further by adding similar blocks.

 How do we use *apply or
something else in the situation ?

In prolog it is extremely easy, but here it is not quite...

  1. Mani Member, Cal. Math. Soc
-- 
A. Mani
Member, Cal. Math. Soc

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Sat Aug 20 07:43:31 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 15:37:57 EST