From: A Mani <a.manigs_at_gmail.com>

Date: Mon 22 Aug 2005 - 23:54:31 EST

#The Score to be computed is for the doctors. It is no. of patients *100 + rate of decrease of diabetic score *1000 + no.of tests at approx 3 months *....(see below )

# To be debugged (loops)

**DATA
**

"DOB","ID","DOCTOR","DATE of TEST","TEST1" 12-23-1921,2177532.174,NA,01-20-2003,NA

NA,2358368.261,"152N7R",01-26-2003,NA

NA,2358368.261,"152N7R",01-27-2003,NA

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Aug 22 23:59:02 2005

Date: Mon 22 Aug 2005 - 23:54:31 EST

Re: A. Mani : Avoiding loops (Petr Pikal)

> Message: 9

*> Date: Mon, 22 Aug 2005 06:40:45 +0200
**> From: "Petr Pikal" <petr.pikal@precheza.cz>
**> Subject: Re: [R] A. Mani : Avoiding loops
**> To: "A. Mani" <a_mani_sc_gs@vsnl.net>, r-help
**> <r-help@stat.math.ethz.ch>
**>
**> On 20 Aug 2005 at 3:26, A. Mani wrote:
**>
**> > On Friday 19 August 2005 11:54, Sean O'Riordain wrote:
**> > > Hi,
**> > > I'm not sure what you actually want from your email (following the
**> > > posting guide is a good way of helping you explain things to the
**> > > rest of us in a way we understand - it might even answer your
**> > > question!
**> > >
**> > > I'm only a beginner at R so no doubt one of our expert colleagues
**> > > will help me...
**> > >
**> > > > fred <- data.frame()
**> > > > fred <- edit(fred)
**> > > > fred
**> > >
**> > > A B C D E
**> > > 1 1 2 X Y 1
**> > > 2 2 3 G L 1
**> > > 3 3 1 G L 5
**> > >
**> > > > fred[,3]
**> > >
**> > > [1] X G G
**> > > Levels: G X
**> > >
**> > > > fred[fred[,3]=="G",]
**> > >
**> > > A B C D E
**> > > 2 2 3 G L 1
**> > > 3 3 1 G L 5
**> > >
**> > > so at this point I can create a new dataframe with column 3 (C) ==
**> > > "G"; either explicitly or implicitly...
**> > >
**> > > and if I want to calculate the sum() of column E, then I just say
**> > > something like...
**> > >
**> > > > sum(fred[fred[,3]=="G",][,5])
**> > >
**> > > [1] 6
**> > >
**> > >
**> > > now naturally being a bit clueless at manipulating stuff in R, I
**> > > didn't know how to do this before I started... and you guys only get
**> > > to see the lines that I typed in and got a "successful" result...
**> > >
**> > > according to section 6 of the "Introduction to R" manual which comes
**> > > with R, I could also have said
**> > >
**> > > > sum(fred[fred$C=="G",]$E)
**> > >
**> > > [1] 6
**> > >
**> > > Hmmm.... I wonder would it be reasonable to put an example of this
**> > > type into section 2.7 of the "Introduction to R"?
**> > >
**> > >
**> > > cheers!
**> > > Sean
**> > >
**> > > On 18/08/05, A. Mani <a_mani_sc_gs@vsnl.net> wrote:
**> > > > Hello,
**> > > > I want to avoid loops in the following situation. There is
**> > > > a
**> > > > 5-col dataframe with col headers alone. two of the columns are
**> > > > non-numeric. The problem is to calculate statistics(scores) for
**> > > > each element of one column. The functions depend on matching in
**> > > > the other non-numeric column.
**> > > >
**> > > > A B C E F
**> > > > 1 2 X Y 1
**> > > > 2 3 G L 1
**> > > > 3 1 G L 5
**> > > > and so on ...30000+ entries.
**> > > >
**> > > > I need scores for col E entries which depend on conditional
**> > > > implications.
**> > > >
**> > > >
**> > > > Thanks,
**> > > >
**> > Hello,
**> > Sorry about the incomplete problem. Here is a better version for
**> > the
**> > problem: (the measure is not simple)
**> > The data frame is like
**> > col1 col2 col3 col4 col5
**> > <num> <nonum> <nonum> <num> <num>
**> > A B C E F
**> > There are repeated strings in col3, col2. Problem : Calculate
**> > Measure(Ci) = [No. of repeats of Ci *100] + [If (Bi, Ci) is same as
**> > (Bj, Cj) and 6>= Ej - Ei >=3 then add 100 else 10] .
**>
**> Hi
**>
**> I am not sure what exactly you would like to compute,
**> **working** example could help. But if you want to do some
**> computation for row "i" which depends on row "j", I suppose that
**> you can not avoid loops.
**>
**> Generally you can use one of aggregate, tapply, by or ave for some
**> computation split by factor. See help pages.
**>
**> tapply(vector or data frame, list(factors), function)
**>
**> is the standard form.
**>
**> HTH
**> Petr
**>
**>
**> >
**> >
**> > Actually it is to stretched further by adding similar blocks.
**> >
**> > How do we use *apply or
**> > something else in the situation ?
**> >
**> >
**> > In prolog it is extremely easy, but here it is not quite...
**> >
**> >
*

Here is some code and a little data

dat <- read.table("/home/project5R/datasplf.csv", header=TRUE,
sep=",", na.strings="NA", dec=".", strip.white=TRUE)
attach(dat)

showData(dat, placement='-20+200', font=.logFont, maxwidth=80, maxheight=30)
x <- as.matrix(dat)

x1 <- as.vector(x[,1])

xd1 <- as.Date(x1, format= "%m-%d-%Y")

n <- length(x1)

n

x2 <- as.vector(x[,2])

length(x2)

x3 <- as.vector(x[,3])

length(x3)

x4 <- as.vector(x[,4])

x5 <- as.vector(x[,5])

x5[is.na(x5)] <- 0

xd4 <- as.Date(x4, format= "%m-%d-%Y")

xd4

p6 <- (1-(abs(x5 - 6)/6))*100

p6

xd1 <- as.Date(x1, format= "%m-%d-%Y")

xd1

x23 <- cbind(x2,x3)

xp <- paste(x2,x3)

xp

y <- cbind(x23,xd4,xd1,xp)

#The Score to be computed is for the doctors. It is no. of patients *100 + rate of decrease of diabetic score *1000 + no.of tests at approx 3 months *....(see below )

# To be debugged (loops)

sc <- vector(n, mode = "numeric")

for (i in 1:n){for(j in 1:n) {If identical(x3[[i]],x3[[j]]) &
identical(x2[[i]],x2[[j]])}

sc[[i]] <- sc[[i]] + 100 else sc[[i]] <- sc[[i]] +0 }
sc

scf <- vector(0, length= n, mode = "numeric", step=0)
for (i,j in 1:n) {If (identical(x3[[i]],x3[[j]]) & identical(x2[[i]],x2[[j]]) &
abs(1-(abs(xd4[[i]]-xd4[[j]]))/90) <= 1.25)} scf[[i]] <- scf[[i]] +
100 else scf[[i]] <- scf[i] +0

scr <- vector(0, length= n, mode = "numeric", step=0) for (i,j in 1:n) {If (identical(x3[[i]],x3[[j]]) & identical(x2[[i]],x2[[j]])} scr[[i]] <- ((abs(x5[[i]]-x5[[j]]))/(abs(xd4[[i]]-xd4[[j]]))) *1000 + scr[[i]]

sce <- vector(0, length= n, mode = "numeric", step=0) for (i in 1:n) {sce[[i]] <- sce[[i]] + (1 - abs(x5[[i]]- 6)/6)*100}

se <- scf + sce + scr + sc

score <- cbind(x3, se)

"DOB","ID","DOCTOR","DATE of TEST","TEST1" 12-23-1921,2177532.174,NA,01-20-2003,NA

NA,2358368.261,"152N7R",01-26-2003,NA

NA,2358368.261,"152N7R",01-27-2003,NA

07-24-1938,2174903.913,NA,01-31-2003,6.7 12-25-1924,2185493.043,NA,01-31-2003,NA 07-21-1943,2181658.696,"K9PL9N,L",01-28-2003,7 05-24-1938,2306571.304,"SH7RM9N",01-13-2003,NA 07-29-1949,2296516.522,"H3001FR9",01-20-2003,NA

Thanks,

- Mani
Member, Cal. Math. Soc

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Aug 22 23:59:02 2005

*
This archive was generated by hypermail 2.1.8
: Sun 23 Oct 2005 - 15:41:00 EST
*