# [R] How to do a time-stratified case-crossover analysis for air pollution data?

From: Nilsson Fredrik X <Fredrik.X.Nilsson_at_skane.se>
Date: Fri, 07 Mar 2008 15:47:39 +0100

Dear Experts,

I am trying to do a time-stratified case-crossover analysis on air pollution data and number of myocardial infarctions. In order to avoid model selection bias, I started with a simple simulation.

I'm still not sure if my simulation is right. But the results I get from the "ts-case-crossover" are much more variable than those from a glm.

Is this:

1. Due to the simple relation of log-rate of mi cases being = alpha + beta*pm that the glm results are more precise?
2. Due to me using method="approximate" instead of default "exact" in the clogit?
3. Due to the fact that the "approximate" method in clogit use "breslow" for handling ties and not "efron"?
4. Due to that I've misunderstood how to arrange data when doing a case-crossover, such that there is a way of using "exact", that wouldn't go berserk due to the many ties (see #Berserk script at bottom).
5. The prize to pay for using a case-crossover analysis?

It should perhaps be noted that, because of the absence of individual data, the exposure to air-pollution (pm10) is assumed to be common to all individuals on a certain day.

I'd be most grateful for any help and ideas on this matter.

Best regards,

Fredrik Nilsson, PhD

PS. I am aware of the limitations that Whitaker et al. presented in Environmetrics 2007; 18: 157-171, but tried to use the time-stratified case-crossover as it could be a "simple", but fairly correct way of doing this type of analysis.

Simulation script:

library(survival)

n<-2*365

samp<-100

ti<-1:n

ar1<-rnorm(1)

for (i in 2:n){

ar1[i]<-0.2*ar1[i-1]+rnorm(1)

}

#pm10<-2 + .5*sin(2*pi*ti/365 + 127) + 0.1*ar1

startdate<-"1992-07-01"

date1<-as.Date(startdate)

dates<-date1 + (ti-1)

tyda<-weekdays(dates)

tymo<-months(dates)

tyyear<-as.character(dates)

for (i in 1:n)

{

}

moeff<-cumsum(c(1,31,28,31,30,31,30,31,31,30,31,30))

a<-as.Date("1994-12-31")

monthnames<-months(a+moeff)

tymofa<-match(tymo,monthnames)

moeff<-.5*sin(2*pi*moeff/365 + 127)

#within strata

pm10<-2 + .5*moeff[tymofa] + 0.1*ar1

al<- -2.5

be<- 1.4

qres<-numeric(samp)

glmres<-qres

gamres<-qres

for(q in 1:samp)

{

rate<-exp(al + be*pm10)

mi<-rpois(length(rate),lambda=rate)

date1<-as.Date(startdate)

dates<-date1 + (ti-1)

tyda<-weekdays(dates)

tymo<-months(dates)

tyyear<-as.character(dates)

for (i in 1:n)

{

}

#replicate cases

air<-data.frame(mi, tyda, tymo, tyyear, pm10, dates)

air\$stratn<-as.numeric(strata(air\$tyda, air\$tymo,air\$tyyear))

lest<-unique(air\$stratn)

air\$nctrl<-0

airbase<-air

for (i in 1:length(lest))

{

a<-which(air\$stratn==lest[i])

for (j in 1:length(a))

{

air\$nctrl[a[j]]<-sum(air\$mi[a[-j]])

}

}

#create cases and controls

cami<-rep(1,sum(mi))

ctmi<-rep(0,sum(air\$nctrl))

capm<-rep(pm10, mi)

ctpm<-rep(pm10, air\$nctrl)

cast<-rep(air\$stratn, mi)

ctst<-rep(air\$stratn, air\$nctrl)

ctdy<-rep(dates, air\$nctrl)

cases<-c(cami, ctmi)

stranu<-c(cast, ctst)

pmva<-c(capm, ctpm)

air2<-data.frame(cases, days, pmva,stranu)

air.cl<-clogit(cases~pmva + strata(stranu), method="approximate", data=air2)

air.glm<-glm(mi~pm10, family=poisson, data=air)

qres[q]<-as.numeric(coef(air.cl))

glmres[q]<-as.numeric(coef(air.glm)[2])

}

par(mfrow=c(2,1))

plot(density(qres))

plot(density(glmres))

fivenum(qres)

fivenum(glmres)

# Berserk script

#

#case<-c(5,0,2,1)

#airq<-c(10,2,3,3)

#

#nctrl<-0*case

#for (i in 1:length(case))

#{

# nctrl[i]<-sum(case[-i])

#}

#

#airca<-rep(airq,case)

#airct<-rep(airq,nctrl)

#

#ctrls<-rep(0,sum(nctrl))

#

#mi<-c(cases,ctrls)

#a<-Sys.time()

#air.cl<-clogit(mi~airq)

#b<-Sys.time()

#b-a

#a<-Sys.time()

#air.cl<-clogit(mi~airq, method="approximate")

#b<-Sys.time()

#b-a

[[alternative HTML version deleted]]

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 07 Mar 2008 - 14:51:38 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 07 Mar 2008 - 15:30:20 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.