Re: [R] Defining reference category for a cph model summary inside of a "for" loop

From: Wells, Brian <WELLSB_at_ccf.org>
Date: Mon, 31 Mar 2008 10:07:55 -0400

Frank,

Thanks again, I didn't realize that continuous variables could be manipulated that way inside of the summary function.

I realize that my code was kind of confusing.

The variables "A"..."F" are all categorical variables. They each have four levels named "1st Quartile"...."4th Quartile"

I tried the code below with the same result.
>print(summary(f, eval(parse(text=paste(i,"='1st Quartile'", sep='')))))

In the output, the reference category is different for each of the variables.

Brian
-----Original Message-----
From: Frank E Harrell Jr [mailto:f.harrell_at_vanderbilt.edu] Sent: Sunday, March 30, 2008 9:14 AM
To: Wells, Brian
Cc: r-help_at_r-project.org
Subject: Re: [R] Defining reference category for a cph model summary
inside of a "for" loop

Wells, Brian wrote:
> Dr. Harrell,
> Thanks for you help.
>
> I tried:
>
>> print(summary(f,parse(text=paste(i,'="1st Quartile"', sep=''))))
>
> Same result. No error, the reference category simply doesn't change.

That's good, because the default in summary is to compare the outer quartiles for a continuous variable. And as I said before the string '1st Quartile' has no special meaning for R or Design.

Get what you are trying to do to work without parse (and you'll need eval() with parse) first. When you want total control over a setting, say getting a hazard ratio for the .2 to the .8 quantile, do something like

summary(f, age=quantile(age,c(.2,.8),na.rm=TRUE))

Frank

>
> Brian
>
> -----Original Message-----
> From: Frank E Harrell Jr [mailto:f.harrell_at_vanderbilt.edu]
> Sent: Friday, March 28, 2008 8:34 PM
> To: Wells, Brian
> Cc: r-help_at_r-project.org
> Subject: Re: [R] Defining reference category for a cph model summary
> inside of a "for" loop
>
> Wells, Brian wrote:
>> I have the following code.
>>
>>
>>
>>
>>
>>> f <- cph(formula = Surv(TimeToDeath, Dead == "Yes")
>> ~1,data=single.dat, x=T, y=T, surv=T)
>>
>>> for(i in c('A', 'B', 'C', 'D', 'E', 'F')){
>>> f <-update(f,as.formula(paste('Surv(TimeToDeath, Dead ==
>> "Yes")~',i,sep='')))
>>
>>> print(summary(f, paste(i,"=1st Quartile", sep='')))
>>
>>
>>
>>
>> There is no error message generated in R, but R ignores the reference
>> category defined with paste in the summary function for the cph
model.
>
>>
>>
>> The output uses the "1st Quartile" as the reference category to
>> calculate hazards for some of the variables defined by i, but not all
> of
>> them.
>
>
> Your code is confusing. What is to the right of ~ in a formula is a
> predictor variable name, not a value. If your variables are named A,
B,
>
> C, ... you are OK.
>
> '1st Quartile' has no special meaning to R or Design, and you can't
pass
>
> a character string as a second argument to summary and expect it to
> work.
>
> You will need parse(text=paste(...)) to create an appropriate
> expression.
>
> But Design gives you inter-quartile range hazard ratios by default
> anyway.
>
> Beware of getting hazard ratios that are not adjusted for other
> variables needed in the model.
>
> Frank Harrell
>
>>
>>
>>
>>
>> Any help would be greatly appreciated.
>>
>>
>>
>> thanks
>>
>>
>>
>> Brian J. Wells, MD, MS
>>
>> Research Associate
>>
>> Quantitative Health Sciences
>>
>> Cleveland Clinic
>>
>

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt
University



P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2007).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 31 Mar 2008 - 14:30:57 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 31 Mar 2008 - 15:00:26 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive