From: <khosoda_at_med.kobe-u.ac.jp>

Date: Wed, 20 Apr 2011 23:44:04 +0900

Date: Wed, 20 Apr 2011 23:44:04 +0900

Dear Prof. Harrel,

Thank you very much for your quick advice. I will try rms package.

Regarding model reduction, is my model 2 method (clustering and recoding that are blinded to the outcome) permissible?

Sincerely,

-- KH (11/04/20 22:01), Frank Harrell wrote:Received on Wed 20 Apr 2011 - 14:47:38 GMT

> Deleting variables is a bad idea unless you make that a formal part of the

> BMA so that the attempt to delete variables is penalized for. Instead of> BMA I recommend simple penalized maximum likelihood estimation (see the lrm> function in the rms package) or pre-modeling data reduction that is blinded> to the outcome variable.> Frank>>> 細田弘吉 wrote:>>>> Hi everybody,>> I apologize for long mail in advance.>>>> I have data of 104 patients, which consists of 15 explanatory variables>> and one binary outcome (poor/good). The outcome consists of 25 poor>> results and 79 good results. I tried to analyze the data with logistic>> regression. However, the 15 variables and 25 events means events per>> variable (EPV) is much less than 10 (rule of thumb). Therefore, I used R>> package, "BMA" to perform logistic regression with BMA to avoid this>> problem.>>>> model 1 (full model):>> x1, x2, x3, x4 are continuous variables and others are binary data.>>>>> x16.bic.glm<- bic.glm(outcome ~ ., data=x16.df,>> glm.family="binomial", OR20, strict=FALSE)>>> summary(x16.bic.glm)>> (The output below has been cut off at the right edge to save space)>>>> 62 models were selected>> Best 5 models (cumulative posterior probability = 0.3606 ):>>>> p!=0 EV SD model 1 model2>> Intercept 100 -5.1348545 1.652424 -4.4688 -5.15>> -5.1536>> age 3.3 0.0001634 0.007258 .>> sex 4.0>> .M -0.0243145 0.220314 .>> side 10.8>> .R 0.0811227 0.301233 .>> procedure 46.9 -0.5356894 0.685148 . -1.163>> symptom 3.8 -0.0099438 0.129690 . .>> stenosis 3.4 -0.0003343 0.005254 .>> x1 3.7 -0.0061451 0.144084 .>> x2 100.0 3.1707661 0.892034 3.2221 3.11>> x3 51.3 -0.4577885 0.551466 -0.9154 .>> HT 4.6>> .positive 0.0199299 0.161769 . .>> DM 3.3>> .positive -0.0019986 0.105910 . .>> IHD 3.5>> .positive 0.0077626 0.122593 . .>> smoking 9.1>> .positive 0.0611779 0.258402 . .>> hyperlipidemia 16.0>> .positive 0.1784293 0.512058 . .>> x4 8.2 0.0607398 0.267501 . .>>>>>> nVar 2 2>> 1 3 3>> BIC -376.9082>> -376.5588 -376.3094 -375.8468 -374.5582>> post prob 0.104>> 0.087 0.077 0.061 0.032>>>> [Question 1]>> Is it O.K to calculate odds ratio and its 95% confidence interval from>> "EV" (posterior distribution mean) and“SD”(posterior distribution>> standard deviation)?>> For example, 95%CI of EV of x2 can be calculated as;>>> exp(3.1707661)>> [1] 23.82573 -----> odds ratio>>> exp(3.1707661+1.96*0.892034)>> [1] 136.8866>>> exp(3.1707661-1.96*0.892034)>> [1] 4.146976>> ------------------> 95%CI (4.1 to 136.9)>> Is this O.K.?>>>> [Question 2]>> Is it permissible to delete variables with small value of "p!=0" and>> "EV", such as age (3.3% and 0.0001634) to reduce the number of>> explanatory variables and reconstruct new model without those variables>> for new session of BMA?>>>> model 2 (reduced model):>> I used R package, "pvclust", to reduce the model. The result suggested>> x1, x2 and x4 belonged to the same cluster, so I picked up only x2.>> Based on the subject knowledge, I made a simple unweighted sum, by>> counting the number of clinical features. For 9 features (sex, side,>> HT2, hyperlipidemia, DM, IHD, smoking, symptom, age), the sum ranges>> from 0 to 9. This score was defined as ClinicalScore. Consequently, I>> made up new data set (x6.df), which consists of 5 variables (stenosis,>> x2, x3, procedure, and ClinicalScore) and one binary outcome>> (poor/good). Then, for alternative BMA session...>>>>> BMAx6.glm<- bic.glm(postopDWI_HI ~ ., data=x6.df,>> glm.family="binomial", OR=20, strict=FALSE)>>> summary(BMAx6.glm)>> (The output below has been cut off at the right edge to save space)>> Call:>> bic.glm.formula(f = postopDWI_HI ~ ., data = x6.df, glm.family =>> "binomial", strict = FALSE, OR = 20)>>>>>> 13 models were selected>> Best 5 models (cumulative posterior probability = 0.7626 ):>>>> p!=0 EV SD model 1 model 2>> Intercept 100 -5.6918362 1.81220 -4.4688 -6.3166>> stenosis 8.1 -0.0008417 0.00815 . .>> x2 100.0 3.0606165 0.87765 3.2221 3.1154>> x3 46.5 -0.3998864 0.52688 -0.9154 .>> procedure 49.3 0.5747013 0.70164 . 1.1631>> ClinicalScore 27.1 0.0966633 0.19645 . .>>>>>> nVar 2 2 1>> 3 3>> BIC -376.9082 -376.5588>> -376.3094 -375.8468 -375.5025>> post prob 0.208 0.175>> 0.154 0.122 0.103>>>> [Question 3]>> Am I doing it correctly or not?>> I mean this kind of model reduction is permissible for BMA?>>>> [Question 4]>> I still have 5 variables, which violates the rule of thumb, "EPV> 10".>> Is it permissible to delete "stenosis" variable because of small value>> of "EV"? Or is it O.K. because this is BMA?>>>> Sorry for long post.>>>> I appreciate your help very much in advance.>>>> -->> KH>>>> ______________________________________________>> R-help_at_r-project.org mailing list>> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide>> http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code.>>>>> -----> Frank Harrell> Department of Biostatistics, Vanderbilt University> --> View this message in context: http://r.789695.n4.nabble.com/BMA-logistic-regression-odds-ratio-model-reduction-etc-tp3462416p3462919.html> Sent from the R help mailing list archive at Nabble.com.>> ______________________________________________> R-help_at_r-project.org mailing list> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

-- ************************************************* 神戸大学大学院医学研究科 脳神経外科学分野 細田 弘吉 〒650-0017 神戸市中央区楠町7丁目5-1 Phone: 078-382-5966 Fax : 078-382-5979 E-mail address Office: khosoda_at_med.kobe-u.ac.jp Home : khosoda_at_venus.dti.ne.jp ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Wed 20 Apr 2011 - 23:20:31 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*