Re: [R] [Not R question]: Better fit for order probit model

From: <>
Date: Sat, 16 Jun 2007 07:17:51 +0000 (GMT)

Thank you so much Robert. I haven't thought about the idea of clumping categories together. One of the reason is because these categories are bridge condition rating scores. They indeed represent different meaning and serviceability conditions. They vary from 0-9. I have about 300,000 data in which the first 5 labels, i.e. 0-4, are bad condition bridge and there are only less than 1000 instances in total. The worst case, is for example, score 0 (meaning the bridge is not operatable), I have 60 instances. Score 1 I have about 100. I would appreciate if you could provide some opinion as to how you would make the order probit fits better in this case? Thank you so much in advance.- adschai----- Original Message -----From: Robert A LaBudde Date: Friday, June 15, 2007 9:52 pmSubject: Re: [R] [Not R question]: Better fit for order probit modelTo:> At 09:31 PM 6/15/2007, adschai wrote:> >I have a model which tries to fit a set of data with 10-level > >order!  ed responses. Somehow, in my data, the majority of the > >observations are from level 6-10 and leave only about 1-5% of > total > >observations contributed to level 1-10. As a result, my model > tends > >to perform badly on points that have lower level than 6.> >> >I would like to ask if there's any way to circumvent this > problem or > >not. I was thinking of the followings ideas. But I am opened to > any > >suggestions if you could please.> >> >1. Bootstrapping with small size of samples each time. > Howevever, in > >each sample basket, I intentionally sample in such a way that > there > >is a good mix between observations from each level. Then I have > to > >do this many times. But I don't know how to obtain the true > standard > >error of estimated parameters after all bootstrapping has been > done. > >Is it going to be simply the average of all standard errors > >estimated each time?> >> >2. Weighting points with level 1-6 more. But it's unclear to me > how > >to put t!  his weight back to maximum likelihood when estimating > >parameters. I t's unlike OLS where your objective is to minimize > >error or, if you'd like, a penalty function. But MLE is > obviously > >not a penalty function.> >> >3. Do step-wise regression. I will segment the data into two > >regions, first points with response less than 6 and the rest > with > >those above 6. The first step is a binary regression to > determine if > >the point belongs to which of the two groups. Then in the > second > >step, estimate ordered probit model for each group separately. > The > >question here is then, why I am choosing 6 as a cutting point > >instead of others?> >> >Any suggestions would be really appreciated. Thank you.> > You could do the obvious, and lump categories such as 1-6 or 1-7 > together to make a composite category.> > You don't mention the size of your dataset. If there are 10,000 > data, > you might live with a 1% category. If you only have 100 data, > you > have too many categories.> > Also, next time plan your study and training better so!

  that next > time > your categories are fully utilized. And don't use so many > categories. > People have trouble even selecting responses on a 5-level scale.> ================================================================> Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail:> Least Cost Formulations, Ltd.            URL:> 824 Timberlake Drive                     Tel: 757-467-0954> Virginia Beach, VA 23464-3239            Fax: 757-467-2947> > "Vere scire est per causas scire"> > ______________________________________________> mailing list>> PLEASE do read the posting guide http://www.R->> and provide commented, minimal, self-contained, reproducible code.>

	[[alternative HTML version deleted]]

______________________________________________ mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Sat 16 Jun 2007 - 07:25:27 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 16 Jun 2007 - 16:31:55 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.