From: Andrew Gelman <gelman_at_stat.columbia.edu>

Date: Tue 23 May 2006 - 01:12:34 EST

Date: Tue 23 May 2006 - 01:12:34 EST

Harold,

I think we use slightly different notation (I like to use variance parameters rather than covariance matrices). Let me try to write it in model form:

Data points y_i, i=1,...,800

800 x 84 matrix of predictors, X: for columns j=1,...,82, X_{i,j} is the amount of food j consumed by person i. X_{i,83} is an indicator (1 if male, 0 if female), and X_{i,84} is the age of person i.

Data-level model: Pr (y_i=1) = inverse.logit (X_i*beta), for i=1,...,800, with independent outcomes.

beta is a (column) vector of length 84.

Group-level model: for j=1,...,82: beta_j ~ Normal (gamma_0 + gamma_1 * u_j, sigma^2_{beta}).

u is a vector of length 82, where u_j = folate concentration in food j

gamma_0 and gamma_1 are scalar coefficients (for the group-level model), and sigma_{beta} is the sd of the group-level errors.

It would be hopeless to estimate all the betas using maximum likelihood: that's 800 data points and 84 predictors, the results will just be too noisy. But it should be ok using the 2-level model above. The question is: can I fit in lmer()?

Thanks again.

Andrew

Doran, Harold wrote:

> So, in the hierarchical notation, does the model look like this (for

*> the linear predictor):
**>
**> DV = constant + food_1(B_1) + food_2(B_2) + ... + food_82(B_82) +
**> sex(B_83) + age(B_84)
**> food_1 = gamma_00 + gamma_01(folic) + r_01
**> food_2 = gamma_10 + gamma_11(folic) + r_02
**> ...
**> food_82 = gamma_20 + gamma_21(folic) + r_82
**>
**> where r_qq ~ N(0, Psi) and Psi is an 82-dimensional covariance matrix.
**>
**> I usually need to see this in model form as it helps me translate this
**> into lmer syntax if it can be estimated. From what I see, this would
**> be estimating 82(82+1)/2 = 3403 parameters in the covariance matrix.
**>
**> What I'm stuck on is below you say it would be hopeless to estimate
**> the 82 predictors using ML. But, if I understand the model correctly,
**> the multilevel regression still resolves the predictors (fixed
**> effects) using ML once estimates of the variances are obtained. So, I
**> feel I might still be missing something.
**>
**>
**>
**> -----Original Message-----
**> From: Andrew Gelman [mailto:gelman@stat.columbia.edu]
**> Sent: Sun 5/21/2006 7:35 PM
**> To: Doran, Harold
**> Cc: r-help@stat.math.ethz.ch; reg26@columbia.edu
**> Subject: Re: [R] Can lmer() fit a multilevel model embedded in
**> a regression?
**>
**> Harold,
**>
**> I get confused by the terms "fixed" and "random". Our first-level model
**> (in the simplified version we're discussing here) has 800 data points
**> (the persons in the study) and 84 predictors: sex, age, and 82
**> coefficients for foods. The second-level model has 82 data points (the
**> foods) and two predictors: a constant term and folic acid concentration.
**>
**> It would be hopeless to estimate the 82 food coefficients via maximum
**> likelihood, so the idea is to do a multilevel model, with a regression
**> of these coefficients on the constant term and folic acid. The
**> group-level model has a residual variance. If the group-level residual
**> variance is 0, it's equivalent to ignoring food, and just using total
**> folic acid as an individual predictor. If the group-level residual
**> variance is infinity, it's equivalent to estimating the original
**> regression (with 84 predictors) using least squares.
**>
**> The difficulty is that the foods aren't "groups" in the usual sense,
**> since persons are not nested within foods; rather, each person eats many
**> foods, and this is reflected in the X matrix.
**>
**> Andrew
**>
**> Doran, Harold wrote:
**>
**> > OK, I'm piecing this together a bit, sorry I'm not familiar with the
**> > article you cite. Let me try and fully understand the issue if you
**> > don't mind. Are you estimating each of the 82 foods as fixed effects?
**> > If so, in the example below this implies 84 total fixed effects (1 for
**> > each food type in the X matrix and then sex and age).
**> >
**> > I'm assuming that food type is nested within one of the 82 folic acid
**> > concentrations and then folic acid is treated as a random effect.
**> >
**> > Is this accurate?
**> >
**> >
**> > -----Original Message-----
**> > From: Andrew Gelman [mailto:gelman@stat.columbia.edu]
**> > Sent: Sun 5/21/2006 9:17 AM
**> > To: Doran, Harold
**> > Cc: r-help@stat.math.ethz.ch; reg26@columbia.edu
**> > Subject: Re: [R] Can lmer() fit a multilevel model embedded in
**> > a regression?
**> >
**> > Harold,
**> >
**> > I'm confused now. Just for concretness, suppose we have 800 people, 82
**> > food items, and one predictor ("folic", the folic acid concentration) at
**> > the food-item level. Then DV will be a vector of length 800, foods is
**> > an 800 x 82 matrix, sex is a vector of length 800, age is a vector of
**> > length 800, and folic is a vector of length 82. The vector of folic
**> > acid concentrations in individual diets is then just foods%*%folic,
**> > which I can call folic_indiv.
**> >
**> > How would I fit the model in lmer(), then? There's some bit of
**> > understading that I'm still missing.
**> >
**> > Thanks.
**> > Andrew
**> >
**> >
**> > Doran, Harold wrote:
**> >
**> > > Prof Gelman:
**> > >
**> > > I believe the answer is yes. It sounds as though persons are partially
**> > > crossed within food items?
**> > >
**> > > Assuming a logit link, the syntax might follow along the lines of
**> > >
**> > > fm1 <- lmer(DV ~ foods + sex + age + (1|food_item), data, family =
**> > > binomial(link='logit'), method = "Laplace", control = list(usePQL=
**> > > FALSE) )
**> > >
**> > > Maybe this gets you partly there.
**> > >
**> > > Harold
**> > >
**> > >
**> > >
**> > > -----Original Message-----
**> > > From: r-help-bounces@stat.math.ethz.ch on behalf of Andrew Gelman
**> > > Sent: Sat 5/20/2006 5:49 AM
**> > > To: r-help@stat.math.ethz.ch
**> > > Cc: reg26@columbia.edu
**> > > Subject: [R] Can lmer() fit a multilevel model embedded in a
**> > > regression?
**> > >
**> > > I would like to fit a hierarchical regression model from Witte et al.
**> > > (1994; see reference below). It's a logistic regression of a health
**> > > outcome on quntities of food intake; the linear predictor has the
**> form,
**> > > X*beta + W*gamma,
**> > > where X is a matrix of consumption of 82 foods (i.e., the rows of X
**> > > represent people in the study, the columns represent different foods,
**> > > and X_ij is the amount of food j eaten by person i); and W is a matrix
**> > > of some other predictors (sex, age, ...).
**> > >
**> > > The second stage of the model is a regression of X on some food-level
**> > > predictors.
**> > >
**> > > Is it possible to fit this model in (the current version of) lmer()?
**> > > The challenge is that the persons are _not_ nested within food
**> items, so
**> > > it is not a simple multilevel structure.
**> > >
**> > > We're planning to write a Gibbs sampler and fit the model
**> directly, but
**> > > it would be convenient to be able to flt in lmer() as well to check.
**> > >
**> > > Andrew
**> > >
**> > > ---
**> > >
**> > > Reference:
**> > >
**> > > Witte, J. S., Greenland, S., Hale, R. W., and Bird, C. L. (1994).
**> > > Hierarchical regression analysis applied to a
**> > > study of multiple dietary exposures and breast cancer.
**> Epidemiology 5,
**> > > 612-621.
**> > >
**> > > --
**> > > Andrew Gelman
**> > > Professor, Department of Statistics
**> > > Professor, Department of Political Science
**> > > gelman@stat.columbia.edu
**> > > www.stat.columbia.edu/~gelman
**> > >
**> > > Statistics department office:
**> > > Social Work Bldg (Amsterdam Ave at 122 St), Room 1016
**> > > 212-851-2142
**> > > Political Science department office:
**> > > International Affairs Bldg (Amsterdam Ave at 118 St), Room 731
**> > > 212-854-7075
**> > >
**> > > Mailing address:
**> > > 1255 Amsterdam Ave, Room 1016
**> > > Columbia University
**> > > New York, NY 10027-5904
**> > > 212-851-2142
**> > > (fax) 212-851-2164
**> > >
**> > > ______________________________________________
**> > > R-help@stat.math.ethz.ch mailing list
**> > > https://stat.ethz.ch/mailman/listinfo/r-help
**> > > PLEASE do read the posting guide!
**> > > http://www.R-project.org/posting-guide.html
**> > >
**> > >
**> >
**> > --
**> > Andrew Gelman
**> > Professor, Department of Statistics
**> > Professor, Department of Political Science
**> > gelman@stat.columbia.edu
**> > www.stat.columbia.edu/~gelman
**> >
**> > Statistics department office:
**> > Social Work Bldg (Amsterdam Ave at 122 St), Room 1016
**> > 212-851-2142
**> > Political Science department office:
**> > International Affairs Bldg (Amsterdam Ave at 118 St), Room 731
**> > 212-854-7075
**> >
**> > Mailing address:
**> > 1255 Amsterdam Ave, Room 1016
**> > Columbia University
**> > New York, NY 10027-5904
**> > 212-851-2142
**> > (fax) 212-851-2164
**> >
**> >
**> >
**>
**> --
**> Andrew Gelman
**> Professor, Department of Statistics
**> Professor, Department of Political Science
**> gelman@stat.columbia.edu
**> www.stat.columbia.edu/~gelman
**>
**> Statistics department office:
**> Social Work Bldg (Amsterdam Ave at 122 St), Room 1016
**> 212-851-2142
**> Political Science department office:
**> International Affairs Bldg (Amsterdam Ave at 118 St), Room 731
**> 212-854-7075
**>
**> Mailing address:
**> 1255 Amsterdam Ave, Room 1016
**> Columbia University
**> New York, NY 10027-5904
**> 212-851-2142
**> (fax) 212-851-2164
**>
**>
**>
*

-- Andrew Gelman Professor, Department of Statistics Professor, Department of Political Science gelman@stat.columbia.edu www.stat.columbia.edu/~gelman Statistics department office: Social Work Bldg (Amsterdam Ave at 122 St), Room 1016 212-851-2142 Political Science department office: International Affairs Bldg (Amsterdam Ave at 118 St), Room 731 212-854-7075 Mailing address: 1255 Amsterdam Ave, Room 1016 Columbia University New York, NY 10027-5904 212-851-2142 (fax) 212-851-2164 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlReceived on Tue May 23 01:29:38 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Tue 23 May 2006 - 02:10:14 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*