# Re: [R] logistic regression weights problem

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Thu 14 Apr 2005 - 02:42:25 EST

On Wed, 13 Apr 2005, Federico Calboli wrote:

> I have a problem with weighted logistic regression. I have a number of
> SNPs and a case/control scenario, but not all genotypes are as
> "guaranteed" as others, so I am using weights to downsample the
> importance of individuals whose genotype has been heavily "inferred".
>
> My data is quite big, but with a dummy example:
>
>> status <- c(1,1,1,0,0)
>> SNPs <- matrix( c(1,0,1,0,0,0,0,1,0,1,0,1,0,1,1), ncol =3)
>> weight <- c(0.2, 0.1, 1, 0.8, 0.7)
>> glm(status ~ SNPs, weights = weight, family = binomial)
>
> Call: glm(formula = status ~ SNPs, family = binomial, weights = weight)
>
> Coefficients:
> (Intercept) SNPs1 SNPs2 SNPs3
> -2.079 42.282 -18.964 NA
>
> Degrees of Freedom: 4 Total (i.e. Null); 2 Residual
> Null Deviance: 3.867
> Residual Deviance: 0.6279 AIC: 6.236
> Warning messages:
> 1: non-integer #successes in a binomial glm! in: eval(expr, envir,
> enclos)
> 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y
> = Y, weights = weights, start = start, etastart = etastart,
>
> NB I do not get warning (2) for my data so I'll completely disregard it.
>
> Warning (1) looks suspiciously like a multiplication of my C/C status by
> the weights... what exacly is glm doing with the weight vector?

Using it in the GLM definition. If you specify 0<=y_i<=1 and weights a_i, this is how you specify Binomial(a_i, a_iy_i). Look up any book on GLMs and see what it says about the binomial. E.g. MASS4 pp. 184, 190.

