From: Bryan Hanson <hanson_at_depauw.edu>

Date: Thu, 19 Jun 2008 19:54:51 -0400

plot(x, y1, pch = 1, ylim = c(0,1), col = "red", main = "Logistic Regression w/Confidence Bounds", ylab = "y values", xlab = "x values") points(x, y2, pch = 3, col = "blue")

abline(h = 0.5, col = "gray")

fit1 <- glm(y1~x, family = binomial(link = "logit"), data, na.action = na.omit)

fit2 <- glm(y2~x, family = binomial(link = "logit"), data, na.action = na.omit)

lines(fit1$model$x, fit1$fitted.values, col = "red") lines(fit2$model$x, fit2$fitted.values, col = "blue")

>> [I've ommitted some of the conversation so far...]

*>>
*

*>>
*

>> I think I follow the above conceptually, but I don't know how to implement

*>> it, though I fooled around (unsuccessfully) with some of the variations on
*

*>> predict().
*

*>>
*

*>> I'm trying to learn this in response to a biology colleague who did
*

*>> something similar in SigmaPlot. I can already tell that SigmaPlot did a lot
*

*>> of stuff for him in the background. The responses are 0/1 of a particular
*

*>> observation by date. The following code simulates what's going on (note
*

*>> that I didn't use 0/1 since this leads to a recognized condition/warning of
*

*>> fitting 1's and 0's. I've requested Brian's Pattern Recognition book so I
*

*>> know what the problem is and how to solve it). My colleague is looking at
*

*>> two populations in which the "LD50" would differ. I'd like to be able to
*

*>> put the "pointwise confidence bounds" on each curve so he can tell if the
*

*>> populations are really different.
*

*>>
*

*>> By the way, this code does give a (minor?) error from glm (which you will
*

*>> see).
*

*>>
*

*>> Can you make a suggestion about how to get those "confidence bounds" on
*

*>> there?
*

*>>
*

*>> Also, is a probit link more appropriate here?
*

*>>
*

*>> Thanks, Bryan
*

*>>
*

*>> x <- c(1:40)
*

*>> y1 <- c(rep(0.1,10), rep(NA, 10), rep(0.9,20))
*

*>> y2 <- c(rep(0.1,15), rep(NA, 10), rep(0.9,15))
*

*>> data <- as.data.frame(cbind(x,y1,y2))
*

*>> plot(x, y1, pch = 1, ylim = c(0,1), col = "red")
*

*>> points(x, y2, pch = 3, col = "blue")
*

*>> abline(h = 0.5, col = "gray")
*

*>> fit1 <- glm(y1~x, family = binomial(link = "logit"), data, na.action =
*

*>> na.omit)
*

*>> fit2 <- glm(y2~x, family = binomial(link = "logit"), data, na.action =
*

*>> na.omit)
*

*>> lines(fit1$model$x, fit1$fitted.values, col = "red")
*

*>> lines(fit2$model$x, fit2$fitted.values, col = "blue")
*

*>> R-help_at_r-project.org mailing list
*

*>> https://stat.ethz.ch/mailman/listinfo/r-help
*

*>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
*

*>> and provide commented, minimal, self-contained, reproducible code.
*

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 20 Jun 2008 - 00:37:04 GMT

Date: Thu, 19 Jun 2008 19:54:51 -0400

Thanks so much to all who offered assistance. I have to say it would have
taken me a long time to figure this out, so I am most grateful. Plus,
studying your examples greatly improves my understanding.

As a follow up, the fit process gives the following error:

Warning messages:

1: In eval(expr, envir, enclos) :

non-integer #successes in a binomial glm!

Is this something I should worry about? It doesn't arise from glm or glm.fit

For the record, a final complete working code is appended below.

y1 <- c(rep(0.1,10), rep(NA, 10), rep(0.9,20)) y2 <- c(rep(0.1,15), rep(NA, 10), rep(0.9,15)) data <- as.data.frame(cbind(x,y1,y2))

plot(x, y1, pch = 1, ylim = c(0,1), col = "red", main = "Logistic Regression w/Confidence Bounds", ylab = "y values", xlab = "x values") points(x, y2, pch = 3, col = "blue")

abline(h = 0.5, col = "gray")

fit1 <- glm(y1~x, family = binomial(link = "logit"), data, na.action = na.omit)

fit2 <- glm(y2~x, family = binomial(link = "logit"), data, na.action = na.omit)

lines(fit1$model$x, fit1$fitted.values, col = "red") lines(fit2$model$x, fit2$fitted.values, col = "blue")

## predictions on scale of link function
pred1 <- predict(fit1, se.fit = TRUE)

pred2 <- predict(fit2, se.fit = TRUE)

## constant for 95% confidence bands ## getting two t values is redundant here as fit1 and fit2 ## have same residual df, but the real world may be differentres.df <- c(fit1$df.residual, fit2$df.residual) ## 0.975 because we want 2.5% in upper and lower tail const <- qt(0.975, df = res.df)

## confidence bands on scale of link function

upper1 <- pred1$fit + (const[1] * pred1$se.fit) lower1 <- pred1$fit - (const[1] * pred1$se.fit) upper2 <- pred2$fit + (const[2] * pred2$se.fit) lower2 <- pred2$fit - (const[2] * pred2$se.fit)

## bind together into a data frame

bands <- data.frame(upper1, lower1, upper2, lower2)

## transform on to scale of response

bands <- data.frame(lapply(bands, binomial(link = "logit")$linkinv))

## plot confidence bands

lines(fit1$model$x, bands$upper1, col = "pink") lines(fit1$model$x, bands$lower1, col = "pink") lines(fit2$model$x, bands$upper2, col = "lightblue") lines(fit2$model$x, bands$lower2, col = "lightblue")

On 6/19/08 12:28 PM, "Gavin Simpson" <gavin.simpson_at_ucl.ac.uk> wrote:

> On Thu, 2008-06-19 at 10:42 -0400, Bryan Hanson wrote:

>> [I've ommitted some of the conversation so far...]

>>> E.g. in a logistic model, with (say) eta = beta_0 + beta_1*x one may >>> find, on the >>> linear predictor scale, A and B (say) such that P(A <= eta <= B) = 0.95. >>> >>> Then P(expit(A) <= expit(eta) <= expit(B)) = 0.95, which is exactly >>> what is wanted.

>> I think I follow the above conceptually, but I don't know how to implement

> > The point is to get predictions on the scale of the link function, > generate 95% confidence bands in the normal way and then transform the > confidence bands onto the scale of the response using the inverse of the > link function used to fit the model. > > [note, am doing this from memory, so best to check this is right -- I'm > sure someone will tell me very quickly if I have gone wrong anywhere!] > > ## predictions on scale of link function > pred1 <- predict(fit1, se.fit = TRUE) > pred2 <- predict(fit2, se.fit = TRUE) > > ## constant for 95% confidence bands > ## getting two t values is redundant here as fit1 and fit2 > ## have same residual df, but the real world may be different > res.df <- c(fit1$df.residual, fit2$df.residual) > ## 0.975 because we want 2.5% in upper and lower tail > const <- qt(0.975, df = res.df) > > ## confidence bands on scale of link function > upper1 <- pred1$fit + (const[1] * pred1$se.fit) > lower1 <- pred1$fit - (const[1] * pred1$se.fit) > upper2 <- pred2$fit + (const[2] * pred2$se.fit) > lower2 <- pred2$fit - (const[2] * pred2$se.fit) > > ## bind together into a data frame > bands <- data.frame(upper1, lower1, upper2, lower2) > > ## transform on to scale of response > bands <- data.frame(lapply(bands, binomial(link = "logit")$linkinv)) > > ## plot confidence bands > lines(fit1$model$x, bands$upper1, col = "red", lty = "dotted") > lines(fit1$model$x, bands$lower1, col = "red", lty = "dotted") > lines(fit2$model$x, bands$upper2, col = "blue", lty = "dotted") > lines(fit2$model$x, bands$lower2, col = "blue", lty = "dotted") > > HTH > > G >

>> ______________________________________________

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 20 Jun 2008 - 00:37:04 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Fri 20 Jun 2008 - 09:31:01 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*