[R] Using predict.glm for classification

From: Eleni Rapsomaniki <e.rapsomaniki_at_mail.cryst.bbk.ac.uk>
Date: Sun 29 Oct 2006 - 14:18:15 GMT

Dear R users,

I'm trying to understand how to derive the actual predictions (in terms of class) using predict.glm. Consider this example:

mydf=data.frame(A=sample(rnorm(1000), size=1000, replace=T), B=sample(rnorm(5), size=1000, replace=T), C=sample(rnorm(10), size=1000, replace=T), class=sample(c("a", "b"), size=1000, replace=T)) mydf.glm=glm(class ~ .^2, data=mydf, family=binomial) ind=sample(1:nrow(mydf), size=0.5*nrow(mydf), replace=F) mydf.glm=glm(class ~ .^2, data=mydf[ind,], family=binomial) mydf.pred=predict(mydf.glm, newdata=mydf[-ind,], type="response", se=T)

My question is what does the vector mydf.pred$fit indicate? If it has a value of say 0.42 does it mean that the probability that the response is "a" is 0.42 and that the response is "b" 1-0.42 (so for a threshold of 0.5 the class would be "b") ?

I would appreciate any comments or help on this.

Many thanks
Eleni Rapsomaniki
Birkbeck College, UK

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon Oct 30 01:24:28 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 29 Oct 2006 - 15:30:14 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.