Re: [R] nontabular logistic regression

From: Jeffrey Stratford <stratja_at_auburn.edu>
Date: Fri 13 Oct 2006 - 17:57:22 GMT


Gavin,

That worked! I went through and I found a few missing cases where I had "." instead of "NA" - I'm still in SAS mode.

Many thanks!



Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

>>> Gavin Simpson <gavin.simpson@ucl.ac.uk> 10/13/06 11:23 AM >>> On Fri, 2006-10-13 at 09:28 -0500, Jeffrey Stratford wrote:
> Hi. I'm attempting to fit a logistic/binomial model so I can
determine
> the influence of landscape on the probability that a box gets used by
a
> bird. I've looked at a few sources (MASS text, Dalgaard, Fox and
> google) and the examples are almost always based on tabular predictor
> variables. My data, however are not. I'm not sure if that is the
> source of the problems or not because the one example that includes a
> continuous predictor looks to be coded exactly the same way. Looking
at
> the output, I get estimates for each case when I should get a single
> estimate for purbank. Any suggestions?
>
> Many thanks,
>
> Jeff

Hi Jeff,

using the snippet of data you provided (copy/paste into a text file and read in with read.table) worked fine:

box.use <- read.table("~/tmp/tmp.txt", header = TRUE) box.use
str(box.use)
'data.frame': 8 obs. of 10 variables:

 $ box        : int  1 2 3 4 5 6 7 8
 $ use        : int  1 1 1 1 0 1 1 0
 $ purbank    : num  0.00381 0.04429 0.04459 0.06072 0.60810 ...
 $ purban2    : num  0.0268 0.1611 0.0604 0.2081 0.6980 ...
 $ purban1    : num  0.069 0.172 0.000 0.069 0.690 ...
 $ pgrassk    : num  0.3282 0.1534 0.1628 0.0194 0.0317 ...
 $ pgrass2    : num  0.685 0.383 0.557 0.000 0.128 ...
 $ pgrass1    : num  0.759 0.655 0.759 0.000 0.241 ...
 $ grassdist  : num    0   0   0 323  30 ...
 $ grasspatchk: num  3.730 1.023 0.961 0.228 0.263 ...

Now I don't like attach, and you just don't need it so I deviate a little now. Replace box.use$use directly and make use of the data argument in glm. Also, your data didn't have any missing data so I'm not sure whether the response or predictor is missing and whether your na.omit is needed or not - I don't use it below.

box.use$use <- factor(box.use$use, levels=0:1) levels(box.use$use) <- c("unused", "used") box.use
str(box.use)
glm1 <- glm(use ~ purbank, data = box.use, family = binomial())

summary(glm1)

Call:
glm(formula = use ~ purbank, family = binomial(), data = box.use)

Deviance Residuals:

     Min 1Q Median 3Q Max -1.61450 -0.03098 0.31935 0.45888 1.39194

Coefficients:

            Estimate Std. Error z value Pr(>|z|)
(Intercept)    3.223      2.225   1.448    0.147
purbank       -6.129      4.773  -1.284    0.199

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 8.9974 on 7 degrees of freedom Residual deviance: 6.5741 on 6 degrees of freedom AIC: 10.574 Number of Fisher Scoring iterations: 5

I suspect something got messed up in your reading of the data and R thought purbank was a factor or character. Always check your data after reading in, and str() is a your friend here as printed representations are not always what they seem.

HTH G

>
>
> THE DATA: (200 boxes total, used [0 if unoccupied, 1 occupied], the
rest
> are landscape variables).
>

>
box use purbank purban2 purban1 pgrassk pgrass2 pgrass1 grassdist grasspatchk >
1 1 0.003813435 0.02684564 0.06896552 0.3282487 0.6845638 0.7586207 0 3.73 >
2 1 0.04429451 0.1610738 0.1724138 0.1534174 0.3825503 0.6551724 0 1.023261 >
3 1 0.04458785 0.06040268 0 0.1628043 0.557047 0.7586207 0 0.9605769 >
4 1 0.06072162 0.2080537 0.06896552 0.01936052 0 0 323.1099 0.2284615 >
5 0 0.6080962 0.6979866 0.6896552 0.03168084 0.1275168 0.2413793 30 0.2627027 >
6 1 0.6060428 0.6107383 0.3448276 0.04077442 0.2885906 0.4482759 30 0.2978571 >
7 1 0.3807568 0.4362416 0.6896552 0.06864183 0.03355705 0 94.86833 0.468 >
8 0 0.3649164 0.3154362 0.4137931 0.06277501 0.1275168 0 120 0.4585714
>
> THE CODE:
>
> box.use<- read.csv("c:\\eabl\\2004\\use_logistic2.csv", header=TRUE)
> attach(box.use)
> box.use <- na.omit(box.use)
> use <- factor(use, levels=0:1)
> levels(use) <- c("unused", "used")
> glm1 <- glm(use ~ purbank, binomial)
>
> THE OUTPUT:
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -4.544e-16 1.414e+00 -3.21e-16 1.000
> purbank0 2.157e+01 2.923e+04 0.001 0.999
> purbank0.001173365 2.157e+01 2.067e+04 0.001 0.999
> purbank0.001466706 2.157e+01 2.923e+04 0.001 0.999
> purbank0.001760047 6.429e-16 2.000e+00 3.21e-16 1.000
> purbank0.002346729 2.157e+01 2.923e+04 0.001 0.999
> purbank0.003813435 2.157e+01 2.923e+04 0.001 0.999
> purbank0.004106776 2.157e+01 2.067e+04 0.001 0.999
> purbank0.004693458 2.157e+01 2.067e+04 0.001 0.999
>
>
> ****************************************
> Jeffrey A. Stratford, Ph.D.
> Postdoctoral Associate
> 331 Funchess Hall
> Department of Biological Sciences
> Auburn University
> Auburn, AL 36849
> 334-329-9198
> FAX 334-844-9234
> http://www.auburn.edu/~stratja
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson                 [t] +44 (0)20 7679 0522
 ECRC & ENSIS, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/cv/
 London, UK. WC1E 6BT.         [w] http://www.ucl.ac.uk/~ucfagls/
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat Oct 14 12:49:52 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 14 Oct 2006 - 03:30:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.