Re: [Rd] segfault in glm.fit (PR#14154)

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Thu, 17 Dec 2009 14:03:52 +0000 (GMT)

I cannot reproduce this on our x86_64 Fedora systems (and I tried all the usual tricks such as gctorture and valgrind to provoke a problem). And I have fitted much larger GLMs many times over the last decade, so your 'bug summary' cannot be the whole story.

Your example is random and you haven't set a seed: to eliminate that there is something specific about the data you tried can you set one and tell us which failed.

One possibility is a compiler optimization bug, so can you please tell us what compilers were used with what flags to build this version of R, and if you built it yourself try it without optimization. (The machines I used had GCC 4.3.2 and 4.4.1 with CFLAGS="-g -O3 -Wall -pedantic -mtune=core2" FFLAGS="-g -O -mtune=core2": higher levels of optimization have known problems with recent x86_64 versions of gfortran, and I am wondering if that is an underlying issue.)

On Thu, 17 Dec 2009, adrian_at_maths.uwa.edu.au wrote:

> Bug summary:
> glm() causes a segfault if the argument 'data'
> is a data frame with more than 16384 rows.
>
> Bug demonstration:
>
> -------input ---------------
> N <- 16400
> df <- data.frame(x=runif(N, min=1,max=2),y=rpois(N, 2))
> glm(y ~ x, family=poisson, data=df)
>
> ------ output ---------------
> *** caught segfault ***
> address (nil), cause 'unknown'
>
> Traceback:
> 1: ifelse(y == 0, 1, y/mu)
> 2: dev.resids(y, mu, weights)
> 3: glm.fit(x = X, y = Y, weights = weights, start = start, etastart =
> etastart, mustart = mustart, offset = offset, family = family,
> control = control, intercept = attr(mt, "intercept") > 0)
> 4: glm(y ~ x, family = poisson, data = df)
>
> --------------------------------
>
> The code generates a segfault if the value of 'N' is greater than 16384.
>
> regards
> Adrian Baddeley
>
> ////////////////////////////////////////////////////////////
>
> --please do not edit the information below--
>
> Version:
> platform = x86_64-unknown-linux-gnu
> arch = x86_64
> os = linux-gnu
> system = x86_64, linux-gnu
> status =
> major = 2
> minor = 10.1
> year = 2009
> month = 12
> day = 14
> svn rev = 50720
> language = R
> version.string = R version 2.10.1 (2009-12-14)
>
> Locale:
> LC_CTYPE=en_AU.UTF-8;LC_NUMERIC=C;LC_TIME=en_AU.UTF-8;LC_COLLATE=en_AU.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_AU.UTF-8;LC_PAPER=en_AU.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_AU.UTF-8;LC_IDENTIFICATION=C
>
> Search Path:
> .GlobalEnv, package:stats, package:graphics, package:grDevices,
> package:utils, package:datasets, package:methods, Autoloads, package:base
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Thu 17 Dec 2009 - 14:07:06 GMT

This archive was generated by hypermail 2.2.0 : Thu 17 Dec 2009 - 14:31:08 GMT