Re: [Rd] Using \u2030 in plot axis label -> stack smashing

From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>
Date: Tue 19 Sep 2006 - 10:48:03 GMT

On Tue, 2006-09-19 at 08:26 +0100, Prof Brian Ripley wrote:
> I didn't have access to my FC5 boxes yesterday (electrical testing).
>
> This does need the FC5-specific compilation options set
> (-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
> --param=ssp-buffer-size=4), so it is not surprising it is not
> reproducible elsewhere (including under valgrind, BTW).
>
> Ei-ji's patch works (and is incorporated now), but the buffer is used at
>
> strncpy(s, buf, sizeof(buf) - 1); /* ensure 0-terminated */
>
> and the 's' here should be big enough (\uxxxx can only expand to 3 bytes
> in UTF-8, so "\u2030" is four bytes in UTF-8 including the null
> terminator). Can Ei-ji explain?
>
> I can understand how Gavin saw this in the released FC5 RPM. What I don't
> understand is how he saw this in 2.4.0 alpha/R-devel without setting
> non-default CFLAGS he did not tell us about.

Thanks Prof. Ripley and Ei-Ji. I should have mentioned that all the versions I reported for were self-compiled, and I did so with the same set of flags as the FC5 rpm. Will add that to the list in my head of things to report.

>
> BTW, just applying this patch will not work: you need to rebuild gram.c
> in maintainer mode.

I'm not clear what you mean by maintainer mode - not something I have come across before. If I update the local source on my machine from the svn server, and make clean, configure and make again, will this be sufficient? Or do I need to do something else?

Many thanks,

G

>
>
> On Tue, 19 Sep 2006, Ei-ji Nakama wrote:
>
> > This seems to be the mine which I contrived. m(_|_)m
> >
> > --- R-alpha.orig/src/main/gram.y 2006-09-04 23:41:33.000000000 +0900
> > +++ R-alpha/src/main/gram.y 2006-09-19 13:01:41.000000000 +0900
> > @@ -99,11 +99,12 @@
> > # endif
> > #endif
> > #include <errno.h>
> > +#define MB_BUF 16
> >
> > static size_t ucstomb(char *s, wchar_t wc, mbstate_t *ps)
> > {
> > char tocode[128];
> > - char buf[16];
> > + char buf[MB_BUF];
> > void *cd = NULL ;
> > wchar_t wcs[2];
> > char *inbuf = (char *) wcs;
> > @@ -1709,7 +1710,7 @@
> > error(_("\\uxxxx sequences not supported"));
> > #else
> > wint_t val = 0; int i, ext; size_t res;
> > - char buff[5]; Rboolean delim = FALSE;
> > + char buff[MB_BUF]; Rboolean delim = FALSE;
> > if((c = xxgetc()) == '{') delim = TRUE; else xxungetc(c);
> > for(i = 0; i < 4; i++) {
> > c = xxgetc();
> > @@ -1743,7 +1744,7 @@
> > #ifdef SUPPORT_MBCS
> > else {
> > wint_t val = 0; int i, ext; size_t res;
> > - char buff[9]; Rboolean delim = FALSE;
> > + char buff[MB_BUF]; Rboolean delim = FALSE;
> > if((c = xxgetc()) == '{') delim = TRUE; else xxungetc(c);
> > for(i = 0; i < 8; i++) {
> > c = xxgetc();
> >
> >
> > 2006/9/19, Gregor Gorjanc <gregor.gorjanc@bfro.uni-lj.si>:
> >> Gavin Simpson wrote:
> >>> On Mon, 2006-09-18 at 19:02 +0000, Gregor Gorjanc wrote:
> >>>> Gavin Simpson <gavin.simpson <at> ucl.ac.uk> writes:
> >>>>> Dear List
> >>>>>
> >>>>> I just noticed the following behaviour in R 2.3.1 Patched (2006-06-13
> >>>>> r38342) and confirmed similar behaviour in R 2.4.0 alpha (2006-09-18
> >>>>> r39383) & R 2.5.0 (2006-09-18 r39383) - which may actually be the same
> >>>>> thing?, that trying to plot the unicode character \u2030 (which should
> >>>>> be in a ¢ó [per mille] sign) in an axis label leads to the following
> >>>>> error:
> >>>>>
> >>>>> *** stack smashing detected ***: /home/gavin/R/R-devel/build/bin/exec/R
> >>>>> terminated
> >>>>> Aborted
> >>>>>
> >>>>> The simplest, reproducible example I have tried is:
> >>>>>
> >>>>> plot(1:10, ylab = "\u2030")
> >>>>>
> >>>> I can not reproduce this on my Debian GNU/Linux. I get something like "S
> >>>> for y label under 2.3.1 2006-06-01 and 2.5.0 2006-09-13 r39292 with the
> >>>> following locale
> >>>>
> >>>> [1] "LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;
> >>>> LC_COLLATE=en_GB.UTF-8;LC_MONETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;
> >>>> LC_PAPER=C;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C;
> >>>> LC_IDENTIFICATION=C"
> >>>>
> >>>> It does not change if I set everything into en_GB.UTF-8. Is this valid
> >>>> unicode code?
> >>>>
> >>>> Gregor
> >>>
> >>> Cheers for the follow up Gregor,
> >>>
> >>> I was following advice given by Prof. Ripley in a posting on R-Help
> >>> about how to get the per mille character:
> >>>
> >>> http://finzi.psych.upenn.edu/R/Rhelp02a/archive/48709.html
> >>>
> >>> It should look like a "%" character but with two circles at the bottom.
> >>
> >> Perhaps I do not have appropriate font for this character.
> >>
> >> Gregor
> >>
> >> ______________________________________________
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >
> >
> >
>
> --
> Brian D. Ripley, ripley@stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson                 [t] +44 (0)20 7679 0522
 ECRC & ENSIS, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Tue Sep 19 21:03:51 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 19 Sep 2006 - 11:30:08 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.