Re: [Rd] memory leak in sub("[range]", ...) when #ifndef _LIBC (PR#11946)

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Thu, 07 Aug 2008 10:47:40 +0100 (BST)

For the record: this is now fixed.

On Thu, 7 Aug 2008, bill_at_insightful.com wrote:

> Full_Name: Bill Dunlap
> Version: R version 2.8.0 Under development (unstable) (2008-07-05 r46037)
> OS: Linux
> Submission from: (NULL) (76.28.245.14)
>
>
> valgrind finds some memory leaks in R when I use sub() with
> a range in the regular expression:
>
> % R --debugger=valgrind --debugger-args=--leak-check=full --quiet --vanilla
> ==28643== Memcheck, a memory error detector.
> ==28643== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
> ==28643== Using LibVEX rev 1658, a library for dynamic binary translation.
> ==28643== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
> ==28643== Using valgrind-3.2.1, a dynamic binary instrumentation framework.
> ==28643== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
> ==28643== For more details, rerun with: -v
> ==28643==
>> for(i in 1:1000)sub("[0-9]","*","17")
>> q()
> ==28643==
> ==28643== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 38 from 2)
> ==28643== malloc/free: in use at exit: 12,607,663 bytes in 7,918 blocks.
> ==28643== malloc/free: 61,907 allocs, 53,989 frees, 54,692,481 bytes allocated.
> ==28643== For counts of detected errors, rerun with: -v
> ==28643== searching for pointers to 7,918 not-freed blocks.
> ==28643== checked 12,620,188 bytes.
> ==28643==
> ==28643== 82 bytes in 4 blocks are definitely lost in loss record 15 of 44
> ==28643== at 0x40046EE: malloc (vg_replace_malloc.c:149)
> ==28643== by 0x3200FF9: xmalloc (in
> /a/devlnx3206.insightful.com/opt/builds/R/devel/LX/46036/dist/lib/R/lib/libreadline.so.4)
> ==28643== by 0x31EC0D5: readline_internal_teardown (in
> /a/devlnx3206.insightful.com/opt/builds/R/devel/LX/46036/dist/lib/R/lib/libreadline.so.4)
> ==28643== by 0x31FD992: rl_callback_read_char (in
> /a/devlnx3206.insightful.com/opt/builds/R/devel/LX/46036/dist/lib/R/lib/libreadline.so.4)
> ==28643== by 0x80E7634: Rstd_ReadConsole (sys-std.c:905)
> ==28643== by 0x8057EA9: Rf_ReplIteration (main.c:205)
> ==28643== by 0x80581C6: R_ReplConsole (main.c:306)
> ==28643== by 0x805845C: run_Rmainloop (main.c:966)
> ==28643== by 0x80566B5: main (Rmain.c:33)
> ==28643==
> ==28643==
> ==28643== 7,996 bytes in 1,999 blocks are definitely lost in loss record 35 of
> 44
> ==28643== at 0x40046EE: malloc (vg_replace_malloc.c:149)
> ==28643== by 0x4005B9A: realloc (vg_replace_malloc.c:306)
> ==28643== by 0x80A5E22: parse_expression (regex.c:5202)
> ==28643== by 0x80A5FDF: parse_branch (regex.c:4707)
> ==28643== by 0x80A60AA: parse_reg_exp (regex.c:4666)
> ==28643== by 0x80A64A8: Rf_regcomp (regex.c:4635)
> ==28643== by 0x8110AE0: do_gsub (character.c:1355)
> ==28643== by 0x80653BC: do_internal (names.c:1129)
> ==28643== by 0x815EF17: Rf_eval (eval.c:461)
> ==28643== by 0x8160BD3: do_begin (eval.c:1174)
> ==28643== by 0x815EF17: Rf_eval (eval.c:461)
> ==28643== by 0x816203C: Rf_applyClosure (eval.c:667)
> ==28643==
> ==28643== LEAK SUMMARY:
> ==28643== definitely lost: 8,078 bytes in 2,003 blocks.
> ==28643== possibly lost: 0 bytes in 0 blocks.
> ==28643== still reachable: 12,599,585 bytes in 5,915 blocks.
> ==28643== suppressed: 0 bytes in 0 blocks.
> ==28643== Reachable blocks (those to which a pointer was found) are not shown.
> ==28643== To see them, rerun with: --show-reachable=yes
>
> The flagged memory block is the range_ends component of mbcset.
> I think that range_starts was also being leaked, but valgrind was
> combining the two.
>
> It looks like the cpp macro _LIBC is not defined when I compile
> R in this Linux box. regex.c defines range_ends and range_starts
> as different types, depending on the value of _LIBC, and it allocates
> space for them in either case. However, free_charset() was only
> freeing these things if _LIBC was defined. The following change
> to regex.c:free_charset() seems to take care of the problem.
>
> % svn diff regex.c
> Index: regex.c
> ===================================================================
> --- regex.c (revision 46038)
> +++ regex.c (working copy)
> @@ -6240,9 +6240,9 @@
> # ifdef _LIBC
> re_free (cset->coll_syms);
> re_free (cset->equiv_classes);
> +# endif
> re_free (cset->range_starts);
> re_free (cset->range_ends);
> -# endif
> re_free (cset->char_classes);
> re_free (cset);
> }
>
>
>> version
> _
> platform i686-pc-linux-gnu
> arch i686
> os linux-gnu
> system i686, linux-gnu
> status Under development (unstable)
> major 2
> minor 8.0
> year 2008
> month 07
> day 05
> svn rev 46037
> language R
> version.string R version 2.8.0 Under development (unstable) (2008-07-05 r46037)
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Thu 07 Aug 2008 - 09:53:51 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 07 Aug 2008 - 10:36:09 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive