Re: [Rd] Recent and upcoming changes to R-devel

From: Duncan Murdoch <murdoch.duncan_at_gmail.com>
Date: Tue, 05 Jul 2011 14:16:29 -0400

On 05/07/2011 1:45 PM, Tobias Verbeke wrote:
> On 07/05/2011 04:21 PM, Duncan Murdoch wrote:
> > On 05/07/2011 10:17 AM, Tobias Verbeke wrote:
> >> Dear Duncan,
> >>
> >> On 07/05/2011 03:25 PM, Duncan Murdoch wrote:
> >> > On 05/07/2011 6:52 AM, Tobias Verbeke wrote:
> >> >> L.S.
> >> >>
> >> >> On 07/05/2011 02:16 AM, Mark.Bravington_at_csiro.au wrote:
> >> >> > I may have misunderstood, but:
> >> >> >
> >> >> > Please could we have an optional installation that does not*not*
> >> >> byte-compile base and recommended?
> >> >> >
> >> >> > Reason: it's not possible to debug byte-compiled code-- at least not
> >> >> with the 'debug' package, which is quite widely used. I quite often
> >> >> end up using 'mtrace' on functions in base/recommended packages to
> >> >> figure out what they are doing. And sometimes I (and others)
> >> >> experiment with changing functions in base/recommended to improve
> >> >> functionality. That seems to be harder with BC versions, and might
> >> >> even be impossible, as best I can tell from hints in the documentation
> >> >> of 'compile').
> >> >> >
> >> >> > Personally, if I had to choose only one, I'd rather live with the
> >> >> speed penalty from not byte-compiling. But of course, if both are
> >> >> available, I could install both.
> >> >>
> >> >> I completely second this request. All speed improvements and the byte
> >> >> compiler in particular are leaps forward and I am very grateful and
> >> >> admiring towards the people that make this happen.
> >> >>
> >> >> That being said, 'moving away' from the sources (with the lazy loading
> >> >> files and byte-compilation) may be a step back for R package
> >> developers
> >> >> that (during development and maybe on separate development
> >> installations
> >> >> [as opposed to production installations of R]) require
> >> >> the sources of all packages to be efficient in their work.
> >> >>
> >> >> As many of you know there is an open source Eclipse/StatET visual
> >> >> debugger ready and for that application as well (similar to Mark's
> >> >> request) presence of non-compiled code is highly desirable.
> >> >>
> >> >> For the particular purpose of debugging R packages, I would even plead
> >> >> to go beyond the current options and support the addition of an
> >> >> R package install option that allows to include the sources (e.g. in
> >> >> a standard folder Rsrc/) in installed packages.
> >> >>
> >> >> I am fully aware that one can always fetch the source tarballs from
> >> >> CRAN for that purpose, but it would be much more easy if a simple
> >> >> installation option could put the R sources of a package in a separate
> >> >> folder [or archive inside an existing folder] such that R development
> >> >> tools (such as the Eclipse/StatET IDE) can offer inspection of sources
> >> >> or display them (e.g. during debugging) out of the box.
> >> >>
> >> >> If one has the srcref, one can always load the absolutely correct
> >> source
> >> >> code this way, even if one doesn't know the parent function with
> >> >> the source attribute.
> >> >>
> >> >> Any comments?
> >> >
> >> > I think these requests have already been met. If you modify the body of
> >> > a closure (as trace() does), then the byte compiled version is
> >> > discarded, and you go back to the regular interpreted code. If you
> >> > install packages with the R_KEEP_PKG_SOURCE=yes environment variable
> >> > set, the you keep all source for all functions. (It's attached to the
> >> > function itself, not as a file that may be out of date.) It's possible

>

> Can you expand on when files put inside a package at install
> time will be out of date compared to the source information
> attached to a function ?

Suppose you're debugging. You change a function, source it: now it's not the same as the one in the package source, it's the one in your editor.

> I (naively) thought the source information was created and attached
> at install time as well and that it did not change afterwards either.

It won't change if the function doesn't change, but during debugging (or in some strange examples, during normal execution) the function might change.

> I guess the arguments for files is that they have precise
> locations and allow for easy indexing by development tools
> external to R (but may be corrected here as well).

As in pre-2.13.0, it will keep the locations and time stamps of the files, but we were finding it was too unreliable not to have an actual copy of the contents, so 2.13.0 also keeps a copy of the file, and that's the main source of content to display.

> >> > that byte compiling turns off R_KEEP_PKG_SOURCE, but that is something
> >> > that is either easily fixed, or avoided by re-installing without byte
> >> > compiling.
> >>
> >> Many thanks for your reaction. Is the R_KEEP_PKG_SOURCE=yes environment
> >> variable also supported during R installation ?
> >
> > Yes, other than the error you saw below, which is a temporary problem.
> > Not sure which function exceeded the length limit, but the length limit
> > is going away before 2.14.0 is released.

>

> Thanks again, Duncan, for the clarification.
>

> Is it useful (or just whimsical) to have an R
> function that would allow for a given stock CRAN
> Windows R installation with stock Windows CRAN binary
> add-on packages to add the source information that
> would be useful e.g. for a debugger post factum?
>

> I can imagine something like
>
> update.packages(., checkSourcesKept = TRUE)

I suspect it would be hard to do that for base, tools and compiler, because those packages are handled specially during installation. The other base packages (and all contributed packages) are handled more similarly, so install.packages() (with the right arguments) should do it (though I admit I haven't tried doing this with the other base packages. If you're debugging those, you'll often end up looking at R internals, and then you need to be able to build R...).

If you want to have binary copies of packages on CRAN that include the debug info, I suspect you'll get some resistance from CRAN, because they'll add a lot to the file size, processing time, etc., for relatively rare use.

> as I don't think this can currently be solved
> with a combination of INSTALL_opts="--with-keep.source"
> and type="source" given that there will not be a check
> for the presence of source information to determine
> which packages require being updated (or in this
> case 'completed' with source information).

>

> The alternative scenario would be to expect users
> that want this functionality to compile R and all
> add-on packages from source (also on Windows or
> Mac).

I don't think you need to compile R except for those 3 packages (and maybe the other base packages), but I I think it's reasonable to expect people who want source level debugging to be ready to do source installs of packages. And back to the point above: if a user can do a source install, they can do one with debugging info, so there's no real point in CRAN doing it for them.

Duncan Murdoch

> Best,
> Tobias

>

> >> I hope I'm not overlooking anything, but when compiling
> >>
> >> ftp://ftp.stat.math.ethz.ch/Software/R/R-devel.tar.gz
> >>
> >> a few minutes ago I encountered the following issue:
> >>
> >> [...]
> >>
> >> building package 'tools'
> >> mkdir -p -- ../../../library/tools
> >> make[4]: Entering directory
> >> `/home/tobias/rAdmin/R-devel/src/library/tools'
> >> mkdir -p -- ../../../library/tools/R
> >> mkdir -p -- ../../../library/tools/po
> >> make[4]: Leaving directory
> >> `/home/tobias/rAdmin/R-devel/src/library/tools'
> >> make[4]: Entering directory
> >> `/home/tobias/rAdmin/R-devel/src/library/tools'
> >> make[5]: Entering directory
> >> `/home/tobias/rAdmin/R-devel/src/library/tools/src'
> >> making text.d from text.c
> >> making init.d from init.c
> >> making Rmd5.d from Rmd5.c
> >> making md5.d from md5.c
> >> gcc -std=gnu99 -I../../../../include -I/usr/local/include
> >> -fvisibility=hidden -fpic -g -O2 -c text.c -o text.o
> >> gcc -std=gnu99 -I../../../../include -I/usr/local/include
> >> -fvisibility=hidden -fpic -g -O2 -c init.c -o init.o
> >> gcc -std=gnu99 -I../../../../include -I/usr/local/include
> >> -fvisibility=hidden -fpic -g -O2 -c Rmd5.c -o Rmd5.o
> >> gcc -std=gnu99 -I../../../../include -I/usr/local/include
> >> -fvisibility=hidden -fpic -g -O2 -c md5.c -o md5.o
> >> gcc -std=gnu99 -shared -L/usr/local/lib64 -o tools.so text.o init.o
> >> Rmd5.o md5.o -L../../../../lib -lR
> >> make[6]: Entering directory
> >> `/home/tobias/rAdmin/R-devel/src/library/tools/src'
> >> make[6]: `Makedeps' is up to date.
> >> make[6]: Leaving directory
> >> `/home/tobias/rAdmin/R-devel/src/library/tools/src'
> >> make[6]: Entering directory
> >> `/home/tobias/rAdmin/R-devel/src/library/tools/src'
> >> mkdir -p -- ../../../../library/tools/libs
> >> make[6]: Leaving directory
> >> `/home/tobias/rAdmin/R-devel/src/library/tools/src'
> >> make[5]: Leaving directory
> >> `/home/tobias/rAdmin/R-devel/src/library/tools/src'
> >> make[4]: Leaving directory
> >> `/home/tobias/rAdmin/R-devel/src/library/tools'
> >> Error in parse(n = -1, file = file) :
> >> function is too long to keep source (at line 2967)
> >> Error: unable to load R code in package ‘tools’
> >> Execution halted
> >>
> >> [...]
> >>
> >> tobias_at_openanalytics:~/rAdmin$ echo $R_KEEP_PKG_SOURCE
> >> yes
> >>
> >> I do not have this issue when R_KEEP_PKG_SOURCE is set
> >> to 'false' during compilation.
> >>
> >> Best,
> >> Tobias
> >>
> >> >> P.S. One could even consider a post-install option e.g. to add 'real'
> >> >> R sources (and source references) to Windows packages (which are by
> >> >> definition already 'installed' and for which such information is not
> >> >> by default included in the CRAN binaries of these packages).
> >> >>
> >> >> >> > Prof Brian Ripley wrote:
> >> >> >> > There was an R-core meeting the week before last, and various
> >> >> planned
> >> >> >> > changes will appear in R-devel over the next few weeks.
> >> >> >> >
> >> >> >> > These are changes planned for R 2.14.0 scheduled for Oct 31.
> >> As we
> >> >> >> > are sick of people referring to R-devel as '2.14' or '2.14.0',
> >> that
> >> >> >> > version number will not be used until we reach 2.14.0 alpha. You
> >> >> >> > will be able to have a package depend on an svn version number
> >> when
> >> >> >> > referring to R-devel rather than using R (>= 2.14.0).
> >> >> >> >
> >> >> >> > All packages are installed with lazy-loading (there were 72 CRAN
> >> >> >> > packages and 8 BioC packages which opted out). This means that
> >> the
> >> >> >> > code is always parsed at install time which inter alia simplifies
> >> >> the
> >> >> >> > descriptions. R 2.13.1 RC warns on installation about packages
> >> which
> >> >> >> > ask not to be lazy-loaded, and R-devel ignores such requests
> >> (with a
> >> >> >> > warning).
> >> >> >> >
> >> >> >> > In the near future all packages will have a name space. If the
> >> >> >> > sources do not contain one, a default NAMESPACE file will be
> >> added.
> >> >> >> > This again will simplify the descriptions and also a lot of
> >> internal
> >> >> >> > code. Maintainers of packages without name spaces (currently
> >> 42% of
> >> >> >> > CRAN) are encouraged to add one themselves.
> >> >> >> >
> >> >> >> > R-devel is installed with the base and recommended packages
> >> >> >> > byte-compiled (the equivalent of 'make bytecode' in R 2.13.x, but
> >> >> >> > done less inefficiently). There is a new option R CMD INSTALL
> >> >> >> > --byte-compile to byte-compile contributed packages, but that
> >> >> remains
> >> >> >> > optional.
> >> >> >> > Byte-compilation is quite expensive (so you definitely want to
> >> do it
> >> >> >> > at install time, which requires lazy-loading), and relatively few
> >> >> >> > packages benefit appreciably from byte-compilation. A larger
> >> number
> >> >> >> > of packages benefit from byte-compilation of R itself: for
> >> example
> >> >> >> > AER runs its checks 10% faster. The byte-compiler technology is
> >> >> >> > thanks to Luke Tierney.
> >> >> >> >
> >> >> >> > There is support for figures in Rd files: currently with a
> >> >> first-pass
> >> >> >> > implementation (thanks to Duncan Murdoch).
> >> >> > ______________________________________________
> >> >> > R-devel_at_r-project.org mailing list
> >> >> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >> >> >
> >> >>
> >> >> ______________________________________________
> >> >> R-devel_at_r-project.org mailing list
> >> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >> >
> >>
> >
>

R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue 05 Jul 2011 - 18:18:43 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 05 Jul 2011 - 22:40:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive