Re: [Rd] Recent and upcoming changes to R-devel

From: <Mark.Bravington_at_csiro.au>
Date: Thu, 07 Jul 2011 17:38:15 +1000

Thank you Duncan for those reassurances. I still have a couple of questions, as below, but there was definitely some good news for me in your replies:

*If* that's all there is, then I can easily enough modify 'debug' to work with byte-compiled code. But I am still a bit concerned, because I can remember in the past reading somewhat scary things like this:

"... sealing the namespace means the byte compiler can make certain assumptions for efficiency..."  

Q1: So my concern is something like this: even if I can successfully 'mtrace' one function X in eg baseenv() for when X is called directly, maybe a byte-compiled version of another function Y (in baseenv() or elsewhere) that calls X will somehow still invoke the original byte-compiled version of X. EG if the BC knows that X has been reduced to piece-of-BC-code #924, then maybe the BC of Y contains "call BC#924 now" rather than "call X now". So only direct calls to X would pick up the modified version. Can someone confirm that "this sort of thing" won't happen?

Obviously this is a bit non-specific and is based on ignorance-- but that's inevitable because there isn't much information about the BC. The manuals for R-2.14-dev have almost none, for example.  

The Windows binary of R-devel on 6-7-2011 didn't yet seem to be the pre-byte-compiled version mentioned in Brian Ripley's announcement, so I haven't been able to test anything.  

Q2: From DM's first reply:  

"... but that is something
 that is either ..., or avoided by re-installing without byte compiling."

How would the latter be done? And would it be possible for the Windows binary distro, or just for the source distro? My reading of the first email about changes in R-2.14-devel was that the distro would be byte-compiled already.

Thanks

Mark Bravington
CSIRO CMIS
Marine Lab
Hobart
Australia



From: Duncan Murdoch [murdoch.duncan_at_gmail.com] Sent: 06 July 2011 21:43
To: Stephan Wahlbrink
Cc: tobias.verbeke_at_openanalytics.eu; Bravington, Mark (CMIS, Hobart); ripley_at_stats.ox.ac.uk; R-devel_at_r-project.org Subject: Re: [Rd] Recent and upcoming changes to R-devel

On 11-07-06 3:46 AM, Stephan Wahlbrink wrote:

> Duncan Murdoch wrote [2011-07-05 17:21]:
>> On 05/07/2011 11:20 AM, Stephan Wahlbrink wrote:

>>> Dear developers,
>>>
>>> Duncan Murdoch wrote [2011-07-05 15:25]:
>>>> On 05/07/2011 6:52 AM, Tobias Verbeke wrote:
>>>>> L.S.
>>>>>
>>>>> On 07/05/2011 02:16 AM, Mark.Bravington_at_csiro.au wrote:
>>>>>> I may have misunderstood, but:
>>>>>>
>>>>>> Please could we have an optional installation that does not*not*
>>>>> byte-compile base and recommended?
>>>>>>
>>>>>> Reason: it's not possible to debug byte-compiled code-- at least not
>>>>> with the 'debug' package, which is quite widely used. I quite often
>>>>> end up using 'mtrace' on functions in base/recommended packages to
>>>>> figure out what they are doing. And sometimes I (and others)
>>>>> experiment with changing functions in base/recommended to improve
>>>>> functionality. That seems to be harder with BC versions, and might
>>>>> even be impossible, as best I can tell from hints in the documentation
>>>>> of 'compile').
>>>>>>
>>>>>> Personally, if I had to choose only one, I'd rather live with the
>>>>> speed penalty from not byte-compiling. But of course, if both are
>>>>> available, I could install both.
>>>>>
>>>>> I completely second this request. All speed improvements and the byte
>>>>> compiler in particular are leaps forward and I am very grateful and
>>>>> admiring towards the people that make this happen.
>>>>>
>>>>> That being said, 'moving away' from the sources (with the lazy loading
>>>>> files and byte-compilation) may be a step back for R package
>>> developers
>>>>> that (during development and maybe on separate development
>>> installations
>>>>> [as opposed to production installations of R]) require
>>>>> the sources of all packages to be efficient in their work.
>>>>>
>>>>> As many of you know there is an open source Eclipse/StatET visual
>>>>> debugger ready and for that application as well (similar to Mark's
>>>>> request) presence of non-compiled code is highly desirable.
>>>>>
>>>>> For the particular purpose of debugging R packages, I would even plead
>>>>> to go beyond the current options and support the addition of an
>>>>> R package install option that allows to include the sources (e.g. in
>>>>> a standard folder Rsrc/) in installed packages.
>>>>>
>>>>> I am fully aware that one can always fetch the source tarballs from
>>>>> CRAN for that purpose, but it would be much more easy if a simple
>>>>> installation option could put the R sources of a package in a separate
>>>>> folder [or archive inside an existing folder] such that R development
>>>>> tools (such as the Eclipse/StatET IDE) can offer inspection of sources
>>>>> or display them (e.g. during debugging) out of the box.
>>>>>
>>>>> If one has the srcref, one can always load the absolutely correct
>>> source
>>>>> code this way, even if one doesn't know the parent function with
>>>>> the source attribute.
>>>>>
>>>>> Any comments?
>>>>
>>>> I think these requests have already been met. If you modify the body of
>>>> a closure (as trace() does), then the byte compiled version is
>>>> discarded, and you go back to the regular interpreted code. If you
>>>> install packages with the R_KEEP_PKG_SOURCE=yes environment variable
>>>> set, the you keep all source for all functions. (It's attached to the
>>>> function itself, not as a file that may be out of date.) It's possible
>>>> that byte compiling turns off R_KEEP_PKG_SOURCE, but that is something
>>>> that is either easily fixed, or avoided by re-installing without byte
>>>> compiling.
>>>
>>> I don’t know how the new installation works exactly, but would it be
>>> possible, to simply install both types, the old expression bodies and
>>> the new byte-compiled, as single package at the same time?
>>
>> Yes, that's what is done.
>
> Perfect! And which version (byte-compiled or expressions) is used at
> runtime under which condition?

The byte code is executed, the interpreted version is displayed. There are conditions under which the byte code is dropped. I don't know a comprehensive list, but the idea is that if the body changes, then the compiled version is no longer valid.

Duncan Murdoch

>
> Thanks,
> Stephan
>
>

>>> This would
>>> allow the R user and developer to simply use the variant which is the
>>> best at the moment. If he wants to debug code, he can switch of the use
>>> of byte-compiled code and use the old R expressions (with attached
>>> srcrefs). If debugging is not required, he can profit from the
>>> byte-compiled version. The best would be a toggle, to switch it at
>>> runtime, but a startup option would be sufficient too.
>>>
>>> I think direct access to the code is one big advantage of open source
>>> software. For developer it makes it easier to find and fix bugs if
>>> something is wrong. But it can also help users a lot to understand how a
>>> function or algorithm works and learn from code written by other persons
>>> – if the access to the sources is easy.
>>>
>>> As long byte-code doesn’t support the debugging features of R, it is
>>> required for best debugging support to run the functions completely
>>> without byte-complied code. If I understood it correctly, byte-code
>>> frames would disable srcrefs as well as features like “step return” to
>>> that frames. Therefore I ask for a way that it is easy to switch between
>>> both execution types.
>>
>> What gave you that impression?
>>
>> Duncan Murdoch
>>

>>> Best,
>>> Stephan
>>>
>>>
>>>>
>>>> Duncan Murdoch
>>>>
>>>>> Best,
>>>>> Tobias
>>>>>
>>>>> P.S. One could even consider a post-install option e.g. to add 'real'
>>>>> R sources (and source references) to Windows packages (which are by
>>>>> definition already 'installed' and for which such information is not
>>>>> by default included in the CRAN binaries of these packages).
>>>>>
>>>>>>>> Prof Brian Ripley wrote:
>>>>>>>> There was an R-core meeting the week before last, and various
>>>>> planned
>>>>>>>> changes will appear in R-devel over the next few weeks.
>>>>>>>>
>>>>>>>> These are changes planned for R 2.14.0 scheduled for Oct 31.
>>> As we

>>>>>>>> are sick of people referring to R-devel as '2.14' or '2.14.0',
>>> that
>>>>>>>> version number will not be used until we reach 2.14.0 alpha. You
>>>>>>>> will be able to have a package depend on an svn version number
>>> when
>>>>>>>> referring to R-devel rather than using R (>= 2.14.0).
>>>>>>>>
>>>>>>>> All packages are installed with lazy-loading (there were 72 CRAN
>>>>>>>> packages and 8 BioC packages which opted out). This means that
>>> the
>>>>>>>> code is always parsed at install time which inter alia simplifies
>>>>> the
>>>>>>>> descriptions. R 2.13.1 RC warns on installation about packages
>>> which
>>>>>>>> ask not to be lazy-loaded, and R-devel ignores such requests
>>> (with a
>>>>>>>> warning).
>>>>>>>>
>>>>>>>> In the near future all packages will have a name space. If the
>>>>>>>> sources do not contain one, a default NAMESPACE file will be
>>> added.
>>>>>>>> This again will simplify the descriptions and also a lot of
>>> internal
>>>>>>>> code. Maintainers of packages without name spaces (currently
>>> 42% of
>>>>>>>> CRAN) are encouraged to add one themselves.
>>>>>>>>
>>>>>>>> R-devel is installed with the base and recommended packages
>>>>>>>> byte-compiled (the equivalent of 'make bytecode' in R 2.13.x, but
>>>>>>>> done less inefficiently). There is a new option R CMD INSTALL
>>>>>>>> --byte-compile to byte-compile contributed packages, but that
>>>>> remains
>>>>>>>> optional.
>>>>>>>> Byte-compilation is quite expensive (so you definitely want to
>>> do it
>>>>>>>> at install time, which requires lazy-loading), and relatively few
>>>>>>>> packages benefit appreciably from byte-compilation. A larger
>>> number
>>>>>>>> of packages benefit from byte-compilation of R itself: for
>>> example
>>>>>>>> AER runs its checks 10% faster. The byte-compiler technology is
>>>>>>>> thanks to Luke Tierney.
>>>>>>>>
>>>>>>>> There is support for figures in Rd files: currently with a
>>>>> first-pass
>>>>>>>> implementation (thanks to Duncan Murdoch).
>>>
>>> --
>>> Stephan Wahlbrink
>>> Humboldtstr. 19
>>> 44137 Dortmund
>>> Germany
>>> http://www.walware.de/goto/opensource
>>>
>>
>>

R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu 07 Jul 2011 - 07:44:33 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 07 Jul 2011 - 14:30:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive