Re: [Rd] substitute creates an object which prints incorrectly (PR#9427)

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: Sat 23 Dec 2006 - 17:18:44 GMT

On 12/23/2006 5:28 AM, Peter Dalgaard wrote:

> Bill.Venables@csiro.au wrote:

>> Thanks Peter. I see the dilemma. It is serious in my view, though,
>> even if I can't see an elegant way round it.
>>
>> I guess the only possibilities are
>>
>> 1. Only keep the source in printing or, much more seriously, dumping, if
>> the source attribute parses to an object structually identical to the
>> function itself (even I can see this is going to be impractical)
>>
>>
> Yes, dumping is the serious issue here. One should also consider that 
> inconsistencies can be created by other means than substitute(), e.g. by 
> brute force.
> 

>> 2. Make the default keep.source option FALSE rather than TRUE and warn
>> people that switching it on can be unsafe in language manipulation.
>> This would be practical, I suggest, if comments were kept as part of the
>> function itself, as well as in the source attribute, but if comments are
>> only kept in the source attribute (as appears to be the case now) I
>> concede this becomes impractical.
>>
>>
> The whole source attribute construction was created because there was no 
> way (that we could find, anyway) to make comment and whitespace handling 
> sane without either changing the language or making the parser insanely 
> complicated. Old R hacks will remember how comments intended for the end 
> of a for loop could move to the top when the function was listed.
> 
> The basic issue is that S's syntactic sugar coat was modeled on C, but C 
> is a compiled language, so its parsers just skip comments and 
> whitespace. Trying to retain comments leads to complications.... Even in 
> a simple function call f(a=b), there are 7 places to stick in comments, 
> so at the very least you need to deal with pre-comments, post-comments, 
> and sometimes in-the-middle-comments. And what is worse, there are 
> ambiguities, which make automatic parser generators unhappy: Does a 
> comment between two expressions belong with the former or the latter? Etc...
> 

>> 3. Modify substitute() so that it strips source attributes (or anything
>> else apparently visible that it will not manipulate) from objects.
>> Sorry folks, too dangerous. (I concede this appears to be a bit of an
>> overkill, too.)
>>
>>
> Actually, this is the easier fix to my mind. Where do you see the danger?

I think we should get rid of source attributes completely, since they are no longer needed, but your comment still applies to source references. We should strip them when code gets modified.

Duncan Murdoch

> 

>> Perhaps the compromise has to be to warn people that keep.source=TRUE
>> can be dangerous in this way, both in the help informaton for options()
>> and for substitute().
>>
>> ?
>>
>> Bill Venables.
>>
>> -----Original Message-----
>> From: Peter Dalgaard [mailto:p.dalgaard@biostat.ku.dk]
>> Sent: Friday, 22 December 2006 9:47 PM
>> To: Venables, Bill (CMIS, Cleveland)
>> Cc: r-devel@stat.math.ethz.ch; R-bugs@biostat.ku.dk
>> Subject: Re: [Rd] substitute creates an object which prints incorrectly
>> (PR#9427)
>>
>> Bill.Venables@csiro.au wrote:
>>
>>> The function "substitute" seems to fail to make a genuine
>>> substitution, although the printed verision seems fine. Here is an
>>> example.
>>>
>>>
>>>
>>>> m <- substitute(Y <- function(x) FUN(x+1),
>>>>
>>>>
>>> + list(Y = as.name("y"), FUN = as.name("sin")))
>>>
>>>
>>>> m
>>>>
>>>>
>>> y <- function(x) sin(x + 1)
>>>
>>>
>>>> eval(m)
>>>> y
>>>>
>>>>
>>> function(x) FUN(x+1)
>>>
>>> However the story doesn't end there. The substitution appears to have
>>> been made, even though the printed version, this time, suggests
>>> otherwise.
>>>
>>>
>>>
>>>> y(pi)
>>>>
>>>>
>>> [1] -0.841471
>>>
>>>
>>>> sin(pi+1)
>>>>
>>>>
>>> [1] -0.841471
>>>
>>>
>>>
>>>
>> Yes, this is (fairly) well known. It has to do with the retention of
>> function source.
>>
>> The thing to notice is that it is only the printing of y that is really
>> confused. If you do
>>
>> dput(y)
>> attr(y, "source")
>> attr(y, "source") <- NULL
>> y
>>
>> then you should see the point. The tricky bit is that the "source"
>> attribute exists in an intermediate form inside m. Notice that m
>> contains, not the function itself, but a call to the function `function`
>>
>> which creates the function when eval'ed. This call contains the function
>>
>> source as its 4th element (look at m[[3]][[4]] in your example), and you
>>
>> might try setting it to NULL and see how things will clear up.
>>
>> The issue with substitute is that it cannot sensibly substitute into
>> character vectors, so it just leaves the source as is, which gives the
>> symptoms you see. It could, however, and probably should, recognize
>> calls to `function` and NULL out their 4th element. It cannot be done
>> completely failsafe though (`function` could result from a computation,
>> or even be part of the substitution), so one has to decide that the
>> extreme cases are too extreme worry about them.
>>
>> -pd
>>
>>
>>> Bill Venables
>>> CMIS, CSIRO Laboratories,
>>> PO Box 120, Cleveland, Qld. 4163
>>> AUSTRALIA
>>> Office Phone (email preferred): +61 7 3826 7251
>>> Fax (if absolutely necessary): +61 7 3826 7304
>>> Mobile (rarely used): +61 4 1963 4642
>>> Home Phone: +61 7 3286 7700
>>> mailto:Bill.Venables@csiro.au
>>> http://www.cmis.csiro.au/bill.venables/
>>>
>>>
>>> --please do not edit the information below--
>>>
>>> Version:
>>> platform = i386-pc-mingw32
>>> arch = i386
>>> os = mingw32
>>> system = i386, mingw32
>>> status =
>>> major = 2
>>> minor = 4.1
>>> year = 2006
>>> month = 12
>>> day = 18
>>> svn rev = 40228
>>> language = R
>>> version.string = R version 2.4.1 (2006-12-18)
>>>
>>> Windows XP Professional (build 2600) Service Pack 2.0
>>>
>>> Locale:
>>>
>>>
>> LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MON
>>
>> ETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252
>>
>>> Search Path:
>>> .GlobalEnv, .R_Store, package:RODBC, package:xlsReadWrite,
>>> package:cluster, package:vegan, package:ASOR, package:stats,
>>> package:graphics, package:grDevices, package:utils, package:datasets,
>>> .R_Data, .R_Utils, package:svIDE, package:tcltk, package:methods,
>>> Autoloads, package:base
>>>
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>>
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sun Dec 24 04:22:21 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 23 Dec 2006 - 17:31:02 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.