Re: [Rd] Wish there were a "strict mode" for R interpreter. What

From: Ted Harding <ted.harding_at_wlandres.net>
Date: Sat, 09 Apr 2011 22:08:13 +0100 (BST)


On 09-Apr-11 20:37:28, Duncan Murdoch wrote:

> On 11-04-09 3:51 PM, Paul Johnson wrote:

>> Years ago, I did lots of Perl programming. Perl will let you be lazy
>> and write functions that refer to undefined variables (like R does),
>> but there is also a strict mode so the interpreter will block anything
>> when a variable is mentioned that has not been defined. I wish there
>> were a strict mode for checking R functions.
>>
>> Here's why. We have a lot of students writing R functions around here
>> and they run into trouble because they use the same name for things
>> inside and outside of functions. When they call functions that have
>> mistaken or undefined references to names that they use elsewhere,
>> then variables that are in the environment are accidentally used. Know
>> what I mean?
>>
>> dat<- whatever
>>
>> someNewFunction<- function(z, w){
>> #do something with z and w and create a new "dat"
>> # but forget to name it "dat"
>> lm (y, x, data=dat)
>> # lm just used wrong data
>> }
>>
>> I wish R had a strict mode to return an error in that case. Users
>> don't realize they are getting nonsense because R finds things to fill
>> in for their mistakes.
>>
>> Is this possible? Does anybody agree it would be good?
>>
> 
> It would be really bad, unless done carefully.
> 
> In your function the free (undefined) variables are dat and lm.  You 
> want to be warned about dat, but you don't want to be warned about lm. 
> What rule should R use to determine that?
> 
> (One possible rule would work in a package with a namespace.  In that 
> case, all variables must be found in declared dependencies, the search 
> could stop before it got to globalenv().  But it seems unlikely that 
> your students are writing packages with namespaces.)
> 
> Duncan Murdoch

I'm with Duncan on this one! On the other hand, I can understand the issues that Paul's students might encounter.

I think the right thing to so is to introduce the students to the basics of scoping, early in the process of learning R.

Thus, when there is a variable (such as 'lm' in the example) which you *expect* to already be out there (since 'lm' is in 'stats' which is pre-loaded by default), then you can go ahead and use it.

But when your function uses a variable (e.g. 'dat') which just *happened* to be out there when you first wrote the function, then when you re-use the same function definition in a different context things are likely to go wrong. So teach them that variables which occur in functions, which might have any meaning in whatever the context of use may be, should either be named arguments in the argument list, or should be specifically defined within the function, and not assumed to already exist unless that is already guaranteed in every context in which the function would be used.

This is basic good practice which, once routinely adopted, should ensure that the right thing is done every time!

Ted.



E-Mail: (Ted Harding) <ted.harding_at_wlandres.net> Fax-to-email: +44 (0)870 094 0861
Date: 09-Apr-11                                       Time: 22:08:10
------------------------------ XFMail ------------------------------

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat 09 Apr 2011 - 21:14:03 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 11 Apr 2011 - 20:40:44 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive