Re: [Rd] Changes to parser in R-devel

From: Yihui Xie <xie_at_yihui.name>
Date: Thu, 19 Jul 2012 16:41:12 -0400

I'm not sure if there is a bug somewhere; see this example:

getParseData(parse(text='function(x){}'))

  line1 col1 line2 col2 id parent          token terminal     text
1     1    1     1    8  1     11       FUNCTION     TRUE function
2     1    9     1    9  2     11            '('     TRUE        (
3     1   10     1   10  3      5 SYMBOL_FORMALS     TRUE        x
4     1   11     1   11  4     11            ')'     TRUE        )
5     1   12     1   12  6      8            '{'     TRUE        {
6     1   13     1   13  7      8            '}'     TRUE        }
7     1   12     1   12  5     11            '}'     TRUE        {
8     1   12     1   13  8     11           expr    FALSE
9     1    1     1   13 11      0           expr    FALSE

I get an additional { in the 7th row of the 'text' column.

Another problem is that for this empty function below, there will be an obvious pause if you run it more than once:

getParseData(parse(text='function(){}'))

and you may get wild line/col numbers like this:

   line1 col1     line2 col2 id parent    token terminal     text
1      1    1         1    8  1      9 FUNCTION     TRUE function
2      1    9         1    9  2      9      '('     TRUE        (
3      1   10         1   10  3      9      ')'     TRUE        )
4      1   11         1   11  4      6      '{'     TRUE        {
5      1   12         1   12  5      6      '}'     TRUE        }
6 320024   11 140106360   11 11      9      '}'     TRUE
7      1   11         1   12  6      9     expr    FALSE
8      1    1         1   12  9     11     expr    FALSE

What is worse is it can crash R:

Traceback:

  1. parse(text = "function(){}")
  2. getSrcref(x)
  3. getSrcfile(x)
  4. getParseData(parse(text = "function(){}"))

> sessionInfo()

R Under development (unstable) (2012-07-18 r59904) Platform: i686-pc-linux-gnu (32-bit)

locale:

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

Regards,
Yihui

--
Yihui Xie <xieyihui_at_gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Wed, Jul 18, 2012 at 2:31 PM, Duncan Murdoch
<murdoch.duncan_at_gmail.com> wrote:

> I have just committed (in r59883) some changes to the R parser based on
> Romain Francois' parser package. Packages that made use of parser will
> hopefully find that the information in base R gives them what they need to
> work with, but the data is not identical to
> what parser recorded (since it was not consistent with some things already
> in R). One reason for the change was that the parser in the parser package
> was slightly different than the one in R; the hope is that by providing the
> services in R, it will make maintenance easier for things like code
> analysis, pretty printing, etc.
>
> See ?getParseData for details, and if you are maintaining a package that
> depends on parser, feel free to ask me for help in the transition, or make
> suggestions for changes if I've done something that causes you too much
> trouble.
>
> Duncan Murdoch
>
> P.S. to Qiang Li: as mentioned privately, the goal for this change was to
> reproduce output equivalent to what parser did, so I have not incorporated
> your suggested change to outlaw expressions like "x[[1] ]" (with an
> embedded space where it shouldn't be). After things settle down we can
> consider that change and others.
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Thu 19 Jul 2012 - 20:45:27 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 19 Jul 2012 - 23:40:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive