Re: [Rd] R_parseVector and syntax error [was: error messages while parsing with rniParse]

From: Romain Francois <romain.francois_at_dbmail.com>
Date: Fri, 19 Jun 2009 09:18:51 +0200

Duncan Murdoch wrote:
>
> Romain Francois wrote:
>> Duncan Murdoch wrote:
>>
>>> Simon Urbanek wrote:
>>>
>>>> On Jun 18, 2009, at 17:02 , Duncan Murdoch wrote:
>>>>
>>>>
>>>>> Romain Francois wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> [I'm redirecting this here from stats-rosuda-devel]
>>>>>>
>>>>>> When parsing R code through R_parseVector and the code generates
>>>>>> an error (syntax error), is there a way to grab the error.
>>>>>> It looks like yyerror populates the buffer "R_ParseErrorMsg",
>>>>>> but then the variable is not part of the public api.
>>>>>>
>>>>>> Would it be possible to add yet another entry point to the
>>>>>> parser that would basically wrap R_parseVector so that it would
>>>>>> have an extra char* argument that would bring back the error
>>>>>> message if there is an error?
>>>>>>
>>>>>>
>>>>>>
>>>>> I would oppose that. Suggest ways to reduce the complexity of
>>>>> the parser interface and I'd be interested. It's a nightmare to
>>>>> make any changes there.
>>>>>
>>>>> You can always call the R function wrapped in try(), so it's not
>>>>> as though this would give you anything that you don't already
>>>>> have access to.
>>>>>
>>>> I'm not quite following - we're talking about R_ParseVector in C
>>>> code so the point is that the C code gets access to the error
>>>> message so it can relay it to the user.
>>> I understood that. But the C code can get the error message by
>>> evaluating an R expression and looking at the result.
>>>
>>>
>>>> There are no R-level functions involved here. The issue here for
>>>> the moment is that this information is retrievable at R level but
>>>> not (officially) at the C level.
>>> I wouldn't mind exposing the underlying information in a clean way,
>>> but the string in R_ParseVector isn't all a front end should get.
>>>
>>
>> Great. Let's do that.
>> Is a function that simply returns some of the static variables used
>> by bison clean enough ?
>>
> It could be. I'd like a design that allows for the possibility of
> multiple syntax errors to be reported. I have parse_Rd doing that,
> though not committed yet. parse() is different because we have to be
> less tolerant of errors in R code than in Rd files. But we could
> still report multiple errors in one parse, not just stop at the first
> one.

This is an interesting problem. Just being curious here: how do you continue parsing after a syntax error in parse ? Does it depend on the kind of syntax error ? Do you use some of the recovery protocols of bison (the special "error" token only appears in the very top level prog symbol :

prog    :    END_OF_INPUT            { return 0; }
    |    '\n'                { return xxvalue(NULL,2,NULL); }
    |    expr_or_assign '\n'            { return xxvalue($1,3,&@1); }
    |    expr_or_assign ';'            { return xxvalue($1,4,&@1); }
    |    error                 { YYABORT; }
    ;

Anyway, what about using the extra information to structure an error message of a custom condition class.

>
> Duncan Murdoch
>
>>> At the time of an R_ParseVector syntax error, the parser knows what
>>> token it couldn't handle, and it knows its classification, and the
>>> location in the file where it came from. Not all of that makes it
>>> through to the error message.
>>>
>>>> As for reducing complexity - technically, there is no complexity
>>>> added since all this is already in place ... [adding extra char *
>>>> argument to ParseVector may not be the best way but that's not
>>>> what I'm arguing for].
>>> It was what I was arguing against.
>>>
>>> Duncan Murdoch
>>>
>>>
>>>> Or am I missing something?
>>>> Cheers,
>>>> S
>>>>
>>>>
>>>>
>>>>
>>>>>> Romain
>>>>>>
>>>>>> Simon Urbanek wrote:
>>>>>>
>>>>>>
>>>>>>> On Jun 15, 2009, at 12:05 , Romain Francois wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> In JRI, is there a way to get the error message that is
>>>>>>>> generated by the
>>>>>>>> parser through rniParse
>>>>>>>> For example, if I have this :
>>>>>>>>
>>>>>>>> long y = re.rniParse( "rnorm( 10 ))", 1 ) ;
>>>>>>>>
>>>>>>>> this obviously generates a parse error, so y will be the same as
>>>>>>>> (R_NilValue) :
>>>>>>>>
>>>>>>>> long null_id = re.rniEval( re.rniParse( "NULL", 1 ), 0 ) ;
>>>>>>>>
>>>>>>>> I guess the underlying question is : "Is R_ParseErrorMsg
>>>>>>>> exposed to
>>>>>>>> JRI".
>>>>>>>>
>>>>>>>>
>>>>>>> AFAICT R_ParseErrorMsg and friends are not exposed by the R API
>>>>>>> - they are not accessible outside, so they cannot be use by
>>>>>>> JRI. It would be nice if there was a way of accessing that
>>>>>>> info, but R doesn't currently support that.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Simon
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> The reason is I would like to bring back the message as part of an
>>>>>>>> exception generated when the code does not parse.
>>>>>>>>
>>>>>>>> Romain
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>> ______________________________________________
>>>>> R-devel_at_r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>>
>>>>>
>>>
>>>
>>
>>
>>
>
>
>

-- 
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Fri 19 Jun 2009 - 07:26:06 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 19 Jun 2009 - 14:31:23 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive