Re: [Rd] PROTECT and OCaml GC.

From: Guillaume Yziquel <guillaume.yziquel_at_citycable.ch>
Date: Mon, 30 Nov 2009 18:08:31 +0100

Simon Urbanek a écrit :
>
> You're talking about two entirely different things -- bypassing the API
> is a very bad idea, but it has nothing to do with your last paragraph.

It's very good to hear that it's two different things. This has been quite unclear to me.

> The API gives you access all user-visible aspects of R which is all you
> really need for any embedding -- that includes closure body, evaluation
> etc. I see no reason why you should ever go lower than the API

Because I've been unable to find what exactly applyClosure or eval requires, when it comes to the structure of the argument LANGSXP. For example.

> since
> that is unreliable and unsupported and thus you won't get any help with
> that (that is IMHO the main reason why you get no responses here - and I
> wouldn't expect any). All other functions are hidden on purpose since
> they cover internal aspects that should not be relied upon.

Please let me be clear on my intentions:

-1- I intend to use only the API if possible.

-2- If not possible, I will perhaps use #define USE_RINTERNALS, which as I understand is not part of the API.

-3- The libR.so with opened symbols is intended only as a replacement of GDB during development. Unfortunately, as things are not going as easily as it could, I am, for gdb-like purposes, writing progressively a new eval / applyClosure duo in OCaml.

The option -3- will not appear in the interface I will release.

In order to discriminate between option -1- and options -1- + -2-, could you please answer the following question, which I hope falls in the scope of legitimate questions on this mailing list:

Suppose I have an OCaml (or pure C if you wish) linked list of OCaml value wrapping SEXP values. Is it possible, using only the API, to create a LANGSXP / LISTSXP list out of these SEXPs?

I guess this is the crucial point where I hit the limits of the API. Please confirm or infirm.

> So again, I just think you're operating on the wrong level here -- and
> this has nothing to do with the fact that you're binding to a functional
> language since the mechanisms are the same regardless of the languages
> (that's why Omegahat was used to bind into any random language that
> seemed useful).

Will look into Omegahat. Not yet very familiar with R userland.

> You get more headaches since you have to decide how to
> handle closures both ways, but I suspect the practical solution is to
> use evaluators on the side where the function is defined (especially for
> the R side since it includes non-S-language code so you simply cannot
> map it).

Ok. So suppose I have wrapped an anonymous R closure, called op.

This closure takes two arguments, a string, and yields a float.

I therefore need to write a function "eval_this_op" whose type would be:

eval_this_op : (string -> int) R.t -> string R.t -> int R.t

Essentially, eval_this_op takes three arguments, a wrapped anonymous R closure, an R string, and yields an R integer.

How could you write such an eval_this_op function without first solving the crucial issue in the above paragraph, which is basically constructing a LANGSXP out of an anonymous closure and an R string?

> If you have suggestions for extending the API, feel free to post them
> with exact explanations how in general that extensions could be useful
> (general is the key word here - I think so far it was rather to hack
> around your way of implementing it). [And FWIW tryEval *is* part of the
> API].
Please take into account that OCaml's type system is extremely strong. "My way of implementing it", as you call it, is essentially the most natural way to fit in the OCaml paradigm. I must satisfy both OCaml and R paradigms in order to write a correct binding.

Please note that it is not an embedding in a random application. It aims to be a full blown binding for general purpose. In OCaml, values are immutable. Really, really, really immutable. Or they are signals, immutable abstractions describing a value that changes overtime. Symbols, variables and such are not welcome. References (~pointers) are statically typed and *cannot* be type casted. The type checking is so strong that you should almost never have to throw an exception. This means avoiding dynamic type-checking everywhere it's possible to avoid. This means that a function that takes a sexp to yield the underlying function should not have to raise an exception if the sexp is not a function. It should therefore not have to dynamically typecheck the sexp at runtime. This means that you have to enhance the type system to *statically* declare (or infer) that this sexp is a LANGSXP. Therefore you have to use a polymorphic type system (somehow ~ C++ templated types) to say "lang sxp" "list sxp" "sym sxp", etc... You get the idea?

This is not "my way". It's the OCaml way: They like to statically type-check *everything* , including HTML. Please have a look at section "Static typing of XHTML with XHTML.M" of

        http://ocsigen.org/eliom/manual/1.2.0/1#p1baseprinciples

Do you know why the Swig module for OCaml is virtually unused? Because the OCaml community does not consider it type-safe enough. And it will go somehow the same for Haskell.

The "general" aspect of my request therefore concerns bindings to languages with 'inferred polymorphic static typing'. Please understand what these languages are about before dismissing my remarks as "my way". You may not care, you wouldn't be the first.

 From Wikipedia: http://en.wikipedia.org/wiki/Objective_Caml

> OCaml's static type system eliminates a large class of programmer errors that may cause problems at runtime. However, it also forces the programmer to conform to the constraints of the type system, which can require careful thought and close attention. A type-inferring compiler greatly reduces the need for manual type annotations (for example, the data type of variables and the signature of functions usually do not need to be explicitly declared, as they do in Java). Nonetheless, effective use of OCaml's type system can require some sophistication on the part of the programmer.

Please understand that I take no joy and no fun in being a pain.

If you force me to write a binding that wouldn't be type safe, it would be unused. This is simply not acceptable to me: I am unfortunately not willing to waste my time. And will then eventually have to bypass the API. Please help me avoid that as much as it is possible with these constraints.

-- 
      Guillaume Yziquel
http://yziquel.homelinux.org/

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Mon 30 Nov 2009 - 17:30:53 GMT

This archive was generated by hypermail 2.2.0 : Mon 30 Nov 2009 - 18:20:52 GMT