Re: [Rd] R strings, null-terminated or size delimited?

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: Sat, 21 Nov 2009 19:25:00 -0500

On 21/11/2009 6:31 PM, Guillaume Yziquel wrote:
> Simon Urbanek a écrit :

>> On Nov 21, 2009, at 4:12 PM, Guillaume Yziquel wrote:
>>
>>> Hello.
>>>
>>> I've been looking at vecsexps for my binding.
>>>
>>> Concerning strings, I'm wondering: are they supposed to be 
>>> null-delimited?
>> Yes, they are null-delimited when you create/access them.

>
> OK. Fair enough. But is guaranteed that null-delimitation ends where the
> vecsxp field of the * VECSEXP tells where the R vector should end? Let
> me rephrase that:
>
> -1- Should I consider it a bug if the two informations differ?
>
> -2- What's the "safest" way out of the two?
>
>>> Are they delimited by the info in the SEXPHEADER macro in Rinternals.h?
>>
>> You should not be touching or reading that.

>
> I believe I should. I'd like the OCaml / R binding to be closely knit to
> R internals. One reason would be for speed, the other being that I'd
> like to make use of camlp4 to write syntax extensions to mix OCaml and R
> syntax. It's therefore important for me not to rely on the R interpreter
> to be active when building R values. Or when marshaling R values via
> OCaml. There are numerous other issues aside this one.

You are probably not going to be able to do that. Take your example of the promise below: to evaluate a promise, you need to evaluate the expression attached to it in the R interpreter. (This is discussed in the R Language Definition.)

You can put probably put together simple R objects like integer arrays without having R running, but anything substantial isn't going to be feasible.

Duncan Murdoch

>
> I'm already using #define USE_RINTERNALS in my .c file to inspect R values.
>

>>> Basically, what are the macros or functions to access the values of 
>>> the vecsexps?
>> VECTOR_ELT and SET_VECTOR_ELT (assuming that you're referring to VECSXP 
>> which is are generic vectors).

>
> No. I'm refering to INTSXP for now. But I see what you mean:
>
>> #define INTEGER(x)      ((int *) DATAPTR(x))
>> #define VECTOR_ELT(x,i) ((SEXP *) DATAPTR(x))[i]

>
> VECTOR_ELT is not suitable for INTSXP arrays. I need to convert to
> INTSXP array to an OCaml list / array.
>
>>> I'm thinking of CHARSXPs and INTSXPs for the moment...
>> Those are entirely different - CHARSXP are not vectors but strings (see 
>> mkChar et al., CHAR, ...) and INTSXP are integer arrays (in C speak) 
>> accessed using INTEGER.

>
> OK. They're not vectors. They're VECTOR_SEXPRECs.
>
>> Please read R-exts - it's better than guessing.

>
> Funny, I have R-exts.pdf and R-ints.pdf opened. They're fine when it
> comes to writing R extensions. Not when writing bindings embedding R
> into OCaml so that you can beta-reduce isomorphically in R and OCaml.
>
>> Cheers,
>> Simon

>
> I'm already using heretic features in OCaml (namely Obj.magic) in order
> to do this binding. I do not mind using heretic features of the R API.
>
> I do not mean to be a pain, but I have to do what needs to be done. If I
> find on my way that #define USE_RINTERNALS is overkill, I'll gladly drop it.
>
> For instance, here's one of my issues: I've extracted the R SEXP for the
> "str" symbol. It's a promise. Now, how do I map such a SEXP to an OCaml
> function? Haven't found that in R-ints.pdf or R-exts.pdf. There's talk
> about functions, but promises are somewhat overlooked. However, such a
> mapping is crucial to me.
>
> I was not guessing when I was trying to look at the internal structure
> of R data. Simply trying to get a grip on how to execute promises, and
> therefore examining such a promise:
>
>> # R.Internal.Pretty.t_of_sexp (R.Raw.sexp_of_t (R.symbol "str"));;
>> - : R.Internal.Pretty.t =
>> PROMISE
>>  {value = SYMBOL None;
>>   expr =
>>    CALL (SYMBOL (Some ("lazyLoadDBfetch", BUILTIN)),
>>     [INT [105; 153119]; Unknown; Unknown; Unknown]);
>>   env = Unknown}

>
> Or, following structures in Rinternals.h:
>
>> # R.Internal.C.t_of_sexp (R.Raw.sexp_of_t (R.symbol "str"));;
>> - : R.Internal.C.t =
>> Val
>>  {content =
>>    PROMSXP
>>     {prom_value =
>>       Val
>>        {content =
>>          SYMSXP
>>           {pname = Val {content = NILSXP};
>>            sym_value = R.Internal.C.Recursive <lazy>;
>>            internal = Val {content = NILSXP}}};
>>      R.Internal.C.expr =
>>       Val
>>        {content =
>>          LANGSXP
>>           {carval =
>>             Val
>>              {content =
>>                SYMSXP
>>                 {pname = Val {content = CHARSXP "lazyLoadDBfetch"};
>>                  sym_value = Val {content = BUILTINSXP 687};
>>                  internal = Val {content = NILSXP}}};
>>            cdrval =
>>             Val
>>              {content =
>>                LISTSXP
>>                 {carval = Val {content = INTSXP [105; 153119]};
>>                  cdrval =
>>                   Val
>>                    {content =
>>                      LISTSXP
>>                       {carval =
>>                         Val
>>                          {content =
>>                            SYMSXP
>>                             {pname = Val {content = CHARSXP "datafile"};
>>                              sym_value =
>>                               Val
>>                                {content =
>>                                  SYMSXP
>>                                   {pname = Val {content = NILSXP};
>>                                    sym_value = R.Internal.C.Recursive <lazy>;
>>                                    internal = Val {content = NILSXP}}};
>>                              internal = Val {content = NILSXP}}};
>>                        cdrval =
>>                         Val
>>                          {content =
>>                            LISTSXP
>>                             {carval =
>>                               Val
>>                                {content =
>>                                  SYMSXP
>>                                   {pname =
>>                                     Val {content = CHARSXP "compressed"};
>>                                    sym_value =
>>                                     Val
>>                                      {content =
>>                                        SYMSXP
>>                                         {pname = Val {content = NILSXP};
>>                                          sym_value =
>>                                           R.Internal.C.Recursive <lazy>;
>>                                          internal = Val {content = NILSXP}}};
>>                                    internal = Val {content = NILSXP}}};
>>                              cdrval =
>>                               Val
>>                                {content =
>>                                  LISTSXP
>>                                   {carval =
>>                                     Val
>>                                      {content =
>>                                        SYMSXP
>>                                         {pname =
>>                                           Val {content = CHARSXP "envhook"};
>>                                          sym_value =
>>                                           Val
>>                                            {content =
>>                                              SYMSXP
>>                                               {pname = Val {content = NILSXP};
>>                                                sym_value =
>>                                                 R.Internal.C.Recursive <lazy>;
>>                                                internal =
>>                                                 Val {content = NILSXP}}};
>>                                          internal = Val {content = NILSXP}}};
>>                                    cdrval = Val {content = NILSXP};
>>                                    tagval = Val {content = NILSXP}}};
>>                              tagval = Val {content = NILSXP}}};
>>                        tagval = Val {content = NILSXP}}};
>>                  tagval = Val {content = NILSXP}}};
>>            tagval = Val {content = NILSXP}}};
>>      R.Internal.C.env = Val {content = ENVSXP}}}
>> # 

>
> For instance, an issue I'd like advice on is: what does such a symbol mean?
>
>>                            SYMSXP
>>                             {pname = Val {content = CHARSXP "datafile"};
>>                              sym_value =
>>                               Val
>>                                {content =
>>                                  SYMSXP
>>                                   {pname = Val {content = NILSXP};
>>                                    sym_value = R.Internal.C.Recursive <lazy>;
>>                                    internal = Val {content = NILSXP}}};
>>                              internal = Val {content = NILSXP}}};

>
> And how is it treated when "str" is executed?
>
> All the best.

>

R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sun 22 Nov 2009 - 00:50:49 GMT

This archive was generated by hypermail 2.2.0 : Sun 22 Nov 2009 - 12:50:46 GMT