Re: [Rd] wchar and wstring. (followup question)

From: James Bullard <bullard_at_berkeley.edu>
Date: Tue 30 Aug 2005 - 00:08:46 GMT

Thanks for all of the help with my previous posts. This question might expose much of my ignorance when it comes to R's memory managment and the responsibilities of the programmer, but I thought I had better ask rather than continue in my ignorance.

I use the following code to create a multi-byte string in R from my wide character string in C.

int str_length;
char* cstr;    

str_length = cel.GetAlg().size();
cstr = Calloc(str_length, char);
wcstombs(cstr, cel.GetAlg().c_str(), str_length); SET_STRING_ELT(names, i, mkChar("Algorithm")); SET_VECTOR_ELT(vals, i++, mkString(cstr)); Free(cstr);

My first question is: do I need the Free? I looked at some of the examples in main/character.c, but I could not decide whether or not I needed it. I imagined (I could not find the source for this function) that mkString made a copy so I thought I would clean up my copy, but if this is not the case then I would assume the Free would be wrong.

My second question is: It was pointed out to me that it would be more natural to use this code:

SET_STRING_ELT(vals, i++, mkChar(header.GetHeader().c_str()));

instead of:

SET_VECTOR_ELT(vals, i++, mkString(header.GetHeader().c_str()));

However, the first line creates the following list element in R:

<CHARSXP: "Percentile">

Whereas, I want it to create as the list element:

"Percentile"

Which the second example does correctly. I had previously posted about this problem and I believe that I was advised to use the second syntax, but maybe there is a different problem in my code. I am trying to construct a named list in R where my first line SET_STRING_ELT sets the name of the list element and the second sets the value where the value can be an INTEGER, STRING or whatever.

My third question is simply, why is wcrtomb preferred, the example i based my code of of in main/character.c used wcstombs.

Thanks again for all of the help.

jim

Prof Brian Ripley wrote:

> On Fri, 26 Aug 2005, James Bullard wrote:
>
>> Hello all, I am writing an R interface to some C++ files which make use
>> of std::wstring classes for internationalization. Previously (when I
>> wanted to make R strings from C++ std::strings), I would do something
>> like this to construct a string in R from the results of the parse.
>>
>> SET_VECTOR_ELT(vals, i++, mkString(header.GetHeader().c_str()));
>
>
> That creates a list of one-element character vectors. It would be more
> usual to do
>
> SET_STRING_ELT(vals, i++, mkChar(header.GetHeader().c_str()));
>
>> However, now the call header.GetHeader().c_str() returns a pointer to
>> an array of wchar_t's. I was going to use wcstombs() to convert the
>> wchar_t* to char*, but I wanted to see if there was a similar
>> function in R for the mkString function which I had initially used
>> which deals with wchar_ts as opposed to chars.
>
>
> No (nor an analogue of mkChar). R uses MBCS and not wchar_t
> internally (and Unix-alike systems do externally). There is no
> wchar_t internal R type (a much-debated design decision at the time).
>
>> Also, since I have no experience with the wctombs() function I wanted
>> to ask if anyone knew if this will handle the internationilzation
>> issues from within R.
>
>
> Did you mean wcstombs or wctomb (if the latter, wcrtomb is preferred)?
> There are tens of examples in the R sources for you to consult.
>
> Note that not all R platforms support wchar_t, hence this code is
> surrounded by #ifdef SUPPORT_MBCS macros (exported in Rconfig.h for
> package writers).
>



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue Aug 30 10:12:59 2005

This archive was generated by hypermail 2.1.8 : Mon 20 Feb 2006 - 03:21:19 GMT