Re: [R] How to get the length of an UTF-8 string

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Thu, 06 Nov 2008 11:43:53 +0000 (GMT)

On Thu, 6 Nov 2008, Fán Lóng wrote:

> Hi there,
>
> I am intending to get the length of an UTF-8 string which contains
> some Japanese characters (let's say, rstr) in R language.
> I try to use the nchar(rstr) to get its length, however, it returns
> the "NA" for it contains some multi-byte characters.
>
> Is there any alternatives to return the length of this rstr?

Use a UTF-8 locale, then nchar() will get it right.

Or convert it to the Japanese locale on your system, and use nchar in that locale.

>
> Any suggestion is appreciated.
>
> Long
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Please do: we are missing the 'at a minimum' information needed to answer this question. Locales do matter.

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Received on Thu 06 Nov 2008 - 11:46:29 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 06 Nov 2008 - 12:30:22 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive