Re: [R] RStem with portuguese language

From: Duncan Temple Lang <duncan_at_wald.ucdavis.edu>
Date: Mon, 28 Jul 2008 11:21:22 -0700

Hi Paulo.

 My development version has that warning turned off. However, the Rstem package predates the encoding in R, AFAIR. So when I call wordStem() with a string which has an Encoding() of UTF-8, the resulting string has Encoding() "unknown".

I'll take a look and add see if I can add support for it. I am traveling at present, so not certain precisely when.

Thanks,
  D.

Paulo Cortez wrote:
> Greetings,
>
> I have R 2.7.1 in MacOs and I believe UTF encoding is already installed.
> At least:
>
> > Sys.getenv()
>
> shows several variables, including:
> LANG "pt_PT.UTF-8"
>
> I installed the Rstem and tm packages and when I try the following code:
>
> > wordStem(c("aberração","aberrações"), language="portuguese")
> [1] "aberraç\xc3" "aberraçõ"
> Warning message:
> In wordStem(c("aberração", "aberrações"), language = "portuguese") :
> Currently, only 'english' is tested. You will need support for UTF
> characters.
>
> So my question is. Am I using Rstem wrong or I do not really have UTF
> support? What do I need to do?
>
> Best regards,
> --
> Paulo Alexandre Ribeiro Cortez (PhD, MSc)
> Lecturer (Prof. Auxiliar) at the Department of Information Systems (DSI)
> University of Minho, Campus de AzurÈm, 4800-058 Guimaraes, Portugal
> http://www.dsi.uminho.pt/~pcortez +351253510313 Fax:+351253510300
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
"There are men who can think no deeper than a fact" - Voltaire


Duncan Temple Lang                duncan_at_wald.ucdavis.edu
Department of Statistics          work:  (530) 752-4782
4210 Mathematical Sciences Bldg.  fax:   (530) 752-7099
One Shields Ave.
University of California at Davis
Davis, CA 95616, USA




______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

  • application/pgp-signature attachment: stored
Received on Mon 28 Jul 2008 - 18:26:53 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 28 Jul 2008 - 18:32:43 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive