[R] RStem with portuguese language

From: Paulo Cortez <pcortez_at_dsi.uminho.pt>
Date: Mon, 28 Jul 2008 16:59:36 +0100


I have R 2.7.1 in MacOs and I believe UTF encoding is already installed. At least:

> Sys.getenv()

shows several variables, including:
  LANG "pt_PT.UTF-8"

I installed the Rstem and tm packages and when I try the following code:

> wordStem(c("aberração","aberrações"), language="portuguese")
[1] "aberraç\xc3" "aberraçõ"
Warning message:
In wordStem(c("aberração", "aberrações"), language = "portuguese") :

   Currently, only 'english' is tested. You will need support for UTF characters.

So my question is. Am I using Rstem wrong or I do not really have UTF support? What do I need to do?

Best regards,

Paulo Alexandre Ribeiro Cortez  (PhD, MSc)
Lecturer (Prof. Auxiliar) at the Department of Information Systems (DSI)
University of Minho, Campus de AzurÈm, 4800-058 Guimaraes, Portugal
http://www.dsi.uminho.pt/~pcortez +351253510313 Fax:+351253510300

R-help_at_r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 28 Jul 2008 - 17:05:02 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 28 Jul 2008 - 19:32:47 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive