Re: [R] unicode&pdf font problem RESOLVED

From: Ben Madin <lists_at_remoteinformation.com.au>
Date: Tue, 01 Mar 2011 21:50:29 +0800

Just to add to this (I've been looking through the archive) problem with display unicode fonts in pdf document in R

If you can use the Cairo package to create pdf on Mac, it seems quite happy with pushing unicode characters through (probably still font family dependant whether it will display)

	probstring <- c(' \u2264 0.2',' \u2268 0.4',' \u00FC 0.6',' \u2264 0.8',' \u2264 1.0')
	Cairo(type='pdf', file='outputs/demo.pdf', width=9,height=12, units='in', bg='transparent')
	plot(1:5,1:5, type='n')
	text(1:5,1:5,probstring)  
	dev.off()

?Cairo suggests encoding is ignored if you do try to set it.

cheers

Ben

On 14/01/2011, at 7:00 PM, r-help-request_at_r-project.org wrote:

> Date: Thu, 13 Jan 2011 10:47:09 -0500
> From: David Winsemius <dwinsemius_at_comcast.net>
> To: Sascha Vieweg <saschaview_at_gmail.com>
> Cc: r-help_at_r-project.org
> Subject: Re: [R] unicode&pdf font problem RESOLVED
> Message-ID: <74FA099F-4CE5-45C7-A05A-4A1DE6C87EC8@comcast.net>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes
> 
> 
> On Jan 13, 2011, at 10:41 AM, Sascha Vieweg wrote:
> 
>> I have many German umlauts in my data sets and code them UTF-8. When  
>> it comes to plotting on pdf, I figured out that "CP1257" is a good  
>> choice to output Umlauts. I have no experiences with "CP1250", but  
>> maybe this small hint helps:
>> 
>> pdf(file=paste(sharepath, "/filename.pdf", sep=""), 9, 6, pointsize  
>> = 11, family = "Helvetica", encoding = "CP1257")
> 
> Just an FYI for the archives, that encoding fails with  
> pdf(encoding="CP1257") on a Mac when printing that target umlaut.
> 
> David.
>> 
>> *S*
>> 
>> On 11-01-13 16:17, tdenes_at_cogpsyphy.hu wrote:
>> 
>>> Date: Thu, 13 Jan 2011 16:17:04 +0100 (CET)
>>> From: tdenes_at_cogpsyphy.hu
>>> To: David Winsemius <dwinsemius_at_comcast.net>
>>> Cc: r-help_at_r-project.org
>>> Subject: Re: [R] unicode&pdf font problem RESOLVED
>>> 
>>> Dear David,
>>> 
>>> Thank you for your efforts. Inspired by your remarks, I started a new
>>> google-search and found this:
>>> http://stackoverflow.com/questions/3434349/sweave-not-printing-localized-characters
>>> 
>>> SO HERE COMES THE SOLUTION (it works on both OSs):
>>> 
>>> pdf.options(encoding = "CP1250")
>>> pdf()
>>> plot(1,type="n")
>>> text(1,1,"\U0171")
>>> dev.off()
>>> 
>>> CP1250 should work for all Central-European languages:
>>> http://en.wikipedia.org/wiki/Windows-1250
>>> 
>>> 
>>> Thank you again,
>>> Denes
>>> 
>>> 
>>> 
>>>> 
>>>> On Jan 13, 2011, at 7:01 AM, tdenes_at_cogpsyphy.hu wrote:
>>>> 

>>>>>
>>>>> Hi!
>>>>>
>>>>> Sorry for the missing specs, here they are:
>>>>>> version

>>>>> _
>>>>> platform i386-pc-mingw32
>>>>> arch i386
>>>>> os mingw32
>>>>> system i386, mingw32
>>>>> status
>>>>> major 2
>>>>> minor 12.1
>>>>> year 2010
>>>>> month 12
>>>>> day 16
>>>>> svn rev 53855
>>>>> language R
>>>>> version.string R version 2.12.1 (2010-12-16)
>>>>>
>>>>> OS: Windows 7 (English version, 32 bit)
>>>>>
>>>>>
>>>> 
>>>> You are after what Adobe calls: udblacute; 0171.  It is recognized  
>>>> in
>>>> the list of adobe glyphs:

>>>>> str(tools::Adobe_glyphs[371, ])
>>>> 'data.frame': 1 obs. of 2 variables: >>>> $ adobe : chr "udblacute" >>>> $ unicode: chr "0171" >>>> >>>> Consulted the help pages >>>> points {graphics} >>>> postscript {grDevices} >>>> pdf {grDevices} >>>> charsets {tools} >>>> postscriptFonts {grDevices} >>>> >>>> I have tried a variety of the pdfFonts installed on my Mac without >>>> success. You can perhaps make a list of fonts on your machines with >>>> names(pdfFonts()). Perhaps the range of fonts and the glyphs they >>>> contain is different on your machines. I get consistently warning >>>> messages saying there is a conversion failure: >>>>

>>>>> pdf("trial.pdf", family="Helvetica")
>>>> # also tried with font="Helvetica" but I think that is erroneous

>>>>> plot(1,type="n")
>>>>> text(1,1,"print \U0170\U0171")
>>>> Warning messages:
>>>> 1: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <c5>
>>>> 2: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <b0>
>>>> 3: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <c5>
>>>> 4: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <b1>
>>>> 5: In text.default(1, 1, "print ????") :
>>>>  font metrics unknown for Unicode character U+0170
>>>> 6: In text.default(1, 1, "print ????") :
>>>>  font metrics unknown for Unicode character U+0171
>>>> 7: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <c5>
>>>> 8: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <b0>
>>>> 9: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <c5>
>>>> 10: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <b1>
>>>> 
>>>> And this is despite my system saying the \U0170 and \U0171 are  
>>>> present
>>>> in the Helvetica font. Also tried family=URWHelvetica and
>>>> family=NimbusSanand and a bunch of others without success, but my  
>>>> last
>>>> best hope after reading the material in help(postscript) in the
>>>> "Families" section had been NimbusSan.  There is also information on
>>>> that page regarding encodings that appears to be very machine  
>>>> specific.
>>>> 

>>>>>
>>>>> Note that \U0171 != ??. See
>>>>> http://www.fileformat.info/info/unicode/char/171/index.htm
>>>>> Anyway, I have no problem with &#369; (~u") and other special
>>>>> Hungarian
>>>>> characters in my R-Gui. It is correctly displayed in the console,
>>>>> in
>>>>> plots, etc. The problem is with the pdf conversion.
>>>>>
>>>>> The same holds for my Ubuntu Hardy Heron system*, with exactly the
>>>>> same
>>>>> error messages as reported in an earlier thread
>>>>> http://www.mail-archive.com/r-help@r-project.org/msg89792.html
>>>>> As far as I know, Hershey fonts do not contain \U0171.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Denes
>>>>>
>>>>> * The specs of Ubuntu:
>>>>>> version

>>>>> _
>>>>> platform x86_64-pc-linux-gnu
>>>>> arch x86_64
>>>>> os linux-gnu
>>>>> system x86_64, linux-gnu
>>>>> status
>>>>> major 2
>>>>> minor 12.0
>>>>> year 2010
>>>>> month 10
>>>>> day 15
>>>>> svn rev 53317
>>>>> language R
>>>>> version.string R version 2.12.0 (2010-10-15)
>>>>>
>>>>>
>>>>>> 
>>>>>> On Jan 12, 2011, at 11:11 PM, tdenes_at_cogpsyphy.hu wrote:
>>>>>> 
>>>>>>> 
>>>>>>> Dear List,
>>>>>>> 
>>>>>>> I would like to print a plot into pdf. The problem is that the
>>>>>>> character
>>>>>>> \U0171 is replaced by a simple 'u' (i.e. without accents) in  
>>>>>>> the pdf
>>>>>>> file.
>>>>>>> 
>>>>>>> Example:
>>>>>>> # this works fine
>>>>>>> plot(1,type="n")
>>>>>>> text(1,1,"print \U0171")
>>>>>>> 
>>>>>>> # this fails
>>>>>>> pdf("trial.pdf")
>>>>>>> plot(1,type="n")
>>>>>>> text(1,1,"print \U0171")
>>>>>>> dev.off()
>>>>>> 
>>>>>> Have you tried:
>>>>>> 
>>>>>> pdf("trial.pdf")
>>>>>> plot(1,type="n")
>>>>>> text(1,1,"print ??")
>>>>>> dev.off()
>>>>>> 
>>>>>> Your default screen fonts may not be the same as your default pdf
>>>>>> fonts. A lot depends on system specifics, none of which have you
>>>>>> provided.
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> I found an earlier post at
>>>>>>> http://www.mail-archive.com/r-help@r-project.org/msg65541.html,  
>>>>>>> but
>>>>>>> it is
>>>>>>> too hard to understand at my R-level. Any help is appreciated.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> David Winsemius, MD
>>>>>> West Hartford, CT
>>>>>> 
>>>>>> 

>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> 
>>>> David Winsemius, MD
>>>> West Hartford, CT
>>>> 
>>>> 
>>> 
>>> ______________________________________________
>>> R-help_at_r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> -- 
>> Sascha Vieweg, saschaview_at_gmail.com
> 
> David Winsemius, MD
> West Hartford, CT

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 01 Mar 2011 - 13:56:29 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 01 Mar 2011 - 14:30:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive