Skip Menu |

This queue is for tickets about the PDF-API2 CPAN distribution.

Report information
The Basics
Id: 57248
Status: open
Priority: 0/
Queue: PDF-API2

People
Owner: Nobody in particular
Requestors: kuzvesov [...] list.ru
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.73
Fixed in: (no value)



Subject: Cyrillic letters
1. The following Cyrillic glyphs (names according to http:// www.adobe.com/ devnet/font/pdfs/5013.Cyrillic_Font_Spec.pdf) afii10047 (uppercase 'Э') afii10049 (uppercase 'Я') afii10095 (lowercase 'э') are not displayed when using TrueType fonts. I tried different encodings (CP1251, UTF8) with the same result. 2. When using core fonts, all the cyrillics are displayed overlapping each other with CP1251 encoding, and are not displayed at all with UTF8 encoding. Perl version v5.10.1 built for MSWin32-x86-multi-thread Binary build 1007 [291969] provided by ActiveState Operating system Windows Vista Home Premium, Service Pack 1 (ver. 6.0.6001)
Subject: test-utf8.pdf
Download test-utf8.pdf
application/pdf 61.6k

Message body not shown because it is not plain text.

Subject: test.pl
use locale; use POSIX; use PDF::Report; my $encoding = 'cp1251'; POSIX::setlocale($encoding) or die 'cannot set locale'; my $pdf = new PDF::API2( ); $pdf->mediabox( 'A4' ); my $page = $pdf->page(); my $txt = $page->text; my $font = $pdf->ttfont('Times.ttf', '-encode' => $encoding ); my $fontsize = 12; $txt->font($font,$fontsize); $txt->translate(10,700); $txt->text("ABCDEFGHIJKLMNOPQRSTUVWXYZ"); $txt->translate(10,650); $txt->text("abcdefghijklmnopqrstuvwxyz"); $txt->translate(10,600); $txt->text("àáâãäå¸æçèéêëìíîïðñòóôõö÷øùüûúýþÿ"); $txt->translate(10,550); $txt->text("ÀÁÂÃÄŨÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÜÛÚÝÞß"); my $font = $pdf->corefont('Times', '-encode' => $encoding ); my $fontsize = 12; $txt->font($font,$fontsize); $txt->translate(10,400); $txt->text("ABCDEFGHIJKLMNOPQRSTUVWXYZ"); $txt->translate(10,350); $txt->text("abcdefghijklmnopqrstuvwxyz"); $txt->translate(10,300); $txt->text("àáâãäå¸æçèéêëìíîïðñòóôõö÷øùüûúýþÿ"); $txt->translate(10,250); $txt->text("ÀÁÂÃÄŨÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÜÛÚÝÞß"); my $font = $pdf->corefont('Times', '-encode' => $encoding ); my $fontsize = 12; $txt->font($font,$fontsize); $txt->translate(10,750); $txt->text("Using true type font:"); $txt->translate(10,450); $txt->text("Using core font:"); $pdf->saveas( 'test.pdf' );
Subject: test-cp1251.pdf
Download test-cp1251.pdf
application/pdf 62.1k

Message body not shown because it is not plain text.

Subject: [rt.cpan.org #57248]
Date: Mon, 15 Feb 2016 16:40:51 -0500
To: bug-PDF-API2 [...] rt.cpan.org
From: Phil M Perry <philperry [...] hvc.rr.com>
I modified the example text file to display x40 through xFF for both TrueType and Core fonts. I ran it for CP1251 (Cyrillic), CP1252 (Latin 1), CP1253 (Greek), and CP1254 (Turkish). This is Windows XP SP3, PDF::API2 2.025, Adobe Reader 11.0.08. All four character sets have some variety of MS "Smart Quotes" in the x80 - x9F range. I have not yet tried UTF-8 encoded text. In all cases, the TTF displays perfectly, even the unassigned characters in the Smart Quotes range. The three Cyrillic characters reported missing in the original bug report are present and in the right place. All the CoreFont displays have problems with the Smart Quotes unassigned characters still displaying the empty box, but evidently having a near-zero width (so that the following character mostly overprints it). Core Font only problems: CP1251: All Cyrillic and possibly some other characters print correctly, but apparently have about 33% width and are overprinted by following characters. CP1252: The unassigned characters in the Smart Quotes range get overprinted, but the rest of the Latin-1 characters look OK. CP1253: The Greek letters behave just like the Cyrillic letters in 1251. CP1254: The Turkish letters behave just like the Latin-1 letters in 1252. The bottom line is that TTF looks OK from here (at least for CP125x encoding), but Core Fonts have trouble with unassigned ("box") characters and non-Latin characters, where the characters look OK, but the text location is not advanced far enough and we get overprinting. Perhaps the font data (especially character width) isn't being read correctly? Since it works for (e.g.) CP1252, it seems odd that it would fail for non-Latin sets (note that Turkish is Latin). That would imply that the font files themselves are defective or non-standard in some way. To add a comment to this thread, just email bug-PDF-API2 [at] rt.cpan.org with subject line [rt.cpan.org #57248]. Note 1 space between org and #, and the [ ] around the whole subject. Nothing else. If you don't follow this format carefully, you will end up creating a new bug report! HTML formatting within the body does not work.

Message body is not shown because sender requested not to inline it.

1: TTF does not appear to be missing any characters, including the three listed, when I tested it. 2: The overlap of characters is because the width listed in PDF::API2::Resource::Font::CoreFont::[fontname].pm's "missingwidth" value of 250, which is as little as a quarter of what is needed. Only the standard Latin-1 glyphs, and their widths, are listed. Everything else is "missing". Possibly this could be fixed by extending the [fontname].pm glyph and width tables, but that will be quite a bit of work. 3: Core fonts do not support UTF-8 -- only single byte encodings at this time. UTF-8 support for core and Type1 fonts would certainly be desirable, but I don't know if it's feasible to add it. I think the best resolution of this is to switch to TTF (ttfont) rather than using core fonts.