Skip Menu |

This queue is for tickets about the PDF-API2 CPAN distribution.

Report information
The Basics
Id: 40648
Status: resolved
Priority: 0/
Queue: PDF-API2

People
Owner: Nobody in particular
Requestors: BARTL [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.71.001
Fixed in: 0.72



Subject: Unicode text prints text on top of text before it
Here's a demonstration of the bug: the test script (bug-demo.pl) produces a PDF file with the Polish text (part of an address) "Centrum Uslug Ksiegowych", with modifications on the "l", and on the "e" in the third word. Look at bug-demo.release.pdf for the PDF with the release version of PDF::API2 (0.71.001); look at bug-demo.patched.pdf for the PDF after my patch is applied. #!/usr/bin/perl -w use PDF::API2; use strict; gen_pdf("$0.pdf"); sub gen_pdf { my($save_as) = @_; my $api = PDF::API2->new(); my $uf = unifont($api, 'Times', 1); $api->mediabox(595,842); my $page = $api->page; my $text = $page->text; $text->font( $uf, 18 ); $text->translate( 190, 400 ); $text->paragraph("Centrum Us\x{0142}ug Ksi\x{0119}gowych", 220, 25); $api->saveas($save_as); $api->end; } sub unifont { my($api, $fontname, @blk) = @_; return $api->unifont( $api->corefont($fontname, -encode=>'latin1'), map([ $api->corefont($fontname, -encode=>"uni$_"), [$_] ], @blk ), -encode => 'latin1' ); } The patch (PDF-API2-Resource-Font.pm.patch) is to add one line in the file PDF/API2/Resource/Font.pm $data->{firstchar} = 0; to set this value to zero if $encoding matches /^uni\d+$/. You can also simply replace the existing module file with the one I attached. (PDF-API2-Resource-Font.pm.tar.gz) (for PDF:API2 0.71.001). Some background: PDF::API2::Resource::UniFont uses a faked font for character sets with more than 256 characters (actually 224, when ignoring control characters). It works by mapping blocks of 256 bytes in Unicode ("block", "page", "plane") to a single byte font that contains just the characters in the font for this block. For example, the Unicode range 0x100 to 0x1FF is remapped to the single byte range 0x00 to 0xFF, in the pseudo-font associated with block 1. The problem is that for the first 32 characters in these blocks, the print width is not stored, and as a result, the PDF rendering engine treats the widths for these characters as zero. That is the case for the "e" ("e ogonek"), which is chr(281) in Unicode and gets remapped to chr(25) in the single byte font, and which (as 25 < 32) gets a zero width. That's why the following "g" is printed on top of it. The "l" ("l slash") is chr(322) and gets remapped to a chr(66), so it behaves normal, as it has its proper width stored. The patch simply tells PDF::API2 that for these remapped fonts, it should treat *every* character for all character codes from 0 to 255, as a normal character, instead of just the default limited range 32 to 255. As a result, the *complete* character width table, with 256 entries, gets now stored in the PDF file. And that fixes it.
Subject: bug-demo.release.pdf
Download bug-demo.release.pdf
application/x-pdf 7.6k

Message body not shown because it is not plain text.

Subject: PDF-API2-Resource-Font.pm.tar.gz

Message body not shown because it is not plain text.

Subject: PDF-API2-Resource-Font.pm.patch
--- old/PDF/API2/Resource/Font.pm Sat Mar 10 14:05:42 2007 +++ PDF/API2/Resource/Font.pm Fri Oct 31 13:48:28 2008 @@ -73,6 +73,7 @@ my $blk=$1; $data->{e2u}=[ map { $blk*256+$_ } (0..255) ]; $data->{e2n}=[ map { nameByUni($_) || '.notdef' } @{$data->{e2u}} ]; + $data->{firstchar} = 0; } elsif(defined $encoding) {
Subject: bug-demo.pl
#!/usr/bin/perl -w use PDF::API2; use strict; gen_pdf("$0.pdf"); sub gen_pdf { my($save_as) = @_; my $api = PDF::API2->new(); my $uf = unifont($api, 'Times', 1); $api->mediabox(595,842); my $page = $api->page; my $text = $page->text; $text->font( $uf, 18 ); $text->translate( 190, 400 ); $text->paragraph("Centrum Us\x{0142}ug Ksi\x{0119}gowych", 220, 25); $api->saveas($save_as); $api->end; } sub unifont { my($api, $fontname, @blk) = @_; return $api->unifont( $api->corefont($fontname, -encode=>'latin1'), map([ $api->corefont($fontname, -encode=>"uni$_"), [$_] ], @blk ), -encode => 'latin1' ); }
Subject: bug-demo.patched.pdf
Download bug-demo.patched.pdf
application/x-pdf 7.7k

Message body not shown because it is not plain text.

try 0.72 and report again
On Tue Nov 18 17:43:22 2008, AREIBENS wrote: Show quoted text
> try 0.72 and report again
Yes, it's fixed now. You do have another problem: PDF:::API2 doesn't show up as the latest release in the CPAN index at http://www.cpan.org/modules/02packages.details.txt.gz which still lists 0.71.001 as the most recent version. Most likely you're suffering from the "world writable directories" problem as discussed at http://use.perl.org/~cosimo/journal/37554 and with a possible fix from Windows at http://use.perl.org/~Burak/journal/37599 . I check the archive on Linux, and all files and directories are indeed world writable. p.s. What's that about bug status "rejected"? I submit a bug report, a new version of PDF::API2 comes out a day after my bug report, with the exact same fix as I proposed, and then you reject my bug report??