Skip Menu |

This queue is for tickets about the PDF-API2 CPAN distribution.

Report information
The Basics
Id: 113084
Status: rejected
Priority: 0/
Queue: PDF-API2

People
Owner: Nobody in particular
Requestors: dam [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



From: dam [...] cpan.org
Subject: Unreproducible internal font IDs
We have the following bug reported to the Debian package of PDF-API2 (https://bugs.debian.org/818363): It doesn't seem to be a bug in the packaging, so you may want to take a look. Thanks! ------8<-----------8<-----------8<-----------8<-----------8<----- The TrueType, BdFont, CoreFont and Postscript subclasses of PDF::API2::Resource::Font append '~'.time() to the resource identifiers they generate. This is unnecessary because the identifier already includes the result of the pdfkey() function, which gives increasing numbers through a process lifetime. Adding time() is in itself not enough to guarantee unique IDs, but only introduces unreproducible output and poses a slight performance penalty for the syscall needed to get the current time. The patch below removes time() from the resource IDs, which is sufficient to get reproducible output. ------8<-----------8<-----------8<-----------8<-----------8<----- Thanks for considering, Damyan Ivanov, Debian Perl Group ------8<-----------8<-----------8<-----------8<-----------8<----- --- a/lib/PDF/API2/Resource/CIDFont/TrueType.pm +++ b/lib/PDF/API2/Resource/CIDFont/TrueType.pm @@ -38,7 +38,7 @@ sub new { my ($ff,$data)=PDF::API2::Resource::CIDFont::TrueType::FontFile->new($pdf,$file,@opts); $class = ref $class if ref $class; - my $self=$class->SUPER::new($pdf,$data->{apiname}.pdfkey().'~'.time()); + my $self=$class->SUPER::new($pdf,$data->{apiname}.pdfkey()); $pdf->new_obj($self) if(defined($pdf) && !$self->is_obj($pdf)); $self->{' data'}=$data; @@ -51,7 +51,7 @@ sub new { $de->{'FontDescriptor'} = $des; $de->{'Subtype'} = PDFName($self->iscff ? 'CIDFontType0' : 'CIDFontType2'); - ## $de->{'BaseFont'} = PDFName(pdfkey().'+'.($self->fontname).'~'.time()); + ## $de->{'BaseFont'} = PDFName(pdfkey().'+'.($self->fontname)); $de->{'BaseFont'} = PDFName($self->fontname); $de->{'DW'} = PDFNum($self->missingwidth); if($opts{-noembed} != 1) --- a/lib/PDF/API2/Resource/Font/BdFont.pm +++ b/lib/PDF/API2/Resource/Font/BdFont.pm @@ -56,7 +56,7 @@ sub new { my %opts=@opts; $class = ref $class if ref $class; - $self = $class->SUPER::new($pdf, sprintf('%s+Bdf%02i',pdfkey(),++$BmpNum).'~'.time()); + $self = $class->SUPER::new($pdf, sprintf('%s+Bdf%02i',pdfkey(),++$BmpNum)); $pdf->new_obj($self) unless($self->is_obj($pdf)); # adobe bitmap distribution font @@ -202,7 +202,7 @@ sub readBDF { $data->{bbox}{'.notdef'} = [0, 0, 0, 0]; } - $data->{fontname}=pdfkey().pdfkey().'~'.time(); + $data->{fontname}=pdfkey(); $data->{apiname}=$data->{fontname}; $data->{flags} = 34; $data->{fontbbox} = [ split(/\s+/,$data->{FONTBOUNDINGBOX}) ]; --- a/lib/PDF/API2/Resource/Font/CoreFont.pm +++ b/lib/PDF/API2/Resource/Font/CoreFont.pm @@ -164,7 +164,7 @@ sub new #} $class = ref $class if ref $class; - $self = $class->SUPER::new($pdf, $data->{apiname}.pdfkey().'~'.time()); + $self = $class->SUPER::new($pdf, $data->{apiname}.pdfkey()); $pdf->new_obj($self) unless($self->is_obj($pdf)); $self->{' data'}=$data; $self->{-dokern}=1 if($opts{-dokern}); --- a/lib/PDF/API2/Resource/Font/Postscript.pm +++ b/lib/PDF/API2/Resource/Font/Postscript.pm @@ -28,7 +28,7 @@ sub new { } $class = ref $class if ref $class; - $self = $class->SUPER::new($pdf, $data->{apiname}.pdfkey().'~'.time()); + $self = $class->SUPER::new($pdf, $data->{apiname}.pdfkey()); $pdf->new_obj($self) unless($self->is_obj($pdf)); $self->{' data'}=$data; @@ -40,7 +40,7 @@ sub new { $self->{'FontDescriptor'}=$self->descrByData(); if(-f $psfile) { - $self->{'BaseFont'} = PDFName(pdfkey().'+'.($self->fontname).'~'.time()); + $self->{'BaseFont'} = PDFName(pdfkey().'+'.($self->fontname)); my ($l1,$l2,$l3,$stream)=$self->readPFAPFB($psfile); ------8<-----------8<-----------8<-----------8<-----------8<-----
Subject: [rt.cpan.org #113084]
Date: Wed, 16 Mar 2016 11:04:14 -0400
To: bug-PDF-API2 [...] rt.cpan.org
From: Phil M Perry <philperry [...] hvc.rr.com>
See also #105579. It sounds like it might be a very similar issue.
Subject: Re: [rt.cpan.org #113084]
Date: Wed, 16 Mar 2016 15:13:19 +0000
To: "philperry [...] hvc.rr.com via RT" <bug-PDF-API2 [...] rt.cpan.org>
From: Damyan Ivanov <dam [...] cpan.org>
-=| philperry@hvc.rr.com via RT, 16.03.2016 11:04:12 -0400 |=- Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=113084 > > > See also #105579. It sounds like it might be a very similar issue.
Yes, this seems to be the same issue. Interestingly, hash order seems not to be a problem -- after removing the time stamps, the PDF output gets completely deterministic in my tests.
Thanks for the suggestion. The version control history doesn't go all the way back, so I can't tell why the time() call was added to the IDs initially, but since it exists in more-commonly-used parts of the code and is missing in less-commonly-used parts of the code, I'm going to guess that it was added at some point to work around a name collision issue, possibly related to merging multiple PDFs created by PDF::API2. Using time() isn't as fancy as, say, using UUIDs, but it's a lot faster and likely good enough for most real-world applications. There are definitely benefits to having the same code produce the same PDF repeatedly, but there are also potential drawbacks, so I'm going to leave this part of the code as is.