Skip Menu |

This queue is for tickets about the PDF-API2 CPAN distribution.

Report information
The Basics
Id: 107333
Status: resolved
Priority: 0/
Queue: PDF-API2

People
Owner: Nobody in particular
Requestors: mark [...] ghy.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: 2.026



Subject: Unable to parse '/' as Name
Date: Fri, 25 Sep 2015 16:01:51 -0500
To: bug-PDF-API2 [...] rt.cpan.org
From: "Mark Balitsky" <mark [...] ghy.com>
I've recently encountered some PDF files which have failed to parse with PDF::API2. The problem was due to an entry in an image dictionary which used the value '/' as a Name, e.g. : 4 0 obj << /DecodeParms [ << /Colors 1 /Rows 3300 /Columns 2550 /K -1 Show quoted text
>>]
/Width 2550 /BitsPerComponent 1 /Name /XImg /Height 3300 /Intent / /Filter [/CCITTFaxDecode] /Subtype /Image /Length 23634 /Type /XObject /ColorSpace /DeviceGray Show quoted text
>>
The attempt to read the dictionary crashes with: Can't parse `/ /Filter [/CCITTFaxDecode] /Subtype /Image (etc...) I first noticed this with PDF::API2 version 2.019, then verified it is still present in 2.024. As a workaround, I've made the following change in Basic/PDF/File.pm, in sub readval, between lines 550-555 (ver. 2.024): from: # Name elsif ($str =~ m|^/($reg_char+)|s) { $value = $1; $str =~ s|^/($reg_char+)||s; $result = PDF::API2::Basic::PDF::Name->from_pdf($value, $self); } to: # Name elsif ($str =~ m|^/($reg_char*)|s) { $value = $1; $str =~ s|^/($reg_char*)||s; $result = PDF::API2::Basic::PDF::Name->from_pdf($value, $self); } This successfully consumes the '/' token without crashing. -- Mark Balitsky I.T. Department GHY International 809-167 Lombard Avenue Winnipeg, MB, R3B 3H8, Canada ------------------------------ web: http://www.ghy.com Main Phone: (204) 947-6851 Fax: (204) 947-3306
PDF::API2 is right to crash there (though cleaner error handling wouldn't be a bad thing), as the line "/Intent /" isn't valid. The second "/" is a marker that says a name is going to follow, but one doesn't. It's kind of like saying { my $intent = $ } in Perl, which would result in a syntax error. On Fri Sep 25 17:03:08 2015, mark@ghy.com wrote: Show quoted text
> I've recently encountered some PDF files which have failed to parse with > PDF::API2. The problem was due to an entry in an image dictionary which > used the value '/' as a Name, e.g. : > > 4 0 obj > << > /DecodeParms [ > << > /Colors 1 > /Rows 3300 > /Columns 2550 > /K -1
> >>]
> /Width 2550 > /BitsPerComponent 1 > /Name /XImg > /Height 3300 > /Intent / > /Filter [/CCITTFaxDecode] > /Subtype /Image > /Length 23634 > /Type /XObject > /ColorSpace /DeviceGray
> >>
> > The attempt to read the dictionary crashes with: > Can't parse `/ > /Filter [/CCITTFaxDecode] > /Subtype /Image > (etc...) > > I first noticed this with PDF::API2 version 2.019, then verified it is > still present in 2.024. > > As a workaround, I've made the following change in Basic/PDF/File.pm, in > sub readval, between lines 550-555 (ver. 2.024): > > from: > > # Name > elsif ($str =~ m|^/($reg_char+)|s) { > $value = $1; > $str =~ s|^/($reg_char+)||s; > $result = PDF::API2::Basic::PDF::Name->from_pdf($value, $self); > } > > to: > > # Name > elsif ($str =~ m|^/($reg_char*)|s) { > $value = $1; > $str =~ s|^/($reg_char*)||s; > $result = PDF::API2::Basic::PDF::Name->from_pdf($value, $self); > } > > This successfully consumes the '/' token without crashing.
Subject: Re: [rt.cpan.org #107333] Unable to parse '/' as Name
Date: Mon, 28 Sep 2015 11:04:03 -0500
To: bug-PDF-API2 [...] rt.cpan.org
From: "Mark Balitsky" <mark [...] ghy.com>
Hi Steve. I agree "/Intent /" seems pointless at best, but it does seem to be supported by the PDF spec. I checked the 1.4 version of Adobe's PDF Reference. Section 3.2.4 includes: "Note: The token / (a slash followed by no regular characters) is a valid name." -- Mark Balitsky I.T. Department GHY International 809-167 Lombard Avenue Winnipeg, MB, R3B 3H8, Canada ------------------------------ web: http://www.ghy.com Main Phone: (204) 947-6851 Fax: (204) 947-3306 From: "Steve Simms via RT" <bug-PDF-API2@rt.cpan.org> To: mark@ghy.com Date: 25/09/2015 05:10 PM Subject: [rt.cpan.org #107333] Unable to parse '/' as Name <URL: https://rt.cpan.org/Ticket/Display.html?id=107333 > PDF::API2 is right to crash there (though cleaner error handling wouldn't be a bad thing), as the line "/Intent /" isn't valid. The second "/" is a marker that says a name is going to follow, but one doesn't. It's kind of like saying { my $intent = $ } in Perl, which would result in a syntax error. On Fri Sep 25 17:03:08 2015, mark@ghy.com wrote: Show quoted text
> I've recently encountered some PDF files which have failed to parse with
Show quoted text
> PDF::API2. The problem was due to an entry in an image dictionary which
Show quoted text
> used the value '/' as a Name, e.g. : > > 4 0 obj > << > /DecodeParms [ > << > /Colors 1 > /Rows 3300 > /Columns 2550 > /K -1
> >>]
> /Width 2550 > /BitsPerComponent 1 > /Name /XImg > /Height 3300 > /Intent / > /Filter [/CCITTFaxDecode] > /Subtype /Image > /Length 23634 > /Type /XObject > /ColorSpace /DeviceGray
> >>
> > The attempt to read the dictionary crashes with: > Can't parse `/ > /Filter [/CCITTFaxDecode] > /Subtype /Image > (etc...) > > I first noticed this with PDF::API2 version 2.019, then verified it is > still present in 2.024. > > As a workaround, I've made the following change in Basic/PDF/File.pm, in
Show quoted text
> sub readval, between lines 550-555 (ver. 2.024): > > from: > > # Name > elsif ($str =~ m|^/($reg_char+)|s) { > $value = $1; > $str =~ s|^/($reg_char+)||s; > $result = PDF::API2::Basic::PDF::Name->from_pdf($value, $self); > } > > to: > > # Name > elsif ($str =~ m|^/($reg_char*)|s) { > $value = $1; > $str =~ s|^/($reg_char*)||s; > $result = PDF::API2::Basic::PDF::Name->from_pdf($value, $self); > } > > This successfully consumes the '/' token without crashing.
Thanks for following up. I had looked at the PDF spec (1.7) before replying, and didn't see a reference to an empty string at the time, but I just re-looked at it and found that the empty string is indeed a valid name. How about that. Your fix has been committed, and it'll be in the next release (2.026).