Skip Menu |

This queue is for tickets about the PostScript-Font CPAN distribution.

Report information
The Basics
Id: 8504
Status: open
Priority: 0/
Queue: PostScript-Font

People
Owner: jv [...] cpan.org
Requestors: roel-perl [...] st2x.net
Cc:
AdminCc:

Bug Information
Severity: Unimportant
Broken in: 1.09
Fixed in: (no value)



Subject: Typesetting a no-break space with ps_textbox results in incorrect PostScript output
Hi Leo I'm afraid PostScript::BasicTypesetter 0.04 contains a bug regarding the handling of no-break spaces. I use a Latin1 encoding, and when I use the ps_textbox method, the resulting PostScript code contains missing text and syntax errors. The problem is due to the fact that the no-break space maps to 'space' in the EncodingVector. In the PostScript::FontMetrics::kstring it is assumed that typesetting a space always results in a kerning value being pushed onto the @res stack; not true however for a no-break space. On line 381 `` $res[$#res] += $kw; '' you then add a numeric value to an array element containing text. There are several ways to solve this problem, and currently I use a work-around in my own code by manipulating the EncodingVector before calling ps_textbox. Should you have any questions, or if you want me to make a patch, please contact me. Regards Roel
Sorry, I meant to write: "Hi Johan". Of course. (Just call me whatever you like.) [RST - Wed Nov 17 13:22:05 2004]: Show quoted text
> Hi Leo > > I'm afraid PostScript::BasicTypesetter 0.04 contains a bug regarding > the handling of no-break spaces. I use a Latin1 encoding, and when > I use the ps_textbox method, the resulting PostScript code contains > missing text and syntax errors. The problem is due to the fact that > the no-break space maps to 'space' in the EncodingVector. In the > PostScript::FontMetrics::kstring it is assumed that typesetting a > space always results in a kerning value being pushed onto the @res > stack; not true however for a no-break space. > > On line 381 `` $res[$#res] += $kw; '' you then add a numeric value to > an array element containing text. There are several ways to solve > this problem, and currently I use a work-around in my own code by > manipulating the EncodingVector before calling ps_textbox. > > Should you have any questions, or if you want me to make a patch, > please contact me. > > Regards > > Roel
To: bug-PostScript-Font [...] rt.cpan.org
Subject: Re: [cpan #8504] Typesetting a no-break space with ps_textbox results in incorrect PostScript output
From: Johan Vromans <jvromans [...] squirrel.nl>
Date: Wed, 17 Nov 2004 20:54:13 +0100
RT-Send-Cc:
" via RT" <bug-PostScript-Font@rt.cpan.org> writes: Show quoted text
> I'm afraid PostScript::BasicTypesetter 0.04 contains a bug regarding > the handling of no-break spaces. I use a Latin1 encoding,
Interesting. There's no such thing as a no-break space. It's a text processor peculiarity that should never get to the low level typesetting code. Show quoted text
> The problem is due to the fact that the no-break space maps to > 'space' in the EncodingVector.
Character index 0240 maps to space, that is correct. Show quoted text
> In the PostScript::FontMetrics::kstring it is assumed that > typesetting a space always results in a kerning value being pushed > onto the @res stack; not true however for a no-break space.
Yes, the problem is that 0240 should be treated as a space. However, the processing of spaces has already taken place by the time the 0240 gets mapped to space. Show quoted text
> There are several ways to solve this problem, and currently I use a > work-around in my own code by manipulating the EncodingVector before > calling ps_textbox.
I think a more correct approach is to locate in the encoding vector the character codes that map to space (this needs to be done only once), and then for each text use a tr to change these 'characters' to spaces before processing. Does that happen to be one of your workarounds? -- Johan (responds when called by any name...)
Hi Johan [jvromans@squirrel.nl - Wed Nov 17 15:30:45 2004]: Show quoted text
> " via RT" <bug-PostScript-Font@rt.cpan.org> writes: >
> > I'm afraid PostScript::BasicTypesetter 0.04 contains a bug regarding > > the handling of no-break spaces. I use a Latin1 encoding,
> > Interesting. There's no such thing as a no-break space. It's a text > processor peculiarity that should never get to the low level > typesetting code.
OK, I agree. I should have formulated this better, of course. But the fact is that octal 240 *is* a legal character code in ISOLatin1, but that ps_textbox produces syntactically incorrect PostScript code when text is passed containing that character. Show quoted text
> > I think a more correct approach is to locate in the encoding vector > the character codes that map to space (this needs to be done only > once), and then for each text use a tr to change these 'characters' to > spaces before processing. > > Does that happen to be one of your workarounds?
No, your approach is a bit more sophisticated. In ISOLatin1 there are only two characters mapping two a space, one being the code point octal 40, which doesn't give any problems -- so the only remaining "problem" character is character code 240. So I'll take the shortcut and focus on that character only. Thanks. I'll bear in mind that I need to avoid alternative space characters when typesetting with PostScript:BasicTypesetter. --Roel