Skip Menu |

This queue is for tickets about the podlators CPAN distribution.

Report information
The Basics
Id: 73804
Status: stalled
Priority: 0/
Queue: podlators

People
Owner: Nobody in particular
Requestors: jkeenan [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: 1.00
Fixed in: (no value)



Subject: Pod2man creates wrong ROFF esc sequences for Latin-1 characters (RT #79410)
Since Pod::Man, as part of podlators, is now maintained on CPAN,I am forwarding this bug report from the Perl 5 RT queue. It was originally filed as https://rt.perl.org/rt3/Ticket/Display.html?id=79410 by Erwin Waterlander <waterlan@xs4all.nl> on 18 Nov 2010. Am attaching the tarball which Waterlander attached to RT #79410 as well as what I think was the relevant diff. Thank you very much. Jim Keenan Original report: ########## This is a bug report for perl from waterlan@xs4all.nl, generated with the help of perlbug 1.39 running under perl 5.10.1. ----------------------------------------------------------------- [Please describe your issue here] Hi, I have a pod file encoded in Latin-1. The 8-bit Latin-1 characters are converted wrongly to ROFF. For instance an a-accute is translated to \*' while it should be \['a] An e with dieresis is translated to \*: instead of \[:e] In fact all characters with dieresis are translated to \*: and all characters with accute to \*' and with grave to \*` and so on. best regards, Erwin Waterlander PS I have never used perlbug before, I hope I can attach a file after this. ... Sending mail with perlbug failed. I'm now sending with Thunderbird and attach a test case. [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=utilities severity=high --- Site configuration information for perl 5.10.1: Configured by rurban at Sat Aug 28 20:14:06 CEST 2010. Summary of my perl5 (revision 5 version 10 subversion 1) configuration: Platform: osname=cygwin, osvers=1.7.5(0.22553), archname=i686-cygwin-thread-multi-64int uname='cygwin_nt-5.1 reini 1.7.5(0.22553) 2010-04-12 19:07 i686 cygwin ' config_args='-de -Dlibperl=cygperl5_10.dll -Dcc=gcc-4 -Dld=g++-4 -Dmksymlinks -Dusethreads -Dmad=y -Doptimize=-O3 -Accflags=-g3' hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=undef, uselongdouble=undef usemymalloc=y, bincompat5005=undef Compiler: cc='gcc-4', ccflags ='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__ -g3 -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include', optimize='-O3', cppflags='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__ -g3 -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.3.4 20090804 (release) 1', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='g++-4', ldflags =' -Wl,--enable-auto-import -Wl,--export-all-symbols -Wl,--stack,8388608 -Wl,--enable-auto-image-base -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /usr/lib /lib libs=-lgdbm -ldb -ldl -lcrypt -lgdbm_compat perllibs=-ldl -lcrypt libc=/usr/lib/libc.a, so=dll, useshrplib=true, libperl=cygperl5_10.dll gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags=' --shared -Wl,--enable-auto-import -Wl,--export-all-symbols -Wl,--stack,8388608 -Wl,--enable-auto-image-base -L/usr/local/lib -fstack-protector' Locally applied patches: CYG11 no-bs CYG12 no archlib in otherlibdirs CYG14 Dynaloader CYG15 static-Win32CORE CYG17 utf8-paths CYG21 LibList-Kid.patch CYG22 cygwin-1.7 hints CYG23 544-stat CYG24 build man pages CYG25 rebase_privlib Module-Build-0.36_13 Bug#55162 CYG18 File::Spec::case_tolerant performance disable ExtUtils::MakeMaker::Coverage in Sys-Syslog --- @INC for perl 5.10.1: /usr/lib/perl5/5.10/i686-cygwin /usr/lib/perl5/5.10 /usr/lib/perl5/site_perl/5.10/i686-cygwin /usr/lib/perl5/site_perl/5.10 /usr/lib/perl5/vendor_perl/5.10/i686-cygwin /usr/lib/perl5/vendor_perl/5.10 /usr/lib/perl5/vendor_perl/5.10 /usr/lib/perl5/site_perl/5.8 /usr/lib/perl5/vendor_perl/5.8 . --- Environment for perl 5.10.1: HOME=/cygdrive/c/Users/waterlan LANG=nl_NL.UTF-8 LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/cygdrive/c/Users/waterlan/bin:/usr/local/bin:/usr/bin:/bin:/cygdrive/c/Program Files/PC Connectivity Solution/:/cygdrive/c/bin:/cygdrive/c/Windows/system32:/cygdrive/c/Windows:/cygdrive/c/Windows/System32/Wbem:/cygdrive/c/Windows/System32/WindowsPowerShell/v1.0/:/cygdrive/c/WATCOM18/BINNT:/cygdrive/c/WATCOM18/BINW PERL_BADLANG (unset) SHELL (unset)
Subject: pod2man.tar.gz
Download pod2man.tar.gz
application/x-gzip 2.6k

Message body not shown because it is not plain text.

From: jkeenan [...] cpan.org
On Fri Jan 06 21:31:06 2012, JKEENAN wrote: Show quoted text
> Am attaching the what I think > was the relevant diff. > > Thank you very much. > Jim Keenan
Subject: 79410_pod_man.diff
142c142 < A\-dieresis A\*: --- > A\-dieresis A\[:A] 144c144 < a\-acute a\*' --- > a\-acute a\['a] 146c146 < e\-grave e\*` --- > e\-grave e\[`e] 148c148 < e\-dieresis e\*: --- > e\-dieresis e\[:e]
Subject: Re: [rt.cpan.org #73804] Pod2man creates wrong ROFF esc sequences for Latin-1 characters (RT #79410)
Date: Fri, 06 Jan 2012 19:00:53 -0800
To: bug-podlators [...] rt.cpan.org
From: Russ Allbery <rra [...] stanford.edu>
"James E Keenan via RT" <bug-podlators@rt.cpan.org> writes: Show quoted text
> Since Pod::Man, as part of podlators, is now maintained on CPAN,I am > forwarding this bug report from the Perl 5 RT queue. It was originally > filed as https://rt.perl.org/rt3/Ticket/Display.html?id=79410 by Erwin > Waterlander <waterlan@xs4all.nl> on 18 Nov 2010. Am attaching the > tarball which Waterlander attached to RT #79410 as well as what I think > was the relevant diff.
[...] Show quoted text
> I have a pod file encoded in Latin-1. The 8-bit Latin-1 characters > are converted wrongly to ROFF.
Show quoted text
> For instance an a-accute is translated to > \*' > while it should be > \['a]
Show quoted text
> An e with dieresis is translated to \*: instead of \[:e]
I'm afraid those are groff-specific escapes and will not work with any other *roff implementation. Pod::Man exists to generate portable man pages that can be distributed and used, and I'm not willing to make it generate groff-specific code (particularly since the best thing to do these days is to just generate UTF-8 without any escapes at all). Closing wontfix. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
Subject: Re: [rt.cpan.org #73804] Pod2man creates wrong ROFF esc sequences for Latin-1 characters (RT #79410)
Date: Fri, 06 Jan 2012 19:03:50 -0800
To: bug-podlators [...] rt.cpan.org
From: Russ Allbery <rra [...] stanford.edu>
"rra@stanford.edu via RT" <bug-podlators@rt.cpan.org> writes: Show quoted text
> I'm afraid those are groff-specific escapes and will not work with any > other *roff implementation. Pod::Man exists to generate portable man > pages that can be distributed and used, and I'm not willing to make it > generate groff-specific code (particularly since the best thing to do > these days is to just generate UTF-8 without any escapes at all).
Oh, and I also should have noted: you will almost certainly get the behavior you want by running pod2man -u instead of just pod2man, and I'm looking at making that the default in a later version. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
CC: <rra [...] stanford.edu>, <jkeenan [...] cpan.org>
Subject: Re: [rt.cpan.org #73804] Pod2man creates wrong ROFF esc sequences for Latin-1 characters (RT #79410)
Date: Mon, 09 Jan 2012 10:08:03 +0100
To: <bug-podlators [...] rt.cpan.org>
From: waterlan <waterlan [...] xs4all.nl>
Show quoted text
> Russ Allbery writes: > >I'm afraid those are groff-specific
escapes and will not work with any Show quoted text
>other *roff implementation. Pod::Man
exists to generate portable man Show quoted text
>pages that can be distributed and used,
and I'm not willing to make it Show quoted text
>generate groff-specific code
(particularly since the best thing to do Show quoted text
>these days is to just generate
UTF-8 without any escapes at all). Show quoted text
> >Closing wontfix.
Hi, This report was pending for more than a year, and now it is closed so fast that I had no time to respond... I was not aware that these escape sequences are groff-specific. Actually portability is a main concern of me. My application (dos2unix) is also ported to DOS and Windows. DOS has no UTF-8 support and Windows has poor UTF-8 support. Therefore my idea was that using the escape sequences is the most portable, at least for Latin based languages. I have seen only groff used on DOS/Windows, no other *roff. Show quoted text
>Oh, and I also should have noted: you will almost certainly get
the Show quoted text
>behavior you want by running pod2man -u instead of just
pod2man, No, because I did not want UTF-8. Show quoted text
>and I'm looking at making
that the default in a later version. Please do not change the default behaviour. You cannot assume that everybody wants UTF-8. A better default behaviour would be to convert to the current locale encoding. And I still would like an output using groff escape sequences. regards, -- Erwin Waterlander http://waterlan.home.xs4all.nl/
CC: <bug-podlators [...] rt.cpan.org>, <jkeenan [...] cpan.org>
Subject: Re: [rt.cpan.org #73804] Pod2man creates wrong ROFF esc sequences for Latin-1 characters (RT #79410)
Date: Mon, 09 Jan 2012 10:11:37 -0800
To: waterlan <waterlan [...] xs4all.nl>
From: Russ Allbery <rra [...] stanford.edu>
waterlan <waterlan@xs4all.nl> writes: Show quoted text
> And I still would like an output using groff escape sequences.
Well, I still think that UTF-8 is the best direction forward and that all *roff implementations need to move in that direction. I suppose if someone produced a patch that adds a non-default option to generate groff escapes, I would consider it, but it's not something I'm going to implement myself. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
CC: bug-podlators [...] rt.cpan.org, jkeenan [...] cpan.org
Subject: Re: [rt.cpan.org #73804] Pod2man creates wrong ROFF esc sequences for Latin-1 characters (RT #79410)
Date: Fri, 13 Jan 2012 22:40:50 +0100
To: Russ Allbery <rra [...] stanford.edu>
From: Erwin Waterlander <waterlan [...] xs4all.nl>
Russ Allbery schreef, Op 9-1-2012 19:11: Show quoted text
> waterlan<waterlan@xs4all.nl> writes: >
>> And I still would like an output using groff escape sequences.
> Well, I still think that UTF-8 is the best direction forward and that all > *roff implementations need to move in that direction. I suppose if > someone produced a patch that adds a non-default option to generate groff > escapes, I would consider it, but it's not something I'm going to > implement myself. >
OK, then I stick to my workaround, some perl post-processing to create groff escape sequences. Please don't make the --utf8 behaviour default. Keep the default behaviour as is. best regards, -- Erwin Waterlander http://waterlan.home.xs4all.nl/