Skip Menu |

This queue is for tickets about the Pod-Perldoc CPAN distribution.

Report information
The Basics
Id: 80527
Status: resolved
Priority: 0/
Queue: Pod-Perldoc

People
Owner: Nobody in particular
Requestors:
Cc: explorer [...] joaquinferrero.com
AdminCc:

Bug Information
Severity: Normal
Broken in: 3.17
Fixed in: 3.20



CC: explorer [...] joaquinferrero.com
Subject: perldoc cannot find functions sections when is called with -L switch
When perldoc is called using the '-L XX' and '-f' switches, it cannot find the corresponding section of the requested function if the string returned by POD2::XX::search_perlfunc_re() (i.e. the string that marks the beginning of the section in perlfunc.pod that contains the descriptions of the functions available) is not encoded as iso-8859-1. perldoc -L ES perlfunc works, but perldoc -L ES -f chr returns the following error message: No documentation for perl function 'chr' found Interum solution: search_perlfunc_re() should return an iso-8859-1-encoded string fully (or partially) matching the same string that should appear in perlfunc.pod. Proposed solution: The search process of the string returned by search_perlfunc_re() should consider the encoding used for perlfunc.pod. In POD2::ES all the docs are UTF-8-encoded. As a temporary solution, we have fixed this issue by removing characters with diacritic marks: sub search_perlfunc_re { return 'Lista de funciones de Perl en orden'; } (removed ‘alfabético’)
Show quoted text
> perldoc -L ES perlfunc > > works, but > > perldoc -L ES -f chr > > returns the following error message: > > No documentation for perl function 'chr' found > > Interum solution: search_perlfunc_re() should return an > iso-8859-1-encoded string fully (or partially) matching the same > string that should appear in perlfunc.pod. > > Proposed solution: The search process of the string returned by > search_perlfunc_re() should consider the encoding used for > perlfunc.pod.
Thanks for the report. We definitely need to fix this.
On Wed Oct 31 11:44:36 2012, explorer@joaquinferrero.com wrote: Show quoted text
> When perldoc is called using the '-L XX' and '-f' switches, it > cannot find the corresponding section of the requested function > if the string returned by POD2::XX::search_perlfunc_re() (i.e. > the string that marks the beginning of the section in > perlfunc.pod that contains the descriptions of the functions > available) is not encoded as iso-8859-1. > > perldoc -L ES perlfunc > > works, but > > perldoc -L ES -f chr > > returns the following error message: > > No documentation for perl function 'chr' found > > Interum solution: search_perlfunc_re() should return an > iso-8859-1-encoded string fully (or partially) matching the same > string that should appear in perlfunc.pod. > > Proposed solution: The search process of the string returned by > search_perlfunc_re() should consider the encoding used for > perlfunc.pod. > > In POD2::ES all the docs are UTF-8-encoded. As a temporary > solution, we have fixed this issue by removing characters with > diacritic marks: > > sub search_perlfunc_re { > return 'Lista de funciones de Perl en orden'; > } > > (removed ‘alfabético’)
There's no easy way to tell what encoding a given file is in reliably, so I am wondering if we should have a callback function in POD2::XX like search_perlfunc_re_encoding() which returns a string scalar like "latin1" or "utf8" or whatever is appropriate. What do you think about that? Thanks. Mark
Le 2013-01-29 04:49:33, mallen a écrit : Show quoted text
> > There's no easy way to tell what encoding a given file is in reliably,
This is irrelevant to the issue. Perl source encoding is specified with "use utf8" or "use encoding ...". POD source encoding is specified with "=encoding ...". Show quoted text
> so I am wondering if we should have a callback function in POD2::XX like > search_perlfunc_re_encoding() which returns a string scalar like > "latin1" or "utf8" or whatever is appropriate.
POD2::ES has "use utf8" at the beginning. So the search_perlfunc_re_encoding() returns a Unicode strings (which Perl internals calls "utf8", see the utf8 module). So POD2::ES seems fine. This is Perldoc that must be fixed. Mark, are you sure that Perldoc correctly process POD sections after having decoded it from bytes to the encoding specified by "=encoding" ? -- Olivier Mengué - http://perlresume.org/DOLMEN
Subject: Re: [rt.cpan.org #80527] perldoc cannot find functions sections when is called with -L switch
Date: Wed, 30 Jan 2013 22:39:56 +0100
To: bug-Pod-Perldoc [...] rt.cpan.org
From: Joaquin Ferrero <explorer [...] joaquinferrero.com>
El 29/01/13 04:49, Mark Allen via RT escribió: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=80527 > > > On Wed Oct 31 11:44:36 2012, explorer@joaquinferrero.com wrote:
>> When perldoc is called using the '-L XX' and '-f' switches, it >> cannot find the corresponding section of the requested function >> if the string returned by POD2::XX::search_perlfunc_re() (i.e. >> the string that marks the beginning of the section in >> perlfunc.pod that contains the descriptions of the functions >> available) is not encoded as iso-8859-1. >> >> perldoc -L ES perlfunc >> >> works, but >> >> perldoc -L ES -f chr >> >> returns the following error message: >> >> No documentation for perl function 'chr' found >> >> Interum solution: search_perlfunc_re() should return an >> iso-8859-1-encoded string fully (or partially) matching the same >> string that should appear in perlfunc.pod. >> >> Proposed solution: The search process of the string returned by >> search_perlfunc_re() should consider the encoding used for >> perlfunc.pod. >> >> In POD2::ES all the docs are UTF-8-encoded. As a temporary >> solution, we have fixed this issue by removing characters with >> diacritic marks: >> >> sub search_perlfunc_re { >> return 'Lista de funciones de Perl en orden'; >> } >> >> (removed ‘alfabético’)
> > There's no easy way to tell what encoding a given file is in reliably, > so I am wondering if we should have a callback function in POD2::XX like > search_perlfunc_re_encoding() which returns a string scalar like > "latin1" or "utf8" or whatever is appropriate. > > What do you think about that? > > Thanks. > > Mark > > >
Other solution: 1) edit perlfunc.pod, and search by line =head2 Alphabetical Listing of Perl Functions And add one line, below: X<Alphabetical Listing of Perl Functions> or, =for Pod::Functions Alphabetical Listing of Perl Functions 2) Modify the perldoc procedure to search by this line, and not by the =head2 line. The =head2 tag will by displayed by perldoc, but not the =for line With this solution, the translations teams and perldoc don't need the search_perlfunc_re() function anymore :) Best Regards, JF^D
On Wed Jan 30 10:48:30 2013, DOLMEN wrote: Show quoted text
> So POD2::ES seems fine. This is Perldoc that must be fixed. > > Mark, are you sure that Perldoc correctly process POD sections after > having decoded it from bytes to the encoding specified by "=encoding" ?
Perldoc *doesn't* interpret POD at all. It's jobs are: 1) Locate the appropriate file (or section of perlfunc, etc) 2) Feed the file to the appropriate formatter 3) Dump the formatted output from step #2 to a pager So perldoc has no way of correctly interpreting =encoding directives without parsing through the file and looking for them. It's arguable that it *ought* to do that, but historically it hasn't. That's why I suggested making the encoding of the POD2::XX regex a callback. We could instead *assume* that POD2::XX is encoded in Latin1 unless Encode:is_utf8 returns true. Or vice versa (assume it's utf8 unless is_utf8 returns false) Other thoughts? Thanks for your help on this. Mark
On Wed Oct 31 11:44:36 2012, explorer@joaquinferrero.com wrote: Show quoted text
> In POD2::ES all the docs are UTF-8-encoded. As a temporary > solution, we have fixed this issue by removing characters with > diacritic marks: > > sub search_perlfunc_re { > return 'Lista de funciones de Perl en orden'; > } > > (removed ‘alfabético’)
OK, I found the problem. When perldoc open filehandles for "dynamic" POD files - like extracts from perlfunc.pod it doesn't open them as UTF-8, so we make sure to do so and add an '=encoding utf8' on top of that. This has the happy side effect of making the full regex with diacritical marks work properly (at least on my local Pod::Perldoc.) Somewhere in the tool chain, you need the latest Pod::Simple and Pod::Text distributions from CPAN as they have much much better UTF-8 support in them now. This is fixed in Pod::Perldoc 3.20 which headed to CPAN shortly. Thanks.
From: Joaquin Ferrero <explorer [...] joaquinferrero.com>
El Sáb Abr 27 01:31:05 2013, mallen escribió: Show quoted text
> > This is fixed in Pod::Perldoc 3.20 which headed to CPAN shortly. >
Confirmed. POD2/ES.pm: 55 # String for perldoc with -L switch 56 sub search_perlfunc_re { 57 return 'Lista de funciones de Perl en orden alfabético'; 58 } (I added the word "alfabético", with the "é" utf8 char) Now, perldoc -f <function> work perfectly: $ perldoc -f chr chr NÚMERO chr Devuelve el carácter representado por NÚMERO en el conjunto de caracteres. Por ejemplo, "chr(65)" es "A" tanto en ASCII como en Unicode, y chr(0x263a) es una cara sonriente en Unicode. Thanks!
Subject: Re: [rt.cpan.org #80527] perldoc cannot find functions sections when is called with -L switch
Date: Thu, 06 Feb 2014 14:03:48 +0100
To: bug-Pod-Perldoc [...] rt.cpan.org
From: Joaquín Ferrero <explorer [...] joaquinferrero.com>
El 29/01/13 04:49, Mark Allen via RT escribió: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=80527 > > > On Wed Oct 31 11:44:36 2012, explorer@joaquinferrero.com wrote:
>> When perldoc is called using the '-L XX' and '-f' switches, it >> cannot find the corresponding section of the requested function >> if the string returned by POD2::XX::search_perlfunc_re() (i.e. >> the string that marks the beginning of the section in >> perlfunc.pod that contains the descriptions of the functions >> available) is not encoded as iso-8859-1. >> >> perldoc -L ES perlfunc >> >> works, but >> >> perldoc -L ES -f chr >> >> returns the following error message: >> >> No documentation for perl function 'chr' found >> >> Interum solution: search_perlfunc_re() should return an >> iso-8859-1-encoded string fully (or partially) matching the same >> string that should appear in perlfunc.pod. >> >> Proposed solution: The search process of the string returned by >> search_perlfunc_re() should consider the encoding used for >> perlfunc.pod. >> >> In POD2::ES all the docs are UTF-8-encoded. As a temporary >> solution, we have fixed this issue by removing characters with >> diacritic marks: >> >> sub search_perlfunc_re { >> return 'Lista de funciones de Perl en orden'; >> } >> >> (removed ‘alfabético’)
> > There's no easy way to tell what encoding a given file is in reliably, > so I am wondering if we should have a callback function in POD2::XX like > search_perlfunc_re_encoding() which returns a string scalar like > "latin1" or "utf8" or whatever is appropriate. > > What do you think about that? > > Thanks. > > Mark > > >
Yes, it's true. The Spanish PerlDoc team suggested to change all original English pod documentation to utf8 encoding, but this proposal was not approved. The Spanish version are all utf8 encoded. Other language translation will be. In this moment, 31 of 169 English pods have the encoding line :) The best part for this problem is that known that encoding of pod documents is easy: all pod are ISO-8859-1, unless the pod have a =encoding tag, showing the encoding. The problem now is to make a regex compatible with these encoding, so perldoc can find the start of list of functions in perlfunc. The search_perlfunc_re_encoding() function would read the first lines of perlfunc.pod and show the encoding, but perldoc can make this operation, also. Other solution: 1) edit perlfunc.pod, and at line 2) remove all the code about I will talk with the Spanish PerlDoc team, and we will send you another email. Best Regards, JF^D -- Enviado desde mi teléfono con K-9 Mail.