Subject: | POD handling |
Date: | Mon, 25 Jun 2018 10:31:44 -0700 |
To: | bug-ppr [...] rt.cpan.org |
From: | Adriano Ferreira <a.r.ferreira [...] gmail.com> |
I have hit a couple of issues while handling documents with POD. I would
try to illustrate them below. I am not sure if / how these issues raise
complications with current code optimizations and overall philosophy of the
module – that is why I am refraining to propose complete patches.
First, PPR currently cannot handle PODs which terminate at the end-of-file
rather than at a =cut.
That calls for a change to <PerlPod> definition – something like
(?<PerlPod>
^ = [^\W\d]\w*+ # A line starting with =<identifier>
.*? # Up to...
(?:
^ = cut \b [^\n]*+ $ # ...the first line starting with =cut
|
\z # or EOF
)
) # End of rule
( If such a fix goes in, it also needs to apply to the corresponding
<PerlPod> definition in decomment(). )
Composed with that, I found that in a lot of places where POD may happen in
PPR::GRAMMAR rules as in
(?<PerlStatement>
(?: (?>(?&PerlPod)) (?&PerlOWS) )?+
(?>
...
should probably be extended to zero or more sequences of <PerlPod> and
<PerlOWS> as in
(?<PerlStatement>
(?: (?>(?&PerlPod)) (?&PerlOWS) )*+
(?>
...
The most prominent places where that may cause PPR to not recognize valid
Perl code are <PerlStatement> (mentioned above), <PerlBlock> and
<PerlDocument> – where current rules misses the fact that sequences of
<PerlPod> and <PerlOWS> would be allowed both at the beginning and at the
end of the matches. Examples of (possibly incomplete) updates to these
rules would be:
(?<PerlBlock>
\{
(?>(?&PerlOWS)) (?: (?>(?&PerlPod)) (?&PerlOWS) )*+ # Leading sequences
of space and POD
(?: (?>(?&PerlStatement))
(?&PerlOWS) (?: (?>(?&PerlPod)) (?&PerlOWS) )*+ # Intervening sequences
of space and POD
)*+
\}
) # End of rule
(?<PerlDocument>
\x{FEFF}?+ # Optional BOM marker
(?>(?&PerlOWS)) (?: (?>(?&PerlPod)) (?&PerlOWS) )*+ # Leading
sequences of space and POD
(?: (?>(?&PerlStatement))
(?&PerlOWS) (?: (?>(?&PerlPod)) (?&PerlOWS) )*+ # Intervening sequences
of space and POD
)*+
) # End of rule
For sure, my suggestions above makes PPR code a lot uglier – something
that could be mitigated by creating and reusing a new rule such as below
(again not sure about impacts to performance and optimizations).
(?<PerlPodSeq>
(?>(?&PerlOWS))
(?: (?>(?&PerlPod)) (?&PerlOWS) )*+
) # End of rule
or maybe
(?<PerlStatementSeq>
(?>(?&PerlOWS)) (?: (?>(?&PerlPod)) (?&PerlOWS) )*+ # Leading sequences
of space and POD
(?: (?>(?&PerlStatement))
(?&PerlOWS) (?: (?>(?&PerlPod)) (?&PerlOWS) )*+ # Intervening sequences
of space and POD
)*+
) # End of rule
(?<PerlBlock>
\{ (?>(?&PerlStatementSeq)) \}
) # End of rule
(?<PerlDocument>
\x{FEFF}?+ # Optional BOM marker
(?>(?&PerlStatementSeq))
) # End of rule
So these suggestions are about being more forgiving with spaces and POD
like Perl currently is. It would allow PPR to accept texts like the one
below.
=pod
POD #1
=cut
# Comments
=pod
POD #2
=cut
my $x;
=pod
Trailing POD terminated by =cut or eof
=cut
So I leave this to your consideration. Thanks for PPR.
Message body is not shown because it is too large.
Message body is not shown because sender requested not to inline it.