Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the Pod-Simple CPAN distribution.

Report information
The Basics
Id: 69390
Status: rejected
Priority: 0/
Queue: Pod-Simple

People
Owner: Nobody in particular
Requestors: florent.angly [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: 3.16
Fixed in: (no value)



Subject: POD string in variables not handled properly
I have been trying to parse Perl files that contain POD with Pod::Simple. I have noticed that Pod::Simple fails to properly handle certain cases. If the Perl file contains a variable whose content is POD-formatted, then this POD will be extracted by Pod::Simple although it should not be extracted since it is not a POD section but the content of a variable. I attached a test file containing POD to reproduce the problem. Run the following and notice that a POD that should not be extracted is extracted: perl -MPod::Simple::Text -e "Pod::Simple::Text->filter('/home/floflooo/Desktop/test_euclid/tricky.pl')" The POD parsing mechanism of perltidy seems to be more robust and able to detect this variable-embedded POD. Running the following properly keeps the POD string in the file: perltidy --delete-pod tricky.pl
Subject: tricky.pl
#! /usr/bin/env perl use strict; use warnings; =head1 NAME Tricky =cut print "Starting...\n--------\n"; my $var =<<EOS; =head1 FAKE_POD_ENTRY_HERE This should not be extracted as POD since it is the content of a variable =cut EOS print $var; print "--------\nDone!\n"; exit; __END__ =head1 SYNOPSIS Tricky file to test proper POD parsing
Hi, I have managed to produce a working patch for this issue. Please merge it from https://github.com/theory/pod-simple/pull/25 Regards, Florent
Hi, I would appreciate some feedback on this. I am eager to continue the development of Getopt::Euclid using Pod::Simple for its POD needs, but I cannot go ahead until this bug is resolved. Cheers, Florent
Subject: Re: [rt.cpan.org #69390] [PATCH] POD string in variables not handled properly
Date: Mon, 23 Jan 2012 20:03:11 -0800
To: bug-Pod-Simple [...] rt.cpan.org
From: "David E. Wheeler" <david [...] kineticode.com>
On Jan 17, 2012, at 6:01 PM, Florent Angly via RT wrote: Show quoted text
> I would appreciate some feedback on this. > I am eager to continue the development of Getopt::Euclid using > Pod::Simple for its POD needs, but I cannot go ahead until this bug is > resolved.
The simplest way to work around this issue is to escape the POD directives in the string, like this: my $var =<<EOS; \=head1 FAKE_POD_ENTRY_HERE This should not be extracted as POD since it is the content of a variable \=cut EOS That will solve your problem. I admire your attempt to work around this in code, but I gotta say, I don’t think I want to go there. You’re getting into parsing Perl code, and that’s absolutely fraught with pitfalls. For example, your patch has this bit: + if ( $line =~ m/^\s*<<\s*["']?(\s*[a-z0-9]+)["']?/i ) { + # Catch heredocs + $ending_re = '^'.$1; And yet there are also single-< heredocs that would be missed. And this code would catch something like: my $string = "<<'FOO"; Which would also be wrong. So I really think that just escaping the =s in your string-embedded Pod is the way to go. HTH, David
Hi David and thanks for the reply. Show quoted text
> The simplest way to work around this issue is to escape the POD > directives in the string, like this: > > my $var =<<EOS; > > \=head1 FAKE_POD_ENTRY_HERE > > This should not be extracted as POD since it is the content of a > variable > > \=cut > > EOS > > That will solve your problem.
I agree that this would fix the problem. However, you and I do not necessarily have access to all code that might need this sort of workaround. You certainly agree that it would be more elegant to have Pod::Simple know how to deal with the issue properly instead of requiring authors to use a workaround. Show quoted text
> I admire your attempt to work around this in code, but I gotta say, I > don’t think I want to go there. You’re getting into parsing Perl code, > and that’s absolutely fraught with pitfalls. > > For example, your patch has this bit: > > + if ( $line =~ m/^\s*<<\s*["']?(\s*[a-z0-9]+)["']?/i ) { > + # Catch heredocs > + $ending_re = '^'.$1; > > And yet there are also single-< heredocs that would be missed. And > this code would catch something like: > > my $string = "<<'FOO"; > > Which would also be wrong.
Yes, you are entirely right that my patch does not cover all possible cases. Consider it a proof-of-concept that this can be done. Parsing Perl code is difficult and I prefer to rely on modules dedicated to it when possible. Text::Balanced is great at detecting quoted string and has methods to catch heredocs: http://search.cpan.org/~adamk/Text-Balanced-2.02/lib/Text/Balanced.pm#extract_quotelike_and_%22here_documents%22 If it is acceptable for Pod::Simple to depend on Getopt::Euclid, using its features would prove more robust than using my proof-of-concept code. This rogue POD string are not a frequent occurrence nor a trivial problem. So, I understand that you might be reluctant to fix it. However considering that Pod::Simple is becoming the main Pod parser in Perl, I have high expectations for it. Best, Florent
Subject: Re: [rt.cpan.org #69390] [PATCH] POD string in variables not handled properly
Date: Tue, 24 Jan 2012 09:02:00 -0800
To: bug-Pod-Simple [...] rt.cpan.org
From: "David E. Wheeler" <david [...] justatheory.com>
On Jan 23, 2012, at 8:49 PM, Florent Angly via RT wrote: Show quoted text
> I agree that this would fix the problem. However, you and I do not > necessarily have access to all code that might need this sort of > workaround. You certainly agree that it would be more elegant to have > Pod::Simple know how to deal with the issue properly instead of > requiring authors to use a workaround.
It’s rare enough that I am not worried about it. Show quoted text
> Yes, you are entirely right that my patch does not cover all possible > cases. Consider it a proof-of-concept that this can be done.
It could only really be done well by using PPI. And we are not about to depend on PPI. Show quoted text
> If it is acceptable for Pod::Simple to depend on Getopt::Euclid, using > its features would prove more robust than using my proof-of-concept code.
It is not. Pod::Simple is a core-distributed module, and not really in a position to start using modules not distributed with the Perl core (like PPI and Getopt::Euclid). Show quoted text
> This rogue POD string are not a frequent occurrence nor a trivial > problem. So, I understand that you might be reluctant to fix it. However > considering that Pod::Simple is becoming the main Pod parser in Perl, I > have high expectations for it.
It has been the main Pod parser for many years already, and complaints such as this are quite rare, especially considering that all of search.cpan.org and metacpan.org have their docs generated via Pod::Simple. In short, Pod::Simple is a Pod parser, not a Perl parser, and I think that’s the way it ought to stay. Best, David
I can undertand your reasoning... Thanks for pointing out PPI to me. It seems like a very powerful tool. Best, Florent