Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the Pod-Simple CPAN distribution.

Report information
The Basics
Id: 43489
Status: resolved
Priority: 0/
Queue: Pod-Simple

People
Owner: Nobody in particular
Requestors: dwheeler [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Verbatim Indents
Date: Fri, 20 Feb 2009 22:22:04 -0800
To: bug-pod-simple [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
Howdy, I noticed that when I ran `perldoc -MPod::Simple::XHTML` on a script, verbatim blocks of code were incorrectly indented: <pre><code> -- Start transaction and plan the tests. BEGIN; SELECT plan(1); -- Run the tests. SELECT pass( &#39;My test passed, w00t!&#39; ); -- Finish the tests and clean up. SELECT * FROM finish(); ROLLBACK;</code></pre> This is annoying, because, in HTML output at least, that space gets stuck in front of every single line. It seems to me that, unless someone is using VerbatimFormatted codes, every line of a verbatim block should have that extra spacing stripped out. The only reasonable way I can think of to determine how much white space should be stripped out is to base it off the number of spaces before the first line in a block. The attached patch does this, and updates all the tests to reflect the change. I assume that anyone using VerbatimFormatted codes, however, knows what they're doing and can strip off such spacing themselves. I looked in perlpod, and saw no mention of this, but it does seem to me to be the only sane way to handle this. What do you think? Thanks, David

Message body is not shown because sender requested not to inline it.

Bump!
The perlpod spec for a Verbatim paragraph is "It should be reproduced exactly..." which means we can't automatically strip spaces for everyone. And, relying on the first line of the code example as an indicator of what to strip is a risky heuristic, someone might have broken a larger code example over several blocks between explanation paragraphs. But, a feature to strip spaces from verbatim paragraphs can be added as an option that's off by default. Since most code-bases use a consistent indentation strategy, the option can specify how many spaces to remove, something like: $new->strip_verbatim_indent(4); (With sensible checks that it's actually spaces being stripped.) Allison
Subject: Re: [rt.cpan.org #43489] Verbatim Indents
Date: Thu, 20 Aug 2009 14:04:46 -0700
To: bug-Pod-Simple [...] rt.cpan.org
From: "David E. Wheeler" <dwheeler [...] cpan.org>
On Aug 19, 2009, at 8:33 PM, ARANDAL via RT wrote: Show quoted text
> The perlpod spec for a Verbatim paragraph is "It should be reproduced > exactly..." which means we can't automatically strip spaces for > everyone.
It's unfortunate that the spec forces one to use syntax without any semantics and then preserve that syntax as if it were semantic. :-( Is Perl 6's Pod like this? Show quoted text
> And, relying on the first line of the code example as an > indicator of what to strip is a risky heuristic, someone might have > broken a larger code example over several blocks between explanation > paragraphs.
Yes, the code should preserve that number for consecutive verbatim blocks, I agree with you there. Show quoted text
> But, a feature to strip spaces from verbatim paragraphs can be added > as > an option that's off by default. Since most code-bases use a > consistent > indentation strategy, the option can specify how many spaces to > remove, > something like: > > $new->strip_verbatim_indent(4); > > (With sensible checks that it's actually spaces being stripped.)
Hrm. Ever worked on a project with a bunch of developers? Although Bricolage evolved some style standards for documentation, the number of spaces to use before verbatim code wasn't one of them, and different developers seem to prefer different numbers. It's irritating. I support the idea of having it off by default and enable-able, with the ability to specify what to strip off. I was thinking of telling it what to strip, so you could do: $new->strip_verbatim_indent(' '); Or $new->strip_verbatim_indent("\t"); For example. But I'd also like to be able to support a heuristic. What if I also allow a code ref to be passed that can calculate the heuristic. It would be passed the verbatim text and would return the bit to strip off. So in addition to the above approaches, you could do this to re-create the first-line heuristic in userland: $new->strip_verbatim_indent(sub { my $para = shift; (my $spaces = $para->[2]) =~ s/\S.*//; return $spaces; }); Does that make sense to you? It could also perhaps do something to determine whether the verbatim blocks are successive, and therefore fallback on the previous return value. The only downside is that the first to elements of the `$para` argument are not the paragraph data. I could shift that stuff off, though, I guess, and shift it back on afterwards. Thoughts? Best, David
I went ahead and implemented this. Patch attached. I think it covers all the cases. Notably, Pod::Simple already includes successively-indented blocks in single verbatim "paragraphs", so the case where a heuristic is used already covers that case. What do you think? The code is actually quite simple, most of the patch being tests and documentation. Best, David
Index: t/strip_verbatim_indent.t =================================================================== --- t/strip_verbatim_indent.t (revision 0) +++ t/strip_verbatim_indent.t (revision 0) @@ -0,0 +1,111 @@ +#!/usr/bin/perl -w + +# t/strip_verbatim_indent.t.t - check verabtim indent stripping feature + +BEGIN { + chdir 't' if -d 't'; +} + +use strict; +use lib '../lib'; +#use Test::More tests => 71; +use Test::More 'no_plan'; + +use_ok('Pod::Simple::XHTML') or exit; +use_ok('Pod::Simple::XMLOutStream') or exit; + +isa_ok my $parser = Pod::Simple::XHTML->new, 'Pod::Simple::XHTML'; + +ok $parser->strip_verbatim_indent(' '), 'Should be able to set striper to " "'; +ok $parser->strip_verbatim_indent(' '), 'Should be able to set striper to " "'; +ok $parser->strip_verbatim_indent("t"), 'Should be able to set striper to "\\t"'; +ok $parser->strip_verbatim_indent(sub { ' ' }), 'Should be able to set striper to coderef'; + +for my $spec ( + [ + "\n=pod\n\n foo bar baz\n", + undef, + qq{<Document><Verbatim\nxml:space="preserve"> foo bar baz</Verbatim></Document>}, + "<pre><code> foo bar baz</code></pre>\n\n", + 'undefined indent' + ], + [ + "\n=pod\n\n foo bar baz\n", + ' ', + qq{<Document><Verbatim\nxml:space="preserve">foo bar baz</Verbatim></Document>}, + "<pre><code>foo bar baz</code></pre>\n\n", + 'single space indent' + ], + [ + "\n=pod\n\n foo bar baz\n", + ' ', + qq{<Document><Verbatim\nxml:space="preserve"> foo bar baz</Verbatim></Document>}, + "<pre><code> foo bar baz</code></pre>\n\n", + 'too large indent' + ], + [ + "\n=pod\n\n foo bar baz\n", + ' ', + qq{<Document><Verbatim\nxml:space="preserve">foo bar baz</Verbatim></Document>}, + "<pre><code>foo bar baz</code></pre>\n\n", + 'double space indent' + ], + [ + "\n=pod\n\n foo bar baz\n", + sub { ' ' }, + qq{<Document><Verbatim\nxml:space="preserve">foo bar baz</Verbatim></Document>}, + "<pre><code>foo bar baz</code></pre>\n\n", + 'code ref stripper' + ], + [ + "\n=pod\n\n foo bar\n\n baz blez\n", + ' ', + qq{<Document><Verbatim\nxml:space="preserve">foo bar\n\nbaz blez</Verbatim></Document>}, + "<pre><code>foo bar\n\nbaz blez</code></pre>\n\n", + 'single space indent and empty line' + ], + [ + "\n=pod\n\n foo bar\n\n baz blez\n", + sub { ' ' }, + qq{<Document><Verbatim\nxml:space="preserve">foo bar\n\nbaz blez</Verbatim></Document>}, + "<pre><code>foo bar\n\nbaz blez</code></pre>\n\n", + 'code ref indent and empty line' + ], + [ + "\n=pod\n\n foo bar\n\n baz blez\n", + sub { (my $s = shift->[0]) =~ s/\S.*//; $s }, + qq{<Document><Verbatim\nxml:space="preserve">foo bar\n\nbaz blez</Verbatim></Document>}, + "<pre><code>foo bar\n\nbaz blez</code></pre>\n\n", + 'heuristic code ref indent' + ], + [ + "\n=pod\n\n foo bar\n baz blez\n", + sub { s/^\s+// for @{ $_[0] } }, + qq{<Document><Verbatim\nxml:space="preserve">foo bar\nbaz blez</Verbatim></Document>}, + "<pre><code>foo bar\nbaz blez</code></pre>\n\n", + 'militant code ref' + ], +) { + my ($pod, $indent, $xml, $xhtml, $desc) = @$spec; + # Test XML output. + ok my $p = Pod::Simple::XMLOutStream->new, "Construct XML parser to test $desc"; + $p->hide_line_numbers(1); + my $output = ''; + $p->output_string( \$output ); + is $indent, $p->strip_verbatim_indent($indent), + 'Set stripper for XML to ' . (defined $indent ? qq{"$indent"} : 'undef'); + ok $p->parse_string_document( $pod ), "Parse POD to XML for $desc"; + is $output, $xml, "Should have expected XML output for $desc"; + + + # Test XHTML output. + ok $p = Pod::Simple::XHTML->new, "Construct XHMTL parser to test $desc"; + $p->html_header(''); + $p->html_footer(''); + $output = ''; + $p->output_string( \$output ); + is $indent, $p->strip_verbatim_indent($indent), + 'Set stripper for XHTML to ' . (defined $indent ? qq{"$indent"} : 'undef'); + ok $p->parse_string_document( $pod ), "Parse POD to XHTML for $desc"; + is $output, $xhtml, "Should have expected XHTML output for $desc"; +} Index: lib/Pod/Simple.pm =================================================================== --- lib/Pod/Simple.pm (revision 370) +++ lib/Pod/Simple.pm (working copy) @@ -67,7 +67,7 @@ 'hide_line_numbers', # For some dumping subclasses: whether to pointedly # suppress the start_line attribute - + 'line_count', # the current line number 'pod_para_count', # count of pod paragraphs seen so far @@ -87,6 +87,7 @@ # text up into several events 'preserve_whitespace', # whether to try to keep whitespace as-is + 'strip_verbatim_indent', # What indent to strip from verbatim 'content_seen', # whether we've seen any real Pod content 'errors_seen', # TODO: document. whether we've seen any errors (fatal or not) @@ -98,7 +99,7 @@ #Called like: # $code_handler->($line, $self->{'line_count'}, $self) if $code_handler; # $cut_handler->($line, $self->{'line_count'}, $self) if $cut_handler; - + ); #@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Index: lib/Pod/Simple/BlackBox.pm =================================================================== --- lib/Pod/Simple/BlackBox.pm (revision 370) +++ lib/Pod/Simple/BlackBox.pm (working copy) @@ -1369,8 +1369,19 @@ DEBUG and print " giving verbatim treatment...\n"; $para->[1]{'xml:space'} = 'preserve'; + + my $indent = $self->strip_verbatim_indent; + if ($indent && ref $indent eq 'CODE') { + my @shifted = (shift @{$para}, shift @{$para}); + $indent = $indent->($para); + unshift @{$para}, @shifted; + } + for(my $i = 2; $i < @$para; $i++) { foreach my $line ($para->[$i]) { # just for aliasing + # Strip indentation. + $line =~ s/^\E$indent// if $indent + && !($self->{accept_codes} && $self->{accept_codes}{VerbatimFormatted}); while( $line =~ # Sort of adapted from Text::Tabs -- yes, it's hardwired in that # tabs are at every EIGHTH column. For portability, it has to be Index: lib/Pod/Simple.pod =================================================================== --- lib/Pod/Simple.pod (revision 370) +++ lib/Pod/Simple.pod (working copy) @@ -173,9 +173,52 @@ This returns true if C<$parser> has read from a source, and come to the end of that source. +=item C<< $parser->strip_verbatim_indent( I<SOMEVALUE> ) >> + +The perlpod spec for a Verbatim paragraph is "It should be reproduced +exactly...", which means that the whitespace you've used to indent your +verbatim blocks will be preserved in the output. This can be annoying for +outputs such as HTML, where that whitespace will remain in front of every +line. It's an unfortunate case where syntax is turned into semantics. + +If the POD your parsing adheres to a consistent indentation policy, you can +have such indentation stripped from the beginning of every line of your +verbatim blocks. This method tells Pod::Simple what to strip. For two-space +indents, you'd use: + + $parser->strip_verbatim_indent(' '); + +For tab indents, you'd use a tab character: + + $parser->strip_verbatim_indent("\t"); + +If the POD is inconsistent about the indentation of verbatim blocks, but you +have figured out a heuristic to determine how much a particular verbatim block +is indented, you can pass a code reference instead. The code reference will be +executed with one argument, an array reference of all the lines in the +verbatim block, and should return the value to be stripped from each line. For +example, if you decide that you're fine to use the first line of the verbatim +block to set the standard for indentation of the rest of the block, you can +look at the first line and return the appropriate value, like so: + + $new->strip_verbatim_indent(sub { + my $lines = shift; + (my $indent = $lines->[0]) =~ s/\S.*//; + return $indent; + }); + +If you'd rather treat each line individually, you can do that, too, by just +transforming them in-place in the code reference and returning C<undef>. Say +that you don't want I<any> lines indented. You can do something like this: + + $new->strip_verbatim_indent(sub { + my $lines = shift; + sub { s/^\s+// for @{ $lines }, + return undef; + }); + =back - =head1 CAVEATS This is just a beta release -- there are a good number of things still Index: ChangeLog =================================================================== --- ChangeLog (revision 371) +++ ChangeLog (working copy) @@ -7,6 +7,7 @@ Add support for an index (TOC) in the XHTML output from David E. Wheeler. + Add strip_verbatim_indent() from David E. Wheeler. 2009-07-16 Allison Randal <allison@perl.org> * Release 3.08
On Thu Aug 20 18:48:44 2009, DWHEELER wrote: Show quoted text
> I went ahead and implemented this. Patch attached. I think it covers > all the cases.
Per IM discussion with Allison, I've committed this change in r372. David