Skip Menu |

This queue is for tickets about the HTML-Clean CPAN distribution.

Report information
The Basics
Id: 6772
Status: new
Priority: 0/
Queue: HTML-Clean

People
Owner: Nobody in particular
Requestors: cpanbughtmlclean [...] edbateman.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



Subject: Issue with <pre> tags.
When using HTML Clean, I found that when I had produced some code inside the preformatted tags (<pre></pre>) that this module was actually removing some of the return characters. Since this is pre-formatted, this corrupts the way in which the page is supposed to be shown, and thus is not a valid optimisation. In my example, I have: &lt;code&gt;&lt;span class=&quot;linecomment&quot;&gt;# Perl code here&lt;/span&gt;<br> &lt;span class=&quot;category2&quot;&gt;print&lt;/span&gt; &quot;Hello world!&quot;;&lt;/code&gt; and this then gets converted to: &lt;code&gt;&lt;span class=&quot;linecomment&quot;&gt;# Perl code here&lt;/span&gt;&lt;span class=&quot;category2&quot;&gt;print&lt;/span&gt; &quot;Hello world!&quot;;&lt;/code&gt; There's a line return missing from in between the comment and the next line. (NB. I added a break tag (br) to ensure that the line return is shown) Clearly when showing code inside pre tags and then optimising the entire page there's a big problem. Suggested fix: turn off optimisations betweens pre tags. Perl version: This is perl, v5.8.2 built for i386-freebsd
From: Allard Hoeve
Subject: Documenting it helps :)
From: Gunnar Wolf
Many people get bitten by this bug as it is right now. I didn't fix the bug, but at least this patch puts a prominent notice hard to be ignored, and mentions its presence in the documentation.
Index: lib/HTML/Clean.pm =================================================================== --- lib/HTML/Clean.pm (revision 1171) +++ lib/HTML/Clean.pm (revision 1172) @@ -375,6 +375,16 @@ =back +Please note that if your HTML includes preformatted regions (this means, if +it includes <pre>...</pre>, we do not suggest removing whitespace, as it will +alter the rendered defaults. + +HTML::Clean will print out a warning if it finds a preformatted region and is +requested to strip whitespace. In order to prevent this, specify that you don't +want to strip whitespace - i.e. + + $h->strip( {whitespace => 0} ); + =cut use vars qw/ @@ -435,6 +445,17 @@ } if ($do_whitespace) { + if ($$h =~ /<pre/i) { + warn << 'EOF' +Warning: Stripping whitespace will affect preformatted region\'s layout +You have a <pre> region in your HTML, which depends on the whitespace not +being modified. You requested to strip the whitespace - The rendered results +will be affected. + +Hint: Use $h->strip({whitespace => 0}); instead. +EOF + } + $$h =~ s,[\r\n]+,\n,sg; # Carriage/LF -> LF $$h =~ s,\s+\n,\n,sg; # empty line $$h =~ s,\n\s+<,\n<,sg; # space before tag