Skip Menu |

This queue is for tickets about the HTML-Zoom CPAN distribution.

Report information
The Basics
Id: 74964
Status: resolved
Priority: 0/
Queue: HTML-Zoom

People
Owner: Nobody in particular
Requestors: stratman@gmail.com (no email address)
Cc: mst [...] shadowcat.co.uk
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.009006
Fixed in: (no value)



CC: mst [...] shadowcat.co.uk
Subject: (PATCH) HTML-Zoom with plain-text
$zoom->from_html("plain text")->to_html;# returns "" But I (personally) would expect it to return "plain text". Using HTML::Zoom::Parser::HTML::Parser it returns "plain text" as expected. Find attached a patch and test file. Identical test file was submitted to HTML::Zoom::Parser::HTML::Parser ( https://github.com/mphill22/HTML-Zoom-Parser-HTML- Parser/pull/1 ) I was afraid (or too lazy?) to touch the regexp in the parser, so this patch is kind of brutish. There might be a more suitable potential solution.
Subject: html_zoom_plaintextfix.patch
diff -crB HTML-Zoom-0.009006/lib/HTML/Zoom/Parser/BuiltIn.pm HTML-Zoom-PlainTextFix/lib/HTML/Zoom/Parser/BuiltIn.pm *** HTML-Zoom-0.009006/lib/HTML/Zoom/Parser/BuiltIn.pm 2011-03-27 09:23:14.000000000 -0500 --- HTML-Zoom-PlainTextFix/lib/HTML/Zoom/Parser/BuiltIn.pm 2012-02-13 13:18:46.000000000 -0600 *************** *** 18,36 **** sub _hacky_tag_parser { my ($text, $handler) = @_; ! while ( ! $text =~ m{ ! ( ! (?:[^<]*) < (?: ! ( / )? ( [^/!<>\s"'=]+ ) ! ( (?:"[^"]*"|'[^']*'|[^/"'<>])+? )? ! | ! (!-- .*? -- | ![^\-] .*? ) ! ) (\s*/\s*)? > ! ) ! ([^<]*) ! }sxg ! ) { my ($whole, $is_close, $tag_name, $attributes, $is_special, $in_place_close, $content) = ($1, $2, $3, $4, $5, $6, $7, $8); --- 18,35 ---- sub _hacky_tag_parser { my ($text, $handler) = @_; ! my $tag_match = qr{ ! ( ! (?:[^<]*) < (?: ! ( / )? ( [^/!<>\s"'=]+ ) ! ( (?:"[^"]*"|'[^']*'|[^/"'<>])+? )? ! | ! (!-- .*? -- | ![^\-] .*? ) ! ) (\s*/\s*)? > ! ) ! ([^<]*) ! }sx; ! while ($text =~ /$tag_match/g) { my ($whole, $is_close, $tag_name, $attributes, $is_special, $in_place_close, $content) = ($1, $2, $3, $4, $5, $6, $7, $8); *************** *** 62,67 **** --- 61,70 ---- $handler->({ type => 'TEXT', raw => $content }); } } + # Special case where you have plain-text (e.g. $text is 'Hello world') + if ($text !~ $tag_match && length $text) { + $handler->({ type => 'TEXT', raw => $text }); + } } sub _hacky_attribute_parser { Only in HTML-Zoom-PlainTextFix/t: plain_text.t
Subject: plain_text.t
use strictures 1; use Test::More qw(no_plan); use HTML::Zoom; my $zoom = HTML::Zoom->new; my $plain_text = 'Hello, World!'; is($zoom->from_html($plain_text)->to_html, $plain_text, 'Parser preserves plain-text input');