Subject: | HTML5 Parsing |
Date: | Tue, 9 Apr 2013 09:54:21 -0300 |
To: | bug-html-tree [...] rt.cpan.org |
From: | Cafe Avila Gratz <cafe01 [...] gmail.com> |
First of all, thank you for this great module.
Now the issue.
I'm using HTML::TreeBuilder (version 5.03) to parse this html snippet:
<header><h1>foo</h1><p>bar</p></header>
And the dump() of it is:
<html> @0 (IMPLICIT)
<head> @0.0 (IMPLICIT)
<body> @0.1 (IMPLICIT)
<h1> @0.1.0
"foo"
<p> @0.1.1
"bar"
<header> @0.2
$tree->guts->as_HTML() is:
<div><h1>foo</h1><p>bar<header></header></div>
instead of
<div><header><h1>foo</h1><p>bar</header></div>
Tested with this code:
use strict;
use warnings;
use HTML::TreeBuilder;
my $tree = HTML::TreeBuilder->new( );
$tree->ignore_unknown(0);
$tree->parse_content('<header><h1>foo</h1><p>bar</p></header>');
$tree->dump;
printf "HTML:\n%s\n", $tree->guts->as_HTML;
Thank you.
Carlos Fernando Avila Gratz.