Subject: | resulting document prints with stray end tags |
The following results in an invalid document with stray end tags:
use HTML::HTML5::Parser;
my $parser = HTML::HTML5::Parser->new;
my $doc = $parser->parse_string(<<'EOT');
<!DOCTYPE html>
<html>
<head>
<title>Thing</title>
<meta charset="utf-8">
<link rel="its-rules" href="blah.html">
</head>
<body></body>
</html>
EOT
print $doc->toStringHTML;
Result:
<html xmlns="http://www.w3.org/1999/xhtml"><head>
<title>Thing</title>
<meta charset="utf-8"></meta>
<link rel="its-rules" href="blah.html"></link>
</head>
<body>
Whereas just with LibXML it does the right thing:
use XML::LibXML;
my $doc = XML::LibXML->load_html(string => <<'EOT');
<!DOCTYPE html>
<html>
<head>
<title>Thing</title>
<meta charset="utf-8">
<link rel="its-rules" href="blah.html">
</head>
<body></body>
</html>
EOT
print $doc->toStringHTML;
Result:
<!DOCTYPE html>
<html>
<head>
<title>Thing</title>
<meta charset="utf-8">
<link rel="its-rules" href="blah.html">
</head>
<body></body>
</html>
The thing that's printed by the document returned by HTML::HTML5::Parser does not validate, having stray </meta> and </link> end tags, while the document returned by XML::LibXML does validate.