Subject: | XML::LibXML->parse_* is not detect BOM. |
XML::LibXML->parse_* detect encodings (xml encoding, meta charset= on
parse_html_*), but it is not sensitive about BOM.
Examples,
# $s is broken
my $doc = $parser->parse_html_string($bommed);
my $s = $doc->findvalue('//title');
# $s is safe string, but only UTF-8, without UTF-16LE/BE
(my $unbommed = $bommed) =~ s/^\xEF\xBB\xBF//s;
my $doc = $parser->parse_html_string($unbommed);
my $s = $doc->findvalue('//title');
Could you fix it?