Skip Menu |

This queue is for tickets about the XML-LibXML CPAN distribution.

Report information
The Basics
Id: 58024
Status: resolved
Priority: 0/
Queue: XML-LibXML

People
Owner: Nobody in particular
Requestors: milu71 [...] googlemail.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.70
Fixed in: (no value)



Subject: XML::LibXML->new, recover flag, suppress warnings
As reported today on the Perl-XML mailing list: http://aspn.activestate.com/ASPN/Mail/Message/Perl-XML/3862042 In XML::LibXML, warnings are not suppressed when specifying the recover or recover_silently flags as per the following excerpt from the manpage: -------- recover /parser, html, reader/ recover from errors; possible values are 0, 1, and 2 A true value turns on recovery mode which allows one to parse broken XML or HTML data. The recovery mode allows the parser to return the successfully parsed portion of the input document. This is useful for almost well-formed documents, where for example a closing tag is missing somewhere. Still, XML::LibXML will only parse until the first fatal (non-recoverable) error occurs, reporting recoverable parsing errors as warnings. To suppress even these warnings, use recover=>2. -------- http://search.cpan.org/dist/XML-LibXML/lib/XML/LibXML/Parser.pod Here's a test case to evidence the behaviour: # use strict; # use warnings; # use utf8; use XML::LibXML; my $txt = <<'EOS'; <div> <a href="/app/search?op=list&type=50">eins</a> <!-- HTML parser error : htmlParseEntityRef: expecting ';' --> </div> EOS my $prsr = XML::LibXML->new( # see perldoc XML::LibXML::Parser recover => 2, # makes parser go on despite errors # suppress_warnings => 1, # doesn't shut the warning off # suppress_errors => 1, # not either ); my $dom = $prsr->load_html( string => $txt ); print $dom->toString( 1 ); print "$_\n" for XML::LibXML::LIBXML_DOTTED_VERSION, # 2.7.6 in my case XML::LibXML::LIBXML_VERSION, # 20706 XML::LibXML::LIBXML_RUNTIME_VERSION; # 20707 yeah, I know ;-)
Subject: Re: [rt.cpan.org #58024] AutoReply: XML::LibXML->new, recover flag, suppress warnings
Date: Tue, 1 Jun 2010 23:03:56 +0200
To: Bugs in XML-LibXML via RT <bug-XML-LibXML [...] rt.cpan.org>
From: Michael Ludwig <milu71 [...] gmx.de>
Here's a better test case using Test::More: use strict; use warnings; use utf8; use XML::LibXML; use Test::More tests => 2; my $txt = <<'EOS'; <div> <a href="milu?a=eins&b=zwei"> ampersand not URL-encoded </a> <!-- HTML parser error : htmlParseEntityRef: expecting ';' --> </div> EOS my %opt = ( # see perldoc XML::LibXML::Parser recover => 1, # makes parser go on despite errors # suppress_warnings => 1, # doesn't shut the warning off # suppress_errors => 1, # not either ); my( $fh, $buf ); { open $fh, '>', \$buf; # open filehandle to scalar variable local *STDERR = $fh; # redirect STDERR there XML::LibXML->new( %opt )->load_html( string => $txt ); close $fh; # warning now in scalar variable like $buf, qr/htmlParseEntityRef:/, 'warning emitted'; open $fh, '>', \$buf; # new filehandle, clears buffer $opt{recover} = 2; # suppress warnings XML::LibXML->new( %opt )->load_html( string => $txt ); close $fh; is $buf, '', 'no warning emitted'; } -- Michael Ludwig
Hi. Thanks for the report. I've integrated your test code into t/49_load_html.t and there's a fix here: https://bitbucket.org/shlomif/perl-xml-libxml It will be uploaded to CPAN later. Regards, -- Shlomi Fish