Subject: | Memory leak when parsing HTML documents with errors |
When investigating bug #79118, I found that XML::LibXML leaks memory when parsing HTML files with errors (if the recover option is *not* set). The HTML parsing code in LibXML.xs should call LibXML_will_die_ctx and destroy the result if it returns true, like the XML parsing code does.
See attachment for a test script.
Subject: | leaky.pl |
#!/usr/bin/perl
# Feeding XML::LibXML with an invalid file, triggering memory leaks
use strict;
use warnings;
use Devel::Leak;
use XML::LibXML;
check_libxml_memory();
sub check_libxml_memory {
make_trouble(); # Run once to initialize stuff.
my $handle;
my $leaveCount = 0;
my $enterCount = Devel::Leak::NoteSV($handle);
print STDERR "ENTER: $enterCount SVs\n";
make_trouble(); # Trace how loading a bad doc affects memory
$leaveCount = Devel::Leak::CheckSV($handle);
print STDERR "\nLEAVE: $leaveCount SVs\n";
}
sub make_trouble {
my $parser = XML::LibXML->new;
eval {
my $doc = $parser->load_html(
string => '<lkj/>',
);
} or warn($@);
$@ = undef;
}
1;