Hi André,
On Tue Aug 21 16:25:56 2012, andre.lang@webrausch.de wrote:
Show quoted text> Hi Shlomi,
>
> I'm using XML::LibXML in an application where I process a lot of HTML
> webpages. Some of these don't contain valid XML, so I set recover => 2.
>
thanks for the bug report. I'm going to investigate it now.
Regards,
-- Shlomi Fish
Show quoted text> Setting this flag causes a lot of memory leaks when calling
> LibXML::Parser on an invalid file.
>
> I see this behaviour on
> Ubuntu 10.04 LTS
> XML:LibXML 2.004
> Perl 5.14.2, 5.16.1 and 5.17.3.
>
> Older XML::LibXML versions show the same behaviour. I also compiled
> XML::LibXML against different versions of libxml, same problem. On
> Windows, also the same.
>
> I attach a test case where the memory bug occurs and is reported by
> Devel::Leak. I supply a snipplet of a problematic webpage, containing
> unknown tags and higher Unicode characters (which LibXML doesn't like at
> all, but that may be a different bug to report - fixed it converting
> them to entities). Basically, I encounter the leak on any page that is
> not parsable.
>
> So, if you run the supplied libxml2.pl you will see several "new" lines
> indicating a lot of SVs have been leaked.
> If recover is set to 0, there is no leak.
>
> This bug may also relate to similar bug #61507.
>
> If you need any further information, please let me know how I can help.
>
> Besides, thanks for you module - switched from XML::XPath as XML::LibXML
> doesn't leak usually :)
>
> Yours,
> André