Subject: | Parsing HTML files using too much memory |
Hello
Some script worked in an older incarnation of XML::LibXML and libxml2
(unfortunately I am not able to tell you whose, but I am trying to find
out) and now it is not working, because XML::LibXML is eating too much
memory.
In attach you can check the test case. xmllint --html processes it very
fast. But with
perl -MXML::LibXML -e '$d = XML::LibXML->load_html(location => shift,
recover => 2, encoding=>"UTF-8")' file.html
it never ends processing (Perl gets out of memory).
I'll add any information as soon as I can dig it out.
Thank you
Alberto
This is perl 5, version 12, subversion 2 (v5.12.2) built for
darwin-thread-multi-2level on Mac OS X, but had the same problem with
Linux. Can get more details.
Subject: | agriculture.html |
Message body is not shown because it is too large.