Subject: | Memory leak in PR parsing |
Date: | Sun, 29 May 2011 23:24:20 +0200 |
To: | bug-Net-Gnats [...] rt.cpan.org |
From: | Yves Martin <ymartin1040 [...] gmail.com> |
With Net-Gnats-0.06
When browsing more than few hundreds PRs including attachments (because
of use of gnatsweb) in the same Perl script, the PR parsing gets really
slower: the script pauses for a while with 100% CPU (probably garbage
collecting) and then go on after the process memory has been increased.
Here is what dprofpp outputs:
$ dprofpp tmon.out
Total Elapsed Time = 9.202319 Seconds
User+System Time = 5.202319 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
94.1 4.897 4.895 100 0.0490 0.0489
Net::Gnats::PR::setFromString
0.38 0.020 0.020 6 0.0033 0.0033 Net::Gnats::connect
0.38 0.020 0.099 13 0.0015 0.0076 main::BEGIN
0.38 0.020 0.020 324 0.0001 0.0001
Net::Gnats::_getGnatsdResponse
0.37 0.019 0.038 318 0.0001 0.0001 Net::Gnats::_doGnatsCmd
0.19 0.010 0.010 4 0.0025 0.0025
main::decode_all_attachments
0.19 0.010 0.010 2 0.0050 0.0050 base::import
0.19 0.010 0.010 3 0.0033 0.0033
Pod::Simple::LinkSection::BEGIN
0.19 0.010 0.010 5 0.0020 0.0020 warnings::register::import
0.19 0.010 0.010 17 0.0006 0.0006
Getopt::Long::ParseOptionSpec
0.19 0.010 0.040 5 0.0020 0.0080 Pod::Text::BEGIN
0.19 0.010 0.010 23 0.0004 0.0004 vars::import
0.19 0.010 0.020 12 0.0008 0.0017 Pod::Simple::BEGIN
0.19 0.010 0.010 7 0.0014 0.0014 Date::Parse::BEGIN
0.19 0.010 0.010 8 0.0012 0.0012 Net::Gnats::BEGIN
So to process large volumes - until now, up to 6000 tickets in a single
run - I have rewritten the parsing with a input-stream by-line loop
instead of regex parsing.
First, the consumed memory is only twice the PR weight, second it no
longer leaks. I agree that it is strange that a regex makes Perl leak,
but that is the fact (or was, october 2008).
Please find my modified version in email attachment.
I have not compared raw performances but probably mine runs faster when
a PR spans over hundreds of lines.
Regards
Yves Martin
Message body is not shown because sender requested not to inline it.
Message body is not shown because sender requested not to inline it.