Skip Menu |

This queue is for tickets about the Net-Gnats CPAN distribution.

Report information
The Basics
Id: 68535
Status: rejected
Priority: 0/
Queue: Net-Gnats

People
Owner: RICHE [...] cpan.org
Requestors: ymartin1040 [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Memory leak in PR parsing
Date: Sun, 29 May 2011 23:24:20 +0200
To: bug-Net-Gnats [...] rt.cpan.org
From: Yves Martin <ymartin1040 [...] gmail.com>
With Net-Gnats-0.06 When browsing more than few hundreds PRs including attachments (because of use of gnatsweb) in the same Perl script, the PR parsing gets really slower: the script pauses for a while with 100% CPU (probably garbage collecting) and then go on after the process memory has been increased. Here is what dprofpp outputs: $ dprofpp tmon.out Total Elapsed Time = 9.202319 Seconds User+System Time = 5.202319 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 94.1 4.897 4.895 100 0.0490 0.0489 Net::Gnats::PR::setFromString 0.38 0.020 0.020 6 0.0033 0.0033 Net::Gnats::connect 0.38 0.020 0.099 13 0.0015 0.0076 main::BEGIN 0.38 0.020 0.020 324 0.0001 0.0001 Net::Gnats::_getGnatsdResponse 0.37 0.019 0.038 318 0.0001 0.0001 Net::Gnats::_doGnatsCmd 0.19 0.010 0.010 4 0.0025 0.0025 main::decode_all_attachments 0.19 0.010 0.010 2 0.0050 0.0050 base::import 0.19 0.010 0.010 3 0.0033 0.0033 Pod::Simple::LinkSection::BEGIN 0.19 0.010 0.010 5 0.0020 0.0020 warnings::register::import 0.19 0.010 0.010 17 0.0006 0.0006 Getopt::Long::ParseOptionSpec 0.19 0.010 0.040 5 0.0020 0.0080 Pod::Text::BEGIN 0.19 0.010 0.010 23 0.0004 0.0004 vars::import 0.19 0.010 0.020 12 0.0008 0.0017 Pod::Simple::BEGIN 0.19 0.010 0.010 7 0.0014 0.0014 Date::Parse::BEGIN 0.19 0.010 0.010 8 0.0012 0.0012 Net::Gnats::BEGIN So to process large volumes - until now, up to 6000 tickets in a single run - I have rewritten the parsing with a input-stream by-line loop instead of regex parsing. First, the consumed memory is only twice the PR weight, second it no longer leaks. I agree that it is strange that a regex makes Perl leak, but that is the fact (or was, october 2008). Please find my modified version in email attachment. I have not compared raw performances but probably mine runs faster when a PR spans over hundreds of lines. Regards Yves Martin

Message body is not shown because sender requested not to inline it.

Message body is not shown because sender requested not to inline it.

Hello, Could you please let me know the sizeof the attachments? Or it doesn't matter? I would like to reproduce the case just for cross-check. Thanks
Subject: Re: [rt.cpan.org #68535] Memory leak in PR parsing
Date: Sun, 07 Sep 2014 19:19:38 +0200
To: bug-Net-Gnats [...] rt.cpan.org
From: Yves Martin <ymartin1040 [...] gmail.com>
On Mon, 2014-08-18 at 23:43 -0400, Richard Elberger via RT wrote: Show quoted text
> Hello, > Could you please let me know the sizeof the attachments? Or it doesn't matter? > I would like to reproduce the case just for cross-check. > Thanks
Hello, Sizes of attachments really vary. I would say it does not matter... My script was used to migrate content to another tracking system and I remember slowness and memory consumption was irregular depending of the project, and not related to the number of tickets in each project. My opinion was that it depends on the number of lines processed. Regards
Hello, Thanks for the feedback. Would it have been helpful to have the capability to serialize the PRs as they came from gnatsd? Right now it keeps everything in memory so it would eventually pop if there is a large query result. On Sun Sep 07 13:19:49 2014, ymartin1040@gmail.com wrote: Show quoted text
> On Mon, 2014-08-18 at 23:43 -0400, Richard Elberger via RT wrote: >
> > Hello, > > Could you please let me know the sizeof the attachments? Or it > > doesn't matter? > > I would like to reproduce the case just for cross-check. > > Thanks
> > Hello, > > Sizes of attachments really vary. I would say it does not matter... > > My script was used to migrate content to another tracking system and I > remember slowness and memory consumption was irregular depending of > the > project, and not related to the number of tickets in each project. My > opinion was that it depends on the number of lines processed. > > Regards
Subject: Re: [rt.cpan.org #68535] Memory leak in PR parsing
Date: Sat, 13 Sep 2014 09:47:35 +0200
To: bug-Net-Gnats [...] rt.cpan.org
From: Yves Martin <ymartin1040 [...] gmail.com>
On Tue, 2014-09-09 at 18:58 -0400, Richard Elberger via RT wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=68535 > > > Hello, > Thanks for the feedback. > Would it have been helpful to have the capability to serialize the PRs as they came from gnatsd? Right now it keeps everything in memory so it would eventually pop if there is a large query result.
Hello, I agree with you. My concern was first for the Perl regex engine itself, as this code seems to make it really leak when used hundreds of time on thousands of lines in the same run. I have not looked for reports to determine if it has been fixed already in Perl. And I have to admit I no longer use Net::Gnats myself since two years. So the point is up to you. Regards Yves
I cannot reproduce the problem. If it persists, please let me know and please if you have a test dataset available that would be great thanks.