Skip Menu |

This queue is for tickets about the XML-Twig CPAN distribution.

Report information
The Basics
Id: 59683
Status: resolved
Priority: 0/
Queue: XML-Twig

People
Owner: Nobody in particular
Requestors: carcus88 [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 3.32
Fixed in: (no value)



Subject: Memory Leak
When trying to parse multiple XML files in one run I came across a memory leak. See the attached example script and data file. There seems to be no way to not make PERL continue to use memory. Given enough data this script will die from memory bloat. I have seeked the wisdom of the PERL monks and they where at a loss. They said to report a bug ticket. So here it is :) I may be doing this all wrong, in which I would love to be corrected so I can get my program to work as I need it to. I am trying to parse about 85 30+MB files in a single run with no success.
Subject: test_10000000_10099999.xml

Message body is not shown because it is too large.

Subject: memory_example.pl
#!/usr/bin/perl -w use strict; use XML::Twig; my $inFile = 'test_10000000_10099999.xml'; if ( ! $inFile ) { die("No input file specified"); } if ( ! -f $inFile ) { die("file '$inFile' not found"); } for(my $x=0;$x<6;++$x) { print "Doing $x\n"; process($inFile); } exit 0; # # Process the file # sub process { $inFile =~ /data_(\d+)_(\d+)/; my $t= new XML::Twig( TwigHandlers=> { BIOG => \&BIOG }, ); $t->parsefile( $inFile ); $t->purge(); $t->dispose(); # Try to Free memory but does not work... } # # BIOG is XML element we are triggering # sub BIOG { my ($t, $BIOG)= @_; if ( ! checkBiog($BIOG->field('BIOG_NBR')) ) { print "Missing ". $BIOG->field('BIOG_NBR') . "\n"; } $t->purge(); $t->dispose(); return 1; } # # Check database for ID # sub checkBiog { my ($biog) = @_; return 1; }
I can't reproduce the bug with perl 5.10.1 (with perl v5.10.1 (*) built for i486-linux-gnu-thread-multi). From looking at the Pelrmonks thread you are using 5.10, I suspect a bug in that version of Perl. I can't remember the bug reference, I'll look it up, and in hte meantime I am installing a perl 5.10.0 on my machine to test your code with it. If you can, see if you can test the code with 5.10.1 or 5.12.1 BTW, you went a little overboard with the purge/dispose ;--) You only need 1 call to purge once, in the element handler. See attached code. __ mirod
Subject: memory_example.pl
#!/usr/bin/perl -w use strict; use XML::Twig; my $inFile = 'test_10000000_10099999.xml'; if ( ! $inFile ) { die("No input file specified"); } if ( ! -f $inFile ) { die("file '$inFile' not found"); } #for(my $x=0;$x<6;++$x) { for my $x (1..10) { print "Doing $x\n"; process($inFile); } exit 0; # # Process the file # sub process { $inFile =~ /data_(\d+)_(\d+)/; my $t= new XML::Twig( TwigHandlers=> { BIOG => \&BIOG }, ); $t->parsefile( $inFile ); # $t->purge(); # $t->dispose(); # Try to Free memory but does not work... } # # BIOG is XML element we are triggering # sub BIOG { my ($t, $BIOG)= @_; if ( ! checkBiog($BIOG->field('BIOG_NBR')) ) { print "Missing ". $BIOG->field('BIOG_NBR') . "\n"; } $t->purge(); #$t->dispose(); return 1; } # # Check database for ID # sub checkBiog { my ($biog) = @_; return 1; }
Indeed I was able to reproduce the bug with 5.10.0. It might be due to RT #56908 "A weak reference to a hash would leak" __ mirod
Subject: Re: [rt.cpan.org #59683] Memory Leak
Date: Mon, 26 Jul 2010 10:45:39 -0400
To: bug-XML-Twig [...] rt.cpan.org
From: Mark Mitchell <carcus88 [...] gmail.com>
Upgraded to 5.10.1 from ActiveState and the problem has gone away. The script now holds a steady memory footprint around 18 MB where before it would increase a few hundred KB per second. Thanks so much for your quick reply, I will post the fix to the Perlmonks thread. On Mon, Jul 26, 2010 at 5:33 AM, MIROD via RT <bug-XML-Twig@rt.cpan.org>wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=59683 > > > Indeed I was able to reproduce the bug with 5.10.0. It might be due to > RT #56908 "A weak reference to a hash would leak" > > __ > mirod >
Subject: Re: [rt.cpan.org #59683] Memory Leak
Date: Mon, 26 Jul 2010 18:18:24 +0200
To: bug-XML-Twig [...] rt.cpan.org
From: Michel Rodriguez <xmltwig [...] gmail.com>
I usually monitor perlmonks, but I was busy this week and missed the thread, sorry. It's a good thing you mentioned the version of perl you used, I will add a warning to the docs in the next version of the module. -- mirod On Mon, Jul 26, 2010 at 4:46 PM, Mark Mitchell via RT <bug-XML-Twig@rt.cpan.org> wrote: Show quoted text
>       Queue: XML-Twig >  Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=59683 > > > Upgraded to 5.10.1 from ActiveState and the problem has gone away. The > script now holds a steady memory footprint around 18 MB where before it > would increase a few hundred KB per second. > Thanks so much for your quick reply, I will post the fix to the Perlmonks > thread. > > > On Mon, Jul 26, 2010 at 5:33 AM, MIROD via RT <bug-XML-Twig@rt.cpan.org>wrote: >
>> <URL: https://rt.cpan.org/Ticket/Display.html?id=59683 > >> >> Indeed I was able to reproduce the bug with 5.10.0. It might be due to >> RT #56908 "A weak reference to a hash would leak" >> >> __ >> mirod >>
> >
-- mirod
Subject: Re: [rt.cpan.org #59683] Memory Leak
Date: Mon, 26 Jul 2010 12:26:22 -0400
To: bug-XML-Twig [...] rt.cpan.org
From: Mark Mitchell <carcus88 [...] gmail.com>
Its cool, I'm just glad it was an easy fix, I've been going crazy for a week thinking it was my code or Twig.pm Thanks again for the quick response, you just saved me from another week of trying to figure this out. - Mark On Mon, Jul 26, 2010 at 12:18 PM, xmltwig@gmail.com via RT < bug-XML-Twig@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=59683 > > > I usually monitor perlmonks, but I was busy this week and missed the > thread, sorry. > > It's a good thing you mentioned the version of perl you used, I will > add a warning to the docs in the next version of the module. > > -- > mirod > > On Mon, Jul 26, 2010 at 4:46 PM, Mark Mitchell via RT > <bug-XML-Twig@rt.cpan.org> wrote:
> > Queue: XML-Twig > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=59683 > > > > > Upgraded to 5.10.1 from ActiveState and the problem has gone away. The > > script now holds a steady memory footprint around 18 MB where before it > > would increase a few hundred KB per second. > > Thanks so much for your quick reply, I will post the fix to the Perlmonks > > thread. > > > > > > On Mon, Jul 26, 2010 at 5:33 AM, MIROD via RT <bug-XML-Twig@rt.cpan.org > >wrote: > >
> >> <URL: https://rt.cpan.org/Ticket/Display.html?id=59683 > > >> > >> Indeed I was able to reproduce the bug with 5.10.0. It might be due to > >> RT #56908 "A weak reference to a hash would leak" > >> > >> __ > >> mirod > >>
> > > >
> > > > -- > mirod > >