Skip Menu |

This queue is for tickets about the XML-Twig CPAN distribution.

Report information
The Basics
Id: 83059
Status: resolved
Priority: 0/
Queue: XML-Twig

People
Owner: Nobody in particular
Requestors: k.tchernov [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 3.42
Fixed in: 3.45



Subject: Segmentation fault while parsing
Running on a Mac OS X with perl v5.12 (I believe it also crashes with perl v5.16), when I parse a list of XML::Twig files one after another, XML::Twig eventually segfaults. I've attached some sample code with sample inputs. To reproduce the crash, run "./parse.pl <input dir>" where <input dir> is the directory containing the contents of xmls.zip I've also tried the sample code with XML::Parser, and that worked fine and without segfaulting.
Subject: xmls.zip
Download xmls.zip
application/zip 2.3m

Message body not shown because it is not plain text.

Subject: parse.pl
#! /usr/bin/perl use warnings; use strict; use XML::Twig; my $xmlsdir = $ARGV[0]; print "Reading dir $xmlsdir\n"; opendir (my $dir, $xmlsdir); while (my $file = readdir($dir)) { print "$file\n"; next unless $file =~ /\.xml$/; my $t = XML::Twig->new(); $t->parsefile("$xmlsdir/$file"); }
From: k.tchernov [...] gmail.com
On Wed Jan 30 18:54:25 2013, ktchernov wrote: Show quoted text
> Running on a Mac OS X with perl v5.12 (I believe it also crashes with > perl v5.16), when I parse a > list of XML::Twig files one after another, XML::Twig eventually > segfaults. > > I've attached some sample code with sample inputs. > > To reproduce the crash, run "./parse.pl <input dir>" where <input dir> > is the directory > containing the contents of xmls.zip > > I've also tried the sample code with XML::Parser, and that worked fine > and without segfaulting.
In the example code a new XML::Twig instance is created for each file. If I re-use the same instance (ie move the initialisation to the outside of the loop and add a $t->purge at the end of the loop), the same problem occurs.
Have you tried perl 5.16? See also https://rt.cpan.org/Ticket/Display.html?id=83037
From: k.tchernov [...] gmail.com
On Thu Jan 31 02:20:01 2013, ANDK wrote: Show quoted text
> Have you tried perl 5.16? See also > > https://rt.cpan.org/Ticket/Display.html?id=83037
That ticket is talking about a large XML, these ones are not particularly large (5.6MB is the biggest one). I've just tried ActivePerl 5.16 on Mac OS X and it does not crash with this example. However, it does crash on Mac with v5.12 on and on Linux with v5.14. Perl v5.16 is not present in many OS distributions, is there any chance for a fix or workaround for the older perl versions?
From: k.tchernov [...] gmail.com
Also worth noting, that I came up with this sample script based on a much bigger script that we're using. In that script, I managed to keep it from crashing by adding a $t->purge before $t went out of scope, it seems to have kept it from segfaulting. Although it seems to make no difference in the sample script that I attached.
Subject: Re: [rt.cpan.org #83059] Segmentation fault while parsing
Date: Fri, 01 Feb 2013 13:42:58 +0100
To: bug-XML-Twig [...] rt.cpan.org
From: mirod <xmltwig [...] gmail.com>
On 02/01/2013 12:14 AM, Konstantin Tchernov via RT wrote: Show quoted text
> Queue: XML-Twig > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=83059 > > > On Thu Jan 31 02:20:01 2013, ANDK wrote:
>> Have you tried perl 5.16? See also >> >> https://rt.cpan.org/Ticket/Display.html?id=83037
> > > That ticket is talking about a large XML, these ones are not particularly large (5.6MB is the > biggest one). > > I've just tried ActivePerl 5.16 on Mac OS X and it does not crash with this example. > > However, it does crash on Mac with v5.12 on and on Linux with v5.14. Perl v5.16 is not present > in many OS distributions, is there any chance for a fix or workaround for the older perl versions?
If it doesn't crash in 5.16 then the problem is most certainly due to a known bug in Scalar::Util weaken. It seems that using too many weakrefs causes a segfault. There is a(n undocumented) way of dodging the problem, by turning off weak references: XML::Twig::_set_weakrefs( 0); This may cause other problems, like running out of memory, and is really not kosher since that function is normally used only for testing the module itself. That said, it fixed the problem with your data in 5.14.2. BTW the bug in weakrefs is demonstrated by the code below: #!/usr/bin/perl use strict; use warnings; use Scalar::Util 'weaken'; # the number of iteration that causes a segmentation fault varies # on my machine, between 19K and 24K, depending on the version of # perl I use (and it's not constant for a given version either) # starting at 5.16 the bug disappears my $ITER= $ARGV[0] || 24000; my $head= {}; my $tail= $head; foreach (1..$ITER) { my $new_tail= { p => $tail }; weaken( $new_tail->{p}); $tail->{n}= $new_tail; $tail= $new_tail; } print "done\n";
This looks like the known weakrefs Perl bug. It is now documented in the docs for the module. -- mirod On Fri Feb 01 07:43:23 2013, xmltwig@gmail.com wrote: Show quoted text
> On 02/01/2013 12:14 AM, Konstantin Tchernov via RT wrote:
> > Queue: XML-Twig > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=83059 > > > > > On Thu Jan 31 02:20:01 2013, ANDK wrote:
> >> Have you tried perl 5.16? See also > >> > >> https://rt.cpan.org/Ticket/Display.html?id=83037
> > > > > > That ticket is talking about a large XML, these ones are not > > particularly large (5.6MB is the > > biggest one). > > > > I've just tried ActivePerl 5.16 on Mac OS X and it does not crash > > with this example. > > > > However, it does crash on Mac with v5.12 on and on Linux with v5.14. > > Perl v5.16 is not present > > in many OS distributions, is there any chance for a fix or workaround > > for the older perl versions?
> > If it doesn't crash in 5.16 then the problem is most certainly due to > a > known bug in Scalar::Util weaken. It seems that using too many > weakrefs > causes a segfault. > > There is a(n undocumented) way of dodging the problem, by turning off > weak references: > > XML::Twig::_set_weakrefs( 0); > > This may cause other problems, like running out of memory, and is > really > not kosher since that function is normally used only for testing the > module itself. That said, it fixed the problem with your data in > 5.14.2. > > > BTW the bug in weakrefs is demonstrated by the code below: > > #!/usr/bin/perl > > use strict; > use warnings; > > use Scalar::Util 'weaken'; > > # the number of iteration that causes a segmentation fault varies > # on my machine, between 19K and 24K, depending on the version of > # perl I use (and it's not constant for a given version either) > # starting at 5.16 the bug disappears > > my $ITER= $ARGV[0] || 24000; > > my $head= {}; > my $tail= $head; > > foreach (1..$ITER) > { my $new_tail= { p => $tail }; > weaken( $new_tail->{p}); > $tail->{n}= $new_tail; > $tail= $new_tail; > } > > print "done\n";
-- __ mirod