Skip Menu |

This queue is for tickets about the XML-Twig CPAN distribution.

Report information
The Basics
Id: 71636
Status: resolved
Priority: 0/
Queue: XML-Twig

People
Owner: Nobody in particular
Requestors: TEAM [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 3.39
Fixed in: (no value)



Subject: Segfault with medium-sized document (30k elements)
Hi there, The attached code raises a segfault with perl-5.10.1 and perl-5.14.2. Seems fine for other XML parsers such as XML::LibXML::SAX and XML::TreeBuilder, and in XML::Twig for smaller documents (although significantly slower than the other parsers). Output of script on Linux/x64: $ perl xmlbench.pl Document is 1350062 chars Benchmark: timing 10 iterations of tree, twig... tree: 10.3706 wallclock secs (10.09 usr + 0.27 sys = 10.36 CPU) @ 0.97/s (n=10) Segmentation fault perl -V output: Summary of my perl5 (revision 5 version 14 subversion 2) configuration: Platform: osname=linux, osvers=2.6.35-30-generic, archname=x86_64-linux uname='linux roku 2.6.35-30-generic #56-ubuntu smp mon jul 11 20:01:08 utc 2011 x86_64 gnulinux ' config_args='-de -Dprefix=/home/tom/perl5/perlbrew/perls/perl-5.14.2' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.4.5', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib/../lib /usr/lib/../lib /lib /usr/lib /usr/lib/x86_64-linux-gnu /lib64 /usr/lib64 libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.12.1.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.12.1' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector' Characteristics of this binary (from libperl): Compile-time options: PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP PERL_PRESERVE_IVUV USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES USE_PERLIO USE_PERL_ATOF Built under linux Compiled at Oct 3 2011 02:24:00 %ENV: PERL5LIB="/home/tom/.cpan:/home/tom/.cpan/lib:/home/tom/.cpan/lib/perl5:/home/tom/.cpan/lib/perl5/site_perl:/home/tom/.cpan/lib/perl5/5.10.0/i486-linux-gnu-thread-multi:" PERLBREW_HOME="/home/tom/.perlbrew" PERLBREW_PATH="/home/tom/perl5/perlbrew/bin:/home/tom/perl5/perlbrew/perls/perl-5.14.2/bin" PERLBREW_PERL="perl-5.14.2" PERLBREW_ROOT="/home/tom/perl5/perlbrew" PERLBREW_VERSION="0.29" PERL_LWP_SSL_VERIFY_HOSTNAME="0" PERL_MM_USE_DEFAULT="1" @INC: /home/tom/.cpan /home/tom/.cpan/lib /home/tom/.cpan/lib/perl5 /home/tom/.cpan/lib/perl5/site_perl /home/tom/.cpan/lib/perl5/5.10.0/i486-linux-gnu-thread-multi /home/tom/perl5/perlbrew/perls/perl-5.14.2/lib/site_perl/5.14.2/x86_64-linux /home/tom/perl5/perlbrew/perls/perl-5.14.2/lib/site_perl/5.14.2 /home/tom/perl5/perlbrew/perls/perl-5.14.2/lib/5.14.2/x86_64-linux /home/tom/perl5/perlbrew/perls/perl-5.14.2/lib/5.14.2 cheers, Tom
Subject: xmlbench.pl
#!/usr/bin/perl use strict; use warnings; use XML::Twig; use XML::TreeBuilder; use Benchmark qw(:hireswallclock); my $xml = q{ <root> }; $xml .= q{ <child> <name>test name</name> </child> } for 0..30000; $xml .= q{ </root> }; warn "Document is " . length($xml) . " chars\n"; timethese(10, { twig => sub { my $parsed = XML::Twig->parse($xml); }, tree => sub { my $parsed = XML::TreeBuilder->new->parse($xml); } });
Subject: Re: [rt.cpan.org #71636] Segfault with medium-sized document (30k elements)
Date: Thu, 13 Oct 2011 10:20:09 +0200
To: bug-XML-Twig [...] rt.cpan.org
From: mirod <xmltwig [...] gmail.com>
Interesting. The problem kicks in around 18055 elements for me (32-bit perl, gobs of memory on the machine). The trigger is not always the same BTW, 18054 may or may not cause it. The number of children seems to be the only important factor BTW, adding sub elements within the children does not change the trigger. You don't need a loop, the segfault happens during the DESTROY phase. The minimum test case I have found is below. #!/usr/bin/perl use strict; use warnings; use XML::Twig; my $xml = q{<root>} . q{<child><name>test name</name></child>} x 20000 . q{</root>}; warn "Document is " . length($xml) . " chars\n"; my $parsed = XML::Twig->new->parse($xml); my $t= XML::Twig->new->parse( '<d/>'); warn "done\n"; exit; The problem occurs _after_ "done" is printed. So it's in the DESTROY phase, and it doesn't happen unless the second parse is done. It also doesn't happen when I purge the twig as I create it (ie if I add a handler on child that does $t->purge. So at this point my guess is that either I messed up the cleanup of XML::Parser related objects, or its a bug in the layers below, either XML::Parser or perl itself. I'll keep on investigating and let you know. Thanks for the report. -- mirod On 10/12/2011 10:11 PM, Thomas Edward Alexander Molesworth via RT wrote: Show quoted text
> Wed Oct 12 16:11:46 2011: Request 71636 was acted upon. > Transaction: Ticket created by TEAM > Queue: XML-Twig > Subject: Segfault with medium-sized document (30k elements) > Broken in: 3.39 > Severity: (no value) > Owner: Nobody > Requestors: TEAM@cpan.org > Status: new > Ticket<URL: https://rt.cpan.org/Ticket/Display.html?id=71636> > > > Hi there, > > The attached code raises a segfault with perl-5.10.1 and perl-5.14.2. > > Seems fine for other XML parsers such as XML::LibXML::SAX and > XML::TreeBuilder, and in XML::Twig for smaller documents (although > significantly slower than the other parsers). > > Output of script on Linux/x64: > > $ perl xmlbench.pl > Document is 1350062 chars > Benchmark: timing 10 iterations of tree, twig... > tree: 10.3706 wallclock secs (10.09 usr + 0.27 sys = 10.36 CPU) @ > 0.97/s (n=10) > Segmentation fault > > perl -V output: > > Summary of my perl5 (revision 5 version 14 subversion 2) configuration: > > Platform: > osname=linux, osvers=2.6.35-30-generic, archname=x86_64-linux > uname='linux roku 2.6.35-30-generic #56-ubuntu smp mon jul 11 > 20:01:08 utc 2011 x86_64 gnulinux ' > config_args='-de -Dprefix=/home/tom/perl5/perlbrew/perls/perl-5.14.2' > hint=recommended, useposix=true, d_sigaction=define > useithreads=undef, usemultiplicity=undef > useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef > use64bitint=define, use64bitall=define, uselongdouble=undef > usemymalloc=n, bincompat5005=undef > Compiler: > cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector > -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', > optimize='-O2', > cppflags='-fno-strict-aliasing -pipe -fstack-protector > -I/usr/local/include' > ccversion='', gccversion='4.4.5', gccosandvers='' > intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 > d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 > ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', > lseeksize=8 > alignbytes=8, prototype=define > Linker and Libraries: > ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' > libpth=/usr/local/lib /lib/../lib /usr/lib/../lib /lib /usr/lib > /usr/lib/x86_64-linux-gnu /lib64 /usr/lib64 > libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat > perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc > libc=/lib/libc-2.12.1.so, so=so, useshrplib=false, libperl=libperl.a > gnulibc_version='2.12.1' > Dynamic Linking: > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' > cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib > -fstack-protector' > > > Characteristics of this binary (from libperl): > Compile-time options: PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP > PERL_PRESERVE_IVUV USE_64_BIT_ALL USE_64_BIT_INT > USE_LARGE_FILES USE_PERLIO USE_PERL_ATOF > Built under linux > Compiled at Oct 3 2011 02:24:00 > %ENV: > > PERL5LIB="/home/tom/.cpan:/home/tom/.cpan/lib:/home/tom/.cpan/lib/perl5:/home/tom/.cpan/lib/perl5/site_perl:/home/tom/.cpan/lib/perl5/5.10.0/i486-linux-gnu-thread-multi:" > PERLBREW_HOME="/home/tom/.perlbrew" > > PERLBREW_PATH="/home/tom/perl5/perlbrew/bin:/home/tom/perl5/perlbrew/perls/perl-5.14.2/bin" > PERLBREW_PERL="perl-5.14.2" > PERLBREW_ROOT="/home/tom/perl5/perlbrew" > PERLBREW_VERSION="0.29" > PERL_LWP_SSL_VERIFY_HOSTNAME="0" > PERL_MM_USE_DEFAULT="1" > @INC: > /home/tom/.cpan > /home/tom/.cpan/lib > /home/tom/.cpan/lib/perl5 > /home/tom/.cpan/lib/perl5/site_perl > /home/tom/.cpan/lib/perl5/5.10.0/i486-linux-gnu-thread-multi > > /home/tom/perl5/perlbrew/perls/perl-5.14.2/lib/site_perl/5.14.2/x86_64-linux > /home/tom/perl5/perlbrew/perls/perl-5.14.2/lib/site_perl/5.14.2 > /home/tom/perl5/perlbrew/perls/perl-5.14.2/lib/5.14.2/x86_64-linux > /home/tom/perl5/perlbrew/perls/perl-5.14.2/lib/5.14.2 > > cheers, > > Tom
From: vcizek [...] suse.cz
On Thu Oct 13 04:20:43 2011, xmltwig@gmail.com wrote: Show quoted text
> Interesting. > > The problem kicks in around 18055 elements for me (32-bit perl, gobs of > memory on the machine). The trigger is not always the same BTW, 18054 > may or may not cause it. The number of children seems to be the only > important factor BTW, adding sub elements within the children does not > change the trigger. >
On my 64bit machine the treshold is around 18695. Show quoted text
> You don't need a loop, the segfault happens during the DESTROY phase. > > The minimum test case I have found is below. > > > #!/usr/bin/perl > use strict; > use warnings; > > use XML::Twig; > > my $xml = q{<root>} . q{<child><name>test name</name></child>} x 20000 . > q{</root>}; > warn "Document is " . length($xml) . " chars\n"; > > my $parsed = XML::Twig->new->parse($xml); > my $t= XML::Twig->new->parse( '<d/>'); > warn "done\n"; > exit; > > > The problem occurs _after_ "done" is printed. So it's in the DESTROY > phase, and it doesn't happen unless the second parse is done.
I can reproduce it without the second parse. Show quoted text
> It also doesn't happen when I purge the twig as I create it (ie if I add > a handler on child that does $t->purge. > > So at this point my guess is that either I messed up the cleanup of > XML::Parser related objects, or its a bug in the layers below, either > XML::Parser or perl itself. >
A workaround that works for me is forcing the module not to use weak references. Eg. setting weakrefs = 0.
Subject: Re: [rt.cpan.org #71636] Segfault with medium-sized document (30k elements)
Date: Tue, 06 Dec 2011 19:33:43 +0100
To: bug-XML-Twig [...] rt.cpan.org
From: mirod <xmltwig [...] gmail.com>
On 12/05/2011 03:45 PM, Vita Cizek via RT wrote: Show quoted text
> Queue: XML-Twig > Ticket<URL: https://rt.cpan.org/Ticket/Display.html?id=71636> > > On Thu Oct 13 04:20:43 2011, xmltwig@gmail.com wrote:
>> Interesting. >> >> The problem kicks in around 18055 elements for me (32-bit perl, gobs of >> memory on the machine). The trigger is not always the same BTW, 18054 >> may or may not cause it. The number of children seems to be the only >> important factor BTW, adding sub elements within the children does not >> change the trigger. >>
> > On my 64bit machine the treshold is around 18695. >
>> You don't need a loop, the segfault happens during the DESTROY phase. >> >> The minimum test case I have found is below. >> >> >> #!/usr/bin/perl >> use strict; >> use warnings; >> >> use XML::Twig; >> >> my $xml = q{<root>} . q{<child><name>test name</name></child>} x 20000 . >> q{</root>}; >> warn "Document is " . length($xml) . " chars\n"; >> >> my $parsed = XML::Twig->new->parse($xml); >> my $t= XML::Twig->new->parse( '<d/>'); >> warn "done\n"; >> exit; >> >> >> The problem occurs _after_ "done" is printed. So it's in the DESTROY >> phase, and it doesn't happen unless the second parse is done.
> > I can reproduce it without the second parse. >
>> It also doesn't happen when I purge the twig as I create it (ie if I add >> a handler on child that does $t->purge. >> >> So at this point my guess is that either I messed up the cleanup of >> XML::Parser related objects, or its a bug in the layers below, either >> XML::Parser or perl itself. >>
> > A workaround that works for me is forcing the module not to use > weak references. Eg. setting weakrefs = 0. >
Indeed this would work, since I believe the problem comes from a bug in weaken The bug actually seems to be fixed in 5.15.5, so there is hope. -- mirod
On Tue Dec 06 13:34:24 2011, xmltwig@gmail.com wrote: Show quoted text
> On 12/05/2011 03:45 PM, Vita Cizek via RT wrote:
> > Queue: XML-Twig > > Ticket<URL: https://rt.cpan.org/Ticket/Display.html?id=71636> > > > > On Thu Oct 13 04:20:43 2011, xmltwig@gmail.com wrote:
> >> Interesting. > >> > >> The problem kicks in around 18055 elements for me (32-bit perl, gobs of > >> memory on the machine). The trigger is not always the same BTW, 18054 > >> may or may not cause it. The number of children seems to be the only > >> important factor BTW, adding sub elements within the children does not > >> change the trigger. > >>
> > > > On my 64bit machine the treshold is around 18695. > >
> >> You don't need a loop, the segfault happens during the DESTROY phase. > >> > >> The minimum test case I have found is below. > >> > >> > >> #!/usr/bin/perl > >> use strict; > >> use warnings; > >> > >> use XML::Twig; > >> > >> my $xml = q{<root>} . q{<child><name>test name</name></child>} x 20000 . > >> q{</root>}; > >> warn "Document is " . length($xml) . " chars\n"; > >> > >> my $parsed = XML::Twig->new->parse($xml); > >> my $t= XML::Twig->new->parse( '<d/>'); > >> warn "done\n"; > >> exit; > >> > >> > >> The problem occurs _after_ "done" is printed. So it's in the DESTROY > >> phase, and it doesn't happen unless the second parse is done.
> > > > I can reproduce it without the second parse. > >
> >> It also doesn't happen when I purge the twig as I create it (ie if I add > >> a handler on child that does $t->purge. > >> > >> So at this point my guess is that either I messed up the cleanup of > >> XML::Parser related objects, or its a bug in the layers below, either > >> XML::Parser or perl itself. > >>
> > > > A workaround that works for me is forcing the module not to use > > weak references. Eg. setting weakrefs = 0. > >
> > Indeed this would work, since I believe the problem comes from a bug in > weaken > > The bug actually seems to be fixed in 5.15.5, so there is hope.
Do the nodes an an XML::Twig class have references to their siblings? If so, this may be related to <https://rt.perl.org/rt3/Ticket/Display.html?id=44225>, in which case turning off weak references only works by chance. If that is the case, you could create a DESTROY method to break links between sub-nodes or even free them one-by-one, starting with the innermost, to work around the problem in 5.14 and earlier. This is just a guess off the top of my head, without actually looking into the problem very far.
This is a Perl bug, fixed in 5.16+ __ mirod