Skip Menu |

This queue is for tickets about the XML-LibXML CPAN distribution.

Report information
The Basics
Id: 57085
Status: resolved
Priority: 0/
Queue: XML-LibXML

People
Owner: Nobody in particular
Requestors: triddle [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



Subject: byteConsumed() method of LibXML Reader wraps around 2 gigs of input XML
I got bug #56843 opened for MediaWiki::DumpFile which I've traced back to the LibXML reader used in the module. Specifically the value from byteConsumed() wraps around 2 gigabytes of input XML. Here is an example program: #!/usr/bin/env perl use strict; use warnings; use XML::LibXML::Reader; my $reader = XML::LibXML::Reader->new(location => shift(@ARGV)); while(1) { if ($reader->byteConsumed < 0) { die "wrapped to " . $reader->byteConsumed; } last unless $reader->nextElement('page') == 1; print $reader->byteConsumed, "\n"; } which will output: 2147463683 2147472892 2147473405 -2147478169 wrapped to -2147478169 at ./test.pl line 13. foodmotron:00-Playing tyler$ I ran across this issue with Parse::MediaWikiDump which was using XML::Parser - it wrapped in the same place but only on a 32 bit perl; using a 64 bit perl was a valid workaround in that instance. In this instance I'm using a 64 bit Perl but the wrap still happens. Thanks for the great software! LibXML is fantastic. :-) Cheers, Tyler
Hi Tyler, On Fri Apr 30 11:00:47 2010, TRIDDLE wrote: Show quoted text
> I got bug #56843 opened for MediaWiki::DumpFile which I've traced back > to the LibXML > reader used in the module. Specifically the value from byteConsumed() > wraps around 2 > gigabytes of input XML. Here is an example program: > > > I ran across this issue with Parse::MediaWikiDump which was using > XML::Parser - it wrapped > in the same place but only on a 32 bit perl; using a 64 bit perl was a > valid workaround in > that instance. In this instance I'm using a 64 bit Perl but the wrap > still happens. >
Thanks for the report. In commit 039f4142e129, I now changed the signature of XML::LibXML::Reader::byteConsumed to use the “long” datatype instead of “int” which should fix it where long is 64-bit. (“long” is what libxml2 returns there, so we cannot do better than that.). See: https://bitbucket.org/shlomif/perl-xml-libxml BTW, one can build perl 5 with 64-bit integers, even on 32-bit platforms. So I'm closing this report. If you still have any problems, please comment on it and it will be re-opened. Regards, -- Shlomi Fish