Skip Menu |

This queue is for tickets about the MediaWiki-DumpFile CPAN distribution.

Report information
The Basics
Id: 56843
Status: resolved
Priority: 0/
Queue: MediaWiki-DumpFile

People
Owner: triddle [...] cpan.org
Requestors: rj [...] petamem.com
triddle [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.1.5
Fixed in: (no value)



Subject: Weird values from $pages->current_byte()
I've tried the drop-in replacement for Parse::MediaWikiDump. In my programm, I print out a kind of progress status report, where every 5000 processed lines, the percentage of the total progress so far is displayed: if (!(++$count % 5000)) { message("$processed_iso: " . $pages->current_byte() * $percent . "%\n", 1); } where percent is my $percent = 100 / $pages->size(); Normally, I get an output like rus: 2.92736135448719% rus: 4.76936761045247% rus: 6.34968195870317% rus: 7.39906647470754% rus: 8.67535165465465% rus: 9.96669300660353% ... until somewhere in the 80-90%, but since the replacement, I get (full paste): rus: 2.92736135448719% rus: 4.76936761045247% rus: 6.34968195870317% rus: 7.39906647470754% rus: 8.67535165465465% rus: 9.96669300660353% rus: 11.5383141558069% rus: 12.9918106949166% rus: 14.2415775333654% rus: 15.5382405643828% rus: 16.8191182953711% rus: 17.400673266953% rus: 18.1541610149901% rus: 19.0827814735421% rus: 20.3977862963271% rus: 21.7221259859246% rus: 23.1200299353174% rus: 24.4145362372213% rus: 25.6426070219686% rus: 26.9235146474011% rus: 27.8204560239691% rus: 29.316505714986% rus: 30.5565876895689% rus: 31.1838623230623% rus: 31.7616345651786% rus: 32.7580490902927% rus: 33.8641585187747% rus: 35.0382072349052% rus: 36.1364014609105% rus: 37.4376173592158% rus: 38.6678651147159% rus: 39.8824663170689% rus: 41.1398620499523% rus: -41.4477783735194% rus: -40.3617165531779% rus: -39.1789125675399% rus: -38.0397321226139% rus: -36.8653571221788% rus: -35.5748742263416% rus: -34.5499344538474% rus: -34.0245297505949% rus: -33.1066716039366% rus: -32.0365527071021% rus: -31.1068637708056% rus: -29.9182803681428% rus: -29.1744408033133% rus: -28.3387617035213% rus: -27.1526152148208% rus: -25.9275296428521% rus: -24.6540318809368% rus: -23.4576795304163% rus: -22.1732273560404% rus: -20.8804021437284% rus: -19.5962855359504% rus: -18.3459499425392% rus: -17.3420667985523% rus: -16.3653910113241% rus: -15.3328930146273% rus: -14.4566699698825% rus: -13.5474778969148% rus: -12.5259328891336% rus: -11.4827912417906% rus: -10.5284432697161% rus: -9.52768496291735% rus: -8.58304663999153% rus: -7.57122617982953% rus: -6.57308409515593% rus: -5.77433925931634% rus: -4.75446264622612% rus: -3.83490587892223% rus: -2.7434907105121% rus: -1.80863875215777% rus: -0.785296392112861% rus: 0.146929106292996% rus: 1.11648605719519% rus: 2.027814520134% rus: 2.97007417743716% rus: 3.46145223656532% rus: 4.34219786414162% rus: 5.30746231997887% rus: 6.20728701860753% rus: 6.9644882299226% rus: 8.05135000452814% rus: 9.00982539895428% rus: 9.90386174927619% rus: 10.7690126173411% rus: 11.6271145977892% rus: 12.5404598144208% rus: 13.3628079031948% rus: 14.2008484105644% rus: 15.0716734898314% rus: 15.997948219301% (finished) Observe the change in sign around 41%. The file in question is the XML dump of the russian wiki (20100331), which is around 4,8GB in size. Probably some overflow?
I've opened up bug #57085 with XML-LibXML which is where this bug is originating. I'm not able to find a workaround at this time as the problem persists on 64 bit Perl and I can't find a way to manually feed the LibXML Reader data which I keep track of inside the module. I think this will have to be resolved upstream but I'll leave the ticket open until this can be done. I'll also add this to the known bug list for the documentation of MediaWiki::DumpFile::Compat. Thank you for your bug report, Cheers, Tyler
The upstream bug was acted on, has been resolved, and is waiting to be released in the next update to XML::LibXML - details are here https://rt.cpan.org/Ticket/Display.html?id=57085
XML::LibXML shipped version 1.77 some time ago and it included the type changes that resolve this problem. I set a requirement on this version in the META file as well.