Skip Menu |

This queue is for tickets about the Parse-MediaWikiDump CPAN distribution.

Report information
The Basics
Id: 46054
Status: resolved
Priority: 0/
Queue: Parse-MediaWikiDump

People
Owner: triddle [...] cpan.org
Requestors: niels [...] genomics.dk
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: MediaWikiDump bug? (script included)
Date: Thu, 14 May 2009 00:24:36 +0200 (CEST)
To: bug-Parse-MediaWikiDump [...] rt.cpan.org
From: niels [...] genomics.dk
Sorry I forgot the script. /Niels #!/usr/bin/env perl # -*- perl -*- use strict; use warnings FATAL => qw ( all ); use Parse::MediaWikiDump; my ( $pages, $page, $file, %cats, $key ); $file = "/home/niels/Wikipedia/enwiki-20090306-pages-articles.xml"; $pages = Parse::MediaWikiDump::Pages->new( $file ); while( defined ($page = $pages->next) ) { if ( $page->categories ) { map { $cats{ $_ }++ } @{ $page->categories }; } } foreach $key ( sort keys %cats ) { print "$key, $cats{ $key }\n"; } --------------------------- Oprindelig e-mail ---------------------------- Emne: MediaWikiDump bug? Fra: niels@genomics.dk Dato: Thu, Maj 14, 2009 00:11 Til: bug-Parse-MediaWikiDump@rt.cpan.org -------------------------------------------------------------------------- Tyler Riddle, The script below quickly exits with the error, Can't use string ("#REDIRECT [[American Samoa]]{{R ") as a SCALAR ref while "strict refs" in use at /home/biobase/GOFFICE/Software/Package_installs/Perl/lib/perl5/site_perl/5.10.0/Parse/MediaWikiDump.pm line 1066. when run on the uncompressed version of pages-articles.xml.bz2 on this page: http://download.wikimedia.org/enwiki/20090306. I have perl 5.10 on a Linux system. I apologize if I am using your methods in a wrong way. Niels Larsen E-mail: niels@genomics.dk Skype: niels_larsen_denmark
Thank you for the bug report! I'll take a look at this soon and let you know my findings. I may also have a test release that I'll send your way when I resolve the bug - if so it would help if you could test it out and let me know the results. Tyler
I think it was a simple oversight on my part. Can you please test version 0.91 attached to the ticket and report back success or failure? I was able to reproduce the error and testing a slightly modified version of your script I was able to verify I resolved it. Thanks again for the bug report, Tyler
Download Parse-MediaWikiDump-0.91.tar.gz
application/x-gzip 18.1k

Message body not shown because it is not plain text.

Subject: Re: [rt.cpan.org #46054] MediaWikiDump bug? (script included)
Date: Thu, 14 May 2009 17:19:23 +0200 (CEST)
To: bug-Parse-MediaWikiDump [...] rt.cpan.org
From: niels [...] genomics.dk
Hi Tyler, Yes, the attached version does not fail, thanks. I started trying to test all the accessors with a little script, see attached, but not sure how much point there is to that. Btw, $pages->namespaces are mentioned twice in the documentation, but with different explanation, probably some typo. Niels Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=46054 > > > I think it was a simple oversight on my part. Can you please test version > 0.91 attached to the > ticket and report back success or failure? > > I was able to reproduce the error and testing a slightly modified version > of your script I was able > to verify I resolved it. > > Thanks again for the bug report, > > Tyler >
Download parsewiki_test
application/octet-stream 1.8k

Message body not shown because it is not plain text.

Thanks for the quick testing and additional bug report about the documentation. I've fixed that and another bug I found with tighter tests (now all the accessors are at least invoked, if not fully tested). I'm going to release 0.91 with the additional changes - you may want to upgrade to the 0.91 that comes off CPAN in the next few days so that your locally installed 0.91 is not out of sync with the rest of the world. I'm going to close this ticket as resolved, thanks again for the report and testing! Tyler