Subject: | MediaWikiDump bug? (script included) |
Date: | Thu, 14 May 2009 00:24:36 +0200 (CEST) |
To: | bug-Parse-MediaWikiDump [...] rt.cpan.org |
From: | niels [...] genomics.dk |
Sorry I forgot the script. /Niels
#!/usr/bin/env perl
# -*- perl -*-
use strict;
use warnings FATAL => qw ( all );
use Parse::MediaWikiDump;
my ( $pages, $page, $file, %cats, $key );
$file = "/home/niels/Wikipedia/enwiki-20090306-pages-articles.xml";
$pages = Parse::MediaWikiDump::Pages->new( $file );
while( defined ($page = $pages->next) )
{
if ( $page->categories ) {
map { $cats{ $_ }++ } @{ $page->categories };
}
}
foreach $key ( sort keys %cats )
{
print "$key, $cats{ $key }\n";
}
--------------------------- Oprindelig e-mail ----------------------------
Emne: MediaWikiDump bug?
Fra: niels@genomics.dk
Dato: Thu, Maj 14, 2009 00:11
Til: bug-Parse-MediaWikiDump@rt.cpan.org
--------------------------------------------------------------------------
Tyler Riddle,
The script below quickly exits with the error,
Can't use string ("#REDIRECT [[American Samoa]]{{R ") as a SCALAR ref
while "strict refs" in use at
/home/biobase/GOFFICE/Software/Package_installs/Perl/lib/perl5/site_perl/5.10.0/Parse/MediaWikiDump.pm
line 1066.
when run on the uncompressed version of pages-articles.xml.bz2 on
this page: http://download.wikimedia.org/enwiki/20090306. I have
perl 5.10 on a Linux system. I apologize if I am using your methods
in a wrong way.
Niels Larsen
E-mail: niels@genomics.dk
Skype: niels_larsen_denmark