Skip Menu |

This queue is for tickets about the Parse-MediaWikiDump CPAN distribution.

Report information
The Basics
Id: 17279
Status: resolved
Priority: 0/
Queue: Parse-MediaWikiDump

People
Owner: Nobody in particular
Requestors: jasonspiro [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Re: Feature request: parse dump files with history
Date: Wed, 25 Jan 2006 14:03:36 -0800
To: bug-Parse-MediaWikiDump [...] rt.cpan.org
From: Jason Spiro <jasonspiro [...] gmail.com>
No, don't worry, free software is a hobby, not an obligation, and it was kind of you to write your module in the first place, you don't have to apologize. :) Instead, I'm going to try to write a PHP script with a (long) UPDATE statement that does the anti-vandalism script I mentioned in my earlier message, without parsing the dump and recreating a new one. Unless you know of anyone else who's interested in rewriting P::M.W.D... :) Cheers, Jason On 1/25/06, via RT <bug-Parse-MediaWikiDump@rt.cpan.org> wrote: Show quoted text
> Hi Jason, > > Honestly, I'd love it too; unfortunately, it is not going to be an easy > task. > Parse::MediaWikiDump was my first experience with XML and I think the core > of the XML > handling logic needs to be re-implemented to properly handle the full > history dump files. It > might be possible to come up with a non-optimal solution using one of the > other XML > frameworks but one that might work well enough for your purposes. This could > wind up > slower than Parse::MediaWikiDump is though. > > My time to implement new features right now is quite limited. XML::Twig is > pretty easy to > use, I'd recommend taking a look at that to see if you can come up with > something easy that > will do what you need. Otherwise, full history dump support is on the road > map for > Parse::MediaWikiDump, but I don't have a time frame on that one. > > Sorry I couldn't of been more help, > > Tyler Riddle > > On Wed Jan 25 03:58:43 2006, jasonspiro@gmail.com wrote:
> > I would love it if Parse::MediaWikiDump could deal with full dumps > > with history included. > > > > Why? Because I would like to do something like what > > http://wikipedia.org/wiki/German_Wikipedia did to remove vandalism > > from dumps automatically (I believe there are more details on their > > algorithm on that page) but Erwin Jurschitza of DirectMedia in Germany > > uses a closed-source app for that purpose; plus, it's written in > > Delphi. Parse::MediaWikiDump would be my only hope. :)
> > > >
Hi Jason, It is probably far after this feature would have been useful for you but it finally got implemented. Parse::MediaWikiDump 0.90 shipped today and has a new package called Parse::MediaWikiDump::Revisions which can handle the dump files that have more than one revision per article. Tyler On Wed Jan 25 17:04:36 2006, jasonspiro@gmail.com wrote: Show quoted text
> No, don't worry, free software is a hobby, not an obligation, and it > was kind of you to write your module in the first place, you don't > have to apologize. :) > > Instead, I'm going to try to write a PHP script with a (long) UPDATE > statement that does the anti-vandalism script I mentioned in my > earlier message, without parsing the dump and recreating a new one. > Unless you know of anyone else who's interested in rewriting > P::M.W.D... :) > Cheers, > Jason > > On 1/25/06, via RT <bug-Parse-MediaWikiDump@rt.cpan.org> wrote:
> > Hi Jason, > > > > Honestly, I'd love it too; unfortunately, it is not going to be an
> easy
> > task. > > Parse::MediaWikiDump was my first experience with XML and I think
> the core
> > of the XML > > handling logic needs to be re-implemented to properly handle the
> full
> > history dump files. It > > might be possible to come up with a non-optimal solution using one
> of the
> > other XML > > frameworks but one that might work well enough for your purposes.
> This could
> > wind up > > slower than Parse::MediaWikiDump is though. > > > > My time to implement new features right now is quite limited.
> XML::Twig is
> > pretty easy to > > use, I'd recommend taking a look at that to see if you can come up
> with
> > something easy that > > will do what you need. Otherwise, full history dump support is on
> the road
> > map for > > Parse::MediaWikiDump, but I don't have a time frame on that one. > > > > Sorry I couldn't of been more help, > > > > Tyler Riddle > > > > On Wed Jan 25 03:58:43 2006, jasonspiro@gmail.com wrote:
> > > I would love it if Parse::MediaWikiDump could deal with full dumps > > > with history included. > > > > > > Why? Because I would like to do something like what > > > http://wikipedia.org/wiki/German_Wikipedia did to remove vandalism > > > from dumps automatically (I believe there are more details on
> their
> > > algorithm on that page) but Erwin Jurschitza of DirectMedia in
> Germany
> > > uses a closed-source app for that purpose; plus, it's written in > > > Delphi. Parse::MediaWikiDump would be my only hope. :)
> > > > > > > >
Subject: Re: [rt.cpan.org #17279] Re: Feature request: parse dump files with history
Date: Thu, 7 May 2009 19:57:30 -0400
To: bug-Parse-MediaWikiDump [...] rt.cpan.org
From: "Jason A. Spiro" <jasonspiro [...] gmail.com>
Hi Tyler. I reread the ticket and I still don't remember why I requested the feature in the first place. But anyway, thank you very much for implementing it. Even though I don't think I ever will, I bet that many other Parse::MediaWikiDump users will use the feature.