Subject: | add_module automatically for nonstandard modules/namespaces |
One-liner!
perl -MXML::RSS -MLWP::Simple -MData::Dumper -le '$t = get("http://freshmeat.net/backend/fm-releases-themes.rdf"); $x = new XML::RSS; $x->parse($t); for (keys %{$x->{namespaces}}) { next if $_ eq "rdf" || $_ eq "#default" || exists $x->{modules}{$x->{namespaces}{$_}}; $x->add_module(prefix => $_, uri => $x->{namespaces}{$_}) }; $x = new XML::RSS; $x->parse($t); print Dumper $x->{items}[0]'
This auto-populates the namespaces with the declared namespaces. So this file has:
xmlns:fm="http://freshmeat.net/backend/fm-releases-0.1.dtd"
And normally, XML::RSS just puts http://freshmeat.net/backend/fm-releases-0.1.dtd as the namespace for the items belonging to that module. That code above creates an fm namespace automatically (and then both work).
I've not looked into this in awhile, but it's something I wanted, so I've saved the code snippet since May, waiting for TODAY to give it to YOU!
More explanation:
I just want to have add_module() called automatically for every nonstandard module in the feed. The real important code is here:
# loop over existing namespaces
for my $ns (keys %{$rss->{namespaces}}) {
# skip default namespaces
next if $ns eq "rdf"
|| $ns eq "#default"
|| exists $rss->{modules}{ $ns->{namespaces}{$ns} };
$rss->add_module(prefix => $ns, uri => $rss->{namespaces}{$ns})
}
This does not need to invoke LWP. The only practical effect is that instead of someone needing to do this:
my $ns = 'http://freshmeat.net/backend/fm-releases-0.1.dtd';
for my $item ( @{$rss->{items}} ) {
print $item->{$ns}{screenshot_url};
}
They can do this:
for my $item ( @{$rss->{items}} ) {
print $item->{fm}{screenshot_url};
}
The one-liner:
1. fetches the RSS
2. parses the RSS
3. looks for nonstandard namespaces
4. adds each namespaces with add_module (which sets up the fm => 'http://freshmeat.net/backend/fm-releases-0.1.dtd' mapping)
5. reparses the XML
6. uses the new RSS
A patch implementing the code near the top would ideally skip the 5th item, and do the 3rd and 4th inside the 2nd. I believe I asked once for this to be included, but the response back was that one should know in advance what modules one is expecting. That is, you can all add_module before parsing the RSS. But a lot of times, you don't know what namespaces you will be expecting. Maybe a flag in the new constructor, add_modules => 1, or something, would be in order.