Skip Menu |

This queue is for tickets about the XML-RSS CPAN distribution.

Report information
The Basics
Id: 107231
Status: rejected
Priority: 0/
Queue: XML-RSS

People
Owner: Nobody in particular
Requestors: oleg [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 1.56
Fixed in: (no value)



Subject: Can't parse RSS with xmlns defined
This test script use XML::RSS; use LWP::Simple; my $rss = XML::RSS->new; $rss->parse(get('http://news.ngs.ru/rss/')); __END__ produces error: Modification of non-creatable array value attempted, subscript -1 at /tmp/XML-RSS-1.56/lib/XML/RSS.pm line 914. And with this patch all works good (except tests in the distribution): --- RSS.pm.orig 2015-09-22 14:20:19.540350762 +0600 +++ RSS.pm 2015-09-22 14:20:49.563355439 +0600 @@ -1235,7 +1235,7 @@ my $self = shift; return XML::Parser->new( - Namespaces => 1, + Namespaces => 0, NoExpand => 1, ParseParamEnt => 0, Handlers => {
Hi Oleg, On Tue Sep 22 04:23:04 2015, OLEG wrote: Show quoted text
> This test script > > use XML::RSS; > use LWP::Simple; > > my $rss = XML::RSS->new; > $rss->parse(get('http://news.ngs.ru/rss/')); > __END__ >
Please provide a more self-contained script with strict and warnings and which does not need network access or depends on networked resources. Show quoted text
> produces error: Modification of non-creatable array value attempted, > subscript -1 at /tmp/XML-RSS-1.56/lib/XML/RSS.pm line 914. > > And with this patch all works good (except tests in the distribution): > > --- RSS.pm.orig 2015-09-22 14:20:19.540350762 +0600 > +++ RSS.pm 2015-09-22 14:20:49.563355439 +0600 > @@ -1235,7 +1235,7 @@ > my $self = shift; > > return XML::Parser->new( > - Namespaces => 1, > + Namespaces => 0, > NoExpand => 1, > ParseParamEnt => 0, > Handlers => {
One option is to make the Namespaces parameter overridable in sub new { ... } . A patch for doing that - with tests - will be welcome.
Втр Сен 22 05:46:08 2015, SHLOMIF писал: Show quoted text
> Hi Oleg, > > On Tue Sep 22 04:23:04 2015, OLEG wrote:
> > This test script > > > > use XML::RSS; > > use LWP::Simple; > > > > my $rss = XML::RSS->new; > > $rss->parse(get('http://news.ngs.ru/rss/')); > > __END__ > >
> > Please provide a more self-contained script with strict and warnings > and which does not need network access or depends on networked > resources. >
> > produces error: Modification of non-creatable array value attempted, > > subscript -1 at /tmp/XML-RSS-1.56/lib/XML/RSS.pm line 914. > > > > And with this patch all works good (except tests in the > > distribution): > > > > --- RSS.pm.orig 2015-09-22 14:20:19.540350762 +0600 > > +++ RSS.pm 2015-09-22 14:20:49.563355439 +0600 > > @@ -1235,7 +1235,7 @@ > > my $self = shift; > > > > return XML::Parser->new( > > - Namespaces => 1, > > + Namespaces => 0, > > NoExpand => 1, > > ParseParamEnt => 0, > > Handlers => {
> > One option is to make the Namespaces parameter overridable in sub new > { ... } . A patch for doing that - with tests - will be welcome.
Sorry for the report. Looks like this feed is not compatible with RSS 2.0 specification, which says: The elements defined in this document are not themselves members of a namespace, so that RSS 2.0 can remain compatible with previous versions in the following sense -- a version 0.91 or 0.92 file is also a valid 2.0 file. If the elements of RSS 2.0 were in a namespace, this constraint would break, a version 0.9x file would not be a valid 2.0 file.
Чтв Сен 24 07:21:00 2015, OLEG писал: Show quoted text
> Втр Сен 22 05:46:08 2015, SHLOMIF писал:
> > Hi Oleg, > > > > On Tue Sep 22 04:23:04 2015, OLEG wrote:
> > > This test script > > > > > > use XML::RSS; > > > use LWP::Simple; > > > > > > my $rss = XML::RSS->new; > > > $rss->parse(get('http://news.ngs.ru/rss/')); > > > __END__ > > >
> > > > Please provide a more self-contained script with strict and warnings > > and which does not need network access or depends on networked > > resources. > >
> > > produces error: Modification of non-creatable array value > > > attempted, > > > subscript -1 at /tmp/XML-RSS-1.56/lib/XML/RSS.pm line 914. > > > > > > And with this patch all works good (except tests in the > > > distribution): > > > > > > --- RSS.pm.orig 2015-09-22 14:20:19.540350762 +0600 > > > +++ RSS.pm 2015-09-22 14:20:49.563355439 +0600 > > > @@ -1235,7 +1235,7 @@ > > > my $self = shift; > > > > > > return XML::Parser->new( > > > - Namespaces => 1, > > > + Namespaces => 0, > > > NoExpand => 1, > > > ParseParamEnt => 0, > > > Handlers => {
> > > > One option is to make the Namespaces parameter overridable in sub new > > { ... } . A patch for doing that - with tests - will be welcome.
> > Sorry for the report. Looks like this feed is not compatible with RSS > 2.0 specification, which says: > The elements defined in this document are not themselves members of a > namespace, so that RSS 2.0 can remain compatible with previous > versions in the following sense -- a version 0.91 or 0.92 file is also > a valid 2.0 file. If the elements of RSS 2.0 were in a namespace, this > constraint would break, a version 0.9x file would not be a valid 2.0 > file.
I thought a little more about this issue. Actually all RSS clients which I tried can successfully parse RSS feed which I mentioned above, but XML::RSS can't. This is the main reason why they don't want to fix that feed. So, why XML::RSS can't just ignore default namespace as other clients do? This patch does just that. All tests passed successfully. Can you apply it and make new release?
Subject: XML-RSS.patch
diff -Naru old/XML-RSS-1.56/lib/XML/RSS.pm new/XML-RSS-1.56/lib/XML/RSS.pm --- old/XML-RSS-1.56/lib/XML/RSS.pm 2014-12-04 22:02:56.000000000 +0600 +++ new/XML-RSS-1.56/lib/XML/RSS.pm 2015-10-12 18:32:34.256889920 +0600 @@ -933,6 +933,12 @@ # beginning of RSS 0.91 if ($el eq 'rss') { + if (!$self->{rss_namespace} && grep { $_ eq '#default' } $parser->current_ns_prefixes) { + # actually RSS may not has default namespace + # so, try to ignore it + $self->{rss_namespace} = $parser->expand_ns_prefix('#default'); + } + if (exists($attribs{version})) { $self->{_internal}->{version} = $attribs{version}; } diff -Naru old/XML-RSS-1.56/t/2.0-parse-w-ns.t new/XML-RSS-1.56/t/2.0-parse-w-ns.t --- old/XML-RSS-1.56/t/2.0-parse-w-ns.t 1970-01-01 07:00:00.000000000 +0700 +++ new/XML-RSS-1.56/t/2.0-parse-w-ns.t 2015-10-12 18:27:02.394291637 +0600 @@ -0,0 +1,110 @@ +use strict; +use warnings; +use Test::More; + +use constant RSS_VERSION => "2.0"; +use constant RSS_CHANNEL_TITLE => "Example 2.0 Channel"; + +use constant RSS_DOCUMENT => qq(<?xml version="1.0"?> +<rss xmlns="http://backend.userland.com/rss2" version="2.0"> + <channel> + <title>Example 2.0 Channel</title> + <link>http://example.com/</link> + <description>To lead by example</description> + <language>en-us</language> + <copyright>All content Public Domain, except comments which remains copyright the author</copyright> + <managingEditor>editor\@example.com</managingEditor> + <webMaster>webmaster\@example.com</webMaster> + <docs>http://backend.userland.com/rss</docs> + <category domain="http://www.dmoz.org">Reference/Libraries/Library_and_Information_Science/Technical_Services/Cataloguing/Metadata/RDF/Applications/RSS/</category> + <generator>The Superest Dooperest RSS Generator</generator> + <lastBuildDate>Mon, 02 Sep 2002 03:19:17 GMT</lastBuildDate> + <ttl>60</ttl> + + <item> + <title>News for September the Second</title> + <link>http://example.com/2002/09/02</link> + <description>other things happened today</description> + <comments>http://example.com/2002/09/02/comments.html</comments> + <author>joeuser\@example.com</author> + <pubDate>Mon, 02 Sep 2002 03:19:00 GMT</pubDate> + <guid isPermaLink="true">http://example.com/2002/09/02</guid> + </item> + + <item> + <title>News for September the First</title> + <link>http://example.com/2002/09/01</link> + <description>something happened today</description> + <comments>http://example.com/2002/09/01/comments.html</comments> + <author>joeuser\@example.com</author> + <pubDate>Sun, 01 Sep 2002 12:01:00 GMT</pubDate> + <guid isPermaLink="true">http://example.com/2002/09/02</guid> + </item> + + </channel> +</rss>); + +plan tests => 7; + +use_ok("XML::RSS"); + +my $xml = XML::RSS->new(); +isa_ok($xml,"XML::RSS"); + +eval { $xml->parse(RSS_DOCUMENT); }; +is($@,'',"Parsed RSS feed"); + +cmp_ok($xml->{'_internal'}->{'version'},"eq",RSS_VERSION,"Is RSS version ".RSS_VERSION); +cmp_ok($xml->{channel}->{'title'},"eq",RSS_CHANNEL_TITLE,"Feed title is ".RSS_CHANNEL_TITLE); +cmp_ok(ref($xml->{items}),"eq","ARRAY","\$xml->{items} is an ARRAY ref"); + +my $ok = 1; + +foreach my $item (@{$xml->{items}}) { + + my $min = 0; + foreach my $el ("title","description") { + if (exists $item->{$el}) { + $min ||= 1; + } + } + + $ok = $min; + last if (! $ok); +} + +ok($ok,"All items have either a title or a description element"); + + +__END__ + +=head1 NAME + +2.0-parse-ns.t - tests for parsing RSS 2.0 data with default namespace (which is not actually valid) using XML::RSS.pm + +=head1 SYNOPSIS + + use Test::Harness qw (runtests); + runtests (./XML-RSS/t/*.t); + +=head1 DESCRIPTION + +Tests for parsing RSS 2.0 data with XML::RSS.pm + +=head1 VERSION + +$Revision: 1.2 $ + +=head1 DATE + +$Date: 2002/11/19 23:56:53 $ + +=head1 AUTHOR + +Aaron Straup Cope + +=head1 SEE ALSO + +http://backend.userland.com/rss2 + +=cut
Hi Oleg, sorry for the late response. On Mon Oct 12 08:46:00 2015, OLEG wrote: Show quoted text
> Чтв Сен 24 07:21:00 2015, OLEG писал:
> > Втр Сен 22 05:46:08 2015, SHLOMIF писал:
> > > Hi Oleg, > > > > > > On Tue Sep 22 04:23:04 2015, OLEG wrote:
> > > > This test script > > > > > > > > use XML::RSS; > > > > use LWP::Simple; > > > > > > > > my $rss = XML::RSS->new; > > > > $rss->parse(get('http://news.ngs.ru/rss/')); > > > > __END__ > > > >
> > > > > > Please provide a more self-contained script with strict and > > > warnings > > > and which does not need network access or depends on networked > > > resources. > > >
> > > > produces error: Modification of non-creatable array value > > > > attempted, > > > > subscript -1 at /tmp/XML-RSS-1.56/lib/XML/RSS.pm line 914. > > > > > > > > And with this patch all works good (except tests in the > > > > distribution): > > > > > > > > --- RSS.pm.orig 2015-09-22 14:20:19.540350762 +0600 > > > > +++ RSS.pm 2015-09-22 14:20:49.563355439 +0600 > > > > @@ -1235,7 +1235,7 @@ > > > > my $self = shift; > > > > > > > > return XML::Parser->new( > > > > - Namespaces => 1, > > > > + Namespaces => 0, > > > > NoExpand => 1, > > > > ParseParamEnt => 0, > > > > Handlers => {
> > > > > > One option is to make the Namespaces parameter overridable in sub > > > new > > > { ... } . A patch for doing that - with tests - will be welcome.
> > > > Sorry for the report. Looks like this feed is not compatible with RSS > > 2.0 specification, which says: > > The elements defined in this document are not themselves members of a > > namespace, so that RSS 2.0 can remain compatible with previous > > versions in the following sense -- a version 0.91 or 0.92 file is > > also > > a valid 2.0 file. If the elements of RSS 2.0 were in a namespace, > > this > > constraint would break, a version 0.9x file would not be a valid 2.0 > > file.
> > I thought a little more about this issue. Actually all RSS clients > which I tried can successfully parse RSS feed which I mentioned above, > but XML::RSS can't. This is the main reason why they don't want to fix > that feed. So, why XML::RSS can't just ignore default namespace as > other clients do? > This patch does just that. All tests passed successfully. > > Can you apply it and make new release?
I'd rather not apply a patch that makes XML-RSS less standards-compliant like that as it is. What you should do is: 1. Make the new behaviour opt-in - i.e: it should be off by default and explictly enabled using a documented switch. 2. Make sure you declare your constants in the test using "my" instead of "use constant", which I detest. 3. Use here-documents where appropriate: https://en.wikipedia.org/wiki/Here_document instead of multiline q(...) strings. If you do all that, then I can apply the patch. Sorry again. Regards, -- Shlomi Fish
On Mon Nov 30 11:58:02 2015, SHLOMIF wrote: Show quoted text
> Hi Oleg, > > sorry for the late response. > > On Mon Oct 12 08:46:00 2015, OLEG wrote:
> > Чтв Сен 24 07:21:00 2015, OLEG писал:
> > > Втр Сен 22 05:46:08 2015, SHLOMIF писал:
> > > > Hi Oleg, > > > > > > > > On Tue Sep 22 04:23:04 2015, OLEG wrote:
> > > > > This test script > > > > > > > > > > use XML::RSS; > > > > > use LWP::Simple; > > > > > > > > > > my $rss = XML::RSS->new; > > > > > $rss->parse(get('http://news.ngs.ru/rss/')); > > > > > __END__ > > > > >
> > > > > > > > Please provide a more self-contained script with strict and > > > > warnings > > > > and which does not need network access or depends on networked > > > > resources. > > > >
> > > > > produces error: Modification of non-creatable array value > > > > > attempted, > > > > > subscript -1 at /tmp/XML-RSS-1.56/lib/XML/RSS.pm line 914. > > > > > > > > > > And with this patch all works good (except tests in the > > > > > distribution): > > > > > > > > > > --- RSS.pm.orig 2015-09-22 14:20:19.540350762 +0600 > > > > > +++ RSS.pm 2015-09-22 14:20:49.563355439 +0600 > > > > > @@ -1235,7 +1235,7 @@ > > > > > my $self = shift; > > > > > > > > > > return XML::Parser->new( > > > > > - Namespaces => 1, > > > > > + Namespaces => 0, > > > > > NoExpand => 1, > > > > > ParseParamEnt => 0, > > > > > Handlers => {
> > > > > > > > One option is to make the Namespaces parameter overridable in sub > > > > new > > > > { ... } . A patch for doing that - with tests - will be welcome.
> > > > > > Sorry for the report. Looks like this feed is not compatible with > > > RSS > > > 2.0 specification, which says: > > > The elements defined in this document are not themselves members of > > > a > > > namespace, so that RSS 2.0 can remain compatible with previous > > > versions in the following sense -- a version 0.91 or 0.92 file is > > > also > > > a valid 2.0 file. If the elements of RSS 2.0 were in a namespace, > > > this > > > constraint would break, a version 0.9x file would not be a valid > > > 2.0 > > > file.
> > > > I thought a little more about this issue. Actually all RSS clients > > which I tried can successfully parse RSS feed which I mentioned > > above, > > but XML::RSS can't. This is the main reason why they don't want to > > fix > > that feed. So, why XML::RSS can't just ignore default namespace as > > other clients do? > > This patch does just that. All tests passed successfully. > > > > Can you apply it and make new release?
> > I'd rather not apply a patch that makes XML-RSS less standards- > compliant like that as it is. What you should do is: > > 1. Make the new behaviour opt-in - i.e: it should be off by default > and explictly enabled using a documented switch. > > 2. Make sure you declare your constants in the test using "my" instead > of "use constant", which I detest. > > 3. Use here-documents where appropriate: > https://en.wikipedia.org/wiki/Here_document instead of multiline > q(...) strings. > > If you do all that, then I can apply the patch. > > Sorry again. > > Regards, > > -- Shlomi Fish
Actually I am using XML::Feed, which uses XML::RSS inside. And it will not know anything about new options. I already patched XML::RSS locally and happy enough. So, feel free to close it.
On Mon Nov 30 12:10:23 2015, OLEG wrote: Show quoted text
> On Mon Nov 30 11:58:02 2015, SHLOMIF wrote:
> > Hi Oleg, > > > > sorry for the late response. > > > > On Mon Oct 12 08:46:00 2015, OLEG wrote:
> > > Чтв Сен 24 07:21:00 2015, OLEG писал:
> > > > Втр Сен 22 05:46:08 2015, SHLOMIF писал:
> > > > > Hi Oleg, > > > > > > > > > > On Tue Sep 22 04:23:04 2015, OLEG wrote:
> > > > > > This test script > > > > > > > > > > > > use XML::RSS; > > > > > > use LWP::Simple; > > > > > > > > > > > > my $rss = XML::RSS->new; > > > > > > $rss->parse(get('http://news.ngs.ru/rss/')); > > > > > > __END__ > > > > > >
> > > > > > > > > > Please provide a more self-contained script with strict and > > > > > warnings > > > > > and which does not need network access or depends on networked > > > > > resources. > > > > >
> > > > > > produces error: Modification of non-creatable array value > > > > > > attempted, > > > > > > subscript -1 at /tmp/XML-RSS-1.56/lib/XML/RSS.pm line 914. > > > > > > > > > > > > And with this patch all works good (except tests in the > > > > > > distribution): > > > > > > > > > > > > --- RSS.pm.orig 2015-09-22 14:20:19.540350762 +0600 > > > > > > +++ RSS.pm 2015-09-22 14:20:49.563355439 +0600 > > > > > > @@ -1235,7 +1235,7 @@ > > > > > > my $self = shift; > > > > > > > > > > > > return XML::Parser->new( > > > > > > - Namespaces => 1, > > > > > > + Namespaces => 0, > > > > > > NoExpand => 1, > > > > > > ParseParamEnt => 0, > > > > > > Handlers => {
> > > > > > > > > > One option is to make the Namespaces parameter overridable in > > > > > sub > > > > > new > > > > > { ... } . A patch for doing that - with tests - will be > > > > > welcome.
> > > > > > > > Sorry for the report. Looks like this feed is not compatible with > > > > RSS > > > > 2.0 specification, which says: > > > > The elements defined in this document are not themselves members > > > > of > > > > a > > > > namespace, so that RSS 2.0 can remain compatible with previous > > > > versions in the following sense -- a version 0.91 or 0.92 file is > > > > also > > > > a valid 2.0 file. If the elements of RSS 2.0 were in a namespace, > > > > this > > > > constraint would break, a version 0.9x file would not be a valid > > > > 2.0 > > > > file.
> > > > > > I thought a little more about this issue. Actually all RSS clients > > > which I tried can successfully parse RSS feed which I mentioned > > > above, > > > but XML::RSS can't. This is the main reason why they don't want to > > > fix > > > that feed. So, why XML::RSS can't just ignore default namespace as > > > other clients do? > > > This patch does just that. All tests passed successfully. > > > > > > Can you apply it and make new release?
> > > > I'd rather not apply a patch that makes XML-RSS less standards- > > compliant like that as it is. What you should do is: > > > > 1. Make the new behaviour opt-in - i.e: it should be off by default > > and explictly enabled using a documented switch. > > > > 2. Make sure you declare your constants in the test using "my" > > instead > > of "use constant", which I detest. > > > > 3. Use here-documents where appropriate: > > https://en.wikipedia.org/wiki/Here_document instead of multiline > > q(...) strings. > > > > If you do all that, then I can apply the patch. > > > > Sorry again. > > > > Regards, > > > > -- Shlomi Fish
> > > Actually I am using XML::Feed, which uses XML::RSS inside. And it will > not know anything about new options. I already patched XML::RSS > locally and happy enough. So, feel free to close it.
You can try overriding methods in sub-classes or monkey-patching or whatever to give it the option (or propagating an option over to XML-Feed as well). But I'm going to close the bug for the time being. Regards, -- Shlomi Fish