Skip Menu |

This queue is for tickets about the XML-RSS CPAN distribution.

Report information
The Basics
Id: 2722
Status: resolved
Priority: 0/
Queue: XML-RSS

People
Owner: KELLAN [...] cpan.org
Requestors: lars.nooden [...] ub.uit.no
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.02
Fixed in: (no value)



Subject: rss->parsefile
On two different platforms (Debian-woody and Solaris 5.8) the rss->parsefile cannot read a file generated by rss->save The error on the Debian machine is similiar, but here is the one from the Solaris machine: undefined entity at line 18, column 57, byte 662 at /usr/local/lib/perl5/site_perl/5.8.0/sun4-solaris/XML/Parser.pm line 168 The version is RSS.pm,v 1.22 2003/02/20 19:19:07 kellan Exp and I can reproduce the error with this much code: #!/usr/bin/perl -w use XML::RSS; my $rss = new XML::RSS; $rss->parsefile( 'foo.rdf' ); Attached is the file produced by XML::RSS which can be read by newstickers, but not XML::RSS itself. -Lars
<?xml version="1.0" encoding="iso-8859-1"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" > <channel rdf:about="http://www.ub.uit.no/"> <title>UB Troms&#248;</title> <link>http://www.ub.uit.no/</link> <description>Recently changed pages at the University Library</description> <dc:language>en-us</dc:language> <dc:date>2003-06-02T05:07+00:00</dc:date> <dc:publisher>The University Library, University in Troms&slash;</dc:publisher> <dc:creator>webmaster@ub.uit.no</dc:creator> <syn:updatePeriod>daily</syn:updatePeriod> <syn:updateFrequency>1</syn:updateFrequency> <syn:updateBase>2003-01-01T00:00+00:00</syn:updateBase> <items> <rdf:Seq> <rdf:li rdf:resource="http://www.ub.uit.no/intern/nytt_paa_ub.html" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/oppslagstavla/studiebibl-handlingsplanen-utkast.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/new_urls.txt" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/sol2001.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/registrer.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/sistenytt/test.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/sistenytt/registrer.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/sistenytt/index.html" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/romres/whatsnew.txt" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/romres/null.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/romres/index.html" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/indextekst.html" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/nymedarbeider.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/nymedarbeidere.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/romres/readme.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/romres/index1.html" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/eudora.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/lisendatabase.html" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/nyhet.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/endringer.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/mal-dok.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/IT/batteri.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/IT/faq.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/IT/faq-xp.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/IT/faq-pub.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/IT/printerstatus.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/index.html" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/oppslagstavla/trimkvarter.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/oppslagstavla/Seniorpolitikk.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/oppslagstavla/Program.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/oppslagstavla/etiske_retningslinjer.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/tdburl/index.html" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/littmidler/index.html" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/ub.html" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/oppslagstavla/visjonsbrev.htm" /> <rdf:li rdf:resource="http://www.ub.uit.no/intern/oppslagstavla/studiebibl-referater.htm" /> </rdf:Seq> </items> </channel> <item rdf:about="http://www.ub.uit.no/intern/nytt_paa_ub.html"> <title>Fil endret siden 2003/05/24</title> <link>http://www.ub.uit.no/intern/nytt_paa_ub.html</link> <dc:date>20030531</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/oppslagstavla/studiebibl-handlingsplanen-utkast.htm"> <title>UB Tromsø internsider handlingsplan studiebibliotek</title> <link>http://www.ub.uit.no/intern/oppslagstavla/studiebibl-handlingsplanen-utkast.htm</link> <dc:date>20030530</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/new_urls.txt"> <title>new_urls.txt</title> <link>http://www.ub.uit.no/intern/new_urls.txt</link> <dc:date>20030530</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/sol2001.htm"> <title>Untitled</title> <link>http://www.ub.uit.no/intern/sol2001.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/registrer.htm"> <title>Registrere endringer</title> <link>http://www.ub.uit.no/intern/registrer.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/sistenytt/test.htm"> <title>Registrere endringer</title> <link>http://www.ub.uit.no/intern/sistenytt/test.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/sistenytt/registrer.htm"> <title>Registrere endringer</title> <link>http://www.ub.uit.no/intern/sistenytt/registrer.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/sistenytt/index.html"> <title>Registrere endringer</title> <link>http://www.ub.uit.no/intern/sistenytt/index.html</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/romres/whatsnew.txt"> <title>whatsnew.txt</title> <link>http://www.ub.uit.no/intern/romres/whatsnew.txt</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/romres/null.htm"> <title>Uten titel</title> <link>http://www.ub.uit.no/intern/romres/null.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/romres/index.html"> <title>UBs romreservasjon</title> <link>http://www.ub.uit.no/intern/romres/index.html</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/indextekst.html"> <title>UBs interne sider</title> <link>http://www.ub.uit.no/intern/indextekst.html</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/nymedarbeider.htm"> <title>Registrere ny medarbeider</title> <link>http://www.ub.uit.no/intern/nymedarbeider.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/nymedarbeidere.htm"> <title>Registrer nymedarbeider</title> <link>http://www.ub.uit.no/intern/nymedarbeidere.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/romres/readme.htm"> <title>PerlCal Help</title> <link>http://www.ub.uit.no/intern/romres/readme.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/romres/index1.html"> <title>PerlCal Calendar</title> <link>http://www.ub.uit.no/intern/romres/index1.html</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/eudora.htm"> <title>Når e-post er klar MÅ DU HA ET NYTT PASSORD for innlogging</title> <link>http://www.ub.uit.no/intern/eudora.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/lisendatabase.html"> <title>Lisenskommentarer</title> <link>http://www.ub.uit.no/intern/lisendatabase.html</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/nyhet.htm"> <title>Hva skjer på UB?</title> <link>http://www.ub.uit.no/intern/nyhet.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/endringer.htm"> <title>Endringer på UBs interne sider</title> <link>http://www.ub.uit.no/intern/endringer.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/mal-dok.htm"> <title>Din tittel</title> <link>http://www.ub.uit.no/intern/mal-dok.htm</link> <dc:date>20030529</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/IT/batteri.htm"> <title>Vedlikehold av batteri</title> <link>http://www.ub.uit.no/intern/IT/batteri.htm</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/IT/faq.htm"> <title>Spørsmål og svar</title> <link>http://www.ub.uit.no/intern/IT/faq.htm</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/IT/faq-xp.htm"> <title>Spørsmål og svar</title> <link>http://www.ub.uit.no/intern/IT/faq-xp.htm</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/IT/faq-pub.htm"> <title>Spørsmål og svar</title> <link>http://www.ub.uit.no/intern/IT/faq-pub.htm</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/IT/printerstatus.htm"> <title>Sjekke printerstatus i nettleseren</title> <link>http://www.ub.uit.no/intern/IT/printerstatus.htm</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/index.html"> <title>UBs interne sider</title> <link>http://www.ub.uit.no/intern/index.html</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/oppslagstavla/trimkvarter.htm"> <title>TIL ALLE FAKULTET VED LEDERE/DIREKTØRER</title> <link>http://www.ub.uit.no/intern/oppslagstavla/trimkvarter.htm</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/oppslagstavla/Seniorpolitikk.htm"> <title>Seniorpolitiske tiltak i henhold til hovedtariffavtalens pkt 5</title> <link>http://www.ub.uit.no/intern/oppslagstavla/Seniorpolitikk.htm</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/oppslagstavla/Program.htm"> <title>FORSLAG TIL PROGRAM 4</title> <link>http://www.ub.uit.no/intern/oppslagstavla/Program.htm</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/oppslagstavla/etiske_retningslinjer.htm"> <title>Etiske retningslinjer for statsansatte mot kjøp og aksept av seksuelle tjenester</title> <link>http://www.ub.uit.no/intern/oppslagstavla/etiske_retningslinjer.htm</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/tdburl/index.html"> <title>Alle tidsskriftene som har brukernavn og passordbelagte tjenester</title> <link>http://www.ub.uit.no/intern/tdburl/index.html</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/littmidler/index.html"> <title>Ledige litteraturmidler</title> <link>http://www.ub.uit.no/intern/littmidler/index.html</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/ub.html"> <title>Fil endret siden 2003/05/21</title> <link>http://www.ub.uit.no/intern/ub.html</link> <dc:date>20030528</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/oppslagstavla/visjonsbrev.htm"> <title>UB Tromsø internsider visjon</title> <link>http://www.ub.uit.no/intern/oppslagstavla/visjonsbrev.htm</link> <dc:date>20030527</dc:date> </item> <item rdf:about="http://www.ub.uit.no/intern/oppslagstavla/studiebibl-referater.htm"> <title>UB Tromsø internsider handlingsplan studiebibliotek</title> <link>http://www.ub.uit.no/intern/oppslagstavla/studiebibl-referater.htm</link> <dc:date>20030527</dc:date> </item> </rdf:RDF>
Well most XML parsers would choke on the attached file. It has an entity reference (&slash;) that is invalid in XML. This is related to bug #2472 (http://rt.cpan.org/NoAuth/Bug.html?id=2472) that XML::RSS encoding is fundamentally broken. (it was quick hack, added after several years of not having any) In the meantime... I'm not sure I have a great suggestion. I'll try to get you an update soon. [guest - Mon Jun 2 06:37:04 2003]: Show quoted text
> On two different platforms (Debian-woody and Solaris 5.8) the rss-
> >parsefile cannot read a file generated by rss->save
> > The error on the Debian machine is similiar, but here is the one from > the Solaris machine: > > undefined entity at line 18, column 57, byte 662 at > /usr/local/lib/perl5/site_perl/5.8.0/sun4-solaris/XML/Parser.pm > line 168 > > The version is RSS.pm,v 1.22 2003/02/20 19:19:07 kellan Exp > and I can reproduce the error with this much code: > > #!/usr/bin/perl -w > use XML::RSS; > my $rss = new XML::RSS; > $rss->parsefile( 'foo.rdf' ); > > Attached is the file produced by XML::RSS which can be read by > newstickers, but not XML::RSS itself. > -Lars
From: Sean <sean [...] ertw.com>
[KELLAN - Sun Nov 23 02:10:33 2003]: Show quoted text
> In the meantime... I'm not sure I have a great suggestion. I'll try > to > get you an update soon.
Has there been any update to this? I ran into a similar problem, where I add an item using add_item but fixed up with HTML::Entities::encode_entities. The output to disk has all the entities encoded properly. I then read it back in with parse_file, no problem. If I save it again, most of the entities are unencoded such that I can no longer read it back in with parse_file. Thanks, Sean
On Wed Jun 16 10:42:18 2004, guest wrote: Show quoted text
> [KELLAN - Sun Nov 23 02:10:33 2003]: >
> > In the meantime... I'm not sure I have a great suggestion. I'll try > > to > > get you an update soon.
> > Has there been any update to this? > > I ran into a similar problem, where I add an item using add_item but > fixed up with HTML::Entities::encode_entities. The output to disk has > all the entities encoded properly. I then read it back in with > parse_file, no problem. If I save it again, most of the entities are > unencoded such that I can no longer read it back in with parse_file.
Hi Sean, I might have fixed this (by fixing something else) a few releases ago. If not, can you provide a test case we can include in the test suite? - ask
This bug has been inactive for quite a while, and cannot be reproduced with a modern version of XML::RSS. (at least not with ->{encoding} enabled.). Resolving. Regards, Shlomi Fish