Skip Menu |

This queue is for tickets about the XML-RSS-Feed CPAN distribution.

Report information
The Basics
Id: 50467
Status: resolved
Worked: 10 min
Priority: 0/
Queue: XML-RSS-Feed

People
Owner: jbisbee [...] cpan.org
Requestors: sven.knispel [...] pobox.com
Cc: Dan [...] DWright.Org
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: 2.4



Subject: Misbehaviour in XML::RSS::Feed, mixup in Headline id/guid
Date: Wed, 14 Oct 2009 01:18:28 +0200
To: bug-XML-RSS-Feed [...] rt.cpan.org
From: Sven Knispel <sven.knispel [...] pobox.com>
Dear Jeff, after having spent the two last nights frying to find out about a difference in behavior of XML::RSS::Feed on my pc and on a friend's I finally had a breakthrough. To sum it up: on version 2.212 everything is fine, on version 2.32 not anymore. Let me elaborate a little on "everything". with the adapted example from the POD: use XML::RSS::Feed; use LWP::Simple qw(get); my $feed = XML::RSS::Feed->new( url => "http://feeds.wired.com/wired/index", name => "Wired", delay => 10, debug => 1, tmpdir => ".", ); while (1) { $feed->parse(get($feed->url)); print $_->headline . "\n" for $feed->late_breaking_news; sleep($feed->delay); } Ok, the expected behavior (with V2.212): - first run: it fetches whatever is in the feed (30 items), and keeps going in the loop with no new items. - second run: after having retrieved the cached items there is no breaking news so it goes on telling "no headlines found". And now the problem (with 2.32): - first run: it fetches whatever is in the feed (30 items), and keeps going in the loop with no new items. - second run: after having retrieved the cached items it still sees another 30 breaking news items and shows them again. At every run the number of initialized headlines from the cache increases by 30. After a few hours and lots of coffee I broke the problem down to the Headlines. In the newer 2.32 version of headlines there is the concept of guid that didn't exist in older version. I found that the "faulty" code is in Headlines.pm in "sub id" on "return $self->guid || $self->url;". For whatever reason $self->guid is not set prior to caching or read from cache (at least my assumption). Anyway, always returning the URL solves the misbehavior. And finally without modifying the code doing a "$feed->init_headlines_seen;" in the calling program does also as obviously it replaces the logic for setting/getting Headline id. The program working for me is: use XML::RSS::Feed; use LWP::Simple qw(get); my $feed = XML::RSS::Feed->new( url => "http://feeds.wired.com/wired/index", name => "Wired", delay => 10, debug => 1, headline_as_id => 1, # <-- avoids getting "real" headline it tmpdir => ".", ); while (1) { $feed->parse(get($feed->url)); print $_->headline . "\n" for $feed->late_breaking_news; sleep($feed->delay); } Now I suspect "sub _build_dump_structure" to be candidate to store guid together with url to solve the problem but I lack background on RSS so please excuse me if I am completely wrong (it would be nice to read your opinion on this whole thing ;-) ). Brgds Sven
Hello, I'm seeing a very similar issue when trying to use XML::RSS:Feed. The problem I'm seeing is that whenever you load headlines from the cache, they no longer contain the guid field. This can be fixed with a one line change inside _build_dump_structure(): push @{ $cached->{items} }, { headline => $headline->headline, url => $headline->url, description => $headline->description, first_seen => $headline->first_seen_hires, Show quoted text
> guid => $headline->guid,
};
Thanks for taking time to report these issues. Fixed * add =encoding utf-8 to pod to fix RT issue #78918 * add guid to serialization so we can properly restore it to fix RT issue #50467 * Fix blantantly broken test * Suppress warnings on deprecated methods during tests * Fix pod coverage issues with Feed::Factory * I wrote this code so long ago it makes me throw up in my mouth just a little bit :P -- Jeff Bisbee / jbisbee@cpan.org