Subject: | SimpleRobot |
Date: | Thu, 24 Oct 2013 18:53:43 -0500 |
To: | bug-perl5 [...] rt.cpan.org |
From: | Mike Flannigan <mikeflan [...] att.net> |
Bug Report
WWW::SimpleRobot
our $VERSION = '0.07';
Copyright (c) 2001 Ave Wrigley
I use Perl v5.8.0
There's a bug in the way that WWW::SimpleRobot handles broken links.
If the link is in the original array that you pass, it recognizes the
broken link and calls the callback routine.
But, when it's traversing a page and building a list of links, it
discards any link that fails a "head" request. So, all broken links
would be discarded.
To troubleshoot this, I first ran it the way you did. Then, I looked
at the docs for WWW::SimpleRobot and didn't see anything useful there.
Next, I looked at the source (nicely formatted by metacpan:
https://metacpan.org/source/AWRIGLEY/WWW-SimpleRobot-0.07/SimpleRobot.pm).
On line 35, I noticed there was an ability to do a VERBOSE mode.
Looking down the code a little ways (lines 119-124), you can see that
verbose is used to print a "get $url" line before the
BROKEN_LINK_CALLBACK is called.
Running that way showed that the code never prints
"gethttp://www.ncgia.ucsb.edu/%7Ecova/seap.html".
Looking a little further shows lines 140-142, which discards the link
if head() fails.
The hdb debugging interface was really nice for this.
Thank you.
Mike Flannigan
Houston, TX
281-286-6869
http://www.mflan.com/index.htm