Subject: | invoke-on-link hook not called on all links |
The documentation states that foreach link on the page, the
invoke-on-link hook is called and then the add-url-test hook is called.
I expected the invoke-on-link hook to be called for all links, even if
they were to be excluded by add-url-test later. I believe this
expectation is consistent with the intended behaviour of module.
I have attached a test with the expected behaviour based on my reading
of the documentation.
Upon inspecting the source, the add-url-test hook is called within
extract_links which occurs before the invoke-on-link hook is called on
all links. I have also attached a patch which moves the add-url-test
hook call from extract_links to immediately before addUrl during link
processing.
Subject: | invoke-on-all-links.t |
use Test::More tests => 1;
use WWW::Robot;
my $WEBSITE = 'http://www.example.com/'; # can be anything that allows robots
my $num_invoked_links = 0;
my $robot = WWW::Robot->new(
NAME => 'MyRobot',
VERSION => 0,
EMAIL => 'example@example.com',
DELAY => 0, # we only follow 1 URL
);
$robot->addHook('follow-url-test', sub { 1 });
$robot->addHook('invoke-on-link', sub { $num_invoked_links++ });
$robot->addHook('continue-test', sub { 0 });
$robot->run($WEBSITE);
my $expected = $num_invoked_links;
$num_invoked_links = 0; # reset it
$robot = WWW::Robot->new(
NAME => 'MyRobot',
VERSION => 0,
EMAIL => 'example@example.com',
DELAY => 0, # we only follow 1 URL
);
$robot->addHook('follow-url-test', sub { 1 });
$robot->addHook('invoke-on-link', sub { $num_invoked_links++ });
$robot->addHook('continue-test', sub { 0 });
$robot->addHook('add-url-test', sub { 0 }); # should not change $num_invoked_links
$robot->run($WEBSITE);
cmp_ok($num_invoked_links, '==', $expected, 'invoke-on-link hook on ALL links');