Skip Menu |

This queue is for tickets about the Scrappy CPAN distribution.

Report information
The Basics
Id: 95420
Status: new
Priority: 0/
Queue: Scrappy

People
Owner: Nobody in particular
Requestors: lubo.rintel [...] gooddata.com
Cc: lkundrak [...] v3.sk
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



CC: lkundrak [...] v3.sk
Subject: Tests are tragically broken
See the attached patch file.
Subject: 0001-Fix-Scrapy-Scraper-Control-and-its-test.patch
From 7b5f3f5fd2a3e94742a536d23ef9f289b0da49bc Mon Sep 17 00:00:00 2001 From: Lubomir Rintel <lkundrak@v3.sk> Date: Wed, 7 May 2014 00:37:08 +0200 Subject: [PATCH] Fix Scrapy::Scraper::Control and its test It's rather broken, possibly due to some careless copy & pasting: * in restrict() and allow(), "next" outside loops is used in place of function returns * in is_allowed(), argument is assumed to be an URI instance despite the function takes a string. Also, a chunk of code (apparently copied from lines above), mistakes allowed() for restricted() and apart from that leaves the logic reversed. * Moreover, the test does not pass a valid URI to a subroutine that expects one The silly condition is left as entertainment for future generations: if (keys %{$self->restricted}) { if (keys %{$self->restricted}) { --- lib/Scrappy/Scraper/Control.pm | 14 +++++++------- t/00_test_function_control.t | 2 +- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/lib/Scrappy/Scraper/Control.pm b/lib/Scrappy/Scraper/Control.pm index f5e3190..c08ba2e 100644 --- a/lib/Scrappy/Scraper/Control.pm +++ b/lib/Scrappy/Scraper/Control.pm @@ -28,7 +28,7 @@ sub allow { $target = URI->new($target); - next + return $i unless $target && ("URI::http" eq ref $target || "URI::https" eq ref $target); @@ -49,7 +49,7 @@ sub restrict { $target = URI->new($target); - next + return $i unless $target && ("URI::http" eq ref $target || "URI::https" eq ref $target); @@ -79,6 +79,7 @@ sub is_allowed { $url = $url->request->uri; } + $url = URI->new($url); return 0 unless ("URI::http" eq ref $url || "URI::https" eq ref $url); $url = $url->host; @@ -107,19 +108,18 @@ sub is_allowed { # is it explicitly restricted if (keys %{$self->restricted}) { if (keys %{$self->restricted}) { - # return 0 if $self->restricted->{$url}; if ($self->restricted->{$url}) { if ($self->restricted->{$url}->{if}) { - return $self->_check_constraints( - $self->allowed->{$url}->{if}, $http); + return ! $self->_check_constraints( + $self->restricted->{$url}->{if}, $http); } else { - return 1; + return 0; } } else { - return 0; + return 1; } } } diff --git a/t/00_test_function_control.t b/t/00_test_function_control.t index 2270d12..7cd3407 100644 --- a/t/00_test_function_control.t +++ b/t/00_test_function_control.t @@ -13,7 +13,7 @@ ok $s->control->is_allowed('http://search.cpan.org/recent'); ok $s->control->is_allowed('http://search.cpan.org/dist/Scrappy/lib/Scrappy.pm'); ok ! $s->control->is_allowed('http://www.google.com/'); ok 0 == keys %{$s->control->restricted}; # no restriction rules set -ok $s->control->restrict('search.cpan.org'); +ok $s->control->restrict('http://search.cpan.org'); ok ! $s->control->is_allowed('http://search.cpan.org/'); ok ! $s->control->is_allowed('http://search.cpan.org/recent'); ok ! $s->control->is_allowed('http://search.cpan.org/dist/Scrappy/lib/Scrappy.pm'); -- 1.9.0