Skip Menu |

This queue is for tickets about the WWW-CheckSite CPAN distribution.

Report information
The Basics
Id: 15776
Status: resolved
Priority: 0/
Queue: WWW-CheckSite

People
Owner: abeltje [...] cpan.org
Requestors: SREZIC [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: 0.015
Fixed in: (no value)



Subject: Reverse match logic in WWW::CheckSite::Spider::filter_link
I think it is better to reverse the logic in WWW::CheckSite::Spider::filter_link() to just accept a few schemes like http, https, maybe also ftp and file. If looking at "SCHEME-SPECIFIC SUPPORT" in the URI.pm it seems to me that there are only a few browsable schemes. A better solution would be if the ua (i.e. WWW::Mechanize) would tell which URI schemes are actually supported. Regards, Slaven
From: srezic [...] cpan.org
[SREZIC - Mon Nov 14 09:58:46 2005]: Show quoted text
> I think it is better to reverse the logic in > WWW::CheckSite::Spider::filter_link() to just accept a few schemes > like http, https, maybe also ftp and file. If looking at "SCHEME- > SPECIFIC SUPPORT" in the URI.pm it seems to me that there are only > a few browsable schemes. >
Addition: and of course also accept all relative links. Maybe like this: sub filter_link { my( $self, $uri ) = @_; use URI; my $uri_object = URI->new($uri); my $uri_scheme = $uri_object->scheme; return $uri if !defined $uri_scheme; # accept everything relative return $uri_scheme =~ m!^(?:http|https|file|ftp)$!i # add more schemes here ? $uri : undef; }
[SREZIC - Mon Nov 28 09:10:05 2005]: Show quoted text
> [SREZIC - Mon Nov 14 09:58:46 2005]: >
> > I think it is better to reverse the logic in > > WWW::CheckSite::Spider::filter_link() to just accept a few
> schemes
> > like http, https, maybe also ftp and file. If looking at "SCHEME- > > SPECIFIC SUPPORT" in the URI.pm it seems to me that there are
> only
> > a few browsable schemes. > >
> > Addition: and of course also accept all relative links. Maybe like > this: > > sub filter_link { > my( $self, $uri ) = @_; > use URI; > my $uri_object = URI->new($uri); > my $uri_scheme = $uri_object->scheme; > return $uri if !defined $uri_scheme; # accept everything relative > return $uri_scheme =~ m!^(?:http|https|file|ftp)$!i # add more > schemes here > ? $uri > : undef; > }
Sorry I missed this one, but it will be in the next release. thanks, -- Abe.