Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the W3C-LinkChecker CPAN distribution.

Report information
The Basics
Id: 18902
Status: resolved
Priority: 0/
Queue: W3C-LinkChecker

People
Owner: scop [...] cpan.org
Requestors: pascal.pignard [...] wanadoo.fr
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 4.2.1
Fixed in: (no value)



Subject: Unexpected lines.
When visiting some web sites with robots rules, LinkChecker reports "unexpected" lines, like: RobotRules <http://dev.mysql.com/robots.txt>: Unexpected line: Crawl-delay: 20 RobotRules <http://www.google.fr/robots.txt>: Unexpected line: Allow: /searchhistory/ Are these correct lines in robots rules? If so, shouldn't LinkChecker accept them without errors? Thanks for making LinkChecker free.
The link checker's robots.txt support comes directly from libwww-perl (LWP::RobotUA), it doesn't implement it internally. The lines you cited belong to a robots.txt version not supported by current libwww-perl, and are thus not supported by the link checker. You can read more about this at http://search.cpan.org/src/SCOP/W3C-LinkChecker-4.2.1/docs/checklink.html#bot I'll look into if there's a clean way of suppressing those warnings in the link checker code.
Version 4.3 tries to suppress these warnings.
Subject: Re: [rt.cpan.org #18902] Resolved: Unexpected lines.
Date: Sun, 5 Nov 2006 10:48:29 +0100
To: bug-W3C-LinkChecker [...] rt.cpan.org
From: Pascal <pascal.pignard [...] wanadoo.fr>
Hello. Great and many thanks, these errors have disappeared with version 4.3. Since, I've got two other errors : 1) Parsing of undecoded UTF-8 will give garbage when decoding entities at /Library/Perl/5.8.6/LWP/Protocol.pm line 114. 2) https://libre2.adacore.com/ Line: 241 Code: 500 Can't locate object method "new" via package "LWP::Protocol::https::Socket" To do: This is a server side problem. Check the URI. Is there a way to fix them? Thanks in advance for your answers, Pascal. http://blady.perso.orange.fr Le 22 oct. 06 à 21:52, Ville Skyttä via RT a écrit : Show quoted text
> <URL: http://rt.cpan.org/Ticket/Display.html?id=18902 > > > According to our records, your request has been resolved. If you > have any > further questions or concerns, please respond to this message. >
From: SCOP [...] cpan.org
On Sun Nov 05 04:49:39 2006, pascal.pignard@wanadoo.fr wrote: Show quoted text
> 1) Parsing of undecoded UTF-8 will give garbage when decoding > entities at /Library/Perl/5.8.6/LWP/Protocol.pm line 114.
This is a libwww-perl issue, see https://rt.cpan.org/Ticket/Display.html?id=20274 Show quoted text
> 2) https://libre2.adacore.com/ Line: 241 > Code: 500 Can't locate object method "new" via package > "LWP::Protocol::https::Socket" > To do: This is a server side problem. Check the URI.
Read the link checker and libwww-perl documentation about installing modules required for SSL/HTTPS functionality.
Subject: Re: [rt.cpan.org #18902] Unexpected lines.
Date: Mon, 1 Jan 2007 19:30:17 +0100
To: bug-W3C-LinkChecker [...] rt.cpan.org
From: Pascal <pascal.pignard [...] wanadoo.fr>
Hello. 2) Oh yes, I've got Crypt-SSLeay-0.51 and now the error is off. Thanks a lot and happy new year 2007, Pascal. http://blady.perso.orange.fr Le 5 nov. 06 à 14:59, Ville Skyttä via RT a écrit : Show quoted text
> > <URL: http://rt.cpan.org/Ticket/Display.html?id=18902 > > > On Sun Nov 05 04:49:39 2006, pascal.pignard@wanadoo.fr wrote: >
>> 1) Parsing of undecoded UTF-8 will give garbage when decoding >> entities at /Library/Perl/5.8.6/LWP/Protocol.pm line 114.
> > This is a libwww-perl issue, see > https://rt.cpan.org/Ticket/Display.html?id=20274 >
>> 2) https://libre2.adacore.com/ Line: 241 >> Code: 500 Can't locate object method "new" via package >> "LWP::Protocol::https::Socket" >> To do: This is a server side problem. Check the URI.
> > Read the link checker and libwww-perl documentation about installing > modules required for SSL/HTTPS functionality. >