Skip Menu |

This queue is for tickets about the WWW-RobotRules CPAN distribution.

Report information
The Basics
Id: 99387
Status: new
Priority: 0/
Queue: WWW-RobotRules

People
Owner: Nobody in particular
Requestors: lindahl [...] pbm.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 6.02
Fixed in: (no value)



Subject: Additional googlebot incompatibility
blekko got flamed by webmasters until we parsed robots.txt like google does. There's already a bug 68219 (https://rt.cpan.org/Public/Bug/Display.html?id=68219) about * and Allow. The additional things we were flamed about are: 1) blank lines should be ignored. Webmasters frequently have stuff like User-agent: googlebot Disallow: / And expect the disallow to be applied to googlebot and not *. Same for User-agent: googlebot # a comment Disallow: / 2) Trailing $ Disallow: .mp3$ should in fact disallow /foo.mp3 I would be happy to donate our testsuite. I don't think anyone should be using a non-googlebot-compatible robots.txt parser these days. But if you want to keep a useless but standard-compliant mode around, it's easy enough to divide the tests up into the ones that obey the standard and the ones that obey the reality.
Subject: Re: [rt.cpan.org #99387] AutoReply: Additional googlebot incompatibility
Date: Wed, 8 Oct 2014 14:02:09 -0700
To: Bugs in WWW-RobotRules via RT <bug-WWW-RobotRules [...] rt.cpan.org>
From: lindahl [...] pbm.com

Message body is not shown because it is too large.