Skip Menu |

This queue is for tickets about the CGI-Application-Dispatch CPAN distribution.

Report information
The Basics
Id: 19029
Status: resolved
Priority: 0/
Queue: CGI-Application-Dispatch

People
Owner: Nobody in particular
Requestors: PURDY [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 2.00_04
Fixed in: (no value)



Subject: Spidering breaks app
For some reason, search engine spidering hits my dispatch app, I get errors: I got this error report just now: ---- ON QSR Site ERROR CODE 500 OCCURRED ON Tue May 2 09:07:24 2006 WHEN THE URL /resources/suppliers/wisconsin_built WAS REQUESTED FULL PATH BY A USER AT 65.214.44.145 THE BROWSER WAS Mozilla/2.0 (compatible; Ask Jeeves/Teoma; +http://sp.ask.com/docs/about/tech_crawling.html) ------------------------------------------------------------------------------ $VAR1 = { 'REDIRECT_UNIQUE_ID' => 'RFdZjM-8S5AAAC-aQFE', 'QUERY_STRING' => '500', 'nokeepalive' => '1', 'REDIRECT_SCRIPT_FILENAME' => '/var/www/qsr/web/resources', 'REMOTE_PORT' => '54776', 'HTTP_ACCEPT' => 'text/html, text/plain, application/x-shockwave-flash', 'HTTP_USER_AGENT' => 'Mozilla/2.0 (compatible; Ask Jeeves/Teoma; +http://sp.ask.com/docs/about/tech_crawling.html)', 'GATEWAY_INTERFACE' => 'CGI/1.1', 'HTTP_HOST' => 'www.qsrmagazine.com', 'REDIRECT_SERVER_ADDR' => '207.252.75.146', 'REDIRECT_PATH' => '/bin:/usr/bin:/sbin:/usr/sbin', 'SCRIPT_NAME' => '/cgi-bin/errorpage.cgi', 'SERVER_NAME' => 'www.qsrmagazine.com', 'HTTP_ACCEPT_ENCODING' => 'gzip, deflate', 'REDIRECT_SERVER_PROTOCOL' => 'HTTP /1.0', 'REDIRECT_PATH_INFO' => '/suppliers/wisconsin_built', 'REDIRECT_STATUS' => '500', 'REDIRECT_HTTP_ACCEPT_ENCODING' => 'gzip, deflate', 'UNIQUE_ID' => 'RFdZjM-8S5AAAC-aQFE', 'REDIRECT_REMOTE_ADDR' => '65.214.44.145', 'SCRIPT_FILENAME' => '/var/www/qsr/web/cgi-bin/errorpage.cgi', 'REDIRECT_PATH_TRANSLATED' => '/var/www/qsr/web/suppliers/wisconsin_built', 'REDIRECT_SERVER_SOFTWARE' => 'Apache/1.3.26 (Unix) Debian GNU/Linux', 'PATH' => '/bin:/usr/bin:/sbin:/usr/sbin', 'REDIRECT_DOCUMENT_ROOT' => '/var/www/qsr/web', 'REDIRECT_REQUEST_URI' => '/resources/suppliers/wisconsin_built', 'SERVER_ADDR' => '207.252.75.146', 'SERVER_PROTOCOL' => 'HTTP /1.0', 'REDIRECT_SERVER_SIGNATURE' => '<ADDRESS>Apache/1.3.26 Server at www.qsrmagazine.com Port 80</ADDRESS> ', 'SERVER_SIGNATURE' => '<ADDRESS>Apache/1.3.26 Server at www.qsrmagazine.com Port 80</ADDRESS> ', 'REDIRECT_SERVER_ADMIN' => 'admin@journalistic.com', 'REDIRECT_SERVER_PORT' => '80', 'SERVER_SOFTWARE' => 'Apache/1.3.26 (Unix) Debian GNU/Linux', 'SERVER_ADMIN' => 'admin@journalistic.com', 'REDIRECT_nokeepalive' => '1', 'REMOTE_ADDR' => '65.214.44.145', 'DOCUMENT_ROOT' => '/var/www/qsr/web', 'REQUEST_URI' => '/resources/suppliers/wisconsin_built', 'REDIRECT_HTTP_HOST' => 'www.qsrmagazine.com', 'REDIRECT_REMOTE_PORT' => '54776', 'REDIRECT_HTTP_ACCEPT' => 'text/html, text/plain, application/x-shockwave-flash', 'REDIRECT_REQUEST_METHOD' => 'GET', 'REQUEST_METHOD' => 'GET', 'REDIRECT_HTTP_USER_AGENT' => 'Mozilla/2.0 (compatible; Ask Jeeves/Teoma; +http://sp.ask.com/docs/about/tech_crawling.html)', 'REDIRECT_URL' => '/resources/suppliers/wisconsin_built', 'REDIRECT_SCRIPT_NAME' => '/resources', 'REDIRECT_GATEWAY_INTERFACE' => 'CGI-Perl/1.1', 'REDIRECT_QUERY_STRING' => '', 'REDIRECT_SERVER_NAME' => 'www.qsrmagazine.com', 'SERVER_PORT' => '80' }; ---- I dug into the web error log and found this: ---- [Tue May 2 09:07:24 2006] null: CGI::Application::Dispatch - ERROR Can't locate object method "_run_app" via package "QSR::Resources::ResourceDispatch" (perhaps you forgot to load "QSR::Resources::ResourceDispatch"?) at /usr/share/perl5/CGI/Application/Dispatch.pm line 298. ---- This is an error report I got from Google last week: ---- ON QSR Site ERROR CODE 500 OCCURRED ON Sat Apr 22 05:52:23 2006 WHEN THE URL /resources/suppliers/astute_solutions WAS REQUESTED FULL PATH BY A USER AT 66.249.66.36 THE BROWSER WAS Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) ------------------------------------------------------------------------------ $VAR1 = { 'REDIRECT_UNIQUE_ID' => 'REn818-8S5AAAEfiDro', 'QUERY_STRING' => '500', 'REDIRECT_HTTP_FROM' => 'googlebot(at)googlebot.com', 'REDIRECT_SCRIPT_FILENAME' => '/var/www/qsr/web/resources', 'REMOTE_PORT' => '54249', 'HTTP_ACCEPT' => '*/*', 'HTTP_USER_AGENT' => 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', 'GATEWAY_INTERFACE' => 'CGI/1.1', 'HTTP_HOST' => 'www.qsrmagazine.com', 'REDIRECT_SERVER_ADDR' => '207.252.75.146', 'REDIRECT_PATH' => '/bin:/usr/bin:/sbin:/usr/sbin', 'SCRIPT_NAME' => '/cgi-bin/errorpage.cgi', 'SERVER_NAME' => 'www.qsrmagazine.com', 'HTTP_ACCEPT_ENCODING' => 'gzip', 'REDIRECT_SERVER_PROTOCOL' => 'HTTP /1.1', 'REDIRECT_PATH_INFO' => '/suppliers/astute_solutions', 'REDIRECT_STATUS' => '500', 'REDIRECT_HTTP_ACCEPT_ENCODING' => 'gzip', 'REDIRECT_HTTP_CONNECTION' => 'Keep-alive', 'UNIQUE_ID' => 'REn818-8S5AAAEfiDro', 'REDIRECT_REMOTE_ADDR' => '66.249.66.36', 'SCRIPT_FILENAME' => '/var/www/qsr/web/cgi-bin/errorpage.cgi', 'REDIRECT_PATH_TRANSLATED' => '/var/www/qsr/web/suppliers/astute_solutions', 'REDIRECT_SERVER_SOFTWARE' => 'Apache/1.3.26 (Unix) Debian GNU/Linux', 'PATH' => '/bin:/usr/bin:/sbin:/usr/sbin', 'HTTP_FROM' => 'googlebot(at)googlebot.com', 'REDIRECT_DOCUMENT_ROOT' => '/var/www/qsr/web', 'REDIRECT_REQUEST_URI' => '/resources/suppliers/astute_solutions', 'SERVER_ADDR' => '207.252.75.146', 'SERVER_PROTOCOL' => 'HTTP /1.1', 'HTTP_CONNECTION' => 'Keep-alive', 'REDIRECT_SERVER_SIGNATURE' => '<ADDRESS>Apache/1.3.26 Server at www.qsrmagazine.com Port 80</ADDRESS> ', 'SERVER_SIGNATURE' => '<ADDRESS>Apache/1.3.26 Server at www.qsrmagazine.com Port 80</ADDRESS> ', 'REDIRECT_SERVER_ADMIN' => 'admin@journalistic.com', 'REDIRECT_SERVER_PORT' => '80', 'SERVER_SOFTWARE' => 'Apache/1.3.26 (Unix) Debian GNU/Linux', 'SERVER_ADMIN' => 'admin@journalistic.com', 'REMOTE_ADDR' => '66.249.66.36', 'DOCUMENT_ROOT' => '/var/www/qsr/web', 'REQUEST_URI' => '/resources/suppliers/astute_solutions', 'REDIRECT_HTTP_HOST' => 'www.qsrmagazine.com', 'REDIRECT_REMOTE_PORT' => '54249', 'REDIRECT_HTTP_ACCEPT' => '*/*', 'REDIRECT_REQUEST_METHOD' => 'GET', 'REQUEST_METHOD' => 'GET', 'REDIRECT_HTTP_USER_AGENT' => 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', 'REDIRECT_URL' => '/resources/suppliers/astute_solutions', 'REDIRECT_SCRIPT_NAME' => '/resources', 'REDIRECT_GATEWAY_INTERFACE' => 'CGI-Perl/1.1', 'REDIRECT_QUERY_STRING' => '', 'REDIRECT_ERROR_NOTES' => 'Can\'t locate object method "@‰`o»p£zH†‡W“ø¿•À1•`²—x{‹D”X×›Ð2“8¹– ½†À†Œ˜…’˜‡”ˆ#‘ ¿•h”å‘ø…ŽXe‰°rV”@›" via package "Apache" (perhaps you forgot to load "Apache"?) at /usr/share/perl5/CGI/Application/Dispatch.pm line 427. ', 'REDIRECT_SERVER_NAME' => 'www.qsrmagazine.com', 'SERVER_PORT' => '80' }; ---- (Note the REDIRECT_ERROR_NOTES part ... very odd!) This is my dispatch module: ---- package QSR::Resources::ResourceDispatch; use strict; use base 'CGI::Application::Dispatch'; use CGI::Carp qw( fatalsToBrowser ); use lib '/var/www/lib'; sub dispatch_args { return { 'prefix' => 'QSR::Resources', 'table' => [ '/equipment/:category?' => { 'app' => 'ResourceViewer', 'rm' => 'view_category_list' }, '/suppliers' => { 'app' => 'ResourceViewer', 'rm' => 'view_company_list' }, '/suppliers/page/:page' => { 'app' => 'ResourceViewer', 'rm' => 'view_company_list' }, '/suppliers/byletter/:letter' => { 'app' => 'ResourceViewer', 'rm' => 'view_company_list' }, '/suppliers/search' => { 'app' => 'ResourceViewer', 'rm' => 'search_companies' }, '/suppliers/:company' => { 'app' => 'ResourceViewer', 'rm' => 'view_company' }, '/suppliers/:company/visit' => { 'app' => 'ResourceViewer', 'rm' => 'log_clickthrough' }, '' => { 'app' => 'ResourceViewer', 'rm' => 'start' }, ], }; } 1; ---- Then in the httpd.conf, I have these relevant lines: ---- <Location /resources> SetHandler perl-script PerlHandler QSR::Resources::ResourceDispatch </Location> ---- You can see it in action here: http://www.qsrmagazine.com/resources I'm using the mod_perl approach. Let me know if I can provide any further information. My guess is that spidering hits the thing pretty quickly and something's not able to keep up. Thanks, Jason
On Tue May 02 09:23:14 2006, PURDY wrote: Show quoted text
> For some reason, search engine spidering hits my dispatch app, I get > errors: > > I got this error report just now:
------------------------------------------------------------------------------ Show quoted text
> $VAR1 = {
[snip] Show quoted text
> 'SERVER_PORT' => '80' > };
Why are you getting this environment dump? Is this the dump from CGI::App? Show quoted text
> I dug into the web error log and found this: > > ---- > [Tue May 2 09:07:24 2006] null: CGI::Application::Dispatch - ERROR > Can't locate object method "_run_app" via package > "QSR::Resources::ResourceDispatch" (perhaps you forgot to load > "QSR::Resources::ResourceDispatch"?) at > /usr/share/perl5/CGI/Application/Dispatch.pm line 298.
[snip] Show quoted text
> package QSR::Resources::ResourceDispatch;
Everything looks ok there. Show quoted text
> Then in the httpd.conf, I have these relevant lines:
And there too. Show quoted text
> My guess is that spidering hits the thing pretty quickly and > something's > not able to keep up.
I seriously doubt that's the case. The test suite would be under similar circumstances as a site being spidered and I've never seen this. Can you reproduce it on your own, or is it only from spidering?
Any other information on this?
Subject: Re: [rt.cpan.org #19029] Spidering breaks app
Date: Tue, 11 Mar 2008 13:14:02 -0400
To: bug-CGI-Application-Dispatch [...] rt.cpan.org
From: "Jason Purdy" <jason [...] purdy.info>
No, not really. I haven't experienced this bug in a while. I'm running 2.01 on production and Apache 2.2. My guess is that when I originally filed the report, I was running Apache 1.3. On 3/11/08, Michael Peters via RT <bug-CGI-Application-Dispatch@rt.cpan.org> wrote: Show quoted text
> > > <URL: http://rt.cpan.org/Ticket/Display.html?id=19029 > > > Any other information on this? >