Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the WWW-Mechanize CPAN distribution.

Report information
The Basics
Id: 18365
Status: resolved
Priority: 0/
Queue: WWW-Mechanize

People
Owner: Nobody in particular
Requestors: mattj [...] 3am-software.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.16
Fixed in: (no value)



Subject: Mech cannot fetch wikipedia
Mech cannot fetch wikipedia. I can get to the same url on the same machine via lynx or wget. A sample url is: http://en.wikipedia.org/ In lynx or wget I get the Wikipedia front page. If I run this: use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $mech->get("http://en.wikipedia.org/"); print "Status " . $mech->status . "\n"; print $mech->content; I get a status of 403 and page saying access denied. I use mech for other sites without a problem, so I assume this is not some kind of configuration error on my part. This seems to be specific to Wikipedia.
Show quoted text
> I get a status of 403 and page saying access denied. I use mech for > other sites without a problem, so I assume this is not some kind of > configuration error on my part. This seems to be specific to Wikipedia.
So it sounds like you need to send some kind of authentication.
Subject: Re: [rt.cpan.org #18365] Mech cannot fetch wikipedia
Date: Sat, 25 Mar 2006 22:39:38 -0800
To: bug-WWW-Mechanize [...] rt.cpan.org
From: MattJ <mattj [...] 3am-software.com>
Yes, that is what is sounds like. Why do I need to send authentication with Mech, but not with wget or lynx? Neither of these other methods have any kind of cookie or password. I only have this problem with Mech. -- MattJ On Mar 25, 2006, at 6:17 PM, Guest via RT wrote: Show quoted text
> > <URL: http://rt.cpan.org/Ticket/Display.html?id=18365 > > >
>> I get a status of 403 and page saying access denied. I use mech for >> other sites without a problem, so I assume this is not some kind of >> configuration error on my part. This seems to be specific to >> Wikipedia.
> > > So it sounds like you need to send some kind of authentication.
Subject: Re: [rt.cpan.org #18365] Mech cannot fetch wikipedia
Date: Sun, 26 Mar 2006 00:45:30 -0600
To: bug-WWW-Mechanize [...] rt.cpan.org
From: Andy Lester <andy [...] petdance.com>
On Mar 26, 2006, at 12:40 AM, mattj@3am-software.com via RT wrote: Show quoted text
> > Queue: WWW-Mechanize > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=18365 > > > Yes, that is what is sounds like. Why do I need to send > authentication with Mech, but not with wget or lynx? Neither of these > other methods have any kind of cookie or password. I only have this > problem with Mech.
Don't know. I suspect wget and/or lynx are sending different HTTP request headers than Mech is. -- Andy Lester => andy@petdance.com => www.petdance.com => AIM:petdance
On Sun Mar 26 01:45:54 2006, andy@petdance.com wrote: Show quoted text
> Don't know. I suspect wget and/or lynx are sending different HTTP > request headers than Mech is.
Mech's user-agent is specifically blocked. Change the user agent to just about anything else (I used WWW::Mechanize->new(agent=>'foo') ) and the page will come back correctly. Also, see the following for a list of user-agents that were blocked from Wikipedia in 2004: http://mail.wikipedia.org/pipermail/wikitech-l/2004-February/020874.html