Skip Menu |

This queue is for tickets about the WWW-Search-Pagesjaunes CPAN distribution.

Report information
The Basics
Id: 4833
Status: resolved
Priority: 0/
Queue: WWW-Search-Pagesjaunes

People
Owner: briac [...] cpan.org
Requestors: julien.mary [...] free.fr
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in:
  • 0.04
  • 0.07
Fixed in: 0.08



Subject: Severe Bug : functionnality compromised
On the request : pagesjaunes -departement 1 -activite "moto location" -separator ';' I obtain : Can't call method "value" on an undefined value at /usr/local/share/perl/5.6.1/WWW/Search/Pagesjaunes.pm line 50.
This patch should correct this bug. It's a shame I had to modify the user-agent string; pagesjaunes.fr seems to block 'WWW::Search::Pagesjaunes' specifically... :-/ -- briac
--- /usr/lib/perl5/site_perl/5.8.0/WWW/Search/Pagesjaunes.pm 2003-12-05 20:00:45.000000000 +010~0 +++ lib/WWW/Search/Pagesjaunes.pm 2004-01-05 16:30:31.000000000 +0100 @@ -1,25 +1,26 @@ package WWW::Search::Pagesjaunes; use strict; -#use Carp qw(carp); -#use locale; +use Carp qw(carp croak); use HTML::Form; use HTML::TokeParser; use LWP::UserAgent; -$WWW::Search::Pagesjaunes::VERSION = '0.07'; +$WWW::Search::Pagesjaunes::VERSION = '0.08_01'; sub ROOT_URL() { 'http://www.pagesjaunes.fr' } sub new { my $class = shift; my $self = {}; - my $ua = shift () || LWP::UserAgent->new( + my $ua = shift() || LWP::UserAgent->new( env_proxy => 1, keep_alive => 1, timeout => 30, ); - $ua->agent( "WWW::Search::Pagesjaunes/$WWW::Search::Pagesjaunes::VERSION " + + $ua->agent( "WXWW::Search::Pagesjaunes/$WWW::Search::Pagesjaunes::VERSION " . $ua->agent ); + $self->{ua} = $ua; $self->{limit} = 50; @@ -36,15 +37,20 @@ # Make the first request to pagesjaunes.fr $self->{URL} = ROOT_URL . ( $opt{activite} ? '/pj.cgi' : '/pb.cgi' ); - my @forms = HTML::Form->parse( - $self->{ua}->request( HTTP::Request->new( 'GET', $self->{URL} ) ) - ->content, - $self->{URL} - ); + my $req = $self->{ua}->request( HTTP::Request->new( 'GET', $self->{URL} ) ); + + $DB::single = 1; + if ( !$req->content || !$req->is_success ) { + croak('Error while retrieving the HTML page'); + } - my $form = $opt{activite} ? $forms[1] : $forms[0]; + my @forms = HTML::Form->parse( $req->content, $self->{URL} ); + + #my $form = $opt{activite} ? $forms[1] : $forms[0]; + my $form = $forms[0]; { + # HTML::Form complains when you change hidden fields values. local $^W; $form->value( 'lang', $self->{lang} ); @@ -71,7 +77,9 @@ my $parser = HTML::TokeParser->new( \$result_page ); # All the <br> tags are transformed to whitespace - $parser->{textify} = { 'br' => sub() { " " } }; + $parser->{textify} = { + 'br' => sub() { " " } + }; my @results; @@ -116,7 +124,7 @@ my $phone = _trim( $parser->get_trimmed_text('/td') ); $phone =~ s/^\W(.*)$/$1/g; - push ( + push( @results, WWW::Search::Pagesjaunes::Entry->new( $name, $address, $phone, 0 @@ -137,14 +145,17 @@ } # If there was no result, we look for an error message in the HTML page - if ( !@results && $self->{error} ){ + if ( !@results && $self->{error} ) { $parser = HTML::TokeParser->new( \$result_page ); - while ( my $token = $parser->get_tag("font") ){ - next unless $token->[1] + while ( my $token = $parser->get_tag("font") ) { + next + unless $token->[1] && $token->[1]{color} && $token->[1]{color} eq '#ff0000'; - $parser->{textify} = { 'br' => sub() { " " } }; - print STDERR _trim($parser->get_trimmed_text('/font')) . "\n"; + $parser->{textify} = { + 'br' => sub() { " " } + }; + print STDERR _trim( $parser->get_trimmed_text('/font') ) . "\n"; } }
[BRIAC - Mon Jan 5 10:38:23 2004]: Show quoted text
> This patch should correct this bug. > It's a shame I had to modify the user-agent string; pagesjaunes.fr > seems to block 'WWW::Search::Pagesjaunes' specifically... :-/
The patch hasn't been built with the original Pagesjaunes.pm, I mean the one given by the package, the result is hasardous. Also for the request pagesjaunes -departement $i -activite "moto location", the result is after have applied the patch fucking carrefully by hand : Error while retrieving the HTML page at /usr/local/bin/pagesjaunes line 44 When the patch is applied : noosphere:/usr/local/share/perl/5.6.1/WWW/Search# patch -p1 < diff.txt (Stripping trailing CRs from patch.) can't find file to patch at input line 3 Perhaps you used the wrong -p or --strip option? The text leading up to this was: -------------------------- |--- /usr/local/share/perl/5.6.1/WWW/5.6.1/WWW/Search/Pagesjaunes.pm 2003-12-05 20:00:45.000000000 +010~0 |+++ lib/WWW/Search/Pagesjaunes.pm 2004-01-05 16:30:31.000000000 +0100 -------------------------- File to patch: Pagesjaunes.pm patching file Pagesjaunes.pm Hunk #2 FAILED at 37. Hunk #3 FAILED at 77. Hunk #5 FAILED at 145. 3 out of 5 hunks FAILED -- saving rejects to file Pagesjaunes.pm.rej
Date: Wed, 7 Jan 2004 15:08:25 +0100
From: Briac Pilpré <briac [...] pilpre.com>
To: Guest via RT <bug-WWW-Search-Pagesjaunes [...] rt.cpan.org>
Subject: Re: [cpan #4833] Severe Bug : functionnality compromised
RT-Send-Cc:
I made a new tarball with th epatched module, it is currently available here: http://briac.net/WWW-Search-Pagesjaunes-0.08_01.tar.gz It seems to work ok on my machine: [15:05:03][briac@ledzep:~/src/perl/WWW-Search-Pagesjaunes]$ ./pagesjaunes -dep 1 -activite 'moto location' La Bicyclette Bleue - 01800 Joyeux - 04 74 98 21 48 Twinner Giroud Sports Adhérent - Station Col de la Faucille 01170 Gex - 04 50 41 30 96 fax : .04 50 41 33 22 Aranc Evasion Cycles - r Principale 01110 Aranc - 04 74 38 57 79 fax : .04 74 38 59 20 Duraffour Jean Noël - Le Retord rte de Bérentin 01130 Poizat (Le) - 04 74 75 30 65 mobile : .06 70 07 25 29 Guichard Denis - Bourg BOISSEY - 03 85 51 82 34 Ain Canoë Dombes VTT - Chem de la Masse 01800 Villieu Loyes Mollon - mobile : .06 11 86 31 86 On Wed, Jan 07, 2004 at 08:15:46AM -0500, Guest via RT wrote: Show quoted text
> > This message about WWW-Search-Pagesjaunes was sent to you by guest <> via rt.cpan.org > > Full context and any attached attachments can be found at: > <URL: https://rt.cpan.org/Ticket/Display.html?id=4833 > > > [BRIAC - Mon Jan 5 10:38:23 2004]: >
> > This patch should correct this bug. > > It's a shame I had to modify the user-agent string; pagesjaunes.fr > > seems to block 'WWW::Search::Pagesjaunes' specifically... :-/
> > The patch hasn't been built with the original Pagesjaunes.pm, I mean the > one given by the package, the result is hasardous. > > Also for the request pagesjaunes -departement $i -activite "moto > location", the result is after have applied the patch fucking carrefully > by hand : > > Error while retrieving the HTML page at /usr/local/bin/pagesjaunes line 44 > > When the patch is applied : > noosphere:/usr/local/share/perl/5.6.1/WWW/Search# patch -p1 < diff.txt > (Stripping trailing CRs from patch.) > can't find file to patch at input line 3 > Perhaps you used the wrong -p or --strip option? > The text leading up to this was: > -------------------------- > |--- /usr/local/share/perl/5.6.1/WWW/5.6.1/WWW/Search/Pagesjaunes.pm > 2003-12-05 20:00:45.000000000 +010~0 > |+++ lib/WWW/Search/Pagesjaunes.pm 2004-01-05 16:30:31.000000000 +0100 > -------------------------- > File to patch: Pagesjaunes.pm > patching file Pagesjaunes.pm > Hunk #2 FAILED at 37. > Hunk #3 FAILED at 77. > Hunk #5 FAILED at 145. > 3 out of 5 hunks FAILED -- saving rejects to file Pagesjaunes.pm.rej
-- briac << dynamic .sig on strike, we apologize for the inconvenience >>