Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the WWW-Mechanize CPAN distribution.

Report information
The Basics
Id: 24116
Status: rejected
Priority: 0/
Queue: WWW-Mechanize

People
Owner: Nobody in particular
Requestors: aaronpdelong [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Bug Submission
Date: Wed, 27 Dec 2006 18:10:41 -0500
To: bug-WWW-Mechanize [...] rt.cpan.org
From: "Aaron Delong" <aaronpdelong [...] gmail.com>
Having trouble with the bug list interface, sorry in advance if this is a dupe. using most recent version of activestate (DL / install a week ago), and the following code: #!/usr/bin/perl -w use strict; use WWW::Mechanize; my $outfile = "c:\LWPout.txt"; my $agent = WWW::Mechanize->new(); $agent->get("http://auditor.cuyahogacounty.us/REPI/default.asp"); $agent->form_name('FormName4'); $agent->field('streetName', "AVALON"); $agent->field('City', "SHAKER HEIGHTS" ); $agent->submit; my @links = $agent->find_all_links(); for my $link ( @links ){ my $url = $$link[0]; my $myFile = $agent->content; if ($url =~ /General\.asp/) { if ($myFile =~ m/\Q$url\E.*?size='2'>(.*?)<\/font>/){ print "$$link[0]\t$$link[1]\t$1\n"; } } } I get the following output: C:\Perl\pl>perl learnmechanize.pl General.asp?txtParcel=73509050 735-09-050 KARBLER, WILLIAM R. General.asp?txtParcel=73509078 735-09-078 TAYLOR, ANDRE General.asp?txtParcel=73509051 735-09-051 PTERSON, JENNIFER H. & JELOVSE General.asp?txtParcel=73509077 735-09-077 CULLERS, ROMNEY B. General.asp?txtParcel=73509052 735-09-052 GEBHARDT, DOUGLAS S. & KRISTEN General.asp?txtParcel=73509076 735-09-076 ROSS, ALLISON LORRAINE General.asp?txtParcel=73509053 735-09-053 JAFFE, MICHAEL G. General.asp?txtParcel=73509054 735-09-054 GILL WILLIAM A & MAUREEN K General.asp?txtParcel=73509075 735-09-075 CHMIELEWSKI, BENJAMIN K This is fine, except that I would expect one line for "SINGERMAN MICHAEL" at the head of this list. (see snippet of HTML source below). The only explanation I can figure is that SINGERMAN's url is split across lines. Is it possible to modify "find_link" (and thus "find_all_links") to catch such situations? Is this a parm that I am missing? I get how I would overcome this if parsing a regex, but I assume this is more involved? If it is helpful for me to work on developing the fix, I would be happy to contribute. Thanks, Aaron <tr bgcolor='#ffffff'><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'><a href=' General.asp?txtParcel=73509079'>735-09-079</a></font>&nbsp;</td><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>SINGERMAN MICHAEL</font>&nbsp;</td><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>03259</font>&nbsp;</td><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'></font>&nbsp;</td><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>AVALON</font>&nbsp;</td><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>SHAKER HEIGHTS</font>&nbsp;</td></tr><tr bgcolor='#dcdcdc'><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'><a href=' General.asp?txtParcel=73509050'>735-09-050</a></font>&nbsp;</td><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>KARBLER, WILLIAM R.</font>&nbsp;</td><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>03260</font>&nbsp;</td><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'></font>&nbsp;</td><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>AVALON</font>&nbsp;</td><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>SHAKER HEIGHTS</font>&nbsp;</td></tr><tr bgcolor='#ffffff'><td valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>
Please send support requests to perlmonks.org or the libwww-perl list at lists.perl.org.