Subject: | Bug Submission |
Date: | Wed, 27 Dec 2006 18:10:41 -0500 |
To: | bug-WWW-Mechanize [...] rt.cpan.org |
From: | "Aaron Delong" <aaronpdelong [...] gmail.com> |
Having trouble with the bug list interface, sorry in advance if this is a
dupe.
using most recent version of activestate (DL / install a week ago), and the
following code:
#!/usr/bin/perl -w
use strict;
use WWW::Mechanize;
my $outfile = "c:\LWPout.txt";
my $agent = WWW::Mechanize->new();
$agent->get("http://auditor.cuyahogacounty.us/REPI/default.asp");
$agent->form_name('FormName4');
$agent->field('streetName', "AVALON");
$agent->field('City', "SHAKER HEIGHTS" );
$agent->submit;
my @links = $agent->find_all_links();
for my $link ( @links ){
my $url = $$link[0];
my $myFile = $agent->content;
if ($url =~ /General\.asp/) {
if ($myFile =~ m/\Q$url\E.*?size='2'>(.*?)<\/font>/){
print "$$link[0]\t$$link[1]\t$1\n";
}
}
}
I get the following output:
C:\Perl\pl>perl learnmechanize.pl
General.asp?txtParcel=73509050 735-09-050 KARBLER, WILLIAM R.
General.asp?txtParcel=73509078 735-09-078 TAYLOR, ANDRE
General.asp?txtParcel=73509051 735-09-051 PTERSON, JENNIFER H. &
JELOVSE
General.asp?txtParcel=73509077 735-09-077 CULLERS, ROMNEY B.
General.asp?txtParcel=73509052 735-09-052 GEBHARDT, DOUGLAS S. &
KRISTEN
General.asp?txtParcel=73509076 735-09-076 ROSS, ALLISON LORRAINE
General.asp?txtParcel=73509053 735-09-053 JAFFE, MICHAEL G.
General.asp?txtParcel=73509054 735-09-054 GILL WILLIAM A & MAUREEN K
General.asp?txtParcel=73509075 735-09-075 CHMIELEWSKI, BENJAMIN K
This is fine, except that I would expect one line for "SINGERMAN MICHAEL" at
the head of this list. (see snippet of HTML source below). The only
explanation I can figure is that SINGERMAN's url is split across lines. Is
it possible to modify "find_link" (and thus "find_all_links") to catch such
situations? Is this a parm that I am missing? I get how I would overcome
this if parsing a regex, but I assume this is more involved? If it is
helpful for me to work on developing the fix, I would be happy to
contribute.
Thanks,
Aaron
<tr bgcolor='#ffffff'><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'><a href='
General.asp?txtParcel=73509079'>735-09-079</a></font> </td><td
valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular'
size='2'>SINGERMAN MICHAEL</font> </td><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular'
size='2'>03259</font> </td><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular'
size='2'></font> </td><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular'
size='2'>AVALON</font> </td><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>SHAKER
HEIGHTS</font> </td></tr><tr bgcolor='#dcdcdc'><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'><a href='
General.asp?txtParcel=73509050'>735-09-050</a></font> </td><td
valign='top'><font face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular'
size='2'>KARBLER, WILLIAM R.</font> </td><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular'
size='2'>03260</font> </td><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular'
size='2'></font> </td><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular'
size='2'>AVALON</font> </td><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>SHAKER
HEIGHTS</font> </td></tr><tr bgcolor='#ffffff'><td valign='top'><font
face='Arial,Helvetica,Geneva,Swiss,SunSans-Regular' size='2'>