Subject: | large mismatch on here-document |
Date: | Sun, 02 Aug 2009 07:57:22 +1000 |
To: | bug-URI-Find [...] rt.cpan.org |
From: | Kevin Ryde <user42 [...] zip.com.au> |
With the debian packaged URI::Find 20090319 and perl 5.10.0 the program
below prints a long line
%3CEOF$0:unabletoguess...
I think it's decided all of "<<EOF" through to the end of
"<config-patches@gnu.org>" is a <...> style url. The text is a fragment
of the gnu config.guess script. I hoped URI::Find wouldn't match that
big chunk but instead the http:// urls within.
I see the match is more or less <.*?>. I think something tighter may be
needed to avoid "<" as a less-than or other operator etc in program
code.
An easy option would be to demand $self->url_re in there. It might be
tighter than strictly needed, but would have the advantage of letting
subclasses influence the behaviour.
use strict;
use warnings;
use URI::Find;
my $str = '
cat >&2 <<EOF
$0: unable to guess system type
This script, last modified $timestamp, has failed to recognize
the operating system you are using. It is advised that you
download the most up to date version of the config scripts from
http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD
and
http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=HEAD
If the version you run ($0) is already up to date, please
send the following data and any information you think might be
pertinent to <config-patches@gnu.org> in order to provide the needed
information to handle your system.
';
my $finder = URI::Find->new (sub {
my ($uri, $orig) = @_;
print $uri, "\n";
return '';
});
$finder->find (\$str);