Subject: | _slurp_uri should take base uri into account |
Date: | Fri, 23 Feb 2007 17:56:21 -0800 |
To: | bug-XML-Twig [...] rt.cpan.org |
From: | Dave Charness <dave [...] denali.com> |
We have a script that uses XML::Twig. We run it from the top of a
source tree on a document that lies in a subdirectory. That document
uses an external entity declared in its internal DTD to include another
file in the same subdirectory. When I upgraded from XML::Twig 3.15 to
3.29, we started getting an error:
cannot open 'xxx': No such file or directory at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser/Expat.pm line 469
As far as I've determined this comes from _slurp_uri passing $uri
verbatim to _slurp when $uri doesn't have a scheme (that is, the else
branch). The simple filename (or any relative uri) should be
interpreted relative to the declaring document, but since the document
isn't in the current directory, the open in _slurp fails.
I've recreated this with a simple example (tried with perl 5.6.1 and
perl 5.8.8, XML::Parser 2.34):
$ cat script.pl
use XML::Twig;
$t = new XML::Twig;
$t->parsefile('xml/doc.xml');
$ cat xml/doc.xml
<!DOCTYPE x [ <!ENTITY ent SYSTEM "ent.xml" > ]>
<x>&ent;</x>
$ cat xml/ent.xml
<foo/>
$ perl script.pl
cannot open 'ent.xml': No such file or directory at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser/Expat.pm line 469
Attached is my temporary patch for our local use, which appears to be
working fine.
Thanks for the good work,
-Dave
dave@denali.com
650 461 7213
--- XML/Twig.pm.orig 2007-02-23 17:45:21.310537000 -0800
+++ XML/Twig.pm 2007-02-23 17:05:48.430731000 -0800
@@ -851,11 +851,20 @@
}
sub _slurp_uri
- { my( $uri)= @_;
+ { my( $uri, $base)= @_;
if( $uri=~ m{^\w+://})
{ _use( 'LWP::Simple'); return LWP::Simple::get( $uri); }
else
- { return _slurp( $uri); }
+ {
+ # cf. XML/Parser.pm's file_ext_ent_handler
+ if (defined($base)
+ and not ($uri =~ m!^(?:[\\/]|\w+:)!)) {
+ my $newpath = $base;
+ $newpath =~ s![^\\/:]*$!$uri!;
+ $uri = $newpath;
+ }
+ return _slurp( $uri);
+ }
}
sub _slurp
@@ -2153,7 +2162,7 @@
{
my( $p, $name, $val, $sysid, $pubid, $ndata)= @_;
my $t=$p->{twig};
- if( $sysid && !$ndata) { $val= _slurp_uri( $sysid); }
+ if( $sysid && !$ndata) { $val= _slurp_uri( $sysid, $p->base); }
my $ent=XML::Twig::Entity->new( $name, $val, $sysid, $pubid, $ndata);
$t->entity_list->add( $ent);
if( $parser_version > 2.27)