Subject: | Entity encoding of URLsin Pod::Simple::XHTML |
There is a minor issue with the way that entities are handled in links
by Pod::Simple::XHTML.
For example consider the following Pod snippet:
=pod
Here is a link L<http://search.cpan.org/search?query=pod&mode=all>.
=cut
If this is processed by Pod::Simple::XHTML (program attached) the link
part of the output is like this:
<p>Here is a link <a
href="http://search.cpan.org/search?query=pod&mode=all">http://search.cpan.org/search?query=pod&mode=all</a>.</p>
The ampersand is encoded in the link but not in the destination (href
attribute). This causes the resulting file to fail validation
(http://validator.w3.org/check) with an error like this:
"An entity reference was found in the document, but there is no
reference by that name defined. Often this is caused by misspelling the
reference name, unencoded ampersands, or by leaving off the trailing
semicolon (;). The most common cause of this error is unencoded
ampersands in URLs as described by the WDG in "Ampersands in URLs"
http://www.htmlhelp.com/tools/validator/problems.html#amp. Entity
references start with an ampersand (&) and end with a semicolon (;). If
you want to use a literal ampersand in your document you must encode it
as "&" (even inside URLs!)."
A suggested fix is shown below and attached.
sub start_L {
my ($self, $flags) = @_;
my ($type, $to, $section) = @{$flags}{'type', 'to', 'section'};
my $url = $type eq 'url' ? $to
: $type eq 'pod' ? $self->resolve_pod_page_link($to, $section)
: $type eq 'man' ? $self->resolve_man_page_link($to, $section)
: undef;
# Add something like this to encode entities in URL..
if (defined $url) {
$url = encode_entities($url);
}
# If it's an unknown type, use an attribute-less <a> like HTML.pm.
$self->{'scratch'} .= '<a' . ($url ? ' href="'. $url . '">' : '>');
}
John.
--
Subject: | test.pod |
=pod
Here is a link L<http://search.cpan.org/search?query=pod&mode=all>.
=cut
Subject: | psx.pl |
#!/usr/bin/perl
use strict;
use warnings;
use Pod::Simple::XHTML;
my $in_file = "test.pod";
my $out_file = "test.xhtml";
open my $xhtml_fh, '>', $out_file or die "Can't write to $out_file: $!";
my $parser = Pod::Simple::XHTML->new();
$parser->output_fh($xhtml_fh);
$parser->parse_file($in_file);
__END__
Subject: | test.xhtml |
Message body not shown because it is not plain text.
Subject: | xhtml.patch |
--- C:\strawberry\perl\lib\Pod\Simple\XHTML.pm Thu Dec 17 10:52:00 2009 UTC
+++ C:\strawberry\perl\lib\Pod\Simple\XHTML_patch.pm Tue Aug 10 23:54:06 2010 UTC
@@ -505,6 +505,11 @@
: $type eq 'man' ? $self->resolve_man_page_link($to, $section)
: undef;
+ # Escape any XML entities in the URL. Mainly for &.
+ if (defined $url) {
+ $url = encode_entities($url);
+ }
+
# If it's an unknown type, use an attribute-less <a> like HTML.pm.
$self->{'scratch'} .= '<a' . ($url ? ' href="'. $url . '">' : '>');
}