Skip Menu |

This queue is for tickets about the XML-Entities CPAN distribution.

Report information
The Basics
Id: 128106
Status: new
Priority: 0/
Queue: XML-Entities

People
Owner: Nobody in particular
Requestors: run2000 [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: (no value)
Fixed in: (no value)



Subject: Entity name match should allow dots and hyphens in download-entities.pl
Hi, I was regenerating the Data module against the entity sets at https://www.w3.org/2003/entities/2007/ and noticed some entities were missing. Most notably, the isogrk4 set was empty, where I was expecting to find entity mappings. Going through the source of download-entities.pl, the problem is that the entity name regex only wants word characters, and misses any matches for names containing dot or hyphen characters. The attached patch fixes the problem.
Subject: download-entities.patch
diff --git a/bin/download-entities.pl b/bin/download-entities.pl index 97ce11d..2336d41 --- a/bin/download-entities.pl +++ b/bin/download-entities.pl @@ -158,8 +158,8 @@ sub report_error { sub parse_ent { my ($ent_file_ref) = @_; if (not ref $ent_file_ref) { $ent_file_ref = \$ent_file_ref } - my @raw_defs = $$ent_file_ref =~ /(?<=<!ENTITY) \s* \w+ \s+ "&[^"]+" (?=\s*>)/sgx; - my @name_value_pairs = map {my ($n, $v) = /(\w+) \s* "&\# ([^"]+) "/sx; [$n, $v]} @raw_defs; + my @raw_defs = $$ent_file_ref =~ /(?<=<!ENTITY) \s* [\w\.\-]+ \s+ "&[^"]+" (?=\s*>)/sgx; + my @name_value_pairs = map {my ($n, $v) = /(\w[\w\.\-]*) \s* "&\# ([^"]+) "/sx; [$n, $v]} @raw_defs; for (@name_value_pairs) { my $v = $$_[1]; # For some reason, some entities like &lt; are defined like &#38;#60; instead of &#60; - just get rid of 38;#