Subject: | Names of entity sets can be illegal Perl sub names |
Hi,
I encountered this problem while refreshing the Data.pm module using the download-entities.pl script against the entity files at https://www.w3.org/2003/entities/2007/
The filenames can contain hyphens, such as "html5-uppercase.ent", "xhtml1-lat1.ent", "xhtml1-special.ent", and "xhtml1-symbol.ent". These names are used to generate Perl sub names.
The generated module will fail to compile, because hyphens are invalid characters for names of Perl subs.
My workaround is to substitute hyphens with underscores, which are permitted as sub identifiers. The patch below shows my solution, but you may prefer to strip them out altogether, or some other workaround.
The patch also skips entity sets that have no character mappings defined. In the entity files at https://www.w3.org/2003/entities/2007/ the file "htmlmathml.ent" is an example of an entity file with no mappings.
Cheers,
Nicholas.
Subject: | download-entities.patch2 |
Message body not shown because it is not plain text.