Subject: | Checking fails for an entity with 62 or more members |
Distribition: XML-Checker-0.12
Perl version: 5.005_03
Operating system: Linux 2.2.14-5.0 (RedHat)
Bug details:
If a DTD has an element with 62 or more members and one or more of these members is optional or is permitted to occur multiple times, validation of this element always fails with the error message 154 "Bad order of Elements".
This appears to be related to the way in which the element is tokenised (sub _tokenize): if there's 62 or more members the tokens are hex strings (for less than 62, a single character from $IDS is used).
The regular expression generated out of these tokens (sub setModel) doesn't appear to be correctly formed for tokens that are more than just a single character.
Eg if I have an element with 62 optional members:
<!ELEMENT FRED (MEMBER1?, MEMBER2?, MEMBER3?, ... MEMBER62?) >
Then the regular expression generated looks like:
(01?02?03?04?...3d?3e?)
Whereas it should, I think, be:
((01)?(02)?(03)?(04)?...(3d)?(3e)?)
The attached patch file seems to fix this problem for me, though I'm not confident that I really understand the code well enough to be certain it's right.
Regards
Brian Mills.
--- Checker.pm 2002-04-23 00:36:42.000000000 +0100
+++ Checker.pm.1 2002-07-05 14:06:58.000000000 +0100
@@ -10,6 +10,14 @@
# - Implied handler?
# - Notation, Entity, Unparsed checks, Default handler?
# - check no root element (it's checked by expat) ?
+#
+# *****
+# Patched by B Mills, 05/07/2002:
+# sub setModel: Generation of regexp goes wrong if an element has more than 62 members and any
+# of these has cardinality other than 1:
+# Parentheses are required around each re token, because the tokens are encoded
+# as character pairs if there's 62 or more of them.
+# *****
package XML::Checker::Term;
use strict;
@@ -441,7 +449,7 @@
# cp := ( name | choice | seq ) ('?' | '*' | '+')?
$n++ while s/<[ncs](\d+)>([?*+]?)/_add
(C => 'a', N => $_n++,
- S => ($_map{$1}->re . $2))/eg;
+ S => ('('. $_map{$1}->re .')'. $2))/eg;
# choice := '(' ch_l ')'
$n++ while s/\(\s*<[ad](\d+)>\s*\)/_add