Bug #1239 for XML-Checker: Checking fails for an entity with 62 or more members

Fri Jul 05 09:35:53 2002 Guest - Ticket created

Subject:

Checking fails for an entity with 62 or more members

Distribition: XML-Checker-0.12 Perl version: 5.005_03 Operating system: Linux 2.2.14-5.0 (RedHat) Bug details: If a DTD has an element with 62 or more members and one or more of these members is optional or is permitted to occur multiple times, validation of this element always fails with the error message 154 "Bad order of Elements". This appears to be related to the way in which the element is tokenised (sub _tokenize): if there's 62 or more members the tokens are hex strings (for less than 62, a single character from $IDS is used). The regular expression generated out of these tokens (sub setModel) doesn't appear to be correctly formed for tokens that are more than just a single character. Eg if I have an element with 62 optional members: <!ELEMENT FRED (MEMBER1?, MEMBER2?, MEMBER3?, ... MEMBER62?) > Then the regular expression generated looks like: (01?02?03?04?...3d?3e?) Whereas it should, I think, be: ((01)?(02)?(03)?(04)?...(3d)?(3e)?) The attached patch file seems to fix this problem for me, though I'm not confident that I really understand the code well enough to be certain it's right. Regards Brian Mills.

--- Checker.pm 2002-04-23 00:36:42.000000000 +0100 +++ Checker.pm.1 2002-07-05 14:06:58.000000000 +0100 @@ -10,6 +10,14 @@ # - Implied handler? # - Notation, Entity, Unparsed checks, Default handler? # - check no root element (it's checked by expat) ? +# +# ***** +# Patched by B Mills, 05/07/2002: +# sub setModel: Generation of regexp goes wrong if an element has more than 62 members and any +# of these has cardinality other than 1: +# Parentheses are required around each re token, because the tokens are encoded +# as character pairs if there's 62 or more of them. +# ***** package XML::Checker::Term; use strict; @@ -441,7 +449,7 @@ # cp := ( name | choice | seq ) ('?' | '*' | '+')? $n++ while s/<[ncs](\d+)>([?*+]?)/_add (C => 'a', N => $_n++, - S => ($_map{$1}->re . $2))/eg; + S => ('('. $_map{$1}->re .')'. $2))/eg; # choice := '(' ch_l ')' $n++ while s/$\s*<[ad](\d+)>\s*$/_add

Sat Jul 06 21:16:28 2002 TJMATHER [...] cpan.org - Correspondence added

I applied the patch, but got an error with t/chk_batch.t: t/chk_batch.........FAILED tests 2, 16, 30, 44 Failed 4/56 tests, 92.86% okay This test worked before applied the patch - so either the patch broke something or the test is flawed. If you can resubmit a corrected patch, or a patch for the test I will apply this. Also you may want to look into XML::LibXML - it has much better support for DTD validation. Thanks

Sat Jul 06 21:16:29 2002 TJMATHER [...] cpan.org - Status changed from 'new' to 'open'