Skip Menu |

This queue is for tickets about the Unicode-Collate CPAN distribution.

Report information
The Basics
Id: 102663
Status: resolved
Priority: 0/
Queue: Unicode-Collate

People
Owner: Nobody in particular
Requestors: JHI [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: 1.12



Subject: IRIX 6.5 failures with Unicode::Collate
In IRIX 6.5, most Unicode::Collate tests succeed, but three have failures: t/ident.t t/loc_cjk.t t/loc_cjkc.t Logs attached. I don't know where to start debugging because the failures just say "not ok" and I'm not familiar with the code, but I have access to the system and can debug if given ideas of what to try.
Subject: uc-log.tgz
Download uc-log.tgz
application/octet-stream 27k

Message body not shown because it is not plain text.

Subject: Re: [rt.cpan.org #102663] IRIX 6.5 failures with Unicode::Collate
Date: Fri, 13 Mar 2015 06:56:02 +0900
To: bug-Unicode-Collate [...] rt.cpan.org
From: Sadahiro Tomoyuki <rsn10260 [...] nifty.com>
Thank you for your report ! { take 1 } The results of loc_cjk.t and loc_cjkc.t seem to be due to change or removal of all the data with a higher value in DUCET, I don't know why. The DUCET contains 16-bit values as weights, compared for sorting, that range is 0x0000..0x4CFC and 0x8000..0xFFFF. For weights of CJK ideographic characters, a combination of 0xFB40..0xFB85 and 0x8000..0xFFFF is used. All of the latter may be changed or removed in your system. <a reproduction...> (1) delete all the lines with [.Fxxx.yyyy.zzzz] from DUCET. Show quoted text
>perl -ni.orig -e print()if!/\.F/ allkeys.txt
(2) then try to install # These results are same as uc-t-loc_cjk-t.log and uc-t-loc_cjkc-t.log. Test Summary Report ------------------- t/loc_cjk.t (Wstat: 0 Tests: 3589 Failed: 448) Failed tests: 6-453 t/loc_cjkc.t (Wstat: 0 Tests: 8025 Failed: 3140) Failed tests: 10-1011, 1017-1018, 1023, 1033, 1046, 1059-1060 1062, 1080, 1103, 1120, 1123, 1125, 1142 (snip) 7874-7877, 7880-7883, 7885-7886, 7888-7890 7892-7910, 7912-8025 Files=127, Tests=25923, 23 wallclock secs ( 2.98 usr + 0.30 sys = 3.28 CPU) Result: FAIL Failed 2/127 test programs. 3588/25923 subtests failed. { take 2 } But I don't guess why uc-t-ident-t.log could be so. # Its test 37 is this: # ok($Collator->viewSortKey("\x{100000}"), # '[FBE0 8000 | 0020 | 0002 | FFFF FFFF | 0010 0000]'); What is the output of the following code? If something has a problem, a different value would appear. #!perl use Unicode::Collate; print "Unicode::Collate $Unicode::Collate::VERSION, ", exists &Unicode::Collate::bootstrap ? "has XS\n" : "no XS\n"; my $c = Unicode::Collate->new(identical => 1); for my $u (0x41, 0x3220, 0x4E00, 0xF967, 0x2B81D, 0x100000) { print $c->viewSortKey(chr $u), "\n"; } __END__ # an example of output: Unicode::Collate 1.11, has XS [190C | 0020 | 0008 | FFFF | 0000 0041] [FB40 CE00 | 0020 | 0004 | 030A FFFF FFFF 030B | 0000 3220] [FB40 CE00 | 0020 | 0002 | FFFF FFFF | 0000 4E00] [FB40 CE0D | 0020 | 0002 | FFFF FFFF | 0000 4E0D] [FB85 B81D | 0020 | 0002 | FFFF FFFF | 0002 B81D] [FBE0 8000 | 0020 | 0002 | FFFF FFFF | 0010 0000] Regards, SADAHIRO Tomoyoki Show quoted text
> Mon Mar 09 19:48:04 2015: Request 102663 was acted upon. > Transaction: Ticket created by JHI > Queue: Unicode-Collate > Subject: IRIX 6.5 failures with Unicode::Collate > Broken in: (no value) > Severity: (no value) > Owner: Nobody > Requestors: JHI@cpan.org > Status: new > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102663 > > > > In IRIX 6.5, most Unicode::Collate tests succeed, but three have failures: > > t/ident.t > t/loc_cjk.t > t/loc_cjkc.t > > Logs attached. > > I don't know where to start debugging because the failures just say "not ok" and I'm not > familiar with the code, but I have access to the system and can debug if given ideas of > what to try. >
Subject: Re: [rt.cpan.org #102663] IRIX 6.5 failures with Unicode::Collate
Date: Thu, 12 Mar 2015 19:43:42 -0400
To: bug-Unicode-Collate [...] rt.cpan.org
From: Jarkko Hietaniemi <jhi [...] iki.fi>
On Thursday-201503-12 17:56, Sadahiro Tomoyuki via RT wrote: Show quoted text
> #!perl > use Unicode::Collate; > print "Unicode::Collate $Unicode::Collate::VERSION, ", > exists &Unicode::Collate::bootstrap ? "has XS\n" : "no XS\n"; > my $c = Unicode::Collate->new(identical => 1); > for my $u (0x41, 0x3220, 0x4E00, 0xF967, 0x2B81D, 0x100000) { > print $c->viewSortKey(chr $u), "\n"; > } > __END__ > # an example of output: > Unicode::Collate 1.11, has XS > [190C | 0020 | 0008 | FFFF | 0000 0041] > [FB40 CE00 | 0020 | 0004 | 030A FFFF FFFF 030B | 0000 3220] > [FB40 CE00 | 0020 | 0002 | FFFF FFFF | 0000 4E00] > [FB40 CE0D | 0020 | 0002 | FFFF FFFF | 0000 4E0D] > [FB85 B81D | 0020 | 0002 | FFFF FFFF | 0002 B81D] > [FBE0 8000 | 0020 | 0002 | FFFF FFFF | 0010 0000]
Irix says: Unicode::Collate 1.11, has XS [190C | 0020 | 0008 | FFFF | 0000 0041] [FB40 CE00 | 0020 | 0004 | 030A FFFF FFFF 030B | 0000 3220] [FB40 CE00 | | | FFFF FFFF | 0000 4E00] [FB40 CE0D | | | FFFF FFFF | 0000 4E0D] [FB85 B81D | | | FFFF FFFF | 0002 B81D] [FBE0 8000 | | | FFFF FFFF | 0010 0000]
Subject: Re: [rt.cpan.org #102663] IRIX 6.5 failures with Unicode::Collate
Date: Thu, 12 Mar 2015 20:30:25 -0400
To: bug-Unicode-Collate [...] rt.cpan.org, JHI [...] cpan.org
From: Jarkko Hietaniemi <jhi [...] iki.fi>
I also checked: the generated Collate.c and ucatbl.h are identical in IRIX to what gets generated in my Mac.
Subject: Re: [rt.cpan.org #102663] IRIX 6.5 failures with Unicode::Collate
Date: Fri, 13 Mar 2015 21:39:30 +0900
To: bug-Unicode-Collate [...] rt.cpan.org
From: Sadahiro Tomoyuki <rsn10260 [...] nifty.com>
Thank you for your report. I found all the failures may be due to one change. That is a part of a XS function, named _derivCE_9(). The missing '| 0020 | 0002 |' is derived from "\x00\x20\x00\x02" between two 16-bit sequences "\xFF\xFF" in U8 a[]. But I don't see what happens acutually and why so. Possibly all bytes of a[] may be filled with \0. Other parts of outputs via viewSortKey() are derived from contents of DUCET or calculation. The missing '| 0020 | 0002 |' should be derived from "\x00\x20\x00\x02" without any change. <trial change> --- Collate.xs~ Tue Feb 17 21:23:44 2015 +++ Collate.xs Fri Mar 13 21:15:26 2015 @@ -268,7 +268,7 @@ _derivCE_24 = 5 PREINIT: UV base, aaaa, bbbb; - U8 a[VCE_Length + 1] = "\x00\xFF\xFF\x00\x20\x00\x02\xFF\xFF"; + U8 a[VCE_Length + 1] = "\x00\xFF\xFF\x00\x00\x00\x00\xFF\xFF"; U8 b[VCE_Length + 1] = "\x00\xFF\xFF\x00\x00\x00\x00\xFF\xFF"; bool basic_unified = 0; PPCODE: <result> Test Summary Report ------------------- t/ident.t (Wstat: 0 Tests: 45 Failed: 1) Failed test: 37 t/loc_cjk.t (Wstat: 0 Tests: 3589 Failed: 448) Failed tests: 6-453 t/loc_cjkc.t (Wstat: 0 Tests: 8025 Failed: 3140) Failed tests: 10-1011, 1017-1018, 1023, 1033, 1046, 1059-1060 1062, 1080, 1103, 1120, 1123, 1125, 1142 (snip) 7874-7877, 7880-7883, 7885-7886, 7888-7890 7892-7910, 7912-8025 Files=127, Tests=25923, 23 wallclock secs ( 2.45 usr + 0.38 sys = 2.83 CPU) Result: FAIL Failed 3/127 test programs. 3589/25923 subtests failed. Regards, SADAHIRO Show quoted text
> Queue: Unicode-Collate > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102663 > > > On Thursday-201503-12 17:56, Sadahiro Tomoyuki via RT wrote:
> > #!perl > > use Unicode::Collate; > > print "Unicode::Collate $Unicode::Collate::VERSION, ", > > exists &Unicode::Collate::bootstrap ? "has XS\n" : "no XS\n"; > > my $c = Unicode::Collate->new(identical => 1); > > for my $u (0x41, 0x3220, 0x4E00, 0xF967, 0x2B81D, 0x100000) { > > print $c->viewSortKey(chr $u), "\n"; > > } > > __END__
> > # an example of output: > > Unicode::Collate 1.11, has XS > > [190C | 0020 | 0008 | FFFF | 0000 0041] > > [FB40 CE00 | 0020 | 0004 | 030A FFFF FFFF 030B | 0000 3220] > > [FB40 CE00 | 0020 | 0002 | FFFF FFFF | 0000 4E00] > > [FB40 CE0D | 0020 | 0002 | FFFF FFFF | 0000 4E0D] > > [FB85 B81D | 0020 | 0002 | FFFF FFFF | 0002 B81D] > > [FBE0 8000 | 0020 | 0002 | FFFF FFFF | 0010 0000]
> > Irix says: > > Unicode::Collate 1.11, has XS > [190C | 0020 | 0008 | FFFF | 0000 0041] > [FB40 CE00 | 0020 | 0004 | 030A FFFF FFFF 030B | 0000 3220] > [FB40 CE00 | | | FFFF FFFF | 0000 4E00] > [FB40 CE0D | | | FFFF FFFF | 0000 4E0D] > [FB85 B81D | | | FFFF FFFF | 0002 B81D] > [FBE0 8000 | | | FFFF FFFF | 0010 0000] >
Subject: Re: [rt.cpan.org #102663] IRIX 6.5 failures with Unicode::Collate
Date: Fri, 13 Mar 2015 21:42:39 +0900
To: bug-Unicode-Collate [...] rt.cpan.org
From: Sadahiro Tomoyuki <rsn10260 [...] nifty.com>
May such a patch fix it? Would you please try it? Regards, SADAHIRO --- Collate.xs~ Tue Feb 17 21:23:44 2015 +++ Collate.xs Fri Mar 13 21:38:00 2015 @@ -268,8 +268,8 @@ _derivCE_24 = 5 PREINIT: UV base, aaaa, bbbb; - U8 a[VCE_Length + 1] = "\x00\xFF\xFF\x00\x20\x00\x02\xFF\xFF"; - U8 b[VCE_Length + 1] = "\x00\xFF\xFF\x00\x00\x00\x00\xFF\xFF"; + U8 a[VCE_Length + 1] = "\x00\x00\x00\x00\x00\x00\x00\x00\x00"; + U8 b[VCE_Length + 1] = "\x00\x00\x00\x00\x00\x00\x00\x00\x00"; bool basic_unified = 0; PPCODE: if (CJK_UidIni <= code) { @@ -299,6 +299,8 @@ a[2] = (U8)(aaaa & 0xFF); b[1] = (U8)(bbbb >> 8); b[2] = (U8)(bbbb & 0xFF); + a[4] = '\x20'; /* second octet of level 2 */ + a[6] = '\x02'; /* second octet of level 3 */ a[7] = b[7] = (U8)(code >> 8); a[8] = b[8] = (U8)(code & 0xFF); EXTEND(SP, 2); Show quoted text
> Queue: Unicode-Collate > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102663 > > > Thank you for your report. > I found all the failures may be due to one change. > That is a part of a XS function, named _derivCE_9(). > > The missing '| 0020 | 0002 |' is derived from "\x00\x20\x00\x02" > between two 16-bit sequences "\xFF\xFF" in U8 a[]. > But I don't see what happens acutually and why so. > > Possibly all bytes of a[] may be filled with \0. > Other parts of outputs via viewSortKey() are derived from > contents of DUCET or calculation. > The missing '| 0020 | 0002 |' should be derived > from "\x00\x20\x00\x02" without any change. > > <trial change> > --- Collate.xs~ Tue Feb 17 21:23:44 2015 > +++ Collate.xs Fri Mar 13 21:15:26 2015 > @@ -268,7 +268,7 @@ > _derivCE_24 = 5 > PREINIT: > UV base, aaaa, bbbb; > - U8 a[VCE_Length + 1] = "\x00\xFF\xFF\x00\x20\x00\x02\xFF\xFF"; > + U8 a[VCE_Length + 1] = "\x00\xFF\xFF\x00\x00\x00\x00\xFF\xFF"; > U8 b[VCE_Length + 1] = "\x00\xFF\xFF\x00\x00\x00\x00\xFF\xFF"; > bool basic_unified = 0; > PPCODE: > > <result> > Test Summary Report > ------------------- > t/ident.t (Wstat: 0 Tests: 45 Failed: 1) > Failed test: 37 > t/loc_cjk.t (Wstat: 0 Tests: 3589 Failed: 448) > Failed tests: 6-453 > t/loc_cjkc.t (Wstat: 0 Tests: 8025 Failed: 3140) > Failed tests: 10-1011, 1017-1018, 1023, 1033, 1046, 1059-1060 > 1062, 1080, 1103, 1120, 1123, 1125, 1142 > (snip) > 7874-7877, 7880-7883, 7885-7886, 7888-7890 > 7892-7910, 7912-8025 > Files=127, Tests=25923, 23 wallclock secs ( 2.45 usr + 0.38 sys = 2.83 CPU) > Result: FAIL > Failed 3/127 test programs. 3589/25923 subtests failed. > > Regards, > SADAHIRO >
> > Queue: Unicode-Collate > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102663 > > > > > On Thursday-201503-12 17:56, Sadahiro Tomoyuki via RT wrote:
> > > #!perl > > > use Unicode::Collate; > > > print "Unicode::Collate $Unicode::Collate::VERSION, ", > > > exists &Unicode::Collate::bootstrap ? "has XS\n" : "no XS\n"; > > > my $c = Unicode::Collate->new(identical => 1); > > > for my $u (0x41, 0x3220, 0x4E00, 0xF967, 0x2B81D, 0x100000) { > > > print $c->viewSortKey(chr $u), "\n"; > > > } > > > __END__
> > > # an example of output: > > > Unicode::Collate 1.11, has XS > > > [190C | 0020 | 0008 | FFFF | 0000 0041] > > > [FB40 CE00 | 0020 | 0004 | 030A FFFF FFFF 030B | 0000 3220] > > > [FB40 CE00 | 0020 | 0002 | FFFF FFFF | 0000 4E00] > > > [FB40 CE0D | 0020 | 0002 | FFFF FFFF | 0000 4E0D] > > > [FB85 B81D | 0020 | 0002 | FFFF FFFF | 0002 B81D] > > > [FBE0 8000 | 0020 | 0002 | FFFF FFFF | 0010 0000]
> > > > Irix says: > > > > Unicode::Collate 1.11, has XS > > [190C | 0020 | 0008 | FFFF | 0000 0041] > > [FB40 CE00 | 0020 | 0004 | 030A FFFF FFFF 030B | 0000 3220] > > [FB40 CE00 | | | FFFF FFFF | 0000 4E00] > > [FB40 CE0D | | | FFFF FFFF | 0000 4E0D] > > [FB85 B81D | | | FFFF FFFF | 0002 B81D] > > [FBE0 8000 | | | FFFF FFFF | 0010 0000] > >
> >
On Fri Mar 13 08:43:04 2015, rsn10260@nifty.com wrote: Show quoted text
> May such a patch fix it? > Would you please try it?
The below change fixes loc_cjk.t and loc_cjkc.t AND ident.t in IRIX, thanks! Show quoted text
> Regards, > SADAHIRO > > --- Collate.xs~ Tue Feb 17 21:23:44 2015 > +++ Collate.xs Fri Mar 13 21:38:00 2015 > @@ -268,8 +268,8 @@ > _derivCE_24 = 5 > PREINIT: > UV base, aaaa, bbbb; > - U8 a[VCE_Length + 1] = "\x00\xFF\xFF\x00\x20\x00\x02\xFF\xFF"; > - U8 b[VCE_Length + 1] = "\x00\xFF\xFF\x00\x00\x00\x00\xFF\xFF"; > + U8 a[VCE_Length + 1] = "\x00\x00\x00\x00\x00\x00\x00\x00\x00"; > + U8 b[VCE_Length + 1] = "\x00\x00\x00\x00\x00\x00\x00\x00\x00"; > bool basic_unified = 0; > PPCODE: > if (CJK_UidIni <= code) { > @@ -299,6 +299,8 @@ > a[2] = (U8)(aaaa & 0xFF); > b[1] = (U8)(bbbb >> 8); > b[2] = (U8)(bbbb & 0xFF); > + a[4] = '\x20'; /* second octet of level 2 */ > + a[6] = '\x02'; /* second octet of level 3 */ > a[7] = b[7] = (U8)(code >> 8); > a[8] = b[8] = (U8)(code & 0xFF); > EXTEND(SP, 2); > > >
> > Queue: Unicode-Collate > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102663 > > > > > Thank you for your report. > > I found all the failures may be due to one change. > > That is a part of a XS function, named _derivCE_9(). > > > > The missing '| 0020 | 0002 |' is derived from "\x00\x20\x00\x02" > > between two 16-bit sequences "\xFF\xFF" in U8 a[]. > > But I don't see what happens acutually and why so. > > > > Possibly all bytes of a[] may be filled with \0. > > Other parts of outputs via viewSortKey() are derived from > > contents of DUCET or calculation. > > The missing '| 0020 | 0002 |' should be derived > > from "\x00\x20\x00\x02" without any change. > > > > <trial change> > > --- Collate.xs~ Tue Feb 17 21:23:44 2015 > > +++ Collate.xs Fri Mar 13 21:15:26 2015 > > @@ -268,7 +268,7 @@ > > _derivCE_24 = 5 > > PREINIT: > > UV base, aaaa, bbbb; > > - U8 a[VCE_Length + 1] = "\x00\xFF\xFF\x00\x20\x00\x02\xFF\xFF"; > > + U8 a[VCE_Length + 1] = "\x00\xFF\xFF\x00\x00\x00\x00\xFF\xFF"; > > U8 b[VCE_Length + 1] = "\x00\xFF\xFF\x00\x00\x00\x00\xFF\xFF"; > > bool basic_unified = 0; > > PPCODE: > > > > <result> > > Test Summary Report > > ------------------- > > t/ident.t (Wstat: 0 Tests: 45 Failed: 1) > > Failed test: 37 > > t/loc_cjk.t (Wstat: 0 Tests: 3589 Failed: 448) > > Failed tests: 6-453 > > t/loc_cjkc.t (Wstat: 0 Tests: 8025 Failed: 3140) > > Failed tests: 10-1011, 1017-1018, 1023, 1033, 1046, 1059-1060 > > 1062, 1080, 1103, 1120, 1123, 1125, 1142 > > (snip) > > 7874-7877, 7880-7883, 7885-7886, 7888-7890 > > 7892-7910, 7912-8025 > > Files=127, Tests=25923, 23 wallclock secs ( 2.45 usr + 0.38 sys = > > 2.83 CPU) > > Result: FAIL > > Failed 3/127 test programs. 3589/25923 subtests failed. > > > > Regards, > > SADAHIRO > >
> > > Queue: Unicode-Collate > > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=102663 > > > > > > > On Thursday-201503-12 17:56, Sadahiro Tomoyuki via RT wrote:
> > > > #!perl > > > > use Unicode::Collate; > > > > print "Unicode::Collate $Unicode::Collate::VERSION, ", > > > > exists &Unicode::Collate::bootstrap ? "has XS\n" : "no > > > > XS\n"; > > > > my $c = Unicode::Collate->new(identical => 1); > > > > for my $u (0x41, 0x3220, 0x4E00, 0xF967, 0x2B81D, 0x100000) { > > > > print $c->viewSortKey(chr $u), "\n"; > > > > } > > > > __END__
> > > > # an example of output: > > > > Unicode::Collate 1.11, has XS > > > > [190C | 0020 | 0008 | FFFF | 0000 0041] > > > > [FB40 CE00 | 0020 | 0004 | 030A FFFF FFFF 030B | 0000 3220] > > > > [FB40 CE00 | 0020 | 0002 | FFFF FFFF | 0000 4E00] > > > > [FB40 CE0D | 0020 | 0002 | FFFF FFFF | 0000 4E0D] > > > > [FB85 B81D | 0020 | 0002 | FFFF FFFF | 0002 B81D] > > > > [FBE0 8000 | 0020 | 0002 | FFFF FFFF | 0010 0000]
> > > > > > Irix says: > > > > > > Unicode::Collate 1.11, has XS > > > [190C | 0020 | 0008 | FFFF | 0000 0041] > > > [FB40 CE00 | 0020 | 0004 | 030A FFFF FFFF 030B | 0000 3220] > > > [FB40 CE00 | | | FFFF FFFF | 0000 4E00] > > > [FB40 CE0D | | | FFFF FFFF | 0000 4E0D] > > > [FB85 B81D | | | FFFF FFFF | 0002 B81D] > > > [FBE0 8000 | | | FFFF FFFF | 0010 0000] > > >
> > > >