Skip Menu |

This queue is for tickets about the threads-shared CPAN distribution.

Report information
The Basics
Id: 37149
Status: resolved
Priority: 0/
Queue: threads-shared

People
Owner: Nobody in particular
Requestors: JGMYERS [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.14
Fixed in: 1.24



Subject: shared hashes fail to preserve utf-8ness of keys
Per the attached test case, shared hash tables fail to preserve the utf8::is_utf8()-ness of hash keys.
Subject: shared-utf8.t
#! /usr/bin/perl -w use strict; use warnings; use threads; use threads::shared; use Test::More tests => 2; binmode STDOUT, ":utf8"; my $name="\x{123}\x{84}\x{20F}\x{2C1}"; my @d; push @d, $name; my ($a); my %a:shared; unless ($a) { $a = &share({}); } %$a = (); %a = (); foreach (@d) { my $file="/tmp/$_"; $$a{$_} = $file; $a{$_} = $file; print "name=$_ file=$file\n"; } while (my ($key, $value) = each (%$a)) { print "key=$key value=$value\n"; ok( $key eq $name, "compare values in shared ref hash a: name and key"); } while (my ($key, $value) = each (%a)) { print "key=$key value=$value\n"; ok( $key eq $name, "compare values in shared hash a: name and key"); }
CC: bug-threads-shared [...] rt.cpan.org
Subject: [PATCH] [rt.cpan.org #37149] shared hashes fail to preserve utf-8ness of keys
Date: Mon, 30 Jun 2008 10:11:05 -0400
To: pp <perl5-porters [...] perl.org>
From: "Jerry D. Hedden" <jdhedden [...] cpan.org>
The attached patch fixes the subject bug. It also adds a new file to the test suite. In addition, t/wait.t and t/waithires.t have been modified to use the new watchdog() function that was added to test.pl.

Message body is not shown because sender requested not to inline it.

Testing the patched threads::shared as a standalone module compiling with perl 5.8.8, I get failures running the test harness: t/av_refs........Can't load '/u/jgmyers/src/threads-shared-1.23/blib/arch/auto/threads/shared/shared.so' for module threads::shared: /u/jgmyers/src/threads-shared-1.23/blib/arch/auto/threads/shared/shared.so: undefined symbol: DPPP_my_newSVpvn_flags at /u/jgmyers/perl/lib/5.8.8/i686-linux-thread-multi/DynaLoader.pm line 230. at t/av_refs.t line 38 and t/wait...........Can't locate ./t/test.pl in @INC (@INC contains: /u/jgmyers/src/threads-shared-1.23/blib/lib /u/jgmyers/src/threads-shared-1.23/blib/arch /u/jgmyers/perl/lib/5.8.8/i686-linux-thread-multi /u/jgmyers/perl/lib/5.8.8 /u/jgmyers/perl/lib/site_perl/5.8.8/i686-linux-thread-multi /u/jgmyers/perl/lib/site_perl/5.8.8 /u/jgmyers/perl/lib/site_perl .) at t/wait.t line 10. BEGIN failed--compilation aborted at t/wait.t line 16. t/wait...........dubious The fix seems a bit dubious in that it causes any iso-8859-1 keys to be utf8 upgradeed. I think a better approach is to copy the UTF8 bit into the shared hash table, such as done by the attached patch.
--- threads-shared-1.23/shared.xs 2008-06-30 11:17:45.000000000 -0700 +++ threads-shared-1.23-fix/shared.xs 2008-06-30 11:19:49.000000000 -0700 @@ -875,7 +875,9 @@ STRLEN len = mg->mg_len; assert ( mg->mg_ptr != 0 ); if (mg->mg_len == HEf_SVKEY) { - key = SvPVutf8((SV *)mg->mg_ptr, len); + key = SvPV((SV *) mg->mg_ptr, len); + if (SvUTF8((SV *) mg->mg_ptr)) + len = -len; } SHARED_CONTEXT; svp = hv_fetch((HV*) saggregate, key, len, 0); @@ -925,8 +927,11 @@ char *key = mg->mg_ptr; STRLEN len = mg->mg_len; assert ( mg->mg_ptr != 0 ); - if (mg->mg_len == HEf_SVKEY) - key = SvPVutf8((SV *)mg->mg_ptr, len); + if (mg->mg_len == HEf_SVKEY) { + key = SvPV((SV *) mg->mg_ptr, len); + if (SvUTF8((SV *) mg->mg_ptr)) + len = -len; + } SHARED_CONTEXT; svp = hv_fetch((HV*) saggregate, key, len, 1); } @@ -956,8 +961,11 @@ char *key = mg->mg_ptr; STRLEN len = mg->mg_len; assert ( mg->mg_ptr != 0 ); - if (mg->mg_len == HEf_SVKEY) - key = SvPVutf8((SV *)mg->mg_ptr, len); + if (mg->mg_len == HEf_SVKEY) { + key = SvPV((SV *) mg->mg_ptr, len); + if (SvUTF8((SV *) mg->mg_ptr)) + len = -len; + } SHARED_CONTEXT; hv_delete((HV*) saggregate, key, len, G_DISCARD); } @@ -1275,7 +1283,9 @@ exists = av_exists((AV*) sobj, SvIV(index)); } else { STRLEN len; - char *key = SvPVutf8(index, len); + char *key = SvPV(index,len); + if (SvUTF8(index)) + len = -len; SHARED_EDIT; exists = hv_exists((HV*) sobj, key, len); } @@ -1297,9 +1307,11 @@ hv_iterinit((HV*) sobj); entry = hv_iternext((HV*) sobj); if (entry) { + I32 utf8 = HeKWASUTF8(entry); key = hv_iterkey(entry,&len); CALLER_CONTEXT; - ST(0) = sv_2mortal(newSVpvn_utf8(key, len, 1)); + ST(0) = sv_2mortal(newSVpvn(key, len)); + if (utf8) SvUTF8_on(ST(0)); } else { CALLER_CONTEXT; ST(0) = &PL_sv_undef; @@ -1323,9 +1335,11 @@ SHARED_CONTEXT; entry = hv_iternext((HV*) sobj); if (entry) { + I32 utf8 = HeKWASUTF8(entry); key = hv_iterkey(entry,&len); CALLER_CONTEXT; - ST(0) = sv_2mortal(newSVpvn_utf8(key, len, 1)); + ST(0) = sv_2mortal(newSVpvn(key, len)); + if (utf8) SvUTF8_on(ST(0)); } else { CALLER_CONTEXT; ST(0) = &PL_sv_undef;
CC: bug-threads-shared [...] rt.cpan.org
Subject: [PATCH - revised] [rt.cpan.org #37149] shared hashes fail to preserve utf-8ness of keys
Date: Mon, 30 Jun 2008 15:07:20 -0400
To: pp <perl5-porters [...] perl.org>
From: "Jerry D. Hedden" <jdhedden [...] cpan.org>
Jerry D. Hedden wrote: Show quoted text
> The attached patch fixes the subject bug. It also adds a > new file to the test suite. > > In addition, t/wait.t and t/waithires.t have been modified > to use the new watchdog() function that was added to > test.pl.
JGMYERS wrote: Show quoted text
> Testing the patched threads::shared as a standalone module > compiling with perl 5.8.8, I get failures running the test > harness: > t/av_refs........Can't load > '/u/jgmyers/src/threads-shared-1.23/blib/arch/auto/threads/shared/shared.so' > for module threads::shared: > /u/jgmyers/src/threads-shared-1.23/blib/arch/auto/threads/shared/shared.so: > undefined symbol: DPPP_my_newSVpvn_flags at > /u/jgmyers/perl/lib/5.8.8/i686-linux-thread-multi/DynaLoader.pm line 230. > at t/av_refs.t line 38
Sorry. Forgot to run ppport. Revised patch adds: #define NEED_newSVpvn_flags

Message body is not shown because sender requested not to inline it.

Subject: Re: [rt.cpan.org #37149] shared hashes fail to preserve utf-8ness of keys
Date: Mon, 30 Jun 2008 15:09:40 -0400
To: bug-threads-shared [...] rt.cpan.org
From: "Jerry D. Hedden" <jdhedden [...] cpan.org>
JGMYERS wrote: Show quoted text
> Queue: threads-shared > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=37149 > > > Testing the patched threads::shared as a standalone module compiling > with perl 5.8.8, I get failures running the test harness: > > t/wait...........Can't locate ./t/test.pl in @INC (@INC contains: > /u/jgmyers/src/threads-shared-1.23/blib/lib > /u/jgmyers/src/threads-shared-1.23/blib/arch > /u/jgmyers/perl/lib/5.8.8/i686-linux-thread-multi > /u/jgmyers/perl/lib/5.8.8 > /u/jgmyers/perl/lib/site_perl/5.8.8/i686-linux-thread-multi > /u/jgmyers/perl/lib/site_perl/5.8.8 /u/jgmyers/perl/lib/site_perl .) at > t/wait.t line 10. > BEGIN failed--compilation aborted at t/wait.t line 16. > t/wait...........dubious
You need an updated version of test.pl in the t/ dir to test with the blead patch. See attached.

Message body is not shown because sender requested not to inline it.

With those extra bits, I'm now able to pass the tests. I still think that copying the utf8 flag into the shared hash is a better approach than always upgrading the key, as the added side effect of upgrading strings can cause astonishing behavior. Unfortunately, my proposed patch fails the tests for enumerating the code ref key and I am unable to puzzle out what the issue is. Any clues as to the source of the code ref problem would be appreciated.
Ok, found my problem. Darn type-unsafe accessor macros. Your utf8.t test case doesn't report failures with "each" very well. Attached find my updated alternate proposal for shared.xs which copies the utf8 flag into the shared hash instead of always upgrading the key.
--- threads-shared-1.23/shared.xs 2008-05-13 14:35:45.000000000 -0700 +++ threads-shared-1.23-copyflag/shared.xs 2008-07-02 17:23:05.000000000 -0700 @@ -123,6 +123,7 @@ # define NEED_sv_2pv_flags # define NEED_vnewSVpvf # define NEED_warner +# define NEED_newSVpvn_flags # include "ppport.h" # include "shared.h" #endif @@ -876,6 +877,8 @@ assert ( mg->mg_ptr != 0 ); if (mg->mg_len == HEf_SVKEY) { key = SvPV((SV *) mg->mg_ptr, len); + if (SvUTF8((SV *) mg->mg_ptr)) + len = -len; } SHARED_CONTEXT; svp = hv_fetch((HV*) saggregate, key, len, 0); @@ -925,8 +928,11 @@ char *key = mg->mg_ptr; STRLEN len = mg->mg_len; assert ( mg->mg_ptr != 0 ); - if (mg->mg_len == HEf_SVKEY) + if (mg->mg_len == HEf_SVKEY) { key = SvPV((SV *) mg->mg_ptr, len); + if (SvUTF8((SV *) mg->mg_ptr)) + len = -len; + } SHARED_CONTEXT; svp = hv_fetch((HV*) saggregate, key, len, 1); } @@ -956,8 +962,11 @@ char *key = mg->mg_ptr; STRLEN len = mg->mg_len; assert ( mg->mg_ptr != 0 ); - if (mg->mg_len == HEf_SVKEY) + if (mg->mg_len == HEf_SVKEY) { key = SvPV((SV *) mg->mg_ptr, len); + if (SvUTF8((SV *) mg->mg_ptr)) + len = -len; + } SHARED_CONTEXT; hv_delete((HV*) saggregate, key, len, G_DISCARD); } @@ -1276,6 +1285,8 @@ } else { STRLEN len; char *key = SvPV(index,len); + if (SvUTF8(index)) + len = -len; SHARED_EDIT; exists = hv_exists((HV*) sobj, key, len); } @@ -1297,9 +1308,10 @@ hv_iterinit((HV*) sobj); entry = hv_iternext((HV*) sobj); if (entry) { + I32 utf8 = HeKUTF8(entry); key = hv_iterkey(entry,&len); CALLER_CONTEXT; - ST(0) = sv_2mortal(newSVpv(key, len)); + ST(0) = sv_2mortal(newSVpvn_utf8(key, len, utf8)); } else { CALLER_CONTEXT; ST(0) = &PL_sv_undef; @@ -1323,9 +1335,10 @@ SHARED_CONTEXT; entry = hv_iternext((HV*) sobj); if (entry) { + I32 utf8 = HeKUTF8(entry); key = hv_iterkey(entry,&len); CALLER_CONTEXT; - ST(0) = sv_2mortal(newSVpv(key, len)); + ST(0) = sv_2mortal(newSVpvn_utf8(key, len, utf8)); } else { CALLER_CONTEXT; ST(0) = &PL_sv_undef;
Subject: Re: [rt.cpan.org #37149] shared hashes fail to preserve utf-8ness of keys
Date: Thu, 3 Jul 2008 10:26:47 -0400
To: bug-threads-shared [...] rt.cpan.org
From: "Jerry D. Hedden" <jdhedden [...] cpan.org>
On Wed, Jul 2, 2008 at 8:26 PM, JGMYERS via RT <bug-threads-shared@rt.cpan.org> wrote: Show quoted text
> Queue: threads-shared > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=37149 > > > Ok, found my problem. Darn type-unsafe accessor macros. Your utf8.t > test case doesn't report failures with "each" very well. > > Attached find my updated alternate proposal for shared.xs which copies > the utf8 flag into the shared hash instead of always upgrading the key.
Thanks. Patch applied and I made some upgrades to t/utf8.t. Update sent to blead perl. When that's applied, I'll release to CPAN.
Subject: Re: [rt.cpan.org #37149] shared hashes fail to preserve utf-8ness of keys
Date: Tue, 8 Jul 2008 08:34:03 -0400
To: bug-threads-shared [...] rt.cpan.org
From: "Jerry D. Hedden" <jdhedden [...] cpan.org>
Show quoted text
> Attached find my updated alternate proposal for shared.xs which copies > the utf8 flag into the shared hash instead of always upgrading the key.
Patch applied to blead. Smoke tests are reporting the following: Compiler messages(MSWin32): shared.xs(881) : warning C4146: unary minus operator applied to unsigned type, result still unsigned shared.xs(935) : warning C4146: unary minus operator applied to unsigned type, result still unsigned shared.xs(970) : warning C4146: unary minus operator applied to unsigned type, result still unsigned shared.xs(1292) : warning C4146: unary minus operator applied to unsigned type, result still unsigned These come from the following constructs: if (SvUTF8((SV *)mg->mg_ptr)) { len = -len; } Is this an issue? If it's completely harmless, is there a construct that will suppress the warnings?
Subject: Re: [rt.cpan.org #37149] shared hashes fail to preserve utf-8ness of keys
Date: Tue, 8 Jul 2008 08:34:03 -0400
To: bug-threads-shared [...] rt.cpan.org
From: "Jerry D. Hedden" <jdhedden [...] cpan.org>
Show quoted text
> Attached find my updated alternate proposal for shared.xs which copies > the utf8 flag into the shared hash instead of always upgrading the key.
Patch applied to blead. Smoke tests are reporting the following: Compiler messages(MSWin32): shared.xs(881) : warning C4146: unary minus operator applied to unsigned type, result still unsigned shared.xs(935) : warning C4146: unary minus operator applied to unsigned type, result still unsigned shared.xs(970) : warning C4146: unary minus operator applied to unsigned type, result still unsigned shared.xs(1292) : warning C4146: unary minus operator applied to unsigned type, result still unsigned These come from the following constructs: if (SvUTF8((SV *)mg->mg_ptr)) { len = -len; } Is this an issue? If it's completely harmless, is there a construct that will suppress the warnings?
The best way to suppress the warning would probably be to change the type of len to be I32.