Skip Menu |

This queue is for tickets about the Encode CPAN distribution.

Report information
The Basics
Id: 65541
Status: resolved
Priority: 0/
Queue: Encode

People
Owner: Nobody in particular
Requestors: xliosha [...] gmail.com
Cc: IKEGAMI [...] cpan.org
pali [...] cpan.org
AdminCc:

Bug Information
Severity: (no value)
Broken in: 2.42
Fixed in: (no value)



CC: IKEGAMI [...] cpan.org
Subject: panic: sv_setpvn called with negative strlen
==== Original request submitted to perlbug follows ==== Hi! I try to output unicode stream via ':encoding(cp1250)' layer. Some symbols doesn't map to this encoding, so i get warnings: "\x{0456}" does not map to cp1250 at C:\buf\osm\osm2mp.pl line 2637, <$_[...]> line 2460557. "\x{043d}" does not map to cp1250 at C:\buf\osm\osm2mp.pl line 2637, <$_[...]> line 2460557. "\x{043d}" does not map to cp1250 at C:\buf\osm\osm2mp.pl line 2637, <$_[...]> line 2460557. "\x{0438}" does not map to cp1250 at C:\buf\osm\osm2mp.pl line 2637, <$_[...]> line 2460557. "\x{0446}" does not map to cp1250 at C:\buf\osm\osm2mp.pl line 2637, <$_[...]> line 2460557. and so on. And _sometimes_ after few such warnings perl crashes with message: panic: sv_setpvn called with negative strlen at C:\buf\osm\osm2mp.pl line 2375, <$_[...]> line 4001961. ==== end ==== The problem appears to be in Encode. (Tested with Encode 2.42.) #0 Perl_sv_setpvn (sv=0x83b6898, ptr=0x83ab901 "\002\002", len=4294967295) at sv.c:4454 #1 0xb7755793 in encode_method (enc=0xb727089c, dir=0xb7272c60, src=0x83b6898, check=<value optimized out>, offset=0x0, term=0x0, retcode=0x0, fallback_cb=0x831c628) at Encode.xs:266 #2 0xb77562cf in XS_Encode__XS_encode (cv=0x834c408) at Encode.xs:657 #3 0x08138203 in Perl_pp_entersub () at pp_hot.c:2931 #4 0x080fe127 in Perl_runops_debug () at dump.c:2267 #5 0x08083288 in Perl_call_sv (sv=0x83b68a8, flags=130) at perl.c:2614 #6 0xb728492b in PerlIOEncode_flush (f=0x832bdd8) at encoding.xs:424 #7 0x082492e3 in PerlIOBuf_write (f=0x832bdd8, vbuf=0x834fc48, count=1) at perlio.c:4157 #8 0xb72861f3 in PerlIOEncode_write (f=0x832bdd8, vbuf=0x834fc48, count=2) at encoding.xs:593 #9 0x08211839 in Perl_do_print (sv=0x83b6838, fp=0x832bdd8) at doio.c:1257 #10 0x08145979 in Perl_pp_print () at pp_hot.c:773 #11 0x080fe127 in Perl_runops_debug () at dump.c:2267 #12 0x08084b7b in perl_run (my_perl=0x831d008) at perl.c:2332 #13 0x08062f25 in main (argc=3, argv=0xbfffe034, env=0xbfffe044) at perlmain.c:120 if (check && !(check & ENCODE_LEAVE_SRC)){ sdone = SvCUR(src) - (slen+sdone); if (sdone) { sv_setpvn(src, (char*)s+slen, sdone); <---- } SvCUR_set(src, sdone); } Test case: binmode STDOUT, ':encoding(cp1250)'; print map chr, 1146, 627, 46, 891, 583, 542, 507, 1169, 1162, 663, 577, 518, 223, 526, 1016, 885, 1135, 1077, 16, 774, 802, 623, 1164, 235, 1136, 1027, 1, 502, 1222, 132, 1127, 738, 747, 115, 315, 23, 643, 455, 815, 1026, 140, 725, 405, 12, 208, 511, 680, 906, 816, 392, 103, 71, 1039, 926, 1163, 953, 38, 1175, 335, 1032, 950, 865, 992, 59, 575, 1263, 227, 216, 1265, 1036, 1189, 365, 667, 403, 1157, 548, 150, 415, 7, 1142, 621, 630, 668, 691, 435, 176, 1152, 396, 1015, 236, 1202, 296, 997, 1115, 1206, 910, 997, 621, 8, 173, 455, 481, 7, 342, 448, 744, 417, 46, 19, 280, 608, 466, 169, 1271, 195, 574, 1246, 1213, 777, 473, 169, 806, 382, 232, 304, 1088, 473, 612, 1011, 1248, 986, 284, 1149, 427, 353, 1110, 287, 957, 229, 378, 793, 48, 114, 1173, 767, 673, 769, 869, 368, 348, 663, 665, 1007, 1180, 871, 561, 1267, 501, 255, 734, 1194, 117, 317, 69, 525, 378, 391, 753, 128, 672, 772, 675, 250, 389, 153, 1245, 1141, 419, 1214, 581, 109, 371, 1000, 1241, 1106, 552, 163, 262, 511, 141, 240, 501, 705, 612, 1256, 432, 4, 28, 959, 381, 196, 567, 134, 722, 4, 40, 360, 603, 359, 518, 979, 189, 316, 1054, 1035, 161, 850, 343, 43, 487, 210, 275, 643, 707, 514, 826, 1213, 1123, 773, 1130, 322, 679, 203, 721, 837, 997, 140, 563, 803, 255, 890, 163, 48, 786, 637, 1048, 110, 942, 309, 1015, 398, 603, 903, 387, 449, 814, 700, 544, 477, 436, 794, 631, 1014, 774, 1104, 1164, 703, 1278, 1267, 1216, 678, 88, 932, 861, 629, 669, 772, 314, 880, 128, 263, 130, 739, 799, 790, 871, 1200, 151, 131, 677, 237, 363, 377, 1276, 1275, 69, 1067, 165, 710, 1011, 560, 1239, 316, 1061, 970, 1043, 1035, 241, 634, 1157, 5, 1091, 332, 1252, 1106, 381, 837, 942, 328, 1268, 452, 892, 796, 1183, 282, 666, 1151, 1123, 402, 1109, 1023, 804, 344, 1214, 722, 928, 870, 721, 308, 536, 1048, 820, 217, 1028, 1252, 1054, 438, 66, 999, 1056, 275, 742, 931, 1213, 608, 224, 697, 358, 855, 132, 705, 477, 1222, 570, 424, 324, 28, 759, 963, 193, 150, 1098, 513, 607, 901, 449, 411, 75, 725, 1247, 982, 274, 752, 63, 179, 545, 617, 544, 436, 1086, 1001, 224, 149, 1054, 225, 66, 402, 364, 288, 1156, 76, 1105, 950, 421, 203, 172, 1091, 1230, 498, 632, 954, 296, 1067, 690, 391, 126, 251, 445, 466, 740, 843, 116, 216, 827, 924, 1113, 406, 1211, 1094, 522, 940, 304, 100, 286, 249, 888, 1175, 652, 184, 267, 1168, 231, 668, 323, 1087, 404, 736, 450, 969, 693, 4, 1082, 959, 321, 1017, 892, 16, 1162, 1166, 1271, 578, 209, 48, 913, 1116, 25, 661, 901, 854, 643, 827, 1142, 1261, 289, 998, 45, 743, 1245, 421, 1204, 472, 117, 345, 1013, 1239, 895, 278, 1235, 1097, 730, 539, 628, 863, 327, 137, 1083, 490, 871, 1021, 468, 938, 1022, 553, 903, 677, 109, 1239, 115, 627, 1188, 656, 986, 79, 730, 1270, 168, 1089, 1086, 759, 247, 794, 1210, 340, 138, 226, 1069, 46, 454, 447, 643, 840, 382, 493, 58, 968, 1263, 6, 1058, 567, 647, 747, 252, 888 ;
Simpler test case: Show quoted text
----- BEGIN TEST CODE ----- binmode STDOUT, ':encoding(cp1250)'; print( ( "a" x 1023 ) . "\x{0378}" ); ----- END TEST CODE -----
----- BEGIN TEST OUTPUT ----- "\x{0340}" does not map to cp1250 at a.pl line 2. panic: sv_setpvn called with negative strlen at a.pl line 2. ----- END TEST OUTPUT ----- Note that the problem occurs at the 1K mark. Note that the warning identifies "\x{0340}" even though there is no such character in the string.
On Mon Feb 07 15:47:45 2011, ikegami wrote: Show quoted text
> Simpler test case: > > ----- BEGIN TEST CODE ----- > binmode STDOUT, ':encoding(cp1250)'; > print( ( "a" x 1023 ) . "\x{0378}" ); > ----- END TEST CODE ----- > > ----- BEGIN TEST OUTPUT ----- > "\x{0340}" does not map to cp1250 at a.pl line 2. > panic: sv_setpvn called with negative strlen at a.pl line 2. > ----- END TEST OUTPUT ----- > > Note that the problem occurs at the 1K mark. > > Note that the warning identifies "\x{0340}" even though there is no
such Show quoted text
> character in the string.
That test case still fails for me up to 5.14.2 but interestingly it works in 5.16.0: $ perlbrew use perl-5.14.2 martin@betdevel:~/bet/tools/modules$ perl -e 'binmode STDOUT, ":encoding(cp1250)";print( ( "a" x 1023 ) . "\x{0378}" );' "\x{0340}" does not map to cp1250 at -e line 1. panic: sv_setpvn called with negative strlen at -e line 1. $ perlbrew use perl-5.16.0 martin@betdevel:~/bet/tools/modules$ perl -e 'binmode STDOUT, ":encoding(cp1250)";print( ( "a" x 1023 ) . "\x{0378}" );' "\x{fffd}" does not map to cp1250 at -e line 1. aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa\x{fffd}"\x{fffd}" does not map to cp1250. My only interest was I used to get the same error under Log::Log4perl and found this rt. Martin -- Martin J. Evans Wetherby, UK
From: ntyni [...] iki.fi
On Fri Sep 21 06:43:45 2012, MJEVANS wrote: Show quoted text
> On Mon Feb 07 15:47:45 2011, ikegami wrote:
> > Simpler test case: > > > > ----- BEGIN TEST CODE ----- > > binmode STDOUT, ':encoding(cp1250)'; > > print( ( "a" x 1023 ) . "\x{0378}" ); > > ----- END TEST CODE ----- > > > > ----- BEGIN TEST OUTPUT ----- > > "\x{0340}" does not map to cp1250 at a.pl line 2. > > panic: sv_setpvn called with negative strlen at a.pl line 2. > > ----- END TEST OUTPUT ----- > > > > Note that the problem occurs at the 1K mark. > > > > Note that the warning identifies "\x{0340}" even though there is no
> such
> > character in the string.
> > That test case still fails for me up to 5.14.2 but interestingly it > works in 5.16.0:
We're still seeing this with 5.22 in Debian. The crashes are sporadic, I have no way to reliably reproduce them. I did manage to get some gdb stack traces and valgrind output, though. It looks like it may end up using uninitialized memory. See https://bugs.debian.org/835989 Combined with https://rt.cpan.org/Public/Bug/Display.html?id=106461 in ExtUtils::MakeMaker this makes the build process of some distributions crash when running Makefile.PL, which is rather unfortunate. Let me know if I can help with this. -- Niko Tyni ntyni@debian.org
RT-Send-CC: khw [...] cpan.org
No idea if this is bug in Encode.xs or in utf8n_to_uvuni(), but utf8n_to_uvuni() set return lenght higher then input lenght and Encode.xs does not handle this situation... Here is quick hotfix for that: diff --git a/Encode.xs b/Encode.xs index 8c990ea..7cf9d4b 100644 --- a/Encode.xs +++ b/Encode.xs @@ -193,6 +193,8 @@ encode_method(pTHX_ const encode_t * enc, const encpage_t * dir, SV * src, UV ch = utf8n_to_uvuni(s+slen, (SvCUR(src)-slen), &clen, UTF8_ALLOW_ANY|UTF8_CHECK_ONLY); + /* bug in utf8n_to_uvuni when it set clen higher then SvCUR(src)-slen */ + if (clen > SvCUR(src)-slen) break; /* if non-representable multibyte prefix at end of current buffer - break*/ if (clen > tlen - sdone) break; if (check & ENCODE_DIE_ON_ERR) { Now perl -e 'binmode STDOUT, ":encoding(cp1250)";print( ( "a" x 1023 ) . "\x{0378}" );' does not crash anymore. CCing khw about this problem.
On Thu Sep 29 12:22:23 2016, PALI wrote: Show quoted text
> No idea if this is bug in Encode.xs or in utf8n_to_uvuni(), but > utf8n_to_uvuni() set return lenght higher then input lenght and > Encode.xs does not handle this situation...
I overhauled utf8n_to_uvuni() for 5.16 which fixed a number of bugs like this. When run under blead, it does set the return length properly. I do not know why Niko would find this in 5.22. Perhaps there is another bug. In the next month I intend to put the latest utf8_to_uvuni into Devel::PPPort, which would solve the problem that Pali found, and likely others.