Skip Menu |

This queue is for tickets about the Crypt-CBC CPAN distribution.

Report information
The Basics
Id: 35239
Status: resolved
Worked: 1 min
Priority: 0/
Queue: Crypt-CBC

People
Owner: LDS [...] cpan.org
Requestors: ANDK [...] cpan.org
Cc: makamaka [...] cpan.org
NEELY [...] cpan.org
AdminCc:

Bug Information
Severity: Normal
Broken in: 2.28
Fixed in: (no value)



CC: NEELY [...] cpan.org, MAKAMAKA [...] cpan.org
Subject: Does not downgrade when UTF-8 flag is set
This bug was discovered during smoke testing of Data::Serializer. I'm trying to CC Neil so he finally gets an explanation for all the failing tests I sent to cpan testers. The bug is that Crypt::CBC does not deal well with data that have the UTF8 bit on. Demonstrated in this test case: perl -le ' use Crypt::CBC;print $Crypt::CBC::VERSION; my $secret = "test"; my $cipher = "Blowfish"; my $digest = qq{deadbeef}; $digest .= chr(256); chop $digest; my $cipher_obj = Crypt::CBC->new($secret,$cipher); print length $cipher_obj->encrypt($digest)==32 ? "ok\n" : "not ok\n"; ' 2.28 input must be 8 bytes long at /home/src/perl/repoperls/installed-perls/perl/pSMD0sR/perl-5.9.1@23966/lib/site_perl/5.9.2/i686-linux-64int/Crypt/Blowfish.pm line 56. This fails on all perls since bleadperl@23966. That patch made unpack encoding neutral which means that a simple ascii string that comes in with the UTF-8 bit set get that bit through unpack. Older perls lost it during the unpack. It needed then the new JSON 2.0 that serializes with UTF-8 bit set correctly. And it needed another few months to discover this problem because Data::Serializer most of the time skips tests when JSON is not installed. I'm pretty sure you can blame perl that it handles $iv ^ $asciistring differently when the UTF8 bit is set on $asciistring. But I think it is sufficiently underdocumented to draw a conclusion. Thanks,
Is there an infallible way to treat a Perl string as binary data so that bit-oriented operations work as expected regardless of the UTF8 bit? Otherwise I have no idea how to work around this. Lincoln On Sun Apr 20 08:40:45 2008, ANDK wrote: Show quoted text
> This bug was discovered during smoke testing of Data::Serializer. > > I'm trying to CC Neil so he finally gets an explanation for all the > failing tests I sent to cpan testers. > > The bug is that Crypt::CBC does not deal well with data that have the > UTF8 bit on. Demonstrated in this test case: > > perl -le ' > use Crypt::CBC;print $Crypt::CBC::VERSION; > my $secret = "test"; > my $cipher = "Blowfish"; > my $digest = qq{deadbeef}; > $digest .= chr(256); chop $digest; > my $cipher_obj = Crypt::CBC->new($secret,$cipher); > print length $cipher_obj->encrypt($digest)==32 ? "ok\n" : "not ok\n"; > ' > 2.28 > input must be 8 bytes long at > /home/src/perl/repoperls/installed-perls/perl/pSMD0sR/perl- > 5.9.1@23966/lib/site_perl/5.9.2/i686-linux-64int/Crypt/Blowfish.pm > line 56. > > > > > This fails on all perls since bleadperl@23966. That patch made unpack > encoding neutral which means that a simple ascii string that comes in > with the UTF-8 bit set get that bit through unpack. Older perls lost > it > during the unpack. It needed then the new JSON 2.0 that serializes > with > UTF-8 bit set correctly. And it needed another few months to discover > this problem because Data::Serializer most of the time skips tests > when > JSON is not installed. > > I'm pretty sure you can blame perl that it handles > > $iv ^ $asciistring > > differently when the UTF8 bit is set on $asciistring. But I think it > is > sufficiently underdocumented to draw a conclusion. > > > Thanks,
From: neil [...] neely.cx
First: Thanks for the Cc - I've seen this choking on utf-8 but didn't see the why on it. Unfortunately I don't run perl > 5.8.8 and can't reproduce this bug and provide a direct patch, but I believe the bytes pragma may be helpful here - from man bytes: As an example, when Perl sees "$x = chr(400)", it encodes the character in UTF-8 and stores it in $x. Then it is marked as character data, so, for instance, "length $x" returns 1. However, in the scope of the "bytes" pragma, $x is treated as a series of bytes - the bytes that make up the UTF8 encoding - and "length $x" returns 2: Not sure what beyond that is needed, but that seemed like a useful starting point. On Mon Apr 21 09:22:59 2008, LDS wrote: Show quoted text
> Is there an infallible way to treat a Perl string as binary data so that > bit-oriented operations work as expected regardless of the UTF8 bit? > Otherwise I have no idea how to work around this. > > Lincoln > > On Sun Apr 20 08:40:45 2008, ANDK wrote:
> > This bug was discovered during smoke testing of Data::Serializer. > > > > I'm trying to CC Neil so he finally gets an explanation for all the > > failing tests I sent to cpan testers. > > > > The bug is that Crypt::CBC does not deal well with data that have the > > UTF8 bit on. Demonstrated in this test case: > > > > perl -le ' > > use Crypt::CBC;print $Crypt::CBC::VERSION; > > my $secret = "test"; > > my $cipher = "Blowfish"; > > my $digest = qq{deadbeef}; > > $digest .= chr(256); chop $digest; > > my $cipher_obj = Crypt::CBC->new($secret,$cipher); > > print length $cipher_obj->encrypt($digest)==32 ? "ok\n" : "not ok\n"; > > ' > > 2.28 > > input must be 8 bytes long at > > /home/src/perl/repoperls/installed-perls/perl/pSMD0sR/perl- > > 5.9.1@23966/lib/site_perl/5.9.2/i686-linux-64int/Crypt/Blowfish.pm > > line 56. > > > > > > > > > > This fails on all perls since bleadperl@23966. That patch made unpack > > encoding neutral which means that a simple ascii string that comes in > > with the UTF-8 bit set get that bit through unpack. Older perls lost > > it > > during the unpack. It needed then the new JSON 2.0 that serializes > > with > > UTF-8 bit set correctly. And it needed another few months to discover > > this problem because Data::Serializer most of the time skips tests > > when > > JSON is not installed. > > > > I'm pretty sure you can blame perl that it handles > > > > $iv ^ $asciistring > > > > differently when the UTF8 bit is set on $asciistring. But I think it > > is > > sufficiently underdocumented to draw a conclusion. > > > > > > Thanks,
> >
CC: ANDK [...] cpan.org, NEELY [...] cpan.org, makamaka [...] cpan.org
Subject: Re: [rt.cpan.org #35239] Does not downgrade when UTF-8 flag is set
Date: Tue, 22 Apr 2008 08:05:12 +0200
To: bug-Crypt-CBC [...] rt.cpan.org
From: andreas.koenig.7os6VVqR [...] franz.ak.mind.de (Andreas J. Koenig)
Please note that I also filed a perlbug on the behaviour of ($iv^$string) when $string has the UTF8 flag: http://rt.perl.org/rt3/Ticket/Display.html?id=53110 -- andreas
Adding "use bytes" to the CBC.pm file seems to fix the problem. I am committing a new release with this fix.
CC: NEELY [...] cpan.org
Subject: Re: [rt.cpan.org #35239] Does not downgrade when UTF-8 flag is set
Date: Tue, 22 Apr 2008 23:29:06 +0900
To: bug-Crypt-CBC [...] rt.cpan.org, ANDK [...] cpan.org
From: makamaka [...] donzoko.net
(Though I don't know well whom I ask, and what I say may be not related with Crypt::CBC.) Does the decode method of JSON 2.x in Data::Serializer::JSON return a string with UTF8 flag? If so, I recommend using JSON 2.x with utf8 option. Regards, "Neil A. Neely via RT" <bug-Crypt-CBC@rt.cpan.org> wrote: Mon, 21 Apr 2008 12:43:13 -0400 [rt.cpan.org #35239] Does not downgrade when UTF-8 flag is set Show quoted text
> ><URL: http://rt.cpan.org/Ticket/Display.html?id=35239 > > >First: Thanks for the Cc - I've seen this choking on utf-8 but didn't see the why on it. > >Unfortunately I don't run perl > 5.8.8 and can't reproduce this bug and provide a direct >patch, but I believe the bytes pragma may be helpful here - from man bytes: > > As an example, when Perl sees "$x = chr(400)", it encodes the character > in UTF-8 and stores it in $x. Then it is marked as character data, so, > for instance, "length $x" returns 1. However, in the scope of the > "bytes" pragma, $x is treated as a series of bytes - the bytes that > make up the UTF8 encoding - and "length $x" returns 2: > >Not sure what beyond that is needed, but that seemed like a useful starting point. > > > >On Mon Apr 21 09:22:59 2008, LDS wrote:
>> Is there an infallible way to treat a Perl string as binary data so that >> bit-oriented operations work as expected regardless of the UTF8 bit? >> Otherwise I have no idea how to work around this. >> >> Lincoln >> >> On Sun Apr 20 08:40:45 2008, ANDK wrote:
>> > This bug was discovered during smoke testing of Data::Serializer. >> > >> > I'm trying to CC Neil so he finally gets an explanation for all the >> > failing tests I sent to cpan testers. >> > >> > The bug is that Crypt::CBC does not deal well with data that have the >> > UTF8 bit on. Demonstrated in this test case: >> > >> > perl -le ' >> > use Crypt::CBC;print $Crypt::CBC::VERSION; >> > my $secret = "test"; >> > my $cipher = "Blowfish"; >> > my $digest = qq{deadbeef}; >> > $digest .= chr(256); chop $digest; >> > my $cipher_obj = Crypt::CBC->new($secret,$cipher); >> > print length $cipher_obj->encrypt($digest)==32 ? "ok\n" : "not ok\n"; >> > ' >> > 2.28 >> > input must be 8 bytes long at >> > /home/src/perl/repoperls/installed-perls/perl/pSMD0sR/perl- >> > 5.9.1@23966/lib/site_perl/5.9.2/i686-linux-64int/Crypt/Blowfish.pm >> > line 56. >> > >> > >> > >> > >> > This fails on all perls since bleadperl@23966. That patch made unpack >> > encoding neutral which means that a simple ascii string that comes in >> > with the UTF-8 bit set get that bit through unpack. Older perls lost >> > it >> > during the unpack. It needed then the new JSON 2.0 that serializes >> > with >> > UTF-8 bit set correctly. And it needed another few months to discover >> > this problem because Data::Serializer most of the time skips tests >> > when >> > JSON is not installed. >> > >> > I'm pretty sure you can blame perl that it handles >> > >> > $iv ^ $asciistring >> > >> > differently when the UTF8 bit is set on $asciistring. But I think it >> > is >> > sufficiently underdocumented to draw a conclusion. >> > >> > >> > Thanks,
---- Makamaka Hannyaharamitu makamaka@cpan.org
From: makamaka [...] cpan.org
Show quoted text
> Does the decode method of JSON 2.x in Data::Serializer::JSON return > a string with UTF8 flag? If so, I recommend using JSON 2.x with utf8 > option.
Sorry, it is not decode but encode method. I confirmed that modified Data::Serializer::JSON works correctly, and I will report it to Data::Serializer author. Thanks,
Can this bug be moved to the Data::Serializer queue? I think that the Crypt::CBC behavior is now resolved.