Skip Menu |

This queue is for tickets about the DBIx-Class-EncodedColumn CPAN distribution.

Report information
The Basics
Id: 78091
Status: resolved
Priority: 0/
Queue: DBIx-Class-EncodedColumn

People
Owner: Nobody in particular
Requestors: gbjk [...] thermeon.com
Cc: mark [...] repixl.com
AdminCc:

Bug Information
Severity: Critical
Broken in: 0.00011
Fixed in: (no value)



CC: mark [...] repixl.com
Subject: Unicode string causes error
Hi there, If we use DBIx::Class with enable_utf8 then the strings we use in our RS will be perl internal strings. Bcrypt expects an octet sequence, though, and blows up on trying to encode the perl sequence with "input must contain only octets". I figure the solution is just to utf8::encode what's flagged as utf8. Patch below. Regards Gareth --- perl/lib/site_perl/5.14.1/DBIx/Class/EncodedColumn/Crypt/Eksblowfish/Bcr ypt.pm 2011-04-11 19:51:04.000000000 +0000 +++ lib/DBIx/Class/EncodedColumn/Crypt/Eksblowfish/Bcrypt.pm 2012-06- 28 11:09:51.000000000 +0000 @@ -24,6 +24,11 @@ my $encoder = sub { my ($plain_text, $settings_str) = @_; + if (utf8::is_utf8($plain_text)){ + # Bcrypt expects octets. This dbi is probably going to encode later + # so we'll have to do this now + utf8::encode($plain_text); + } unless ( $settings_str ) { my $salt = join('', map { chr(int(rand(256))) } 1 .. 16); $salt = Crypt::Eksblowfish::Bcrypt::en_base64( $salt );
On Thu Jun 28 07:18:14 2012, gbjk@thermeoneurope.com wrote: Show quoted text
> Hi there, > [snip] > I figure the solution is just to utf8::encode what's flagged as utf8. > > Patch below. > [snip]
Hi, Thanks for the patch. Could you please provide an automated test case? Cheers,
Show quoted text
> Hi, > > Thanks for the patch. Could you please provide an automated test case? > > Cheers,
I'd just tack it onto the end of t/bcrypt.t: # Test utf8 characters make it through Bcrypt okay. use utf8; # Source code *is* utf8 $row->bcrypt_1("官话"); $row->update; Though you might want to pretty that up because in a failing case, it'll explode on you. Maybe you want to catch explosions from Bcrypt better anyway, though...? HTH. Sorry I can't do more.
On 2012-06-28 04:18:14, GBJK wrote: Show quoted text
> I figure the solution is just to utf8::encode what's flagged as utf8.
Not quite. "utf8::is_utf8" doesn't do what you think it does. It does *not* tell you whether the characters are ascii or non-ascii, but merely report on the *internal only* utf8 flag which indicates whether characters have been encoded into bytes or not (that is, a "wide" character might be represented as a single integer of value higher than 0xFF, in which case the utf8 flag will be off, or it could be represented as multiple integers all under 0xFF, in which case is_utf8 will return true). It CANNOT be used to determine whether a string should be run through utf8::encode or not -- to do that will result in mojibaked characters. All you can do is clearly document whether the strings you receive will be run through utf8::encode and ::decode, or not -- that is, whether you expect *characters*, or *bytes*. Generally encoding/decoding is only done on the edges of an application, at the very boundary between physical representation and logical. It is reasonable to do the encoding right before passing to crypt() or encode_base64() (etc), because that's the interface that requires bytes.