Skip Menu |

This queue is for tickets about the DBD-mysql CPAN distribution.

Report information
The Basics
Id: 24738
Status: resolved
Priority: 0/
Queue: DBD-mysql

People
Owner: CAPTTOFU [...] cpan.org
Requestors: sven-bitcard [...] sven.de
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 4.001
Fixed in: (no value)



Subject: UTF8-Flag not set with flag mysql_enable_utf8 and column collation utf8_bin
Description: When retrieving data from a column with collation utf8_bin, the utf8-flag does not get set. It works correctly with the collations utf8_general_ci and utf8_unicode_ci. How to repeat: using this perl script: use DBI; my $dbh = DBI->connect($dsn, $user, $pass, { mysql_enable_utf8 => 1 }); $dbh->do("SET NAMES utf8"); $dbh->do("SET CHARACTER SET utf8"); my $sth = $dbh->prepare(q{ SELECT name /* column with collation utf8_bin */ FROM author }); $sth->execute(); my $r = $sth->fetch; print "not utf8" if ! utf8::is_utf8( $r->[0] ); $sth->finish; $dbh->disconnect; Suggested fix: alter behaviour so utf8-flag also gets set for Columns with utf8_bin collation. Ugly workaround: Pick one of the working collations during select: SELECT name COLLATE utf8_general_ci AS name FROM author
From: sven-bitcard [...] sven.de
This bug is also tracked at http://bugs.mysql.com/bug.php?id=26010
Subject: PATCH: UTF8-Flag not set with flag mysql_enable_utf8 and column collation utf8_bin
I can confirm this bug, which is still present in DBD-mysql v4.003 The problem is that the DBD only does a sv_utf8_decode if the result field has the BINARY_FLAG set. However, this appears to not be the correct way to detect binary fields (and apparently hasn't been so since mysql 4.1.0) From: http://www.mysql.org/doc/refman/5.0/en/c-api-datatypes.html "To distinguish between binary and non-binary data for string data types, check whether the charsetnr value is 63. If so, the character set is binary, which indicates binary rather than non-binary data. This is how to distinguish between BINARY and CHAR, VARBINARY and VARCHAR, and BLOB and TEXT." The attached patch follows this method for mysql versions 4.1.0 and up. Cheers, Joost Diepenmaat Zeekat Softwareontwikkeling.
diff -ru DBD-mysql-4.003/dbdimp.c DBD-mysql-4.003-patched/dbdimp.c --- DBD-mysql-4.003/dbdimp.c 2007-03-01 16:47:26.000000000 +0100 +++ DBD-mysql-4.003-patched/dbdimp.c 2007-03-12 15:31:01.000000000 +0100 @@ -3647,7 +3647,12 @@ sv_setpvn(sv, col, len); /* UTF8 */ #if defined(sv_utf8_decode) && MYSQL_VERSION_ID >=SERVER_PREPARE_VERSION + +#if MYSQL_VERSION_ID >= FIELD_CHARSETNR_VERSION + if (imp_dbh->enable_utf8 && fields[i].charsetnr != 63) /* see bottom of: http://www.mysql.org/doc/refman/5.0/en/c-api-datatypes.html */ +#else if (imp_dbh->enable_utf8 && !(fields[i].flags & BINARY_FLAG)) +#endif sv_utf8_decode(sv); #endif /* END OF UTF8 */ diff -ru DBD-mysql-4.003/dbdimp.h DBD-mysql-4.003-patched/dbdimp.h --- DBD-mysql-4.003/dbdimp.h 2007-03-01 16:47:26.000000000 +0100 +++ DBD-mysql-4.003-patched/dbdimp.h 2007-03-12 15:26:26.000000000 +0100 @@ -36,6 +36,7 @@ #define NEW_DATATYPE_VERSION 50003 #define SSL_VERIFY_VERSION 50023 #define MYSQL_VERSION_5_0 50001 +#define FIELD_CHARSETNR_VERSION 40101 /* should equivalent to 4.1.0 */ /* This is to avoid the ugly #ifdef mess in dbdimp.c */ #if MYSQL_VERSION_ID < SQL_STATE_VERSION #define mysql_sqlstate(svsock) (NULL) Only in DBD-mysql-4.003-patched: .#mysql.xs Only in DBD-mysql-4.003-patched: #mysql.xs# Only in DBD-mysql-4.003-patched/t: mysql.mtest diff -ru DBD-mysql-4.003/t/utf8.t DBD-mysql-4.003-patched/t/utf8.t --- DBD-mysql-4.003/t/utf8.t 2007-01-06 20:07:23.000000000 +0100 +++ DBD-mysql-4.003-patched/t/utf8.t 2007-03-12 14:20:11.000000000 +0100 @@ -102,7 +102,7 @@ # Create a new table; In an ideal world, it'd be more sensible to # make the whole database UTF8... # - $query = "CREATE TABLE $table (name VARCHAR(64) CHARACTER SET utf8, bincol BLOB, shape GEOMETRY)"; + $query = "CREATE TABLE $table (name VARCHAR(64) CHARACTER SET utf8, bincol BLOB, shape GEOMETRY, binutf VARCHAR(64) CHARACTER SET utf8 COLLATE utf8_bin)"; Test($state or $dbh->do($query)) or ErrMsgF("Cannot create table: Error %s.\n", $dbh->errstr); @@ -126,11 +126,11 @@ Test( $state or ( $dbh->{ mysql_enable_utf8 } ) ) or ErrMsg( "mysql_enable_utf8 didn't survive connect()\n" ); - $query = qq{INSERT INTO $table (name, bincol, shape) VALUES (?,?, GeomFromText('Point(132865 501937)'))}; - Test( $state or $dbh->do( $query, {}, $utf8_str,$blob ) ) + $query = qq{INSERT INTO $table (name, bincol, shape, binutf) VALUES (?,?, GeomFromText('Point(132865 501937)'), ?)}; + Test( $state or $dbh->do( $query, {}, $utf8_str,$blob, $utf8_str ) ) or ErrMsgF( "INSERT failed: query $query, error %s.\n", $dbh->errstr ); - $query = "SELECT name,bincol,asbinary(shape) FROM $table LIMIT 1"; + $query = "SELECT name,bincol,asbinary(shape), binutf FROM $table LIMIT 1"; Test( $state or ($sth = $dbh->prepare( $query ) ) ) or ErrMsgF( "prepare failed: query $query, error %s.\n", $dbh->errstr ); @@ -143,7 +143,13 @@ # Finally, check that we got back UTF-8 correctly. Test( $state or ($ref->[0] eq $utf8_str) ) - or ErrMsgF( "got back '$ref->[0]' instead of '$utf8_str'.\n" ); + or ErrMsgF( "got back '$ref->[0]' instead of '$utf8_str' (normal utf8 collation).\n" ); + + # same as above for utf8_bin column + Test( $state or ($ref->[3] eq $utf8_str) ) + or ErrMsgF( "got back '$ref->[3]' instead of '$utf8_str' (binary utf8 collation).\n" ); + + if (eval "use Encode;") { # Check for utf8 flag
CC: undisclosed-recipients:;
Subject: Re: [rt.cpan.org #24738] PATCH: UTF8-Flag not set with flag mysql_enable_utf8 and column collation utf8_bin
Date: Mon, 12 Mar 2007 15:20:44 -0400
To: bug-DBD-mysql [...] rt.cpan.org
From: Patrick Galbraith <patg [...] patg.net>
Joost Diepenmaat via RT wrote: Show quoted text
> Queue: DBD-mysql > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=24738 > > >I can confirm this bug, which is still present in DBD-mysql v4.003 > >The problem is that the DBD only does a sv_utf8_decode if the result >field has the BINARY_FLAG set. However, this appears to not be the >correct way to detect binary fields (and apparently hasn't been so since >mysql 4.1.0) > >From: http://www.mysql.org/doc/refman/5.0/en/c-api-datatypes.html > >"To distinguish between binary and non-binary data for string data >types, check whether the charsetnr value is 63. If so, the character >set is binary, which indicates binary rather than non-binary data. This >is how to distinguish between BINARY and CHAR, VARBINARY and VARCHAR, >and BLOB and TEXT." > >The attached patch follows this method for mysql versions 4.1.0 and up. > >Cheers, >Joost Diepenmaat >Zeekat Softwareontwikkeling. > > > >
Joost, Thank you for this patch, which I will review and add to the source upon successful testing. Kind regards, Patrick -- Patrick Galbraith, Senior Programmer Grazr - Easy feed grazing and sharing http://www.grazr.com Satyam Eva Jayate - Truth Alone Triumphs Mundaka Upanishad
All tests pass, added to SVN, will be in impending 4.004 release.