Skip Menu |

This queue is for tickets about the DBD-mysql CPAN distribution.

Report information
The Basics
Id: 120141
Status: resolved
Priority: 0/
Queue: DBD-mysql

People
Owner: Nobody in particular
Requestors: tanabe [...] fa2.so-net.ne.jp
Cc: pali [...] cpan.org
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: 4.041_02



Subject: UTF-8 column names and error messages are octet-streams, not strings
Date: Wed, 8 Feb 2017 12:42:41 +0900
To: bug-DBD-mysql [...] rt.cpan.org
From: Tanabe Yoshinori <tanabe [...] fa2.so-net.ne.jp>
Hello, Column names and error messages should be treated as strings, but they are octet-streams in DBD-mysql-4.041. The attached code creates a table with a column whose name contains a non ASCII character. After issueing a SELECT statement and fetchrow_hashref, it tries to get a value using the column name at (1), but the result is undef. If you use the octet stream for the column name as a key, you get the value, at (2). Also, when you use Japanese error messages by adding line lc_messages=ja_JP in [mysqld] section of my.ini, messages are not decoded in DBD::mysql. As a result, messages are unreadable in (3) and (4). We could explicitly decode them as in (5) for message caught, but this cannot be applied to (3). Of course, it can be avoided by not using automatic encoding for STDERR at (6), but then we need to manually encode all other strings, a nightmare. Finally, I noticed that when error messages are in Japanese, make test of DBD-mysql fails. It may be difficult to avoid (I do not know), but a warning message (lc_messages should not be changed) in make test would help. DBD::mysql version: 4.041 Strawberry perl 64bit, v5.22.1 MariaDB $dbh->{mysql_clientinfo, mysql_clientversion, mysql_serverversion} returns: 5.1.44, 50144, 50505, respectively. Windows 7 Pro Service Pack 1 Regards, Tanabe Yoshinori

Message body is not shown because sender requested not to inline it.

On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote: Show quoted text
> Hello, > > Column names and error messages should be treated as strings, but > they are octet-streams in DBD-mysql-4.041. > > The attached code creates a table with a column whose name > contains a non ASCII character. After issueing a SELECT statement > and fetchrow_hashref, it tries to get a value using the column name > at (1), but the result is undef. If you use the octet stream for > the column name as a key, you get the value, at (2). > > Also, when you use Japanese error messages by adding line > lc_messages=ja_JP > in [mysqld] section of my.ini, messages are not decoded in > DBD::mysql. As a result, messages are unreadable in (3) and (4). > We could explicitly decode them as in (5) for message caught, but > this cannot be applied to (3). Of course, it can be avoided by > not using automatic encoding for STDERR at (6), but then we need > to manually encode all other strings, a nightmare. > > Finally, I noticed that when error messages are in Japanese, make > test of DBD-mysql fails. It may be difficult to avoid (I do not > know), but a warning message (lc_messages should not be changed) > in make test would help. > > DBD::mysql version: 4.041 > Strawberry perl 64bit, v5.22.1 > MariaDB > $dbh->{mysql_clientinfo, mysql_clientversion, mysql_serverversion} > returns: > 5.1.44, 50144, 50505, respectively. > Windows 7 Pro Service Pack 1 > > Regards, > Tanabe Yoshinori >
Hello, please try development version 4.041_1 of DBD-mysql. That one has fixed UTF-8 support for passing statements and parameters.
CC: pali [...] cpan.org
Subject: Re: [rt.cpan.org #120141] UTF-8 column names and error messages are octet-streams, not strings
Date: Wed, 8 Feb 2017 20:19:20 +0900
To: bug-DBD-mysql [...] rt.cpan.org
From: Tanabe Yoshinori <tanabe [...] fa2.so-net.ne.jp>
On 2017/02/08 19:32, Pali via RT wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 > > > On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
>> Hello, >> >> Column names and error messages should be treated as strings, but >> they are octet-streams in DBD-mysql-4.041. >> >> The attached code creates a table with a column whose name >> contains a non ASCII character. After issueing a SELECT statement >> and fetchrow_hashref, it tries to get a value using the column name >> at (1), but the result is undef. If you use the octet stream for >> the column name as a key, you get the value, at (2). >> >> Also, when you use Japanese error messages by adding line >> lc_messages=ja_JP >> in [mysqld] section of my.ini, messages are not decoded in >> DBD::mysql. As a result, messages are unreadable in (3) and (4). >> We could explicitly decode them as in (5) for message caught, but >> this cannot be applied to (3). Of course, it can be avoided by >> not using automatic encoding for STDERR at (6), but then we need >> to manually encode all other strings, a nightmare. >> >> Finally, I noticed that when error messages are in Japanese, make >> test of DBD-mysql fails. It may be difficult to avoid (I do not >> know), but a warning message (lc_messages should not be changed) >> in make test would help. >> >> DBD::mysql version: 4.041 >> Strawberry perl 64bit, v5.22.1 >> MariaDB >> $dbh->{mysql_clientinfo, mysql_clientversion, mysql_serverversion} >> returns: >> 5.1.44, 50144, 50505, respectively. >> Windows 7 Pro Service Pack 1 >> >> Regards, >> Tanabe Yoshinori >>
> > Hello, please try development version 4.041_1 of DBD-mysql. That one has fixed UTF-8 support for passing statements and parameters. >
Hello, I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows the number) and run the script again. The results are the same as in my first report. Thank you. Tanabe
On Str Feb 08 06:20:34 2017, tanabe@fa2.so-net.ne.jp wrote: Show quoted text
> On 2017/02/08 19:32, Pali via RT wrote:
> > <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 > > > > > On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
> >> Hello, > >> > >> Column names and error messages should be treated as strings, but > >> they are octet-streams in DBD-mysql-4.041. > >> > >> The attached code creates a table with a column whose name > >> contains a non ASCII character. After issueing a SELECT statement > >> and fetchrow_hashref, it tries to get a value using the column name > >> at (1), but the result is undef. If you use the octet stream for > >> the column name as a key, you get the value, at (2). > >> > >> Also, when you use Japanese error messages by adding line > >> lc_messages=ja_JP > >> in [mysqld] section of my.ini, messages are not decoded in > >> DBD::mysql. As a result, messages are unreadable in (3) and (4). > >> We could explicitly decode them as in (5) for message caught, but > >> this cannot be applied to (3). Of course, it can be avoided by > >> not using automatic encoding for STDERR at (6), but then we need > >> to manually encode all other strings, a nightmare. > >> > >> Finally, I noticed that when error messages are in Japanese, make > >> test of DBD-mysql fails. It may be difficult to avoid (I do not > >> know), but a warning message (lc_messages should not be changed) > >> in make test would help. > >> > >> DBD::mysql version: 4.041 > >> Strawberry perl 64bit, v5.22.1 > >> MariaDB > >> $dbh->{mysql_clientinfo, mysql_clientversion, > >> mysql_serverversion} > >> returns: > >> 5.1.44, 50144, 50505, respectively. > >> Windows 7 Pro Service Pack 1 > >> > >> Regards, > >> Tanabe Yoshinori > >>
> > > > Hello, please try development version 4.041_1 of DBD-mysql. That one > > has fixed UTF-8 support for passing statements and parameters. > >
> > Hello, > > I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows the > number) and run the script again. The results are the same as in my > first report. > > Thank you. > Tanabe
Hi! Can you try compile DBD::mysql (either 4.041_01 or from git master) with these two attached patches? It should fix wide Unicode characters in column names and error messages. Note that DBI itself has broken Unicode messages prior to version 1.635 (see https://rt.cpan.org/Public/Bug/Display.html?id=102404).
On Ned Feb 12 07:52:30 2017, PALI wrote: Show quoted text
> On Str Feb 08 06:20:34 2017, tanabe@fa2.so-net.ne.jp wrote:
> > On 2017/02/08 19:32, Pali via RT wrote:
> > > <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 > > > > > > > On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
> > >> Hello, > > >> > > >> Column names and error messages should be treated as strings, but > > >> they are octet-streams in DBD-mysql-4.041. > > >> > > >> The attached code creates a table with a column whose name > > >> contains a non ASCII character. After issueing a SELECT statement > > >> and fetchrow_hashref, it tries to get a value using the column > > >> name > > >> at (1), but the result is undef. If you use the octet stream for > > >> the column name as a key, you get the value, at (2). > > >> > > >> Also, when you use Japanese error messages by adding line > > >> lc_messages=ja_JP > > >> in [mysqld] section of my.ini, messages are not decoded in > > >> DBD::mysql. As a result, messages are unreadable in (3) and (4). > > >> We could explicitly decode them as in (5) for message caught, but > > >> this cannot be applied to (3). Of course, it can be avoided by > > >> not using automatic encoding for STDERR at (6), but then we need > > >> to manually encode all other strings, a nightmare. > > >> > > >> Finally, I noticed that when error messages are in Japanese, make > > >> test of DBD-mysql fails. It may be difficult to avoid (I do not > > >> know), but a warning message (lc_messages should not be changed) > > >> in make test would help. > > >> > > >> DBD::mysql version: 4.041 > > >> Strawberry perl 64bit, v5.22.1 > > >> MariaDB > > >> $dbh->{mysql_clientinfo, mysql_clientversion, > > >> mysql_serverversion} > > >> returns: > > >> 5.1.44, 50144, 50505, respectively. > > >> Windows 7 Pro Service Pack 1 > > >> > > >> Regards, > > >> Tanabe Yoshinori > > >>
> > > > > > Hello, please try development version 4.041_1 of DBD-mysql. That > > > one > > > has fixed UTF-8 support for passing statements and parameters. > > >
> > > > Hello, > > > > I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows > > the > > number) and run the script again. The results are the same as in my > > first report. > > > > Thank you. > > Tanabe
> > Hi! Can you try compile DBD::mysql (either 4.041_01 or from git > master) with these two attached patches? It should fix wide Unicode > characters in column names and error messages. Note that DBI itself > has broken Unicode messages prior to version 1.635 (see > https://rt.cpan.org/Public/Bug/Display.html?id=102404).
Trying to attach patches again...
Subject: 0001-Fix-decoding-UTF-8-field-names-and-tables-names-when.patch
From 10980f96fb33c73c9b50bbee3c52d4875a0cc3e5 Mon Sep 17 00:00:00 2001 From: Pali <pali@cpan.org> Date: Sun, 12 Feb 2017 13:02:08 +0100 Subject: [PATCH 1/2] Fix decoding UTF-8 field names and tables names when mysql_enable_utf8 is enabled Attributes $sth->{NAME} and $sth->{mysql_table} should be properly UTF-8 decoded. Otherwise column and table names stay in UTF-8 octets instead of correct wide Unicode strings. Partially fixes bug: https://rt.cpan.org/Public/Bug/Display.html?id=120141 --- dbdimp.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/dbdimp.c b/dbdimp.c index 6b1e268..9e128b4 100644 --- a/dbdimp.c +++ b/dbdimp.c @@ -4869,8 +4869,10 @@ dbd_st_FETCH_internal( { dTHX; D_imp_sth(sth); + D_imp_dbh_from_sth; AV *av= Nullav; MYSQL_FIELD *curField; + bool enable_utf8 = (imp_dbh->enable_utf8 || imp_dbh->enable_utf8mb4); /* Are we asking for a legal value? */ if (what < 0 || what >= AV_ATTRIB_LAST) @@ -4896,10 +4898,14 @@ dbd_st_FETCH_internal( switch(what) { case AV_ATTRIB_NAME: sv= newSVpvn(curField->name, strlen(curField->name)); + if (enable_utf8) + sv_utf8_decode(sv); break; case AV_ATTRIB_TABLE: sv= newSVpvn(curField->table, strlen(curField->table)); + if (enable_utf8) + sv_utf8_decode(sv); break; case AV_ATTRIB_TYPE: -- 1.7.9.5
Subject: 0002-Fix-decoding-UTF-8-warning-and-error-messages-when-m.patch
From 132dba0df6e426262dacfd5236f5d0842b2419f0 Mon Sep 17 00:00:00 2001 From: Pali <pali@cpan.org> Date: Sun, 12 Feb 2017 13:27:04 +0100 Subject: [PATCH 2/2] Fix decoding UTF-8 warning and error messages when mysql_enable_utf8 is enabled Warning and error messages can contains wide characters based on locale and encoding settings. If we do not properly decode them then messages have UTF-8 octets instead of correct wide Unicode strings. Note that at least DBI of version 1.635 is required. Otherwise this change has no effect. Partially fixes bug: https://rt.cpan.org/Public/Bug/Display.html?id=120141 --- dbdimp.c | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/dbdimp.c b/dbdimp.c index 9e128b4..91cc1a8 100644 --- a/dbdimp.c +++ b/dbdimp.c @@ -1594,14 +1594,30 @@ void do_error(SV* h, int rc, const char* what, const char* sqlstate) { dTHX; D_imp_xxh(h); + imp_dbh_t* dbh; SV *errstr; SV *errstate; + bool enable_utf8; + + if (DBIc_TYPE(imp_xxh) == DBIt_DB) { + D_imp_dbh(h); + dbh = imp_dbh; + } else { + D_imp_sth(h); + D_imp_dbh_from_sth; + dbh = imp_dbh; + } + + enable_utf8 = (dbh->enable_utf8 || dbh->enable_utf8mb4); if (DBIc_TRACE_LEVEL(imp_xxh) >= 2) PerlIO_printf(DBIc_LOGPIO(imp_xxh), "\t\t--> do_error\n"); errstr= DBIc_ERRSTR(imp_xxh); sv_setiv(DBIc_ERR(imp_xxh), (IV)rc); /* set err early */ + SvUTF8_off(errstr); sv_setpv(errstr, what); + if (enable_utf8) + sv_utf8_decode(errstr); #if MYSQL_VERSION_ID >= SQL_STATE_VERSION if (sqlstate) @@ -1626,10 +1642,26 @@ void do_warn(SV* h, int rc, char* what) { dTHX; D_imp_xxh(h); + imp_dbh_t* dbh; + bool enable_utf8; + + if (DBIc_TYPE(imp_xxh) == DBIt_DB) { + D_imp_dbh(h); + dbh = imp_dbh; + } else { + D_imp_sth(h); + D_imp_dbh_from_sth; + dbh = imp_dbh; + } + + enable_utf8 = (dbh->enable_utf8 || dbh->enable_utf8mb4); SV *errstr = DBIc_ERRSTR(imp_xxh); sv_setiv(DBIc_ERR(imp_xxh), (IV)rc); /* set err early */ + SvUTF8_off(errstr); sv_setpv(errstr, what); + if (enable_utf8) + sv_utf8_decode(errstr); /* NO EFFECT DBIh_EVENT2(h, WARN_event, DBIc_ERR(imp_xxh), errstr);*/ if (DBIc_TRACE_LEVEL(imp_xxh) >= 2) PerlIO_printf(DBIc_LOGPIO(imp_xxh), "%s warning %d recorded: %s\n", @@ -2748,7 +2780,8 @@ SV* dbd_db_FETCH_attrib(SV *dbh, imp_dbh_t *imp_dbh, SV *keysv) STRLEN kl; char *key = SvPV(keysv, kl); /* needs to process get magic */ SV* result = NULL; - dbh= dbh; + bool enable_utf8 = (imp_dbh->enable_utf8 || imp_dbh->enable_utf8mb4); + PERL_UNUSED_ARG(dbh); switch (*key) { case 'A': @@ -2804,6 +2837,8 @@ SV* dbd_db_FETCH_attrib(SV *dbh, imp_dbh_t *imp_dbh, SV *keysv) /* Note that errmsg is obsolete, as of 2.09! */ const char* msg = mysql_error(imp_dbh->pmysql); result= sv_2mortal(newSVpvn(msg, strlen(msg))); + if (enable_utf8) + sv_utf8_decode(result); } else if (kl == strlen("enable_utf8mb4") && strEQ(key, "enable_utf8mb4")) result = sv_2mortal(newSViv(imp_dbh->enable_utf8mb4)); -- 1.7.9.5
CC: pali [...] cpan.org
Subject: Re: [rt.cpan.org #120141] UTF-8 column names and error messages are octet-streams, not strings
Date: Mon, 13 Feb 2017 11:34:08 +0900
To: bug-DBD-mysql [...] rt.cpan.org
From: Tanabe Yoshinori <tanabe [...] fa2.so-net.ne.jp>
On 2017/02/12 21:52, Pali via RT wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 > > > On Str Feb 08 06:20:34 2017, tanabe@fa2.so-net.ne.jp wrote:
>> On 2017/02/08 19:32, Pali via RT wrote:
>>> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 > >>> >>> On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
>>>> Hello, >>>> >>>> Column names and error messages should be treated as strings, but >>>> they are octet-streams in DBD-mysql-4.041. >>>> >>>> The attached code creates a table with a column whose name >>>> contains a non ASCII character. After issueing a SELECT statement >>>> and fetchrow_hashref, it tries to get a value using the column name >>>> at (1), but the result is undef. If you use the octet stream for >>>> the column name as a key, you get the value, at (2). >>>> >>>> Also, when you use Japanese error messages by adding line >>>> lc_messages=ja_JP >>>> in [mysqld] section of my.ini, messages are not decoded in >>>> DBD::mysql. As a result, messages are unreadable in (3) and (4). >>>> We could explicitly decode them as in (5) for message caught, but >>>> this cannot be applied to (3). Of course, it can be avoided by >>>> not using automatic encoding for STDERR at (6), but then we need >>>> to manually encode all other strings, a nightmare. >>>> >>>> Finally, I noticed that when error messages are in Japanese, make >>>> test of DBD-mysql fails. It may be difficult to avoid (I do not >>>> know), but a warning message (lc_messages should not be changed) >>>> in make test would help. >>>> >>>> DBD::mysql version: 4.041 >>>> Strawberry perl 64bit, v5.22.1 >>>> MariaDB >>>> $dbh->{mysql_clientinfo, mysql_clientversion, >>>> mysql_serverversion} >>>> returns: >>>> 5.1.44, 50144, 50505, respectively. >>>> Windows 7 Pro Service Pack 1 >>>> >>>> Regards, >>>> Tanabe Yoshinori >>>>
>>> >>> Hello, please try development version 4.041_1 of DBD-mysql. That one >>> has fixed UTF-8 support for passing statements and parameters. >>>
>> >> Hello, >> >> I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows the >> number) and run the script again. The results are the same as in my >> first report. >> >> Thank you. >> Tanabe
> > Hi! Can you try compile DBD::mysql (either 4.041_01 or from git master) with these two attached patches? It should fix wide Unicode characters in column names and error messages. Note that DBI itself has broken Unicode messages prior to version 1.635 (see https://rt.cpan.org/Public/Bug/Display.html?id=102404). >
Hello, I have confirmed that the problems have gone by applying the patches (and upgrading DBI to a later version). Thank you very much for the quick fix. One concern is that the fix can break code currently running. Best regards, Tanabe
On Sun Feb 12 21:34:37 2017, tanabe@fa2.so-net.ne.jp wrote: Show quoted text
> On 2017/02/12 21:52, Pali via RT wrote:
> > <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 > > > > > On Str Feb 08 06:20:34 2017, tanabe@fa2.so-net.ne.jp wrote:
> >> On 2017/02/08 19:32, Pali via RT wrote:
> >>> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 > > >>> > >>> On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
> >>>> Hello, > >>>> > >>>> Column names and error messages should be treated as strings, but > >>>> they are octet-streams in DBD-mysql-4.041. > >>>> > >>>> The attached code creates a table with a column whose name > >>>> contains a non ASCII character. After issueing a SELECT statement > >>>> and fetchrow_hashref, it tries to get a value using the column > >>>> name > >>>> at (1), but the result is undef. If you use the octet stream for > >>>> the column name as a key, you get the value, at (2). > >>>> > >>>> Also, when you use Japanese error messages by adding line > >>>> lc_messages=ja_JP > >>>> in [mysqld] section of my.ini, messages are not decoded in > >>>> DBD::mysql. As a result, messages are unreadable in (3) and (4). > >>>> We could explicitly decode them as in (5) for message caught, but > >>>> this cannot be applied to (3). Of course, it can be avoided by > >>>> not using automatic encoding for STDERR at (6), but then we need > >>>> to manually encode all other strings, a nightmare. > >>>> > >>>> Finally, I noticed that when error messages are in Japanese, make > >>>> test of DBD-mysql fails. It may be difficult to avoid (I do not > >>>> know), but a warning message (lc_messages should not be changed) > >>>> in make test would help. > >>>> > >>>> DBD::mysql version: 4.041 > >>>> Strawberry perl 64bit, v5.22.1 > >>>> MariaDB > >>>> $dbh->{mysql_clientinfo, mysql_clientversion, > >>>> mysql_serverversion} > >>>> returns: > >>>> 5.1.44, 50144, 50505, respectively. > >>>> Windows 7 Pro Service Pack 1 > >>>> > >>>> Regards, > >>>> Tanabe Yoshinori > >>>>
> >>> > >>> Hello, please try development version 4.041_1 of DBD-mysql. That > >>> one > >>> has fixed UTF-8 support for passing statements and parameters. > >>>
> >> > >> Hello, > >> > >> I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows > >> the > >> number) and run the script again. The results are the same as in my > >> first report. > >> > >> Thank you. > >> Tanabe
> > > > Hi! Can you try compile DBD::mysql (either 4.041_01 or from git > > master) with these two attached patches? It should fix wide Unicode > > characters in column names and error messages. Note that DBI itself > > has broken Unicode messages prior to version 1.635 (see > > https://rt.cpan.org/Public/Bug/Display.html?id=102404). > >
> > Hello, I have confirmed that the problems have gone by applying the > patches (and upgrading DBI to a later version). Thank you very much > for > the quick fix. > One concern is that the fix can break code currently running. > Best regards, > Tanabe
Thank you for testing. I will reuse your script to create tests for this issue. Currently Unicode support is broken for a long time in DBD::mysql and proper way is to fix current code.
Reopening, fix was reverted in 4.043.