Skip Menu |

This queue is for tickets about the DBD-ODBC CPAN distribution.

Report information
The Basics
Id: 67994
Status: resolved
Priority: 0/
Queue: DBD-ODBC

People
Owner: Nobody in particular
Requestors: bitcard [...] mscha.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.29
Fixed in: 1.30_5



Subject: DBD::ODBC doesn't handle windows-1252 characters correctly
Using DBI 1.616 and DBD::ODBC 1.29 in ActivePerl 5.12.2 Build 1202 on Windows 7 to connect to an SQL Server database with default collation SQL_Latin1_General_CP1_CI_AS, I get data in varchar columns (normal, not wide) back in un-decoded windows-1252. (For instance a Euro sign, "€", is 0x80 instead of 0x20AC.) Workarounds are: - decode myself, using Encode::decode('windows-1252', $val) for each column after each fetch - explicitly set column type to SQL_WCHAR This shouldn't be necessary, though. If it isn't possible for DBD::ODBC to detect that data is encoded in windows-1252, perhaps a database handle attribute could be added to tell DBD::ODBC the encoding of the ODBC connection? See also: http://stackoverflow.com/questions/5912082/automatic-character-encoding-handling-in-perl-dbi-dbdodbc
On Fri May 06 11:41:59 2011, mscha wrote: Show quoted text
> Using DBI 1.616 and DBD::ODBC 1.29 in ActivePerl 5.12.2 Build 1202 on > Windows 7 to connect to an SQL Server database with default collation > SQL_Latin1_General_CP1_CI_AS, I get data in varchar columns (normal, > not > wide) back in un-decoded windows-1252. > (For instance a Euro sign, "€", is 0x80 instead of 0x20AC.) > > Workarounds are: > - decode myself, using Encode::decode('windows-1252', $val) for each > column after each fetch > - explicitly set column type to SQL_WCHAR > > This shouldn't be necessary, though. > > If it isn't possible for DBD::ODBC to detect that data is encoded in > windows-1252, perhaps a database handle attribute could be added to > tell > DBD::ODBC the encoding of the ODBC connection? > > See also: > http://stackoverflow.com/questions/5912082/automatic-character- > encoding-handling-in-perl-dbi-dbdodbc
See http://www.nntp.perl.org/group/perl.dbi.dev/2011/05/msg6559.html If I was to make the suggested change are you in a position to test it? Martin -- Martin J. Evans Wetherby, UK
On Wed May 11 03:45:13 2011, MJEVANS wrote: Show quoted text
> On Fri May 06 11:41:59 2011, mscha wrote:
> > Using DBI 1.616 and DBD::ODBC 1.29 in ActivePerl 5.12.2 Build 1202 on > > Windows 7 to connect to an SQL Server database with default collation > > SQL_Latin1_General_CP1_CI_AS, I get data in varchar columns (normal, > > not > > wide) back in un-decoded windows-1252. > > (For instance a Euro sign, "€", is 0x80 instead of 0x20AC.) > > > > Workarounds are: > > - decode myself, using Encode::decode('windows-1252', $val) for each > > column after each fetch > > - explicitly set column type to SQL_WCHAR > > > > This shouldn't be necessary, though. > > > > If it isn't possible for DBD::ODBC to detect that data is encoded in > > windows-1252, perhaps a database handle attribute could be added to > > tell > > DBD::ODBC the encoding of the ODBC connection? > > > > See also: > > http://stackoverflow.com/questions/5912082/automatic-character- > > encoding-handling-in-perl-dbi-dbdodbc
> > See > http://www.nntp.perl.org/group/perl.dbi.dev/2011/05/msg6559.html > > If I was to make the suggested change are you in a position to test it? > > Martin
Try 1.30_1 on CPAN now. Please let me know if this addresses your issue so I can write this rt off. Martin -- Martin J. Evans Wetherby, UK
From: bitcard [...] mscha.org
On Mon May 16 03:46:20 2011, MJEVANS wrote: Show quoted text
> On Wed May 11 03:45:13 2011, MJEVANS wrote:
> > On Fri May 06 11:41:59 2011, mscha wrote:
> > > Using DBI 1.616 and DBD::ODBC 1.29 in ActivePerl 5.12.2 Build 1202 on > > > Windows 7 to connect to an SQL Server database with default collation > > > SQL_Latin1_General_CP1_CI_AS, I get data in varchar columns (normal, > > > not > > > wide) back in un-decoded windows-1252. > > > (For instance a Euro sign, "€", is 0x80 instead of 0x20AC.) > > > > > > Workarounds are: > > > - decode myself, using Encode::decode('windows-1252', $val) for each > > > column after each fetch > > > - explicitly set column type to SQL_WCHAR > > > > > > This shouldn't be necessary, though. > > > > > > If it isn't possible for DBD::ODBC to detect that data is encoded in > > > windows-1252, perhaps a database handle attribute could be added to > > > tell > > > DBD::ODBC the encoding of the ODBC connection? > > > > > > See also: > > > http://stackoverflow.com/questions/5912082/automatic-character- > > > encoding-handling-in-perl-dbi-dbdodbc
> > > > See > > http://www.nntp.perl.org/group/perl.dbi.dev/2011/05/msg6559.html > > > > If I was to make the suggested change are you in a position to test it? > > > > Martin
> > Try 1.30_1 on CPAN now. > > Please let me know if this addresses your issue so I can write this rt
off. Show quoted text
>
Apologies for the late response. The chosen solution sounds like the right one to me. However, it does not appear to work... I installed 1.30_4 as described in README.windows: cpan MJEVANS/DBD-ODBC-1.30_4.tar.gz (The compiler is MS Visual Studio 2008.) I then wrote a little test script. Without specifying a bind type, I still get data back in the legacy encoding (Windows-1252, a Euro sign (€) is 0x80). If I bind explicitly as SQL_WCHAR, I do get a properly encoded Euro sign (0x8364). DBD::ODBC version 1.30_4 odbc_has_unicode: 1 odbc_old_unicode: Title with default bind type: 'eBookbon: ▒ 10' '▒': 128 Title with bind type SQL_WCHAR: 'eBookbon: € 10' '€': 8364 I'll attach the script (enctest.pl) to this bug report, together with the output using DBI_TRACE=15=enctest.log. (I also tried 1.30_1, in case it worked there and had been broken again in one of the later releases, but it behaves exactly the same.) Thanks, - Michael
From: bitcard [...] mscha.org
On Sun May 22 14:19:59 2011, mscha wrote: Show quoted text
> On Mon May 16 03:46:20 2011, MJEVANS wrote:
> > On Wed May 11 03:45:13 2011, MJEVANS wrote:
> > > On Fri May 06 11:41:59 2011, mscha wrote:
> > > > Using DBI 1.616 and DBD::ODBC 1.29 in ActivePerl 5.12.2 Build
1202 on Show quoted text
> > > > Windows 7 to connect to an SQL Server database with default
collation Show quoted text
> > > > SQL_Latin1_General_CP1_CI_AS, I get data in varchar columns (normal, > > > > not > > > > wide) back in un-decoded windows-1252. > > > > (For instance a Euro sign, "€", is 0x80 instead of 0x20AC.) > > > > > > > > Workarounds are: > > > > - decode myself, using Encode::decode('windows-1252', $val) for
each Show quoted text
> > > > column after each fetch > > > > - explicitly set column type to SQL_WCHAR > > > > > > > > This shouldn't be necessary, though. > > > > > > > > If it isn't possible for DBD::ODBC to detect that data is encoded in > > > > windows-1252, perhaps a database handle attribute could be added to > > > > tell > > > > DBD::ODBC the encoding of the ODBC connection? > > > > > > > > See also: > > > > http://stackoverflow.com/questions/5912082/automatic-character- > > > > encoding-handling-in-perl-dbi-dbdodbc
> > > > > > See > > > http://www.nntp.perl.org/group/perl.dbi.dev/2011/05/msg6559.html > > > > > > If I was to make the suggested change are you in a position to
test it? Show quoted text
> > > > > > Martin
> > > > Try 1.30_1 on CPAN now. > > > > Please let me know if this addresses your issue so I can write this rt
> off.
> >
> > Apologies for the late response. > > The chosen solution sounds like the right one to me. > However, it does not appear to work... > > I installed 1.30_4 as described in README.windows: > cpan MJEVANS/DBD-ODBC-1.30_4.tar.gz > (The compiler is MS Visual Studio 2008.) > > I then wrote a little test script. Without specifying a bind type, I > still get data back in the legacy encoding (Windows-1252, a Euro sign > (€) is 0x80). If I bind explicitly as SQL_WCHAR, I do get a properly > encoded Euro sign (0x8364). > > DBD::ODBC version 1.30_4 > odbc_has_unicode: 1 > odbc_old_unicode: > Title with default bind type: 'eBookbon: ▒ 10' > '▒': 128 > Title with bind type SQL_WCHAR: 'eBookbon: € 10' > '€': 8364 > > I'll attach the script (enctest.pl) to this bug report, together with > the output using DBI_TRACE=15=enctest.log.
And here they are... Show quoted text
> > (I also tried 1.30_1, in case it worked there and had been broken again > in one of the later releases, but it behaves exactly the same.) > > Thanks, > > - Michael
Subject: enctest.log
Download enctest.log
application/octet-stream 71.9k

Message body not shown because it is not plain text.

Subject: enctest.pl
#!/opt/perl/bin/perl use strict; use warnings; use 5.010; use utf8; use autodie; use DBI qw(:sql_types); use DBD::ODBC; ############################################################ my $DB_NAME = '(censored)'; my $DB = "dbi:ODBC:Driver={SQL Server Native Client 10.0};Server=localhost;Database=$DB_NAME;Trusted_Connection=yes;"; my $DB_USER = '(censored)'; my $DB_PASS = '(censored)'; my $DB_OPTIONS = { RaiseError => 1, AutoCommit => 1 }; ############################################################ #binmode STDOUT, ':utf8'; say "DBD::ODBC version $DBD::ODBC::VERSION"; my $dbh = DBI->connect($DB, $DB_USER, $DB_PASS, $DB_OPTIONS); foreach my $attr (qw(odbc_has_unicode odbc_old_unicode)) { say "$attr: ", $dbh->{$attr} // ''; } # Title is a varchar(2000). The collation is SQL_Latin1_General_CP1_CI_AS, better known as windows-1252. my $sth = $dbh->prepare("SELECT Title, Title FROM Product WHERE EAN = '3880060000010'"); $sth->execute(); my ($title1, $title2); $sth->bind_col(1, \$title1); $sth->bind_col(2, \$title2, SQL_WCHAR); while ($sth->fetch()) { printf("Title with default bind type: '%s'\n", $title1); my @high1 = $title1 =~ m{([^\x00-\x7F])}g; foreach my $h (@high1) { printf(" '%s': %d\n", $h, ord($h)); } printf("Title with bind type SQL_WCHAR: '%s'\n", $title2); my @high2 = $title2 =~ m{([^\x00-\x7F])}g; foreach my $h (@high2) { printf(" '%s': %d\n", $h, ord($h)); } } $dbh->disconnect();
On Sun May 22 14:22:02 2011, mscha wrote: Show quoted text
> > I installed 1.30_4 as described in README.windows: > > cpan MJEVANS/DBD-ODBC-1.30_4.tar.gz > > (The compiler is MS Visual Studio 2008.) > > > > I then wrote a little test script. Without specifying a bind type, I > > still get data back in the legacy encoding (Windows-1252, a Euro sign > > (€) is 0x80). If I bind explicitly as SQL_WCHAR, I do get a properly > > encoded Euro sign (0x8364). > > > > DBD::ODBC version 1.30_4 > > odbc_has_unicode: 1 > > odbc_old_unicode: > > Title with default bind type: 'eBookbon: ▒ 10' > > '▒': 128 > > Title with bind type SQL_WCHAR: 'eBookbon: € 10' > > '€': 8364 > > > > I'll attach the script (enctest.pl) to this bug report, together with > > the output using DBI_TRACE=15=enctest.log.
> > And here they are... >
> > > > (I also tried 1.30_1, in case it worked there and had been broken again > > in one of the later releases, but it behaves exactly the same.) > > > > Thanks, > > > > - Michael
>
Thanks Michael. Unfortunately I cannot reproduce your problem here which is making this difficult. Is there any way you could send me some SQL exported from your SQL Server which sets the table and columns up as you have it and a row of data? Martin -- Martin J. Evans Wetherby, UK
From: bitcard [...] mscha.org
On Tue May 24 04:36:30 2011, MJEVANS wrote: Show quoted text
> > Unfortunately I cannot reproduce your problem here which is making this > difficult. Is there any way you could send me some SQL exported from > your SQL Server which sets the table and columns up as you have it and a > row of data?
The attached script should do the trick. Tested on SQL Server 2008, works with enctest.pl which I posted earlier (if you change database, username and password accordingly). Of course, you should check if the encoding doesn't get corrupted somewhere along the way, and the row is inserted correctly. The SELECT at the end of the script should return: 3880060000010 eBookbon: € 10 (i.e. with a Euro sign between the colon and the number 10). Thanks, - Michael
Subject: create_enctest_db.sql
Download create_enctest_db.sql
application/octet-stream 321b

Message body not shown because it is not plain text.

On Tue May 24 06:49:33 2011, mscha wrote: Show quoted text
> On Tue May 24 04:36:30 2011, MJEVANS wrote:
> > > > Unfortunately I cannot reproduce your problem here which is making
this Show quoted text
> > difficult. Is there any way you could send me some SQL exported from > > your SQL Server which sets the table and columns up as you have it
and a Show quoted text
> > row of data?
> > The attached script should do the trick. Tested on SQL Server 2008, > works with enctest.pl which I posted earlier (if you change database, > username and password accordingly). > > Of course, you should check if the encoding doesn't get corrupted > somewhere along the way, and the row is inserted correctly. The
SELECT Show quoted text
> at the end of the script should return: > 3880060000010 eBookbon: € 10 > (i.e. with a Euro sign between the colon and the number 10). > > Thanks, > > - Michael
Michael, Could you try changing the lines around line 2239 in dbdimp.c to: if (!imp_sth->odbc_old_unicode && (fbh->ColSqlType == SQL_CHAR || fbh->ColSqlType == SQL_VARCHAR)) { fbh->ColSqlType = SQL_WCHAR; } note, I've added || fbh->ColSqlType == SQL_VARCHAR run nmake/dmake or whatever you did before to rebuild it and try again. You can either make install it or do something like: perl -Iblib\lib -Iblib\arch myscript.pl from the build directory. Martin -- Martin J. Evans Wetherby, UK
From: bitcard [...] mscha.org
On Tue May 24 08:50:53 2011, MJEVANS wrote: Show quoted text
> On Tue May 24 06:49:33 2011, mscha wrote:
> > On Tue May 24 04:36:30 2011, MJEVANS wrote:
> > > > > > Unfortunately I cannot reproduce your problem here which is making
> this
> > > difficult. Is there any way you could send me some SQL exported from > > > your SQL Server which sets the table and columns up as you have it
> and a
> > > row of data?
> > > > The attached script should do the trick. Tested on SQL Server 2008, > > works with enctest.pl which I posted earlier (if you change database, > > username and password accordingly). > > > > Of course, you should check if the encoding doesn't get corrupted > > somewhere along the way, and the row is inserted correctly. The
> SELECT
> > at the end of the script should return: > > 3880060000010 eBookbon: € 10 > > (i.e. with a Euro sign between the colon and the number 10). > > > > Thanks, > > > > - Michael
> > Michael, > > Could you try changing the lines around line 2239 in dbdimp.c to: > > if (!imp_sth->odbc_old_unicode && > (fbh->ColSqlType == SQL_CHAR || fbh->ColSqlType == SQL_VARCHAR)) { > fbh->ColSqlType = SQL_WCHAR; > } > > note, I've added || fbh->ColSqlType == SQL_VARCHAR > > run nmake/dmake or whatever you did before to rebuild it and try again. > > You can either make install it or do something like: > > perl -Iblib\lib -Iblib\arch myscript.pl > > from the build directory. > > Martin
DBD::ODBC version 1.30_4 odbc_has_unicode: 1 odbc_old_unicode: Title with default bind type: 'eBookbon: € 10' '€': 8364 Title with bind type SQL_WCHAR: 'eBookbon: € 10' '€': 8364 :-) Thanks, - Michael
On Tue May 24 12:19:09 2011, mscha wrote: Show quoted text
> On Tue May 24 08:50:53 2011, MJEVANS wrote:
> > On Tue May 24 06:49:33 2011, mscha wrote:
> > > On Tue May 24 04:36:30 2011, MJEVANS wrote:
> > > > > > > > Unfortunately I cannot reproduce your problem here which is
making Show quoted text
> > this
> > > > difficult. Is there any way you could send me some SQL exported
from Show quoted text
> > > > your SQL Server which sets the table and columns up as you have
it Show quoted text
> > and a
> > > > row of data?
> > > > > > The attached script should do the trick. Tested on SQL Server
2008, Show quoted text
> > > works with enctest.pl which I posted earlier (if you change
database, Show quoted text
> > > username and password accordingly). > > > > > > Of course, you should check if the encoding doesn't get corrupted > > > somewhere along the way, and the row is inserted correctly. The
> > SELECT
> > > at the end of the script should return: > > > 3880060000010 eBookbon: € 10 > > > (i.e. with a Euro sign between the colon and the number 10). > > > > > > Thanks, > > > > > > - Michael
> > > > Michael, > > > > Could you try changing the lines around line 2239 in dbdimp.c to: > > > > if (!imp_sth->odbc_old_unicode && > > (fbh->ColSqlType == SQL_CHAR || fbh->ColSqlType == SQL_VARCHAR))
{ Show quoted text
> > fbh->ColSqlType = SQL_WCHAR; > > } > > > > note, I've added || fbh->ColSqlType == SQL_VARCHAR > > > > run nmake/dmake or whatever you did before to rebuild it and try
again. Show quoted text
> > > > You can either make install it or do something like: > > > > perl -Iblib\lib -Iblib\arch myscript.pl > > > > from the build directory. > > > > Martin
> > DBD::ODBC version 1.30_4 > odbc_has_unicode: 1 > odbc_old_unicode: > Title with default bind type: 'eBookbon: € 10' > '€': 8364 > Title with bind type SQL_WCHAR: 'eBookbon: € 10' > '€': 8364 > > :-) > > Thanks, > > - Michael
\o/ I will release this as 1.30_5 development release later tonight. Thanks for sticking with this to the end and providing enough detail to find a solution. Martin -- Martin J. Evans Wetherby, UK