Bug #46438 for DBD-Oracle: Errors returned by procedures are not unicode strings

Thu May 28 04:21:52 2009 bohica [...] ntlworld.com - Ticket created

Subject:

Errors returned by procedures are not unicode strings

Hi, This probably applies to all errors but my example is from a procedure. We have a procedure which calls RAISE_APPLICATION_ERROR with a unicode string but when the error is retrieved from DBD::Oracle, the string is valid UTF-8 but not marked as such. This means when it is printed to a file handle with utf-8 encoding it is double encoded (unlike unicode data retrieved from a result-set due to a select). I've not looked into DBD::Oracle code as yet but this suggests to me data retrieved from selects is marked utf-8 but errors are not. Here is a small self contained example (with output): #!/usr/bin/perl use warnings; use strict; use Encode; use DBI; # Note: NLS_LANG=AMERICAN_AMERICA.AL32UTF8 binmode(STDOUT, ":utf8"); my $dbh = DBI->connect('dbi:Oracle:host=xxx.easysoft.local;sid=devel', 'xxx', 'xxx'); eval {$dbh->do(q/drop procedure p_martin/)}; my $proc = <<'EOT'; create procedure p_martin(arg IN VARCHAR2) IS BEGIN RAISE_APPLICATION_ERROR(-20001, 'Error ' || arg); END; EOT $dbh->do($proc); $dbh->{RaiseError} = 0; $dbh->{PrintError} = 0; my $arg = "\x{20ac}"; my $sth = $dbh->prepare(q{BEGIN p_martin(?); END;}); $sth->bind_param(1, $arg); if (!$sth->execute) { my $error = $dbh->errstr; print "Is utf8: " . (Encode::is_utf8($error, Encode::FB_CROAK) ? 'yes' : 'no') . "\n"; print "error: $error, \n"; print "Should be $arg\n"; my $utf8 = Encode::decode_utf8($error); print "Error decoded: $utf8\n"; } which outputs: $ perl dbd_oracle_errors_not_unicode_example.pl Is utf8: no error: ORA-20001: Error â¬ ORA-06512: at "BET.P_MARTIN", line 3 ORA-06512: at line 1 (DBD ERROR: OCIStmtExecute), Should be € Error decoded: ORA-20001: Error € ORA-06512: at "BET.P_MARTIN", line 3 ORA-06512: at line 1 (DBD ERROR: OCIStmtExecute) I would have expected "Error €" but the euro was double utf-8 encoded as it was not marked as utf-8 internally in perl. Martin -- Martin J. Evans Wetherby, UK

Fri Jun 05 13:07:09 2009 champoux [...] pythian.com - Correspondence added

Well spent a little time with this today and I do not have an exact answer to this. Seems the error is never returned as anUTF8 by OCI or it is being mangled by OCI before I get it. I do not know whitch exactly. for now I will leave this open and get back to it if I have time.

Fri Jun 05 13:07:10 2009 The RT System itself - Status changed from 'new' to 'open'

Fri Jun 05 13:15:08 2009 bohica [...] ntlworld.com - Correspondence added

On Fri Jun 05 13:07:09 2009, PYTHIAN wrote: Show quoted text

> Well spent a little time with this today and I do not have an exact > answer to this. > > Seems the error is never returned as anUTF8 by OCI or it is being > mangled by OCI before I get it. I do not know whitch exactly. > > for now I will leave this open and get back to it if I have time. >

It must be in UTF8 or how could I call utf8_decode() on them and get the unicode strings back. Martin -- Martin J. Evans Wetherby, UK

Fri Jun 05 15:02:06 2009 champoux [...] pythian.com - Correspondence added

I might be that my DB is not unicode so it does not react the same (or return the data in unicode) I will try it againg on another DB that I know is utf8 cheers John Scoles On Fri Jun 05 13:15:08 2009, MJEVANS wrote: Show quoted text

> On Fri Jun 05 13:07:09 2009, PYTHIAN wrote:

> > Well spent a little time with this today and I do not have an exact > > answer to this. > > > > Seems the error is never returned as anUTF8 by OCI or it is being > > mangled by OCI before I get it. I do not know whitch exactly. > > > > for now I will leave this open and get back to it if I have time. > >

> > It must be in UTF8 or how could I call utf8_decode() on them and get

the Show quoted text

> unicode strings back. > > Martin

Fri Jun 05 15:07:59 2009 bohica [...] ntlworld.com - Correspondence added

On Fri Jun 05 15:02:06 2009, PYTHIAN wrote: Show quoted text

> I might be that my DB is not unicode so it does not react the same (or > return the data in unicode) I will try it againg on another DB that I > know is utf8 >

We are using NLS_LANG=AMERICAN_AMERICA.AL32UTF8 exclusively. We are using 10g server and client but the same issue existed to XE server with XE client.

Tue Jun 09 03:41:49 2009 bohica [...] ntlworld.com - Correspondence added

Some discussion continued off this rt via email. The following solution works but may need tweaking if it is possible to detect whether Oracle is sending UTF-8 encoded errors back or not (similarly to what must happen when result-sets are read) - see comments. The following works so long as you use DBIh_SET_ERR_SV instead of DBIh_SET_ERR_CHAR macro like this: int oci_error_err(SV *h, OCIError *errhp, sword status, char *what, sb4 force_err) { dTHX; D_imp_xxh(h); sb4 errcode; SV *errstr_sv = sv_newmortal(); SV *errcode_sv = sv_newmortal(); errcode = oci_error_get(errhp, status, what, errstr_sv, DBIS->debug); /* The following should really know if Oracle is returning us UTF8 encoded errors or not. I don't know how you know this but I imagine DBD::Oracle must know this for result-set data. */ #ifdef sv_utf8_decode sv_utf8_decode(errstr_sv); #else SvUTF8_on(errstr_sv); #endif /* DBIc_ERR *must* be SvTRUE (for RaiseError etc), some */ /* errors, like OCI_INVALID_HANDLE, don't set errcode. */ if (force_err) errcode = force_err; if (status == OCI_SUCCESS_WITH_INFO) errcode = 0; /* record as a "warning" for DBI>=1.43 */ else if (errcode == 0) errcode = (status != 0) ? status : -10000; sv_setiv(errcode_sv, errcode); DBIh_SET_ERR_SV(h, imp_xxh, errcode_sv, errstr_sv, Nullsv, Nullsv); return 0; /* always returns 0 */ } NOTE: only using SvUTF8_on as a last resort - the proper way is to use sv_utf8_decode because just turning on the UTF8 flag on a scalar is dangerous if the scalar is not really UTF8 encoded (see http://www.perlmonks.org/?node_id=644786 and warning from Tim last year). If the perl is new enough (I think 5.8.1) it has a nice safe sv_utf8_decode else we fall back on dangerous svUTF8_on. NOTE: see comment in code above. I don't know how to know if Oracle is supposed to be returning UTF8 encoded error strings. I imagine it is like whatever you do with result-sets to know if the data is UTF8 encoded. It should really be in a test of: if (oracle_is_returning_utf8_encoded_data) This was the bit I thought you might know. Anyway, the above code works fine on my system. It is basically as it was except changed to use DBIh_SET_ERR_SV instead of DBIh_SET_ERR_CHAR and to decode the UTF8 encoded strings. BTW, I am having some email issues at the moment so it is probably better to keep this on rt (for me anyway) for now. Martin -- Martin J. Evans Wetherby, UK

Wed Jun 10 11:09:46 2009 Tim.Bunce [...] pobox.com - Correspondence added

Subject:	Re: [rt.cpan.org #46438] Errors returned by procedures are not unicode strings
Date:	Wed, 10 Jun 2009 16:09:19 +0100
To:	Martin J Evans via RT <bug-DBD-Oracle [...] rt.cpan.org>
From:	Tim Bunce <Tim.Bunce [...] pobox.com>

On Tue, Jun 09, 2009 at 03:41:50AM -0400, Martin J Evans via RT wrote: Show quoted text

> NOTE: see comment in code above. I don't know how to know if Oracle is > supposed to be returning UTF8 encoded error strings. I imagine it is > like whatever you do with result-sets to know if the data is UTF8 > encoded. It should really be in a test of: > > if (oracle_is_returning_utf8_encoded_data) > > This was the bit I thought you might know.

I think CSFORM_IMPLIES_UTF8(SQLCS_IMPLICIT) is what's needed. Tim.

Wed Jun 10 12:06:20 2009 bohica [...] ntlworld.com - Correspondence added

On Wed Jun 10 11:09:46 2009, Tim.Bunce@pobox.com wrote: Show quoted text

> I think CSFORM_IMPLIES_UTF8(SQLCS_IMPLICIT) is what's needed.

Thanks Tim. In that case my changed oci_error_err function is now: int oci_error_err(SV *h, OCIError *errhp, sword status, char *what, sb4 force_err) { dTHX; D_imp_xxh(h); sb4 errcode; SV *errstr_sv = sv_newmortal(); SV *errcode_sv = sv_newmortal(); errcode = oci_error_get(errhp, status, what, errstr_sv, DBIS->debug); if (CSFORM_IMPLIES_UTF8(SQLCS_IMPLICIT)) { #ifdef sv_utf8_decode sv_utf8_decode(errstr_sv); #else SvUTF8_on(errstr_sv); #endif } /* DBIc_ERR *must* be SvTRUE (for RaiseError etc), some */ /* errors, like OCI_INVALID_HANDLE, don't set errcode. */ if (force_err) errcode = force_err; if (status == OCI_SUCCESS_WITH_INFO) errcode = 0; /* record as a "warning" for DBI>=1.43 */ else if (errcode == 0) errcode = (status != 0) ? status : -10000; sv_setiv(errcode_sv, errcode); DBIh_SET_ERR_SV(h, imp_xxh, errcode_sv, errstr_sv, Nullsv, Nullsv); return 0; /* always returns 0 */ } I can confirm this passes the test code I sent originally to demonstrate the problem. Martin

Thu Jul 02 15:24:04 2009 champoux [...] pythian.com - Correspondence added

Wel I applied you changes and changed them in. I stil cannot get it to error for me though so I will have to trust you that it fixes the problems

Thu Jul 02 15:24:06 2009 champoux [...] pythian.com - Status changed from 'open' to 'resolved'

Wed Jul 15 04:20:47 2009 bohica [...] ntlworld.com - Correspondence added

On Thu Jul 02 15:24:04 2009, PYTHIAN wrote: Show quoted text

> Wel I applied you changes and changed them in. I stil cannot get it to > error for me though so I will have to trust you that it fixes the > problems

I can assure you it does and I will check again with the version in subversion when I see it go in. Martin -- Martin J. Evans Wetherby, UK

Wed Jul 15 04:20:48 2009 The RT System itself - Status changed from 'resolved' to 'open'

Thu Oct 29 13:57:46 2009 champoux [...] pythian.com - Correspondence added

The patch has been applied and tested on a unicode DB and is working you can get the code in Trunk http://svn.perl.org/modules/dbd-oracle/trunk it will be release with 1.24 version of the code

Thu Oct 29 13:57:48 2009 champoux [...] pythian.com - Status changed from 'open' to 'resolved'

Tue Dec 21 18:55:47 2010 champoux [...] pythian.com - Fixed in 1.24 added

Tue Dec 21 18:55:47 2010 champoux [...] pythian.com - Fixed in 1.24a added

Tue Dec 21 18:55:47 2010 champoux [...] pythian.com - Fixed in 1.24b added

Bug #46438 for DBD-Oracle: Errors returned by procedures are not unicode strings

Preferred bug tracker