Bug #31577 for DBD-Pg: Placeholders aren't properly detected in statements with unquoted non-ascii characters

Sun Dec 16 06:33:58 2007 xdrudis [...] tinet.cat - Ticket created

Subject:	Placeholders aren't properly detected in statements with unquoted non-ascii characters
Date:	Sun, 16 Dec 2007 12:25:56 +0100
To:	bug-DBD-Pg [...] rt.cpan.org
From:	Xavi Drudis Ferran <xdrudis [...] tinet.cat>

Hello. Thanks again for your work in this driver. Continuing in my contributions to make the driver idiot-proof (none beats an idiot like me at it), here's a bug or something I've found: If you have a postgres identifier (table or column name, for instance) with a non-ascii character like an accent, cedilla or whatever and you call prepare with a statement which contains this identifier , placeholders after the non-ascii identifier may not be detected properly, and a later bind_param or execute may fail complaining the placeholder is not there. This does not happen (i.e., everything works fine) if the non-ascii identifier is surrounded by double quotes. I'm not sure why it happens, it might be something about dbd_st_split_statement in dbdimp.c getting utf8 strings while it only works for ascii or iso-8859-1, but I'm not sure, since my C is too rusty and I have too litle time to test and learn. I'm not even sure it needs to be fixed. If it's too much work at least a note in the documentation should tell users to always double quote postgres identifiers containing non-ascii characters. What I know is that I forgot to quote one and spent almost one day until I found the problem, because psql does not require such identifiers to be double quoted. So any explanation in the documentation, or even clearer error messages, warnings of unquoted non-ascii identifiers or something that helps understand the problem earlier woud be welcome. Btw, It would also help to note in the documentation that it is useful to include $dbh->do("SET client_encoding TO 'UTF8'"); # or LATIN1 or whater whenever the perl script file is encoded in UTF8 (or latin1) and the locale it is run under is something else (well, it seems an infrequent case, but a script may be installed for several users with several locales and it will only be encoded in one encoding). Thanks again. I attach a little test about the placeholders after unquoted non-ascii identifiers.

Sun Dec 16 10:47:42 2007 xdrudis [...] tinet.cat - Correspondence added

CC:	xdrudis [...] tinet.cat
Subject:	Re: [rt.cpan.org #31577] AutoReply: Placeholders aren't properly detected in statements with unquoted non-ascii characters
Date:	Sun, 16 Dec 2007 16:40:55 +0100
To:	Bugs in DBD-Pg via RT <bug-DBD-Pg [...] rt.cpan.org>
From:	Xavi Drudis Ferran <xdrudis [...] tinet.cat>

On Sun, Dec 16, 2007 at 06:34:05AM -0500, Bugs in DBD-Pg via RT wrote: Show quoted text

> > Thanks again. I attach a little test about the placeholders after unquoted > non-ascii identifiers. >

Sorry, I'm attaching it now.

Message body is not shown because sender requested not to inline it.

Sun Jan 06 20:31:18 2008 greg [...] turnstep.com - Correspondence added

Thanks for the report, I've made the character unsigned, which should solve the problem for this particular case. Long term, we may want to support UTF-16 as well, but since DBI uses a lot of basic C strings, that's a ways off. Made the fix in r10482, will be in next release as well.

Sun Jan 06 20:31:21 2008 The RT System itself - Status changed from 'new' to 'open'

Sun Jan 06 20:32:02 2008 greg [...] turnstep.com - Severity Critical added

Mon Feb 11 10:02:46 2008 greg [...] turnstep.com - Status changed from 'open' to 'resolved'

Mon Feb 11 10:03:17 2008 greg [...] turnstep.com - Fixed in 2.0.0 added