Subject: | Placeholders aren't properly detected in statements with unquoted non-ascii characters |
Date: | Sun, 16 Dec 2007 12:25:56 +0100 |
To: | bug-DBD-Pg [...] rt.cpan.org |
From: | Xavi Drudis Ferran <xdrudis [...] tinet.cat> |
Hello. Thanks again for your work in this driver.
Continuing in my contributions to make the driver idiot-proof (none
beats an idiot like me at it), here's a bug or something I've found:
If you have a postgres identifier (table or column name, for instance)
with a non-ascii character like an accent, cedilla or whatever and you
call prepare with a statement which contains this identifier ,
placeholders after the non-ascii identifier may not be detected
properly, and a later bind_param or execute may fail complaining the
placeholder is not there. This does not happen (i.e., everything works
fine) if the non-ascii identifier is surrounded by double quotes.
I'm not sure why it happens, it might be something about dbd_st_split_statement
in dbdimp.c getting utf8 strings while it only works for ascii or iso-8859-1,
but I'm not sure, since my C is too rusty and I have too litle time
to test and learn.
I'm not even sure it needs to be fixed. If it's too much work at least
a note in the documentation should tell users to always double quote
postgres identifiers containing non-ascii characters. What I know is
that I forgot to quote one and spent almost one day until I found
the problem, because psql does not require such identifiers to be double quoted.
So any explanation in the documentation, or even clearer error messages,
warnings of unquoted non-ascii identifiers or something that helps
understand the problem earlier woud be welcome.
Btw, It would also help to note in the documentation that it is useful to include
$dbh->do("SET client_encoding TO 'UTF8'"); # or LATIN1 or whater
whenever the perl script file is encoded in UTF8 (or latin1) and the
locale it is run under is something else (well, it seems an infrequent case,
but a script may be installed for several users with several locales
and it will only be encoded in one encoding).
Thanks again. I attach a little test about the placeholders after unquoted
non-ascii identifiers.