Hi,
A Debian user reported[1] that pg_enable_utf8 flag has no effect on
columns, declared as "text[]". The returned data is not flagged as utf8,
even if it actually is.
[1]
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=436693
The attached test.tar.gz contains an sql script for populating a
database and a perl script that demonstrates the problem.
As you may see, the data returned from the first select is flagged as
utf8, confirmed by the warning about printing wide characters.
The data from the second statement however is not flagged as utf8 and no
warnings occur. Also, the is_utf8 function returns false.
I tried to see where the problem is, but failed. Given my complete lack
of knowledge about DBD::Pg internals, this is not surprising.
Anyway, I'll share my findings hoping they may be of some use (if for
nothing else, then for bringing a smile on your face :))
I see that detection of utf8 data is done in dbdimp.c, around line 2309.
Adding a couple of warn()'s, I discovered that:
1) the type_id for the array column is INT4ARRAYOID, which is normally
not considered in the following switch.
2) even when adding INT3ARRAYOID to the switch, the effect is null, as
the data pointed by *value has no high bits set. This is not surprising,
as value_len is only 7, suggesting that *value contains only a pointer
to the actual data. For comparison, value_len is 9 in the first select.
So here's where my knowledge stops. Perhaps the *value pointer is passed
on to DBI, which handles the array case, disregarding the pg_enable_utf8
flag; perhaps I am missing something obvious.
In any case, your comments would be much appreciated.
Thank you,
dam,
on behalf of the Debian Perl Group <debian-perl@lists.debian.org>