Subject: | Compatibility with SQLite ICU extension |
Hi. This is a feature request.
SQLite can be compiled with unicode support via the ICU library:
http://www.sqlite.org/cvstrac/fileview?f=sqlite/ext/icu/README.txt
http://site.icu-project.org/
It would be extremely helpful if DBD-SQLite were compatible with this extension. I think there is only a minor step missing. Here is what I tried.
For ICU support, you configure SQLite 3.6.20 with CFLAGS=-DSQLITE_ENABLE_ICU and LIBS=-licuio. I tried to see whether DBD-SQLite 1.27 works with the ICU extension and configured it with
perl Makefile.PL LIBS="-licuio" CCFLAGS="-DSQLITE_ENABLE_ICU" PREFIX=/usr/local
Compilation works, and the resulting Perl module knows the ICU collations, for example,
SELECT icu_load_collation('de_DE','custom');
does not give an error. But when you try to use this collation, say,
CREATE TABLE something ( column VARCHAR(20) COLLATE custom );
DBD-SQLite says it does not know the collation:
can't install, unknown collation : custom at /usr/local/lib/perl5/site_perl/5.10.0/i586-linux-thread-multi/DBD/SQLite.pm line 141.
So it seems although the ICU extension is there and the ICU collation 'de_DE' is known, DBD-SQLite takes control of all requested collations and gives an error because it does not know about the ICU collation.
I would be most grateful if you could support ICU in a future version. I am also happy to invest some time and try out patches.
Finally, why might somebody want ICU support? SQLite is only able to sort ASCII strings. Any accented characters are wrongly sorted and there is no unicode or locale support whatsoever (SQLite design choice because they want to keep their library small). Of course, every client is free to add their own collations. So you added support for Perl 'cmp'. In Qt, it is easy to use their unicode localized QString comparison, and so on. But as of today, this is done differently in each client (SQLite design drawback because there is no central server). So your database, well, at least your indices, becomes inconsistent if you access the same database file in different ways, say a Perl DBD-SQLite script to do some data import, versus a Qt frontend to query the data base. Since SQLite already have the ICU extension, that one would be the natural standard for unicode and locale support.
Thanks for your attention and your help.
Bernhard
SQLite can be compiled with unicode support via the ICU library:
http://www.sqlite.org/cvstrac/fileview?f=sqlite/ext/icu/README.txt
http://site.icu-project.org/
It would be extremely helpful if DBD-SQLite were compatible with this extension. I think there is only a minor step missing. Here is what I tried.
For ICU support, you configure SQLite 3.6.20 with CFLAGS=-DSQLITE_ENABLE_ICU and LIBS=-licuio. I tried to see whether DBD-SQLite 1.27 works with the ICU extension and configured it with
perl Makefile.PL LIBS="-licuio" CCFLAGS="-DSQLITE_ENABLE_ICU" PREFIX=/usr/local
Compilation works, and the resulting Perl module knows the ICU collations, for example,
SELECT icu_load_collation('de_DE','custom');
does not give an error. But when you try to use this collation, say,
CREATE TABLE something ( column VARCHAR(20) COLLATE custom );
DBD-SQLite says it does not know the collation:
can't install, unknown collation : custom at /usr/local/lib/perl5/site_perl/5.10.0/i586-linux-thread-multi/DBD/SQLite.pm line 141.
So it seems although the ICU extension is there and the ICU collation 'de_DE' is known, DBD-SQLite takes control of all requested collations and gives an error because it does not know about the ICU collation.
I would be most grateful if you could support ICU in a future version. I am also happy to invest some time and try out patches.
Finally, why might somebody want ICU support? SQLite is only able to sort ASCII strings. Any accented characters are wrongly sorted and there is no unicode or locale support whatsoever (SQLite design choice because they want to keep their library small). Of course, every client is free to add their own collations. So you added support for Perl 'cmp'. In Qt, it is easy to use their unicode localized QString comparison, and so on. But as of today, this is done differently in each client (SQLite design drawback because there is no central server). So your database, well, at least your indices, becomes inconsistent if you access the same database file in different ways, say a Perl DBD-SQLite script to do some data import, versus a Qt frontend to query the data base. Since SQLite already have the ICU extension, that one would be the natural standard for unicode and locale support.
Thanks for your attention and your help.
Bernhard