Subject: | postgres does not support multi-statement sql in "do" method |
Date: | Wed, 20 Aug 2008 13:43:03 -0700 |
To: | bug-DBIx-TextIndex [...] rt.cpan.org |
From: | Stephen Tu <stephen_tu [...] berkeley.edu> |
Hello DBIx-TextIndex team,
I am unable to utilize your text indexing package due to a limitation in
DBD::Pg's drivers. I'll get to that later, but let me give you some
relevant background information:
$ uname -a
FreeBSD [domain-name] 6.2-RELEASE-p10 FreeBSD 6.2-RELEASE-p10 #2: Thu
Jan 17 04:34:12 PST 2008
$ pkg_info | grep -i db [only showing relevant ones]
p5-DBD-Pg-2.9.0 Provides access to PostgreSQL databases through the DBI
p5-DBI-1.60.4 The perl5 Database Interface. Required for DBD::*
modules
$ perl -MDBIx::TextIndex\ 999
DBIx::TextIndex version 999 required--this is only version 0.28 .
$ pkg_info | grep postgres
postgresql-client-8.2.5_1 PostgreSQL database (client)
postgresql-server-8.2.5_2 The most advanced open-source database
available anywhere
$ perl -v
This is perl, v5.8.8 built for i386-freebsd-64int
So I tried to get this software to work on a very simple table:
user=> \d dbix_index_test
Table "public.dbix_index_test"
Column | Type | Modifiers
-------------+---------+-----------
id | integer | not null
description | text | not null
Indexes:
"dbix_index_test_pkey" PRIMARY KEY, btree (id)
user=> SELECT * FROM dbix_index_test ;
id | description
----+-----------------------
1 | this is my first test
2 | my sentence goes here
3 | windows linux etc
(3 rows)
so i have two scripts here. initially, in order to "initialize" the
index, I ran the following:
init_dbix_index.pl
#!/usr/bin/perl
use strict;
use warnings;
use DBIx::TextIndex;
use DBI;
my $index_dbh = DBI->connect(
'dbi:Pg:dbname=db',
'user',
'pass',
{ AutoCommit => 0,
RaiseError => 1,
pg_server_prepare => 0});
my $doc_dbh = DBI->connect(
'dbi:Pg:dbname=db',
'user',
'pass',
{ AutoCommit => 0,
RaiseError => 1,
pg_server_prepare => 0});
my $init_index = DBIx::TextIndex->new(
{
index_dbh => $doc_dbh,
collection => 'test_collection',
doc_dbh => $index_dbh,
doc_table => 'dbix_test_index',
doc_id_field => 'id',
doc_fields => ['description'],
}
);
$init_index->initialize();
$doc_dbh->commit();
$index_dbh->commit();
which created the following tables without error:
user=> \dt test_collection_
test_collection_all_docs_vector
test_collection_description_inverted test_collection_docweights
test_collection_delete_queue
test_collection_doc_key test_collection_mask
and now i run my second script to populate the index:
test-dbix-index.pl
#!/usr/bin/perl
use strict;
use warnings;
use DBIx::TextIndex;
use DBI;
my $index_dbh = DBI->connect(
'dbi:Pg:dbname=db',
'user',
'pass',
{ AutoCommit => 0,
RaiseError => 1,
pg_server_prepare => 0});
my $doc_dbh = DBI->connect(
'dbi:Pg:dbname=db',
'user',
'pass',
{ AutoCommit => 0,
RaiseError => 1,
pg_server_prepare => 0});
my $index = DBIx::TextIndex->new(
{
index_dbh => $index_dbh,
collection => 'test_collection',
doc_dbh => $doc_dbh,
print_activity => 3,
}
);
$index->add_doc([1,2]);
$index_dbh->commit();
$doc_dbh->commit();
and this is what happens:
$ ./test-dbix-index.pl
Checking if collection table upgrade required ...
... Collection table up-to-date
Adding 2 docs
1 - 15 description 6
2 - 16 description 5
DBD::Pg::db do failed: ERROR: cannot insert multiple commands into a
prepared statement at
/usr/local/lib/perl5/site_perl/5.8.8/mach/DBIx/TextIndex.pm line 1923.
DBD::Pg::db do failed: ERROR: cannot insert multiple commands into a
prepared statement at
/usr/local/lib/perl5/site_perl/5.8.8/mach/DBIx/TextIndex.pm line 1923.
Issuing rollback() for database handle being DESTROY'd without explicit
disconnect().
Now, if we look at /path/to/lib/DBIx/TextIndex/DBD/Pg.pm, the culprit
is, well in this case:
sub update_all_docs_vector {
my $self = shift;
return <<END;
DELETE FROM $self->{ALL_DOCS_VECTOR_TABLE} WHERE id = 1;
INSERT INTO $self->{ALL_DOCS_VECTOR_TABLE}
(id, all_docs_vector)
VALUES (1, ?)
END
}
which gets called from /path/to/lib/DBIx/TextIndex.pm:
sub all_doc_ids {
my $self = shift;
my @ids = @_;
# stuff omitted to save space
if (@ids) {
if ($self->{ALL_DOCS_VECTOR}->Size() < $self->max_indexed_id + 1) {
$self->{ALL_DOCS_VECTOR}->Resize($self->max_indexed_id + 1);
}
$self->{ALL_DOCS_VECTOR}->Index_List_Store(@ids);
$self->{INDEX_DBH}->do($self->{DB}->update_all_docs_vector, undef,
$self->{ALL_DOCS_VECTOR}->to_Enum);
}
else {
# FIXME: this is probably unnecessary, but older versions
# had this documented as a public method
return $self->{ALL_DOCS_VECTOR}->Index_List_Read;
}
}
The problem is, even with the latest DBD::Pg drivers, you can only issue
multiple statement SQL commands and have them prepared in this fashion:
my $sth = $dbh->prepare($multi_statement_sql);
$sth->execute(@values);
my $rows = $sth->rows;
and even that requires that you have $dbh->{pg_server_prepare} = 0
(which i did above in both of my dbh connections)
however, if you try to do this w/ $dbh->do, it won't work, regardless of
the value of your pg_server_prepare
i have filed a bug w/ the developers of DBD::Pg on this very issue.
however, depending on how long they take , and even if they would
acknowledge this as a bug, I would really like to be able to use your
software now to index a very large table in a database. if at all
possible, i'd greatly appreciate it if you could refactor the sql in
Pg.pm to not use multi-statement sql in one "do" method, or suggest a
workaround.
My apologies if this has become very long-winded. I hope that I have
been clear enough on what the issue is here. Thank you for your time,
and let me know if you have any further questions.
Regards,
Stephen Tu