Subject: | All docs in an index must have the same fields - badly reported |
It seems that all documents in an index must contain the same fields.
Perhaps that is a bug - I don't see it mentioned in the documentation
anywhere. If this is by design, it seems like a reasonable restriction,
but it is badly handled in the API, and badly reported when an error
occurs (see below).
If KinoSearch died or returned false when you try to create this
situation, it would be much easier to handle this problem
programmatically. Perhaps if $invindexers that were created from
existing indexes already knew about the spec'ed fields, this wouldn't be
a problem? Perhaps calling $doc->set_value for a field that didn't
exist in the invindexer should die?
Either way - the error I get when I try to have docs in an invindexer
that do not all have the same fields looks like this:
Error in function read_bytes at lib/KinoSearch/Store/InStream.pm:590:
read_bytes: tried to read 1 bytes, got 0
at /usr/local/lib/perl/5.8.7/KinoSearch/Index/NormsReader.pm
line 32
KinoSearch::Index::NormsReader::_ensure_read('KinoSearch::Index::NormsReader=HASH(0x9834a64)')
called at /usr/local/lib/perl/5.8.7/KinoSearch/Index/NormsReader.pm line 25
KinoSearch::Index::NormsReader::get_bytes('KinoSearch::Index::NormsReader=HASH(0x9834a64)')
called at /usr/local/lib/perl/5.8.7/KinoSearch/Index/SegWriter.pm line 138
KinoSearch::Index::SegWriter::_merge_norms('KinoSearch::Index::SegWriter=HASH(0x985cba4)',
'KinoSearch::Index::SegReader=HASH(0x98545f0)',
'KinoSearch::Util::IntMap=SCALAR(0x985a6ec)') called at
/usr/local/lib/perl/5.8.7/KinoSearch/Index/SegWriter.pm line 118
KinoSearch::Index::SegWriter::add_segment('KinoSearch::Index::SegWriter=HASH(0x985cba4)',
'KinoSearch::Index::SegReader=HASH(0x98545f0)') called at
/usr/local/lib/perl/5.8.7/KinoSearch/InvIndexer.pm line 289
KinoSearch::InvIndexer::finish('KinoSearch::InvIndexer=HASH(0x9828854)')
called at /usr/local/apache/htdocs/solstice/lib//Solstice/Model.pm line 322
Solstice::Model::storeSearchIndex('WebQ::Model::Survey=HASH(0x93419c4)')
called at /usr/local/apache/htdocs/apps/webq/lib//WebQ/Model/Survey.pm
line 83
WebQ::Model::Survey::index('WebQ::Model::Survey=HASH(0x93419c4)') called
at index_surveys.pl line 25
Which seems to be very difficult to understand and react to as a user of
the KinoSearch API.
This is broken in the 0.20_01 version of KinoSearch that you sent me
awhile ago.