Bug #11658 for Plucene: Optimize runs wild, swamping stderr

Subject:

Optimize runs wild, swamping stderr

call xmlshell.pl < strange-error.xml When you do this, it will add a document to the index, and then optimize and close the index. During optimization, Plucene runs havoc, swaping stderr with two different character sequences (maybe UTF-8). See the image in the attached files. The "appearant" spaces are either of 0x82 or 0x83, the Terminal is in Latin-9 mode. Some information from the Perl debugger: ÂÃÂÃÂÃÂÃÂÃÂÃÂenabstand (content lt content) at /usr/share/perl5/Plucene/Index/SegmentMerger.pm line 151 Plucene::Index::TermInfosWriter::add('Plucene::Index::TermInfosWriter=HASH(0x8bce4c8)', 'Plucene::Index::Term=HASH(0x8bcd720)', 'Plucene::Index::TermInfo=HASH(0x8bd3490)') called at /usr/share/perl5/Plucene/Index/SegmentMerger.pm line 151 Plucene::Index::SegmentMerger::_merge_term_info('Plucene::Index::SegmentMerger=HASH(0x8b6cc58)', 'Plucene::Index::SegmentMergeInfo=HASH(0x8bd15f0)') called at /usr/share/perl5/Plucene/Index/SegmentMerger.pm line 138 Plucene::Index::SegmentMerger::_merge_term_infos('Plucene::Index::SegmentMerger=HASH(0x8b6cc58)') called at /usr/share/perl5/Plucene/Index/SegmentMerger.pm line 109 Plucene::Index::SegmentMerger::_merge_terms('Plucene::Index::SegmentMerger=HASH(0x8b6cc58)') called at /usr/share/perl5/Plucene/Index/SegmentMerger.pm line 78 Plucene::Index::SegmentMerger::merge('Plucene::Index::SegmentMerger=HASH(0x8b6cc58)') called at /usr/share/perl5/Plucene/Index/Writer.pm line 280 Plucene::Index::Writer::_merge_segments('Plucene::Index::Writer=HASH(0x8b6ebd8)', 0) called at /usr/share/perl5/Plucene/Index/Writer.pm line 206 Plucene::Index::Writer::optimize('Plucene::Index::Writer=HASH(0x8b6ebd8)') called at Midcom/Plucene/RequestProcessor.pm line 121 Midcom::Plucene::RequestProcessor::close('Midcom::Plucene::RequestProcessor=HASH(0x8b315fc)') called at xmlshell.pl line 19 Plucene::Index::TermInfosWriter::add(/usr/share/perl5/Plucene/Index/TermInfosWriter.pm:93): 93: carp "Frequency pointer out of order" 94: if $ti->freq_pointer < $self->{last_ti}->freq_pointer; The size of this output mess grows exponentially(!) with the number of documents in the index, so right now I had to disable the optimization sequence to even be able to *test* the system. Interestingly, I do not know how this corrupt information comes together, it is definitly not part of the docuemnt I want to store into Plucene, you can easily verify this by turning on the $request->dump line in XMLComm.pl::_ParseIndex. If you need any further information, please ask. You can also have shell access to the box where this script is currently being developed, in case there may be version inconsistencies.

Download plucene.tar.gz
application/x-gzip 956.3k

Message body not shown because it is not plain text.