Subject: | OODoc::XPath::inputTextConversion doesn't handle Perlstrings as input |
OODoc::XPath::inputTextConversion always calls Encode::decode on any provided text, even if it already is a valid UTF-8 Perlstring and telling so because the UTF-8 flag is turned on. This is unnecessary and doesn't work, because Encode::decode is documented to only be called on octets, not Perlstrings, and therefore the result is wrongly encoded data. If you would provide different options to decode like Encode::FB_CROAK(), it would even "die" with e.g. the following error message:
Show quoted text
> utf8 "\xF6" does not map to Unicode
The following implementation can be used to demonstrate and work around the problem:
*OpenOffice::OODoc::XPath::inputTextConversion = sub
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
my $text = shift;
my $localEncoding = $self->{'local_encoding'};
my $isUtf8 = utf8::is_utf8($text);
return $text unless (defined($text));
return $text unless (defined($localEncoding));
#return $text if ($isUtf8);
my $onError = Encode::FB_CROAK() | Encode::LEAVE_SRC();
my $retVal = Encode::decode($localEncoding, $text, $onError);
return $retVal;
};
Uncomment "return $text if ($isUtf8);" to switch between failing and not failing behavior.