Bug #117523 for OpenOffice-OODoc: OODoc::XPath::inputTextConversion doesn't handle Perlstrings as input

Subject:

OODoc::XPath::inputTextConversion doesn't handle Perlstrings as input

OODoc::XPath::inputTextConversion always calls Encode::decode on any provided text, even if it already is a valid UTF-8 Perlstring and telling so because the UTF-8 flag is turned on. This is unnecessary and doesn't work, because Encode::decode is documented to only be called on octets, not Perlstrings, and therefore the result is wrongly encoded data. If you would provide different options to decode like Encode::FB_CROAK(), it would even "die" with e.g. the following error message: Show quoted text

> utf8 "\xF6" does not map to Unicode

The following implementation can be used to demonstrate and work around the problem: *OpenOffice::OODoc::XPath::inputTextConversion = sub { my $self = shift || Carp::croak('The method needs to be called with an instance.'); my $text = shift; my $localEncoding = $self->{'local_encoding'}; my $isUtf8 = utf8::is_utf8($text); return $text unless (defined($text)); return $text unless (defined($localEncoding)); #return $text if ($isUtf8); my $onError = Encode::FB_CROAK() | Encode::LEAVE_SRC(); my $retVal = Encode::decode($localEncoding, $text, $onError); return $retVal; }; Uncomment "return $text if ($isUtf8);" to switch between failing and not failing behavior.