CC: | turnermm02 [...] shaw.ca |
Subject: | UTF* |
A user of DokuWikiFck from Slovenia reported a problem with wide
characters. I am forwarding his eamil here:
OK, I managed to do some Windows testing, using apache 2.2.14 + PHP
5.2.12 + ActivePerl 5.8.9, with latest Dokuwiki and DokuwikiFCK.
fckgLite works out-of-the-box, fckg does not. Here are the details:
- just to see what happens I modified the Windows version of saveFCK.pl
like this (notice the UTF-8 options for both input and output streams):
binmode(STDOUT, ":utf8");
my $html;
if (exists $options{'file'}) {
open FH, "<:encoding(utf-8)", $options{'file'};
$html = join "", <FH>;
close FH;
}
else {
$html = join "", <>;
}
print $html;
This way the script outputs properly encoded UTF8 text.
I then passed the same $html variable to WikiConverter.
my $wc = new HTML::WikiConverter( 'dialect' => "DokuWikiFCK",
'base_uri' => $options{'base_uri'});
print $wc->html2wiki($html);
This produces an error saying "Cannot decode string with wide characters
at C:/Perl/lib/Encode.pm line 170."
So I modified WikiConverter.pm and commented out these two lines:
(line 224)
#$html = decode( $self->encoding, $html );
(line 258)
#$output = encode( $self->encoding, $output );
And as a result I get properly encoded text!
Perhaps this is the way to go - Dokuwiki uses UTF8 by default, so
modifying saveFCK.pl as shown shouldn't cause any problems. But I need
to find a way to eliminate the encode/decode error message, so we can
use the original WikiConverter module (good for upgrades).
Thanks,
Myron Turner