Skip Menu |

This queue is for tickets about the WWW-Dict-Leo-Org CPAN distribution.

Report information
The Basics
Id: 84196
Status: resolved
Priority: 0/
Queue: WWW-Dict-Leo-Org

People
Owner: Nobody in particular
Requestors: g_ml2000-x [...] yahoo.de
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.35
Fixed in: (no value)



Subject: output latin1+utf-8 mixed
I normally use 'xterm' and ISO-8859-15 locale, but I tried it with other terminal emulators and locale settings, with or without "use_latin", with or without the patch from https://rt.cpan.org/Ticket/Display.html?id=35543 ("leo script and utf8 terminals"): On all attempts, I get a mixture of latin1 and utf-8, which naturally can not be displayed correctly on any terminal. Except for still using latin1, there is nothing unusual on my system - until recently, I used a similar script based on http://www.perlmeister.com/scripts/leo.html withou any problem ...
From: g_ml2000-x [...] yahoo.de
I looked into my problem a little closer and found several issues: - intended values for $conf{use_latin} seem to be 'yes' or 'no', but later checked for boolean value (which is true in both cases) - section titles are printed without decoding; insert sth. like: utf8::decode($section->{title}) if ($conf{use_latin} eq 'yes'); - The biggest problem: HTML code returned by leo.org contains both ' ' and ' ', which should theoretically be equivalent. The former will be translated to a ascii space character with the HTML::TableParser option 'DecodeNBS', while the latter ends up as a byte with value \x160; because the result is not valid utf-8 anymore, the 'uft8::decode' call fails, and the resulting string, which otherwise is utf8-encoded is printed. I am not sure, where would be the best place to fix this issue (both, HTML::TableParser and HTML::Entities do not really shine here). As a simple workaround for the moment, I inserted into WWW::Dict::Leo::Org right before passing $site to HTML::TableParser: $site=~s/ /\&nbsp\;/g; With these changes, everything seems to work for me ...
Fixed in 1.36. Thanks!