On Mon Nov 13 04:50:27 2006, dma_k@mail.ru wrote:
Show quoted text> Finally, as I said above, some of perl installations work, some -- not,
> and I've come to the conclusion, it's a core Perl bug with unicode
> chars. What version of Perl do you use for testing?
I use Apple's Perl (5.8.6 on OSX), Debian sarge's Perl (5.8.4), and a
custom Perl (5.8.2) for release testing. I do have a 5.6 install
sitting around, and t/body.t fails on unicode escape tests. (I should
skip those on that platform.)
Show quoted text> Can you please, define more precisely the return value for
> "HTML::Entity->as_text()"? Should it return the UTF-8 text? Localized
> text?
It returns the text exactly as it's contained in each HTML::Element (not
HTML::Entity) and children. If that's UTF-8, Unicode, ISO-8859-1, or
whatever, that's been decided by HTML::Parser. HTML::Element is just
the middleman, doing simple concatenation.
If you could give a test case that shows the broken behavior on your
platform, I would appreciate it.