On Mar 31 Jui 2012 14:26:48, BBYRD wrote :
Show quoted text> On Sun Jul 29 18:51:44 2012, VPIT wrote:
> > Thanks for your report.
> >
> > From perlunicode, perl is supposed to recognize UTF8, UTF16-LE and
> > UTF16-BE BOMs at the beginning of a Perl source file, so I think
> > Module::Metadata should decode the source file appropriately when it
> > sees the BOM.
>
> Nope. I talked with some of the guys on IRC about it, including doy,
> and there's an important distinction: Perl will decode a source file
> that it's actually reading/parsing, but reading a file that happens to
> be Perl source is a different matter. In the latter case, Perl will
> merely follow what binmode is doing.
Except that Module::Metadata is also supposed to be able to extract POD,
and handing back octet POD strings to the user is not really useful. For
that reason, I think that Module::Metadata should also honour "use utf8"
and "=encoding", but that's another matter.
Show quoted text> In the case of Module::Metadata, I would say to detect the BOM at the
> beginning, and if it exists, remove it. Not even Encode::Guess seems to
> remove BOMs if they appear in UTF-8 code.
Starting from version 1.000011, Module::Metadata->new_from_file and
->new_from_module look for a UTF-8/UTF-16LE/UTF-16BE BOM at the
beginning of the file, skip it, then decode appropriately the rest of
the input. Module::Metadata->new_from_handle is untouched. The decoding
part is easily removable if deemed harmful.