On Tue, Oct 08, 2013 at 08:35:38PM -0400, rra@stanford.edu via RT wrote:
Show quoted text> $parser->output_string(\$output) will put all of the output into $output.
> I believe that Pod::Simple treats this as equivalent to a file handle and
> does output encoding before adding it to that string, but I'm not positive
> about that (I haven't tested). If so, it will be encoded in whatever
> encoding is declared by the =encoding string.
Right - what I'm saying is that sometimes we don't *want* the content
encoded yet, because we'll be doing some more processing of that text
before we write it out to a file ourselves -- so we'll need to know what
the encoding is, so we can apply the right layer to the $fh when we write
it. for example, Pod::Weaver... working on already-encoded octets isn't
optimal here.
Show quoted text> That said, I would recommend standardizing on UTF-8, because that will let
> you pass utf8 => 1 to Pod::Text's constructor, which will override all
> that and force the encoding to be in UTF-8. Then you can just decode the
> $output stream with decode('UTF-8', $output) and you should get back Perl
> internal strings.
This might be enough, but users upstream might be using other encodings
other than UTF-8 (and I believe Dist::Zilla intends to support that).
However, as long as we know the actual encoding of the string (actually a
bytestring) we get back, we can decode it properly, do our munging, and
the re-encode without generating mojibake.