Subject: | ENCODING is misleading, xmlDecl() needs its own new() setter |
Regarding these two realities:
1. An ENCODING of 'utf-8' double encodes utf-8 data (see “see double encoding” below for details).
2. xmlDecl() can only be set via ENCODING or by argument.
It should be clearer that you only need ENCODING when you want your data massaged when written to an OUTPUT that is a file handle. (i.e. when
being written to a scalar there is no IOLayer involved so its not double encoded).
It'd be nice to be able to set the default encoding for xmlDecl() in new() without corrupting the data.
Suggestion:
If 'IOLYAER' was passed:
a. Treat its value like ENCODING is currently treated as far as binmode goes
b. use ENCODING for xmlDecl only.
or
add an option for new() that sets the default for xmlDecl() but does not touch OUTPUT”s IO::Layer (would need to check for conflict, e.g. one is ascci
and the other is utf-8)
or
add an option to new() to not binmode() the handle even if ENCODING was passed (no conflict resolution).
Current workarounds:
a. do not pass ENCODING to new() && call xmlDecl() w/ an argument of "utf-8".
b. pass ENCODING so it is set for xmlDecl() && call “binmode $output;” after new() so the handle itself is set back to the :raw iolayer.
[ -- “see double encoding” -- ]
1, Taking the example from the synopsis, adding an xmlDecl() call, and adding some utf8 bytes to the characters() call (i.e. curly quotes)l
my $writer = XML::Writer->new(OUTPUT => $output);
$writer->xmlDecl();
…
$writer->characters("Hello, “world”!");
correctly results in:
<?xml version="1.0"?>
<greeting class="simple">Hello, “world”!</greeting>
2. adding ENCODING=>"utf-8" to new() makes xmlDecl() work as expected but garbles the quotes:
<?xml version="1.0" encoding="utf-8"?>
<greeting class="simple">Hello, âworldâ!</greeting>
3. not having encoding but calling xmlDecl w/ "UTF-8" works as expected:
<?xml version="1.0" encoding="UTF-8"?>
<greeting class="simple">Hello, “world”!</greeting>
4. having ENCODING=>"utf-8", 'binmode $output;' right after new(), and a call to xmlDecl w/ no arg has the same expected and correct result as #3:
<?xml version="1.0" encoding="utf-8"?>
<greeting class="simple">Hello, “world”!</greeting>