Skip Menu |

This queue is for tickets about the File-BOM CPAN distribution.

Report information
The Basics
Id: 102293
Status: new
Priority: 0/
Queue: File-BOM

People
Owner: Nobody in particular
Requestors: MORITZ [...] cpan.org
Cc: 20697743 [...] ticket.noris.net
AdminCc:

Bug Information
Severity: (no value)
Broken in: 0.14
Fixed in: (no value)



CC: 20697743 [...] ticket.noris.net
Subject: Spontaneous encoding change when used together with :crlf
$ cat test-utf8.pl #!/usr/bin/env perl use 5.014; use utf8; use File::BOM; binmode STDOUT, ':encoding(utf-16le):crlf:via(File::BOM)' or die; say 'Das ist der Rand von Ostermundigen.' for 1 .. 1000; $ perl test-utf8.pl |& hexdump -C |head 00000000 ff fe 44 00 61 00 73 00 20 00 69 00 73 00 74 00 |..D.a.s. .i.s.t.| 00000010 20 00 64 00 65 00 72 00 20 00 52 00 61 00 6e 00 | .d.e.r. .R.a.n.| 00000020 64 00 20 00 76 00 6f 00 6e 00 20 00 4f 00 73 00 |d. .v.o.n. .O.s.| 00000030 74 00 65 00 72 00 6d 00 75 00 6e 00 64 00 69 00 |t.e.r.m.u.n.d.i.| 00000040 67 00 65 00 6e 00 2e 00 0d 00 0a 00 44 00 61 00 |g.e.n.......D.a.| 00000050 73 00 20 00 69 00 73 00 74 00 20 00 64 00 65 00 |s. .i.s.t. .d.e.| 00000060 72 00 20 00 52 00 61 00 6e 00 64 00 20 00 76 00 |r. .R.a.n.d. .v.| 00000070 6f 00 6e 00 20 00 4f 00 73 00 74 00 65 00 72 00 |o.n. .O.s.t.e.r.| 00000080 6d 00 75 00 6e 00 64 00 69 00 67 00 65 00 6e 00 |m.u.n.d.i.g.e.n.| 00000090 2e 00 0d 00 0a 00 44 00 61 00 73 00 20 00 69 00 |......D.a.s. .i.| $ perl test-utf8.pl |& hexdump -C |tail 00011000 61 6e 64 20 76 6f 6e 20 4f 73 74 65 72 6d 75 6e |and von Ostermun| 00011010 64 69 67 65 6e 2e 0d 0a 44 61 73 20 69 73 74 20 |digen...Das ist | 00011020 64 65 72 20 52 61 6e 64 20 76 6f 6e 20 4f 73 74 |der Rand von Ost| 00011030 65 72 6d 75 6e 64 69 67 65 6e 2e 0d 0a 44 61 73 |ermundigen...Das| 00011040 20 69 73 74 20 64 65 72 20 52 61 6e 64 20 76 6f | ist der Rand vo| 00011050 6e 20 4f 73 74 65 72 6d 75 6e 64 69 67 65 6e 2e |n Ostermundigen.| 00011060 0d 0a 44 61 73 20 69 73 74 20 64 65 72 20 52 61 |..Das ist der Ra| 00011070 6e 64 20 76 6f 6e 20 4f 73 74 65 72 6d 75 6e 64 |nd von Ostermund| 00011080 69 67 65 6e 2e 0d 0a |igen...| 00011087 The output starts as UTF-16, and then spontaneously switches to UTF-8, and stays that way. The change occurs somewhere around byte 0xFFFF: 0000ffe0 74 00 20 00 64 00 65 00 72 00 20 00 52 00 61 00 |t. .d.e.r. .R.a.| 0000fff0 6e 00 64 00 20 00 76 00 6f 00 6e 00 20 4f 73 74 |n.d. .v.o.n. Ost| 00010000 65 72 6d 75 6e 64 69 67 65 6e 2e 0d 0a 44 61 73 |ermundigen...Das| 00010010 20 69 73 74 20 64 65 72 20 52 61 6e 64 20 76 6f | ist der Rand vo| 00010020 6e 20 4f 73 74 65 72 6d 75 6e 64 69 67 65 6e 2e |n Ostermundigen.|
I forgot to add: tested with both perl 5.14.2 and perl 5.20.1