Skip Menu |

This queue is for tickets about the Archive-Zip CPAN distribution.

Report information
The Basics
Id: 115723
Status: open
Priority: 0/
Queue: Archive-Zip

People
Owner: Nobody in particular
Requestors: paul.g.mckay [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Problem writing utf8 strings to file handles
Date: Wed, 29 Jun 2016 11:22:40 +0100
To: bug-Archive-Zip [...] rt.cpan.org
From: Paul McKay <paul.g.mckay [...] gmail.com>
Hi all, Problem seen on version 1.57 on perl 5.16.2 on my mac, Darwin Kernel Version 13.4.0 Also reproduced on perl 5.18.4 on linux kernel 2.6.18-164.2.1.el5.plus I've boiled down the problem to be reproduced with this script,( although it was first seen writing data to a zip that was pulled from a database, where the utf8 pragma was not used, it's used here to illustrate the error) #!/usr/bin/env perl use strict; use Archive::Zip; use utf8; my $data = '€'; my $zip = Archive::Zip->new(); $zip->addString($data,'data.txt'); $zip->writeToFileNamed('/tmp/data.zip'); This errors out with Wide character in Compress::Raw::Zlib::crc32 at /usr/local/perl/lib/site_perl/5.18.4/Archive/Zip.pm line 330. Any help with this much appreciated. At the moment I'm having to look to work around Archive::Zip completely by using temporary files and system commands, not ideal. Paul.
On Wed Jun 29 06:22:53 2016, paul.g.mckay@gmail.com wrote: Show quoted text
> Hi all, > > Problem seen on version 1.57 on perl 5.16.2 on my mac, Darwin Kernel > Version 13.4.0 > Also reproduced on perl 5.18.4 on linux kernel 2.6.18-164.2.1.el5.plus > > I've boiled down the problem to be reproduced with this script,( although > it was first seen writing data to a zip that was pulled from a database, > where the utf8 pragma was not used, it's used here to illustrate the error) > > #!/usr/bin/env perl > > > use strict; > > use Archive::Zip; > > use utf8; > > > my $data = '€'; > > > my $zip = Archive::Zip->new(); > > $zip->addString($data,'data.txt'); > > $zip->writeToFileNamed('/tmp/data.zip'); > > > This errors out with > > Wide character in Compress::Raw::Zlib::crc32 at > /usr/local/perl/lib/site_perl/5.18.4/Archive/Zip.pm line 330. > > > Any help with this much appreciated. At the moment I'm having to look to > work around Archive::Zip completely by using temporary files and system > commands, not ideal.
Files in the filesystem contain sequences of bytes. If they contain Unicode text, that text must be in a specific encoding. Since addString is expected the contents of a file, you need to encode your text before you pass it to addString. If you want to store it in UTF-8 encoding, for instance, you could do: utf8::encode $data; before passing it to addString.
Subject: Re: [rt.cpan.org #115723] Problem writing utf8 strings to file handles
Date: Wed, 29 Jun 2016 15:15:43 +0100
To: bug-Archive-Zip [...] rt.cpan.org
From: Paul McKay <paul.g.mckay [...] gmail.com>
Surely addString is expecting a string, rather than the contents of a file? If the string is marked internally as UTF-8, then writeToFileNamed fails. Archive::Zip should be able to handle strings that are marked internally as UTF-8. If not, then this should at least be mentioned in the documentation, and the examples, that you need to call encode on any strings that are passed in to addString. On Wed, Jun 29, 2016 at 2:32 PM, Father Chrysostomos via RT < bug-Archive-Zip@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=115723 > > > On Wed Jun 29 06:22:53 2016, paul.g.mckay@gmail.com wrote:
> > Hi all, > > > > Problem seen on version 1.57 on perl 5.16.2 on my mac, Darwin Kernel > > Version 13.4.0 > > Also reproduced on perl 5.18.4 on linux kernel 2.6.18-164.2.1.el5.plus > > > > I've boiled down the problem to be reproduced with this script,(
> although
> > it was first seen writing data to a zip that was pulled from a database, > > where the utf8 pragma was not used, it's used here to illustrate the
> error)
> > > > #!/usr/bin/env perl > > > > > > use strict; > > > > use Archive::Zip; > > > > use utf8; > > > > > > my $data = '€'; > > > > > > my $zip = Archive::Zip->new(); > > > > $zip->addString($data,'data.txt'); > > > > $zip->writeToFileNamed('/tmp/data.zip'); > > > > > > This errors out with > > > > Wide character in Compress::Raw::Zlib::crc32 at > > /usr/local/perl/lib/site_perl/5.18.4/Archive/Zip.pm line 330. > > > > > > Any help with this much appreciated. At the moment I'm having to look to > > work around Archive::Zip completely by using temporary files and system > > commands, not ideal.
> > Files in the filesystem contain sequences of bytes. If they contain > Unicode text, that text must be in a specific encoding. Since addString is > expected the contents of a file, you need to encode your text before you > pass it to addString. If you want to store it in UTF-8 encoding, for > instance, you could do: > > utf8::encode $data; > > before passing it to addString. > > > >
Subject: Re: [rt.cpan.org #115723] Problem writing utf8 strings to file handles
Date: Wed, 29 Jun 2016 15:47:47 +0100
To: bug-Archive-Zip [...] rt.cpan.org
From: Paul McKay <paul.g.mckay [...] gmail.com>
In addition, the files that are created after encoding, in the zip, don't contain the UTF-8 BOM at the start. So you actually need to do my $data = '€'; utf8::encode $data; my $zip = Archive::Zip->new(); $zip->addString("\x{EF}\x{BB}\x{BF}" . $data,'data.csv'); On Wed, Jun 29, 2016 at 3:15 PM, Paul McKay <paul.g.mckay@gmail.com> wrote: Show quoted text
> Surely addString is expecting a string, rather than the contents of a > file? If the string is marked internally as UTF-8, then writeToFileNamed > fails. > > Archive::Zip should be able to handle strings that are marked internally > as UTF-8. > > If not, then this should at least be mentioned in the documentation, and > the examples, that you need to call encode on any strings that are passed > in to addString. > > > On Wed, Jun 29, 2016 at 2:32 PM, Father Chrysostomos via RT < > bug-Archive-Zip@rt.cpan.org> wrote: >
>> <URL: https://rt.cpan.org/Ticket/Display.html?id=115723 > >> >> On Wed Jun 29 06:22:53 2016, paul.g.mckay@gmail.com wrote:
>> > Hi all, >> > >> > Problem seen on version 1.57 on perl 5.16.2 on my mac, Darwin Kernel >> > Version 13.4.0 >> > Also reproduced on perl 5.18.4 on linux kernel 2.6.18-164.2.1.el5.plus >> > >> > I've boiled down the problem to be reproduced with this script,(
>> although
>> > it was first seen writing data to a zip that was pulled from a database, >> > where the utf8 pragma was not used, it's used here to illustrate the
>> error)
>> > >> > #!/usr/bin/env perl >> > >> > >> > use strict; >> > >> > use Archive::Zip; >> > >> > use utf8; >> > >> > >> > my $data = '€'; >> > >> > >> > my $zip = Archive::Zip->new(); >> > >> > $zip->addString($data,'data.txt'); >> > >> > $zip->writeToFileNamed('/tmp/data.zip'); >> > >> > >> > This errors out with >> > >> > Wide character in Compress::Raw::Zlib::crc32 at >> > /usr/local/perl/lib/site_perl/5.18.4/Archive/Zip.pm line 330. >> > >> > >> > Any help with this much appreciated. At the moment I'm having to look to >> > work around Archive::Zip completely by using temporary files and system >> > commands, not ideal.
>> >> Files in the filesystem contain sequences of bytes. If they contain >> Unicode text, that text must be in a specific encoding. Since addString is >> expected the contents of a file, you need to encode your text before you >> pass it to addString. If you want to store it in UTF-8 encoding, for >> instance, you could do: >> >> utf8::encode $data; >> >> before passing it to addString. >> >> >> >>
>
On 2016-06-29 07:15:59, paul.g.mckay@gmail.com wrote: Show quoted text
> Surely addString is expecting a string
Depends on what the documentation says. If it is silent on the matter, then the docs need to be amended. Show quoted text
> If the string is marked internally as UTF-8, then > writeToFileNamed > fails. > > Archive::Zip should be able to handle strings that are marked > internally as > UTF-8.
The internal representation of a string in perl has *nothing* to do with any expectations for how it should be encoded when it is written to a file. There is no connection at all, and you should *never* pay attention to the utf8 flag on a scalar variable. It is not relevant unless you are hacking on the guts of perl itself.
Subject: Re: [rt.cpan.org #115723] Problem writing utf8 strings to file handles
Date: Mon, 4 Jul 2016 17:40:41 +0100
To: bug-Archive-Zip [...] rt.cpan.org
From: Paul McKay <paul.g.mckay [...] gmail.com>
The docs say "Append a member created from the given string or string reference." and the example given is my $member = $zip->addString( 'This is a test', 'test.txt' ); IMO, in the very least the documentation should mention that strings need to be encoded first before passed into 'addString', and that the example is inaccurate, as the passed in string needs to be encoded first, or addString may bork. ( I only mention the utf8 flag because this method borks if a string with the flag set is passed into the method, so it is relevant to the discussion.) On Fri, Jul 1, 2016 at 7:54 PM, Karen Etheridge via RT < bug-Archive-Zip@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=115723 > > > On 2016-06-29 07:15:59, paul.g.mckay@gmail.com wrote:
> > Surely addString is expecting a string
> > Depends on what the documentation says. If it is silent on the matter, > then the docs need to be amended. >
> > If the string is marked internally as UTF-8, then > > writeToFileNamed > > fails. > > > > Archive::Zip should be able to handle strings that are marked > > internally as > > UTF-8.
> > The internal representation of a string in perl has *nothing* to do with > any expectations for how it should be encoded when it is written to a file. > There is no connection at all, and you should *never* pay attention to the > utf8 flag on a scalar variable. It is not relevant unless you are hacking > on the guts of perl itself. > > >