CC: | wosch [...] freebsd.org |
Subject: | Interoperability problems with pbzip2 |
pbzip2 <http://compression.ca/pbzip2/> is a bzip2 variant which can
compress and decompress faster by making use of multiple CPUs in a
system. Unfortunately pbzip2 can create files which cannot be
uncompressed by IO-Compress. The attached file demonstrates the problem.
In my experiments I found out that the problem happens if the compressed
content is larger than the block size used for compression, which is by
default 900000, or n*100000 when using the pbzip2 option -b.
Maybe the report really belongs to the Compress-Raw-Bzip2 queue, but my
test script is just using IO::Uncompress::Bunzip2. Feel free to move the
bug ticket.
Regards,
Slaven
Subject: | pbzip2test.pl |
#!/usr/bin/perl -w
use strict;
use IO::Uncompress::Bunzip2 qw(bunzip2 $Bunzip2Error) ;
use Digest::MD5 qw(md5_hex);
use File::Temp qw(tempfile);
use IPC::Run qw(run);
use Test::More qw(no_plan);
#my $pbzip2_exe = "/usr/local/src/work/pbzip2-1.1.6/pbzip2";
my $pbzip2_exe = "pbzip2";
#my $blocksize = 1;
my $blocksize = 9;
# Good:
#my $len = $blocksize*100000;
# Bad:
my $len = $blocksize*100000+1;
my $input = do {
my $input_uncompressed = " " x $len;
my($tmpfh,$tmpfile) = tempfile(UNLINK => 1) or die $!;
run [$pbzip2_exe, "-b$blocksize"], "<", \$input_uncompressed, ">", $tmpfh
or die $!;
$tmpfile;
};
my $output_perl = do {
my $output;
bunzip2 $input => \$output
or die "bunzip2 failed: $Bunzip2Error\n";
$output;
};
my $output_system = do {
my $output;
run ["bzcat", $input], ">", \$output
or die $!;
$output;
};
is md5_hex($output_perl), md5_hex($output_system);
__END__