Skip Menu |

This queue is for tickets about the Audio-Digest-MP3 CPAN distribution.

Report information
The Basics
Id: 50036
Status: open
Priority: 0/
Queue: Audio-Digest-MP3

People
Owner: Nobody in particular
Requestors: abrasive [...] axdf.net
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 0.1
Fixed in: (no value)



Subject: Mishandling with ID3v2 tags
MPEG::Audio::Frame sometimes 'finds' apparently valid frames in the ID3v2 tags present at the beginnings of some files (courtesy of the ludicrously overcomplicated ID3v2 format). It's necessary to scan for and skip these tags first, or the final hash value can include them. Find attached: - two mp3s with equal audio data; one with id3v2 tag and one without - two scripts; one manually skips id3 tags, the other shows the output of Audio::Digest::MP3 (call with filename argument) Thanks for the handy module :)
Subject: test_tag.mp3
Download test_tag.mp3
application/octet-stream 10k

Message body not shown because it is not plain text.

Subject: hash_correct.pl
#!/usr/pkg/bin/perl use Digest::SHA1 qw(sha1_hex); use MPEG::Audio::Frame; my $sha1 = Digest::SHA1->new(); open IN, "<", $ARGV[0] || die("Can't open input!"); my $h; while (1) { my $n = read IN, $h, 10; unless ($n==10 && $h =~ /^ID3/) { seek(IN, -$n, 1); last; }; print "found id3 tag "; my @szb = unpack("C4", substr($h, 6, 4)); my $len = ((((($szb[0] << 7) | $szb[1]) << 7) | $szb[2]) << 7) | $szb[3]; # 7-bit ints, sigh print "len $len\n"; seek (IN, $len, 1); }; # print "starting parse at " . tell(IN) . "\n"; tie *MP3, 'MPEG::Audio::Frame', *IN; while (<MP3>) { $sha1->add($_); # print "off " . $_->offset . " length " . $_->length . " " . sha1_hex($_->asbin()) . "\n"; } print "SHA1-" . $sha1->hexdigest() . "\n";
Subject: hash_with_module.pl
#!/usr/pkg/bin/perl use Digest::SHA1; use Audio::Digest::MP3; my $info = Audio::Digest::MP3->scan(shift @ARGV, 'SHA1'); return unless $info; print "SHA1-" . $info->digest() . "\n";
Subject: test_tag2.mp3
Download test_tag2.mp3
application/octet-stream 12k

Message body not shown because it is not plain text.

Subject: [PATCH] Mishandling with ID3v2 tags
Attached patch to the .pm fixes the issue.
--- MP3.pm.orig 2009-09-26 21:04:02.000000000 +1000 +++ MP3.pm 2009-09-26 21:14:53.000000000 +1000 @@ -20,6 +20,23 @@ my $ctx = Digest->new(shift || 'MD5'); open my($fh), "<", $file or croak "Can't open file \"$file\": $!"; binmode $fh; + + my $header; + ID3: while (1) { + my $n = read $fh, $header, 10; + unless ($n==10 && $header =~ /^ID3/) { + seek $fh, -$n, 1; + last ID3; + } + my @septets = unpack("C4", substr($header, 6, 4)); + my $len = 0; + foreach (@septets) { + $len <<= 7; + $len += $_; + } + seek $fh, $len, 1; + } + my $frames = 0; my $seconds = 0; my $bytes = 0;
On Sat Sep 26 06:43:21 2009, abrasive wrote: Show quoted text
> MPEG::Audio::Frame sometimes 'finds' apparently valid frames in the > ID3v2 tags present at the beginnings of some files (courtesy of the > ludicrously overcomplicated ID3v2 format). It's necessary to scan for > and skip these tags first, or the final hash value can include them.
Thanks for the patch. However, I think your fix is in the wrong place. It should be in the MP3 file parsing module, MPEG::Audio::Frame, that should skip the ID3v2 tag. Since some time I have edited that module, making it skip an ID3v2 tag, if it sees one. It's still not perfect, as for some MP3 files it reports a playing time of 0 seconds. This usually happens for files that contain artwork images. So I guess I must have done something wrong. I'll mark this bug as "solved" once the new version of that module is uploaded. I now am co-maintainer of MPEG::Audio::Frame so I will upload a new version of that module in due time; If I can't fix the above error, I'll settle for the current, improved version.
Subject: Re: [rt.cpan.org #50036] Mishandling with ID3v2 tags
Date: Sat, 17 Oct 2009 23:34:08 +1000
To: BARTL via RT <bug-Audio-Digest-MP3 [...] rt.cpan.org>
From: Abrasive <abrasive [...] axdf.net>
On Fri, Oct 16, 2009 at 12:00:55PM -0400, BARTL via RT wrote: Show quoted text
> Thanks for the patch. However, I think your fix is in the wrong place. > It should be in the MP3 file parsing module, MPEG::Audio::Frame, that > should skip the ID3v2 tag.
Bit of a semantic argument as to whether the fix belongs in MPEG::Audio::Frame or not -- ID3 tags can occur at least at the beginning and end of the file, and can contain perfectly valid MPEG frames inside them; this is really only an issue specifically in MP3s... Show quoted text
> Since some time I have edited that module, making it skip an ID3v2 tag, > if it sees one. It's still not perfect, as for some MP3 files it reports > a playing time of 0 seconds. This usually happens for files that contain > artwork images. So I guess I must have done something wrong.
I too have had mixed results; there are rather a lot of variants on the format out there. I ended up handrolling a hasher for my projects. You can find it at http://axdf.net/~abrasive/pub/perl/hash_mp3.pm (pardon crufty code). I haven't manually verified its performance on corner cases. I'll see what I can find when your new version pops out. Show quoted text
> If I can't fix the above error, I'll > settle for the current, improved version.
After submitting I realised I don't handle ID3v2 tags that aren't at the beginning (ie. at the end) -- that's definitely a problem. See the above code for one workaround - but since it makes multiple passes over the file it's definitely more expensive... Thanks, JHL PS. what is your intended use for this module? My use case is still evolving, but centres on distributed backup between individual net-users, and automatic redundancy scanning - modulo ID3 tags.