Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the Email-Folder CPAN distribution.

Report information
The Basics
Id: 31132
Status: resolved
Priority: 0/
Queue: Email-Folder

People
Owner: Nobody in particular
Requestors: whynot [...] pozharski.name
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: (no value)
Fixed in: (no value)



Subject: Email::Folder::Mbox looses the last header if Content-Length or Lines header is next-to-last
Extensive (too verbose) context of my install is in bugcontext. Proposed fix against checked-out HEAD of repository is in bugfix. Mininimal(?) (next-to-extensive(?)) code that reproduces bug is in bugsample. If run in checked-out copy with C<-I lib>, then all headers are in place. If run against installed v.0.853, than #1 and #4 are broken. However, as far as I can grok symmetry in CL and L headers handling, #3 and #6 must be correct; in my install they are broken too; anyway bugfix fixes them. I must admit, that #2 is broken too: for some surprising reason fixed Mbox.pl still sucks the very first line of the body as valueles header. Why I consider that bug "critical". Simple speculation: a mail (with CL or L header) is placed in mbox; then read in E::Simple object; then one header is added (as the very last of course); then is placed in mbox again. The next time that pure mail is read, the very last header will be missing in E::Simple object.
Subject: bugfix
Index: t/mboxcl3 =================================================================== --- t/mboxcl3 (revision 0) +++ t/mboxcl3 (revision 0) @@ -0,0 +1,49 @@ +From daemon@orphan.zombinet +Return-path: <daemon@orphan.zombinet> +Envelope-to: root@orphan.zombinet +Delivery-date: Sun, 11 Nov 2007 06:58:35 +0200 +Received: from orphan.zombinet ([192.168.0.1]) + by orphan.zombinet with smtp (Exim 4.63) + (envelope-from <daemon@orphan.zombinet>) + id 1Ir4u3-0001Xb-MI + for root@orphan.zombinet; Sun, 11 Nov 2007 06:58:35 +0200 +From: <daemon@orphan.zombinet> +To: <root@orphan.zombinet> +Date: Sun, 11 Nov 2007 06:58:35 EET +Subject: [Samhain at orphan.zombinet] 11-11-2007 06:58:35: CRIT +Message-Id: <E1Ir4u3-0001Xb-MI@orphan.zombinet> +Content-Length: 442 +X-Test: Just a bwahaha + +-----BEGIN MESSAGE----- +[2007-11-11T06:58:35+0200] orphan.zombinet +CRIT : [2007-11-11T06:57:43+0200] msg=<POLICY [ReadOnly] --------T->, path=</etc/ppp/resolv.conf>, ctime_old=<[2007-11-10T22:39:57]>, ctime_new=<[2007-11-11T04:16:55]>, mtime_old=<[2007-11-10T22:39:57]>, mtime_new=<[2007-11-11T04:16:55]>, +-----BEGIN SIGNATURE----- +72EC87DCB8E1B67324DFC76E8D1BF4956F1E9D6E6D03B015 +000023 1194292332::orphan.zombinet +-----END MESSAGE----- + +From daemon@orphan.zombinet +Return-path: <daemon@orphan.zombinet> +Envelope-to: root@orphan.zombinet +Delivery-date: Sun, 11 Nov 2007 06:58:35 +0200 +Received: from orphan.zombinet ([192.168.0.1]) + by orphan.zombinet with smtp (Exim 4.63) + (envelope-from <daemon@orphan.zombinet>) + id 1Ir4u3-0001Xb-MI + for root@orphan.zombinet; Sun, 11 Nov 2007 06:58:35 +0200 +From: <daemon@orphan.zombinet> +To: <root@orphan.zombinet> +Date: Sun, 11 Nov 2007 06:58:35 EET +Subject: [Samhain at orphan.zombinet] 11-11-2007 06:58:35: CRIT +Message-Id: <E1Ir4u3-0001Xb-MI@orphan.zombinet> +Lines: 7 +X-Test: Just another bwahaha + +-----BEGIN MESSAGE----- +[2007-11-11T06:58:35+0200] orphan.zombinet +CRIT : [2007-11-11T06:57:43+0200] msg=<POLICY [ReadOnly] --------T->, path=</etc/ppp/resolv.conf>, ctime_old=<[2007-11-10T22:39:57]>, ctime_new=<[2007-11-11T04:16:55]>, mtime_old=<[2007-11-10T22:39:57]>, mtime_new=<[2007-11-11T04:16:55]>, +-----BEGIN SIGNATURE----- +72EC87DCB8E1B67324DFC76E8D1BF4956F1E9D6E6D03B015 +000023 1194292332::orphan.zombinet +-----END MESSAGE----- Index: t/01mbox.t =================================================================== --- t/01mbox.t (revision 847) +++ t/01mbox.t (working copy) @@ -3,7 +3,7 @@ BEGIN { %boxes = ( 't/testmbox' => "\x0a", 't/testmbox.mac' => "\x0d", 't/testmbox.dos' => "\x0d\x0a" ) } -use Test::More tests => 12 + 3 * keys %boxes; +use Test::More tests => 16 + 3 * keys %boxes; use strict; use_ok("Email::Folder"); @@ -72,4 +72,16 @@ ok( $r = Email::Folder->new('t/mboxcl2', seek_to => $offset), "reopened"); is( $r->next_message->header('Subject'), 'Re: Fifteenth anniversary of Perl.', 'second message' ); +undef $r; +$r = Email::Folder->new('t/mboxcl3'); +my $p = $r->next_message(); +is( $p->header('X-Test'), 'Just a bwahaha', + 'one more line after Content-Length' ); +is( $p->header_names(), 11, + 'with Content-Length all headers are in place' ); +$p = $r->next_message(); +is( $p->header('X-Test'), 'Just another bwahaha', + 'one more line after Lines' ); +is( $p->header_names(), 11, + 'with Lines all headers are in place' ); Index: lib/Email/Folder/Mbox.pm =================================================================== --- lib/Email/Folder/Mbox.pm (revision 847) +++ lib/Email/Folder/Mbox.pm (working copy) @@ -153,7 +153,7 @@ } # grab the next line (should be /^From / or undef) my $next = <$fh>; - return "$mail$/$read" + return "$mail$prev$/$read" if !defined $next || $next =~ /^From /; # seek back and scan line-by-line like the header # wasn't here @@ -168,7 +168,7 @@ for (1..$lines) { $read .= <$fh> } <$fh>; # trailing newline my $next = <$fh>; - return "$mail$/$read" + return "$mail$prev$/$read" if !defined $next || $next =~ /^From /; # seek back and scan line-by-line like the header # wasn't here
Subject: bugcontext
=== contents (perl -mEmail::Folder -e 'print $Email::Folder::VERSION, "\n"') goes below === 0.853 === contents (perl -mEmail::Folder -e 'print $Email::Folder::VERSION, "\n"') gone above === === contents (perl -mEmail::FolderType -e 'print $Email::FolderType::VERSION, "\n"') goes below === 0.812 === contents (perl -mEmail::FolderType -e 'print $Email::FolderType::VERSION, "\n"') gone above === === contents (perl -mEmail::Simple -e 'print $Email::Simple::VERSION, "\n"') goes below === 2.003 === contents (perl -mEmail::Simple -e 'print $Email::Simple::VERSION, "\n"') gone above === === contents (perl -mTest::More -e 'print $Test::More::VERSION, "\n"') goes below === 0.62 === contents (perl -mTest::More -e 'print $Test::More::VERSION, "\n"') gone above === === contents (perl -V) goes below === Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.18.3, archname=i486-linux-gnu-thread-multi uname='linux saens 2.6.18.3 #1 smp sat nov 25 13:39:52 est 2006 i686 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i486-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include' ccversion='', gccversion='4.1.2 20061115 (prerelease) (Debian 4.1.1-20)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.3.6.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8 gnulibc_version='2.3.6' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_REENTRANT_API Built under linux Compiled at Dec 6 2006 23:05:53 %ENV: PERLDOC_PAGER="pager -Cei" @INC: /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl . === contents (perl -V) gone above === === contents (uname -a) goes below === Linux orphan.zombinet 2.6.18-5-686 #1 SMP Fri Jun 1 00:47:00 UTC 2007 i686 GNU/Linux === contents (uname -a) gone above === === contents (ldd --version) goes below === ldd (GNU libc) 2.3.6 Copyright (C) 2005 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Written by Roland McGrath and Ulrich Drepper. === contents (ldd --version) gone above ===
Subject: bugsample
#!/usr/bin/perl use strict; use warnings; use Email::Folder::Mbox; use Email::Simple; use Email::Simple::Creator; use Email::LocalDelivery; use File::Temp qw(tempfile); my $data; while(<main::DATA>) { $data .= $_; }; $data = eval $data; my $body = <<EOT; Perl is the only (in other words: the best) language that exists ever. The only problem left: can someone write kernel in Perl?\ EOT my(undef, $fn) = tempfile(q(/tmp/emailfolder-XXXXXX)); Email::LocalDelivery->deliver( Email::Simple->create(header => $_, body => $body)->as_string, $fn) foreach (@{$data}); my $p = Email::Folder::Mbox->new($fn); my $q = Email::Simple->new($p->next_message()); foreach my $r (@{$data}) { my @s = $q->header_pairs(); print( qq(Expected headers:\n), join(' ', map { $r->[$_]; } grep { not $_ % 2; } (0 .. $#{$r})), "\n", qq( Got headers:\n), join(' ', map { $s[$_]; } grep { not $_ % 2; } (0 .. $#s)), "\n", "\n"); } continue { $q = Email::Simple->new($p->next_message() || ''); }; unlink($fn); exit(0); __DATA__ [ [ q(From), q(foo), q(To), q(bar), q(Date), scalar(gmtime(time - 3600 * 6)), q(Message-Id), q(abcdef@uvwxyz), q(Content-Length), 129, q(X-Not-Found), q(bwahaha), ], [ q(From), q(foo), q(To), q(bar), q(Date), scalar(gmtime(time - 3600 * 5)), q(Message-Id), q(bcdefg@uvwxyz), q(Content-Length), 314, q(X-Found), q(bwahaha), ], [ q(From), q(foo), q(To), q(bar), q(Date), scalar(gmtime(time - 3600 * 4)), q(Content-Length), 129, q(Message-Id), q(cdefgh@uvwxyz), q(X-Found), q(bwahaha), ], [ q(From), q(foo), q(To), q(bar), q(Date), scalar(gmtime(time - 3600 * 3)), q(Message-Id), q(defghi@uvwxyz), q(Lines), 3, q(X-Not-Found), q(bwahaha), ], [ q(From), q(foo), q(To), q(bar), q(Date), scalar(gmtime(time - 3600 * 2)), q(Message-Id), q(efghij@uvwxyz), q(Lines), 4, q(X-Found), q(bwahaha), ], [ q(From), q(foo), q(To), q(bar), q(Date), scalar(gmtime(time - 3600)), q(Lines), 3, q(Message-Id), q(fghijk@uvwxyz), q(X-Found), q(bwahaha), ], ];
From: whynot [...] pozharski.name
On Sat Dec 01 14:49:13 2007, whynot wrote: Show quoted text
> Extensive (too verbose) context of my install is in bugcontext. > Proposed fix against checked-out HEAD of repository is in bugfix. > Mininimal(?) (next-to-extensive(?)) code that reproduces bug is in > bugsample. If run in checked-out copy with C<-I lib>, then all > headers are in place. If run against installed v.0.853, than #1 > and #4 are broken. However, as far as I can grok symmetry in CL > and L headers handling, #3 and #6 must be correct; in my install > they are broken too; anyway bugfix fixes them. I must admit, that > #2 is broken too: for some surprising reason fixed Mbox.pl still > sucks the very first line of the body as valueles header. > Why I consider that bug "critical". Simple speculation: a mail (with > CL or L header) is placed in mbox; then read in E::Simple object; > then one header is added (as the very last of course); then is > placed in mbox again. The next time that pure mail is read, the > very last header will be missing in E::Simple object.
The issue with F<bugsample> is investigated: C<$_> is in use in the innermost blocks (C<if $mail =~ m/Content-Length/> and C<if $mail =~ m/Lines/>); then reused at the end of the outermost C<while>); a proper bugfix is in F<bugfix2>. Thanks for your patience. p.s. Just one more speculation: What if value passed in CL or L header is hardly broken and points at the next-to-next mail? Is it possible to just clear all that mess out?
Index: t/mboxcl3 =================================================================== --- t/mboxcl3 (revision 0) +++ t/mboxcl3 (revision 0) @@ -0,0 +1,49 @@ +From daemon@orphan.zombinet +Return-path: <daemon@orphan.zombinet> +Envelope-to: root@orphan.zombinet +Delivery-date: Sun, 11 Nov 2007 06:58:35 +0200 +Received: from orphan.zombinet ([192.168.0.1]) + by orphan.zombinet with smtp (Exim 4.63) + (envelope-from <daemon@orphan.zombinet>) + id 1Ir4u3-0001Xb-MI + for root@orphan.zombinet; Sun, 11 Nov 2007 06:58:35 +0200 +From: <daemon@orphan.zombinet> +To: <root@orphan.zombinet> +Date: Sun, 11 Nov 2007 06:58:35 EET +Subject: [Samhain at orphan.zombinet] 11-11-2007 06:58:35: CRIT +Message-Id: <E1Ir4u3-0001Xb-MI@orphan.zombinet> +Content-Length: 442 +X-Test: Just a bwahaha + +-----BEGIN MESSAGE----- +[2007-11-11T06:58:35+0200] orphan.zombinet +CRIT : [2007-11-11T06:57:43+0200] msg=<POLICY [ReadOnly] --------T->, path=</etc/ppp/resolv.conf>, ctime_old=<[2007-11-10T22:39:57]>, ctime_new=<[2007-11-11T04:16:55]>, mtime_old=<[2007-11-10T22:39:57]>, mtime_new=<[2007-11-11T04:16:55]>, +-----BEGIN SIGNATURE----- +72EC87DCB8E1B67324DFC76E8D1BF4956F1E9D6E6D03B015 +000023 1194292332::orphan.zombinet +-----END MESSAGE----- + +From daemon@orphan.zombinet +Return-path: <daemon@orphan.zombinet> +Envelope-to: root@orphan.zombinet +Delivery-date: Sun, 11 Nov 2007 06:58:35 +0200 +Received: from orphan.zombinet ([192.168.0.1]) + by orphan.zombinet with smtp (Exim 4.63) + (envelope-from <daemon@orphan.zombinet>) + id 1Ir4u3-0001Xb-MI + for root@orphan.zombinet; Sun, 11 Nov 2007 06:58:35 +0200 +From: <daemon@orphan.zombinet> +To: <root@orphan.zombinet> +Date: Sun, 11 Nov 2007 06:58:35 EET +Subject: [Samhain at orphan.zombinet] 11-11-2007 06:58:35: CRIT +Message-Id: <E1Ir4u3-0001Xb-MI@orphan.zombinet> +Lines: 7 +X-Test: Just another bwahaha + +-----BEGIN MESSAGE----- +[2007-11-11T06:58:35+0200] orphan.zombinet +CRIT : [2007-11-11T06:57:43+0200] msg=<POLICY [ReadOnly] --------T->, path=</etc/ppp/resolv.conf>, ctime_old=<[2007-11-10T22:39:57]>, ctime_new=<[2007-11-11T04:16:55]>, mtime_old=<[2007-11-10T22:39:57]>, mtime_new=<[2007-11-11T04:16:55]>, +-----BEGIN SIGNATURE----- +72EC87DCB8E1B67324DFC76E8D1BF4956F1E9D6E6D03B015 +000023 1194292332::orphan.zombinet +-----END MESSAGE----- Index: t/01mbox.t =================================================================== --- t/01mbox.t (revision 847) +++ t/01mbox.t (working copy) @@ -3,7 +3,7 @@ BEGIN { %boxes = ( 't/testmbox' => "\x0a", 't/testmbox.mac' => "\x0d", 't/testmbox.dos' => "\x0d\x0a" ) } -use Test::More tests => 12 + 3 * keys %boxes; +use Test::More tests => 16 + 3 * keys %boxes; use strict; use_ok("Email::Folder"); @@ -72,4 +72,16 @@ ok( $r = Email::Folder->new('t/mboxcl2', seek_to => $offset), "reopened"); is( $r->next_message->header('Subject'), 'Re: Fifteenth anniversary of Perl.', 'second message' ); +undef $r; +$r = Email::Folder->new('t/mboxcl3'); +my $p = $r->next_message(); +is( $p->header('X-Test'), 'Just a bwahaha', + 'one more line after Content-Length' ); +is( $p->header_names(), 11, + 'with Content-Length all headers are in place' ); +$p = $r->next_message(); +is( $p->header('X-Test'), 'Just another bwahaha', + 'one more line after Lines' ); +is( $p->header_names(), 11, + 'with Lines all headers are in place' ); Index: lib/Email/Folder/Mbox.pm =================================================================== --- lib/Email/Folder/Mbox.pm (revision 847) +++ lib/Email/Folder/Mbox.pm (working copy) @@ -147,13 +147,14 @@ my $length = $1; print " Content-Length: $length\n" if debug; my $read = ''; + local $_; while (<$fh>) { last if length $read >= $length; $read .= $_; } # grab the next line (should be /^From / or undef) my $next = <$fh>; - return "$mail$/$read" + return "$mail$prev$/$read" if !defined $next || $next =~ /^From /; # seek back and scan line-by-line like the header # wasn't here @@ -165,10 +166,11 @@ my $lines = $1; print " Lines: $lines\n" if debug; my $read = ''; + local $_; for (1..$lines) { $read .= <$fh> } <$fh>; # trailing newline my $next = <$fh>; - return "$mail$/$read" + return "$mail$prev$/$read" if !defined $next || $next =~ /^From /; # seek back and scan line-by-line like the header # wasn't here
Fixed in svn. I'd love it if you could confirm that the fix I applied, which differs, works for you. I have incorporated your tests. -- rjbs
From: whynot [...] pozharski.name
On Sun Dec 23 14:14:36 2007, RJBS wrote: Show quoted text
> I'd love it if you could confirm that the fix I applied, which differs, > works for you.
It doesn't, sorry. Running F<bugsample> gives extra headers in 2nd and 5th samples (those with broken "Content-Length" and "Lines" headers). Issue is with adding C<$prev> twice. Bugfix (F<bugfix3>) added. p.s. I honestly wouldn't like to pretend to be an expert (either in Perl programming or secure programming); I'm not one. I don't want to appear abusing *yours* maintainer/developer powers (that's your project and your responcibility). What I really want to know, that you consciously rely on unreliable headers (RFC2076 in chapter 3.11 specifies "Content-Length" header as "discouraged"; yes, I know, RFC2076 is informational). Should that discussion be moved onto pep-list? Should I benchmark?
Index: t/01mbox.t =================================================================== --- t/01mbox.t (revision 883) +++ t/01mbox.t (working copy) @@ -61,6 +61,10 @@ ], "they're the messages we expected"); +is_deeply( [ sort map { scalar($_->header_pairs()); } @messages ], + [ 50, 54, 66 ], + "and they have expected number of headers"); + my $r = Email::Folder->new('t/mboxcl2'); is( $r->next_message->header('Subject'), 'Fifteenth anniversary of Perl.', 'iterate first message' ); Index: lib/Email/Folder/Mbox.pm =================================================================== --- lib/Email/Folder/Mbox.pm (revision 883) +++ lib/Email/Folder/Mbox.pm (working copy) @@ -146,6 +146,7 @@ # look for a content length header, and try to use that if ($mail =~ m/^Content-Length: (\d+)$/mi) { $mail .= $prev; + $prev = ''; my $length = $1; print " Content-Length: $length\n" if debug; my $read = ''; @@ -166,6 +167,7 @@ # much the same, but with Lines: if ($mail =~ m/^Lines: (\d+)$/mi) { $mail .= $prev; + $prev = ''; my $lines = $1; print " Lines: $lines\n" if debug; my $read = '';
The problem with -only- looking for /^From / is that traditionally Email::Folder::Mbox has tried to handle all the various types of mbox encoding, which means dealing with mboxcl, which allows the Content-Length header to skip you past unencoded body lines beginning with "From ". This is stupid, but it's how it has worked in the past. I am pretty tempted to abandon it, however, as it is needlessly complex, and not very reliable. -- rjbs
From: whynot [...] pozharski.name
On Mon Dec 24 15:00:05 2007, RJBS wrote: Show quoted text
> The problem with -only- looking for /^From / is that traditionally > Email::Folder::Mbox has tried to handle all the various types of mbox > encoding, which means dealing with mboxcl, which allows the > Content-Length header to skip you past unencoded body lines beginning > with "From ". This is stupid, but it's how it has worked in the past. > > I am pretty tempted to abandon it, however, as it is needlessly complex, > and not very reliable.
"Look, Ma! No-expert!" I've got it now. My fault.
From: whynot [...] pozharski.name
On Sun Sep 14 22:19:52 2008, RJBS wrote: Show quoted text
Yes. Header next to I<C-L:> or I<L:> isn't lost any more. The problem is, that if any that header has incorrect value, then a next header is passed twice. See F<bugfix3>.
applied in git. -- rjbs