Skip Menu |

This queue is for tickets about the Mail-Box CPAN distribution.

Report information
The Basics
Id: 44439
Status: resolved
Worked: 2.3 hours (140 min)
Priority: 0/
Queue: Mail-Box

People
Owner: Nobody in particular
Requestors: reinpost [...] win.tue.nl
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 2.087
Fixed in: (no value)



Subject: the takemail script doesn't output From separators, which renders it useless
Looking for a script to scan through my Unix mbox files and return an mbox file with all messages meeting some criterion (e.g. from within the last month), I found Mail::Box and the takemail script, which looks like it's designed to do that. But it doesn't: it forgets to put the From-lines on the messages in the output, with the consequence that the output isn't a mbox file. Either this used to work in some previous version (and from a quick glance at the code it seems that it should) or takemail was never actually used on mbox files. Something is wrong either in Mail::Box itself or in the docs (they don't mention the problem or how to fix it). My workaround is to use a script that doesn't use Mail::Box at all, not my preferred solution.
Subject: Re: [rt.cpan.org #44439] the takemail script doesn't output From separators, which renders it useless
Date: Fri, 20 Mar 2009 12:13:05 +0100
To: Reinier Post via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Reinier Post via RT (bug-Mail-Box@rt.cpan.org) [090320 11:03]: Show quoted text
> Fri Mar 20 07:03:41 2009: Request 44439 was acted upon. > Transaction: Ticket created by rpost > Queue: Mail-Box > Subject: the takemail script doesn't output From separators > Broken in: 2.087 > Severity: Important > Owner: Nobody > Requestors: reinpost@win.tue.nl > Status: new > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=44439 > > > Looking for a script to scan through my Unix mbox files and return an > mbox file with all messages meeting some criterion (e.g. from within the > last month), I found Mail::Box and the takemail script, which looks like > it's designed to do that. But it doesn't: it forgets to put the > From-lines on the messages in the output, with the consequence that the > output isn't a mbox file.
[laten we het maar in het Engels houden, ter documentatie] The takemail script was contributed a long time ago (somewhere in 2002), so I do not know the ins and outs of it. It seems (from the name of the parameters) that the author used MailDir. It probably depends on how you are calling it. Can you give me an example? You can easily achieve the same thing in Mail::Box directly, without script. That should work, because there are regression tests to protect that functionality. # untested my $mgr = Mail::Box::Manager->new(...); my $in = $mgr->open($infn); my $out = $mgr->open($outfn, access => 'w', create => 1); foreach my $msg ($in->messages) { if(...some condition...) { $mgr->copyMessage($msg, $out); } } -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #44439] the takemail script doesn't output From separators, which renders it useless
Date: Fri, 20 Mar 2009 13:34:25 +0100
To: Mark Overmeer via RT <bug-Mail-Box [...] rt.cpan.org>
From: rp [...] win.tue.nl (Reinier Post)
Oeps, *dat* is snel!!! vergeet mijn mailtje van net dus maar :) -- Reinier
The attached script leaves the output file completely empty. The commands have been simplified from the takemail script, which *does* write the output, except the From separator lines. Tested with Perl 5.10, both with 2.087 on Cygwin and with 2.088 on Linux.
#!/usr/bin/env perl use strict; use warnings; use Mail::Box::Manager; warn "$Mail::Box::VERSION\n"; my $mgr = new Mail::Box::Manager; my $arg = shift(@ARGV); my $inbox = $mgr->open(folder => $arg, access => 'r'); my $outbox = $mgr->open(folder => "out-$arg", access => 'w', create => 1); $mgr->copyMessage($outbox, $_) for $inbox->messages; $mgr->closeAllFolders;
Another version. Note that the messages *are* being copied, but the close() (implicit now) issues a reset which clears them (as perl -d shows). Why?!
#!/usr/bin/env perl use strict; use warnings; use Mail::Box::Manager; my $mgr = Mail::Box::Manager->new; my $arg = shift(@ARGV); my $inbox = $mgr->open(folder => $arg, access => 'r'); my $outbox = $mgr->open(folder => "out-$arg", access => 'w', create => 1, save_on_exit => 1); $mgr->copyMessage($outbox, $_) for $inbox->messages; warn scalar($outbox->messages) . " messages have been copied, but they will be cleared when the file is closed\n";
Subject: Re: [rt.cpan.org #44439] the takemail script doesn't output From separators, which renders it useless
Date: Mon, 23 Mar 2009 11:53:12 +0100
To: Reinier Post via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Reinier Post via RT (bug-Mail-Box@rt.cpan.org) [090320 20:16]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=44439 > > > The attached script leaves the output file completely empty. > The commands have been simplified from the takemail script, > which *does* write the output, except the From separator lines.
Simple script, hard bug ;-b The script does not fail if you write with 'rw' or 'a' mode, only if you use 'w'. The problem is cause by the reopening of the folder after it being written. In Mail/Box/File.pm writeMessages() calls $self->parser->restart() which closes the folder file and reopens it again... with the same mode as it was opened originally. So, if the file was opened with 'w', it will be reopened with 'w'... which implies that the file is emptied again. Probably, the best solution is not to restart, for the simple reason that the folder "write" is usually the last action on the folder, before close. -- MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Show quoted text
> In Mail/Box/File.pm writeMessages() calls > $self->parser->restart() > which closes the folder file and reopens it again... with the same > mode as it was opened originally. So, if the file was opened with 'w', > it will be reopened with 'w'... which implies that the file is emptied > again. > > Probably, the best solution is not to restart, for the simple reason > that the folder "write" is usually the last action on the folder, > before close.
Patch attached. Restarting in this way is clearly wrong for files with access mode 'w' (or 'w+'). Perhaps a restart should use seek() instead of close(); open(). But I suppose this will be good enough. Unfortunately this is a different bug: the problem with the takemail script leaving out its separators on output isn't addressed by it. So that remains open.
Download File.pm.diff-u
application/octet-stream 284b

Message body not shown because it is not plain text.

As to the original problem: I had the strange idea to actually follow the references in the documentation and ran straight into this (in Mail::Message): In most cases, the result of "write" will be the same as with print(). The main exception is for Mbox folder messages, which will get printed with their leading 'From ' line and a trailing blank. Each line of their body which starts with 'From ' will have an '>' added in front. Aaargh! The patch for takemail is attached. Yes, that is all it takes.
Download takemail.diff-u
application/octet-stream 344b

Message body not shown because it is not plain text.

CC: undisclosed-recipients: ;
Subject: Re: [rt.cpan.org #44439] the takemail script doesn't output From separators, which renders it useless
Date: Thu, 26 Mar 2009 15:25:31 +0100
To: Reinier Post via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Reinier Post via RT (bug-Mail-Box@rt.cpan.org) [090326 09:46]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=44439 > > > As to the original problem: I had the strange idea to actually follow > the references in the documentation and ran straight into this (in > Mail::Message): > > In most cases, the result of "write" will be the same as with > print(). The main exception is for Mbox folder messages, which > will get printed with their leading 'From ' line and a trailing > blank. Each line of their body which starts with 'From ' > will have an '>' added in front. > > Aaargh! > > The patch for takemail is attached. Yes, that is all it takes.
Thanks. The patch will be attributed to you in the next release. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #44439] the takemail script doesn't output From separators, which renders it useless
Date: Thu, 26 Mar 2009 15:27:38 +0100
To: Reinier Post via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Reinier Post via RT (bug-Mail-Box@rt.cpan.org) [090326 08:49]: Show quoted text
> Queue: Mail-Box > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=44439 > >
>> In Mail/Box/File.pm writeMessages() calls >> $self->parser->restart() >> ... >> Probably, the best solution is not to restart, for the simple reason >> that the folder "write" is usually the last action on the folder, >> before close.
Show quoted text
> Patch attached. > > Restarting in this way is clearly wrong for files with access mode 'w' > (or 'w+'). > Perhaps a restart should use seek() instead of close(); open().
The purpose of the "restart()" was to restart the administration of the file. But, for the moment, I prefer my solution: simply remove the restart. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions drs Mark A.C.J. Overmeer MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
Subject: Re: [rt.cpan.org #44439] the takemail script doesn't output From separators, which renders it useless
Date: Fri, 27 Mar 2009 10:43:48 +0100
To: Mark Overmeer via RT <bug-Mail-Box [...] rt.cpan.org>
From: rp [...] win.tue.nl (Reinier Post)
Show quoted text
> > The patch for takemail is attached. Yes, that is all it takes.
> > Thanks. The patch will be attributed to you in the next release.
:-)
fixed in 2.089