Subject: | Possible better handling of "From" dates |
I've been parsing some Mozilla Thunderbird mailbox files. Some of these
files appear to use different formats for the "From: " lines, so all
mails get skipped it some mailboxes.
Example lines:
From - Wed May 04 09:13:23 2005
From - Mon, 17 Jan 2005 11:55:50
The first of these lines works as-is, the second reports 0 mails in the
mailbox.
I understand the mailbox format probably varies more than a bit, but
here's a patch that works for both the styles I'm dealing with:
===
97c97
< my $from_date = qr/^From
(.*)(\w+,\s*\d+\s*\w+\s*\d+\s*\d+:\d+:\d+|\w+\s*\w+\s*\d+\s*\d+:\d+:\d+\s*\d+)\015?$/;
---
Show quoted text
> my $from_date = qr/^From (.*)\d{4}\015?$/;
===
yes, my version's got a _very_ ugly regex :-).
Cheers, great module.