Skip Menu |

This queue is for tickets about the Mail-Box CPAN distribution.

Report information
The Basics
Id: 20747
Status: rejected
Priority: 0/
Queue: Mail-Box

People
Owner: Nobody in particular
Requestors: NUFFIN [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 2.065
Fixed in: (no value)



Subject: Mail::Box:::Thread::Mangager vs. invalid(?) References/In-Reply-To fields
Hi, Mail::Box::Thread::Manager parses references/in-reply-to headers assuming the message IDs are valid. However, some messages from nntp.perl.org's news server have headers that contain strings like "< >", where the space matches [^>]*, and is then scrubbed until the message ID is the empty string. Then it explodes when trying to create a dummy message with an empty message ID. I suggest the regex be changed to something like /<\s*(.*?)\s*>/. Attached is a patch with a slightly different regex coding style, more in line with yours. Cheers, -Yuval
Subject: thread_manager.patch
--- Manager.pm.orig 2006-07-28 22:18:38.000000000 +0300 +++ Manager.pm 2006-07-28 22:19:42.000000000 +0300 @@ -295,7 +295,7 @@ my $replies; if(my $irt = $head->get('in-reply-to')) - { for($irt =~ m/\<([^>]*)\>/) + { for($irt =~ m/\<\s*([^>]+)\s*\>/) { my $msgid = $1; $msgid =~ s/\s+//g; $replies = $self->{MBTM_ids}{$msgid} || $self->createDummy($msgid); @@ -304,7 +304,7 @@ my @refs; if(my $refs = $head->get('references')) - { while($refs =~ s/\<([^>]*)\>//s) + { while($refs =~ s/\<\s*([^>]+)\s*\>//s) { my $msgid = $1; $msgid =~ s/\s//gs; push @refs, $self->{MBTM_ids}{$msgid} || $self->createDummy($msgid);
Subject: Re: [rt.cpan.org #20747] Mail::Box:::Thread::Mangager vs. invalid(?) References/In-Reply-To fields
Date: Sat, 29 Jul 2006 11:26:21 +0200
To: יובל קוג'מן via RT <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* יובל קוג'מן via RT (bug-Mail-Box@rt.cpan.org) [060728 19:26]: Show quoted text
> Fri Jul 28 15:26:45 2006: Request 20747 was acted upon. > Transaction: Ticket created by NUFFIN > Queue: Mail-Box > Subject: Mail::Box:::Thread::Mangager vs. invalid(?) References/In-Reply-To fields > > Mail::Box::Thread::Manager parses references/in-reply-to headers assuming the message IDs > are valid.
I agree with you that the message-ids are treated too sloppy in this. Show quoted text
> I suggest the regex be changed to something like /<\s*(.*?)\s*>/. Attached is a patch with a > slightly different regex coding style, more in line with yours.
According to RFC2822, a message-id shall look like this: message-id = "Message-ID:" msg-id CRLF in-reply-to = "In-Reply-To:" 1*msg-id CRLF references = "References:" 1*msg-id CRLF msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] id-left = dot-atom-text / no-fold-quote / obs-id-left id-right = dot-atom-text / no-fold-literal / obs-id-right no-fold-quote = DQUOTE *(qtext / quoted-pair) DQUOTE no-fold-literal = "[" *(dtext / quoted-pair) "]" qtext = NO-WS-CTL / ; Non white space controls %d33 / ; The rest of the US-ASCII %d35-91 / ; characters not including "\" %d93-126 ; or the quote character quoted-pair = ("\" text) / obs-qp obs-id-left = local-part obs-id-right = domain obs-qp = "\" (%d0-127) Which means that blanks are forbidden and the id should be at least \S\@\S so... this will be the new regexp: if(my $irt = $head->get('in-reply-to')) { for($irt =~ m/\<(\S+\@\S+)\>/) Of course, I could be more strict, but I assume that not all applications are strict enough in the characters they use. Fix in next release (soon!) -- Thanks for the report, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
CC: NUFFIN [...] cpan.org
Subject: Re: [rt.cpan.org #20747] Mail::Box:::Thread::Mangager vs. invalid(?) References/In-Reply-To fields
Date: Sat, 29 Jul 2006 13:45:26 +0300
To: Mark Overmeer via RT <bug-Mail-Box [...] rt.cpan.org>
From: Yuval Kogman <nothingmuch [...] woobling.org>
On Sat, Jul 29, 2006 at 05:26:39 -0400, Mark Overmeer via RT wrote: Show quoted text
> Which means that blanks are forbidden and the id should be at least \S\@\S > so... this will be the new regexp: > > if(my $irt = $head->get('in-reply-to')) > { for($irt =~ m/\<(\S+\@\S+)\>/)
You could also use this: m/\<\s*(\S+\@\S+)\s*\>/) and save the \s* stripping later. Thanks! -- Yuval Kogman <nothingmuch@woobling.org> http://nothingmuch.woobling.org 0xEBD27418
Download (untitled)
application/pgp-signature 191b

Message body not shown because it is not plain text.

Subject: Re: [rt.cpan.org #20747] Mail::Box:::Thread::Mangager vs. invalid(?) References/In-Reply-To fields
Date: Sat, 29 Jul 2006 13:05:07 +0200
To: "nothingmuch [...] woobling.org via RT" <bug-Mail-Box [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* nothingmuch@woobling.org via RT (bug-Mail-Box@rt.cpan.org) [060729 10:46]: Show quoted text
> On Sat, Jul 29, 2006 at 05:26:39 -0400, Mark Overmeer via RT wrote:
> > Which means that blanks are forbidden and the id should be at least \S\@\S > > so... this will be the new regexp: > > > > if(my $irt = $head->get('in-reply-to')) > > { for($irt =~ m/\<(\S+\@\S+)\>/)
> > You could also use this: > > m/\<\s*(\S+\@\S+)\s*\>/) > > and save the \s* stripping later.
The specs say that blanks are not allowed between < and >, because that may cause folding. Even not tolerated within the possible quoted string left of the '@'. I will also remove that line from the id-handling. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net