On 11/26/2012 11:27 AM, Mark Overmeer via RT wrote:
Show quoted text>> $ perl -Ilib -wE 'use Mail::Address; for my $s
>> (Mail::Address->parse(q[root@mx03.noris.net <root@mx03.noris.net>])) {
>> say $s->phrase }'
>> root @ mx03 . noris . net
>>
>> IMHO that's still too smart.
>
> As far as I can see, the problem is in line 97
>
> if( s/^("(?:[^"\\]+|\\.)*")\s*// # "..."
> || s/^(\[(?:[^\]\\]+|\\.)*\])\s*// # [...]
> || s/^([^\s()<>\@,;:\\".[\]]+)\s*// <---
> || s/^([()<>\@,;:\\".[\]])\s*//
> )
>
> The whole tokenizer is not sufficiently according to spec (the Mail::Box
> address parser is, but probably considerably slower)
>
> Should we remove '.' from that regex? Would it break things? At least,
> it does break a regression test.
For my purpose, the \@ must also be removed.
And yes, it breaks a regression test, but IMHO that one is much more
obscure than my use case :-) (of course, since it's my use case; but
judge for yourself).
But I can't really comment on how it effects the overall parser, I'm
neither very familiar with the code nor with the relevant RFCs.
Cheers,
Moritz