Subject: | Suggested Improvements for Mail::VRFY |
Date: | Sun, 5 Jul 2015 15:38:31 -0700 |
To: | bug-Mail-VRFY [...] rt.cpan.org |
From: | Steve James <4stevejames [...] gmail.com> |
I am using Mail::VRFY to check email addresses for a free CMS-style system
called FreeToastHost that produces free websites and is used by
Toastmasters public speaking clubs. (over 10,000 clubs worldwide are
registered in the system)
The system also provides email distribution lists and forwarding email
lists for the various Toastmasters clubs that use the system. I regularly
have encountered junk or outdated email addresses that are in the system.
I seem the email issues in our email logs when people send emails to the
various email lists. I have long suspected that we have outdated and junk
email addresses in the system.
I am trying to deploy Mail::VRFY as part of more rigorous email address
checking strategy. My experience with it has been mixed, and I have had to
make some changes to it and use some workarounds to get it the strategy to
work. I am not very steeped in the SMTP stuff, but I have quickly tried to
learn it out of necessity during the last year or so.
I have a few suggestions for improvements--I would be more than happy to
collaborate with you to make said improvements:
1. Noting that you state *"Email address syntax checking does not conform
to RFC2822"*, the regular expression used to validate email addresses is
flawed and could be improved with little effort. In my case, it would not
take an email address with an apostrophe in it (Irish name). Upon further
review, I also discovered that the regular expression would allow email
addresses starting with a period in the username part. (not valid) I
believe the regular expression on this page,
http://www.regular-expressions.info/email.html (the one that matches 99%)
is a better approach, as it is much closer to what the standard is. I
modified the VRFY.pm file I am using to use this regular expression... less
headaches with that now.
2. The whole "misbehaving" result seems to be a bit arbitrary and
ambiguous. It appears to be a result that your code is interpreting and
assigning a result code to rather than being related to a specific SMTP
reply code. What if your code is missing some of the possible
scenarios/use cases and therefore misinterprets what is going on? Some
more context variables (and methods to fetch) are needed that can be
returned to the calling code to assess what is going on. (*Not* using a
debug mode.)
4. I would like to see a method to fetch the last SMTP reply code. This
would help a lot, and would provide additional context when needed, beyond
your simple result codes.
5. I would like to be able to specify a maximum number of retries (default
3) for connection attempts. Basically, I know from experience that when
establishing a connection, sometimes you need to try more than once. In my
email system code, I have a retry loop for sending email, that I
progressively double a delay (via sleep command) between connection
attempts. This increases the odds that I will get connected, even during
heavy system loads--it works really well for us. (We handle over 300,000
emails in a given week.) *Since you do not try to connect more than once,
it is possible that the code may misinterpret that result as a non-working
email address.*
I would be happy to contribute/collaborate on implementing code for any of
the above--let me know.
Regards,
--
*Steve James*
*FreeToastHost System Developer*
http://support.toastmastersclubs.org
*Email:* 4stevejames@gmail.com