Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the Email-Simple CPAN distribution.

Report information
The Basics
Id: 26298
Status: resolved
Priority: 0/
Queue: Email-Simple

People
Owner: Nobody in particular
Requestors: ddascalescu+perl [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 1.999
Fixed in: (no value)



Subject: Very slow parsing
Email::Simple seems very slow at building the headers from a normal, 25-line e-mail header: it clocks at about 5 messages parsed per second. I tested this on a 3.2GHz Pentium running RHEL, and on a 1.8GHz Pentium M running Windows XP. Benchmark attached. Hope that helps, Dan Dascalescu
Subject: Email-Simple.pl
#!/usr/bin/perl -w use strict; use Email::Simple; use Benchmark ':hireswallclock'; my $header = [ 'Delivered-To:', 'Return-Path: <nospam@arcor.de>', 'Received: from mail-in-04.arcor-online.net (mail-in-04.arcor-online.net [151.189.21.44])', ' by mrin1.yahoo.com (8.13.8/8.13.8/y.in) with ESMTP id l3D2Pbrm088969', ' for <nospam@yahoo.com>; Thu, 12 Apr 2007 19:25:37 -0700 (PDT)', 'Received: from mail-in-01-z2.arcor-online.net (mail-in-11-z2.arcor-online.net [151.189.8.28])', ' by mail-in-04.arcor-online.net (Postfix) with ESMTP id 495C117F670', ' for <nospam@yahoo.com>; Fri, 13 Apr 2007 04:25:34 +0200 (CEST)', 'Received: from mail-in-07.arcor-online.net (mail-in-07.arcor-online.net [151.189.21.47])', ' by mail-in-11-z2.arcor-online.net (Postfix) with ESMTP id 3D183347328', ' for <nospam@yahoo.com>; Fri, 13 Apr 2007 04:25:34 +0200 (CEST)', 'Received: from webmail16 (webmail16.arcor-online.net [151.189.8.70])', ' by mail-in-07.arcor-online.net (Postfix) with ESMTP id 30E4D2C29E1', ' for <nospam@yahoo.com>; Fri, 13 Apr 2007 04:25:34 +0200 (CEST)', 'Message-ID: <8531504.1176431134185.JavaMail.ngmail@webmail16>', 'Date: Fri, 13 Apr 2007 04:25:34 +0200 (CEST)', 'From: nospam@arcor.de', 'To: nospam@yahoo.com', 'Subject: Email::Simple speed test', 'MIME-Version: 1.0', 'Content-Type: text/plain; charset=ISO-8859-1', 'Content-Transfer-Encoding: quoted-printable', 'X-ngMessageSubType: MessageSubType_MAIL', 'X-WebmailclientIP: 1.2.3.4', 'X-Spam-Track: [cat=UK; info=ip:NN<ip=151.189.21.44,policy=n-w0,n100,g0,s>;sv:UK<ip=66.218.86.238>;sg:UK<size=8,cnt=0>]', ]; @$header = map {$_.="\n"} @$header; my $t = timeit(10, sub { my $es = Email::Simple->new(join '', @{$header} ); my $s = $es->header("Subject"); }); print "10 loops of other code took:", timestr($t),"\n";
Subject: Re: [rt.cpan.org #26298] Very slow parsing
Date: Fri, 13 Apr 2007 09:05:15 -0400
To: Dan Dascalescu via RT <bug-Email-Simple [...] rt.cpan.org>
From: Ricardo SIGNES <rjbs [...] cpan.org>
* Dan Dascalescu via RT <bug-Email-Simple@rt.cpan.org> [2007-04-12T23:31:34] Show quoted text
> Email::Simple seems very slow at building the headers from a normal, > 25-line e-mail header: it clocks at about 5 messages parsed per second.
Interesting! You are correct, except when you say normal. A normal message looks like this: Header1: A Header2: B Body EOF A bodyless message should like this: Head1: A Head2: B EOF You're testing this message: Head1: A Head2: B EOF Your test on my laptop shows about what you said: 10 loops of other code took:2.52888 wallclock secs ( 2.51 usr + 0.00 sys = 2.51 CPU) @ 3.98/s (n=10) If I add a blank line after the last header: 10 loops of other code took:0.00832796 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) Feeding this "slow message" to Email::Simple::Header's new method is fast: 10 loops of other code took:0.0050571 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) That means that the problem is in Email::Simple::_split_head_from_body, and that it's probably affecting a pretty small subset of people. I'll give it further investigation later today. -- rjbs
I've fixed this in Subversion. The fix will be in 2.000, which should be out in the next two weeks. Basically, we no longer capture an unneeded part in a regex. -- rjbs
From: ddascalescu+perl [...] gmail.com
On Fri Apr 13 09:06:14 2007, RJBS wrote: Show quoted text
> Interesting! You are correct, except when you say normal.
[...] Show quoted text
> You're testing this message: > > Head1: A > Head2: B > EOF
[...] Show quoted text
> That means that the problem is in > Email::Simple::_split_head_from_body, and > that it's probably affecting a pretty small subset of people. I'll > give it further investigation later today.
I was actually affected by the example at http://search.cpan.org/~cfaber/Net-IMAP-Simple-1.17/lib/Net/IMAP/Simple.pm :-) Tee example fetches a bunch of header lines with $imap->top. Glad it's fixed now, Dan
Email::Simple 2.0 released -- rjbs