Bug #47768 for Digest-SHA: Performance compared to Digest::SHA1

Fri Jul 10 06:00:27 2009 justincase [...] yopmail.com - Ticket created

Subject:

Performance compared to Digest::SHA1

For generating sha1 digests, I've preferred using Digest::SHA over Digest::SHA1 because it had more recent updates and is more comprehensive. But gisle recently updated Digest::SHA, so I performed a benchmark and found Digest::SHA to be less than half as fast: Rate Digest::SHA::sha Digest::SHA1::sha1 Digest::SHA::sha 412559/s -- -60% Digest::SHA1::sha1 1037900/s 152% -- ---- use strict; use warnings; use Digest::SHA (); use Digest::SHA1 (); use Benchmark qw(cmpthese); my $key = 1234567890; cmpthese -1, { 'Digest::SHA::sha' => sub { Digest::SHA::sha1($key) }, 'Digest::SHA1::sha1' => sub { Digest::SHA1::sha1($key) }, }

Fri Jul 10 10:11:04 2009 mshelor [...] cpan.org - Correspondence added 10 min

My experience does not agree with yours: Digest::SHA is still considerably faster that Digest::SHA1 on all my platforms. For example, here are the results on Intel/Linux using Gisle's benchmark code: $ digest-bench Digest::SHA1 419bd6d3ee9022c4c9e221494d757b737bc274f5 33554432/1.64555287361145 Digest::SHA1 2.12 19.45 MB/s $ digest-bench Digest::SHA 419bd6d3ee9022c4c9e221494d757b737bc274f5 33554432/0.341594934463501 Digest::SHA 5.47 93.68 MB/s The difference is even more dramatic on an old Mac/PPC. So, for the time being, I'm rejecting this bug. I take performance very seriously, and strive to make the code as fast as possible within portability limits. If you can supply any more evidence to support your case, I'll be happy to work with you. Regards, Mark

Fri Jul 10 10:11:05 2009 The RT System itself - Status changed from 'new' to 'open'

Fri Jul 10 10:11:08 2009 mshelor [...] cpan.org - Status changed from 'open' to 'rejected'

Sun Jul 12 13:36:17 2009 justincase [...] yopmail.com - Correspondence added

On Fri Jul 10 10:11:04 2009, MSHELOR wrote: Show quoted text

> My experience does not agree with yours: Digest::SHA is still > considerably faster that Digest::SHA1 on all my platforms. For example, > here are the results on Intel/Linux using Gisle's benchmark code: > > $ digest-bench Digest::SHA1 > 419bd6d3ee9022c4c9e221494d757b737bc274f5 > 33554432/1.64555287361145 > Digest::SHA1 2.12 19.45 MB/s > > $ digest-bench Digest::SHA > 419bd6d3ee9022c4c9e221494d757b737bc274f5 > 33554432/0.341594934463501 > Digest::SHA 5.47 93.68 MB/s > > The difference is even more dramatic on an old Mac/PPC. > > So, for the time being, I'm rejecting this bug. I take performance very > seriously, and strive to make the code as fast as possible within > portability limits. If you can supply any more evidence to support your > case, I'll be happy to work with you. > > Regards, Mark

The benchmark you used creates a single digest and measure the throughput on a large piece of data. My benchmark repeatedly creates digests on smaller pieces of data, which is probably just as common a use case (think passwords). Digest::SHA1 seems to be more efficient in this case.

Sun Jul 12 13:36:19 2009 The RT System itself - Status changed from 'rejected' to 'open'

Sun Jul 12 19:11:32 2009 mshelor [...] cpan.org - Correspondence added 15 min

Gisle crafted his benchmark to use large data sets because that is the only case where relative digest performance is significant or even detectable. The time taken to compute the digest of a small password is not noticeable, and is almost certainly smaller than the margin of timing error. So, even if the use-case of small inputs is far more common, it doesn't matter: the performance issue isn't particularly relevant in that case, and wouldn't likely vary much from implementation to implementation, or even from algorithm to algorithm. That said, the Digest::SHA module still appears to be faster than Digest::SHA1, even on small inputs. Here's the result I get using your benchmark code on Intel/Linux: Rate Digest::SHA1::sha1 Digest::SHA::sha Digest::SHA1::sha1 164549/s -- -36% Digest::SHA::sha 258306/s 57% -- Mark

Sun Jul 12 19:11:37 2009 mshelor [...] cpan.org - Status changed from 'open' to 'rejected'

Sat Jan 12 12:04:57 2013 victor [...] vsespb.ru - Correspondence added

Ubuntu 10.04 64bit Core i7-2600 (Sandy Bridge) Digest::SHA 5.80 Digest::SHA1 2.13 gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1) == For small files Digest::SHA1 is much faster use strict; use warnings; use Digest::SHA (); use Digest::SHA1 (); use Benchmark qw(cmpthese); my $key = 1234567890; cmpthese -1, { 'Digest::SHA::sha' => sub { Digest::SHA::sha1($key) }, 'Digest::SHA1::sha1' => sub { Digest::SHA1::sha1($key) }, } Rate Digest::SHA::sha Digest::SHA1::sha1 Digest::SHA::sha 811471/s -- -62% Digest::SHA1::sha1 2123852/s 162% -- == For big files (i.e. more than 1kb) Digest::SHA is faster use strict; use warnings; use Digest::SHA (); use Digest::SHA1 (); use Benchmark qw(cmpthese); my $key = 'x' x (1024*1024*100); cmpthese -1, { 'Digest::SHA::sha' => sub { Digest::SHA::sha1($key) }, 'Digest::SHA1::sha1' => sub { Digest::SHA1::sha1($key) }, } (warning: too few iterations for a reliable count) Rate Digest::SHA1::sha1 Digest::SHA::sha Digest::SHA1::sha1 2.27/s -- -27% Digest::SHA::sha 3.12/s 38% -- Show quoted text

> The time taken to compute the digest of a small password is not

noticeable, and is almost certainly smaller than the margin of timing error. Show quoted text

> the performance issue isn't particularly relevant in that case,

Not always 1) In case we write code like this cmpthese -1, { 'Digest::SHA::sha' => sub { Digest::SHA::sha1($key) for (1..100) }, 'Digest::SHA1::sha1' => sub { Digest::SHA1::sha1($key) for (1..100) }, } there should be any timing error 2) Performance on small inputs is important for example a) if you generate things like Rainbow tables when writing security applications. b) Real example - I had to do _integration_ (regression) test for Tree Hash based on SHA256 (CPAN Net::Amazon::TreeHash), so I used small chunks like ~ 10 bytes, and this test is slower part of my testsuite. Most time consumed in SHA266. On Mon Jul 13 03:11:32 2009, MSHELOR wrote: Show quoted text

> Gisle crafted his benchmark to use large data sets because that is the > only case where relative digest performance is significant or even > detectable. The time taken to compute the digest of a small password is > not noticeable, and is almost certainly smaller than the margin of > timing error. > > So, even if the use-case of small inputs is far more common, it doesn't > matter: the performance issue isn't particularly relevant in that case, > and wouldn't likely vary much from implementation to implementation, or > even from algorithm to algorithm. > > That said, the Digest::SHA module still appears to be faster than > Digest::SHA1, even on small inputs. Here's the result I get using your > benchmark code on Intel/Linux: > > Rate Digest::SHA1::sha1 Digest::SHA::sha > Digest::SHA1::sha1 164549/s -- -36% > Digest::SHA::sha 258306/s 57% -- > > Mark

Sat Jan 12 12:04:58 2013 The RT System itself - Status changed from 'rejected' to 'open'

Sun Jan 13 13:39:25 2013 victor [...] vsespb.ru - Correspondence added

Also another observation: 1) Digest::SHA slower than Digest::SHA1 on small inputs 2) Digest::SHA faster than Digest::SHA1 on large inputs So I might suggest that Digest::SHA initialization time is something that can be improved. That will be good, because Amazon AWS Glacier now uses 'TreeHash' algorithm that requires calculation of SHA256 for each 1 MiB chunk. So to calculate TreeHash for 1 GB of data need 1000 initializations, and for 1TB need one million.. On Sat Jan 12 21:04:57 2013, vsespb wrote: Show quoted text

> Ubuntu 10.04 64bit Core i7-2600 (Sandy Bridge) > Digest::SHA 5.80 > Digest::SHA1 2.13 > gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1) > > == For small files Digest::SHA1 is much faster > > use strict; > use warnings; > use Digest::SHA (); > use Digest::SHA1 (); > use Benchmark qw(cmpthese); > my $key = 1234567890; > cmpthese -1, { > 'Digest::SHA::sha' => sub { Digest::SHA::sha1($key) }, > 'Digest::SHA1::sha1' => sub { Digest::SHA1::sha1($key) }, > } > > > Rate Digest::SHA::sha Digest::SHA1::sha1 > Digest::SHA::sha 811471/s -- -62% > Digest::SHA1::sha1 2123852/s 162% -- > > == For big files (i.e. more than 1kb) Digest::SHA is faster > > use strict; > use warnings; > use Digest::SHA (); > use Digest::SHA1 (); > use Benchmark qw(cmpthese); > my $key = 'x' x (1024*1024*100); > cmpthese -1, { > 'Digest::SHA::sha' => sub { Digest::SHA::sha1($key) }, > 'Digest::SHA1::sha1' => sub { Digest::SHA1::sha1($key) }, > } > > (warning: too few iterations for a reliable count) > Rate Digest::SHA1::sha1 Digest::SHA::sha > Digest::SHA1::sha1 2.27/s -- -27% > Digest::SHA::sha 3.12/s 38% -- > >

> > The time taken to compute the digest of a small password is not

> noticeable, and is almost certainly smaller than the margin of timing > error.

> > the performance issue isn't particularly relevant in that case,

> > Not always > > 1) In case we write code like this > > cmpthese -1, { > 'Digest::SHA::sha' => sub { Digest::SHA::sha1($key) for (1..100) }, > 'Digest::SHA1::sha1' => sub { Digest::SHA1::sha1($key) for (1..100) }, > } > > there should be any timing error > > 2) Performance on small inputs is important for example > a) if you generate things like Rainbow tables when writing security > applications. > b) Real example - I had to do _integration_ (regression) test for Tree > Hash based on SHA256 (CPAN Net::Amazon::TreeHash), so I used small > chunks like ~ 10 bytes, > and this test is slower part of my testsuite. Most time consumed in

SHA266. Show quoted text

> > > > > On Mon Jul 13 03:11:32 2009, MSHELOR wrote:

> > Gisle crafted his benchmark to use large data sets because that is the > > only case where relative digest performance is significant or even > > detectable. The time taken to compute the digest of a small password is > > not noticeable, and is almost certainly smaller than the margin of > > timing error. > > > > So, even if the use-case of small inputs is far more common, it doesn't > > matter: the performance issue isn't particularly relevant in that case, > > and wouldn't likely vary much from implementation to implementation, or > > even from algorithm to algorithm. > > > > That said, the Digest::SHA module still appears to be faster than > > Digest::SHA1, even on small inputs. Here's the result I get using your > > benchmark code on Intel/Linux: > > > > Rate Digest::SHA1::sha1 Digest::SHA::sha > > Digest::SHA1::sha1 164549/s -- -36% > > Digest::SHA::sha 258306/s 57% -- > > > > Mark

> >

Sun Jan 13 20:39:47 2013 mshelor [...] cpan.org - Correspondence added

On Sun Jan 13 13:39:25 2013, vsespb wrote: Show quoted text

> Also another observation: > > 1) Digest::SHA slower than Digest::SHA1 on small inputs > 2) Digest::SHA faster than Digest::SHA1 on large inputs > > So I might suggest that Digest::SHA initialization time is something > that can be improved.

There are many many instances of the statement "Module A is faster than Module B on platform X using compiler Y with data Z." When changes are made to improve performance in one particular instance, it runs the risk of damaging performance in other instances. Since you're using 'gcc', try applying the '-O1 -fomit-frame-pointer' options. I currently restrict these to 'i[3456]86' platforms but they might improve your situation as well. Once you've more thoroughly analyzed the situation and can suggest specific patches (whether they be in initialization, transform functions, compiler settings, etc), I'll be happy to consider integrating them into Digest::SHA. Regards, Mark

Sun Jan 13 20:39:47 2013 mshelor [...] cpan.org - Status changed from 'open' to 'rejected'

Mon Jan 14 05:48:00 2013 victor [...] vsespb.ru - Correspondence added

On Mon Jan 14 05:39:47 2013, MSHELOR wrote: Show quoted text

> On Sun Jan 13 13:39:25 2013, vsespb wrote:

> > Also another observation: > > > > 1) Digest::SHA slower than Digest::SHA1 on small inputs > > 2) Digest::SHA faster than Digest::SHA1 on large inputs > > > > So I might suggest that Digest::SHA initialization time is something > > that can be improved.

> > > There are many many instances of the statement "Module A is faster than > Module B on platform X using compiler Y with data Z." When changes are > made to improve performance in one particular instance, it runs the risk > of damaging performance in other instances. > > Since you're using 'gcc', try applying the '-O1 -fomit-frame-pointer' > options. I currently restrict these to 'i[3456]86' platforms but they > might improve your situation as well. Once you've more thoroughly > analyzed the situation and can suggest specific patches (whether they be > in initialization, transform functions, compiler settings, etc), I'll be > happy to consider integrating them into Digest::SHA. > > Regards, Mark

Ok, thanks. I tried -O1 -fomit-frame-pointer for small input I've got 8727/s instead of 8219/s and Digest::SHA1::sha1 is still 147% faster I understood that you need a patch, so as I don't have it, leave in rejected state for now.

Mon Jan 14 05:48:01 2013 The RT System itself - Status changed from 'rejected' to 'open'

Mon Jan 14 07:08:45 2013 mshelor [...] cpan.org - Status changed from 'open' to 'rejected'