Bug #4584 for Digest-SHA: addfile() does not check for errors from read()

Fri Dec 05 08:02:13 2003 Guest - Ticket created

Subject:

addfile() does not check for errors from read()

I suggest that you should check the return value from read() like I do in Digest::base::addfile(). I also thought you concluded that it faster to read in smaller chunks?

Fri Dec 05 12:05:48 2003 JCDUQUE [...] cpan.org - Correspondence added

[guest - Fri Dec 5 08:02:13 2003]: Show quoted text

> I suggest that you should check the return value from > read() like I do in Digest::base::addfile().

Are you referring to the latest, version 4.0.4? If so, Mark reverted to using his old addfile(), which, indeed, doesn't have checks. In his Changes file, he writes: 4.0.4 Thu Dec 4 00:07:00 MST 2003 - made Digest::SHA into a self-contained module -- no longer depends on Digest::base -- more convenient for users -- no need to install Digest:: module I had hope that we should ALL use Digest::base to be consistent, but Mark had other plans. I'll release my own version of SHA-256/384/512 which uses Digest::base, and runs faster than Mark's. Show quoted text

> I also thought you concluded that it faster to read > in smaller chunks?

From the benchmarks I got, 4K is the optimum buffer. Either > 4K or < 4K is suboptimum. Best regards, Julius

Fri Dec 05 15:35:06 2003 JCDUQUE [...] cpan.org - Correspondence added

Show quoted text

> > I also thought you concluded that it faster to read > > in smaller chunks?

> > From the benchmarks I got, 4K is the optimum buffer. Either > 4K > or < 4K is suboptimum.

Oh, I get it. Mark's use of 1<<12 is the same as 4*1024, that is, 1 << 12 == 2^12 == 4096 == 4 * 1024 Best regards, Julius

Sat Dec 13 06:27:19 2003 Guest - Correspondence added

From:

mshelor [...] comcast.net

[guest - Fri Dec 5 08:02:13 2003]: Show quoted text

> I suggest that you should check the return value from > read() like I do in Digest::base::addfile(). > > I also thought you concluded that it faster to read > in smaller chunks?

I just now became aware of this trouble ticket, so my latest release (4.0.7) doesn't contain the fix. Regarding the buffer size used in "read": perhaps you're confusing me with someone else. 4096 appears to be a good value on most platforms, so I've always used that value (viz. 1<<12). Julius did a bit of experimentation on the performance of different buffer sizes, and came to the same conclusion. I'm a bit uncomfortable with Croak'ing on a file read error (would prefer Carp'ing), but it's not that big of an issue. So, I'll plan on making the change in my next release. Mark

Sat Dec 13 06:45:09 2003 Guest - Correspondence added

[JCDUQUE - Fri Dec 5 12:05:48 2003]: Show quoted text

> I had hope that we should ALL use Digest::base to be consistent, > but Mark had other plans. I'll release my own version of > SHA-256/384/512 which uses Digest::base, and runs faster than > Mark's.

I did in fact subclass from Digest::base in version 4.0.3. Since the first CPAN tester didn't have Digest::base installed, this caused a failure, which made me reconsider the wisdom of being dependent on other modules. I prefer Digest::SHA to be self-contained, so that users who download it won't have to deal with the frustration of going back to CPAN and installing another module. Regarding performance, the latest version of Digest::SHA is significantly better on Gisle's benchmark. It's now roughly 30% faster than SHA2 on little-endian, and more so on big-endian. It's also much faster than Digest::SHA1 on big-endian, but roughly the same on Intel. Regards, Mark

Sun Dec 14 02:29:00 2003 mshelor [...] cpan.org - Status changed from 'new' to 'resolved'

Sun Dec 14 02:29:00 2003 mshelor [...] cpan.org - Correspondence added 20 min

RT-Send-CC:

jcduque [...] lycos.com

I am marking this ticket as resovled since the addfile() method in Digest::SHA already exhibits correct behavior. There is no requirement in either Digest:: or Digest::base for the addfile() method to check for read errors, or to "croak" if a read error occurs. At present, the decision on whether to croak or not is implementation-dependent. The Digest::SHA1 module does indeed do this, but the Digest:: module does not require it. Here is the relevant documentation on "addfile()" from the Digest module: $ctx->addfile( $io_handle ) The $io_handle is read until EOF and the content is appended to the message we calculate the digest for. The return value is the $ctx object itself. If the Digest author decides to impose such a requirement, I believe that a call to "carp" would be more appropriate. This would prevent interactive applications based on the Digest modules from dying on read errors, and instead would allow the application developer to make a more graceful recovery from the error. Regards, Mark

Sun Dec 14 02:29:01 2003 mshelor [...] cpan.org - Given to MSHELOR