Bug #89207 for Net-SFTP-Foreign: rremove() returns no error if directory doesn't exist

Wed Oct 02 15:30:29 2013 tmetro [...] cpan.org - Ticket created

Subject:

rremove() returns no error if directory doesn't exist

I ran code like: my $sftp = Net::SFTP::Foreign->new($host); $sftp->error and die "$host: failed to connect: " . $sftp->error, "\n"; my $dir = 'testx'; $sftp->rremove($dir); $sftp->error and die "$dir: rmdir failed: " . $sftp->error, "\n"; where directory 'testx' doesn't exist, and no error was returned. rremove() documentation says, "This method tries to recover and continue when some error happens." So the above behavior may be intentional. If the intent is to remove a directory, and it doesn't already exist, having that be a no error condition may be desirable, providing all other failures are reported. The documentation should be clarified. (It would also be nice if you indicated whether this is a method you implemented using several SFTP primitives, or is it a supported SFTP command. I didn't check your source.) (If this is the case, what can be done to intentionally trigger a removal error to test the error handling? A permissions change? I tried having it remove textx/foo after doing a 'chmod 000 testx' and it still failed silently, so I think your error handling is buggy.) Documentation also says, "Returns the number of elements successfully removed." So presumably I could check for a zero return value, but if I know the code is doing the right thing, I won't bother doing that. (I was planning to use stat() to test for the presence of the directory first, to handle that as a special case before calling rremove().)

Wed Oct 02 17:38:47 2013 salva [...] cpan.org - Correspondence added

On Wed Oct 02 15:30:29 2013, TMETRO wrote: Show quoted text

> I ran code like: > > my $sftp = Net::SFTP::Foreign->new($host); > $sftp->error and die "$host: failed to connect: " . $sftp->error, > "\n"; > > my $dir = 'testx'; > $sftp->rremove($dir); > $sftp->error and die "$dir: rmdir failed: " . $sftp->error, "\n"; > > where directory 'testx' doesn't exist, and no error was returned. > > rremove() documentation says, "This method tries to recover and > continue when some error happens." So the above behavior may be > intentional. If the intent is to remove a directory, and it doesn't > already exist, having that be a no error condition may be desirable, > providing all other failures are reported. The documentation should be > clarified. (It would also be nice if you indicated whether this is a > method you implemented using several SFTP primitives, or is it a > supported SFTP command. I didn't check your source.) > > (If this is the case, what can be done to intentionally trigger a > removal error to test the error handling? A permissions change? I > tried having it remove textx/foo after doing a 'chmod 000 testx' and > it still failed silently, so I think your error handling is buggy.)

On recursive methods, the way to handle errors is using the on_error callback. Show quoted text

> > Documentation also says, "Returns the number of elements successfully > removed." So presumably I could check for a zero return value, but if > I know the code is doing the right thing, I won't bother doing that. > (I was planning to use stat() to test for the presence of the > directory first, to handle that as a special case before calling > rremove().)

Wed Oct 02 17:38:47 2013 The RT System itself - Status changed from 'new' to 'open'

Wed Oct 02 17:38:47 2013 salva [...] cpan.org - Status changed from 'open' to 'rejected'

Thu Oct 03 12:40:26 2013 tmetro [...] cpan.org - Correspondence added

From:

tmetro [...] cpan.org

On Wed Oct 02 17:38:47 2013, SALVA wrote: Show quoted text

> On recursive methods, the way to handle errors is using the on_error > callback.

And elsewhere you wrote: Show quoted text

> The methods accepting the on_error handler are those that may perform several > operations that may fail individually without aborting the main method. > Specifically, IIRC, that means recursive and glob'ing methods.

This is not at all apparent from your documentation. So if not a code bug, its a documentation bug. That aside, I'm not following the logic here. You're saying for any method that aggregates together multiple SFTP primitives, you can only catch errors if you use the on_error handler? Error handlers should be optional for users who want more fine-grained control of the error handling. It should give the user the option to intercept and choose to ignore specific errors. If the error handler is not used (or returns whatever the normal return code is for not ignoring an error), then the method by default should do the right thing and trap the error and return an error indication in a way consistent with the rest of the methods in the module. Take rremove() for example: there are really no errors it should be ignoring by default, other than the absence of the directory it was asked to remove, and even that is arguably an error condition. Everything else should cause an error exit from the method. (What errors do you think are suitable to ignore?) The whole freedom you (as a module author) gain from offering an error handler option is that you can implement error handling that does the expected thing for most users, and be confident that for the few who want different behavior, they can use the handler to implement it. If most users need to use the handler to get typically needed behavior, then the implementation is not ideal. It hardly seems likely that no error checking at all should be the default for any aggregate method. (And no, "it is too late to change the module API" is not a good answer. :-) Appropriate sometimes, but if that always applied, modules would never be improved, other than to fix egregious bugs.)

Fri Oct 04 05:22:12 2013 salva [...] cpan.org - Correspondence added

El Jue Oct 03 12:40:26 2013, TMETRO escribió: Show quoted text

> On Wed Oct 02 17:38:47 2013, SALVA wrote:

> > On recursive methods, the way to handle errors is using the on_error > > callback.

> > And elsewhere you wrote:

> > The methods accepting the on_error handler are those that may perform > > several > > operations that may fail individually without aborting the main > > method. > > Specifically, IIRC, that means recursive and glob'ing methods.

> > This is not at all apparent from your documentation. So if not a code > bug, its a documentation bug. > > That aside, I'm not following the logic here. You're saying for any > method that aggregates together multiple SFTP primitives, you can only > catch errors if you use the on_error handler? > > Error handlers should be optional for users who want more fine-grained > control of the error handling. It should give the user the option to > intercept and choose to ignore specific errors. > > If the error handler is not used (or returns whatever the normal > return code is for not ignoring an error), then the method by default > should do the right thing and trap the error and return an error > indication in a way consistent with the rest of the methods in the > module.

That makes sense, it may be useful. On the other hand, I am not sure I like the idea of dual methods that behave differently when "on_error" is passed or not. It doesn't seem obvious. I would have to think about it. Show quoted text

> (And no, "it is too late to change the module API" is not a good > answer. :-) Appropriate sometimes, but if that always applied, modules > would never be improved, other than to fix egregious bugs.)

The API can be changed when there is a good reason, I just refuse to change it for trivial things. I'm leaving this ticket open for now.

Fri Oct 04 05:22:13 2013 salva [...] cpan.org - Status changed from 'rejected' to 'open'

Fri Oct 04 05:22:13 2013 salva [...] cpan.org - Taken

Fri Oct 04 05:23:28 2013 salva [...] cpan.org - Severity Wishlist added

Fri Oct 04 05:23:28 2013 salva [...] cpan.org - Broken in 1.75 deleted

Fri Oct 04 13:40:20 2013 tmetro [...] cpan.org - Correspondence added

From:

tmetro [...] cpan.org

On Fri Oct 04 05:22:12 2013, SALVA wrote: Show quoted text

> El Jue Oct 03 12:40:26 2013, TMETRO escribió:

> > If the error handler is not used (or returns whatever the normal > > return code is for not ignoring an error), then the method by default > > should do the right thing and trap the error and return an error > > indication in a way consistent with the rest of the methods in the > > module.

> > That makes sense, it may be useful. > I would have to think about it.

Thanks for considering it. Show quoted text

> On the other hand, I am not sure I like the idea of dual methods that > behave differently when "on_error" is passed or not. It doesn't seem > obvious.

"Behave differently" is entirely up to the author of the on_error handler. More typically, an API using an error handler subroutine will look at the return code from the handler (if it does return) and use one value to signify proceeding normally with the default error handling behavior, and another value to signify short circuiting. So if I write: $sftp->rremove('testx/foo', on_error => sub { my $sftp = shift; my $obj = shift || {}; my $file = $obj->{filename} || '<unknown>'; warn "$file: " . $sftp->error . "\n"; return 1; } ); my warning will get printed, and normal error handling happens. (In this case, there is no behavior change in rremove().) While if I end that with 'return undef' it will cause the default error code in rremove() to be skipped and the method will return immediately with an error status (whatever is consistent for your API). This sort of an API is quite common. While that's most flexible, it's also OK to support a simpler form, where the user has to throw an exception to bail out. So to get the "normal" behavior I wanted from rremove(), I used: $sftp->rremove('testx/foo', on_error => sub { my $sftp = shift; my $obj = shift || {}; my $file = $obj->{filename} || '<unknown>'; die "$file: " . $sftp->error . "\n"; } ); and that indeed works: testx/foo: Couldn't stat remote link: Permission denied I would be less concerned with the potential for on_error to alter behavior and more concerned that rremove() and other aggregate functions are inconsistent from the rest of your API in that they don't return error conditions by default. Related to this, I noticed a few documentation bugs: Show quoted text

> $sftp->rremove(\@dirs, %opts)

[...] Show quoted text

> on_error => sub { ... } > This callback is called when some error is occurs. The arguments > passed are the $sftp object and the current entry (see ls docs for > more information).

It appears ls() is not documented as accepting on_error. Did you mean find(), which appears to give the most details on on_error? (The docs for glob() also reference ls instead of find().) You might want to flesh out the documentation for on_error in one spot, and consistently point to it from all the others (if the API is identical). (Some methods currently have a one or two sentence definition for on_error.) The definition should also cover what is passed as the 2nd parameter (partially explained under find()), which I had to figure out from a call to Data::Dumper. The structure appears to be similar, but not identical to the structure used by ls(). (I'm not sure what value an attributes object ('a' hash key) offers in the context of a rremove() or glob() error, but I'm assuming it is there just for consistency. I see it is set to undef. If most of your methods using on_error don't need it, might be better to switch to a simple scalar.)