Subject: | No way to filter-out inaccessible dirs in traverse |
Neither traverse, nor recurse, are useful for analyzing directory tree in case there are
inaccessible (by permissions) directories in it.
(To examine the problem, try calling traverse or recurse on some dir which has inaccessible
subdirectories, I did it on my home which is mounted on separate filesystem so has
lost+found).
Traverse case
=============
In case of traverse, I get an error like "Can't open directory: lost+found" before my
callback is run and before it has chance to ignore this dir by not calling continuation.
Before failing traverse processess some items, in fact failure happens just before given dir
is to became the traverse first param. Here is example partial backtrace:
Can't open directory /home/marcink/lost+found: Permission denied at
/usr/local/share/perl/5.10.1/Path/Class/Dir.pm line 197
Path::Class::Dir::children('Path::Class::Dir=HASH(0x9098b68)') called at
/usr/local/share/perl/5.10.1/Path/Class/Dir.pm line 144
Path::Class::Dir::traverse('Path::Class::Dir=HASH(0x9098b68)', 'CODE(0x9096538)',
'ARRAY(0x901bd68)') called at /usr/local/share/perl/5.10.1/Path/Class/Dir.pm line 148
Path::Class::Dir::__ANON__('ARRAY(0x901bd68)') called at
/home/marcink/DEV_hg/mercurial_utils/scripts/../modules/MekkHgRepoSet.pm line 186
Mekk::HgRepoSet::__ANON__('Path::Class::Dir=HASH(0x90920f8)', 'CODE(0x9095a58)',
'ARRAY(0x901bd68)') called at /usr/local/share/perl/5.10.1/Path/Class/Dir.pm line 151
Path::Class::Dir::traverse('Path::Class::Dir=HASH(0x90920f8)', 'CODE(0x9096538)',
'ARRAY(0x901bd68)') called at
/home/marcink/DEV_hg/mercurial_utils/scripts/../modules/MekkHgRepoSet.pm line 187
It is OK to fail on inaccessible dir, but it is not OK that I can't ignore it.
Possible solution
-----------------
Do not call $dir->children on directory before visiting it, instead, do it at first step of
continuation routine. Sth like (untested):
sub traverse {
my $self = shift;
my ($callback, @args) = @_;
# DO NOT DO IT # my @children = $self->children;
return $self->$callback(
sub {
my @inner_args = @_;
# ADD IT
my @children = $self->children;
return map { $_->traverse($callback, @inner_args) } @children;
},
@args
);
}
Then the traverse callback code would be able to test whether dir is readable and prune it
(by not calling continuation) if it is not.
Delayed children may also have additional bonus of seeing filesystem changes executed by the
callback (case like creating directory tree with traverse)
Recurse case
============
->recurse have similar problem for another reason: children call is delayed after callback,
but there is no way to avoid it. And even without permissions problem, recurse really lacks
sth like $File::Find::prune=1 („please don't analyze current dir's subdirectories”).
Possible solution
-----------------
Allow users to prune from callback. One possibility is to handle special return value (
currently recurse callback returns are ignored), so for example the user could
return Path::Class::PRUNE;
from recurse callback. Of course it can't work in postorder, but neither does it in
File::Find. So introduce some such constant and instead of current
$callback->($dir);
unshift @queue, $dir->children;
try
my $ret = $callback->($dir);
unless( ($ret||'') eq PRUNE ) {
unshift ...
}
(and similar change for push case)
PS If you accept any of those ideas, but would like to get them as patch or pull request,
let me know.