Skip Menu |

This queue is for tickets about the File-Find-Rule CPAN distribution.

Report information
The Basics
Id: 33790
Status: new
Priority: 0/
Queue: File-Find-Rule

People
Owner: Nobody in particular
Requestors: s.bhooshi [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Problem in globbing dotdirectories.
Date: Tue, 4 Mar 2008 02:20:54 +0000
To: "File::Find::Rule Bugtraq" <bug-File-Find-Rule [...] rt.cpan.org>
From: "Shalom Bhooshi" <s.bhooshi [...] gmail.com>
When globbing for dotfiles and dotdirectories (.*) the top directory is included too due to the fact that $_ is set to '.'. e.g. $ perl -MFile::Find::Rule -Wle 'my $f=File::Find::Rule->new; print for $f->name(".*")->in("/tmp")' /tmp /tmp/foo /tmp/bar ... While this might not be so much of a problem when aggregating results, it does cause big and hard-to-find problems when pruning and discarding results because the entire directory tree is likely to be pruned and then no results are returned, going against DWIM and leaving you well surprised. #!/usr/bin/perl -Wl use strict; use File::Find::Rule; my @filters; push @filters, File::Find::Rule # don't descend into dotdirs ->directory ->name( ".*" ) ->prune->discard; push @filters, File::Find::Rule->new; # process everything else print join "\n", my @files = File::Find::Rule ->any( @filters ) ->in("/tmp"); The above script will usually not return anything. After some debugging, It turns out that $_ (or $shortname) is set to '.' for the rootdir in all method names and subsequent matches are done against $_. I believe this behaviour is present in and inherited from File::Find. $ perl -MFile::Find::Rule -Wle 'my$f=File::Find::Rule->new;print for $f->exec(sub{print join " | ", @_})->in("/tmp")' . | /tmp | /tmp foo | /tmp | /tmp/foo bar | /tmp | /tmp/bar You can circumvent this 'problem' by changing the glob to something like name( '.*?' ) (with regexes - name( qr/^\..+/ )) but this really goes against how globs are performed (on unix atleast) and is not DWIM because you really have to know that $_ is set differently (and inconsistently) for the top directory and therefore you have to glob differently. Consider the following GNU find command trying to achieve the same as the above script that works as expected. e.g. $ find /tmp \( -type d -name ".*" -prune \) -o \( -print \) I'm not entirely sure that $_ within the various methods ought to be changed from within File::Find (and derived packages) itself, there might be a reason for this that i am not aware of (I personally think it should be set to basename($topdir) or undef atleast) and it might cause problems with legacy code. I think a remedy is necessary for the methods of File::Find::Rule atleast for the following reasons. 1. Follow accepted glob 'standards' and more importantly maintain DWIM (Do What I Mean) because in most instances it's the non-perl-file-find-rule-savvy user that is left baffled 2. Bugs arising from this behaviour can be hard to find especially when chaining (complex) rules (a subjective reason, but compelling nonetheless). $ perl -MFile::Find::Rule -Wle 'print $File::Find::Rule::VERSION' 0.30 $ perl -V Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.15.7, archname=i486-linux-gnu-thread-multi uname='linux terranova 2.6.15.7 #1 smp thu jul 12 14:27:56 utc 2007 i686 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i486-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include' ccversion='', gccversion='4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.6.1.so, so=so, useshrplib=true, libperl= libperl.so.5.8.8 gnulibc_version='2.6.1' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_REENTRANT_API Built under linux Compiled at Dec 4 2007 08:56:39 @INC: /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl .