Skip Menu |

This queue is for tickets about the File-SearchPath CPAN distribution.

Report information
The Basics
Id: 97617
Status: rejected
Priority: 0/
Queue: File-SearchPath

People
Owner: Nobody in particular
Requestors: meir [...] guttman.co.il
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Search criteria to be a RegEx
Date: Wed, 30 Jul 2014 13:00:34 +0300
To: bug-File-SearchPath [...] rt.cpan.org
From: Meir Guttman <meir [...] guttman.co.il>
Dear Tim, I would like to suggest two addition to File::SearchPath: adding two "Hash-like options": - 'ext' => <file-extension-string-to-search-for> - 'qr => <a RegEx variable to use in searching> Would I be a more experienced Perl'er, I would gladly do it and submit a patch to implement it. Unfortunately I don't feel qualified enough...:( . Distribution name and version File::SearchPath ver. 0.06 . Perl version Ver. 5.18.2 (But IMHO not relevant) . Operating System Windows-7 Pro 64bit Eng. (Again, seems to be not relevant) . Severity: Enhancement request Meir
From: duff [...] pobox.com
On Wed Jul 30 06:03:10 2014, meir@guttman.co.il wrote: Show quoted text
> Dear Tim, > I would like to suggest two addition to File::SearchPath: adding two > "Hash-like options": > - 'ext' => <file-extension-string-to-search-for> > - 'qr => <a RegEx variable to use in searching>
At first blush, these look like they might be good ideas, but I have some questions: How would these interact with the filename specified? Would searchpath('foo', ext => '.pl'); look for "foo" and "foo.pl" (I would guess that this is what's desired) or only "foo.pl"? Also, should it be ext => '.pl' or ext => 'pl' (and a period is automatically put between the filename and the "extension")? Also, what's the benefit of the option when you can just say searchpath('foo.pl'); # ??? Maybe an option to specify multiple filenames to search for would be a better way to satisfy the need that this option is trying to fill? Perhaps something like searchpath([ 'foo', 'foo.pl' ]); # ??? WRT using a regex ... it seems to me that for this to be useful it would have to replace the filename argument (otherwise, what is searchpath("foo", qr => qr/blah/) to mean?), perhaps there should be some logic to check the type of the first arg and do something different if it's a regex: searchpath(qr/foo.*/); # finds foo, foo.pl, foo.py, libfoo.so, etc. Or a way to specify that the "filename" is a regex: searchpath("foo.*", isregex => 1); # same as above. Any comments or thoughts? -Scott Show quoted text
> Would I be a more experienced Perl'er, I would gladly do it and submit a > patch to implement it. Unfortunately I don't feel qualified enough...:( > > . Distribution name and version File::SearchPath ver. 0.06 > . Perl version Ver. 5.18.2 (But IMHO not relevant) > . Operating System Windows-7 Pro 64bit Eng. (Again, seems to be not > relevant) > . Severity: Enhancement request > > Meir > >
Subject: RE: [rt.cpan.org #97617] Search criteria to be a RegEx
Date: Sun, 18 Jan 2015 13:59:18 +0200
To: bug-File-SearchPath [...] rt.cpan.org
From: Meir Guttman <meir [...] guttman.co.il>
Show quoted text
> -----Original Message----- > From: Jonathan Scott Duff via RT [mailto:bug-File-SearchPath@rt.cpan.org] > Sent: יום ו 16 ינואר 2015 20:00 > To: meir@guttman.co.il > Subject: [rt.cpan.org #97617] Search criteria to be a RegEx
Show quoted text
> How would these interact with the filename specified? Would > > searchpath('foo', ext => '.pl'); > > look for "foo" and "foo.pl" (I would guess that this is what's desired) or > only "foo.pl"?
My intention was just 'foo.pl' Show quoted text
> Also, should it be ext => '.pl' or ext => 'pl' (and a > period is automatically put between the filename and the "extension")?
I would vote for the latter. Show quoted text
> Also, what's the benefit of the option when you can just say > > searchpath('foo.pl'); # ??? >
Indeed, no benefit, if this works. But then, why do we need the 'exe' => 1 clause? Can't we just do: searchpath('foo.exe'); Show quoted text
> Maybe an option to specify multiple filenames to search for would be a > better way to satisfy the need that this option is trying to fill? > Perhaps something like > > > searchpath([ 'foo', 'foo.pl' ]); # ???
That can work of course, but I am not sure that the complexity is justified? Aren’t we better off with the RegEx option or when this is not appropriate, calling 'searchpath' multiple times and attaching the results to the array? Show quoted text
> > > WRT using a regex ... it seems to me that for this to be useful it would > have to replace the filename argument (otherwise, what is > searchpath("foo", qr => qr/blah/) to mean?), perhaps there should be some > logic to check the type of the first arg and do something different if > it's a regex: > > searchpath(qr/foo.*/); # finds foo, foo.pl, foo.py, libfoo.so, etc. > > Or a way to specify that the "filename" is a regex: > > searchpath("foo.*", isregex => 1); # same as above. >
I imagined something like: my $criteria = qr/foo_\d\d\.pl/; searchpath ($criteria, isregex => 1); # finds foo_00.pl, foo_01.pl, etc. Regards, Meir
On Sun Jan 18 07:22:53 2015, meir@guttman.co.il wrote: Show quoted text
> >
> > -----Original Message----- > > From: Jonathan Scott Duff via RT [mailto:bug-File- > > SearchPath@rt.cpan.org] > > Sent: יום ו 16 ינואר 2015 20:00 > > To: meir@guttman.co.il > > Subject: [rt.cpan.org #97617] Search criteria to be a RegEx
>
> > How would these interact with the filename specified? Would > > > > searchpath('foo', ext => '.pl'); > > > > look for "foo" and "foo.pl" (I would guess that this is what's > > desired) or > > only "foo.pl"?
> > My intention was just 'foo.pl' >
> > Also, should it be ext => '.pl' or ext => 'pl' (and a > > period is automatically put between the filename and the > > "extension")?
> > I would vote for the latter. >
I would support both (it's easy to see if you have a leading . or not). See File::Basename documentation which uses ".pl". Show quoted text
> > Also, what's the benefit of the option when you can just say > > > > searchpath('foo.pl'); # ??? > >
> > Indeed, no benefit, if this works. But then, why do we need the 'exe' > => 1 clause? Can't we just do: > > searchpath('foo.exe'); >
On unix executable status is not indicated via a suffix. Show quoted text
> > Maybe an option to specify multiple filenames to search for would be > > a > > better way to satisfy the need that this option is trying to fill? > > Perhaps something like > > > > > > searchpath([ 'foo', 'foo.pl' ]); # ???
> > That can work of course, but I am not sure that the complexity is > justified? Aren’t we better off with the RegEx option or when this is > not appropriate, calling 'searchpath' multiple times and attaching the > results to the array? >
> > > > > > WRT using a regex ... it seems to me that for this to be useful it > > would > > have to replace the filename argument (otherwise, what is > > searchpath("foo", qr => qr/blah/) to mean?), perhaps there should be > > some > > logic to check the type of the first arg and do something different > > if > > it's a regex: > > > > searchpath(qr/foo.*/); # finds foo, foo.pl, foo.py, libfoo.so, > > etc. > > > > Or a way to specify that the "filename" is a regex: > > > > searchpath("foo.*", isregex => 1); # same as above. > >
> > I imagined something like: > > my $criteria = qr/foo_\d\d\.pl/; > searchpath ($criteria, isregex => 1); # finds foo_00.pl, foo_01.pl, > etc. >
There is no need for the "isregex" flag as you can check that you have been given a regexp object instead of a plain file name. I wouldn't worry about people putting a glob into the string -- just support the regexp object. It should return the first matching file in scalar context and all matching files in list context. -- Tim Jenness
Subject: RE: [rt.cpan.org #97617] Search criteria to be a RegEx
Date: Sun, 18 Jan 2015 18:05:43 +0200
To: bug-File-SearchPath [...] rt.cpan.org
From: Meir Guttman <meir [...] guttman.co.il>
Dear Tim, You are obviously more knowledgeable and more experienced than I am, and your arguments here are very convincing. So my suggestion is that you take whatever ideas I presented here, and do whatever you consider best. And thank you again for a very useful package. Meir Show quoted text
> -----Original Message----- > From: TJENNESS via RT [mailto:bug-File-SearchPath@rt.cpan.org] > Sent: יום א 18 ינואר 2015 17:26 > To: meir@guttman.co.il > Subject: [rt.cpan.org #97617] Search criteria to be a RegEx > > <URL: https://rt.cpan.org/Ticket/Display.html?id=97617 > > > On Sun Jan 18 07:22:53 2015, meir@guttman.co.il wrote:
> > > >
> > > -----Original Message----- > > > From: Jonathan Scott Duff via RT [mailto:bug-File- > > > SearchPath@rt.cpan.org] > > > Sent: יום ו 16 ינואר 2015 20:00 > > > To: meir@guttman.co.il > > > Subject: [rt.cpan.org #97617] Search criteria to be a RegEx
> >
> > > How would these interact with the filename specified? Would > > > > > > searchpath('foo', ext => '.pl'); > > > > > > look for "foo" and "foo.pl" (I would guess that this is what's > > > desired) or > > > only "foo.pl"?
> > > > My intention was just 'foo.pl' > >
> > > Also, should it be ext => '.pl' or ext => 'pl' (and a > > > period is automatically put between the filename and the > > > "extension")?
> > > > I would vote for the latter. > >
> > I would support both (it's easy to see if you have a leading . or not). > See File::Basename documentation which uses ".pl". > >
> > > Also, what's the benefit of the option when you can just say > > > > > > searchpath('foo.pl'); # ??? > > >
> > > > Indeed, no benefit, if this works. But then, why do we need the 'exe' > > => 1 clause? Can't we just do: > > > > searchpath('foo.exe'); > >
> > On unix executable status is not indicated via a suffix. >
> > > Maybe an option to specify multiple filenames to search for would be > > > a > > > better way to satisfy the need that this option is trying to fill? > > > Perhaps something like > > > > > > > > > searchpath([ 'foo', 'foo.pl' ]); # ???
> > > > That can work of course, but I am not sure that the complexity is > > justified? Aren’t we better off with the RegEx option or when this is > > not appropriate, calling 'searchpath' multiple times and attaching the > > results to the array? > >
> > > > > > > > > WRT using a regex ... it seems to me that for this to be useful it > > > would > > > have to replace the filename argument (otherwise, what is > > > searchpath("foo", qr => qr/blah/) to mean?), perhaps there should be > > > some > > > logic to check the type of the first arg and do something different > > > if > > > it's a regex: > > > > > > searchpath(qr/foo.*/); # finds foo, foo.pl, foo.py, libfoo.so, > > > etc. > > > > > > Or a way to specify that the "filename" is a regex: > > > > > > searchpath("foo.*", isregex => 1); # same as above. > > >
> > > > I imagined something like: > > > > my $criteria = qr/foo_\d\d\.pl/; > > searchpath ($criteria, isregex => 1); # finds foo_00.pl, foo_01.pl, > > etc. > >
> > There is no need for the "isregex" flag as you can check that you have > been given a regexp object instead of a plain file name. I wouldn't worry > about people putting a glob into the string -- just support the regexp > object. It should return the first matching file in scalar context and all > matching files in list context. > > -- > Tim Jenness > > > > > > > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 2015.0.5645 / Virus Database: 4260/8951 - Release Date: 01/18/15
From: duff [...] pobox.com
On Sun Jan 18 10:25:53 2015, TJENNESS wrote: Show quoted text
> On Sun Jan 18 07:22:53 2015, meir@guttman.co.il wrote:
> > > >
> > > -----Original Message----- > > > From: Jonathan Scott Duff via RT [mailto:bug-File- > > > SearchPath@rt.cpan.org] > > > Sent: יום ו 16 ינואר 2015 20:00 > > > To: meir@guttman.co.il > > > Subject: [rt.cpan.org #97617] Search criteria to be a RegEx
> >
> > > How would these interact with the filename specified? Would > > > > > > searchpath('foo', ext => '.pl'); > > > > > > look for "foo" and "foo.pl" (I would guess that this is what's > > > desired) or > > > only "foo.pl"?
> > > > My intention was just 'foo.pl' > >
> > > Also, should it be ext => '.pl' or ext => 'pl' (and a > > > period is automatically put between the filename and the > > > "extension")?
> > > > I would vote for the latter. > >
> > I would support both (it's easy to see if you have a leading . or > not). See File::Basename documentation which uses ".pl". > >
> > > Also, what's the benefit of the option when you can just say > > > > > > searchpath('foo.pl'); # ??? > > >
> > > > Indeed, no benefit, if this works. But then, why do we need the 'exe' > > => 1 clause? Can't we just do: > > > > searchpath('foo.exe'); > >
> > On unix executable status is not indicated via a suffix. >
> > > Maybe an option to specify multiple filenames to search for would > > > be > > > a > > > better way to satisfy the need that this option is trying to fill? > > > Perhaps something like > > > > > > > > > searchpath([ 'foo', 'foo.pl' ]); # ???
> > > > That can work of course, but I am not sure that the complexity is > > justified? Aren’t we better off with the RegEx option or when this is > > not appropriate, calling 'searchpath' multiple times and attaching > > the > > results to the array? > >
> > > > > > > > > WRT using a regex ... it seems to me that for this to be useful it > > > would > > > have to replace the filename argument (otherwise, what is > > > searchpath("foo", qr => qr/blah/) to mean?), perhaps there should > > > be > > > some > > > logic to check the type of the first arg and do something different > > > if > > > it's a regex: > > > > > > searchpath(qr/foo.*/); # finds foo, foo.pl, foo.py, libfoo.so, > > > etc. > > > > > > Or a way to specify that the "filename" is a regex: > > > > > > searchpath("foo.*", isregex => 1); # same as above. > > >
> > > > I imagined something like: > > > > my $criteria = qr/foo_\d\d\.pl/; > > searchpath ($criteria, isregex => 1); # finds foo_00.pl, foo_01.pl, > > etc. > >
> > There is no need for the "isregex" flag as you can check that you have > been given a regexp object instead of a plain file name. I wouldn't > worry about people putting a glob into the string -- just support the > regexp object. It should return the first matching file in scalar > context and all matching files in list context.
What does it mean to be the "first matching file"? Under the current implementation, there is one file name and an ordered list of directories. Thus, "first" makes perfect sense because we look for one file name in each directory in the order specified. But, if we allow for regex filename matching, then "first" becomes ambigous because in any given directory there may be several matching filenames. For one of them to be first implies that there must be some definite order *within* the directory. But what should that order be? For instance, readdir() will return entries in whatever order they are stored in the directory (which may be roughly the order that they were created), while higher level tools (such as the ls command) handily sort the entries alphabetically. So ... what should "first" mean? Also, I'll note that adding an option to match filenames based on a regex changes the search significantly. Under the current implementation we need only look at most one file per directory (the filename we're searching for). If regex matching is allowed, we must generate a list of *ALL* files within the directory and then apply the regex to each item in that list. This will impact the execution time when the directories contain many files. -Scott
On Mon Jan 19 02:29:58 2015, duff wrote: Show quoted text
> Also, I'll note that adding an option to match filenames based on a > regex changes the search significantly. Under the current > implementation we need only look at most one file per directory (the > filename we're searching for). If regex matching is allowed, we must > generate a list of *ALL* files within the directory and then apply the > regex to each item in that list. This will impact the execution time > when the directories contain many files. >
Right. There is a huge difference between "-e $file" and readdir(). Maybe regexp should not be allowed but a glob string should? Either way the performance hit should be documented and people using wildcards should expect performance issues. As for "what does first mean", just document to say that in scalar context you'll get a file that matches the regexp/glob but there is no guarantee as to the order that the search will be done in a particular directory. It would help if we had a compelling use case for the change. -- Tim
Subject: RE: [rt.cpan.org #97617] Search criteria to be a RegEx
Date: Tue, 20 Jan 2015 13:51:25 +0200
To: bug-File-SearchPath [...] rt.cpan.org
From: Meir Guttman <meir [...] guttman.co.il>
Hi all! As the original poster of these so called "enhancements", I would like to retract them. I am now convinced that this module provides a path-traversal for searching a single file, to retrieve the first occurrence, without any doubt about what 'first' means. Typical usage is a search for an executable such that a personal version will supersede a public one. And this module does this, period. There are plenty of tree traversal modules in which supporting RegEx and wild cards is a must. So we don't need one more at the price of uncertainty of its usage ("What does 'first' mean?") and performance penalty. Regards to all that contributed their time and effort to convince me of my folly... ;) Meir Show quoted text
> As for "what does first mean", just document to say that in scalar context > you'll get a file that matches the regexp/glob but there is no guarantee > as to the order that the search will be done in a particular directory. > > It would help if we had a compelling use case for the change. > > -- > Tim
After some discussion it has been decided that a regular expression option is not within the remit of the module and the original submitter has pulled the request.