Subject: | Document Result-Order being arbitrary and not-predictable |
This is not very important, but having it documented somewhere would be
helpful to users.
There's a property of the underlying file system mechanisms in that the
order
of the data that produces is reasonably arbitrary.
This arbitrariness, at present, starts at the driver level, and propagates
through the entire call tree, through the kernel, through libc's readdir(),
through perls readdir() and through File::Find, ending up at
File::Find::Rule.
This usually doesn't matter to anyone, but it matters when people decide
to do
directory comparisons.
The following is not guaranteed to pass on any filesystem
system('rsync -avp /a /x')
system('rsync -avp /a /y')
is_deeply(
[ File::Find::Rule->in("a") ],
[ File::Find::Rule->in("x") ],
"Source vs Copy x");
);
is_deeply(
[ File::Find::Rule->in("a") ],
[ File::Find::Rule->in("y") ],
"Source vs Copy y");
);
is_deeply(
[ File::Find::Rule->in("x") ],
[ File::Find::Rule->in("y") ],
"Source vs Copy y");
);
Although, you could get very lucky and it /could/ pass.
However, your chances are much worse when you're using 2 different
filesystems, or comparing an internal ordering with any real filesystem:
People have a tendency to assume ordering is alphabetical, since that is
what
'ls' does.
A good example of how to start a mess is as follows:
1. Filesystem X
* JFS
* JFS returns readdir() in alphabetical order
2. Filesystem Y
* TMPFS
* TMPFS returns readdir() in REVERSE INSERTION ORDER
in this case, comparing the indexing of one directly with another will
almost
/certainly/ not work.
Thus, it should be stated, wherever one is comparing directory
structures, or
anything that could derive from a directory structure ( for example:
http://www.nntp.perl.org/group/perl.cpan.testers/2009/08/msg4940109.html
), the results obtained from File::Find::Rule /must/ be sorted prior to
doing
anything practical with them.
system('rsync -avp /a /x')
system('rsync -avp /a /y')
is_deeply(
[ sort { $a cmp $b } File::Find::Rule->in("a") ],
[ sort { $a cmp $b } File::Find::Rule->in("x") ],
"Source vs Copy x");
);
is_deeply(
[ sort { $a cmp $b } File::Find::Rule->in("a") ],
[ sort { $a cmp $b } File::Find::Rule->in("y") ],
"Source vs Copy y");
);
is_deeply(
[ sort { $a cmp $b } File::Find::Rule->in("x") ],
[ sort { $a cmp $b } File::Find::Rule->in("y") ],
"Source vs Copy y");
);
For maximum user-friendlyness I would have suggested implementing sort by
default, because its what people tend to expect, but the penalties of doing
that are too high and they're not needed for 90% of cases.
Thanks.