Skip Menu |

This queue is for tickets about the File-Copy-Recursive CPAN distribution.

Report information
The Basics
Id: 77370
Status: resolved
Priority: 0/
Queue: File-Copy-Recursive

People
Owner: Nobody in particular
Requestors: Christopher.Knowlton [...] xerox.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Possible bug in File::Copy::Recursive
Date: Tue, 22 May 2012 15:46:26 -0400
To: <bug-File-Copy-Recursive [...] rt.cpan.org>
From: "Knowlton, Christopher" <Christopher.Knowlton [...] xerox.com>
In a VERY blended perl + Hudson + make system we are all done building and we are running a script that "publishes" the build to a release database and stages it in a pre-determined location. The target directories are getting made, but then we get into this mode where a bunch of stat calls are being queued by something and when they get to NFS mount points that are broken (a different problem), they never return from the STAT call. Could this be File::Copy::Recursive? This is an excerpt from a stack trace on a red-hat linux64 system running version DIR|0775, st_size=4096, ...}) = 0 [pid 32193] 0.000079 stat64("/apps/scmrel_build/dp/artifacts/x2c/9.0.01/unpublished/DP9.0.01_ 05.22.2012/spool", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0 [pid 32193] 0.000098 stat64("/vobs/ImagePath/X2C/MainX2C/obj.x2c_L_publish", {st_mode=S_IFDIR|0775, st_size=264, ...}) = 0 [pid 32193] 0.000178 statfs64("/apps/scmrel_build/dp/artifacts/x2c/9.0.01/unpublished/DP9.0.0 1_05.22.2012/spool", 84, {???}) = 0 [pid 32193] 0.001059 stat64("/apps/scmrel_build/dp/artifacts/x2c/9.0.01/unpublished/DP9.0.01_ 05.22.2012/spool", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0 [pid 32193] 0.000086 open("/proc/mounts", O_RDONLY) = 4 [pid 32193] 0.000088 fstat64(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 [pid 32193] 0.000109 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7c38000 [pid 32193] 0.000051 read(4, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 4072 [pid 32193] 0.000298 read(4, "/dev/sda4 /home/ade ext3 rw,data"..., 4096) = 4086 [pid 32193] 0.000261 read(4, "-hosts /net/galvatron/u03 autofs"..., 4096) = 2386 [pid 32193] 0.000171 read(4, "", 4096) = 0 [pid 32193] 0.000036 _llseek(4, 0, [0], SEEK_SET) = 0 [pid 32193] 0.000026 read(4, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 4072 [pid 32193] 0.000255 read(4, "/dev/sda4 /home/ade ext3 rw,data"..., 4096) = 4086 [pid 32193] 0.000175 stat64("/apps/thvsw_bin", {st_mode=S_IFDIR|0777, st_size=2560, ...}) = 0 [pid 32193] 0.000427 stat64("/net/usa0207spduls01.sdsp.mc.xerox.com/BinaryDataFiles", 0xffc4b94c) = ? ERESTARTSYS (To be restarted) Process 32178 resumed Process 32193 detached This NFS mount point is broken (("/net/usa0207spduls01.sdsp.mc.xerox.com/BinaryDataFiles", 0xffc4b94c) CTRL-C pressed to get the ERESTARTSYS to appear and the two process messages from strace. CALL ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++++++++++++++++++ publish: [echo] executing: /opt/rational/clearcase/bin/clearmake BUILD=NIGHTLY -C gnu TARGET=x2c RT9X=y VXWORKS=63 CONFIGURATION=DELPHINUS NO_BUILD_LOG=y publish [exec] /opt/rational/clearcase/bin/clearmake -f Makefile.install publish [exec] clearmake[1]: Entering directory `/vobs/ImagePath/X2C/MainX2C' [exec] **************** publish ****************** [exec] /vobs/Toolkit/E6System/IntegrationTools/mr_publish_gen -p DP -v 9.0 -m x2c -r /vobs/ImagePath/X2C/MainX2C/obj.x2c_L Build was aborted Where -p , -v, -m are strings identifying module and version data, and -r is the source of the material to be staged out to the NFS staging area. -r happens to be a soft link which points to /vobs/ImagePath/X2C/MainX2C/obj.x2c mr_publish_gen SOURCE Code Excerpts ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++ use strict; use ClearCase::ConfigSpecReader; use ClearCase::Simplified qw(cleartool); use File::Basename; use File::Copy::Recursive qw(dircopy fcopy); use File::Path qw(mkpath rmtree); use File::Spec; use File::Temp qw(tempfile :POSIX); use Filesys::Df; use Filesys::DiskUsage qw(du); use FindBin; use Getopt::Long; use IO::File; use Pod::Usage; use XML::Simple; use lib "$FindBin::Bin/../lib"; use lib "$FindBin::Bin/../lib/perl"; use SWRDB::DBIC; use constant CMD => basename($0); use constant DEBUG => "@{[ CMD ]}: Debug:"; use constant ERROR => "@{[ CMD ]}: Error:"; use constant INFO => "@{[ CMD ]}: Info:"; use constant WARNING => "@{[ CMD ]}: Warning:"; use constant BUFFER => " " x length(ERROR); use constant ADMIN_USER => <redacted>; use constant USER => (getpwuid($<))[0] || $ENV{USER} || getlogin; use constant ARTIFACT_VALIDATION_FILE => "CD.sdf"; use vars qw($VERSION $ivfref); $File::Copy::Recursive::RMTrgFil = 1; ...LOTS OF Simple DBIC DATABASE STUFF THAT SUCCEEDS... # module repository directory is now optional... if ($moduleRepositoryDirectory) { # validate module artifact repository is a symlink and validate if the CD.sdf file exists $moduleRepositoryDirectory = -l $moduleRepositoryDirectory ? readlink $moduleRepositoryDirectory : $modu leRepositoryDirectory; if ($validateArtifactsRepository) { die ERROR." failed to validate artifact repository: \"" . ARTIFACT_VALIDATION_FILE . "\" does not exist in \"$moduleRepositoryDirectory\"\n" unless -e "$moduleRepositoryDirectory/" .ARTIFACT_VALIDATION_FILE; } my $numberOfFilesAndDirsCopied = dirCopy({SRC_DIR => $moduleRepositoryDirectory, DEST_DIR => $repository , CHECK_DISK_SPACE => 1, FOLLOW_SYMLINKS => $followSymlinks}); die ERROR." failed to deploy artifacts: unable to copy \"$moduleRepositoryDirectory\" to \"$repository\" \n" unless $numberOfFilesAndDirsCopied; # touch a file to track when artifacts are deployed...could be use outside the scope of this tool system("/bin/touch $repository/.deployartifacts"); } sub getDiskSpaceInfo { my $method = (caller(0))[3]; $method =~ s/main:://; my $vars = defined $_[0] && UNIVERSAL::isa($_[0], 'HASH') ? shift : { @_ }; foreach (qw/DF_DIR DU_DIR/) { die ERROR." $method: $_ is undefined\n" unless $$vars{$_}; } $$vars{DF_DIR} = -l $$vars{DF_DIR} ? readlink($$vars{DF_DIR}) : $$vars{DF_DIR}; die ERROR." $method: $$vars{DF_DIR} does not exist\n" unless -d $$vars{DF_DIR}; die ERROR." $method: $$vars{DU_DIR} does not exist\n" unless -d $$vars{DU_DIR}; my $ref = df($$vars{DF_DIR}); my $available = $ref->{bavail}; my $required = sprintf "%u", (du($$vars{DU_DIR}) / 1024); return wantarray ? (($available < $required),$available,$required) : ($available < $required); } sub dirCopy { my $method = (caller(0))[3]; $method =~ s/main:://; my $vars = defined $_[0] && UNIVERSAL::isa($_[0], 'HASH') ? shift : { @_ }; foreach (qw/SRC_DIR DEST_DIR/) { die ERROR." $method: $_ is undefined\n" unless $$vars{$_}; } die ERROR." $method: $$vars{SRC_DIR} does not exist\n" unless -d $$vars{SRC_DIR}; mkpath($$vars{DEST_DIR},0,0775) unless -d $$vars{DEST_DIR}; $File::Copy::Recursive::CopyLink = $$vars{FOLLOW_SYMLINKS} ? 0 : 1; if ($$vars{CHECK_DISK_SPACE}) { my ($status,$available,$required) = getDiskSpaceInfo({DF_DIR => $$vars{DEST_DIR},DU_DIR => $$var s{SRC_DIR}}); if ($status) { my $buffer = " " x length(ERROR." $method:"); die ERROR." $method: there is not enough disk space available to stage \"$$vars{SRC_DIR} \"\n" . " $buffer required disk space: $required\n" . "$buffer available disk space: $available\n"; } } my ($numOfFilesAndDirs,$numOfDirs,$depthTraversed) = dircopy($$vars{SRC_DIR},$$vars{DEST_DIR}) or die ERROR." $method: unable to copy \"$$vars{SRC_DIR}\" to \"$$vars{DEST_DIR}\": $!\n"; return wantarray ? ($numOfFilesAndDirs,$numOfDirs,$depthTraversed) : $numOfFilesAndDirs; }

Message body is not shown because it is too large.

Can you submit a “smallest test case” example of the problem in action to isolate the problem as much as possible? Doing so makes the problem clearer, often the solution also, but at the very least it makes it easier to solve since it narrows down the possible culprits and helps cut down on red herrings. e.g. perl -MFile::Copy::Recursive=dircopy -e 'dircopy(@ARGV) or die $!;' src-in-question-here/ trg-in-question-here/ if the stats() its doing hangs then we know where to look (e.g. put some debug statements in dircopy() and rerun and see where it hangs, then narrow it down even further to, say, the call to stat() on a given path), if it doesn't then you know to look elsewhere in your code, at least for the moment HTH!
This is likely addressed in v0.41. If it happens again we can revisit fresh via github.