Skip Menu |

This queue is for tickets about the Proc-ParallelLoop CPAN distribution.

Report information
The Basics
Id: 81754
Status: open
Priority: 0/
Queue: Proc-ParallelLoop

People
Owner: Nobody in particular
Requestors: carl.asplund [...] sassa.nu
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.5
Fixed in: (no value)



Subject: Asymmetric process load causes crash
Proc::ParallelLoop-0.5 Perl 5.14.2 Linux 3.2.0-34-generic #53-Ubuntu SMP Thu Nov 15 10:48:16 UTC 2012 x86_64 If one child process takes long enough time so that at least another 510 processes is forked before it is finished, then the pardo loop crashes with the error message "ParallelLoop::dispatch(): unable to fork." It is easy to demonstrate this behavior with a loop where one of the iterations involves a sleep for a few seconds, as shown below. A necessary requirement is that Max_Workers > 1. I have written a short program which demonstrates this bug. First a parallel loop is executed with Parallel::ForkManager (for reference) and then the same job is executed with Proc::ParallelLoop. For the settings below the Proc::ParallelLoop version crashes. However, as soon $i_additional_forks is reduced to 509, or $processes is reduced to 1, or $duration = 0 (no sleep, i.e. symmetric load on all children) then the job finishes properly. #!/usr/bin/env perl use warnings; use strict; use Carp; use FindBin; use lib "$FindBin::Bin/lib"; use Parallel::Forkmanager; use Proc::ParallelLoop; # Proc::ParallelLoop always works, regardless of other parameter settings, whenever # one or more of the following conditions are true: # $i_additional_forks < 510 # $processes = 1 # $duration = 0 (i.e. no sleep, symmetric processes) # # However, it does crash (saying "ParallelLoop::dispatch(): unable to fork.") if ALL three # of the following conditions are true: # $i_additional_forks >= 510 # $processes >= 2 # $duration in seconds is long enough to fork at least another 510 processes while sleeping. This # is a hardware and OS dependent number. Typically 5 s should be more than enough on # a fast GNU/Linux system, while a MS Win system with slower, emulated forking might require # more time. my $i_long_job = 30; # sleep at iteration no. $i_long_job, simulates an asymmetric load on parallel processes my $i_additional_forks = 510; # No. of forks after sleep process started my $processes = 2; # number of parallel processes my $duration = 2; # time in seconds, during which the long process sleeps before finishing print("\n\$i_additional_forks: $i_additional_forks, \$i_long_job: $i_long_job, \$duration: $duration, \$processes: $processes \n"); ############################################################## print("Testing Parallel::Forkmanager...\n"); my $pm = new Parallel::ForkManager($processes); for (my $i=0; $i<($i_long_job+$i_additional_forks); $i++) { $pm->start and next; # Forks and returns the pid for the child if ( $i == $i_long_job ) { sleep $duration }; print("\$i: $i\n"); $pm->finish; # Terminates the child process } $pm->wait_all_children; print("Sucessfully finished loop with Parallel::Forkmanager !!!!\n"); ################################################################ print("Testing Proc:ParallelLoop...\n"); { my $i=0; pardo sub{ $i<($i_long_job+$i_additional_forks)}, sub{ $i++ }, sub{ if ( $i == $i_long_job ) { sleep $duration }; print("\$i: $i\n"); }, { Max_Workers => $processes }; }; print("Sucessfully finished loop with Proc:ParallelLoop !!!!\n");
Subject: fork_test.pl
#!/usr/bin/env perl use warnings; use strict; use Carp; use FindBin; use lib "$FindBin::Bin/lib"; use Parallel::Forkmanager; use Proc::ParallelLoop; # Proc::ParallelLoop always works, regardless of other parameter settings, whenever # one or more of the following conditions are true: # $i_additional_forks < 510 # $processes = 1 # $duration = 0 (i.e. no sleep, symmetric processes) # # However, it does crash (saying "ParallelLoop::dispatch(): unable to fork.") if ALL three # of the following conditions are true: # $i_additional_forks >= 510 # $processes >= 2 # $duration in seconds is long enough to fork at least another 510 processes while sleeping. This # is a hardware and OS dependent number. Typically 5 s should be more than enough on # a fast GNU/Linux system, while a MS Win system with slower, emulated forking might require # more time. my $i_long_job = 30; # sleep at iteration no. $i_long_job, simulates an asymmetric load on parallel processes my $i_additional_forks = 510; # No. of forks after sleep process started my $processes = 2; # number of parallel processes my $duration = 2; # time in seconds, during which the long process sleeps before finishing print("\n\$i_additional_forks: $i_additional_forks, \$i_long_job: $i_long_job, \$duration: $duration, \$processes: $processes \n"); ############################################################## print("Testing Parallel::Forkmanager...\n"); my $pm = new Parallel::ForkManager($processes); for (my $i=0; $i<($i_long_job+$i_additional_forks); $i++) { $pm->start and next; # Forks and returns the pid for the child if ( $i == $i_long_job ) { sleep $duration }; print("\$i: $i\n"); $pm->finish; # Terminates the child process } $pm->wait_all_children; print("Sucessfully finished loop with Parallel::Forkmanager !!!!\n"); ################################################################ print("Testing Proc:ParallelLoop...\n"); { my $i=0; pardo sub{ $i<($i_long_job+$i_additional_forks)}, sub{ $i++ }, sub{ if ( $i == $i_long_job ) { sleep $duration }; print("\$i: $i\n"); }, { Max_Workers => $processes }; }; print("Sucessfully finished loop with Proc:ParallelLoop !!!!\n");
I ran into what appears to be a similar problem and found that the fork was failing due to a lack of file descriptors (added a $! to the error message). Running lsof against the process shows an ever growing number of read file descriptors on pipes. My particular code is using IPC::Run to run a two-process pipeline. Since I haven't run into this problem before I'm guessing there is some characteristic of this code that is causing ParallelLoop to not clean up file descriptors properly.