Skip Menu |

This queue is for tickets about the threads CPAN distribution.

Report information
The Basics
Id: 61705
Status: resolved
Priority: 0/
Queue: threads

People
Owner: Nobody in particular
Requestors: bitcard_kiddm [...] ghctechnologies.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in:
  • 1.72
  • 1.74
  • 1.79
Fixed in: (no value)



Subject: join() hangs on Ubuntu when main thread has piped filehandle(s)
The join() method hangs when trying to join a joinable thread when the main thread has open piped filehandle(s), whether for input or output. I wrote a simple package to speed up the processing of large text files on multicore boxes. The user supplies the input filename, the output filename(s), and a handle of the worker subroutine. The package then reads the input file in large chunks (stopping at a newline), hands the pieces off to the user's worker subroutine (as a string), and then caches the results (strings) in memory long enough to preserve the same order for the output file(s) as the input files. In order to minimize the I/O, a potential bottleneck, the package will automatically open gzip pipe(s) if the file extension is .gz. This works great on Windows using ActivePerl 5.10 or 5.12 and it works fine on Ubuntu 10.04 so long as the input and output file(s) are uncompressed. But when any of the files get opened via a pipe, the main thread hangs when it tries to join a joinable worker thread. I first saw this with threads 1.72 and again with 1.74 when updated via the Ubuntu package manager. Then I downloaded 1.79 from CPAN, compiled it and make sure the include path was set to use the latest version (checked via print $threads::VERSION). But the problem persisted. I double checked the documentation which does mention an issue about "Spawning threads with open directory handles" but nothing about pipes. I'll work on stripping the code down to a really simple example.
From: bitcard_kiddm [...] ghctechnologies.com
Here is the stripped down example which no longer involves a separate package, doesn't do anything meaningful, and is now horribly inefficient. Still it demonstrates the issue. I find that the following works: perl pipethreadtest.pl reads.txt.gz out But that the following hangs: perl pipethreadtest.pl reads.txt.gz out.gz In this simplified case, reading from a pipe for the input is not causing a problem (though it did in the original code) but writing to a pipe causes a hang when joining a joinable thread. Again both examples work fine using ActivePerl on Windows but the latter hangs under Ubuntu.
Subject: reads.txt.gz
Download reads.txt.gz
application/x-gzip 1.2k

Message body not shown because it is not plain text.

Subject: pipethreadtest.pl
#!/usr/bin/perl # # pipethreadtest.pl # # Demonstrate hang joining joinable thread when main thread has # a piped filehandle. use strict; use threads; if (scalar(@ARGV) < 2) { (my $bname = $0) =~ s/.*[\\\/]//; print <<"DONE"; Missing parameter(s) Usage: $bname ifname ofname DONE } splitwork($ARGV[0], $ARGV[1], \&worker, {ncpu => 2} ); sub worker { my ($chunk) = @_; my $out; open (my $fh, '<', \$chunk) or die "Unable to open/read from CHUNK variable."; open (my $ofh, '>', \$out); # Not much of a worker function! (just copy lines); while (my $fline = <$fh>) { print $ofh $fline; } close($fh); close($ofh); return($out); } sub splitwork { my ($ifname, $ofname, $func, $opt) = @_; my $MAX_BUFFERS = 64; my ($fh, $ofh); if ($ifname =~ /\.gz$/i) { my $pipe = "gzip -dc \"$ifname\" |"; open($fh, $pipe); } else { open($fh, '<', $ifname) or die "Unable to open/write: $ifname"; } if ($ofname =~ /\.gz$/i) { my $pipe = "| gzip -c > \"$ofname\""; open($ofh, $pipe); } else { open($ofh, '>', $ofname) or die "Unable to open/write: $ofname"; } my %tids; my $nbuffalloc = 0; my $nlaunched = 0; my $nactive = 0; my $readall = 0; while (1) { if ($nactive < $opt->{'ncpu'} && !$readall && $nbuffalloc < $MAX_BUFFERS ) { # Read a chunk (just one line for demonstration purposes). my $chunk = <$fh>; if (!defined $chunk) { $readall = 1; next; } my $th = threads->create({'context' => 'list'}, $func, $chunk); $tids{ $th->tid() } = undef; printf "Launched thread: %d\n", $th->tid(); $nactive++; next; } # See if any threads are joinable. my @thr = threads->list(threads::joinable); if ($#thr == -1) { # 1 second is rather long for one line chunks... ## sleep(1); next; next; } # Store result(s) from each thread. foreach my $th (@thr) { printf "joining thread %d\n", $th->tid(); my ($out) = $th->join(); printf "joined thread %d\n", $th->tid(); $tids{ $th->tid() } = $out; $nbuffalloc++; $nactive--; } # See if we can write out any of the buffers. foreach my $tid (sort { $a <=> $b } keys %tids ) { if (!defined $tids{$tid}) { last; } print $ofh $tids{$tid}; print "Wrote result for thread: $tid\n"; delete $tids{$tid}; $nbuffalloc--; } if ($nactive == 0 && $readall) { last; } } close($ofh); }
This bug can be simplified to: use threads; open(my $OUT, '| cat') || die("ERROR: $!"); threads->create(sub { })->join(); # <--- hangs I have elevated this to a core Perl bug report [perl #78494]: http://rt.perl.org/rt3/Ticket/Display.html?id=78494 This may also be related to [perl #63662]: http://rt.perl.org/rt3/Ticket/Display.html?id=63662
On 2010-10-21 10:35:50, JDHEDDEN wrote: Show quoted text
> This bug can be simplified to: > > use threads; > open(my $OUT, '| cat') || die("ERROR: $!"); > threads->create(sub { })->join(); # <--- hangs > > I have elevated this to a core Perl bug report [perl #78494]: > http://rt.perl.org/rt3/Ticket/Display.html?id=78494 > > This may also be related to [perl #63662]: > http://rt.perl.org/rt3/Ticket/Display.html?id=63662
This has now been "fixed". See the core Perl bug report above.