Subject: | join() hangs on Ubuntu when main thread has piped filehandle(s) |
The join() method hangs when trying to join a joinable thread when the
main thread has open piped filehandle(s), whether for input or output.
I wrote a simple package to speed up the processing of large text files
on multicore boxes. The user supplies the input filename, the output
filename(s), and a handle of the worker subroutine. The package then
reads the input file in large chunks (stopping at a newline), hands the
pieces off to the user's worker subroutine (as a string), and then
caches the results (strings) in memory long enough to preserve the same
order for the output file(s) as the input files. In order to minimize
the I/O, a potential bottleneck, the package will automatically open
gzip pipe(s) if the file extension is .gz.
This works great on Windows using ActivePerl 5.10 or 5.12 and it works
fine on Ubuntu 10.04 so long as the input and output file(s) are
uncompressed. But when any of the files get opened via a pipe, the main
thread hangs when it tries to join a joinable worker thread. I first saw
this with threads 1.72 and again with 1.74 when updated via the Ubuntu
package manager. Then I downloaded 1.79 from CPAN, compiled it and make
sure the include path was set to use the latest version (checked via
print $threads::VERSION). But the problem persisted.
I double checked the documentation which does mention an issue about
"Spawning threads with open directory handles" but nothing about pipes.
I'll work on stripping the code down to a really simple example.