Skip Menu |

This queue is for tickets about the POE CPAN distribution.

Report information
The Basics
Id: 87721
Status: resolved
Priority: 0/
Queue: POE

People
Owner: Nobody in particular
Requestors: nicko [...] phone.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Possible race condition in POE::Driver::SysRW
Date: Thu, 8 Aug 2013 12:25:55 -0700
To: bug-POE [...] rt.cpan.org
From: Nicolas Dehaine <nicko [...] phone.com>
Hi Rocco, I work on an application built using POE and using POEx::HTTP::Server. I just completed investigating a problem serving medium size files (10MB) through the server, where the connection would be closed unexpectedly by POEx::HTTP::Server (mid-download, and at random times) and think I traced this down to a problem in Driver::SysRW. The problem is complex to reproduce - it needs an active POE-based app with a good deal of I/O taking place and I can't narrow it down to simple sample code but I'm happy to provide you access to a VM if you would like to see it. Here is my undestanding of the problem. POEx::HTTP::Server detects the presence of the Sys::Sendfile module. If present, it uses sendfile(2) instead of write(2) when sending a file straight from disk as the HTTP response. The problem is only reproducible if Sys::Semdfile is installed. While downloading a large file (curl client => POEx HTTP server => Sendfile => RW Wheel) , the server occasionally closes the connection. Turning on debugging in HTTP Sever, I observed that the server is receiving a 'EAGAIN' error from the ReadWrite wheel that manages the HTTP socket. From the POE perspective, the stack is: POEx::HTTP::Server :: sendfile() function, calling sendfile on a socket => Wheel::ReadWrite for socket => Driver::SysRW for the flushing the socket I believe that Driver::SysRW's flush() function returns an error incorrectly. Because of the semantics of sendfile(), there is never any data in $self->[OUTPUT_QUEUE] (line 124 of SysRW.pm). As a result, the function is executed as: 117 sub flush { 118 my ($self, $handle) = @_; 121 BEGIN { eval { require bytes } and bytes->import; } 154 $self->[TOTAL_OCTETS_LEFT]; 155 } that is, the while() loop is never entered. As a result, the error handling done in the while loop (setting $!) is never executed (line 132 - 138). As a result, the caller of this function will see the value of $! set by whichever code it last, rather than by the flush() function. In my case, this is ReadWrite.pm line 188-187. 184 $$driver_buffered_out_octets = $driver->flush($handle); 185 186 # When you can't write, nothing else matters. 187 if ($!) { The value that is tested at line 187 is not set by flush(). This results in a race condition - when $! is not 0 when entering flush(), and with flush() behaving as a no-op, it returns that initial error unexpectedly, causing it to propagate back to the HTTP server, which closes the connection. In my case, the error is EAGAIN (which is not expected to be returned by flush()). Proposed patch: This patch addressed the issue; flush() doesn't return an error unexpectedly anymore. This seems safe, i.e. not changing the expected behavior of the function. *** /usr/share/perl5/POE/Driver/SysRW.pm.old 2011-12-15 12:52:49.000000000 -0800 --- f/usr/share/perl5/POE/Driver/SysRW.pm 2013-08-08 12:16:54.035492000 -0700 *************** *** 120,125 **** --- 120,128 ---- # Need to check lengths in octets, not characters. BEGIN { eval { require bytes } and bytes->import; } + # Reset errno in case there is nothing to write + $! = 0; + # syswrite() it, like we're supposed to while (@{$self->[OUTPUT_QUEUE]}) { my $wrote_count = syswrite( Environment Ubuntu 12.04 $ perl -v This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-gnu-thread-multi (with 55 registered patches, see perl -V for more detail) POE 1.54 Sys::Sendfile 0.11 POEx::HTTP::Server 0.0901 I am happy to help troubleshooting this further or testing alternate fixes. Thank you, Nicko Dehaine
Hi, Nicko. It seems weird that flush() is being called on that handle. Under normal circumstances, I would expect POE::Driver::SysRW to be disabled out of the system before delegating to Sys::Sendfile. Having both writers interested in the same file handle seems to be asking for trouble. I see that POEx::HTTP::Server calls select_resume_write(), but I don't see a corresponding select_pause_write() to disable the write-okay watcher. Maybe that's a bug, and the fix could resolve this ticket too. I've asked the author of POEx::HTTP::Server about it on irc.perl.org #poe. Meanwhile, your patch seems reasonable. I'll apply and test it soon.
The author of POEx::HTTP::Server appears to have fixed the problem in version 0.0902, which hasn't been released yet.
Thank you for the well-described bug report and patch. I've applied it, it passes all tests, and it will be in the next CPAN release: commit f42a26db921db2bbbb951a1becd02800d8ceb2d2 Author: Nicko Dehaine <nicko@phone.com> Date: Mon Aug 19 00:03:29 2013 -0400 [rt.cpan.org 87721] Clear $! in Driver::SysRW::flush(). Nothing prevents the user from calling flush() when there is nothing to flush. In these cases, $! remained unchanged, which could lead to detection of false error conditions. This change clears $! to avoid the problem.