Bug #79130 for Sys-SigAction: correct alarm handling is still mis-documented

Wed Aug 22 03:24:08 2012 eponymousalias [...] yahoo.com - Ticket created

Subject:

correct alarm handling is still mis-documented

The documentation for how to correctly handle alarms, protecting against race conditions, is still incorrect in the Description section of the CPAN page for this package (reference bug https://rt.cpan.org/Public/Bug/Display.html?id=50628 for the original complaint, in an earlier release). The second alarm(0) call has to be placed within the outer eval{};, not after it, so the local alarm handler is always still in play when the alarm either goes off or is disabled. (And you're missing a semicolon after the closing brace from the outer eval{};.) The attached alarm_test.pl script can be used to demonstrate the race condition, by stretching out the region in which the failure could occur so as to trigger the bug. Read the comments in the script, and try running it several times both with and without the second call to alarm(0) commented out. Observe the script failure half the time, as caught and printed by the global alarm handler whose usual absence would cause the script to die. The actual code in the timeout_call() routine seems to be correctly implemented, always turning off the alarm while the local alarm handler is still in play. But the doc should be upgraded to match, since users are likely to copy this pattern into their own code without analyzing it too closely. Also, in the Synopsis section of the doc, in the third code box, there is no path through the code that would result in the alarm still being active once the outer eval{}; is exited, so the third alarm(0); call is unnecessary. That is to say, you cannot get out of the outer eval{}; without the alarm either having gone off while before it was disabled, in which case the third alarm(0) won't do any good, or the alarm having been disabled by the second alarm(0) call, the one within the outer eval{};, in which case again the third alarm(0) is pointless.

Subject:

alarm_test.pl

#!/usr/bin/perl -w $SIG{ALRM} = sub { die "global timeout handler invoked" }; sub sleep_or_die { if (rand(2) > 1) { print "before nap\n"; sleep 4; print "after nap\n"; } else { die "random death"; } } eval { local $SIG{ALRM} = sub { die "local timeout handler invoked" }; eval{ alarm(2); sleep_or_die(); print "before first alarm(0)\n"; alarm(0); }; # This is where the second alarm(0) belongs, still within # the scope of the local SIGALRM handler, not outside of it. # Try running this program multiple times, both with and # without this alarm(0) in play. Without it, if you get # a random death above, you will get the global timeout # handler called during the sleep below -- which means the # entire process would have died without that global handler, # which we have in place only for test/debug purposes here. # With this alarm(0) in play, a random death exits the inner # eval{} quickly without turning off the alarm, but this # next alarm(0) call does so while the local SIGALRM handler # is still in play, so the global handler is never in danger # of being called (and the process won't die should it not # have such a handler in place). # alarm(0); if ($@) { chomp $@; print "inner eval exception: $@\n" } }; # Lengthen this small processing window long enough to trigger # the race condition if we got a random death above instead of # sleeping then, and the second alarm(0) above is not in play. print "sleeping to widen the race-condition window\n"; sleep 4; alarm (0); die "outer eval exception: $@" if $@; print "still alive at the end\n";

Sun Jul 21 19:17:27 2013 lab [...] lincolnbaxter.com.MAKE.ME.VALID - Correspondence added

On Wed Aug 22 03:24:08 2012, eponymousalias@yahoo.com wrote: Show quoted text

> The documentation for how to correctly handle alarms, protecting > against race conditions, is still incorrect in the Description > section of the CPAN page for this package (reference bug > https://rt.cpan.org/Public/Bug/Display.html?id=50628 for the > original complaint, in an earlier release). The second alarm(0) > call has to be placed within the outer eval{};, not after it, so > the local alarm handler is always still in play when the alarm > either goes off or is disabled. (And you're missing a semicolon > after the closing brace from the outer eval{};.) > > The attached alarm_test.pl script can be used to demonstrate the > race condition, by stretching out the region in which the failure > could occur so as to trigger the bug. Read the comments in the > script, and try running it several times both with and without the > second call to alarm(0) commented out. Observe the script failure > half the time, as caught and printed by the global alarm handler > whose usual absence would cause the script to die. > > The actual code in the timeout_call() routine seems to be > correctly implemented, always turning off the alarm while the > local alarm handler is still in play. But the doc should be > upgraded to match, since users are likely to copy this pattern > into their own code without analyzing it too closely. > > Also, in the Synopsis section of the doc, in the third code box, > there is no path through the code that would result in the alarm > still being active once the outer eval{}; is exited, so the > third alarm(0); call is unnecessary. That is to say, you cannot > get out of the outer eval{}; without the alarm either having > gone off while before it was disabled, in which case the third > alarm(0) won't do any good, or the alarm having been disabled by > the second alarm(0) call, the one within the outer eval{};, in > which case again the third alarm(0) is pointless.

Sun Jul 21 19:17:28 2013 The RT System itself - Status changed from 'new' to 'open'

Sun Jul 21 19:19:40 2013 lab [...] lincolnbaxter.com.MAKE.ME.VALID - Correspondence added

On Wed Aug 22 03:24:08 2012, eponymousalias@yahoo.com wrote: Show quoted text

> The documentation for how to correctly handle alarms, protecting > against race conditions, is still incorrect in the Description > section of the CPAN page for this package (reference bug > https://rt.cpan.org/Public/Bug/Display.html?id=50628 for the > original complaint, in an earlier release). The second alarm(0) > call has to be placed within the outer eval{};, not after it, so > the local alarm handler is always still in play when the alarm > either goes off or is disabled. (And you're missing a semicolon > after the closing brace from the outer eval{};.) > > The attached alarm_test.pl script can be used to demonstrate the > race condition, by stretching out the region in which the failure > could occur so as to trigger the bug. Read the comments in the > script, and try running it several times both with and without the > second call to alarm(0) commented out. Observe the script failure > half the time, as caught and printed by the global alarm handler > whose usual absence would cause the script to die. > > The actual code in the timeout_call() routine seems to be > correctly implemented, always turning off the alarm while the > local alarm handler is still in play. But the doc should be > upgraded to match, since users are likely to copy this pattern > into their own code without analyzing it too closely. > > Also, in the Synopsis section of the doc, in the third code box, > there is no path through the code that would result in the alarm > still being active once the outer eval{}; is exited, so the > third alarm(0); call is unnecessary. That is to say, you cannot > get out of the outer eval{}; without the alarm either having > gone off while before it was disabled, in which case the third > alarm(0) won't do any good, or the alarm having been disabled by > the second alarm(0) call, the one within the outer eval{};, in > which case again the third alarm(0) is pointless.

Sun Jul 21 19:19:40 2013 lab [...] lincolnbaxter.com.MAKE.ME.VALID - Taken

Mon Jul 22 20:35:43 2013 lab [...] lincolnbaxter.com.MAKE.ME.VALID - Correspondence added

On Sun Jul 21 19:19:40 2013, LBAXTER wrote: Show quoted text

> On Wed Aug 22 03:24:08 2012, eponymousalias@yahoo.com wrote:

> > The documentation for how to correctly handle alarms, protecting > > against race conditions, is still incorrect in the Description > > section of the CPAN page for this package (reference bug > > https://rt.cpan.org/Public/Bug/Display.html?id=50628 for the > > original complaint, in an earlier release). The second alarm(0) > > call has to be placed within the outer eval{};, not after it, so > > the local alarm handler is always still in play when the alarm > > either goes off or is disabled. (And you're missing a semicolon > > after the closing brace from the outer eval{};.) > > > > The attached alarm_test.pl script can be used to demonstrate the > > race condition, by stretching out the region in which the failure > > could occur so as to trigger the bug. Read the comments in the > > script, and try running it several times both with and without the > > second call to alarm(0) commented out. Observe the script failure > > half the time, as caught and printed by the global alarm handler > > whose usual absence would cause the script to die. > > > > The actual code in the timeout_call() routine seems to be > > correctly implemented, always turning off the alarm while the > > local alarm handler is still in play. But the doc should be > > upgraded to match, since users are likely to copy this pattern > > into their own code without analyzing it too closely. > > > > Also, in the Synopsis section of the doc, in the third code box, > > there is no path through the code that would result in the alarm > > still being active once the outer eval{}; is exited, so the > > third alarm(0); call is unnecessary. That is to say, you cannot > > get out of the outer eval{}; without the alarm either having > > gone off while before it was disabled, in which case the third > > alarm(0) won't do any good, or the alarm having been disabled by > > the second alarm(0) call, the one within the outer eval{};, in > > which case again the third alarm(0) is pointless.

>

Found the mis-documentation, and it is fixed in version 0.17.

Mon Jul 22 20:35:44 2013 lab [...] lincolnbaxter.com.MAKE.ME.VALID - Status changed from 'open' to 'resolved'