On Tue Dec 04 17:46:06 2012, LAJANDY wrote:
Show quoted text> Client had a runaway process that was not killed after WORKER_MAX_TTL
> seconds passed. They have seen Helios act appropriately to kill such
> processes before, but it apparently is not always enforced.
The WORKER_MAX_TTL parameter is currently only enforced if MAX_WORKERS
processes has been reached or HOLD is in effect. This is apparently an
oversight by the <ahem> original developer.
The cleanest place to enforce WORKER_MAX_TTL in normal operation (at
least for now) is just after service registration. That way, helios.pl
checks on the running workers often, but not so frequently it ends up
spending more time checking for workers to kill than its primary job of
actually launching workers.
I have a proof-of-concept that works in a forked repo in github...I'll
get an official patched release together soon.