Skip Menu |

This queue is for tickets about the jmx4perl CPAN distribution.

Report information
The Basics
Id: 67899
Status: resolved
Priority: 0/
Queue: jmx4perl

People
Owner: Nobody in particular
Requestors: work [...] paul.dubuc.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.90
Fixed in: (no value)



Subject: Should check_jmx4perl UNKNOWN status be CRITICAL instead?
The check_jmx4perl plugin will exit with an UNKNOWN error if it fails to find jolokia or can't find the particular JMX attribute it is looking for. Example: UNKNOWN - Error: 500 Error while fetching http://host11:8180/jolokia/ : This error comes out when the appserver is not running. We also get unknown errors when JMS queues don't deploy correctly, for example, so their JMX attributes don't exist. On a production system (where the jolokia deployment and the check_jmx4perl configuration is assumed to be tested and correct) these should be CRITICAL errors. Could there be a way to override the plugin's use of the UNKNOWN exit status and change it to CRITICAL? Thank you.
Hi Paul, 

I'm the opinion that UNKNOWN is the proper state for these cases. 
It is not the check what failed, but a precondition has not been met so
that the check could not be executed at all. That's the definition 
of the UNKNOWN state as far as I understand.

Only because the agent is not installed or not reachable, doesn't mean
that the memory threshold is exceeded. It is somewhat like Schroedingers cat,
as long as  you dont measure, there is a probability for all states, hence the 
only thing you can say for sure, is that the memory check's result
it is UNKNOWN. (For a missing MBean this is probably a different story, since
it seems that the whole cat is missing in this case ;-)

Since UNKNOWN states are dealt with in a similar way like CRITICAL 
states, e.g  notifications are sent out for such a hard state change, I don't see yet
much benefit in forcing an UNKNOWN state into a CRITICAL one.

Said this, I'm not totally against adding an per-check option for mapping
UNKNOWN error to CRITICAL errors, I probably only need a bit more
arguments to be convinced (or a specific use case, where an UNKNOWN state 
is a real problem) ;-)

bye ...
...roland
Subject: Re: [rt.cpan.org #67899] Should check_jmx4perl UNKNOWN status be CRITICAL instead?
Date: Mon, 02 May 2011 15:30:07 -0400
To: bug-jmx4perl [...] rt.cpan.org
From: "Paul M. Dubuc" <work [...] paul.dubuc.org>
Hello Roland, Thanks for your reply. I agree with your definition of the UNKNOWN state. I think there is a gray area here, though. It's one thing to return UNKNOWN because of some configuration, invalid arguments, or deployment error where the test could not be run and the server under test cannot be contacted. For all we know the server is OK. But when the configuration and deployment is correct and tests have all functioned correctly in the past and then the test fails because the server is down or the MBean is missing, we know that the server is not OK. So UNKNOWN means there is a problem with the test. CRITICAL means there is a problem with the server being tested. The problem with the jmx tests is that there is apparently no way for the test itself to distinguish between the two cases because these same kinds of failures can happen for either reason. So it is preferable for the operations person to decide that these failures are CRITICAL or not and configure them to be so. In our operations, alarm notifications are sent for CRITICAL errors only. Someone will be called at any hour to fix the problem. WARNING or UNKNOWN are considered less serious and we use other notification methods for them. We like UNKNOWN to always mean that a test doesn't work and the problem is with the monitoring, not the monitored system. To be on the safe side we want these errors to be CRITICAL for production systems because those system monitors should not have configuration or deployment errors. I can't think of a reason why we would need this to be a per check option. A global, command line option would be just as good and would be easier to configure. Thanks, Paul Dubuc
Ok, convinced ;-)

I added an option '--unknown-is-critical' to check_jmx4perl which maps
every UNKNOWN event to a CRITICAL one. It is available in the latest
snapshot release 0.91_1 from CPAN.

0.91 is scheduled for this weekend.

bye ...
...roland
From: work [...] paul.dubuc.org
Thank you very much!
 Resolved already.