Subject: | Sys::Statistics::Linux::CpuStats - Negative zero idle-time |
Greetings,
some of our servers are having issues with the output of Sys::Statistics::Linux::CpuStats when having long idle-times.
Details:
--------
We are monitoring our servers cpu-usage via the nrpe_cpu Icinga 2 plugin.
When our servers have been idle for a 'some time', we noticed nrpe_cpu started warning about:
'CPU CRITICAL : idle -0.00%'
Which of course is wrong, since the server is 100% idle. We confirmed this by looking at the output of the top command:
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
We tracked this problem down inside the check_linux_stats.pl script used by the nrpe_cpu plugin:
[...]
Sys::Statistics::Linux->new(cpustats => 1)
my $stat = $lxs->get
my $cpu = $stat->cpustats->{cpu};
my $cpu_used=sprintf("%.2f", (100-$cpu->{idle}));
The issue is, that $cpu->{idle} returns '-0.0' on systems that have been idle for 'some time', while the correct value should be 100.0, which looks a lot like a Overflow or Float-Precision problem...
Additional Information:
-----------------------
* OS is Debian Jessie (confirmed) and wheezy (unconfirmed)
* Sys::Statistics::Linux::CpuStats version is 0.66-1
* Seems to only happen with VMs hosted via Ganeti but not VMware (not 100% confirmed)
* For /proc/stats output, see attachment 1
* The full perl script can be found at https://exchange.nagios.org/directory/Plugins/Operating-Systems/Linux/check_linux_stats/details
Subject: | proc-stats.txt |
Server 1:
---------
cpu 5237875 0 1793576 125738733 57900 7 17247 400985488789 0 0
cpu0 5237875 0 1793576 125738733 57900 7 17247 400985488789 0 0
intr 87280124 69 10 0 0 0 0 37 0 1 0 0 0 144 0 0 1332878 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 19443200 92 0 2157006 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 157389522
btime 1461584686
processes 5308074
procs_running 1
procs_blocked 0
softirq 141600284 1 59706870 0 20058503 666290 0 16 0 24709 61143895
Server 2:
---------
cpu 6689010 0 2764422 121984554 93607 6 25231 1688229435281 0 0
cpu0 6689010 0 2764422 121984554 93607 6 25231 1688229435281 0 0
intr 126876902 68 10 0 0 0 0 37 0 1 0 0 0 144 0 0 1332924 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3718903 10 22702415 154 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 354138606
btime 1461584691
processes 8132214
procs_running 1
procs_blocked 0
softirq 177692670 1 69654374 0 23253334 666313 0 16 0 44914 84073718
Server 3:
---------
cpu 5238739 0 2104497 125144495 85309 3 19688 761459701703 0 0
cpu0 5238739 0 2104497 125144495 85309 3 19688 761459701703 0 0
intr 128429341 58 10 0 0 0 0 37 0 1 0 0 0 144 0 0 1332990 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 22627551 150 0 3945448 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 347391550
btime 1461584694
processes 8096784
procs_running 1
procs_blocked 0
softirq 165901986 1 61674565 0 23166530 666346 0 16 0 45676 80348852
Server 4:
---------
cpu 4802066 0 1567795 126612640 73042 3 14956 355045411803 0 0
cpu0 4802066 0 1567795 126612640 73042 3 14956 355045411803 0 0
intr 100055222 68 10 0 0 0 0 37 0 1 0 0 0 144 0 0 1333010 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 18888912 84 0 3608981 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 192808525
btime 1461584699
processes 4396853
procs_running 1
procs_blocked 0
softirq 131900828 1 56931739 0 19009427 666356 0 16 0 35904 55257385
Server 5:
---------
cpu 4357786 0 1275146 127874123 52442 3 13532 836771065724 0 0
cpu0 4357786 0 1275146 127874123 52442 3 13532 836771065724 0 0
intr 82165541 68 10 0 0 0 0 37 0 1 0 0 0 144 0 0 1333040 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 18307157 72 0 1307984 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 146338623
btime 1461584703
processes 4076429
procs_running 1
procs_blocked 0
softirq 125428734 1 54328133 0 18426831 666371 0 16 0 21569 51985813
Server 6:
---------
cpu 7169468 0 2224136 121606078 505769 5 20419 627571562152 0 0
cpu0 7169468 0 2224136 121606078 505769 5 20419 627571562152 0 0
intr 122643452 69 10 0 0 0 0 39 0 1 0 0 0 144 0 0 1333100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 22287194 128 0 3489440 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 236833737
btime 1461584709
processes 5768444
procs_running 2
procs_blocked 0
softirq 159175282 1 67765601 0 22407334 666393 0 16 0 33122 68302815