Acknowledgement sent
to Guillem Jover <[email protected]>:
New Bug report received and forwarded. Copy sent to Fabio Augusto De Muzio Tobich <[email protected]>.
(Tue, 20 Jul 2021 21:39:04 GMT) (full text, mbox, link).
Package: monit
Version: 1:5.27.2-1
Severity: serious
Forwarded: https://bitbucket.org/tildeslash/monit/pull-requests/110/use-an-epsilon-when-doing-the-reboot-boot
Hi!
On Linux the current method to retrieve the boot timestamp is racy,
which means that the reboot checks (the daemon start delay), and the
state machinery can be affected. The former is an annoyance as monit
will not respond to commands during the set amount of time. But the
latter means that services set to «onreboot nostart» and managed f.ex.
by a HA system will lose their state on «monit restart», which can be
rather bad.
I notice that Hurd patch modifies the Linux _getStartTime()
implementation making it need way more syscalls, which can exacerbate
this problem, and IMO should be either reverted for bullseye, or
modified so that the Hurd has its own sysdep file with that change.
I'm attaching the patch I've submitted upstream, which fixes the
problem for us.
To reproduce I added a Log_debug() entry to see the exact timestamps,
but this can be easily seen anyway by adding a start delay and
checking whether the delay gets skipped or taken into account after
each «service monit restart».
Thanks,
Guillem
Acknowledgement sent
to "Rowan Wookey" <[email protected]>:
Extra info received and forwarded to list. Copy sent to Fabio Augusto De Muzio Tobich <[email protected]>.
(Thu, 29 Jul 2021 09:51:02 GMT) (full text, mbox, link).
Severity: normal
This is not a serious bug and is currently slated to remove monit from testing. Please check for severities here https://www.debian.org/Bugs/Developer#severities
It's been rejected upstream I'll let the maintainers here comment on if it should be accepted or rejected here.
Acknowledgement sent
to "Fabio A. De Muzio Tobich" <[email protected]>:
Extra info received and forwarded to list. Copy sent to Fabio Augusto De Muzio Tobich <[email protected]>.
(Thu, 29 Jul 2021 12:09:02 GMT) (full text, mbox, link).
First thanks for reporting this, Guillem.
I was waiting for an upstream response to the pull request and, as Wookey
pointed out, was rejected.
Let me quote upstream reply here for the sake of information:
"Thanks for pointing the problem out and for the patch, i think the solution
is however shaky, as the time between the syscalls may vary and the hardcoded
epsilon may be insufficient. The patch also changes the behaviour on all
platforms, not just linux and may break cases where two reboots will occur
within consecutive seconds (not very likely, but possible in the future).
We need to fix the root cause (race condition in _getStartTime on linux)."
So, I will follow upstream here and I will not accept the patch, but I will
keep the bug open until a more appropriate solution to the problem is found.
Also, as pointed out by Wookey, this bug should is not serious, the correct
severity for it is normal, so I changed the severity to normal.
Cheers.
--
⢀⣴⠾⠻⢶⣦
⣾⠁⢠⠒⠀⣿⡁ Fabio Augusto De Muzio Tobich
⢿⡄⠘⠷⠚⠋⠀ 9730 4066 E5AE FAC2 2683 D03D 4FB3 B4D3 7EF6 3B2E
⠈⠳⣄
Debbugs is free software and licensed under the terms of the GNU General
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.