If a SIGHUP/SIGUSR2 is sent immediately after a SIGTERM/SIGINT and
before wait() is notified, it will mask it since there's no queue,
only a copy of the last received signal. Let's add a special check
before overwriting the signal so that SIGTERM/SIGINT are not masked.
Some people report that sometimes there's a collection of old processes
after a restart of the systemd wrapper. It's not surprizing when reading
the code, the SIGTERM is propagated as a SIGINT which asks for a graceful
stop instead. If people ask for termination, we should terminate.
Commit c54bdd2 ("MINOR: Also accept SIGHUP/SIGTERM in systemd-wrapper")
added support for extra signals but did not adapt the debug message,
which continues to report incorrect signal types. Let's pass the signal
number to the do_restart() and do_shutdown() functions so that the output
matches the signal that was received.
There's a race condition in the systemd wrapper. People who restart it
too fast report old processes remaining there. Here if the signal is
caught before entering the "while" loop we block indefinitely on the
wait() call. Let's check the caught signal before calling wait(). It
doesn't completely close the race, a window still exists between the
test of the flag and the call to wait(), but it's much smaller. A
safer solution could involve pselect() to wait for the signal
delivery.
Gcc complains because the systemd wrapper mixed code and declaration :
"warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]".
Kristoffer Grönlund reported that after my recent update to the
systemd-wrapper, I accidentely left the debugging code which
consists in disabling the fork :-(
The fix needs to be backported to 1.5 as well since I pushed it
there as well.
Having to use a hard-coded "haproxy" executable name next to the systemd
wrapper is not always convenient, as it's sometimes desirable to run with
multiple versions in parallel.
Thus this patch performs a minor change to the wrapper : if the name ends
with "-systemd-wrapper", then it trims that part off and what remains
becomes the target haproxy executable. That makes it easy to have for
example :
haproxy-1.5.4-systemd-wrapper haproxy-1.5.4
haproxy-1.5.3-systemd-wrapper haproxy-1.5.3
and so on, in a same directory.
This patch also fixes a rare bug caused by readlink() not adding the
trailing zero and leaving possible existing contents, including possibly
a randomly placed "/" which would make it unable to locate the correct
binary. This case is not totally unlikely as I got a \177 a few times
at the end of the executable names, so I could have got a '/' as well.
Back-porting to 1.5 is desirable.
My proposal is to let haproxy-systemd-wrapper also accept normal
SIGHUP/SIGTERM signals to play nicely with other process managers
besides just systemd. In my use case, this will be for using with
runit which has to ability to change the signal used for a
"reload" or "stop" command. It also might be worth renaming this
bin to just haproxy-wrapper or something of that sort to separate
itself away from systemd. But that's a different discussion. :)
Move all code out of the signal handlers, since this is potentially
dangerous. To make sure the signal handlers behave as expected, use
sigaction() instead of signal(). That also obsoletes messing with
the signal mask after restart.
Signed-off-by: Conrad Hoffmann <conrad@soundcloud.com>
Searching for the pid file in the list of arguments did not
take flags without parameters into account, like e.g. -de. Because
of this, the wrapper would use a different pid file than haproxy
if such an argument was specified before -p.
The new version can still yield a false positive for some crazy
situations, like your config file name starting with "-p", but
I think this is as good as it gets without using getopt or some
library.
Signed-off-by: Conrad Hoffmann <conrad@soundcloud.com>
Use HAProxy's exit status as the systemd wrapper's exit status instead
of always returning EXIT_SUCCESS, permitting the use of systemd's
`Restart = on-failure' logic.
Use standard error for logging messages, as it seems that this gets
messages to the systemd journal more reliably. Also use systemd's
support for specifying log levels via stderr to apply different levels
to messages.
Re-execute the systemd wrapper on SIGUSR2 and before reloading HAProxy,
making it possible to load a completely new version of HAProxy
(including a new version of the systemd wrapper) gracefully.
Since the wrapper accepts no command-line arguments of its own,
re-execution is signaled using the HAPROXY_SYSTEMD_REEXEC environment
variable.
This is primarily intended to help seamless upgrades of distribution
packages.
OpenBSD complains this way due to strncat() :
src/haproxy-systemd-wrapper.o(.text+0xd5): In function `spawn_haproxy':
src/haproxy-systemd-wrapper.c:33: warning: strcat() is almost always misused, please use strlcat()
In fact, the code before strncat() here is wrong, because it may
dereference a NULL if /proc/self/exe is not readable. So fix it
and get rid of strncat() at the same time.
No backport is needed.
There is a compiler warning after commit 1b6e75fa84 ("MEDIUM: haproxy-
systemd-wrapper: Use haproxy in same directory"):
src/haproxy-systemd-wrapper.c: In function ‘locate_haproxy’:
src/haproxy-systemd-wrapper.c:28:10: warning: ignoring return value of ‘readlink’, declared with attribute warn_unused_result [-Wunused-result]
Fix the compiler warning by checking the return value of readlink().
Send SIGINT to child processes when killed. This ensures that
the haproxy process managed by the systemd-wrapper is stopped
when "systemctl stop haproxy.service" is called.
Formerly, if A was replaced by B, and then B by C before
A finished exiting, we didn't wait for B to finish so it
ended up as a zombie process.
Fix this by waiting randomly every child we spawn.
Signed-off-by: Marc-Antoine Perennou <Marc-Antoine@Perennou.com>
Currently, to reload haproxy configuration, you have to use "-sf".
There is a problem with this way of doing things. First of all, in the systemd world,
reload commands should be "oneshot" ones, which means they should not be the new main
process but rather a tool which makes a call to it and then exits. With the current approach,
the reload command is the new main command and moreover, it makes the previous one exit.
Systemd only tracks the main program, seeing it ending, it assumes it either finished or failed,
and kills everything remaining as a grabage collector. We then end up with no haproxy running
at all.
This patch adds wrapper around haproxy, no changes at all have been made into it,
so it's not intrusive and doesn't change anything for other hosts. What this wrapper does
is basically launching haproxy as a child, listen to the SIGUSR2 (not to conflict with
haproxy itself) signal, and spawing a new haproxy with "-sf" as a child to relay the
first one.
Signed-off-by: Marc-Antoine Perennou <Marc-Antoine@Perennou.com>