BUG/MINOR: mworker: set a timeout on the worker socketpair read at startup

During a soft reload, a starting worker sends sock_pair[0] to the master
via send_fd_uxst(), then reads on sock_pair[1] waiting for the master to
acknowledge receipt. Because of a documented macOS sendmsg(2) bug, the
worker must keep sock_pair[0] open until the master confirms the fd was
received by the CLI applet. This means the read() on sock_pair[1] will
never return 0 (EOF), since the worker itself still holds a reference to
sock_pair[0]. The worker can only unblock when the master actively sends
a byte back. If the master crashes before doing so, the worker blocks
indefinitely in read().

Fix this by setting a 2-second SO_RCVTIMEO on sock_pair[1] before the
read(), so the worker can unblock and continue regardless of the master's
state.

This was introduced by d7f6819161c ("BUG/MEDIUM: mworker: fix startup
and reload on macOS").

This should be backported to 3.1 and later.
This commit is contained in:
William Lallemand 2026-03-13 18:41:05 +01:00
parent cb51c8729d
commit 51d6f1ca4f

View File

@ -3765,6 +3765,7 @@ int main(int argc, char **argv)
char *msg = NULL;
char c;
int r __maybe_unused;
struct timeval tv = { .tv_sec = 2, .tv_usec = 0 };
if (socketpair(PF_UNIX, SOCK_STREAM, 0, sock_pair) == -1) {
ha_alert("[%s.main()] Cannot create socketpair to update the new worker state\n",
@ -3803,6 +3804,7 @@ int main(int argc, char **argv)
* we make sure that the fd is received correctly.
*/
shutdown(sock_pair[1], SHUT_WR);
setsockopt(sock_pair[1], SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));
r = read(sock_pair[1], &c, 1);
close(sock_pair[1]);
close(sock_pair[0]);