BUG/MEDIUM: mworker: fix startup and reload on macOS

Since the mworker rework in haproxy 3.1, the worker need to tell the
master that it is ready. This is done using the sockpair protocol by
sending a _send_status message to the master.

It seems that the sockpair protocol is buggy on macOS because of a known
issue around fd transfer documented in sendmsg(2):

https://man.freebsd.org/cgi/man.cgi?sendmsg(2) BUGS section

  Because sendmsg() does not necessarily block until the data has been
  transferred, it is possible to transfer an open file descriptor across
  an AF_UNIX domain socket (see recv(2)), then close() it before it has
  actually been sent, the result being that the receiver gets a closed
  file descriptor. It is left to the application to implement an
  acknowledgment mechanism to prevent this from happening.

Indeed the recv side of the sockpair is closed on the send side just
after the send_fd_uxst(), which does not implement an acknowledgment
mechanism. So the master might never recv the _send_status message.

In order to implement an acknowledgment mechanism, a blocking read() is
done before closing the recv fd on the sending side, so we are sure that
the message was read on the other side.

This was only reproduced on macOS, meaning the master CLI is also
impacted on macOS. But no solution was found on macOS for it.
Implementing an acknowledgment mechanism would complexify too much the
protocol in non-blocking mode.

The problem was reported in ticket #3045, reproduced and analyzed by
@cognet.

Must be backported as far as 3.1.
This commit is contained in:
William Lallemand 2025-08-28 14:18:42 +02:00
parent 441cd614f9
commit d7f6819161

View File

@ -3590,6 +3590,7 @@ int main(int argc, char **argv)
struct mworker_proc *proc;
int sock_pair[2];
char *msg = NULL;
char c;
if (socketpair(PF_UNIX, SOCK_STREAM, 0, sock_pair) == -1) {
ha_alert("[%s.main()] Cannot create socketpair to update the new worker state\n",
@ -3611,7 +3612,6 @@ int main(int argc, char **argv)
exit(1);
}
close(sock_pair[0]);
memprintf(&msg, "_send_status READY %d\n", getpid());
if (send(sock_pair[1], msg, strlen(msg), 0) != strlen(msg)) {
@ -3619,7 +3619,17 @@ int main(int argc, char **argv)
exit(1);
}
/* in macOS, the sock_pair[0] might be received in the master
* process after it was closed in the worker, which is a
* documented bug in sendmsg(2). We need to close the fd only
* after confirming receipt of the "\n" from the CLI applet, so
* we make sure that the fd is received correctly.
*/
shutdown(sock_pair[1], SHUT_WR);
read(sock_pair[1], &c, 1);
close(sock_pair[1]);
close(sock_pair[0]);
ha_free(&msg);
/* at this point the worker must have his own startup_logs buffer */