BUG/MINOR: mworker: don't try to access an initializing process

In pcli_prefix_to_pid(), when resolving a worker by absolute pid
(@!<pid>) or by relative pid (@1), a worker that still has PROC_O_INIT
set (i.e. not yet ready, still initializing) could be returned as a
valid target.

During a reload, if a client connects to the master CLI and sends a
command targeting a worker (e.g. @@1 or @@!<pid>), the master resolves
the target pid and attempts to forward the command by transferring a fd
over the worker's sockpair. If the worker is still initializing and has
not yet sent its READY signal, its end of the sockpair is not usable,
causing send_fd_uxst() to fail with EPIPE. This results in the
following alert being repeated in a loop:

  [ALERT] (550032) : socketpair: Cannot transfer the fd 13 over sockpair@5. Giving up.

The situation is even worse if the initializing worker has already
exited (e.g. due to a bind failure) but has not yet been removed from
the process list: in that case the sockpair's remote end is already
closed, making the failure immediate and unrecoverable until the dead
worker is cleaned up.

This was not possible before 3.1 because the master's polling loop only
started once all workers were fully ready, making it impossible to
receive CLI connections while a worker was still initializing.

Fix this by skipping workers with PROC_O_INIT set in both the absolute
and relative pid resolution paths of pcli_prefix_to_pid(), so that
only fully initialized workers can be targeted.

Must be backported to 3.1 and later.
This commit is contained in:
William Lallemand 2026-03-18 16:53:43 +01:00
parent b93137ce67
commit c6221db375

View File

@ -2876,7 +2876,7 @@ static int pcli_prefix_to_pid(const char *prefix)
if (*errtol != '\0')
return -1;
list_for_each_entry(child, &proc_list, list) {
if (!(child->options & PROC_O_TYPE_WORKER))
if (!(child->options & PROC_O_TYPE_WORKER) || (child->options & PROC_O_INIT))
continue;
if (child->pid == proc_pid){
return child->pid;
@ -2899,7 +2899,7 @@ static int pcli_prefix_to_pid(const char *prefix)
/* chose the right process, the current one is the one with the
least number of reloads */
list_for_each_entry(child, &proc_list, list) {
if (!(child->options & PROC_O_TYPE_WORKER))
if (!(child->options & PROC_O_TYPE_WORKER) || (child->options & PROC_O_INIT))
continue;
if (child->reloads == 0)
return child->pid;