Commit Graph

23300 Commits

Author SHA1 Message Date
Valentine Krasnobaeva
babbcb047e MINOR: startup: encapsulate sock_get_old_sockets in a function
Let's encapsulate here the code, that calls sock_get_old_sockets() to obtain
listeners sockets from the previous process into a separate function. This
will make the code of main() more readable and we can move this new function
(if we might need so) in future.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
f4e73b4302 MINOR: startup: simplify check for calling sock_get_old_sockets
MODE_CHECK and MODE_CHECK_CONDITION are applied now very early in
step_init_1() and step_init_2() in order to check the configuration or to check
some condition provided via the command line. When these checks have
terminated, the main process exits. So, no longer need to verify these modes at
the moment, when the current process have already done its basic initialization
routines and is asking for listeners sockets from the previously started one.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
c4795e4019 MINOR: startup: split main: add step_init_3
The first part of main(), just after calling the former init() and before
trying to bind listeners, need to be also encapsulated into a separate
step_init_3() as it is. It contains important blocks to register signals, to
apply memory and nofile limits, etc. The order of these blocks should be also
preserved (especially the signals part).

For the moment step_init_3() must be also executed for all runtime modes.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
49772c55e3 MINOR: startup: split init() into separate initialization routines
This is the first commit in a series to add a support of the 5-th reload
use case, when the master process fails to read its new configuration. In this
case it just need to perform its initialization steps and keep the existed
worker.

To add the support for this last use case we need to split init() and main()
in a shorter steps in order to encapsulate necessary initialization routines
into separate functions.

Let's at first, make here progname as a global variable for haproxy.c, as it
will be used in error messages in the initialization functions. Then let's
split the init() into separate routines, which set and apply modes, write
process PID in a pidfile, etc.

The big part of the former init(), which called functions to allocate pools,
to initialize proxies, to calculate maxconn and to perform some post checks was
just encasulated as is, into step_init_2(). It will take some time to figure
out exactly which parts of this initialization block are really necessary for
the master process and which ones it could skip. So, for the moment
step_init_2() is called for all runtime modes.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
81dbc2c2e2 MEDIUM: startup: split sending oldpids_sig logic for standalone and mworker modes
Before refactoring the master-worker mode, in all runtime modes, when the new
process successfully parsed its configuration and bound to sockets, it sent
either SIGUSR1 or SIGTERM to the previous one in order to terminate it.

Let's keep this logic as is for the standalone mode. In addition, in standalone
mode we need to send the signal to old process before calling set_identity(),
because in set_identity() effective user or group may change. So, the order is
important here.

In case of master-worker mode after refactoring, master terminates the previous
worker by itself up to receiving "READY" status from the new one in
_send_status(). Master also sets at this moment HAPROXY_LOAD_SUCCESS env
variable and checks, if there are some other workers to terminate with
max_reloads exceeded.

So, now in master-worker mode we terminate old workers only, when the new one
has successfully done all initialization steps and has sent "READY" status to
master.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
b73a278df4 MINOR: mworker/cli: add _send_status to support state transition
In the new master-worker architecture, when a worker process is forked and
successfully initialized it needs somehow to communicate its "READY" state to
the master, in order to terminate the previous worker and workers, that might
exceeded max_reloads counter.

So, let's implement for this a new master CLI _send_status command. A new
worker can send its status string "READY" to the master, when it's about
entering to the run poll loop, thus it can start to receive data.

In _send_status() in the master context we update the status of the new worker:
PROC_O_INIT flag is withdrawn.

When TERM signal is sent to a worker, worker terminates and this triggers the
mworker_catch_sigchld() handler in master. This handler deletes the exiting
process entry from the processes list.

In _send_status() we loop over the processes list twice. At the first time, in
order to stop workers that exceeded the max_reloads counter. At the second time,
in order to stop the worker forked before the last reload. In the corner case,
when max_reloads=1, we avoid to send SIGTERM twice to the same worker by
setting sigterm_sent flag during the first loop.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
154848a314 MINOR: mworker: simplify the code that sets PROC_O_LEAVING
When master performs a reexec it should set for an already existed worker the
flag PROC_O_LEAVING. It means that existed worked is marked as the previous one
and will be terminated after the reload.

In the previous implementation master process was need to do the reexec
twice (the first time for parsing its configuration and the second time to free
unused ressources). So the logic of setting PROC_O_LEAVING was based on
comparing the number of reloads, performed by each process from the processes
list, except the master.

Now, as being mentioned before, reexec is performed only once. So, in this case
we need to set PROC_O_LEAVING flag, when we deserialize the list. It is done for
all processes, which have the number of reloads stricly positive.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
c8aac63893 MINOR: mworker: add support for case when new worker dies
The case, when the new worker fails while it parses its configuration or while
it tries to apply it, could be considered as the new one, because the master
process is no longer need to reexec again. The master simply keeps the previous
worker (forked before the reload) and it let the new one to exit with failure.

When the new worker exits, in the master process context (mworker_catch_sigchld)
we need to stop a MASTER proxy listener and we need to drop the server,
attached to new worker's CLI sockpair (it's inherited in master). Then we
explicitly delete master's end of this sockpair (child->ipc_fd[0]) from the
fdtab and we free the memory allocated for the worker process.
on_new_child_failure() is called before the clean up to signal systemd that
reload/load was failed.

If the new worker fails during the first start, so there is no any previous
worker, master process should exit immediately in order to keep the same
behaviour, as it was before this architecture change.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
2bb07b913d MINOR: startup: rename and adapt reexec_on_failure
Previously reexec_on_failure() was called in cases when the process has failed
after reload, while it was parsing its configuration or it was trying to apply
it. reexec_on_failure() has called mworker_reexec() and the master process has
been reexecuted.

With the new architecture in such cases there is no longer need to reexecute
the master process after its reload again. It simply keeps the previous worker,
forked before the reload, and it lets the new one to exit with an error. But we
still need the code, which increments the number of failed reloads and which
notifies systemd with new "Reload failed!" status. So, let's reuse and adapt
for this reexec_on_failure() and let's rename it to on_new_child_failure().
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
9b27f82da3 MINOR: mworker: mworker_catch_sigchld: use fd_delete instead of close
If the worker exits due to failure or due to receiving TERM signal, in the
master context, we can't now simply close the master's fd (ipc_fd[0]) of
the inherited master CLI sockpair.

When the worker is created, in the master process context MASTER proxy listener
is bound to ipc_fd[0]. When this worker fails or exits, master process is
always in its polling loop. So, closing some fd in its context immediately
triggers the BUG_ON(fd->owner), as the poller try to reinsert the "freed" fd
into fdtab and try to reuse it. We must call fd_delete in this case. This will
deinitializes fd auxilary data and closes its properly.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
cf150fd73d MEDIUM: mworker/cli: close child and parent fds, setup listeners
Basically, this is the continuation of the previous commits. So, here after the
fork, worker process closes the "master" end of the copied CLI sockpair and
binds its end, ipc_fd[1], to the GLOBAL proxy listener.
mworker_cli_global_proxy_new_listener() guarantees that GLOBAL proxy will be
created, if it wasn't the case before.

Master process, at first, allocates the MASTER proxy, creates master CLI listener
(-S command line option) and reload sockpair and then closes the "worker" end of
the copied CLI sockpair and binds its end, ipc_fd[0], to the created MASTER proxy.

Usage of the new PROC_O_INIT state helps to reduce test conditions to find the
newly forked worker.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
646299fc95 MINOR: mworker: add and set state PROC_O_INIT for new worker
Here, to distinguish between the new worker and the previous one let's add a
new process state PROC_O_INIT and let's set it, when the memory is allocated
for the new worker in the processes list.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
26ad5465cc MINOR: mworker/cli: create MASTER proxy before mcli listeners
For the master process we always need to create a MASTER proxy, even if
master cli settings were not provided via command line, because now we bind a
listener in the master process context at ipc_fd[0]. So, MASTER proxy should be
already allocated at this moment.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
6ec38c9a74 MINOR: mworker/cli: create master CLI sockpair before fork
The main idea here is to create a master CLI inherited sockpair just before the
master-worker fork. And only then after the fork let each process to bind a
needed listener to the its end of this sockpair.

Like this master and worker processes can close unused "ends" of its sockpair
copy (ipc_fd[0] for worker and and ipc_fd[1] for master).

When this sockpair creation happens inside the
mworker_cli_global_proxy_new_listener() is not possible for the master to
close ipc_fd[1] bound to the GLOBAL proxy listener, as this triggers a
BUG_ON(fd->owner) in fd_insert() in master context, because master process
has alredy entered in its polling loop and poller in its turn tries to reused
closed fd.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
cc1a631beb MINOR: mworker/cli: rename and clean mworker_cli_sockpair_new
Let's rename mworker_cli_sockpair_new() to
mworker_cli_global_proxy_new_listener() to outline that this function creates
the GLOBAL proxy, allocates the listener with "master-socket" bind conf and
attaches this listener to this GLOBAL proxy. Listener is bound to ipc_fd[1] of
the sockpair inherited in master and in worker (master CLI sockpair).
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
0fbf1973ad MINOR: mworker/cli: rename mworker_cli_proxy_new_listener
This is the first commit in a series to add the support of 4 primary reload
use-cases for the new master-worker architecture:

1. Newly forked worker process dies before any reload, due to some errors in
   the configuration. Newly forked worker process crashes before any reload
   after sending its "READY" state to master.

2. Newly forked worker process dies due to some errors in the new
   configuration. This happens after reload, when this new configuration was
   supplied, so the previous worker process is still here.

3. Newly forked worker process crashes after sending its "READY" state to
   master due to some bugs. This happens after reload, so the previous worker
   process is still here.

4. Newly forked worker process has sent its "READY" state to master and starts
   to receive traffic. This happens after reload, the old worker hasn't
   terminated yet, as it is waiting on some idle connection and it crashes.

Let's rename in this commit mworker_cli_proxy_new_listener() to
mworker_cli_master_proxy_new_listener() to outline, that this function creates
"master-socket" bind conf and allocates a listener. This listener is attached
to the MASTER proxy and it's bound to the ipc_fd[0] of the sockpair,
inherited in master and in worker processes (master CLI sockpair).
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
223caab96f MINOR: cfgparse: fix *thread keywords sensitive to global section position
*thread keywords parsers are sensitive to global section position. If they are
present there, the global section must be the first section in the
configuration. *thread parsers logic is based on non_global_section_parsed
counter. So, we need to reset it explicitly before the second configuration
read done by worker or in a standalone mode.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
0ed262d7bf MEDIUM: startup: introduce load_cfg and read_cfg
This commit is a part of the series to add a support of discovery mode in the
configuration parser and in initialization sequence.

In order to support discovery mode, we need to read the configuration twice.
So, we need to split the stage, when we load all configuration files, from
the stage when we parse it. To do this, let's encapsulate in read_cfg() the
part, where we load the configuration files in a separate function, load_cfg().
Like this we can call only the parsing part as many times as we need.

Before reading configuration at the first time we set MODE_DISCOVERY. After
the reading this mode is immediately unset, as the real runtime mode has been
already set by discovery keywords parsers.

Second read is performed when all primary runtime modes (daemon, master-worker)
are applied, because we should not read the configuration twice in the master
process.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
e2b4768224 MEDIUM: cfgparse: parse only "global" section in MODE_DISCOVERY
This commit is a part of the series to add a support of discovery mode in the
configuration parser and in initialization sequence.

So, in discovery mode, when we read the configuration the first time, we
parse for the moment only the "global" section. Unknown section names will be
ignored.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
699be6a55d MEDIUM: cfgparse-global: parse only KWF_DISCOVERY keywords in MODE_DISCOVERY
This commit is a part of the series to add a support of discovery mode in the
configuration parser and in initialization sequence.

Global section parser parses the majority of keywords in its function, so
those keywords don't have any dedicated parsers yet. Only after this parsing
block cfg_parse_global() starts to call dedicated parsers for any other
discovered keywords, which were not found in the block.

As all keywords, which should be parsed in MODE_DISCOVERY have its own parser
funtions, we can skip this block with goto discovery_kw and start directly from
the part, where we call parsers from the keywords list. KWF_DISCOVERY flag helps
to call in MODE_DISCOVERY only the parsers, which we are needed at this mode.

All unknown keywords and garbage will be ignored at this stage.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
48371e6a30 MEDIUM: cfgparse: call some parsers only in MODE_DISCOVERY
This commit is a part of the series to add a support of discovery mode in the
configuration parser and in initialization sequence.

Some keyword parsers tagged with KWF_DISCOVERY (for example those, which parse
runtime modes, poller types, pidfile), should not be called twice when
the configuration will be read the second time after the discovery mode.
It's redundant and could trigger parser's errors in standalone mode. In
master-worker mode the worker process inherits parsed settings from the master.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
f9123e2183 MEDIUM: cfgparse: add KWF_DISCOVERY keyword flag
This commit is a part of the series to add a support of discovery mode in the
configuration parser and in initialization sequence.

So, let's add here KWF_DISCOVERY flag to distinguish the keywords,
which should be parsed in "discovery" mode and which are needed for master
process, from all others. Keywords, that should be parsed in "discovery" mode
have its dedicated parser funtions. Let's tag these functions with
KWF_DISCOVERY flag in keywords list. Like this, only these keyword parsers
might be called during the first configuration read in discovery mode.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
6769745fe5 MINOR: global: add MODE_DISCOVERY flag
This is the first commit from a series to add a support of discovery mode
in the configuration parser and in initialization sequence.

Discovery mode is the mode, when we read the configuration at the first time
and we parse and set runtime modes: daemon, zero-warning, master-worker. In
this mode we also parse some parameters needed for the master process to start,
in case if we are in the master-worker mode. Like this the master process
doesn't allocate any additional resources, which it doesn't use and it quickly
finishes its initialization and enters to its polling loop. The worker process
after its fork reads the rest of the configuration.

So, let's add in this commit MODE_DISCOVERY flag to check it in
configuration parser functions.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
fe75c1e12d MEDIUM: startup: remove MODE_MWORKER_WAIT
MODE_MWORKER_WAIT becames redundant with MODE_MWORKER, due to moving
master-worker fork in init(). This change allows master no longer perform
reexec just after forking in order to free additional memory.

As after the fork in the master process we set 'master' variable, we can
replace now MODE_MWORKER_WAIT in some 'if' statements by simple check of this
'master' variable.

Let's also continue to get rid of HAPROXY_MWORKER_WAIT_ONLY environment
variable, as it's no longer needed as well.

In cfg_program_postparser(), which is used to check if cmdline is defined to
launch a program, we completely remove the check of mode for now, because
the master process does not parse the configuration for the moment. 'program'
section parsing will be reintroduced in master later in the next commits.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
fb7bef781d MINOR: defaults: update MASTER_MAXCONN description
This is a one of the commits to prepare the removal of MODE_MWORKER_WAIT
support, as it became redundant with MODE_MWORKER due to moving master-worker
fork in init().
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
3f5f57845b MINOR: startup: rename exit_on_waitmode_failure to exit_on_failure
As we no longer support MODE_MWORKER_WAIT for master (it became redundant with
MODE_MWORKER after moving master-worker fork in init()), let's rename
exit_on_waitmode_failure() callback in just exit_on_failure().
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
7795d49ae6 CLEANUP: haproxy: rm no longer used mworker_reexec_waitmode
This a first commit to prepare the removal of MODE_MWORKER_WAIT support. It has
became redundant with MODE_MWORKER, due to moving master-worker fork in init().
Master process does no longer perform reexec to free additional memory after
forking and does no longer changing its mode to MODE_MWORKER_WAIT, where it has
entered to its wait polling loop and has handled signals. Now, master enters in
this loop almost immediately after forking a worker and being always in mode
MODE_MWORKER.

So, we can remove mworker_reexec_waitmode() wrapper, which was used to set
HAPROXY_MWORKER_WAIT_ONLY variable and to call mworker_reexec(). But let's keep
for the moment the logic of reexec_on_failure() atexit callback for master in
order if in the future we will need to support this case again.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
cb0f1f42e1 MINOR: startup: only worker gets capabilities from bin
Due to moving the master-worker fork in init(), we need to protect
prepare_caps_from_permitted_set() call, which is executed after init(). This
call makes sense only for worker, daemon and for foreground mono process modes.

prepare_caps_from_permitted_set() allows to read Linux capabilities from
haproxy binary and to move some of them in process Effective set, if 'setcap'
keyword lists needed capabilities in the global section.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
fe04c2ad37 MEDIUM: startup: do set_identity() if needed in one place
There are two set_identity() calls, both under quite same:

    'if ((global.mode & (MODE_MWORKER|MODE_DAEMON...)...'

The first call serves to change uid/gid and set some needed Linux capabilities
only for process in the foreground mode. The second comes after master-worker
fork and allows to do the same in daemon and in worker modes.

Due to moving the master-worker fork in init() in some previous commit, the
second set_identity() now is no longer under the 'if'. So, it is executed
for all modes, except MODE_MWORKER. Now in MODE_MWORKER process enters in its
wait polling loop just after forking a worker and it terminates almost
immediately, if it exits this loop.

Worker, daemon and process in a foreground mode will perform set_identity() as
before, but now it will be called in a one place at main().

global.last_checks should be verified just after set_identity() call. As it's
stated in comments some configuration options may require full privileges or
some Linux capabilities need to be granted to process. set_identity() via
prepare_caps_for_setuid() may put configured capabilities in process Effective
set and, hence, remove respective flag from global.last_checks.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
02af1fe067 MEDIUM: startup: call chroot() if needed in one place
There are two 'chroot' code blocks, both under quite same:

	'if ((global.mode & (MODE_MWORKER|MODE_DAEMON...)...'

The first block serves to perform chroot only for process in the foreground
mode. The second comes after master-worker fork and allows to do chroot
in daemon and in worker modes.

Due to moving the master-worker fork in init() in some previous commit, the
second 'chroot' code block now is no longer under the 'if'. So, it is executed
for all modes, except MODE_MWORKER. Now in MODE_MWORKER process enters in its
wait polling loop just after forking a worker and it terminates almost
immediately, if it exits this loop.

Worker, daemon and process in a foreground mode will perform the chroot as
before, but now it will be done in a one place at main().
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
7a2ee10d71 REORG: mworker: move mworker_create_master_cli in master 'case'
Let's move mworker_create_master_cli() call in 'master' case just above and get
rid of redundant global.mode tests.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
e4c10a704d MINOR: init: check MODE_MWORKER before creating master CLI
mworker_create_master_cli() creates MASTER proxy and allocates listeners,
which are attached to this proxy. It also creates a reload sockpair.

So, it's more appropriate to do the check, that we are in a MODE_MWORKER, if
master CLI settings were provided via command line, just after the config
parsing. And only then, if runtime mode and command line settings are
coherent, try to perform master-worker fork and try to create master CLI.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
26e53e2e8c REORG: mworker: set nbthread=1 for master after fork
After moving master-worker fork into init() and reintroducing it into a
switch-case (see the previous commit), it is more appropriate to set
nbthread=1 and nbtgroups=1 immediately in the 'case' for the parent process.
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
ae84f06025 BUG/MINOR: mworker: fix memory leak due to master-worker fork
Before this fix, startup logs ring was duplicated before the fork(), so master
and worker had both the original startup_logs ring and the duplicated one. In
the worker context we freed the original ring and used a duplicated one. In
the master context we did nothing, but we still create a duplicated copy again
and again during the reload.

So, let's duplicate startup logs ring only in the worker context. Master
continues to use the original ring initialized in init() before its fork().
2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva
8dd4efe42f MAJOR: mworker: move master-worker fork in init()
This refactoring allows to simplify 'master-worker' logic. The master process
with this change will fork a worker very early at the initialization stage,
which allows to perform a configuration parsing only for the worker. In reality
only the worker process needs to parse and to apply the whole configuration.

Master process just polls master CLI sockets, watches worker status, catches
its termination state and handles the signals. With this refactoring there is
no longer need for master to perform re-execution after reading the whole
configuration file to free additional memory. And there is no longer need for
worker to register atexit callbacks, in order to free the memory, when it
fails to apply the new configuration. In contrast, we now need to set
proc_self pointer to the new worker entry in processes list just after
the fork in the worker process context. proc_self is dereferenced in
mworker_sockpair_register_per_thread(), which is called when worker enters in
its polling loop.

Following patches will try to gather more 'worker' and 'master' specific' code
in the dedicated cases of this new fork() switch, or in a separate functions.
2024-10-16 22:00:58 +02:00
Valentine Krasnobaeva
4cbfcc60f4 MEDIUM: startup: move PID handling in init()
Let's move PID handling in init() from the main() code. It is more appropriate
to open and to write the PID of the process just after daemonization fork. In
case of daemon monoprocess mode, we will simply write a PID of the process,
which is already in the background. In case of 'master-worker' mode, we keep
the previous behaviour and we write only a PID of the master process.

This allows to remove redundant tests of the process execution mode, tests of
the pidfd value and consequent writes to this pidfd. This patch prepares the
refactoring of master-worker fork by moving it in init() function as well.
2024-10-16 22:00:58 +02:00
Valentine Krasnobaeva
95c19be2ab MINOR: startup: refactor "daemonization" fork
Let's put "daemonization" fork into a switch-case. This is more readable and we
don't need to allocate memory for the fork() return value here.
2024-10-16 22:00:58 +02:00
Valentine Krasnobaeva
90b8181c0a MEDIUM: startup: move daemonization fork in init
Let's move daemonization fork in init(). We need to perform this fork always
before forking a worker process, in order to be able to launch master and then
its worker in daemon, i.e. background mode, if haproxy was started with '-D'
option.

This refactoring is a preparation step, needed for replacing then master-worker
fork in init() as well. This allows the master process not to read the whole
configuration file and not to do re-execution in order to free additional
memory, when worker was forked. In the new refactored design only the worker
process will read and apply a new configuration, while the master will arrive
very fast in its polling loop to wait worker's termination and to handle
signals. See more details in the following commits.
2024-10-16 22:00:58 +02:00
Valentine Krasnobaeva
df12791da3 MINOR: startup: add O_CLOEXEC flag to open /dev/null
As master process performs execvp() syscall to handle USR2 and HUP signals in
mworker_reexec(), let's add O_CLOEXEC flag, when we open '/dev/null' in order
to avoid fd leak.

This a preparation step to refactor master-worker logic. See more details in
the next commits.
2024-10-16 22:00:58 +02:00
Valentine Krasnobaeva
5bbcdc003a REGTESTS: cli: add delay 0.1 before connect to cli
When vtest starts haproxy process, it loops until the moment, when haproxy
pidfile is created. When pidfile is created, vtest considers that haproxy
process is ready and it starts to perform test commands, in particular, it
connects to CLI. It's not very reliable approach to base the check of the
process readiness on the PID file. After master-worker architecture
refactoring pidfile is created in the early init stage, but master and worker
are not yet finished its initialization routines. So, all mcli tests and some
tests where we sent commands to CLI start to fail regularly.

In vtest at the moment there is no any other approach to check that the
process is really ready. So let's add a delay 0.1s before connecting to CLI in
all mcli tests and in acl_cli_spaces test.
2024-10-16 22:00:58 +02:00
Willy Tarreau
2c2dac77aa DEBUG: mux-h2/flags: add H2_CF_DEM_RXBUF & H2_SF_EXPECT_RXDATA for the decoder
Both flags were recently added but missing from the decoders flags, so
they appeared in hex in dev/flags/flags output. No backport needed.
2024-10-16 18:32:52 +02:00
Willy Tarreau
ca275d99ce BUG/MEDIUM: queue: make sure never to queue when there's no more served conns
Since commit 53f52e67a0 ("BUG/MEDIUM: queue: always dequeue the backend when
redistributing the last server"), we've got two reports again still showing
the theoretically impossible condition in pendconn_add(), including a single
threaded one.

Thanks to the traces, the issue could be tracked down to the redispatch part.
In fact, in non-determinist LB algorithms (RR, LC, FAS), we don't perform the
LB if there are pending connections in the backend, since it indicates that
previous attempts already failed, so we directly return SRV_STATUS_FULL. And
contrary to a previous belief, it is possible to meet this condition with
be->served==0 when redispatching (and likely with maxconn not greater than
the number of threads).

The problem is that in this case, the entry is queued and then the
pendconn_must_try_again() function checks if any connections are currently
being served to detect if we missed a race, and tries again, but that
situation is not caused by a concurrent thread and will never fix itself,
resulting in the loop.

All that part around pendconn_must_try_again() is still quite brittle, and
a safer approach would involve a sequence counter to detect new arrivals
and dequeues during the pendconn_add() call. But it's more sensitive work,
probably for a later fix.

This fix must be backported wherever the fix above was backported. Thanks
to Patrick Hemmer, as well as Damien Claisse and Basha Mougamadou from
Criteo for their help on tracking this one!
2024-10-16 18:08:39 +02:00
Aurelien DARRAGON
85298189bf BUG/MEDIUM: server: server stuck in maintenance after FQDN change
Pierre Bonnat reported that SRV-based server-template recently stopped
to work properly.

After reviewing the changes, it was found that the regression was caused
by a4d04c6 ("BUG/MINOR: server: make sure the HMAINT state is part of MAINT")

Indeed, HMAINT is not a regular maintenance flag. It was implemented in
b418c122 a4d04c6 ("BUG/MINOR: server: make sure the HMAINT state is part
of MAINT"). This flag is only set (and never removed) when the server FQDN
is changed from its initial config-time value. This can happen with "set
server fqdn" command as well as SRV records updates from the DNS. This
flag should ideally belong to server flags.. but it was stored under
srv_admin enum because cur_admin is properly exported/imported via server
state-file while regular server's flags are not.

Due to a4d04c6, when a server FQDN changes, the server is considered in
maintenance, and since the HMAINT flag is never removed, the server is
stuck in maintenance.

To fix the issue, we partially revert a4d04c6. But this latter commit is
right on one point: HMAINT flag was way too confusing and mixed-up between
regular MAINT flags, thus there's nothing to blame about a4d04c6 as it was
error-prone anyway.. To prevent such kind of bugs from happening again,
let's rename HMAINT to something more explicit (SRV_ADMF_FQDN_CHANGED) and
make it stand out under srv_admin enum so we're not tempted to mix it with
regular maintenance flags anymore.

Since a4d04c6 was set to be backported in all versions, this patch must
be backported there as well.
2024-10-16 14:26:57 +02:00
Amaury Denoyelle
0918c41ef6 BUG/MEDIUM: quic: support wait-for-handshake
wait-for-handshake http-request action was completely ineffective with
QUIC protocol. This commit implements its support for QUIC.

QUIC MUX layer is extended to support wait-for-handshake. A new function
qcc_handle_wait_for_hs() is executed during qcc_io_process(). It detects
if MUX processing occurs after underlying QUIC handshake completion. If
this is the case, it indicates that early data may be received. As such,
connection is flagged with CO_FL_EARLY_SSL_HS, which is necessary to
block stream processing on wait-for-handshake action.

After this, qcc subscribs on quic_conn layer for RECV notification. This
is used to detect QUIC handshake completion. Thus,
qcc_handle_wait_for_hs() can be reexecuted one last time, to remove
CO_FL_EARLY_SSL_HS and notify every streams flagged as
SE_FL_WAIT_FOR_HS.

This patch must be backported up to 2.6, after a mandatory period of
observation. Note that it relies on the backport of the two previous
patches :
- MINOR: quic: notify connection layer on handshake completion
- BUG/MINOR: stream: unblock stream on wait-for-handshake completion
2024-10-16 11:51:35 +02:00
Amaury Denoyelle
73031e81cd BUG/MINOR: stream: unblock stream on wait-for-handshake completion
wait-for-handshake is an http-request action which permits to delay the
processing of content received as TLS early data. The action yields
as long as connection handshake is in progress. In the meantime, stconn
is flagged with SE_FL_WAIT_FOR_HS.

When the handshake is finished, MUX layer is responsible to woken up
SE_FL_WAIT_FOR_HS flagged stconn instances to restart the stream
processing. On sc_conn_process(), SE_FL_WAIT_FOR_HS flag is removed and
stream layer is woken up.

However, there may be a blocking after MUX notification. sc_conn_recv()
may return 0 due to no new data reception, which prevents
sc_conn_process() execution. The stream is thus blocked until its
timeout.

To fix this, checks in sc_conn_recv() about the handshake termination
condition. If true, explicitely returns 1 to ensure sc_conn_process()
will be executed.

Note that this bug is not reproducible due to various conditions related
to early data implementation in haproxy. Indeed, connection layer
instantiation is always delayed until SSL handshake completion, which
prevents the handling of early data as expected.

This fix will be necessary to implement wait-for-handshake support for
QUIC. As such, it must be backported with the next commit up to 2.6,
after a mandatory period of observation.
2024-10-16 11:44:31 +02:00
Amaury Denoyelle
5a5950e42d MINOR: quic: notify connection layer on handshake completion
Wake up connection layer on QUIC handshake completion via
quic_conn_io_cb. Select SUB_RETRY_RECV as this was previously unused by
QUIC MUX layer.

For the moment, QUIC MUX never subscribes for handshake completion.
However, this will be necessary for features such as the delaying of
early data forwarding via wait-for-handshake.

This patch will be necessary to implement wait-for-handshake support for
QUIC. As such, it must be backported with next commits up to 2.6,
after a mandatory period of observation.
2024-10-16 11:42:06 +02:00
Willy Tarreau
5091f90479 MINOR: activity/memprofile: always return "other" bin on NULL return address
It was found in a large "show profiling memory" output that a few entries
have a NULL return address, which causes confusion because this address
will be reused by the next new allocation caller, possibly resulting in
inconsistencies such as "free() ... pool=trash" which makes no sense. The
cause is in fact that the first caller had an entry->info pointing to the
trash pool from a p_alloc/p_free with a NULL return address, and the second
had a different type and reused that entry.

Let's make sure undecodable stacks causing an apparent NULL return address
all lead to the "other" bin.

While this is not exactly a bug, it would make sense to backport it to the
recent branches where the feature is used (probably at least as far as 2.8).
2024-10-15 08:12:34 +02:00
Willy Tarreau
93c9f19af7 REGTESTS: fix a reload race in abns_socket.vtc
This test issues a reload over the master CLI, but it is totally
possible that the master has not yet finished starting up the master
CLI when the command is issued, resulting in a failure. This was much
more visible on the new master-worker model, but definitely affects the
old one and could be the reason for this test to occasionally fail on
the CI.
2024-10-14 19:15:21 +02:00
William Lallemand
0302adf996 CI: cirrus-ci: bump FreeBSD image to 14-1
FreeBSD CI since to be broken for a while, try to upgrade the image to
the latest 14.1 version.
2024-10-14 14:28:26 +02:00
Willy Tarreau
e4cb0ad632 MINOR: mux-h2/traces: add buffer-related info to h2s and h2c
The traces currently don't contain any info about the amount of data
present in buffers, making it difficult to figure if an empty buffer
is the cause for not demuxing or if a full buffer is the cause for
not reading more data. Let's add them, with the head/tail info as
well.
2024-10-12 18:07:21 +02:00