615 Commits

Author SHA1 Message Date
Aurelien DARRAGON
d38b918da1 BUG/MINOR: server: ensure server is detached from proxy list before being freed
There remained some cases (on error paths) were a server could be freed
while still attached on the parent proxy server list. In 3.3 this can be
problematic because new_server() automatically adds the server to the
parent proxy list.

The bug is insignificant because it is on errors paths during init and
often haproxy exits right after. But let's fix that to ensure no UAF or
undefined behavior occurs because of that.

This patch depends on ("MINOR: cli: use srv_drop() when server was created using new_server()")

It must be backported in 3.3 with the above mentioned patch.
2026-01-19 14:24:04 +01:00
Aurelien DARRAGON
12dc9325a7 MINOR: cli: use srv_drop() when server was created using new_server()
Now that new_server() is becoming more and more complex, we need to
take care that servers created using new_server() must be released
using the corresponding release function srv_drop() which takes care
of properly de-initing the server and its members.
2026-01-19 14:23:58 +01:00
Olivier Houchard
5495c88441 MEDIUM: counters: Dynamically allocate per-thread group counters
Instead of statically allocating the per-thread group counters,
based on the max number of thread groups available, allocate
them dynamically, based on the number of thread groups actually
used. That way we can increase the maximum number of thread
groups without using an unreasonable amount of memory.
2026-01-13 11:12:34 +01:00
William Lallemand
5b19d95850 BUG/MEDIUM: mworker/listener: ambiguous use of RX_F_INHERITED with shards
The RX_F_INHERITED flag was ambiguous, as it was used to mark both
listeners inherited from the parent process and listeners duplicated
from another local receiver. This could lead to incorrect behavior
concerning socket unbinding and suspension.

This commit refactors the handling of inherited listeners by splitting
the RX_F_INHERITED flag into two more specific flags:

- RX_F_INHERITED_FD: Indicates a listener inherited from the parent
  process via its file descriptor. These listeners should not be unbound
  by the master.

- RX_F_INHERITED_SOCK: Indicates a listener that shares a socket with
  another one, either by being inherited from the parent or by being
  duplicated from another local listener. These listeners should not be
  suspended or resumed individually.

Previously, the sharding code was unconditionally using RX_F_INHERITED
when duplicating a file descriptor. In HAProxy versions prior to 3.1,
this led to a file descriptor leak for duplicated unix stats sockets in
the master process. This would eventually cause the master to crash with
a BUG_ON in fd_insert() once the file descriptor limit was reached.

This must be backported as far as 3.0. Branches earlier than 3.0 are
affected but would need a different patch as the logic is different.
2025-12-11 18:09:47 +01:00
Christopher Faulet
0629ce8f4b BUG/MEDIUM: cli: State the cli have no more data to deliver if it yields
A regression was introduced in the commit 2d7e3ddd4 ("BUG/MEDIUM: cli: do
not return ACKs one char at a time"). When the CLI is processing a command
line, we no longer send response immediately. It is especially useful for
clients sending a bunch of commands with very short response.

However, in that state, the CLI applet must state it has no more data to
deliver. Otherwise it will be woken up again and again because data are
found in its output buffer with no blocking conditions. In worst cases, if
the command rate is really high, this can trigger the watchdog.

This patch must be backported where the patch above is, so probably as far
as 3.0.
2025-11-21 10:00:15 +01:00
William Lallemand
709cde6d08 BUG/MEDIUM: mworker: signals inconsistencies during startup and reload
Since haproxy 3.1, the master-worker mode changed to let the worker
parse the configuration instead of the master.

Previously, signals were blocked during configuration parsing and
unblocked before entering the polling loop of the master. This way it
was impossible to start a reload during the configuration parsing.

But with the new model, the polling loop is started in the master before
the configuration parsing is finished, and the signals are still
unblocked at this step. Meaning that it is possible to start a reload
while the configuration is parsing.

This patch reintroduce the behavior of blocking the signals during
configuration parsing adapted to the new model:

- Before the exec() of the reload, signals are blocked.
- When entering the polling loop, the SIGCHLD is unblocked because it is
  required to get a failure during configuration parsing in the worker
- Once the configuration is parsed, upon success in _send_status() or
  upon failure in run_master_in_recovery_mode() every signals are unblocked.

This patch must be backported as far as 3.1.
2025-11-18 10:05:42 +01:00
Christopher Faulet
a1b5325a7a MINOR: channel: Remove total field from channels
The <total> field in the channel structure is now useless, so it can be
removed. The <bytes_in> field from the SC is used instead.

This patch is related to issue #1617.
2025-11-06 15:01:29 +01:00
Christopher Faulet
567df50d91 MINOR: stream: Remove bytes_in and bytes_out counters from stream
per-stream bytes_in and bytes_out counters was removed and replaced by
req.in and res.in. Coorresponding samples still exists but replies on new
counters.

This patch is related to issue #1617.
2025-11-06 15:01:29 +01:00
Christopher Faulet
ac9201f929 MINOR: stream: Add samples to get number of bytes received or sent on each side
req.in and req.out samples can now be used to get the number of bytes
received by a client and send to the server. And res.in and res.out samples
can be used to get the number of bytes received by a server and send to the
client. These info are stored in the logs structure inside a stream.

This patch is related to issue #1617.
2025-11-06 15:01:28 +01:00
Remi Tricot-Le Breton
a011683622 MINOR: init: Use devnullfd in stdio_quiet calls instead of recreating a fd everytime
Since commit "65760d MINOR: init: Make devnullfd global and create it
earlier in init" the devnullfd file descriptor pointing to /dev/null
is created regardless of the process's parameters so we can use it in
all 'stdio_quiet' calls instead or recreating an FD.
2025-10-29 10:54:17 +01:00
Willy Tarreau
2d7e3ddd4a BUG/MEDIUM: cli: do not return ACKs one char at a time
Since 3.0 where the CLI started to use rcv_buf, it appears that some
external tools sending chained commands are randomly experiencing
failures. Each time this happens when the whole command is sent as a
single packet, immediately followed by a close. This is not a correct
way to use the CLI but this has been working for ages for simple
netcat-based scripts, so we should at least try to preserve this.

The cause of the failure is that the first LF that acks a command is
immediately sent back to the client and rejected due to the closed
connection. This in turn forwards the error back to the applet which
aborts its processing.

Before 3.0 the responses would be queued into the buffer, then sent
back to the channel, and would all fail at once. This changed when
snd_buf/rcv_buf were implemented because the applets are much more
responsive and since they yield between each command, they can
deliver one ACK at a time that is immediately forwarded down the
chain.

An easy way to observe the problem is to send 5 map updates, a shutdown,
and immediately close via tcploop, and in parallel run a periodic
"show map" to count the number of elements:

  $ tcploop -U /tmp/sock1 C S:"add map #0 1 1; add map #0 2 2; add map #0 3 3; add map #0 4 4; add map #0 5 5\n" F K

Before 3.0, there would always be 5 elements. Since 3.0 and before
20ec1de214 ("MAJOR: cli: Refacor parsing and execution of pipelined
commands"), almost always 2. And since that commit above in 3.2, almost
always one. Doing the same using socat or netcat shows almost always 5...
It's entirely timing-dependent, and might even vary based on the RTT
between the client and haproxy!

The approach taken here consists in doing the same principle as MSG_MORE
or Nagle but on the response buffer: the applet doesn't need to send a
single ACK for each command when it has already been woken up and is
scheduled to come back to work. It's fine (and even desirable) that
ACKs are grouped in a single packet as much as possible.

For this reason, this patch implements APPCTX_CLI_ST1_YIELD, a new CLI
flag which indicates that the applet left in yielding condition, i.e.
it has not finished its work. This flag is used by .rcv_buf to hold
pending data. This way we won't return partial responses for no reason,
and we can continue to emulate the previous behavior.

One very nice benefit to this is that it saves huge amounts of CPU on
the client. In the test below that tries to update 1M map entries, the
CPU used by socat went from 100% to 0% and the total transfer time
dropped by 28%:

  before:
    $ time awk 'BEGIN{ printf "prompt i\n"; for (i=0;i<1000000;i++) { \
         printf "add map #0 %d %d\n",i,i,i }}' | socat /tmp/sock1 - >/dev/null

    real    0m2.407s
    user    0m1.485s
    sys     0m1.682s

  after:
    $ time awk 'BEGIN{ printf "prompt i\n"; for (i=0;i<1000000;i++) { \
         printf "add map #0 %d %d\n",i,i,i }}' | socat /tmp/sock1 - >/dev/null

    real    0m1.721s
    user    0m0.952s
    sys     0m0.057s

The difference is also quite visible on the number of syscalls during
the test (for 1k updates):

  before:
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.071691           0    100001           sendmsg

  after:
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.000011           1         9           sendmsg

This patch will need to be backported to 3.0, and depends on these two
patches to be backported as well:

    MINOR: applet: do not put SE_FL_WANT_ROOM on rcv_buf() if the channel is empty
    MINOR: cli: create cli_raw_rcv_buf() from the generic applet_raw_rcv_buf()
2025-10-27 16:57:07 +01:00
Willy Tarreau
f38ea2731b MINOR: cli: create cli_raw_rcv_buf() from the generic applet_raw_rcv_buf()
This is in preparation for a future fix. For now it's simply a pure
copy of the original function, but dedicated to the CLI. It will
have to be backported to 3.0.
2025-10-27 16:57:07 +01:00
Willy Tarreau
e63e98f1d8 BUG/MEDIUM: cli: also free the trash chunk on the error path
Since commit 20ec1de214 ("MAJOR: cli: Refacor parsing and execution of
pipelined commands"), command not returning any response (e.g. "quit")
don't pass through the free_trash_chunk() call, possibly leaking the
cmdline buffer. A typical way to reproduce it is to loop on "quit" on
the CLI, though it very likely affects other specific commands.

Let's make sure in the release handler that we always release that
chunk in any case. This must be backported to 3.2.
2025-10-20 14:58:53 +02:00
Willy Tarreau
1a95bc42c7 MEDIUM: proxy: index proxy ID using compact trees
The proxy ID is currently stored as a 32-bit int using an eb32 tree.
It's used essentially to find holes in order to automatically assign IDs,
and to detect duplicates. Let's change this to use compact trees instead
in order to save 24 bytes in struct proxy for this node, plus 8 bytes in
the root (which is static so not much relevant here). Now the proxy is
3088 bytes large.
2025-09-16 09:23:46 +02:00
Willy Tarreau
eab5b89dce MINOR: proxy: add proxy_index_id() to index a proxy by its ID
This avoids needlessly exposing the tree's root and the mechanics outside
of the low-level code.
2025-09-16 09:23:46 +02:00
Frederic Lecaille
47bb15ca84 MINOR: quic: get rid of ->target quic_conn struct member
The ->li (struct listener *) member of quic_conn struct was replaced by a
->target (struct obj_type *) member by this commit:

    MINOR: quic-be: get rid of ->li quic_conn member

to abstract the connection type (front or back) when implementing QUIC for the
backends. In these cases, ->target was a pointer to the ojb_type of a server
struct. This could not work with the dynamic servers contrary to the listeners
which are not dynamic.

This patch almost reverts the one mentioned above. ->target pointer to obj_type member
is replaced by ->li pointer to listener struct member. As the listener are not
dynamic, this is easy to do this. All one has to do is to replace the
objt_listener(qc->target) statement by qc->li where applicable.

For the backend connection, when needed, this is always qc->conn->target which is
used only when qc->conn is initialized. The only "problematic" case is for
quic_dgram_parse() which takes a pointer to an obj_type as third argument.
But this obj_type is only used to call quic_rx_pkt_parse(). Inside this function
it is used to access the proxy counters of the connection thanks to qc_counters().
So, this obj_type argument may be null for now on with this patch. This is the
reason why qc_counters() is modified to take this into consideration.
2025-09-11 09:51:28 +02:00
Amaury Denoyelle
bce29bc7a4 MINOR: cli: display failure reason on wait command
wait CLI command can be used to wait until either a defined timeout or a
specific condition is reached. So far, srv-removable is the only event
supported. This is tested via srv_check_for_deletion().

This is implemented via srv_check_for_deletion(), which is
able to report a message describing the reason if the condition is
unmet.

Previously, wait return a generic string, to specify if the condition is
met, the timer has expired or an immediate error is encountered. In case
of srv-removable, it did not report the real reason why a server could
not be removed.

This patch improves wait command with srv-removable. It now displays the
last message returned by srv_check_for_deletion(), either on immediate
error or on timeout. This is implemented by using dynamic string output
with cli_dynmsg/dynerr() functions.
2025-08-28 14:52:29 +02:00
Christopher Faulet
c6c2ef1f11 BUG/MEDIUM: cli: Report inbuf is no longer full when a line is consumed
When the command line parsing was refactored (20ec1de21 "MAJOR: cli: Refacor
parsing and execution of pipelined commands"), a regression was introduced.
When input data are consumed, information about the applet's input buffer
are no longer updated accordingly to state it is no longer full. So it is
possible to freeze the CLI applet. And a spinning loop may be encountered if
a client shutdown is detected in this state.

The fix is obivous. When data are consumed from the applet's input buffer,
APPCTX_FL_INBLK_FULL flag is removed to notify the input buffer is no longer
full and more data can be sent to the CLI applet.

This patch should fix the issue #3064. It must be backported to 3.2.
2025-08-20 16:01:50 +02:00
Willy Tarreau
a469356268 MEDIUM: server: introduce srv_alloc()/srv_free() to alloc/free a server
It happens that we free servers at various places in the code, both
on error paths and at runtime thanks to the "server delete" feature. In
order to switch to an aligned struct, we'll need to change the calloc()
and free() calls. Let's first spot them and switch them to srv_alloc()
and srv_free() instead of using calloc() and either free() or ha_free().
An easy trap to fall into is that some of them are default-server
entries. The new srv_free() function also resets the pointer like
ha_free() does.

This was done by running the following coccinelle script all over the
code:

  @@
  struct server *srv;
  @@
  (
  - free(srv)
  + srv_free(&srv)
  |
  - ha_free(&srv)
  + srv_free(&srv)
  )
  @@
  struct server *srv;
  expression e1;
  expression e2;
  @@
  (
  - srv = malloc(e1)
  + srv = srv_alloc()
  |
  - srv = calloc(e1, e2)
  + srv = srv_alloc()
  )

This is marked medium because despite spotting all call places, we can
never rule out the possibility that some out-of-tree patches would
allocate their own servers and continue to use the old API... at their
own risk.
2025-08-13 17:37:11 +02:00
Aurelien DARRAGON
c24de077bd OPTIM: stats: store fast sharded counters pointers at session and stream level
Following commit 75e480d10 ("MEDIUM: stats: avoid 1 indirection by storing
the shared stats directly in counters struct"), in order to minimize the
impact of the recent sharded counters work, we try to push things a bit
further in this patch by storing and using "fast" pointers at the session
and stream levels when available to avoid costly indirections and
systematic "tgid" resolution (which can not be cached by the CPU due to
its THREAD-local nature).

Indeed, we know that a session/stream is tied to a given CPU, thanks to
this we know that the tgid for a given session/stream will never change.

Given that, we are able to store sharded frontend and listener counters
pointer at the session level (namely sess->fe_tgcounters and
sess->li_tgcounters), and once the backend and the server are selected,
we are also able to store backend and server sharded counters
pointer at the stream level (namely s->be_tgcounters and s->sv_tgcounters)

Everywhere we rely on these counters and the stream or session context is
available, we use the fast pointers it instead of the indirect pointers
path to make the pointer resolution a bit faster.

This optimization proved to bring a few percents back, and together with
the previous 75e480d10 commit we now fixed the performance regression (we
are back to back with 3.2 stats performance)
2025-07-25 18:24:23 +02:00
Christopher Faulet
337768656b MINOR: applet: Add support for flags on applets with a flag about the new API
A new field was added in the applet structure to be able to set flags on the
applets The first one is related to the new API. APPLET_FL_NEW_API is set
for applets based on the new API. It was set on all HAProxy's applets.
2025-07-25 15:44:02 +02:00
Ilia Shipitsin
0ee3d739b8 CLEANUP: assorted typo fixes in the code, commits and doc
Corrected various spelling and phrasing errors to improve clarity and consistency.
2025-07-10 19:49:48 +02:00
Christopher Faulet
fb76655526 MINOR: cli: handle EOS/ERROR first
It is not especially a bug fixed. But APPCTX_FL_EOS and APPCTX_FL_ERROR
flags must be handled first. These flags are set by the applet itself and
should mark the end of all processing. So there is not reason to get the
output buffer in first place.

This patch could be backported as far as 3.0.
2025-06-16 16:47:59 +02:00
Christopher Faulet
396f0252bf BUG/MEDIUM: cli: Don't consume data if outbuf is full or not available
The output buffer must be available to process a command, at least to be
able to emit error messages. When this buffer is full or cannot be
allocated, we must wait. In that case, we must take care to notify the SE
will not consume input data. It is important to avoid wakeup in loop,
especially when the client aborts.

When the output buffer is available again and no longer full, and the CLI
applet is waiting for a command line, it must notify it will consume input
data.

This patch must be backported as far as 3.0.
2025-06-16 16:47:59 +02:00
Frederic Lecaille
b9703cf711 MINOR: quic-be: get rid of ->li quic_conn member
Replace ->li quic_conn pointer to struct listener member by  ->target which is
an object type enum and adapt the code.
Use __objt_(listener|server)() where the object type is known. Typically
this is were the code which is specific to one connection type (frontend/backend).
Remove <server> parameter passed to qc_new_conn(). It is redundant with the
<target> parameter.
GSO is not supported at this time for QUIC backend. qc_prep_pkts() is modified
to prevent it from building more than an MTU. This has as consequence to prevent
qc_send_ppkts() to use GSO.
ssl_clienthello.c code is run only by listeners. This is why __objt_listener()
is used in place of ->li.
2025-06-11 18:37:34 +02:00
Christopher Faulet
f5d41803d3 BUG/MEDIUM: cli: Properly parse empty lines and avoid crashed
Empty lines was not properly parsed and could lead to crashes because the
last argument was parsed outside of the cmdline buffer. Indeed, the last
argument is parsed to look for an eventual payload pattern. It is started
one character after the newline at the end of the command line. But it is
only valid for an non-empty command line.

So, now, this case is properly detected when we leave if an empty line is
detected.

This patch must be backported to 3.2.
2025-06-05 10:46:13 +02:00
Aurelien DARRAGON
889ef6f67b MEDIUM: server: automatically add server to proxy list in new_server()
while new_server() takes the parent proxy as argument and even assigns
srv->proxy to the parent proxy, it didn't actually inserted the server
to the parent proxy server list on success.

The result is that sometimes we add the server to the list after
new_server() is called, and sometimes we don't.

This is really error-prone and because of that hooks such as
REGISTER_POST_SERVER_CHECK() which as run for all servers listed in
all proxies may not be relied upon for servers which are not actually
inserted in their parent proxy server list. Plus it feels very strange
to have a server that points to a proxy, but then the proxy doesn't know
about it because it cannot find it in its server list.

To prevent errors and make proxy->srv list reliable, we move the insertion
logic directly under new_server(). This requires to know if we are called
during parsing or during runtime to either insert or append the server to
the parent proxy list. For that we use PR_FL_CHECKED flag from the parent
proxy (if the flag is set, then the proxy was checked so we are past the
init phase, thus we assume we are called during runtime)

This implies that during startup if new_server() has to be cancelled on
error paths we need to call srv_detach() (which is now exposed in server.h)
before srv_drop().

The consequence of this commit is that REGISTER_POST_SERVER_CHECK() should
not run reliably on all servers created using new_server() (without having
to manually loop on global servers_list)
2025-06-02 17:51:30 +02:00
Aurelien DARRAGON
976e0bd32f BUG/MINOR: cli: fix too many args detection for commands
d3f928944 ("BUG/MINOR: cli: Issue an error when too many args are passed
for a command") added a new check to prevent the command to run when
too many arguments are provided. In this case an error is reported.

However it turns out this check (despite marked for backports) was
ineffective prior to 20ec1de21 ("MAJOR: cli: Refacor parsing and
execution of pipelined commands") as 'p' pointer was reset to the end of
the buffer before the check was executed.

Now since 20ec1de21, the check works, but we have another issue: we may
read past initialized bytes in the buffer because 'p' pointer is always
incremented in a while loop without checking if we increment it past 'end'
(This was detected using valgrind)

To fix the issue introduced by 20ec1de21, let's only increment 'p' pointer
if p < end.

For 3.2 this is it, now for older versions, since d3f928944 was marked for
backport, a sligthly different approach is needed:

 - conditional p increment must be done in the loop (as in this patch)
 - max arg check must moved above "fill unused slots" comment where p is
   assigned to the end of the buffer

This patch should be backported with d3f928944.
2025-05-08 02:03:43 +02:00
Willy Tarreau
dc06495b71 MEDIUM: mcli: replicate the current mode when enterin the worker process
While humans can find it convenient to enter the worker process in prompt
mode, for external tools it will not be convenient to have to systematically
disable it. A better approach is to replicate the master socket's mode
there, since it has already been configured to suit the user: interactive,
prompt and timed modes are automatically passed to the worker process.
This makes the using the worker commands more natural from the master
process, without having to systematically adapt it for each new connection.
2025-04-28 20:21:06 +02:00
Willy Tarreau
c347cb73fa MEDIUM: mcli: make the prompt mode configurable between i/p
Support the same syntax in master mode as in worker mode in order to
configure the prompt. The only thing is that for now the master doesn't
have a non-interactive mode and it doesn't seem necessary to implement
it, so we only support the interactive and prompt modes. However the code
was written in a way that makes it easy to change this later if desired.
2025-04-28 20:21:06 +02:00
Willy Tarreau
e5c255c4e5 MEDIUM: cli: make the prompt mode configurable between n/i/p
Now the prompt mode can more finely be configured between non-interactive
(default), interactive without prompt, and interactive with prompt. This
will ease the usage from automated tools which are not necessarily
interested in having to consume '> ' after each command nor displaying
"+" on payload lines. This can also be convenient when coming from the
master CLI to keep the same output format.
2025-04-28 20:21:06 +02:00
Willy Tarreau
f25b4abc9b MINOR: cli: split APPCTX_CLI_ST1_PROMPT into two distinct flags
The CLI's "prompt" command toggles two distinct things:
  - displaying or hiding the prompt at the beginning of the line
  - single-command vs interactive mode

These are two independent concepts and the prompt mode doesn't
always cope well with tools that would like to upload data without
having to read the prompt on return. Also, the master command line
works in interactive mode by default with no prompt, which is not
consistent (and not convenient for tools). So let's start by splitting
the bit in two, and have a new APPCTX_CLI_ST1_INTER flag dedicated
to the interactive mode. For now the "prompt" command alone continues
to toggle the two at once.
2025-04-28 20:21:06 +02:00
Willy Tarreau
7a79f54c98 BUG/MINOR: master/cli: only parse the '@@' prefix on complete lines
The new adhoc parser for the '@@' prefix forgot to require the presence
of the LF character marking the end of the line. This is the reason why
entering incomplete commands would display garbage, because the line was
expected to have its LF character replaced with a zero.

The problem is well illustrated by using socat in raw mode:

   socat /tmp/master.sock STDIO,raw,echo=0

then entering "@@1 show info" one character at a time would error just
after the second "@". The command must take care to report an incomplete
line and wait for more data in such a case.
2025-04-25 09:05:00 +02:00
Willy Tarreau
931d932b3e Revert "BUG/MINOR: master/cli: properly trim the '@@' process name in error messages"
This reverts commit 0e94339eaf1c8423132debb6b1b485d8bb1bb7da.

This patch was in fact fixing the symptom, not the cause. The root cause
of the problem is that the parser was processing an incomplete line when
looking for '@@'. When the LF is present, this problem does not exist
as it's properly replaced with a zero. This can be verified using socat
in raw mode:

  socat /tmp/master.sock STDIO,raw,echo=0

Then entering "@@1 show info" one character at a time will immediately
fail on "@@" without going further. A subsequent patch will fix this.
No backport is needed.
2025-04-25 09:05:00 +02:00
Christopher Faulet
101cc4f334 BUG/MEDIUM: cli: Handle applet shutdown when waiting for a command line
When the CLI applet was refactord in the commit 20ec1de21 ("MAJOR: cli:
Refacor parsing and execution of pipelined commands"), a regression was
introduced. The applet shutdown was not longer handled when the applet was
waiting for the next command line. It is especially visible when a client
timeout occurred because the client connexion is no longer closed.

To fix the issue, the test on the SE_FL_SHW flag was reintroduced in
CLI_ST_PARSE_CMDLINE state, but only is there is no pending input data.

It is a 3.2-specific issue. No backport needed.
2025-04-25 08:47:05 +02:00
Willy Tarreau
0e94339eaf BUG/MINOR: master/cli: properly trim the '@@' process name in error messages
When '@@' alone is sent on the master CLI (no trailing LF), we get an
error that displays anything past these two characters in the buffer
since there's no room for a \0. Let's make sure to limit the length of
the process name in this case. No backport is needed since this was added
with 00c967fac4 ("MINOR: master/cli: support bidirectional communications
with workers").
2025-04-24 17:52:28 +02:00
Christopher Faulet
b734d7c156 MINOR: cli/applet: Move appctx fields only used by the CLI in a private context
There are several fields in the appctx structure only used by the CLI. To
make things cleaner, all these fields are now placed in a dedicated context
inside the appctx structure. The final goal is to move it in the service
context and add an API for cli commands to get a command coontext inside the
cli context.
2025-04-24 15:09:37 +02:00
Christopher Faulet
44ace9a1b7 MINOR: cli: Rename some CLI applet states to reflect recent refactoring
CLI_ST_GETREQ state was renamed into CLI_ST_PARSE_CMDLINE and CLI_ST_PARSEREQ
into CLI_ST_PROCESS_CMDLINE to reflect the real action performed in these
states.
2025-04-24 15:09:37 +02:00
Christopher Faulet
20ec1de214 MAJOR: cli: Refacor parsing and execution of pipelined commands
Before this patch, when pipelined commands were received, each command was
parsed and then excuted before moving to the next command. Pending commands
were not copied in the input buffer of the applet. The major issue with this
way to handle commands is the impossibility to consume inputs from commands
with an I/O handler, like "show events" for instance. It was working thanks
to a "bug" if such commands were the last one on the command line. But it
was impossible to use them followed by another command. And this prevents us
to implement any streaming support for CLI commands.

So we decided to refactor the command line parsing to have something similar
to a basic shell. Now an entire line is parsed, including the payload,
before starting commands execution. The command line is copied in a
dedicated buffer. "appctx->chunk" buffer is used for this purpose. It was an
unsed field, so it is safe to use it here. Once the command line copied, the
commands found on this line are executed. Because the applet input buffer
was flushed, any input can be safely consumed by the CLI applet and is
available for the command I/O handler. Thanks to this change, "show event
-w" command can be followed by a command. And in theory, it should be
possible to implement commands supporting input data streaming. For
instance, the Tetris like lua applet can be used on the CLI now.

Note that the payload, if any, is part of the command line and must be fully
received before starting the commands processing. It means there is still
the limitation to a buffer, but not only for the payload but for the whole
command line. The payload is still necessarily at the end of the command
line and is passed as argument to the last command. Internally, the
"appctx->cli_payload" field was introduced to point on the payload in the
command line buffer.

This patch is quite huge but it cannot easily be splitted. It should not
introduced significant changes.
2025-04-24 15:09:37 +02:00
Christopher Faulet
69a9ec5bef MINOR: cli: Use a full prompt command for bidir connections with workers
When a bidirection connection with no command is establisehd with a worker
(so "@@<pid>" alone), a "prompt" command is automatically added to display
the worker's prompt and enter in interactive mode in the worker context.
However, till now, an unfinished command line is sent, with a semicolon
instead of a newline at the end. It is not exactly a bug because this
works. But it is not really expected and could be a problem for future
changes.

So now, a full command line is sent: the "prompt" command finished by a
newline character.
2025-04-24 15:09:37 +02:00
Christopher Faulet
d3f9289447 BUG/MINOR: cli: Issue an error when too many args are passed for a command
When a command is parsed to split it in an array of arguments, by default,
at most 64 arguments are supported. But no warning was emitted when there
were too many arguments. Instead, the arguments above the limit were
silently ignored. It could be an issue for some commands, like "add server",
because there was no way to know some arguments were ignored.

Now an error is issued when too many arguments are passed and the command is
not executed.

This patch should be backported to all stable versions.
2025-04-24 14:58:24 +02:00
Willy Tarreau
23705564ae BUG/MINOR: debug: remove the trailing \n from BUG_ON() statements
These ones were added by mistake during the change of the cfgparse
mechanism in 3.1, but they're corrupting the output of "debug counters"
by leaving stray ']' on their own lines. We could possibly check them
all once at boot but it doens't seem worth it.

This should be backported to 3.1.
2025-04-14 19:02:13 +02:00
Willy Tarreau
00c967fac4 MINOR: master/cli: support bidirectional communications with workers
Some rare commands in the worker require to keep their input open and
terminate when it's closed ("show events -w", "wait"). Others maintain
a per-session context ("set anon on"). But in its default operation
mode, the master CLI passes commands one at a time to the worker, and
closes the CLI's input channel so that the command can immediately
close upon response. This effectively prevents these two specific cases
from being used.

Here the approach that we take is to introduce a bidirectional mode to
connect to the worker, where everything sent to the master is immediately
forwarded to the worker (including the raw command), allowing to queue
multiple commands at once in the same session, and to continue to watch
the input to detect when the client closes. It must be a client's choice
however, since doing so means that the client cannot batch many commands
at once to the master process, but must wait for these commands to complete
before sending new ones. For this reason we use the prefix "@@<pid>" for
this. It works exactly like "@" except that it maintains the channel
open during the whole execution. Similarly to "@<pid>" with no command,
"@@<pid>" will simply open an interactive CLI session to the worker, that
will be ended by "quit" or by closing the connection. This can be convenient
for the user, and possibly for clients willing to dedicate a connection to
the worker.
2025-04-11 16:09:17 +02:00
Aurelien DARRAGON
4194f756de MEDIUM: tree-wide: avoid manually initializing proxies
In this patch we try to use the proxy API init functions as much as
possible to avoid code redundancy and prevent proxy initialization
errors. As such, we prefer using alloc_new_proxy() and setup_new_proxy()
instead of manually allocating the proxy pointer and performing the
base init ourselves.
2025-04-10 22:10:31 +02:00
Willy Tarreau
6336b636f7 MINOR: cli/server: don't take thread isolation to check for srv-removable
Thanks to the previous commits, we now know that "wait srv-removable"
does not require thread isolation, as long as 3372a2ea00 ("BUG/MEDIUM:
queues: Stricly respect maxconn for outgoing connections") and c880c32b16
("MINOR: stream: decrement srv->served after detaching from the list")
are present. Let's just get rid of thread_isolate() here, which can
consume a lot of CPU on highly threaded machines when removing many
servers at once.
2025-03-18 17:36:02 +01:00
Willy Tarreau
1e99efccef MINOR: cli: export cli_io_handler() to ease symbol resolution
It's common to meet this function in backtraces, it's a bit annoying
that it's not resolved, so let's export it so that it becomes resolvable.
2025-03-13 17:30:48 +01:00
Christopher Faulet
49b7bcf583 BUG/MINOR: cli: Wait for the last ACK when FDs are xferred from the old worker
On reload, the new worker requests bound FDs to the old one. The old worker
sends them in message of at most 252 FDs. Each message is acknowledged by
the new worker. All messages sent or received by the old worker are handled
manually via sendmsg/recv syscalls. So the old worker must be sure consume
all the ACK replies. However, the last one was never consumed. So it was
considered as a command by the CLI applet. This issue was hidden since
recently. But it was the root cause of the issue #2862.

Note this last ack is also the first one when there are less than 252 FDs to
transfer.

This patch must be backported to all stable versions.
2025-02-17 15:31:07 +01:00
Christopher Faulet
972ce87676 BUG/MEDIUM: cli: Be sure to drop all input data in END state
Commit 7214dcd ("BUG/MEDIUM: applet: Don't pretend to have more data to
handle EOI/EOS/ERROR") revealed a bug with the CLI applet. Pending input
data when the applet is in CLI_ST_END state were never consumed or dropped,
leading to a wakeup loop.

The CLI applet implements its own snd_buf callback function. It is important
it consumes all pending input data. Otherwise, the applet is woken up in
loop until it empties the request buffer. Another way to fix the issue would
be to report an error. But in that case, it seems reasonnable to drop these
data.

The issue can be observed on reload, in master/worker mode, because of issue
about the last ACK message which was never consummed by the _getsocks()
command.

This patch should fix the issue #2862. It must be backported to 3.1 with the
commit above.
2025-02-17 15:31:07 +01:00
Willy Tarreau
561319bd1c BUG/MEDIUM: fd: mark FD transferred to another process as FD_CLONED
The crappy epoll API stroke again with reloads and transferred FDs.
Indeed, when listening sockets are retrieved by a new worker from a
previous one, and the old one finally stops listening on them, it
closes the FDs. But in this case, since the sockets themselves were
not closed, epoll will not unregister them and will continue to
report new activity for these in the old process, which can only
observe, count an fd_poll_drop event and not unregister them since
they're not reachable anymore.

The unfortunate effect is that long-lasting old processes are woken
up at the same rate as the new process when accepting new connections,
and can waste a lot of CPU. Accept rates divided by 8 were observed on
a small test involving a slow transfer on 10 connections facing a reload
every second so that 10 processes were busy dealing with them while
another process was hammering the service with new connections.

Fortunately, years ago we implemented a flag FD_CLONED exactly for
similar purposes. Let's simply mark transferred FDs with FD_CLONED
so that the process knows that these ones require special treatment
and have to be manually unregistered before being closed. This does
the job fine, now old processes correctly unregister the FD before
closing it and no longer receive accept events for the new process.

This needs to be backported to all stable versions. It only affects
epoll, as usual, and this time in combination with transferred FDs
(typically reloads in master-worker mode). Thanks to Damien Claisse
for providing all detailed measurements and statistics allowing to
understand and reproduce the problem.
2025-02-12 16:35:01 +01:00
Christopher Faulet
75e8c8ed33 BUG/MINOR: cli: Fix a possible infinite loop in _getsocks()
In _getsocks() functuoin, when we failed to set the unix socket in
non-blocking mode, a goto to "out" label led to loop infinitly. To fix the
issue, we must only let the function exit.

This patch should be backported to all stable versions.
2025-02-06 15:44:21 +01:00