8750 Commits

Author SHA1 Message Date
Frederic Lecaille
a88fdf8669 MINOR: quic/flags: add missing QUIC flags for flags dev tool.
Add missing QUIC_FL_CONN_XPRT_CLOSED quic_conn flags definition.
2025-11-20 08:10:58 +01:00
Amaury Denoyelle
d54d78fe9a BUG/MINOR: quic: fix FD usage for quic_conn_closed on backend side
On the frontend side, QUIC transfer can be performed either via a
connection owned FD or multiplex on the listener one. When a quic_conn
is freed and converted to quic_conn_closed instance, its FD if open is
closed and all exchanges are now multiplex via the listener FD.

This is different for the backend as connections only has the choice to
use their owned FD. Thus, special care care must be taken when freeing a
connection and converting it to a quic_conn_closed instance. In this
case, qc_release_fd() is delayed to the quic_conn_closed release.

Furthermore, when the FD is transferred, its iocb and owner fields are
updated to the new quic_conn_closed instance. Without it, a crash will
occur when accessing the freed quic_conn tasklet. A newly dedicated
handler quic_conn_closed_sock_fd_iocb is used to ensure access to
quic_conn_closed members only.
2025-11-19 16:02:22 +01:00
Amaury Denoyelle
e55bcf5746 BUG/MINOR: mux-quic: implement max-reuse server parameter
Properly implement support for max-reuse server keyword. This is done by
adding a total count of streams seen for the whole connection. This
value is used in avail_streams callback.
2025-11-19 16:02:22 +01:00
Amaury Denoyelle
c67a614e45 MINOR: quic: remove <ipv4> arg from qc_new_conn()
Remove <ipv4> argument from qc_new_conn(). This parameter is unnecessary
as it can be derived from the family type of the addresses also passed
as argument.
2025-11-17 10:20:54 +01:00
Amaury Denoyelle
133f100467 MINOR: quic: refactor qc_new_conn() prototype
The objective of this patch is to streamline qc_new_conn() usage so that
it is similar for frontend and backend sides.

Previously, several parameters were set only for frontend connections.
These arguments are replaced by a single quic_rx_packet argument, which
represents the INITIAL packet triggering the connection allocation on
the server side. For a QUIC client endpoint, it remains NULL. This usage
is consider more explicit.

As a minor change, <target> is moved as the first argument of the
function. This is considered useful as this argument determines whether
the connection is a frontend or backend entry.

Along with these changes, qc_new_conn() documentation has been reworded
so that it is now up-to-date with the newest usage.
2025-11-17 10:13:40 +01:00
Amaury Denoyelle
49edaca513 MINOR: quic: try to clarify quic_conn CIDs fields direction
quic_conn has two fields named <dcid> and <scid>. It may cause confusion
as it is not obvious how these fields are related to the connection
direction. Try to improve this by extending the documentation of these
two fields.
2025-11-17 10:11:04 +01:00
Amaury Denoyelle
8720130cc7 MINOR: quic: do not use quic_newcid_from_hash64 on BE side
quic_newcid_from_hash64 is an external callback. If defined, it serves
as a CID method generation, as an alternative to the default random
implementation.

This mechanism was not correctly implemented on the backend side.
Indeed, <hash64> quic_conn member is only setted for frontend
connections. The simplest solution would be to properly define it also
for backend ones. However, quic_newcid_from_hash64 derivation is really
only useful for the frontend side for now. Thus, this patch disables
using it on the backend side in favor of the default random generator.

To implement this, quic_cid_generate() is splitted in two functions, for
both methods of CIDs generation. This is the responsibility of the
caller to select the proper method. On backend side, only random
implementation is now used.
2025-11-17 10:11:04 +01:00
Christopher Faulet
fc6e3e9081 MINOR: stick-tables: Rename stksess shards to use buckets
The shard keyword is already used by the peers and on the server lines. And
it is unrelated with the session keys distribution. So instead of talking
about shard for the session key hashing, we now use the term "bucket".
2025-11-17 07:42:51 +01:00
Willy Tarreau
675c86c4aa DEBUG: add BUG_ON_STRESS(): a BUG_ON() implemented only when DEBUG_STRESS > 0
The purpose of this new BUG_ON is beyond BUG_ON_HOT(). While BUG_ON_HOT()
is meant to be light but placed on very hot code paths, BUG_ON_STRESS()
might be heavy and only used under stress-testing, to try to detect early
that something bad is starting to happen. This one is not even type-checked
when not defined because we don't want to risk the compiler emitting the
slightest piece of code there in production mode, so as to give enough
freedom to the developers.
2025-11-14 16:42:53 +01:00
Willy Tarreau
3d441e78e5 DEBUG: extend DEBUG_STRESS to ease testing and turn on extra checks
DEBUG_STRESS is currently used only to expose "stress-level". With this
patch, we go a bit further, by automatically forcing DEBUG_STRICT and
DEBUG_STRICT_ACTION to their highest values in order to enable all
BUG_ON levels, and make all of them result in a crash. In addition,
care is taken to always only have 0 or 1 in the macro, so that it can be
tested using "#if DEBUG_STRESS > 0" as well as "if (DEBUG_STRESS) { }"
everywhere.

The goal will be to ease insertion of extra tests for builds dedicated
to stress-testing that enable possibly expensive extra checks on certain
code paths that cannot reasonably be compiled in for production code
right now.
2025-11-14 16:38:04 +01:00
Amaury Denoyelle
d79295d89b Revert "BUG/MEDIUM: connections: permit to permanently remove an idle conn"
The target patch fixes a rare race condition which happen when a MUX IO
handler is working on a connection already moved into the purge list. In
this case, the handler will incorrectly moved back the connection into
the idle list.

To fix this, conn_delete_from_tree() was extended to remove flags along
with the connection from the idle list. This was performed when the
connection is moved into the purge list. However, it introduces another
issue related to the idle server connection accounting. Thus it is
necessary to revert it prior to the incoming newer fix.

This patch must be backported to every version where the original commit
is.
2025-11-14 16:06:34 +01:00
William Lallemand
3d15c07ed0 MINOR: cfgcond: add "awslc_api_atleast" and "awslc_api_before"
AWS-LC features are not easily tested with just the openssl version
constant. AWS-LC uses its own API versioning stored in the
AWSLC_API_VERSION constant.

This patch add the two awslc_api_atleast and awslc_api_before predicates
that help to check the AWS-LC API.
2025-11-14 11:01:45 +01:00
Amaury Denoyelle
8415254cea MINOR: check: clarify check-reuse-pool interaction with reuse policy
check-reuse-pool can only perform as expected if reuse policy on the
backend is set to aggressive or higher. Update the documentation to
reflect this and implement a server diag warning.
2025-11-14 10:44:05 +01:00
William Lallemand
2bdf5a7937 BUG/MEDIUM: acme: move from mt_list to a rwlock + ebmbtree
The current ACME scheduler suffers from problems due to the way the
tasks are stored:

- MT_LIST are not scalables when having a lot of ACME tasks and having
  to look for a specific one.
- the acme_task pointer was stored in the ckch_store in order to not
  passing through the whole list. But a ckch_store can be updated and
  the pointer lost in the previous one.
- when a task fails, the ptr in the ckch_store was not removed because
  we only work with a copy of the original ckch_store, it would need to
  lock the ckchs_tree and remove this pointer.

This patch fixes the issues by removing the MT_LIST-based architecture,
and replacing it by a simple ebmbtree + rwlock design.

The pointer to the task is not stored anymore in the ckch_store, but
instead it is stored in the acme_tasks tree. Finding a task is done by
doing a lookup on this tree with a RDLOCK.
Instead of checking if store->acme_task is not NULL, a lookup is also
done.

This allow to remove the stuck "acme_task" pointer in the store, which
was preventing to restart an acme task when the previous failed for this
specific certificate.

Must be backported in 3.2.
2025-11-13 15:18:12 +01:00
Frederic Lecaille
d84463f9f6 MINOR: quic-be: validate the 0-RTT transport parameters
During 0-RTT sessions, some server transport parameters are reused after having
been save from previous sessions. These parameters must not be reduced
when it resends them. The client must check this is the case when some early data
are accepted by the server. This is what is implemented by this patch.

Implement qc_early_tranport_params_validate() which checks the new server parameters
are not reduced.

Also implement qc_ssl_eary_data_accepted() which was not implemented for TLS
stack without 0-RTT support (for instance wolfssl). That said this function
was no more used. This is why the compilation against wolfssl could not fail.
2025-11-13 14:04:31 +01:00
Frederic Lecaille
6419b9f204 MEDIUM: quic-be: enable the use of 0-RTT
This patch allows the use of 0-RTT feature on QUIC server lines with "allow-0rtt"
option. In fact 0-RTT is really enabled only if ssl_sock_srv_try_reuse_sess()
successfully manages to reuse the SSL session and the chosen application protocol
from previous connections.

Note that, at this time, 0-RTT works only with quictls and aws-lc as TLS stack.

(0-RTT does not work at all (even for QUIC frontends) with libressl).
2025-11-13 14:04:31 +01:00
Frederic Lecaille
a4bbbc75db MINOR: quic-be: Send post handshake frames from list of frames (0-RTT)
This patch is required to make 0-RTT work. It modifies the prototype of
quic_build_post_handshake_frames() to send post handshake frames from a
list of frames in place of the application encryption level (used
as <qc->ael> local variable).

This patch does not modify at all the current QUIC stack behavior (even for
QUIC frontends). It must be considered as a preparation for the code
to come about 0-RTT support for QUIC backends.
2025-11-13 14:04:31 +01:00
Frederic Lecaille
6e14365a5b MEDIUM: quic-be: modify ssl_sock_srv_try_reuse_sess() to reuse backend sessions (0-RTT)
This function is called for both TCP and QUIC connections to reuse SSL sessions
saved by ssl_sess_new_srv_cb() callback called upon new SSL session creation.

In addition to this, a QUIC SSL session must reuse the ALPN and some specific QUIC
transport parameters. This is what is added by this patch for QUIC 0-RTT sessions.

Note that for now on, ssl_sock_srv_try_reuse_sess() may fail for QUIC connections
if it did not managed to reuse the ALPN. The caller must be informed of such an
issue. It must not enable 0-RTT for the current session in this case. This is
impossible without ALPN which is required to start a mux.

ssl_sock_srv_try_reuse_sess() is modified to always succeeds for TCP connections.
2025-11-13 14:04:31 +01:00
Frederic Lecaille
5309dfb56b MINOR: quic-be: Save the backend 0-RTT parameters
For both TCP and QUIC connections, this is ssl_sess_new_srv_cb() callback which
is called when a new SSL session is created. Its role is to save the session to
be reused for the next sessions.

This patch modifies this callback to save the QUIC parameters to be reused
for the next 0-RTT sessions (or during SSL session resumption).

The already existing path_params->nego_alpn member is used to store the ALPN as
this is done for TCP alongside path_params->tps new quic_early_transport_params
struct used to save the QUIC transport parameters to be reused for 0-RTT sessions.
2025-11-13 14:04:31 +01:00
Frederic Lecaille
41e40eb431 MINOR: quic-be: helper quic_reuse_srv_params() function to reuse server params (0-RTT)
Implement quic_reuse_srv_params() whose role is to reuse the ALPN negotiated
during a first connection to a QUIC backend alongside its transport parameters.
2025-11-13 14:04:31 +01:00
Frederic Lecaille
33564ca54c MINOR: quic-be: helper functions to save/restore transport params (0-RTT)
Define quic_early_transport_params new struct for QUIC transport parameters
in relation with 0-RTT. This parameters must be saved during a first session to
be reused for 0-RTT next sessions.

qc_early_transport_params_cpy() copies the 0-RTT transport parameters to be
saved during a first connection to a backend. The copy is made from
a quic_transport_params struct to a quic_ealy_transport_params struct.

On the contrary, qc_early_transport_params_reuse() copies the transport parameters
to be reused for a 0-RTT session from a previous one. The copy is made
from a quic_early_transport_params strcut to a quic_transport_params struct.

Also add QUIC_EV_EARLY_TRANSP_PARAMS trace event to dump such 0-RTT
transport parameters from traces.
2025-11-13 14:04:31 +01:00
Frederic Lecaille
80070fe51c MEDIUM: quic-be: Parse, store and reuse tokens provided by NEW_TOKEN
Add a per thread ist struct to srv_per_thread struct to store the QUIC token to
be reused for subsequent sessions.

Parse at packet level (from qc_parse_ptk_frms()) these tokens and store
them calling qc_try_store_new_token() newly implemented function. This is
this new function which does its best (may fail) to update the tokens.

Modify qc_do_build_pkt() to resend these tokens calling quic_enc_token()
implemented by this patch.
2025-11-13 14:04:31 +01:00
Frederic Lecaille
8f23d4d287 MINOR: quic-be: Parse the NEW_TOKEN frame
Rename ->data qf_new_token struct field to ->w_data to distinguish it from
->r_data new field used to parse the NEW_TOKEN frame. Indeed to build the
NEW_TOKEN we need to write it to a static buffer into the frame struct. To
parse it we only need to store the address of the token field into the
RX buffer.
2025-11-13 14:04:31 +01:00
Amaury Denoyelle
5a8728d03a MEDIUM/OPTIM: quic: alloc quic_conn after CID collision check
On Initial packet parsing, a new quic_conn instance is allocated via
qc_new_conn(). Then a CID is allocated with its value derivated from
client ODCID. On CID tree insert, a collision can occur if another
thread was already parsing an Initial packet from the same client. In
this case, the connection is released and the packet will be requeued to
the other thread.

Originally, CID collision check was performed prior to quic_conn
allocation. This was changed by the commit below, as this could cause
issue on quic_conn alloc failure.

  commit 4ae29be18c5b212dd2a1a8e9fa0ee2fcb9dbb4b3
  BUG/MINOR: quic: Possible endless loop in quic_lstnr_dghdlr()

However, this procedure is less optimal. Indeed, qc_new_conn() performs
many steps, thus it could be better to skip it on Initial CID collision,
which can happen frequently. This patch restores the older order of
operations, with CID collision check prior to quic_conn allocation.

To ensure this does not cause again the same bug, the CID is removed in
case of quic_conn alloc failure. This should prevent any loop as it
ensures that a CID found in the global tree does not point to a NULL
quic_conn, unless if CID is attach to a foreign thread. When this thread
will parse a re-enqueued packet, either the quic_conn is already
allocated or the CID has been removed, triggering a fresh CID and
quic_conn allocation procedure.
2025-11-10 12:10:14 +01:00
Amaury Denoyelle
2623e0a0b7 BUG/MEDIUM: quic: handle collision on CID generation
CIDs are provided by haproxy so that the peer can use them as DCID of
its packets. Their value is set via a random generator. It happens on
several occasions during connection lifetime:
* via ODCID derivation if haproxy is the server
* on quic_conn init if haproxy is the client
* during post-handshake if haproxy is the server
* on RETIRE_CONNECTION_ID frame parsing

CIDs are stored in a global tree. On ODCID derivation, a check is
performed to ensure the CID is not a duplicate value. This is mandatory
to properly handle multiple INITIAL packets from the same client on
different thread.

However, for the other cases, no check is performed for CID collision.
As _quic_cid_insert() is silent, the issue is not detected at all. This
results in a CID advertized to the peer but not stored in the global
one. In the end, this may cause two issues. The first one is that
packets from the client which use the new CID will be rejected by
haproxy, most probably with a STATELESS_RESET. The second issue is that
it can cause a crash during quic_conn release. Indeed, the CID is stored
in the quic_conn local tree and thus eb_delete() for the global tree
will be performed. As <leaf_p> member is uninit, this results in a
segfault.

Note that this issue is pretty rare. It can only be observed if running
with a high number of concurrent connections in parallel, so that the
random generator will provide duplicate values. Patch is still labelled
as MEDIUM as this modifies code paths used frequently.

To fix this, _quic_cid_insert() unsafe function is completely removed.
Instead, quic_cid_insert() can be used, which reports an error code if a
collision happens. CID are then stored in the quic_conn tree only after
global tree insert success. Here is the solution for each steps if a
collision occurs :
* on init as client: the connection is completely released
* post-handshake: the CID is immediately released. The connection is
  kept, but it will miss an extra CID.
* on RETIRE_CONNECTION_ID parsing: a loop is implemented to retry random
  generation. It it fails several times, the connection is closed in
  error.

A small convenience change is made to quic_cid_insert(). Output
parameter <new_tid> can now be NULL, which is useful as most of the
times caller do not care about it.

This must be backported up to 2.6.
2025-11-10 12:10:14 +01:00
Amaury Denoyelle
419e5509d8 MINOR: quic: split CID alloc/generation function
Split new_quic_cid() function into multiple ones. This patch should not
introduce any visible change. The objective is to render CID allocation
and generation more modular.

The first advantage of this patch is to bring code simplication. In
particular, conn CID sequence number increment and insertion into
connection tree is simpler than before. Another improvment is also that
errors could now be handled easier at each different steps of the CID
init.

This patch is a prerequisite for the fix on CID collision, thus it must
be backported prior to it to every affected version.
2025-11-10 12:10:14 +01:00
Christopher Faulet
ecc2c3a35d MEDIUM: peers: Remove commitupdate field on stick-tables
This stick-table field was atomically updated with the last update id pushed
and dumped on the CLI but never used otherwise. And all peer sessions share
the same id because it is a stick-table info. So the info in peers dump is
pretty limited.

So, let's remove it.
2025-11-07 12:17:53 +01:00
Ben Kallus
d5ca3bb3b4 IMPORT: cebtree: Replace offset calculation with offsetof to avoid UB
This is the same as the equivalent fix in ebtree:

The C standard specifies that it's undefined behavior to dereference
NULL (even if you use & right after). The hand-rolled offsetof idiom
&(((s*)NULL)->f) is thus technically undefined. This clutters the
output of UBSan and is simple to fix: just use the real offsetof when
it's available.

This is cebtree commit 2d08958858c2b8a1da880061aed941324e20e748.
2025-11-07 07:32:58 +01:00
Willy Tarreau
14087e48b9 MINOR: tools: add env_suggest() to suggest alternate variable names
The purpose here is to look in the environment for a variable whose
name looks like the provided one. This will be used to try to auto-
correct misspelled environment variables that would silently be turned
to an empty string.
2025-11-06 19:57:44 +01:00
Willy Tarreau
a4d78dd4f5 MINOR: tools: add support for ist to the word fingerprinting functions
The word fingerprinting functions are used to compare similar words to
suggest a correctly spelled one that looks like what the user proposed.
Currently the functions only support const char*, but there's no reason
for this, and it would be convenient to support substrings extracted
from random pieces of configurations. Here we're adding new variants
"_with_len" that take these ISTs and which are in fact a slight change
of the original ones that the old ones now rely on.
2025-11-06 19:57:44 +01:00
Willy Tarreau
0144426dfb BUG/MEDIUM: server: close a race around ready_srv when deleting a server
When a server is being disabled or deleted, in case it matches the
backend's ready_srv, this one is reset. However it's currently done in
a non-atomic way when the server goes down, and that could occasionally
reset the entry matching another server, but more importantly if in
parallel some requests are dequeued for that server, it may re-appear
there after having been removed, leading to a possible crash once it
is fully removed, as shown in issue #3177.

Let's make sure we reset the pointer when detaching the server from
the proxy, and use a CAS in both cases to only reset this server.

This fix needs to be backported to 3.2. There, srv_detach() is in
server.c instead of server.h. Thanks to Basha Mougamadou for the
detailed report and the useful backtraces.
2025-11-06 19:57:44 +01:00
Christopher Faulet
a1b5325a7a MINOR: channel: Remove total field from channels
The <total> field in the channel structure is now useless, so it can be
removed. The <bytes_in> field from the SC is used instead.

This patch is related to issue #1617.
2025-11-06 15:01:29 +01:00
Christopher Faulet
1effe0fc0a MINOR: applet: Add function to get amount of data in the output buffer
The helper function applet_output_data() returns the amount of data in the
output buffer of an applet. For applets using the new API, it is based on
data present in the outbuf buffer. For legacy applets, it is based on input
data present in the input channel's buffer. The HTX version,
applet_htx_output_data(), is also available

This patch is related to issue #1617.
2025-11-06 15:01:29 +01:00
Christopher Faulet
4991a51208 MINOR: stats: Add stats about request and response bytes received and sent
In previous patches, these counters were added per frontend, backend, server
and listener. With this patch, these counters are reported on stats,
including promex.

Note that the stats file minor version was incremented by one because the
shm_stats_file_object struct size has changed.

This patch is related to issue #1617.
2025-11-06 15:01:29 +01:00
Christopher Faulet
0084baa6ba MINOR: counters: Remove bytes_in and bytes_out counter from fe/be/srv/li
bytes_in and bytes_out counters per frontend, backend, listener and server
were removed and we now rely on, respectively on, req_in and res_in
counters.

This patch is related to issue #1617.
2025-11-06 15:01:29 +01:00
Christopher Faulet
567df50d91 MINOR: stream: Remove bytes_in and bytes_out counters from stream
per-stream bytes_in and bytes_out counters was removed and replaced by
req.in and res.in. Coorresponding samples still exists but replies on new
counters.

This patch is related to issue #1617.
2025-11-06 15:01:29 +01:00
Christopher Faulet
1c62a6f501 MINOR: counters: Add req_in/req_out/res_in/res_out counters for fe/be/srv/li
Thanks to the previous patch, and based on info available on the stream, it
is now possible to have counters for frontends, backends, servers and
listeners to report number of bytes received and sent on both sides.

This patch is related to issue #1617.
2025-11-06 15:01:29 +01:00
Christopher Faulet
ac9201f929 MINOR: stream: Add samples to get number of bytes received or sent on each side
req.in and req.out samples can now be used to get the number of bytes
received by a client and send to the server. And res.in and res.out samples
can be used to get the number of bytes received by a server and send to the
client. These info are stored in the logs structure inside a stream.

This patch is related to issue #1617.
2025-11-06 15:01:28 +01:00
Christopher Faulet
629fbbce19 MINOR: stconn: Add counters to SC to know number of bytes received and sent
<bytes_in> and <bytes_out> counters were added to SC to count, respectively,
the number of bytes received from an endpoint or sent to an endpoint. These
counters are updated for connections and applets.

This patch is related to issue #1617.
2025-11-06 15:01:28 +01:00
Willy Tarreau
5fe4677231 MINOR: server: move the lock inside srv_add_idle()
Almost all callers of _srv_add_idle() lock the list then call the
function. It's not the most efficient and it requires some care from
the caller to take care of that lock. Let's change this a little bit by
having srv_add_idle() that takes the lock and calls _srv_add_idle() that
is now inlined. This way callers don't have to handle the lock themselves
anymore, and the lock is only taken around the sensitive parts, not the
function call+return.

Interestingly, perf tests show a small perf increase from 2.28-2.32M RPS
to 2.32-2.37M RPS on a 128-thread system.
2025-11-06 13:16:24 +01:00
William Lallemand
546c67d137 MINOR: acme: generate a temporary key pair
This patch provides two functions acme_gen_tmp_pkey() and
acme_gen_tmp_x509().

These functions generates a unique keypair and X509 certificate that
will be stored in tmp_x509 and tmp_pkey. If the key pair or certificate
was already generated they will return the existing one.

The key is an RSA2048 and the X509 is generated with a expiration in the
past. The CN is "expired".

These are just placeholders to be used if we don't have files.
2025-11-06 11:56:27 +01:00
William Lallemand
1df55b441b MEDIUM: ssl/ckch: use ckch_store instead of ckch_data for ckch_conf_kws
This is an API change, instead of passing a ckch_data alone, the
ckch_conf_kws.func() is called with a ckch_store.

This allows the callback to access the whole ckch_store, with the
ckch_conf and the ckch_data. But it requires the ckch_conf to be
actually put in the ckch_store before.
2025-11-06 11:56:27 +01:00
Amaury Denoyelle
b9809fe0d0 MINOR: quic: remove <mux_state> field
This patch removes <mux_state> field from quic_conn structure. The
purpose of this field was to indicate if MUX layer above quic_conn is
not yet initialized, active, or already released.

It became tedious to properly set it as initialization order of the
various quic_conn/conn/MUX layers now differ between the frontend and
backend sides, and also depending if 0-RTT is used or not. Recently, a
new change introduced in connect_server() will allow to initialize QUIC
MUX earlier if ALPN is cached on the server structure. This had another
level of complexity.

Thus, this patch removes <mux_state> field completely. Instead, a new
flag QUIC_FL_CONN_XPRT_CLOSED is defined. It is set at a single place
only on close XPRT callback invokation. It can be mixed with the new
utility functions qc_wait_for_conn()/qc_is_conn_ready() to determine the
status of conn/MUX layers now without an extra quic_conn field.
2025-11-05 14:03:34 +01:00
Willy Tarreau
096999ee20 BUG/MEDIUM: connections: permit to permanently remove an idle conn
There's currently a function conn_delete_from_tree() which is used to
detach an idle connection from the tree it's currently attached to so
that it is no longer found. This function is used in three circumstances:
  - when picking a new connection that no longer has any avail stream
  - when temporarily working on the connection from an I/O handler,
    in which case it's re-added at the end
  - when killing a connection

The 2nd case above is quite specific, as it requires to preserve the
CO_FL_LIST_MASK flags so that the connection can be re-inserted into
the proper tree when leaving the handler. However, there's a catch.
When killing a connection, we want to be certain it will not be
reinserted into the tree. The flags preservation is causing a tiny
race if an I/O happens while the connection is in the kill list,
because in this case the I/O handler will note the connection flags,
do its work, then reinsert the connection where it believed it was,
then the connection gets purged, and another user can find it in the
tree.

The issue is very difficult to reproduce. On a 128-thread machine it
happens in H2 around 500k req/s after around 50M requests. In H1 it
happens after around 1 billion requests.

The fix here consists in passing an extra argument to the function to
indicate if the removal is permanent or not. When it's permanent, the
function will clear the associated flags. The callers were adjusted
so that all those dequeuing a connection in order to kill it do it
permanently and all other ones do it only temporarily.

A slightly different approach could have worked: the function could
always remove all flags, and the callers would need to restore them.
But this would require trickier modifications of the various call
places, compared to only passing 0/1 to indicate the permanent status.

This will need to be backported to all stable versions. The issue was
at least reproduced since 3.1 (not tested before). The patch will need
to be adjusted for 3.2 and older, because a 2nd argument "thr" was
added in 3.3, so the patch will not apply to older versions as-is.
2025-11-05 11:08:25 +01:00
Olivier Houchard
7d4aa7b22b BUG/MEDIUM: server: Add a rwlock to path parameter
Add a rwlock to control the server's path_parameter, to make sure
multiple threads don't set it at the same time, and it can't be seen in
an inconsistent state.
Also don't set the parameter every time, only set them if they have
changed, to prevent needless writes.

This does not need to be backported.
2025-11-04 18:47:34 +01:00
Amaury Denoyelle
efe60745b3 MINOR: quic: remove connection arg from qc_new_conn()
This patch is similar to the previous one, this time dealing with
qc_new_conn(). This function was asymetric on frontend and backend side,
as connection argument was set only in the latter case.

This was required prior due to qc_alloc_ssl_sock_ctx() signature. This
has changed with the previous patch, thus qc_new_conn() can also be
realigned on both FE and BE sides. <conn> member of quic_conn instance
is always set outside it, in qc_xprt_start() on the backend case.
2025-11-04 17:47:42 +01:00
Amaury Denoyelle
5a17cade4f MINOR: quic: do not set conn member if ssl_sock_ctx
ssl_sock_ctx is a generic object used both on TCP/SSL and QUIC stacks.
Most notably it contains a <conn> member which is a pointer to struct
connection.

On QUIC frontend side, this member is always set to NULL. Indeed,
connection is only created after handshake completion. However, this has
changed for backend side, where the connection is instantiated prior to
its quic_conn counterpart. Thus, ssl_sock_ctx member would be set in
this case as a convenience for use later in qc_ssl_do_hanshake().

However, this method was unsafe as the connection can be released,
without resetting ssl_sock_ctx member. Thus, the previous patch fixes
this by using on <conn> member through the quic_conn instance which is
the proper way.

Thus, this patch resets ssl_sock_ctx <conn> member to NULL. This is
deemed the cleanest method as it ensures that both frontend and backend
sides must not use it anymore.
2025-11-04 17:38:09 +01:00
Willy Tarreau
fd012b6c59 OPTIM: proxy: move atomically access fields out of the read-only ones
Perf top showed that h1_snd_buf() was having great difficulties accessing
the proxy's server_id_hdr_name field in the middle of the headers loop.
Moving the assignment out of the loop to a local variable moved the
problem there as well:

       |      if (!(h1m->flags & H1_MF_RESP) && isttest(h1c->px->server_id_hdr_n
  0.10 |20b0:   mov        -0x120(%rbp),%rdi
  1.33 |        mov        0x60(%rdi),%r10
  0.01 |        test       %eax,%eax
  0.18 |        jne        2118
 12.87 |        mov        0x350(%r10),%rdi
  0.01 |        test       %rdi,%rdi
  0.05 |        je         2118
       |        mov        0x358(%r10),%r11

It turns out that there are several atomically accessed fields in its
vicinity, causing the cache line to bounce all the time. Let's collect
the few frequently changed fields and place them together at the end
of the structure, and plug the 32-bit hole with another isolated field.
Doing so also reduced a little bit the cost of decrementing be->be_conn
in process_stream(), and overall the HTTP/1 performance increased by
about 1% both on ARM and x86_64.
2025-11-03 13:54:49 +01:00
Amaury Denoyelle
6bfabfdc77 OPTIM: backend: skip conn reuse for incompatible proxies
When trying to reuse a backend connection, a connection hash is
calculated to match an entry with similar parameters. Previously, this
operation was skipped if the stream content wasn't based on HTTP, as it
would have been incompatible with http-reuse.

With the introduction of SPOP backends, this condition was removed, so
that it can also benefit from connection reuse. However, this means that
now hash calcul is always performed when connecting to a server, even
for TCP or log backends. This is unnecessary as these proxies cannot
perform connection reuse.

Note also that reuse mode is resetted on postparsing for incompatible
backends. This at least guarantees that no tree lookup will be performed
via be_reuse_connection(). However, connection lookup is still performed
in the session via session_get_conn() which is another unnecessary
operation.

Thus, this patch restores the condition so that reuse operations are now
entirely skipped if a backend mode is incompatible. This is implemented
via a new utility function named be_supports_conn_reuse().

This could be backported up to 3.1, as this commit could be considered
as a performance regression for tcp/log backend modes.
2025-11-03 10:43:50 +01:00
Amaury Denoyelle
14a6468df5 MINOR: quic: reject conf with QUIC servers if not compiled
Ensure that QUIC support is compiled into haproxy when a QUIC server is
configured. This check is performed during _srv_parse_finalize() so that
it is detected both on configuration parsing and when adding a dynamic
server via the CLI.

Note that this changes the behavior of srv_is_quic() utility function.
Previously, it always returned false when QUIC support wasn't compiled.
With this new check introduced, it is now guaranteed that a QUIC server
won't exist if compilation support is not active. Hence srv_is_quic()
does not rely anymore on USE_QUIC define.
2025-10-31 11:32:20 +01:00