546 Commits

Author SHA1 Message Date
Willy Tarreau
2d6b5c7a60 MEDIUM: connection: reintegrate conn_hash_node into connection
Previously the conn_hash_node was placed outside the connection due
to the big size of the eb64_node that could have negatively impacted
frontend connections. But having it outside also means that one
extra allocation is needed for each backend connection, and that one
memory indirection is needed for each lookup.

With the compact trees, the tree node is smaller (16 bytes vs 40) so
the overhead is much lower. By integrating it into the connection,
We're also eliminating one pointer from the connection to the hash
node and one pointer from the hash node to the connection (in addition
to the extra object bookkeeping). This results in saving at least 24
bytes per total backend connection, and only inflates connections by
16 bytes (from 240 to 256), which is a reasonable compromise.

Tests on a 64-core EPYC show a 2.4% increase in the request rate
(from 2.08 to 2.13 Mrps).
2025-09-16 09:23:46 +02:00
Willy Tarreau
ceaf8c1220 MEDIUM: connection: move idle connection trees to ceb64
Idle connection trees currently require a 56-byte conn_hash_node per
connection, which can be reduced to 32 bytes by moving to ceb64. While
ceb64 is theoretically slower, in practice here we're essentially
dealing with trees that almost always contain a single key and many
duplicates. In this case, ceb64 insert and lookup functions become
faster than eb64 ones because all duplicates are a list accessed in
O(1) while it's a subtree for eb64. In tests it is impossible to tell
the difference between the two, so it's worth reducing the memory
usage.

This commit brings the following memory savings to conn_hash_node
(one per backend connection), and to srv_per_thread (one per thread
and per server):

     struct       before  after  delta
  conn_hash_nodea   56     32     -24
  srv_per_thread    96     72     -24

The delicate part is conn_delete_from_tree(), because we need to
know the tree root the connection is attached to. But thanks to
recent cleanups, it's now clear enough (i.e. idle/safe/avail vs
session are easy to distinguish).
2025-09-16 09:23:46 +02:00
Willy Tarreau
95b8adff67 MINOR: connection: pass the thread number to conn_delete_from_tree()
We'll soon need to choose the server's root based on the connection's
flags, and for this we'll need the thread it's attached to, which is
not always the current one. This patch simply passes the thread number
from all callers. They know it because they just set the idle_conns
lock on it prior to calling the function.
2025-09-16 09:23:46 +02:00
Amaury Denoyelle
687df405fe BUG/MINOR: connection: streamline conn detach from lists
Over their lifetime, connections are attached to different list. These
lists depends on whether connection is on frontend or backend side.
Attach point members are stored via a union in struct connection. The
next commit reorganizes them so that a proper frontend/backend
separation is performed :

  commit a96f1286a75246fef6db3e615fabdef1de927d83
  BUG/MINOR: connection: rearrange union list members

On conn_free(), connection instance must be removed from these lists to
ensure there is no use-after-free case. However code was still shaky
there, despite no real issue. Indeed, <toremove_list> was detached for
all connections, despite being only used on backend side only.

This patch streamlines the freeing of connection. Now, <toremove_list>
detach is performed in conn_backend_deinit(). Moreover, a new helper
conn_frontend_deinit() is defined. It ensures that <stopping_list>
detach is done. Prior it was performed individually by muxes.

Note that a similar procedure is performed when the connection is
reversed. Hence, conn_frontend_deinit() is now used here as well,
rendering reversal from FE to BE or vice versa symmetrical.

As mentionned above, no crash occured prior to this patch, but the code
was fragile, in particular access to <toremove_list> for frontend
connections. Thus this patch is considered as a bug fix worthy of a
backport along with above mentionned patch, currently up to 3.0.
2025-09-04 18:31:20 +02:00
Amaury Denoyelle
dcf2261612 BUG/MAJOR: mux-quic: fix crash on reload during emission
MUX QUIC restricts buffer allocation per connection based on the
underlying congestion window. If a QCS instance cannot allocate a new
buffer, it is put in a buf_wait list. Typically, this will cause stream
upper layer to subscribe for sending.

A BUG_ON() was present on snd_buf and nego_ff callback prologue to
ensure that these functions were not called if QCS is already in
buf_wait list. The objective was to guarantee that there is no wake up
on a stream if it cannot allocate a buffer.

However, this BUG_ON() is not correct, as it can be fired legitimely.
Indeed, stream layer can retry emission even if no wake up occured. This
case can happen on reload. Thus, BUG_ON() will cause an unexpected
crash.

Fix this by removing these BUG_ON(). Instead, snd_buf/nego_ff callbacks
ensure that QCS is not subscribed in buf_wait list. If this is the case,
a nul value will be returned, which is sufficient for the stream layer
to pause emission and subscribe if necessary.

Occurences for this crash have been reported on the mailing list. It is
also the subject of github issue #3080, which should be fixed with this
patch.

This must be backported up to 3.0.
2025-09-01 15:35:22 +02:00
Amaury Denoyelle
17a1daca72 MEDIUM: mux-quic: enforce thread-safety of backend idle conns
Complete QUIC MUX for backend side. Ensure access to idle connections
are performed in a thread-safe way. Even if takeover is not yet
implemented for this protocol, it is at least necessary to ensure that
there won't be any issue with idle connections purging mechanism.

This change will also be necessary to ensure that QUIC servers can
safely be removed via CLI "del server". This is not yet sufficient as
currently server deletion still relies on takeover for idle connections
removal. However, this will be adjusted in a future patch to instead use
idle connections standard purging mechanism.
2025-08-28 15:08:35 +02:00
Amaury Denoyelle
b3ce464435 BUG/MINOR: mux-quic: do not access conn after idle list insert
Once a connection is inserted into the server idle/safe tree during
stream detach, it is not accessed anymore by the muxes without
idle_conns_lock protection. This is because the connection could have
been already stolen by a takeover operation.

Adjust QUIC MUX detach implementation to follow the same pattern. Note
that, no bug can occur due to takeover as QUIC does not implement it.
However, prior to this patch, there may still exist race-conditions with
idle connection purging.

No backport needed.
2025-08-28 14:52:29 +02:00
Frederic Lecaille
ffa926ead3 BUG/MINOR: mux-quic: trace with non initialized qcc
This issue leads to crashes when the QUIC mux traces are enabled and could be
reproduced with -dMfail. When the qcc allocation fails (qcc_init()) haproxy
crashes into qmux_dump_qcc_info() because ->conn qcc member is initialized:

Program terminated with signal SIGSEGV, Segmentation fault.
    at src/qmux_trace.c:146
146             const struct quic_conn *qc = qcc->conn->handle.qc;
[Current thread is 1 (LWP 1448960)]
(gdb) p qcc
$1 = (const struct qcc *) 0x7f9c63719fa0
(gdb) p qcc->conn
$2 = (struct connection *) 0x155550508
(gdb)

This patch simply fixes the TRACE() call concerned to avoid <qcc> object
dereferencing when it is NULL.

Must be backported as far as 3.0.
2025-08-28 08:19:34 +02:00
Willy Tarreau
c264ea1679 MEDIUM: tree-wide: replace most DECLARE_POOL with DECLARE_TYPED_POOL
This will make the pools size and alignment automatically inherit
the type declaration. It was done like this:

   sed -i -e 's:DECLARE_POOL(\([^,]*,[^,]*,\s*\)sizeof(\([^)]*\))):DECLARE_TYPED_POOL(\1\2):g' $(git grep -lw DECLARE_POOL src addons)
   sed -i -e 's:DECLARE_STATIC_POOL(\([^,]*,[^,]*,\s*\)sizeof(\([^)]*\))):DECLARE_STATIC_TYPED_POOL(\1\2):g' $(git grep -lw DECLARE_STATIC_POOL src addons)

81 replacements were made. The only remaining ones are those which set
their own size without depending on a structure. The few ones with an
extra size were manually handled.

It also means that the requested alignments are now checked against the
type's. Given that none is specified for now, no issue is reported.

It was verified with "show pools detailed" that the definitions are
exactly the same, and that the binaries are similar.
2025-08-11 19:55:30 +02:00
Amaury Denoyelle
a6e67e7b41 BUG/MEDIUM: mux-quic: ensure Early-data header is set
QUIC MUX may be initialized prior to handshake completion, when 0-RTT is
used. In this case, connection is flagged with CO_FL_EARLY_SSL_HS, which
is notably used by wait-for-hs http rule.

Early data may be subject to replay attacks. For this reason, haproxy
adds the header 'Early-data: 1' to all requests handled as TLS early
data. Thus the server can reject it if it is deemed unsafe. This header
injection is implemented by http-ana. However, it was not functional
with QUIC due to missing CO_FL_EARLY_DATA connection flag.

Fix this by ensuring that QUIC MUX sets CO_FL_EARLY_DATA when needed.
This is performed during qcc_recv() for STREAM frame reception. It is
only set if QC_CF_WAIT_HS is set, meaning that the handshake is not yet
completed. After this, the request is considered safe and Early-data
header is not necessary anymore.

This should fix github issue #3054.

This must be backported up to 3.2 at least. If possible, it should be
backported to all stable releases as well. On these versions, the
current patch relies on the following refactoring commit :
  commit 0a53a008d032b69377869c8caaec38f81bdd5bd6
  MINOR: mux-quic: refactor wait-for-handshake support
2025-07-31 15:25:59 +02:00
Amaury Denoyelle
697f7d1142 MINOR: muxes: refactor private connection detach
Following the latest adjustment on session_add_conn() /
session_check_idle_conn(), detach muxes callbacks were rewritten for
private connection handling.

Nothing really fancy here : some more explicit comments and the removal
of a duplicate checks on idle conn status for muxes with true
multipexing support.
2025-07-30 16:14:00 +02:00
Amaury Denoyelle
dd9645d6b9 MINOR: session: do not release conn in session_check_idle_conn()
session_check_idle_conn() is called to flag a connection already
inserted in a session list as idle. If the session limit on the number
of idle connections (max-session-srv-conns) is exceeded, the connection
is removed from the session list.

In addition to the connection removal, session_check_idle_conn()
directly calls MUX destroy callback on the connection. This means the
connection is freed by the function itself and should not be used by the
caller anymore.

This is not practical when an alternative connection closure method
should be used, such as a graceful shutdown with QUIC. As such, remove
MUX destroy invokation : this is now the responsability of the caller to
either close or release immediately the connection.
2025-07-30 11:43:41 +02:00
Amaury Denoyelle
ec1ab8d171 MINOR: session: remove redundant target argument from session_add_conn()
session_add_conn() uses three argument : connection and session
instances, plus a void pointer labelled as target. Typically, it
represents the server, but can also be a backend instance (for example
on dispatch).

In fact, this argument is redundant as <target> is already a member of
the connection. This commit simplifies session_add_conn() by removing
it. A BUG_ON() on target is extended to ensure it is never NULL.
2025-07-30 11:39:57 +02:00
Amaury Denoyelle
cfe9bec1ea MINOR: mux-quic: release conn after shutdown on BE reuse failure
On stream detach on backend side, connection is inserted in the proper
server/session list to be able to reuse it later. If insertion fails and
the connection is idle, the connection can be removed immediately.

If this occurs on a QUIC connection, QUIC MUX implements graceful
shutdown to ensure the server is notified of the closure. However, the
connection instance is not freed. Change this to ensure that both
shutdown and release is performed.
2025-07-30 10:04:19 +02:00
Amaury Denoyelle
08d664b17c MEDIUM: mux-quic: support backend private connection
If a backend connection is private, it should not be reused outside of
its original attached session. As such, on stream detach operation, such
connection is never inserted into server idle/avail list. Instead, it is
stored directly on the session.

The purpose of this commit is to implement proper handling of private
backend connections via QUIC multiplexer.
2025-07-23 15:49:51 +02:00
Amaury Denoyelle
00d668549e MINOR: mux-quic: do not reuse connection if app already shut
QUIC connection graceful closure is performed in two steps. First, the
application layer is closed. In the context of HTTP/3, this is done with
a GOAWAY frame emission, which forbids opening of new streams. Then the
whole connection is terminated via CONNECTION_CLOSE which is the final
emitted frame.

This commit ensures that when app layer is shut for a backend
connection, this connection is removed from either idle or avail server
tree. The objective is to prevent stream layer to try to reuse a
connection if no new stream can be attached on it.

New BUG_ON checks are inserted in qmux_strm_attach() and h3_attach() to
ensure that this assertion is always true.
2025-07-23 15:45:18 +02:00
Amaury Denoyelle
3217835b1d MEDIUM: mux-quic: implement be connection reuse
Implement support for QUIC connection reuse on the backend side. The
main change is done during detach stream operation. If a connection is
idle, it is inserted in the server list. Else, it is stored in the
server avail tree if there is room for more streams.

For non idle connection, qmux_avail_streams() is reused to detect that
stream flow-control limit is not yet reached. If this is the case, the
connection is not inserted in the avail tree, so it cannot be reuse,
even if flow-control is unblocked later by the peer. This latter point
could be improved in the future.

Note that support for QUIC private connections is still missing. Reuse
code will evolved to fully support this case.
2025-07-23 15:45:09 +02:00
Amaury Denoyelle
3bf37596ba MINOR: mux-quic: store session in QCS instance
Add a new <sess> member into QCS structure. It is used to store the
parent session of the stream on attach operation. This is only done for
backend side.

This new member will become necessary when connection reuse will be
implemented. <owner> member of connection is not suitable as it could be
set to NULL, notably after a session_add_conn() failure.

Also, a single BE conn can be shared along different session instance,
in particular when using aggressive/always reuse mode. Thus it is
necessary to linked each QCS instance with its session.
2025-07-23 15:42:37 +02:00
Amaury Denoyelle
826f797bb0 MINOR: mux-quic: disable glitch on backend side
For now, QUIC glitch limit counter is only available on the frontend
side. Thus, disable incrementation on the backend side for now. Also,
session is only available as conn <owner> reliably on the frontend side,
so session_add_glitch_ctr() operation is also securised.
2025-07-23 14:39:18 +02:00
Amaury Denoyelle
89329b147d MINOR: mux-quic: correctly implement backend timeout
qcc_refresh_timeout() is the function called on QUIC MUX activity. Its
purpose is to update the timeout by selecting the correct value
depending on the connection state.

Prior to this patch, backend connections were mostly ignored by the
function. However, the default server timeout was selecting as a
fallback. This is incompatible with backend connections reuse.

This patch fixes timeout applied on backend connections. Only values
specific to frontend which are http-request and http-keep-alive timeouts
are now ignored for a backend connection. Also, fallback timeout is only
used for frontend connections.

This patch ensures that an idle backend connection won't be deleted due
to server timeout. This is necessary for proper connection reuse which
will be implemented in a future patch.
2025-07-23 14:36:48 +02:00
Amaury Denoyelle
95cb763cd6 MINOR: mux-quic: refactor timeout code
This commit is a small reorganization of condition used into
qcc_refresh_timeout(). Its objective is to render the code more logical
before the next patch which will ensure that timeout is properly set for
backend connections.
2025-07-23 14:36:48 +02:00
Amaury Denoyelle
558532fc57 BUG/MINOR: mux-quic: ensure close-spread-time is properly applied
If a connection remains on a proxy currently disabled or stopped, a
special spread timeout is set if active close is configured. For QUIC
MUX, this is set via qcc_refresh_timeout() as with all other timeout
values.

Fix this closing timeout setting : it is now used as an override to any
other timeout that may have been chosen if calculated spread time is
lower than the previously selected value. This is done for backend
connections as well.

This should be backported up to 2.6 after a period of observation.
2025-07-23 14:36:48 +02:00
Amaury Denoyelle
c5bcc3a21e BUG/MINOR mux-quic: apply correctly timeout on output pending data
When no stream is attached, mux layer is responsible to maintain a
timeout. The first criteria is to apply client/server timeout if there
is still data waiting for emission.

Previously, <hreq> qcc member was used to determine this state. However,
this only covers bidirectional streams. Fix this by testing if
<send_list> is empty or not. This is enough to take into account both
bidi and uni streams.

Theorically, this should be backported to every stable versions.
However, send-list is not available on 2.6 and there is no alternative
to quickly determine if there is waiting output data. Thus, it's better
to backport it up to 2.8 only.
2025-07-23 14:36:48 +02:00
Ilia Shipitsin
0ee3d739b8 CLEANUP: assorted typo fixes in the code, commits and doc
Corrected various spelling and phrasing errors to improve clarity and consistency.
2025-07-10 19:49:48 +02:00
Amaury Denoyelle
4527a2912b MEDIUM: mux-quic: implement attach for new streams on backend side
Implement attach and avail_streams mux-ops callbacks, which are used on
backend side for connection reuse.

Attach operation is used to initiate new streams on the connection
outside of the first one. It simply relies on qcc_init_stream_local() to
instantiate a new QCS instance, which is immediately linked to its
stream data layer.

Outside of attach, it is also necessary to implement avail_streams so
that the stream layer will try to initiate connection reuse. This method
reports the number of bidirectional streams which can still be opened
for the QUIC connection. It depends directly to the flow-control value
advertised by the peer. Thus, this ensures that attach won't cause any
flow control violation.
2025-06-18 17:25:27 +02:00
Amaury Denoyelle
81cfaab6b4 MINOR: mux-quic: abort conn if cannot create stream due to fctl
Prior to initiate first stream on the backend side, ensure that peer
flow-control allows at least that a single bidirectional stream can be
created. If this is not the case, abort MUX init operation.

Before this patch, flow-control limit was not checked. Hence, if peer
does not allow any bidirectional stream, haproxy would violate it, which
whould then cause the peer to close the connection.

Note that with the current situation, haproxy won't be able to talk to
servers which uses a 0 for initial max bidi streams. A proper solution
could be to pause the request until a MAX_STREAMS is received, under
timeout supervision to ensure the connection is closed if no frame is
received.
2025-06-18 17:25:27 +02:00
Amaury Denoyelle
06cab99a0e MINOR: mux-quic: support max bidi streams value set by the peer
Implement support for MAX_STREAMS frame. On frontend, this was mostly
useless as haproxy would never initiate new bidirectional streams.
However, this becomes necessary to control stream flow-control when
using QUIC as a client on the backend side.

Parsing of MAX_STREAMS is implemented via new qcc_recv_max_streams().
This allows to update <ms_uni>/<ms_bidi> QCC fields.

This patch is necessary to achieve QUIC backend connection reuse.
2025-06-18 17:25:27 +02:00
Amaury Denoyelle
805a070ab9 BUG/MINOR: mux-quic/h3: properly handle too low peer fctl initial stream
Previously, no check on peer flow-control was implemented prior to open
a local QUIC stream. This was a small problem for frontend
implementation, as in this case haproxy as a server never opens
bidirectional streams.

On frontend, the only stream opened by haproxy in this case is for
HTTP/3 control unidirectional data. If the peer uses an initial value
for max uni streams set to 0, it would violate its flow control, and the
peer will probably close the connection. Note however that RFC 9114
mandates that each peer defines minimal initial value so that at least
the control stream can be created.

This commit improves the situation of too low initial max uni streams
value. Now, on HTTP/3 layer initialization, haproxy preemptively checks
flow control limit on streams via a new function
qcc_fctl_avail_streams(). If credit is already expired due to a too
small initial value, haproxy preemptively closes the connection using
H3_ERR_GENERAL_PROTOCOL_ERROR. This behavior is better as haproxy is now
the initiator of the connection closure.

This should be backported up to 2.8.
2025-06-18 17:18:55 +02:00
Amaury Denoyelle
fc1a17f169 BUG/MINOR: mux-quic: check sc_attach_mux return value
On backend side, QUIC MUX needs to initialize the first local stream
during MUX init operation. This is necessary so that the first transfer
can then be performed.

sc_attach_mux() is used to attach the created QCS instance to its stream
data layer. However, return value was not checked, which may cause
issues on allocation error. This patch fixes it by returning an error on
MUX init operation and freeing the QCS instance in case of
sc_attach_mux() error.

This fixes coverity report from github issue #3007.

No need to backport.
2025-06-16 18:11:09 +02:00
Amaury Denoyelle
1efaca8a57 MINOR: mux-quic: instantiate first stream on backend side
Adjust qmux_init() to handle frontend and backend sides differently.
Most notably, on backend side, the first bidirectional stream is created
preemptively. This step is necessary as MUX layer will be woken up just
after handshake completion.
2025-06-12 11:28:54 +02:00
Amaury Denoyelle
f8d096c05f MINOR: mux-quic: set expect data only on frontend side
Stream data layer is notified that data is expected when FIN is
received, which marks the end of the HTTP request. This prepares data
layer to be able to handle the expected HTTP response.

Thus, this step is only relevant on frontend side. On backend side, FIN
marks the end of the HTTP response. No further content is expected, thus
expect data should not be set in this case.

Note that se_expect_data() invokation via qcs_attach_sc() is not
protected. This is because this function will only be called during
request headers parsing which is performed on the frontend side.
2025-06-12 11:28:54 +02:00
Amaury Denoyelle
e8775d51df MINOR: mux-quic: define flag for backend side
Mux connection is flagged with new QC_CF_IS_BACK if used on the backend
side. For now the only change is during traces, to be able to
differentiate frontend and backend usage.
2025-06-12 11:28:54 +02:00
Amaury Denoyelle
044ad3a602 BUG/MEDIUM: mux-quic: adjust wakeup behavior
Change wake callback behavior for QUIC MUX. This operation loops over
each QCS and notify their stream data layer on certain events via
internal helper qcc_wake_some_streams().

Previously, streams were notified only if an error occured on the
connection. Change this to notify streams data layer everytime wake
callback is used. This behavior is now identical to H2 MUX.

qcc_wake_some_streams() is also renamed to qcc_wake_streams(), as it
better reflect its true behavior.

This change should not have performance impact as wake mux ops should
not be called frequently. Note that qcc_wake_streams() can also be
called directly via qcc_io_process() to ensure a new error is correctly
propagated. As wake callback first uses qcc_io_process(), it will only
call qcc_wake_streams() if no error is present.

No known issue is associated with this commit. However, it could prevent
freezing transfer under certain condition. As such, it is considered as
a bug fix worthy of backporting.

This should be backported after a period of observation.
2025-06-12 11:12:49 +02:00
Amaury Denoyelle
9c751a3cc1 MINOR: mux-quic-be: allow QUIC proto on backend side
Activate QUIC protocol support for MUX-QUIC on the backend side,
additionally to current frontend support. This change is mandatory to be
able to implement QUIC on the backend side.

Without this modification, it is impossible to activate explicitely QUIC
protocol on a server line, hence an error is reported :
  config : proxy 'xxxx' : MUX protocol 'quic' is not usable for server 'yyyy'
2025-06-11 18:37:34 +02:00
Willy Tarreau
a1577a89a0 MINOR: glitches: add global setting "tune.glitches.kill.cpu-usage"
It was mentioned during the development of glitches that it would be
nice to support not killing misbehaving connections below a certain
CPU usage so that poor implementations that routinely misbehave without
impact are not killed. This is now possible by setting a CPU usage
threshold under which we don't kill them via this parameter. It defaults
to zero so that we continue to kill them by default.
2025-05-21 15:47:42 +02:00
Amaury Denoyelle
f286288471 MINOR: quic: refactor handling of streams after MUX release
quic-conn layer has to handle itself STREAM frames after MUX release. If
the stream was already seen, it is probably only a retransmitted frame
which can be safely ignored. For other streams, an active closure may be
needed.

Thus it's necessary that quic-conn layer knows the highest stream ID
already handled by the MUX after its release. Previously, this was done
via <nb_streams> member array in quic-conn structure.

Refactor this by replacing <nb_streams> by two members called
<stream_max_uni>/<stream_max_bidi>. Indeed, it is unnecessary for
quic-conn layer to monitor locally opened uni streams, as the peer
cannot by definition emit a STREAM frame on it. Also, bidirectional
streams are always opened by the remote side.

Previously, <nb_streams> were set by quic-stream layer. Now,
<stream_max_uni>/<stream_max_bidi> members are only set one time, just
prior to QUIC MUX release. This is sufficient as quic-conn do not use
them if the MUX is available.

Note that previously, IDs were used relatively to their type, thus
incremented by 1, after shifting the original value. For simplification,
use the plain stream ID, which is incremented by 4.
2025-05-21 14:26:45 +02:00
Amaury Denoyelle
e399daa67e BUG/MEDIUM: mux-quic: fix BUG_ON() on rxbuf alloc error
RX buffer allocation has been reworked in current dev tree. The
objective is to support multiple buffers per QCS to improve upload
throughput.

RX buffer allocation failure is handled simply : the whole connection is
closed. This is done via qcc_set_error(), with INTERNAL_ERROR as error
code. This function contains a BUG_ON() to ensure it is called only one
time per connection instance.

On RX buffer alloc failure, the aformentioned BUG_ON() crashes due to a
double invokation of qcc_set_error(). First by qcs_get_rxbuf(), and
immediately after it by qcc_recv(), which is the caller of the previous
one. This regression was introduced by the following commit.

  60f64449fbba7bb6e351e8343741bb3c960a2e6d
  MAJOR: mux-quic: support multiple QCS RX buffers

To fix this, simply remove qcc_set_error() invocation in
qcs_get_rxbuf(). On buffer alloc failture, qcc_recv() is responsible to
set the error.

This does not need to be backported.
2025-05-21 11:33:00 +02:00
Amaury Denoyelle
1ccede211c MINOR: mux-quic: account Rx data per stream
Add counters to measure Rx buffers usage per QCS. This reused the newly
defined bdata_ctr type already used for Tx accounting.

Note that for now, <tot> value of bdata_ctr is not used. This is because
it is not easy to account for data accross contiguous buffers.

These values are displayed both on log/traces and "show quic" output.
2025-05-13 15:41:51 +02:00
Amaury Denoyelle
a1dc9070e7 MINOR: quic: account Tx data per stream
Add accounting at qc_stream_desc level to be able to report the number
of allocated Tx buffers and the sum of their data. This represents data
ready for emission or already emitted and waiting on ACK.

To simplify this accounting, a new counter type bdata_ctr is defined in
quic_utils.h. This regroups both buffers and data counter, plus a
maximum on the buffer value.

These values are now displayed on QCS info used both on logline and
traces, and also on "show quic" output.
2025-05-13 15:41:41 +02:00
Amaury Denoyelle
14e4f2b811 BUG/MEDIUM: mux-quic: fix crash on invalid fctl frame dereference
Emission of flow-control frames have been recently modified. Now, each
frame is sent one by one, via a single entry list. If a failure occurs,
emission is interrupted and frame is reinserted into the original
<qcc.lfctl.frms> list.

This code is incorrect as it only checks if qcc_send_frames() returns an
error code to perform the reinsert operation. However, an error here
does not always mean that the frame was not properly emitted by lower
quic-conn layer. As such, an extra test LIST_ISEMPTY() must be performed
prior to reinsert the frame.

This bug would cause a heap overflow. Indeed, the reinsert frame would
be a random value. A crash would occur as soon as it would be
dereferenced via <qcc.lfctl.frms> list.

This was reproduced by issuing a POST with a big file and interrupt it
after just a few seconds. This results in a crash in about a third of
the tests. Here is an example command using ngtcp2 :

 $ ngtcp2-client -q --no-quic-dump --no-http-dump \
   -m POST -d ~/infra/html/1g 127.0.0.1 20443 "http://127.0.0.1:20443/post"

Heap overflow was detected via a BUG_ON() statement from qc_frm_free()
via qcc_release() caller :

  FATAL: bug condition "!((&((*frm)->reflist))->n == (&((*frm)->reflist)))" matched at src/quic_frame.c:1270

This does not need to be backported.
2025-05-09 11:07:11 +02:00
Aurelien DARRAGON
b39825ee45 BUG/MINOR: proxy: only use proxy_inc_fe_cum_sess_ver_ctr() with frontends
proxy_inc_fe_cum_sess_ver_ctr() was implemented in 9969adbc
("MINOR: stats: add by HTTP version cumulated number of sessions and
requests")

As its name suggests, it is meant to be called for frontends, not backends

Also, in 9969adbc, when used under h1_init(), a precaution is taken to
ensure that the function is only called with frontends.

However, this precaution was not applied in h2_init() and qc_init().

Due to this, it remains possible to have proxy_inc_fe_cum_sess_ver_ctr()
being called with a backend proxy as parameter. While it did not cause
known issues so far, it is not expected and could result in bugs in the
future. Better fix this by ensuring the function is only called with
frontends.

It may be backported up to 2.8
2025-05-06 11:01:39 +02:00
Amaury Denoyelle
df50d3e39f MINOR: mux-quic: limit emitted MSD frames count per qcs
The previous commit has implemented a new calcul method for
MAX_STREAM_DATA frame emission. Now, a frame may be emitted as soon as a
buffer was consumed by a QCS instance.

This will probably increase the number of MAX_STREAM_DATA frame
emission. It may even cause a series of frame emitted for the same
stream with increasing values under high load, which is completely
unnecessary.

To improve this, limit the number of MAX_STREAM_DATA frames built to one
per QCS instance. This is implemented by storing a reference to this
frame in QCS structure via a new member <tx.msd_frm>.

Note that to properly reset QCS msd_frm member, emission of flow-control
frames have been changed. Now, each frame is emitted individually. On
one side, it is better as it prevent to emit frames related to different
streams in a single datagram, which is not desirable in case of packet
loss. However, this can also increase sendto() syscall invocation.
2025-04-30 16:08:47 +02:00
Amaury Denoyelle
14a3fb679f MEDIUM: mux-quic: increase flow-control on each bufsize
Recently, QCS Rx allocation buffer method has been improved. It is now
possible to allocate multiple buffers per QCS instances, which was
necessary to improve HTTP/3 POST throughput.

However, a limitation remained related to the emission of
MAX_STREAM_DATA. These frames are only emitted once at least half of the
receive capacity has been consumed by its QCS instance. This may be too
restrictive when a client need to upload a large payload.

Improve this by adjusting MAX_STREAM_DATA allocation. If QCS capacity is
still limited to 1 or 2 buffers max, the old calcul is still used. This
is necessary when user has limited upload throughput via their
configuration. If QCS capacity is more than 2 buffers, a new frame is
emitted if at least a buffer was consumed.

This patch has reduced number of STREAM_DATA_BLOCKED frames received in
POST tests with some specific clients.
2025-04-30 16:08:47 +02:00
Olivier Houchard
15c5846db8 MEDIUM: quic: Make sure we return the tasklet from qcc_io_cb
In qcc_io_cb, return the tasklet to tell the scheduler the tasklet is
still alive, it is not yet needed, but will be soon.
2025-04-25 16:14:26 +02:00
Amaury Denoyelle
6c5030f703 BUG/MINOR: mux-quic: do not decode if conn in error
Add an early return to qcc_decode_qcs() if QCC instance is flagged on
error and connection is scheduled for immediate closure.

The main objective is to ensure to not trigger BUG_ON() from
qcc_set_error() : if a stream decoding has set the connection error, do
not try to process decoding on other streams as they may also encounter
an error. Thus, the connection is closed asap with the first encountered
error case.

This should be backported up to 2.6, after a period of observation.
2025-04-24 14:15:02 +02:00
Amaury Denoyelle
fbedb8746f BUG/MINOR: mux-quic: fix possible infinite loop during decoding
With the support of multiple Rx buffers per QCS instance, stream
decoding in qcc_io_recv() has been reworked for the next haproxy
release. An issue appears in a double while loop : a break statement is
used in the inner loop, which is not sufficient as it should instead
exit from the outer one.

Fix this by replacing break with a goto statement.

No need to backport this.
2025-04-24 14:15:02 +02:00
Ilia Shipitsin
27a6353ceb CLEANUP: assorted typo fixes in the code, commits and doc 2025-04-03 11:37:25 +02:00
Ilia Shipitsin
78b849b839 CLEANUP: assorted typo fixes in the code and comments
code, comments and doc actually.
2025-04-02 11:12:20 +02:00
Amaury Denoyelle
c5f8df8d55 BUG/MINOR: mux-quic: remove extra BUG_ON() in _qcc_send_stream()
The following patch fixed a BUG_ON() which could be triggered if RS/SS
emission was scheduled after stream local closure.
  7ee1279f4b8416435faba5cb93a9be713f52e4df
  BUG/MEDIUM: mux-quic: fix crash on RS/SS emission if already close local

qcc_send_stream() was rewritten as a wrapper around an internal
_qcc_send_stream() used to bypass the faulty BUG_ON(). However, an extra
unnecessary BUG_ON() was added by mistake in _qcc_send_stream().

This should not cause any issue, as the BUG_ON() is only active if <urg>
argument is false, which is not the case for RS/SS emission. However,
this patch is labelled as a bug as this BUG_ON() is unnecessary and may
cause issues in the future.

This should be backported up to 2.8, after the above mentionned patch.
2025-03-20 18:18:52 +01:00
Amaury Denoyelle
7ee1279f4b BUG/MEDIUM: mux-quic: fix crash on RS/SS emission if already close local
A BUG_ON() is present in qcc_send_stream() to ensure that emission is
never performed with a stream already closed locally. However, this
function is also used for RESET_STREAM/STOP_SENDING emission. No
protection exists to ensure that RS/SS is not scheduled after stream
local closure, which would result in this BUG_ON() crash.

This crash can be triggered with the following QUIC client sequence :
1. SS is emitted to open a new stream. QUIC-MUX schedules a RS emission
   by and the stream is locally closed.
2. An invalid HTTP/3 request is sent on the same stream, for example
   with duplicated pseudo-headers. The objective is to ensure
   qcc_abort_stream_read() is called after stream closure, which results
   in the following backtrace.

 0x000055555566a620 in qcc_send_stream (qcs=0x7ffff0061420, urg=1, count=0) at src/mux_quic.c:1633
 1633            BUG_ON(qcs_is_close_local(qcs));
 [ ## gdb ## ] bt
 #0  0x000055555566a620 in qcc_send_stream (qcs=0x7ffff0061420, urg=1, count=0) at src/mux_quic.c:1633
 #1  0x000055555566a921 in qcc_abort_stream_read (qcs=0x7ffff0061420) at src/mux_quic.c:1658
 #2  0x0000555555685426 in h3_rcv_buf (qcs=0x7ffff0061420, b=0x7ffff748d3f0, fin=0) at src/h3.c:1454
 #3  0x0000555555668a67 in qcc_decode_qcs (qcc=0x7ffff0049eb0, qcs=0x7ffff0061420) at src/mux_quic.c:1315
 #4  0x000055555566c76e in qcc_recv (qcc=0x7ffff0049eb0, id=12, len=0, offset=23, fin=0 '\000',
     data=0x7fffe0049c1c "\366\r,\230\205\354\234\301;\2563\335\037k\306\334\037\260", <incomplete sequence \323>) at src/mux_quic.c:1901
 #5  0x0000555555692551 in qc_handle_strm_frm (pkt=0x7fffe00484b0, strm_frm=0x7ffff00539e0, qc=0x7fffe0049220, fin=0 '\000') at src/quic_rx.c:635
 #6  0x0000555555694530 in qc_parse_pkt_frms (qc=0x7fffe0049220, pkt=0x7fffe00484b0, qel=0x7fffe0075fc0) at src/quic_rx.c:980
 #7  0x0000555555696c7a in qc_treat_rx_pkts (qc=0x7fffe0049220) at src/quic_rx.c:1324
 #8  0x00005555556b781b in quic_conn_app_io_cb (t=0x7fffe0037f20, context=0x7fffe0049220, state=49232) at src/quic_conn.c:601
 #9  0x0000555555d53788 in run_tasks_from_lists (budgets=0x7ffff748e2b0) at src/task.c:603
 #10 0x0000555555d541ae in process_runnable_tasks () at src/task.c:886
 #11 0x00005555559c39e9 in run_poll_loop () at src/haproxy.c:2858
 #12 0x00005555559c41ea in run_thread_poll_loop (data=0x55555629fb40 <ha_thread_info+64>) at src/haproxy.c:3075

The proper solution is to not execute this BUG_ON() for RS/SS emission.
Indeed, it is valid and can be useful to emit these frames, even after
stream local closure.

To implement this, qcc_send_stream() has been rewritten as a mere
wrapper function around the new internal _qcc_send_stream(). The latter
is used only by QMUX for STREAM, RS and SS emission. Application layer
continue to use the original function for STREAM emission, with the
BUG_ON() still in place there.

This must be backported up to 2.8.
2025-03-20 17:32:14 +01:00