546 Commits

Author SHA1 Message Date
Amaury Denoyelle
75027692a3 MEDIUM: mux-quic: handle too short data splitted on multiple rxbuf
Previous commit introduces support for multiple Rx buffers per QCS
instance. Contiguous data may be splitted accross multiple buffers
depending on their offset.

A particular issue could arise with this new model. Indeed, app_ops
rcv_buf callback can still deal with a single buffer at a time. This may
cause a deadlock in decoding if app_ops layer cannot proceed due to
partial data, but such data are precisely divided on two buffers. This
can for example intervene during HTTP/3 frame header parsing.

To deal with this, a new function is implemented to force data realign
between two contiguous buffers. This is called only when app_ops rcv_buf
returned 0 but data is available in the next buffer after the current
one. In this case, data are transferred from the next into the current
buffer via qcs_transfer_rx_data(). Decoding is then restarted, which
should ensure that app_ops layer has enough data to advance.

During this operation, special care is ensure to removed both
qc_stream_rxbuf entries, as their offset are adjusted. The next buffer
is only reinserted if there is remaining data in it, else it can be
freed.

This case is not easily reproducible as it depends on the HTTP/3 framing
used by the client. It seems to be easily reproduced though with quiche.
$ quiche-client --http-version HTTP/3 --method POST --body /tmp/100m \
  "https://127.0.0.1:20443/post"
2025-03-07 12:06:27 +01:00
Amaury Denoyelle
60f64449fb MAJOR: mux-quic: support multiple QCS RX buffers
Implement support for multiple Rx buffers per QCS instances. This
requires several changes mostly in qcc_recv() / qcc_decode_qcs() which
deal with STREAM frames reception and decoding. These multiple buffers
can be stored in QCS rx.bufs tree which was introduced in an earlier
patch.

On STREAM frame reception, a buffer is retrieved from QCS bufs tree, or
allocated if necessary, based on the data starting offset. Each buffers
are aligned on bufsize for convenience. This ensures there is no overlap
between two contiguous buffers. Special care is taken when dealing with
a STREAM frame which must be splitted and stored in two contiguous
buffers.

When decoding input data, qcc_decode_qcs() is still invoked with a
single buffer as input. This requires a new while loop to ensure
decoding is performed accross multiple contiguous buffers until all data
are decoded or app stream buffer is full.

Also, after qcs_consume() has been performed, the stream Rx channel is
immediately closed if FIN was already received and QCS now contains only
a single buffer with all remaining data. This is necessary as qcc_recv()
is unable to close the Rx channel if FIN is received for a buffer
different from the current readable offset.

Note that for now stream flow-control value is still too low to fully
utilizing this new infrastructure and improve clients upload throughput.
Indeed, flow-control max-stream-data initial values are set to match
bufsize. This ensures that each QCS will use 1 buffer, or at most 2 if
data are splitted. A future patch will increase this value to unblock
this limitation.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
7b168e356f MINOR: mux-quic: adapt return value of qcc_decode_qcs()
Change return value of qcc_decode_qcs(). It now directly returns the
value from app_ops rcv_buf callback. Function documentation is updated
to reflect this.

For now, qcc_decode_qcs() return value is ignored by callers, so this
patch should not have any functional change. However, it will become
necessary when implementing multiple Rx buffers per QCS, as a loop will
be implemented to invoke qcc_decode_qcs() on several contiguous buffers.
Decoding must be stopped however as soon as an error is returned by
rcv_buf callback. This is also the case in case of a null value, which
indicates there is not enough data to continue decoding.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
6b5607d66f MINOR: mux-quic: adjust Rx data consumption API
HTTP/3 data are converted into HTX via qcc_decode_qcs() function. On
completion, these data are removed from QCS Rx buffer via qcs_consume().

This patch adjust qcs_consume() API with several changes. Firstly, the
Rx buffer instance to operate on must now be specified as a new argument
to the function. Secondly, buffer liberation when all data were removed
from qcs_consume() is extracted up to qcc_decode_qcs() caller.

No functional change with this patch. The objective is to have an API
which can be better adapted to multiple Rx buffers per QCS instance.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
a4f31ffeeb MINOR: mux-quic: store QCS Rx buf in a single-entry tree
Convert QCS rx buffer pointer to a tree container. Additionnaly, offset
field of qc_stream_rxbuf is thus transformed into a node tree.

For now, only a single Rx buffer is stored at most in QCS tree. Multiple
Rx buffers will be implemented in a future patch to improve QUIC clients
upload throughput.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
cc3c2d1f12 MINOR: mux-quic: define rxbuf wrapper
Define a new type qc_stream_rxbuf. This is used as a wrapper around QCS
Rx buffer with encapsulation of the ncbuf storage. It is allocated via a
new pool. Several functions are adapted to be able to deal with
qc_stream_rxbuf as a wrapper instead of the previous plain ncbuf
instance.

No functional change should happen with this patch. For now, only a
single qc_stream_rxbuf can be instantiated per QCS. However, this new
type will be useful to implement multiple Rx buffer storage in a future
commit.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
7dd1eec2b1 MINOR: mux-quic: refine reception of standalone STREAM FIN
Reception of standalone STREAM FIN is a corner case, which may be
difficult to handle. In particular, care must be taken to ensure app_ops
rcv_buf() is always called to be notify about FIN, even if Rx buffer is
empty or full demux flag is set. If this is the case, it could prevent
closure of QCS Rx channel.

To ensure this, rcv_buf() was systematically called if FIN was received,
with or without data payload. This could called unnecessary invokation
when FIN is transmitted with data and full demux flag is set, or data
are received out-of-order.

This patches improve qcc_recv() by detecting explicitely a standalone
FIN case. Thus, rcv_buf() is only forcefully called in this case and if
all data were already previously received.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
20dc8e4ec2 MINOR/OPTIM: mux-quic: do not allocate rxbuf on standalone FIN
STREAM FIN may be received without any payload. However, qcc_recv()
always called qcs_get_ncbuf() indiscriminately, which may allocate a QCS
Rx buffer. This is unneeded as there is no payload to store.

Improve this by skipping qcs_get_ncbuf() invokation when dealing with a
standalone FIN signal. This should prevent superfluous buffer
allocation.
2025-03-07 12:06:26 +01:00
Amaury Denoyelle
a7645d7cd5 MINOR: mux-quic/h3: support temporary blocking on control stream sending
When HTTP/3 layer is initialized via QUIC MUX, it first emits a SETTINGS
frame on an unidirectional control stream. However, this could be
prevented if client did not provide initial flow control.

Previously, QUIC MUX was unable to deal with such situation. Thus, the
connection was closed immediately and no transfer could occur. Improve
this by extending QUIC MUX application layer API : initialization may
now return a transient error. This allows MUX to continue to use the
connection normally. Initialization will be retried periodically alter
until it can succeed.

This new API allows to deal with the flow control issue described above.
Note that this patch is not considered as a bug fix. Indeed, clients are
strongly advised to provide enough flow control for a SETTINGS frame
exchange.
2025-02-19 11:08:02 +01:00
Amaury Denoyelle
06e7674399 MINOR: mux-quic/h3: emit SETTINGS via MUX tasklet handler
Previously, QUIC MUX application layer was installed and initialized via
MUX init. However, the latter stage involve I/O operations, for example
when using HTTP/3 with the emission of a SETTINGS frame.

Change this to prevent any I/O operations during MUX init. As such,
finalize app_ops callback is now called during the first invokation of
qcc_io_send(), in the context of MUX tasklet. To implement this, a new
application state value is added, to detect the transition from NULL to
INIT stage.
2025-02-19 11:03:40 +01:00
Amaury Denoyelle
188fc45b95 MINOR: mux-quic: define a QCC application state member
Introduce a new QCC field to track the current application layer state.
For the moment, only INIT and SHUT state are defined. This allows to
replace the older flag QC_CF_APP_SHUT.

This commit does not bring major changes. It is only necessary to permit
future evolutions on QUIC MUX. The only noticeable change is that QMUX
traces can now display this new field.
2025-02-19 10:59:53 +01:00
Amaury Denoyelle
2715dbe9d0 BUG/MINOR: mux-quic: prevent crash after MUX init failure
qmux_init() may fail for several reasons. In this case, connection
resources are freed and underlying and a CONNECTION_CLOSE will be
emitted via its quic_conn instance.

In case of qmux_init() failure, qcc_release() is used to clean up
resources, but QCC <conn> member is first resetted to NULL, as
connection released must be delayed. Some cleanup operations are thus
skipped, one of them is the resetting of <ctx> connection member to
NULL. This may cause a crash as <ctx> is a dangling pointer after QCC
release. One of the possible reproducer is to activate QMUX traces,
which will cause a segfault on the qmux_init() error leave trace.

To fix this, simply reset <ctx> to NULL manually on qmux_init() failure.

This must be backported up to 3.0.
2025-02-18 11:02:46 +01:00
Amaury Denoyelle
2cdc4695cb BUG/MINOR: quic: prevent crash on conn access after MUX init failure
Initially, QUIC-MUX was responsible to reset quic_conn <conn> member to
NULL when MUX was released. This was performed via qcc_release().

However, qcc_release() is also used on qmux_init() failure. In this
case, connection must be freed via its session, so QCC <conn> member is
resetted to NULL prior to qcc_release(), which prevents quic_conn <conn>
member to also be resetted. As the connection is freed soon after,
quic_conn <conn> is a dangling pointer, which may cause crashes.

This bug should be very rare as first it implies that QUIC-MUX
initialization has failed (for example due to a memory alloc error).
Also, <conn> member is rarely used by quic_conn instance. In fact, the
only reproducible crash was done with QUIC traces activated, as in this
case connection is accessed via quic_conn under __trace_enabled()
function.

To fix this, detach connection from quic_conn via the XPRT layer instead
of the MUX. More precisely, this is performed via quic_close(). This
should ensure that it will always be conducted, either on normal
connection closure, but also after special conditions such as MUX init
failure.

This should be backported up to 2.6.
2025-02-18 10:43:56 +01:00
Amaury Denoyelle
b849ee5fa3 BUILD: quic: fix overflow in global tune
A new global option was recently introduced to disable pacing. However,
the value used (1<<31) caused issue with some compiler as options field
used for storage is declared as int. Move pacing deactivation flag
outside into the newly defined quic_tune to fix this.

This should be backported up to 3.1 after a period of observation. Note
that it relied on the previous patch which defined new quic_tune type.
2025-01-30 18:12:53 +01:00
Amaury Denoyelle
0c8b54b2d1 MINOR: quic: transform pacing settings into a global option
Pacing support was previously activated on each bind line individually,
via an optional argument of quic-cc-algo keyword. Remove this optional
argument and introduce a global setting to enable/disable pacing. Pacing
activation is still flagged as experimental.

One important change is that previously BBR usage automatically
activated pacing support. This is not the case anymore, so users should
now always explicitely activate pacing if BBR is selected. A new warning
message will be displayed if this is not the case.

Another consequence of this change is that now pacing_inter callback is
always defined for every quic_cc_algo types. As such, QUIC MUX uses
global.tune.options to determine if pacing is required.

This should be backported up to 3.1, after a period of observation.
2025-01-30 17:19:38 +01:00
Amaury Denoyelle
8098be1fdc MEDIUM: mux-quic: reduce pacing CPU usage with passive wait
Pacing algorithm has been revamped in the previous commit to implement a
credit based solution. This is a far more adaptative solution, in
particular which allow to catch up in case pause between pacing emission
was longer than expected.

This allows QMUX to remove the active loop based on tasklet wake-up.
Instead, a new task is used when emission should be paced. The main
advantage is that CPU usage is drastically reduced.

New pacing task timer is reset each time qcc_io_send() is invoked. Timer
will be set only if pacing engine reports that emission must be
interrupted. In this case timer is set via qcc_wakeup_pacing() to the
delay reported by congestion algorithm, or 1ms if delay is too short. At
the end of qcc_io_cb(), pacing task is queued if timer has been set.

Pacing task execution is simple enough : it immediately wakes up QCC I/O
handler.

Note that to have decent performance, it requires to have a large enough
burst defined in configuration of quic-cc-algo. However, this value is
common to every listener clients, which may cause too much loss under
network conditions. This will be address in a future patch.

This should be backported up to 3.1.
2025-01-23 17:40:22 +01:00
Amaury Denoyelle
4489a61585 MEDIUM: quic: implement credit based pacing
Implement a new method for QUIC pacing emission based on credit. This
represents the number of packets which can be emitted in a single burst.
After emission, decrement from the credit the number of emitted packets.
Several emission can be conducted in the same sequence until the credit
is completely decremented.

When a new emission sequence is initiated (i.e. under a new QMUX tasklet
invokation), credit is refilled according to the delay which occured
between the last and current emission context.

This new mechanism main advantage is that it allows to conduct several
emission in the same task context without having to wait between each
invokation. Wait is only forced if pacing is expired, which is now
equivalent to having a null credit.

Furthermore, if delay between two emissions sequence would have been
smaller than expected, credit is only partially refilled. This allows to
restart emission without having to wait for the whole credit to be
available.

On the implementation side, a new field <credit> is avaiable in
quic_pacer structure. It is automatically decremented on
quic_pacing_sent_done() invokation. Also, a new function
quic_pacing_reload() must be used by QUIC MUX when a new emission
sequence is initiated to refill credit. <next> field from quic_pacer has
been removed.

For the moment, credit is based on the burst configured via quic-cc-algo
keyword, or directly reported by BBR.

This should be backported up to 3.1.
2025-01-23 17:40:20 +01:00
Amaury Denoyelle
9d8589f0de MINOR: mux-quic: increment pacing retry counter on expired
A field <paced_sent_ctr> from quic_pacer structure is used to report the
number of occurences where emission has been interrupted due to pacing.
However, it was not incremented when QUIC MUX had to pause immediately
emission as pacing was still not yet expired.

Fix this by incrementing <paced_sent_ctr> in qcc_io_send() prior to
emission if pacing is expired. Note that incrementation is only done
once if the tasklet is then repeatdely woken up until the timer is
expired.

This should be backported up to 3.1.
2025-01-23 17:29:14 +01:00
Amaury Denoyelle
7c0820892f MINOR: quic: rename pacing_rate cb to pacing_inter
Rename one of the congestion algorithms pacing callback from pacing_rate
to pacing_inter. This better reflects that this function returns a delay
(in nanoseconds) which should be applied between each packet emission to
fill the congestion window with a perfectly smoothed emission.

This should be backported up to 3.1.
2025-01-23 14:49:35 +01:00
Amaury Denoyelle
801e39e1cc BUG/MINOR: mux-quic: handle closure of uni-stream
This commit is a direct follow-up to the previous one. As already
described, a previous fix was merged to prevent streamdesc attach
operation on already completed QCS instances scheduled for purging. This
was implemented by skipping app proto decoding.

However, this has a bad side-effect for remote uni-directional stream.
If receiving a FIN stream frame on such a stream, it will considered as
complete because streamdesc are never attached to a uni stream. Due to
the mentionned new fix, this prevent analysis of this last frame for
every uni stream.

To fix this, do not skip anymore app proto decoding for completed QCS.
Update instead qcs_attach_sc() to transform it as a noop function if QCS
is already fully closed before streamdesc instantiation. However,
success return value is still used to prevent an invalid decoding error
report.

The impact of this bug should be minor. Indeed, HTTP3 and QPACK uni
streams are never closed by the client as this is invalid due to the
spec. The only issue was that this prevented QUIC MUX to close the
connection with error H3_ERR_CLOSED_CRITICAL_STREAM.

This must be backported along the previous patch, at least to 3.1, and
eventually to 2.8 if mentionned patches are merged there.
2025-01-03 17:21:19 +01:00
Amaury Denoyelle
af00be8e0f MINOR: mux-quic: change return value of qcs_attach_sc()
A recent fix was introduced to ensure that a streamdesc instance won't
be attached to an already completed QCS which is eligible to purging.
This was performed by skipping application protocol decoding if a QCS is
in such a state. Here is the patch responsible for this change.
  caf60ac696a29799631a76beb16d0072f65eef12
  BUG/MEDIUM: mux-quic: do not attach on already closed stream

However, this is too restrictive, in particular for unidirection stream
where no streamdesc is never attached. To fix this behavior, first
qcs_attach_sc() API has been modified. Instead of returning a streamdesc
instance, it returns either 0 on success or a negative error code.

There should be no functional changes with this patch. It is only to be
able to extend qcs_attach_sc() with the possibility of skipping
streamdesc instantiation while still keeping a success return value.

This should be backported wherever the above patch has been merged. For
the record, it was scheduled for immediate backport on 3.1, plus merging
on older releases up to 2.8 after a period of observation.
2025-01-03 17:19:21 +01:00
Amaury Denoyelle
4f2554903b BUG/MINOR: mux-quic: fix wakeup on qcc_set_error()
The following patch was a major refactoring of QUIC MUX. It removes
pacing specific code path. In particular, qcc_wakeup() utility function
was removed and replaced by its tasklet_wakup() usage.
  41f0472d967b2deb095d5adc8a167da973fbee3d
  MEDIUM: mux-quic: remove pacing specific code on qcc_io_cb

However, an incorrect substitution was performed in qcc_set_error(). As
such, there was no explicit wakeup in case an error is detected by QUIC
MUX or the app protocol layer. This may lead to missing error reporting
to clients.

Fix this by re-add tasklet_wakup() usage into qcc_set_error().

This must be backported up to 3.1 where above patch is scheduled.
2025-01-03 10:39:49 +01:00
Amaury Denoyelle
caf60ac696 BUG/MEDIUM: mux-quic: do not attach on already closed stream
Due to QUIC packet reordering, a stream may be opened via a new
RESET_STREAM or STOP_SENDING frame. This would cause either Tx or Rx
channel to be immediately closed.

This can cause an issue with current QUIC MUX implementation with QCS
purging. QCS are inserted into QCC purge list when transfer could be
considered as completed. In most cases, this happens after full
request/response exchange. However, it can also happens after request
reception if RESET_STREAM/STOP_SENDING are received first.

A BUG_ON() crash will occur if a STREAM frame is received after. In this
case, streamdesc instance will be attached via qcs_attach_sc() to handle
the new request. However, QCS is already considered eligible to purging.
It could cause it to be released while its streamdesc instance remains.
A BUG_ON() crash detects this problem in qcc_purge_streams().

To fix this, extend qcc_decode_qcs() to skip app proto rcv_buf
invokation if QCS is considered completed. A similar condition was
already implemented when read was previously aborted after a
STOP_SENDING emission by QUIC MUX.

This crash was reproduced on haproxy.org. Here is the output of the
backtrace :
Core was generated by `./haproxy-dev -db -f /etc/haproxy/haproxy-current.cfg -sf 16495'.
Program terminated with signal SIGILL, Illegal instruction.
 #0  0x00000000004e442b in qcc_purge_streams (qcc=0x774cca0) at src/mux_quic.c:2661
2661                    BUG_ON_HOT(!qcs_is_completed(qcs));
[Current thread is 1 (LWP 1457)]
[ ## gdb ## ] bt
 #0  0x00000000004e442b in qcc_purge_streams (qcc=0x774cca0) at src/mux_quic.c:2661
 #1  0x00000000004e4db7 in qcc_io_process (qcc=0x774cca0) at src/mux_quic.c:2744
 #2  0x00000000004e5a54 in qcc_io_cb (t=0x7f71193940c0, ctx=0x774cca0, status=573504) at src/mux_quic.c:2886
 #3  0x0000000000b4f792 in run_tasks_from_lists (budgets=0x7ffdcea1e670) at src/task.c:603
 #4  0x0000000000b5012f in process_runnable_tasks () at src/task.c:883
 #5  0x00000000007de4a3 in run_poll_loop () at src/haproxy.c:2771
 #6  0x00000000007deb9f in run_thread_poll_loop (data=0x1335a00 <ha_thread_info>) at src/haproxy.c:2985
 #7  0x00000000007dfd8d in main (argc=6, argv=0x7ffdcea1e958) at src/haproxy.c:3570

This BUG_ON() crash can only happen since 3.1 refactoring. Indeed, purge
list was only implemented on this version. As such, please backport it
on 3.1 immediately. However, a logic issue remains for older version as
a stream could be attached on a fully closed QCS. Thus, it should be
backported up to 2.8, this time after a period of observation.
2025-01-02 11:25:40 +01:00
Amaury Denoyelle
4a997e5a93 MINOR: mux-quic: add traces on sd attach
Add traces into qcs_attach_sc(). This function is called when a request
is received on a QCS stream and a streamdesc instance is attached. This
will be useful to facilitate debugging.
2025-01-02 11:25:40 +01:00
Amaury Denoyelle
ddfd8031f8 BUG/MAJOR: mux-quic: properly fix BUG_ON on empty STREAM emission
Properly fix BUG_ON() occurence when QUIC MUX emits only empty STREAM
frames. This was addressed by a previous patch but it causes another
regression so a revert was needed.

BUG_ON() on qcc_build_frms() return value is invalid. Indeed,
qcc_build_frms() may return 0, but this does not imply that frame list
is empty, as encoded frames can have a zero length payload. As such,
simply remove this invalid BUG_ON().

This must be backported up to 3.1.
2025-01-02 11:25:40 +01:00
Amaury Denoyelle
85e27f1e92 Revert "BUG/MAJOR: mux-quic: fix BUG_ON on empty STREAM emission"
This reverts commit 98064537423fafe05b9ddd97e81cedec8b6b278d.

Above patch tried to fix a BUG_ON() occurence when MUX only emitted
empty STREAM frames via qcc_build_frms(). Return value of qcs_send() was
changed from the payload STREAM frame to the whole frame length.
However, this is invalid as this return value is used to ensure
connection flow-control is not exceeded on sending retry. This causes
occurence of BUG_ON() crash in qcc_io_send() as send-list is not
properly purged after QCS emission.

Reverts this incorrect fix. The original issue will be properly dealt in
the next commit.

This commit must be backported to 3.1 if reverted commit was already
applied on it.
2025-01-02 11:00:25 +01:00
Amaury Denoyelle
9806453742 BUG/MAJOR: mux-quic: fix BUG_ON on empty STREAM emission
A BUG_ON() is present in qcc_io_send() to ensure that encoded frame list
is empty if qcc_build_frms() previously returned 0.

This BUG_ON() may be triggered if empty STREAM frame is encoded for
standalone FIN. Indeed, qcc_build_frms() returns the sum of all STREAM
payload length. In case only empty STREAM frames are generated, return
value will be 0, despite new frames encoded and inserted into frame
list.

To fix this, change return value of qcs_send(). This now returns the
whole STREAM frame length, both header and payload included. This
ensures that qcc_build_frms() won't return a nul value if new frames are
encoded, even empty ones.

This must be backported up to 3.1.
2024-12-31 16:39:53 +01:00
Amaury Denoyelle
4490df57a6 CLEANUP: mux-quic: remove dead err label in qcc_build_frms()
STREAM frames emission in qcc_build_frms() has been splitted from
RESET_STREAM/STOP_SENDING into qcc_emit_rs_ss(). Now, the former cannot
fail, as such err label can be removed as it is unreachable.

This should be backported up to 3.1.

This should fix github issue #2824.
2024-12-19 16:36:33 +01:00
Amaury Denoyelle
7edb2ffae7 BUG/MEDIUM: mux-quic: prevent BUG_ON() by refreshing frms on MAX_DATA
QUIC MUX emission has been optimized recently by recycling STREAM frames
list between emission cycles. This is done via qcc frms list member. If
new data is available, frames list must be cleared before the next
emission to force the encoding of new STREAM frames.

If a refresh frames list is missed, it would lead to incomplete data
emission on the next transfer. In most cases, this is detected via a
BUG_ON() inside qcc_io_send(), as qcs instances remains in send_list
after a qcc_send_frames() full emission.

A bug was recently found which causes this BUG_ON() crash. This is
directly related to flow control. Indeed, when sending credit is
increased on the connection or a stream, frames list should be cleared
as new larger STREAM frames could be encoded. This was already performed
on MAX_DATA/MAX_STREAM_DATA reception but only if flow-control limit was
unblocked. However this is not the proper condition and it may lead to
insufficient frames refresh and thus this BUG_ON() crash.

Fix this by adjusting the condition for frames refresh on flow control
credit increase. Now, frames list is cleared if real offset is not
blocked and soft offset was equal or greater to the previous limit.
Indeed, this is the only case in which frames refreshing is necessary as
it would result in bigger encoded STREAM frames.

This bug was detected on QUIC interop with go-x-net client. It can also
be reproduced, albeit not systematically, using the following command :
  $ ngtcp2-client -q --no-quic-dump --no-http-dump \
    --exit-on-all-streams-close --max-data 10 \
    127.0.0.1 20443 -n10 "http://127.0.0.1:20443/?s=10k"

This bug appeared with the following patch. As it is scheduled for 3.1
backporting, the current fix should be backported with it.
  14710b5e6bf76834343d58db22e00b72590b16fe
  MEDIUM/OPTIM: mux-quic: do not rebuild frms list on every send
2024-12-19 16:36:28 +01:00
Amaury Denoyelle
53db43aff2 MINOR: mux-quic: hide traces when woken up on pacing only
Previous commit aligned default and pacing emission. This is a cleaner
and more robust code. However, it may disrupt traces analysis when
pacing is rescheduled until timer expiration.

Hide traces when qcc_io_cb() is woken up only due to pacing and timer is
not yet expired. This is implemented by using special TASK_WOKEN_IO for
pacing.

This should be backported up to 3.1.
2024-12-18 09:52:16 +01:00
Amaury Denoyelle
41f0472d96 MEDIUM: mux-quic: remove pacing specific code on qcc_io_cb
Pacing was recently implemented by QUIC MUX. Its tasklet is rescheduled
until next emission timer is reached. To improve performance, an
alternate execution of qcc_io_cb was performed when rescheduled due to
pacing. This was implemented using TASK_F_USR1 flag.

However, this model is fragile, in particular when several events
happened alongside pacing scheduling. This has caused some issue
recently, most notably when MUX is subscribed on transport layer on
receive for handshake completion while pacing emission is performed in
parallel. MUX qcc_io_cb() would not execute the default code path, which
means the reception event is silently ignored.

Recent patches have reworked several parts of qcc_io_cb. The objective
was to improve performance with better algorithm on send and receive
part. Most notable, qcc frames list is only cleared when new data is
available for emission. With this, pacing alternative code is now mostly
unneeded. As such, this patch removes it. The following changes are
performed :

* TASK_F_USR1 is now not used by QUIC MUX. As such, tasklet_wakeup()
  default invokation can now replace obsolete wrappers
  qcc_wakeup/qcc_wakeup_pacing

* qcc_purge_sending is removed. On pacing rescheduling, all qcc_io_cb()
  is executed. This is less error-prone, in particular when pacing is
  mixed with other events like receive handling. This renders the code
  less fragile, as it completely solves the described issue above.

This should be backported up to 3.1.
2024-12-18 09:49:20 +01:00
Amaury Denoyelle
14710b5e6b MEDIUM/OPTIM: mux-quic: do not rebuild frms list on every send
A newly introduced frames list member has been defined into QCC instance
with pacing implementation. This allowed to preserve STREAM frames built
between different emission scheduled by pacing, without having to
regenerate it if no new QCS data is available.

Generalize this principle outside of pacing scheduling. Now, the frames
list will be reused accross several qcc_io_send() usage. Frames list is
only cleared when necessary. This will force its refreshing in the next
qcc_io_send() via qcc_build_frms_list().

Frames list refreshing is performed in the following cases :
* on successful transfer from stream snd_buf / done_ff / shut
* on stream reset or read abort
* on max_data/max_stream_data reception with window increase

Note that the two first cases are in fact covered directly due to
qcc_send_stream() usage when QCS is (re)inserted into the send_list.

The main objective of this patch will be to remove QUIC MUX pacing
specific code path. It could also provide better performance as emission
of large frames may often be rescheduled due to transport layer, either
on congestion or full socket buffer. When QUIC MUX is rescheduled, no
new data is available and frames list can be reuse as-is, avoiding an
unecessary loop over send_list.

This should be backported up to 3.1.
2024-12-18 09:49:02 +01:00
Amaury Denoyelle
9ecc1a8e57 MINOR: mux-quic: split STREAM and RS/SS emission
This commit is a follow-up of the previous one which defines function
qcc_build_frms(). This function implements looping over qcc send_list,
to both encode and send individually any STOP_SENDING and RESET_STREAM,
but also encode STREAM frames as a preparator step. STREAM frames were
then sent as a list outside of qcc_build_frms() via qcc_send_frames().

Extract STOP_SENDING/RESET_STREAM encoding and emission step into a new
function qcc_emit_rs_ss(). The code is thus cleaner. In particular it
highlights that an error during STOP_SENDING/RESET_STREAM emission stage
is fatal and prevent any STREAM frames processing.

This should be backported up to 3.1.
2024-12-18 09:40:21 +01:00
Amaury Denoyelle
244dc00b09 MINOR: mux-quic: extract code to build STREAM frames list
Extracts code responsible to generate STREAM, RESET_STREAM and
STOP_SENDING frames for each qcs instances registered in qcc send_list.
It is moved from qcc_io_send() to its owned new function
qcc_build_frms().

This commit does not bring functional change. It is a preparatory step
to adapt QUIC MUX send mechanism to allow reusing of qcc frms list
accross qcc_io_send() invokation.

As a side change, qcc_tx_frms_free() is renamed to qcc_clear_frms().
This better highlights its relationship with qcc_build_frms().

This should be bkacported up to 3.1.
2024-12-18 09:38:19 +01:00
Amaury Denoyelle
e296585ae9 MEDIUM/OPTIM: mux-quic: implement purg_list
This commit is part of the current serie which aims to refactor and
improve overall performance of QUIC MUX I/O handler.

qcc_io_process() is responsible to perform some internal operations on
QUIC MUX after I/O completion. It is notably called on every qcc_io_cb()
tasklet handler.

The most intensive work on it is the purging of QCS instances after
transfer completion. This was implemented by looping on QCC streams tree
and inspecting the state of every QCS. The purpose of this commit is to
optimize this processing.

A new purg_list QCC member is defined. It is responsible to list every
QCS instances whose transfer has been completed. It is thus safe to
reuse <el_send> QCS list attach point. Stream purging will thus only
loop on purg_list instead of every known QCS.

This should be backported up to 3.1.
2024-12-18 09:33:52 +01:00
Amaury Denoyelle
4b42dd4ae0 MEDIUM/OPTIM: mux-quic: define a recv_list for demux resumption
This commit is part of the current serie which aims to refactor and
improve overall performance of QUIC MUX I/O handler.

Define a recv_list element into qcc structure. This is used to
registered every instance of qcs which are currently blocked on
demuxing, which happen on no more space in <rx.appbuf>.

The purpose of this patch is to reduce qcc_io_recv() CPU usage. Now,
only recv_list iteration is performed, instead of the previous looping
over every qcs instances. This is useful as qcc_io_recv() is called each
time qcc_io_cb() is scheduled, even if only sending condition was the
wakeup origin.

A qcs is not inserted into recv_list immediately after blocking on demux
full buffer. Instead, this is only done after unblocking via stream
rcv_buf callback, which ensure that new buffer space is available.

This should be backported up to 3.1.
2024-12-18 09:23:41 +01:00
Amaury Denoyelle
0a53a008d0 MINOR: mux-quic: refactor wait-for-handshake support
This commit refactors wait-for-handshake support from QUIC MUX. The flag
logic QC_CF_WAIT_HS is inverted : it is now positionned only if MUX is
instantiated before handshake completion. When the handshake is
completed, the flag is removed.

The flag is now set directly on initialization via qmux_init(). Removal
via qcc_wait_for_hs() is moved from qcc_io_process() to qcc_io_recv().
This is deemed more logical as QUIC MUX is scheduled on RECV to be
notify by the transport layer about handshake termination. Moreover,
qcc_wait_for_hs() is now called if recv subscription is still active.

This commit is the first of a serie which aims to refactor QUIC MUX I/O
handler and improves its overall performance. The ultimate objective is
to be able to stream qcc_io_cb() by removing pacing specific code path
via qcc_purge_sending().

This should be backported up to 3.1.
2024-12-18 09:23:41 +01:00
Amaury Denoyelle
9dcd2369e2 MINOR: quic: add traces
Add some traces to better follow QUIC MUX scheduling, in particular with
pacing interaction.

This should be backported up to 3.1.
2024-12-18 09:20:20 +01:00
Amaury Denoyelle
2e3542bec6 BUG/MEDIUM: mux-quic: do not mix qcc_io_send() return codes with pacing
With pacing implementation, qcc_send_frames() return code has been
extended to report emission interruption due to pacing limitation. This
is used only in qcc_io_send().

However, its invokation may be skipped using 'sent_done' label. This
happens on emission failure of a STOP_SENDING or RESET_STREAM (either
memory allocation failure, or transport layer rejection). In this case,
return values are mixed as qcs_send() is wrongly compared against pacing
interruption condition. This value corresponds to the length of the last
built STREAM frames.

If by mischance the last frame was 1 byte long, qcs_send() return value
is equal to pacing interruption condition. This has several effects. If
pacing is activated, it may lead to unneeded wakeup on QUIC MUX. Worst,
if pacing is not used, a BUG_ON() crash will be triggered.

Fix this by using a different variable dedicated to qcc_send_frames()
return value. By default it is initialized to 0. This ensures that
pacing code won't be activated in case qcc_send_frames() is not used.

This must be backported up to 3.1.
2024-12-18 09:18:48 +01:00
Amaury Denoyelle
7885a3b3e1 MINOR: mux-quic: clean up zero-copy done_ff callback
Recently, an issue was found with QUIC zero-copy forwarding on 3.0
version. A desynchronization could occur internally in QCS Tx bytes
counters which would cause a BUG_ON() crash on qcs_destroy() when the
stream is detached.

It was silently fixed in version 3.1 by the following patch. As it was
considered as an optimization, it was not scheduled yet for backport.

  6697e87ae5e1f569dc87cf690b5ecfc049c4aab0
  MINOR: mux-quic: Don't send an emtpy H3 DATA frame during zero-copy forwarding

This mistake has been caused due to some counter-intuitive manipulation
in QUIC zero-copy implementation. Try to streamline this in QUIC MUX
done_ff callback and its application protocol counterpart. Especially
for values exchanged between MUX and application on one side, and MUX
and stconn layer as done_fastfwd return value.

First, application done_ff callback now returns the length of the wholly
encoded frame. For HTTP/3, it means header length + payload length h3
frame. This value can then be reused as qcc_send_stream() argument to
increase QCS Tx soft offset.

As previously, special care has been taken to ensure that QUIC MUX
done_ff only return the transferred data bytes. Thus, any extra offset
for HTTP/3 header is properly excluded. This is mandatory for stconn
layer to consider the transfer has completed.

Secondly, remove duplicated code in application done_ff to reset iobuf
info. This is now factorize in QUIC MUX done_ff itself.

This patch is related to github issue #2678.
2024-12-05 16:57:31 +01:00
Amaury Denoyelle
08f557f0c4 BUG/MEDIUM: mux-quic: remove pacing status when everything is sent
TASK_F_USR1 is used by MUX tasklet when emission has been interrupted
due to pacing. When the tasklet runs again, only qcc_purge_sending()
will be called as an optimization.

Pacing status is only removed via qcc_wakeup(). Until then, TASK_F_USR1
is not cleared. This causes an issue after emission with pacing
completion if the MUX tasklet is woken up for a recv subscribe, as
qcc_wakeup() is not used by quic-conn layer. The tasklet will
incorrectly run only for pacing emission, without handling reception
process. Worst, a crash will occur if QCC tx frames list is empty, due
to a BUG_ON() in qcc_purge_sending().

Recv subscribe is only used for 0-RTT, when QUIC MUX is instantiated
before quic-conn handshake completion. Thus, this bug can only be
reproduced with 0-rtt. Furthermore, MUX must already have emitted at
least a few response bytes with pacing, before QUIC handshake
completion. It cannot easily be reproduced, at least with CLI clients
where the handshake is always already completed before MUX exchanges.

To fix this, remove TASK_F_USR1 when pacing emission has been completed.
At least, this prevents BUG_ON() on qcc_purge_sending() as it won't be
called with an empty QCC Tx frame list anymore. However, this bug has
revealed that MUX tasklet architecture is not suitable when both
handling reception and emission part. This will be improved in a future
serie of patches.

This should fix github issue #2796.

This must be backported up to 3.1.
2024-12-05 11:04:06 +01:00
Amaury Denoyelle
9d4c26ebaa BUG/MEDIUM: quic: prevent stream freeze on pacing
On snd_buf completion, QUIC MUX tasklet is scheduled if newly data has
been transferred from the stream layer. Thanks to qcc_wakeup(), pacing
status is removed from tasklet, which ensure next emission will reset Tx
frames and use the new data.

Tasklet is not scheduled if MUX is already subscribed on send due to a
previous blocking condition. This is an optimization to prevent an
unneeded IO handler execution. However, this causes a bug if an emission
is currently delayed due to pacing. As pacing status is not removed on
snd_buf, next emission process will continue emission with older data
without refreshing the newly transferred one.

This causes a transfer freeze. Unless there is some activity on the
connection, the transfer will be eventually aborted due to idle timeout.

To fix this, remove TASK_F_USR1 if tasklet wakeup is not called due to
send subscription. Note that this code is also duplicated in done_ff for
zero-copy transfer.

This must be backported up to 3.1.
2024-11-29 14:35:10 +01:00
Amaury Denoyelle
22bd92a87f MINOR: mux-quic: use sched call time for pacing
QUIC pacing was recently implemented to limit burst and improve overall
bandwidth. This is used only for MUX STREAM emission. Pacing requires
nanosecond resolution. As such, it used now_cpu_time() which relies on
clock_gettime() syscall.

The usage of clock_gettime() has several drawbacks :
* it is a syscall and thus requires a context-switch which may hurt
  performance
* it is not be available on all systems
* timestamp is retrieved multiple times during a single task execution,
  thus yielding different values which may tamper pacing calculation

Improve this by using task_mono_time() instead. This returns task call
time from the scheduler thread context. It requires the flag
TASK_F_WANTS_TIME on QUIC MUX tasklet to force the scheduler to update
call time with now_mono_time(). This solves every limitations listed
above :
* syscall invokation is only performed once before tasklet execution,
  thus reducing context-switch impact
* on non compatible system, a millisecond timer is used as a fallback
  which should ensure that pacing works decently for them
* timer value is now guaranteed to be fixed duing task execution
2024-11-25 11:21:45 +01:00
Amaury Denoyelle
3704e0e174 BUG/MINOR: mux-quic: fix show quic report of QCS prepared bytes
On show quic, each MUX streams are listed with their various indicator
for buffering on Rx and Tx. In particular, txoff displays in parenthesis
the current level of data prepared by the upper stream instance not yet
emitted by QUIC transport layer.

This value is only accessible after a substract operation. However,
there was a typo which caused the result to be always 0. Fix this by
reusing the correct offsets in the calculation.

This should be backported up to 3.0.
2024-11-25 11:21:28 +01:00
Amaury Denoyelle
5a29fd6c61 MINOR: mux_quic/pacing: display pacing info on show quic
To improve debugging, extend "show quic" output to report if pacing is
activated on a connection. Two values will be displayed for pacing :

* a new counter paced_sent_ctr is defined in QCC structure. It will be
  incremented each time an emission is interrupted due to pacing.

* pacing engine now saves the number of datagrams sent in the last paced
  emission. This will be helpful to ensure burst parameter is valid.
2024-11-19 16:21:05 +01:00
Amaury Denoyelle
796446a15e MAJOR: mux-quic: support pacing emission
Support pacing emission for STREAM frames at the QUIC MUX layer. This is
implemented by adding a quic_pacer engine into QCC structure.

The main changes have been written into qcc_io_send(). It now
differentiates cases when some frames have been rejected by transport
layer. This can occur as previously due to congestion or FD buffer full,
which requires subscribing on transport layer. The new case is when
emission has been interrupted due to pacing timing. In this case, QUIC
MUX I/O tasklet is rescheduled to run with the flag TASK_F_USR1.

On tasklet execution, if TASK_F_USR1 is set, all standard processing for
emission and reception is skipped. Instead, a new function
qcc_purge_sending() is called. Its purpose is to retry emission with the
saved STREAM frames list. Either all remaining frames can now be send,
subscribe is done on transport error or tasklet must be rescheduled for
pacing purging.

In the meantime, if tasklet is rescheduled due to other conditions,
TASK_F_USR1 is reset. This will trigger a full regeneration of STREAM
frames. In this case, pacing expiration must be check before calling
qcc_send_frames() to ensure emission is now allowed.
2024-11-19 16:16:48 +01:00
Amaury Denoyelle
ede4cd4c2e MINOR: mux-quic: encapsulate QCC tasklet wakeup
QUIC MUX will be responsible to drive emission with pacing. This will be
implemented via setting TASK_F_USR1 before I/O tasklet wakeup. To
prepare this, encapsulate each I/O tasklet wakeup into a new function
qcc_wakeup().

This commit is purely refactoring prior to pacing implementation into
QUIC MUX.
2024-11-19 16:16:48 +01:00
Amaury Denoyelle
4a94a018f0 MINOR: mux-quic: define a tx STREAM frame list member
For STREAM emission, MUX QUIC previously used a local list defined under
qcc_io_send(). This was suitable as either all frames were sent, or
emission must be interrupted due to transport congestion or fatal error.
In the latter case, the list was emptied anyway and a new frame list was
built on future qcc_io_send() invokation.

For pacing, MUX QUIC may have to save the frame list if pacing should be
applied across emission. This is necessary to avoid to unnecessarily
rebuilt stream frame list between each paced emission. To support this,
STREAM list is now stored as a member of QCC structure.

Ensure frame list is always deleted, even on QCC release, using newly
defined utility function qcc_tx_frms_free().
2024-11-19 16:16:48 +01:00
Amaury Denoyelle
8039fe43e6 MINOR: quic/pacing: support pacing emission on quic_conn layer
Pacing will be implemented for STREAM frames emission. As such,
qc_send_mux() API has been extended to add an argument to a quic_pacer
engine.

If non NULL, engine will be used to pace emission. In short, no more
than one datagram will be emitted for each qc_send_mux() invokation.
Pacer is then notified about the emission and a timer for a future
emission is calculated. qc_send_mux() will return PACING error value, to
inform QUIC MUX layer that it will be responsible to retry emission
after some delay.
2024-11-19 16:16:48 +01:00
Amaury Denoyelle
7fd48a5723 MINOR: quic: extend qc_send_mux() return type with a dedicated enum
This commit is part of a adjustment on QUIC transport send API to
support pacing. Here, qc_send_mux() return type has been changed to use
a new enum quic_tx_err.

This is useful to explain different failure causes of emission. For now,
only two values have been defined : NONE and FATAL. When pacing will be
implemented, a new value would be added to specify that emission was
interrupted on pacing. This won't be a fatal error as this allows to
retry emission but not immediately.
2024-11-19 16:16:48 +01:00