Inspect return code of qc_send_mux(). If quic-conn layer reports an
error, this will interrupt the current emission process.
This should be backported up to 2.6.
It is possible to receive a STOP_SENDING frame for a locally closed
stream. This was not properly managed as this would result in a BUG_ON()
crash from qcs_idle_open() call under qcc_recv_stop_sending().
Now, STOP_SENDING frames are ignored when received on streams already
locally closed. This has two consequences depending on the reason of
closure :
* if a RESET_STREAM was already emitted and closed the stream, this
patch prevents to emit a new RESET_STREAM. This behavior is thus
better.
* if stream was closed due to all data transmitted, no RESET_STREAM will
be built. This is contrary to the RFC 9000 which advice to transmit
it, even on "Data Sent" state. However, this is not mandatory so the
new behavior is acceptable, even if it could be improved.
This crash has been detected on haproxy.org. This can be artifically
reproduced by adding the following snippet at the end of qc_send_mux()
when doing a request with a small payload response :
qcc_recv_stop_sending(qc->qcc, 0, 0);
This must be backported up to 2.6.
xprt_quic module was too large and did not reflect the true architecture
by contrast to the other protocols in haproxy.
Extract code related to XPRT layer and keep it under xprt_quic module.
This code should only contains a simple API to communicate between QUIC
lower layer and connection/MUX.
The vast majority of the code has been moved into a new module named
quic_conn. This module is responsible to the implementation of QUIC
lower layer. Conceptually, it overlaps with TCP kernel implementation
when comparing QUIC and HTTP1/2 stacks of haproxy.
This should be backported up to 2.6.
MUX QUIC snd_buf operation whill return early if a qcs instance is
resetted. In this case, HTX is left untouched and the callback returns
the whole bufer size. This lead to an undefined behavior as the stream
layer is notified about a transfer but does not see its HTX buffer
emptied. In the end, the transfer may stall which will lead to a leak on
session.
To fix this, HTX buffer is now resetted when snd_buf is short-circuited.
This should fix the issue as now the stream layer can continue the
transfer until its completion.
This patch has already been tested by Tristan and is reported to solve
the github issue #1801.
This should be backported up to 2.6.
Factorize common code between h3 and hq-interop snd_buf operation. This
is inserted in MUX QUIC snd_buf own callback.
The h3/hq-interop API has been adjusted to directly receive a HTX
message instead of a plain buf. This led to extracting part of MUX QUIC
snd_buf in qmux_http module.
This should be backported up to 2.6.
Extract function dealing with HTX outside of MUX QUIC. For the moment,
only rcv_buf stream operation is concerned.
The main objective is to be able to support both TCP and HTTP proxy mode
with a common base and add specialized modules on top of it.
This should be backported up to 2.6.
QUIC MUX implements several APIs to interface with stream, quic-conn and
app-ops layers. It is planified to better separate this roles, possibly
by using several files.
The first step is to extract QUIC MUX traces in a dedicated source
files. This will allow to reuse traces in multiple files.
The main objective is to be
able to support both TCP and HTTP proxy mode with a common base and add
specialized modules on top of it.
This should be backported up to 2.6.
A qcs instance free may be postponed in stream detach operation if the
stream is not locally closed. This condition is there to achieve
transfering data still present in Tx buffer. Once all data have been
emitted to quic-conn layer, qcs instance can be released.
However, the stream is only closed locally if HTX EOM has been seen or
it has been resetted. In case the transfer finished without EOM, a
detached qcs won't be freed even if there is no more activity on it.
This bug was not reproduced but was found on code analysis. Its precise
impact is unknown but it should not cause any leak as all qcs instances
are freed with its parent qcc connection : this should eventually happen
on MUX timeout or QUIC idle timeout.
To adjust this, condition to mark a stream as locally closed has been
extended. On qcc_streams_sent_done() notification, if its Tx buffer has
been fully transmitted, it will be closed if either FIN STREAM was set
or the stream is detached.
This must be backported up to 2.6.
nb_hreq is a counter on qcc for active HTTP requests. It is incremented
for each qcs where a full HTTP request was received. It is decremented
when the stream is closed locally :
- on HTTP response fully transmitted
- on stream reset
A bug will occur if a stream is resetted without having processed a full
HTTP request. nb_hreq will be decremented whereas it was not
incremented. This will lead to a crash when building with
DEBUG_STRICT=2. If BUG_ON_HOT are not active, nb_hreq counter will wrap
which may break the timeout logic for the connection.
This bug was triggered on haproxy.org. It can be reproduced by
simulating the reception of a STOP_SENDING frame instead of a STREAM one
by patching qc_handle_strm_frm() :
+ if (quic_stream_is_bidi(strm_frm->id))
+ qcc_recv_stop_sending(qc->qcc, strm_frm->id, 0);
+ //ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len,
+ // strm_frm->offset.key, strm_frm->fin,
+ // (char *)strm_frm->data);
To fix this bug, a qcs is now flagged with a new QC_SF_HREQ_RECV. This
is set when the full HTTP request is received. When the stream is closed
locally, nb_hreq will be decremented only if this flag was set.
This must be backported up to 2.6.
This fixes 4 tiny and harmless typos in mux_quic.c, quic_tls.c and
ssl_sock.c. Originally sent via GitHub PR #1843.
Signed-off-by: cui fliter <imcusg@gmail.com>
[Tim: Rephrased the commit message]
[wt: further complete the commit message]
A stream is considered as remotely closed once we have received all the
data with the FIN bit set.
The condition to close the stream was wrong. In particular, if we
receive an empty STREAM frame with FIN bit set, this would have close
the stream even if we do not have yet received all the data. The
condition is now adjusted to ensure that Rx buffer contains all the data
up to the stream final size.
In most cases, this bug is harmless. However, if compiled with
DEBUG_STRICT=2, a BUG_ON_HOT crash would have been triggered if close is
done too early. This was most notably the case sometimes on interop test
suite with quinn or kwik clients. This can also be artificially
reproduced by simulating reception of an empty STREAM frame with FIN bit
set in qc_handle_strm_frm() :
+ if (strm_frm->fin) {
+ qcc_recv(qc->qcc, strm_frm->id, 0,
+ strm_frm->len, strm_frm->fin,
+ (char *)strm_frm->data);
+ }
ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len,
strm_frm->offset.key, strm_frm->fin,
(char *)strm_frm->data);
This must be backported up to 2.6.
Small cleanup on snd_buf for application protocol layer.
* do not export h3_snd_buf
* replace stconn by a qcs argument. This is better as h3/hq-interop only
uses the qcs instance.
This should be backported up to 2.6.
H3 SETTINGS emission has recently been delayed. The idea is to send it
with the first STREAM to reduce sendto syscall invocation. This was
implemented in the following patch :
3dd79d378c
MINOR: h3: Send the h3 settings with others streams (requests)
This patch works fine under nominal conditions. However, it will cause a
crash if a HTTP/3 connection is released before having sent any data,
for example when receiving an invalid first request. In this case,
qc_release will first free qcc.app_ops HTTP/3 application protocol layer
via release callback. Then qc_send is called to emit any closing frames
built by app_ops release invocation. However, in qc_send, as no data has
been sent, it will try to complete application layer protocol
intialization, with a SETTINGS emission for HTTP/3. Thus, qcc.app_ops is
reused, which is invalid as it has been just freed. This will cause a
crash with h3_finalize in the call stack.
This bug can be reproduced artificially by generating incomplete HTTP/3
requests. This will in time trigger http-request timeout without any
data send. This is done by editing qc_handle_strm_frm function.
- ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len,
+ ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len - 1,
strm_frm->offset.key, strm_frm->fin,
(char *)strm_frm->data);
To fix this, application layer closing API has been adjusted to be done
in two-steps. A new shutdown callback is implemented : it is used by the
HTTP/3 layer to generate GOAWAY frame in qc_release prologue.
Application layer context qcc.app_ops is then freed later in qc_release
via the release operation which is now only used to liberate app layer
ressources. This fixes the problem as the intermediary qc_send
invocation will be able to reuse app_ops before it is freed.
This patch fixes the crash, but it would be better to adjust H3 SETTINGS
emission in case of early connection closing : in this case, there is no
need to send it. This should be implemented in a future patch.
This should fix the crash recently experienced by Tristan in github
issue #1801.
This must be backported up to 2.6.
This is the ->finalize application callback which prepares the unidirectional STREAM
frames for h3 settings and wakeup the mux I/O handler to send them. As haproxy is
at the same time always waiting for the client request, this makes haproxy
call sendto() to send only about 20 bytes of stream data. Furthermore in case
of heavy loss, this give less chances to short h3 requests to succeed.
Drawback: as at this time the mux sends its streams by their IDs ascending order
the stream 0 is always embedded before the unidirectional stream 3 for h3 settings.
Nevertheless, as these settings may be lost and received after other h3 request
streams, this is permitted by the RFC.
Perhaps there is a better way to do. This will have to be checked with Amaury.
Must be backported to 2.6.
The following task/tasklet handlers often appear in "show profiling tasks"
but were not resolved since static:
qc_io_cb, quic_conn_app_io_cb, process_timer,
quic_accept_run, qc_idle_timer_task
This commit simply exports them so they can be resolved now. "process_timer"
which was a bit too generic and renamed to qc_process_timer.
It's very limited but at least provides the very basic info about QCS and
QCC when issuing "show sess all":
scf=0x7fa9642394a0 flags=0x00000080 state=EST endp=CONN,0x7fa9642351f0,0x02001001 sub=3
> qcs=0x7fa9642351f0 .flg=0x5 .id=396 .st=HCR .ctx=0x7fa9642353f0, .err=0
> qcc=0x7fa96405ce20 .flg=0 .nbsc=100 .nbhreq=100, .task=0x7fa964054260
co0=0x7fa96405cd50 ctrl=quic4 xprt=QUIC mux=QUIC data=STRM target=LISTENER:0x328c530
flags=0x00200300 fd=-1 fd.state=00 updt=0 fd.tmask=0x0
It will need to be improved but it's better than nothing already. This
should be backported to 2.6 if the other dumps are backported.
MUX notification on TX has been edited recently : it will be notified
only when sending its own data, and not for example on retransmission by
the quic-conn layer. This is subject of the patch :
b29a1dc2f4
BUG/MINOR: quic: do not notify MUX on frame retransmit
A new flag QUIC_FL_CONN_RETRANS_LOST_DATA has been introduced to
differentiate qc_send_app_pkts invocation by MUX and directly by the
quic-conn layer in quic_conn_app_io_cb(). However, this is a first
problem as internal quic-conn layer usage is not limited to
retransmission. For example for NEW_CONNECTION_ID emission.
Another problem much important is that send functions are also called
through quic_conn_io_cb() which has not been protected from MUX
notification. This could probably result in crash when trying to notify
the MUX.
To fix both problems, quic-conn flagging has been inverted : when used
by the MUX, quic-conn is flagged with QUIC_FL_CONN_TX_MUX_CONTEXT. To
improve the API, MUX must now used qc_send_mux which ensure the flag is
set. qc_send_app_pkts is now static and can only be used by the
quic-conn layer.
This must be backported wherever the previously mentionned patch is.
Adjust qc_send_app_pkts function : remove <old_data> arg and provide a
new wrapper function qc_send_app_probing() which should be used instead
when probing with old data.
This simplifies the interface of the default function, most notably for
the MUX which does not interfer with retransmission.
QUIC_FL_CONN_RETRANS_OLD_DATA flag is set/unset directly in the wrapper
qc_send_app_probing().
At the same time, function documentation has been updated to clarified
arguments and return values.
This commit will be useful for the next patch to differentiate MUX and
retransmission send context. As a consequence, the current patch should
be backported wherever the next one will be.
Complete some MUX traces by adding qcc or qcs instance as arguments when
this is possible. This will be useful when several connections are
interleaved.
Emit STREAM_LIMIT_ERROR if a client tries to open an unidirectional
stream with an ID greater than the value specified by our flow-control
limit. The code is similar to the bidirectional stream opening.
MAX_STREAMS_UNI emission is not implement for the moment and is left as
a TODO. This should not be too urgent for the moment : in HTTP/3, a
client has only a limited use for unidirectional streams (H3 control
stream + 2 QPACK streams). This is covered by the value provided by
haproxy in transport parameters.
This patch has been tagged with BUG as it should have prevented last
crash reported on github issue #1808 when opening a new unidirectional
streams with an invalid ID. However, it is probably not the main cause
of the bug contrary to the patch
commit 11a6f4007b
BUG/MINOR: quic: Wrong status returned by qc_pkt_decrypt()
This must be backported up to 2.6.
qc_detach() is used to free a qcs as notified by sedesc. If there is no
more stream active and the connection is considered as dead, it will
then be freed. This prevent to dereference qcc in TRACE macro. Else this
will cause a crash.
Use a different code-path on release for qc_detach() to fix this bug.
This will fix the last occurence of crash on github issue #1808.
This has been introduced by recent QUIC MUX traces rework. Thus, it does
not need to be backport.
Traces argument were incorrectly used in qcs_free(). A qcs was specified
as first arg instead of a connection. This will lead to a crash if
developer qmux traces are activated. This is now fixed.
This bug has been introduced with QUIC MUX traces rework. No need to
backport.
Add new traces to help debugging on QUIC MUX. Most notable, the
following functions are now traced :
* qcc_emit_cc
* qcs_free
* qcs_consume
* qcc_decode_qcs
* qcc_emit_cc_app
* qcc_install_app_ops
* qcc_release_remote_stream
* qcc_streams_sent_done
* qc_init
Change default devel level for some traces in QUIC MUX:
* proto : used to notify about reception/emission of frames
* state : modification of internal state of connection or streams
* data : detailled information about transfer and flow-control
Replace devel traces with error level on all errors situation. Also a
new event QMUX_EV_PROTO_ERR is used. This should help to detect invalid
situations quickly.
Considered a stream as opened when receiving a STOP_SENDING frame as the
first frame on the stream.
This patch is tagged as BUG because a BUG_ON may occur if only a
STOP_SENDING frame has been received for a frame. This will reset the
stream in respect with RFC9000 but internally it is considered invalid
transition to reset an idle stream.
To fix this, simply use qcs_idle_open() on STOP_SENDING parsing
function. This will mark the stream as OPEN before resetting it.
This was detected on haproxy.org with the following backtrace :
FATAL: bug condition "qcs->st == QC_SS_IDLE" matched at
src/mux_quic.c:383
call trace(12):
| 0x490dd3 [b8 01 00 00 00 c6 00 00]: main-0x1d0633
| 0x4975b8 [48 8b 85 58 ff ff ff 8b]: main-0x1c9e4e
| 0x497df4 [48 8b 45 c8 48 89 c7 e8]: main-0x1c9612
| 0x49934c [48 8b 45 c8 48 89 c7 e8]: main-0x1c80ba
| 0x6b3475 [48 8b 05 54 1b 3a 00 64]: run_tasks_from_lists+0x45d/0x8b2
| 0x6b4093 [29 c3 89 d8 89 45 d0 83]: process_runnable_tasks+0x7c9/0x824
| 0x660bde [8b 05 fc b3 4f 00 83 f8]: run_poll_loop+0x74/0x430
| 0x6611de [48 8b 05 7b a6 40 00 48]: main-0x228
| 0x7f66e4fb2ea5 [64 48 89 04 25 30 06 00]: libpthread:+0x7ea5
| 0x7f66e455ab0d [48 89 c7 e8 5b 72 fc ff]: libc:clone+0x6d/0x86
Stream states have been implemented in the current dev tree. Thus, this
patch does not need to be backported.
qc_send_app_pkts() has now a while loop implemented which allows to send
all possible frames even if the send buffer is full between packet
prepare and send. This is present since commit :
dc07751ed7
MINOR: quic: Send packets as much as possible from qc_send_app_pkts()
This means we can remove code from the MUX which implement this at the
upper layer. This is useful to simplify qc_send_frames() function.
As mentionned commit is subject to backport, this commit should be
backported as well to 2.6.
Implement http-request timeout for QUIC MUX. It is used when the
connection is opened and is triggered if no HTTP request is received in
time. By HTTP request we mean at least a QUIC stream with a full header
section. Then qcs instance is attached to a sedesc and upper layer is
then responsible to wait for the rest of the request.
This timeout is also used when new QUIC streams are opened during the
connection lifetime to wait for full HTTP request on them. As it's
possible to demux multiple streams in parallel with QUIC, each waiting
stream is registered in a list <opening_list> stored in qcc with <start>
as timestamp in qcs for the stream opening. Once a qcs is attached to a
sedesc, it is removed from <opening_list>. When refreshing MUX timeout,
if <opening_list> is not empty, the first waiting stream is used to set
MUX timeout.
This is efficient as streams are stored in the list in their creation
order so CPU usage is minimal. Also, the size of the list is
automatically restricted by flow control limitation so it should not
grow too much.
Streams are insert in <opening_list> by application protocol layer. This
is because only application protocol can differentiate streams for HTTP
messaging from internal usage. A function qcs_wait_http_req() has been
added to register a request stream by app layer. QUIC MUX can then
remove it from the list in qc_attach_sc().
As a side-note, it was necessary to implement attach qcc_app_ops
callback on hq-interop module to be able to insert a stream in waiting
list. Without this, a BUG_ON statement would be triggered when trying to
remove the stream on sedesc attach. This is to ensure that every
requests streams are registered for http-request timeout.
MUX timeout is explicitely refreshed on MAX_STREAM_DATA and STOP_SENDING
frame parsing to schedule http-request timeout if a new stream has been
instantiated. It was already done on STREAM parsing due to a previous
patch.
Try to reorganize qcc_refresh_timeout() to improve its readability. The
main objective is to reduce the indentation level and if sequences by
using goto statement to the end of the function. Also, backend and
frontend code path should be more explicit with this new version.
Refresh the MUX connection timeout in frame parsing functions. This is
necessary as these Rx operation are completed directly from the
quic-conn layer outside of MUX I/O callback. Thus, the timeout should be
refreshed on this occasion.
Note that however on STREAM parsing refresh is only conducted when
receiving the current consecutive data offset.
Timeouts related function have been moved up in the source file to be
able to use them in qcc_decode_qcs().
This commit will be useful for http-request timeout. Indeed, a new
stream may be opened during qcc_decode_qcs() which should trigger this
timeout until a full header section is received and qcs instance is
attached to sedesc.
Complete QUIC MUX timeout refresh function by using http-keep-alive
timeout. It is used when the connection is idle after having handle at
least one request.
To implement this a new member <idle_start> has been defined in qcc
structure. This is used as timestamp for when the connection became idle
and is used as base time for http keep-alive timeout
Add a new qcc member named <nb_hreq>. Its purpose is close to <nb_sc>
which represents the number of attached stream connectors. Both are
incremented inside qc_attach_sc().
The difference is on the decrement operation. While <nb_cs> is
decremented on sedesc detach callback, <nb_hreq> is decremented when the
qcs is locally closed.
In most cases, <nb_hreq> will be decremented before <nb_cs>. However, it
will be the reverse if a stream must be kept alive after detach callback.
The main purpose of this field is to implement http-keep-alive timeout.
Both <nb_sc> and <nb_hreq> must be null to activate the http-keep-alive
timeout.
Implement a new internal function qcc_refresh_timeout(). Its role will be
to reset QUIC MUX timeout depending if there is requests in progress or
not.
qcc_update_timeout() does not set a timeout if there is still attached
streams as in this case the upper layer is responsible to manage it.
Else it will activate the timeout depending on the connection current
status.
Timeout is refreshed on several locations : on stream detach and in I/O
handler and wake callback.
For the moment, only the default timeout is used (client or server). The
function may be expanded in the future to support more specific ones :
* http-keep-alive if connection is idle
* http-request when waiting for incomplete HTTP requests
* client/server-fin for graceful shutdown
Store a reference to proxy in the qcc structure. This will be useful to
access to proxy members outside of qcc_init().
Most notably, this change is required to implement timeout refreshing by
using the various timeouts configured at the proxy level.
Ensure via qcc_is_dead() that a connection is not released instance
until all of qcs streams are detached by the upper layer, even if an
error has been reported or the timeout has fired.
On the other side, as qc_detach() always check the connection status,
this should ensure that we do not keep a connection if not necessary.
Without this patch, a qcc instance may be freed with some of its qcs
streams not detached. This is an incorrect behavior and will lead to a
BUG_ON fault. Note however that no occurence of this bug has been
produced currently. This patch is mainly a safety against future
occurences.
This should be backported up to 2.6.
Timeout in QUIC MUX has evolved from the simple first implementation. At
the beginning, a connection was considered dead unless bidirectional
streams were opened. This was abstracted through an app callback
is_active().
Now this paradigm has been reversed and a connection is considered alive
by default, unless an error has been reported or a timeout has already
been fired. The callback is_active() is thus not used anymore and can be
safely removed to simplify qcc_is_dead().
This commit should be backported to 2.6.
A qcc instance may be freed in the middle of qc_io_cb() if all streams
were purged. This will lead to a crash as qcc instance is reused after
this step. Jump directly to the end of the function to avoid this.
Note that this bug has not been triggered for the moment. This is a
safety fix to prevent it.
This must be backported up to 2.6.
On H3 DATA frame transfer from the client, some streams are not properly
closed by the upper layer, despite all transfer operation completed.
Data integrity is not impacted but this will prevent the stream timeout
to fire and thus keep the owner session opened.
In most cases, sessions are closed on QUIC idle timeout, but it may stay
forever if a client emits PING frames at a regular interval to maintain
it.
This bug is caused by a missing EOI stream desc flag on certain
condition in qc_rcv_buf(). To be triggered, we have to use the
optimization when conn-stream buffer is empty and can be swapped with
qcs buffer. The problem is that it will skip the function body for
default copy but also a condition to check if EOI must be set. Thus this
bug does not happens for every H3 post requets : it requires that
conn-stream buffer is empty on last qc_rcv_buf() invocation.
This was reproduced more frequently when using ngtcp2 client with one or
multiple streams :
$ ngtcp2-client -m POST -d ~/infra/html/10K 127.0.0.1 20443 \
http://127.0.0.1:20443/post
This may fix at least partially github issue #1801.
This must be backported up to 2.6.
Stream data reception is incorrect when dealing with a partially new
offset with some data already consumed out of the RX buffer. In this
case, data length is adjusted but not the data buffer. In most cases,
ncb_add() operation will be rejected as already stored data does not
correspond with the new inserted offset. This will result in an invalid
CONNECTION_CLOSE with PROTOCOL_VIOLATION.
To fix this, buffer pointer is advanced while the length is reduced.
This can be reproduced with a POST request and patching haproxy to call
qcc_recv() multiple times by copying a quic_stream frame with different
offsets.
Must be backported to 2.6.
Call qc_send() on qc_release(). This is mostly useful for application
protocol with a connection closing procedure. Most notably, this will be
useful to implement HTTP/3 GOAWAY emission.
This change is purely cosmetic. qc_release() function is moved just
before qc_io_cb(). It's cleaner as it brings it closer where it is used.
More importantly, this will be required to be able to use it in
qc_send() function.
When MUX is released, a CONNECTION_CLOSE frame should be emitted. This
will ensure that the client does not use anymore a half-dead connection.
App protocol layer is responsible to provide the error code via release
callback. For HTTP/3 NO_ERROR is used as specified in RFC 9114. If no
release callback is provided, generic QUIC NO_ERROR code is used. Note
that a graceful shutdown is used : quic_conn must emit CONNECTION_CLOSE
frame when possible. This will be provided in another patch.
This change should limit the risk of browsers stuck on webpage loading
if MUX has been released. On CONNECTION_CLOSE reception, the client will
reopen a new QUIC connection.
Adjust qcc_emit_cc_app() to allow the delay of emission of a
CONNECTION_CLOSE. This will only set the error code but the quic-conn
layer is not flagged for immediate close. The quic-conn will be
responsible to shut the connection when deemed suitable.
This change will allow to implement application graceful shutdown, such
as HTTP/3 with GOAWAY emission. This will allow to emit closing frames
on MUX release. Once all work is done at the lower layer, the quic-conn
should emit a CONNECTION_CLOSE with the registered error code.
Define a new structure quic_err to abstract a QUIC error type. This
allows to easily differentiate a transport and an application error
code. This simplifies error transmission from QUIC MUX and H3 layers.
This new type is defined in quic_frame module. It is used to replace
<err_code> field in <quic_conn>. QUIC_FL_CONN_APP_ALERT flag is removed
as it is now useless.
Utility functions are defined to be able to quickly instantiate
transport, tls and application errors.
Implement support for STOP_SENDING frame parsing. The stream is resetted
as specified by RFC 9000. This will automatically interrupt all future
send operation in qc_send(). A RESET_STREAM will be sent with the code
extracted from the original STOP_SENDING frame.
Implement functions to be able to reset a stream via RESET_STREAM.
If needed, a qcs instance is flagged with QC_SF_TO_RESET to schedule a
stream reset. This will interrupt all future send operations.
On stream emission, if a stream is flagged with QC_SF_TO_RESET, a
RESET_STREAM frame is generated and emitted to the transport layer. If
this operation succeeds, the stream is locally closed. If upper layer is
instantiated, error flag is set on it.
Adjust condition to detach a qcs instance : if the stream is not locally
close it is not directly free. This should improve stream closing by
ensuring that either FIN or a RESET_STREAM is sent before destroying it.
Implement a basic state machine to represent stream lifecycle. By
default a stream is idle. It is marked as open when sending or receiving
the first data on a stream.
Bidirectional streams has two states to represent the closing on both
receive and send channels. This distinction does not exists for
unidirectional streams which passed automatically from open to close
state.
This patch is mostly internal and has a limited visible impact. Some
behaviors are slightly updated :
* closed streams are garbage collected at the start of io handler
* send operation is interrupted if a stream is close locally
Outside of this, there is no functional change. However, some additional
BUG_ON guards are implemented to ensure that we do not conduct invalid
operation on a stream. This should strengthen the code safety. Also,
stream states are displayed on trace which should help debugging.
MAX_STREAM_DATA can be used as the first frame of a stream. In this
case, the stream should be opened, if it respects flow-control limit. To
implement this, simply replace plain lookup in stream tree by
qcc_get_qcs() at the start of the parsing function. This automatically
takes care of opening the stream if not already done.
As specified by RFC 9000, if MAX_STREAM_DATA is receive for a
receive-only stream, a STREAM_STATE_ERROR connection error is emitted.
Improve return path for qcc_recv() on STREAM parsing. It returns 0 on
success. On error, a non-zero value is returned which indicates to the
caller that the packet containing the frame should not be acknowledged.
When qcc_recv() generates a CONNECTION_CLOSE or RESET_STREAM, either
directly or via qcc_get_qcs(), an error is returned which ensure that no
acknowledgement is generated. This required an adjustment on
qcc_get_qcs() API which now returns a success/error code. The stream
instance is returned via a new out argument.
Extend the function qcc_get_qcs() to be able to filter send/receive-only
unidirectional streams. A connection error STREAM_STATE_ERROR is emitted
if this new filter does not match.
This will be useful when various frames handlers are converted with
qcc_get_qcs(). Depending on the frame type, it will be easy to filter on
the forbidden stream types as specified in RFC 9000.
Implement a simple function to notify a possible subscriber or wake up
the upper layer if a special condition happens on a stream. For the
moment, this is only used to replace identical code in
qc_wake_some_streams().
Rename qc_release_detached_streams() to qc_purge_streams(). The aim is
to have a more generic name. It's expected to complete this function to
add other criteria to purge dead streams.
Also the function documentation has been corrected. It does not return a
number of streams. Instead it is a boolean value, to true if at least
one stream was released.
Rename both qcc_open_stream_local/remote() functions to
qcc_init_stream_local/remote(). This change is purely cosmetic. It will
reduces the ambiguity with the soon to be implemented OPEN states for
QCS instances.
QUIC MUX was not able to correctly deal with server response using
chunked transfer-encoding. All data will be transfered correctly to the
client but the FIN bit is missing. The transfer will never stop as the
client will wait indefinitely for the FIN bit.
This bug happened because the HTX message representing a chunked encoded
payload contains a final empty block with the EOM flag. However,
emission is skipped by QUIC MUX if there is no data to transfer. To fix
this, the condition was completed to ensure that there is no need to
send the FIN signal. If this is false, data emission will proceed even
if there is no data : this will generate an empty QUIC STREAM frame with
FIN set which will mark the end of the transfer.
To ensure that a FIN STREAM frame is sent only one time,
QC_SF_FIN_STREAM is resetted on send confirmation from the transport in
qcc_streams_sent_done().
This bug was reproduced when dealing with chunked transfer-encoding
response for the HTTP server.
This must be backported up to 2.6.
Add a check on stream size when the stream is in state Size Known. In
this case, a STREAM frame cannot change the stream size. If this is not
respected, a CONNECTION_CLOSE with FINAL_SIZE_ERROR will be emitted as
specified in the RFC 9000.
Rename QC_SF_FIN_RECV to the more generic name QC_SF_SIZE_KNOWN. This
better align with the QUIC RFC 9000 which uses the "Size Known" state
definition. This change is purely cosmetic.
Review the whole API used to access/instantiate qcs.
A public function qcc_open_stream_local() is available to the
application protocol layer. It allows to easily opening a local stream.
The ID is automatically attributed to the next one available.
For remote streams, qcc_open_stream_remote() has been implemented. It
will automatically take care of allocating streams in a linear way
according to the ID. This function is called via qcc_get_qcs() which can
be used for each qcc_recv*() operations. For the moment, it is only used
for STREAM frames via qcc_recv(), but soon it will be implemented for
other frames types which can also be used to open a new stream.
qcs_new() and qcs_free() has been restricted to the MUX QUIC only as
they are now reserved for internal usage.
This change is a pure refactoring and should not have any noticeable
impact. It clarifies the developer intent and help to ensure that a
stream is not automatically opened when not desired.
Implement a function <qcs_sc> to easily access to the stconn associated
with a QCS. This takes care of qcs.sd which may be NULL, for example for
unidirectional streams.
It is expected that in the future when implementing
STOP_SENDING/RESET_STREAM, stconn must be notify about the event. This
accessor will allow to easily test if the stconn is instantiated or not.
Adjust FIN signal on Rx path for the application layer : ensure that the
receive buffer has no gap.
Without this extra condition, FIN was signalled as soon as the STREAM
frame with FIN was received, even if we were still waiting to receive
missing offsets.
This bug could have lead to incomplete requests read from the
application protocol. However, in practice this bug has very little
chance to happen as the application layer ensures that the demuxed frame
length is equivalent to the buffer data size. The only way to happen is
if to receive the FIN STREAM as the H3 demuxer is still processing on a
frame which is not the last one of the stream.
This must be backported up to 2.6. The previous patch on ncbuf is
required for the newly defined function ncb_is_fragmented().
MINOR: ncbuf: implement ncb_is_fragmented()
Since a previous refactoring, application protocol layer is not require
anymore to call qcs_consume(). This function is now automatically used
by the MUX itself.
In GH #1760 (which is marked as being a feature), there were compilation
errors on MacOS which could be reproduced in Linux when building 32-bit code
(-m32 gcc option). Most of them were due to variables types mixing in QUIC_MIN macro
or using size_t type in place of uint64_t type.
Must be backported to 2.6.
LIST_ELEM macro was incorrectly used in the loop when purging
flow-control frames from qcc.lfctl.frms on MUX release. This caused a
segfault in qc_release() due to an invalid quic_frame pointer instance.
The occurence of this bug seems fairly rare. To happen, some
flow-control frames must have been allocated but not yet sent just as
the MUX release is triggered.
I did not find a reproducer scenario. Instead, I artificially triggered
it by inserting a quic_frame in qcc.lfctl.frms just before purging it in
qc_release() using the following snippet.
struct quic_frame *frm;
frm = pool_zalloc(pool_head_quic_frame);
LIST_INIT(&frm->reflist);
frm->type = QUIC_FT_MAX_DATA;
frm->max_data.max_data = 0;
LIST_APPEND(&qcc->lfctl.frms, &frm->list);
This should fix github issue #1747.
This must be backported up to 2.6.
When the MUX transfers a big amount of data to the client, the transport
layer may reject some of them because of the congestion controller
limit. Frames built by the MUX are thus dropped, even if streams transferred
data are kept in buffers for future new frames.
Thus, the MUX is required to free rejected frames. This fixes a memory
leak which may grow with important data transfers.
It should be backported to 2.6 after it has been tested and validated.
TX flow-control enforcing is not straightforward : it requires the usage
of several counters at stream and connection level, in part due to the
difficult sending API between MUX and quic-conn layers.
To strengthen this part and ensures it behaves as expected, some
existing BUG_ON statements were adjusted and new one were added. This
should help to catch errors as early as possible, as in the case with
github issue #1738.
The flow control enforced at connection level is incorrectly calculated.
There is a risk of exceeding the limit. In most cases, this results in a
segfault produced by a BUG_ON which is here to catch this kind of error.
If not compiled with DEBUG_STRICT, this should generate a connection
closed by the client due to the flow control overflow.
The problem is encountered when transfered payload is big enough to fill
the transport congestion window. In this case, some data are rejected by
the transport layer and kept by the MUX to be reemitted later. However,
these preserved data are not counted on the connection flow control when
resubmitted, which gradually amplify the gap between expected and real
consumed flow control.
To fix this, handle the flow-control at the connection level in the same
way as the stream level. A new field qcc.tx.offsets is incremented as
soon as data are transfered between stream TX buffers. The field
qcc.tx.sent_offsets is preserved to count bytes handled by the transport
layer and stop the MUX transfer if limit is reached.
As already stated, this bug can occur during transfers with enough
emitted data, over multiple streams. When using a single stream, the
flow control at the stream level hides it.
The BUG_ON crash is reproduced systematically with quiche client :
$ quiche-client --no-verify --http-version HTTP/3 -n 10000 https://127.0.0.1:20443/10K
This must be backported up to 2.6 when confirmed to work as expected.
This should fix github issue #1738.
Clean the API used by decode_qcs() and transcoder internal functions.
Parsing functions now returns a ssize_t which represents the number of
consumed bytes or a negative error code. The total consumed bytes is
returned via decode_qcs().
The API is now unified and cleaner. The MUX can thus simply use the
return value of decode_qcs() instead of substracting the data bytes in
the buffer before and after the call. Transcoders functions are not
anymore obliged to remove consumed bytes from the buffer which was not
obvious.
Slightly modify decode_qcs function used by transcoders. The MUX now
gives a buffer instance on which each transcoder is free to work on it.
At the return of the function, the MUX removes consume data from its own
buffer.
This reduces the number of invocation to qcs_consume at the end of a
full demuxing process. The API is also cleaner with the transcoders not
responsible of calling it with the risk of having the input buffer
freed if empty.
As a mirror to qcc/qcs types, add a h3c pointer into h3s struct. This
should help to clean up H3 code and avoid to use qcs.qcc.ctx to retrieve
the h3c instance.
Make the transport parameters be standlone as much as possible as
it consists only in encoding/decoding data into/from buffers.
Reduce the size of xprt_quic.h. Unfortunalety, I think we will
have to continue to include <xprt_quic-t.h> to use the trace API
into this module.
Function arguments and local variables called "cs" were renamed to "sc"
to avoid future confusion. The "nb_cs" stream-connector counter was
renamed to "nb_sc" and qc_attach_cs() was renamed to qc_attach_sc().
There's no more reason for keepin the code and definitions in conn_stream,
let's move all that to stconn. The alphabetical ordering of include files
was adjusted.
The function doesn't return a pointer to the mux but to the mux stream
(h1s, h2s etc). Let's adjust its name to reflect this. It's rarely used,
the name can be enlarged a bit. And of course s/cs/sc to accommodate for
the updated name.
For historical reasons (stream-interface and connections), we used to
require two independent fields for the application level callbacks and
the transport-level functions. Over time the distinction faded away so
much that the low-level functions became specific to the application
and conversely. For example, applets may only work with streams on top
since they rely on the channels, and the stream-level functions differ
between applets and connections. Right now the application level only
contains a wake() callback and the low-level ones contain the functions
that act at the lower level to perform the shutr/shutw and at the upper
level to notify about readability and writability. Let's just merge them
together into a single set and get rid of this confusing distinction.
Note that the check ops do not define any app-level function since these
are only called by streams.
This renames the "struct conn_stream" to "struct stconn" and updates
the descriptions in all comments (and the rare help descriptions) to
"stream connector" or "connector". This touches a lot of files but
the change is minimal. The local variables were not even renamed, so
there's still a lot of "cs" everywhere.
After some discussion we found that the cs_endpoint was precisely the
descriptor for a stream endpoint, hence the naturally coming name,
stream endpoint constructor.
This patch renames only the type everywhere and the new/init/free functions
to remain consistent with it. Future patches will address field names and
argument names in various code areas.
That's the "stream endpoint" pointer. Let's change it now while it's
not much spread. The function __cs_endp_target() wasn't yet renamed
because that will change more globally soon.
This changes all main uses of endp->flags to the se_fl_*() equivalent
by applying coccinelle script endp_flags.cocci. The se_fl_*() functions
themselves were manually excluded from the change, of course.
Note: 144 locations were touched, manually reviewed and found to be OK.
The script was applied with all includes:
spatch --in-place --recursive-includes -I include --sp-file $script $files
As specified by HTTP/3 draft, an unknown unidirectional stream can be
aborted. To do this, use a new flag QC_SF_READ_ABORTED. When the MUX
detects this flag, QCS instance is automatically freed.
Previously, such streams were instead automatically drained. By aborting
them, we economize some useless memcpy instruction. On future data
reception, QCS instance is not found in the tree and considered as
already closed. The frame payload is thus deleted without copying it.
The whole QUIC stack is impacted by this change :
* at quic-conn level, a single function is now used to handle uni and
bidirectional streams. It uses qcc_recv() function from MUX.
* at MUX level, qc_recv() io-handler function does not skip uni streams
* most changes are conducted at app layer. Most notably, all received
data is handle by decode_qcs operation.
Now that decode_qcs is the single app read function, the H3 layer can be
simplified. Uni streams parsing was extracted from h3_attach_ruqs() to
h3_decode_qcs().
h3_decode_qcs() is able to deal with all HTTP/3 frame types. It first
check if the frame is valid for the H3 stream type. Most notably,
SETTINGS parsing was moved from h3_control_recv() into h3_decode_qcs().
This commit has some major benefits besides removing duplicated code.
Mainly, QUIC flow control is now enforced for uni streams as with bidi
streams. Also, an unknown frame received on control stream does not set
an error : it is now silently ignored as required by the specification.
Some cleaning in H3 code is already done with this patch :
h3_control_recv() and h3_attach_ruqs() are removed as they are now
unused. A final patch should clean up the unneeded remaining bit.
Define a new enum h3s_t. This is used to differentiate between the
different stream types used in a HTTP/3 connection, including the QPACK
encoder/decoder streams.
For the moment, only bidirectional streams is positioned. This patch
will be useful to unify reception of uni streams with bidirectional
ones.
Remove the unneeded skip over unidirectional streams in qc_send(). This
unify sending for both uni and bidi streams.
In fact, the only local unidirectional streams in use for the moment is
the H3 Control stream responsible of SETTINGS emission. The frame was
already properly generated in qcs.tx.buf, but not send due to stream
skip in qc_send(). Now, there is no need to ignore uni streams so remove
this condition.
This fixes the emission of H3 settings which is now properly emitted.
Uni and bidi streams use the same set of funtcions for sending. One of
the most notable gain is that flow-control is now enforced for uni
streams.
Emit STREAM_STATE_ERROR connection error in two cases :
* if receiving data for send-only stream
* if receiving data on a locally initiated stream not open yet
For the moment the first case cannot be encoutered as uni streams
reception does not use qcc_recv(). However, this will be soon
implemented with the unification between bidi and uni streams.
Similar to sending, read operations are disabled when a CONNECTION_CLOSE
frame has been emitted.
Most notably, this prevents unneeded loop demuxing when the H3 layer has
issue an error and cannot process the buffer payload anymore.
Note that read is not prevented for unidirectional streams for the
moment. This will supported soon with the unification of bidir and uni
streams treatment.
Complete quic-conn API for error reporting. A new parameter <app> is
defined in the function quic_set_connection_close(). This will transform
the frame into a CONNECTION_CLOSE_APP type.
This type of frame will be generated by the applicative layer, h3 or
hq-interop for the moment. A new function qcc_emit_cc_app() is exported
by the MUX layer for them.
Do not allocate cs_endpoint for every QCS instances in qcs_new().
Instead, this is delayed to qc_attach_cs() function.
In effect, with H3 as app protocol, cs_endpoint will be allocated on
HEADERS parsing. Thus, no cs_endpoint is allocated for H3 unidirectional
streams which do not convey any HTTP data.
The wake handler detects if the frontend is closed. This can happen if
the proxy has been disabled individually or even on process soft-stop.
Before this patch, in this condition QCS instances were freed before
being detached from the cs_endpoint. This clearly violates the haproxy
connection architecture and cause a BUG_ON statement crash in cs_free().
To handle this properly, cs_endpoint is notified by setting RD_SH|WR_SH
on connection flags. The cs_endpoint will thus use the detach operation
which allows the QCS instance to be freed.
This code allows the soft-stop process to complete as soon as possible.
However, the client is not notified about the connection closing. It
should be done by emitting a H3 GOAWAY + CONNECTION_CLOSE. Sadly, this
is impossible at this stage because the listener sockets are closed so
the quic-conn cannot use it to emit new frames. At this stage the client
will most probably detect connection closing on its idle timeout
expiration.
Thus, to completely support proxy closing/soft-stop, important
architecture changes are required in QUIC socket management. This is
also linked with the reload feature.
In order to be able to check compatibility between muxes and transport
layers, we'll need a new flag to tag muxes that work on framed transport
layers like QUIC. Only QUIC has this flag now.
As specified by the RFC reception of different STREAM data for the same
offset should be treated with a CONNECTION_CLOSE with error
PROTOCOL_VIOLATION.
Use ncbuf API to detect this case : if add operation fails with
NCB_RET_DATA_REJ with add mode NCB_ADD_COMPARE.
Send a CONNECTION_CLOSE on reception of a STREAM frame for a STREAM id
exceeding the maximum value enforced. Only implemented for bidirectional
streams for the moment.
Send a CONNECTION_CLOSE if the peer emits more data than authorized by
our flow-control. This is implemented for both stream and connection
level.
Fields have been added in qcc/qcs structures to differentiate received
offsets for limit enforcing with consumed offsets for sending of
MAX_DATA/MAX_STREAM_DATA frames.
Define an API to easily set a CONNECTION_CLOSE. This will mainly be
useful for the MUX when an error is detected which require to close the
whole connection.
On the MUX side, a new flag is added when a CONNECTION_CLOSE has been
prepared. This will disable add future send operations.
Release the QCS RX buffer if emptied afer qcs_consume(). This improves
memory usage and avoids a QCS to keep an allocated buffer, particularly
when no data is received anymore. Buffer is automatically reallocated if
needed via qc_get_ncbuf().
qc_free_ncbuf() may now be used with a NCBUF_NULL buffer as parameter.
This is useful when using this function on a QCS with no allocated
buffer. This case was not reproduced for the moment, but it will soon
become more present as buffers will be released if emptied.
Also a call to offer_buffers() is added to conform with the dynamic
buffer management of haproxy.
This commit is similar to the previous one but deals with MAX_DATA for
connection-level data flow control. It uses the same function
qcc_consume_qcs() to update flow control level and generate a MAX_DATA
frame if needed.
Send MAX_STREAM_DATA frames when at least half of the allocated
flow-control has been demuxed, frame and cleared. This is necessary to
support QUIC STREAM with received data greater than a buffer.
Transcoders must use the new function qcc_consume_qcs() to empty the QCS
buffer. This will allow to monitor current flow-control level and
generate a MAX_STREAM_DATA frame if required. This frame will be emitted
via qc_io_cb().
Adjust the mechanism for MAX_STREAMS_BIDI emission. When a bidirectional
stream is removed, current flow-control level is checked. If needed, a
MAX_STREAMS_BIDI frame is generated and inserted in a new list in the
QCS instance. The new frames will be emitted at the start of qc_send().
This has no impact on the current MAX_STREAMS_BIDI behavior. However,
this mechanism is more flexible and will allow to implement quickly
MAX_STREAM_DATA/MAX_DATA emission.
Slightly change the interface for qcc_recv() between MUX and XPRT. The
MUX is now responsible to call qcc_decode_qcs(). This is cleaner as now
the XPRT does not have to deal with an extra QCS parameter and the MUX
will call qcc_decode_qcs() only if really needed.
This change is possible since there is no extra buffering for
out-of-order STREAM frames and the XPRT does not have to handle buffered
frames.
Previously, qc_io_cb() of mux-quic only dealt with TX. Add support for
RX in it. This is done through a new function qc_recv(qcc). It loops
over all QCS instances and call qcc_decode_qcs(qcs).
This has no impact from the quic-conn layer as qcc_decode_qcs(qcs) is
called directly. However, this allows to have a resume point when demux
is blocked on the upper layer HTX full buffer.
Note that for the moment, only RX for bidirectional streams is managed
in qc_io_cb(). Unidirectional streams use their own mechanism for both
TX/RX. It should be unified in the near future in a refactoring.
Rx has been simplified since the conversion of buffer to a ncbuf. The
old buffer can now be removed. The frms tree is also removed. It was
used previously to stored out-of-order received STREAM frames. Now the
MUX is able to buffer them directly into the ncbuf.
Add a ncbuf for data reception on qcs. Thanks to this, the MUX is able
to buffered all received frame directly into the buffer. Flow control
parameters will be used to ensure there is never an overflow.
This change will simplify Rx path with the future deletion of acked
frames tree previously used for frames out of order.
Fred & Amaury found that I messed up with qc_detach() in commit 4201ab791
("CLEANUP: muxes: make mux->attach/detach take a conn_stream endpoint"),
causing a segv in this case with endp->cs == NULL being passed to
__cs_mux(). It obviously ought to have been endp->target like in other
muxes.
No backport needed.
The mux ->detach() function currently takes a conn_stream. This causes
an awkward situation where the caller cs_detach_endp() has to partially
mark it as released but not completely so that ->detach() finds its
endpoint and context, and it cannot be done later since it's possible
that ->detach() deletes the endpoint. As such the endpoint link between
the conn_stream and the mux's stream is in a transient situation while
we'd like it to be clean so that the mux's ->detach() code can call any
regular function it wants that knows the regular semantics of the
relation between the CS and the endpoint.
A better approach consists in slightly modifying the detach() API to
better match the reality, which is that the endpoint is detached but
still alive and that it's the only part the function is interested in.
As such, this patch modifies the function to take an endpoint there,
and by analogy (or simplicity) does the same for ->attach(), even
though it looks less important there since we're always attaching an
endpoint to a conn_stream anyway. It is possible that in the future
the API could evolve to use more endpoints that provide a bit more
flexibility in the API, but at this point we don't need to go further.
At a few places the endpoint pointer was retrieved from the conn_stream
while it's safer and more long-term proof to take it from the qcs. Let's
just do that.
We rely on the largest ID which was used to open streams to know if the
stream we received STREAM frames for is closed or not. If closed, we return the
same status as the one for a STREAM frame which was a already received one for
on open stream.
This is required when the retransmitted frame types when the mux is released.
We add a counter for the number of streams which were opened or closed by the mux.
After the mux has been released, we can rely on this counter to know if the STREAM
frames are retransmitted ones or not.
If the request channel buffer is full, H3 demuxing must be interrupted
on the stream until some read is performed. This condition is reported
if the HTX stream buffer qcs.rx.app_buf is full.
In this case, qcs instance is marked with a new flag QC_SF_DEM_FULL.
This flag cause the H3 demuxing to be interrupted. It is cleared when
the HTX buffer is read by the conn-stream layer through rcv_buf
operation.
When the flag is cleared, the MUX tasklet is woken up. However, as MUX
iocb does not treat Rx for the moment, this is useless. It must be fix
to prevent possible freeze on POST transfers.
In practice, for the moment the HTX buffer is never full as the current
Rx code is limited by the quic-conn receive buffer size and the
incomplete flow-control implementation. So for now this patch is not
testable under the current conditions.
If the MUX cannot handle immediately nor buffer a STREAM frame, the
packet containing it must not be acknowledge. This is in conformance
with the RFC9000.
qcc_recv() return codes have been adjusted to differentiate an invalid
frame with an already fully received offset which must be acknowledged.
Modify qc_send_app_pkt() to distinguish the case where it sends new data
against the case where it sends old data during probing retransmissions.
We add <old_data> boolean parameter to this function to do so. The mux
never directly send old data when probing retransmissions are needed by
the connection.
We want to track the frames which have been duplicated during retransmissions so
that to avoid uselessly retransmitting frames which would already have been
acknowledged. ->origin new member is there to store the frame from which a copy
was done, ->reflist is a list to store the frames which are copies.
Also ensure all the frames are zeroed and that their ->reflist list member is
initialized.
Add QUIC_FL_TX_FRAME_ACKED flag definition to mark a TX frame as acknowledged.
Define 2 new callback for qcc_app_ops : attach and detach. They are
called when a qcs instance is respectively allocated and freed. If
implemented, they can allocate a custom context stored in the new
abstract field ctx of qcs.
For now, h3 and hq-interop does not use these new callbacks. They will
be soon implemented by the h3 layer to allocate a context used for
stateful demuxing.
This change is required to support the demuxing of H3 frames bigger than
a buffer.
Improve the reception for STREAM frames. In qcc_recv(), if the frame is
bigger than the remaining space in rx buffer, do not reject it wholly.
Instead, copy as much data as possible. The rest of the data is
buffered.
This is necessary to handle H3 frames bigger than a buffer. The H3 code
does not demux until the frame is complete or the buffer is full.
Without this, the transfer on payload larger than the Rx buffer can
rapidly freeze.
Add new qcs fields to count the sum of bytes received for each stream.
This is necessary to enforce flow-control for reception on the peer.
For the moment, the implementation is partial. No MAX_STREAM_DATA or
FLOW_CONTROL_ERROR are emitted. BUG_ON statements are here as a
remainder.
This means that for the moment we do not support POST payloads greater
that the initial max-stream-data announced (256k currently).
At least, we now ensure that we never buffer a frame which overflows the
flow-control limit : this ensures that the memory consumption per stream
should stay under control.
qcs and its field are not properly freed if the conn-stream allocation
fail in qcs_new(). Fix this by having a proper deinit code with a
dedicated label.
Comments were not properly edited since the splitting of functions for
stream emission. Also "payload" argument has been renamed to "in" as it
better reflects the function purpose.
Remove CS_EP_EOS set erroneously on qc_rcv_buf().
This fixes POST with abortonclose. Previously, request was preemptively
aborted by haproxy due to the incorrect EOS flag.
For the moment, EOS flag is not set anymore. It should be set to warn
about a premature close from the client.
Since previous patch
MINOR: mux-quic: split xfer and STREAM frames build
there is no way to report an error in qcs_xfer_data().
This should fix github issue #1669.
Do not initialize mux task timeout if timeout client is set to 0 in the
configuration. Check for the task before queuing it in qc_io_cb() or
qc_detach().
This fix a crash when timeout client is 0 or undefined.
Unsubscribe from lower layer on qc_release. This ensures that the lower
layer won't wake up a null tasklet after the MUX has been released and
may prevent a crash.
Complete qc_send function. After having processed each qcs emission, it
will now retry send on qcs where transfer can continue. This is useful
when qc_stream_desc buffer is full and there is still data present in
qcs buf.
To implement this, each eligible qcs is inserted in a new list
<qcc.send_retry_list>. This is done on send notification from the
transport layer through qcc_streams_sent_done(). Retry emission until
send_retry_list is empty or the transport layer cannot proceed more
data.
Several send operations are now called on two different places. Thus a
new _qc_send_qcs() function is defined to factorize the code.
This change should maximize the throughput during QUIC transfers.
MUX streams can now allocate multiple buffers for sending. quic-conn is
responsible to limit the total count of allowed allocated buffers. A
counter is stored in the new field <stream_buf_count>.
For the moment, the value is hardcoded to 30.
On stream buffer allocation failure, the qcc MUX is flagged with
QC_CF_CONN_FULL. The MUX is then woken up as soon as a buffer is freed,
most notably on ACK reception.
Complete the qc_stream_desc type to support multiple buffers on
emission. The main objective is to increase the transfer throughput.
The MUX is now able to transfer more data without having to wait ACKs.
To implement this feature, a new type qc_stream_buf is declared. it
encapsulates a buffer with a list element. New functions are defined to
retrieve the current buffer, release it or allocate a new one. Each
buffer is kept in the qc_stream_desc list until all of its data is
acknowledged.
On the MUX side, a qcs uses the current stream buffer to transfer data.
Once the buffer is full, it is released and a new one will be allocated
on a future qc_send() invocation.
Simplify the model qcs/qc_stream_desc. Each types has now its own tree
node, stored respectively in qcc and quic-conn trees. It is still
necessary to mark the stream as detached by the MUX once all data is
transfered to the lower layer.
This might improve slightly the performance on ACK management as now
only the lookup in quic-conn is necessary. On the other hand, memory
size of qcs structure is increased.
Regroup all type definitions and functions related to qc_stream_desc in
the source file src/quic_stream.c.
qc_stream_desc complexity will be increased with the development of Tx
multi-buffers. Having a dedicated module is useful to mix it with
pure transport/quic-conn code.
Split qcs_push_frame() in two functions.
The first one is qcs_xfer_data(). Its purpose is to transfer data from
qcs.tx.buf to qc_stream_desc buffer. The second function is named
qcs_build_stream_frm(). It generates a STREAM frame using qc_stream_desc
buffer as payload.
The trace events previously associated with qcs_push_frame() has also
been split in two to reflect the new code structure.
The purpose of this refactoring is first to better reflect how sending
is implemented. It will also simplify the implementation of Tx
multi-buffer per streams.
Implement qc_destroy. This callback is used to quickly release all MUX
resources.
session_free uses this callback. Currently, it can only be called if
there was an error during connection initialization. If not defined, the
process crashes.
For all muxes, the function responsible to release a mux is always called
with a defined mux. Thus there is no reason to test if it is defined or not.
Note the patch may seem huge but it is just because of indentation changes.
Several muxes (h2, fcgi, quic) don't support the protocol upgrade. For these
muxes, there is no reason to have code to support it. Thus in the destroy
callback, there is now a BUG_ON() and the release function is simplified
because the connection is always owned by the mux..
All old flags CS_FL_* are now moved in the endpoint scope and renamed
CS_EP_* accordingly. It is a systematic replacement. There is no true change
except for the health-check and the endpoint reset. Here it is a bit special
because the same conn-stream is reused. Thus, we must handle endpoint
allocation errors. To do so, cs_reset_endp() has been adapted.
Thanks to this last change, it will now be possible to simplify the
multiplexer and probably the applets too. A review must also be performed to
remove some flags in the channel or the stream-interface. The HTX will
probably be simplified too. Finally, there is now some place in the
conn-stream to move info from the stream-interface.
The conn-stream endpoint is now shared between the conn-stream and the
applet or the multiplexer. If the mux or the applet is created first, it is
responsible to also create the endpoint and share it with the conn-stream.
If the conn-stream is created first, it is the opposite.
When the endpoint is only owned by an applet or a mux, it is called an
orphan endpoint (there is no conn-stream). When it is only owned by a
conn-stream, it is called a detached endpoint (there is no mux/applet).
The last entity that owns an endpoint is responsible to release it. When a
mux or an applet is detached from a conn-stream, the conn-stream
relinquishes the endpoint to recreate a new one. This way, the endpoint
state is never lost for the mux or the applet.
Group the endpoint target of a conn-stream, its context and the associated
flags in a dedicated structure in the conn-stream. It is not inlined in the
conn-stream structure. There is a dedicated pool.
For now, there is no complexity. It is just an indirection to get the
endpoint or its context. But the purpose of this structure is to be able to
share a refcounted context between the mux and the conn-stream. This way, it
will be possible to preserve it when the mux is detached from the
conn-stream.
This change is only significant for the multiplexer part. For the applets,
the context and the endpoint are the same. Thus, there is no much change. For
the multiplexer part, the connection was used to set the conn-stream
endpoint and the mux's stream was the context. But it is a bit strange
because once a mux is installed, it takes over the connection. In a
wonderful world, the connection should be totally hidden behind the mux. The
stream-interface and, in a lesser extent, the stream, still access the
connection because that was inherited from the pre-multiplexer era.
Now, the conn-stream endpoint is the mux's stream (an opaque entity for the
conn-stream) and the connection is the context. Dedicated functions have
been added to attached an applet or a mux to a conn-stream.