Clean the API used by decode_qcs() and transcoder internal functions.
Parsing functions now returns a ssize_t which represents the number of
consumed bytes or a negative error code. The total consumed bytes is
returned via decode_qcs().
The API is now unified and cleaner. The MUX can thus simply use the
return value of decode_qcs() instead of substracting the data bytes in
the buffer before and after the call. Transcoders functions are not
anymore obliged to remove consumed bytes from the buffer which was not
obvious.
Slightly modify decode_qcs function used by transcoders. The MUX now
gives a buffer instance on which each transcoder is free to work on it.
At the return of the function, the MUX removes consume data from its own
buffer.
This reduces the number of invocation to qcs_consume at the end of a
full demuxing process. The API is also cleaner with the transcoders not
responsible of calling it with the risk of having the input buffer
freed if empty.
As a mirror to qcc/qcs types, add a h3c pointer into h3s struct. This
should help to clean up H3 code and avoid to use qcs.qcc.ctx to retrieve
the h3c instance.
Make the transport parameters be standlone as much as possible as
it consists only in encoding/decoding data into/from buffers.
Reduce the size of xprt_quic.h. Unfortunalety, I think we will
have to continue to include <xprt_quic-t.h> to use the trace API
into this module.
Function arguments and local variables called "cs" were renamed to "sc"
to avoid future confusion. The "nb_cs" stream-connector counter was
renamed to "nb_sc" and qc_attach_cs() was renamed to qc_attach_sc().
There's no more reason for keepin the code and definitions in conn_stream,
let's move all that to stconn. The alphabetical ordering of include files
was adjusted.
The function doesn't return a pointer to the mux but to the mux stream
(h1s, h2s etc). Let's adjust its name to reflect this. It's rarely used,
the name can be enlarged a bit. And of course s/cs/sc to accommodate for
the updated name.
For historical reasons (stream-interface and connections), we used to
require two independent fields for the application level callbacks and
the transport-level functions. Over time the distinction faded away so
much that the low-level functions became specific to the application
and conversely. For example, applets may only work with streams on top
since they rely on the channels, and the stream-level functions differ
between applets and connections. Right now the application level only
contains a wake() callback and the low-level ones contain the functions
that act at the lower level to perform the shutr/shutw and at the upper
level to notify about readability and writability. Let's just merge them
together into a single set and get rid of this confusing distinction.
Note that the check ops do not define any app-level function since these
are only called by streams.
This renames the "struct conn_stream" to "struct stconn" and updates
the descriptions in all comments (and the rare help descriptions) to
"stream connector" or "connector". This touches a lot of files but
the change is minimal. The local variables were not even renamed, so
there's still a lot of "cs" everywhere.
After some discussion we found that the cs_endpoint was precisely the
descriptor for a stream endpoint, hence the naturally coming name,
stream endpoint constructor.
This patch renames only the type everywhere and the new/init/free functions
to remain consistent with it. Future patches will address field names and
argument names in various code areas.
That's the "stream endpoint" pointer. Let's change it now while it's
not much spread. The function __cs_endp_target() wasn't yet renamed
because that will change more globally soon.
This changes all main uses of endp->flags to the se_fl_*() equivalent
by applying coccinelle script endp_flags.cocci. The se_fl_*() functions
themselves were manually excluded from the change, of course.
Note: 144 locations were touched, manually reviewed and found to be OK.
The script was applied with all includes:
spatch --in-place --recursive-includes -I include --sp-file $script $files
As specified by HTTP/3 draft, an unknown unidirectional stream can be
aborted. To do this, use a new flag QC_SF_READ_ABORTED. When the MUX
detects this flag, QCS instance is automatically freed.
Previously, such streams were instead automatically drained. By aborting
them, we economize some useless memcpy instruction. On future data
reception, QCS instance is not found in the tree and considered as
already closed. The frame payload is thus deleted without copying it.
The whole QUIC stack is impacted by this change :
* at quic-conn level, a single function is now used to handle uni and
bidirectional streams. It uses qcc_recv() function from MUX.
* at MUX level, qc_recv() io-handler function does not skip uni streams
* most changes are conducted at app layer. Most notably, all received
data is handle by decode_qcs operation.
Now that decode_qcs is the single app read function, the H3 layer can be
simplified. Uni streams parsing was extracted from h3_attach_ruqs() to
h3_decode_qcs().
h3_decode_qcs() is able to deal with all HTTP/3 frame types. It first
check if the frame is valid for the H3 stream type. Most notably,
SETTINGS parsing was moved from h3_control_recv() into h3_decode_qcs().
This commit has some major benefits besides removing duplicated code.
Mainly, QUIC flow control is now enforced for uni streams as with bidi
streams. Also, an unknown frame received on control stream does not set
an error : it is now silently ignored as required by the specification.
Some cleaning in H3 code is already done with this patch :
h3_control_recv() and h3_attach_ruqs() are removed as they are now
unused. A final patch should clean up the unneeded remaining bit.
Define a new enum h3s_t. This is used to differentiate between the
different stream types used in a HTTP/3 connection, including the QPACK
encoder/decoder streams.
For the moment, only bidirectional streams is positioned. This patch
will be useful to unify reception of uni streams with bidirectional
ones.
Remove the unneeded skip over unidirectional streams in qc_send(). This
unify sending for both uni and bidi streams.
In fact, the only local unidirectional streams in use for the moment is
the H3 Control stream responsible of SETTINGS emission. The frame was
already properly generated in qcs.tx.buf, but not send due to stream
skip in qc_send(). Now, there is no need to ignore uni streams so remove
this condition.
This fixes the emission of H3 settings which is now properly emitted.
Uni and bidi streams use the same set of funtcions for sending. One of
the most notable gain is that flow-control is now enforced for uni
streams.
Emit STREAM_STATE_ERROR connection error in two cases :
* if receiving data for send-only stream
* if receiving data on a locally initiated stream not open yet
For the moment the first case cannot be encoutered as uni streams
reception does not use qcc_recv(). However, this will be soon
implemented with the unification between bidi and uni streams.
Similar to sending, read operations are disabled when a CONNECTION_CLOSE
frame has been emitted.
Most notably, this prevents unneeded loop demuxing when the H3 layer has
issue an error and cannot process the buffer payload anymore.
Note that read is not prevented for unidirectional streams for the
moment. This will supported soon with the unification of bidir and uni
streams treatment.
Complete quic-conn API for error reporting. A new parameter <app> is
defined in the function quic_set_connection_close(). This will transform
the frame into a CONNECTION_CLOSE_APP type.
This type of frame will be generated by the applicative layer, h3 or
hq-interop for the moment. A new function qcc_emit_cc_app() is exported
by the MUX layer for them.
Do not allocate cs_endpoint for every QCS instances in qcs_new().
Instead, this is delayed to qc_attach_cs() function.
In effect, with H3 as app protocol, cs_endpoint will be allocated on
HEADERS parsing. Thus, no cs_endpoint is allocated for H3 unidirectional
streams which do not convey any HTTP data.
The wake handler detects if the frontend is closed. This can happen if
the proxy has been disabled individually or even on process soft-stop.
Before this patch, in this condition QCS instances were freed before
being detached from the cs_endpoint. This clearly violates the haproxy
connection architecture and cause a BUG_ON statement crash in cs_free().
To handle this properly, cs_endpoint is notified by setting RD_SH|WR_SH
on connection flags. The cs_endpoint will thus use the detach operation
which allows the QCS instance to be freed.
This code allows the soft-stop process to complete as soon as possible.
However, the client is not notified about the connection closing. It
should be done by emitting a H3 GOAWAY + CONNECTION_CLOSE. Sadly, this
is impossible at this stage because the listener sockets are closed so
the quic-conn cannot use it to emit new frames. At this stage the client
will most probably detect connection closing on its idle timeout
expiration.
Thus, to completely support proxy closing/soft-stop, important
architecture changes are required in QUIC socket management. This is
also linked with the reload feature.
In order to be able to check compatibility between muxes and transport
layers, we'll need a new flag to tag muxes that work on framed transport
layers like QUIC. Only QUIC has this flag now.
As specified by the RFC reception of different STREAM data for the same
offset should be treated with a CONNECTION_CLOSE with error
PROTOCOL_VIOLATION.
Use ncbuf API to detect this case : if add operation fails with
NCB_RET_DATA_REJ with add mode NCB_ADD_COMPARE.
Send a CONNECTION_CLOSE on reception of a STREAM frame for a STREAM id
exceeding the maximum value enforced. Only implemented for bidirectional
streams for the moment.
Send a CONNECTION_CLOSE if the peer emits more data than authorized by
our flow-control. This is implemented for both stream and connection
level.
Fields have been added in qcc/qcs structures to differentiate received
offsets for limit enforcing with consumed offsets for sending of
MAX_DATA/MAX_STREAM_DATA frames.
Define an API to easily set a CONNECTION_CLOSE. This will mainly be
useful for the MUX when an error is detected which require to close the
whole connection.
On the MUX side, a new flag is added when a CONNECTION_CLOSE has been
prepared. This will disable add future send operations.
Release the QCS RX buffer if emptied afer qcs_consume(). This improves
memory usage and avoids a QCS to keep an allocated buffer, particularly
when no data is received anymore. Buffer is automatically reallocated if
needed via qc_get_ncbuf().
qc_free_ncbuf() may now be used with a NCBUF_NULL buffer as parameter.
This is useful when using this function on a QCS with no allocated
buffer. This case was not reproduced for the moment, but it will soon
become more present as buffers will be released if emptied.
Also a call to offer_buffers() is added to conform with the dynamic
buffer management of haproxy.
This commit is similar to the previous one but deals with MAX_DATA for
connection-level data flow control. It uses the same function
qcc_consume_qcs() to update flow control level and generate a MAX_DATA
frame if needed.
Send MAX_STREAM_DATA frames when at least half of the allocated
flow-control has been demuxed, frame and cleared. This is necessary to
support QUIC STREAM with received data greater than a buffer.
Transcoders must use the new function qcc_consume_qcs() to empty the QCS
buffer. This will allow to monitor current flow-control level and
generate a MAX_STREAM_DATA frame if required. This frame will be emitted
via qc_io_cb().
Adjust the mechanism for MAX_STREAMS_BIDI emission. When a bidirectional
stream is removed, current flow-control level is checked. If needed, a
MAX_STREAMS_BIDI frame is generated and inserted in a new list in the
QCS instance. The new frames will be emitted at the start of qc_send().
This has no impact on the current MAX_STREAMS_BIDI behavior. However,
this mechanism is more flexible and will allow to implement quickly
MAX_STREAM_DATA/MAX_DATA emission.
Slightly change the interface for qcc_recv() between MUX and XPRT. The
MUX is now responsible to call qcc_decode_qcs(). This is cleaner as now
the XPRT does not have to deal with an extra QCS parameter and the MUX
will call qcc_decode_qcs() only if really needed.
This change is possible since there is no extra buffering for
out-of-order STREAM frames and the XPRT does not have to handle buffered
frames.
Previously, qc_io_cb() of mux-quic only dealt with TX. Add support for
RX in it. This is done through a new function qc_recv(qcc). It loops
over all QCS instances and call qcc_decode_qcs(qcs).
This has no impact from the quic-conn layer as qcc_decode_qcs(qcs) is
called directly. However, this allows to have a resume point when demux
is blocked on the upper layer HTX full buffer.
Note that for the moment, only RX for bidirectional streams is managed
in qc_io_cb(). Unidirectional streams use their own mechanism for both
TX/RX. It should be unified in the near future in a refactoring.
Rx has been simplified since the conversion of buffer to a ncbuf. The
old buffer can now be removed. The frms tree is also removed. It was
used previously to stored out-of-order received STREAM frames. Now the
MUX is able to buffer them directly into the ncbuf.
Add a ncbuf for data reception on qcs. Thanks to this, the MUX is able
to buffered all received frame directly into the buffer. Flow control
parameters will be used to ensure there is never an overflow.
This change will simplify Rx path with the future deletion of acked
frames tree previously used for frames out of order.
Fred & Amaury found that I messed up with qc_detach() in commit 4201ab791
("CLEANUP: muxes: make mux->attach/detach take a conn_stream endpoint"),
causing a segv in this case with endp->cs == NULL being passed to
__cs_mux(). It obviously ought to have been endp->target like in other
muxes.
No backport needed.
The mux ->detach() function currently takes a conn_stream. This causes
an awkward situation where the caller cs_detach_endp() has to partially
mark it as released but not completely so that ->detach() finds its
endpoint and context, and it cannot be done later since it's possible
that ->detach() deletes the endpoint. As such the endpoint link between
the conn_stream and the mux's stream is in a transient situation while
we'd like it to be clean so that the mux's ->detach() code can call any
regular function it wants that knows the regular semantics of the
relation between the CS and the endpoint.
A better approach consists in slightly modifying the detach() API to
better match the reality, which is that the endpoint is detached but
still alive and that it's the only part the function is interested in.
As such, this patch modifies the function to take an endpoint there,
and by analogy (or simplicity) does the same for ->attach(), even
though it looks less important there since we're always attaching an
endpoint to a conn_stream anyway. It is possible that in the future
the API could evolve to use more endpoints that provide a bit more
flexibility in the API, but at this point we don't need to go further.
At a few places the endpoint pointer was retrieved from the conn_stream
while it's safer and more long-term proof to take it from the qcs. Let's
just do that.
We rely on the largest ID which was used to open streams to know if the
stream we received STREAM frames for is closed or not. If closed, we return the
same status as the one for a STREAM frame which was a already received one for
on open stream.
This is required when the retransmitted frame types when the mux is released.
We add a counter for the number of streams which were opened or closed by the mux.
After the mux has been released, we can rely on this counter to know if the STREAM
frames are retransmitted ones or not.
If the request channel buffer is full, H3 demuxing must be interrupted
on the stream until some read is performed. This condition is reported
if the HTX stream buffer qcs.rx.app_buf is full.
In this case, qcs instance is marked with a new flag QC_SF_DEM_FULL.
This flag cause the H3 demuxing to be interrupted. It is cleared when
the HTX buffer is read by the conn-stream layer through rcv_buf
operation.
When the flag is cleared, the MUX tasklet is woken up. However, as MUX
iocb does not treat Rx for the moment, this is useless. It must be fix
to prevent possible freeze on POST transfers.
In practice, for the moment the HTX buffer is never full as the current
Rx code is limited by the quic-conn receive buffer size and the
incomplete flow-control implementation. So for now this patch is not
testable under the current conditions.
If the MUX cannot handle immediately nor buffer a STREAM frame, the
packet containing it must not be acknowledge. This is in conformance
with the RFC9000.
qcc_recv() return codes have been adjusted to differentiate an invalid
frame with an already fully received offset which must be acknowledged.