When no client timeout is defined in the configuration, QCC timeout task
is never allocated. However, a NULL timeout task is also used as a
criteria in qcc_is_dead() to consider that the MUX instance should be
released as timeout stroke earlier.
This bug causes every connection to be closed by haproxy side with a
CONNECTION_CLOSE. This is notable when using several streams per
connection with only the first stream completed and the others failed.
To fix this, change timeout task allocation policy. It is now always
allocated. This means that if no timeout is defined, it will never be
run. This is not considered a waste of resource as no timeout in the
configuration is considered as an exception case. However, this has the
advantage to simplify the rest of the code which can now check for the
task instance without having an extra check on the timeout value.
This bug is labelled as minor as it only occurs if no timeout client is
defined which reports warning on startup as it may caused unexpected
behavior.
This bug should be backported up to 2.6.
qcs_new() allocates several elements in intermediary steps. All elements
must first be properly initialized to be able to free qcs instance in
case of an intermediary failure.
Previously, qc_stream_desc allocation was done in the middle of
qcs_new() before some elements initializations. In case this fails, a
crash can happened as some elements are left uninitialized.
To fix this, move qc_stream_desc allocation at the end of qcs_new().
This ensures that all qcs elements are initialized first.
This should be backported up to 2.6.
Support stream opening with an initial max-stream-data of 0.
In normal case, QC_SF_BLK_SFCTL is set when a qcs instance cannot
transfer more data due to flow-control. This flag is set when
transfering data from MUX to quic-conn instance.
However, it's possible to define an initial value of 0 for
max-stream-data. In this case, qcs instance is blocked despite
QC_SF_BLK_SFCTL not set. No STREAM frame is prepared for this stream as
it's not possible to emit any byte, so QC_SF_BLK_SFCTL flag is never
set.
This behavior should cause no harm. However, this can cause a BUG_ON()
crash on qcc_io_send(). Indeed, when sending is retried, it ensures that
only qcs instance waiting for a new qc_stream_buf or with
QC_SF_BLK_SFCTL set is present in the send_list.
To fix this, initialize qcs with 0 value for msd and QC_SF_BLK_SFCTL.
The flag is removed only if transport parameter msd value is non null.
This should be backported up to 2.6.
When receiving a RESET_STREAM on a send-only stream, it is mandatory to
close the connection with an error STREAM_STATE error. However, this was
badly implemented as this caused two invocation of qcc_set_error() which
is forbidden by the mux-quic API.
To fix this, rely on qcc_get_qcs() to properly detect the error. Remove
qcc_set_error() usage from qcc_recv_reset_stream() instead.
This must be backported up to 2.7.
When rcv_buf stream callback is invoked, mux tasklet is woken up if
demux was previously blocked due to lack of buffer space. A BUG_ON() is
present to ensure there is data in qcs Rx buffer. If this is not the
case, wakeup is unneeded :
BUG_ON(!ncb_data(&qcs->rx.ncbuf, 0));
This BUG_ON() may be triggered if RESET_STREAM is received after demux
has been blocked. On reset, Rx buffer is purged according to RFC 9000
which allows to discard any data not yet consumed. This will trigger the
BUG_ON() assertion if rcv_buf stream callback is invoked after this.
To prevent BUG_ON() crash, just clear demux block flag each time Rx
buffer is purged. This covers accordingly RESET_STREAM reception.
This should be backported up to 2.7.
This may fix github issue #2293.
This bug relies on several precondition so its occurence is rare. This
was reproduced by using a custom client which post big enough data to
fill the buffer. It then emits a RESET_STREAM in place of a proper FIN.
Moreover, mux code has been edited to artificially stalled stream read
to force demux blocking.
h3_data_to_htx:
- return htx_sent;
+ return 1;
qcc_recv_reset_stream:
qcs_free_ncbuf(qcs, &qcs->rx.ncbuf);
+ qcs_notify_recv(qcs);
qmux_strm_rcv_buf:
char fin = 0;
+ static int i = 0;
+ if (++i < 2)
+ return 0;
TRACE_ENTER(QMUX_EV_STRM_RECV, qcc->conn, qcs);
A HTTP server may provide a complete response even prior receiving the
full request. In this case, RFC 9114 allows the server to abort read
with a STOP_SENDING with error code H3_NO_ERROR.
This scenario was notably reproduced with haproxy and an inactive
server. If the client send a POST request, haproxy may provide a full
HTTP 503 response before the end of the full request.
This patch is similar to the previous one but for QUIC mux functions
used inside the mux code itself or application layer. Replace all
occurences of qc_* prefix by qcc_* or qcs_*. This should help to better
differentiate code between quic_conn and MUX.
This should be backported up to 2.7.
Rename all QUIC mux function exposed through mux_ops structure. Use the
prefix qmux_* or qmux_strm_*. The objective is to remove qc_* prefix
which should only be used in quic_conn layer.
This should be backported up to 2.7.
Recently stconn flags were reviewed for QUIC mux to be conform with
other HTTP muxes. However, a mistake was made when dealing with a proper
stream FIN with both EOI and EOS set. This was done as RESET_STREAM
received after a FIN are ignored by QUIC mux and thus there is no
difference between EOI or EOI+EOS. However, analyzers may interpret EOS
as an interrupted request which result in a 400 HTTP error code.
To fix this, only set EOI on proper stream FIN. EOS is set when input is
interrupted (RESET_STREAM before FIN) or a STOP_SENDING is received
which prevent transfer to complete. In this last case, EOS must be
manually set too if FIN has been received before STOP_SENDING to go
directly from ERR_PENDING to final ERROR state.
This must be backported up to 2.7.
Remove nb_streams field from qcc. It was not used outside of a BUG_ON()
statement to ensure we never have a negative count of streams. However
this is already checked with other fields.
This should be backported up to 2.7.
A RESET_STREAM is emitted in several occasions :
- protocol error during HTTP/3.0 parsing
- STOP_SENDING reception
In both cases, if a stream-endpoint is attached we must set its ERR
flag. This was correctly done but after some delay as it was only when
the RESET_STREAM was emitted. Change this to set the ERR flag as soon as
one of the upper cases has been encountered. This should help to release
faster streams in error.
This should be backported up to 2.7.
A recent review was done to rationalize ERR/EOS/EOI flags on stream
endpoint. A common definition for both H1/H2/QUIC mux have been written
in the following documentation :
./doc/internals/stconn-close.txt
In QUIC it is possible to close each channels of a stream independently
with RESET_STREAM and STOP_SENDING frames. When a RESET_STREAM is
received, it indicates that the peer has ended its transmission in an
abnormal way. However, it is still ready to receive.
Previously, on RESET_STREAM reception, QUIC MUX set the ERR flag on
stream-endpoint. However, according to the QUIC mechanism, it should be
instead EOS but this was impossible due to a BUG_ON() which prevents EOS
without EOI or ERR. This BUG_ON was only present because this case was
never used before the introduction of QUIC. It was removed in a recent
commit which allows us to now properly set EOS alone on RESET_STREAM
reception.
In practice, this change allows to continue to send data even after
RESET_STREAM reception. However, currently browsers always emit it with
a STOP_SENDING as this is used to abort the whole H3 streams. In the end
this will result in a stream-endpoint with EOS and ERR_PENDING/ERR
flags.
This should be backported up to 2.7.
A recent review was done to rationalize ERR/EOS/EOI flags on stream
endpoint. A common definition for both H1/H2/QUIC mux have been written
in the following documentation :
./doc/internals/stconn-close.txt
Always set EOS with EOI flag to conform to this specification. EOI is
set whenever the proper stream end has been encountered : with QUIC it
corresponds to a STREAM frame with FIN bit. At this step, RESET_STREAM
frames are ignored by QUIC MUX as allowed by RFC 9000. This means we can
always set EOS at the same time with EOI.
This should be backported up to 2.7.
Complete each useful BUG_ON statements with a comment to explain its
purpose. Also convert BUG_ON_HOT to BUG_ON as they should not have a
big impact.
This should be backported up to 2.7.
When a full message is received for a stream, MUX is responsible to set
EOI flag. This was done through rcv_buf stream callback by checking if
QCS HTX buffer contained the EOM flag.
This is not correct for HTTP without body. In this case, QCS HTX buffer
is never used. Only a local HTX buffer is used to transfer headers just
as stream endpoint is created. As such, EOI is never transmitted to the
upper layer.
If the transfer occur without any issue, this does not seem to cause any
problem. However, in case the transfer is aborted, the stream is never
released which cause a memory leak and prevent the process soft-stop.
To fix this, also check if EOM is put by application layer during
headers conversion. If true, this is transferred through a new argument
to qc_attach_sc() MUX function which is responsible to set the EOI flag.
This issue was reproduced using h2load with hundred of connections.
h2load is interrupted with a SIGINT which causes streams to never be
closed on haproxy side.
This should be backported up to 2.6.
Uninline and move qc_attach_sc() function to implementation source file.
This will be useful for next commit to add traces in it.
This should be backported up to 2.7.
MUX is responsible to put EOS on stream when read channel is closed.
This happens if underlying connection is closed or a RESET_STREAM is
received. FIN STREAM is ignored in this case.
For connection closure, simply check for CO_FL_SOCK_RD_SH.
For RESET_STREAM reception, a new flag QC_CF_RECV_RESET has been
introduced. It is set when RESET_STREAM is received, unless we already
received all data. This is conform to QUIC RFC which allows to ignore a
RESET_STREAM in this case. During RESET_STREAM processing, input buffer
is emptied so EOS can be reported right away on recv_buf operation.
This should be backported up to 2.7.
Add traces to render each stream transition more explicit. Also, move
ERR_PENDING to ERROR transition after other stream flags are set, as
with the MUX H2 implementation. This is purely a cosmetic change and it
should have no functional impact.
This should be backported up to 2.7.
Since the following patch
commit 6c501ed23b
BUG/MINOR: mux-quic: differentiate failure on qc_stream_desc alloc
it is not possible to check if Tx buf allocation failed due to a
configured limit exhaustion or a simple memory failure.
This patch fixes it as the condition was inverted. Indeed, if buf_avail
is null, this means that the limit has been reached. On the contrary
case, this is a real memory alloc failure. This caused the flag
QC_CF_CONN_FULL to not be properly used and may have caused disruption
on transfer with several streams or large data.
This was detected due to an abnormal error QUIC MUX traces. Also change
in consequence trace for limit exhaustion to be more explicit.
This must be backported up to 2.6.
qc_init() is used to initialize a QUIC MUX instance. On failure, each
resources are released via a series of goto statements. There is one
issue if the app_ops.init callback fails. In this case, MUX task is not
freed.
This can cause a crash as the task is already scheduled. When the
handler will run, it will crash when trying to access qcc instance.
To fix this, properly destroy qcc task on fail_install_app_ops label.
The impact of this bug is minor as app_ops.init callback succeeds most
of the time. However, it may fail on allocation failure due to memory
exhaustion.
This may fix github issue #2154.
This must be backported up to 2.7.
qc_stream_buf_alloc() can fail for two reasons :
* limit of Tx buffer per connection reached
* allocation failure
The first case is properly treated. A flag QC_CF_CONN_FULL is set on the
connection to interrupt emission. It is cleared when a buffer became
available after in order ACK reception and the MUX tasklet is woken up.
The allocation failure was handled with the same mechanism which in this
case is not appropriate and could lead to a connection transfer freeze.
Instead, prefer to close the connection with a QUIC internal error code.
To differentiate the two causes, qc_stream_buf_alloc() API was changed
to return the number of available buffers to the caller.
This must be backported up to 2.6.
The function qc_get_ncbuf() is used to allocate a ncbuf content.
Allocation failure was handled using a plain BUG_ON.
Fix this by a proper error management. This buffer is only used for
STREAM frame reception to support out-of-order offsets. When an
allocation failed, close the connection with a QUIC internal error code.
This should be backported up to 2.6.
A convenience function qc_get_buf() is implemented to centralize buffer
allocation on MUX and H3 layers. However, allocation failure was not
handled properly with a BUG_ON() statement.
Replace this by proper error management. On emission, streams is
temporarily skip over until the next qc_send() invocation. On reception,
H3 uses this function for HTX conversion; on alloc failure the
connection will be closed with QUIC internal error code.
This must be backported up to 2.6.
Following previous patch, error notification from quic_conn has been
adjusted to rely on standard connection flags. Most notably, CO_FL_ERROR
on the connection instance when a fatal error is detected.
Check for CO_FL_ERROR is implemented by qc_send(). If set the new flag
QC_CF_ERR_CONN will be set for the MUX instance. This flag is similar to
the local error flag and will abort most of the futur processing. To
ensure stream upper layer is also notified, qc_wake_some_streams()
called by qc_process() will put the stream on error if this new flag is
set.
This should be backported up to 2.7.
When an error is detected at quic-conn layer, the upper MUX must be
notified. Previously, this was done relying on quic_conn flag
QUIC_FL_CONN_NOTIFY_CLOSE set and the MUX wake callback called on
connection closure.
Adjust this mechanism to use an approach more similar to other transport
layers in haproxy. On error, connection flags are updated with
CO_FL_ERROR, CO_FL_SOCK_RD_SH and CO_FL_SOCK_WR_SH. The MUX is then
notified when the error happened instead of just before the closing. To
reflect this change, qc_notify_close() has been renamed qc_notify_err().
This function must now be explicitely called every time a new error
condition arises on the quic_conn layer.
To ensure MUX send is disabled on error, qc_send_mux() now checks
CO_FL_SOCK_WR_SH. If set, the function returns an error. This should
prevent the MUX from sending data on closing or draining state.
To complete this patch, MUX layer must now check for CO_FL_ERROR
explicitely. This will be the subject of the following commit.
This should be backported up to 2.7.
Remove the unnecessary err label for qc_send(). Anyway, this label
cannot be used once some frames are sent because there is no cleanup
part for it.
This should be backported up to 2.7.
Factorize code for send subscribing on the lower layer in a dedicated
function qcc_subscribe_send(). This allows to call the lower layer only
if not already subscribed and print a trace in this case. This should
help to understand when subscribing is really performed.
In the future, this function may be extended to avoid subscribing under
new conditions, such as connection already on error.
This should be backported up to 2.7.
Do not built STREAM frames if MUX is already subscribed for sending on
lower layer. Indeed, this means that either socket currently encountered
a transient error or congestion window is full.
This change is an optimization which prevents to allocate and release a
series of STREAM frames for nothing under congestion.
Note that nothing is done for other frames (flow-control, RESET_STREAM
and STOP_SENDING). Indeed, these frames are not restricted by flow
control. However, this means that they will be allocated for nothing if
send is blocked on a transient error.
This should be backported up to 2.7.
Add traces for when an upper layer stream is woken up by the MUX. This
should help to diagnose frozen stream issues.
This should be backported up to 2.7.
When detach is conducted by stream endpoint layer, a stream is either
freed or just flagged as detached if the transfer is not yet finished.
In the latter case, the stream will be finally freed via
qc_purge_streams() which is called periodically.
A subscribe was done on quic-conn layer if a stream cannot be freed via
qc_purge_streams() as this means FIN STREAM has not yet been sent.
However, this is unnecessary as either HTX EOM was not yet received and
we are waiting for the upper layer, or FIN stream is still in the buffer
but was not yet transmitted due to an incomplete transfer, in which case
a subscribe should have already been done.
This should be backported up to 2.7.
MUX uses qc_send_mux() function to send frames list over a QUIC
connection. On network congestion, the lower layer will reject some
frames and it is the MUX responsibility to free them. There is another
category of error which are when the sendto() fails. In this case, the
lower layer will free the packet and its attached frames and the MUX
should not touch them.
This model was violated by MUX layer for RESET_STREAM and STOP_SENDING
emission. In this case, frames were freed every time by the MUX on
error. This causes a double free error which lead to a crash.
Fix this by always ensuring if frames were rejected by the lower layer
before freeing them on the MUX. This is done simply by checking if frame
list is not empty, as RESET_STREAM and STOP_SENDING are sent
individually.
This bug was never reproduced in production. Thus, it is labelled as
MINOR.
This must be backported up to 2.7.
Since recent modification of MUX error processing, shutw operation was
skipped for a connection reported as on error. However, this can caused
the stream layer to not be notified about error. The impact of this bug
is unknown but it may lead to stream never closed.
To fix this, simply skip over send operations when connection is on
error while keep notifying the stream layer.
This should be backported up to 2.7.
A recent series of commit have been introduced to rework error
generation on QUIC MUX side. Now, all MUX/APP functions uses
qcc_set_error() to set the flag QC_CF_ERRL on error. Then, this flag is
converted to QC_CF_ERRL_DONE with a CONNECTION_CLOSE emission by
qc_send().
This has the advantage of centralizing the CONNECTION_CLOSE generation
in one place and reduces the link between MUX and quic-conn layer.
However, we must now ensure that every qcc_set_error() call is followed
by a QUIC MUX tasklet to invoke qc_send(). This was not the case, thus
when there is no active transfer, no CONNECTION_CLOSE frame is emitted
and the connection remains opened.
To fix this, add a tasklet_wakeup() directly in qcc_set_error(). This is
a brute force solution as this may be unneeded when already in the MUX
tasklet context. However, it is the simplest solution as it is too
tedious for the moment to list all qcc_set_error() invocation outside of
the tasklet.
This must be backported up to 2.7.
A recent series of patch were introduced to streamline error generation
by QUIC MUX. However, a regression was introduced : every error
generated by the MUX was built as CONNECTION_CLOSE_APP frame, whereas it
should be only for H3/QPACK errors.
Fix this by adding an argument <app> in qcc_set_error. When false, a
standard CONNECTION_CLOSE is used as error.
This bug was detected by QUIC tracker with the following tests
"stop_sending" and "server_flow_control" which requires a
CONNECTION_CLOSE frame.
This must be backported up to 2.7.
With the change for QUIC MUX local error API, the new flag QC_CF_ERRL is
now checked on qc_detach(). If set, qcs instance is freed even though
transfer is not finished. This should help to quickly release qcs and
eventually all MUX instance resources.
To further accelerate this, a specific check has been added in
qc_shutw(). It is skipped if local error flag is set to prevent noisy
reset stream invocation. In the same way, QUIC MUX is not rescheduled on
qc_recv_buf() operation if local error flag set.
This should be backported up to 2.7.
If an error a detected at the MUX layer, all remaining stream endpoints
should be closed asap with error set. This is now done by checking for
QC_CF_ERRL flag on qc_wake_some_streams() and qc_send_buf(). To complete
this, qc_wake_some_streams() is called by qc_process() if needed.
This should help to quickly release streams as soon as a new error is
detected locally by the MUX or APP layer. This allows to in turn free
the MUX instance itself. Previously, error would not have been
automatically reported until the transport layer closure would occur
on CONNECTION_CLOSE emission.
This should be backported up to 2.7.
When a fatal error is detected by the QUIC MUX or H3 layer, the
connection should be closed with a CONNECTION_CLOSE with an error code
as the reason.
Previously, a direct call was used to the quic_conn layer to try to
close the connection. This API was adjusted to be more flexible. Now,
when an error is detected, the function qcc_set_error() is called. This
set the flag QC_CF_ERRL with the error code stored by the MUX. The
connection will be closed soon so most of the operations are not
conducted anymore. Connection is then finally closed during qc_send()
via quic_conn layer if QC_CF_ERRL is set. This will set the flag
QC_CF_ERRL_DONE which indicates that the MUX instance can be freed.
This model is cleaner and brings the following improvments :
- interaction with quic_conn layer for closure is centralized on a
single function
- CO_FL_ERROR is not set anymore. This was incorrect as this should be
reserved to errors reported by the transport layer to be similar with
other haproxy components. As a consequence, qcc_is_dead() has been
adjusted to check for QC_CF_ERRL_DONE to release the MUX instance.
This should be backported up to 2.7.
When HTX content is transferred from qcs instance to upper stream
endpoint, a wakeup is conducted for MUX tasklet. However, this is only
necessary if demux was interrupted due to a full QCS HTX buffer.
This should be backported up to 2.7.
Add a dedicated trace event QMUX_EV_QCC_ERR. This is used for locally
detected error when a CONNECTION_CLOSE should be emitted.
This should be backported up to 2.7.
When MUX performs a graceful shutdown, quic_conn error code is set to a
"no error" code which depends on the application layer used. However,
this may overwrite a previous error code if quic_conn layer has detected
an error on its side.
In practice, this behavior has not been seen on production. In fact, it
may have undesirable effect only if this error code modification happens
between the quic_conn error detection and the emission of the
CONNECTION_CLOSE, so it should be pretty rare. However, there is still a
tiny possibility it may happen.
To prevent this, first check that quic_conn error code is not set before
setting it. Ideally, transport layer API should be adjusted to be able
to set this without fiddling with the quic_conn directly.
This should be backported up to 2.6.
Before this patch, global sending rate was measured on the QUIC lower
layer just after sendto(). This meant that all QUIC frames were
accounted for, including non STREAM frames and also retransmission.
To have a better reflection of the application data transferred, move
the incrementation into the MUX layer. This allows to account only for
STREAM frames payload on their first emission.
This should be backported up to 2.6.
Sometimes it may be necessary to send an empty STREAM frame to signal
clean stream closure with FIN bit set. Prior to this change, a Tx buffer
was allocated unconditionnally even if no data is transferred.
Most of the times, allocation was not performed due to an older buffer
reused. But if data were already acknowledge, a new buffer is allocated.
No memory leak occurs as the buffer is properly released when the empty
frame acknowledge is received. But this allocation is unnecessary and it
consumes a connexion Tx buffer for nothing.
Improve this by skipping buffer allocation if no data to transfer.
qcs_build_stream_frm() is now able to deal with a NULL out argument.
This should be backported up to 2.6.
Previous patch fixes an issue occurring with empty STREAM frames without
payload. The crash was hidden in part because buf/data fields of
qf_stream were set even if no payload is referenced. This was not the
true cause of the crash but to ease future debugging, a STREAM frame
built with no payload now has its buf and data fields set to NULL.
This should be backported up to 2.6.
Since the following mentioned patch, a send-list mechanism was
implemented to improve streams priorization on sending.
commit 20f2a425ff
MAJOR: mux-quic: rework stream sending priorization
This is done to prevent the same streams to always be used as first ones
on emission. However there is still a flaw on the algorithm. Once put in
the send-list, a streams is not removed until it has sent all of its
content. When a stream transfers a large object, it will remain in the
send-list during all the transfer and will soon monopolize the first
place. the stream does never leave its position until the transfer is
finished and will monopolize the first place. Other streams behind won't
have the opportunity to advance on their own transfers due to a Tx
buffer exhaustion.
This situation is especially problematic if a small timeout client is
used. As some streams won't advance on their transfer for a long period
of time, they will be aborted due to a stream layer timeout client
causing a RESET_STREAM emission.
To fix this, during sending each stream with at least some bytes
transferred from its tx.buf to qc_stream_desc out buffer is put at the
end of the send-list. This ensures that on the next iteration streams
that cannot transfer anything will be used in priority.
This patch improves significantly h2load benchmarks for large objects
with several streams opened in parallel on a single connection. Without
it, errors may be reported by h2load for aborted streams. For example,
this improved the following scenario on a 10mbit/s link with a 10s
timeout client :
$ ./build/bin/h2load --npn-list h3 -t 1 -c 1 -m 30 -n 30 https://198.18.10.11:20443/?s=500k
This fix may help with the github issue #2004 where chrome browser stop
to use QUIC after receiving RESET_STREAM frames.
This should be backported up to 2.7.
Some HTX responses may not always contain a EOM block. For example this
is the case if content-length header is missing from the HTTP server
response. Stream termination is thus signaled to QUIC mux via shutw
callback. However, this is interpreted inconditionnally as an early
close by the mux with a RESET_STREAM emission. Most of the times, QUIC
clients report this as an error.
To fix this, check if htx.extra is set to HTX_UNKOWN_PAYLOAD_LENGTH for
a qcs instance. If true, shutw will never be used to emit a
RESET_STREAM. Instead, the stream will be closed properly with a FIN
STREAM frame. If all data were already transfered, an empty STREAM frame
is sent.
This fix may help with the github issue #2004 where chrome browser stop
to use QUIC after receiving RESET_STREAM frames.
This issue was reported by Vladimir Zakharychev. Thanks to him for his
help and testing. It was also reproduced locally using httpterm with the
query string "/?s=1k&b=0&C=1".
This should be backported up to 2.7.