Every once in a while, GCC reports issues with qc_release_lost_pkts()
function. It seems that its static analysis is foiled by the code
structuring. The latest warning reports the following issue :
CC src/quic_loss.o
src/quic_loss.c: In function ‘qc_release_lost_pkts’:
src/quic_loss.c:313:58: error: potential null pointer dereference [-Werror=null-dereference]
313 | unsigned int period = newest_lost->time_sent_ms - oldest_lost->time_sent_ms;
| ~~~~~~~~~~~^~~~~~~~~~~~~~
To fix definitely this, change slightly the code. <oldest_lost> and
<newest_lost> are now initialized on the first list entry outside of the
loop. This is enough to guarantee to GCC that they cannot be NULL for
the remainder of the function.
Some new compilers warn that <oldest_lost> variable can be null even this cannot be
the case as mentioned by the comment about an already present ASSUME_NONNULL()
call comment as follows:
src/quic_loss.c: In function ‘qc_release_lost_pkts’:
src/quic_loss.c:307:86: error: potential null pointer dereference [-Werror=null-dereference]
307 | unsigned int period = newest_lost->time_sent_ms - oldest_lost->time_sent_ms;
| ~~~~~~~~~~~^~~~~~~~~~~~~~
Move up this ASSUME_NONNULL() statement to please these compiler.
Must be backported as far as 2.6 to easy any further backport around this code part.
There were 4 instances of ALREADY_CHECKED() used to tell the compiler that
the argument couldn't be NULL by design. Let's change them to the cleaner
ASSUME_NONNULL(). Functions like qc_snd_buf() were slightly reduced in
size (-24 bytes).
Apparently gcc-13 sees a potential case that others don't see, and it's
likely a bug since depending what is masked, it will completely change
the output warnings to the point of contradicting itself. After many
attempts, it appears that just checking that CMSG_FIRSTHDR(msg) is not
null suffices to calm it down, so the strange warnings might have been
the result of an overoptimization based on a supposed UB in the first
place. At least now all versions up to 13.2 as well as clang are happy.
This patch fixes the loss of information when computing the delivery rate
(quic_cc_drs.c) on links with very low latency due to usage of 32bits
variables with the millisecond as precision.
Initialize the quic_conn task with TASK_F_WANTS_TIME flag ask it to ask
the scheduler to update the call date of this task. This allows this task to get
a nanosecond resolution on the call date calling task_mono_time(). This is enabled
only for congestion control algorithms with delivery rate estimation support
(BBR only at this time).
Store the send date with nanosecond precision of each TX packet into
->time_sent_ns new quic_tx_packet struct member to store the date a packet was
sent in nanoseconds thanks to task_mono_time().
Make use of this new timestamp by the delivery rate estimation algorithm (quic_cc_drs.c).
Rename current ->time_sent member from quic_tx_packet struct to ->time_sent_ms to
distinguish the unit used by this variable (millisecond) and update the code which
uses this variable. The logic found in quic_loss.c is not modified at all.
Must be backported to 3.1.
The per-packet delivery rate sample is applied to ack-eliciting packet only
calling ->drs_on_transmit() BBR callback. So, ->on_pkt_lost() which inspects the
delivery rate sampling information during packet loss detection must not be
called for non ack-eliciting packet. If not, it would be facing with non
initialized variables with big chance to trigger a BUG_ON().
As BBR is implemented in the current developement version, there is
no need to backport this patch.
qc_packet_loss_lookup() aim is to detect the packet losses. This is this function
which must called ->on_pkt_lost() BBR specific callback. It also set
<bytes_lost> passed parameter to the total number of bytes detected as lost upon
an ACK frame receipt for its caller.
Modify qc_release_lost_pkts() to call ->congestion_event() with the send time
from the newest packet detected as lost.
Modify qc_release_lost_pkts() to call ->slow_start() callback only if define
by the congestion control algorithm. This is not the case for BBR.
Upon loss detection, qc_release_lost_pkts() notifies congestion
controllers about the event and its final time. However it does not
pass the number of lost packets, that can provide useful hints for
some controllers. Let's just pass this option.
A packet is considered as reordered when it is detected as lost because its packet
number is above the largest acknowledeged packet number by at least the
packet reordering threshold value.
Add ->nb_reordered_pkt new quic_loss struct member at the same location that
the number of lost packets to count such packets.
Should be backported to 2.6.
Let's say that the largest packet number acknowledged by the peer is #10, when inspecting
the non already acknowledged packets to detect if they are lost or not, this is the
case a least if the difference between this largest packet number and and their
packet numbers are bigger or equal to the packet reordering threshold as defined
by the RFC 9002. This latter must not be less than QUIC_LOSS_PACKET_THRESHOLD(3).
Which such a value, packets #7 and oldest are detected as lost if non acknowledged,
contrary to packet number #8 or #9.
So, the packet loss detection is very sensitive to such a network characteristic
where non acknowledged packets are distant from each others by their packet number
differences.
Do not use this static value anymore for the packet reordering threshold which is used
as a criteria to detect packet loss. In place, make it depend on the difference
between the number of the last transmitted packet and the number of the oldest
one among the packet which are still in flight before being inspected to be
deemed as lost.
Add new tune.quic.reorder-ratio setting to apply a ratio in percent to this
dynamic packet reorder threshold.
Should be backported to 2.6.
Move all QUIC trace definitions from quic_conn.h to quic_trace-t.h. Also
remove multiple definition trace_quic macro definition into
quic_trace.h. This forces all QUIC source files who relies on trace to
include it while reducing the size of quic_conn.h.
Due to the fact that several variable values (rtt_var, srtt) were stored as multiple
of their real values, some calculations were less accurate as expected.
Stop storing 4*rtt_var values, and 8*srtt values.
Adjust all the impacted statements.
Must be backported as far as 2.6.
Add a pool to dynamically handle the memory used for the QUIC TLS packet number spaces.
Remove the static array of packet number spaces at QUIC connection level (struct
quic_conn) and add three new members to quic_conn struc as pointers to quic_pktns
struct, one by packet number space as follows:
->ipktns for Initial packet number space,
->hpktns for Handshake packet number space and
->apktns for Application packet number space.
Also add a ->pktns_list new member (struct list) to quic_conn struct to attach
the list of the packet number spaces allocated for the QUIC connection.
Implement ssl_to_quic_pktns() to map and retrieve the addresses of these pointers
from TLS stack encryption levels.
Modify quic_pktns_init() to initialize these members.
Modify ha_quic_set_encryption_secrets() and ha_quic_add_handshake_data() to
allocate the packet numbers and initialize the encryption level.
Implement quic_pktns_release() which takes pointers to pointers to packet number
space objects to release the memory allocated for a packet number space attached
to a QUIC connection and reset their address values.
Modify qc_new_conn() to allocation only the Initial packet number space and
Initial encryption level.
Modify QUIC loss detection API (quic_loss.c) to use the new ->pktns_list
list attached to a QUIC connection in place of a static array of packet number
spaces.
Replace at several locations the use of elements of an array of packet number
spaces by one of the three pointers to packet number spaces
Replace a "less than" comparison between two tick variable by a call to tick_is_lt()
in quic_loss_pktns(). This bug could lead to a wrong packet loss detection
when the loss time computed values could wrap. This is the case 20 seconds after
haproxy has started.
Must be backported as far as 2.6.
Add some statistical counters to quic_conn struct from quic_counters struct which
are used at listener level to handle them at QUIC connection level. This avoid
calling atomic functions. Furthermore this will be useful soon when a counter will
be added for the total number of packets which have been sent which will be very
often incremented.
Some counters were not added, espcially those which count the number of QUIC errors
by QUIC error types. Indeed such counters would be incremented most of the time
only one time at QUIC connection level.
Implement quic_conn_prx_cntrs_update() which accumulates the QUIC connection level
statistical counters to the listener level statistical counters.
Must be backported to 2.7.
qc_set_timer() function is used to rearm the timer for loss detection
and probing. Previously, timer was always rearm when congestion window
was free due to a wrong interpretation of the RFC which mandates the
client to rearm the timer before handshake completion to avoid a
deadlock related to anti-amplification.
Fix this by removing this code from quic_pto_pktns(). This allows
qc_set_timer() to be reentrant and only activate the timer if needed.
The impact of this bug seems limited. It can probably caused the timer
task to be processed too frequently which could caused too frequent
probing.
This change will allow to reuse easily qc_set_timer() after quic_conn
thread migration. As such, the new timer task will be scheduled only if
needed.
This should be backported up to 2.6.
Add the number of packet losts and the maximum congestion control window computed
by the algorithms to "show quic".
Same thing for the traces of existent congestion control algorithms.
Must be backported to 2.7 and 2.6.
As timestamps based on now_ms values are used to compute the probing timeout,
they may wrap. So, use ticks API to compared them.
Must be backported to 2.7 and 2.6.
In ->srtt quic_loss struct this is 8*srtt which is stored so that not to have to multiply/devide
it to compute the RTT variance (at least). This is where there was a bug in quic_loss_srtt_update():
each time ->srtt must be used, it must be devided by 8 or right shifted by 3.
This bug had a very bad impact for network with non negligeable packet loss.
Must be backported to 2.6 and 2.7.
Dump variables displayed by TRACE_ENTER() or TRACE_LEAVE() by calls to TRACE_PROTO().
No more variables are displayed by the two former macros. For now on, these information
are accessible from proto level.
Add new calls to TRACE_PROTO() at important locations in relation whith QUIC transport
protocol.
When relevant, try to prefix such traces with TX or RX keyword to identify the
concerned subpart (transmission or reception) of the protocol.
Must be backported to 2.7.
During the handshake and when the handshake has not been confirmed
the acknowledgement delays reported by the peer may be larger
than max_ack_delay. max_ack_delay SHOULD be ignored before the
handshake is completed when computing the PTO. But the current code considered
the wrong condition "before the hanshake is completed".
Replace the enum value QUIC_HS_ST_COMPLETED by QUIC_HS_ST_CONFIRMED to
fix this issue. In quic_loss.c, the parameter passed to quic_pto_pktns()
is renamed to avoid any possible confusion.
Must be backported to 2.7 and 2.6.
Implement quic_tls_secrets_keys_alloc()/quic_tls_secrets_keys_free() to allocate
the memory for only one direction (RX or TX).
Modify ha_quic_set_encryption_secrets() to call these functions for one of this
direction (or both). So, for now on we can rely on the value of the secret keys
to know if it was derived.
Remove QUIC_FL_TLS_SECRETS_SET flag which is no more useful.
Consequently, the secrets are dumped by the traces only if derived.
Must be backported to 2.6.
xprt_quic module was too large and did not reflect the true architecture
by contrast to the other protocols in haproxy.
Extract code related to XPRT layer and keep it under xprt_quic module.
This code should only contains a simple API to communicate between QUIC
lower layer and connection/MUX.
The vast majority of the code has been moved into a new module named
quic_conn. This module is responsible to the implementation of QUIC
lower layer. Conceptually, it overlaps with TCP kernel implementation
when comparing QUIC and HTTP1/2 stacks of haproxy.
This should be backported up to 2.6.
Clean up quic sources by adjusting headers list included depending
on the actual dependency of each source file.
On some occasion, xprt_quic.h was removed from included list. This is
useful to help reducing the dependency on this single file and cleaning
up QUIC haproxy architecture.
This should be backported up to 2.6.
Add a least as much as possible TRACE_ENTER() and TRACE_LEAVE() calls
to any function. Note that some functions do not have any access to the
a quic_conn argument when receiving or parsing datagram at very low level.
As the TX packets are ordered by their packet number and always sent
in the same order. their TX timestamps are inspected from the older to
the newer values when we look for the packet loss. So we can stop
this search as soon as we found the first packet which has not been lost.
Must be backported to 2.6
<loss_time_limit> is the loss time limit computed from <time_sent> packet
transmission timestamps in qc_packet_loss_lookup() to identify the packets which
have been lost. This latter timestamp variable was used in place of
<loss_time_limit> to distinguish such packets from others (still in fly packets).
Must be backported to 2.6
Add new counters to count the number of dropped packet upon parsing error, lost
sent packets and the number of stateless reset packet sent.
Take the oppportunity of this patch to rename CONN_OPENINGS to QUIC_ST_HALF_OPEN_CONN
(total number of half open connections) and QUIC_ST_HDSHK_FAILS to QUIC_ST_HDSHK_FAIL.
It seems this multiplier ended up in oblivion. Indeed a multiplier must be
applied to the loss delay expressed as an RTT multiplier: 9/8.
So, some packets were detected as lost too soon, leading to be retransmitted too
early!
Move all inline functions with trace from quic_loss.h to a dedicated
object file. This let to remove the TRACE_SOURCE macro definition
outside of the include file.
This change is required to be able to define another TRACE_SOUCE inside
the mux_quic.c for a dedicated trace module.