haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-12-06 02:01:01 +01:00

Author	SHA1	Message	Date
Frederic Lecaille	3f60891360	MEDIUM: quic-be: qc_send_mux() adaptation for 0-RTT When entering this function, a selection is done about the encryption level to be used to send data. For a client, the early data encryption level is used to send 0-RTT if this encryption level is initialized. The Initial encryption is also registered to the send list for clients if there is Initial crypto data to send. This allow Initial and 0-RTT packets to be coalesced by datagrams.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	a4bbbc75db	MINOR: quic-be: Send post handshake frames from list of frames (0-RTT) This patch is required to make 0-RTT work. It modifies the prototype of quic_build_post_handshake_frames() to send post handshake frames from a list of frames in place of the application encryption level (used as <qc->ael> local variable). This patch does not modify at all the current QUIC stack behavior (even for QUIC frontends). It must be considered as a preparation for the code to come about 0-RTT support for QUIC backends.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	ac1d3eba88	MINOR: quic-be: allow the preparation of 0-RTT packets A QUIC server never sends 0-RTT packets contrary to the client. This very simple modification allow the the preparation of 0-RTT packets with early data as encryption level (->eel).	2025-11-13 14:04:31 +01:00
Frederic Lecaille	80070fe51c	MEDIUM: quic-be: Parse, store and reuse tokens provided by NEW_TOKEN Add a per thread ist struct to srv_per_thread struct to store the QUIC token to be reused for subsequent sessions. Parse at packet level (from qc_parse_ptk_frms()) these tokens and store them calling qc_try_store_new_token() newly implemented function. This is this new function which does its best (may fail) to update the tokens. Modify qc_do_build_pkt() to resend these tokens calling quic_enc_token() implemented by this patch.	2025-11-13 14:04:31 +01:00
Amaury Denoyelle	b9809fe0d0	MINOR: quic: remove <mux_state> field This patch removes <mux_state> field from quic_conn structure. The purpose of this field was to indicate if MUX layer above quic_conn is not yet initialized, active, or already released. It became tedious to properly set it as initialization order of the various quic_conn/conn/MUX layers now differ between the frontend and backend sides, and also depending if 0-RTT is used or not. Recently, a new change introduced in connect_server() will allow to initialize QUIC MUX earlier if ALPN is cached on the server structure. This had another level of complexity. Thus, this patch removes <mux_state> field completely. Instead, a new flag QUIC_FL_CONN_XPRT_CLOSED is defined. It is set at a single place only on close XPRT callback invokation. It can be mixed with the new utility functions qc_wait_for_conn()/qc_is_conn_ready() to determine the status of conn/MUX layers now without an extra quic_conn field.	2025-11-05 14:03:34 +01:00
Amaury Denoyelle	9bfe9b9e21	MINOR: quic: split Tx options for FE/BE usage This patch is similar to the previous one, except that it is focused on Tx QUIC settings. It is now possible to toggle GSO and pacing on frontend and backend sides independently. As with previous patch, option are renamed to use "fe/be" unified prefixes. This is part of the current serie of commits which unify QUI settings. Older options are deprecated and will be removed on 3.5 release.	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	33a8cb87a9	MINOR: quic: split congestion controler options for FE/BE usage Various settings can be configured related to QUIC congestion controler. This patch duplicates them to be able to set independent values on frontend and backend sides. As with previous patch, option are renamed to use "fe/be" unified prefixes. This is part of the current serie of commits which unify QUIC settings. Older options are deprecated and will be removed on 3.5 release.	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	5b04a85bc7	TESTS: quic: fix uninit of quic_cc_path const member Fix quic_tx unittest module by adding an explicit define for <mtu> const member of quic_cc_path. This should fix coverity report from github issue #3162. This can be backported up to 3.2.	2025-10-17 09:29:01 +02:00
Frederic Lecaille	47bb15ca84	MINOR: quic: get rid of ->target quic_conn struct member The ->li (struct listener ) member of quic_conn struct was replaced by a ->target (struct obj_type ) member by this commit: MINOR: quic-be: get rid of ->li quic_conn member to abstract the connection type (front or back) when implementing QUIC for the backends. In these cases, ->target was a pointer to the ojb_type of a server struct. This could not work with the dynamic servers contrary to the listeners which are not dynamic. This patch almost reverts the one mentioned above. ->target pointer to obj_type member is replaced by ->li pointer to listener struct member. As the listener are not dynamic, this is easy to do this. All one has to do is to replace the objt_listener(qc->target) statement by qc->li where applicable. For the backend connection, when needed, this is always qc->conn->target which is used only when qc->conn is initialized. The only "problematic" case is for quic_dgram_parse() which takes a pointer to an obj_type as third argument. But this obj_type is only used to call quic_rx_pkt_parse(). Inside this function it is used to access the proxy counters of the connection thanks to qc_counters(). So, this obj_type argument may be null for now on with this patch. This is the reason why qc_counters() is modified to take this into consideration.	2025-09-11 09:51:28 +02:00
Amaury Denoyelle	0b6908385e	BUG/MINOR: quic: properly support GSO on backend side Previously, GSO emission was explicitely disabled on backend side. This is not true since the following patch, thus GSO can be used, for example when transfering large POST requests to a HTTP/3 backend. commit e064e5d46171d32097a84b8f84ccc510a5c211db MINOR: quic: duplicate GSO unsupp status from listener to conn However, GSO on the backend side may cause crash when handling EIO. In this case, GSO must be completely disabled. Previously, this was performed by flagging listener instance. In backend side, this would cause a crash as listener is NULL. This patch fixes it by supporting GSO disable flag for servers. Thus, in qc_send_ppkts(), EIO can be converted either to a listener or server flag depending on the quic_conn proxy side. On backend side, server instance is retrieved via <qc.conn.target>. This is enough to guarantee that server is not deleted. This does not need to be backported.	2025-09-08 16:18:05 +02:00
Amaury Denoyelle	f645cd3c74	MINOR: quic: restore QUIC_HP_SAMPLE_LEN constant The below patch fixes padding emission for small packets, which is required to ensure that header protection removal can be performed by the recipient. commit d7dea408c64c327cab6aebf4ccad93405b675565 BUG/MINOR: quic: too short PADDING frame for too short packets In addition to the proper fix, constant QUIC_HP_SAMPLE_LEN was removed and replaced by QUIC_TLS_TAG_LEN. However, it still makes sense to have a dedicated constant which represent the size of the sample used for header protection. Thus, this patch restores it. Special instructions for backport : above patch mentions that no backport is needed. However, this is incorrect, as bug is introduced by another patch scheduled for backport up to 2.6. Thus, it is first mandatory to schedule d7dea408c64c327cab6aebf4ccad93405b675565 after it. Then, this patch can also be used for the sake of code clarity.	2025-09-08 14:49:03 +02:00
Amaury Denoyelle	c20c71a079	TESTS: quic: add unit-tests for QUIC TX part Define a new "quic_tx" unit-test which is used to test QUIC TX module. For the moment, a single test is performed on qc_do_build_pkt(). It checks that PADDING is correctly added for HP sampling in case of a small packet.	2025-09-08 14:49:03 +02:00
Amaury Denoyelle	fb8c6e2030	CLEANUP: quic: fix typo in quic_tx trace Fix trace in qc_may_build_pkt(). This can be backported up to 3.0.	2025-09-08 14:49:03 +02:00
Frederic Lecaille	d7dea408c6	BUG/MINOR: quic: too short PADDING frame for too short packets This bug arrvived with this commit: MINOR: quic: centralize padding for HP sampling on packet building What was missed is the fact that at the centralization point for the PADDING frame to add for too short packet, <len> payload length already includes <*pn_len> the packet number field length value. So when computing the length of the PADDING frame, the packet field length must not be considered and added to the payload length (<len>). This bug leaded too short PADDING frame to too short packets. This was the case, most of times with Application level packets with a 1-byte packet number field followed by a 1-byte PING frame. A 1-byte PADDING frame was added in this case in place of a correct 2-bytes PADDINF frame. The header packet protection of such packet could not be removed by the clients as for instance for ngtcp2 with such traces: I00001828 0x5a135c81e803f092c74bac64a85513b657 pkt could not decrypt packet number As the header protection could no be removed, the header keyupdate bit could also not be read by packet analyzers such as pyshark used during the keyupdate tests. No need to backport.	2025-09-05 16:17:11 +02:00
Frederic Lecaille	71336bdd08	MINOR: quic: add useful trace about padding params values When adding a PADDING frame for too short packets, add a trace about variable values whose this PADDING frame length depends on.	2025-09-05 16:17:11 +02:00
Amaury Denoyelle	36d28bfca3	MEDIUM: quic: strengthen BUG_ON() for unpad Initial packet on client To avoid anti-amplification limit, it is required that Initial packet are padded to be at least 1.200 bytes long. On server side, this only applies to ack-eliciting packets. However, for client side, this is mandatory for every packets. This patch adjusts qc_txb_store() BUG_ON statement used to catch too small Initial packets. On QUIC client side, ack-eliciting flag is now ignored, thus every packets are checked. This is labelled as MEDIUM as this BUG_ON() is known to be easily triggered, as QUIC datagrams encoding function are complex. However, it's important that a QUIC endpoint respects it, else the peer will drop the invalid packet and could immediately close the connection.	2025-09-02 10:41:49 +02:00
Amaury Denoyelle	209a54d539	BUG/MINOR: quic: pad Initial pkt with CONNECTION_CLOSE on client Currently, when connection is closing, only CONNECTION_CLOSE frame is emitted via qc_prep_pkts()/qc_do_build_pkt(). Also, only the first registered encryption level is considered while the others are dismissed. This results in a single packet datagram. This can cause issues for QUIC client support, as padding is required for every Initial packet, contrary to server side where only ack-eliciting packets are eligible. Thus a client must add padding to a CONNECTION_CLOSE frame on Initial level. This patch adjusts qc_prep_pkts() to ensure such packet will be correctly padded on client side. It sets <final_packet> variable which instructs that if padding is necessary it must be apply immediately on the current encryption level instead of the last one. It could appear as unnecessary to pad a CONNECTION_CLOSE packet, as the peer will enter in draining state when processing it. However, RFC mandates that a client Initial packet too small must be dropped by the server, so there is a risk that the CONNECTION_CLOSE is simply discarded prior to its processing if stored in a too small datagram. No need to backport as this is a QUIC backend issue only.	2025-09-02 10:34:12 +02:00
Amaury Denoyelle	e9b78e3fb1	BUG/MINOR: quic: fix padding issue on INITIAL retransmit On loss detection timer expiration, qc_dgrams_retransmit() is used to reemit lost packets. Different code paths are present depending on the active encryption level. If Initial level is still initialized, retransmit is performed both for Initial and Handshake spaces, by first retrieving the list of lost frames for each of them. Prior to this patch, Handshake level was always registered for emission after Initial, even if it dit not have any frame to reemit. In this case, most of the time it would result in a datagram containing Initial reemitted frames packet coalesced with a Handshake packet consisting only of a PADDING frame. This is because padding is only added for the last registered QEL. For QUIC backend support, this may cause issues. This is because contrary to QUIC server side, Initial and Handshake levels keys are not derived simultaneously for a QUIC client. Thus, if the latter keys are unavailable, Handshake packet cannot be encoded in sending, leaving a single Initial packet. However, this is now too late to add PADDING. Thus the resulting datagram is invalid : this triggers the BUG_ON() assert failure located on qc_txb_store(). This patch fixes this by amending qc_dgrams_retransmit(). Now, Handshake level is only registered for emission if there is frame to retransmit, which implies that Handshake keys are already available. Thus, PADDING will now either be added at Initial or Handshake level as expected. Note that this issue should not be present on QUIC frontend, due to Initial and Handshake keys derivation almost simultaneously. However, this should still be backported up to 3.0.	2025-09-02 10:31:32 +02:00
Amaury Denoyelle	34d5bfd23c	BUG/MINOR: quic: fix room check if padding requested qc_prep_pkts() activates padding when building an Initial packet. This ensures that resulting datagram will always be at least 1.200 bytes, which is mandatory to prevent deadlock over anti-amplication. Prior to padding activation, a check is performed to ensure that output buffer is big enough for a padded datagram. However, this did not take into account previously built packets which would be coalesced in the same datagram. Thus this patch fixes this comparison check. In theory, prior to this patch, in some cases Initial packets could not be built despite a datagram of the proper size. Currently, this probably never happens as Initial packet is always the first encoded in a datagram, thus there is no coalesced packet prior to it. However, there is no hard requirement on this, so it's better to reflect this in the code. This should be backported up to 2.6.	2025-09-02 10:29:11 +02:00
Amaury Denoyelle	1529ec1a25	MINOR: quic: centralize padding for HP sampling on packet building The below patch has simplified INITIAL padding on emission. Now, qc_prep_pkts() is responsible to activate padding for this case, and there is no more special case in qc_do_build_pkt() needed. commit 8bc339a6ad4702f2c39b2a78aaaff665d85c762b BUG/MAJOR: quic: fix INITIAL padding with probing packet only However, qc_do_build_pkt() may still activate padding on its own, to ensure that a packet is big enough so that header protection decryption can be performed by the peer. HP decryption is performed by extracting a sample from the ciphered packet, starting 4 bytes after PN offset. Sample length is 16 bytes as defined by TLS algos used by QUIC. Thus, a QUIC sender must ensures that length of packet number plus payload fields to be at least 4 bytes long. This is enough given that each packet is completed by a 16 bytes AEAD tag which can be part of the HP sample. This patch simplifies qc_do_build_pkt() by centralizing padding for this case in a single location. This is performed at the end of the function after payload is completed. The code is thus simpler. This is not a bug. However, it may be interesting to backport this patch up to 2.6, as qc_do_build_pkt() is a tedious function, in particular when dealing with padding generation, thus it may benefit greatly from simplification.	2025-08-25 08:48:24 +02:00
Amaury Denoyelle	7d554ca629	BUG/MINOR: quic: don't coalesce probing and ACK packet of same type Haproxy QUIC stack suffers from a limitation : it's not possible to emit a packet which contains probing data and a ACK frame in it. Thus, in case qc_do_build_pkt() is invoked which both values as true, probing has the priority and ACK is ignored. However, this has the undesired side-effect of possibly generating two coalesced packets of the same type in the same datagram : the first one with the probing data and the second with an ACK frame. This is caused by qc_prep_pkts() loop which may call qc_do_build_pkt() multiple times with the same QEL instance. This case is normally use when a full datagram has been built but there is still content to emit on the current encryption level. To fix this, alter qc_prep_pkts() loop : if both probing and ACK is requested, force the datagram to be written after packet encoding. This will result in a datagram containing the packet with probing data as final entry. A new datagram is started for the next packet which will can contain the ACK frame. This also has some impact on INITIAL padding. Indeed, if packet must be the last due to probing emission, qc_prep_pkts() will also activate padding to ensure final datagram is at least 1.200 bytes long. Note that coalescing two packets of the same type is not invalid according to QUIC RFC. However it could cause issue with some shaky implementations, so it is considered as a bug. This must be backported up to 2.6.	2025-08-22 18:20:42 +02:00
Amaury Denoyelle	8bc339a6ad	BUG/MAJOR: quic: fix INITIAL padding with probing packet only A QUIC datagram that contains an INITIAL packet must be padded to 1.200 bytes to prevent any deadlock due to anti-amplification protection. This is implemented by encoding a PADDING frame on the last packet of the datagram if necessary. Previously, qc_prep_pkts() was responsible to activate padding when calling qc_do_build_pkt(), as it knows which packet is the last to encode. However, this has the side-effect of preventing PING emission for probing with no data as this case was handled in an else-if branch after padding. This was fixed by the below commit 217e467e89d15f3c22e11fe144458afbf718c8a8 BUG/MINOR: quic: fix malformed probing packet building Above logic was altered to fix the PING case : padding was set to false explicitely in qc_prep_pkts(). Padding was then added in a specific block dedicated to the PING case in qc_do_build_pkt() itself for INITIAL packets. However, the fix is incorrect if the last QEL used to built a packet is not the initial one and probing is used with PING frame only. In this case, specific block in qc_do_build_pkt() does not add padding. This causes a BUG_ON() crash in qc_txb_store() which catches these packets as irregularly formed. To fix this while also properly handling PING emission, revert to the original padding logic : qc_prep_pkts() is responsible to activate INITIAL padding. To not interfere with PING emission, qc_do_build_pkt() body is adjusted so that PING block is moved up in the function and detached from the padding condition. The main benefit from this patch is that INITIAL padding decision in qc_prep_pkts() is clearer now. Note that padding can also be activated by qc_do_build_pkt(), as packets should be big enough for header protection decipher. However, this case is different from INITIAL padding, so it is not covered by this patch. This should be backported up to 2.6.	2025-08-22 18:12:32 +02:00
Amaury Denoyelle	0376e66112	BUG/MINOR: quic: do not emit probe data if CONNECTION_CLOSE requested If connection closing is activated, qc_prep_pkts() can only built a datagram with a single packet. This is because we consider that only a single CONNECTION_CLOSE frame is relevant at this stage. This is handled both by qc_prep_pkts() which ensure that only a single packet datagram is built and also qc_do_build_pkt() which prevents the invokation of qc_build_frms() if <cc> is set. However, there is an incoherency for probing. First, qc_prep_pkts() deactivates it if connection closing is requested. But qc_do_build_pkt() may still emit probing frame as it does not check its <probe> argument but rather <pto_probe> QEL field directly. This can results in a packet mixing a PING and a CONNECTION close frames, which is useless. Fix this by adjusting qc_do_build_pkt() : closing argument is also checked on PING probing emission. Note that there is still shaky code here as qc_do_build_pkt() should rely only on <probe> argument to ensure this. This should be backported up to 2.6.	2025-08-22 18:06:43 +02:00
Amaury Denoyelle	fc3ad50788	BUG/MEDIUM: quic: reset padding when building GSO datagrams qc_prep_pkts() encodes input data into QUIC packets in a loop into one or several datagrams. It supports GSO which requires to built a serie of multiple datagrams of the same length. Each packet encoding is performed via a call to qc_do_build_pkt(). This function has an argument to specify if output packet must be completed with a PADDING frame. This option is activated when qc_prep_pkts() encodes the last packet of a datagram with at least one INITIAL packet in it. Padding is resetted each time a new datagram is started. However, this was not performed if GSO is used to built the next datagram. This patch fixes it by properly resetting padding in this case also. The impact of this bug is unknown. It may have several effectfs, one of the most obvious being the insertion of unnecessary padding in packets. It could also potentially trigger an infinite loop in qc_prep_pkts(), although this has never been encountered so far. This must be backported up to 3.1.	2025-08-22 16:22:01 +02:00
Frederic Lecaille	9a22770ac5	BUG/MINOR: quic-be: missing Initial packet number space discarding A QUIC client must discard the Initial packet number space as soon as it first sends a Handshake packet. This patch implements this packet number space which was missing.	2025-08-21 14:24:31 +02:00
Willy Tarreau	c264ea1679	MEDIUM: tree-wide: replace most DECLARE_POOL with DECLARE_TYPED_POOL This will make the pools size and alignment automatically inherit the type declaration. It was done like this: sed -i -e 's:DECLARE_POOL($[^,],[^,],\s$sizeof($[^)]$)):DECLARE_TYPED_POOL(\1\2):g' $(git grep -lw DECLARE_POOL src addons) sed -i -e 's:DECLARE_STATIC_POOL($[^,],[^,],\s$sizeof($[^)]$)):DECLARE_STATIC_TYPED_POOL(\1\2):g' $(git grep -lw DECLARE_STATIC_POOL src addons) 81 replacements were made. The only remaining ones are those which set their own size without depending on a structure. The few ones with an extra size were manually handled. It also means that the requested alignments are now checked against the type's. Given that none is specified for now, no issue is reported. It was verified with "show pools detailed" that the definitions are exactly the same, and that the binaries are similar.	2025-08-11 19:55:30 +02:00
Amaury Denoyelle	731b52ded9	MINOR: quic: prefer qc_is_back() usage over qc->target Previously quic_conn <target> member was used to determine if quic_conn was used on the frontend (as server) or backend side (as client). A new helper function can now be used to directly check flag QUIC_FL_CONN_IS_BACK. This reduces the dependency between quic_conn and their relative listener/server instances.	2025-08-07 16:59:59 +02:00
Amaury Denoyelle	e064e5d461	MINOR: quic: duplicate GSO unsupp status from listener to conn QUIC emission can use GSO to emit multiple datagrams with a single syscall invokation. However, this feature relies on several kernel parameters which are checked on haproxy process startup. Even if these checks report no issue, GSO may still be unable due to the underlying network adapter underneath. Thus, if a EIO occured on sendmsg() with GSO, listener is flagged to mark GSO as unsupported. This allows every other QUIC connections to share the status and avoid using GSO when using this listener. Previously, listener flag was checked for every QUIC emission. This was done using an atomic operation to prevent races. Improve this by duplicating GSO unsupported status as the connection level. This is done on qc_new_conn() and also on thread rebinding if a new listener instance is used. The main benefit from this patch is to reduce the dependency between quic_conn and listener instances.	2025-08-07 16:36:26 +02:00
Frederic Lecaille	14d0f74052	MINOR: quic: Remove pool_head_quic_be_cc_buf pool This patch impacts the QUIC frontends. It reverts this patch MINOR: quic-be: add a "CC connection" backend TX buffer pool which adds <pool_head_quic_be_cc_buf> new pool to allocate CC (connection closed state) TX buffers with bigger object size than the one for <pool_head_quic_cc_buf>. Indeed the QUIC backends must be able to send at least 1200 bytes Initial packets. For now on, both the QUIC frontends and backend use the same pool with MAX(QUIC_INITIAL_IPV6_MTU, QUIC_INITIAL_IPV4_MTU)(1252 bytes) as object size.	2025-07-17 19:33:21 +02:00
Frederic Lecaille	838024e07e	MINOR: quic: Get rid of qc_is_listener() Replace all calls to qc_is_listener() (resp. !qc_is_listener()) by calls to objt_listener() (resp. objt_server()). Remove qc_is_listener() implement and QUIC_FL_CONN_LISTENER the flag it relied on.	2025-07-16 16:42:21 +02:00
Ilia Shipitsin	0ee3d739b8	CLEANUP: assorted typo fixes in the code, commits and doc Corrected various spelling and phrasing errors to improve clarity and consistency.	2025-07-10 19:49:48 +02:00
Frederic Lecaille	87ada46f38	BUG/MINOR: quic-be: Malformed coalesced Initial packets This bug fix completes this patch which was not sufficient: MINOR: quic-be: Allow sending 1200 bytes Initial datagrams This patch could not allow the build of well formed Initial packets coalesced to others (Handshake) packets. Indeed, the <padding> parameter passed to qc_build_pkt() is deduced from a first value: <padding> value and must be set to 1 for the last encryption level. As a client, the last encryption level is always the Handshake encryption level. But <padding> was always set to 1 for a QUIC client, leading the first Initial packet to be malformed because considered as the second one into the same datagram. So, this patch sets <padding> value passed to qc_build_pkt() to 1 only when there is no last encryption level at all, to allow the build of Initial only packets (not coalesced) or when it frames to send (coalesced packets). No need to backport.	2025-07-07 14:13:02 +02:00
Frederic Lecaille	194e3bc2d5	MINOR: quic-be: address validation support implementation (RETRY) - Add ->retry_token and ->retry_token_len new quic_conn struct members to store the retry tokens. These objects are allocated by quic_rx_packet_parse() and released by quic_conn_release(). - Add <pool_head_quic_retry_token> new pool for these tokens. - Implement quic_retry_packet_check() to check the integrity tag of these tokens upon RETRY packets receipt. quic_tls_generate_retry_integrity_tag() is called by this new function. It has been modified to pass the address where the tag must be generated - Add <resend> new parameter to quic_pktns_discard(). This function is called to discard the packet number spaces where the already TX packets and frames are attached to. <resend> allows the caller to prevent this function to release the in flight TX packets/frames. The frames are requeued to be resent. - Modify quic_rx_pkt_parse() to handle the RETRY packets. What must be done upon such packets receipt is: - store the retry token, - store the new peer SCID as the DCID of the connection. Note that the peer will modify again its SCID. This is why this SCID is also stored as the ODCID which must be matched with the peer retry_source_connection_id transport parameter, - discard the Initial packet number space without flagging it as discarded and prevent retransmissions calling qc_set_timer(), - modify the TLS cryptographic cipher contexts (RX/TX), - wakeup the I/O handler to send new Initial packets asap. - Modify quic_transport_param_decode() to handle the retry_source_connection_id transport parameter as a QUIC client. Then its caller is modified to check this transport parameter matches with the SCID sent by the peer with the RETRY packet.	2025-06-26 09:48:00 +02:00
Frederic Lecaille	8a25fcd36e	MINOR: quic-be: Allow sending 1200 bytes Initial datagrams This easy to understand patch is not intrusive at all and cannot break the QUIC listeners. The QUIC client MUST always pad its datagrams with Initial packets. A "!l" (not a listener) test OR'ed with the existing ones is added to satisfy the condition to allow the build of such datagrams.	2025-06-26 09:48:00 +02:00
Frederic Lecaille	c898b29e64	MINOR: quic: Useless TX buffer size reduction in closing state There is no need to limit the size of the TX buffer to QUIC_MIN_CC_PKTSIZE bytes when the connection is in closing state. There is already a test which limits the number of bytes to be used from this TX buffer after this useless test removed. It limits this number of bytes to the size of the TX buffer itself: if (end > (unsigned char )b_wrap(buf)) end = (unsigned char )b_wrap(buf); This is exactly what is needed when the connection is in closing state. Indeed, the size of the TX buffers are limited to reduce the memory usage. The connection only needs to send short datagrams with at most 2 packets with a CONNECTION_CLOSE* frames. They are built only one time and backed up into small TX buffer allocated from a dedicated pool. The size of this TX buffer is QUIC_MAX_CC_BUFSIZE which depends on QUIC_MIN_CC_PKTSIZE: #define QUIC_MIN_CC_PKTSIZE 128 #define QUIC_MAX_CC_BUFSIZE (2 * (QUIC_MIN_CC_PKTSIZE + QUIC_DGRAM_HEADLEN)) This size is smaller than an MTU. This patch should be backported as far as 2.9 to ease further backports to come.	2025-06-26 09:48:00 +02:00
Frederic Lecaille	9cb2acd2f2	MINOR: quic-be: add a "CC connection" backend TX buffer pool A QUIC client must be able to close a connection sending Initial packets. But QUIC client Initial packets must always be at least 1200 bytes long. To reduce the memory use of TX buffers of a connection when in "closing" state, a pool was dedicated for this purpose but with a too much reduced TX buffer size (QUIC_MAX_CC_BUFSIZE). This patch adds a "closing state connection" TX buffer pool with the same role for QUIC backends.	2025-06-26 09:48:00 +02:00
Frederic Lecaille	1e6d8f199c	BUG/MINOR: quic: wrong QUIC_FT_CONNECTION_CLOSE(0x1c) frame encoding This is an old bug which was there since this commit: MINOR: quic: Avoid zeroing frame structures It seems QUIC_FT_CONNECTION_CLOSE was confused with QUIC_FT_CONNECTION_CLOSE_APP which does not include a "frame type" field. This field was not initialized (so with a random value) which prevent the packet to be built because the packet builder supposes the packet with such frames are very short. Must be backported as far as 2.6.	2025-06-26 09:48:00 +02:00
Frederic Lecaille	b9703cf711	MINOR: quic-be: get rid of ->li quic_conn member Replace ->li quic_conn pointer to struct listener member by ->target which is an object type enum and adapt the code. Use __objt_(listener\|server)() where the object type is known. Typically this is were the code which is specific to one connection type (frontend/backend). Remove <server> parameter passed to qc_new_conn(). It is redundant with the <target> parameter. GSO is not supported at this time for QUIC backend. qc_prep_pkts() is modified to prevent it from building more than an MTU. This has as consequence to prevent qc_send_ppkts() to use GSO. ssl_clienthello.c code is run only by listeners. This is why __objt_listener() is used in place of ->li.	2025-06-11 18:37:34 +02:00
Ilia Shipitsin	78b849b839	CLEANUP: assorted typo fixes in the code and comments code, comments and doc actually.	2025-04-02 11:12:20 +02:00
Amaury Denoyelle	a71007c088	MINOR: quic: move global tune options into quic_tune A new structure quic_tune has recently been defined. Its purpose is to store global options related to QUIC. Previously, only the tunable to toggle pacing was stored in it. This commit moves several QUIC related tunable from global to quic_tune structure. This better centralizes QUIC configuration option and gives room for future generic options.	2025-03-24 10:01:46 +01:00
Amaury Denoyelle	e2744d23be	MINOR: quic: refactor CRYPTO encoding and splitting This patch is the direct follow-up of the previous one which refactor STREAM frame encoding. Reuse the newly defined quic_strm_frm_fillbuf() and quic_strm_frm_split() functions for CRYPTO frame encoding. The code for CRYPTO and STREAM frames encoding should now be clearer as it is mostly identical.	2025-02-12 15:10:54 +01:00
Amaury Denoyelle	f96af8e463	MINOR: quic: refactor STREAM encoding and splitting CRYPTO and STREAM frames encoding is similar. If payload is too large, frame will be splitted and only the first payload part will be written in the output QUIC packet. This process is complexified by the presence of a variable-length integer Length field prior to the payload. This commit aims at refactor these operations. Define two functions to simplify the code : * quic_strm_frm_fillbuf() which is used to calculate the optimal frame length of a STREAM/CRYPTO frame with its payload in a buffer * quic_strm_frm_split() which is used to split the frame payload if buffer is too small With this patch, both functions are now implemented for STREAM encoding.	2025-02-12 15:10:03 +01:00
Amaury Denoyelle	731340afbd	MINOR: quic: simplify length calculation for STREAM/CRYPTO frames STREAM and CRYPTO frames have a similar encoding format. In particular, both of them have a variable-length integer Length field just before the frame payload. It is complex to determine the optimal Length value before copying the payload data in the remaining buffer space. As such, helper functions were implemented to calculate this. However, CRYPTO and STREAM frames encoding implementation were not completely aligned, which renders the code harder to follow. The purpose of this commit is to simplify CRYPTO and STREAM frames encoding. First, a new helper quic_int_cap_length() is defined which is useful to determine the optimal buffer room available if prefixed by a variable-length integer as Length field. Then, processing of both CRYPTO and STREAM frames is now nearly identical, based on this new helper function. Functions max_available_room() and max_stream_data_size() are now unused and are removed.	2025-02-12 11:51:09 +01:00
Amaury Denoyelle	e6a223542a	BUG/MINOR: quic: fix CRYPTO payload size calcul for encoding Function max_stream_data_size() is used to determine the payload length of a CRYPTO frame. It takes into account that the CRYPTO length field is a variable length integer. Implemented calcul was incorrect as it reserved too much space as a frame header. This error is mostly due because max_stream_data_size() reuses max_available_room() which also reserve space for a variable length integer. This results in CRYPTO frames shorter of 1 to 2 bytes than the maximum achievable value, which produces in the end datagram shorter than the MTU. Fix max_stream_data_size() implementation. It is now merely a wrapper on max_available_room(). This ensures that CRYPTO frame encoding is now properly optimized to use the MTU available. This should be backported up to 2.6.	2025-02-12 11:51:09 +01:00
Amaury Denoyelle	63747452a3	BUG/MINOR: quic: reserve length field for long header encoding Long header packets have a mandatory Length field, which contains the size of Packet number and payload, encoded as a variable-length integer. Its value can thus only be determined after the payload size is known, which depends on the remaining buffer space after this variable-length field. Packet payload are encoded in two steps. First, a list of input frames is processed until the packet buffer is full. CRYPTO and STREAM frames payload can be splitted if need to fill the buffer. Real encoding is then performed as a second stage operation, first with Length field, then with the selected frames themselves. Before this patch, no space was reserved in the buffer for Length field when attaching the frames to the packet. This could result in a error as the packet payload would be too large for the remaining space. In practice, this issue was rarely encounted, mostly as a side-effect from another issue linked to CRYPTO frame encoding. Indeed, a wrong calculation is performed on CRYPTO splitting, which results in frame payload shorter by a few bytes than expected. This however ensured there would be always enough room for the Length field and payload during encoding. As CRYPTO frames are the only big enough content emitted with a Long header packet, this renders the current issue mostly non reproducible. Fix the original issue by reserving some space for Length field prior to frame payload calculation, using a maximum value based on the remaining room space. Packet length is then reduced if needed when encoding is performed, which ensures there is always enough room for the selected frames. Note that the other issue impacting CRYPTO frame encoding is not yet fixed. This could result in datagrams with Long header packets not completely extended to the full MTU. The issue will be addressed in another patch. This should be backported up to 2.6.	2025-02-12 11:51:09 +01:00
Amaury Denoyelle	4489a61585	MEDIUM: quic: implement credit based pacing Implement a new method for QUIC pacing emission based on credit. This represents the number of packets which can be emitted in a single burst. After emission, decrement from the credit the number of emitted packets. Several emission can be conducted in the same sequence until the credit is completely decremented. When a new emission sequence is initiated (i.e. under a new QMUX tasklet invokation), credit is refilled according to the delay which occured between the last and current emission context. This new mechanism main advantage is that it allows to conduct several emission in the same task context without having to wait between each invokation. Wait is only forced if pacing is expired, which is now equivalent to having a null credit. Furthermore, if delay between two emissions sequence would have been smaller than expected, credit is only partially refilled. This allows to restart emission without having to wait for the whole credit to be available. On the implementation side, a new field <credit> is avaiable in quic_pacer structure. It is automatically decremented on quic_pacing_sent_done() invokation. Also, a new function quic_pacing_reload() must be used by QUIC MUX when a new emission sequence is initiated to refill credit. <next> field from quic_pacer has been removed. For the moment, credit is based on the burst configured via quic-cc-algo keyword, or directly reported by BBR. This should be backported up to 3.1.	2025-01-23 17:40:20 +01:00
Frederic Lecaille	f8b697c19b	BUG/MINOR: improve BBR throughput on very fast links This patch fixes the loss of information when computing the delivery rate (quic_cc_drs.c) on links with very low latency due to usage of 32bits variables with the millisecond as precision. Initialize the quic_conn task with TASK_F_WANTS_TIME flag ask it to ask the scheduler to update the call date of this task. This allows this task to get a nanosecond resolution on the call date calling task_mono_time(). This is enabled only for congestion control algorithms with delivery rate estimation support (BBR only at this time). Store the send date with nanosecond precision of each TX packet into ->time_sent_ns new quic_tx_packet struct member to store the date a packet was sent in nanoseconds thanks to task_mono_time(). Make use of this new timestamp by the delivery rate estimation algorithm (quic_cc_drs.c). Rename current ->time_sent member from quic_tx_packet struct to ->time_sent_ms to distinguish the unit used by this variable (millisecond) and update the code which uses this variable. The logic found in quic_loss.c is not modified at all. Must be backported to 3.1.	2024-11-28 21:39:05 +01:00
Amaury Denoyelle	2fffd85b97	BUG/MEDIUM: quic: prevent EMSGSIZE with GSO for larger bufsize A UDP datagram cannot be greater than 65535 bytes, as UDP length header field is encoded on 2 bytes. As such, sendmsg() will reject a bigger input with error EMSGSIZE. By default, this does not cause any issue as QUIC datagrams are limited to 1.252 bytes and sent individually. However, with GSO support, value bigger than 1.252 bytes are specified on sendmsg(). If using a bufsize equal to or greater than 65535, syscall could reject the input buffer with EMSGSIZE. As this value is not expected, the connection is immediately closed by haproxy and the transfer is interrupted. This bug can easily reproduced by requesting a large object on loopback interface and using a bufsize of 65535 bytes. In fact, the limit is slightly less than 65535, as extra room is also needed for IP + UDP headers. Fix this by reducing the count of datagrams encoded in a single GSO invokation via qc_prep_pkts(). Previously, it was set to 64 as specified by man 7 udp. However, with 1252 datagrams, this is still too many. Reduce it to a value of 52. Input to sendmsg will thus be restricted to at most 65.104 bytes if last datagram is full. If there is still data available for encoding in qc_prep_pkts(), they will be written in a separate batch of datagrams. qc_send_ppkts() will then loop over the whole QUIC Tx buffer and call sendmsg() for each series of at most 52 datagrams. This does not need to be backported.	2024-11-26 11:49:30 +01:00
Frederic Lecaille	96b2641fc8	BUG/MAJOR: quic: fix wrong packet building due to already acked frames If a packet build was asked to probe the peer with frames which have just been acked, the frames build run by qc_build_frms() could be cancelled by qc_stream_frm_is_acked() whose aim is to check that current frames to be built have not been already acknowledged. In this case the packet build run by qc_do_build_pkt() is not interrupted, leading to the build of an empty packet which should be ack-eliciting. This is a bug detected by the BUG_ON() statement in qc_do_build_pk(): BUG_ON(qel->pktns->tx.pto_probe && !(pkt->flags & QUIC_FL_TX_PACKET_ACK_ELICITING)); Thank you to @Tristan971 for having reported this issue in GH #2709 This is an old bug which must be backported as far as 2.6.	2024-11-25 18:55:45 +01:00
Amaury Denoyelle	044452546e	BUG/MEDIUM: quic: fix sending performance due to qc_prep_pkts() return qc_prep_pkts() is a QUIC transport level function which encodes one or several datagrams in a buffer before sending them. It returns the number of encoded datagram. This is especially important when pacing is used to limit packet bursts. This datagram accounting was not trivial as qc_prep_pkts() used several code paths depending on the condition of the current encoded packet. Thus, there were several places were the local variable dgram_cnt could have been incremented. This was implemented by the following commit : commit 5cb8f8a6224db96f4386277c41ddae4a29a4130d MINOR: quic: support a max number of built packet per send iteration However, there is a bug due to a missing increment when all frames from the current QEL have been encoded. In this case, the encoding continue in the same datagram to coalesce a futur packet. However, if this is the last QEL, encoding loop will then break. As first_pkt is not NULL, qc_txb_store() is called outside but dgram_cnt is yet not incremented. In particular, this causes qc_prep_pkts() to return 0 when there is only small STREAM frames to emit for application QEL. In qc_send(), this is interpreted as a value which prevents further emission for the current invokation. Thus, it may hurts performance, both without and with pacing. To fix this, removing multiple dgram_cnt increment. Now, it is modified only in a single place which should cover every case, and render the code easier to validate. The most notable case where the bug is visible is when using cubic with pacing without any burst, with quic-cc-algo cubic(,1). First, transfer bandwidth in average was suboptimal, with significant variation. Worst, it could sometimes fall dramatically for a particular stream without recovering before returning to an expected level on the next one. No need to backport.	2024-11-25 11:21:28 +01:00

1 2 3

126 Commits