haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-12-03 16:51:01 +01:00

Author	SHA1	Message	Date
Frederic Lecaille	89d5a59933	MINOR: quic-be: add field for max_udp_payload_size into quic_conn Add ->max_udp_payload_size new member to quic_conn struct. Initialize it from qc_new_conn(). Adapt qc_snd_buf() to use it.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	f7c0f5ac1b	MINOR: quic-be: xprt ->init() adapatations Allocate a connection to connect to QUIC servers from qc_conn_init() which is the ->init() QUIC xprt callback. Also initialize ->prepare_srv and ->destroy_srv callback as this done for TCP servers.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	29fb1aee57	MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn()) For haproxy QUIC servers (or QUIC clients), the peer is considered as validated. This is a property which is more specific to QUIC servers (haproxy QUIC listeners). No <odcid> is used for the QUIC client connection. It is used only on the QUIC server side. The <token_odcid> is also not used on the QUIC client side. It must be embedded into the transport parameters only on the QUIC server side. The quic_conn is created before the socket allocation. So, the local address is zeroed. Initilize the transport parameter with qc_srv_params_init(). Stop hardcoding the <server> parameter passed value to qc_new_isecs() to correctly initialize the Initial secrets.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	f49bbd36b9	MINOR: quic-be: SSL sessions initializations Modify qc_alloc_ssl_sock_ctx() to pass the connection object as parameter. It is NULL for a QUIC listener, not NULL for a QUIC server. This connection object is set as value for ->conn quic_conn struct member. Initialise the SSL session object from this function for QUIC servers. qc_ssl_set_quic_transport_params() is also modified to pass the SSL object as parameter. This is the unique parameter this function needs. <qc> parameter is used only for the trace. SSL_do_handshake() must be calle as soon as the SSL object is initialized for the QUIC backend connection. This triggers the TLS CRYPTO data delivery. tasklet_wakeup() is also called to send asap these CRYPTO data. Modify the QUIC_EV_CONN_NEW event trace to dump the potential errors returned by SSL_do_handshake().	2025-06-11 18:37:34 +02:00
Frederic Lecaille	990c9f95f7	MINOR: quic-be: Correct Version Information transp. param encoding According to the RFC, a QUIC client must encode the QUIC version it supports into the "Available Versions" of "Version Information" transport parameter order by descending preference. This is done defining <quic_version_2> and <quic_version_draft_29> new variables pointers to the corresponding version of <quic_versions> array elements. A client announces its available versions as follows: v1, v2, draft29.	2025-06-11 18:37:34 +02:00
Amaury Denoyelle	f286288471	MINOR: quic: refactor handling of streams after MUX release quic-conn layer has to handle itself STREAM frames after MUX release. If the stream was already seen, it is probably only a retransmitted frame which can be safely ignored. For other streams, an active closure may be needed. Thus it's necessary that quic-conn layer knows the highest stream ID already handled by the MUX after its release. Previously, this was done via <nb_streams> member array in quic-conn structure. Refactor this by replacing <nb_streams> by two members called <stream_max_uni>/<stream_max_bidi>. Indeed, it is unnecessary for quic-conn layer to monitor locally opened uni streams, as the peer cannot by definition emit a STREAM frame on it. Also, bidirectional streams are always opened by the remote side. Previously, <nb_streams> were set by quic-stream layer. Now, <stream_max_uni>/<stream_max_bidi> members are only set one time, just prior to QUIC MUX release. This is sufficient as quic-conn do not use them if the MUX is available. Note that previously, IDs were used relatively to their type, thus incremented by 1, after shifting the original value. For simplification, use the plain stream ID, which is incremented by 4.	2025-05-21 14:26:45 +02:00
Amaury Denoyelle	07d41a043c	MINOR: quic: move function to check stream type in utils Move general function to check if a stream is uni or bidirectional from QUIC MUX to quic_utils module. This should prevent unnecessary include of QUIC MUX header file in other sources.	2025-05-21 14:17:41 +02:00
Amaury Denoyelle	cf45bf1ad8	CLEANUP: quic: remove unused cbuf module Cbuf are not used anymore. Remove the related source and header files, as well as include statements in the rest of QUIC source files.	2025-05-21 14:16:37 +02:00
Frederic Lecaille	b3ac1a636c	MINOR: quic: implement all remaining callbacks for OpenSSL 3.5 QUIC API The quic_conn struct is modified for two reasons. The first one is to store the encoded version of the local tranport parameter as this is done for USE_QUIC_OPENSSL_COMPAT. Indeed, the local transport parameter "should remain valid until after the parameters have been sent" as mentionned by SSL_set_quic_tls_cbs(3) manual. In our case, the buffer is a static buffer attached to the quic_conn object. qc_ssl_set_quic_transport_params() function whose role is to call SSL_set_tls_quic_transport_params() (aliased by SSL_set_quic_transport_params() to set these local tranport parameter into the TLS stack from the buffer attached to the quic_conn struct. The second quic_conn struct modification is the addition of the new ->prot_level (SSL protection level) member added to the quic_conn struct to store "the most recent write encryption level set via the OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn callback (if it has been called)" as mentionned by SSL_set_quic_tls_cbs(3) manual. This patches finally implements the five remaining callacks to make the haproxy QUIC implementation work. OSSL_FUNC_SSL_QUIC_TLS_crypto_send_fn() (ha_quic_ossl_crypto_send) is easy to implement. It calls ha_quic_add_handshake_data() after having converted qc->prot_level TLS protection level value to the correct ssl_encryption_level_t (boringSSL API/quictls) value. OSSL_FUNC_SSL_QUIC_TLS_crypto_recv_rcd_fn() (ha_quic_ossl_crypto_recv_rcd()) provide the non-contiguous addresses to the TLS stack, without releasing them. OSSL_FUNC_SSL_QUIC_TLS_crypto_release_rcd_fn() (ha_quic_ossl_crypto_release_rcd()) release these non-contiguous buffer relying on the fact that the list of encryption level (qc->qel_list) is correctly ordered by SSL protection level secret establishements order (by the TLS stack). OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn() (ha_quic_ossl_got_transport_params()) is a simple wrapping function over ha_quic_set_encryption_secrets() which is used by boringSSL/quictls API. OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn() (ha_quic_ossl_got_transport_params()) role is to store the peer received transport parameters. It simply calls quic_transport_params_store() and set them into the TLS stack calling qc_ssl_set_quic_transport_params(). Also add some comments for all the OpenSSL 3.5 QUIC API callbacks. This patch have no impact on the other use of QUIC API provided by the others TLS stacks.	2025-05-20 15:00:06 +02:00
Amaury Denoyelle	d358da4d83	BUG/MINOR: quic: fix crash on quic_conn alloc failure If there is an alloc failure during qc_new_conn(), cleaning is done via quic_conn_release(). However, since the below commit, an unchecked dereferencing of <qc.path> is performed in the latter. e841164a4402118bd7b2e2dc2b5068f21de5d9d2 MINOR: quic: account for global congestion window To fix this, simply check <qc.path> before dereferencing it in quic_conn_release(). This is safe as it is properly initialized to NULL on qc_new_conn() first stage. This does not need to be backported.	2025-05-19 11:03:48 +02:00
Olivier Houchard	faa18c1ad8	BUG/MEDIUM: quic: Let it be known if the tasklet has been released. quic_conn_release() may, or may not, free the tasklet associated with the connection. So make it return 1 if it was, and 0 otherwise, so that if it was called from the tasklet handler itself, the said handler can act accordingly and return NULL if the tasklet was destroyed. This should be backported if 9240cd4a2771245fae4d0d69ef025104b14bfc23 is backported.	2025-05-02 11:09:28 +02:00
Olivier Houchard	81e4083efb	BUILD/MEDIUM: quic: Make sure we build with recent changes TASK_IN_LIST has been changed to TASK_QUEUED, but one was missed in quic_conn.c, so fix that.	2025-04-30 18:00:56 +02:00
Amaury Denoyelle	e841164a44	MINOR: quic: account for global congestion window Use the newly defined cshared type to account for the sum of congestion window of every QUIC connection. This value is stored in global counter quic_mem_global defined in proto_quic module.	2025-04-29 15:19:32 +02:00
Olivier Houchard	5838786fa0	MEDIUM: quic: Make sure we return NULL in quic_conn_app_io_cb if needed In quic_conn_app_io_cb, make sure we return NULL if the tasklet has been destroyed, so that the scheduler knows. It is not yet needed, but will be soon.	2025-04-25 16:14:26 +02:00
Ilia Shipitsin	27a6353ceb	CLEANUP: assorted typo fixes in the code, commits and doc	2025-04-03 11:37:25 +02:00
Ilia Shipitsin	78b849b839	CLEANUP: assorted typo fixes in the code and comments code, comments and doc actually.	2025-04-02 11:12:20 +02:00
Amaury Denoyelle	a71007c088	MINOR: quic: move global tune options into quic_tune A new structure quic_tune has recently been defined. Its purpose is to store global options related to QUIC. Previously, only the tunable to toggle pacing was stored in it. This commit moves several QUIC related tunable from global to quic_tune structure. This better centralizes QUIC configuration option and gives room for future generic options.	2025-03-24 10:01:46 +01:00
Amaury Denoyelle	7896edccdc	MINOR: quic: remove unused pacing burst in bind_conf/quic_cc_path Pacing burst size is now dynamic. As such, configuration value has been removed and related fields in bind_conf and quic_cc_path structures can be safely removed. This should be backported up to 3.1.	2025-01-23 17:40:48 +01:00
Amaury Denoyelle	41f0472d96	MEDIUM: mux-quic: remove pacing specific code on qcc_io_cb Pacing was recently implemented by QUIC MUX. Its tasklet is rescheduled until next emission timer is reached. To improve performance, an alternate execution of qcc_io_cb was performed when rescheduled due to pacing. This was implemented using TASK_F_USR1 flag. However, this model is fragile, in particular when several events happened alongside pacing scheduling. This has caused some issue recently, most notably when MUX is subscribed on transport layer on receive for handshake completion while pacing emission is performed in parallel. MUX qcc_io_cb() would not execute the default code path, which means the reception event is silently ignored. Recent patches have reworked several parts of qcc_io_cb. The objective was to improve performance with better algorithm on send and receive part. Most notable, qcc frames list is only cleared when new data is available for emission. With this, pacing alternative code is now mostly unneeded. As such, this patch removes it. The following changes are performed : * TASK_F_USR1 is now not used by QUIC MUX. As such, tasklet_wakeup() default invokation can now replace obsolete wrappers qcc_wakeup/qcc_wakeup_pacing * qcc_purge_sending is removed. On pacing rescheduling, all qcc_io_cb() is executed. This is less error-prone, in particular when pacing is mixed with other events like receive handling. This renders the code less fragile, as it completely solves the described issue above. This should be backported up to 3.1.	2024-12-18 09:49:20 +01:00
Amaury Denoyelle	9dcd2369e2	MINOR: quic: add traces Add some traces to better follow QUIC MUX scheduling, in particular with pacing interaction. This should be backported up to 3.1.	2024-12-18 09:20:20 +01:00
Frederic Lecaille	f8b697c19b	BUG/MINOR: improve BBR throughput on very fast links This patch fixes the loss of information when computing the delivery rate (quic_cc_drs.c) on links with very low latency due to usage of 32bits variables with the millisecond as precision. Initialize the quic_conn task with TASK_F_WANTS_TIME flag ask it to ask the scheduler to update the call date of this task. This allows this task to get a nanosecond resolution on the call date calling task_mono_time(). This is enabled only for congestion control algorithms with delivery rate estimation support (BBR only at this time). Store the send date with nanosecond precision of each TX packet into ->time_sent_ns new quic_tx_packet struct member to store the date a packet was sent in nanoseconds thanks to task_mono_time(). Make use of this new timestamp by the delivery rate estimation algorithm (quic_cc_drs.c). Rename current ->time_sent member from quic_tx_packet struct to ->time_sent_ms to distinguish the unit used by this variable (millisecond) and update the code which uses this variable. The logic found in quic_loss.c is not modified at all. Must be backported to 3.1.	2024-11-28 21:39:05 +01:00
Frederic Lecaille	d85eb127e9	MINOR: quic: quic_loss modifications to support BBR qc_packet_loss_lookup() aim is to detect the packet losses. This is this function which must called ->on_pkt_lost() BBR specific callback. It also set <bytes_lost> passed parameter to the total number of bytes detected as lost upon an ACK frame receipt for its caller. Modify qc_release_lost_pkts() to call ->congestion_event() with the send time from the newest packet detected as lost. Modify qc_release_lost_pkts() to call ->slow_start() callback only if define by the congestion control algorithm. This is not the case for BBR.	2024-11-20 17:34:22 +01:00
Amaury Denoyelle	24cea66e07	MEDIUM: quic: define cubic-pacing congestion algorithm Define a new QUIC congestion algorithm token 'cubic-pacing' for quic-cc-algo bind keyword. This is identical to default cubic implementation, except that pacing is used for STREAM frames emission. This algorithm supports an extra argument to specify a burst size. This is stored into a new bind_conf member named quic_pacing_burst which can be reuse to initialize quic path. Pacing support is still considered experimental. As such, 'cubic-pacing' can only be used with expose-experimental-directives set.	2024-11-19 16:20:58 +01:00
Amaury Denoyelle	ede4cd4c2e	MINOR: mux-quic: encapsulate QCC tasklet wakeup QUIC MUX will be responsible to drive emission with pacing. This will be implemented via setting TASK_F_USR1 before I/O tasklet wakeup. To prepare this, encapsulate each I/O tasklet wakeup into a new function qcc_wakeup(). This commit is purely refactoring prior to pacing implementation into QUIC MUX.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	5cb8f8a622	MINOR: quic: support a max number of built packet per send iteration Extend QUIC transport emission function to support a maximum datagram argument. The purpose is to ensure that qc_send() won't emit more than the specified value, unless it is 0 which is considered as unlimited. In qc_prep_pkts(), a counter of built datagram has been added to support this. The packet building loop is interrupted if it reaches a specified maximum value. Also, its return value has been changed to the number of prepared datagrams. This is reused by qc_send() to interrupt its work if a specified max datagram argument value is reached over one or several iteration of prepared/sent datagrams. This change is necessary to support pacing emission. Note that ideally, the total length in bytes of emitted datagrams should be taken into account instead of the raw number of datagrams. However, for a first implementation, it was deemed easier to implement it with the latter.	2024-11-19 16:16:48 +01:00
Frederic Lecaille	b1af5dabf0	BUG/MEDIUM: quic: avoid freezing 0RTT connections This issue came with this commit: f627b92 BUG/MEDIUM: quic: always validate sender address on 0-RTT and could be easily reproduced with picoquic QUIC client with -Q option which splits a big ClientHello TLS message into two Initial datagrams. A second condition must be fulfilled to reprodue this issue: picoquic must not send the token provided by haproxy (NEW_TOKEN). To do that, haproxy must be patched to prevent it to send such tokens. Under these conditions, if haproxy has enough time to reply to the first Initial datagrams, when it receives the second Initial datagram it sends a Retry paquet. Then the client ignores the Retry paquet as mentionned by RFC 9000: 17.2.5.2. Handling a Retry Packet A client MUST accept and process at most one Retry packet for each connection attempt. After the client has received and processed an Initial or Retry packet from the server, it MUST discard any subsequent Retry packets that it receives. On its side, haproxy has closed the connection. When it receives the second Initial datagram, it open a new connection but with Initial packets it cannot decrypt (wrong ODCID) leaving the client without response. To fix this, as the aim of the token (NEW_TOKEN) sent by haproxy is to validate the peer address, in place of closing the connection when no token was received for a 0RTT connection, one leaves this validation to the handshake process. Indeed, the peer adress is validated during the handshake when a valid handshake packet is received by the listener. But as one does not want haproxy to process 0RTT data when no token was received, one does not accept the connection before the successful handshake completion. In addition to this, the 0RTT packets are not released after successful handshake completion when no token was received to leave a chance to haproxy to process these 0RTT data in such case (see quic_conn_io_cb()). Must be backported as far as 2.9.	2024-10-17 15:04:06 +02:00
Frederic Lecaille	c7f14a38f5	MINOR: quic: send new tokens (NEW_TOKEN) even for 1RTT sessions Tokens are sent when opening a connection, just after the handshake, to be possibly reused by the peer for the next connection. They are used to validate the peer address during the 0RTT connection openings. But there is no reason to reserve this feature to 0RTT connections. This patch modifies quic_build_post_handshake_frames() to do so.	2024-10-17 15:04:06 +02:00
Frederic Lecaille	19aa320f64	BUG/MINOR: quic: avoid leaking post handshake frames This bug came with this commit: f627b92 BUG/MEDIUM: quic: always validate sender address on 0-RTT If an error happens in quic_build_post_handshake_frames() during the code exexuted for th NEW_TOKEN frame allocation, some could leak because of the wrong label used to interrupt this function asap. Replace the "goto leave" by "goto err" to deallocated such frames to fix this issue. Must be backported as far as 2.9.	2024-10-17 15:04:06 +02:00
Amaury Denoyelle	5a5950e42d	MINOR: quic: notify connection layer on handshake completion Wake up connection layer on QUIC handshake completion via quic_conn_io_cb. Select SUB_RETRY_RECV as this was previously unused by QUIC MUX layer. For the moment, QUIC MUX never subscribes for handshake completion. However, this will be necessary for features such as the delaying of early data forwarding via wait-for-handshake. This patch will be necessary to implement wait-for-handshake support for QUIC. As such, it must be backported with next commits up to 2.6, after a mandatory period of observation.	2024-10-16 11:42:06 +02:00
Frederic Lecaille	f627b9272b	BUG/MEDIUM: quic: always validate sender address on 0-RTT It has been reported by Wedl Michael, a student at the University of Applied Sciences St. Poelten, a potential vulnerability into haproxy as described below. An attacker could have obtained a TLS session ticket after having established a connection to an haproxy QUIC listener, using its real IP address. The attacker has not even to send a application level request (HTTP3). Then the attacker could open a 0-RTT session with a spoofed IP address trusted by the QUIC listen to bypass IP allow/block list and send HTTP3 requests. To mitigate this vulnerability, one decided to use a token which can be provided to the client each time it successfully managed to connect to haproxy. These tokens may be reused for future connections to validate the address/path of the remote peer as this is done with the Retry token which is used for the current connection, not the next one. Such tokens are transported by NEW_TOKEN frames which was not used at this time by haproxy. So, each time a client connect to an haproxy QUIC listener with 0-RTT enabled, it is provided with such a token which can be reused for the next 0-RTT session. If no such a token is presented by the client, haproxy checks if the session is a 0-RTT one, so with early-data presented by the client. Contrary to the Retry token, the decision to refuse the connection is made only when the TLS stack has been provided with enough early-data from the Initial ClientHello TLS message and when these data have been accepted. Hopefully, this event arrives fast enough to allow haproxy to kill the connection if some early-data have been accepted without token presented by the client. quic_build_post_handshake_frames() has been modified to build a NEW_TOKEN frame with this newly implemented token to be transported inside. quic_tls_derive_retry_token_secret() was renamed to quic_do_tls_derive_token_secre() and modified to be reused and derive the secret for the new token implementation. quic_token_validate() has been implemented to validate both the Retry and the new token implemented by this patch. When this is a non-retry token which could not be validated, the datagram received is marked as requiring a Retry packet to be sent, and no connection is created. When the Initial packet does not embed any non-retry token and if 0-RTT is enabled the connection is marked with this new flag: QUIC_FL_CONN_NO_TOKEN_RCVD. As soon as the TLS stack detects that some early-data have been provided and accepted by the client, the connection is marked to be killed (QUIC_FL_CONN_TO_KILL) from ha_quic_add_handshake_data(). This is done calling qc_ssl_eary_data_accepted() new function. The secret TLS handshake is interrupted as soon as possible returnin 0 from ha_quic_add_handshake_data(). The connection is also marked as requiring a Retry packet to be sent (QUIC_FL_CONN_SEND_RETRY) from ha_quic_add_handshake_data(). The the handshake I/O handler (quic_conn_io_cb()) knows how to behave: kill the connection after having sent a Retry packet. About TLS stack compatibility, this patch is supported by aws-lc. It is disabled for wolfssl which does not support 0-RTT at this time thanks to HAVE_SSL_0RTT_QUIC. This patch depends on these commits: MINOR: quic: Add trace for QUIC_EV_CONN_IO_CB event. MINOR: quic: Implement qc_ssl_eary_data_accepted(). MINOR: quic: Modify NEW_TOKEN frame structure (qf_new_token struct) BUG/MINOR: quic: Missing incrementation in NEW_TOKEN frame builder MINOR: quic: Token for future connections implementation. MINOR: quic: Implement quic_tls_derive_token_secret(). MINOR: tools: Implement ipaddrcpy(). Must be backported as far as 2.6.	2024-08-30 17:04:09 +02:00
Amaury Denoyelle	aeb8c1ddc3	MAJOR: mux-quic: allocate Tx buffers based on congestion window Each QUIC MUX may allocate buffers for MUX stream emission. These buffers are then shared with quic_conn to handle ACK reception and retransmission. A limit on the number of concurrent buffers used per connection has been defined statically and can be updated via a configuration option. This commit replaces the limit to instead use the current underlying congestion window size. The purpose of this change is to remove the artificial static buffer count limit, which may be difficult to choose. Indeed, if a connection performs with minimal loss rate, the buffer count would limit severely its throughput. It could be increase to fix this, but it also impacts others connections, even with less optimal performance, causing too many extra data buffering on the MUX layer. By using the dynamic congestion window size, haproxy ensures that MUX buffering corresponds roughly to the network conditions. Using QCC <buf_in_flight>, a new buffer can be allocated if it is less than the current window size. If not, QCS emission is interrupted and haproxy stream layer will subscribe until a new buffer is ready. One of the criticals parts is to ensure that MUX layer previously blocked on buffer allocation is properly woken up when sending can be retried. This occurs on two occasions : * after an already used Tx buffer is cleared on ACK reception. This case is already handled by qcc_notify_buf() via quic_stream layer. * on congestion window increase. A new qcc_notify_buf() invokation is added into qc_notify_send(). Finally, remove <avail_bufs> QCC field which is now unused. This commit is labelled MAJOR as it may have unexpected effect and could cause significant behavior change. For example, in previous implementation QUIC MUX would be able to buffer more data even if the congestion window is small. With this patch, data cannot be transferred from the stream layer which may cause more streams to be shut down on client timeout. Another effect may be more CPU consumption as the connection limit would be hit more often, causing more streams to be interrupted and woken up in cycle.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	f4d1bd0b76	MINOR: mux-quic: account stream txbuf in QCC A limit per connection is put on the number of buffers allocated by QUIC MUX for emission accross all its streams. This ensures memory consumption remains under control. This limit is simply explained as a count of buffers which can be concurrently allocated for each connection. As such, quic_conn structure was used to account currently allocated buffers. However, a quic_conn nevers allocates new stream buffers. This is only done at QUIC MUX layer. As such, this commit moves buffer accounting inside QCC structure. This simplifies the API, most notably qc_stream_buf_alloc() usage. Note that this commit inverts the accounting. Previously, it was initially set to 0 and increment for each allocated buffer. Now, it is set to the maximum value and decrement for each buf usage. This is considered as clearer to use.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	bb9ac256a1	MINOR: quic: convert qc_stream_desc release field to flags qc_stream_desc had a field <release> used as a boolean. Convert it with a new <flags> field and QC_SD_FL_RELEASE value as equivalent. The purpose of this patch is to be able to extend qc_stream_desc by adding newer flags values. This patch is required for the following patch BUG/MEDIUM: quic: handle retransmit for standalone FIN STREAM As such, it must be backported prior to it.	2024-08-06 18:00:17 +02:00
Amaury Denoyelle	bba6baff30	BUG/MEDIUM: quic: prevent conn freeze on 0RTT undeciphered content Received QUIC packets are stored in quic_conn Rx buffer after header protection removal in qc_rx_pkt_handle(). These packets are then removed after quic_conn IO handler via qc_treat_rx_pkts(). If HP cannot be removed, packets are still copied into quic_conn Rx buffer. This can happen if encryption level TLS keys are not yet available. The packet remains in the buffer until HP can be removed and its content processed. An issue occurs if client emits a 0-RTT packet but haproxy does not have the shared secret, for example after a haproxy process restart. In this case, the packet is copied in quic_conn Rx buffer but its HP won't ever be removed. This prevents the buffer to be purged. After some time, if the client has emitted enough packets, Rx buffer won't have any space left and received packets are dropped. This will cause the connection to freeze. To fix this, remove any 0-RTT buffered packets on handshake completion. At this stage, 0-RTT packets are unnecessary anymore. The client is expected to reemit its content in 1-RTT packet which are properly deciphered. This can easily reproduce with HTTP/3 POST requests or retrieving a big enough object, which will fill the Rx buffer with ACK frames. Here is a picoquic command to provoke the issue on haproxy startup : $ picoquicdemo -Q -v 00000001 -a h3 <hostname> 20443 "/?s=1g" Note that allow-0rtt must be present on the bind line to trigger the issue. Else haproxy will reject any 0-RTT packets. This must be backported up to 2.6. This could be one of the reason for github issue #2549 but it's unsure for now.	2024-07-31 10:24:53 +02:00
Frederic Lecaille	402ce29e9e	BUG/MINOR: quic: Non optimal first datagram. This bug arrived with this commit: b068e758f MINOR: quic: simplify rescheduling for handshake This commit introduced a bad side effect. Haproxy always replied by an ACK-only datagram when it received the first client Initial packet. Then it handled the CRYPTO data insided. And finally, it sent its own CRYPTO data. This broke the packet coalescing rule whose aim is to optimally build and send as more as QUIC packets by datagram. To fix this, simply partially reverts this commit, to make the low level I/O task return again if some CRYPTO were received. This will delay the acknowledgement which will be sent with the CRYPTO data from the same datagram again. Must be backported to 3.0.	2024-07-19 16:22:00 +02:00
Amaury Denoyelle	3be58fc720	CLEANUP: quic: rename TID affinity elements This commit is the renaming counterpart of the previous one, this time for quic_conn module. Several elements related to TID affinity update from quic_conn has been renamed : public functions, but also flag renamed to QUIC_FL_CONN_TID_REBIND and trace event to QUIC_EV_CONN_BIND_TID. This should be backported with the same instruction as the previous commit.	2024-07-11 15:14:06 +02:00
Amaury Denoyelle	448d3d388a	MINOR: quic: add GSO parameter on quic_sock send API Add <gso_size> parameter to qc_snd_buf(). When non-null, this specifies the value for socket option SOL_UDP/UDP_SEGMENT. This allows to send several datagrams in a single call by splitting data multiple times at <gso_size> boundary. For now, <gso_size> remains set to 0 by caller, as such there should not be any functional change.	2024-07-11 11:02:44 +02:00
Amaury Denoyelle	3d4baa3c7b	MINOR: quic: rename "ssl error" trace SSL status is reported each time quic_conn_io_cb() is finished via a trace. Change the trace label from "ssl error" to "ssl status". This allows to search for errors easier without being distracted by this trace.	2024-07-08 09:38:35 +02:00
Amaury Denoyelle	95f624540b	BUG/MEDIUM: quic: prevent crash on accept queue full Handshake for quic_conn instances runs on a single non-chosen thread. On completion, listener_accept() is performed to select the less loaded thread before initializing connection instance. As such, quic_conn instance is migrated to the thread with its upper connection. In case accept queue is full, listener_accept() fallback to local accept mode, which cause the connection to be assigned to the current thread. However, this is not supported by QUIC as quic_conn instance is left on the previously selected thread. In most cases, this will cause a BUG_ON() due to a task manipulation from an outside thread. To fix this, handle quic_conn thread rebind in multiple steps using the new extended protocol API. Several operations have been moved from qc_set_tid_affinity1() to newly defined qc_set_tid_affinity2(), in particular CID TID update. This ensures that quic_conn instance is not prematurely accessed on the new thread until accept queue push is guaranteed to succeed. qc_reset_tid_affinity() is also newly defined to reassign the newly created tasks and tasklets to the current thread. This is necessary to prevent the BUG_ON() crash described above. This must be backported up to 2.8 after a period of observation. Note that it depends on previous patch : MINOR: proto: extend connection thread rebind API	2024-07-04 17:28:56 +02:00
Amaury Denoyelle	1a43b9f32c	MINOR: proto: extend connection thread rebind API MINOR: listener: define callback for accept queue push Extend API for connection thread rebind API by replacing single callback set_affinity by three different ones. Each one of them is used at a different stage of the operation : * set_affinity1 is used similarly to previous set_affinity * set_affinity2 is called directly from accept_queue_push_mp() when an entry has been found in accept ring. This operation cannot fail. * reset_affinity is called after set_affinity1 in case of failure from accept_queue_push_mp() due to no space left in accept ring. This is necessary for protocols which must reconfigure resources before fallback on the current tid. This patch does not have any functional changes. However, it will be required to fix crashes for QUIC connections when accept queue ring is full. As such, it must be backported with it.	2024-07-04 16:33:21 +02:00
Amaury Denoyelle	bfdf145859	MINOR: quic: ensure quic_conn is never removed on thread affinity rebind On accept, quic_conn instance is migrated from its original thread to a new one. This operation is conducted in two steps, on the original than the new thread instance. During the interval, quic_conn is artificially rendered inactive. It must never be accessed nor removed until migration is completed via qc_finalize_affinity_rebind(). This new BUG_ON() will enforce that removal is never conducted until migration is completed.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	0a352ef08e	MINOR: quic: remove access to CID global tree outside of quic_cid module haproxy generates for each QUIC connection a set of CID. The peer must reuse them as DCID for its emitted packet. On datagram reception, DCID field serves as identifier to dispatch them on their correct thread. These CIDs are stored in a global CID tree. Access to this data structure must always be protected with CID_LOCK. This commit is a refactoring to regroup all CID tree access in quic_cid module. Several code parts are ajusted : * quic_cid_insert() is extended to check for insertion race-condition. This is useful on quic_conn instantiation. Code where such race cannot happen can use unsafe _quic_cid_insert() instead. * on RETIRE_CONNECTION_ID frame reception, existing quic_cid_delete() function is used. * remove tree lookup from qc_check_dcid(), extracted in the new quic_cmp_cid_conn() function. Ultimately, the latter should be removed as CID lookup could be conducted on quic_conn owned tree without locking.	2024-07-03 15:02:40 +02:00
Willy Tarreau	192abc6f83	BUG/MEDIUM: quic: fix possible exit from qc_check_dcid() without unlocking Locking of the CID tree was extended in qc_check_dcid() by recent commit 05f59a5 ("BUG/MINOR: quic: fix race condition in qc_check_dcid()") but there was a direct return from the middle of the function which was not covered by the unlock, resulting in the function keeping the lock on success return. Let's just remove this return and replace it with a variable to merge all exit paths. This must be backported wherever the fix above is backported.	2024-07-01 10:29:31 +02:00
Amaury Denoyelle	05f59a51ac	BUG/MINOR: quic: fix race condition in qc_check_dcid() qc_check_dcid() is a function which check that a DCID is associated to the expected quic_conn instance. This is used for quic_conn socket receive handler as there is a tiny risk that a datagram to another connection was received on this socket. As other operations on global CID tree, a lock must be used to protect against race condition. However, as previous commit, lock was not held long enough as CID tree node is accessed outside of the lock region. To fix this, increase critical section until CID dereferencement is done. The impact of this bug should be similar to the previous one. However, risk of crash are even less reduced as it should be extremely rare to receive datagram for other connections on a quic_conn socket. As such, most of the time first check condition of qc_check_dcid() is enough. This may fix first crash of issue github #2607. This must be backported up to 2.8.	2024-06-28 16:28:33 +02:00
Amaury Denoyelle	0ef94e2dff	BUG/MINOR: quic: ensure Tx buf is always purged quic_conn API for sending was recently refactored. The main objective was to regroup the different functions present for both handshake and application emission. After this refactoring, an optimization was introduced to avoid calling qc_send() if there was nothing new to emit. However, this prevent the Tx buffer to be purged if previous sending was interrupted, until new frames are finally available. To fix this, simply remove the optimization. qc_send() is thus now always called in quic_conn IO handlers. The impact of this bug should be minimal as it happens only on sending temporary error. However in this case, this could cause extra latency or even a complete sending freeze in the worst scenario. This must be backported up to 3.0.	2024-06-10 10:29:28 +02:00
Amaury Denoyelle	f7ae84e7d1	BUG/MINOR: quic: prevent crash on qc_kill_conn() Ensure idle_timer task is allocated in qc_kill_conn() before waking it up. It can be NULL if idle timer has already fired but MUX layer is still present, which prevents immediate quic_conn release. qc_kill_conn() is only used on send() syscall fatal error to notify upper layer of an error and close the whole connection asap. This crash occurence is pretty rare as it relies on timing issues. It happens only if idle timer occurs before the MUX release (a bigger client timeout is thus required) and any send() syscall detected error. For now, it was only reproduced using GDB to interrupt haproxy longer than the idle timeout. This should be backported up to 2.6.	2024-06-04 14:59:24 +02:00
Amaury Denoyelle	e094412337	MINOR: h3/qpack: adjust naming for errors Rename enum values used for HTTP/3 and QPACK RFC defined codes. First uses a prefix H3_ERR_* which serves as identifier between them. Also separate QPACK values in a new dedicated enum qpack_err. This is deemed cleaner.	2024-05-16 10:31:17 +02:00
Amaury Denoyelle	34b31d85cb	OPTIM: quic: do not call qc_send() if nothing to emit qc_send() was systematically called by quic_conn IO handlers with all instantiated quic_enc_level. Change this to only register quic_enc_level for send if needed. Do not call at all qc_send() if no qel registered. A new function qel_need_sending() is defined to detect if sending is required. First, it checks if quic_enc_level has prepared frames or probing is set. It can also returns true if ACK required either on quic_enc_level itself or because of quic_conn ack timer fired. Finally, a CONNECTION_CLOSE emission for quic_conn is also a valid case. This should reduce the number of invocations of qc_send(). This could improve slightly performance, as well as simplify traces debugging.	2024-04-10 11:17:21 +02:00
Amaury Denoyelle	7fc1ce5bc8	MEDIUM: quic: remove duplicate hdshk/app send functions A series of previous patches have clean up sending function for handshake case. Their new exposed API is now flexible enough to convert app case to use the same functions. As such, qc_send_hdshk_pkts() is renamed qc_send() and become the single entry point for QUIC emission. It is used during application packets emission in quic_conn_app_io_cb(), qc_send_mux(). Also the internal function qc_prep_hpkts() is renamed qc_prep_pkts(). Remove the new unneeded qc_send_app_pkts() and qc_prep_app_pkts(). Also removed qc_send_app_probing(). It was a simple wrapper over other application send functions. Now, default qc_send() can be reuse for such cases with <old_data> argument set to true. An adjustment was needed when converting qc_send_hdshk_pkts() to the general qc_send() version. Previously, only a single packets encoding/emission cycle was performed. This was enough as handshake packets are always smaller than Tx buffer. However, it may be possible to emit more application data. As such, a loop is necessary to perform multiple encoding/emission cycles, as this was already the case in qc_send_app_pkts(). No functional difference should happen with this commit. However, as these are critcal functions with a lot of changes, this patch is labelled as medium.	2024-04-10 11:07:35 +02:00
Amaury Denoyelle	4e4127a66d	MINOR: quic: use qc_send_hdshk_pkts() in handshake IO cb quic_conn_io_cb() manually implements emission by using lower level functions qc_prep_pkts() and qc_send_ppkts(). Replace this by using the higher level function qc_send_hdshk_pkts() which notably handle buffer allocation and purging. This allows to clean up send API by flagging qc_prep_pkts() and qc_send_ppkts() as static. They are now used in a single location inside qc_send_hdshk_pkts().	2024-04-10 11:07:19 +02:00

1 2 3 4 5 ...

439 Commits