haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-06 23:27:04 +02:00

Author	SHA1	Message	Date
Amaury Denoyelle	d8f1ff8648	BUG/MEDIUM: quic: fix connection freeze on post handshake After handshake completion, QUIC server is responsible to emit HANDSHAKE_DONE frame. Some clients wait for it to begin STREAM transfers. Previously, there was no explicit tasklet_wakeup() after handshake completion, which is necessary to emit post-handshake frames. In most cases, this was undetected as most client continue emission which will reschedule the tasklet. However, as there is no tasklet_wakeup(), this is not a consistent behavior. If this bug occurs, it causes a connection freeze, preventing the client to emit any request. The connection is finally closed on idle timeout. To fix this, add an explicit tasklet_wakeup() after handshake completion. It sounds simple enough but in fact it's difficult to find the correct location efor tasklet_wakeup() invocation, as post-handshake is directly linked to connection accept, with different orderings. Notably, if 0-RTT is used, connection can be accepted prior handshake completion. Another major point is that along HANDSHAKE_DONE frame, a series of NEW_CONNECTION_ID frames are emitted. However, these new CIDs allocation must occur after connection is migrated to its new thread as these CIDs are tied to it. A BUG_ON() is present to check this in qc_set_tid_affinity(). With all this in mind, 2 locations were selected for the necessary tasklet_wakeup() : * on qc_xprt_start() : this is useful for standard case without 0-RTT. This ensures that this is done only after connection thread migration. * on qc_ssl_provide_all_quic_data() : this is done on handshake completion with 0-RTT used. In this case only, connection is already accepted and migrated, so tasklet_wakeup() is safe. Note that as a side-change, quic_accept_push_qc() API has evolved to better reflect differences between standard and 0-RTT usages. It is now forbidden to call it multiple times on a single quic_conn instance. A BUG_ON() has been added. This issue is labelled as medium even though it seems pretty rare. It was only reproducible using QUIC interop runner, with haproxy compiled with LibreSSL with quic-go as client. However, affected code parts are pretty sensible, which justify the chosen severity. This should fix github issue #2418. It should be backported up to 2.6, after a brief period of observation. Note that the extra comment added in qc_set_tid_affinity() can be removed in 2.6 as thread migration is not implemented for this version. Other parts should apply without conflict.	2024-03-06 10:39:57 +01:00
Amaury Denoyelle	a17eaf7763	BUG/MINOR: quic: initialize msg_flags before sendmsg Previously, msghdr struct used for sendmsg was memset to 0. This was updated for performance reason with each members individually defined. This is done by the following commit : commit `107d6d7546` OPTIM: quic: improve slightly qc_snd_buf() internal msg_flags is the only member unset, as sendmsg manual page reports that it is unused. However, this caused a coverity report. In the end, it is better to explicitely set it to 0 to avoid any future interrogations, compiler warning or even portability issues. This should fix coverity report from github issue #2455. No need to backport unless above patch is.	2024-02-21 10:13:53 +01:00
Amaury Denoyelle	8b950f40fa	MINOR: quic: only use sendmsg() syscall variant This patch is the direct followup of the previous one : MINOR: quic: remove sendto() usage variant This finalizes qc_snd_buf() simplification by removing send() syscall usage for quic-conn owned socket. Syscall invocation is merged in a single code location to the sendmsg() variant. The only difference for owned socket is that destination address for sendmsg() is set to NULL. This usage is documented in man 2 sendmsg as valid for connected sockets. This allows maximum performance by avoiding unnecessary lookups on kernel socket address tables. As the previous patch, no functional change should happen here. However, it will be simpler to extend qc_snd_buf() for GSO usage.	2024-02-20 16:42:05 +01:00
Amaury Denoyelle	8de9f8f193	MINOR: quic: remove sendto() usage variant qc_snd_buf() is a wrapper around emission syscalls. Given QUIC configuration, a different variant is used. When using connection socket, send() is the only used. For listener sockets, sendmsg() and sendto() are possible. The first one is used only if local address has been retrieved prior. This allows to fix it on sending to guarantee the source address selection. Finally, sendto() is used for systems which do not support local address retrieval. All of these variants render the code too complex. As such, this patch simplifies this by removing sendto() alternative. Now, sendmsg() is always used for listener sockets. Source address is then specified only if supported by the system. This patch should not exhibit functional behavior changes. It will be useful when implementing GSO as the code is now simpler.	2024-02-20 16:42:05 +01:00
Amaury Denoyelle	ea90c39302	MINOR: quic: move IP_PKTINFO on send on a dedicated function When using listener socket, source address for emission is explicitely set using ancillary data for sendmsg(). This is useful to guarantee the correct address is used when binding on a non-explicit address. This code was implemented directly under qc_snd_buf(). However, it is quite complex due to portability issue. For IPv4, two parallel implementations coexist, defined under IP_PKTINFO or IP_RECVDSTADDR. For IPv6, another option is defined under IPV6_RECVPKTINFO. Each variant uses its distinct name which increase the code complexity. Extract ancillary data filling in a dedicated function named cmsg_set_saddr(). This reduces greatly the body of qc_snd_buf(). Such functions can be replicated when other ancillary data type will be implemented. This will notably be useful for GSO implementation.	2024-02-20 16:42:05 +01:00
Amaury Denoyelle	107d6d7546	OPTIM: quic: improve slightly qc_snd_buf() internal qc_snd_buf() is a wrapper for sendmsg() syscall (or its derivatives) used for all QUIC emissions. This patch aims at removing several non-optimal code sections : * fd_send_ready() for connected sockets is only checked on the function preambule instead of inside the emission loop * zero-ing msghdr structure for unconnected sockets is removed. This is unnecessary as all fields are properly initialized then. * extra memcpy/memset invocations when using IP_PKTINFO/IPV6_RECVPKTINFO are removed by setting directly the address value into cmsg buffer	2024-02-20 16:42:05 +01:00
Amaury Denoyelle	5b31989a3f	BUG/MEDIUM: quic: fix transient send error with listener socket Transient send errors is handled differentely if using connection or listener socket for QUIC transfers. In the first case, proper poller subscription is used via fd_cant_send()/fd_want_send(). For the listener socket case, error is ignored by qc_snd_buf() caller and retransmission mechanism will allow to reemit the data. For listener socket, transient error code handling is buggy. It blindly uses fd_cand_send() with <qc.fd> member which is set to -1 for listener socket usage. This results in an invalid fdtab access, with a possible crash or a modification of a totally unrelated FD. This bug is simply fixed by using qc_test_fd() before using fd_cant_send()/fd_want_send(). This ensures <qc.fd> is used only if initialized which is only the case when using connection socket. No crash was reported yet for this bug. However, it is reproducible by using ASAN compilation and the following strace sendmsg() errno command injection : # strace -qq -yy -p $(pgrep haproxy) -f -e trace=%network \ -e inject=sendto,sendmsg:error=EAGAIN:when=20+20 This must be backported up to 2.7.	2024-02-19 17:56:51 +01:00
Frédéric Lécaille	f74d882ef0	REORG: quic: Move the QUIC DCID parser to quic_sock.c Move quic_get_dgram_dcid() from quic_conn.c to quic_sock.c because only used in this file and define it as static.	2023-11-28 15:37:50 +01:00
Frédéric Lécaille	0fc0d45745	REORG: quic: Add a new module to handle QUIC connection IDs Move quic_cid and quic_connnection_id from quic_conn-t.h to new quic_cid-t.h header. Move defintions of quic_stateless_reset_token_init(), quic_derive_cid(), new_quic_cid(), quic_get_cid_tid() and retrieve_qc_conn_from_cid() to quic_cid.c new C file.	2023-11-28 15:37:22 +01:00
Ilya Shipitsin	80813cdd2a	CLEANUP: assorted typo fixes in the code and comments This is 37th iteration of typo fixes	2023-11-23 16:23:14 +01:00
Amaury Denoyelle	bb28215d9b	MEDIUM: quic: define an accept queue limit QUIC connections are pushed manually into a dedicated listener queue when they are ready to be accepted. This happens after handshake finalization or on 0-RTT packet reception. Listener is then woken up to dequeue them with listener_accept(). This patch comptabilizes the number of connections currently stored in the accept queue. If reaching a certain limit, INITIAL packets are dropped on reception to prevent further QUIC connections allocation. This should help to preserve system resources. This limit is automatically derived from the listener backlog. Half of its value is reserved for handshakes and the other half for accept queues. By default, backlog is equal to maxconn which guarantee that there can't be no more than maxconn connections in handshake or waiting to be accepted.	2023-11-09 16:24:00 +01:00
Amaury Denoyelle	3df6a60113	MEDIUM: quic: limit handshake per listener Implement a limit per listener for concurrent number of QUIC connections. When reached, INITIAL packets for new connections are automatically dropped until the number of handshakes is reduced. The limit value is automatically based on listener backlog, which itself defaults to maxconn. This feature is important to ensure CPU and memory resources are not consume if too many handshakes attempt are started in parallel. Special care is taken if a connection is released before handshake completion. In this case, counter must be decremented. This forces to ensure that member <qc.state> is set early in qc_new_conn() before any quic_conn_release() invocation.	2023-11-09 16:23:52 +01:00
Amaury Denoyelle	f59f8326f9	REORG: quic: cleanup traces definition Move all QUIC trace definitions from quic_conn.h to quic_trace-t.h. Also remove multiple definition trace_quic macro definition into quic_trace.h. This forces all QUIC source files who relies on trace to include it while reducing the size of quic_conn.h.	2023-10-11 14:15:31 +02:00
Amaury Denoyelle	2ac5d9a657	MINOR: quic: handle perm error on bind during runtime Improve EACCES permission errors encounterd when using QUIC connection socket at runtime : * First occurence of the error on the process will generate a log warning. This should prevent users from using a privileged port without mandatory access rights. * Socket mode will automatically fallback to listener socket for the receiver instance. This requires to duplicate the settings from the bind_conf to the receiver instance to support configurations with multiple addresses on the same bind line.	2023-10-03 16:52:02 +02:00
Willy Tarreau	6cbb5a057b	Revert "MAJOR: import: update mt_list to support exponential back-off" This reverts commit `c618ed5ff4`. The list iterator is broken. As found by Fred, running QUIC single- threaded shows that only the first connection is accepted because the accepter relies on the element being initialized once detached (which is expected and matches what MT_LIST_DELETE_SAFE() used to do before). However while doing this in the quic_sock code seems to work, doing it inside the macro show total breakage and the unit test doesn't work anymore (random crashes). Thus it looks like the fix is not trivial, let's roll this back for the time it will take to fix the loop.	2023-09-15 17:13:43 +02:00
Willy Tarreau	c618ed5ff4	MAJOR: import: update mt_list to support exponential back-off The new mt_list code supports exponential back-off on conflict, which is important for use cases where there is contention on a large number of threads. The API evolved a little bit and required some updates: - mt_list_for_each_entry_safe() is now in upper case to explicitly show that it is a macro, and only uses the back element, doesn't require a secondary pointer for deletes anymore. - MT_LIST_DELETE_SAFE() doesn't exist anymore, instead one just has to set the list iterator to NULL so that it is not re-inserted into the list and the list is spliced there. One must be careful because it was usually performed before freeing the element. Now instead the element must be nulled before the continue/break. - MT_LIST_LOCK_ELT() and MT_LIST_UNLOCK_ELT() have always been unclear. They were replaced by mt_list_cut_around() and mt_list_connect_elem() which more explicitly detach the element and reconnect it into the list. - MT_LIST_APPEND_LOCKED() was only in haproxy so it was left as-is in list.h. It may however possibly benefit from being upstreamed. This required tiny adaptations to event_hdl.c and quic_sock.c. The test case was updated and the API doc added. Note that in order to keep include files small, the struct mt_list definition remains in list-t.h (par of the internal API) and was ifdef'd out in mt_list.h. A test on QUIC with both quictls 1.1.1 and wolfssl 5.6.3 on ARM64 with 80 threads shows a drastic reduction of CPU usage thanks to this and the refined memory barriers. Please note that the CPU usage on OpenSSL 3.0.9 is significantly higher due to the excessive use of atomic ops by openssl, but 3.1 is only slightly above 1.1.1 though: - before: 35 Gbps, 3.5 Mpps, 7800% CPU - after: 41 Gbps, 4.2 Mpps, 2900% CPU	2023-09-13 11:50:33 +02:00
Amaury Denoyelle	7f80d51812	BUG/MEDIUM: quic: fix tasklet_wakeup loop on connection closing It is possible to trigger a loop of tasklets calls if a QUIC connection is interrupted abruptly by the client. This is caused by the following interaction : * FD iocb is woken up for read. This causes a wakeup on quic_conn tasklet. * quic_conn_io_cb is run and try to read but fails as the connection socket is closed (typically with a ECONNREFUSED). FD read is subscribed to the poller via qc_rcv_buf() which will cause the loop. The looping will stop automatically once the idle-timeout is expired and the connection instance is finally released. To fix this, ensure FD read is subscribed only for transient error cases (EAGAIN or similar). All other cases are considered as fatal and thus all future read operations will fail. Note that for the moment, nothing is reported on the quic_conn which may not skip future reception. This should be improved in a future commit to accelerate connection closing. This bug can be reproduced on a frequent occurence by interrupting the following command. Quic traces should be activated on haproxy side to detect the loop : $ ngtcp2-client --tp-file=/tmp/ngtcp2-tp.txt \ --session-file=/tmp/ngtcp2-session.txt \ -r 0.3 -t 0.3 --exit-on-all-streams-close 127.0.0.1 20443 \ "http://127.0.0.1:20443/?s=1024" This must be backported up to 2.7.	2023-08-11 17:04:20 +02:00
Frédéric Lécaille	5d602f4f33	MINOR: quic: Add a trace for QUIC conn fd ready for receive Add a trace as this is done for the "send ready" fd state.	2023-08-11 08:57:47 +02:00
Amaury Denoyelle	f40a72a7ff	BUILD: quic: fix wrong potential NULL dereference GCC warns about a possible NULL dereference when requeuing a datagram on the connection socket. This happens due to a MT_LIST_POP to retrieve a rxbuf instance. In fact, this can never be NULL there is enough rxbuf allocated for each thread. Once a thread has finished to work with it, it must always reappend it. This issue was introduced with the following patch : commit `b34d353968` BUG/MEDIUM: quic: consume contig space on requeue datagram As such, it must be backported in every version with the above commit. This should fix the github CI compilation error.	2023-08-04 15:42:34 +02:00
Amaury Denoyelle	f59635c495	BUG/MINOR: quic: reappend rxbuf buffer on fake dgram alloc error A thread must always reappend the rxbuf instance after finishing datagram reception treatment. This was not the case on one error code path : when fake datagram allocation fails on datagram requeing. This issue was introduced with the following patch : commit `b34d353968` BUG/MEDIUM: quic: consume contig space on requeue datagram As such, it must be backported in every version with the above commit.	2023-08-04 15:42:30 +02:00
Amaury Denoyelle	b34d353968	BUG/MEDIUM: quic: consume contig space on requeue datagram When handling UDP datagram reception, it is possible to receive a QUIC packet for one connection to the socket attached to another connection. To protect against this, an explicit comparison is done against the packet DCID and the quic-conn CID. On no match, the datagram is requeued and dispatched via rxbuf and will be treated as if it arrived on the listener socket. One reason for this wrong reception is explained by the small race condition that exists between bind() and connect() syscalls during connection socket initialization. However, one other reason which was not thought initially is when clients reuse the same IP:PORT for different connections. In this case the current FD attribution is not optimal and this can cause a substantial number of requeuing. This situation has revealed a bug during requeuing. If rxbuf contig space is not big enough for the datagram, the incoming datagram was dropped, even if there is space at buffer origin. This can cause several datagrams to be dropped in a series until eventually buffer head is moved when passing through the listener FD. To fix this, allocate a fake datagram to consume contig space. This is similar to the handling of datagrams on the listener FD. This allows then to store the datagram to requeue on buffer head and continue. This can be reproduced by starting a lot of connections. To increase the phenomena, POST are used to increase the number of datagram dropping : $ while true; do curl -F "a=@~/50k" -k --http3-only -o /dev/null https://127.0.0.1:20443/; done	2023-08-04 14:27:40 +02:00
Frédéric Lécaille	444c1a4113	MINOR: quic: Split QUIC connection code into three parts Move the TX part of the code to quic_tx.c. Add quic_tx-t.h and quic_tx.h headers for this TX part code. The definition of quic_tx_packet struct has been move from quic_conn-t.h to quic_tx-t.h. Same thing for the TX part: Move the RX part of the code to quic_rx.c. Add quic_rx-t.h and quic_rx.h headers for this TX part code. The definition of quic_rx_packet struct has been move from quic_conn-t.h to quic_rx-t.h.	2023-07-27 10:51:03 +02:00
Fr�d�ric L�caille	bdd64fd71d	MINOR: quic: Add some counters at QUIC connection level Add some statistical counters to quic_conn struct from quic_counters struct which are used at listener level to handle them at QUIC connection level. This avoid calling atomic functions. Furthermore this will be useful soon when a counter will be added for the total number of packets which have been sent which will be very often incremented. Some counters were not added, espcially those which count the number of QUIC errors by QUIC error types. Indeed such counters would be incremented most of the time only one time at QUIC connection level. Implement quic_conn_prx_cntrs_update() which accumulates the QUIC connection level statistical counters to the listener level statistical counters. Must be backported to 2.7.	2023-05-24 16:30:11 +02:00
Frédéric Lécaille	76d502588d	BUG/MINOR: quic: Wrong redispatch for external data on connection socket It is possible to receive datagram from other connection on a dedicated quic-conn socket. This is due to a race condition between bind() and connect() system calls. To handle this, an explicit check is done on each datagram. If the DCID is not associated to the connection which owns the socket, the datagram is redispatch as if it arrived on the listener socket. This redispatch step was not properly done because the source address specified for the redispatch function was incorrect. Instead of using the datagram source address, we used the address of the socket quic-conn which received the datagram due to the above race condition. Fix this simply by using the address from the recvmsg() system call. The impact of this bug is minor as redispatch on connection socket should be really rare. However, when it happens it can lead to several kinds of problems, like for example a connection initialized with an incorrect peer address. It can also break the Retry token check as this relies on the peer address. In fact, Retry token check failure was the reason this bug was found. When using h2load with thousands of clients, the counter of Retry token failure was unusually high. With this patch, no failure is reported anymore for Retry. Must be backported to 2.7.	2023-05-12 14:48:30 +02:00
Amaury Denoyelle	1bcb695a05	MINOR: quic: use real sending rate measurement Before this patch, global sending rate was measured on the QUIC lower layer just after sendto(). This meant that all QUIC frames were accounted for, including non STREAM frames and also retransmission. To have a better reflection of the application data transferred, move the incrementation into the MUX layer. This allows to account only for STREAM frames payload on their first emission. This should be backported up to 2.6.	2023-04-28 16:52:26 +02:00
Frédéric Lécaille	7d23e8d1a6	CLEANUP: quic: Rename several <buf> variables into quic_sock.c Rename some variables which are not struct buffer variables. Should be backported to 2.7.	2023-04-24 15:53:27 +02:00
Tim Duesterhus	c18e244515	CLEANUP: Stop checking the pointer before calling `pool_free()` Changes performed with this Coccinelle patch: @@ expression e; expression p; @@ - if (e != NULL) { pool_free(p, e); - } @@ expression e; expression p; @@ - if (e) { pool_free(p, e); - } @@ expression e; expression p; @@ - if (e) pool_free(p, e); @@ expression e; expression p; @@ - if (e != NULL) pool_free(p, e);	2023-04-23 00:28:25 +02:00
Willy Tarreau	6a4d48b736	MINOR: quic_sock: index li->per_thr[] on local thread id, not global one There's a li_per_thread array in each listener for use with QUIC listeners. Since thread groups were introduced, this array can be allocated too large because global.nbthread is allocated for each listener, while only no more than MIN(nbthread,MAX_THREADS_PER_GROUP) may be used by a single listener. This was because the global thread ID is used as the index instead of the local ID (since a listener may only be used by a single group). Let's just switch to local ID and reduce the allocated size.	2023-04-21 17:41:26 +02:00
Amaury Denoyelle	739de3f119	MINOR: quic: properly finalize thread rebinding When a quic_conn instance is rebinded on a new thread its tasks and tasklet are destroyed and new ones created. Its socket is also migrated to a new thread which stop reception on it. To properly reactivate a quic_conn after rebind, wake up its tasks and tasklet if they were active before thread rebind. Also reactivate reading on the socket FD. These operations are implemented on a new function qc_finalize_affinity_rebind(). This should be backported up to 2.7 after a period of observation.	2023-04-18 17:09:02 +02:00
Amaury Denoyelle	987812b190	MINOR: quic: do not proceed to accept for closing conn Each quic_conn is inserted in an accept queue to allocate the upper layers. This is done through a listener tasklet in quic_sock_accept_conn(). This patch interrupts the accept process for a quic_conn in closing/draining state. Indeed, this connection will soon be closed so it's unnecessary to allocate a complete stack for it. This patch will become necessary when thread migration is implemented. Indeed, it won't be allowed to proceed to thread migration for a closing quic_conn. This should be backported up to 2.7 after a period of observation.	2023-04-18 16:54:48 +02:00
Amaury Denoyelle	f16ec344d5	MEDIUM: quic: handle conn bootstrap/handshake on a random thread TID encoding in CID was removed by a recent change. It is now possible to access to the <tid> member stored in quic_connection_id instance. For unknown CID, a quick solution was to redispatch to the thread corresponding to the first CID byte. This ensures that an identical CID will always be handled by the same thread to avoid creating multiple same connection. However, this forces an uneven load repartition which can be critical for QUIC handshake operation. To improve this, remove the above constraint. An unknown CID is now handled by its receiving thread. However, this means that if multiple packets are received with the same unknown CID, several threads will try to allocate the same connection. To prevent this race condition, CID insertion in global tree is now conducted first before creating the connection. This is a thread-safe operation which can only be executed by a single thread. The thread which have inserted the CID will then proceed to quic_conn allocation. Other threads won't be able to insert the same CID : this will stop the treatment of the current packet which is redispatch to the now owning thread. This should be backported up to 2.7 after a period of observation.	2023-04-18 16:54:44 +02:00
Amaury Denoyelle	e83f937cc1	MEDIUM: quic: use a global CID trees list Previously, quic_connection_id were stored in a per-thread tree list. Datagram were first dispatched to the correct thread using the encoded TID before a tree lookup was done. Remove these trees and replace it with a global trees list of 256 entries. A CID is using the list index corresponding to its first byte. On datagram dispatch, CID is lookup on its tree and TID is retrieved using new member quic_connection_id.tid. As such, a read-write lock protects each list instances. With 256 entries, it is expected that contention should be reduced. A new structure quic_cid_tree served as a tree container associated with its read-write lock. An API is implemented to ensure lock safety for insert/lookup/delete operation. This patch is a step forward to be able to break the affinity between a CID and a TID encoded thread. This is required to be able to migrate a quic_conn after accept to select thread based on their load. This should be backported up to 2.7 after a period of observation.	2023-04-18 16:54:17 +02:00
Amaury Denoyelle	66947283ba	MINOR: quic: remove TID ref from quic_conn Remove <tid> member in quic_conn. This is moved to quic_connection_id instance. For the moment, this change has no impact. Indeed, qc.tid reference could easily be replaced by tid as all of this work was already done on the connection thread. However, it is planified to support quic_conn thread migration in the future, so removal of qc.tid will simplify this. This should be backported up to 2.7.	2023-04-18 16:20:47 +02:00
Willy Tarreau	8f6da64641	MINOR: quic_sock: un-statify quic_conn_sock_fd_iocb() This one is printed as the iocb in the "show fd" output, and arguably this wasn't very convenient as-is: 293 : st=0x000123(cl heopI W:sRa R:sRA) ref=0 gid=1 tmask=0x8 umask=0x0 prmsk=0x8 pwmsk=0x0 owner=0x7f488487afe0 iocb=0x50a2c0(main+0x60f90) Let's unstatify it and export it so that the symbol can now be resolved from the various points that need it.	2023-03-10 14:30:01 +01:00
Frédéric Lécaille	4377dbd756	BUG/MINOR: quic: Missing listener accept queue tasklet wakeups This bug was revealed by h2load tests run as follows: h2load -t 4 --npn-list h3 -c 64 -m 16 -n 16384 -v https://127.0.0.1:4443/ This open (-c) 64 QUIC connections (-n) 16384 h3 requets from (-t) 4 threads, i.e. 256 requests by connection. Such tests could not always pass and often ended with such results displays by h2load: finished in 53.74s, 38.11 req/s, 493.78KB/s requests: 16384 total, 2944 started, 2048 done, 2048 succeeded, 14336 failed, 14336 errored, 0 timeout status codes: 2048 2xx, 0 3xx, 0 4xx, 0 5xx traffic: 25.92MB (27174537) total, 102.00KB (104448) headers (space savings 1.92%), 25.80MB (27053569) data UDP datagram: 3883 sent, 24330 received min max mean sd ± sd time for request: 48.75ms 502.86ms 134.12ms 75.80ms 92.68% time for connect: 20.94ms 331.24ms 189.59ms 84.81ms 59.38% time to 1st byte: 394.36ms 417.01ms 406.72ms 9.14ms 75.00% req/s : 0.00 115.45 14.30 38.13 87.50% The number of successful requests was always a multiple of 256. Activating the traces also shew that some connections were blocked after having successfully completed their handshakes due to the fact that the mux. The mux is started upon the acceptation of the connection. Under heavy load, some connections were never accepted. From the moment where more than 4 (MAXACCEPT) connections were enqueued before a listener could be woken up to accept at most 4 connections, the remaining connections were not accepted ore lately at the second listener tasklet wakeup. Add a call to tasklet_wakeup() to the accept list tasklet of the listeners to wake up it if there are remaining connections to accept after having called listener_accept(). In this case the listener must not be removed of this accept list, if not at the next call it will not accept anything more. Must be backported to 2.7 and 2.6.	2023-03-10 14:05:24 +01:00
Amaury Denoyelle	caa16549b8	MINOR: quic: notify on send ready This patch completes the previous one with poller subscribe of quic-conn owned socket on sendto() error. This ensures that mux-quic is notified if waiting on sending when a transient sendto() error is cleared. As such, qc_notify_send() is called directly inside socket I/O callback. qc_notify_send() internal condition have been thus completed. This will prevent to notify upper layer until all sending condition are fulfilled: room in congestion window and no transient error on socket FD. This should be backported up to 2.7.	2023-03-01 14:32:37 +01:00
Amaury Denoyelle	e1a0ee3cf6	MEDIUM: quic: implement poller subscribe on sendto error On sendto() transient error, prior to this patch sending was simulated and we relied on retransmission to retry sending. This could hurt significantly the performance. Thanks to quic-conn owned socket support, it is now possible to improve this. On transient error, sending is interrupted and quic-conn socket FD is subscribed on the poller for sending. When send is possible, quic_conn_sock_fd_iocb() will be in charge of restart sending. A consequence of this change is on the return value of qc_send_ppkts(). This function will now return 0 on transient error if quic-conn has its owned socket. This is used to interrupt sending in the calling function. The flag QUIC_FL_CONN_TO_KILL must be checked to differentiate a fatal error from a transient one. This should be backported up to 2.7.	2023-03-01 14:32:37 +01:00
Amaury Denoyelle	4bdd069637	MINOR: quic: consider EBADF as critical on send() EBADF on sendto() is considered as a fatal error. As such, it is removed from the list of the transient errors. The connection will be killed when encountered. For the record, EBADF can be encountered on process termination with the listener socket. This should be backported up to 2.7.	2023-02-28 10:51:25 +01:00
Amaury Denoyelle	1febc2d316	MEDIUM: quic: improve fatal error handling on send Send is conducted through qc_send_ppkts() for a QUIC connection. There is two types of error which can be encountered on sendto() or affiliated syscalls : * transient error. In this case, sending is simulated with the remaining data and retransmission process is used to have the opportunity to retry emission * fatal error. If this happens, the connection should be closed as soon as possible. This is done via qc_kill_conn() function. Until this patch, only ECONNREFUSED errno was considered as fatal. Modify the QUIC send API to be able to differentiate transient and fatal errors more easily. This is done by fixing the return value of the sendto() wrapper qc_snd_buf() : * on fatal error, a negative error code is returned. This is now the case for every errno except EAGAIN, EWOULDBLOCK, ENOTCONN, EINPROGRESS and EBADF. * on a transient error, 0 is returned. This is the case for the listed errno values above and also if a partial send has been conducted by the kernel. * on success, the return value of sendto() syscall is returned. This commit will be useful to be able to handle transient error with a quic-conn owned socket. In this case, the socket should be subscribed to the poller and no simulated send will be conducted. This commit allows errno management to be confined in the quic-sock module which is a nice cleanup. On a final note, EBADF should be considered as fatal. This will be the subject of a next commit. This should be backported up to 2.7.	2023-02-28 10:51:25 +01:00
Frédéric Lécaille	a2c62c3141	MINOR: quic: Kill the connections on ICMP (port unreachable) packet receipt The send*() syscall which are responsible of such ICMP packets reception fails with ECONNREFUSED as errno. man(7) udp ECONNREFUSED No receiver was associated with the destination address. This might be caused by a previous packet sent over the socket. We must kill asap the underlying connection. Must be backported to 2.7.	2023-02-17 17:36:30 +01:00
Frédéric Lécaille	9fc10aff05	BUG/MINOR: quic: Possible unexpected counter incrementation on send() errors Some counters could potentially be incremented even if send() syscall returned no error when ret >= 0 and ret != sz. This could be the case for instance if a first call to send() returned -1 with errno set to EINTR (or any previous syscall which set errno to a non-null value) and if the next call to send() returned something positive and smaller than <sz>. Must be backported to 2.7 and 2.6.	2023-02-17 17:36:30 +01:00
Amaury Denoyelle	2d380926ba	MEDIUM: quic-sock: fix udp source address for send on listener socket When receiving a QUIC datagram, destination address is retrieved via recvmsg() and stored in quic-conn as qc.local_addr. This address is then reused when using the quic-conn owned socket. When listener socket mode is preferred, send operation did not specify the source address of the emitted datagram. If listener socket is bound on a wildcard address, the kernel is free to choose any address assigned to the local machine. This may be different from the address selected by the client on its first datagram which will prevent the client to emit next replies. To address this, this patch fixes the UDP source address via sendmsg(). This process is similar to the reception and relies on ancillary message, so the socket is left untouched after the operation. This is heavily platform specific and may not be supported by some kernels. This change has only an impact if listener socket only is used for QUIC communications. This is the default behavior for 2.7 branch but not anymore on 2.8. Use tune.quic.socket-owner set to listener to ensure set it. This should be backported up to 2.7.	2023-01-20 17:06:04 +01:00
Willy Tarreau	6be8d09a61	OPTIM: global: move byte counts out of global and per-thread During multiple tests we've already noticed that shared stats counters have become a real bottleneck under large thread counts. With QUIC it's pretty visible, with qc_snd_buf() taking 2.5% of the CPU on a 48-thread machine at only 25 Gbps, and this CPU is entirely spent in the atomic increment of the byte count and byte rate. It's also visible in H1/H2 but slightly less since we're working with larger buffers, hence less frequent updates. These counters are exclusively used to report the byte count in "show info" and the byte rate in the stats. Let's move them to the thread_ctx struct and make the stats reader just collect each thread's stats when requested. That's way more efficient than competing on a single cache line. After this, qc_snd_buf has totally disappeared from the perf profile and tests made in h1 show roughly 1% performance increase on small objects.	2023-01-12 16:37:45 +01:00
Willy Tarreau	eed7826529	BUG/MEDIUM: quic: properly take shards into account on bind lines Shards were completely forgotten in commit `f5a0c8abf` ("MEDIUM: quic: respect the threads assigned to a bind line"). The thread mask is taken from the bind_conf, but since shards were introduced in 2.5, the per-listener mask is held by the receiver and can be smaller than the bind_conf's mask. The effect here is that the traffic is not distributed to the appropriate thread. At first glance it's not dramatic since it remains one of the threads eligible by the bind_conf, but it still means that in some contexts such as "shards by-thread", some concurrency may persist on listeners while they're expected to be alone. One identified impact is that it requires more rxbufs than necessary, but there may possibly be other not yet identified side effects. This must be backported to 2.7 and everywhere the commit above is backported.	2022-12-21 09:27:26 +01:00
Amaury Denoyelle	d3083c9df9	MINOR: quic: reconnect quic-conn socket on address migration UDP addresses may change over time for a QUIC connection. When using quic-conn owned socket, we have to detect address change to break the bind/connect association on the socket. For the moment, on change detected, QUIC connection socket is closed and a new one is opened. In the future, we may improve this by trying to keep the original socket and reexecute only bind/connect syscalls. This change is part of quic-conn owned socket implementation. It may be backported to 2.7 after a period of observation.	2022-12-02 14:45:43 +01:00
Amaury Denoyelle	b2bd83972b	MEDIUM: quic: requeue datagrams received on wrong socket There is a small race condition when QUIC connection socket is instantiated between the bind() and connect() system calls. This means that the first datagram read on the sockets may belong to another connection. To detect this rare case, we compare the DCID for each QUIC datagram read on the QUIC socket. If it does not match the connection CID, the datagram is requeue using quic_receiver_buf to be able to handle it on the correct thread. This change is part of quic-conn owned socket implementation. It may be backported to 2.7 after a period of observation.	2022-12-02 14:45:43 +01:00
Amaury Denoyelle	7c9fdd9c3a	MEDIUM: quic: move receive out of FD handler to quic-conn io-cb This change is the second part for reception on QUIC connection socket. All operations inside the FD handler has been delayed to quic-conn tasklet via the new function qc_rcv_buf(). With this change, buffer management on reception has been simplified. It is now possible to use a local buffer inside qc_rcv_buf() instead of quic_receiver_buf(). This change is part of quic-conn owned socket implementation. It may be backported to 2.7 after a period of observation.	2022-12-02 14:45:43 +01:00
Amaury Denoyelle	5b41486b7f	MEDIUM: quic: use quic-conn socket for reception Try to use the quic-conn socket for reception if it is allocated. For this, the socket is inserted in the fdtab. This will call the new handler quic_conn_io_cb() which is responsible to process the recv() system call. It will reuse datagram dispatch for simplicity. However, this is guaranteed to be called on the quic-conn thread, so it will be more efficient to use a dedicated buffer. This will be implemented in another commit. This patch should improve performance by reducing contention on the receiver socket. However, more gain can be obtained when the datagram dispatch operation will be skipped. Older quic_sock_fd_iocb() is renamed to quic_lstnr_sock_fd_iocb() to emphasize its usage for the receiver socket. This change is part of quic-conn owned socket implementation. It may be backported to 2.7 after a period of observation.	2022-12-02 14:45:43 +01:00
Amaury Denoyelle	dc0dcb394b	MINOR: quic: use connection socket for emission If quic-conn has a dedicated socket, use it for sending over the listener socket. This should improve performance by reducing contention over the shared listener socket. This change is part of quic-conn owned socket implementation. It may be backported to 2.7 after a period of observation.	2022-12-02 14:45:43 +01:00
Amaury Denoyelle	40909dfec5	MINOR: quic: allocate a socket per quic-conn Allocate quic-conn owned socket if possible. This requires that this is activated in haproxy configuration. Also, this is done only if local address is known so it depends on the support of IP_PKTINFO. For the moment this socket is not used. This causes QUIC support to be broken as received datagram are not read. This commit will be completed by a following patch to support recv operation on the newly allocated socket. This change is part of quic-conn owned socket implementation. It may be backported to 2.7 after a period of observation.	2022-12-02 14:45:43 +01:00

1 2 3

125 Commits