haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-09-23 14:51:27 +02:00

Author	SHA1	Message	Date
Frederic Lecaille	e12620a8a9	BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only) By "aws-lc only", one means that this bug was first revealed by aws-lc stack. This does not mean it will not appeared for new versions of other TLS stacks which have never revealed this bug. This bug was reported by Ilya (@chipitsine) in GH #2657 where some QUIC interop tests (resumption, zerortt) could lead to crash with haproxy compiled against aws-lc TLS stack. These crashed were triggered by this BUG_ON() which detects that too short datagrams with at least one ack-eliciting Initial packet inside could be built. <0>2024-07-31T15:13:42.562717+02:00 [01\|quic\|5\|quic_tx.c:739] qc_prep_pkts(): next encryption level : qc@0x61d000041080 idle_timer_task@0x60d000006b80 flags=0x6000058 FATAL: bug condition "first_pkt->type == QUIC_PACKET_TYPE_INITIAL && (first_pkt->flags & (1UL << 0)) && length < 1200" matched at src/quic_tx.c:163 call trace(12): \| 0x563ea447bc02 [ba d9 00 00 00 48 8d 35]: main-0x1958ce \| 0x563ea4482703 [e9 73 fe ff ff ba 03 00]: qc_send+0x17e4/0x1b5d \| 0x563ea4488ab4 [85 c0 0f 85 00 f6 ff ff]: quic_conn_io_cb+0xab1/0xf1c \| 0x563ea468e6f9 [48 c7 c0 f8 55 ff ff 64]: run_tasks_from_lists+0x173/0x9c2 \| 0x563ea468f24a [8b 7d a0 29 c7 85 ff 0f]: process_runnable_tasks+0x302/0x6e6 \| 0x563ea4610893 [83 3d aa 65 44 00 01 0f]: run_poll_loop+0x6e/0x57b \| 0x563ea4611043 [48 8b 1d 46 c7 1d 00 48]: main-0x48d \| 0x7f64d05fb609 [64 48 89 04 25 30 06 00]: libpthread:+0x8609 \| 0x7f64d0520353 [48 89 c7 b8 3c 00 00 00]: libc:clone+0x43/0x5e That said everything was correctly done by qc_prep_ptks() to prevent such a case. But this relied on the hypothesis that the list of encryption levels it used was always built in the same order as follows for 0-RTT sessions: initial, early-data, handshake, application But this order is determined but the order the TLS stack derives the secrets for these encryption levels. For aws-lc, this order is not the same but as follows: initial, handshake, application, early-data During 0-RTT sessions, the server may have to build three ack-eliciting packets (with CRYPTO data inside) to reply to the first client packet: initial, hanshake, application. qc_prep_pkts() adds a PADDING frame to the last built packet for the last encryption level in the list. But after application level encryption, there is early-data encryption level. This prevented qc_prep_pkts() to build a padded applicaiton level last packet to send a 1200-bytes datagram. To fix this, always insert early-data encryption level after the initial encryption level into the encryption levels list when initializing this encryption level from quic_conn_enc_level_init(). Must be backported as far as 2.9.	2024-08-02 15:25:26 +02:00
Christopher Faulet	78b8b60030	BUG/MEDIUM: peer: Notify the applet won't consume data when it waits for sync When the peer applet is waiting for a synchronisation with the global sync task, we must notify it won't consume data. Otherwise, if some data are already waiting in the input buffer, the applet will be woken up in loop and this wil trigger the watchdog. Once synchronized, the applet is woken up. In that case, the peer applet must indicate it is going to consume data again. This patch should fix the issue #2656. It must be backported to 3.0.	2024-08-02 08:42:29 +02:00
Christopher Faulet	184f16ded7	BUG/MEDIUM: mux-h2: Propagate term flags to SE on error in h2s_wake_one_stream When a stream is explicitly woken up by the H2 conneciton, if an error condition is detected, the corresponding error flag is set on the SE. So SE_FL_ERROR or SE_FL_ERR_PENDING, depending if the end of stream was reported or not. However, there is no attempt to propagate other termination flags. We must be sure to properly set SE_FL_EOI and SE_FL_EOS when appropriate to be able to switch a pending error to a fatal error. Because of this bug, the SE remains with a pending error and no end of stream, preventing the applicative stream to trully abort it. It means on some abort scenario, it is possible to block a stream infinitely. This patch must be backported at least as far as 2.8. No bug was observed on older versions while the same code is inuse.	2024-08-02 08:42:28 +02:00
Christopher Faulet	6743e128f3	BUG/MEDIUM: h2: Only report early HTX EOM for tunneled streams For regular H2 messages, the HTX EOM flag is synonymous the end of input. So SE_FL_EOI flag must also be set on the stream-endpoint descriptor. However, there is an exception. For tunneled streams, the end of message is reported on the HTX message just after the headers. But in that case, no end of input is reported on the SE. But here, there is a bug. The "early" EOM is also report on the HTX messages when there is no payload (for instance a content-length set to 0). If there is no ES flag on the H2 HEADERS frame, it is an unexpected case. Because for the applicative stream and most probably for the opposite endpoint, the message is considered as finihsed. It is switched in its DONE state (or the equivalent on the endpoint). But, if an extra H2 frame with the ES flag is received, a TRAILERS frame or an emtpy DATA frame, an extra EOT HTX block is pushed to carry the HTX EOM flag. So an extra HTX block is emitted for a regular HTX message. It is totally invalid, it must never happen. Because it is an undefined behavior, it is difficult to predict the result. But it definitly prevent the applicative stream to properly handle aborts and errors because data remain blocked in the channel buffer. Indeed, the end of the message was seen, so no more data are forwarded. It seems to be an issue for 2.8 and upper. Harder to evaluate for older versions. This patch must be backported as far as 2.4.	2024-08-02 08:42:28 +02:00
Christopher Faulet	0ba6202796	BUG/MEDIUM: http-ana: Report error on write error waiting for the response When we are waiting for the server response, if an error is pending on the frontend side (a write error on client), it is handled as an abort and all regular response analyzers are removed, except the one responsible to release the filters, if any. However, while it is handled as an abort, the error is not reported, as usual, via http_reply_and_close() function. It is an issue because in that, the channels buffers are not reset. Because of this bug, it is possible to block a stream infinitely. The request side is waiting for the response side and the response side is blocked because filters must be released and this cannot be done because data remain blocked in channels buffers. So, in that case, calling http_reply_and_close() with no message is enough to unblock the stream. This patch must be backported as far as 2.8.	2024-08-02 08:42:28 +02:00
Amaury Denoyelle	7a5a30d28a	BUG/MINOR: h2: reject extended connect for h2c protocol This commit prevents forwarding of an HTTP/2 Extended CONNECT when "h2c" or "h2" token is set as targetted protocol. Contrary to the previous commit which deals with HTTP/1 mux, this time the request is rejected and a RESET_STREAM is reported to the client. This must be backported up to 2.4 after a period of observation.	2024-08-01 18:23:44 +02:00
Amaury Denoyelle	7b89aa5b19	BUG/MINOR: h1: do not forward h2c upgrade header token haproxy supports tunnel establishment through HTTP Upgrade mechanism. Since the following commit, extended CONNECT is also supported for HTTP/2 both on frontend and backend side. commit 9bf957335e2c385b74901481f7a89c9565dfce53 MEDIUM: mux_h2: generate Extended CONNECT from htx upgrade As specified by HTTP/2 rfc, "h2c" can be used by an HTTP/1.1 client to request an upgrade to HTTP/2. In haproxy, this is not supported so it silently ignores this. However, Connection and Upgrade headers are forwarded as-is on the backend side. If using HTTP/1 on the backend side and the server supports this upgrade mechanism, haproxy won't be able to parse the HTTP response. If using HTTP/2, mux backend tries to incorrectly convert the request to an Extended CONNECT with h2c protocol, which may also prevent the response to be transmitted. To fix this, flag HTTP/1 request with "h2c" or "h2" token in an upgrade header. On converting the header list to HTX, the upgrade header is skipped if any of this token is present and the H1_MF_CONN_UPG flag is removed. This issue can easily be reproduced using curl --http2 argument to connect to an HTTP/1 frontend. This must be backported up to 2.4 after a period of observation.	2024-08-01 18:23:32 +02:00
Amaury Denoyelle	a7a2db4ad5	BUG/MIONR: quic: fix fc_lost Control layer callback get_info has recently been implemented for QUIC. However, fc_lost always returned 0. This is because quic_get_info() does not use the correct input argument value to identify lost value. This does not need to be backported.	2024-08-01 11:35:27 +02:00
Amaury Denoyelle	522c3bea2c	BUG/MINOR: quic: fix fc_rtt/srtt values QUIC has recently implement get_info callback to return RTT/sRTT values. However, it uses milliseconds, contrary to TCP which uses microseconds. This cause smp fetch functions to return invalid values. Fix this by converting QUIC values to microseconds. This does not need to be backported.	2024-08-01 11:35:27 +02:00
Amaury Denoyelle	4b0bda42f7	MINOR: flags/mux-quic: decode qcc and qcs flags Decode QUIC MUX connection and stream elements via qcc_show_flags() and qcs_show_flags(). Flags definition have been moved outside of USE_QUIC to ease compilation of flags binary.	2024-07-31 17:59:35 +02:00
Frederic Lecaille	f7f76b8b0d	MINOR: quic: Define ->get_info() control layer callback for QUIC This low level callback may be called by several sample fetches for frontend connections like "fc_rtt", "fc_rttvar" etc. Define this callback for QUIC protocol as pointer to quic_get_info(). This latter supports these sample fetches: "fc_lost", "fc_reordering", "fc_rtt" and "fc_rttvar". Update the documentation consequently.	2024-07-31 10:29:42 +02:00
Frederic Lecaille	1733dff42a	MINOR: tcp_sample: Move TCP low level sample fetch function to control layer Add ->get_info() new control layer callback definition to protocol struct to retreive statiscal counters information at transport layer (TCPv4/TCPv6) identified by an integer into a long long int. Move the TCP specific code from get_tcp_info() to the tcp_get_info() control layer function (src/proto_tcp.c) and define it as the ->get_info() callback for TCPv4 and TCPv6. Note that get_tcp_info() is called for several TCP sample fetches. This patch is useful to support some of these sample fetches for QUIC and to keep the code simple and easy to maintain.	2024-07-31 10:29:42 +02:00
Amaury Denoyelle	bba6baff30	BUG/MEDIUM: quic: prevent conn freeze on 0RTT undeciphered content Received QUIC packets are stored in quic_conn Rx buffer after header protection removal in qc_rx_pkt_handle(). These packets are then removed after quic_conn IO handler via qc_treat_rx_pkts(). If HP cannot be removed, packets are still copied into quic_conn Rx buffer. This can happen if encryption level TLS keys are not yet available. The packet remains in the buffer until HP can be removed and its content processed. An issue occurs if client emits a 0-RTT packet but haproxy does not have the shared secret, for example after a haproxy process restart. In this case, the packet is copied in quic_conn Rx buffer but its HP won't ever be removed. This prevents the buffer to be purged. After some time, if the client has emitted enough packets, Rx buffer won't have any space left and received packets are dropped. This will cause the connection to freeze. To fix this, remove any 0-RTT buffered packets on handshake completion. At this stage, 0-RTT packets are unnecessary anymore. The client is expected to reemit its content in 1-RTT packet which are properly deciphered. This can easily reproduce with HTTP/3 POST requests or retrieving a big enough object, which will fill the Rx buffer with ACK frames. Here is a picoquic command to provoke the issue on haproxy startup : $ picoquicdemo -Q -v 00000001 -a h3 <hostname> 20443 "/?s=1g" Note that allow-0rtt must be present on the bind line to trigger the issue. Else haproxy will reject any 0-RTT packets. This must be backported up to 2.6. This could be one of the reason for github issue #2549 but it's unsure for now.	2024-07-31 10:24:53 +02:00
William Lallemand	f76e8e50f4	BUILD: ssl: replace USE_OPENSSL_AWSLC by OPENSSL_IS_AWSLC Replace USE_OPENSSL_AWSLC by OPENSSL_IS_AWSLC in the code source, so we won't need to set USE_OPENSSL_AWSLC in the Makefile on the long term.	2024-07-30 18:53:08 +02:00
William Lallemand	1889b86561	BUG/MEDIUM: ssl: 0-RTT initialized at the wrong place for AWS-LC Revert patch fcc8255 "MINOR: ssl_sock: Early data disabled during SSL_CTX switching (aws-lc)". The patch was done in the wrong callback which is never built for AWS-LC, and applies options on the SSL_CTX instead of the SSL, which should never be done elsewhere than in the configuration parsing. This was probably triggered by successfully linking haproxy against AWS-LC without using USE_OPENSSL_AWSLC. The patch also reintroduced SSL_CTX_set_early_data_enabled() in the ssl_quic_initial_ctx() and ssl_sock_initial_ctx(). So the initial_ctx does have the right setting, but it still needs to be applied to the selected SSL_CTX in the clienthello, because we need it on the selected SSL_CTX. Must be backported to 3.0. (ssl_clienthello.c part was in ssl_sock.c)	2024-07-30 18:53:08 +02:00
William Lallemand	56eefd6827	BUG/MEDIUM: ssl: reactivate 0-RTT for AWS-LC Then reactivate HAVE_SSL_0RTT and HAVE_SSL_0RTT_QUIC for AWS-LC, which were wrongly deactivated in f5353f2c ("MINOR: ssl: add HAVE_SSL_0RTT constant"). Must be backported to 3.0.	2024-07-30 18:53:08 +02:00
Willy Tarreau	376b147fff	BUG/MINOR: stconn: bs.id and fs.id had their dependencies incorrect The backend depends on the response and the frontend on the request, not the other way around. In addition, they used to depend on L6 (hence contents in the channel buffers) while they should only depend on L5 (permanent info known in the mux). This came in 2.9 with commit 24059615a7 ("MINOR: Add sample fetches to get the frontend and backend stream ID") so this can be backported there. (cherry picked from commit 61dd0156c82ea051779e6524cad403871c31fc5a) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-30 18:39:29 +02:00
Christopher Faulet	d9f41b1d6e	BUILD: mux-pt: Use the right name for the sedesc variable A typo was introduced in 760d26a86 ("BUG/MEDIUM: mux-pt/mux-h1: Release the pipe on connection error on sending path"). The sedesc variable is 'sd', not 'se'. This patch must be backported with the commit above.	2024-07-30 10:44:00 +02:00
Christopher Faulet	760d26a862	BUG/MEDIUM: mux-pt/mux-h1: Release the pipe on connection error on sending path When data are sent using the kernel splicing, if a connection error occurred, the pipe must be released. Indeed, in that case, no more data can be sent and there is no reason to not release the pipe. But it is in fact an issue for the stream because the channel will appear are not empty. This may prevent the stream to be released. This happens on 2.8 when a filter is also attached on it. On 2.9 and upper, it seems there is not issue. But it is hard to be sure and the current patch remains valid is all cases. On 2.6 and lower, the code is not the same and, AFAIK, there is no issue. This patch must be backported to 2.8. However, on 2.8, there is no zero-copy data forwarding. The patch must be adapted. There is no done_ff/resume_ff callback functions for muxes. The pipe must released in sc_conn_send() when an error flag is set on the SE, after the call to snd_pipe callback function.	2024-07-30 09:05:25 +02:00
Christopher Faulet	5dc45445ff	BUG/MEDIUM: stconn: Report error on SC on send if a previous SE error was set When a send on a connection is performed, if a SE error (or a pending error) was already reported earlier, we leave immediately. No send is performed. However, we must be sure to report the error at the SC level if necessary. Indeed, the SE error may have been reported during the zero-copy data forwarding. So during receive on the opposite side. In that case, we may have missed the opportunity to report it at the SC level. The patch must be backported as far as 2.8.	2024-07-30 09:05:25 +02:00
Christopher Faulet	33c9562f07	DOC: config: Add documentation about spop mode for backends The SPOE was refactored. Now backends referenced by a SPOE filter must use the spop mode to be able to use the spop multiplexer for server connections. The "spop" mode was added in the list of supported mode for backends.	2024-07-30 09:05:25 +02:00
Willy Tarreau	5541d4995d	BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue() After checking that a server or backend is full, it remains possible to call pendconn_add() just after the last pending requests finishes, so that there's no more connection on the server for very low maxconn (typ 1), leaving new ones in queue till the timeout. The approach depends on where the request was queued, though: - when queued on a server, we can simply detect that we may dequeue pending requests and wake them up, it will wake our request and that's fine. This needs to be done in srv_redispatch_connect() when the server is set. - when queued on a backend, it means that all servers are done with their requests. It means that all servers were full before the check and all were empty after. In practice this will only concern configs with less servers than threads. It's where the issue was first spotted, and it's very hard to reproduce with more than one server. In this case we need to load-balance again in order to find a spare server (or even to fail). For this, we call the newly added dedicated function pendconn_must_try_again() that tells whether or not a blocked pending request was dequeued and needs to be retried. This should be backported along with pendconn_must_try_again() to all stable versions, but with extreme care because over time the queue's locking evolved.	2024-07-29 09:27:01 +02:00
Willy Tarreau	1a8f3a368f	MINOR: queue: add a function to check for TOCTOU after queueing There's a rare TOCTOU case that happens from time to time with maxconn 1 and multiple threads. Between the moment we see the queue full and the moment we queue a request, it's possible that the last request on the server or proxy ended and that no other one is left to offer it its place. Given that all this code path is performance-critical and we cannot afford to increase the lock duration, better recheck for the condition after queueing. For this we need to be able to check for the condition and cleanly dequeue a request. That's what this patch provides via the new function pendconn_must_try_again(). It will catch more requests than absolutely needed though it will catch them all. It may find that around 1/1000 of requests are at risk, though testing shows that in practice, it's around 1 per million that really gets stuck (other ones benefit from timing and finishing late requests). Maybe in the future some conditions might be refined but it's harmless. What happens to such requests is that they're dequeued and their pendconn freed, so that the caller can decide to try to LB or queue them again. For now the function is not used, it's just added separately for easier tracking.	2024-07-29 09:27:01 +02:00
Willy Tarreau	4316ef2eab	BUILD: cfgparse-quic: fix build error on Solaris due to missing netinet/in.h Since commit 35470d518 ("MINOR: quic: activate UDP GSO for QUIC if supported"), Solaris build fails due to netinet/udp.h being included without netinet/in.h. Adding it is sufficient to fix the problem. No backport is needed.	2024-07-28 14:59:23 +02:00
Christopher Faulet	46b1fec0e9	BUG/MEDIUM: jwt: Clear SSL error queue on error when checking the signature When the signature included in a JWT is verified, if an error occurred, one or more SSL errors are queued and never cleared. These errors may be then caught by the SSL stack and a fatal SSL error may be erroneously reported during a SSL received or send. So we must take care to clear the SSL error queue when the signature verification failed. This patch should fix issue #2643. It must be backported as far as 2.6.	2024-07-26 16:59:00 +02:00
Frederic Lecaille	4abaadd842	MINOR: quic: Dump TX in flight bytes vs window values ratio. Display the ratio of the numbers of bytes in flight by packet number spaces versus the current window values in percent.	2024-07-26 16:42:44 +02:00
Frederic Lecaille	76ff8afa2d	MINOR: quic: Add information to "show quic" for CUBIC cc. Add ->state_cli() new callback to quic_cc_algo struct to define a function called by the "show quic (cc\|full)" commands to dump some information about the congestion algorithm internal state currently in use by the QUIC connections. Implement this callback for CUBIC algorithm to dump its internal variables: - K: (the time to reach the cubic curve inflexion point), - last_w_max: the last maximum window value reached before intering the last recovery period. This is also the window value at the inflexion point of the cubic curve, - wdiff: the difference between the current window value and last_w_max. So negative before the inflexion point, and positive after.	2024-07-26 16:42:44 +02:00
Willy Tarreau	2dab1ba84b	MEDIUM: h1: allow to preserve keep-alive on T-E + C-L In 2.5-dev9, commit 631c7e866 ("MEDIUM: h1: Force close mode for invalid uses of T-E header") enforced a recently arrived new security rule in the HTTP specification aiming at preventing a class of content-smuggling attacks involving HTTP/1.0 agents. It consists in handling the very rare T-E + C-L requests or responses in close mode. It happens it does have an impact of a rare few and very old clients (probably running insecure TLS stacks by the way) that continue to send both with their POST requests. The impact is that for each and every request they'll have to reconnect, possibly negotiating a full TLS handshake that becomes harmful to the machine in terms of CPU computation. This commit adds a new option "h1-do-not-close-on-insecure-transfer-encoding" that does exactly what it says, it just asks not to close on such messages, even though the message continues to be sanitized and C-L dropped. It means that the risk is only between the sender and haproxy, which is limited, and might be the only acceptable solution for such environments having to deal with broken implementations. The cases are so rare that it should not need to be backported, or in the worst case, to the latest LTS if there is any demand.	2024-07-26 15:59:35 +02:00
Amaury Denoyelle	85131f91bf	BUG/MEDIUM: quic: fix invalid conn reject with CONNECTION_REFUSED quic-initial rules were implemented just recently. For some actions, a new flags field was added in quic_dgram structure. This is used to report the result of the rules execution. However, this flags field was left uninitialized. Depending on its value, it may close the connection to be wrongly rejected via CONNECTION_REFUSED. Fix this by properly set flags value to 0. No need to backport.	2024-07-26 15:24:35 +02:00
Amaury Denoyelle	08515af9df	MINOR: quic: implement send-retry quic-initial rules Define a new quic-initial "send-retry" rule. This allows to force the emission of a Retry packet on an initial without token instead of instantiating a new QUIC connection.	2024-07-25 15:39:39 +02:00
Amaury Denoyelle	69d7e9f3b7	MINOR: quic: implement reject quic-initial action Define a new quic-initial action named "reject". Contrary to dgram-drop, the client is notified of the rejection by a CONNECTION_CLOSE with CONNECTION_REFUSED error code. To be able to emit the necessary CONNECTION_CLOSE frame, quic_conn is instantiated, contrary to dgram-drop action. quic_set_connection_close() is called immediatly after qc_new_conn() which prevents the handshake startup.	2024-07-25 15:39:39 +02:00
Amaury Denoyelle	f91be2657e	MINOR: quic: pass quic_dgram as obj_type for quic-initial rules To extend quic-initial rules, pass quic_dgram instance to argument for the various actions. As such, quic_dgram is now supported as an obj_type and can be used in session origin field.	2024-07-25 15:39:39 +02:00
Amaury Denoyelle	1259700763	MINOR: quic: support ACL for quic-initial rules Add ACL condition support for quic-initial rules. This requires the extension of quic_parse_quic_initial() to parse an extra if/unless block. Only layer4 client samples are allowed to be used with quic-initial rules. However, due to the early execution of quic-initial rules prior to any connection instantiation, some samples are non supported. To be able to use the 4 described samples, a dummy session is instantiated before quic-initial rules execution. Its src and dst fields are set from the received datagram values.	2024-07-25 15:39:39 +02:00
Amaury Denoyelle	cafe596608	MEDIUM: quic: implement quic-initial rules Implement a new set of rules labelled as quic-initial. These rules as specific to QUIC. They are scheduled to be executed early on Initial packet parsing, prior a new QUIC connection instantiation. Contrary to tcp-request connection, this allows to reject traffic earlier, most notably by avoiding unnecessary QUIC SSL handshake processing. A new module quic_rules is created. Its main function quic_init_exec_rules() is called on Initial packet parsing in function quic_rx_pkt_retrieve_conn(). For the moment, only "accept" and "dgram-drop" are valid actions. Both are final. The latter drops silently the Initial packet instead of allocating a new QUIC connection.	2024-07-25 15:39:39 +02:00
Amaury Denoyelle	a72e82c382	MINOR: quic: delay Retry emission on quic-force-retry Currently, quic Retry packets are emitted for two different reasons after processing an Initial without token : - quic-force-retry is set on bind-line - an abnormal number of half-open connection is currently detected Previously, these two conditions were checked separately in different functions during datagram parsing. Uniformize this by moving quic-force-retry check in quic_rx_pkt_retrieve_conn() along the second condition check. The purpose of this patch is to uniformize datagram parsing stages. It is necessary to implement quic-initial rules in quic_rx_pkt_retrieve_conn() prior to any Retry emission. This prevents to emit unnecessary Retry if an Initial is subject to a reject rule.	2024-07-25 15:29:50 +02:00
Aurelien DARRAGON	e328056ddc	MEDIUM: sink: assume sft appctx stickiness As mentioned in b40d804 ("MINOR: sink: add some comments about sft->appctx usage in applet handlers"), there are few places in the code where it looks like we assumed that the applet callbacks such as sink_forward_session_init() or sink_forward_io_handler() could be executing an appctx whose sft is detached from the appctx (appctx != sft->appctx). In practise this should not be happening since an appctx sticks to the same thread its entire lifetime, and the only times sft->appctx is effectively assigned is during the session/appctx creation (in process_sink_forward()) or release. Thus if sft->appctx wouldn't point to the appctx that the sft was bound to after appctx creation, it would probably indicate a bug rather than an expected condition. To further emphasize that and prevent the confusion, and since 3.1-dev4 was released, let's remove such checks and instead add a BUG_ON to ensure this never happens. In _sink_forward_io_handler(), the "hard_close" label was removed since there are no more uses for it (no hard errors may be caught from the function for now)	2024-07-25 14:56:19 +02:00
William Lallemand	28cb01f8e8	MEDIUM: quic: implement CHACHA20_POLY1305 for AWS-LC With AWS-LC, the aead part is covered by the EVP_AEAD API which provides the correct EVP_aead_chacha20_poly1305(), however for header protection it does not provides an EVP_CIPHER for chacha20. This patch implements exceptions in the header protection code and use EVP_CIPHER_CHACHA20 and EVP_CIPHER_CTX_CHACHA20 placeholders so we can use the CRYPTO_chacha_20() primitive manually instead of the EVP_CIPHER API. This requires to check if we are using EVP_CIPHER_CTX_CHACHA20 when doing EVP_CIPHER_CTX_free().	2024-07-25 13:45:39 +02:00
William Lallemand	177c84808c	MEDIUM: quic: add key argument to header protection crypto functions In order to prepare the code for using Chacha20 with the EVP_AEAD API, both quic_tls_hp_decrypt() and quic_tls_hp_encrypt() need an extra key argument. Indeed Chacha20 does not exists as an EVP_CIPHER in AWS-LC, so the key won't be embedded into the EVP_CIPHER_CTX, so we need an extra parameter to use it.	2024-07-25 13:45:39 +02:00
William Lallemand	d55a297b85	MINOR: quic: rename confusing wording aes to hp Some of the crypto functions used for headers protection in QUIC are named with an "aes" name even thought they are not used for AES encryption only. This patch renames these "aes" to "hp" so it is clearer.	2024-07-25 13:45:38 +02:00
William Lallemand	31c831e29b	MEDIUM: ssl/quic: implement quic crypto with EVP_AEAD The QUIC crypto is using the EVP_CIPHER API in order to achieve authenticated encryption, this was the API which was used with OpenSSL. With libraries that inspires from BoringSSL (libreSSL and AWS-LC), the AEAD algorithms are implemented using the EVP_AEAD API. This patch converts the call to the EVP_CIPHER API when called in the contex of AEAD cryptography for QUIC. The patch defines some QUIC_AEAD macros that can be either EVP_CIPHER or EVP_AEAD depending on the library. This was mainly done for AWS-LC but this could be useful for other libraries. This should finally allow to use CHACHA20_POLY1305 with AWS-LC. This patch allows to use the following ciphers with the EVP_AEAD API: - TLS1_3_CK_AES_128_GCM_SHA256 - TLS1_3_CK_AES_256_GCM_SHA384 AWS-LC does not implement TLS1_3_CK_AES_128_CCM_SHA256 and TLS1_3_CK_CHACHA20_POLY1305_SHA256 requires some hack for headers protection which will come in another patch.	2024-07-25 13:45:38 +02:00
Frederic Lecaille	a6d40e09f7	BUG/MINOR: quic: Lack of precision when computing K (cubic only cc) K cubic variable is stored in ms. But it was a formula with the second as unit for the window difference parameter which was used to compute K without considering the loss of information. Then the result was converted in ms (K *= 1000). This leaded to a lack of precision and multiples of 1000 as values. To fix this, use the same formula but with the window difference in ms as parameter passed to the cubic function and remove the conversion. Must be backported as far as 2.6.	2024-07-24 18:24:39 +02:00
Willy Tarreau	7eca16921b	[RELEASE] Released version 3.1-dev4 Released version 3.1-dev4 with the following main changes : - MINOR: limits: prepare to keep limits in one place - REORG: fd: move raise_rlim_nofile to limits - CLEANUP: fd: rm struct rlimit definition - REORG: global: move rlim_fd__at_boot in limits - MINOR: haproxy: prepare to move limits-related code - REORG: haproxy: move limits handlers to limits - MINOR: limits: add is_any_limit_configured - CLEANUP: quic: remove obsolete comment on send - MINOR: quic: extend detection of UDP API OS features - MINOR: quic: activate UDP GSO for QUIC if supported - MINOR: quic: define quic_cc_path MTU as constant - MINOR: quic: add GSO parameter on quic_sock send API - MAJOR: quic: support GSO when encoding datagrams - MEDIUM: quic: implement GSO fallback mechanism - MINOR: quic: add counters of sent bytes with and without GSO - BUG/MEDIUM: bwlim: Be sure to never set the analyze expiration date in past - CLEANUP: proto: rename TID affinity callbacks - CLEANUP: quic: rename TID affinity elements - BUG/MINOR: limits: fix license type in limits.h - BUG/MINOR: session: Eval L4/L5 rules defined in the default section - CLEANUP: stconn: Fix a typo in comments for SE_ABRT_SRC_ - MEDIUM: spoe: Remove fragmentation support - MEDIUM: spoe: Remove async mode support - MINOR: spoe: Use only a global engine-id per agent - MINOR: spoe: Remove debugging - MAJOR: spoe: Remove idle applets and pipelining support - MINOR: spoe: Remove the dedicated SPOE applet task - MEDIUM: proxy/spoe: Add a SPOP mode - MEDIUM: applet: Add a .shut callback function for applets - MINOR: connection: No longer include stconn type header in connection-t.h - MINOR: stconn: Use a dedicated function to get the opposite sedesc - MINOR: spoe: Rename some flags and constant to use SPOP prefix - MINOR: spoe: Dynamically alloc the message list per event of an agent - MINOR: spoe: Move all stuff regarding the filter/applet in the C file - MINOR: spoe: Move spoe_str_to_vsn() into the header file - MEDIUM: mux-spop: Introduce the SPOP multiplexer - MEDIUM: check/spoe: Use SPOP multiplexer to perform SPOP health-checks - MAJOR: spoe: Rewrite SPOE applet to use the SPOP mux - CLEANUP: spoe: Uniformize function definitions - MINOR: spoe: Add internal sample fetch to retrieve the SPOE engine ID - MEDIUM: spoe: Set a specific name for the connection pool of SPOP servers - MINOR: backend: Remove test on HTX streams to reuse idle connections on connect - MEDIUM: spoe: Force the reuse 'always' mode for SPOP backends - MINOR: mux-spop: Use a dedicated function to update the SPOP connection timeout - MAJOR: mux-spop: Make the SPOP connections reusable - MINOR: stats-html: Display reuse ratio for spop connections - MEDIUM: spoe: Directly xfer NOTIFY frame when SPOE applet is created - MEDIUM: spoe: Directly receive ACK frame in the SPOE context buffer - MEDIUM: mux-spop/spoe: Save negociated max-frame-size value in the mux - MINOR: spoe: Remove the spop version from the SPOE appctx context - MEDIUM: mux-spop: Add checks on received frames - MEDIUM: mux-spop: Announce the pipeling support if possible - MEDIUM: spoe: Forward SPOE context error to the SPOE applet - MEDIUM: spoe: Make the SPOE applet use its own buffers - DOC: spoe: Update SPOE documentation to reflect recent refactoring - BUILD: mux-spop: fix build failure on gcc 4-10 and clang - MINOR: fd: don't scan the full fdtab on all threads - MINOR: server: better mt_list usage for node migration (prev_deleted handling) - BUG/MINOR: do not close uninit FD in quic_test_socketops() - BUG/MEDIUM: debug/cli: fix "show threads" crashing with low thread counts - MINOR: debug: prepare feed_post_mortem_late - CLEANUP: debug: fix indents in debug_parse_cli_show_dev - MINOR: debug: store runtime uid/gid in postmortem - MINOR: debug: keep runtime capabilities in post_mortem - MINOR: debug: use LIM2A to show limits - MINOR: debug: prepare to show runtime limits - MINOR: debug: keep runtime limits in postmortem - DOC: install: don't reference removed CPU arg - BUG/MEDIUM: ssl_sock: fix deadlock in ssl_sock_load_ocsp() on error path - BUG/MAJOR: mux-h2: force a hard error upon short read with pending error - MEDIUM: sink: start applets asynchronously - OPTIM: sink: balance applets accross threads - MEDIUM: ocsp: fix ocsp when the chain is loaded from 'issuers-chain-path' - MEDIUM: ssl: add extra_chain to ckch_data - MINOR: ssl: change issuers-chain for show_cert_detail() - REGTESTS: ssl: test the issuers-chain-path keyword - DOC: configuration: issuers-chain-path not compatible with OCSP - DOC: configuration: issuers-chain-path is compatible with OCSP - BUG/MEDIUM: startup: fix zero-warning mode - BUILD: tree-wide: cast arguments to tolower/toupper to unsigned char (2) - MINOR: cfgparse-global: move mode's keywords in cfg_kw_list - MINOR: cfgparse-global: move no<poller_name> in cfg_kw_list - DOC: config: improve the http-keep-alive section - BUG/MINOR: stick-table: fix crash for src_inc_gpc() without stkcounter - BUG/MINOR: server: Don't warn fallback IP is used during init-addr resolution - BUG/MINOR: cli: Atomically inc the global request counter between CLI commands - MINOR: stream: Add a pointer to set the parent stream - MINOR: vars: Fill a description instead of hash and scope when a name is parsed - MINOR: vars: Use a description to set/unset a variable instead of its hash and scope - MEDIUM: vars: Be able to parse parent scopes for variables - MINOR: vars: Use a variable description to get variables of a specific scope - MEDIUM: vars: Be able to retrieve variable of the parent stream, if any - MEDIUM: spoe: Set the parent stream for SPOE streams - BUG/MINOR: quic: Non optimal first datagram. - DOC: config: Add a dedicated section about variables - DOC: config: Add info about variable scopes referencing the parent stream - DOC: config: Explicitly state the SPOE streams have a usable parent stream - MINOR: quic: Avoid cc priv buffer overflow. - MINOR: spoe: Add a function to validate a version is supported - MINOR: spoe: export the list of SPOP error reasons - MEDIUM: spoe/tcpcheck: Reintroduce SPOP check as a customized tcp-check - REGTESTS: check/spoe: Re-enable the script performing SPOP health-checks - BUG/MEDIUM: sink: properly init applet under sft lock - MINOR: sink: unify and sink_forward_io_handler() and sink_forward_oc_io_handler() - MINOR: sink: Remove useless test on SE_FL_SHR/SHW flags - MINOR: sink: merge sink_forward_io_handler() with sink_forward_oc_io_handler() - MINOR: sink: add some comments about sft->appctx usage in applet handlers - MINOR: sink: distinguish between hard and soft close in _sink_forward_io_handler() - MEDIUM: sink: don't set NOLINGER flag on the outgoing stream interface - MINOR: ring: count processed messages in ring_dispatch_messages() - MINOR: sink: add processed events counter in sft - MEDIUM: sink: "max-reuse" support for sink servers - OPTIM: sink: consider threads' current load when rebalancing applets v3.1-dev4	2024-07-24 18:20:24 +02:00
Aurelien DARRAGON	2513bd257f	OPTIM: sink: consider threads' current load when rebalancing applets In c454296f0 ("OPTIM: sink: balance applets accross threads"), we already made sure to balance applets accross threads by picking a random thread to spawn the new applet. Also, thanks to the previous commit, we also have the ability to destroy the applet when a certain amount of messages were processed to help distribute the load during runtime. Let's improve that by trying up to 3 different threads in the hope to pick a non-overloaded one in the best scenario, and the least over loaded one in the worst case. This should help to better distribute the load over multiple threads when high loads are expected. Logic was greatly inspired from thread migration logic used by server health checks, but it was simpliflied for sink's use case.	2024-07-24 17:59:18 +02:00
Aurelien DARRAGON	237849c911	MEDIUM: sink: "max-reuse" support for sink servers Thanks to the previous commit, it is now possible to know how many events were processed for a given sft/server sink pair. As mentioned in commit c454296 ("OPTIM: sink: balance applets accross threads"), let's provide the ability to restart a server connection when a certain amount of events were processed to help better balance the load over multiple threads. For this, we make use the of "max-reuse" server keyword which was only relevant under "http" context so far. Under sink context, "max-reuse" corresponds to the number of times the tcp connection can be reused for sending messages, which in fact means that "max-reuse + 1" is the number of events (ie: messages) that are allowed to be sent using the same tcp server connection: when this threshold is met, the connection will be destroyed and a new one will be created on a random thread. The value is not strict: it is the minimum value above which the connection may be destroyed since the value is checked after ring_dispatch_messages() which may process multiple messages at once. By default, no limit is enforced (the connection will be reused for as long as it is available). The documentation was updated accordingly.	2024-07-24 17:59:14 +02:00
Aurelien DARRAGON	709b3db941	MINOR: sink: add processed events counter in sft Add a new struct member to sft structure named e_processed in order to track the total number of events processed by sft applets. sink_forward_oc_io_handler() and sink_forward_io_handler() now make use of ring_dispatch_messages() optional value added in the previous commit in order to increase the number of processed events.	2024-07-24 17:59:08 +02:00
Aurelien DARRAGON	47323e64ad	MINOR: ring: count processed messages in ring_dispatch_messages() ring_dispatch_messages() now takes an optional argument <processed> which must point to a size_t counter when provided. When provided, the value is updated to the number of messages processed by the function.	2024-07-24 17:59:03 +02:00
Aurelien DARRAGON	0821460e3f	MEDIUM: sink: don't set NOLINGER flag on the outgoing stream interface Given that sink applets are responsible for conveying messages from the ring to the tcp server endpoint, there are no protocol timeout or errors expected there, it is an unidirectional flow of data over TCP. As such, NOLINGER flag which was inherited from peers applet, see dbd026792 ("BUG/MEDIUM: peers: set NOLINGER on the outgoing stream interface") is not desirable under sink context: The reason why we have the NOLINGER flag set is to ensure the connection is closed right away and avoid 60s TIME_WAIT delay on closed sockets. The downside is that messages sent right before closing the socket are not guaranteed to make it to the server because closing with NOLINGER flag set will result in RST packet being emitted right away, which could prevent in-flight messages from being properly delivered. Unlike peers applets, the only cases were sink applets are expected to close the connection are upon unexpected error or upon stopping, which are relatively rare events. Thanks to previous commit, ERROR flag is already set in case of error, so the use of NOLINGER is not mandatory for the RST to be sent. Now for the stopping case, it only happens once in the process lifetime so it's acceptable to close the socket using EOS+EOI flags without the NOLINGER option set. So in our case, it is preferable to ensure messages get properly delivered knowning that closed sockets should be piling up in TIME_WAIT, this means removing the NOLINGER flag on the outgoing stream interface for sink applets. It is a prerequisite for upcoming patches in order to cleanly shut the applet during runtime without risking to send the RST packet before all pending messages were sent to the endpoint.	2024-07-24 17:58:58 +02:00
Aurelien DARRAGON	c6ab0e14e2	MINOR: sink: distinguish between hard and soft close in _sink_forward_io_handler() Aborting the socket on soft-stop is not the same as aborting it due to unexpected error. As such, let's leverage the granularity offered by sedesc flags to better reflect the situation: abort during soft-stop is handled as a soft close thanks to EOI+EOS flags, while abort due to unexpected error is handled as hard error thanks to ERROR+EOS flags. Thanks to this change, hard error will always emit RST packet even if the NOLINGER option wasn't set on the socket.	2024-07-24 17:58:52 +02:00
Aurelien DARRAGON	b40d804c7f	MINOR: sink: add some comments about sft->appctx usage in applet handlers There seem to be an ambiguity in the code where sft->appctx would differ from the appctx that was assigned to it upon appctx creation. In practise, it doesn't seem this could be happening. Adding a few notes to come back to this later and try to see if we can remove this ambiguity.	2024-07-24 17:58:47 +02:00
Aurelien DARRAGON	10811fdfd6	MINOR: sink: merge sink_forward_io_handler() with sink_forward_oc_io_handler() Now that sink_forward_oc_io_handler() and sink_forward_io_handler() were unified again thanks to the previous commit, let's take a chance to merge code that is common to both functions in order to ease code maintenance. Let's add _sink_forward_io_handler() internal function which takes the applet and a message handler as argument: sink_forward_io_handler() and sink_forward_oc_io_handler() leverage this internal function by passing the correct message handler for the desired format.	2024-07-24 17:58:41 +02:00

... 2 3 4 5 6 ...

22909 Commits