haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-15 19:46:57 +02:00

Author	SHA1	Message	Date
Aurelien DARRAGON	52f0b6edbe	MINOR: vars: fix indentation in var_clear_buffer() Fix indentation in var_clear_buffer() since it is exclusively using spaces. Could be backported if a fix depends on it.	2024-01-18 16:31:55 +01:00
Frederic Lecaille	0eaf42a2a4	BUG/MEDIUM: quic: keylog callback not called (USE_OPENSSL_COMPAT) This bug impacts only the QUIC OpenSSL compatibility module (USE_QUIC_OPENSSL_COMPAT) and it was introduced by this commit: BUG/MINOR: quic: Wrong keylog callback setting. quic_tls_compat_keylog_callback() callback was no more set when the SSL keylog was enabled by tune.ssl.keylog setting. This is the callback which sets the TLS secrets into haproxy. Set it again when the SSL keylog is not enabled by configuration. Thank you to @Greg57070 for having reported this issue in GH #2412. Must be backported as far as 2.8.	2024-01-16 10:17:27 +01:00
Willy Tarreau	7021a8c4d8	BUG/MINOR: mux-h2: also count streams for refused ones There are a few places where we can reject an incoming stream based on technical errors such as decoded headers that are too large for the internal buffers, or memory allocation errors. In this case we send an RST_STREAM to abort the request, but the total stream counter was not incremented. That's not really a problem, until one starts to try to enforce a total stream limit using tune.h2.fe.max-total-streams, and which will not count such faulty streams. Typically a client that learns too large cookies and tries to replay them in a way that overflows the maximum buffer size would be rejected and depending on how they're implemented, they might retry forever. This patch removes the stream count increment from h2s_new() and moves it instead to the calling functions, so that it translates the decision to process a new stream instead of a successfully decoded stream. The result is that such a bogus client will now be blocked after reaching the total stream limit. This can be validated this way: global tune.h2.fe.max-total-streams 128 expose-experimental-directives trace h2 sink stdout trace h2 level developer trace h2 verbosity complete trace h2 start now frontend h bind :8080 mode http redirect location / Sending this will fill frames with 15972 bytes of cookie headers that expand to 16500 for storage+index once decoded, causing "message too large" events: (dev/h2/mkhdr.sh -t p;dev/h2/mkhdr.sh -t s; for sid in {0..1000}; do dev/h2/mkhdr.sh -t h -i $((sid*2+1)) -f es,eh \ -R "828684410f7777772e6578616d706c652e636f6d \ $(for i in {1..66}; do echo -n 60 7F 73 433d $(for j in {1..24}; do echo -n 2e313233343536373839; done); done) "; done) \| nc 0 8080 Now it properly stops after sending 128 streams. This may be backported wherever commit `983ac4397` ("MINOR: mux-h2: support limiting the total number of H2 streams per connection") is present, since without it, that commit is less effective.	2024-01-12 18:59:59 +01:00
William Lallemand	97832ab823	MEDIUM: ssl: implements 'default-crt' keyword for bind Lines The 'default-crt' bind keyword allows to specify multiples default/fallback certificates, allowing one to have an RSA as well as an ECDSA default.	2024-01-12 17:40:42 +01:00
William Lallemand	83a0cde207	REORG: ssl: move 'generate-certificates' code to ssl_gencert.c A lot of code specific to the 'generate-certificates' option was left in ssl_sock.c. Move the code to 'ssl_gencert.c' and 'ssl_gencert.h'	2024-01-12 17:40:42 +01:00
William Lallemand	b80635a7e0	MEDIUM: ssl: does not use default_ctx for 'generate-certificate' option The 'generate-certificates' option does not need its dedicated SSL_CTX *, it only needs the default SSL_CTX. Use the default SSL_CTX found in the sni_ctx to generate certificates. It allows to remove all the specific default_ctx initialization, as well as the default_ssl_conf and 'default_inst'.	2024-01-12 17:40:42 +01:00
William Lallemand	0bf9d122a9	MEDIUM: ssl: generate '' SNI filters for default certificates This patch follows the previous one about default certificate selection ("MEDIUM: ssl: allow multiple fallback certificate to allow ECDSA/RSA selection"). This patch generates '" SNI filters for the first certificate of a bind line, it will be used to match default certificates. Instead of setting the default_ctx pointer in the bind line. Since the filters are in the SNI tree, it allows to have multiple default certificate and restore the ecdsa/rsa selection with a multi-cert bundle. This configuration: # foobar.pem.ecdsa and foobar.pem.rsa bind *:8443 ssl crt foobar.pem crt next.pem will use "foobar.pem.ecdsa" and "foobar.pem.rsa" as default certificates. Note: there is still cleanup needed around default_ctx. This was discussed in github issue #2392.	2024-01-12 17:40:42 +01:00
William Lallemand	30592168e5	MEDIUM: ssl: allow multiple fallback certificate to allow ECDSA/RSA selection This patch changes the default certificate mechanism. Since the beginning of SSL in HAProxy, the default certificate was the first certificate of a bind line. This allowed to fallback on this certificate when no servername extension was sent by the server, or when no SAN nor CN was available in the certificate. When using a multi-certificate bundle (ecdsa+rsa), it was possible to have both certificates as the fallback one, leting openssl chose the right one. This was possible because a multi-certificate bundle was generating a unique SSL_CTX for both certificates. When the haproxy and openssl architecture evolved, we decided to use multiple SSL_CTX for a multi-cert bundle, in order to simplify the code and allow updates over the CLI. However only one default_ctx was allowed, so we lost the ability to chose between ECDSA and RSA for the default certificate. This patch allows to use a '' filter for a certificate, which allow to lookup between multiple '' filter, and have one in RSA and another one in ECDSA. It replaces the default_ctx mechanism in the ClientHello callback and use the standard algorithm to look for a default cert and chose between ECDSA and RSA. /!\ This patch breaks the automatic setting of the default certificate, which will be introduce in the next patch. So the first certificate of a bind line won't be used as a defaullt anymore. To use this feature, one could use crt-list with '' filters: $ cat foo.crtlist foobar.pem.rsa foobar.pem.ecdsa * In order to test the feature, it's easy to send a request without the servername extension and use ECDSA or RSA compatible ciphers: $ openssl s_client -connect localhost:8443 -tls1_2 -cipher ECDHE-RSA-AES256-GCM-SHA384 $ openssl s_client -connect localhost:8443 -tls1_2 -cipher ECDHE-ECDSA-AES256-GCM-SHA384	2024-01-12 17:40:42 +01:00
Amaury Denoyelle	333f2cabab	BUG/MINOR: mux-quic: do not prevent non-STREAM sending on flow control Data emitted by QUIC MUX is restrained by the peer flow control. This is checked on stream and connection level inside qcc_io_send(). The connection level check was placed early in qcc_io_send() preambule. However, this also prevents emission of other frames STOP_SENDING and RESET_STREAM, until flow control limitation is increased by a received MAX_DATA. Note that local flow control frame emission is done prior in qcc_io_send() and so are not impacted. In the worst case, if no MAX_DATA is received for some time, this could delay significantly streams closure and resource free. However, this should be rare as other peers should anticipate emission of MAX_DATA before reaching flow control limit. In the end, this is also covered by the MUX timeout so the impact should be minimal To fix this, move the connection level check directly inside QCS sending loop. Note that this could cause unnecessary looping when connection flow control level is reached and no STOP_SENDING/RESET_STREAM are needed. This should be backported up to 2.6.	2024-01-12 16:53:41 +01:00
Ilya Shipitsin	671f6cf36a	CLEANUP: fix spelling of "occured" in src/h3.c	2024-01-12 08:34:53 +01:00
Willy Tarreau	4cc25f26f9	MEDIUM: http: add the ability to redefine http-err-codes and http-fail-codes The new global keywords "http-err-codes" and "http-fail-codes" allow to redefine which HTTP status codes indicate a client-induced error or a server error, as tracked by stick-table counters. This is only done globally, though everything was done so that it could easily be extended to a per-proxy mechanism if there was a real need for this (but it would eat quite more RAM then). A simple reg-test was added (http-err-fail.vtc).	2024-01-11 15:10:08 +01:00
Willy Tarreau	9d827e1049	MEDIUM: http_act: check status codes against the bit fields for err/fail This drops the hard-coded 4xx and 5xx status codes for err_cnt and fail_cnt, in favor of the new bit fields that will soon be configurable. There should be no difference at all since the bit fields are initialized to the exact same sets (400-499 for err, 500-599 minus 501 and 505 for fail).	2024-01-11 15:10:08 +01:00
Willy Tarreau	3c135569c5	MINOR: http: add infrastructure to choose status codes for err / fail At the moment, http_err_cnt and http_fail_cnt are incremented on a well-defined set of status codes, which are checked at various places. Over time, there have been some complains about 404, 401 or 407 triggering errors, or 500 triggering failures in SOAP environments for example. With a small bit field that fits in a cache line we can match the presence of a status code from 100 to 599, so that remains cheap. This patch adds two such bit fields, one per code class, and the accompanying functions to set/clear/test the codes. The arrays are preset at boot time. For now they are not used and it's not possible to adjust them.	2024-01-11 15:10:08 +01:00
Willy Tarreau	59c01f1091	CLEANUP: http: avoid duplicating literals in find_http_meth() The function does the inverse of http_known_methods[], better rely on that array with its indices, that makes the code clearer. Note that we purposely don't use a loop because the compiler is able to build an evaluation tree of the size checks and content checks that's very efficient for the most common methods. Moving a few unimportant entries even simplified the output code a little bit (they're now groupped by size without changing anything for the first ones).	2024-01-11 15:10:08 +01:00
Willy Tarreau	19def65228	OPTIM: http: simplify http_get_status_idx() using a hash This function uses a large switch/case, but the problem is that due to the numerous holes in the range, the compiler implemented a large jump table. With a bit of experimentations, some trivial perfect-hash code works, and since the number of entries is 19, it was enlarged to match the nearest next power of two to avoid a large modulo operation, and fills the holes with the default return value (HTTP_ERR_500). Jumping to 32 also results in a lot of valid keys and allows us to pick small values, resulting in very fast and compact code. The new function, despite keeping a 32-bytes table, saves slightly more than 800 bytes of code+data compared to the previous code, and avoids table jumps that affect the CPU's branch history. Note that another simple hash worked fine and produced exactly 19 codes (hence no need to pad holes): ((status * 8675725) >> 13) % 19 But it's still about 24 bytes larger in code to save 13 bytes of data that are aligned anyway, and it was a bit more expensive so that was definitely not worth it. The validity of the table was verified with this test code added just after it: __attribute__((constructor)) void http_hash_test(void) { int i; for (i = 0; i <= 600; i++) printf("code %d => %d\n", i, http_get_status_idx(i)); exit(0); } And starting haproxy \|grep -vw 14 correctly shows all ordered values (except 500 of course which is 14). In case new codes would be added, just play again with dev/phash to updated the table. As long as there are less than 32 effective entries it will remain easy to update without having to modify phash.	2024-01-11 15:10:08 +01:00
Aurelien DARRAGON	3b0bf5097b	MINOR: map: mapfile ordering also matters for tree-based match types Willy made me realize that tree-based matching may also suffer from out-of-order mapfile loading, as opposed to what's being said in `b546bb6d` ("BUG/MINOR: map: list-based matching potential ordering regression") and the associated REGTEST. Indeed, in case of duplicated keys, we want to be sure that only the key that was first seen in the file will be returned (as long as it is not removed). The above fix is still valid, and the list-based match regtest will also prevent regressions for tree-based match since mapfile loading logic is currently match-type agnostic. But let's clarify that by making both the code comment and the regtest more precise.	2024-01-11 11:13:54 +01:00
Aurelien DARRAGON	b546bb6d67	BUG/MINOR: map: list-based matching potential ordering regression An unexpected side-effect was introduced by `5fea597` ("MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list") The above commit tried to use eb tree API to manipulate elements as much as possible in the hope to accelerate some functions. Prior to `5fea597`, pattern_read_from_file() used to iterate over all elements from the map file in the same order they were seen in the file (using list_for_each_entry) to push them in the pattern expression. Now, since eb api is used to iterate over elements, the ordering is lost very early. This is known to cause behavior changes with existing setups (same conf and map file) when compared with previous versions for some list-based matching methods as described in GH #2400. For instance, the map_dom() converter may return a different matching key from the one that was returned by older haproxy versions. For IP or STR matching, matching is based on tree lookups for better efficiency, so in this case the ordering is lost at the name of performance. The order in which they are loaded doesn't matter because tree ordering is based on the content, it is not positional. But with some other types, matching is based on list lookups (e.g.: dom), and the order in which elements are pushed into the list can affect the matching element that will be returned (in case of multiple matches, since only the first matching element in the list will be returned). Despite the documentation not officially stating that the file ordering should be preserved for list-based matching methods, it's probably best to be conservative here and stick to historical behavior. Moreover, there was no performance benefit from using the eb tree api to iterate over elements in pattern_read_from_file() since all elements are visited anyway. This should be backported to 2.9.	2024-01-10 18:02:13 +01:00
William Lallemand	0773826645	CLEANUP: ssl: fix indentation in smp_fetch_ssl_fc_ec() (part 2) Fix indentation in smp_fetch_ssl_fc_ec() since it is using exclusively spaces. This should have been in previous `9a21b4b43` patch but was missed by accident. Could be backported if a fix depends on it.	2024-01-09 17:27:31 +01:00
William Lallemand	9a21b4b435	CLEANUP: ssl: fix indentation in smp_fetch_ssl_fc_ec() Fix indentation in smp_fetch_ssl_fc_ec() since it is using exclusively spaces. Could be backported if a fix depends on it.	2024-01-09 11:53:21 +01:00
Mariam John	25da2174c6	MINOR: ssl: Update ssl_fc_curve/ssl_bc_curve to use SSL_get0_group_name The function `smp_fetch_ssl_fc_ec` gets the curve name used during key exchange. It currently uses the `SSL_get_negotiated_group`, available since OpenSSLv3.0 to get the nid and derive the short name of the curve from the nid. In OpenSSLv3.2, a new function, `SSL_get0_group_name` was added that directly gives the curve name. The function `smp_fetch_ssl_fc_ec` has been updated to use `SSL_get0_group_name` if using OpenSSL>=3.2 and for versions >=3.0 and < 3.2 use the old SSL_get_negotiated_group to get the curve name. Another change made is to normalize the return value, so that `smp_fetch_ssl_fc_ec` returns curve name in uppercase. (`SSL_get0_group_name` returns the curve name in lowercase and `SSL_get_negotiated_group` + `OBJ_nid2sn` returns curve name in uppercase). Can be backported to 2.8.	2024-01-09 11:53:21 +01:00
Willy Tarreau	2b930aa7c3	[RELEASE] Released version 3.0-dev1 Released version 3.0-dev1 with the following main changes : - MINOR: channel: Use dedicated functions to deal with STREAMER flags - MEDIUM: applet: Handle channel's STREAMER flags on applets size - MINOR: applets: Use channel's field to compute amount of data received - MEDIUM: cache: Save body size of cached objects and track it on delivery - MEDIUM: cache: Add support for endp-to-endp fast-forwarding - MINOR: cache: Add global option to enable/disable zero-copy forwarding - MINOR: pattern: Use reference name as filename to read patterns from a file - MEDIUM: pattern: Add support for virtual and optional files for patterns - DOC: config: Add section about name format for maps and ACLs - DOC: management/lua: Update commands about map and acl - MINOR: promex: Add support for specialized front/back/li/srv metric names - MINOR: promex: Export active/backup metrics per-server - BUG/MINOR: ssl: Double free of OCSP Certificate ID - MINOR: ssl/cli: Add ha_(warning\|alert) msgs to CLI ckch callback - BUG/MINOR: ssl: Wrong OCSP CID after modifying an SSL certficate - BUG/MINOR: lua: Wrong OCSP CID after modifying an SSL certficate (LUA) - DOC: configuration: typo req.ssl_hello_type - MINOR: hq-interop: add fastfwd support - CLEANUP: mux_quic: rename ffwd function with prefix qmux_strm_ - MINOR: mux-quic: add traces for 0-copy/fast-forward - BUG/MINOR: mworker/cli: fix set severity-output support - CLEANUP: mworker/cli: add comments about pcli_find_and_exec_kw() - BUG/MEDIUM: quic: Possible buffer overflow when building TLS records - BUILD: ssl: update types in wolfssl cert selection callback - MINOR: ssl: activate the certificate selection callback for WolfSSL - CI: github: switch to wolfssl git-c4b77ad for new PR - BUG/MEDIUM: map/acl: pat_ref_{set,delete}_by_id regressions - BUG/MINOR: ext-check: cannot use without preserve-env - CLEANUP: mux-quic: remove unused prototype - MINOR: mux-quic: clean up qcs Rx buffer allocation API - MINOR: mux-quic: clean up qcs Tx buffer allocation API - CLEANUP: mux-quic: clean up app ops callback definitions - MINOR: mux-quic: factorize QC_SF_UNKNOWN_PL_LENGTH set - MINOR: h3: complete traces for sending - MINOR: h3: adjust zero-copy sending related code - MINOR: hq-interop: use zero-copy to transfer single HTX data block - BUG/MEDIUM: quic: QUIC CID removed from tree without locking - BUG/MEDIUM: stconn: Block zero-copy forwarding if EOS/ERROR on consumer side - BUG/MEDIUM: mux-h1: Cound data from input buf during zero-copy forwarding - BUG/MEDIUM: mux-h1: Explicitly skip request's C-L header if not set originally - CLEANUP: mux-h1: Fix a trace message about C-L header addition - BUG/MEDIUM: mux-h2: Report too large HEADERS frame only when rxbuf is empty - BUG/MEDIUM: mux-quic: report early error on stream - DOC: config: add arguments to sample fetch methods in the table - DOC: config: also add arguments to the converters in the table - BUG/MINOR: resolvers: default resolvers fails when network not configured - SCRIPTS: mk-patch-list: produce a list of patches - DEV: patchbot: add the AI-based bot to pre-select candidate patches to backport - BUG/MEDIUM: mux-h2: Switch pending error to error if demux buffer is empty - BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty - BUG/MEDIUM: mux-h2: Don't report error on SE if error is only pending on H2C - BUG/MEDIUM: mux-h2: Don't report error on SE for closed H2 streams - DOC: config: Update documentation about local haproxy response - DEV: patchbot: use checked buttons as reference instead of internal table - DEV: patchbot: allow to show/hide backported patches - MINOR: h3: remove quic_conn only reference - BUG/MINOR: server: Use the configured address family for the initial resolution - MINOR: mux-quic: remove qcc_shutdown() from qcc_release() - MINOR: mux-quic: use qcc_release in case of init failure - MINOR: mux-quic: adjust error code in init failure - MINOR: h3: add traces for connection init stage - BUG/MINOR: h3: properly handle alloc failure on finalize - MINOR: h3: use INTERNAL_ERROR code for init failure - BUG/MAJOR: stconn: Disable zero-copy forwarding if consumer is shut or in error - MINOR: stats: store the parent proxy in stats ctx (http) - BUG/MEDIUM: stats: unhandled switching rules with TCP frontend - MEDIUM: proxy: set PR_O_HTTP_UPG on implicit upgrades - MINOR: proxy: monitor-uri works with tcp->http upgrades - OPTIM: server: eb lookup for server_find_by_name() - OPTIM: server: ebtree lookups for findserver_unique_* functions - MINOR: server/event_hdl: add server_inetaddr struct to facilitate event data usage - MINOR: server/event_hdl: update _srv_event_hdl_prepare_inetaddr prototype - BUG/MINOR: server/event_hdl: propagate map port info through inetaddr event - MINOR: server: ensure connection cleanup on server addr changes - CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event - MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic - CLEANUP: server: remove unused server_parse_addr_change_request() function - CLEANUP: resolvers: remove duplicate func prototype - MINOR: resolvers: add unique numeric id to nameservers - MEDIUM: server: make server_set_inetaddr() updater serializable - MINOR: server/event_hdl: expose updater info through INETADDR event - MINOR: server: add dns hint in server_inetaddr_updater struct - MEDIUM: server/dns: clear RMAINT when addr resolves again - BUG/MINOR: server/dns: use server_set_inetaddr() to unset srv addr from DNS - BUG/MEDIUM: server/dns: perform svc_port updates atomically from SRV records - MEDIUM: peers: use server as stream target - CLEANUP: peers: remove unused sock_init_arg struct member - CLEANUP: peers: remove unused "proto" and "xprt" struct members - MINOR: peers: rely on srv->addr and remove peer->addr - DOC: config: add context hint for server keywords - MINOR: stktable: add table_process_entry helper function - MINOR: stktable: use {show,set,clear} table with ptr - MINOR: map: add map__key converters to provide the matching key - DOC: fix typo for fastfwd QUIC option - BUG/MINOR: mux-quic: always report error to SC on RESET_STREAM emission - MEDIUM: mux-quic: add BUG_ON if sending on locally closed QCS - BUG/MINOR: mux-quic: disable fast-fwd if connection on error - BUG/MINOR: quic: Wrong keylog callback setting. - BUG/MINOR: quic: Missing call to TLS message callbacks - MINOR: h3: check connection error during sending - BUG/MINOR: h3: close connection on header list too big - BUG/MINOR: h3: close connection on sending alloc errors - BUG/MINOR: h3: disable fast-forward on buffer alloc failure - Revert "MINOR: mux-quic: Disable zero-copy forwarding for send by default" - MINOR: stktable: stktable_data_ptr() cannot fail in table_process_entry() - CLEANUP: assorted typo fixes in the code and comments - CI: use semantic version compare for determing "latest" OpenSSL - CLEANUP: server: remove ambiguous check in srv_update_addr_port() - CLEANUP: resolvers: remove unused RSLV_UPD_OBSOLETE_IP flag - CLEANUP: resolvers: remove some more unused RSLV_UDP flags - MEDIUM: server: simplify snr_set_srv_down() to prevent confusions - MINOR: backend: export get_server_() functions - MINOR: tcpcheck: export proxy_parse_tcpcheck() - MEDIUM: udp: allow to retrieve the frontend destination address - MINOR: global: export a way to list build options - MINOR: debug: add features and build options to "show dev" - BUG/MINOR: server: fix server_find_by_name() usage during parsing - REGTESTS: check attach-srv out of order declaration - CLEANUP: quic: Remaining useless code into server part - BUILD: quic: Missing quic_ssl.h header protection - BUG/MEDIUM: h3: fix incorrect snd_buf return value - MINOR: h3: do not consider missing buf room as error on trailers - BUG/MEDIUM: stconn: Forward shutdown on write timeout only if it is forwardable - BUG/MEDIUM: stconn: Set fsb date if zero-copy forwarding is blocked during nego - BUG/MEDIUM: spoe: Never create new spoe applet if there is no server up - MINOR: mux-h2: support limiting the total number of H2 streams per connection - CLEANUP: mux-h2: remove the printfs from previous commit on h2 streams limit. - DEV: h2: add the ability to emit literals in mkhdr - DEV: h2: add the preface as well in supported output types - DEV: h2: support passing raw data for a frame - IMPORT: ebtree: implement and use flsnz_long() to count bits - IMPORT: ebtree: switch the sizes and offsets to size_t and ssize_t - IMPORT: ebtree: rework the fls macros to better deal with arch-specific ones - IMPORT: ebtree: make string_equal_bits turn back to unsigned char - IMPORT: ebtree: use unsigned ints for flznz() - IMPORT: ebtree: make string_equal_bits() return an unsigned	2024-01-06 14:09:35 +01:00
Willy Tarreau	e19334a343	CLEANUP: mux-h2: remove the printfs from previous commit on h2 streams limit. After thinking about them all the time at the end, I managed to remove them while editing the commit and to forget to push them :-(	2024-01-05 19:19:10 +01:00
Willy Tarreau	983ac4397d	MINOR: mux-h2: support limiting the total number of H2 streams per connection This patch introduces a new setting: tune.h2.fe.max-total-streams. It sets the HTTP/2 maximum number of total streams processed per incoming connection. Once this limit is reached, HAProxy will send a graceful GOAWAY frame informing the client that it will close the connection after all pending streams have been closed. In practice, clients tend to close as fast as possible when receiving this, and to establish a new connection for next requests. Doing this is sometimes useful and desired in situations where clients stay connected for a very long time and cause some imbalance inside a farm. For example, in some highly dynamic environments, it is possible that new load balancers are instantiated on the fly to adapt to a load increase, and that once the load goes down they should be stopped without breaking established connections. By setting a limit here, the connections will have a limited lifetime and will be frequently renewed, with some possibly being established to other nodes, so that existing resources are quickly released. The default value is zero, which enforces no limit beyond those implied by the protocol (2^30 ~= 1.07 billion). Values around 1000 were found to already cause frequent enough connection renewal without causing any perceptible latency to most clients. One notable exception here is h2load which reports errors for all requests that were expected to be sent over a given connection after it receives a GOAWAY. This is an already known limitation: https://github.com/nghttp2/nghttp2/issues/981 The patch was made in two parts inside h2_frt_handle_headers(): - the first one, at the end of the function, which verifies if the configured limit was reached and if it's needed to emit a GOAWAY ; - the second, just before decoding the stream frame, which verifies if a previously configured limit was ignored by the client, and closes the connection if this happens. Indeed, one reason for a connection to stay alive for too long definitely comes from a stupid bot that periodically fetches the same resource, scans lots of URLs or tries to brute-force something. These ones are more likely to just ignore the last stream ID advertised in GOAWAY than a regular browser, or a well-behaving client such as curl which respects it. So in order to make sure we can close the connection we need to enforce the advertised limit. Note that a regular client will not face a problem with that because in the worst case it will have max_concurrent_streams in flight and this limit is taken into account when calculating the advertised last acceptable stream ID. Just a note: it may also be possible to move the first part above to h2s_frt_stream_new() instead so that it's not processed for trailers, though it doesn't seem to be more interesting, first because it has two return points. This is something that may be backported to 2.9 and 2.8 to offer more control to those dealing with dynamic infrastructures, especially since for now we cannot force a connection to be cleanly closed using rules (e.g. github issues #946, #2146).	2024-01-05 18:49:11 +01:00
Christopher Faulet	72c23bd4cd	BUG/MEDIUM: spoe: Never create new spoe applet if there is no server up This test was already performed when a new message is queued into the sending queue. However not when the last applet is released, in spoe_release_appctx(). It is a quite old bug. It was introduced by commit `6f1296b5c7` ("BUG/MEDIUM: spoe: Create a SPOE applet if necessary when the last one is released"). Because of this bug, new SPOE applets may be created and quickly released because there is no server up, in loop and while there is at least one message in the sending queue, consuming all the CPU. It is pretty visible if the processing timeout is high. To fix the bug, conditions to create or not a SPOE applet are now centralized in spoe_create_appctx(). The test about the max connections per second and about number of active servers are moved in this function. This patch must be backported to all stable versions.	2024-01-05 17:28:50 +01:00
Christopher Faulet	7eb7ae2835	BUG/MEDIUM: stconn: Forward shutdown on write timeout only if it is forwardable The commit `b9c87f8082` ("BUG/MEDIUM: stconn/stream: Forward shutdown on write timeout") introduced a regression. In sc_cond_forward_shut(), the write timeout is considered too early to forward the shutdown. In fact, it is always considered, even if the shutdown is not forwardable yet. It is of course unexpected. It is especially an issue when a write timeout is encountered on server side during the connection establishment. In this case, if shutdown is forwarded too early on the client side, the connection is closed before the 503 error sending. So the write timeout must indeed be considered to forward the shutdown to the underlying layer, but only if the shutdown is forwardable. Otherwise, we should do nothing. This patch should fix the issue #2404. It must be backported as far as 2.2.	2024-01-05 17:28:06 +01:00
Amaury Denoyelle	8df47442d2	MINOR: h3: do not consider missing buf room as error on trailers Improve h3_resp_trailers_send() return value to be similar with h3_resp_data_send(). In particular, if QCS Tx buffer has not enough space for trailer encoding, 0 is returned instead of an error value, with QC_SF_BLK_MROOM set. This unify HTTP/3 headers/data/trailers encoding functions. Negative error codes are limited to fatal error which should cause a connection closure. Not enough output buffer space is only a transient condition which is reflect by the QC_SF_BLK_MROOM flag.	2024-01-04 15:37:49 +01:00
Amaury Denoyelle	14673fe54d	BUG/MEDIUM: h3: fix incorrect snd_buf return value h3_resp_data_send() is used to transcode HTX data into H3 data frames. If QCS Tx buffer is not aligned when first invoked, two separate frames may be built, first until buffer end, then with remaining space in front. If buffer space is not enough for at least the H3 frame header, -1 is returned with the flag QC_SF_BLK_MROOM set to await for more room. An issue arises if this occurs for the second frame : -1 is returned even though HTX data were properly transcoded and removed on the first step. This causes snd_buf callback to return an incorrect value to the stream layer, which in the end will corrupt the channel output buffer. To fix this, stop considering that not enough remaining space is an error case. Instead, return 0 if this is encountered for the first frame or the HTX removed block size for the second one. As QC_SF_BLK_MROOM is set, this will correctly interrupt H3 encoding. Label err is thus only properly limited to fatal error which should cause a connection closure. A new BUG_ON() has been added which should prevent similar issues in the future. This issue was detected using the following client : $ ngtcp2-client --no-quic-dump --no-http-dump --exit-on-all-streams-close \ 127.0.0.1 20443 -n2 "http://127.0.0.1:20443/?s=50k" This triggers the following CHECK_IF statement. Note that it may be necessary to disable fast forwarding to enforce snd_buf usage. Thread 1 "haproxy" received signal SIGILL, Illegal instruction. 0x00005555558bc48a in co_data (c=0x5555561ed428) at include/haproxy/channel.h:130 130 CHECK_IF_HOT(c->output > c_data(c)); [ ## gdb ## ] bt #0 0x00005555558bc48a in co_data (c=0x5555561ed428) at include/haproxy/channel.h:130 #1 0x00005555558c1d69 in sc_conn_send (sc=0x5555561f92d0) at src/stconn.c:1637 #2 0x00005555558c2683 in sc_conn_io_cb (t=0x5555561f7f10, ctx=0x5555561f92d0, state=32832) at src/stconn.c:1824 #3 0x000055555590c48f in run_tasks_from_lists (budgets=0x7fffffffdaa0) at src/task.c:596 #4 0x000055555590cf88 in process_runnable_tasks () at src/task.c:876 #5 0x00005555558aae3b in run_poll_loop () at src/haproxy.c:3049 #6 0x00005555558ab57e in run_thread_poll_loop (data=0x555555d9fa00 <ha_thread_info>) at src/haproxy.c:3251 #7 0x00005555558ad053 in main (argc=6, argv=0x7fffffffddd8) at src/haproxy.c:3948 In case CHECK_IF are not activated, it may cause crash or incorrect transfers. This was introduced by the following commit commit `2144d24186` BUG/MINOR: h3: close connection on sending alloc errors This must be backported wherever the above patch is.	2024-01-04 15:36:58 +01:00
Frédéric Lécaille	860028db47	CLEANUP: quic: Remaining useless code into server part Remove some QUIC definitions of members from server structure as the haproxy QUIC stack does not support at all the server part (QUIC client) as this time. Remove the statements in relation with their initializations. This patch should be backported as far as 2.6 to save memory.	2024-01-04 11:16:06 +01:00
Amaury Denoyelle	b4db3be86e	BUG/MINOR: server: fix server_find_by_name() usage during parsing Since below commit, server_find_by_name() now search using 'used_server_id' proxy backend tree : `4bcfe30414` OPTIM: server: eb lookup for server_find_by_name() This introduces a regression if server_find_by_name() is used via check_config_validity() during post-parsing. Indeed, used_server_id tree is populated at the same stage so it's possible to not found an existing server. This can cause incorrect rejection of previously valid configuration file. To fix this, servers are now inserted in used_server_id tree during parsing via parse_server(). This guarantees that server instances can be retrieved during post parsing. A known feature which uses server_find_by_name() during post parsing is attach-srv tcp-rule used for reverse HTTP. Prior to the current fix, a config was wrongly rejected if the rule was declared before the server line. This should not be backported unless the mentionned commit is.	2024-01-02 15:52:47 +01:00
Willy Tarreau	9d869b10de	MINOR: debug: add features and build options to "show dev" The "show dev" CLI command is still missing useful elements such as the build options, SSL version etc. Let's just add the build features and the build options there so that it's possible to collect all of this from a running process without having to start the executable with -vv. This is still dumped all at once from the parsing function since the output is small. If it were to grow, this would possibly require to be reworked to support a context. It might be helpful to backport this to 2.9 since it can help narrow down certain issues.	2024-01-02 11:44:42 +01:00
Willy Tarreau	afba58f21e	MINOR: global: export a way to list build options The new function hap_get_next_build_opt() will iterate over the list of build options. This will be used for debugging, so that the build options can be retrieved from the CLI.	2024-01-02 11:44:42 +01:00
Dragan Dosen	96c1a61136	MEDIUM: udp: allow to retrieve the frontend destination address A new flag RX_F_PASS_PKTINFO is now available, whose purpose is to mark that the destination address is about to be retrieved on some listeners. The address can be retrieved from the first received datagram, and relies on the IP_PKTINFO, IP_RECVDSTADDR and IPV6_RECVPKTINFO support.	2024-01-02 11:44:42 +01:00
Dragan Dosen	1582ccf9d3	MINOR: tcpcheck: export proxy_parse_tcpcheck() Export proxy_parse_tcpcheck() in tcpcheck.h	2024-01-02 11:44:42 +01:00
Dragan Dosen	5b1609f9da	MINOR: backend: export get_server_*() functions This is in preparation for exposing more of the LB internals.	2024-01-02 11:44:42 +01:00
Aurelien DARRAGON	bdecff511c	MEDIUM: server: simplify snr_set_srv_down() to prevent confusions snr_set_srv_down() (was formely known as snr_update_srv_status()), is still too ambiguous because it's not clear whether we will be putting the server under maintenance or not. This is mainly due to the fact that the function behaves differently if has_no_ip is set or not. By reviewing the function callers, it has now become clear that snr_resolution_cb() is always calling the function with a valid resolution so we only want to put the server under maintenance if we don't have a valid IP address. On the other hand snr_resolution_error_cb() always calls the function on error, with either no resolution (for SRV requests) or with failing resolution (all cases except RSLV_STATUS_VALID), so in this case we decide whether to put the server under maintenance case by case (ie: expired? timeout?) As a result, let's simplify snr_set_srv_down() so that it is only called when the caller really thinks that the server should be put under maintenance, which means always for snr_resolution_error_cb(), and only if the resolution didn't yield usable ip for snr_resolution_cb().	2024-01-02 10:29:50 +01:00
Aurelien DARRAGON	689784ed91	CLEANUP: resolvers: remove some more unused RSLV_UDP flags RSLV_UPD_CNAME and RSLV_UPD_NAME_ERROR flags have now become useless since `3cf7f987` ("MINOR: dns: proper domain name validation when receiving DNS response") as they are never set, but we forgot to remove them.	2024-01-02 10:29:41 +01:00
Aurelien DARRAGON	3ebe7bef8d	CLEANUP: server: remove ambiguous check in srv_update_addr_port() A leftover check was left by recent patch series about server addr:svc_port propagation: a check on (msg) being set was performed in srv_update_addr_port(), but msg is always set, so the check is not needed and confuses coverity (See GH #2399)	2024-01-02 10:29:24 +01:00
Ilya Shipitsin	8705e45964	CLEANUP: assorted typo fixes in the code and comments This is 38th iteration of typo fixes	2024-01-02 10:19:48 +01:00
Aurelien DARRAGON	41b7193e3c	MINOR: stktable: stktable_data_ptr() cannot fail in table_process_entry() In table_process_entry(), stktable_data_ptr() result is dereferenced without checking if it's NULL first, which may happen when bad inputs are provided to the function. However, data_type and ts arguments were already checked prior to calling the function, so we know for sure that stktable_data_ptr() will never return NULL in this case. However some static code analyzers such as Coverity are being confused because they think that the result might possibly be NULL. (See GH #2398) To make it explicit that we always provide good inputs and expect valid result, let's switch to the __stktable_data_ptr() unsafe function.	2024-01-02 08:51:51 +01:00
Amaury Denoyelle	b7274e69ef	Revert "MINOR: mux-quic: Disable zero-copy forwarding for send by default" This reverts commit `18f2ccd244`. Found issues related to QUIC fast-forward were resolved (see github issue #2372). Reenable it by default. If any issue arises, it can be disabled using the global statement : tune.quit.zero-copy-fwd-send off This can be backported to 2.9, but only after a sensible period of observation.	2023-12-22 16:30:37 +01:00
Amaury Denoyelle	cfa6d4cdd0	BUG/MINOR: h3: disable fast-forward on buffer alloc failure If QCS Tx buffer cannot be allocated in nego_ff callback, disable fast-forward for this connection and return immediately. If snd_buf is later finally used but still no buffer can being allocated, the connection will be closed on error. This should fix coverity reported in github issue #2390. This should be backported up to 2.9.	2023-12-22 16:14:23 +01:00
Amaury Denoyelle	2144d24186	BUG/MINOR: h3: close connection on sending alloc errors When encoding new HTTP/3 frames, QCS Tx buffer must be allocated if currently NULL. Previously, allocation failure was not properly checked, leaving the connection in an unspecified state, or worse risking a crash. Fix this by setting <h3c.err> to H3_INTERNAL_ERROR each time the allocation fails. This will stop sending and close the connection. In the future, it may be better to put the connection on pause waiting for allocation to succeed but this is too complicated to implement for now in a reliable way. Along with the current change, return of all HTX parsing functions (h3_resp_*_send) were set to a negative value in case of error. A new BUG_ON() in h3_snd_buf() ensures that if such a value is returned, either a connection error is register (via <h3c.err>) or buffer is temporarily full (flag QC_SF_BLK_MROOM). This should fix github issue #2389. This should be backported up to 2.6. Note that qcc_get_stream_txbuf() does not exist in 2.9 and below. mux_get_buf() is its equivalent. An explicit check b_is_null(&qcs.tx.buf) should be used there.	2023-12-22 16:02:49 +01:00
Amaury Denoyelle	d077f7ccf4	BUG/MINOR: h3: close connection on header list too big When parsing a HTX response, if too many headers are present, stop sending and close the connection with error code H3_INTERNAL_ERROR. Previously, no error was reported despite the interruption of header parsing. This cause an infinite loop. However, this is considered as minor as it happens on the response path from backend side. This should be backported up to 2.6. It relies on previous commit "MINOR: h3: check connection error during sending".	2023-12-22 15:43:39 +01:00
Amaury Denoyelle	642016ce03	MINOR: h3: check connection error during sending If an error occurs during HTX to H3 encoding, h3_snd_buf() should be interrupted. This commit add this possibility by checking for <h3c.err> member value. If non null, sending loop is stopped and an error is reported using qcc_set_error(). This commit does not change any behavior for now, as <h3c.err> is never set during sending. However, this will change in future commits, most notably to reject too many headers or handle buffer allocation failure. As such, this commit should be backported along the following fixes. Note that in 2.6 qcc_set_error() does not exist and must be replaced by qcc_emit_cc_app().	2023-12-22 15:40:11 +01:00
Frédéric Lécaille	10e96fcd17	BUG/MINOR: quic: Missing call to TLS message callbacks This bug impacts only the QUIC OpenSSL compatibility module (USE_QUIC_OPENSSL_COMPAT). The TLS capture of information from client hello enabled by tune.ssl.capture-buffer-size could not work with USE_QUIC_OPENSSL_COMPAT. This is due to the fact the callback set for this feature was replaced by quic_tls_compat_msg_callback(). In fact this called must be registered by ssl_sock_register_msg_callback() as this done for the TLS client hello capture. A call to this function appends the function passed as parameter to a list of callbacks to be called when the TLS stack parse a TLS message. quic_tls_compat_msg_callback() had to be modified to return if it is called for a non-QUIC TLS session. Must be backported to 2.8.	2023-12-21 16:33:06 +01:00
Frédéric Lécaille	b26f6fb0cb	BUG/MINOR: quic: Wrong keylog callback setting. This bug impacts only the QUIC OpenSSL compatibility module (USE_QUIC_OPENSSL_COMPAT). To make this module works, quic_tls_compat_keylog_callback() function must be set as keylog callback, or at least be called by another keylog callback. This is what SSL_CTX_keylog() was supposed to do. In addition to export the TLS secrets via sample fetches this latter also calls quic_tls_compat_keylog_callback() when compiled with USE_QUIC_OPENSSL_COMPAT defined. Before this patch, SSL_CTX_keylog() was replaced by quic_tls_compat_keylog_callback() and the TLS secret were no more exported by sample fetches. Must be backported to 2.8.	2023-12-21 16:26:31 +01:00
Amaury Denoyelle	19f4f4d890	BUG/MINOR: mux-quic: disable fast-fwd if connection on error Add a check on nego_ff to ensure connection is not on error. If this is the case, fast-forward is disable to prevent unnecessary sending. If snd_buf is latter called, stconn will be notified of the error to interrupt the stream. This check is necessary to ensure snd_buf and nego_ff are consistent. Note that previously, if fast-forward was conducted even on connection error, no sending would occur as qcc_io_send() also check these flags. However, there is a risk that stconn is never notified of the error status, thus it is considered as a bug. Its impact is minimal for now as fast-forward is disable by default on QUIC. By fixing it, it should be possible to reactive it soon. This should be backported up to 2.9.	2023-12-21 15:42:08 +01:00
Amaury Denoyelle	235e8f1afd	MEDIUM: mux-quic: add BUG_ON if sending on locally closed QCS Previously, if snd_buf operation was conducted despite QCS already locally closed, the input buffer was silently dropped. This situation could happen if a RESET_STREAM was emitted butemission not reported to the stream layer. Resetting silently the buffer ensure QUIC MUX remain compliant with RFC 9000 which forbid emission after RESET_STREAM. Since previous commit, it is now ensured that RESET_STREAM sending will always be reported to stream-layer. Thus, there is no need anymore to silently reset the buffer. A BUG_ON() statement is added to ensure this assumption will remain valid. The new code is deemed cleaner as it does not hide a missing error notification on the stconn-layer. Previously, if an error was missing, sending would continue unnecessarily with a false success status reported for the stream. Note that the BUG_ON() statement was also added into nego_ff callback. This is necessary to ensure both sending path remains consistent. This patch is labelled as MEDIUM as issues were already encountered in snd_buf/nego_ff implementation and it's not easy to cover all occurences during test. If the BUG_ON() is triggered without any apparent stream-layer issue, this commit should be reverted.	2023-12-21 15:42:08 +01:00
Amaury Denoyelle	0a69750a98	BUG/MINOR: mux-quic: always report error to SC on RESET_STREAM emission On RESET_STREAM emission, the stream Tx channel is closed. This event must be reported to stream-conn layer to interrupt future send operations. Previously, se_fl_set_error() was manually invocated before/after qcc_reset_stream(). Change this by moving se_fl_set_error() invocation into the latter. This ensures that notification won't be forget, most notably in HTTP/3 layer. In most cases, behavior should be identical as both functions were called together unless not necessary. However, there is one exception which could cause a RESET_STREAM emission without error notification : this happens on H3 trailers parsing error. All other H3 errors happen before the stream-layer creation and thus the error is notified on stream creation. This regression has been caused by the following patch : `152beeec34` MINOR: mux-quic: report error on stream-endpoint earlier Thus it should be backported up to 2.7. Note that the case described above did not cause any crash or protocol error. This is because currently MUX QUIC snd_buf operation silently reset buffer on transmission if QCS is already closed locally. This will however be removed in a future commit so the current patch is necessary to prevent an invalid behavior.	2023-12-21 15:42:08 +01:00
Aurelien DARRAGON	ca47583787	MINOR: map: add map__key converters to provide the matching key All map__ converters now have an additional output type: key. Such converters will return the matched entry's key (as found in the map file) as a string instead of the value. Consider this example map file: \|example.com value1 \|haproxy value2 With the above map file: str(test.example.com/url),map_dom_key(file.map) will return "example.com" str(running haproxy),map_sub_key(file.map) will return "haproxy" This should address GH #1446.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	9b2717e7bb	MINOR: stktable: use {show,set,clear} table with ptr This patchs adds support for optional ptr (0xffff form) instead of key argument to match against existing sticktable entries, ie: if the key is empty or cannot be matched on the cli due to incompatible characters. Lookup is performed using a linear search so it will be slower than key search which relies on eb tree lookup. Example: set table mytable key mykey data.gpc0 1 show table mytable > 0x7fbd00032bd8: key=mykey use=0 exp=86373242 shard=0 gpc0=1 clear table mytable ptr 0x7fbd00032bd8 This patchs depends on: - "MINOR: stktable: add table_process_entry helper function" It should solve GH #2118	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	6ee3923c52	MINOR: stktable: add table_process_entry helper function Only keep key-related logic in table_process_entry_per_key() function, and then use table_process_entry() function that takes an entry pointer as argument to process the entry.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	f6ae25858d	MINOR: peers: rely on srv->addr and remove peer->addr Similarly to the previous commit, we get rid of unused peer member. peer->addr was only used to save a copy of the sever's addr at parsing time. But instead of relying on an intermediate variable, we can actually use server's address directly when initiating the peer session. As with other streams created from server's settings (tcp/http, log, ring), we should rely on srv->svc_port for the port part of the address. This shouldn't change anything for peers since the address is fully resolved at parsing time and runtime changes are not supported, but this should help to make the code future-proof.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	372d3e2934	CLEANUP: peers: remove unused "proto" and "xprt" struct members peer->proto and peer->xprt struct members are now pure legacy: they are only set during parsing but never used afterwards. This is due to commit `02efedac` ("MINOR: peers: now remove the remote connection setup code") which made some cleanup in the past, but the unused proto and xprt members were probably left unused by mistake. Since we don't have valid uses for them, we remove them. Also, peer_xprt() helper function was removed since it was related to peer->xprt struct member.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	334caefaaa	CLEANUP: peers: remove unused sock_init_arg struct member Since `be0688c6` ("MEDIUM: stream_interface: remove the si->init"), sock_init_arg is completely useless (set but never used later), thus we remove it.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	c5cace3100	BUG/MEDIUM: server/dns: perform svc_port updates atomically from SRV records This was the last missing bit from `cd994407a` ("BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates") Indeed, despite the fix, svc_port updates from resolvers were still directly performed on the server's struct. Now they make proper use of the server_set_inetaddr() function so the port change (+ optional addr change with AR) will be propagated atomically. This patch depends on: - "MINOR: server: ensure connection cleanup on server addr changes" - "CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event" - "MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic" - "MEDIUM: server: make server_set_inetaddr() updater serializable" - "MINOR: server/event_hdl: expose updater info through INETADDR event" - "MINOR: server: add dns hint in server_inetaddr_updater struct" - "MEDIUM: server/dns: clear RMAINT when addr resolves again" While it could be backported in 2.9 with `cd994407a` ("BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates") to ensure addr and svc_port updates performed by resolver's code comply with the API taking care of pushing the update (and thus avoid any race), some patch dependencies are quite sensitive so it's probably best to avoid backporting for no good reason, or at least wait for it to be considered stable to prevent any breakeages	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	64c9c8ef39	BUG/MINOR: server/dns: use server_set_inetaddr() to unset srv addr from DNS As seen before, server's addr and svc_port should not be updated directly during runtime, because even if the update is performed under the lock, some competing threads might be reading ->addr and ->svc_port without the lock because they simply cannot afford it. To prevent races with such competing threads, server's addr and port should only be updated using server_set_inetaddr() function or similar. This patch depends on: - "MINOR: server: ensure connection cleanup on server addr changes" - "CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event" - "MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic" - "MEDIUM: server: make server_set_inetaddr() updater serializable" - "MINOR: server/event_hdl: expose updater info through INETADDR event" - "MINOR: server: add dns hint in server_inetaddr_updater struct" - "MEDIUM: server/dns: clear RMAINT when addr resolves again" While it could be backported in 2.9 with `cd994407a` ("BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates") to ensure addr and svc_port reset performed by resolver's code comply with the API taking care of pushing the update (and thus avoid any race), some patch dependencies are quite sensitive so it's probably best to avoid backporting for no good reason, or at least wait for it to be considered stable to prevent any breakeages.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	334ebfa1a2	MEDIUM: server/dns: clear RMAINT when addr resolves again snr_update_srv_status() and srvrq_update_srv_status() will both set or clear the server RMAINT state depending of the result of the current dns resolution. This used to work pretty well in the past, but now that addr:svc_port changes are changed atomically through a dedicated task, the change is performed asynchronously, so this can cause some flapping issues if the server is put out of maintenance while the server's address is still unassigned. To prevent errors, the resolver's code is now only allowed to put the server under maintenance but not to remove it from maintenance: the decision to remove a server from maintenance is performed by the task responsible for updating the server's addr: if the addr resolves again thanks to a valid DNS resolution and the server was previously under RMAINT, then it cleared from RMAINT state. srvrq_update_srv_status() was renamed srvrq_set_srv_down(), since it is only called to put the server in maintenance as a result of a failing SRV entry. snr_update_srv_status() was renamed srv_set_srv_down() and slightly modified so that it only takes care of putting the server under maintenance when needed. The cli command "set server x/y addr" does not need to remove the RMAINT flag anymore.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	33cd676e9e	MINOR: server/event_hdl: expose updater info through INETADDR event Thanks to the previous commit, we can now expose updater info through INETADDR event.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	3ac79b504a	MEDIUM: server: make server_set_inetaddr() updater serializable server_set_inetaddr() updater argument is a simple char * string containing infos about the caller responsible for the update. In this patch, we try to make this argument serializable, that is, make it so that we can easily export it without having to keep the original pointer passed by the caller or having to work with strings of variable lengths. This was a prerequisite for exposing more updater information through SERVER_INETADDR event (upcoming patch). Static strings were simply mapped to a fixed ID that can be converted back to a string when needed using server_inetaddr_updater_by_to_str(). One special case one made for the SERVER_INETADDR_UPDATER_DNS_RESOLVER updater since in this case the updater hint has to be generated from the corresponding resolver id / nameserver id combination. This was achieved by saving the nameserver id within the updater struct. Knowing that the resolver id can be guessed from the server struct directly, it was not exposed through the updater struct. This patch depends on: - "MINOR: resolvers: add unique numeric id to nameservers" No functional change should be expected.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	2f6120d6d4	MINOR: resolvers: add unique numeric id to nameservers When we want to avoid keeping pointers on a nameserver struct, it's not always convenient to refer as a nameserver using it's text-based unique identifier since it's not limited in length thus it cannot be serialized and deserialized safely. To address this limitation, we add a new ->puid member in dns_nameserver struct which is a parent-unique numeric value that can be used to refer to the dns nameserver within its parent resolver context. To achieve this, we reused the resolver->nb_nameserver member that wasn't used. Each time we add a new nameserver to a resolver: we set ns->puid to the current number of nameservers within the resolver and we increment this number right away. Public helper function find_nameserver_by_resolvers_and_id() was added to help retrieve nameserver pointer from (resolver X nameserver puid) combination.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	ab6fef4882	CLEANUP: server: remove unused server_parse_addr_change_request() function server_parse_addr_change_request() was completely replaced by the newer srv_update_addr_port() function. Considering the function doesn't offer useful features that srv_update_addr_port() couldn't do, we simply remove the function.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	f1f4b93a67	MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic Both functions are performing the similar tasks, except that the _port() version is doing a bit more work. In this patch, we add the server_set_inetaddr() function that works like the srv_update_addr_port() but it takes parsed inputs instead of raw strings as arguments. Then, server_set_inetaddr() is used as underlying helper function for both srv_update_addr() and srv_update_addr_port() to make them easier to maintain. Also, helper functions were added: - server_set_inetaddr_warn() -> same as server_set_inetaddr() but report a warning on updates. - server_get_inetaddr() -> fills a struct server_inetaddr from srv Since the feedback message generation part was slightly reworked, some minor changes in the way addr:svc_port updates are reported in the logs or cli messages should be expected (no loss of information though).	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	2d0c7f5935	CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event Now that purge_conn hint is now being ignored thanks to previous commit, we can simply get rid of it.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	2e3a163e47	MINOR: server: ensure connection cleanup on server addr changes Previously, in srv_update_addr_port(), we forced connection cleanup on server changes. This was done in `6318d33ce` ("BUG/MEDIUM: connections: force connections cleanup on server changes"). However, there is no reason we shouldn't have done the same in srv_update_addr() function, because the end goal is the same: perform runtime changes on server's address. The purge_conn hint propagated through the INETADDR server event was simply there to keep the original behavior (only purge the connection for events originating from srv_update_addr_port()), but to ensure the address change is handled the same way for both code paths, we simply ignore this hint.	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	545e72546c	BUG/MINOR: server/event_hdl: propagate map port info through inetaddr event server addr:svc_port updates during runtime might set or clear the SRV_F_MAPPORTS flag. Unfortunately, the flag update is still directly performed by srv_update_addr_port() function while the addr:svc_port update is being scheduled for atomic update. Given that existing readers don't take server's lock to read addr:svc_port, they also check the SRV_F_MAPPORTS flag right after without the lock. So we could cause the readers to incorrectly interpret the svc_port from the server struct because the mapport information is not published atomically, resulting in inconsistencies between svc_port / mapport flag. (MAPPORTS flag causes svc_port to be used differently by the reader) To fix this, we publish the mapport information within the INETADDR server event and we let the task responsible for updating server's addr and port position or clear the flag depending on the mapport hint. This patch depends on: - MINOR: server/event_hdl: add server_inetaddr struct to facilitate event data usage - MINOR: server/event_hdl: update _srv_event_hdl_prepare_inetaddr prototype This should be backported in 2.9 with `683b2ae01` ("MINOR: server/event_hdl: add SERVER_INETADDR event")	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	4e50c31eab	MINOR: server/event_hdl: update _srv_event_hdl_prepare_inetaddr prototype Slightly change _srv_event_hdl_prepare_inetaddr() function prototype to reduce the input arguments by learning some settings directly from the server. Also taking this opportunity to make the function static inline since it's relatively simple and not meant to be used directly.	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	835263047e	OPTIM: server: ebtree lookups for findserver_unique_* functions `4e5e2664` ("MINOR: proxy: add findserver_unique_id() and findserver_unique_name()") added findserver_unique_id() and findserver_unique_name() functions that were inspired from the historical findserver() function, so unfortunately they don't perform well when used on large backend farms because they scan the whole server list linearly. I was about to provide a patch to optimize such functions when I stumbled on Baptiste's work: `19a106d24` ("MINOR: server: server_find functions: id, name, best_match") It turns out Baptiste already implemented helper functions to supersed the unoptimized findserver() function (at least at runtime when servers have been assigned their final IDs and inserted in the lookup trees): they offer more matching options and rely on eb lookups so they are much more suitable for fast queries. I don't know how I missed that, but they are a perfect base for the server rid matching functions. So in this patch, we essentially revert `4e5e2664` to provide the optimized equivalent functions named server_find_by_id_unique() and server_find_by_name_unique(), then we force existing findserver_unique_*() callers to switch to the new functions. This patch depends on: - "OPTIM: server: eb lookup for server_find_by_name()" This could be backported up to 2.8.	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	4bcfe30414	OPTIM: server: eb lookup for server_find_by_name() server_find_by_name() function was added in `19a106d24` ("MINOR: server: server_find functions: id, name, best_match"). At that time, only the used_server_id proxy tree was available, thus the name lookup was performed as a linear search. However, used_server_name proxy tree was added in `84d6046a` ("MINOR: proxy: Add a "server by name" tree to proxy."), so we may safely rely on it to perform server name lookups now. This will hopefully make the function quite faster, especially when performing lookups in huge backend farms.	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	e35fa36360	MINOR: proxy: monitor-uri works with tcp->http upgrades Currently, we have a check in proxy_cfg_ensure_no_http() that generates a warning if the monitor-uri is configured on a proxy that doesn't have mode HTTP enabled. However, when we give a look at monitor-uri implementation, it's not 100% correct. Indeed, despite the warning message, the directive will still be evaluated when HTTP upgrade occurs from a TCP frontend. Thus the error is misleading. To make the error message comply with the actual behavior, the check was moved alongside other checks that accept both native HTTP mode or HTTP upgrades in cfgparse.c.	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	8a6cc6e3ea	MEDIUM: proxy: set PR_O_HTTP_UPG on implicit upgrades When a TCP frontend uses an HTTP backend, the stream is automatically upgraded and it results in a similar behavior as if a switch-mode http rule was evaluated since stream_set_http_mode() gets called in both situations and minimal HTTP analyzers are set. In the current implementation, some postparsing checks are generating errors or warnings when the frontend is in TCP mode with some HTTP options set and no upgrade is expected (no switch-rule http). But as you can guess, unfortunately this leads in issues when such "HTTP" only options are used in a frontend that has implicit switching rules (that is, when the frontend uses an HTTP backend for example), because in this case the PR_O_HTTP_UPG will not be set, so the postparsing checks will consider that some options are not relevant and will raise some warnings. Consider the following example: backend back mode http server s1 git.haproxy.org:80 frontend front mode tcp bind localhost:8080 http-request set-var(txn.test) str(TRUE),debug(WORKING,stderr) use_backend back By starting an haproxy instance with the above example conf, we end up having this warning: [WARNING] (400280) : config : 'http-request' rules ignored for frontend 'front' as they require HTTP mode. However, by making a request on the frontend, we notice that the request rules are still executed, and that's because the stream is effectively upgraded as a result of an implicit upgrade: [debug] WORKING: type=str <TRUE> So this confirms the previous description: since implicit and explicit upgrades result in approximately the same behavior on the frontend side, we should consider them both when doing postparsing checks. This is what we try to address in the following commit: PR_O_HTTP_UPG flag is now more generic in the sense that it refers to either implicit (through default_backend or use_backend rules) or explicit (switch-mode rules) upgrades. Indeed, everytime an HTTP or dynamic backend (where the mode cannot be assumed during parsing) is encountered in default_backend directive or use_backend rules, we explicitly position the upgrade flag so that further checks that depend on the proxy being in HTTP context don't report false warnings.	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	64b7d8e173	BUG/MEDIUM: stats: unhandled switching rules with TCP frontend Consider the following configuration: backend back mode http frontend front mode tcp bind localhost:8080 stats enable stats uri /stats tcp-request content switch-mode http if FALSE use_backend back Firing a request to /stats on the haproxy process started with the above configuration will cause a segfault in http_handle_stats(). The cause for the crash is that in this case, the upgrade doesn't simply switches to HTTP mode, but also changes the stream backend (changing from the frontend itself to the targeted HTTP backend). However, there is an inconsitency in the stats logic between the check for the stats URI and the actual handling of the stats page. Indeed, http_stats_check_uri() checks uri parameters from the proxy undergoing the http analyzers, whereas http_handle_stats() uses s->be instead. During stream analysis, from the frontend perspective: s->be defaults to the frontend. But if the frontend is in TCP mode and the stream is upgraded to HTTP via backend switching rules, then s->be will be assigned to the actual HTTP-capable backend in stream_set_backend(). What this means is that when the http analyzer first checks if the current URI matches the one from the "stats uri" directive, it will check against the "stats uri" directive from the frontend, but later since the stats handlers reads the uri from s->be it wil actually use the value from the backend and the previous safety checks are thus garbage, resulting in unexpected behavior. (In our test case since the backend didn't define "stats uri" it is set to NULL, and http_handle_stats() dereferences it) To fix this, we should ensure that prechecks and actual stats processing always rely on the same proxy source for stats config directives. This is what is done in this patch, thanks to the previous commit, since we can make sure that the stat applet will use ->http_px as its parent proxy. So here we simply propagate the current proxy being analyzed through all the stats processing functions. This patch depends on: - MINOR: stats: store the parent proxy in stats ctx (http) It should be backported up to 2.4. For 2.4: the fix is less trivial since stats ctx was directly stored within the applet struct at that time, so this alternative patch must be used instead (without "MINOR: stats: store the parent proxy in stats ctx (http)" dependency): diff --git a/include/haproxy/applet-t.h b/include/haproxy/applet-t.h index 014e01ed9..1d9a63359 100644 --- a/include/haproxy/applet-t.h +++ b/include/haproxy/applet-t.h @@ -121,6 +121,7 @@ struct appctx { * keep the grouped together and avoid adding new ones. / struct { + struct proxy http_px; /* parent proxy of the current applet (only relevant for HTTP applet) / void obj1; /* context pointer used in stats dump / void obj2; /* context pointer used in stats dump / uint32_t domain; / set the stats to used, for now only proxy stats are supported / diff --git a/src/http_ana.c b/src/http_ana.c index b557da89d..1025d7711 100644 --- a/src/http_ana.c +++ b/src/http_ana.c @@ -63,8 +63,8 @@ static enum rule_result http_req_restrict_header_names(struct stream s, struct static void http_manage_client_side_cookies(struct stream s, struct channel req); static void http_manage_server_side_cookies(struct stream s, struct channel res); -static int http_stats_check_uri(struct stream s, struct http_txn txn, struct proxy backend); -static int http_handle_stats(struct stream s, struct channel req); +static int http_stats_check_uri(struct stream s, struct http_txn txn, struct proxy px); +static int http_handle_stats(struct stream s, struct channel req, struct proxy px); static int http_handle_expect_hdr(struct stream s, struct htx htx, struct http_msg msg); static int http_reply_100_continue(struct stream s); @@ -428,7 +428,7 @@ int http_process_req_common(struct stream s, struct channel req, int an_bit, s } / parse the whole stats request and extract the relevant information / - http_handle_stats(s, req); + http_handle_stats(s, req, px); verdict = http_req_get_intercept_rule(px, &px->uri_auth->http_req_rules, s); / not all actions implemented: deny, allow, auth / @@ -3959,16 +3959,16 @@ void http_check_response_for_cacheability(struct stream s, struct channel res) / * In a GET, HEAD or POST request, check if the requested URI matches the stats uri - * for the current backend. + * for the current proxy. * * It is assumed that the request is either a HEAD, GET, or POST and that the * uri_auth field is valid. * * Returns 1 if stats should be provided, otherwise 0. / -static int http_stats_check_uri(struct stream s, struct http_txn txn, struct proxy backend) +static int http_stats_check_uri(struct stream s, struct http_txn txn, struct proxy px) { - struct uri_auth uri_auth = backend->uri_auth; + struct uri_auth uri_auth = px->uri_auth; struct htx htx; struct htx_sl sl; struct ist uri; @@ -4003,14 +4003,14 @@ static int http_stats_check_uri(struct stream s, struct http_txn txn, struct p s->target which is supposed to already point to the stats applet. The caller * is expected to have already assigned an appctx to the stream. / -static int http_handle_stats(struct stream s, struct channel req) +static int http_handle_stats(struct stream s, struct channel req, struct proxy px) { struct stats_admin_rule stats_admin_rule; struct stream_interface si = &s->si[1]; struct session sess = s->sess; struct http_txn txn = s->txn; struct http_msg msg = &txn->req; - struct uri_auth uri_auth = s->be->uri_auth; + struct uri_auth uri_auth = px->uri_auth; const char h, lookup, end; struct appctx appctx; struct htx htx; @@ -4020,6 +4020,7 @@ static int http_handle_stats(struct stream s, struct channel req) memset(&appctx->ctx.stats, 0, sizeof(appctx->ctx.stats)); appctx->st1 = appctx->st2 = 0; appctx->ctx.stats.st_code = STAT_STATUS_INIT; + appctx->ctx.stats.http_px = px; appctx->ctx.stats.flags \|= uri_auth->flags; appctx->ctx.stats.flags \|= STAT_FMT_HTML; /* assume HTML mode by default / if ((msg->flags & HTTP_MSGF_VER_11) && (txn->meth != HTTP_METH_HEAD)) diff --git a/src/stats.c b/src/stats.c index d1f3daa98..1f0b2bff7 100644 --- a/src/stats.c +++ b/src/stats.c @@ -2863,9 +2863,9 @@ static int stats_dump_be_stats(struct stream_interface si, struct proxy px) return stats_dump_one_line(stats, stats_count, appctx); } -/ Dumps the HTML table header for proxy <px> to the trash for and uses the state from - * stream interface <si> and per-uri parameters <uri>. The caller is responsible - * for clearing the trash if needed. +/* Dumps the HTML table header for proxy <px> to the trash and uses the state from + * stream interface <si>. The caller is responsible for clearing the trash if + * needed. / static void stats_dump_html_px_hdr(struct stream_interface si, struct proxy px) { @@ -3015,17 +3015,19 @@ static void stats_dump_html_px_end(struct stream_interface si, struct proxy px input buffer. Returns 0 if it had to stop dumping data because of lack of * buffer space, or non-zero if everything completed. This function is used * both by the CLI and the HTTP entry points, and is able to dump the output - * in HTML or CSV formats. If the later, <uri> must be NULL. + * in HTML or CSV formats. / int stats_dump_proxy_to_buffer(struct stream_interface si, struct htx htx, - struct proxy px, struct uri_auth uri) + struct proxy px) { struct appctx appctx = __objt_appctx(si->end); - struct stream s = si_strm(si); struct channel rep = si_ic(si); struct server sv, svs; / server and server-state, server-state=server or server->track / struct listener l; + struct uri_auth uri = NULL; + if (appctx->ctx.stats.http_px) + uri = appctx->ctx.stats.http_px->uri_auth; chunk_reset(&trash); switch (appctx->ctx.stats.px_st) { @@ -3045,7 +3047,7 @@ int stats_dump_proxy_to_buffer(struct stream_interface si, struct htx htx, break; / match '.' which means 'self' proxy / - if (strcmp(scope->px_id, ".") == 0 && px == s->be) + if (strcmp(scope->px_id, ".") == 0 && px == appctx->ctx.stats.http_px) break; scope = scope->next; } @@ -3227,10 +3229,16 @@ int stats_dump_proxy_to_buffer(struct stream_interface si, struct htx htx, } / Dumps the HTTP stats head block to the trash for and uses the per-uri - * parameters <uri>. The caller is responsible for clearing the trash if needed. + * parameters from the parent proxy. The caller is responsible for clearing + * the trash if needed. / -static void stats_dump_html_head(struct appctx appctx, struct uri_auth uri) +static void stats_dump_html_head(struct appctx appctx) { + struct uri_auth uri; + + BUG_ON(!appctx->ctx.stats.http_px); + uri = appctx->ctx.stats.http_px->uri_auth; + / WARNING! This must fit in the first buffer !!! / chunk_appendf(&trash, "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\"\n" @@ -3345,17 +3353,21 @@ static void stats_dump_html_head(struct appctx appctx, struct uri_auth uri) } / Dumps the HTML stats information block to the trash for and uses the state from - * stream interface <si> and per-uri parameters <uri>. The caller is responsible - * for clearing the trash if needed. + * stream interface <si> and per-uri parameters from the parent proxy. The caller + * is responsible for clearing the trash if needed. / -static void stats_dump_html_info(struct stream_interface si, struct uri_auth uri) +static void stats_dump_html_info(struct stream_interface si) { struct appctx appctx = __objt_appctx(si->end); unsigned int up = (now.tv_sec - start_date.tv_sec); char scope_txt[STAT_SCOPE_TXT_MAXLEN + sizeof STAT_SCOPE_PATTERN]; const char scope_ptr = stats_scope_ptr(appctx, si); + struct uri_auth uri; unsigned long long bps = (unsigned long long)read_freq_ctr(&global.out_32bps) 32; + BUG_ON(!appctx->ctx.stats.http_px); + uri = appctx->ctx.stats.http_px->uri_auth; + /* Turn the bytes per second to bits per second and take care of the * usual ethernet overhead in order to help figure how far we are from * interface saturation since it's the only case which usually matters. @@ -3629,8 +3641,7 @@ static void stats_dump_json_end() * a pointer to the current server/listener. / static int stats_dump_proxies(struct stream_interface si, - struct htx htx, - struct uri_auth uri) + struct htx htx) { struct appctx appctx = __objt_appctx(si->end); struct channel rep = si_ic(si); @@ -3650,7 +3661,7 @@ static int stats_dump_proxies(struct stream_interface si, px = appctx->ctx.stats.obj1; /* skip the disabled proxies, global frontend and non-networked ones / if (!px->disabled && px->uuid > 0 && (px->cap & (PR_CAP_FE \| PR_CAP_BE))) { - if (stats_dump_proxy_to_buffer(si, htx, px, uri) == 0) + if (stats_dump_proxy_to_buffer(si, htx, px) == 0) return 0; } @@ -3666,14 +3677,12 @@ static int stats_dump_proxies(struct stream_interface si, } /* This function dumps statistics onto the stream interface's read buffer in - * either CSV or HTML format. <uri> contains some HTML-specific parameters that - * are ignored for CSV format (hence <uri> may be NULL there). It returns 0 if - * it had to stop writing data and an I/O is needed, 1 if the dump is finished - * and the stream must be closed, or -1 in case of any error. This function is - * used by both the CLI and the HTTP handlers. + * either CSV or HTML format. It returns 0 if it had to stop writing data and + * an I/O is needed, 1 if the dump is finished and the stream must be closed, + * or -1 in case of any error. This function is used by both the CLI and the + * HTTP handlers. / -static int stats_dump_stat_to_buffer(struct stream_interface si, struct htx htx, - struct uri_auth uri) +static int stats_dump_stat_to_buffer(struct stream_interface si, struct htx htx) { struct appctx appctx = __objt_appctx(si->end); struct channel rep = si_ic(si); @@ -3688,7 +3697,7 @@ static int stats_dump_stat_to_buffer(struct stream_interface si, struct htx ht case STAT_ST_HEAD: if (appctx->ctx.stats.flags & STAT_FMT_HTML) - stats_dump_html_head(appctx, uri); + stats_dump_html_head(appctx); else if (appctx->ctx.stats.flags & STAT_JSON_SCHM) stats_dump_json_schema(&trash); else if (appctx->ctx.stats.flags & STAT_FMT_JSON) @@ -3708,7 +3717,7 @@ static int stats_dump_stat_to_buffer(struct stream_interface si, struct htx ht case STAT_ST_INFO: if (appctx->ctx.stats.flags & STAT_FMT_HTML) { - stats_dump_html_info(si, uri); + stats_dump_html_info(si); if (!stats_putchk(rep, htx, &trash)) goto full; } @@ -3733,7 +3742,7 @@ static int stats_dump_stat_to_buffer(struct stream_interface si, struct htx ht case STATS_DOMAIN_PROXY: default: /* dump proxies / - if (!stats_dump_proxies(si, htx, uri)) + if (!stats_dump_proxies(si, htx)) return 0; break; } @@ -4112,11 +4121,14 @@ static int stats_process_http_post(struct stream_interface si) static int stats_send_http_headers(struct stream_interface si, struct htx htx) { struct stream s = si_strm(si); - struct uri_auth uri = s->be->uri_auth; + struct uri_auth uri; struct appctx appctx = __objt_appctx(si->end); struct htx_sl sl; unsigned int flags; + BUG_ON(!appctx->ctx.stats.http_px); + uri = appctx->ctx.stats.http_px->uri_auth; + flags = (HTX_SL_F_IS_RESP\|HTX_SL_F_VER_11\|HTX_SL_F_XFER_ENC\|HTX_SL_F_XFER_LEN\|HTX_SL_F_CHNK); sl = htx_add_stline(htx, HTX_BLK_RES_SL, flags, ist("HTTP/1.1"), ist("200"), ist("OK")); if (!sl) @@ -4166,11 +4178,14 @@ static int stats_send_http_redirect(struct stream_interface si, struct htx htx { char scope_txt[STAT_SCOPE_TXT_MAXLEN + sizeof STAT_SCOPE_PATTERN]; struct stream s = si_strm(si); - struct uri_auth uri = s->be->uri_auth; + struct uri_auth uri; struct appctx appctx = __objt_appctx(si->end); struct htx_sl sl; unsigned int flags; + BUG_ON(!appctx->ctx.stats.http_px); + uri = appctx->ctx.stats.http_px->uri_auth; + /* scope_txt = search pattern + search query, appctx->ctx.stats.scope_len is always <= STAT_SCOPE_TXT_MAXLEN / scope_txt[0] = 0; if (appctx->ctx.stats.scope_len) { @@ -4263,7 +4278,7 @@ static void http_stats_io_handler(struct appctx appctx) } if (appctx->st0 == STAT_HTTP_DUMP) { - if (stats_dump_stat_to_buffer(si, res_htx, s->be->uri_auth)) + if (stats_dump_stat_to_buffer(si, res_htx)) appctx->st0 = STAT_HTTP_DONE; } @@ -4888,6 +4903,7 @@ static int cli_parse_show_stat(char *args, char payload, struct appctx appctx appctx->ctx.stats.scope_str = 0; appctx->ctx.stats.scope_len = 0; + appctx->ctx.stats.http_px = NULL; // not under http context appctx->ctx.stats.flags = STAT_SHNODE \| STAT_SHDESC; if ((strm_li(si_strm(appctx->owner))->bind_conf->level & ACCESS_LVL_MASK) >= ACCESS_LVL_OPER) @@ -4954,7 +4970,7 @@ static int cli_io_handler_dump_info(struct appctx appctx) / static int cli_io_handler_dump_stat(struct appctx appctx) { - return stats_dump_stat_to_buffer(appctx->owner, NULL, NULL); + return stats_dump_stat_to_buffer(appctx->owner, NULL); } static int cli_io_handler_dump_json_schema(struct appctx *appctx)	2023-12-21 14:21:53 +01:00
Aurelien DARRAGON	ef9d692544	MINOR: stats: store the parent proxy in stats ctx (http) Some HTTP related stats functions need to know the parent proxy, mainly to get a pointer on the related uri_auth set by the proxy or to check scope settings. The current design (probably historical as only the http context existed by then) took the other approach: it propagates the uri pointer from the http context deep down the calling stack up to the relevant functions. For non-http contexts (cli), the pointer is set to NULL. Doing so is not very pretty and not easy to maintain. Moreover, there were still some places in the code were the uri pointer was learned directly from the stream proxy because the argument was not available as argument from those functions. This is error-prone, because if one day we decide to change the source proxy in the parent function, we might still have some functions down the stack that ignore the top most argument and still do on their own, and we'll probably end up with inconsistencies. So in this patch, we take a safer approach: the caller responsible for creating the stats applet should set the http_px pointer so that any stats function running under the applet that needs to know if it's running in http context or needs to access parent proxy info may do so thanks to the dedicated ctx->http_px pointer.	2023-12-21 14:20:03 +01:00
Amaury Denoyelle	9ab107b84b	MINOR: h3: use INTERNAL_ERROR code for init failure Consider that application layer is responsible to set proper error code on init or finalize operation failure. In case of H3, use INTERNAL_ERROR application error code. This allows to remove qcc_set_error() invocation from qmux_init(). In case application layer would not specify any error code, fallback INTERNAL_ERROR transport error code would be used thanks to the recent change introduced for error management in qmux_init().	2023-12-20 15:40:02 +01:00
Amaury Denoyelle	7a3602a1f5	BUG/MINOR: h3: properly handle alloc failure on finalize If H3 control stream Tx buffer cannot be allocated, return a proper errur through h3_finalize(). This will cause the emission of a CONNECTION_CLOSE with error H3_INTERNAL_ERROR and closure of the whole connection. This should be backported up to 2.6. Note that 2.9 has some difference which will cause conflict. The main one is that qcc_get_stream_txbuf() does not exist in this version. Instead the check in h3_control_send() should be made after mux_get_buf(). Finally, it may be useful to first pick previous commit (MINOR: h3: add traces for connection init stage) to improve context similarity.	2023-12-20 15:39:51 +01:00
Amaury Denoyelle	a2dbd6d916	MINOR: h3: add traces for connection init stage Add traces H3_EV_H3C_NEW. These are used for h3_init() and h3_finalize() functions.	2023-12-20 15:27:11 +01:00
Amaury Denoyelle	403492af8e	MINOR: mux-quic: adjust error code in init failure If QUIC MUX cannot be initialized for any reason, the connection is shut down with a CONNECTION_CLOSE frame. Previously, no error code was explicitely specified, resulting in "no error" code. Change this by always set error code in case of QUIC MUX failure. Use the already defined QUIC MUX error code or "internal error" if unset. Call quic_set_connection_close() on error label to register it to the quic_conn layer. This should help to improve error reporting in case of MUX initialization failure.	2023-12-20 15:27:11 +01:00
Amaury Denoyelle	bcade776c2	MINOR: mux-quic: use qcc_release in case of init failure qmux_init() may fail at different stage. In this case, an error is returned and QCC allocated element are freed. Previously, extra care was taken using different label to only liberate already allocate elements. This patch removes the multi label and uses qcc_release(). This will be simpler to ensure a QCC is always properly freed. The only important thing is to ensure that mandatory fields are properly initialized to NULL or equivalent to be able to use qcc_release() safely.	2023-12-20 15:27:11 +01:00
Amaury Denoyelle	3c38bb7ee1	MINOR: mux-quic: remove qcc_shutdown() from qcc_release() Render qcc_release() more generic by removing qcc_shutdown(). This prevents systematic graceful shutdown/CONNECTION_CLOSE emission if only QCC resource deallocation is necessary. For now, qcc_shutdown() is used before every qcc_release() invocation. The only exception is on qmux_destroy stream layer callback. This commit will be useful to reuse qcc_release() in other contexts to simply deallocate a QCC instance.	2023-12-20 15:27:11 +01:00
Christopher Faulet	3811c1de25	BUG/MINOR: server: Use the configured address family for the initial resolution A regression was introduced by the commit `c886fb58eb` ("MINOR: server/ip: centralize server ip updates"). The configured address family is lost when the server address is initialized during the startup, for the resolution based on the libc or based on the server state-file. Thus, "ipv4@" and "ipv6@" prefixed are ignored. To fix the bug, we take care to use the configured address family before calling str2ip2() in srv_apply_lastaddr() and srv_apply_via_libc() functions. This patch should fix the issue #2393. It must be backported to 2.9.	2023-12-20 12:21:59 +01:00
Amaury Denoyelle	d2540b2f72	MINOR: h3: remove quic_conn only reference H3 uses a direct reference to quic_conn to access the listener instance. This can be replaced by using qcc->conn->target. This allows to remove quic_conn-t.h header include from it.	2023-12-20 10:38:30 +01:00
Christopher Faulet	d9eb6d6680	BUG/MEDIUM: mux-h2: Don't report error on SE for closed H2 streams An error on the H2 connection was always reported as an error to the stream-endpoint descriptor, independently on the H2 stream state. But it is a bug to do so for closed streams. And indeed, it leads to report "SD--" termination state for some streams while the response was fully received and forwarded to the client, at least for the backend side point of view. Now, errors are no longer reported for H2 streams in closed state. This patch is related to the three previous ones: * "BUG/MEDIUM: mux-h2: Don't report error on SE for closed H2 streams" * "BUG/MEDIUM: mux-h2: Don't report error on SE if error is only pending on H2C" * "BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty" The series should fix a bug reported in issue #2388 (#2388#issuecomment-1855735144). The series should be backported to 2.9 but only after a period of observation. In theory, older versions are also affected but this part is pretty sensitive. So don't backport it further except if someone ask for it.	2023-12-18 21:15:32 +01:00
Christopher Faulet	580ffd6123	BUG/MEDIUM: mux-h2: Don't report error on SE if error is only pending on H2C In h2s_wake_one_stream(), we must not report an error on the stream-endpoint descriptor if the error is not definitive on the H2 connection. A pending error on the H2 connection means there are potentially remaining data to be demux. It is important to not truncate a message for a stream. This patch is part of a series that should fix a bug reported in issue #2388 (#2388#issuecomment-1855735144). Backport instructions will be shipped in the last commit of the series.	2023-12-18 21:15:32 +01:00
Christopher Faulet	19fb19976f	BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty It is similar to the previous fix ("BUG/MEDIUM: mux-h2: Don't report H2C error on read error if dmux buffer is not empty"), but on receive side. If the demux buffer is not empty, an error on the TCP connection must not be immediately reported as an error on the H2 connection. We must be sure to have tried to demux all data first. Otherwise, messages for one or more streams may be truncated while all data were already received and are waiting to be demux. This patch is part of a series that should fix a bug reported in issue #2388 (#2388#issuecomment-1855735144). Backport instructions will be shipped in the last commit of the series.	2023-12-18 21:15:32 +01:00
Christopher Faulet	5b78cbae77	BUG/MEDIUM: mux-h2: Switch pending error to error if demux buffer is empty When an error on the H2 connection is detected when sending data, only a pending error is reported, waiting for an error or a shutdown on the read side. However if a shutdown was already received, the pending error is switched to a definitive error. At this stage, we must also wait to have flushed the demux buffer. Otherwise, if some data must still be demux, messages for one or more streams may be truncated. There is already the flag H2_CF_END_REACHED to know a shutdown was received and we no longer progress on demux side (buffer empty or data truncated). On sending side, we should use this flag instead to report a definitive error. This patch is part of a series that should fix a bug reported in issue #2388 (#2388#issuecomment-1855735144). Backport instructions will be shipped in the last commit of the series.	2023-12-18 21:15:32 +01:00
William Lallemand	0d2ebb53f7	BUG/MINOR: resolvers: default resolvers fails when network not configured Bug #1740 was opened again, this time a user is complaining about the "can't create socket for nameserver". This can happen if the resolv.conf file contains a class of address which was not configured on the machine, for example IPv6. The fix does the same as `b10b1196b` ("MINOR: resolvers: shut the warning when "default" resolvers is implicit"), and uses the "resolvers->conf.implicit" variable to emit the error. Though it is not needed to convert the explicit behavior with a ERR_WARN, because this is supposed to be an unrecoverable error, unlike the connect(). Should fix issue #1740. Must be backported were `b10b1196b` was backported. (as far as 2.6)	2023-12-18 15:50:07 +01:00
Amaury Denoyelle	af297f19f6	BUG/MEDIUM: mux-quic: report early error on stream On STOP_SENDING reception, an error is notified to the stream layer as no more data can be responded. However, this is not done if the stream instance is not allocated (already freed for example). The issue occurs if STOP_SENDING is received and the stream instance is instantiated after it. It happens if a STREAM frame is received after it with H3 HEADERS, which is valid in QUIC protocol due to UDP packet reordering. In this case, stream layer is never notified about the underlying error. Instead, reponse buffers are silently purged by the MUX in qmux_strm_snd_buf(). This is suboptimal as there is no point in exchanging data from the server if it cannot be eventually transferred back to the client. However, aside from this consideration, no other issue occured. However, this is not the case with QUIC mux-to-mux implementation. Now, if mux-to-mux is used, qmux_strm_snd_buf() is bypassed and response if transferred via nego_ff/done_ff callbacks. However, these functions did not checked if QCS is already locally closed. This causes a crash when qcc_send_stream() is called via done_ff. To fix this crash, there is several approach, one of them would be to adjust nego_ff/done_ff QUIC callbacks. However, another method has been chosen. Now stream layer is flagged on error just after its instantiation if the stream is already locally closed. This ensures that mux-to-mux won't try to emit data as se_nego_ff() check if the opposide SD is not on error before continuing. Note that an alternative solution could be to not instantiate at all stream layer if QCS is already locally closed. This is the most optimal solution as it reduce unnecessary allocations and task processing. However, it's not easy to implement so the easier bug fix has been chosen for the moment. This patch is labelled as MEDIUM as it can change behavior of all QCS instances, wheter mux-to-mux is used or not, and thus could reveal other architecture issues. This should fix latest crash occurence on github issue #2392. It should be backported up to 2.6, until a necessary period of observation.	2023-12-14 11:15:46 +01:00
Christopher Faulet	682f73b4fa	BUG/MEDIUM: mux-h2: Report too large HEADERS frame only when rxbuf is empty During HEADERS frames decoding, if a frame is too large to fit in a buffer, an internal error is reported and a RST_STREAM is emitted. On the other hand, we wait to have an empty rxbuf to decode the frame because we cannot retry a failed HPACK decompression. When we are decoding headers, it is valid to return an error if dbuf buffer is full because no data can be blocked in the rxbuf (which hosts the HTX message). However, during the trailers decoding, it is possible to have some data not sent yet for the current stream in the rxbug and data for another stream fully filling the dbuf buffer. In this case, we don't decode the trailers but we must not return an error. We must wait to empty the rxbuf first. Now, a HEADERS frame is considered as too large if the dbuf buffer is full and if the rxbuf is empty (the HTX message to be accurate). This patch should fix the issue #2382. It must be backported to all stable versions.	2023-12-13 16:45:29 +01:00
Christopher Faulet	65ca444240	CLEANUP: mux-h1: Fix a trace message about C-L header addition This fixes a cut-paste error on a trace message notifying a 'Content-Length' header was added during the HTTP message formatting.	2023-12-13 16:45:29 +01:00
Christopher Faulet	966a18e2b4	BUG/MEDIUM: mux-h1: Explicitly skip request's C-L header if not set originally Commit `f89ba27caa` ("BUG/MEDIUM: mux-h1; Ignore headers modifications about payload representation") introduced a regression. The Content-Length is no longer sent to the server for requests without payload but with a 'Content-Lnegth' header explicitly set to 0, like POST request with no payload. It is of course unexpected. In some cases, depending on the server, such requests are considered as invalid and a 411-Length-Required is returned. The above commit is not directly responsible for the bug, it only reveals a too lax condition to skip the 'Content-Length' header of bodyless requests. We must only skip this header if none was originally found, during the parsing. This patch should fix the issue #2386. It must be backported to 2.9.	2023-12-13 16:45:29 +01:00
Christopher Faulet	eed1e8733c	BUG/MEDIUM: mux-h1: Cound data from input buf during zero-copy forwarding During zero-copy forwarding, we first try to forward data found in the input buffer before trying to receive more data. These data must be removed from the amount of data to forward (the cound variable). Otherwise, on an internal retry, in h1_fastfwd(), we can be lead to read more data than expected. It is especially a problem on the end of a chunk. An error is erroneously reported because more data than announced are received. This patch should fix the issue #2382. It must be backported to 2.9.	2023-12-13 16:45:29 +01:00
Frédéric Lécaille	dd58dff1e6	BUG/MEDIUM: quic: QUIC CID removed from tree without locking This bug arrived with this commit: BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number chec Every connection ID manipulations against the by thread trees used to store the connection IDs must be done under the trees locks. These trees are accessed by the low level connection identification code. When receiving a RETIRE_CONNECTION_ID frame, the concerned connection ID must be deleted from the its underlying by thread tree but not without locking! Add a WR lock around ebmb_delete() call to do so. Must be backported as far as 2.7.	2023-12-13 14:42:50 +01:00
Amaury Denoyelle	f8e095b058	MINOR: hq-interop: use zero-copy to transfer single HTX data block Similarly to H3, hq-interop now uses zero-copy when dealing with a HTX message with only a single data block. Exchange HTX and QCS buffer, and use the HTX data block for HTTP payload. This is only possible if QCS buffer is empty. Contrary to HTTP/3, no extra frame header is needed before transferring HTTP payload. hq-interop is only implemented for testing purpose so this change should not be noticeable by users. However, it will be useful to be able to test zero-copy transfer on QUIC interop testing.	2023-12-12 10:31:22 +01:00
Amaury Denoyelle	d3987b69c3	MINOR: h3: adjust zero-copy sending related code Adjust HTTP/3 data emission. First, add HTX as argument to the function as this is used for other frames emission function. Keep the buffer argument as this is mandatory for zero-copy. Extend comments related to this, in particular to explain purposes of both HTX and buffer arguments. No function change here. This should however be useful to port a code equivalent to hq-interop protocol.	2023-12-12 10:31:22 +01:00
Amaury Denoyelle	0e632fc9b4	MINOR: h3: complete traces for sending Add data level traces for each encoded H3 frame. Of notable interest, traces will be useful to detect if standard emission, zero-copy or fast forward is used. Also add the generic filter H3_EV_TX_FRAME to be able to filter these messages.	2023-12-12 10:31:22 +01:00
Amaury Denoyelle	1adadc4d3f	MINOR: mux-quic: factorize QC_SF_UNKNOWN_PL_LENGTH set When dealing with HTTP/1 responses without Content-Length nor chunked encoding, flag QC_SF_UNKNOWN_PL_LENGTH is set on QCS. This prevent the emission of a RESET_STREAM on shutw, instead resorting to a proper FIN emission. This code was duplicated both in H3 and hq-interop. Move it in common qcs_http_snd_buf() to factorize it.	2023-12-12 10:14:22 +01:00
Amaury Denoyelle	e772d3f40f	CLEANUP: mux-quic: clean up app ops callback definitions qcc_app_ops is a set of callbacks used to unify application protocol running over QUIC. This commit introduces some changes to clarify its API : * write simple comment to reflect each callback purpose * rename decode_qcs to rcv_buf as this name is more common and is similar to already existing snd_buf * finalize is moved up as it is used during connection init stage All these changes are ported to HTTP/3 layer. Also function comments have been extended to highlight HTTP/3 special characteristics.	2023-12-11 16:15:13 +01:00
Amaury Denoyelle	f496c7469b	MINOR: mux-quic: clean up qcs Tx buffer allocation API This function is similar to the previous one, but this time for QCS sending buffer. Previously, each application layer redefine their own version of mux_get_buf() which was used to allocate <qcs.tx.buf>. Unify it under a single function renamed qcc_get_stream_txbuf().	2023-12-11 16:08:51 +01:00
Amaury Denoyelle	b526ffbfb9	MINOR: mux-quic: clean up qcs Rx buffer allocation API Replaces qcs_get_buf() function which naming does not reflect its purpose. Add a new function qcc_get_stream_rxbuf() which allocate if needed <qcs.rx.app_buf> and returns the buffer pointer. This function is reserved for application protocol layer. This buffer is then accessed by stconn layer. For other qcs_get_buf() invocation which was used in effect for a local buffer, replace these by a plain b_alloc().	2023-12-11 16:02:30 +01:00
Aurelien DARRAGON	63282f3bfb	BUG/MINOR: ext-check: cannot use without preserve-env Since `1de44da` ("MINOR: ext-check: add an option to preserve environment variables"), it is now possible to provide an extra argument to "external-check" directive. This allows to support the "preserve-env" option which differs from the default behavior. However a mistake was made, because the config parser doesn't allow the default configuration anymore: using external-check without argument will trigger an error: 'external-check' only supports 'preserve-env' as an argument, found ''. This is due to as small mistake in the code that make the check systematically report an error if the first argument is not equal to "preserve-env". The check was modified so that the error is only reported if the argument is provided, so that the default behavior is restored. This should fix GH #2380 and should be backported on 2.9 and potentially further (anywhere `1de44da` is, because a note about an optional backport up to the 2.6 was left in the original commit message)	2023-12-08 14:26:06 +01:00
Aurelien DARRAGON	d7964c52ce	BUG/MEDIUM: map/acl: pat_ref_{set,delete}_by_id regressions Some regressions were introduced by `5fea59754b` ("MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list") pat_ref_delete_by_id() fails to properly unlink and free the removed reference because it bypasses the pat_ref_delete_by_ptr() made for that purpose. This function is normally used everywhere the target reference is set for removal, such as the pat_ref_delete() function that matches pattern against a string. The call was probably skipped by accident during the rewrite of the function. With the above commit also comes another undesirable change: both pat_ref_delete_by_id() and pat_ref_set_by_id() directly use the <refelt> argument as a valid pointer (they do dereference it). This is wrong, because <refelt> is unsafe and should be handled as an ID, not a pointer (hence the function name). Indeed, the calling function may directly pass user input from the CLI as <refelt> argument, so we must first ensure that it points to a valid element before using it, else it is probably invalid and we shouldn't touch it. What this patch essentially does, is that it reverts pat_ref_set_by_id() and pat_ref_delete_by_id() to pre `5fea59754b` behavior. This seems like it was the only optimization from the patch that doesn't apply. Hopefully, after reviewing the changes with Fred, it seems that the 2 functions are only being involved in commands for manipulating maps or acls on the cli, so the "missed" opportunity to improve their performance shouldn't matter much. Nonetheless, if we wanted to speed up the reference lookup by ID, we could consider adding an eb64 tree for that specific purpose that contains all pattern references IDs (ie: pointers) so that eb lookup functions may be used instead of linear list search. The issue was raised by Marko Juraga as he failed to perform an an acl removal by reference on the CLI on 2.9 which was known to work properly on other versions. It should be backported on 2.9. Co-Authored-by: Frédéric Lécaille <flecaille@haproxy.com>	2023-12-08 14:26:06 +01:00
William Lallemand	86376f591e	MINOR: ssl: activate the certificate selection callback for WolfSSL The PR which allows to chose a certificate depending on the ciphers and the signature algorithms was merged in WolfSSL. Let's activate this code. This could be backported in 2.9 only when the next WolfSSL release is available (5.6.5). It will also need a check on the version.	2023-12-08 12:08:01 +01:00
William Lallemand	dbe9cea35b	BUILD: ssl: update types in wolfssl cert selection callback The types have changed in the PR for the wolfSSL_get_sigalg_info() function, let's update them. Must be backported in 2.9.	2023-12-08 12:03:11 +01:00
Frédéric Lécaille	c075e4f2fc	BUG/MEDIUM: quic: Possible buffer overflow when building TLS records This bug impacts only the OpenSSL QUIC compatibility module (USE_QUIC_OPENSSL_COMPAT). This may happen only when the TLS stack has to be provided with more than 1024+1+5+16 bytes of CRYPTO data. In this case several TLS records have to be built in one call to SSL_provide_quic_data(). A 5-bytes header is created at the head of these records. This header is used as AAD to cipher the record. But the length of this AAD was counted two times. One time here in quic_tls_compat_create_record() (initialization): adlen = quic_tls_compat_create_header(qc, rec, ad, 0); and a second time here in the same function after quic_tls_tls_seal() return: ret = aad_len + outlen; This addition is useless. Note that this bug could be reproduced when haproxy has to authenticate the client. Thank you to @vifino for having reported this issue in GH #2381. Must be backported to 2.8.	2023-12-08 10:03:33 +01:00
William Lallemand	75a51dfc3f	CLEANUP: mworker/cli: add comments about pcli_find_and_exec_kw() Add a comment about the pcli_find_and_exec_kw().	2023-12-07 18:04:41 +01:00
William Lallemand	1c1bb8ef2a	BUG/MINOR: mworker/cli: fix set severity-output support "set severity-output" is one of these command that changes the appctx state so the next commands are affected. Unfortunately the master CLI works with pipelining and server close mode, which means the connection between the master and the worker is closed after each response, so for the next command this is a new appctx state. To fix the problem, 2 new flags are added ACCESS_MCLI_SEVERITY_STR and ACCESS_MCLI_SEVERITY_NB which are used to prefix each command sent to the worker with the right "set severity-output" command. This patch fixes issue #2350. It could be backported as far as 2.6.	2023-12-07 17:37:23 +01:00
Amaury Denoyelle	0338778c41	MINOR: mux-quic: add traces for 0-copy/fast-forward Complete qmux traces : * add a trace when 0-copy is used for DATA transfer * mark the FIN as detected when using fast forward	2023-12-07 17:06:55 +01:00
Amaury Denoyelle	f5b2870eab	CLEANUP: mux_quic: rename ffwd function with prefix qmux_strm_ All QUIC MUX functions which are callbacks for stream layer use the prefix qmux_strm_. This was not the case for fast forward related callback which only used qmux_ prefix. Fix this by reusing the standard prefix to respect QUIC MUX code convention.	2023-12-07 17:06:55 +01:00
Amaury Denoyelle	de765a0058	MINOR: hq-interop: add fastfwd support Implement callback for fast forwarding for hq-interop. This change should not be considered as functionally important. Indeed, HTTP/0.9 is reserved for QUIC interop testing and should not be used outside of it. However, implementing fast forwarding in this context is useful as this will allow to test MUX code sections for fast forward via QUIC interop.	2023-12-07 17:06:52 +01:00
Frédéric Lécaille	917f7c74d3	BUG/MINOR: lua: Wrong OCSP CID after modifying an SSL certficate (LUA) This bugfix is the same as the following one: "BUG/MINOR: ssl_ckch: Wrong OCSP CID after modifying an SSL certficate" where the OCSP CID had to be reset when updating a certificate. Must be backported to 2.8.	2023-12-06 16:12:08 +01:00
Frédéric Lécaille	75f5977ff4	BUG/MINOR: ssl: Wrong OCSP CID after modifying an SSL certficate This bug could be reproduced with the "set ssl cert" CLI command to update a certificate. The OCSP CID is duplicated by ckchs_dup() which calls ssl_sock_copy_cert_key_and_chain(). It should be computed again by ssl_sock_load_ocsp(). This may be accomplished resetting the new ckch OCSP CID returned by ckchs_dup(). This bug may be in relation with GH #2319. Must be backported to 2.8.	2023-12-06 16:12:08 +01:00
Frédéric Lécaille	456ba6e95f	MINOR: ssl/cli: Add ha_(warning\|alert) msgs to CLI ckch callback This patch allows cli_io_handler_commit_cert() callback called upon a "commit ssl cert ..." command to prefix the messages returned by the CLI to the by the ones built by ha_warining(), ha_alert(). Should be interesting to backport this commit to 2.8.	2023-12-06 16:12:08 +01:00
Frédéric Lécaille	7dab3e8266	BUG/MINOR: ssl: Double free of OCSP Certificate ID This bug could be reproduced loading several certificated from "bind" line: with "server_ocsp.pem" as argument to "crt" setting and updating the CDSA certificate with the RSA as follows: echo -e "set ssl cert reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.ecdsa \ <<\n$(cat reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.rsa)\n" \| socat - /tmp/stats followed by an "commit ssl cert reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.ecdsa" command. This could be detected by libasan as follows: ================================================================= ==507223==ERROR: AddressSanitizer: attempting double-free on 0x60200007afb0 in thread T3: #0 0x7fabc6fb5527 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x54527) #1 0x7fabc6ae8f8c in ossl_asn1_string_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xd4f8c) #2 0x7fabc6af54e9 in ossl_asn1_primitive_free (/opt/quictls/lib/libcrypto.so.81.3+0xe14e9) #3 0x7fabc6af5960 in ossl_asn1_template_free (/opt/quictls/lib/libcrypto.so.81.3+0xe1960) #4 0x7fabc6af569f in ossl_asn1_item_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xe169f) #5 0x7fabc6af58a4 in ASN1_item_free (/opt/quictls/lib/libcrypto.so.81.3+0xe18a4) #6 0x46a159 in ssl_sock_free_cert_key_and_chain_contents src/ssl_ckch.c:723 #7 0x46aa92 in ckch_store_free src/ssl_ckch.c:869 #8 0x4704ad in cli_release_commit_cert src/ssl_ckch.c:1981 #9 0x962e83 in cli_io_handler src/cli.c:1140 #10 0xc1edff in task_run_applet src/applet.c:454 #11 0xaf8be9 in run_tasks_from_lists src/task.c:634 #12 0xafa2ed in process_runnable_tasks src/task.c:876 #13 0xa23c72 in run_poll_loop src/haproxy.c:3024 #14 0xa24aa3 in run_thread_poll_loop src/haproxy.c:3226 #15 0x7fabc69e7ea6 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x7ea6) #16 0x7fabc6907a2e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfba2e) 0x60200007afb0 is located 0 bytes inside of 3-byte region [0x60200007afb0,0x60200007afb3) freed by thread T3 here: #0 0x7fabc6fb5527 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x54527) #1 0x7fabc6ae8f8c in ossl_asn1_string_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xd4f8c) previously allocated by thread T2 here: #0 0x7fabc6fb573f in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x5473f) #1 0x7fabc6ae8d77 in ASN1_STRING_set (/opt/quictls/lib/libcrypto.so.81.3+0xd4d77) Thread T3 created by T0 here: #0 0x7fabc6f84bba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba) #1 0xc04f36 in setup_extra_threads src/thread.c:252 #2 0xa2761f in main src/haproxy.c:3917 #3 0x7fabc682fd09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09) Thread T2 created by T0 here: #0 0x7fabc6f84bba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba) #1 0xc04f36 in setup_extra_threads src/thread.c:252 #2 0xa2761f in main src/haproxy.c:3917 #3 0x7fabc682fd09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09) SUMMARY: AddressSanitizer: double-free ??:0 __interceptor_free ==507223==ABORTING Aborted The OCSP CID stored in the impacted ckch data were freed but not reset to NULL, leading to a subsequent double free. Must be backported to 2.8.	2023-12-06 16:12:08 +01:00
Christopher Faulet	67c03508d6	MEDIUM: pattern: Add support for virtual and optional files for patterns Before this patch, it was not possible to use a list of patterns, map or a list of acls, without an existing file. However, it could be handy to just use an ID, with no file on the disk. It is pretty useful for everyone managing dynamically these lists. It could also be handy to try to load a list from a file if it exists without failing if not. This way, it could be possible to make a cold start without any file (instead of empty file), dynamically add and del patterns, dump the list to the file periodically to reuse it on reload (via an external process). In this patch, we uses some prefixes to be able to use virtual or optional files. The default case remains unchanged. regular files are used. A filename, with no prefix, is used as reference, and it must exist on the disk. With the prefix "file@", the same is performed. Internally this prefix is skipped. Thus the same file, with ou without "file@" prefix, references the same list of patterns. To use a virtual map, "virt@" prefix must be used. No file is read, even if the following name looks like a file. It is just an ID. The prefix is part of ID and must always be used. To use a optional file, ie a file that may or may not exist on a disk at startup, "opt@" prefix must be used. If the file exists, its content is loaded. But HAProxy doesn't complain if not. The prefix is not part of ID. For a given file, optional files and regular files reference the same list of patterns. This patch should fix the issue #2202.	2023-12-06 10:24:41 +01:00
Christopher Faulet	660e4185e1	MINOR: pattern: Use reference name as filename to read patterns from a file It is only a small API refactoring. The filename is no longer used when pat_ref_read_from_file_smp() or pat_ref_read_from_file() functions are called. The filename was already used to create the reference on the list of patterns. Thus, we now directly use info from this reference.	2023-12-06 10:24:41 +01:00
Christopher Faulet	533121a56e	MINOR: cache: Add global option to enable/disable zero-copy forwarding tune.cache.zero-copy-forwarding parameter can now be used to enable or disable the zero-copy fast-forwarding for the cache applet only. It is enabled ('on') by default. It can be disabled by setting the parameter to 'off'.	2023-12-06 10:24:41 +01:00
Christopher Faulet	ebead3c0a1	MEDIUM: cache: Add support for endp-to-endp fast-forwarding It is now possible to directly forward data to the opposite side from the cache applet. To do so, dedicated functions were added to fast-forward the payload part of the cached objects. Of course headers and trailers are still sent via the channel's buffer, using the HTX. When an object is delivered from the cache, once the applet reaches the HTX_CACHE_DATA state, it declares it can fast-forward data. From this point, all data are directly transferred to the oppposite side.	2023-12-06 10:24:41 +01:00
Christopher Faulet	5baa9ea168	MEDIUM: cache: Save body size of cached objects and track it on delivery We now save the body size of cached objets in the cache entry strucutre. In addition, the cache applet tracks the body part already sent. This will be mandatory to add support of endpoint-to-endpoint fast-forwarding in the cache applet.	2023-12-06 10:24:41 +01:00
Christopher Faulet	5f99a37ae6	MINOR: applets: Use channel's field to compute amount of data received To be able to support endpoint-to-endpoint fast-forwarding (formerly called mux-to-mux fast-forwarding), we cannot rely on data in the input channel to compute amount of data the applet has produced. The applet API is not really designed to know how many bytes are produced or received at each call. Till now, it was not a problem because data always passed through the channels. With E2E fast-frowarding, input data may be immediately consumed. From the caller point of view (task_run_applet), there is only the total field of the input channel that will change. So let's use it now.	2023-12-06 10:24:41 +01:00
Christopher Faulet	52c84ab0e0	MEDIUM: applet: Handle channel's STREAMER flags on applets size Till now, it was not possible to notify an producing applet is streaming data. It means, it was not possible to set CF_STREAMER and CF_STREAMER_FLAGS on the input channel of an applet streaming data. While it is not a big deal for most of applets, it is interesting for the cache. Because there are now dedicated functions to deal with these flags, we can use them in task_run_applet() to set/unset these flags on the input channel. This patch relies on "MINOR: channel: Use dedicated functions to deal with STREAMER flags".	2023-12-06 10:24:41 +01:00
Christopher Faulet	a40321eb3b	MINOR: channel: Use dedicated functions to deal with STREAMER flags For now, CF_STREAMER and CF_STREAMER_FAST flags are set in sc_conn_recv() function. The logic is moved in dedicated functions. First, channel_check_idletimer() function is now responsible to check the channel's last read date against the idle timer value to be sure the producer is still streaming data. Otherwise, it removes STREAMER flags. Then, channel_check_xfer() function is responsible to check amount of data transferred avec a receive, to eventually update STREAMER flags. In sc_conn_recv(), we now use these functions.	2023-12-06 10:24:41 +01:00
Christopher Faulet	a7777bbf79	BUG/MEDIUM: peers: fix partial message decoding peer_recv_msg() may return because the message is incomplete without checking if a shutdown is pending for the SC. The function relies on co_getblk() to detect shutdowns. However, the message length decoding may be interrupted if the multi-bytes integer is incomplete. In this case, the SC is not check for shutdowns. When this happens, this leads to an appctx spinning loop. This patch should fix the issue #2373. It must be backported to 2.8.	2023-12-05 09:28:53 +01:00
Christopher Faulet	18f2ccd244	MINOR: mux-quic: Disable zero-copy forwarding for send by default There is at least an bug for now in this part and it is still unstable. Thus it is better to disable it for now by default. It can be enable by setting tune.quic.zero-copy-fwd-send to 'on'.	2023-12-04 15:36:02 +01:00
Christopher Faulet	5c959336fd	MINOR: mux-quic: Add global option to enable/disable zero-copy forwarding tune.quic.zero-copy-fwd-send can now be used to enable or disable the zero-copy fast-forwarding for the QUIC mux only, for sends. For now, there is no option to disable it for receives because it is not supported yet. It is enabled ('on') by default.	2023-12-04 15:33:52 +01:00
Christopher Faulet	6da0429e75	MINOR: mux-h2: Add global option to enable/disable zero-copy forwarding tune.h2.zero-copy-fwd-send can now be used to enable or disable the zero-copy fast-forwarding for the H2 mux only, for sends. For now, there is no option to disable it for receives because it is not supported yet. It is enabled ('on') by default.	2023-12-04 15:33:34 +01:00
Christopher Faulet	f5e73024e9	MINOR: mux-h1: Add global option to enable/disable zero-copy forwarding tune.h1.zero-copy-fwd-recv and tune.h1.zero-copy-fwd-send can now be used to enable or disable the zero-copy fast-forwarding for the H1 mux only, for receives or sends. Unlike the PT mux, there are 2 options here because client and server sides can use difference muxes. Both are enabled ('on') by default.	2023-12-04 15:33:07 +01:00
Christopher Faulet	eccef69137	MINOR: mux-pt: Add global option to enable/disable zero-copy forwarding tune.pt.zero-copy-forwarding parameter can now be used to enable or disable the zero-copy fast-forwarding for the PT mux only. It is enabled ('on') by default. It can be disabled by setting the parameter to 'off'. In this case, this disables receive and send side.	2023-12-04 15:32:32 +01:00
Christopher Faulet	7732323cf3	MINOR: global: Use a dedicated bitfield to customize zero-copy fast-forwarding Zero-copy fast-forwading feature is a quite new and is a bit sensitive. There is an option to disable it globally. However, all protocols have not the same maturity. For instance, for the PT multiplexer, there is nothing really new. The zero-copy fast-forwading is only another name for the kernel splicing. However, for the QUIC/H3, it is pretty new, not really optimized and it will evolved. And soon, the support will be added for the cache applet. In this context, it is usefull to be able to enable/disable zero-copy fast-forwading per-protocol and applet. And when it is applicable, on sends or receives separately. So, instead of having one flag to disable it globally, there is now a dedicated bitfield, global.tune.no_zero_copy_fwd.	2023-12-04 15:31:47 +01:00
Willy Tarreau	db812f73af	BUILD: http_htx: silence uninitialized warning on some gcc versions Building on gcc 4.4 reports "start may be used uninitialized". This is a classical case of dependency between two variables where the compiler lost track of their initialization and doesn't know that if one is not set, the other is. By just moving the second test in the else clause of the assignment both fixes it and makes the code more efficient, and this can be simplified as a ternary operator. It's probably not needed to backport this, unless anyone reports build warnings with more recent compilers (intermediary optimization levels such as -O1 can sometimes trigger such warnings).	2023-12-01 20:46:24 +01:00
Aurelien DARRAGON	c2cd6a419c	BUG/MINOR: server/event_hdl: properly handle AF_UNSPEC for INETADDR event It is possible that a server's addr family is temporarily set to AF_UNSPEC even if we're certain to be in INET context (ipv4, ipv6). Indeed, as soon as IP address resolving is involved, srv->addr family will be set to AF_UNSPEC when the resolution fails (could happen at anytime). However, _srv_event_hdl_prepare_inetaddr() wrongly assumed that it would only be called with AF_INET or AF_INET6 families. Because of that, the function will handle AF_UNSPEC address as an IPV6 address: not only we could risk reading from an unititialized area, but we would then propagate false information when publishing the event. In this patch we make sure to properly handle the AF_UNSPEC family in both the "prev" and the "next" part for SERVER_INETADDR event and that every members are explicitly initialized. This bug was introduced by 6fde37e046 ("MINOR: server/event_hdl: add SERVER_INETADDR event"), no backport needed.	2023-12-01 20:43:42 +01:00
Tim Duesterhus	1dcc6a8a96	BUG/MINOR: sample: Make the `word` converter compatible with `-m found` Previously an expression like: path,word(2,/) -m found always returned `true`. Bug exists since the `word` converter exists. That is: `c9a0f6d023` The same bug was previously fixed for the `field` converter in commit `4381d26edc`. The fix should be backported to 1.6+.	2023-12-01 14:35:47 +01:00
Christopher Faulet	084db70ad1	DEBUG: stream: Report lra/fsb values for front end back SC in stream dump REX and WEX date are already reported. But if the corresponding SC cannot expire on read or write, "<NEVER>" is reported instead. The same is reported if no expiration date is set. It is not really convenient because we cannot distinguish the two cases. So, now, for each SC, read and wirte timer (rto/wto) are also reported in the dump, based on .lra/.fsb dates and the current I/O timeout. The SC I/O timeout is also reported.	2023-12-01 11:25:49 +01:00
Aurelien DARRAGON	eec3911e64	BUG/MINOR: cfgparse-listen: fix warning being reported as an alert Since `b40542000d` ("MEDIUM: proxy: Warn about ambiguous use of named defaults sections") we introduced a new error to prevent user from having an ambiguous named default section in the config which is both inherited explicitly using "from" and implicitly by another proxy due to the default section being the last one defined. However, despite the error message being presented as a warning the err_code, the commit message and the documentation, it is actually reported as a fatal error because ha_alert() was used in place of ha_warning(). In this patch we make the code comply with the documentation and the intended behavior by using ha_warning() to report the error message. This should be backported up to 2.6.	2023-12-01 09:09:45 +01:00
Willy Tarreau	822d45678f	BUILD: server: shut a bogus gcc warning on certain ubuntu On ubuntu 20.04 and 22.04 with gcc 9.4 and 11.4 respectively, we get the following warning: src/server.c: In function 'srv_update_addr_port': src/server.c:4027:3: warning: 'new_port' may be used uninitialized in this function [-Wmaybe-uninitialized] 4027 \| _srv_event_hdl_prepare_inetaddr(&cb_data.addr, &s->addr, s->svc_port, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4028 \| ((ip_change) ? &sa : &s->addr), \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4029 \| ((port_change) ? new_port : s->svc_port), \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4030 \| 1); \| ~~ It's clearly wrong, port_change only changes from 0 to anything else after assigning new_port. Let's just preset new_port to zero instead of trying to play smart with the compiler.	2023-11-30 17:48:03 +01:00
Willy Tarreau	7f58e9f1e0	DEBUG: unstatify a few functions that are often present in backtraces It's useful to be able to recognize certain functions that are often present in backtraces as they call lower level functions, and for this they must not be static. Let's remove "static" in front of these functions: sc_notify, sc_conn_recv, sc_conn_send, sc_conn_process, sc_applet_process, back_establish, stream_update_both_sc, httpclient_applet_io_handler, httpclient_applet_init, httpclient_applet_release	2023-11-30 17:15:54 +01:00
Fr�d�ric L�caille	ff8db5a85d	BUG/MINOR: config: Stopped parsing upon unmatched environment variables When an environment variable could not be matched by getenv(), the current character to be parsed by parse_line() from <in> variable is the trailing double quotes. If nothing is done in such a case, this character is skipped by parse_line(), then the following spaces are parsed as an empty argument. To fix this, skip the double quotes character and the following spaces to make <in> variable point to the next argument to be parsed. Thank you to @sigint2 for having reported this issue in GH #2367. Must be backported as far as 2.4.	2023-11-30 16:48:41 +01:00
Amaury Denoyelle	0ce213d246	MINOR: quic_tp: use in_addr/in6_addr for preferred_address preferred_address is a transport parameter specify by the server. It specified both an IPv4 and IPv6 address. These addresses were defined as plain array in <struct tp_preferred_address>. Convert these adressees to use the common types in_addr/in6_addr. With this change, dumping of preferred_address is extended. It now displays the addresses using inet_ntop() and CID value.	2023-11-30 15:59:45 +01:00
Amaury Denoyelle	a9ad68aa74	BUG/MINOR: quic_tp: fix preferred_address decoding quic_transport_param_dec_pref_addr() is responsible to decode preferred_address from received transport parameter. There was two issues with this function : * address and port location as defined in RFC were inverted for both IPv4 and IPv6 during decoding * an invalid check was done to ensure decoded CID length corresponds to remaining buffer size. It did not take into account the final field for stateless reset token. These issues were never encountered as only server can emit preferred_address transport parameter, so the impact of this bug is invisible. This should be backported up to 2.6.	2023-11-30 15:49:10 +01:00
Amaury Denoyelle	f31719edae	CLEANUP: quic_cid: remove unused listener arg retrieve_qc_conn_from_cid() requires listener as argument whereas it is unused. This is an artifact from the old architecture where CID trees where stored on listener instances instead of globally. Remove it to better reflect this change.	2023-11-30 15:04:27 +01:00
Amaury Denoyelle	86e5c607d1	MINOR: rhttp: mark reverse HTTP as experimental Mark the reverse HTTP feature as experimental. This will allow to adjust if needed the configuration mechanism with future developments without maintaining retro-compatibility. Concretely, each config directives linked to it now requires to specify first global expose-experimental-directives before. This is the case for the following directives : - rhttp@ prefix uses in bind and server lines - nbconn bind keyword - attach-srv tcp rule Each documentation section refering to these keywords are updated to highlight this new requirement. Note that this commit has duplicated on several places the code from the global function check_kw_experimental(). This is because the latter only work with cfg_keyword type. This is not adapted with bind_kw or action_kw types. This should be improve in a future patch.	2023-11-30 15:04:27 +01:00
Christopher Faulet	c9418366b4	BUG/MEDIUM: cli: Don't look for payload pattern on empty commands A regression was introduced by commit 9431aa0bdf ("BUG/MEDIUM: cli: Don't look for payload pattern on empty commands"). On empty commands (really empty or containing spaces and tabs), the number of arguments set to 0. However we look for the payload pattern without checking it. The result is an access at the index -1 in the argument array. It is of course invalid. To fix the issue, we just skip this part when there is no argument. Note that the empty command is still sent to the worker. This patch should solve the issue #2365. No backport needed.	2023-11-29 15:09:29 +01:00
Christopher Faulet	24059615a7	MINOR: Add sample fetches to get the frontend and backend stream ID "fc.id" and "bc.id" sample fetches can now be used to get, respectively, the frontend or the backend stream ID. They rely on ->sctl() callback function on the mux attached to the corresponding SC. It means these sample fetches work only for connection, not applets, and from the time a multiplexer is installed.	2023-11-29 11:11:12 +01:00
Christopher Faulet	fd8ce788a5	MINOR: muxes: Implement ->sctl() callback for muxes and return the stream id All muxes now implements the ->sctl() callback function and are able to return the stream ID. For the PT multiplexer, it is always 0. For the H1 multiplexer it is the request count for the current H1 connection (added for this purpose). The FCGI, H2 and QUIC muxes, the stream ID is returned. The stream ID is returned as a signed 64 bits integer.	2023-11-29 11:11:12 +01:00
Christopher Faulet	d982a37e4c	MINOR: muxes: Rename mux_ctl_type values to use MUX_CTL_ prefix Instead of the generic MUX_, we now use MUX_CTL_ prefix for all mux_ctl_type value. This will avoid any ambiguities with other enums, especially with a new one that will be added to get information on mux streams.	2023-11-29 11:11:12 +01:00
Christopher Faulet	0b8e7d666e	MINOR: stream: Expose the stream's uniq_id via a new sample fetch "txn.id32" may now be used to get the stream's uniq_id. It is equivalent to %rt in logs.	2023-11-29 11:11:12 +01:00
Christopher Faulet	b1eb3bc9a2	MINOR: stream: add a sample fetch to get the number of connection retries "txn.conn_retries" can now be used to get the number of connection retries. This value is only stable once the connection is fully established. For HTTP sessions, L7-retries must also be passed.	2023-11-29 11:11:12 +01:00
Christopher Faulet	8f56552862	MINOR: stream: Expose session terminate state via a new sample fetch It is now possible to retrieve the session terminate state, using "txn.sess_term_state". The sample fetch returns the 2-character session termation state. Of course, the result of this sample fetch is volatile. It is subject to change. It is also most of time useless because no termation state is set except at the end. It should only be useful in http-after-response rule sets. It may also be used to customize the logs using a log-format directive. This patch should fix the issue #2221.	2023-11-29 11:11:12 +01:00
Christopher Faulet	0fd25514d6	MEDIUM: http-ana: Set termination state before returning haproxy response When, for any reason and at any step, HAProxy decides to interrupt an HTTP transaction, it returns a dedicated responses to the client (possibly empty) and it sets the stream flags used to produce the session termination state. These both operation were performed in any order, depending on the code path. Most of time, the HAPRoxy response is produced first. With this patch, the stream flags for the termination state are now set first. This way, these flags become visible from http-after-reponse rule sets. Only errors when the HAProxy response is generated are reported later.	2023-11-29 11:11:12 +01:00
Christopher Faulet	2de9e3ae24	MINOR: http-fetch: Add a sample to get the transaction status code It was possible get the status code in the HTTP response and the one received from the server. Thanks to 'txn.status', it is now possible to get the transaction status code. It is equivalent to '%ST' in log-format. Most of time, it is the same than 'status', except if the status code of the HTTP reply does not match the one used to interrupt the transaction. For instance, an error file use mapped on 400 containing a 404.	2023-11-29 11:11:12 +01:00
Christopher Faulet	b2f82b2b51	MINOR: http-fetch: Add a sample to retrieve the server status code The code returned by the "status" sample fetch is the one in the HTTP response at the moment the sample is evaluated. It may be the status code in the server response or the one of the HAProxy reply in case of error, deny, redirect... However, it could be handy to retrieve the status code returned by the server, when a HTTP response was really received from it. It is the purpose of the "server_status" sample fetch. The server status code itself is stored in the HTTP txn.	2023-11-29 11:11:12 +01:00
Amaury Denoyelle	263f4e3d9c	MINOR: h3: use correct error code for missing SETTINGS Each received HTTP/3 frame is checked to ensure it is valid given the type of stream and its current status. This was implemented via h3_is_frame_valid(). Previously, no distinction was made for error code, so every failure triggered a CONNECTION_CLOSE_APP with code H3_FRAME_UNEXPECTED. However, this function also ensures that the first frame received on control frame is of type SETTINGS. If not, the error code to use is H3_MISSING_SETTINGS. To support this, adjust the function prototype. Instead of returning a boolean, 0 is returned for success, or a HTTP/3 error code. The function is renamed h3_check_frame_valid() to reflects the return type change. This is not considered as a bug as previously the connection was correctly closed on a missing SETTINGS, albeit with a non conform error code. It's not deemed as sufficient to be backported.	2023-11-29 09:24:20 +01:00
Amaury Denoyelle	74ba22b1ee	BUG/MINOR: h3: always reject PUSH_PROMISE The condition for checking PUSH_PROMISE was not correctly interpreted from the RFC. Initially, it rejects such a frame for every stream initiated from client side. In fact, the RFC indicates that PUSH_PROMISE are never sent by a client. Thus, it can be rejected in any case until HTTP/3 will be implemented on the backend side. This should be backported up to 2.6.	2023-11-29 09:24:20 +01:00
Amaury Denoyelle	81a4cc666d	BUG/MINOR: h3: fix TRAILERS encoding HTTP/3 trailers encoding was never working as intended. It's because h3_trailers_to_htx() manipulate a newly allocated buffer instead of the already existing channel one. Thus, HTX message handled by the stream was incomplete as it lacked trailers and EOM. Fix this by reusing the already allocated channel buffer in h3_trailers_to_htx(). This bug was detected by simulating TRAILERS emission which generate CL--- state due to missing request side termination signal. Its impact is deemed as minimal as trailers are pretty infrequent for now in HTTP/3. This must be backported up to 2.7.	2023-11-29 09:24:19 +01:00
Christopher Faulet	07691a2e7c	CLEANUP: log: Fix %rc comment in sess_build_logline() %rq was used instead of %rc.	2023-11-29 08:59:27 +01:00
Christopher Faulet	61749d7cb7	BUG/MEDIUM: mux-quic: Stop zero-copy FF during nego if input is not empty When the producer negociate with the QUIC mux to perform a zero-copy fast-forward, data in the input buffer are first transferred in the H3 buffer. However, after the transfer, if the input buffer is not empty, the data fast-forwarding must be stopped. In this case, qmux_nego_ff() must return 0. No backport needed.	2023-11-29 08:59:27 +01:00
Christopher Faulet	a053512a7f	BUG/MEDIUM: master/cli: Properly pin the master CLI on thread 1 / group 1 A previous fix was pushed for that (`13fb7170be` "BUG/MEDIUM: master/cli: Pin the master CLI on the first thread of the group 1" ). Unfortunately, instead of the master CLI, it is the sockpairs between the master and the workers that were pinned to the first thread of the group 1. So the crash is still there. So, again, to fix the bug the master CLI is now pinned on the first thread of the first group. patch should fix the issue #2259 and must be backported to 2.8.	2023-11-29 08:59:27 +01:00
Aurelien DARRAGON	d3cbd36950	BUG/MINOR: compression: possible NULL dereferences in comp_prepare_compress_request() This bug was introduced in `ead43fe4f2` ("MEDIUM: compression: Make it so we can compress requests as well.") 2 cases where not properly handled, resulting in 2 possible NULL dereferences leading to crashes in the function at runtime: - when the backend didn't define any compression options so its comp pointer is NULL (ie: if only the frontend defines some comp options) - when both the frontend and the backend didn't set a compression algo but at least one of the two defined some other comp options (comp pointer set) For the first case, we added the missing checks to make sure we don't read ->comp pointer if it is NULL. For the second case, we properly return from the function if no compression algo is defined, because there is no default value that could be used as a fallback. This should be backported to 2.8.	2023-11-29 08:59:27 +01:00
Aurelien DARRAGON	2f2cb6d082	MEDIUM: log/balance: support FQDN for UDP log servers In previous log backend implementation, we created a pseudo log target for each declared log server, and we made the log target's address point to the actual server address to save some time and prevent unecessary copies. But this was done without knowing that when FQDN is involved (more broadly when dns/resolution is involved), the "port" part of server addr should not be relied upon, and we should explicitly use ->svc_port for that purpose. With that in mind and thanks to the previous commit, some changes were required: we allocate a dedicated addr within the log target when target is in DGRAM mode. The addr is first initialized with known values and it is then updated automatically by _srv_set_inetaddr() during runtime. (the change is atomic so readers don't need to worry about it) addr from server "log target" (INET/DGRAM mode) is made of the combination of server's address (lacking the port part) and server's svc_port.	2023-11-29 08:59:27 +01:00
Aurelien DARRAGON	cd994407a9	BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates For inet families (IP4/IP6), it is expected that server's addr/port might be updated at runtime from DNS, cli or lua for instance. Such updates were performed under the server's lock. Unfortunately, most readers such as backend.c or sink.c perform the read without taking server's lock because they can't afford slowing down their processing for a type of event which is normally rare. But this could result in bad values being read for the server addr:svc_port tuple (ie: during connection etablishment) as a result of concurrent updates from external components, which can obviously cause some undesirable effects. Instead of slowing the readers down, as we consider server's addr changes are relatively rare, we take another approach and try to update the addr:port atomically by performing changes under full thread isolation when a new change is requested. The changes are performed by a dedicated task which takes care of isolating the current thread and doesn't depend on other threads (independent code path) to protect against dead locks. As such, server's addr:port changes will now be performed atomically, but they will not be processed instantly, they will be translated to events that the dedicated task will pick up from time to time to apply the pending changes. This bug existed for a very long time and has never been reported so far. It was discovered by reading the code during the implementation of log backend ("mode log" in backends). As it involves changes in sensitive areas as well as thread isolation, it is probably not worth considering backporting it for now, unless it is proven that it will help to solve bugs that are actually encountered in the field. This patch depends on: - `24da4d3` ("MINOR: tools: use const for read only pointers in ip{cmp,cpy}") - `c886fb5` ("MINOR: server/ip: centralize server ip updates") - event_hdl API (which was first seen on 2.8) + `683b2ae` ("MINOR: server/event_hdl: add SERVER_INETADDR event") + BUG/MEDIUM: server/event_hdl: memory overrun in _srv_event_hdl_prepare_inetaddr() + "MINOR: event_hdl: add global tunables" Note that the patch may be reworked so that it doesn't depend on event_hdl API for older versions, the approach would remain the same: this would result in a larger patch due to the need to manually implement a global queue of pending updates with its dedicated task responsible for picking updates and comitting them. An alternative approach could consist in per-server, lock-protected, temporary addr:svc_port storage dedicated to "updaters" were only the most recent values would be kept. The sync task would then use them as source values to atomically update the addr:svc_port members that the runtime readers are actually using.	2023-11-29 08:59:27 +01:00
Aurelien DARRAGON	cb3ec978fd	MINOR: event_hdl: add global tunables The local variable "event_hdl_async_max_notif_at_once" which was introduced with the event_hdl API was left as is but with a TODO note telling that we should make it a global tunable. Well, we're doing this now. To prepare for upcoming tunables related to event_hdl API, we add a dedicated struct named event_hdl_tune which is globally exposed through the event_hdl header file so that it may be used from everywhere. The struct is automatically initialized in event_hdl_init() according to defaults.h. "event_hdl_async_max_notif_at_once" now becomes "event_hdl_tune.max_events_at_once" with it's dedicated configuation keyword: "tune.events.max-events-at-once". We're also taking this opportunity to raise the default value from 10 to 100 since it's seems quite reasonnable given existing async event_hdl users. The documentation was updated accordingly.	2023-11-29 08:59:27 +01:00
Aurelien DARRAGON	f638d4b1bc	BUG/MEDIUM: server/event_hdl: memory overrun in _srv_event_hdl_prepare_inetaddr() As reported in GH #2358, #2359, #2360, #2361 and #2362: ipv6 address handling may cause memory overrun due to struct in6_addr being handled as sockaddr_in6 which is larger. Moreover, source variable wasn't properly read from since the raw value was used as a pointer instead of pointing to the actual variable's address. This bug was introduced by 6fde37e046 ("MINOR: server/event_hdl: add SERVER_INETADDR event") Unfortunately for us, gcc didn't catch this and, this actually used to "work" by accident since in6_addr struct is made of array so not passing pointer explicitly still resolved to the proper starting address.. Hopefully this was caught by coverity so thanks to Ilya for that. The fix is simple: we simply copy the whole in6_addr struct by accessing it using a pointer and using the proper struct size for the copy.	2023-11-29 08:59:27 +01:00
William Lallemand	08f1e2bea2	MINOR: mworker/cli: implements the customized payload pattern for master CLI Implements the customized payload pattern for the master CLI. The pattern is stored in the stream in char pcli_payload_pat[8]. The principle is basically the same as the CLI one, it looks for '<<' then stores what's between '<<' and '\n', and look for it to exit the payload mode.	2023-11-28 19:13:49 +01:00
William Lallemand	dd38c37777	CLEANUP: mworker/cli: use a label to return errors Remove the returns in the function to end directly at the end label.	2023-11-28 19:12:32 +01:00
William Lallemand	e3557c7d45	MEDIUM: cli: allow custom pattern for payload The CLI payload syntax has some limitation, it can't handle payloads with empty lines, which is a common problem when uploading a PEM file over the CLI. This patch implements a way to customize the ending pattern of the CLI, so we can't look for other things than empty lines. A char cli_payload_pat[8] is used in the appctx to store the customized pattern. The pattern can't be more than 7 characters and can still empty to match an empty line. The cli_io_handler() identifies the pattern and stores it, and cli_parse_request() identifies the end of the payload. If the customized pattern between "<<" and "\n" is more than 7 characters, it is not considered as a pattern. This patch only implements the parser for the 'stats socket', another patch is needed for the 'master CLI'.	2023-11-28 19:12:32 +01:00
Remi Tricot-Le Breton	23c810d042	BUG/MINOR: cache: Remove incomplete entries from the cache when stream is closed When a stream is interrupted by the client before the full answer is stored in the cache, we end up with an incomplete entry in the cache that cannot be overwritten until it "naturally" expires. In such a case, we call the cache filter's cache_store_strm_deinit callback without ever calling cache_store_http_end which means that the 'complete' flag is never set on the concerned cache_entry. This patch adds a check on the 'complete' flag in the strm_deinit callback and removes the entry from the cache if it is incomplete. A way to exhibit this bug is to try to get the same "big" response on multiple clients at the same time thanks to h2load for instance, and to interrupt the client side before the answer can be fully stored in the cache. This patch can be backported up to 2.4 but it will need some rework starting with branch 2.8 because of the latest cache changes.	2023-11-28 17:18:48 +01:00
Frédéric Lécaille	ad61a5dde3	REORG: quic: Move quic_increment_curr_handshake() to quic_sock Move quic_increment_curr_handshake() from quic_conn.c to quic_sock.h to be inlined. Also move all the inlined functions at the end of this header.	2023-11-28 15:47:18 +01:00
Frédéric Lécaille	3e16784dfc	REORG: quic: Remove qc_pkt_insert() implementation As this function does only a few things with a not very well chosen name, remove it and replace it by the its statements at the unique location it is called.	2023-11-28 15:47:18 +01:00
Frédéric Lécaille	95e9033fd2	REORG: quic: Add a new module for retransmissions Move several functions in relation with the retransmissions from TX part (quic_tx.c) to quic_retransmit.c new C file.	2023-11-28 15:47:18 +01:00
Frédéric Lécaille	714d1096bc	REORG: quic: Move qc_notify_send() to quic_conn Move qc_notify_send() from quic_tx.c to quic_conn.c. Note that it was already exported from both quic_conn.h and quic_tx.h. Modify this latter header to fix the duplication.	2023-11-28 15:47:18 +01:00
Frédéric Lécaille	b5970967ca	REORG: quic: Add a new module for QUIC retry Add quic_retry.c new C file for the QUIC retry feature: quic_saddr_cpy() moved from quic_tx.c, quic_generate_retry_token_aad() moved from quic_generate_retry_token() moved from parse_retry_token() moved from quic_retry_token_check() moved from quic_retry_token_check() moved from	2023-11-28 15:47:18 +01:00
Frédéric Lécaille	43fbea0f38	REORG: quic: Move ncbuf related function from quic_rx to quic_conn Move quic_get_ncbuf() and quic_free_ncbuf() from quic_rx.c to quic_conn.h as static inlined functions.	2023-11-28 15:47:18 +01:00
Frédéric Lécaille	e0d3eb496b	REORG: quic: Move NEW_CONNECTION_ID frame builder to quic_cid Move qc_build_new_connection_id_frm() from quic_conn.c to quic_cid.c. Also move quic_connection_id_to_frm_cpy() from quic_conn.h to quic_cid.h.	2023-11-28 15:47:18 +01:00
Frédéric Lécaille	795d1a57bf	REORG: quic: Rename some (quic\|qc)_conn* objects to quic_conn_closed These objects could be confused with the ones defined by the congestion control part (quic_cc.c).	2023-11-28 15:47:16 +01:00
Frédéric Lécaille	0b872e24cd	REORG: quic: Move qc_may_probe_ipktns() to quic_tls.h This function is in relation with the Initial packet number space which is more linked to the QUIC TLS specifications. Let's move it to quic_tls.h to be inlined.	2023-11-28 15:37:50 +01:00
Frédéric Lécaille	c93ebcc59b	REORG: quic: Move quic_build_post_handshake_frames() to quic_conn module Move quic_build_post_handshake_frames() from quic_rx.c to quic_conn.c. This is a function which is also called from the TX part (quic_tx.c).	2023-11-28 15:37:50 +01:00
Frédéric Lécaille	3482455ddd	REORG: quic: Move qc_handle_conn_migration() to quic_conn.c This function manipulates only quic_conn objects. Its location is definitively in quic_conn.c.	2023-11-28 15:37:50 +01:00
Frédéric Lécaille	581549851c	REORG: quic: Move QUIC path definitions/declarations to quic_cc module Move quic_path struct from quic_conn-t.h to quic_cc-t.h and rename it to quic_cc_path. Update the code consequently. Also some inlined functions in relation with QUIC path to quic_cc.h	2023-11-28 15:37:50 +01:00
Frédéric Lécaille	f32fc26b62	REORG: quic: Rename some functions used upon ACK receipt Rename some functions to reflect more their jobs. Move qc_release_lost_pkts() to quic_loss.c	2023-11-28 15:37:50 +01:00
Frédéric Lécaille	f74d882ef0	REORG: quic: Move the QUIC DCID parser to quic_sock.c Move quic_get_dgram_dcid() from quic_conn.c to quic_sock.c because only used in this file and define it as static.	2023-11-28 15:37:50 +01:00
Frédéric Lécaille	3b91756ebe	REORG: quic: Move QUIC SSL BIO method related functions to quic_ssl.c Move __quic_conn_init() and __quic_conn_deinit() from quic_conn.c to quic_ssl.c.	2023-11-28 15:37:50 +01:00
Frédéric Lécaille	09ab48472c	REORG: quic: Move several inlined functions from quic_conn.h Move quic_pkt_type(), quic_saddr_cpy(), quic_write_uint32(), max_available_room(), max_stream_data_size(), quic_packet_number_length(), quic_packet_number_encode() and quic_compute_ack_delay_us() to quic_tx.c because only used in this file. Also move quic_ack_delay_ms() and quic_read_uint32() to quic_tx.c because they are used only in this file. Move quic_rx_packet_refinc() and quic_rx_packet_refdec() to quic_rx.h header. Move qc_el_rx_pkts(), qc_el_rx_pkts_del() and qc_list_qel_rx_pkts() to quic_tls.h header.	2023-11-28 15:37:47 +01:00
Frédéric Lécaille	831764641f	REORG: quic: Move QUIC CRYPTO stream definitions/declarations to QUIC TLS Move quic_cstream struct definition from quic_conn-t.h to quic_tls-t.h. Its pool is also moved from quic_conn module to quic_tls. Same thing for quic_cstream_new() and quic_cstream_free().	2023-11-28 15:37:22 +01:00
Frédéric Lécaille	ae885b9b68	REORG: quic: Move CRYPTO data buffer defintions to QUIC TLS module Move quic_crypto_buf struct definition from quic_conn-t.h to quic_tls-t.h. Also move its pool definition/declaration to quic_tls-t.h/quic_tls.c.	2023-11-28 15:37:22 +01:00
Frédéric Lécaille	0fc0d45745	REORG: quic: Add a new module to handle QUIC connection IDs Move quic_cid and quic_connnection_id from quic_conn-t.h to new quic_cid-t.h header. Move defintions of quic_stateless_reset_token_init(), quic_derive_cid(), new_quic_cid(), quic_get_cid_tid() and retrieve_qc_conn_from_cid() to quic_cid.c new C file.	2023-11-28 15:37:22 +01:00
Frédéric Lécaille	1564ec0a93	REORG: quic: Move some QUIC CLI code to its C file Move init_quic() from quic_conn.c to quic_cli.c and rename it to cli_quic_init().	2023-11-28 15:37:22 +01:00
Frédéric Lécaille	21615d4376	CLEANUP: quic: Remove dead definitions/declarations Remove useless definitions and declarations.	2023-11-28 15:37:22 +01:00
Christopher Faulet	af733ef6e4	BUG/MEDIUM: mux-h2: Remove H2_SF_NOTIFIED flag for H2S blocked on fast-forward When a H2 stream is blocked during data fast-forwarding, we must take care to remove H2_SF_NOTIFIED flag. This was only performed when data fast-forward was attempted. However, if the H2 stream was blocked for any reason, this flag was not removed. During our tests, we found it was possible to infinitely block a connection because one of its streams was in the send_list with the flag set. In this case, the stream was no longer woken up to resume the sends, blocking all other streams. No backport needed.	2023-11-28 14:01:56 +01:00
Amaury Denoyelle	fe3726cb76	BUG/MINOR: quic: fix CONNECTION_CLOSE_APP encoding CONNECTION_CLOSE_APP encoding is broken, which prevents the sending of every packet with such a frame. This bug was always present in quic haproxy. However, it was slightly dissimulated by the previous code which always initialized all frame members to zero, which was sufficient to ensure CONNECTION_CLOSE_APP encoding was ok. The below patch changes this behavior by removing this costly initialization step. `4cf784f38e` MINOR: quic: Avoid zeroing frame structures Now, frames members must always be initialized individually given the type of frame to used. However, for CONNECTION_CLOSE_APP this was not done as qc_cc_build_frm() accessed the wrong union member refering to a CONNECTION_CLOSE instead. This bug was detected when trying to generate a HTTP/3 error. The CONNECTION_CLOSE_APP frame encoding failed due to a non-initialized <reason_phrase_len> which was too big. This was reported by the following trace : "frame building error : qc@0x5555561b86c0 idle_timer_task@0x5555561e5050 flags=0x86038058 CONNECTION_CLOSE_APP" This must be backported up to 2.6. This is necessary even if above commit is not as previous code is also buggy, albeit with a different behavior.	2023-11-28 11:40:01 +01:00
Willy Tarreau	d656ac7e13	OPTIM: mux-h2/zero-copy: don't allocate more buffers per connections than streams It's the exact same as commit `0a7ab7067` ("OPTIM: mux-h2: don't allocate more buffers per connections than streams"), but for the zero-copy case this time. Previously it was only done on the regular snd_buf() path, but this one is needed as well. A transfer on 16 parallel streams now consumes half of the memory, and a single stream consumes much less. An alternate approach would be worth investigating in the future, based on the same principle as the CF_STREAMER_FAST at the higher level: in short, by monitoring how many mux buffers we write at once before refilling them, we would get an idea of how much is worth keeping in buffers max, given that anything beyond would just waste memory. Some tests show that a single buffer already seems almost as good, except for single-stream transfers, which is why it's worth spending more time on this.	2023-11-28 09:15:26 +01:00
Amaury Denoyelle	e97489a526	MINOR: trace: support -dt optional format Add an optional argument for "-dt". This argument is interpreted as a list of several trace statement separated by comma. For each statement, a specific trace name can be specifed, or none to act on all sources. Using double-colon separator, it is possible to add specifications on the wanted level and verbosity.	2023-11-27 17:15:14 +01:00
Amaury Denoyelle	670520cff8	MINOR: trace: parse verbosity in a function This patch is similar to the previous one except that it handles trace verbosity. Trace source must be specified unless "quiet" is used.	2023-11-27 17:11:14 +01:00
Amaury Denoyelle	ed9fbeed78	MINOR: trace: parse level in a function Extract conversion of level string argument to integer value in a dedicated internal function trace_parse_level(). This function is used to for CLI trace parsing and will also be useful for "-dt" process argument.	2023-11-27 17:11:14 +01:00
Amaury Denoyelle	cef29d3708	MINOR: trace: define simple -dt argument Add '-dt' haproxy process argument. This will automatically activate all trace sources on stderr with the error level. This could be useful to troubleshoot issues such as protocol violations.	2023-11-27 17:10:18 +01:00
Amaury Denoyelle	eabe477ad2	BUILD: map: fix build warning <pattern> field pointer of pat_ref_elt structure has been by a zero-length array. As such, it's now unneeded to check for NULL address before printing it. This type conversion was done in the following commit : `3ac9912837` OPTIM: pattern: save memory and time using ebst instead of ebis The current patch is mandatory to fix the following GCC warning : CC src/map.o src/map.c: In function ‘cli_io_handler_map_lookup’: src/map.c:549:54: error: the comparison will always evaluate as ‘true’ for the address of ‘pattern’ will never be NULL [-Werror=address] 549 \| if (pat->ref && pat->ref->pattern) \| No need to backport it unless the above commit is.	2023-11-27 15:03:41 +01:00
Willy Tarreau	3ac9912837	OPTIM: pattern: save memory and time using ebst instead of ebis In the pat_ref_elt struct, the pattern string is stored outside of the node element, using a pointer to an strdup(). Not only this needlessly wastes at least 16-24 bytes per entry (8 for the pointer, 8-16 for the allocator), it also makes the tree descent less efficient since both the node and the string have to be visited for each layer (hence at least two cache lines). Let's use an ebmb storage and place the pattern right at the end of the pat_ref_elt, making it a variable-sized element instead. The set-map test below jumps from 173 to 182 kreq/s/core, and the memory usage drops from 356 MB to 324 MB: http-request set-map(/dev/null) %[rand(1000000)] 1 This is even more visible with large maps: after loading 16M IP addresses into a map, the process uses this amount of memory: - 3.15 GB with haproxy-2.8 - 4.21 GB with haproxy-2.9-dev11 - 3.68 GB with this patch So that's a net saving of 32 bytes per entry here, which cuts in half the extra cost of the tree, and loading a large map takes about 20% less time.	2023-11-27 11:25:07 +01:00
Christopher Faulet	b4eaadae84	BUG/MEDIUM: mux-h1: Properly ignore trailers when a content-length is announced It is not possible in H1, but in H2 (and probably H3) it is possible to have trailers at the end of a message while a Content-Length was announced. However, depending if the trailers are received with the last HTX DATA block or the zero-copy forwarding is used or not, an processing error may be triggered, leading to a 500-internal-error. To fix the issue, when a content-length is announced and all the payload was processed, we switch the message to H1_MSG_DONE state only if the end-of-message was also reported (HTX_FL_EOM flag set). Otherwise, it is switched to H1_MSG_TRAILERS state to be able to properly ignored the trailers, if so. The patch must be backported as far as 2.4. Be careful, this part was highly refactored. The patch will have to be adapted to be backported.	2023-11-27 08:37:48 +01:00
William Lallemand	3dd55fa132	MINOR: mworker/cli: implement hard-reload over the master CLI The mworker mode never had a proper 'hard-stop' (-st) for the reload, this is a mode which was commonly used with the daemon mode, but it was never implemented in mworker mode. This patch fixes the problem by implementing a "hard-reload" command over the master CLI. It does the same as the "reload" command, but instead of waiting for the connections to stop in the previous process, it immediately quits the previous process after binding.	2023-11-24 21:44:25 +01:00
William Lallemand	77a97536e8	MEDIUM: ssl: use ssl_sock_chose_sni_ctx() in the clienthello callback This patch removes the code which selects the SSL certificate in the OpenSSL Client Hello callback, to use the ssl_sock_chose_sni_ctx() function which does the same. The bigger part of the function which remains is the extraction of the servername, ciphers and sigalgs, because it's done manually by parsing the TLS extensions. This is not supposed to change anything functionally.	2023-11-24 20:07:27 +01:00
William Lallemand	9f2e07bf7b	MINOR: ssl: move certificate selection in a dedicate function The certificate selection used in the WolfSSL cert_cb and in the OpenSSL clienthello callback is the same, the function was duplicate to achieve the same. This patch move the selection code to a common function called ssl_sock_chose_sni_ctx(). The servername string is still lowered in the callback, however the search for the first dot in the string (wildp) is done in ssl_sock_chose_sni_ctx() The function uses the same certificate selection algorithm as before, it needs to know if you need rsa or ecdsa, the bind_conf to achieve the lookup, and the servername string. This patch moves the code for WolSSL only.	2023-11-24 20:07:27 +01:00
William Lallemand	b900a3533c	MINOR: ssl: replace 'trash.area' by 'servername' in ssl_sock_switchctx_cbk() Replace 'trash.area' by 'servername' for more readibility.	2023-11-24 20:07:27 +01:00
William Lallemand	3750442bc4	MEDIUM: ssl: implement rsa/ecdsa selection with WolfSSL PR https://github.com/wolfSSL/wolfssl/pull/6963 implements primitives to extract ciphers and algorithm signatures. It allows to chose a certificate depending on the sigals and ciphers presented by the client (RSA or ECDSA). Since WolfSSL does not implement the clienthello callback, the patch uses the certificate callback (SSL_CTX_set_cert_cb()) The callback is inspired by our clienthello callback, however the extraction of client ciphers and sigalgs is simpler, wolfSSL_get_sigalg_info() and wolfSSL_get_ciphersuite_info() are used. This is not enabled by default yet as the PR was not merged.	2023-11-24 20:07:27 +01:00
Aurelien DARRAGON	20437b3e32	MINOR: log/balance: set lbprm tot_weight on server on queue/dequeue Maintain proper px->lbprm.tot_weight for log backends. server's weight is considered as 1 as long as the server is usable. This will allow the stats page to correctly display the proxy status since the check currently relies on proxy's lbprm.tot_weight variable.	2023-11-24 16:27:55 +01:00
Aurelien DARRAGON	661c079bc5	MINOR: log/backend: prevent "use-server" rules use with LOG mode server_rules declared using "use-server" keyword within a proxy are not supported inside a log backend (with "mode log" set), so we report a warning to the user and reset the setting.	2023-11-24 16:27:55 +01:00
Aurelien DARRAGON	f2629ebd4e	MINOR: proxy: add free_server_rules() helper function Take the px->server_rules freeing part out of free_proxy() and make it a dedicated helper function so that it becomes possible to use it from anywhere.	2023-11-24 16:27:55 +01:00
Aurelien DARRAGON	481e9317e3	MINOR: proxy: add free_logformat_list() helper function There are multiple places inside free_proxy() where we need to perform the exact same operation: freeing a logformat list which includes freeing every member. To prevent code duplication, we add the free_logformat_list() function that takes such list as parameter and does all the freeing job on its own.	2023-11-24 16:27:55 +01:00
Aurelien DARRAGON	8f878d5969	Revert "MINOR: cfgparse-listen: warn when use-server rules is used in wrong mode" This reverts commit 5884e46ec8c8231e73c68e1bdd345c75c9af97a0 since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-24 16:27:55 +01:00
Aurelien DARRAGON	ffae3ca34b	MINOR: backend: remove invalid mode test for "hash-balance-factor" This is a leftover from `1e0093a317` ("MINOR: backend/balance: "balance" requires TCP or HTTP mode"). Indeed, we cannot perform the test during parsing as the effective proxy type is not yet known. Moreover, thanks to `b61147fd` ("MEDIUM: log/balance: merge tcp/http algo with log ones") we could potentially benefit from this setting even in log mode, but for now it is ignored by all log compatible load-balancing algorithms.	2023-11-24 16:27:55 +01:00
Aurelien DARRAGON	c886fb58eb	MINOR: server/ip: centralize server ip updates Add a new helper function named _srv_update_inetaddr() to centralize ip addr and port updates during runtime.	2023-11-24 16:27:55 +01:00
Aurelien DARRAGON	24da4d3ee7	MINOR: tools: use const for read only pointers in ip{cmp,cpy} In this patch we fix the prototype for ipcmp() and ipcpy() functions so that input pointers that are used exclusively for reads are used as const pointers. This way, the compiler can safely assume that those variables won't be altered by the function.	2023-11-24 16:27:55 +01:00
Aurelien DARRAGON	683b2ae013	MINOR: server/event_hdl: add SERVER_INETADDR event In this patch we add the support for a new SERVER event in the event_hdl API. SERVER_INETADDR is implemented as an advanced server event. It is published each time the server's ip address or port is about to change. (ie: from the cli, dns, lua...) SERVER_INETADDR data is an event_hdl_cb_data_server_inetaddr struct that provides additional info related to the server inet addr change, but can be casted as a regular event_hdl_cb_data_server struct if additional info is not needed.	2023-11-24 16:27:55 +01:00
Christopher Faulet	671e07617c	BUG/MINOR: global: Fix tune.disable-(fast-forward/zero-copy-forwarding) options These options were not properly handled during configration parsing. A wrong bitwise operation was used. No backport needed.	2023-11-24 09:33:56 +01:00
Christopher Faulet	8d46a2c973	MAJOR: h3: Implement zero-copy support to send DATA frame When possible, we try send DATA frame without copying data. To do so, we swap the input buffer with QCS tx buffer. It is only possible iff: * There is only one HTX block of data at the beginning of the message * Amount of data to send is equal to the size of the HTX data block * The QCS tx buffer is empty In this case, both buffers are swapped. The frame metadata are written at the begining of the buffer, before data and where the HTX structure is stored.	2023-11-24 07:42:43 +01:00
Christopher Faulet	1bcc0f8892	MEDIUM: mux-quic: Add consumer-side fast-forwarding support The QUIC multiplexer now implements callbacks to consume fast-forwarded data. It relies on the H3 stack to acquire the buffer and format the frame.	2023-11-24 07:42:43 +01:00
Willy Tarreau	cd352c0dbe	MINOR: log/balance: rename "log-sticky" to "sticky" After giving it some thought, it could pretty well happen that other protocols benefit from the sticky algorithm that some used to emulate using a "stick-on int(0)" or things like this previously. So better rename it to "sticky" right now instead of having to keep that "log-" prefix forever. It's still limited to logs, of course, only the algo is renamed in the config.	2023-11-23 18:21:31 +01:00
Amaury Denoyelle	71ed381249	MINOR: listener: allow thread kw for rhttp bind Thanks to previous commit, a reverse HTTP listener is able to distribute actively opened connections accross its threads. To be able to exploit this, allow "thread" keyword for such a listener. An extra check is added to explicitely forbids a reverse bind to span multiple thread groups. Without this, multiple listeners instances will be created, each with its owned "nbconn" value. This may surprise users so for now, better to deactivate this possibility.	2023-11-23 17:46:00 +01:00
Amaury Denoyelle	3d0c7f2e2a	MEDIUM: rhttp: support multi-thread active connect Implement support for active HTTP reverse task migration on listener threads. This operation is done each time a new reversable connection will be instantiated. Instead of directly allocate the connection, a lookup is done among all the listener threads. A comparison is done to select the thread with the smallest number of current reverse connection. If the thread found is different from the current one, the connection allocation is delayed and the task rescheduled on the chosen thread. The connection will then be created and pinned on the new thread. This mechanisms allows to balance reverse HTTP connections accross different threads. Note that rhttp_set_affinity is still defined to disable thread migration on accept. This is necessary as it's unsafe to move an existing connection to another thread. However, active reverse task migration should be sufficient to distribute connections accross several threads. Better than that, this design allows to differentiate standard frontend and reversable connections. The latest are designed to be long-lived so it's useful to have their repartition solely based on others reversed connections.	2023-11-23 17:45:56 +01:00
Amaury Denoyelle	a3187fe06c	MINOR: rhttp: add count of active conns per thread Add a new member <nb_rhttp_conns> in thread_ctx structure. Its purpose is to count the current number of opened reverse HTTP connections regarding from their listeners membership. This patch will be useful to support multi-thread for active reverse HTTP, in order to select the less loaded thread. Note that despite access to <nb_rhttp_conns> are only done by the current thread, atomic operations are used. This is because once multi-thread support will be added, external threads will also retrieve values from others.	2023-11-23 17:43:01 +01:00
Amaury Denoyelle	55e78ff7e1	MINOR: rhttp: large renaming to use rhttp prefix Previous commit renames 'proto_reverse_connect' module to 'proto_rhttp'. This commits follows this by replacing various custom prefix by 'rhttp_' to make the code uniform. Note that 'reverse_' prefix was kept in connection module. This is because if a new reversable protocol not based on HTTP is implemented, it may be necessary to reused the same connection function which are protocol agnostic.	2023-11-23 17:40:01 +01:00
Amaury Denoyelle	e09af499b4	MINOR: rhttp: rename proto_reverse_connect This commit is renaming of module proto_reverse_connect to proto_rhttp. This name is selected as it is shorter and more precise.	2023-11-23 17:38:58 +01:00
Christopher Faulet	85da7116a9	BUG/MEDIUM: mux-h1: Don't set CO_SFL_MSG_MORE flag on last fast-forward send In the mux-to-mux fast-forwarding, when end-of-input is reached on the producer side, the consumer side must not set the CO_SFL_MSG_MORE flag on send. It means the H1C_F_CO_MSG_MORE flag must be removed from the H1 connection. No backport needed.	2023-11-23 17:30:18 +01:00
Willy Tarreau	1de44daf7d	MINOR: ext-check: add an option to preserve environment variables In Github issue #2128, @jvincze84 explained the complexity of using external checks in some advanced setups due to the systematic purge of environment variables, and expressed the desire to preserve the existing environment. During the discussion an agreement was found around having an option to "external-check" to do that and that solution was tested and confirmed to work by user @nyxi. This patch just cleans this up, implements the option as "preserve-env" and documents it. The default behavior does not change, the environment is still purged, unless "preserve-env" is passed. The choice of not using "import-env" instead was made so that we could later use it to name specific variables that have to be imported instead of keeping the whole environment. The patch is simple enough that it could be backported if needed (and was in fact tested on 2.6 first).	2023-11-23 16:53:57 +01:00
Ilya Shipitsin	80813cdd2a	CLEANUP: assorted typo fixes in the code and comments This is 37th iteration of typo fixes	2023-11-23 16:23:14 +01:00
Willy Tarreau	45a9e4e24b	MINOR: init: add info about the main program to the post_mortem struct This way we'll still have haproxy's version, build options etc in core dumps and centralized all at once.	2023-11-23 15:39:21 +01:00
Willy Tarreau	6455fd5024	MINOR: debug: add the ability to enter components in the post_mortem struct Here the idea is to collect components' versions and build options. The main component is haproxy, but the API is made so that any sub-system can easily add a component there (for example the detailed version of a device detection lib, or some info about a lib loaded from Lua). The elements are stored as a pointer to an array of structs and its count so that it's sufficient to issue this in gdb to list them all at once: print *post_mortem.components@post_mortem.nb_components For now we collect name, version, toolchain, toolchain options, build options and path. Maybe more could be useful in the future.	2023-11-23 15:39:21 +01:00
Willy Tarreau	a88a3482b5	MINOR: debug: dump the mapping of the libs into post_mortem Having the libs and their addresses listed in the post_mortem struct is also helpful. Sometimes it helps notice that one version is not the expected one, e.g. due to some LD_LIBRARY_PATH. We don't emit it on "show dev" however since that's already available via "show libs".	2023-11-23 15:39:21 +01:00
Willy Tarreau	37e3dd718c	MINOR: debug: copy the thread info into the post_mortem struct The last starting thread now copies the pthread ID and stack top of each thread into post_mortem. That way it's as easy as issuing "p post_mortem" in gdb to see all thread IDs and stack frames and more easily map them to the threads met in a core.	2023-11-23 15:39:21 +01:00
Willy Tarreau	c0eec3a4aa	MINOR: debug: collect some boot-time info related to the process Here we collect the original uid/gid/rlimits for FD and RAM since these ones do affect behavior and are sometimes different from expected in containers or when starting as a service.	2023-11-23 15:39:21 +01:00
Willy Tarreau	ff9e06cd53	MINOR: debug: report any detected hypervisor in post_mortem When the x86 CPU flags show the "hypervisor" flag, we know we're running inside QEMU, VMware or possibly other flavors of hypervisors. In this case we'll report either "qemu", "vmware" or "yes" for other ones in the "virt_techno" field, based on the DMI hardware vendor name, otherwise "no" when the flag is not found.	2023-11-23 15:39:21 +01:00
Willy Tarreau	0cc799bdd1	MINOR: debug: detect CPU model and store it in post_mortem The CPU model and type has significant impact on certain bugs, such as contention issues caused by CPUs having split L3 caches, or stricter memory models that exhibit some barrier issues. It's complicated though because the info about the model depends on the arch. For example, x86 reports an SKU name while ARM rather reports the CPU core types, families and versions for each CPU core. There, the SoC will sometimes be reported in the device tree or DMI info instead. But we don't really care, it's essentially useful to know if the code is running on an armv8.0 such as A53, a 8.2 such as A55/A76/Neoverse etc. For MIPS the model appears to generally be there, and in addition the SoC is often present in the "system type" field before the first CPU, and the type of machine in the "machine" field, to replace the missing DMI and DT, so they are also collected. Note that only the first CPU is checked and reported, that's expected to be vastly sufficient, since we're just trying to spot known incompatibilities or issues.	2023-11-23 15:39:21 +01:00
Willy Tarreau	2974f3e71b	MINOR: debug: report in post_mortem if the container techno used is docker If we detect we're running inside a container on Linux, let's check if it seems to be docker. Docker usually creates a /.dockerenv file, which is easy to check. It's uncertain whether it's always the case, but on the few tested instances that was true, and we don't really care, what matters is to place helpful debugging info for developers. When this file is detected, we report "docker" instead of "yes" in the container techno.	2023-11-23 15:39:21 +01:00
Willy Tarreau	cf8be50a3d	MINOR: debug: report in port_mortem whether a container was detected Containers often cause significant trouble depending on how they're set up, and they're not always trivial for their users to extract info from. Here we're trying to detect if we're running inside a container on Linux. There are plenty of approaches and none is perfectly clean nor reliable, which makes sense since the goal is to remain transparent enough. One interesting approach is to rely on the observation that containers generally do not expose most kernel threads, and that the very firsts of them are extremely stable across all kernel versions: pid 2 was called "keventd" in kernel 2.4, became "kthreadd" in kernel 2.6, and has since not changed. This is true on all architectures tested, even with highly stripped down kernels such as those found on 15 year-old OpenWRT images. And this one doesn't appear inside containers. Thus here we check if we find such a thread via /proc and whether it's called keventd or kthreadd, to detect a container, and we set the "cont_techno" variable to "yes" or "no" depending on what is found.	2023-11-23 15:39:21 +01:00
Willy Tarreau	4e3f9921de	MINOR: debug: add OS/hardware info to the post_mortem struct Let's extract some info about the system (board model, vendor etc), this will indicate some hypervisors, some cloud instances or some uncommon embedded boards etc. Typically, vmware, qemu and raspberry-pi are visible here and can help during the troubleshooting session.	2023-11-23 15:39:21 +01:00
Willy Tarreau	0184597522	MINOR: debug: start to create a new struct post_mortem The goal here is to accumulate precious debugging information in a struct that is easy to find in memory. It's aligned to 256-byte as it also helps. We'll progressively add a lot of info about the startup conditions, the operating system, the hardware and hypervisor so as to limit the number of round trips between developers and users during debugging sessions. Also, opening a core file with an hex editor should often be sufficient to extract most of the info. In addition, a new "show dev" command will show these information so that they can be checked at runtime without having to wait for a crash (e.g. if a limit is bad in a container, better know it early). For now the struct only contains utsname that's fed at boot time.	2023-11-23 15:39:21 +01:00
Willy Tarreau	2268f10dd6	DEBUG: tinfo: store the pthread ID and the stack pointer in tinfo When debugging a core, it's difficult to match a given gdb thread number against an internal thread. Let's just store the pthread ID and the stack pointer in each tinfo. This could help in the future by allowing to just glance over them and pick the right one depending what info is found first.	2023-11-23 14:32:55 +01:00
Willy Tarreau	53da8bfcb6	BUG/MINOR: server: do not leak default-server in defaults sections When a default-server directive is used in a defaults section, it's never freed and the "defaults" proxy gets reset without freeing the fields from that default-server. Normally there are no allocation there, except for the config file location stored in srv->conf.file form an strdup() since commit `9394a9444` ("REORG: server: move alert traces in parse_server") that appeared in 2.4. In addition, if a "default-server" directive appears multiple times in a defaults section, one more entry will be leaked per call. This commit addresses this by checking that we don't overwrite the file upon multiple calls, and by clearing it when resetting the default proxy. This should be backported to 2.4.	2023-11-23 14:32:55 +01:00
Frédéric Lécaille	7fc52357cb	BUG/MINOR: quic: Possible RX packet memory leak under heavy load This bug could be reproduced with -dMfail and h2load generating plenty of connections. A "show pools" CLI command showed that some memory in relation with RX packet pool was never release. Furthermore, adding a RX packet counter to each connection and a BUG_ON() in quic_conn_release() has proved that this unreleased memory was in relation with RX packet which were not linked to a connection. The responsible is quic_dgram_parse() which does not release some RX packet memory before exiting after the connection thread affinity has changed. Must be backported as far as 2.7.	2023-11-22 18:03:26 +01:00
Frédéric Lécaille	cd225da46c	BUG/MINOR: quic: Possible leak of TX packets under heavy load This bug could be reproduced with -dMfail and detected added a counter of TX packet to the QUIC connection. When released calling quic_conn_release() the connection should have a null counter of TX packets. This was not always the case. This could occur during the handshake step: a first packet was built, then another one should have followed in the same datagram, but fail due to a memory allocation issue. As the datagram length and first TX packet were not written in the TX buffer, this latter could not really be purged by qc_purge_tx_buf() even if called. This bug occured only when building coalesced packets in the same datagram. To fix this, write the packet information (datagram length and first packet address) in the TX buffer before purging it. Must be backported as far as 2.6.	2023-11-22 18:03:26 +01:00
Frédéric Lécaille	dc8a20b317	BUG/MEDIUM: quic: Possible crash during retransmissions and heavy load This bug could be reproduced with -dMfail and dectected by libasan as follows: $ ASAN_OPTIONS=disable_coredump=0:unmap_shadow_on_exit=1:abort_on_error=f quic-freeze.cfg -dMfail -dMno-cache -dM0x55 ================================================================= ==82989==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffc 0x560790cc4749 bp 0x7fff8e0e8e30 sp 0x7fff8e0e8e28 WRITE of size 8 at 0x7fff8e0ea338 thread T0 #0 0x560790cc4748 in qc_frm_free src/quic_frame.c:1222 #1 0x560790cc5260 in qc_release_frm src/quic_frame.c:1261 #2 0x560790d1de99 in qc_treat_acked_tx_frm src/quic_rx.c:312 #3 0x560790d1e708 in qc_ackrng_pkts src/quic_rx.c:370 #4 0x560790d22a1d in qc_parse_ack_frm src/quic_rx.c:694 #5 0x560790d25daa in qc_parse_pkt_frms src/quic_rx.c:988 #6 0x560790d2a509 in qc_treat_rx_pkts src/quic_rx.c:1373 #7 0x560790c72d45 in quic_conn_io_cb src/quic_conn.c:906 #8 0x560791207847 in run_tasks_from_lists src/task.c:596 #9 0x5607912095f0 in process_runnable_tasks src/task.c:876 #10 0x560791135564 in run_poll_loop src/haproxy.c:2966 #11 0x5607911363af in run_thread_poll_loop src/haproxy.c:3165 #12 0x56079113938c in main src/haproxy.c:3862 #13 0x7f92606edd09 in __libc_start_main ../csu/libc-start.c:308 #14 0x560790bcd529 in _start (/home/flecaille/src/haproxy/haproxy+0x Address 0x7fff8e0ea338 is located in stack of thread T0 at offset 1032 i #0 0x560790d29b52 in qc_treat_rx_pkts src/quic_rx.c:1341 This frame has 2 object(s): [32, 48) 'ar' (line 1380) [64, 1088) '_msg' (line 1368) <== Memory access at offset 1032 is inable HINT: this may be a false positive if your program uses some custom stacnism, swapcontext or vfork (longjmp and C++ exceptions are supported) SUMMARY: AddressSanitizer: stack-use-after-scope src/quic_frame.c:1222 i Shadow bytes around the buggy address: 0x100071c15410: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 0x100071c15420: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 0x100071c15430: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 0x100071c15440: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 0x100071c15450: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 =>0x100071c15460: f8 f8 f8 f8 f8 f8 f8[f8]f8 f8 f8 f8 f8 f8 f3 f3 0x100071c15470: f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 f3 00 00 0x100071c15480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100071c15490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100071c154a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100071c154b0: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f3 f3 f3 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==82989==ABORTING AddressSanitizer:DEADLYSIGNAL AddressSanitizer:DEADLYSIGNAL AddressSanitizer:DEADLYSIGNAL AddressSanitizer:DEADLYSIGNAL AddressSanitizer:DEADLYSIGNAL AddressSanitizer:DEADLYSIGNAL Aborted (core dumped) Note that a coredump could not always be produced with all compilers. This was always the case with clang 11. When allocating frames to be retransmitted from qc_dgrams_retransmit(), if they could not be sent for any reason, they could remain attached to a local list to qc_dgrams_retransmit() and trigger a crash with libasan when releasing the original frames they were duplicated from. To fix this, always release the frames which could not be sent during retransmissions calling qc_free_frm_list() where needed. Must be backported as far as 2.6.	2023-11-22 18:03:26 +01:00
Frédéric Lécaille	34bc100b8f	MINOR: quic: Add traces to debug frames handling during retransmissions This is really boring to not know why some retransmissions could not be done from qc_prep_hpkts() which allocates frames, prepare packets and send them. Especially to not know about if frames are not remaining allocated and attached to list on the stack. This patch already helped in diagnosing such an issue during "-dMfail" tests.	2023-11-22 18:03:26 +01:00
Willy Tarreau	8f9e94ecff	BUILD: log: silence a build warning when threads are disabled Building without threads emits two warnings because the proxy pointer is no longer used (only serves for the lock) since 2.9 commit `9a74a6cb1` ("MAJOR: log: introduce log backends"). No backport is needed.	2023-11-22 11:21:07 +01:00
Amaury Denoyelle	89da4e9e5d	MINOR: acl: define explicit HTTP_3.0 Some ACL shortcuts are defined to match HTTP requests by their version. This exists for HTTP_1.0 to HTTP_2.0. This patch adds HTTP_3.0 definition.	2023-11-20 18:01:07 +01:00
Amaury Denoyelle	decf29d06d	MINOR: quic: remove unneeded QUIC specific stopping function On CONNECTION_CLOSE reception/emission, QUIC connections enter CLOSING state. At this stage, only CONNECTION_CLOSE can be reemitted and all other exchanges are stopped. Previously, on haproxy process stopping, if all QUIC connections were in CLOSING state, they were released before their closing timer expiration to not block the process shutdown. However, since a recent commit, the closing timer has been shorten to a more reasonable delay. It is now consider viable to respect connections closing state even on process shutdown. As such, stopping specific code in QUIC connections idle timer task was removed. A specific function quic_handle_stopping() was implemented to notify QUIC connections on shutdown from main() function. It should have been deleted along the removal in QUIC idle timer task. This patch just does this.	2023-11-20 17:59:52 +01:00
Frédéric Lécaille	756b3c5f7b	BUG/MEDIUM: quic: Possible crash for connections to be killed The connections are flagged as "to be killed" asap when the peer has left (detected by sendto() "Connection refused" errno) by qc_kill_conn(). This function has to wakeup the idle timer task to release the connection (and the idle timer and the idle timer task itself). Then if in the meantime the connection was flagged as having to process some retransmissions, some packet could lead to sendto() errors again with a call to qc_kill_conn(), this time with a released idle timer task. This bug could be detected by libasan as follows: .AddressSanitizer:DEADLYSIGNAL ================================================================= ==21018==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x 560b5d898717 bp 0x7f9aaac30000 sp 0x7f9aaac2ff80 T3) ==21018==The signal is caused by a READ memory access. ==21018==Hint: address points to the zero page. . #0 0x560b5d898717 in _task_wakeup include/haproxy/task.h:209 #1 0x560b5d8a563c in qc_kill_conn src/quic_conn.c:171 #2 0x560b5d97f832 in qc_send_ppkts src/quic_tx.c:636 #3 0x560b5d981b53 in qc_send_app_pkts src/quic_tx.c:876 #4 0x560b5d987122 in qc_send_app_probing src/quic_tx.c:910 #5 0x560b5d987122 in qc_dgrams_retransmit src/quic_tx.c:1397 #6 0x560b5d8ab250 in quic_conn_app_io_cb src/quic_conn.c:712 #7 0x560b5de41593 in run_tasks_from_lists src/task.c:596 #8 0x560b5de4333c in process_runnable_tasks src/task.c:876 #9 0x560b5dd6f2b0 in run_poll_loop src/haproxy.c:2966 #10 0x560b5dd700fb in run_thread_poll_loop src/haproxy.c:3165 #11 0x7f9ab9188ea6 in start_thread nptl/pthread_create.c:477 #12 0x7f9ab90a8a2e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfba2e) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV include/haproxy/task.h:209 in _task_wakeup Thread T3 created by T0 here: #0 0x7f9ab97ac2a2 in __interceptor_pthread_create ../../../../src/libsaniti zer/asan/asan_interceptors.cpp:214 #1 0x560b5df4f3ef in setup_extra_threads src/thread.c:252 o #2 0x560b5dd730c7 in main src/haproxy.c:3856 #3 0x7f9ab8fd0d09 in __libc_start_main ../csu/libc-start.c:308 i ==21018==ABORTING AddressSanitizer:DEADLYSIGNAL Aborted (core dumped) To fix, simply reset the connection flag QUIC_FL_CONN_RETRANS_NEEDED to cancel the retransmission when qc_kill_conn is called. Note that this new bug arrived with this fix which is correct and flagged as to be backported as far as 2.6. BUG/MINOR: quic: idle timer task requeued in the past Must be backported as far as 2.6.	2023-11-20 17:17:16 +01:00
Amaury Denoyelle	a8968701c0	BUG/MAJOR: quic: complete thread migration before tcp-rules A quic_conn is instantiated and tied on the first thread which has received the first INITIAL packet. After handshake completion, listener_accept() is called. For each quic_conn, a new thread is selected among the least loaded ones Note that this occurs earlier if handling 0-RTT data. This thread connection migration is done in two steps : * inside listener_accept(), on the origin thread, quic_conn tasks/tasklet are killed. After this, no quic_conn related processing will occur on this thread. The connection is flagged with QUIC_FL_CONN_AFFINITY_CHANGED. * as soon as the first quic_conn related processing occurs on the new thread, the migration is finalized. This allows to allocate the new tasks/tasklet directly on the destination thread. This last step on the new thread must be done prior to other quic_conn access. There is two events which may trigger it : * a packet is received on the new thread. In this case, qc_finalize_affinity_rebind() is called from quic_dgram_parse(). * the recently accepted connection is popped from accept_queue_ring via accept_queue_process(). This will called session_accept_fd() as listener.bind_conf.accept callback. This instantiates a new session and start connection stack via conn_xprt_start(), which itself calls qc_xprt_start() where qc_finalize_affinity_rebind() is used. A condition was recently found which could cause a closing to be used with qc_finalize_affinity_rebind() which is forbidden with a BUG_ON(). This lat step was not compatible with layer 4 rule such as "tcp-request connection reject" which closes the connection early. In this case, most of the body of session_accept_fd() is skipped, including qc_xprt_start(), so thread migration is not finalized. At the end of the function, conn_xprt_close() is then called which flags the connection as CLOSING. If a datagram is received for this connection before it is released, this will call qc_finalize_affinity_rebind() which triggers its BUG_ON() to prevent thread migration for CLOSING quic_conn. FATAL: bug condition "qc->flags & ((1U << 29)\|(1U << 30))" matched at src/quic_conn.c:2036 Thread 3 "haproxy" received signal SIGILL, Illegal instruction. [Switching to Thread 0x7ffff794f700 (LWP 2973030)] 0x00005555556221f3 in qc_finalize_affinity_rebind (qc=0x7ffff002d060) at src/quic_conn.c:2036 2036 BUG_ON(qc->flags & (QUIC_FL_CONN_CLOSING\|QUIC_FL_CONN_DRAINING)); (gdb) bt #0 0x00005555556221f3 in qc_finalize_affinity_rebind (qc=0x7ffff002d060) at src/quic_conn.c:2036 #1 0x0000555555682463 in quic_dgram_parse (dgram=0x7fff5003ef10, from_qc=0x0, li=0x555555f38670) at src/quic_rx.c:2602 #2 0x0000555555651aae in quic_lstnr_dghdlr (t=0x555555fc4440, ctx=0x555555fc3f78, state=32832) at src/quic_sock.c:189 #3 0x00005555558c9393 in run_tasks_from_lists (budgets=0x7ffff7944c90) at src/task.c:596 #4 0x00005555558c9e8e in process_runnable_tasks () at src/task.c:876 #5 0x000055555586b7b2 in run_poll_loop () at src/haproxy.c:2966 #6 0x000055555586be87 in run_thread_poll_loop (data=0x555555d3d340 <ha_thread_info+64>) at src/haproxy.c:3165 #7 0x00007ffff7b59609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #8 0x00007ffff7a7e133 in clone () from /lib/x86_64-linux-gnu/libc.so.6 To fix this issue, ensure quic_conn migration is completed earlier inside session_accept_fd(), before any tcp rules processing. This is done by moving qc_finalize_affinity_rebind() invocation from qc_xprt_start() to qc_conn_init(). This must be backported up to 2.7.	2023-11-20 16:11:26 +01:00
Willy Tarreau	3e913909e7	BUILD: cache: fix build error on older compilers pre-c99 compilers will fail to build the cache since commit `48f81ec09` ("MAJOR: cache: Delay cache entry delete in reserve_hot function") due to an int declaration in the for loop. No backport is needed.	2023-11-20 11:43:52 +01:00
Willy Tarreau	445fc1fe3a	BUG/MINOR: sock: mark abns sockets as non-suspendable and always unbind them In 2.3, we started to get a cleaner socket unbinding mechanism with commit `f58b8db47` ("MEDIUM: receivers: add an rx_unbind() method in the protocols"). This mechanism rightfully refrains from unbinding when sockets are expected to be transferrable to another worker via "expose-fd listeners", but this is not compatible with ABNS sockets, which do not support reuseport, unbinding nor being renamed: in short they will always prevent a new process from binding. It turns out that this is not much visible because by pure accident, GTUNE_SOCKET_TRANSFER is only set in the code dealing with master mode and deamons, so it's never set in foreground mode nor in tests even if present on the stats socket. However with master mode, it is now always set even when not present on the stats socket, and will always conflict. The only reasonable approach seems to consist in marking these abns sockets as non-suspendable so that the generic sock_unbind() code can decide to just unbind them regardless of GTUNE_SOCKET_TRANSFER. This should carefully be backported as far as 2.4.	2023-11-20 11:38:26 +01:00
William Lallemand	ef9a195742	BUG/MINOR: startup: set GTUNE_SOCKET_TRANSFER correctly This bug was forbidding the GTUNE_SOCKET_TRANSFER option to be set when haproxy is neither in daemon mode nor in mworker mode. So it basically only impacts the foreground mode. The fix moves the code outside the 'if (global.mode & (MODE_DAEMON \| MODE_MWORKER \| MODE_MWORKER_WAIT))' condition. Bug was introduced with `7f80eb23` ("MEDIUM: proxy: zombify proxies only when the expose-fd socket is bound"). Must be backported in every stable version.	2023-11-20 10:49:05 +01:00
Aurelien DARRAGON	82f4bcafae	MINOR: log/backend: prevent "dynamic-cookie-key" use with LOG mode It doesn't make sense to set "dynamic-cookie-key" inside a log backend, thus we report a warning to the user and reset the setting.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	c7783fb32b	MINOR: log/backend: prevent "http-send-name-header" use with LOG mode It doesn't make sense to use the "http-send-name-header" directive inside a log backend so we report a warning in with case and reset the setting.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	4b2616f784	MINOR: log/backend: prevent stick table and stick rules with LOG mode Report a warning and prevent errors if user tries to declare a stick table or use stick rules within a log backend.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	5335618967	MINOR: log/backend: prevent tcp-{request,response} use with LOG mode We start implementing some postparsing compatibility checks for log backends. Here we report a warning if user tries to use tcp-{request,response} rules with log backend, and we properly ignore such rules when inherited from defaults section.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	6a29888f60	MINOR: log/backend: ensure log exclusive params are not used in other modes add proxy_cfg_ensure_no_log() function (similar to proxy_cfg_ensure_no_http()) to ensure at the end of proxy parsing that no log exclusive options are found if the proxy is not in log mode.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	42d7d1bd47	Revert "MINOR: filter: "filter" requires TCP or HTTP mode" This reverts commit `f9422551cd` since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	c8948fb7ac	Revert "MINOR: flt_http_comp: "compression" requires TCP or HTTP mode" This reverts commit `225526dc16` since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	33e5c4055f	Revert "MINOR: http_htx/errors: prevent the use of some keywords when not in tcp/http mode" This reverts commit `b41b77b4cc` since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	0f9b475333	Revert "MINOR: fcgi-app: "use-fcgi-app" requires TCP or HTTP mode" This reverts commit `0ba731f50b` since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	7d59730100	Revert "MINOR: cfgparse-listen: "http-reuse" requires TCP or HTTP mode" This reverts commit `65f1124b5d` since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	f1a072d077	Revert "MINOR: cfgparse-listen: "dynamic-cookie-key" requires TCP or HTTP mode" This reverts commit `0b09727a22` since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	a0a7dd1ee7	Revert "MINOR: cfgparse-listen: "http-send-name-header" requires TCP or HTTP mode" This reverts commit `d354947365` since we cannot perform the test during parsing as the proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	c90d7dc46b	Revert "MINOR: stktable: "stick" requires TCP or HTTP mode" This reverts commit `098ae743fd` since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	8e20fdbb1c	Revert "MINOR: tcp_rules: tcp-{request,response} requires TCP or HTTP mode" This reverts commit `09b15e4163` since we cannot perform the test during parsing as the effective proxy mode is not yet known.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	b6e1e9ec8b	Revert "MINOR: proxy: report a warning for max_ka_queue in proxy_cfg_ensure_no_http()" This reverts commit `3934901` since it makes no sense to report a warning in this case given that max-keepalive-queue will also work with TCP backends.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	b61147fd2a	MEDIUM: log/balance: merge tcp/http algo with log ones "log-balance" directive was recently introduced to configure the balancing algorithm to use when in a log backend. However, it is confusing and it causes issues when used in default section. In this patch, we take another approach: first we remove the "log-balance" directive, and instead we rely on existing "balance" directive to configure log load balancing in log backend. Some algorithms such as roundrobin can be used as-is in a log backend, and for log-only algorithms, they are implemented as "log-$name" inside the "backend" directive. The documentation was updated accordingly.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	2c4943c18b	BUG/MINOR: proxy/stktable: missing frees on proxy cleanup In `1b8e68e` ("MEDIUM: stick-table: Stop handling stick-tables as proxies.") we forgot to free the table pointer which is now dynamically allocated. Let's take this opportunity to also fix a missing free in the table itself (the table expire task wasn't properly destroyed) This patch depends on: - "MINOR: stktable: add sktable_deinit function" It should be backported in every stable versions.	2023-11-18 11:16:21 +01:00
Aurelien DARRAGON	e10cf61099	MINOR: stktable: add stktable_deinit function Adding sktable_deinit() helper function to properly cleanup a sticktable that was initialized using stktable_init().	2023-11-18 11:16:21 +01:00
Willy Tarreau	6c7771f1b4	MINOR: stream/cli: add another filter "susp" to "show sess" This one reports streams considered as "suspicious", i.e. those with no expiration dates or dates in the past, or those without a front endpoint. More criteria could be added in the future.	2023-11-17 19:30:07 +01:00
Willy Tarreau	3ffcf7beb1	MINOR: stream/cli: add an optional "older" filter for "show sess" It's often needed to be able to refine "show sess" when debugging, and very often a first glance at old streams is performed, but that's a difficult task in large dumps, and it takes lots of resources to dump everything. This commit adds "older <age>" to "show sess" in order to specify the minimum age of streams that will be dumped. This should simplify the identification of blocked ones.	2023-11-17 19:30:04 +01:00
Willy Tarreau	ec76e0138b	BUG/MINOR: stream/cli: report correct stream age in "show sess" Since 2.4-dev2 with commit `15e525f49` ("MINOR: stream: Don't retrieve anymore timing info from the mux csinfo"), we don't replace the tv_accept (now accept_ts) anymore with the current request's, so that it properly reflects the session's accept date and not the request's date. However, since then we failed to update "show sess" to make use of the request's timestamp instead of the session's timestamp, resulting in fantasist values in the "age" field of "show sess" for the task. Indeed, the session's age is displayed instead of the stream's, which leads to great confusion when debugging, particularly when it comes to multiplexed inter-proxy connections which are kept up forever. Let's fix this now. This must be backported as far as 2.4. However, for 2.7 and older, the field was named tv_request and was a timeval.	2023-11-17 18:59:12 +01:00
Willy Tarreau	662565ddb4	MINOR: backend: without ->connect(), allow to pick another thread's connection If less connections than threads are established on a reverse-http gateway and these servers have a non-nul pool-min-conn, then conn_backend_get() will refrain from picking available connections from other threads. But this makes no sense for protocols for which there is no ->connect(), since there's no way the current thread will manage to establish its own connection. For such situations we should always accept to use another thread's connection. That's precisely what this patch does.	2023-11-17 18:13:04 +01:00
Willy Tarreau	f592a0d5dd	MINOR: rhttp: remove the unused outgoing connect() function A dummy connect() function previously had to be installed for the log server so that a reverse-http address could be referenced on a "server" line, but after the recent rework of the server line parsing, this is no longer needed, and this is actually annoying as it makes one believe there is a way to connect outside, which is not true. Let's now get rid of this function.	2023-11-17 18:10:16 +01:00
Willy Tarreau	d069825c5f	BUG/MEDIUM: mux-fcgi: fail earlier on malloc in takeover() This is the equivalent of the previous "BUG/MEDIUM: mux-h1: fail earlier on malloc in takeover()". Connection takeover was implemented for fcgi in 2.2 by commit `a41bb0b6c` ("MEDIUM: mux_fcgi: Implement the takeover() method."). It does have one corner case related to memory allocation failure: in case the task or tasklet allocation fails, the connection gets released synchronously. Unfortunately the situation is bad there, because the lower layers are already switched to the new thread while the tasklet is either NULL or still the old one, and calling fcgi_release() will also result in touching the thread-local list of buffer waiters, calling unsubscribe(), There are even code paths where the thread will try to grab the lock of its own idle conns list, believing the connection is there while it has no useful effect. However, if the owner thread was doing the same at the same moment, and ended up trying to pick from the current thread (which could happen if picking a connection for a different name), the two could even deadlock. No tests were made to try to reproduce the problem, but the description above is sufficient to see that nothing can guarantee against it. This patch takes a simple but radically different approach. Instead of starting to migrate the connection before risking to face allocation failures, it first pre-allocates a new task and tasklet, then assigns them to the connection if the migration succeeds, otherwise it just frees them. This way it's no longer needed to manipulate the connection until it's fully migrated, and as a bonus this means the connection will continue to exist and the use-after-free condition is solved at the same time. This should be backported to 2.2. Thanks to Fred for the initial analysis of the problem!	2023-11-17 18:10:16 +01:00
Willy Tarreau	95fd2d6801	BUG/MEDIUM: mux-h1: fail earlier on malloc in takeover() This is the h1 equivalent of previous "BUG/MEDIUM: mux-h2: fail earlier on malloc in takeover()". Connection takeover was implemented for H1 in 2.2 by commit `f12ca9f8f1` ("MEDIUM: mux_h1: Implement the takeover() method."). It does have one corner case related to memory allocation failure: in case the task or tasklet allocation fails, the connection gets released synchronously. Unfortunately the situation is bad there, because the lower layers are already switched to the new thread while the tasklet is either NULL or still the old one, and calling h1_release() will call some unsubscribe and and possibly other things whose safety is not guaranteed (and the ambiguity here alone is sufficient to be careful). There are even code paths where the thread will try to grab the lock of its own idle conns list, believing the connection is there while it has no useful effect. However, if the owner thread was doing the same at the same moment, and ended up trying to pick from the current thread (which could happen if picking a connection for a different name), the two could even deadlock. Contrary to mux-h2, a few tests were not sufficient to try to crash the process, but there's nothing that indicates it couldn't happen based on the description above. This patch takes a simple but radically different approach. Instead of starting to migrate the connection before risking to face allocation failures, it first pre-allocates a new task and tasklet, then assigns them to the connection if the migration succeeds, otherwise it just frees them. This way it's no longer needed to manipulate the connection until it's fully migrated, and as a bonus this means the connection will continue to exist and the use-after-free condition is solved at the same time. This should be backported to 2.2. Thanks to Fred for the initial analysis of the problem!	2023-11-17 18:10:16 +01:00
Willy Tarreau	4f02e3da67	BUG/MEDIUM: mux-h2: fail earlier on malloc in takeover() Connection takeover was implemented for H2 in 2.2 by commit `cd4159f03` ("MEDIUM: mux_h2: Implement the takeover() method."). It does have one corner case related to memory allocation failure: in case the task or tasklet allocation fails, the connection gets released synchronously. Unfortunately the situation is bad there, because the lower layers are already switched to the new thread while the tasklet is either NULL or still the old one, and calling h2_release() will also result in h2_process() and h2_process_demux() that may process any possibly pending frames. Even the session remains the old one on the old thread, so that some sess_log() that are called when facing certain demux errors will be associated with the previous thread, possibly accessing a number of elements belonging to another thread. There are even code paths where the thread will try to grab the lock of its own idle conns list, believing the connection is there while it has no useful effect. However, if the owner thread was doing the same at the same moment, and ended up trying to pick from the current thread (which could happen if picking a connection for a different name), the two could even deadlock. The risk is extremely low, but Fred managed to reproduce use-after-free errors in conn_backend_get() after a takeover() failed by playing with -dMfail, indicating that h2_release() had been successfully called. In practise it's sufficient to have h2 on the server side with reuse-always and to inject lots of request on it with -dMfail. This patch takes a simple but radically different approach. Instead of starting to migrate the connection before risking to face allocation failures, it first pre-allocates a new task and tasklet, then assigns them to the connection if the migration succeeds, otherwise it just frees them. This way it's no longer needed to manipulate the connection until it's fully migrated, and as a bonus this means the connection will continue to exist and the use-after-free condition is solved at the same time. This should be backported to 2.2. Thanks to Fred for the initial analysis of the problem!	2023-11-17 18:10:16 +01:00
Willy Tarreau	c7a90cc181	CLEANUP: haproxy: remove old comment from 1.1 from the file header There was still a totally outdated comment speaking about issues affecting solaris on 1.1.8pre4 (April 2002, 21 year-old)! This proves that comments in headers are never read, so let's take this opportunity for also removing the outdated one recommending to read the "updated" RFC7230.	2023-11-17 18:10:16 +01:00
Fr�d�ric L�caille	888d1dc3dc	MINOR: quic: Rename "handshake" timeout to "client-hs" Use a more specific name for this timeout to distinguish it from a possible future one on the server side. Also update the documentation.	2023-11-17 18:09:41 +01:00
Frédéric Lécaille	373e40f0c1	MEDIUM: session: handshake timeout (TCP) Adapt session_accept_fd() called on accept() to set the handshake timeout from "hanshake-timeout" setting if set by configuration. If not set, continue to use the "client" timeout setting.	2023-11-17 17:31:42 +01:00
Frédéric Lécaille	392640a61b	BUG/MINOR: quic: Malformed CONNECTION_CLOSE frame This bug arrived with this commit: MINOR: quic: Avoid zeroing frame structures Before this latter, the CONNECTION_CLOSE was zeroed, especially the "reason phrase length". Restablish this behavior. No need to backport.	2023-11-17 17:31:42 +01:00
Frédéric Lécaille	953c7dc2b9	MINOR: quic: Dump the expiration date of the idle timer task This date is shared between the idle timer and hanshake timeout. So, it should be useful to dump the expiration date of the idle timer task itself, in place of the idle timer expiration date. This way, the handshake timeout value will be visible during the handshake from CLI "show quic full" command.	2023-11-17 17:31:42 +01:00
Frédéric Lécaille	e3e0bb90ce	MEDIUM: quic: Add support for "handshake" timeout setting. The idle timer task may be used to trigger the client handshake timeout. The hanshake timeout expiration date (qc->hs_expire) is initialized when the connection is allocated. Obviously, this timeout is taken into an account only during the handshake by qc_idle_timer_do_rearm() whose job is to rearm the idle timer. The idle timer expiration date could be initialized only one time, then never updated until the hanshake completes. But this only works if the handshake timeout is smaller than the idle timer task timeout. If the handshake timeout is set greater than the idle timeout, this latter may expire before the handshake timeout. This patch may have an impact on the L1/C1 interop tests (with heavy packet loss or corruption). This is why I guess some implementations with a hanshake timeout support set a big timeout during this test. This is at least the case for ngtcp2 which sets a 180s hanshake timeout! haproxy will certainly have to proceed the same way if it wants to have a chance to pass this test as before this handshake timeout.	2023-11-17 17:31:42 +01:00
Frédéric Lécaille	b33eacc523	MINOR: proxy: Add "handshake" new timeout (frontend side) Add a new timeout for the handshake, on the frontend side only. Such a hanshake will be typically used for TLS hanshakes during client connections to TLS/TCP or QUIC frontends.	2023-11-17 17:31:42 +01:00
Remi Tricot-Le Breton	d5cce92a46	BUG/MINOR: shctx: Remove old HA_SPIN_INIT The shctx lock was changed from a SPINLOCK to a RWLOCK in commit `ed35b94` "MEDIUM: cache: Switch shctx spinlock to rwlock and restrict its scope" but a SPIN_INIT was left behind. This patch does not need to be backported.	2023-11-17 16:56:18 +01:00
Christopher Faulet	7676a2cdf6	BUG/MINOR: stconn/applet: Report send activity only if there was output data For applets and connection, when a send attempt is performed, we must be sure to not report a send activity if there was no output data at all before the attempt. It is not important for the <fsb> date itself but for the <lra> date for non-independent stream. This patch must be backported to 2.8.	2023-11-17 15:36:43 +01:00
Christopher Faulet	ab5ecaa2ea	BUG/MINOR: stconn: Use HTX-aware channel's functions to get info on buffer Some channel function are used to check if the channel's buffer is full, not empty or if there are input data. However, functions used are not HTX-aware. So it is not accurate and may prevent some actions to be performed (However, not sure there are really issues). Because HTX-aware versions now exist, use them instead. This patch may be backported as far as 2.2. It relies on * "MINOR: channel: Add functions to get info on buffers and deal with HTX streams" * "MINOR: htx: Use a macro for overhead induced by HTX"	2023-11-17 15:09:33 +01:00
Christopher Faulet	24409a5caa	BUG/MINOR: stconn: Fix streamer detection for HTX streams Since the HTX was introduced, the streamer detection is broken for HTX streams because the HTX overhead was not counted in the test to set CF_STREAMER and CF_STREAMER_FAST flags. The consequence was that the consumer side was no longer able to send more than tune.ssl.maxrecord at a time in SSL. To fix the issue, we now count the HTX overhead of HTX streams to be able to set CF_STREAMER/CF_STREAMER_FAST flags on a channel. This patch relies on folloing commits: * "MINOR: channel: Add functions to get info on buffers and deal with HTX streams" * "MINOR: htx: Use a macro for overhead induced by HTX" The series must be backported as far as 2.2.	2023-11-17 15:09:17 +01:00
Christopher Faulet	b68c579eda	BUG/MEDIUM: stconn: Update fsb date on partial sends The first-send-blocked date was originally designed to save the date of the first send of a series where some data remain blocked. It was relaxed recently (`3083fd90e` "BUG/MEDIUM: stconn: Report a send activity everytime data were sent") to save the date of the first full blocked send. However, it is not accurrate. When all data are sent, the fsb value must be reset to TICK_ETERNITY. When nothing is sent and if it is not already set, it must be set. But when data are partially sent, the value must be updated and not reset. Otherwise the write timeout may be ignored because fsb date is never set. So, changes brought by the patch above are reverted and sc_ep_report_blocked_send() was changed to know if some data were sent or not. This way we are able to update fsb value. l This patch must be backported to 2.8.	2023-11-17 12:13:00 +01:00
Remi Tricot-Le Breton	f1f8e2b3df	DOC: cache: Specify when function expects a cache lock Some functions are built on the fact that the cache lock must be already taken by the caller. This patch adds this information in the functions' descriptions.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	45a2ff0f4a	MINOR: shctx: Remove 'use_shared_mem' variable This global variable was used to avoid using locks on shared_contexts in the unlikely case of nbthread==1. Since the locks do not do anything when USE_THREAD is not defined, it will be more beneficial to simply remove this variable and the systematic test on its value in the shared context locking functions.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	4fe6c1365d	MINOR: shctx: Remove redundant arg from free_block callback The free_block callback does not get called on blocks that are not row heads anymore so we don't need too shared_block parameters.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	48f81ec09d	MAJOR: cache: Delay cache entry delete in reserve_hot function A reference counter on the cache_entry was added in a previous commit. Its value is atomically increased and decreased via the retain_entry and release_entry functions. This is needed because of the latest cache and shared_context modifications that introduced two separate locks instead of the preexisting single shctx_lock one. With the new logic, we have two main blocks competing for the two locks: - the one in the http_action_req_cache_use that performs a lookup in the cache tree (locked by the cache lock) and then tries to remove the corresponding blocks from the shared_context's 'avail' list until the response is sent to the client by the cache applet, - the shctx_row_reserve_hot that traverses the 'avail' list and gives them back to the caller, while removing previous row heads from the cache tree Those two blocks require the two locks but one of them would take the cache lock first, and the other one the shctx_lock first, which would end in a deadlock without the current patch. The way this conflict is resolved in this patch is by ensuring that at least one of those uses works without taking the two locks at the same time. The solution found was to keep taking the two locks in the cache_use case. We first lock the cache to lookup for an entry and we then take the shctx lock as well to detach the corresponding blocks from the 'avail' list. The subtlety is that between the cache lookup and the actual locking of the shctx, another thread might have called the reserve_hot function in which we only take the shctx lock. In this function we traverse the 'avail' list to remove blocks that are then given to the caller. If one of those blocks corresponds to a previous row head, we call the 'free_blocks' callback that used to delete the cache entry from the tree. We now avoid deleting directly the cache entries in reserve_hot and we rather set the cache entries 'complete' param to 0 so that no other thread tries to work with this entry. This way, when we release the shctx lock in reserve_hot, the first thread that had performed the cache lookup and had found an entry that we just gave to another thread will see that the 'complete' field is 0 and it won't try to work with this response. The actual removal of entries from the cache tree will now be performed in the new 'reserve_finish' callback called at the end of the shctx_row_reserve_hot function. It will iterate on all the row head that were inserted in a dedicated list in the 'free_block' callback and perform the actual delete.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	1cd91b4f2a	MINOR: shctx: Add new reserve_finish callback call to shctx_row_reserve_hot This patch adds a reserve_finish callback that can be defined by the subsystems that require a shared_context. It is called at the end of shctx_row_reserve_hot after the shared_context lock is released.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	11df806c88	MEDIUM: shctx: Descend shctx_lock calls into the shctx_row_reserve_hot Descend the shctx_lock calls into the shctx_row_reserve_hot so that the cases when we don't need to lock anything (enough space in the current row or not enough space in the 'avail' list) do not take the lock at all. In sh_ssl_sess_new_cb the lock had to be descended into sh_ssl_sess_store in order not to cover the shctx_row_reserve_hot call anymore.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	a29b073f26	MEDIUM: cache: Add refcount on cache_entry Add a reference counter on the cache_entry. Its value will be atomically increased and decreased via the retain_entry and release_entry functions. The release_entry function has two distinct versions, release_entry_locked and release_entry_unlocked that should be called when the cache lock is already taken in write mode or not (respectively). In the unlocked case the cache lock will only be taken in write mode on the last reference of the entry (before calling delete_entry). This allows to limit the amount of times when we need to take the cache lock during a release operation.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	ed35b9411a	MEDIUM: cache: Switch shctx spinlock to rwlock and restrict its scope Since a lock on the cache tree was added in the latest cache changes, we do not need to use the shared_context's lock to lock more than pure shared_context related data anymore. This already existing lock will now only cover the 'avail' list from the shared_context. It can then be changed to a rwlock instead of a spinlock because we might want to only run through the avail list sometimes. Apart form changing the type of the shctx lock, the main modification introduced by this patch is to limit the amount of code covered by the shctx lock. This lock does not need to cover any code strictly related to the cache tree anymore.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	a0d7c290ec	MINOR: cache: Use dedicated trash for "show cache" cli command After the latest changes in the cache/shared_context mechanism, the cache and shared_context logic were decorrelated and in some unlikely cases we might end up using the "show cache" command while some regular cache processing is occurring (a response being stored in the cache for instance). In such a case, because we used the same 'trash' buffer in those two contexts, we could end up with the contents of a response in the ouput of the "show cache" command. This patch fixes this problem by allocating a dedicated trash for the CLI command.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	3831d8454f	MEDIUM: shctx: Remove 'hot' list from shared_context The "hot" list stored in a shared_context was used to keep a reference to shared blocks that were currently being used and were thus removed from the available list (so that they don't get reused for another cache response). This 'hot' list does not ever need to be shared across threads since every one of them only works on their current row. The main need behind this 'hot' list was to detach the corresponding blocks from the 'avail' list and to have a known list root when calling list_for_each_entry_from in shctx_row_data_append (for instance). Since we actually never need to iterate over all members of the 'hot' list, we can remove it and replace the inc_hot/dec_hot logic by a detach/reattach one.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	bd24118212	MEDIUM: cache: Use rdlock on cache in cache_use When looking for a valid entry in the cache tree in http_action_req_cache_use, we do not need to delete an expired entry at once because even if an expired entry exists, since the request will be forwarded to the server, then the expired entry will be overwritten when the updated response is seen. We can then use a simpler rdlock during cache_use operation.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	0dfb57bbf9	MINOR: cache: Add option to avoid removing expired entries in lookup function Any lookup in the cache tree done through entry_exist or secondary_entry_exist functions could end up deleting the corresponding entry if it is expired which prevents from using a rdlock on code paths that would just perform a lookup on the tree (in http_action_req_cache_use for instance). Adding a 'delete_expired' boolean as a parameter allows for "pure" lookups and thus it will allow to perform operations on the tree that simply require a rdlock instead of a "heavier" wrlock.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	ff3cb6dad4	MINOR: cache: Remove expired entry delete in "show cache" command The "show cache" CLI command iterates over all the entries of the cache tree and it used this opportunity to remove expired entries from the cache. This behavior was completely undocumented and does not seem that necessary. By removing it we can take the cache lock in read mode only which limits the impact on the other threads.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	ac9c49b40d	MEDIUM: cache: Use dedicated cache tree lock alongside shctx lock Every use of the cache tree was covered by the shctx lock even when no operations were performed on the shared_context lists (avail and hot). This patch adds a dedicated RW lock for the cache so that blocks of code that work on the cache tree only can use this lock instead of the superseding shctx one. This is useful for operations during which the concerned blocks are already in the hot list. When the two locks need to be taken at the same time, in http_action_req_cache_use and in shctx_row_reserve_hot, the shctx one must be taken first. A new parameter needed to be added to the shared_context's free_block callback prototype so that cache_free_block can take the cache lock and release it afterwards.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	81d8014af8	MINOR: shctx: Remove explicit 'from' param from shctx_row_data_append This parameter is not necessary since the first element of a row always has a pointer to the row's tail.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	610b67fd8b	MEDIUM: shctx: Simplify shctx_row_reserve_hot loop The shctx_row_reserve_hot relied on two loop levels in order to first look for the first block of a preused row and then iterate on all the blocks of this row to reserve them for the new row. This was not the simplest nor the easiest to read way so this logic could be replaced by a single iteration on the avail list members. The two use cases of calling this function with or without a preexisting "first" member were a bit cumbersome as well and were replaced by a more straightforward approach.	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	eccb97f60e	MEDIUM: shctx: Move list between hot and avail list in O(1) Instead of iterating over all the elements of a given row when moving it between the hot and available lists, we can make use of the last_reserved pointer that already points to the last block of the list to perform the move in O(1).	2023-11-16 19:35:10 +01:00
Remi Tricot-Le Breton	55fbf82080	MINOR: shctx: Set last_append to NULL when reserving block in hot list Ensure that the last_append pointer is always set to NULL on first block of rows reserved by the subsystems using the shctx (cache for instance). This pointer will be used directly in shctx_row_data_append instead of the 'from' param which will simplify its uses.	2023-11-16 19:35:10 +01:00
Amaury Denoyelle	560cb1332a	MINOR: server: force add to idle on reverse A backend connection is inserted in server idle list via srv_add_to_idle_list(). This function has several conditions which may cause the connection to be rejected instead. One of this condition is based on the current estimate count of needed connections for the server. If the count of idle connections stored has already reached this estimation, the new connection is rejected. This is in opposition with the purpose of reverse HTTP. On active reverse, haproxy can instantiate several connections to properly serve the future traffic. However, the opposite passive haproxy will have only a low estimate of needed connection and will reject most of them. To fix this, simply check CO_FL_REVERSED connection flag on srv_add_to_idle_list(). If set, the connection is inserted without checking for estimate count. Note that all other conditions are not impacted, so it's still possible to reject a connection, for example if process FD limit is reached. This commit relies on recent patch which change CO_FL_REVERSED flag for connection after passive reverse.	2023-11-16 18:43:41 +01:00
Amaury Denoyelle	a1457296d5	BUG/MINOR: mux_h2: reject passive reverse conn if error on add to idle On passive reverse, H2 mux is responsible to insert the connection in the server idle list. This is done via srv_add_to_idle_list(). However, this function may fail for various reason, such as FD usage limit reached. Handle properly this error case. H2 mux flags the connection on error which will cause its release. Prior to this patch, the connection was only released on server timeout. This bug was found inspecting server curr_used_conns counter. Indeed, on connection reverse, this counter is first incremented. It is decremented just after on srv_add_to_idle_list() if insertion is validated. However, if insertion is rejected, the connection was not released which cause curr_used_conns to remains positive. This has the major downside to break the reusing of idle connection on rhttp causing spurrious 503 errors. No need to backport.	2023-11-16 18:43:32 +01:00
Amaury Denoyelle	8cc3fc73f1	MINOR: connection: update rhttp flags usage Change the flags used for reversed connection : * CO_FL_REVERSED is now put after reversal for passive connect. For active connect, it is delayed when accept is completed after reversal. * CO_FL_ACT_REVERSING replace the old CO_FL_REVERSED. It is put only for active connect on reversal and removes once accept is done. This allows to identify a connection as reversed during its whole lifetime. This should be useful to extend reverse connect.	2023-11-16 17:53:31 +01:00
Christopher Faulet	691f4cf449	BUG/MEDIUM: stream: Don't call mux .ctl() callback if not implemented The commit `5ff7d2276` ("BUG/MEDIUM: stream: Properly handle abortonclose when set on backend only") introduced a regression. Not all multiplexer implement the .ctl() callback function. Thus we must be sure this callback function is defined first to call it. This patch should fix a crash reported by Tristan in the issue #2095. It must be backported as far as 2.2, with the commit above.	2023-11-14 19:21:52 +01:00
William Lallemand	d76fa37534	BUG/MEDIUM: mworker: set the master variable earlier Since 2.7 and the mcli_reload_bind_conf (`56f73b21a5`), upon a reload failure because of a bind error, the mcli_reload_bind_conf go through a sock_unbind((). This is not supposed to do anything when a listener is RX_F_INHERITED in the master, but unfortunately this is done too early and provokes an exit of the master. We already suspected in the past that setting the 'master' variable this late could have negative impact. The fix sets the master variable earlier before the bind. This must be backported at least to 2.7. This could be backported earlier but better wait any feedbacks on the fix.	2023-11-14 14:32:39 +01:00
Willy Tarreau	a63e016d27	MINOR: activity: report profiling duration and age in "show profiling" Seeing counters in "show profiling" is not always very helpful without an indication of how long the analysis lasted nor if it's still active or not. Let's add a pair of start/stop timers for tasks and memory so that we can now indicate how long the measurements lasted and when they ended (or 0 if still running). Note that for tasks profiling set to "auto", the measurement is considered enabled since it can automatically switch on and off on a per-thread basis.	2023-11-14 11:46:37 +01:00
Christopher Faulet	ec3ea6f698	MINOR: stconn: Use SC to detect frontend connections in sc_conn_recv() In sc_conn_recv(), instead of using the connection to know we are on the frontend side, we now use the SC flags. It changes nothing but it is cleaner.	2023-11-14 11:01:51 +01:00
Christopher Faulet	5ff7d22767	BUG/MEDIUM: stream: Properly handle abortonclose when set on backend only Since the 2.2 and the commit `dedd30610` ("MEDIUM: h1: Don't wake the H1 tasklet if we got the whole request."), we avoid to subscribe for reads if the H1 message is fully received. However, this broke the abortonclose option. To fix the issue, a CO_RFL flag was added to instruct the mux it should still wait for read events to properly handle read0. Only the H1 mux was concerned. But since then, most of time, the option is only handled if it is set on the frontend proxy because the request is fully received before selecting the backend. If the backend is selected before the end of the request there is no issue. But otherwise, because the backend is not known yet, we are unable to properly handle the option and we miss to subscribe for reads. Of course the option cannot be set on a frontend proxy. So concretly it means the option is properly handled if it is enabled in the defaults section (if common to frontend and backend) or a listen proxy, but it is ignored if it is set on backend only. Thanks to previous patches, we can now instruct the mux it should subscribe for reads if not already done. We use this mechanism in process_stream() when the connection is set up, ie when backend SC is set to SC_ST_REQ state. This patch relies on following patches: * MINOR: connection: Add a CTL flag to notify mux it should wait for reads again * MEDIUM: mux-h1: Handle MUX_SUBS_RECV flag in h1_ctl() and susbscribe for reads This patch should be the issue #2344. All the series must be backported as far as 2.2.	2023-11-14 11:01:51 +01:00
Christopher Faulet	450ff71c95	MEDIUM: mux-h1: Handle MUX_SUBS_RECV flag in h1_ctl() and susbscribe for reads The H1 mux now handle MUX_SUBS_RECV flag in h1_ctl(). If it is not already subscribed for reads, it does so. This patch will be mandatory to properly handle abortonclose option.	2023-11-14 11:01:51 +01:00
Christopher Faulet	9327e7efa7	BUG/MINOR: stconn: Handle abortonclose if backend connection was already set up abortonclose option is a backend option, it should not be handle on frontend side. Of course a frontend can also be a backend but the option should not be handled too early because it is not necessarily the selected backend (think about a listen proxy routing requests to another backend). It is especially an issue when the abortonclose option is enabled in the defaults section and disabled by the selected backend. Because in this case, the option may still be enabled while it should not. Thus, now we wait the backend connection was set up to handle the option. To do so, we check the backend SC state. The option is ignored if it is in ST_CS_INI state. For all other states, it means the backend was already selected. This patch could be backported as far as 2.2.	2023-11-14 11:01:51 +01:00
Willy Tarreau	6a4591c3d0	BUG/MEDIUM: connection: report connection errors even when no mux is installed An annoying issue was met when testing the reverse-http mechanism, by which failed connection attempts would apparently not be attempted again when there was no connect timeout. It turned out to be more generalized than the rhttp system, and actually affects all outgoing connections relying on NPN or ALPN to choose the mux, on which no mux is installed and for which the subscriber (ssl_sock) must be notified instead. The problem appeared during 2.2-dev1 development. First, commit `062df2c23` ("MEDIUM: backend: move the connection finalization step to back_handle_st_con()") broke the error reporting by testing CO_FL_ERROR only under CO_FL_CONNECTED. While it still worked OK for cases where a mux was present, it did not for this specific situation because no single error path would be considered when no mux was present. Changing the CO_FL_CONNECTED test to also include CO_FL_ERROR did work, until a few commits later with `477902bd2` ("MEDIUM: connections: Get ride of the xprt_done callback.") which removed the xprt_done callback that was used to indicate success or failure of the transport layer setup, since, as the commit explains, we can report this via the mux. What this last commit says is true, except when there is no mux. For this, however, the sock_conn_iocb() function (formerly conn_fd_handler) is called for such errors, evaluates a number of conditions, none of which is matched in this error condition case, since sock_conn_check() instantly reports an error causing a jump to the leave label. There, the mux is notified if installed, and the function returns. In other error condition cases, readiness and activity are checked for both sides, the tasklets woken up and the corresponding subscriber flags removed. This means that a sane (and safe) approach would consist in just notifying the subscriber in case of error, if such a subscriber still exists: if still there, it means the event hasn't been caught earlier, then it's the right moment to report it. And since this is done after conn_notify_mux(), it still leaves all control to the mux once it's installed. This commit should be progressively backported as far as 2.2 since it's where the problem was introduced. It's important to clearly check the error path in each function to make sure the fix still does what it's supposed to.	2023-11-14 08:49:23 +01:00
Frédéric Lécaille	3741e4bf90	BUG/MINOR: quic: maximum window limits do not match the doc This bug arrived with this commit: MINOR: quic: Add a max window parameter to congestion control algorithms The documentation was been modified with missing/wrong modifications in the code part. The 'g' suffix must be accepted to parse value in gigabytes. And exctly 4g is also accepted. No need to backport.	2023-11-13 19:56:28 +01:00
Frédéric Lécaille	9021e8935e	MINOR: quic: Maximum congestion control window for each algo Make all the congestion support the maximum congestion control window set by configuration. There is nothing special to explain. For each each algo, each time the window is incremented it is also bounded.	2023-11-13 17:53:18 +01:00
Frédéric Lécaille	028a55a1d0	MINOR: quic: Add a max window parameter to congestion control algorithms Add a new ->max_cwnd member to bind_conf struct to store the maximum congestion control window value for each QUIC binding. Modify the "quic-cc-algo" keyword parsing to add an optional parameter to its value: the maximum congestion window value between parentheses as follows: ex: quic-cc-algo cubic(10m) This value must be bounded, greater than 10k and smaller than 1g.	2023-11-13 17:53:18 +01:00
Frédéric Lécaille	840af0928b	BUG/MEDIUM: quic: Non initialized CRYPTO data stream deferencing This bug arrived with this commit: BUG/MINOR: quic: Useless use of non-contiguous buffer for in order CRYPTO data Before this commit qc->cstream was tested before entering qc_treat_rx_crypto_frms(). This patch restablishes this behavior. Furthermore, it simplyfies qc_ssl_provide_all_quic_data() which is a little bit ugly: the CRYPTO data frame may be freed asap in the list_for_each_entry_safe() block after having store its data pointer and length in local variables. Also interrupt the CRYPTO data process as soon as qc_ssl_provide_quic_data() or qc_treat_rx_crypto_frms() fail. No need to be backported.	2023-11-13 16:00:25 +01:00
Amaury Denoyelle	954b5b756a	BUG/MEDIUM: quic: fix FD for quic_cc_conn Since following commit, quic_conn closes its owned socket before transition to quic_cc_conn for closing state. This allows to save FDs as quic_cc_conn could use the listener socket for their I/O. commit `150c0da889` MEDIUM: quic: release conn socket before using quic_cc_conn This patch is incomplete as it removes initialization of <fd> member for quic_cc_conn. Thus, if sending is done on closing state, <fd> value is undefined which in most cases will result in a crash. Fix this by simply initializing <fd> member with qc_init_fd() in qc_new_cc_conn(). This bug should fix recent issue from #2095. Thanks to Tristan for its reporting and then testing of this patch. No need to backport.	2023-11-13 11:55:07 +01:00
Amaury Denoyelle	78d244e9f7	BUG/MINOR: quic: fix decrement of half_open counter on qc alloc failure Half open counter is used to comptabilize QUIC connections waiting for address validation. It was recently reworked to adjust its scope. With each decrement operation, a BUG_ON() was added to ensure the counter never wraps. This BUG_ON() could be triggered if an allocation fails for one of quic_conn members in qc_new_conn(). This is because half open counter is incremented at the end of qc_new_conn(). However, in case of alloc failure, quic_conn_release() is called immediately to ensure the counter is decremented if a connection is freed before peer address has been validated. To fix this, increment half open counter early in qc_new_conn() prior to every quic_conn members allocations. This issue was reproduced using -dMfail argument. This issue has been introduced by commit `278808915b` MINOR: quic: reduce half open counters scope No need to backport.	2023-11-13 11:16:41 +01:00
Amaury Denoyelle	92da3accfd	BUG/MINOR: quic: fix crash on qc_new_conn alloc failure A new counter was recently introduced to comptabilize the current number of active QUIC handshakes. This counter is stored on the listener instance. This counter is incremented at the beginning of qc_new_conn() to check if limit is not reached prior to quic_conn allocation. If quic_conn or one of its inner member allocation fails, special care is taken to decrement the counter as the connection instance is released. However, it relies on <l> variable which is initialized too late to cover pool_head_quic_conn allocation failure. To fix this, simply initialize <l> at the beginning of qc_new_conn(). This issue was reproduced using -dMfail argument. This issue was introduced by the following commit commit `3df6a60113` MEDIUM: quic: limit handshake per listener No need to backport.	2023-11-13 11:16:41 +01:00
Aurelien DARRAGON	76acde9107	BUG/MINOR: log: keep the ref in dup_logger() This bug was introduced with `969e212` ("MINOR: log: add dup_logsrv() helper function") When duplicating an existing log entry, we must take care to inherit from its original ->ref if it is set, because not doing so would make `28ac0999` ("MINOR: log: Keep the ref when a log server is copied to avoid duplicate entries") ineffective given that global log directives will lose their original reference when duplicated resursively (at least twice), which is what happens when global log directives are first inherited to defaults which are then inherited to a regular proxy at the end of the chain. This can be easily reproduced using the following configuration: \|global \| log stdout format raw local0 \| \|defaults \| log global \| \|frontend test \| log global \| ... Logs from "test" proxy will be duplicated because test incorrectly inherited from global "log" directives twice, which `28ac0999` would normally detect and prevent. No backport needed unless `969e212` gets backported.	2023-11-13 11:06:05 +01:00
Christopher Faulet	33a1fc883a	BUG/MINOR: sample: Fix bytes converter if offset is bigger than sample length When the bytes converter was improved to be able to use variables (`915e48675` ["MEDIUM: sample: Enhances converter "bytes" to take variable names as arguments"]), the behavior of the sample slightly change. A failure is reported if the given offset is bigger than the sample length. Before, a empty binary sample was returned. This patch fixes the converter to restore the original behavior. The function was also refactored to properly handle failures by removing SMP_F_MAY_CHANGE flag. Because the converter now handles variables, the conversion to an integer may fail. In this case SMP_F_MAY_CHANGE flag must be removed to be sure the caller will not retry. This patch should fix the issue #2335. No backport needed except if commit above is backported.	2023-11-13 11:06:05 +01:00
William Lallemand	a06f6212c9	MEDIUM: startup: 'haproxy -c' is quiet when valid MODE_CHECK does not output "Configuration file is valid" by default anymore. To display this message the -V option must be used with -c. However the warning and errors are still output by default if they exist. This allows to clean the output of the systemd unit file with is doing a -c.	2023-11-13 09:59:34 +01:00
Willy Tarreau	cf07cb96be	BUG/MEDIUM: proxy: always initialize the default settings after init The proxy's initialization is rather odd. First, init_new_proxy() is called to zero all the lists and certain values, except those that can come from defaults, which are initialized by proxy_preset_defaults(). The default server settings are also only set there. This results in these settings not to be set for a number of internal proxies that do not explicitly call proxy_preset_defaults() after allocation, such as sink and log forwarders. This was revealed by last commit `79aa63823` ("MINOR: server: always initialize pp_tlvs for default servers") which crashes in log parsers when applied to certain proxies which did not initialize their default servers. In theory this should be backported, however it would be desirable to wait a bit before backporting it, in case certain parts would rely on these elements not being initialized.	2023-11-13 09:17:05 +01:00
Willy Tarreau	79aa638238	MINOR: server: always initialize pp_tlvs for default servers In commit `6f4bfed3a` ("MINOR: server: Add parser support for set-proxy-v2-tlv-fmt") a suspicious check for a NULL srv_tlv was placed in the list_for_each_entry(), that should not be needed. In practice, it's caused by the list head not being initialized, hence the first element is NULL, as shown by Alexander's reproducer below which crashes if the test in the loop is removed: backend dummy default-server send-proxy-v2 set-proxy-v2-tlv-fmt(0xE1) %[fc_pp_tlv(0xE1)] server dummy_server 127.0.0.1:2319 The right place to initialize this field is proxy_preset_defaults(). We'd really need a function to initialize a server :-/ The check in the loop was removed. No backport is needed.	2023-11-13 08:53:28 +01:00
Fr�d�ric L�caille	dfda884633	BUG/MINOR: quic: Useless use of non-contiguous buffer for in order CRYPTO data This issue could be reproduced with a TLS client certificate verificatio to generate enough CRYPTO data between the client and haproxy and with dev/udp/udp-perturb as network perturbator. Haproxy could crash thanks to a BUG_ON() call as soon as in disorder data were bufferized into a non-contiguous buffer. There is no need to pass a non NULL non-contiguous to qc_ssl_provide_quic_data() from qc_ssl_provide_all_quic_data() which handles in order CRYPTO data which have not been bufferized. If not, the first call to qc_ssl_provide_quic_data() to process the first block of in order data leads the non-contiguous buffer head to be advanced to a wrong offset, by <len> bytes which is the length of the in order CRYPTO frame. This is detected by a BUG_ON() as follows: FATAL: bug condition "ncb_ret != NCB_RET_OK" matched at src/quic_ssl.c:620 call trace(11): \| 0x5631cc41f3cc [0f 0b 8b 05 d4 df 48 00]: qc_ssl_provide_quic_data+0xca7/0xd78 \| 0x5631cc41f6b2 [89 45 bc 48 8b 45 b0 48]: qc_ssl_provide_all_quic_data+0x215/0x576 \| 0x5631cc3ce862 [48 8b 45 b0 8b 40 04 25]: quic_conn_io_cb+0x19a/0x8c2 \| 0x5631cc67f092 [e9 1b 02 00 00 83 45 e4]: run_tasks_from_lists+0x498/0x741 \| 0x5631cc67fb51 [89 c2 8b 45 e0 29 d0 89]: process_runnable_tasks+0x816/0x879 \| 0x5631cc625305 [8b 05 bd 0c 2d 00 83 f8]: run_poll_loop+0x8b/0x4bc \| 0x5631cc6259c0 [48 8b 05 b9 ac 29 00 48]: main-0x2c6 \| 0x7fa6c34a2ea7 [64 48 89 04 25 30 06 00]: libpthread:+0x7ea7 \| 0x7fa6c33c2a2f [48 89 c7 b8 3c 00 00 00]: libc:clone+0x3f/0x5a Thank you to @Tristan971 for having reported this issue in GH #2095. No need to backport.	2023-11-10 18:16:14 +01:00
Aurelien DARRAGON	078ebde870	CLEANUP: sink: useless leftover in sink_add_srv() Removing a useless leftover which has been introduced with `31e8a003a5` ("MINOR: sink: function to add new sink servers")	2023-11-10 17:49:57 +01:00
Aurelien DARRAGON	2694621151	CLEANUP: sink: bad indent in sink_new_from_logger() Fixing bad indent in sink_new_from_logger() which was recently introduced	2023-11-10 17:49:57 +01:00
Aurelien DARRAGON	d710dfbacc	BUG/MINOR: sink: don't learn srv port from srv addr Since `04276f3d` ("MEDIUM: server: split the address and the port into two different fields") we should not use srv->addr to store server's port and rely on srv->svc_port instead. For sink servers, we correctly set >svc_port upon server creation but we didn't use it when initializing address for the connection. As a result, FQDN resolution will not work properly with sink servers. Hopefully, this used to work by accident because sink servers were resolved using the PA_O_RESOLVE flag in str2sa_range(), which made the srv->addr contain the port in addition to the address. But this will fail to work when FQDN resolution is postponed because only ->svc_port will contain the proper server port upon resolution. For instance, FQDN resolution with servers from log backends (which are resolved as regular servers, that is, without the PA_O_RESOLVE) will fail to work because of this. This may be backported as far as 2.2 even though the bug didn't have noticeable effects for versions below 2.9 [In 2.2, sink_forward_session_init() didn't exist it should be applied in sink_forward_session_create()]	2023-11-10 17:49:57 +01:00
Aurelien DARRAGON	64e0b63442	BUG/MEDIUM: server: invalid address (post)parsing checks This bug was introduced with `29b76ca` ("BUG/MEDIUM: server/log: "mode log" after server keyword causes crash ") Indeed, we cannot safely rely on addr_proto being set when str2sa_range() returns in parse_server() (even if SRV_PARSE_PARSE_ADDR is set), because proto lookup might be bypassed when FQDN addresses are involved. Unfortunately, the above patch wrongly assumed that proto would always be set when SRV_PARSE_PARSE_ADDR was passed to parse_server() (so when str2sa_range() was called), resulting in invalid postparsing checks being performed, which could as well lead to crashes with log backends ("mode log" set) because some postparsing init was skipped as a result of proto not being set and this wasn't expected later in the init code. To fix this, we now make use of the previous patch to perform server's address compatibility checks on hints that are always set when str2sa_range() succesfully returns. For log backend, we're also adding a complementary test to check if the address family is of expected type, else we report an error, plus we're moving the postinit logic in log api since _srv_check_proxy_mode() is only meant to check proxy mode compatibility and we were abusing it. This patch depends on: - "MINOR: tools: make str2sa_range() directly return type hints" No backport required unless `29b76ca` gets backported.	2023-11-10 17:49:57 +01:00
Aurelien DARRAGON	12582eb8e5	MINOR: tools: make str2sa_range() directly return type hints str2sa_range() already allows the caller to provide <proto> in order to get a pointer on the protocol matching with the string input thanks to `5fc9328a` ("MINOR: tools: make str2sa_range() directly return the protocol") However, as stated into the commit message, there is a trick: "we can fail to return a protocol in case the caller accepts an fqdn for use later. This is what servers do and in this case it is valid to return no protocol" In this case, we're unable to return protocol because the protocol lookup depends on both the [proto type + xprt type] and the [family type] to be known. While family type might not be directly resolved when fqdn is involved (because family type might be discovered using DNS queries), proto type and xprt type are already known. As such, the caller might be interested in knowing those address related hints even if the address family type is not yet resolved and thus the matching protocol cannot be looked up. Thus in this patch we add the optional net_addr_type (custom type) argument to str2sa_range to enable the caller to check the protocol type and transport type when the function succeeds.	2023-11-10 17:49:57 +01:00
Christopher Faulet	ebf90ca550	BUG/MEDIUM: applet: Remove appctx from buffer wait list on release For now, the appctx is removed from the buffer wait list when it is freed. However, when it is released, it is not necessarily freed immediately. But it is detached from the SC. If it is still registered in the buffer wait list, it could then be woken up to get a buffer. At this stage it is totally unexpected, especially because we must access the SC. The fix is obvious, the appctx must be removed from the buffer wait list on release. Note this bug exists because the appctx was moved at the mux level. This patch must be backported as far as 2.6.	2023-11-10 17:49:57 +01:00
Amaury Denoyelle	150c0da889	MEDIUM: quic: release conn socket before using quic_cc_conn After emission/reception of a CONNECTION_CLOSE, a connection enters the CLOSING state. In this state, only minimal exchanges occurs as only the packets which containted the CONNECTION_CLOSE frame can be reemitted. In conformance with the RFC, most resources are released and quic_conn instance is converted to the lighter quic_cc_conn. Push further this optimization by closing quic_conn socket FD before switching to a quic_cc_conn. This means that quic_cc_conn will rely on listener socket for its send/recv operation. This should not impact performance as as stated input/output are minimal on this state. This patch should improve FD consumption as prior to this a socket FD was kept during the closing delay which could cause maxsock to be reached for other connections. Note that fd member is kept in QUIC_CONN_COMMON and not removed from quic_cc_conn. This is because quic_cc_conn relies on qc_snd_buf() which access this field. As a side-effect to this change, jobs accounting for quic_conn is also updated. quic_cc_conn instances are now not counted as jobs. Indeed, the main objective of jobs is to prevent haproxy process to be stopped with data truncation. However, this relies on the connection to uses its owned socket as the listener socket is shut down inconditionaly on shutdown. A consequence of the jobs handling change is that haproxy process will be closed if only quic_cc_conn instances are present, thus preventing to respect the closing state. In case of a reload, if a client missed a CONNECTION_CLOSE frame just before process shutdown, it will probably received a Stateless Reset on sending retry. This change is considered safe as, for now, haproxy only emits CONNECTION_CLOSE on error conditions (such as protocol violation or timeout). It is considered as expected to suffer from data truncation from this. However, if connection closing is reused by haproxy to implement clean shutdown, it should be necessary to delay CONNECTION_CLOSE frame emission to ensure no data truncation happens here.	2023-11-10 15:27:45 +01:00
Amaury Denoyelle	f549eb2b34	MEDIUM: quic: respect closing state even on soft-stop Prior to this patch, a special condition was set when idle timer was rearmed for closing connections during haproxy process stopping. In this case, the timeout was ditched and the idle task woken up immediatly. The objective was to release quickly closing connections to not prevent the process stopping to be too long. However, it is not conform with RFC 9000 recommandations and may cause some clients to miss a CONNECTION_CLOSE in case of a packet loss. A recent fix was set to use a shorter timeout for closing state. Now a connection should only be left in this state for one second or less. This reduces greatly the importance of stopping special condition. Thus, this patch removes it completely.	2023-11-10 15:26:03 +01:00
Amaury Denoyelle	75e36c57f0	BUG/MINOR: quic: remove dead code in error path In quic_rx_pkt_retrieve_conn(), err label is now only used if qc is NULL. Thus, condition on qc can be removed. No need to backport. This issue was reported by coverity on github. This should fix issue #2338.	2023-11-10 15:26:03 +01:00
Willy Tarreau	0a7ab7067f	OPTIM: mux-h2: don't allocate more buffers per connections than streams When an H2 mux works with a slow downstream connection and without the mux-mux mode, it is possible that a single stream will allocate all 32 buffers in the connection. This is not desirable at all because 1) it brings no value, and 2) it allocates a lot of memory per connection, which, in addition to using a lot of memory, tends to degrade performance due to cache thrashing. This patch improves the situation by refraining from sending data frames over a connection when more mbufs than streams are allocated. On a test featuring 10k connections each with a single stream reading from the cache, this patch reduces the RAM usage from ~180k buffers to ~20k bufs, and improves the bandwidth. This may even be backported later to recent versions to improve memory usage. Note however that it is efficient only when combined with `e16762f8a` ("OPTIM: mux-h2: call h2_send() directly from h2_snd_buf()"), and tends to slightly reduce the single-stream performance without it, so in case of a backport, the two need to be considered together.	2023-11-09 17:24:00 +01:00
Willy Tarreau	a13f8425f0	MINOR: task/debug: make task_queue() and task_schedule() possible callers It's common to see process_stream() being woken up by wake_expired_tasks in the profiling output, without knowing which timeout was set to cause this. By making it possible to record the call places of task_queue() and task_schedule(), and by making wake_expired_tasks() explicitly not replace it, we'll be able to know which task_queue() or task_schedule() was triggered for a given wakeup. For example below: process_stream 51200 311.4ms 6.081us 34.59s 675.6us <- run_tasks_from_lists@src/task.c:659 task_queue process_stream 19227 70.00ms 3.640us 9.813m 30.62ms <- sc_notify@src/stconn.c:1136 task_wakeup process_stream 6414 102.3ms 15.95us 8.093m 75.70ms <- stream_new@src/stream.c:578 task_wakeup It's visible that it's the run_tasks_from_lists() which in fact applies on the task->expire returned by the ->process() function itself.	2023-11-09 17:24:00 +01:00
Amaury Denoyelle	4dee110f56	BUG/MINOR: quic: fix retry token check inconsistency A client may send multiple INITIAL packets if ClientHello is too big for only one. In case a Retry token is used, the client must reuse it for every INITIAL packets. On the haproxy server side, there was an inconsistency to handle these packets depending on the socket mode : * when using listener socket, token is always revalidated. * when using connection socket, token check is bypassed. This is because quic_conn instance is known through its socket and thus quic_rx_pkt_retrieve_conn() is not necessary. RFC 9000 does not seems to mandate retry token validation after the first INITIAL packet per connection. Thus, this patch chooses to bypass the check every time the connection instance is known, as this indicates that a previous token was already validated. This should be backported up to 2.7.	2023-11-09 16:57:37 +01:00
Amaury Denoyelle	bb28215d9b	MEDIUM: quic: define an accept queue limit QUIC connections are pushed manually into a dedicated listener queue when they are ready to be accepted. This happens after handshake finalization or on 0-RTT packet reception. Listener is then woken up to dequeue them with listener_accept(). This patch comptabilizes the number of connections currently stored in the accept queue. If reaching a certain limit, INITIAL packets are dropped on reception to prevent further QUIC connections allocation. This should help to preserve system resources. This limit is automatically derived from the listener backlog. Half of its value is reserved for handshakes and the other half for accept queues. By default, backlog is equal to maxconn which guarantee that there can't be no more than maxconn connections in handshake or waiting to be accepted.	2023-11-09 16:24:00 +01:00
Amaury Denoyelle	3df6a60113	MEDIUM: quic: limit handshake per listener Implement a limit per listener for concurrent number of QUIC connections. When reached, INITIAL packets for new connections are automatically dropped until the number of handshakes is reduced. The limit value is automatically based on listener backlog, which itself defaults to maxconn. This feature is important to ensure CPU and memory resources are not consume if too many handshakes attempt are started in parallel. Special care is taken if a connection is released before handshake completion. In this case, counter must be decremented. This forces to ensure that member <qc.state> is set early in qc_new_conn() before any quic_conn_release() invocation.	2023-11-09 16:23:52 +01:00
Amaury Denoyelle	278808915b	MINOR: quic: reduce half open counters scope Accounting is implemented for half open connections which represent QUIC connections waiting for handshake completion. When reaching a certain limit, Retry mechanism is automatically activated prior to instantiate new connections. The issue with this behavior is that two notions are mixed : QUIC connection handshake phase and Retry which is mechanism against amplification attacks. As such, only peer address validation should be taken into account to activate Retry protection. This patch chooses to reduce the scope of half_open_conn. Now only connection waiting to validate the peer address are now accounted for. Most notably, connections instantiated with a validated Retry token check are not accounted. One impact of this patch is that it should prevent to activate Retry mechanism too early, in particular in case if multiple handshakes are too slow. Another limitation should be implemented to protect against this scenario.	2023-11-09 16:23:52 +01:00
Amaury Denoyelle	d38bb7f8a7	MEDIUM: quic: adjust address validation When a new QUIC connection is created, server considers peer address as not yet validated. The server must limit its sending up to 3 times the content already received. This is a defensive measure to avoid flooding a remote host victim of address spoofing. This patch adjust the condition to consider the peer address as validated. Two conditions are now considered : * successful handling of a received HANDSHAKE packet. This was already done before although implemented in a different way. * validation of a Retry token. This was not considered prior this patch despite RFC recommandation. This patch also adjusts how a connection is internally labelled as using a validated peer address. Before, above conditions were checked via quic_peer_validated_addr(). Now, a flag QUIC_FL_CONN_PEER_VALIDATED_ADDR is set to labelled this. It already existed prior this patch but was only used for quic_cc_conn. This should now be more explicit.	2023-11-09 16:23:52 +01:00
Christopher Faulet	3a051ca0c8	BUG/MEDIUM: mux-h1: Exit early if fast-forward is not supported by opposite SC The commit `4be0c7c65` ("MEDIUM: stconn/muxes: Loop on data fast-forwarding to forward at least a buffer") introduced a regression. In h1_fastfwd(), if data fast-forwarding is not supported by the opposite SC, we must exit without calling se_donn_ff(). Otherwise a BUG_ON() will be triggered because the opposite mux has no .done_fastfwd() callback function. No backport needed.	2023-11-09 15:18:43 +01:00
William Lallemand	3ac3a06963	MEDIUM: mworker: -W is mandatory when using -S Defining a master CLI without the master-worker mode emits a warning since version 1.8. This patch enforce the behavior by forbiding the usage of the -S option without the master-worker mode.	2023-11-09 15:07:15 +01:00
William Lallemand	da24b462c3	MEDIUM: errors: move the MODE_QUIET test in print_message() Move the MODE_QUIET and MODE_VERBOSE test in print_message() so we always output in the startup-logs even with MODE_QUIET. ha_warning(), ha_alert() and ha_notice() does not check the MODE_QUIET and MODE_VERBOSE anymore, it is done before doing the fprintf() in print_message().	2023-11-09 14:39:11 +01:00
William Lallemand	59d699c0c4	MINOR: errors: does not check MODE_STARTING for log emission ha_alert(), ha_warning() and ha_notice() shouldn't check MODE_STARTING for log emission. Let's remove the check. This shouldn't do much since the stdio_quiet() function mute the output in main().	2023-11-09 14:39:11 +01:00
William Lallemand	b959b752f9	MINOR: errors: ha_alert() and ha_warning() uses warn_exec_path() Move the code to display the haproxy version and path during starting mode, which is called by the first ha_alert() or ha_warning().	2023-11-09 14:39:11 +01:00
Christopher Faulet	78021ee9ef	BUG/MEDIUM: stconn: Don't update stream expiration date if already expired The commit `08d7169f4` ("MINOR: stconn: Don't queue stream task in past in sc_notify()") tried to fix issues with epiration date set in past for the stream in sc_notify(). However it remains some cases where the stream expiration date may already be expired before recomputing it. This happens when an event is reported by the mux exactly when a timeout is triggered. In this case, depending on the scheduling, the SC may be woken up before the stream. For these cases, we fall into the BUG_ON() preventing to queue in the past. So, it remains unexpected to queue a task in the past. The BUG_ON() is correct at this place. We must just avoid to recompute the stream expiration date if it is already expired. At worst, the stream will be woken up for nothing. But it is not really a big deal because it will only happen on timeouts from time to time. It is so sporadic that we can ignore it from a performance point of view. This patch must be backpoted to 2.8. Be careful to remove the BUG_ON() on the 2.8.	2023-11-09 12:08:59 +01:00
Frédéric Lécaille	819690303d	BUG/MEDIUM: quic: Avoid some crashes upon TX packet allocation failures If a TX packet cannot be allocated (by qc_build_pkt()), as it can be coalesced to another one, this leads the TX buffer to have remaining not sent prepared data. Then haproxy crashes upon a BUG_ON() triggered by the next call to qc_txb_release(). This may happen only during handshakes. To fix this, qc_build_pkt() returns a new -3 error to dected such allocation failures followed which is for now on followed by a call to qc_purge_txbuf() to send the TX prepared data and purge the TX buffer. Must be backported as far as 2.6.	2023-11-09 10:32:31 +01:00
Frédéric Lécaille	b21e08cbd2	BUG/MEDIUM: quic: Possible crashes when sending too short Initial packets This may happen during handshakes when Handshake packets cannot be coalesced to a first Initial packet because of TX frame allocation failures (from qc_build_frms()). This leads too short (not padded) Initial packets to be sent. This is detected by a BUG_ON() in qc_send_ppkts(). To avoid this an Handshake packet without ack-eliciting frames which should have been built by qc_build_frms() is built. Must be backported as far as 2.6.	2023-11-09 10:32:31 +01:00
Frédéric Lécaille	c78cb49a3b	BUG/MEDIUM: quic: Avoid trying to send ACK frames from an empty ack ranges tree This may happen upon ack ranges allocation failures (from quic_update_ack_ranges_list(). This can lead to empty trees of ack ranges to be used to build ACK frames which is not good at all. Furthermore this is detected by a BUG_ON() (in qc_do_build_pkt()). To avoid this, simply update the acknowledgemen state of the connection only if quic_update_ack_ranges_list() succeeds, as it fails only in case of memory allocation failures. Must be backported as far as 2.6.	2023-11-09 10:32:31 +01:00
Frédéric Lécaille	4e3b28e8b6	BUG/MEDIUM: quic: Too short Initial packet sent (enc. level allocation failed) If the Handshake encryption level could not be allocated, this could lead to Initial packets to be sent because no Handshake CRYPTO frames were generated. Furthermore in such an allocation failure case, the connection should be closed as soon as possible. This is done making ha_quic_set_encryption_secrets() return 0 upon an encryption level allocation failure. Also fix a typo in the trace in relation to this allocation failure. No need to be backported.	2023-11-09 10:32:31 +01:00
Frédéric Lécaille	4cf784f38e	MINOR: quic: Avoid zeroing frame structures Do not initialize anymore ->type of quic_frame structures which leads to the others to be zeroed.	2023-11-09 10:32:31 +01:00
Frédéric Lécaille	f1be725474	CLEANUP: quic: Indentation fix in qc_do_build_pkt() Modification without any functional impact.	2023-11-09 10:32:31 +01:00
Frédéric Lécaille	7ecf4b34b9	BUG/MINOR: quic: idle timer task requeued in the past When the idle timer expired with a still present mux, this task was not freed and even requeued with a timer in the past. Fix this issue calling task_destroy() in this case. As the task is freed, its handler must return NULL setting local <t> variable to NULL in every cases. Also ensure that this timer task is not armed again after having been released with a <return> statement when this is the case from qc_idle_timer_do_rearm(). Must be backported as far as 2.6.	2023-11-09 10:32:31 +01:00
Frédéric Lécaille	b48abf0beb	MINOR: quic: Add idle timer task pointer to traces Helpful to detect if this timer was freed or not.	2023-11-09 10:32:31 +01:00
Frédéric Lécaille	4cfae3ac01	MINOR: quic: release the TLS context asap from quic_conn_release() This was no reason not to release as soon as possible the TLS/SSL QUIC connection context from quic_conn_release() before allocating a "closing connection" connection (quic_cc_conn struct).	2023-11-09 10:32:31 +01:00
Frédéric Lécaille	3a8dd48e30	MEDIUM: quic: Heavy task mode with non contiguously bufferized CRYPTO data This patch sets the handshake task in heavy task mode when receiving in disorder CRYPTO data which results in in order bufferized CRYPTO data. This is done thanks to a non-contiguous buffer and from qc_handle_crypto_frm() after having potentially bufferized CRYPTO data in this buffer. qc_treat_rx_crypto_frms() is no more called from qc_treat_rx_pkts() but instead this is where the task is set in heavy task mode. Consequently, this is the job of qc_ssl_provide_all_quic_data() to call directly qc_treat_rx_crypto_frms() to provide the in order bufferized CRYPTO data to the TLS stack. As this function releases the non-contiguous buffer for the CRYPTO data, if possible, there is no need to do that from qc_treat_rx_crypto_frms() anymore.	2023-11-09 10:32:31 +01:00
Frédéric Lécaille	94d20be138	MEDIUM: quic: Heavy task mode during handshake Add a new pool for the CRYPTO data frames received in order. Add ->rx.crypto_frms list to each encryption level to store such frames when they are received in order from qc_handle_crypto_frm(). Also set the handshake task (qc_conn_io_cb()) in heavy task mode from this function after having received such frames. When this task detects that it is set in heavy mode, it calls qc_ssl_provide_all_quic_data() newly implemented function to provide the CRYPTO data to the TLS task. Modify quic_conn_enc_level_uninit() to release these CRYPTO frames when releasing the encryption level they are in relation with.	2023-11-09 10:32:31 +01:00
Christopher Faulet	84d26bcf3f	MINOR: stconn/mux-h2: Use a iobuf flag to report EOI to consumer side during FF IOBUF_FL_EOI iobuf flag is now set by the producer to notify the consumer that the end of input was reached. Thanks to this flag, we can remove the ugly ack in h2_done_ff() to test the opposite SE flags. Of course, for now, it works and it is good enough. But we must keep in mind that EOI is always forwarded from the producer side to the consumer side in this case. But if this change, a new CO_RFL_ flag will have to be added to instruct the producer if it can forward EOI or not.	2023-11-08 21:14:07 +01:00
Christopher Faulet	4be0c7c655	MEDIUM: stconn/muxes: Loop on data fast-forwarding to forward at least a buffer In the mux-to-mux data forwarding, we now try, as far as possible to send at least a buffer. Of course, if the consumer side is congested or if nothing more can be received, we leave. But the idea is to retry to fast-forward data if less than a buffer was forwarded. It is only performed for buffer fast-forwarding, not splicing. The idea behind this patch is to optimise the forwarding, when a first forward was performed to complete a buffer with some existing data. In this case, the amount of data forwarded is artificially limited because we are using a non-empty buffer. But without this limitation, it is highly probable that a full buffer could have been sent. And indeed, with H2 client, a significant improvement was observed during our test. To do so, .done_fastfwd() callback function must be able to deal with interim forwards. Especially for the H2 mux, to remove H2_SF_NOTIFIED flags on the H2S on the last call only. Otherwise, the H2 stream can be blocked by itself because it is in the send_list. IOBUF_FL_INTERIM_FF iobuf flag is used to notify the consumer it is not the last call. This flag is then removed on the last call.	2023-11-08 21:14:07 +01:00
Willy Tarreau	a57f2a5cfe	BUG/MEDIUM: pool: try once to allocate from another bucket if empty In order to limit inter-thread contention on the global pool, in 2.9-dev3 with commit `7bf829ace` ("MAJOR: pools: move the shared pool's free_list over multiple buckets"), it was decided that if the selected bucket had an empty free list, we would simply give up and fall back to the OS allocator. But this causes allocations to be made from the OS for certain threads, to be released to overloaded pools that are sent back to the OS. One visible effect is that sending a lot of traffic using h2load with 100 parallel streams over 100 connections causes 5-10k buffers to be allocated, then reducing the load to only 10 connections doesn't make these allocations go down, just because some buckets are no longer visited. Tests show that giving a second chance to pick another bucket in this case is sufficient to visit all other buckets and recycle their pending objects. Now "show pools" that starts at 10k buffers at 100 connections goes down to about 150 with 1 connection and 100 streams in a fraction of a second. No backport is needed, as the issue is only in 2.9.	2023-11-08 17:14:03 +01:00
Willy Tarreau	a9ae094b27	BUG/MINOR: pool: check one other random bucket on alloc conflict Since 2.9-dev3 with commit `7bf829ace` ("MAJOR: pools: move the shared pool's free_list over multiple buckets"), the global pool supports multiple heads to reduce inter-thread contention. However, when grabbing a freelist head fails because another thread is already picking from it, we just skip to the next one and try again. Unfortunately, it still maintains a bit of contention between thread pairs when for some reasons only a few threads are used. This may happen for example when running on a 4- or 8- thread system and the two most active ones end up on adjacent buckets. A better and much simpler solution consists in visiting a random bucket instead of the current one. Tests show that the CPU usage spent in pool_refill_local_from_shared() reduces at low number of connections (hence threads). No backport is needed, as the issue is only in 2.9.	2023-11-08 17:12:49 +01:00
Christopher Faulet	5705a6e3b7	BUG/MEDIUM: freq-ctr: Don't report overshoot for long inactivity period The function returning the excess of events over the current period for a target frequency (the overshoot) has a flaw if the inactivity period is too long. In this case, the result may overflow. Instead to be negative, a very high positive value is returned. This function is used by the bandwidth limitation filter. It means after a long inactivity period, a huge burst may be detected while it should not. In fact, the problem arise from the moment we're past the current period. In this case, we should not report any overshoot and just get the number of remaining events as usual. This patch should be backported as far as 2.7.	2023-11-08 16:38:06 +01:00
Christopher Faulet	2c9c2f9d77	BUG/MINOR: mux-h1: Properly handle http-request and http-keep-alive timeouts It is now the turn for the H1 mux to be fix to properly handle http-request and http-keep-alive timeouts. It is quite surprising but it is broken since the 2.2. For idle connections on client side, the smallest value between the client timeout and the http-request/http-keep-alive timeout is used while the client timeout should only be used if other ones are not defined. So, if the client timeout is the smallest value, the keep-alive timeout is not respected. It is only an issue for idle client connections. The http-request timeout is respected from the moment part of the next request was received. This patch should fix the issue #2334. It must be backported as far as 2.2. But be careful during the backports. The H1 mux had evolved a lot since the 2.2.	2023-11-08 16:38:06 +01:00
Aurelien DARRAGON	8dae361f35	MINOR: stktable/cli: support v6tov4 and v4tov6 conversions Add a special treatment for the IPV4 and IPV6 cases in table_process_entry_per_key() function so that input string is parsed in best effort (STR to pseudo type ADDR): input format is first considered over table type and then let smp_to_stkey() do the type conversion for us when needed. This patch heavily depends on: - "MEDIUM: stktable/cli: simplify entry key handling" And optionally depends on: - `72514a44` ("MEDIUM: tools/ip: v4tov6() and v6tov4() rework")	2023-11-08 16:38:06 +01:00
Aurelien DARRAGON	0a47e6bccc	MEDIUM: stktable/cli: simplify entry key handling Make use of smp_to_stkey() in table_process_entry_per_key() to simplify key handling and leverage auto type conversions from sample API. One noticeable side effect is that integer input checks will be relaxed given that c_str2int() sample conv is more permissible than the integrated table_process_entry_per_key() integer parser.	2023-11-08 16:38:06 +01:00
Aurelien DARRAGON	c6826b9570	BUG/MINOR: stick-table/cli: Check for invalid ipv4 key When an ipv4 key is used to filter a CLI command on a stick table clear/set/show table ...), inetaddr_host+htonl combination was used with no error checking. Instead, we now use inet_pton(), which is what we use for ipv6 addresses since `b7c962b0c0` ("BUG/MINOR: stick-table/cli: Check for invalid ipv6 key") Doing this allows us to easily check for parsing errors: we're trading off some parsing efficience to better catch input errors and ensure we get similar behavior between ipv4 and ipv6 addresses handling. This patch may be backported to all supported versions.	2023-11-08 16:38:06 +01:00
Christopher Faulet	ba6ad4654e	BUG/MINOR: mux-h1: Release empty ibuf during data fast-forwarding We must take care to release H1 input buffer when it is emptied during the fast-forwarding nego. Otherwise, it may be kept allocated for a while, waiting for the next "normal" receive or the H1C release. No backport needed.	2023-11-08 16:38:06 +01:00
Amaury Denoyelle	d434acd8bb	MINOR: proto_reverse_connect: use connect timeout Use backend connect timeout when a new connection is instantiated for rhttp. This ensures that if connect operation fails after a certain delay, reverse_connect listener task is woken up. This allows to free the current connection and retry a new connect. As a consequence of this change, rev_process() may be woken up even if connection is not reported with CO_FL_ERROR. This happens if timeout fired before any network reported issue. Connection freeing is adjusted as in this case MUX instance is already allocated. Use destroy callback to release MUX context prior to the connection itself. This patch is really useful as a side measure for a haproxy bug impacting connect with SSL for both backend connections and active reverse connect. This is caused by the delayed allocation of MUX allocation. Asynchronous connect error detected at the socket layer is not notified to upper layers. Currently, only connect timeout allows to release this failed connection.	2023-11-08 10:17:43 +01:00
Christopher Faulet	7d7df1cf0a	BUG/MEDIUM: mux-h1: Be sure xprt support splicing to use it during fast-forward The commit `d6d4abdc3` ("BUILD: mux-h1: Fix build without kernel splicing support") introduced a regression. The kernel support for the underlying XPRT is no longer checked. So it is possible to enable the splicing for SSL connection. This of course leads to a segfault. This patch restore the test on the xprt rcv_pipe/snd_pipe functions. This patch should fix a crash reported by Tristan in #2095 (#issuecomment-1788949014). No backport needed.	2023-11-07 18:23:00 +01:00
Amaury Denoyelle	6f9b65f952	BUG/MEDIUM: quic: fix sslconns on quic_conn alloc failure QUIC connections are accounted inside global sslconns. As with QUIC actconn, it suffered from a similar issue if an intermediary allocation failed inside qc_new_conn(). Fix this similarly by moving increment operation inside qc_new_conn(). Increment and error path are now centralized and much easier to validate. The consequences are similar to the actconn fix : on memory allocation global sslconns may wrap, this time blocking any future QUIC or SSL connections on the process. This must be backported up to 2.6.	2023-11-07 14:06:02 +01:00
Amaury Denoyelle	a7ba679fe7	BUG/MEDIUM: quic: fix actconn on quic_conn alloc failure Since the following commit, quic_conn instances are accounted into global actconn and compared against maxconn. commit `7735cf3854` MEDIUM: quic: count quic_conn instance for maxconn Increment is always done prior to real allocation to guarantee minimal resource consumption. Special care is taken to ensure there will always be one decrement operation for each increment. To help this, decrement is centralized in quic_conn_release(). This behaves incorrectly in case of an intermediary allocation failure inside qc_new_conn(). In this case, quic_conn_release() will decrement actconn. Then, a NULL qc is returned in quic_rx_pkt_retrieve_conn() which will also decrement the counter on its own error code path. To properly fix this, actconn incrementation has been moved directly inside qc_new_conn(). It is thus easier to cover every cases : * if alloc failure before or on pool_head_quic_conn, actconn is decremented manually at the end of qc_new_conn() * after this step, actconn will be decremented by quic_conn_release() either on intermediary alloc failure or on proper connection release This bug happens on memory allocation failure so it should be rare. However, its impact is not negligeable as if actconn counter is wrapped it will block any future connection allocation for both QUIC and TCP. One small downside of this change is that a CID is now always allocated before quic_conn even if maxconn will be reached. However, this is considered as of minor importance compared to a more robust code. This must be backported up to 2.6.	2023-11-07 13:50:07 +01:00
Christopher Faulet	e5fe2013a9	CLEANUP: htx: Properly indent htx_reserve_max_data() function Spaces were used instead of tabs to indent htx_reserve_max_data() function. Let's reindent the whole function.	2023-11-07 10:41:11 +01:00
Christopher Faulet	c57af8ebcd	BUG/MINOR: stconn: Sanitize report for read activity When a EOS or EOI is detected on the endpoint and when the event is reported at the SC level, a read activity must be reported. It is not really a big deal because these flags already inhibit any read timeout. But it is consistent with the <lra> comment. In addition, no read activity is reported on abort. It is up-down event and it is not an event unblocking the reads. So there is no reason to report a read activity. This patch must be backported to 2.8.	2023-11-07 10:41:11 +01:00
Christopher Faulet	08d7169f42	MINOR: stconn: Don't queue stream task in past in sc_notify() A task must never be queued in past. However, in sc_notify(), the stream task, if not woken up, is queued. Thanks to previous fixes, the stream task expiration date should be correct. But to prevent any issue, a BUG_ON() is added to be sure it never happens. I guess a good idea could be to remove it or change it to BUG_ON_HOT() for the final release.	2023-11-07 10:32:25 +01:00
Christopher Faulet	4a2660aa45	BUG/MEDIUM: stconn: Don't report rcv/snd expiration date if SC cannot epxire When receive or send expiration date of a stream-connector is retrieved, we now automatically check if it may expire. If not, TICK_ETERNITY is returned. The expiration dates of the frontend and backend stream-connectors are used to compute the stream expiration date. This operation is performed at 2 places: at the end of process_stream() and in sc_notify() if the stream is not woken up. With this patch, there is no special changes for process_stream() because it was already handled. It make thing a little simpler. However, it fixes sc_notify() by avoiding to erroneously compute an expiration date in past. This highly reduce the stream wakeups when there is contention on the consumer side. The bug was introduced with the commit `8073094bf` ("NUG/MEDIUM: stconn: Always update stream's expiration date after I/O"). It was an error to unconditionnaly set the stream expiration data, without testing blocking conditions on both SC. This patch must be backported to 2.8.	2023-11-07 10:30:01 +01:00
Christopher Faulet	141b489291	BUG/MEDIUM: stconn: Report send activity during mux-to-mux fast-forward When data are directly forwarded from a mux to the opposite one, we must not forget to report send activity when data are successfully sent or report a blocked send with data are blocked. It is important because otherwise, if the transfer is quite long, longer than the client or server timeout, an error may be triggered because the write timeout is reached. H1, H2 and PT muxes are concerned. To fix the issue, The done_fastword() callback now returns the amount of data consummed. This way it is possible to update/reset the FSB data accordingly. No backport needed.	2023-11-07 10:30:01 +01:00
Tim Duesterhus	d7eaa0d553	CLEANUP: Re-apply xalloc_size.cocci (3) This reapplies the xalloc_size.cocci patch across the whole `src/` tree. see `16cc16dd82` see `63ee0e4c01` see `9fb57e8c17`	2023-11-06 20:49:56 +01:00
Willy Tarreau	09eacb8b24	BUG/MINOR: server: remove some incorrect free() calls on null elements In commit `6f4bfed3a` ("MINOR: server: Add parser support for set-proxy-v2-tlv-fmt") a few free() calls were made to an element on error path when it was detected it was NULL. It doesn't have any effect, however there was one case of use-after-free at the end of srv_settings_cpy() that was caught by gcc due to attempting to free the element after freeing its holder. No backport is needed.	2023-11-04 08:56:01 +01:00
Willy Tarreau	e16762f8a8	OPTIM: mux-h2: call h2_send() directly from h2_snd_buf() This allows to eliminate full buffers very quickly and to recycle them much faster, resulting in higher transfer rates and lower memory usage at the same time. We just wake the tasklet up if it succeeded so that h2_process() and friends are called to finalize what needs to. For regular buffer sizes, the performance level becomes quite close to the one obtained with the zero-copy mechanism (zero-copy remains much faster with non-default buffer sizes). The memory savings are huge with default buffer size: at 64c * 100 streams on a single thread, we used to forward 4.4 Gbps of traffic using 10400 buffers. After the change, the performance reaches 5.9 Gbps with only 22-24 buffers, since they are quickly recycled. That's asaving of 160 MB of RAM. A concern was an increase in the number of syscalls but this is not the case, the numbers remained exactly the same before and after. Some experimentations were made to try to cork data and not send incomplete buffers, and that always voided these changes. One explanation might be that keeping a first buffer with only headers frames is sufficient to prevent a zero-copy of the data coming in a next snd_buf() call. This still needs to be studied anyway.	2023-11-04 08:34:23 +01:00
Willy Tarreau	0fa5adee3b	MINOR: mux-h2: always use h2_send() in h2_done_ff(), not h2_process() By calling h2_process(), the code would theoretically make it possible for a synchronous ->wake() call to provoke an indirect call to h2_snd_buf() while we're in h2_done_ff(), which could be quite bad. The current conditions do not permit it right now but this could easily break by accident. Better use h2_send() and wake the task up if needed. Precise performance tests showed no change.	2023-11-04 08:12:17 +01:00
Willy Tarreau	58185669d8	BUG/MEDIUM: pattern: don't trim pools under lock in pat_ref_purge_range() There's a subtle issue that results from pat_ref_purge_range() trying to release memory. Since commit `0d93a8186` ("MINOR: pools: work around possibly slow malloc_trim() during gc") that was backported to 2.3, trim_all_pools() now protects itself against concurrent malloc() and free() by isolating itself. The problem is that pat_ref_purge_range() must be called under a lock, which is precisely what's done in cli_io_handler_clear_map(). Thus during a clearing of a map, if another thread tries to access or update an entry in the same map, it will wait for the ref->lock to be released, and trim_all_pools() will wait for all threads to be harmless, thus causing a deadlock. Note that disabling memory trimming cannot work around the problem here because it's tested only under isolation. The solution here consists in moving the call to trim_all_pools() to the caller, out of the lock. This must be backported as far as 2.4.	2023-11-04 07:55:37 +01:00
Alexander Stephan	ce7501de79	MINOR: connection: Send out generic, user-defined server TLVs To follow-up the implementation of the new set-proxy-v2-tlv-fmt keyword in the server, the connection is updated to use the previously allocated TLVs. If no value was specified, we send out an empty TLV. As the feature is fully working with this commit, documentation and a test for the server and default-server are added as well.	2023-11-04 04:56:59 +01:00
Alexander Stephan	6f4bfed3a2	MINOR: server: Add parser support for set-proxy-v2-tlv-fmt This commit introduces a generic server-side parsing of type-value pair arguments and allocation of a TLV list via a new keyword called set-proxy-v2-tlv-fmt. This allows to 1) forward any TLV type with the help of fc_pp_tlv, 2) generally, send out any TLV type and value via a log format expression. To have this fully working the connection will need to be updated in a follow-up commit to actually respect the new server TLV list. default-server support has also been implemented.	2023-11-04 04:56:59 +01:00
Aurelien DARRAGON	5158c0ff69	MEDIUM: stktable/peers: "write-to" local table on peer updates In this patch, we add the possibility to declare on a table definition ("table" in peer section, or "stick-table" in proxy section) that we want the remote/peer updates on that table to be pushed on a local haproxy table in addition to the source table. Consider this example: \|peers mypeers \| peer local 127.0.0.1:3334 \| peer clust 127.0.0.1:3333 \| table t1.local type string size 10m store server_id,server_key expire 30s \| table t1.clust type string size 10m store server_id,server_key write-to mypeers/t1.local expire 30s With this setup, we consider haproxy uses t1.local as cache/local table for read and write operations, and that t1.clust is a remote table containing datas processed from t1.local and similar tables from other haproxy peers in a cluster setup. The t1.clust table will be used to refresh the local/cache one via the "write-to" statement. What will happen, is that every time haproxy will see entry updates for the t1.clust table: it will overwrite t1.local table with fresh data and will update the entry expiration timer. If t1.local entry doesn't exist yet (key doesn't exist), it will automatically create it. Note that only types that cannot be used for arithmetic ops will be handled, and this to prevent processed values from the remote table from interfering with computations based on values from the local table. (ie: prevent cumulative counters from growing indefinitely). "write-to" will only push supported types if they both exist in the source and the target table. Be careful with server_id and server_key storage because they are often declared implicitly when referencing a table in sticking rules but it is required to declare them explicitly for them to be pushed between a remote and a local table through "write-to" option. Also note that the "write-to" target table should have the same type as the source one, and that the key length should be strictly equal, otherwise haproxy will raise an error due to the tables being incompatibles. A table that is already being written to cannot be used as a source table for a "write-to" target. Thanks to this patch, it will now be possible to use sticking rules in peer cluster context by using a local table as a local cache which will be automatically refreshed by one or multiple remote table(s). This commit depends on: - "MINOR: stktable: stktable_init() sets err_msg on error" - "MINOR: stktable: check if a type should be used as-is"	2023-11-03 17:30:30 +01:00
Aurelien DARRAGON	db0cb54f81	MINOR: stktable: check if a type should be used as-is stick table types now have an extra bit named 'as_is' that allows us to check if such type should be used as-is or if it may be involved in arithmetic operations such as counters. This can be useful since those types are not common and may require specific handling. e.g.: stktable_data_types[data_type].as_is will be set to 1 if the type cannot be used in arithmetic operations.	2023-11-03 17:30:30 +01:00
Aurelien DARRAGON	b8c19f877a	MINOR: stktable: stktable_init() sets err_msg on error stktable_init() now sets err_msg when error occurs so that caller is able to precisely report the cause of the failure.	2023-11-03 17:30:30 +01:00
Aurelien DARRAGON	b6a9eca88d	BUG/MINOR: cfgparse/stktable: fix error message on stktable_init() failure As a result of copy paste error in `1b8e68e` ("MEDIUM: stick-table: Stop handling stick-tables as proxies."), postparsing stktable_init() failures were reported as such for named peer tables: "Proxy 'table_name': failed to initialize stick table." Now they are correctly reported like this: "Parsing [file:line]: failed to initialize 'table_name' stick-table." This should be backported to every stable versions.	2023-11-03 17:30:30 +01:00
Aurelien DARRAGON	6376fe9142	BUG/MINOR: stktable: missing free in parse_stick_table() When "peers" keyword is encountered within a stick table definition, peers.name hint gets replaced with a new copy of the provided name using strdup(). However, there is no detection on whether the name was previously set or not, so it is currently allowed to reuse the keyword multiple time to overwrite previous value, but here we forgot to free previous value for peers.name before assigning it to a new one. This should be backported to every stable versions.	2023-11-03 17:30:30 +01:00
Aurelien DARRAGON	b9c0b039c8	MINOR: proxy/stktable: add resolve_stick_rule helper function Simplify stick and store sticktable proxy rules postparsing by adding a sticking rule entry resolve (postparsing) function. This will ease code maintenance.	2023-11-03 17:30:30 +01:00
Amaury Denoyelle	d82a6d93e2	BUG/MINOR: proto_reverse_connect: support SNI on active connect SNI may be specify on a server line for connecting to the remote host. This requires to manually set it on the connection via ssl_sock_set_servername(). This step was missing when a server line was used for active reverse HTTP. Fix this by adding the missing ssl_sock_set_servername() invocation inside new_reverse_conn(). Note that for the moment, no session is instantiated to carry active reverse connection. A direct consequence of this is that SNI sample retrieval may crash depending if it depends on session parameters. This should be fixed by a later commit. In the meantime, this patch is sufficient to support simple SNI value such as constant expressions. No need to backport.	2023-11-03 11:11:44 +01:00
Ruei-Bang Chen	7a1ec235cd	MINOR: sample: Add fetcher for getting all cookie names This new fetcher can be used to extract the list of cookie names from Cookie request header or from Set-Cookie response header depending on the stream direction. There is an optional argument that can be used as the delimiter (which is assumed to be the first character of the argument) between cookie names. The default delimiter is comma (,). Note that we will treat the Cookie request header as a semi-colon separated list of cookies and each Set-Cookie response header as a single cookie and extract the cookie names accordingly.	2023-11-03 09:57:06 +01:00
Christopher Faulet	c72ab1cc6d	BUG/MINOR: tcpcheck: Report hexstring instead of binary one on check failure When an expect rule failed for a tcp-check, information about the expect rule is dumped in the report. For a check on a binary string, a hexstring is used in the configuration but the decoded string is dumped. It is an problem because it can contain special characters. And it is not really handy because there is no correspondance with the config. So, now, the hexstring is dumped in the report. This way, we are sure there is no special characters and it is easy to find it in the configuration. This patch shoudl solve the issue #2326. It must be backported as far as 2.2.	2023-10-31 08:02:44 +01:00
William Lallemand	e7bae7a0b6	BUG/MEDIUM: ssl: segfault when cipher is NULL The patch which fixes the certificate selection uses SSL_CIPHER_get_id() to skip the SCSV ciphers without checking if cipher is NULL. This patch fixes the issue by skipping any NULL cipher in the iteration. Problem was reported in #2329. Need to be backported where `23093c72f1` was backported. No release was made with this patch so the severity is MEDIUM.	2023-10-30 18:08:16 +01:00
Amaury Denoyelle	47ed1181f2	BUG/MINOR: mux-quic: fix early close if unset client timeout When no client timeout is defined in the configuration, QCC timeout task is never allocated. However, a NULL timeout task is also used as a criteria in qcc_is_dead() to consider that the MUX instance should be released as timeout stroke earlier. This bug causes every connection to be closed by haproxy side with a CONNECTION_CLOSE. This is notable when using several streams per connection with only the first stream completed and the others failed. To fix this, change timeout task allocation policy. It is now always allocated. This means that if no timeout is defined, it will never be run. This is not considered a waste of resource as no timeout in the configuration is considered as an exception case. However, this has the advantage to simplify the rest of the code which can now check for the task instance without having an extra check on the timeout value. This bug is labelled as minor as it only occurs if no timeout client is defined which reports warning on startup as it may caused unexpected behavior. This bug should be backported up to 2.6.	2023-10-27 17:51:08 +02:00
William Lallemand	23093c72f1	BUG/MINOR: ssl: suboptimal certificate selection with TLSv1.3 and dual ECDSA/RSA When using TLSv1.3, the signature algorithms extension is used to chose the right ECDSA or RSA certificate. However there was an old test for previous version of TLS (< 1.3) which was testing if the cipher is compatible with ECDSA when an ECDSA signature algorithm is used. This test was relying on SSL_CIPHER_get_auth_nid(cipher) == NID_auth_ecdsa to verify if the cipher is still good. Problem is, with TLSv1.3, all ciphersuites are compatible with any authentication algorithm, but SSL_CIPHER_get_auth_nid(cipher) does not return NID_auth_ecdsa, but NID_auth_any. Because of this, with TLSv1.3 when both ECDSA and RSA certificates are available for a domain, the ECDSA one is not chosen in priority. This patch also introduces a test on the cipher IDs for the signaling ciphersuites, because they would always return NID_auth_any, and are not relevent for this selection. This patch fixes issue #2300. Must be backported in all stable versions.	2023-10-26 19:17:13 +02:00
Amaury Denoyelle	4a89dba6d5	MEDIUM: quic: count quic_conn for global sslconns Similar to the previous commit which check for maxconn before allocating a QUIC connection, this patch checks for maxsslconn at the same step. This is necessary as a QUIC connection cannot run without a SSL context. This should be backported up to 2.6. It relies on the following patch : "BUG/MINOR: ssl: use a thread-safe sslconns increment"	2023-10-26 15:35:58 +02:00
Amaury Denoyelle	7735cf3854	MEDIUM: quic: count quic_conn instance for maxconn Increment actconn and check maxconn limit when a quic_conn is instantiated. This is necessary because prior to this patch, quic_conn instances where not counted. Global actconn was only incremented after the handshake has been completed and the connection structure is allocated. The increment is done using increment_actconn() on INITIAL packet parsing if a new connection is about to be created. If the limit is reached, the allocation is cancelled and the INITIAL packet is dropped. The decrement is done under quic_conn_release(). This means that quic_cc_conn instances are not taken into account. This seems safe enough because quic_cc_conn are only used for minimal usage. The counterpart of this change is that maxconn must not be checked a second time when listener_accept() is done over a QUIC connection. For this, a new bind_conf flag BC_O_XPRT_MAXCONN is set for listeners when maxconn is already counted by the lower layer. For the moment, it is positionned only for QUIC listeners. Without this patch, haproxy process could suffer from heavy memory/CPU load if the number of concurrent handshake is high. This patch is not considered a bug fix per-se. However, it has a major benefit to protect against too many QUIC handshakes. As such, it should be backported up to 2.6. For this, it relies on the following patch : "MINOR: frontend: implement a dedicated actconn increment function"	2023-10-26 15:35:56 +02:00
Amaury Denoyelle	350f8b0c07	BUG/MINOR: ssl: use a thread-safe sslconns increment Each time a new SSL context is allocated, global.sslconns is incremented. If global.maxsslconn is reached, the allocation is cancelled. This procedure was not entirely thread-safe due to the check and increment operations conducted at different stage. This could lead to global.maxsslconn slightly exceeded when several threads allocate SSL context while sslconns is near the limit. To fix this, use a CAS operation in a do/while loop. This code is similar to the actconn/maxconn increment for connection. A new function increment_sslconn() is defined for this operation. For the moment, only SSL code is using it. However, it is expected that QUIC will also use it to count QUIC connections as SSL ones. This should be backported to all stable releases. Note that prior to the 2.6, sslconns was outside of global struct, so this commit should be slightly adjusted.	2023-10-26 15:25:07 +02:00
Amaury Denoyelle	fffd435bbd	MINOR: frontend: implement a dedicated actconn increment function When a new frontend connection is instantiated, actconn global counter is incremented. If global maxconn value is reached, the connection is cancelled. This ensures that system limit are under control. Prior to this patch, the atomic check/increment operations were done directly into listener_accept(). Move them in a dedicated function increment_actconn() in frontend module. This will be useful when QUIC connections will be counted in actconn counter.	2023-10-26 15:18:48 +02:00
Amaury Denoyelle	fe29dba872	BUG/MINOR: quic: do not consider idle timeout on CLOSING state When entering closing state, a QUIC connection is maintained during a certain delay. The principle is to ensure the other peer has received the CONNECTION_CLOSE frame. In case of packet duplication/reordering, CONNECTION_CLOSE is reemitted. QUIC RFC recommends to use at least 3 times the PTO value. However, prior to this patch, haproxy used instead the max value between 3 times the PTO and the connection idle timeout. In the default case, idle timeout is set to 30s which is in most of the times largely superior to the PTO. This has the downside of keeping the connection in memory for too long whereas all resources could be released much earlier. Fix this behavior by using 3 times the PTO on closing or draining state. This value is limited up to 1s. This ensures that most of connections are covered by this. If a connection runs with a very high RTT, it must not impact the whole process and should be released in a reasonable delay. This should be backported up to 2.6.	2023-10-26 15:14:36 +02:00
Willy Tarreau	96bb99a87d	DEBUG: pools: detect that malloc_trim() is in progress Now when calling ha_panic() with a thread still under malloc_trim(), we'll set a new tainted flag to easily report it, and the output trace will report that this condition happened and will suggest to use no-memory-trimming to avoid it in the future.	2023-10-25 15:48:02 +02:00
Willy Tarreau	26a6481f00	DEBUG: lua: add tainted flags for stuck Lua contexts William suggested that since we can detect the presence of Lua in the stack, let's combine it with stuck detection to set a new pair of flags indicating a stuck Lua context and a stuck Lua shared context. Now, executing an infinite loop in a Lua sample fetch function with yield disabled crashes with tainted=0xe40 if loaded from a lua-load statement, or tainted=0x640 from a lua-load-per-thread statement. In addition, at the end of the panic dump, we can check if Lua was seen stuck and emit recommendations about lua-load-per-thread and the choice of dependencies depending on the presence of threads and/or shared context.	2023-10-25 15:48:02 +02:00
Willy Tarreau	46bbb3a33b	DEBUG: add a tainted flag when ha_panic() is called This will make it easier to know that the panic function was called, for the occasional case where the dump crashes and/or the stack is corrupted and not much exploitable. Now at least it will be sufficient to check the tainted value to know that someone called ha_panic(), and it will also be usable to condition extra analysis.	2023-10-25 15:48:02 +02:00
Aurelien DARRAGON	1822e8998b	MINOR: server: add helper function to detach server from proxy list Remove some code duplication by introducing a basic helper function to detach a server from its parent proxy. It is supported to call the function even if the server is not yet listed in the proxy list. If the server is not yet listed in the proxy, the function will do nothing. In delete_server(), we previously performed some BUG_ON() to ensure that the detach always succeeded given that we were certain that the server was in the proxy list because it was retrieved through get_backend_server(). However this test is superfluous, we can safely assume that the operation will always succeed if get_backend_server() returned != NULL (we're under full thread isolation), and if it's not the case, then we have a bigger API issue anyway..	2023-10-25 11:59:27 +02:00
Aurelien DARRAGON	e128fc7ce1	BUG/MEDIUM: server: "proto" not working for dynamic servers In `304672320e` ("MINOR: server: support keyword proto in 'add server' cli") improper use of conn_get_best_mux_entry() function was made: First, server's proxy mode was directly passed as "proto_mode" argument to conn_get_best_mux_entry(), but this is strictly invalid because while there is some relationship between proto modes and proxy modes, they don't use the same storage mechanism and cannot be used interchangeably. Because of this bug, conn_get_best_mux_entry() would not work at all for TCP because PR_MODE_TCP equals 0, where PROTO_MODE_TCP normally equals 1. Then another, less sensitive bug, remains: as its name and description implies, conn_get_best_mux_entry() will try its best to return something to the user, only using keyword (mux_proto) input as an hint to return the most relevant mux within the list of mux that are compatibles with proto_side and proto_mode values. This means that even if mux_proto cannot be found or is not available with current proto_side and proto_mode values, conn_get_best_mux_entry() will most probably fallback to a more generic mux. However in cli_parse_add_server(), we directly check the result of conn_get_best_mux_entry() and consider that it will return NULL if the provided keyword hint for mux_proto cannot be found. This will result in the function not raising errors as expected, because most of the times if the expected proto cannot be found, then we'll silently switch to the fallback one, despite the user providing an explicit proto. To fix that, we store the result of conn_get_best_mux_entry() to compare the returned mux proto name with the one we're expecting to get, as it is originally performed in cfgparse during initial server keyword parsing. This patch depends on - "MINOR: connection: add conn_pr_mode_to_proto_mode() helper func") It must be backported up to 2.6.	2023-10-25 11:59:27 +02:00
Aurelien DARRAGON	66795bd721	MINOR: connection: add conn_pr_mode_to_proto_mode() helper func This function allows to safely map proxy mode to corresponding proto_mode This will allow for easier code maintenance and prevent mixups between proxy mode and proto mode.	2023-10-25 11:59:27 +02:00
Aurelien DARRAGON	29b76cae47	BUG/MEDIUM: server/log: "mode log" after server keyword causes crash In `9a74a6c` ("MAJOR: log: introduce log backends"), a mistake was made: it was assumed that the proxy mode was already known during server keyword parsing in parse_server() function, but this is wrong. Indeed, "mode log" can be declared late in the proxy section. Due to this, a simple config like this will cause the process to crash: \|backend test \| \| server name 127.0.0.1:8080 \| mode log In order to fix this, we relax some checks in _srv_parse_init() and store the address protocol from str2sa_range() in server struct, then we set-up a postparsing function that is to be called after config parsing to finish the server checks/initialization that depend on the proxy mode to be known. We achieve this by checking the PR_CAP_LB capability from the parent proxy to know if we're in such case where the effective proxy mode is not yet known (it is assumed that other proxies which are implicit ones don't provide this possibility and thus don't suffer from this constraint). Only then, if the capability is not found, we immediately perform the server checks that depend on the proxy mode, else the check is postponed and it will automatically be performed during postparsing thanks to the REGISTER_POST_SERVER_CHECK() hook. Note that we remove the SRV_PARSE_IN_LOG_BE flag because it was introduced in the above commit and it is no longer relevant. No backport needed unless `9a74a6c` gets backported.	2023-10-25 11:59:27 +02:00
Amaury Denoyelle	f76e94d231	MINOR: backend: refactor insertion in avail conns tree Define a new function srv_add_to_avail_list(). This function is used to centralize connection insertion in available tree. It reuses a BUG_ON() statement to ensure the connection is not present in the idle list.	2023-10-25 10:33:06 +02:00
Amaury Denoyelle	394bd4eb39	BUG/MAJOR: backend: fix idle conn crash under low FD Since the following commit, idle conns are stored in a list as secondary storage to retrieve them in usage order : `5afcb686b9` MAJOR: connection: purge idle conn by last usage The list usage has been extended wherever connections lookup are done both on idle and safe trees. This reduced the code size by replacing a two tree loops by a single list loop. LIST_ELEM() is used in this context to retrieve the first idle list element from the server list head. However, macro usage was wrong due to an extra '&' operator which returns an invalid connection reference. This will most of the time caused a crash on conn_delete_from_tree() or affiliated functions. This bug only occurs if the FD pool is exhausted and some idle connections are selected to be killed. It can be reproduced using the following config and h2load command : $ h2load -t 8 -c 800 -m 10 -n 800 "http://127.0.0.1:21080/?s=10k" global maxconn 100 defaults mode http timeout connect 20s timeout client 20s timeout server 20s listen li bind :21080 proto h2 server nginx 127.99.0.1:30080 proto h1 This bug has been introduced by the above commit. Thus no need to backport this fix. Note that LIST_ELEM() macro usage was slightly adjusted also in srv_migrate_conns_to_remove(). The function used toremove_list instead of idle_list connection list element. This is not a bug as they are stored in the same union. However, the new code is clearer as it intends to move connection from the idle_list only into the toremove_list mt-list.	2023-10-25 10:30:45 +02:00
Amaury Denoyelle	b9fbbaf2a8	BUG/MINOR: backend: fix wrong BUG_ON for avail conn Idle connections are both stored in an idle/safe tree and in an idle list. The list is used as a secondary storage to be able to retrieve them by usage order. If a connection is moved into the available tree, it must not be present in the idle list. A BUG_ON() was written to check this but was placed at the wrong code section. Fix this by removing the misplaced one and write new ones for avail_conns tree insertion and lookup. The impact of this bug is minor as the misplaced BUG_ON() did not seem to be triggered. No need to backport.	2023-10-25 10:11:04 +02:00
Tristan	8da0e45382	MINOR: lua: change tune.lua.log.stderr default from 'on' to 'auto' After making it configurable in previous commit "MINOR: lua: Add flags to configure logging behaviour", this patch changes the default value of tune.lua.log.stderr from 'on' (unconditionally forward LUA logs to stderr) to 'auto' (only forward LUA logs to stderr if logging via a standard logger is disabled, or none is configured for the current context) Since this is a change in behaviour, it shouldn't be backported	2023-10-25 07:49:03 +02:00
Tristan	97dacbbb86	MINOR: lua: Add flags to configure logging behaviour Until now, messages printed from LUA log functions were sent both to the any logger configured for the current proxy, and additionally to stderr (in most cases) This introduces two flags to configure LUA log handling: - tune.lua.log.loggers to use standard loggers or not - tune.lua.log.stderr to use stderr, or not, or only conditionally This addresses github feature request #2316 This can be backported to 2.8 as it doesn't change previous behaviour.	2023-10-25 07:48:48 +02:00
William Lallemand	b12613f0ac	BUG/MINOR: ssl: load correctly @system-ca when ca-base is define The configuration parser still adds the 'ca-base' directory when loading the @system-ca, preventing it to be loaded correctly. This patch fixes the problem by not adding the ca-base when a file starts by '@'. Fix issue #2313. Must be backported as far as 2.6.	2023-10-23 22:03:55 +02:00
Willy Tarreau	380f115a4a	BUG/MINOR: mux-h2: update tracked counters with req cnt/req err Originally H2 would transfer everything to H1 and parsing errors were handled there, so that if there was a track-sc rule in effect, the counters would be updated as well. As we started to add more and more HTTP-compliance checks at the H2 layer, then switched to HTX, we progressively lost this ability. It's a bit annoying because it means we will not maintain accurate error counters for a given source, for example. This patch adds the calls to session_inc_http_req_ctr() and session_inc_http_err_ctr() when needed (i.e. when failing to parse an HTTP request since all other cases are handled by the stream), just like mux-h1 does. The same should be done for mux-h3 by the way. This can be backported to recent stable versions. It's not exactly a bug, rather a missing feature in that we had never updated this counter for H2 till now, but it does make sense to do it especially based on what the doc says about its usage.	2023-10-20 21:09:12 +02:00
Willy Tarreau	250b630fb9	BUG/MINOR: mux-h2: commit the current stream ID even on reject The H2 spec says that a HEADERS frame turns an idle stream to the open state, and it may then turn to half-closed(remote) on ES, then to close, all at once, if we respond with RST (e.g. on error). Due to the fact that we process a complete frame at once since h2_dec_hdrs() may reassemble CONTINUATION frames until everything is complete, the state was only committed after the frame was completley valid (otherwise multiple passes could result in subsequent frames being rejected as the stream ID would be equal to the highest one). However this is not correct because it means that a client may retry on the same ID as a previously failed one, which technically is forbidden (for example the client couldn't know which of them a WINDOW_UPDATE or RST_STREAM frame is for). In practice, due to the error paths, this would only be possible when failing to decode HPACK while leaving the HPACK stream intact, thus when the valid decoded HPACK stream cannot be turned into a valid HTTP representation, e.g. when the resulting headers are too large for example. The solution to avoid this consists in committing the stream ID on this error path as well. h2spec continues to be happy. Thanks to Annika Wickert and Tim Windelschmidt for reporting this issue. This fix must be backported to all stable versions.	2023-10-20 21:09:12 +02:00
Willy Tarreau	08f3bb5bd5	MINOR: mux-h2/traces: clarify the "rejected H2 request" event In h2_frt_handle_headers() all failures lead to a generic message saying "rejected H2 request". It's quite inexpressive while there are a few distinct tests that are made before jumping there: - trailers on closed stream - unparsable request - refused stream Let's emit the traces from these call points instead so that we get more info about what happened. Since these are user-level messages, we take care of keeping them aligned as much as possible. For example before it would say: [04\|h2\|1\|mux_h2.c:2859] rejected H2 request : h2c=0x7f5d58036fd0(F,FRE) [04\|h2\|5\|mux_h2.c:2860] h2c_frt_handle_headers(): leaving on error : h2c=0x7f5d58036fd0(F,FRE) dsi=1 h2s=0x9fdb60(0,CLO) And now it says: [04\|h2\|1\|mux_h2.c:2817] rcvd unparsable H2 request : h2c=0x7f55f8037160(F,FRH) dsi=1 h2s=CLO [04\|h2\|5\|mux_h2.c:2875] h2c_frt_handle_headers(): leaving on error : h2c=0x7f55f8037160(F,FRE) dsi=1 h2s=CLO	2023-10-20 21:09:12 +02:00
Willy Tarreau	1deac6f99a	MINOR: mux-h2/traces: explicitly show the error/refused stream states Sometimes it's unclear whether a stream is still open or closed when certain traces are emitted, for example when the stream was refused, because the reported pointer and ID in fact correspond to the refused stream. And for closed streams, no pointer/name is printed, leaving some confusion about the state. This patch makes the situation easier to analyse by explicitly reporting "h2s=CLO" on closed/error/refused streams so that we don't waste time comparing pointers and we instantly know the stream is closed. Now instead of emitting: [03\|h2\|5\|mux_h2.c:2874] h2c_frt_handle_headers(): leaving on error : h2c=0x7fdfa8026820(F,FRE) dsi=201 h2s=0x9fdb60(0,CLO) It will emit: [03\|h2\|5\|mux_h2.c:2874] h2c_frt_handle_headers(): leaving on error : h2c=0x7fdfa8026820(F,FRE) dsi=201 h2s=CLO	2023-10-20 21:09:12 +02:00
Jens Popp	f66b9f6018	MINOR: sample: Added support for Arrays in sample_conv_json_query in sample.c Method now returns the content of Json Arrays, if it is specified in Json Path as String. The start and end character is a square bracket. Any complex object in the array is returned as Json, so that you might get Arrays of Array or objects. Only recommended for Arrays of simple types (e.g., String or int) which will be returned as CSV String. Also updated documentation and fixed issue with parenthesis and other changes from comments. This patch was discussed in issue #2281. Signed-off-by: William Lallemand <wlallemand@haproxy.com>	2023-10-20 18:42:05 +02:00
Amaury Denoyelle	f70cf28539	MINOR: listener: forbid most keywords for reverse HTTP bind Reverse HTTP bind is very specific in that in rely on a server to initiate connection. All connection settings are defined on the server line and ignored from the bind line. Before this patch, most of keywords were silently ignored. This could result in a configuration from doing unexpected things from the user point of view. To improve this situation, add a new 'rhttp_ok' field in bind_kw structure. If not set, the keyword is forbidden on a reverse bind line and will cause a fatal config error. For the moment, only the following keywords are usable with reverse bind 'id', 'name' and 'nbconn'. This change is safe as it's already forbidden to mix reverse and standard addresses on the same bind line.	2023-10-20 17:28:08 +02:00
Amaury Denoyelle	e05edf71df	MINOR: cfgparse: rename "rev@" prefix to "rhttp@" 'rev@' was used to specify a bind/server used with reverse HTTP transport. This notation was deemed not explicit enough. Rename it 'rhttp@' instead.	2023-10-20 14:44:37 +02:00
Amaury Denoyelle	9d4c7c1151	MINOR: server: convert @reverse to rev@ standard format Remove the recently introduced '@reverse' notation for HTTP reverse servers. Instead, reuse the 'rev@' prefix already defined for bind lines.	2023-10-20 14:44:37 +02:00
Amaury Denoyelle	3222047a14	MINOR: listener: add nbconn kw for reverse connect Previously, maxconn keyword was reused for a specific usage on reverse HTTP binds to specify the number of active connect to proceed. To avoid confusion, introduce a new dedicated keyword 'nbconn' which is specific to reverse HTTP bind. This new keyword is forbidden for non-reverse listener. A fatal error is emitted during config parsing if this rule is not respected. It's safe because it's also forbidden to mix standard and reverse addresses on the same bind line. Internally, nbconn value will be reassigned to 'maxconn' member of bind_conf structure. This ensures that listener layer will automatically reenable the preconnect task each time a connection is closed.	2023-10-20 14:44:37 +02:00
Amaury Denoyelle	37d7e52cc6	MINOR: cfgparse: forbid mixing reverse and standard listeners Reverse HTTP listeners are very specific and share only a very limited subset of keywords with other listeners. As such, it is probable meaningless to mix standard and reverse addresses on the same bind line. This patch emits a fatal error during configuration parsing if this is the case.	2023-10-20 14:44:37 +02:00
Christopher Faulet	60e7116be0	BUG/MEDIUM: peers: Fix synchro for huge number of tables The number of updates sent at once was limited to not loop too long to emit updates when the buffer size is huge or when the number of sync tables is huge. The limit can be configured and is set to 200 by default. However, this fix introduced a bug. It is impossible to syncrhonize two peers if the number of tables is higher than this limit. Thus by default, it is not possible to sync two peers if there are more than 200 tables to sync. Technically speacking, a teaching process is finished if we loop on all tables with no new update messages sent. Because we are limited at each call, the loop is splitted on several calls. However the restart point for the next loop is always the last table for which we emitted an update message. Thus with more tables than the limit, the loop never reachs the end point. Worse, in conjunction with the bug fixed by "BUG/MEDIUM: peers: Be sure to always refresh recconnect timer in sync task", it is possible to trigger the watchdog because the applets may be woken up in loop and leave requesting more room while its buffer is empty. To fix the issue, restart conditions for a teaching loop were changed. If the teach process is interrupted, we now save the restart point, called stop_local_table. It is the last evaluated table on the previous loop. This restart point is reset when the teach process is finished. In additionn, the updates_sent variable in peer_send_msgs() was renamed to updates to avoid ambiguities. Indeed, the variable is incremented, whether messages were sent or not. This patch must be backported as far as 2.6.	2023-10-20 14:32:12 +02:00
Christopher Faulet	cebeab3d20	BUG/MEDIUM: peers: Be sure to always refresh recconnect timer in sync task A sync task used to manage reconnect, sessions creation or shutdown and data synchronization is responsible to refresh reconnect and heartbeat timers for each remote peers and trigger applets wakeup. These timers are used to refresh the sync task timeer itself. Thus it is important to take care to always properly refresh them. However, when there are some data to push, the reconnect timer is not checked. It may be expired and not refreshed. In this case, an expired timer may be used to the sync task, leading to a storm of wakeups. The sync task is woken up in loop because its timer is in the past, waking up Peer applets at each time. To fix the issue, the peer's reconnect timer is now refresh to the default reconnect timeout, if necessary, when there are some data to push. This patch must be backported to all stable versions.	2023-10-19 15:26:43 +02:00
Willy Tarreau	f08322b56c	BUG/MINOR: trace: fix trace parser error reporting Since traces were adapted to support being declared in the global section in 2.7 with commit `c11f1cdf4` ("MINOR: trace: split the CLI "trace" parser in CLI vs statement"), the method used to return the error message was unreliable. For example an invalid sink name in the global section would produce: [ALERT] (26685) : config : parsing [test-trace.cfg:51] : 'trace': No such sink [ALERT] (26685) : config : parsing [test-trace.cfg:51] : (null) [ALERT] (26685) : config : Error(s) found in configuration file : test-trace.cfg [ALERT] (26685) : config : Fatal errors found in configuration. The reason is that the trace is emitted manually using ha_error() in cfg_parse_trace() and -1 is returned without setting the message, and the caller also prints the empty message. That's quite awkward given that the API originally comes from the CLI which does support dynamic strings and that config keywords do as well. This commit modifies both cli_parse_trace() and cfg_parse_trace() to return a dynamically allocated message instead, and adapts the central function trace_parse_statement() to do the same, replacing a few direct assignments with strdup() or memprintf(). This way the alert is no longer emitted by the parser function, it just passes the message to the caller. A few of the static messages switching to memprintf() also took this opportunity to report the faulty word: [ALERT] (26772) : config : parsing [test-trace.cfg:51] : No such trace sink 'stduot' [ALERT] (26772) : config : Error(s) found in configuration file : test-trace.cfg [ALERT] (26772) : config : Fatal errors found in configuration. This may be backported to 2.8 and 2.7.	2023-10-19 14:45:07 +02:00
Willy Tarreau	3dd963b35f	BUG/MINOR: mux-h2: fix http-request and http-keep-alive timeouts again Stefan Behte reported that since commit `f279a2f14` ("BUG/MINOR: mux-h2: refresh the idle_timer when the mux is empty"), the http-request and http-keep-alive timeouts don't work anymore on H2. Before this patch, and since 3e448b9b64 ("BUG/MEDIUM: mux-h2: make sure control frames do not refresh the idle timeout"), they would only be refreshed after stream frames were sent (HEADERS or DATA) but the patch above that adds more refresh points broke these so they don't expire anymore as long as there's some activity. We cannot just revert the fix since it also addressed an isse by which sometimes the timeout would trigger too early and provoque truncated responses. The right approach here is in fact to only use refresh the idle timer when the mux buffer was flushed from any such stream frames. In order to achieve this, we're now setting a flag on the connection whenever we write a stream frame, and we consider that flag when deciding to refresh the buffer after it's emptied. This way we'll only clear that flag once the buffer is empty and there were stream data in it, not if there were no such stream data. In theory it remains possible to leave the flag on if some control data is appended after the buffer and it's never cleared, but in practice it's not a problem as a buffer will always get sent in large blocks when the window opens. Even a large buffer should be emptied once in a while as control frames will not fill it as much as data frames could. Given the patch above was backported as far as 2.6, this patch should also be backported as far as 2.6.	2023-10-18 17:17:58 +02:00
Willy Tarreau	91ed52976c	MINOR: dgram: allow to set rcv/sndbuf for dgram sockets as well tune.rcvbuf.client and tune.rcvbuf.server are not suitable for shared dgram sockets because they're per connection so their units are not the same. However, QUIC's listener and log servers are not connected and take per-thread or per-process traffic where a socket log buffer might be too small, causing undesirable packet losses and retransmits in the case of QUIC. This essentially manifests in listener mode with new connections taking a lot of time to set up under heavy traffic due to the small queues causing delays. Let's add a few new settings allowing to set these shared socket sizes on the frontend and backend side (which reminds that these are per-front/back and not per client/server hence not per connection).	2023-10-18 17:01:19 +02:00
Christopher Faulet	203211f4cb	REORG: stconn/muxes: Rename init step in fast-forwarding Instead of speaking of an initialisation stage for each data fast-forwarding, we now use the negociate term. Thus init_ff/init_fastfwd functions were renamed nego_ff/nego_fastfwd.	2023-10-18 12:46:55 +02:00
Christopher Faulet	d6d4abdc31	BUILD: mux-h1: Fix build without kernel splicing support Data fast-forwarding does not build without the kernel splicing support because counters about splicing don't exist. To make the code more readable, all code about splicing is disabled if kernel splicing is not supported.	2023-10-18 12:43:38 +02:00
Christopher Faulet	023564b685	MINOR: global: Add an option to disable the zero-copy forwarding The zero-copy forwarding or the mux-to-mux forwarding is a way to fast-forward data without using the channels buffers. Data are transferred from a mux to the other one. The kernel splicing is an optimization of the zero-copy forwarding. But it can also use normal buffers (but not channels ones). This way, it could be possible to fast-forward data with muxes not supporting the kernel splicing (H2 and H3 muxes) but also with applets. However, this mode can introduce regressions or bugs in future (just like the kernel splicing). Thus, It could be usefull to disable this optim. To do so, in configuration, the global tune settting 'tune.disable-zero-copy-forwarding' may be set in a global section or the '-dZ' command line parameter may be used to start HAProxy. Of course, this also disables the kernel splicing.	2023-10-17 18:51:13 +02:00
Christopher Faulet	ec22d3102d	MEDIUM: mux-pt: Add fast-forwarding support The PT multiplexer now implements callbacks function to produce and consume fast-forwarded data. Only splicing is support because the mux-pt does not use its own buffers.	2023-10-17 18:51:13 +02:00
Christopher Faulet	169df3b3a8	CLEAN: mux-h1: Remove useless __maybe_unused attribute on h1_make_chunk() This attribute was added during the dev stage. But it is useless now the function is used. So, just remove it.	2023-10-17 18:51:13 +02:00
Christopher Faulet	322d660d08	MINOR: tree-wide: Only rely on co_data() to check channel emptyness Because channel_is_empty() function does now only check the channel's buffer, we can remove it and rely on co_data() instead. Of course, all tests must be inverted. channel_is_empty() is thus removed.	2023-10-17 18:51:13 +02:00
Christopher Faulet	20c463955d	MEDIUM: channel: don't look at iobuf to report an empty channel It is important to split channels and I/O buffers. When data are pushed in an I/O buffer, we consider them as forwarded. The channel never sees them. Fast-forwarded data are now handled in the SE only.	2023-10-17 18:51:13 +02:00
Christopher Faulet	11c05c516a	MEDIUM: mux-h2: Add consumer-side fast-forwarding support The H2 multiplexer now implements callbacks to consume fast-forwarded data. It is the most usful case: A H2 client getting data from a H1 server. It is also the easiest case to implement. The producer side is trickier because of multiplexing. It is not obvious this case would be improved with data fast-forwarding.	2023-10-17 18:51:13 +02:00
Christopher Faulet	eb346074bb	MINOR: h2: Set the BODYLESS_RESP flag on the HTX start-line if necessary When message headers are parsed and an HTX start-line is created, if we detect the response must not have any payload, a specific flag must be set on the HTX start-line. It happens for instance for response to HEAD requests. This flag is useb by the multiplexers to know response payload, if any, must be silently skipped. This was not performed when h2 HEADERS frames were decoded. This HTX flag was specifically added to fix a bug when the splicing is inuse. Thus the H2 multiplexer was not concerned. Because the mux-to-mux fast-forwarding will be introduced, it is important handle this flag in the H2 multiplexer too.	2023-10-17 18:51:13 +02:00
Christopher Faulet	2d80eb5b7a	MEDIUM: mux-h1: Add fast-forwarding support The H1 multiplexer now implements callbacks function to produce and consume fast-forwarded data.	2023-10-17 18:51:13 +02:00
Christopher Faulet	2db273a7b5	MEDIUM: mux-h1: Simplify payload formatting based on HTX blocks on sending path Just like for the zero-copy, this patch tries to simplify the code responsible to format the message payload before sending it. But here, we take care to simplify the loop on the HTX blocks. The result should be less errorrpone.	2023-10-17 18:51:13 +02:00
Christopher Faulet	129787fb00	MEDIUM: mux-h1: Simplify zero-copy on sending path In h1_make_data(), the function responsible to format the message payload before sending it, the code dealing with zero-copy was slighly simplified (at least for me :). There is no real change but there is a better split between messages with a content-length and cunked messages.	2023-10-17 18:51:13 +02:00
Christopher Faulet	6dff013fad	MINOR: mux-h1: Add function to add size of a chunk to an outgoind message This function should be used to send the chunk size, before appending the chunk payload. It also takes care to add a CRLF to finish a previous chunk, if necessary. This function will be used to fix the splicing for re-chunk responses with an unknown length.	2023-10-17 18:51:13 +02:00
Christopher Faulet	91f1c5519a	MEDIUM: raw-sock: Specifiy amount of data to send via snd_pipe callback When data were sent using the kernel splicing, we tried to send all data with no restriction. Most of time it is valid. However, because the payload representation may differ between the producer and the consumer, it is important to be able to specify how must data to send via the splicing. Of course, for performance reason, it is important to maximize amount of data send via splicing at each call. However, on edge-cases, this now can be limited.	2023-10-17 18:51:13 +02:00
Christopher Faulet	d57a66d63a	MEDIUM: mux-h1: Properly handle state transitions of chunked outgoing messages On the sending path, there are 3 states for chunked payload in H1: * H1_MSG_CHUNK_SIZE: the chunk size must be emitted * H1_MSH_CHUNK_CRLF: The end of the chunk must be emitted * H1_MSG_DATA: Chunked data must be emitted However, some shortcuts were used on the sending path to avoid some transitions. Especially, outgoing messages were never switched in H1_MSG_CHUNK_SIZE state. However, it will be necessary to properly handle all transitions on the payload to implement mux-to-mux forwarding, to be sure to always known when the chunk size or the end of the chunk must be emitted.	2023-10-17 18:51:13 +02:00
Christopher Faulet	117f9cc017	MINOR: mux-h1: Use HTX extra field only for responses with known length For now, it is not an issue, but it is safer to explicitly ignore HTX extra field for responses with unknown length. This will be mandatory to future fixes, to be able to re-chunk responses with an unknown length..	2023-10-17 18:51:13 +02:00
Christopher Faulet	799518e63f	MEDIUM: stconn: Add mux-to-mux fast-forward support Now the kernel splicing support was removed, we can add mux-to-mux fast-forward support. Of course, the splicing support will be reintroduced in the muxes themselves but this will be transparent. Changes are mainly located into sc_conn_recv() and sc_conn_send().	2023-10-17 18:51:13 +02:00
Christopher Faulet	a500899601	MINOR: mux-h1: Temporarily remove splicing support Because the kernel splicing support was removed from the stconn, it is useless to keep it in muxes. In this patch, we remove the kernel splicing support from the H1 multiplexer. It will be replaced by the mux-to-mux data fast-forwarding.	2023-10-17 18:51:13 +02:00
Christopher Faulet	02ed7c0d0f	MINOR: mux-pt: Temporarily remove splicing support Because the kernel splicing support was removed from the stconn, it is useless to keep it in muxes. In this patch, we remove the kernel splicing support from the passthough multiplexer. It will be replaced by the mux-to-mux data fast-forwarding.	2023-10-17 18:51:13 +02:00
Christopher Faulet	8b89fe3d8f	MINOR: stconn: Temporarily remove kernel splicing support mux-to-mux fast-forwarding will be added. To avoid mix with the splicing and simplify the commits, the kernel splicing support is removed from the stconn. CF_KERN_SPLICING flag is removed and the support is no longer tested in process_stream(). In the stconn part, rcv_pipe() callback function is no longer called. Reg-tests scripts testing the kernel splicing are temporarly marked as broken.	2023-10-17 18:51:13 +02:00
Christopher Faulet	1d68bebb70	MINOR: stconn: Extend iobuf to handle a buffer in addition to a pipe It is unused for now, but the iobuf structure now owns a pointer to a buffer. This buffer will be used to perform mux-to-mux fast-forwarding when splicing is not supported or unusable. This pointer should be filled by an endpoint to let the opposite one forward data. Extra fields, in addition to the buffer, are mandatory because the buffer may already contains some data. the ".offset" field may be used may be used as the position to start to copy data. Finally, the amount of data copied in this buffer must be saved in ".data" field. Some flags are also added to prepare next changes. And helper stconn fnuctions are updated to also count data in the buffer. For a first implementation, it is not planned to handle data in the buffer and in the pipe in same time. But it will be possible to do so.	2023-10-17 18:51:13 +02:00
Christopher Faulet	e52519ac83	MINOR: stconn: Start to introduce mux-to-mux fast-forwarding notion Instead of talking about kernel splicing at stconn/sedesc level, we now try to talk about mux-to-mux fast-forwarding. To do so, 2 functions were added to know if there are fast-forwarded data and to retrieve this amount of data. Of course, for now, there is only data in a pipe. In addition, some flags were renamed to reflect this notion. Note the channel's documentation was not updated yet.	2023-10-17 18:51:13 +02:00
Christopher Faulet	8bee0dcd7d	MEDIUM: stconn/channel: Move pipes used for the splicing in the SE descriptors The pipes used to put data when the kernel splicing is in used are moved in the SE descriptors. For now, it is just a simple remplacement but there is a major difference with the pipes in the channel. The data are pushed in the consumer's pipe while it was pushed in the producer's pipe. So it means the request data are now pushed in the pipe of the backend SE descriptor and response data are pushed in the pipe of the frontend SE descriptor. The idea is to hide the pipe from the channel/SC side and to be able to handle fast-forwading in pipe but also in buffer. To do so, the pipe is inside a new entity, called iobuf. This entity will be extended.	2023-10-17 18:51:13 +02:00
Christopher Faulet	1fdfa4f9ba	BUG/MEDIUM: mux-h2: Don't report an error on shutr if a shutw is pending If a shutw is blocked because the mux is full or busy, we must defer the shutr. In this case, the H2 stream is not in H2_SS_CLOSED state because the shutw is also deferred. If the shutr is performed, this will lead to a error. Concretly, when the mux is unblocked, a RST_STREAM is sent while in some cases, an empty DATA frame with ES flag set could be sent. This patch should be backported to all stable versions.	2023-10-17 18:51:13 +02:00
Christopher Faulet	d0b04920d1	BUG/MINOR: htpp-ana/stats: Specify that HTX redirect messages have a C-L header Redirect responses sent during the HTTP analysis have no payload. However there is still a "Content-Length" header. It is important to set the corresponding flag on the HTX start-line to be sure to preserve this header when the reponse is sent to the client. The same is true with the stats applet, when it returns a redirect responses. It is especially important because we no ignore in-fly modifications of "Content-Length" or "Transfer-Encoding" headers without updating the HTX start-line flags. This patch may be backported to all stable versions but it is probably useless because only the 2.9-dev is affected by the bug.	2023-10-17 18:11:04 +02:00
Christopher Faulet	e9f6e8e7f6	BUG/MEDIUM: mux-h1: do not forget TLR/EOT even when no data is sent Since commit `723c73f8a` ("MEDIUM: mux-h1: Split h1_process_mux() to make code more readable"), outgoing H1 chunked messages with no data at all get delayed by 200ms. It is due to the fact that we end processing too early and we don't have the opportunity to process trailers in this case. This fix addresses it by verifying if it's required to emit EOT or trailers, if any, when retruning from h1_make_data() No backport is needed, this was in 2.9-dev.	2023-10-17 18:11:04 +02:00
Christopher Faulet	2f9db80cc6	CLEANUP: hlua: Remove dead-code on error path in hlua_socket_new() Since last fixes about the lua cosocket, the appctx is no longer initialized in hlua_socket_new(). The code to deal with error at this stage can be removed. This patch should fix the issue #2308.	2023-10-17 18:11:04 +02:00
Willy Tarreau	4070e4042a	BUG/MEDIUM: quic_conn: let the scheduler kill the task when needed The two timer handlers qc_process_timer() and qc_idle_timer_task() would inadvertently return NULL when they don't want to be requeued, instead of just returning the task itself. The effect of returning NULL for the scheduler is that it considers the task as freed, so it must not touch it anymore. As such, the TASK_F_RUNNING flag is never removed from these tasks, and when quic_conn_release() later tries to release these tasks using task_destroy(), the latter sees the RUNNING flag and just sets ->process to NULL, hoping that the scheduler will kill them on return, but there's no longer being executed so this never happens and they are leaked. Interestingly, this doesn't seem to happen as much when multi-queue is set to off, but it's likely because the tasks are being replaced and the first ones have already been woken up and leaked, while the latter might only trigger on a timeout or timer renewal. This should address github issue #2310. Thanks to @hpn0t0ad for the numerous traces that helped understand this sequence. This must be backported to 2.7 at least, and adapted for 2.6 (qc_idle_timer_task must return t there).	2023-10-17 17:14:06 +02:00
Willy Tarreau	5714aff4a6	DEBUG: pool: store the memprof bin on alloc() and update it on free() When looking at "show pools", it's often difficult to know which alloc() corresponds to which free() since it's not often 1:1. But sometimes we have all elements available to maintain a link between alloc and free. Indeed, when the caller is recorded in the allocated area, we can store the pointer to the just created bin instead of the caller address itself, since the caller address is already in the memprof bin. By doing so, we permit the pool_free() call to locate the allocator bin and update its free count when caller tracing is enabled. This for example allows to produce outputs like this on "show profiling" and a process started with -dMcaller: 1391967 1391968 22805987328 22806003712\| 0x59f72f process_stream+0x19f/0x3a7a p_alloc(0) [delta=-16384] [pool=buffer] 1391936 1391937 22805479424 22805495808\| 0x6e1476 task_run_applet+0x426/0xea2 p_alloc(0) [delta=-16384] [pool=buffer] 1391925 1391925 22805299200 22805299200\| 0x58435a main+0xdf07a p_alloc(0) [delta=0] [pool=buffer] 0 2087930 0 34208645120\| 0x59b519 stream_release_buffers+0xf9/0x110 p_free(-16384) [pool=buffer] 695993 695992 11403149312 11403132928\| 0x66018f main+0x1baeaf p_alloc(0) [delta=16384] [pool=buffer] 0 1391957 0 22805823488\| 0x59b47c stream_release_buffers+0x5c/0x110 p_free(-16384) [pool=buffer] 695968 695970 11402739712 11402772480\| 0x587b85 h1_io_cb+0x9a5/0xe7c p_alloc(0) [delta=-32768] [pool=buffer] 0 1391923 0 22805266432\| 0x57f388 main+0xda0a8 p_free(-16384) [pool=buffer] 695959 695960 11402592256 11402608640\| 0x586add main+0xe17fd p_alloc(0) [delta=-16384] [pool=buffer] 0 695978 0 11402903552\| 0x59cc58 stream_free+0x178/0x9ea p_free(-16384) [pool=buffer] (...) Here it's quickly visible that all of them got properly released.	2023-10-17 17:13:56 +02:00
Willy Tarreau	68d02e5fa9	BUG/MINOR: mux-h2: make up other blocked streams upon removal from list An interesting issue was met when testing the mux-to-mux forwarding code. In order to preserve fairness, in h2_snd_buf() if other streams are waiting in send_list or fctl_list, the stream that is attempting to send also goes to its list, and will be woken up by h2_process_mux() or h2_send() when some space is released. But on rare occasions, there are only a few (or even a single) streams waiting in this list, and these streams are just quickly removed because of a timeout or a quick h2_detach() that calls h2s_destroy(). In this case there's no even to wake up the other waiting stream in its list, and this will possibly resume processing after some client WINDOW_UPDATE frames or even new streams, so usually it doesn't last too long and it not much noticeable, reason why it was left that long. In addition, measures have shown that in heavy network-bound benchmark, this exact situation happens on less than 1% of the streams (reached 4% with mux-mux). The fix here consists in replacing these LIST_DEL_INIT() calls on h2s->list with a function call that checks if other streams were queued to the send_list recently, and if so, which also tries to resume them by calling h2_resume_each_sending_h2s(). The detection of late additions is made via a new flag on the connection, H2_CF_WAIT_INLIST, which is set when a stream is queued due to other streams being present, and which is cleared when this is function is called. It is particularly difficult to reproduce this case which is particularly timing-dependent, but in a constrained environment, a test involving 32 conns of 20 streams each, all downloading a 10 MB object previously showed a limitation of 17 Gbps with lots of idle CPU time, and now filled the cable at 25 Gbps. This should be backported to all versions where it applies.	2023-10-17 16:43:44 +02:00
Vladimir Vdovin	70d2d9aefc	MINOR: support for http-response set-timeout Added set-timeout action for http-response. Adapted reg-tests and documentation.	2023-10-17 08:27:33 +02:00
Christopher Faulet	7629e82c6e	BUG/MINOR: mux-h1: Send a 400-bad-request on shutdown before the first request Except if we must silently ignore empty connections by enabling http-ignore-probes or dontlognull options, when a client connection is closed before the first request, a 400-bad-request response must be sent with the corresponding log message. However, that is broken since the commit `fc473a6453` ("MEDIUM: mux-h1: Rely on the H1C to deal with shutdown for reads"). The bug is subtle. Parsing errors are no longer reported on connection errors before the first request while it should be. This patch must be backported where the above commit is (as far as 2.7).	2023-10-13 17:16:43 +02:00
Christopher Faulet	2a51d5b6ea	BUG/MEDIUM: applet: Report a send activity everytime data were sent In the same way than for stream-connectors (see "BUG/MEDIUM: stconn: Report a send activity everytime data were sent" for details), we now report a send activity everytime something was consumed by an applet, even if some output data remains blocked into the channel's buffer. This patch must be backported to 2.8.	2023-10-13 10:35:32 +02:00
Christopher Faulet	3083fd90e1	BUG/MEDIUM: stconn: Report a send activity everytime data were sent When read/write timeouts were refactored in 2.8, we decided to change when a send activity had to be reported. Before, everytime some data were sent a send activity were reported. At this time, the channel's wex timer were updated. During the refactoring, we decided to limit send activity to sends that ampty te channel's buffer, consuming all outgoing data. Idea behind this change was to protect haproxy against clients consumming data very slowly. However, it is too strict. Some congested muxes but still active can hit the client or the server timeout. It seems a bit unfair. It is especially visible with QUIC/H3 but it is probably also possible with H2 if the window size is small. The better is to restore the old behavior. This patch must be backported to 2.8.	2023-10-13 10:35:32 +02:00
Aurelien DARRAGON	94d0f77deb	MINOR: server: introduce "log-bufsize" kw "log-bufsize" may now be used for a log server (in a log backend) to configure the bufsize of implicit ring associated to the server (which defaults to BUFSIZE).	2023-10-13 10:05:07 +02:00
Aurelien DARRAGON	b30bd7adba	MEDIUM: log/balance: support for the "hash" lb algorithm hash lb algorithm can be configured with the "log-balance hash <cnv_list>" directive. With this algorithm, the user specifies a converter list with <cnv_list>. The produced log message will be passed as-is to the provided converter list, and the resulting hash will be used to select the log server that will receive the log message.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	7251344748	MINOR: sample: add sample_process_cnv() function split sample_process() in 2 parts in order to be able to only process the converter part of a sample expression from an existing input sample struct passed as parameter.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	08767e162d	MINOR: lbprm: compute the hash avalanche in gen_hash() Instead of systematically computing the avalanche hash right after the gen_hash() call, do it inside the gen_hash() function directly to ensure avalanche setting is always considered.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	a7563158f7	MINOR: lbprm: support for the "none" hash-type function Allow the use of the "none" hash-type function so that the key resulting from the sample expression is directly used as the hash. This can be useful to do the hashing manually using available hashing converters, or even custom ones, and then inform haproxy that it can directly rely on the sample expression result which is explictly handled as an integer in this case.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	e0b4660015	MINOR: log/balance: support for the "random" lb algorithm In this patch we add basic support for the random algorithm: random algorithm picks a random server using the result of the statistical_prng() function as if it was a hash key to then compute the related server ID. There is no support for the <draw> parameter (which is implemented for tcp/http load-balancing), because we don't have the required metrics to evaluate server's load in log backends for the moment. Plus it would add more complexity to the __do_send_log_backend() function so we'll keep it this way for now but this might be needed in the future.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	26f73dbcbb	MINOR: log/balance: support for the "sticky" lb algorithm sticky algorithm always tries to send log messages to the first server in the farm. The server will stay in front during queue and dequeue operations (no other server can steal its place), unless it becomes unavailable, in which case it will be replaced by another server from the tree.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	9a74a6cb17	MAJOR: log: introduce log backends Using "mode log" in a backend section turns the proxy in a log backend which can be used to log-balance logs between multiple log targets (udp or tcp servers) log backends can be used as regular log targets using the log directive with "backend@be_name" prefix, like so: \| log backend@mybackend local0 A log backend will distribute log messages to servers according to the log load-balancing algorithm that can be set using the "log-balance" option from the log backend section. For now, only the roundrobin algorithm is supported and set by default.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	e58a9b4baf	MINOR: sink: add sink_new_from_srv() function This helper function can be used to create a new sink from an existing server struct (and thus existing proxy as well), in order to spare some resources when possible.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	5c0d1c1a74	MEDIUM: sink: inherit from caller fmt in ring_write() when rings didn't set one implicit rings were automatically forced to the parent logger format, but this was done upon ring creation. This is quite restrictive because we might want to choose the desired format right before generating the log header (ie: when producing the log message), depending on the logger (log directive) that is responsible for the log message, and with current logic this is not possible. (To this day, we still have dedicated implicit ring per log directive, but this might change) In ring_write(), we check if the sink->fmt is specified: - defined: we use it since it is the most precise format (ie: for named rings) - undefined: then we fallback to the format from the logger With this change, implicit rings' format is now set to UNSPEC upon creation. This is safe because the log header building function automatically enforces the "raw" format when UNSPEC is set. And since logger->format also defaults to "raw", no change of default behavior should be expected.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	6dad0549a5	MEDIUM: log/sink: simplify log header handling Introduce log_header struct to easily pass log header data between functions and use that to simplify the logic around log header handling. While at it, some outdated comments were updated as well. No change in behavior should be expected.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	ab914667da	MINOR: log: remove the logger dependency in do_send_log() do_send_log() now exlusively relies on explicit parameters to remove logger dependency in low-level log sending chain.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	60c5821867	MINOR: log: support explicit log target as argument in __do_send_log() __do_send_log() now takes an extra target parameter to pass an explicit log target instead of getting it from logger->target. This will allow __do_send_log() to be called multiple times within a logger entry containing multiple log targets.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	cc3dfe89ed	MEDIUM: sink/log: stop relying on AF_UNSPEC for rings Since `a5b325f92` ("MINOR: protocol: add a real family for existing FDs"), we don't rely anymore on AF_UNSPEC for buffer rings in do_send_log. But we kept it as a parsing hint to differentiate between implicit and named rings during ring buffer postparsing. However it is still a bit confusing and forces us to systematically rely on target->addr, even for named buffer rings where it doesn't make much sense anymore. Now that target->addr was made a pointer in a recent commit, we can choose not to initialize it when not needed (i.e.: named rings) and use this as a hint to distinguish implicit rings during init since they rely on the addr struct to temporarily store the ring's address until the ring is actually created during postparsing step.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	a9b185f34e	MEDIUM: log: introduce log target log targets were immediately embedded in logger struct (previously named logsrv) and could not be used outside of this context. In this patch, we're introducing log_target type with the associated helper functions so that it becomes possible to declare and use log targets outside of loggers scope.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	18da35c123	MEDIUM: tree-wide: logsrv struct becomes logger When 'log' directive was implemented, the internal representation was named 'struct logsrv', because the 'log' directive would directly point to the log target, which used to be a (UDP) log server exclusively at that time, hence the name. But things have become more complex, since today 'log' directive can point to ring targets (implicit, or named) for example. Indeed, a 'log' directive does no longer reference the "final" server to which the log will be sent, but instead it describes which log API and parameters to use for transporting the log messages to the proper log destination. So now the term 'logsrv' is rather confusing and prevents us from introducing a new level of abstraction because they would be mixed with logsrv. So in order to better designate this 'log' directive, and make it more generic, we chose the word 'logger' which now replaces logsrv everywhere it was used in the code (including related comments). This is internal rewording, so no functional change should be expected on user-side.	2023-10-13 10:05:06 +02:00
Amaury Denoyelle	89d685f396	BUG/MEDIUM: quic-conn: free unsent frames on retransmit to prevent crash Since the following patch : commit 33c49cec987c1dcd42d216c6d075fb8260058b16 MINOR: quic: Make qc_dgrams_retransmit() return a status. retransmission process is interrupted as soon as a fatal send error has been encounted. However, this may leave frames in local list. This cause several issues : a memory leak and a potential crash. The crash happens because leaked frames are duplicated of an origin frame via qc_dup_pkt_frms(). If an ACK arrives later for the origin frame, all duplicated frames are also freed. During qc_frm_free(), LIST_DEL_INIT() operation is invalid as it still references the local list used inside qc_dgrams_retransmit(). This bug was reproduced using the following injection from another machine : $ h2load --npn-list h3 -t 8 -c 10000 -m 1 -n 2000000000 \ https://<host>:<port>/?s=4m Haproxy was compiled using ASAN. The crash resulted in the following trace : ==332748==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fff82bf9d78 at pc 0x556facd3b95a bp 0x7fff82bf8b20 sp 0x7fff82bf8b10 WRITE of size 8 at 0x7fff82bf9d78 thread T0 #0 0x556facd3b959 in qc_frm_free include/haproxy/quic_frame.h:273 #1 0x556facd59501 in qc_release_frm src/quic_conn.c:1724 #2 0x556facd5a07f in quic_stream_try_to_consume src/quic_conn.c:1803 #3 0x556facd5abe9 in qc_treat_acked_tx_frm src/quic_conn.c:1866 #4 0x556facd5b3d8 in qc_ackrng_pkts src/quic_conn.c:1928 #5 0x556facd60187 in qc_parse_ack_frm src/quic_conn.c:2354 #6 0x556facd693a1 in qc_parse_pkt_frms src/quic_conn.c:3203 #7 0x556facd7531a in qc_treat_rx_pkts src/quic_conn.c:4606 #8 0x556facd7a528 in quic_conn_app_io_cb src/quic_conn.c:5059 #9 0x556fad3284be in run_tasks_from_lists src/task.c:596 #10 0x556fad32a3fa in process_runnable_tasks src/task.c:876 #11 0x556fad24a676 in run_poll_loop src/haproxy.c:2968 #12 0x556fad24b510 in run_thread_poll_loop src/haproxy.c:3167 #13 0x556fad24e7ff in main src/haproxy.c:3857 #14 0x7fae30ddd0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x240b2) #15 0x556facc9375d in _start (/opt/haproxy-quic-2.8/haproxy+0x1ea75d) Address 0x7fff82bf9d78 is located in stack of thread T0 at offset 40 in frame #0 0x556facd74ede in qc_treat_rx_pkts src/quic_conn.c:4580 This must be backported up to 2.7.	2023-10-13 08:57:08 +02:00
Amaury Denoyelle	10dab4af98	BUG/MINOR: mux-quic: fix free on qcs-new fail alloc qcs_new() allocates several elements in intermediary steps. All elements must first be properly initialized to be able to free qcs instance in case of an intermediary failure. Previously, qc_stream_desc allocation was done in the middle of qcs_new() before some elements initializations. In case this fails, a crash can happened as some elements are left uninitialized. To fix this, move qc_stream_desc allocation at the end of qcs_new(). This ensures that all qcs elements are initialized first. This should be backported up to 2.6.	2023-10-13 08:52:29 +02:00
Amaury Denoyelle	63a6f26a86	BUG/MINOR: quic: fix free on quic-conn fail alloc qc_new_conn() allocates several elements in intermediary steps. If one of the fails, a global free is done on the quic_conn and its elements. This requires that most elements are first initialized to NULL or equivalent to ensure freeing operation is done only on proper values. Once of this element is qc.tx.cc_buf_area. It was initialized too late which could caused crashes. This is introduced by `9f7cfb0a56` MEDIUM: quic: Allow the quic_conn memory to be asap released. No need to backport.	2023-10-13 08:52:20 +02:00
Willy Tarreau	5798b5bb14	BUG/MAJOR: connection: make sure to always remove a connection from the tree Since commit `5afcb686b` ("MAJOR: connection: purge idle conn by last usage") in 2.9-dev4, the test on conn->toremove_list added to conn_get_idle_flag() in 2.8 by commit `3a7b539b1` ("BUG/MEDIUM: connection: Preserve flags when a conn is removed from an idle list") becomes misleading. Indeed, now both toremove_list and idle_list are shared by a union since the presence in these lists is mutually exclusive. However, in conn_get_idle_flag() we check for the presence in the toremove_list to decide whether or not to delete the connection from the tree. This test now fails because instead it sees the presence in the idle or safe list via the union, and concludes the element must not be removed. Thus the element remains in the tree and can be found later after the connection is released, causing crashes that Tristan reported in issue #2292. The following config is sufficient to reproduce it with 2 threads: defaults mode http timeout client 5s timeout server 5s timeout connect 1s listen front bind :8001 server next 127.0.0.1:8002 frontend next bind :8002 timeout http-keep-alive 1 http-request redirect location / Sending traffic with a few concurrent connections and some short timeouts suffices to instantly crash it after ~10k reqs: $ h2load -t 4 -c 16 -n 10000 -m 1 -w 1 http://0:8001/ With Amaury we analyzed the conditions in which the function is called in order to figure a better condition for the test and concluded that ->toremove_list is never filled there so we can safely remove that part from the test and just move the flag retrieval back to what it was prior to the 2.8 patch above. Note that the patch is not reverted though, as the parts that would drop the unexpected flags removal are unchanged. This patch must NOT be backported. The code in 2.8 works correctly, it's only the change in 2.9 that makes it misbehave.	2023-10-12 14:20:03 +02:00
Willy Tarreau	704f090b05	CLEANUP: connection: drop an uneeded leftover cast In conn_delete_from_tree() there remains a cast of the toremove_list to struct list while the introduction of the union precisely was to avoid this cast. It's a leftover from the first version of patch `5afcb686b` ("MAJOR: connection: purge idle conn by last usage") merged into in 2.9-dev4, let's fix that. No backport is needed.	2023-10-12 14:16:59 +02:00
Amaury Denoyelle	dc750817c5	BUG/MINOR: h3: strengthen host/authority header parsing HTTP/3 specification has several requirement when parsing authority or host header inside a request. However, it was until then only partially implemented. This commit fixes this by ensuring the following : * reject an empty authority/host header * reject a host header if an authority was found with a different value * no authority neither host header present This must be backported up to 2.6.	2023-10-11 14:21:30 +02:00
Amaury Denoyelle	9d905dfd73	BUG/MINOR: mux-quic: support initial 0 max-stream-data Support stream opening with an initial max-stream-data of 0. In normal case, QC_SF_BLK_SFCTL is set when a qcs instance cannot transfer more data due to flow-control. This flag is set when transfering data from MUX to quic-conn instance. However, it's possible to define an initial value of 0 for max-stream-data. In this case, qcs instance is blocked despite QC_SF_BLK_SFCTL not set. No STREAM frame is prepared for this stream as it's not possible to emit any byte, so QC_SF_BLK_SFCTL flag is never set. This behavior should cause no harm. However, this can cause a BUG_ON() crash on qcc_io_send(). Indeed, when sending is retried, it ensures that only qcs instance waiting for a new qc_stream_buf or with QC_SF_BLK_SFCTL set is present in the send_list. To fix this, initialize qcs with 0 value for msd and QC_SF_BLK_SFCTL. The flag is removed only if transport parameter msd value is non null. This should be backported up to 2.6.	2023-10-11 14:15:31 +02:00
Amaury Denoyelle	d85f9f9d43	BUG/MEDIUM: mux-quic: fix RESET_STREAM on send-only stream When receiving a RESET_STREAM on a send-only stream, it is mandatory to close the connection with an error STREAM_STATE error. However, this was badly implemented as this caused two invocation of qcc_set_error() which is forbidden by the mux-quic API. To fix this, rely on qcc_get_qcs() to properly detect the error. Remove qcc_set_error() usage from qcc_recv_reset_stream() instead. This must be backported up to 2.7.	2023-10-11 14:15:31 +02:00
Amaury Denoyelle	a4c59f5b9e	BUG/MINOR: quic: reject packet with no frame RFC 9000 indicates that a QUIC packet with no frame must trigger a connection closure with PROTOCOL_VIOLATION error code. Implement this via an early return inside qc_parse_pkt_frms(). This should be backported up to 2.6.	2023-10-11 14:15:31 +02:00
Amaury Denoyelle	f59f8326f9	REORG: quic: cleanup traces definition Move all QUIC trace definitions from quic_conn.h to quic_trace-t.h. Also remove multiple definition trace_quic macro definition into quic_trace.h. This forces all QUIC source files who relies on trace to include it while reducing the size of quic_conn.h.	2023-10-11 14:15:31 +02:00
Frédéric Lécaille	bd83b6effb	BUG/MINOR: quic: Avoid crashing with unsupported cryptographic algos This bug was detected when compiling haproxy against aws-lc TLS stack during QUIC interop runner tests. Some algorithms could be negotiated by haproxy through the TLS stack but not fully supported by haproxy QUIC implentation. This leaded tls_aead() to return NULL (same thing for tls_md(), tls_hp()). As these functions returned values were never checked, they could triggered segfaults. To fix this, one closes the connection as soon as possible with a handshake_failure(40) TLS alert. Note that as the TLS stack successfully negotiates an algorithm, it provides haproxy with CRYPTO data before entering ->set_encryption_secrets() callback. This is why this callback (ha_set_encryption_secrets() on haproxy side) is modified to release all the CRYPTO frames before triggering a CONNECTION_CLOSE with a TLS alert. This is done calling qc_release_pktns_frms() for all the packet number spaces. Modify some quic_tls_keys_hexdump to avoid crashes when the ->aead or ->hp EVP_CIPHER are NULL. Modify qc_release_pktns_frms() to do nothing if the packet number space passed as parameter is not intialized. This bug does not impact the QUIC TLS compatibily mode (USE_QUIC_OPENSSL_COMPAT). Thank you to @ilia-shipitsin for having reported this issue in GH #2309. Must be backported as far as 2.6.	2023-10-11 11:52:22 +02:00
William Lallemand	a62a2d8b48	MINOR: ssl: add an explicit error when 'ciphersuites' are not supported Add an explicit error when the support for 'ciphersuites' was not enable into the build because of the SSL library.	2023-10-09 14:46:09 +02:00
Aurelien DARRAGON	31e8a003a5	MINOR: sink: function to add new sink servers Move the sft creation part out of sink_finalize() function so that it becomes possible to register sink's servers without forward_px being set.	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	205d480d9f	MINOR: sink: refine forward_px usage now forward_px only serves as a hint to know if a proxy was created specifically for the sink, in which case the sink is responsible for it. Everywhere forward_px was used in appctx context: get the parent proxy from the sft->srv instead. This permits to finally get rid of the double link dependency between sink and proxy.	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	405567c125	MINOR: sink: don't rely on forward_px to init sink forwarding Instead, we check if at least one sft has been registered into the sink, if it is the case, then we need to init the forwarding for the sink.	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	3c53f6cb76	MINOR: sink: don't rely on p->parent in sink appctx Removing unnecessary dependency on proxy->parent pointer in sink appctx functions by directly using the sink sft from the applet->svcctx to get back to sink related structs. Thanks to this, proxy used for a ringbuf does not have to be exclusive to a single sink anymore.	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	ec770b7924	MINOR: sink: remove useless check after sink creation It's useless to check if sink has been created with BUF type after calling sink_new_buf() since the goal of the function is to create a new sink of BUF type.	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	cb01da8d12	MINOR: sink/log: fix some typos around postparsing logic Fixing some typos that have been overlooked during the recent log/sink API improvements. Using this patch to make sink_new_from_logsrv() static since it is not used outside of sink.c	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	19a1210dcd	MINOR: cfgparse-listen: warn when use-server rules is used in wrong mode haproxy will report a warning when "use-server" keyword is used within a backend that doesn't support server rules to inform the user that rules will be ignored. To this day, only TCP and HTTP backends can make use of it.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	3934901e51	MINOR: proxy: report a warning for max_ka_queue in proxy_cfg_ensure_no_http() Display a warning when max_ka_queue is set (it is the case when "max-keep-alive-queue" directive is used within a proxy section) to inform the user that this directives depends on the "http" mode to work and thus will safely be ignored.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	65f1124b5d	MINOR: cfgparse-listen: "http-reuse" requires TCP or HTTP mode Prevent the use of the "http-reuse" keyword in proxy section when neither the TCP nor the HTTP mode is set.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	403fdee6a4	MINOR: proxy: dynamic-cookie CLIs require TCP or HTTP mode Prevent the use of "dynamic-cookie" related CLI commands if the backend is not in TCP or HTTP mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	0b09727a22	MINOR: cfgparse-listen: "dynamic-cookie-key" requires TCP or HTTP mode Prevent the use of the "dynamic-cookie-key" keyword in proxy sections when TCP or HTTP modes are not set.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	d354947365	MINOR: cfgparse-listen: "http-send-name-header" requires TCP or HTTP mode Prevent the use of the "http-send-name-header" keyword in proxy section when neither TCP or HTTP mode is set.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	0ba731f50b	MINOR: fcgi-app: "use-fcgi-app" requires TCP or HTTP mode Prevent the use of the "use-fcgi-app" keyword in proxy sections where neither TCP nor HTTP mode is set.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	b41b77b4cc	MINOR: http_htx/errors: prevent the use of some keywords when not in tcp/http mode Prevent the use of "errorfile", "errorfiles" and various errorloc options in proxies that are neither in TCP or HTTP mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	225526dc16	MINOR: flt_http_comp: "compression" requires TCP or HTTP mode Prevent the use of "compression" keyword in proxy sections when the proxy is neither in tcp or http mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	1e0093a317	MINOR: backend/balance: "balance" requires TCP or HTTP mode Prevent the use of "balance" and associated keywords when proxy is neither in tcp or http mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	f9422551cd	MINOR: filter: "filter" requires TCP or HTTP mode Prevent the use of "filter" when proxy is not in TCP or HTTP mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	098ae743fd	MINOR: stktable: "stick" requires TCP or HTTP mode Prevent the use of "stick-table" and "stick *" when proxy is neither in tcp or http mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	09b15e4163	MINOR: tcp_rules: tcp-{request,response} requires TCP or HTTP mode Prevent the use of tcp-{request,response} keyword in proxies that are neither in TCP or HTTP modes.	2023-10-06 15:34:30 +02:00
Willy Tarreau	90fa2eaa15	MINOR: haproxy: permit to register features during boot The regtests are using the "feature()" predicate but this one can only rely on build-time options. It would be nice if some runtime-specific options could be detected at boot time so that regtests could more flexibly adapt to what is supported (capabilities, splicing, etc). Similarly, certain features that are currently enabled with USE_XXX could also be automatically detected at build time using ifdefs and would simplify the configuration, but then we'd lose the feature report in the feature list which is convenient for regtests. This patch makes sure that haproxy -vv shows the variable's contents and not the macro's contents, and adds a new hap_register_feature() to allow the code to register a new keyword.	2023-10-06 11:40:02 +02:00
Remi Tricot-Le Breton	a5e96425a2	MEDIUM: cache: Add "Origin" header to secondary cache key This patch add a hash of the Origin header to the cache's secondary key. This enables to manage store responses that have a "Vary: Origin" header in the cache when vary is enabled. This cannot be considered as a means to manage CORS requests though, it only processes the Origin header and hashes the presented value without any form of URI normalization. This need was expressed by Philipp Hossner in GitHub issue #251. Co-Authored-by: Philipp Hossner <philipp.hossner@posteo.de>	2023-10-05 10:53:54 +02:00
Amaury Denoyelle	544e320f80	BUG/MINOR: hq-interop: simplify parser requirement hq-interop should be limited for QUIC testing. As such, its code should be kept plain simple and not implement too many things. This patch fixes issues which may cause rare QUIC interop failures : - remove some unneeded BUG_ON() as parser should not be too strict - remove support of partial message parsing - ensure buffer data does not wrap as it was not properly handled. In any case, this should never happen as only a single message will be stored for each qcs buffer. This should be backported up to 2.6.	2023-10-04 17:32:23 +02:00
William Lallemand	45174e4fdc	BUILD: quic: allow USE_QUIC to work with AWSLC This patch fixes the build with AWSLC and USE_QUIC=1, this is only meant to be able to build for now and it's not feature complete. The set_encryption_secrets callback has been split in set_read_secret and set_write_secret. Missing features: - 0RTT was disabled. - TLS1_3_CK_CHACHA20_POLY1305_SHA256, TLS1_3_CK_AES_128_CCM_SHA256 were disabled - clienthello callback is missing, certificate selection could be limited (RSA + ECDSA at the same time)	2023-10-04 16:55:19 +02:00
Christopher Faulet	225a4d02e1	MINOR: h1-htx: Declare successful tunnel establishment as bodyless Successful responses to a CONNECT or to a upgrade request have no payload. Be explicit on this point by setting HTX_SL_F_BODYLESS_RESP flag on the HTX start-line.	2023-10-04 15:34:18 +02:00
Christopher Faulet	b6c32f1e04	BUG/MINOR: h1-htx: Keep flags about C-L/T-E during HEAD response parsing When a response to a HEAD request is parsed, flags to know if the content length is set or if the payload is chunked must be preserved.. It is important because of the previous fix. Otherwise, these headers will be removed from the response sent to the client. This patch must only backported if "BUG/MEDIUM: mux-h1; Ignore headers modifications about payload representation" is backported.	2023-10-04 15:34:18 +02:00
Christopher Faulet	f89ba27caa	BUG/MEDIUM: mux-h1; Ignore headers modifications about payload representation We now ignore modifications during the message analysis about the payload representation if only headers are updated and not meta-data. It means a C-L header removed to add a T-E one or the opposite via HTTP actions. This kind of changes are ignored because it is extremly hard to be sure the payload will be properly formatted. It is an issue since the HTX was introduced and it was never reported. Thus, there is no reason to backport this patch for now. It relies on following commits: * MINOR: mux-h1: Add flags if outgoing msg contains a header about its payload * MINOR: mux-h1: Rely on H1S_F_HAVE_CHNK to add T-E in outgoing messages * BUG/MEDIUM: mux-h1: Add C-L header in outgoing message if it was removed	2023-10-04 15:34:18 +02:00
Christopher Faulet	c43742c188	BUG/MEDIUM: mux-h1: Add C-L header in outgoing message if it was removed If a C-L header was found during parsing of a message but it was removed via a HTTP action, it is re-added during the message formatting. Indeed, if headers about the payload are modified, meta-data of the message must also be updated. Otherwise, it is not possible to guarantee the message will be properly formatted. To do so, we rely on the flag H1S_F_HAVE_CLEN. This patch should not be backported except an issue is explicitly reported. It relies on "MINOR: mux-h1: Add flags if outgoing msg contains a header about its payload".	2023-10-04 15:34:18 +02:00
Christopher Faulet	accd3e911c	MINOR: mux-h1: Rely on H1S_F_HAVE_CHNK to add T-E in outgoing messages If a message is declared to have a known length but no C-L or T-E headers are set, a "Transfer-Encoding; chunked" header is automatically added. It is useful for H2/H3 messages with no C-L header. There is now a flag to know this header was found or added. So we use it.	2023-10-04 15:34:18 +02:00
Christopher Faulet	e7964eac2d	BUG/MEDIUM: h1: Ignore C-L value in the H1 parser if T-E is also set In fact, during the parsing there is already a test to remove the Content-Length header if a Transfer-Encoding one is found. However, in the parser, the content-length value was still used to set the body length (the final one and the remaining one). This value is thus also used to set the extra field in the HTX message and is then used during the sending stage to announce the chunk size. So, Content-Length header value must be ignored by the H1 parser to properly reformat the message when it is sent. This patch must be backported as far as 2.6. Lower versions don"t handle this case.	2023-10-04 15:34:18 +02:00
Christopher Faulet	c367957851	BUG/MINOR: mux-h1: Ignore C-L when sending H1 messages if T-E is also set In fact, it is already done but both flags (H1_MF_CLEN and H1_MF_CHUNK) are set on the H1 parser. Thus it is errorprone when H1 messages are sent, especially because most of time, the "Content-length" case is processed before the "chunked" one. This may lead to compute the wrong chunk size and to miss the last chunk. This patch must be backported as far as 2.6. This case is not handled in 2.4 and lower.	2023-10-04 15:34:18 +02:00
Christopher Faulet	331241b084	BUG/MINOR: mux-h1: Handle read0 in rcv_pipe() only when data receipt was tried In rcv_pipe() callback we must be careful to not report the end of stream too early because some data may still be present in the input buffer. If we report a EOS here, this will block the subsequent call to rcv_buf() to process remaining input data. This only happens when we try a last rcv_pipe() when the xfer length is unknown and all data was already received in the input buffer. Concretely this happens with a payload larger than a buffer but lower than 2 buffers. This patch must be backported as far as 2.7.	2023-10-04 15:34:18 +02:00
Christopher Faulet	2225cb660c	DEBUG: mux-h1: Fix event label from trace messages about payload formatting The label used for in/out trace messages about payload formatting was not the right one. Use H1_EV_TX_BODY, instead of H1_EV_TX_HDRS.	2023-10-04 15:34:18 +02:00
Christopher Faulet	751b59c40b	BUG/MEDIUM: hlua: Initialize appctx used by a lua socket on connect only Ths appctx used by a lua socket was synchronously initialized after the appctx creation. The connect itself is performed later. However it is an issue because the script may be interrupted beteween the two operation. In this case, the stream attached to the appctx is woken up before any destination is set. The stream will try to connect but without destination, it fails. When the lua script is rescheduled and the connect is performed, the connection has already failed and an error is returned. To fix the issue, we must be sure to not woken up the stream before the connect. To do so, we must defer the appctx initilization. It is now perform on connect. This patch relies on the following commits: * MINOR: hlua: Test the hlua struct first when the lua socket is connecting * MINOR: hlua: Save the lua socket's server in its context * MINOR: hlua: Save the lua socket's timeout in its context * MINOR: hlua: Don't preform operations on a not connected socket * MINOR: hlua: Set context's appctx when the lua socket is created All the series must be backported as far as 2.6.	2023-10-04 15:34:13 +02:00
Christopher Faulet	66fc9238f0	MINOR: hlua: Test the hlua struct first when the lua socket is connecting It makes sense to first verify the hlua context is valid. It is probably better than doing it after updated the appctx.	2023-10-04 15:34:10 +02:00
Christopher Faulet	6f4041c75d	MINOR: hlua: Save the lua socket's server in its context For the same reason than the timeout, the server used by a lua socket is now saved in its context. This will be mandatory to fix issues with the lua sockets.	2023-10-04 15:34:06 +02:00
Christopher Faulet	0be1ae2fa2	MINOR: hlua: Save the lua socket's timeout in its context When the lua socket timeout is set, it is now saved in its context. If there is already a stream attached to the appctx, the timeout is then immediately modified. Otherwise, it is modified when the stream is created, thus during the appctx initialization. For now, the appctx is initialized when it is created. But this will change to fix issues with the lua sockets. Thus, this patch is mandatory.	2023-10-04 15:34:03 +02:00
Christopher Faulet	ee687aa18d	MINOR: hlua: Don't preform operations on a not connected socket There is nothing that prevent someone to create a lua socket and try to receive or to write before the connection was established ot after the shutdown was performed. The same is true when info about the socket are retrieved. It is not an issue because this will fail later. But now, we check the socket is connected or not earlier. It is more effecient but it will be also mandatory to fix issue with the lua sockets.	2023-10-04 15:34:00 +02:00
Christopher Faulet	ed9333827a	MINOR: hlua: Set context's appctx when the lua socket is created The lua socket's context referenced the owning appctx. It was set when the appctx was initialized. It is now performed when the appctx is created. It is a small change but this will be required to fix several issues with the lua sockets.	2023-10-04 15:33:57 +02:00
Christopher Faulet	b62d5689d2	BUILD: pool: Fix GCC error about potential null pointer dereference In pool_gc(), GCC 13.2.1 reports an error about a potential null potential dereference: src/pool.c: In function ‘pool_gc’: src/pool.c:807:64: error: potential null pointer dereference [-Werror=null-dereference] 807 \| entry->buckets[bucket].free_list = temp->next; \| ~~~~^~~~~~ There is no issue here because "bucket" variable cannot be greater than CONFIG_HAP_POOL_BUCKETS. But to make GCC happy, we now break the loop if it is greater or equal to CONFIG_HAP_POOL_BUCKETS.	2023-10-04 08:03:02 +02:00
Amaury Denoyelle	90873dc678	MINOR: proto_reverse_connect: support source address setting Support backend configuration for explicit source address on pre-connect. These settings can be specified via "source" backend keyword or directly on the server line. Previously, all source parameters triggered a BUG_ON() when binding a reverse connect listener. This was done because some settings are incompatible with reverse connect context : this is the case for all source settings which do not specify a fixed address but rather rely on a frontend connection. Indeed, in case of preconnect, connection is initiated on its own without the existence of a previous frontend connection. This patch allows to use a source parameter with a fixed address. All other settings (usesrc client/clientip/hdr_ip) are rejected on listener binding. On connection init, alloc_bind_address() is used to set the optional source address.	2023-10-03 17:50:36 +02:00
Amaury Denoyelle	bd001ff346	MINOR: backend: refactor specific source address allocation Refactor alloc_bind_address() function which is used to allocate a sockaddr if a connection to a target server relies on a specific source address setting. The main objective of this change is to be able to use this function outside of backend module, namely for preconnections using a reverse server. As such, this function is now exported globally. For reverse connect, there is no stream instance. As such, the function parts which relied on it were reduced to the minimal. Now, stream is only used if a non-static address is configured which is useful for usesrc client\|clientip\|hdr_ip. These options have no sense for reverse connect so it should be safe to use the same function.	2023-10-03 17:49:12 +02:00
Amaury Denoyelle	2ac5d9a657	MINOR: quic: handle perm error on bind during runtime Improve EACCES permission errors encounterd when using QUIC connection socket at runtime : * First occurence of the error on the process will generate a log warning. This should prevent users from using a privileged port without mandatory access rights. * Socket mode will automatically fallback to listener socket for the receiver instance. This requires to duplicate the settings from the bind_conf to the receiver instance to support configurations with multiple addresses on the same bind line.	2023-10-03 16:52:02 +02:00
Amaury Denoyelle	3ef6df7387	MINOR: quic: define quic-socket bind setting Define a new bind option quic-socket : quic-socket [ connection \| listener ] This new setting works in conjunction with the existing configuration global tune.quic.socket-owner and reuse the same semantics. The purpose of this setting is to allow to disable connection socket usage on listener instances individually. This will notably be useful when needing to deactivating it when encountered a fatal permission error on bind() at runtime.	2023-10-03 16:49:26 +02:00
Remi Tricot-Le Breton	b019636cd7	DOC: sample: Add a comment in 'check_operator' to explain why 'vars_check_arg' should ignore the 'err' buffer This extra comment ensure that we do not try to pass an 'err' argument to 'vars_check_arg' otherwise some warnings will be raised if an operator is given an integer directly in the configuration file.	2023-10-03 11:13:10 +02:00
Remi Tricot-Le Breton	6fe57303f7	Revert "MEDIUM: sample: Small fix in function check_operator for eror reporting" This reverts commit `d897d7da87`. The "check_operator" function is used for all the operator converters such as "and", "or", "add"... With such a converter that accepts a variable name as well as an integer, the "vars_check_arg" call is expected to fail when an integer is provided. Passing an "err" variable has the unwanted side effect of raising a warning during init for a configuration such as the following: http-request set-query "s=%[rand,add(20)]" which raises the following warning: [WARNING] (33040) : config : parsing [hap.cfg:14] : invalid variable name '20'. A variable name must be start by its scope. The scope can be 'proc', 'sess', 'txn', 'req', 'res' or 'check'.	2023-10-03 11:13:10 +02:00
William Lallemand	c21ec3b735	BUG/MINOR: proto_reverse_connect: fix FD leak upon connect new_reverse_conn() is creating its own socket with sock_create_server_socket(). However the connect is done with conn->ctrl->connect() which is tcp_connect_server(). tcp_connect_server() is also creating its own socket and sets it in the struct conn, left the previous socket unclosed and leaking at each attempt. This patch fixes the issue by letting tcp_connect_server() handling the socket part, and removes it in new_reverse_conn().	2023-09-30 00:53:43 +02:00
Amaury Denoyelle	c58fd4d1cc	MINOR: tcp_act: remove limitation on protocol for attach-srv This patch allows to specify "tcp-request session attach-srv" without requiring that each associated bind lines mandates HTTP/2 usage. If a non supported protocol is targetted by this rule, conn_install_mux_fe() is responsible to reject it. This change is mandatory to be able to mix attach-srv and standard non-reversable connection on the same bind instances. An ACL can be used to activate attach-srv only on some conditions.	2023-09-29 18:11:10 +02:00
Amaury Denoyelle	337c71423f	MINOR: connection: define mux flag for reverse support Add a new MUX flag MX_FL_REVERSABLE. This value is used to indicate that MUX instance supports connection reversal. For the moment, only HTTP/2 multiplexer is flagged with it. This allows to dynamically check if reversal can be completed during MUX installation. This will allow to relax requirement on config writing for 'tcp-request session attach-srv' which currently cannot be used mixed with non-http/2 listener instances, even if used conditionnally with an ACL.	2023-09-29 18:09:08 +02:00
Amaury Denoyelle	ac1164de7c	MINOR: connection: define error for reverse connect Define a new error code for connection CO_ER_REVERSE. This will be used to report an issue which happens on a connection targetted for reversal before reverse process is completed.	2023-09-29 18:08:26 +02:00
Amaury Denoyelle	753fe2b9ac	BUG/MINOR: tcp_act: fix attach-srv rule ACL parsing Fix parser for tcp-request session attach-srv rule. Before this commit, it was impossible to use an anonymous ACL with it. This was caused because support for optional name argument was badly implemented. No need to backport this.	2023-09-29 18:07:52 +02:00
Amaury Denoyelle	6118590e95	BUG/MINOR: proto_reverse_connect: fix FD leak on connection error Listener using "rev@" address is responsible to setup connection and reverse it using a server instance. If an error occured before reversal is completed, proper freeing must be taken care of by the listener as no session exists for this. Currently, there is two locations where a connection is freed on error before reversal inside reverse_connect protocol. Both of these were incomplete as several function must be used to ensure connection is properly freed. This commit fixes this by reusing the same cleaning mechanism used inside H2 multiplexer. One of the biggest drawback before this patch was that connection FD was not properly removed from fdtab which caused a file-descriptor leak. No need to backport this.	2023-09-29 18:02:36 +02:00
Willy Tarreau	b3dcd59f8d	MINOR: stream: fix output alignment of stuck thread dumps Since commit `c185bc465` ("MEDIUM: stream: now provide full stream dumps in case of loops"), the stuck threads show the stream's pointer in the margin since it appears immediately after a line feed. Let's add it after the prefix and "stream=" to make the output more readable.	2023-09-29 16:43:07 +02:00
Emeric Brun	3c250cb847	Revert "BUG/MEDIUM: quic: missing check of dcid for init pkt including a token" This reverts commit `072e774939`. Doing h2load with h3 tests we notice this behavior: Client ---- INIT no token SCID = a , DCID = A ---> Server (1) Client <--- RETRY+TOKEN DCID = a, SCID = B ---- Server (2) Client ---- INIT+TOKEN SCID = a , DCID = B ---> Server (3) Client <--- INIT DCID = a, SCID = C ---- Server (4) Client ---- INIT+TOKEN SCID = a, DCID = C ---> Server (5) With (5) dropped by haproxy due to token validation. Indeed the previous patch adds SCID of retry packet sent to the aad of the token ciphering aad. It was useful to validate the next INIT packets including the token are sent by the client using the new provided SCID for DCID as mantionned into the RFC 9000. But this stateless information is lost on received INIT packets following the first outgoing INIT packet from the server because the client is also supposed to re-use a second time the lastest received SCID for its new DCID. This will break the token validation on those last packets and they will be dropped by haproxy. It was discussed there: https://mailarchive.ietf.org/arch/msg/quic/7kXVvzhNCpgPk6FwtyPuIC6tRk0/ To resume: this is not the role of the server to verify the re-use of retry's SCID for DCID in further client's INIT packets. The previous patch must be reverted in all versions where it was backported (supposed until 2.6)	2023-09-29 09:27:22 +02:00
Willy Tarreau	d956db6638	CLEANUP: stream: remove the now unused stream_dump() function It was superseded by strm_dump_to_buffer() which provides much more complete information and supports anonymizing.	2023-09-29 09:20:27 +02:00
Willy Tarreau	feff6296a1	MINOR: debug: use the more detailed stream dump in panics Similarly upon a panic we'd like to have a more detailed dump of a stream's state, so let's use the full dump function for this now.	2023-09-29 09:20:27 +02:00
Willy Tarreau	c185bc4656	MEDIUM: stream: now provide full stream dumps in case of loops When a stream is caught looping, we produce some output to help figure its internal state explaining why it's looping. The problem is that this debug output is quite old and the info it provides are quite insufficient to debug a modern process, and since such bugs happen only once or twice a year the situation doesn't improve. On the other hand the output of "show sess all" is extremely detailed and kept up to date with code evolutions since it's a heavily used debugging tool. This commit replaces the call to the totally outdated stream_dump() with a call to strm_dump_to_buffer(), and removes the filters dump since they are already emitted there, and it now produces much more exploitable output: [ALERT] (5936) : A bogus STREAM [0x7fa8dc02f660] is spinning at 5653514 calls per second and refuses to die, aborting now! Please report this error to developers: 0x7fa8dc02f660: [28/Sep/2023:09:53:08.811818] id=2 proto=tcpv4 source=127.0.0.1:58306 flags=0xc4a, conn_retries=0, conn_exp=<NEVER> conn_et=0x000 srv_conn=0x133f220, pend_pos=(nil) waiting=0 epoch=0x1 frontend=public (id=2 mode=http), listener=? (id=1) addr=127.0.0.1:4080 backend=public (id=2 mode=http) addr=127.0.0.1:61932 server=s1 (id=1) addr=127.0.0.1:7443 task=0x7fa8dc02fa40 (state=0x01 nice=0 calls=5749559 rate=5653514 exp=3s tid=1(1/1) age=1s) txn=0x7fa8dc02fbf0 flags=0x3000 meth=1 status=-1 req.st=MSG_DONE rsp.st=MSG_RPBEFORE req.f=0x4c rsp.f=0x00 scf=0x7fa8dc02f5f0 flags=0x00000482 state=EST endp=CONN,0x7fa8dc02b4b0,0x05004001 sub=1 rex=58s wex=<NEVER> h1s=0x7fa8dc02b4b0 h1s.flg=0x100010 .sd.flg=0x5004001 .req.state=MSG_DONE .res.state=MSG_RPBEFORE .meth=GET status=0 .sd.flg=0x05004001 .sc.flg=0x00000482 .sc.app=0x7fa8dc02f660 .subs=0x7fa8dc02f608(ev=1 tl=0x7fa8dc02fae0 tl.calls=0 tl.ctx=0x7fa8dc02f5f0 tl.fct=sc_conn_io_cb) h1c=0x7fa8dc0272d0 h1c.flg=0x0 .sub=0 .ibuf=0@(nil)+0/0 .obuf=0@(nil)+0/0 .task=0x7fa8dc0273f0 .exp=<NEVER> co0=0x7fa8dc027040 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=LISTENER:0x12840c0 flags=0x00000300 fd=32 fd.state=20 updt=0 fd.tmask=0x2 scb=0x7fa8dc02fb30 flags=0x00001411 state=EST endp=CONN,0x7fa8dc0300c0,0x05000001 sub=1 rex=58s wex=<NEVER> h1s=0x7fa8dc0300c0 h1s.flg=0x4010 .sd.flg=0x5000001 .req.state=MSG_DONE .res.state=MSG_RPBEFORE .meth=GET status=0 .sd.flg=0x05000001 .sc.flg=0x00001411 .sc.app=0x7fa8dc02f660 .subs=0x7fa8dc02fb48(ev=1 tl=0x7fa8dc02feb0 tl.calls=2 tl.ctx=0x7fa8dc02fb30 tl.fct=sc_conn_io_cb) h1c=0x7fa8dc02ff00 h1c.flg=0x80000000 .sub=1 .ibuf=0@(nil)+0/0 .obuf=0@(nil)+0/0 .task=0x7fa8dc030020 .exp=<NEVER> co1=0x7fa8dc02fcd0 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=SERVER:0x133f220 flags=0x10000300 fd=33 fd.state=10421 updt=0 fd.tmask=0x2 req=0x7fa8dc02f680 (f=0x1840000 an=0x8000 pipe=0 tofwd=0 total=79) an_exp=<NEVER> buf=0x7fa8dc02f688 data=(nil) o=0 p=0 i=0 size=0 htx=0xc18f60 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0 res=0x7fa8dc02f6d0 (f=0x80000000 an=0x1400000 pipe=0 tofwd=0 total=0) an_exp=<NEVER> buf=0x7fa8dc02f6d8 data=(nil) o=0 p=0 i=0 size=0 htx=0xc18f60 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0 call trace(10): \| 0x59f2b7 [0f 0b 0f 1f 80 00 00 00]: stream_dump_and_crash+0x1f7/0x2bf \| 0x5a0d71 [e9 af e6 ff ff ba 40 00]: process_stream+0x19f1/0x3a56 \| 0x68d7bb [49 89 c7 4d 85 ff 74 77]: run_tasks_from_lists+0x3ab/0x924 \| 0x68e0b4 [29 44 24 14 8b 4c 24 14]: process_runnable_tasks+0x374/0x6d6 \| 0x656f67 [83 3d f2 75 84 00 01 0f]: run_poll_loop+0x127/0x5a8 \| 0x6575d7 [48 8b 1d 42 50 5c 00 48]: main+0x1b22f7 \| 0x7fa8e0f35e45 [64 48 89 04 25 30 06 00]: libpthread:+0x7e45 \| 0x7fa8e0e5a4af [48 89 c7 b8 3c 00 00 00]: libc:clone+0x3f/0x5a Note that the output is subject to the global anon key so that IPs and object names can be anonymized if required. It could make sense to backport this and the few related previous patches next time such an issue is reported.	2023-09-29 09:20:27 +02:00
Willy Tarreau	b206504f43	MINOR: streams: add support for line prefixes to strm_dump_to_buffer() Now the function can prepend every new line with a caller-fed prefix that will later be used for indenting. The caller has to feed the prefix for the first line itself though, allowing to possibly append the first line at the end of an existing one.	2023-09-29 09:20:27 +02:00
Willy Tarreau	5743eeea88	MINOR: stream: make stream_dump() always multi-line There used to be two working modes for this function, a single-line one and a multi-line one, the difference being made on the "eol" argument which could contain either a space or an LF (and with the prefix being adjusted accordingly). Let's get rid of the single-line mode as it's what limits the output contents because it's difficult to produce exploitable structured data this way. It was only used in the rare case of spinning streams and applets and these are the ones lacking info. Now a spinning stream produces: [ALERT] (3511) : A bogus STREAM [0x227e7b0] is spinning at 5581202 calls per second and refuses to die, aborting now! Please report this error to developers: strm=0x227e7b0,c4a src=127.0.0.1 fe=public be=public dst=s1 txn=0x2041650,3000 txn.req=MSG_DONE,4c txn.rsp=MSG_RPBEFORE,0 rqf=1840000 rqa=8000 rpf=80000000 rpa=1400000 scf=0x24af280,EST,482 scb=0x24af430,EST,1411 af=(nil),0 sab=(nil),0 cof=0x7fdb28026630,300:H1(0x24a6f60)/RAW((nil))/tcpv4(33) cob=0x23199f0,10000300:H1(0x24af630)/RAW((nil))/tcpv4(32) filters={} call trace(11): (...)	2023-09-29 09:20:27 +02:00
Willy Tarreau	5ddeba7af3	MINOR: stream: make strm_dump_to_buffer() show the list of filters That's one of the rare pieces of information that was not present in the full dump and only in the short one, the list of filters the stream is subscribed to (however the current filter was present and more detailed).	2023-09-29 09:20:27 +02:00
Willy Tarreau	3e630a9871	MINOR: stream: make strm_dump_to_buffer() take an arbitrary buffer We won't always want to dump into the trash, so let's make the function accept an arbitrary buffer.	2023-09-29 09:20:27 +02:00
Willy Tarreau	6bc07103f8	CLEANUP: stream: make strm_dump_to_buffer() take a const stream Now that we don't need a variable anymore, let's pass a const stream. It will void any doubt about what can happen to the stream when the function is called from inspection points (show sess etc).	2023-09-29 09:20:27 +02:00
Willy Tarreau	1a01ee4740	CLEANUP: stream: use const filters in the dump function The strm_dump_to_buffer() function requires a variable stream only for a few functions in it that do not take a const. strm_flt() is one of them (and for good reasons since most call places want to update filters). Here we know we won't modify the filter nor the stream so let's directly access the strm_flt in the stream and assign it to a const filter. This will also catch any future accidental change.	2023-09-29 09:20:27 +02:00
Willy Tarreau	77ecb3146a	MINOR: stream: split stats_dump_full_strm_to_buffer() in two The function only works with the CLI's appctx and does most of the convenient work of dumping a stream into a buffer (well, the trash buffer for now). Let's split it in two so that most of the work is done in a generic function and that the CLI-specific function relies on that one. The diff looks huge due to the changed indent caused by the extraction of the switch/case statement, but when looked at using diff -b it's small.	2023-09-29 09:20:27 +02:00
Willy Tarreau	6c2af048d6	CLEANUP: stream: make the dump code not depend on the CLI appctx The HA_ANON_CLI() helper relies on the CLI appctx and prevents the code from being made more generic. Let's extract the CLI's anon key separately and pass it via HA_ANON_STR() instead.	2023-09-29 09:20:27 +02:00
Amaury Denoyelle	7cf9cf705e	BUG/MINOR: mux-quic: remove full demux flag on ncbuf release When rcv_buf stream callback is invoked, mux tasklet is woken up if demux was previously blocked due to lack of buffer space. A BUG_ON() is present to ensure there is data in qcs Rx buffer. If this is not the case, wakeup is unneeded : BUG_ON(!ncb_data(&qcs->rx.ncbuf, 0)); This BUG_ON() may be triggered if RESET_STREAM is received after demux has been blocked. On reset, Rx buffer is purged according to RFC 9000 which allows to discard any data not yet consumed. This will trigger the BUG_ON() assertion if rcv_buf stream callback is invoked after this. To prevent BUG_ON() crash, just clear demux block flag each time Rx buffer is purged. This covers accordingly RESET_STREAM reception. This should be backported up to 2.7. This may fix github issue #2293. This bug relies on several precondition so its occurence is rare. This was reproduced by using a custom client which post big enough data to fill the buffer. It then emits a RESET_STREAM in place of a proper FIN. Moreover, mux code has been edited to artificially stalled stream read to force demux blocking. h3_data_to_htx: - return htx_sent; + return 1; qcc_recv_reset_stream: qcs_free_ncbuf(qcs, &qcs->rx.ncbuf); + qcs_notify_recv(qcs); qmux_strm_rcv_buf: char fin = 0; + static int i = 0; + if (++i < 2) + return 0; TRACE_ENTER(QMUX_EV_STRM_RECV, qcc->conn, qcs);	2023-09-28 11:44:53 +02:00
Vladimir Vdovin	f8b81f6eb7	MINOR: support for http-request set-timeout client Added set-timeout for frontend side of session, so it can be used to set custom per-client timeouts if needed. Added cur_client_timeout to fetch client timeout samples.	2023-09-28 08:49:22 +02:00
Amaury Denoyelle	b9bb3b932c	MINOR: proto_reverse_connect: emit log for preconnect Add reporting using send_log() for preconnect operation. This is minimal to ensure we understand the current status of listener in active reverse connect. To limit logging quantity, only important transition are considered. This requires to implement a minimal state machine as a new field in receiver structure. Here are the logs produced : * Initiating : first time preconnect is enabled on a listener * Error : last preconnect attempt interrupted on a connection error * Reaching maxconn : all necessary connections were reversed and are operational on a listener	2023-09-22 17:21:53 +02:00
Amaury Denoyelle	069ca55e70	MINOR: proto_reverse_connect: remove unneeded wakeup No need to use task_wakeup() on rev_bind_listener() to bootstrap preconnect. A similar call is done on rev_enable_listener() which serve both for bootstrap and also later to reinitiate attemps to maintain maxconn if connection are freed.	2023-09-22 17:06:18 +02:00
Amaury Denoyelle	1f43fb71be	MINOR: proto_reverse_connect: refactor preconnect failure When a connection is freed during preconnect before reversal, the error must be notified to the listener to remove any connection reference and rearm a new preconnect attempt. Currently, this can occur through 2 code paths : * conn_free() called directly by H2 mux * error during conn_create_mux(). For this case, connection is flagged with CO_FL_ERROR and reverse_connect task is woken up. The process task handler is then responsible to call conn_free() for such connection. Duplicated steps where done both in conn_free() and process task handler. These are now removed. To facilitate code maintenance, dedicated operation have been centralized in a new function rev_notify_preconn_err() which is called by conn_free().	2023-09-22 16:43:36 +02:00
Amaury Denoyelle	a37abee266	BUG/MINOR: proto_reverse_connect: set default maxconn If maxconn is not set for preconnect, it assumes we want to establish a single connection. However, this does not work properly in case the connection is closed after reversal. Listener is not resumed by protocol layer to attempt a new preconnect. To fix this, explicitely set maxconn to 1 in the listener instance if none is defined. This ensures the behavior is consistent. A BUG_ON() has been added to validate we never try to use a listener with a 0 maxconn.	2023-09-22 16:40:58 +02:00
Emeric Brun	27b2fd2e06	MINOR: quic: handle external extra CIDs generator. This patch adds the ability to externalize and customize the code of the computation of extra CIDs after the first one was derived from the ODCID. This is to prepare interoperability with extra components such as different QUIC proxies or routers for instance. To process the patch defines two function callbacks: - the first one to compute a hash 64bits from the first generated CID (itself continues to be derived from ODCID). Resulting hash is stored into the 'quic_conn' and 64bits is chosen large enought to be able to store an entire haproxy's CID. - the second callback re-uses the previoulsy computed hash to derive an extra CID using the custom algorithm. If not set haproxy will continue to choose a randomized CID value. Those two functions have also the 'cluster_secret' passed as an argument: this way, it is usable for obfuscation or ciphering.	2023-09-22 10:32:14 +02:00
Lokesh Jindal	d897d7da87	MEDIUM: sample: Small fix in function check_operator for eror reporting When function "check_operator" calls function "vars_check_arg" to decode a variable, it passes in NULL value for pointer to the char array meant for capturing the error message. This commit replaces NULL with the pointer to the real char array. This should help in correct error reporting.	2023-09-22 08:48:53 +02:00
Lokesh Jindal	915e48675a	MEDIUM: sample: Enhances converter "bytes" to take variable names as arguments Prior to this commit, converter "bytes" takes only integer values as arguments. After this commit, it can take variable names as inputs. This allows us to dynamically determine the offset/length and capture them in variables. These variables can then be used with the converter. Example use case: parsing a token present in a request header.	2023-09-22 08:48:51 +02:00
Amaury Denoyelle	d3db96f11a	MINOR: proto_reverse_connect: prevent transparent server for pre-connect Prevent using transparent servers for pre-connect on startup by emitting a fatal error. This is used to ensure we never try to connect to a target with an unspecified destination address or port.	2023-09-21 16:58:08 +02:00
Amaury Denoyelle	9b6812d781	BUG/MINOR: proto_reverse_connect: fix preconnect with startup name resolution addr member of server structure is not set consistently depending on the server address type. When using <IP:PORT> notation, its port is properly set. However, when using <HOSTNAME:PORT>, only IP address is set after startup name resolution but its port is left to 0. This behavior causes preconnect to not be functional when using server with hostname for startup name resolution. Indeed, only srv.addr is used as connect argument through function new_reverse_conn(). To fix this, rely on srv.svc_port : this member is always set for servers using IP or hostname. This is similar to connect_server() on the backend side. This does not need to be backported.	2023-09-21 16:57:30 +02:00
Sébastien Gross	6a9ba85322	MINOR: hlua: Add support for the "http-after-res" action This commit introduces support for the "http-after-res" action in hlua, enabling the invocation of a Lua function in a "http-after-response" rule. With this enhancement, a Lua action can be registered using the "http-after-res" action type: core.register_action('myaction', {'http-after-res'}, myaction) A new "lua.myaction" is created and can be invoked in a "http-after-response" rule: http-after-response lua.myaction This addition provides greater flexibility and extensibility in handling post-response actions using Lua. This commit depends on: - `4457783` ("MINOR: http_ana: position the FINAL flag for http_after_res execution") Signed-off-by: Sébastien Gross <sgross@haproxy.com>	2023-09-21 16:31:20 +02:00
Aurelien DARRAGON	95c4d24825	BUG/MEDIUM: server/cli: don't delete a dynamic server that has streams In cli_parse_delete_server(), we take care of checking that the server is in MAINT and that the cur_sess counter is set to 0, in the hope that no connection/stream ressources continue to point to the server, else we refuse to delete it. As shown in GH #2298, this is not sufficient. Indeed, when the server option "on-marked-down shutdown-sessions" is not used, server streams are not purged when srv enters maintenance mode. As such, there could be remaining streams that point to the server. To detect this, a secondary check on srv->cur_sess counter was performed in cli_parse_delete_server(). Unfortunately, there are some code paths that could lead to cur_sess being decremented, and not resulting in a stream being actually shutdown. As such, if the delete_server cli is handled right after cur_sess has been decremented with streams still pointing to the server, we could face some nasty bugs where stream->srv_conn could point to garbage memory area, as described in the original github report. To make the check more reliable prior to deleting the server, we don't rely exclusively on cur_sess and directly check that the server is not used in any stream through the srv_has_stream() helper function. Thanks to @capflam which found out the root cause for the bug and greatly helped to provide the fix. This should be backported up to 2.6.	2023-09-21 14:57:01 +02:00
Aurelien DARRAGON	0189a4679e	MINOR: pattern/ip: simplify pat_match_ip() function pat_match_ip() has been updated several times over the last decade to introduce new features, but it was never cleaned up. The result is that the function is pretty hard to read, and there are multiple duplicated code blocks so it becomes error-prone to maintain it, plus it bloats the haproxy binary for nothing. In this patch, we move the tree search (ip4 / ip6) logic into 2 dedicated helper functions. This allows us to refactor pat_match_ip() without touching to the original behavior.	2023-09-21 09:50:56 +02:00
Aurelien DARRAGON	f80122db26	MINOR: pattern/ip: offload ip conversion logic to helper functions Now that v4tov6() and v6tov4() were reworked to match behavior from pat_match_ip() function in ("MINOR: tools/ip: v4tov6() and v6tov4() rework"), we can remove code duplication in pat_match_ip() by directly using those dedicated functions where relevant.	2023-09-21 09:50:55 +02:00
Aurelien DARRAGON	72514a4467	MEDIUM: tools/ip: v4tov6() and v6tov4() rework v4tov6() and v6tov4() helper function were initially implemented in `4f92d3200` ("[MEDIUM] IPv6 support for stick-tables"). However, since `ceb4ac9c3` ("MEDIUM: acl: support IPv6 address matching") support for legacy ip6 to ip4 conversion formats were added, with the parsing logic directly performed in acl_match_ip (which later became pat_match_ip) The issue is that the original v6tov4() function which is used for sample expressions handling lacks those additional formats, so we could face inconsistencies whether we rely on ip4/ip6 conversions from an acl context or an expression context. To unify ip4/ip6 automatic mapping behavior, we reworked v4tov6 and v6tov4 functions so that they now behave like in pat_match_ip() function. Note: '6to4 (RFC3056)' and 'RFC4291 ipv4 compatible address' formats are still supported for legacy purposes despite being deprecated for a while now.	2023-09-21 09:50:55 +02:00
Christopher Faulet	d3e379b3ce	BUG/MEDIUM: http-ana: Try to handle response before handling server abort In the request analyser responsible to forward the request, we try to detect the server abort to stop the request forwarding. However, we must be careful to not block the response processing, if any. Indeed, it is possible to get the response and the server abort in same time. In this case, we must try to forward the response to the client first. So to fix the issue, in the request analyser we no longer handle the server abort if the response channel is not empty. In the end, the response analyser is able to detect the server abort if it is relevant. Otherwise, the stream will be woken up after the response forwarding and the server abort should be handled at this stage. This patch should be backported as far as 2.7 only because the risk of breakage is high. And it is probably a good idea to wait a bit before backporting it.	2023-09-21 09:36:37 +02:00
Willy Tarreau	cbbee15462	CLEANUP: ring: rename the ring lock "RING_LOCK" instead of "LOGSRV_LOCK" The ring lock was initially mostly used for the logs and used to inherit its name in lock stats. Now that it's exclusively used by rings, let's rename it accordingly.	2023-09-20 21:38:33 +02:00
Willy Tarreau	cec8b42cb3	MEDIUM: logs: atomically check and update the log sample index The log server lock is pretty visible in perf top when using log samples because it's taken for each server in turn while trying to validate and update the log server's index. Let's change this for a CAS, since we have the index and the range at hand now. This allow us to remove the logsrv lock. The test on 4 servers now shows a 3.7 times improvement thanks to much lower contention. Without log sampling a test producing 4.4M logs/s delivers 4.4M logs/s at 21 CPUs used, everything spent in the kernel. After enabling 4 samples (1:4, 2:4, 3:4 and 4:4), the throughput would previously drop to 1.13M log/s with 37 CPUs used and 75% spent in process_send_log(). Now with this change, 4.25M logs/s are emitted, using 26 CPUs and 22% in process_send_log(). That's a 3.7x throughput improvement for a 30% global CPU usage reduction, but in practice it mostly shows that the performance drop caused by having samples is much less noticeable (each of the 4 servers has its index updated for each log). Note that in order to even avoid incrementing an index for each log srv that is consulted, it would be more convenient to have a single index per frontend and apply the modulus on each log server in turn to see if the range has to be updated. It would then only perform one write per range switch. However the place where this is done doesn't have access to a frontend, so some changes would need to be performed for this, and it would require to update the current range independently in each logsrv, which is not necessarily easier since we don't know yet if we can commit it.	2023-09-20 21:38:33 +02:00
Willy Tarreau	e00470378b	MINOR: logs: use a single index to store the current range and index By using a single long long to store both the current range and the next index, we'll make it possible to perform atomic operations instead of locking. Let's only regroup them for now under a new "curr_rg_idx". The upper word is the range, the lower is the index.	2023-09-20 21:38:33 +02:00
Willy Tarreau	49ddc0138c	CLEANUP: logs: rename a confusing local variable "curr_rg" to "smp_rg" The variable curr_rg in process_send_log() is misleading because it is not related to the integer curr_rg that's used to calculate it, instead it's a pointer to the current smp_log_range from smp_rgs[], so let's call it "smp_rg" as a singular for this "smp_rgs" and put an end to this confusion.	2023-09-20 21:38:33 +02:00
Willy Tarreau	3f1284560f	MINOR: log: remove the unused curr_idx in struct smp_log_range This index is useless because it only serves to know when the global index reached the end, while the global one already knows it. Let's just drop it and perform the test on the global range. It was verified with the following config that the first server continues to take 1/10 of the traffic, the 2nd one 2/10, the 3rd one 3/10 and the 4th one 4/10: log 127.0.0.1:10001 sample 1:10 local0 log 127.0.0.1:10002 sample 2,5:10 local0 log 127.0.0.1:10003 sample 3,7,9:10 local0 log 127.0.0.1:10004 sample 4,6,8,10:10 local0	2023-09-20 21:38:33 +02:00
Willy Tarreau	4351364700	MINOR: logs: clarify the check of the log range The test of the log range is not very clear, in part due to the reuse of the "curr_idx" name that happens at two levels. The call to in_smp_log_range() applies to the smp_info's index to which 1 is added: it verifies that the next index is still within the current range. Let's just have a local variable "next_index" in process_send_log() that gets assigned the next index (current+1) and compare it to the current range's boundaries. This makes the test much clearer. We can then simply remove in_smp_log_range() that's no longer needed.	2023-09-20 21:38:33 +02:00
Aurelien DARRAGON	2c9bd3ae80	BUG/MINOR: server: add missing free for server->rdr_pfx rdr_pfx was not being free during server cleanup, leading to small memory leak when "redir" argument was used on a server line (HTTP only). This should be backported to every stable versions. [For 2.6 and 2.7: the free should be performed in srv_drop() directly. For older versions: free in deinit() function near the free for the cookie string]	2023-09-15 17:46:49 +02:00
Willy Tarreau	6cbb5a057b	Revert "MAJOR: import: update mt_list to support exponential back-off" This reverts commit `c618ed5ff4`. The list iterator is broken. As found by Fred, running QUIC single- threaded shows that only the first connection is accepted because the accepter relies on the element being initialized once detached (which is expected and matches what MT_LIST_DELETE_SAFE() used to do before). However while doing this in the quic_sock code seems to work, doing it inside the macro show total breakage and the unit test doesn't work anymore (random crashes). Thus it looks like the fix is not trivial, let's roll this back for the time it will take to fix the loop.	2023-09-15 17:13:43 +02:00
William Lallemand	694889ac2d	BUILD: quic: fix build on centos 8 and USE_QUIC_OPENSSL_COMPAT When using USE_QUIC_OPENSSL_COMPAT=1 on centos-8 the build fail this way: In file included from src/quic_openssl_compat.c:11: /usr/include/openssl/kdf.h:33:46: error: unknown type name 'va_list' int EVP_KDF_vctrl(EVP_KDF_CTX *ctx, int cmd, va_list args); This is because of openssl/kdf.h being include before openssl-compat.h	2023-09-14 16:26:58 +02:00
Christopher Faulet	89e20033c7	BUG/MAJOR: mux-h2: Report a protocol error for any DATA frame before headers If any DATA frame is received before all headers are fully received, a protocol error must be reported. It is required by the HTTP/2 RFC but it is also important because the HTTP analyzers expect the first HTX block is a start-line. It leads to a crash if this statement is not respected. For instance, it is possible to trigger a crash by sending an interim message with a DATA frame (It may be an empty DATA frame with the ES flag). AFAIK, only the server side is affected by this bug. To fix the issue, an protocol error is reported for the stream. This patch should fix the issue #2291. It must be backported as far as 2.2 (and probably to 2.0 too).	2023-09-14 11:39:39 +02:00
Fr�d�ric L�caille	3921bf80c7	BUG/MINOR: quic: Leak of frames to send. In very rare cases, it is possible that packet are detected as lost, their frames requeued, then the connection is released without releasing for any reason (to be killed because of a sendto() fatal failure for instance. Such frames are lost and never release because the function which release their packet number spaces does not release the frames which are still enqueued to be send. Must be backported as far as 2.6.	2023-09-13 15:32:14 +02:00
William Lallemand	c7424a1bac	MINOR: samples: implement bytes_in and bytes_out samples %[bytes_in] and %[bytes_out] are equivalent to %U and %B tags in log-format.	2023-09-13 14:54:50 +02:00
Willy Tarreau	5abbae2d3d	CLEANUP: pools: simplify the pool expression when no pool was matched in dump When dumping pool information, we make a special case of the condition where the pool couldn't be identified and we consider that it was the correct one. In the code arrangements brought by commit `efc46dede` ("DEBUG: pools: inspect pools on fatal error and dump information found"), a ternary expression for testing this depends on the "if" block condition so this can be simplified and will make Coverity happy. This was reported in GH #2290.	2023-09-13 13:31:41 +02:00
Willy Tarreau	c618ed5ff4	MAJOR: import: update mt_list to support exponential back-off The new mt_list code supports exponential back-off on conflict, which is important for use cases where there is contention on a large number of threads. The API evolved a little bit and required some updates: - mt_list_for_each_entry_safe() is now in upper case to explicitly show that it is a macro, and only uses the back element, doesn't require a secondary pointer for deletes anymore. - MT_LIST_DELETE_SAFE() doesn't exist anymore, instead one just has to set the list iterator to NULL so that it is not re-inserted into the list and the list is spliced there. One must be careful because it was usually performed before freeing the element. Now instead the element must be nulled before the continue/break. - MT_LIST_LOCK_ELT() and MT_LIST_UNLOCK_ELT() have always been unclear. They were replaced by mt_list_cut_around() and mt_list_connect_elem() which more explicitly detach the element and reconnect it into the list. - MT_LIST_APPEND_LOCKED() was only in haproxy so it was left as-is in list.h. It may however possibly benefit from being upstreamed. This required tiny adaptations to event_hdl.c and quic_sock.c. The test case was updated and the API doc added. Note that in order to keep include files small, the struct mt_list definition remains in list-t.h (par of the internal API) and was ifdef'd out in mt_list.h. A test on QUIC with both quictls 1.1.1 and wolfssl 5.6.3 on ARM64 with 80 threads shows a drastic reduction of CPU usage thanks to this and the refined memory barriers. Please note that the CPU usage on OpenSSL 3.0.9 is significantly higher due to the excessive use of atomic ops by openssl, but 3.1 is only slightly above 1.1.1 though: - before: 35 Gbps, 3.5 Mpps, 7800% CPU - after: 41 Gbps, 4.2 Mpps, 2900% CPU	2023-09-13 11:50:33 +02:00
Christopher Faulet	13fb7170be	BUG/MEDIUM: master/cli: Pin the master CLI on the first thread of the group 1 There is no reason to start the master CLI on several threads and on several groups. And in fact, it must not be done otherwise the same FD is inserted several times in the fdtab, leading to a crash during startup because of a BUG_ON(). It happens when several groups are configured. To fix the bug the master CLI is now pinned on the first thread of the first group. This patch should fix the issue #2259 and must be backported to 2.8.	2023-09-13 10:26:32 +02:00
Christopher Faulet	665703d456	BUG/MEDIUM: mux-fcgi: Don't swap trash and dbuf when handling STDERR records trahs chunks are buffers but not allocated from the buffers pool. And the "trash" chunk is static and thread-local. It is two reason to not swap it with a regular buffer allocated from the buffers pool. Unfortunatly, it is exactly what is performed in the FCGI mux when a STDERR record is handled. b_xfer() is used to copy data from the demux buffer to the trash to format the error message. A zeor-copy via a swap may be performed. In this case, this leads to a memory corruption and a crash because, some time later, the demux buffer is released because it is empty. And it is in fact the trash chunk. b_force_xfer() must be used instead. This function forces the copy. This patch must be backported as far as 2.2. For 2.4 and 2.2, b_force_xfer() does not exist. For these versions, the following commit must be backported too: * `c7860007cc` ("MINOR: buf: Add b_force_xfer() function")	2023-09-12 19:50:17 +02:00
Aurelien DARRAGON	1115fc348e	BUG/MINOR: hlua/init: coroutine may not resume itself It's not supported to call lua_resume with <L> and <from> designating the same lua coroutine. It didn't cause visible bugs so far because Lua 5.3 used to be more permissive about this, and moreover, yielding is not involved during the hlua init state. But this is wrong usage, and the doc clearly specifies that the <from> argument can be NULL when there is no such coroutine, which is the case here. This should be backported in every stable versions.	2023-09-12 19:50:17 +02:00
Aurelien DARRAGON	e7281f3f5d	BUG/MEDIUM: hlua: don't pass stale nargs argument to lua_resume() In hlua_ctx_resume(), we call lua_resume() function like this: lua_resume(lua->T, hlua_states[lua->state_id], lua->nargs) Once the call returns, we may call the function again with the same hlua context when E_YIELD is returned (the execution was interrupted and may be resumed through another lua_resume() call). The 3rd argument to lua_resume(), 'nargs', is a hint passed to Lua to know how many (optional) arguments were pushed on the stack prior to resuming the execution (arguments that Lua will then expose to the Lua script). But here is the catch: we never reset lua->nargs between successive lua_resume() calls, meaning that next lua_resume() calls will still inherit from the initial nargs value that was set in hlua ctx prior to calling hlua_ctx_resume() (our wrapper function) for the first time. This is problematic, because despite not being explicitly mentioned in the Lua documentation, passed arguments (to which `nargs` refer to), are already consumed once lua_resume() returns. This means that we cannot keep calling lua_resume() with non-zero nargs if we don't push new arguments on the stack prior to resuming lua after the initial call: nargs is proper to a single lua_resume() invocation. Despite improper use of lua_resume() for a long time, this didn't cause visible issues in the past with Lua 5.3, but it is particularly sensitive starting with Lua 5.4.3 due to debugging hooks improvements that led to some internal changes (see: lua/lua@58aa09a). Not using nargs properly now exposes us to undefined behavior when resuming after a yield triggered from a debugging hook, which may cause running scripts to crash unexpectedly: for instance with Lua raising errors and complaining about values being NULL where it should not be the case. For reference, this issue was initially raised on the Lua mailing list: http://lua-users.org/lists/lua-l/2023-09/msg00005.html In this patch, we immediately reset nargs when lua_resume() returns to prevent any misuse. It should be backported to every maintained versions.	2023-09-12 19:50:17 +02:00
Willy Tarreau	93c2ea0ec3	MEDIUM: pools: refine pool size rounding The pools sizes were rounded up a little bit too much with commit `30f931ead` ("BUG/MEDIUM: pools: fix the minimum allocation size"). The goal was in fact to make sure they were always at least large enough to store 2 list heads, and stuffing this into the alignment calculation resulted in the size being always rounded up to this size. This is problematic because it means that the appended tag at the end doesn't always catch potential overflows since more bytes than needed are allocated. Moreover, this test was later reinforced by commit `b5ba09ed5` ("BUG/MEDIUM: pools: ensure items are always large enough for the pool_cache_item"), proving that the first test was not always sufficient. This needs to be reworked to proceed correctly: - the two lists are needed when the object is in the cache, hence when we don't care about the tag, which means that the tag's size, if any, can easily cover for the missing bytes to reach that size. This is actually what was already being checked for. - the rounding should not be performed (beyond the size of a word to preserve pointer alignment) when pool tagging is enabled, otherwise we don't detect small overflows. It means that there will be less merging when proceeding like this. Tests show that we merge 93 pools into 36 without tags and 43 with tags enabled. - the rounding should not consider the extra size, since it's already done when calculating the allocated size later (i.e. don't round up twice). The difference is subtle but it's what makes sure the tag immediately follows the area instead of starting from the end. Thanks to this, now when writing one byte too many at the end of a struct stream, the error is instantly caught.	2023-09-12 18:14:05 +02:00
Willy Tarreau	61575769ac	DEBUG: pools: print the contents surrounding the expected tag location When no tag matches a known pool, we can inspect around to help figure what could have possibly overwritten memory. The contents are printed one machine word per line in hex, then using printable characters, and when they can be resolved to a pointer, either the pool's pointer name or a resolvable symbol with offset. The goal here is to help recognize what is easily identifiable in memory. For example applying the following patch to stream_free(): - pool_free(pool_head_stream, s); + pool_free(pool_head_stream, (void*)s+1); Causes the following dump to be emitted: FATAL: pool inconsistency detected in thread 1: tag mismatch on free(). caller: 0x59e968 (stream_free+0x6d8/0xa0a) item: 0x13df5c1 pool: 0x12782c0 ('stream', size 888, real 904, users 1) Tag does not match (0x4f00000000012782). Tag does not match any other pool. Contents around address 0x13df5c1+888=0x13df939: 0x13df918 [00 00 00 00 00 00 00 00] [........] 0x13df920 [00 00 00 00 00 00 00 00] [........] 0x13df928 [00 00 00 00 00 00 00 00] [........] 0x13df930 [00 00 00 00 00 00 00 00] [........] 0x13df938 [c0 82 27 01 00 00 00 00] [..'.....] [pool:stream] 0x13df940 [4f c0 59 00 00 00 00 00] [O.Y.....] [stream_new+0x4f/0xbec] 0x13df948 [49 46 49 43 41 54 45 2d] [IFICATE-] 0x13df950 [81 02 00 00 00 00 00 00] [........] 0x13df958 [df 13 00 00 00 00 00 00] [........] Other possible callers: (...) We notice that the tag references pool_head_stream with the allocation point in stream_new. Another benefit is that a caller may be figured from the tag even if the "caller" feature is not enabled, because upon a free() we always put the caller's location into the tag. This should be sufficient to debug most cases that normally require gdb.	2023-09-12 18:14:05 +02:00
Willy Tarreau	0f9a10c7f1	DEBUG: pools: also print the value of the tag when it doesn't match Sometimes the tag's value may reveal a recognizable pattern, so let's print it when it doesn't match a known pool.	2023-09-12 18:14:05 +02:00
Willy Tarreau	96c1a24224	DEBUG: pools: also print the item's pointer when crashing It's important to inspect a core or recognize some values to have the item pointer, it was not provided.	2023-09-12 18:14:05 +02:00
Willy Tarreau	efc46dede9	DEBUG: pools: inspect pools on fatal error and dump information found It's a bit frustrating sometimes to see pool checks catch a bug but not provide exploitable information without a core. Here we're adding a function "pool_inspect_item()" which is called just before aborting in pool_check_pattern() and POOL_DEBUG_CHECK_MARK() and which will display the error type, the pool's pointer and name, and will try to check if the item's tag matches the pool, and if not, will iterate over all pools to see if one would be a better candidate, then will try to figure the last known caller and possibly other likely candidates if the pool's tag is not sufficiently trusted. This typically helps better diagnose corruption in use-after-free scenarios, or freeing to a pool that differs from the one the object was allocated from, and will also indicate calling points that may help figure where an object was last released or allocated. The info is printed on stderr just before the backtrace. For example, the recent off-by-one test in the PPv2 changes would have produced the following output in vtest logs: * h1 debug\|FATAL: pool inconsistency detected in thread 1: tag mismatch on free(). * h1 debug\| caller: 0x62bb87 (conn_free+0x147/0x3c5) * h1 debug\| pool: 0x2211ec0 ('pp_tlv_256', size 304, real 320, users 1) * h1 debug\|Tag does not match. Possible origin pool(s): * h1 debug\| tag: @0x2565530 = 0x2216740 (pp_tlv_128, size 176, real 192, users 1) * h1 debug\|Recorded caller if pool 'pp_tlv_128': *** h1 debug\| @0x2565538 (+0184) = 0x62c76d (conn_recv_proxy+0x4cd/0xa24) A mismatch in the allocated/released pool is already visible, and the callers confirm it once resolved, where the allocator indeed allocates from pp_tlv_128 and conn_free() releases to pp_tlv_256: $ addr2line -spafe ./haproxy <<< $'0x62bb87\n0x62c76d' 0x000000000062bb87: conn_free at connection.c:568 0x000000000062c76d: conn_recv_proxy at connection.c:1177	2023-09-11 15:46:14 +02:00
Willy Tarreau	f6bee5a50b	DEBUG: pools: make pool_check_pattern() take a pointer to the pool This will be useful to report detailed bug traces.	2023-09-11 15:19:49 +02:00
Willy Tarreau	e92e96b00f	DEBUG: pools: pass the caller pointer to the check functions and macros In preparation for more detailed pool error reports, let's pass the caller pointers to the check functions. This will be useful to produce messages indicating where the issue happened.	2023-09-11 15:19:49 +02:00
Willy Tarreau	baf2070421	DEBUG: pools: always record the caller for uncached allocs as well When recording the caller of a pool_alloc(), we currently store it only when the object comes from the cache and never when it comes from the heap. There's no valid reason for this except that the caller's pointer was not passed to pool_alloc_nocache(), so it used to set NULL there. Let's just pass it down the chain.	2023-09-11 15:19:49 +02:00
Frédéric Lécaille	2dedbe76c9	BUG/MINOR: quic: fdtab array underflow access When using the listener socket as file descriptor, qc->fd value is -1. In this case one must not access fdtab[qc->fd] element to change its value. This bug could have been detected by asan with such a backtrace: ================================================================= ==402222==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fa8ecf417ex7fa8e915cf90 sp 0x7fa8e915cf88 WRITE of size 8 at 0x7fa8ecf417e8 thread T6 #0 0x55707a0bf18a in qc_new_cc_conn src/quic_conn.c:838 #1 0x55707a0c6dc0 in quic_conn_release src/quic_conn.c:1408 #2 0x55707a10916f in quic_close src/xprt_quic.c:35 #3 0x55707a0cec77 in conn_xprt_close include/haproxy/connection.h:153 #4 0x55707a0ceed0 in conn_full_close include/haproxy/connection.h:197 #5 0x55707a0ec253 in qcc_release src/mux_quic.c:2412 #6 0x55707a0ec7d0 in qcc_io_cb src/mux_quic.c:2443 #7 0x55707a63ff2a in run_tasks_from_lists src/task.c:596 #8 0x55707a641cc9 in process_runnable_tasks src/task.c:876 #9 0x55707a56f7b2 in run_poll_loop src/haproxy.c:2954 #10 0x55707a5705fd in run_thread_poll_loop src/haproxy.c:3153 #11 0x7fa8f9450ea6 in start_thread nptl/pthread_create.c:477 #12 0x7fa8f936ea2e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfba2e) 0x7fa8ecf417e8 is located 24 bytes to the left of 134217728-byte region [0x7fa8e allocated by thread T0 here: #0 0x7fa8f9a37037 in __interceptor_calloc ../../../../src/libsanitizer/asan/ #1 0x55707a71a61d in init_pollers src/fd.c:1161 #2 0x55707a56cdf1 in init src/haproxy.c:2672 #3 0x55707a5714c2 in main src/haproxy.c:3298 #4 0x7fa8f9296d09 in __libc_start_main ../csu/libc-start.c:308 Thread T6 created by T0 here: #0 0x7fa8f99e22a2 in __interceptor_pthread_create ../../../../src/libsanitizpp:214 #1 0x55707a748a21 in setup_extra_threads src/thread.c:252 #2 0x55707a5735c9 in main src/haproxy.c:3844 #3 0x7fa8f9296d09 in __libc_start_main ../csu/libc-start.c:308 SUMMARY: AddressSanitizer: heap-buffer-overflow src/quic_conn.c:838 in qc_new_cc Shadow bytes around the buggy address: 0x0ff59d9e02a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0ff59d9e02b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0ff59d9e02c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0ff59d9e02d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0ff59d9e02e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =>0x0ff59d9e02f0: fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]fa fa 0x0ff59d9e0300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ff59d9e0310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ff59d9e0320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ff59d9e0330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ff59d9e0340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==402222==ABORTING Aborted Thank you to @Tristan971 for having reported this bug in GH #2247. No need to backport.	2023-09-11 15:14:22 +02:00
Willy Tarreau	4a18d9e560	REORG: cpuset: move parse_cpu_set() and parse_cpumap() to cpuset.c These ones were still in cfgparse.c but they're not specific to the config at all and may actually be used even when parsing cpu list entries in /sys. Better move them where they can be reused.	2023-09-08 16:25:19 +02:00
Willy Tarreau	5119109e3f	MINOR: cpuset: dynamically allocate cpu_map cpu_map is 8.2kB/entry and there's one such entry per group, that's ~520kB total. In addition, the init code is still in haproxy.c enclosed in ifdefs. Let's make this a dynamically allocated array in the cpuset code and remove that init code. Later we may even consider reallocating it once the number of threads and groups is known, in order to shrink it a little bit, as the typical setup with a single group will only need 8.2kB, thus saving half a MB of RAM. This would require that the upper bound is placed in a variable though.	2023-09-08 16:25:19 +02:00
Willy Tarreau	b0f20ed79b	MEDIUM: cfgparse: assign NUMA affinity to cpu-maps Do not force affinity on the process, instead let's just apply it to cpu-map, it will automatically be used later in the init process. We can do this because we know that cpu-map was not set when we're using this detection code. This is much saner, as we don't need to manipulate the process' affinity at this point in time, and just update the info that the user omitted to set by themselves, which guarantees a better long-term consistency with the documented feature.	2023-09-08 16:25:19 +02:00
Willy Tarreau	809a49da96	MINOR: cfgparse: use read_line_from_trash() to read from /sys It's easier to use this function now to natively support variable fields in the file's path. This also removes read_file_from_trash() that was only used here and was static.	2023-09-08 16:25:19 +02:00
Willy Tarreau	1f2433fb6a	MINOR: tools: add function read_line_to_trash() to read a line of a file This function takes on input a printf format for the file name, making it particularly suitable for /proc or /sys entries which take a lot of numbers. It also automatically trims the trailing CR and/or LF chars.	2023-09-08 16:25:19 +02:00
Willy Tarreau	5f10176e2c	MEDIUM: init: initialize the trash earlier More and more utility functions rely on the trash while most of the init code doesn't have access to it because it's initialized very late (in PRE_CHECK for the initial one). It's a pool, and it purposely supports being reallocated, so let's initialize it in STG_POOL so that early STG_INIT code can at least use it.	2023-09-08 16:25:19 +02:00
Frédéric Lécaille	e3e218b98e	CLEANUP: quic: Remove useless free_quic_tx_pkts() function. This function define but no more used since this commit: BUG/MAJOR: quic: Really ignore malformed ACK frames.	2023-09-08 10:17:25 +02:00
Frédéric Lécaille	292dfdd78d	BUG/MINOR: quic: Wrong cluster secret initialization The function generate_random_cluster_secret() which initializes the cluster secret when not supplied by configuration is buggy. There 1/256 that the cluster secret string is empty. To fix this, one stores the cluster as a reduced size first 128 bits of its own SHA1 (160 bits) digest, if defined by configuration. If this is not the case, it is initialized with a 128 bits random value. Furthermore, thus the cluster secret is always initialized. As the cluster secret is always initialized, there are several tests which are for now on useless. This patch removes such tests (if(global.cluster_secret)) in the QUIC code part and at parsing time: no need to check that a cluster secret was initialized with "quic-force-retry" option. Must be backported as far as 2.6.	2023-09-08 09:50:58 +02:00
William Lallemand	15e591b6e0	MINOR: ssl: add support for 'curves' keyword on server lines This patch implements the 'curves' keyword on server lines as well as the 'ssl-default-server-curves' keyword in the global section. It also add the keyword on the server line in the ssl_curves reg-test. These keywords allow the configuration of the curves list for a server.	2023-09-07 23:29:10 +02:00
Willy Tarreau	28ff1a5d56	MINOR: tasks/stats: report the number of niced tasks in "show info" We currently know the number of tasks in the run queue that are niced, and we don't expose it. It's too bad because it can give a hint about what share of the load is relevant. For example if one runs a Lua script that was purposely reniced, or if a stats page or the CLI is hammered with slow operations, seeing them appear there can help identify what part of the load is not caused by the traffic, and improve monitoring systems or autoscalers.	2023-09-06 17:44:44 +02:00
Remi Tricot-Le Breton	e03d060aa3	MINOR: cache: Change hash function in default normalizer used in case of "vary" When building the secondary signature for cache entries when vary is enabled, the referer part of the signature was a simple crc32 of the first referer header. This patch changes it to a 64bits hash based of xxhash algorithm with a random seed built during init. This will prevent "malicious" hash collisions between entries of the cache.	2023-09-06 16:11:31 +02:00
Aurelien DARRAGON	7a71801af6	CLEANUP: log: remove unnecessary trim in __do_send_log Since both sink_write and fd_write_frag_line take the maxlen parameter as argument, there is no added value for the trim before passing the msg parameter to those functions.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	8e6339aa29	MEDIUM: sink: add sink_finalize() function To further clean the code and remove duplication, some sink postparsing and sink->sft finalization is now performed in a dedicated function named sink_finalize().	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	b2879e3502	MEDIUM: sink/ring: introduce high level ring creation helper function ease code maintenance.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	5a8755681d	MINOR: sink: add helper function to deallocate sink struct In this patch we move sink freeing logic outside of sink_deinit() function in order to create the sink_free() helper function that could be used on error paths for example.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	6049a478e4	MEDIUM: spoe-agent: properly postresolve log rings Now that we have sink_postresolve_logsrvs() function, we make use of it for spoe-agent log postparsing logic. This will allow this kind of config to work: \|spoe-agent test \| log tcp@127.0.0.1:514 local0 \| use-backend xxx Plus, consistency checks will also be performed as for regular log directives used from global, log-forward or proxy sections.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	486aa01204	MEDIUM: fcgi-app: properly postresolve logsrvs Now that we have postresolve_logsrv_list() function, we make use of it for fcgi-app log postparsing logic. This will allow this kind of config to work: \|fcgi-app test \| docroot / \| log-stderr tcp@127.0.0.1:514 local0 Plus, consistency checks will also be performed as for regular log directives used from global, log-forward or proxy sections.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	d9b81e5b49	MEDIUM: log/sink: make logsrv postparsing more generic We previously had postparsing logic but only for logsrv sinks, but now we need to make this operation on logsrv directly instead of sinks to prepare for additional postparsing logic that is not sink-specific. To do this, we migrated post_sink_resolve() and sink_postresolve_logsrvs() to their postresolve_logsrvs() and postresolve_logsrv_list() equivalents. Then, we split postresolve_logsrv_list() so that the sink-only logic stays in sink.c (sink_resolve_logsrv_buffer() function), and the "generic" target part stays in log.c as resolve_logsrv(). Error messages formatting was preserved as far as possible but some slight variations are to be expected. As for the functional aspect, no change should be expected.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	969e212c66	MINOR: log: add dup_logsrv() helper function ease code maintenance by introducing dup_logsrv() helper function to properly duplicate an existing logsrv struct.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	7a12e2d369	MEDIUM: httpclient/logs: rely on per-proxy post-check instead of global one httpclient used to register a global post-check function to iterate over all known proxies and post-initialize httpclient related ones (mainly for logs initialization). But we currently have an issue: post_sink_resolve() function which is also registered using REGISTER_POST_CHECK() macro conflicts with httpclient_postcheck() function. This is because post_sink_resolve() relies on proxy->logsrvs to be correctly initialized already, and httpclient_postcheck() may create and insert new logsrvs entries to existing proxies when executed. So depending on which function runs first, we could run into trouble. Hopefully, to this day, everything works "by accident" due to http_client.c file being loaded before sink.c file when compiling source code. But as soon as we would move one of the two functions to other files, or if we rename files or make changes to the Makefile build recipe, we could break this at any time. To prevent post_sink_resolve() from randomly failing in the future, we now make httpclient postcheck rely on per-proxy post-checks by slightly modifying httpclient_postcheck() function so that it can be registered using REGISTER_POST_PROXY_CHECK() macro. As per-proxy post-check functions are executed right after config parsing for each known proxy (vs global post-check which are executed a bit later in the init process), we can be certain that functions registered using global post-check macro, ie: post_sink_resolve(), will always be executed after httpclient postcheck, effectively resolving the ordering conflict. This should normally not cause visible behavior changes, and while it could be considered as a bug, it's probably not worth backporting it since the only way to trigger the issue is through code refactors, unless we want to backport it to ease code maintenance of course, in which case it should easily apply for >= 2.7.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	e187361b52	MINOR: log: move log-forwarders cleanup in log.c Move the log-forwarded proxies cleanup from global deinit() function into log dedicated deinit function. No backport needed.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	32f1db6d0d	MEDIUM: sink: don't perform implicit truncations when maxlen is not set maxlen now defaults ~0 (instead of BUFSIZE) to make sure no implicit truncation will be performed when the option is not specified, since the doc doesn't mention any default value for maxlen. As such, if the payload is too big, it will be dropped (this is the default expected behavior).	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	fdf82d058b	MINOR: sink: inform the user when logs will be implicitly truncated Consider the following example: \|log ring@test-ring len 2000 local0 \| \|ring test-ring \| maxlen 1000 This would result in emitted logs being silently truncated to 1000 because test-ring maxlen is smaller than the log directive maxlen. In this patch we're adding an extra check in post_sink_resolve() to detect this kind of confusing setups and warn the user about the implicit truncation when DIAG mode is on. This commit depends on: - "MINOR: sink: simplify post_sink_resolve function"	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	ceaa1ddb06	MINOR: log/sink: detect when log maxlen exceeds sink size To prevent logs from being silently (and unexpectly droppped) at runtime, we check that the maxlen parameter from the log directives are strictly inferior to the targeted ring size. \|global \| tune.bufsize 16384 \| log tcp@127.0.0.1:514 len 32768 \| log myring@127.0.0.1:514 len 32768 \|ring myring \| # no explicit size On such configs, a diag warning will be reported. This commit depends on: - "MINOR: sink: simplify post_sink_resolve function" - "MINOR: ring: add a function to compute max ring payload"	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	d499485aa9	MINOR: sink: simplify post_sink_resolve function Simplify post_sink_resolve() function to reduce code duplication and make it easier to maintain.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	ddd8671b19	BUG/MEDIUM: ring: adjust maxlen consistency check When user specifies a maxlen parameter that is greater than the size of a given ring section, a warning is emitted to inform that the max length exceeds size, and then the maxlen is forced to size. The logic is good, but imprecise, because it doesn't take into account the slight overhead from storing payloads into the ring. In practise, we cannot store a single message which is exactly the same length than size. Doing so will result in the message being dropped at runtime. Thanks to the ring_max_payload() function introduced in "MINOR: ring: add a function to compute max ring payload", we can now deduce the maximum value for the maxlen parameter before it could result in messages being dropped. When maxlen value is set to an improper value, the warning will be emitted and maxlen will be forced to the maximum "single" payload len that could fit in the ring buffer, preventing messages from being dropped unexpectedly. This commit depends on: - "MINOR: ring: add a function to compute max ring payload" This may be backported as far as 2.2	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	5b295ff409	MINOR: ring: add a function to compute max ring payload Add a helper function to the ring API to compute the maximum payload length that could fit into the ring based on ring size.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	4457783ade	MINOR: http_ana: position the FINAL flag for http_after_res execution Ensure that the ACT_OPT_FINAL flag is always set when executing actions from http_after_res context. This will permit lua functions to be executed as http_after_res actions since hlua_ctx_resume() automatically disables "yielding" when such flag is set: the hlua handler will only allow 1shot executions at this point (lua or not, we don't wan't to reschedule http_after_res actions).	2023-09-06 11:42:34 +02:00
Aurelien DARRAGON	967608a432	BUG/MINOR: hlua/action: incorrect message on E_YIELD error When hlua_action error messages were reworked in `d5b073cf1` ("MINOR: lua: Improve error message"), an error was made for the E_YIELD case. Indeed, everywhere E_YIELD error is handled: "yield is not allowed" or similar error message is reported to the user. But instead we currently have: "aborting Lua processing on expired timeout". It is quite misleading because this error message often refers to the HLUA_E_ETMOUT case. Thus, we now report the proper error message thanks to this patch. This should be backported to all stable versions. [on 2.0, the patch needs to be slightly adapted]	2023-09-06 11:42:34 +02:00
Fr�d�ric L�caille	e7240a0ba6	BUG/MINOR: quic: Dereferenced unchecked pointer to Handshke packet number space This issue was reported by longrtt interop test with quic-go as client and @chipitsine in GH #2282 when haproxy is compiled against libressl. Add two checks to prevent a pointer to the Handshake packet number space to be dereferenced if this packet number space was released. Thank you to @chipitsine for this report. No need to backport.	2023-09-06 10:13:40 +02:00
Christopher Faulet	700ca14fc1	BUG/MINOR: ring/cli: Don't expect input data when showing events The "show events" command may wait for now events if "-w" option is used. In this case, no timeout must be triggered. So we explicitly state no input data are expected. This disables the read timeout on the client side. This patch should be backported to 2.8. It is probably useless to backport it further. In all cases, it depends on the commit "BUG/MINOR: applet: Always expect data when CLI is waiting for a new command"	2023-09-06 09:36:29 +02:00
Christopher Faulet	2f1e0a0a46	BUG/MINOR: applet: Always expect data when CLI is waiting for a new command There is a mechanism for applets to disable the read timeout on the opposite side if it is now waiting for any data. Of course, there is also a way to re-activate it. But, it must excplicitly be handle by applets. For the CLI, some commands may state no input data are expected. So we must be sure to reset its state when the applet is waiting for a new command. For now, it is not a bug because no CLI command uses this mechanism. This patch must be backported to 2.8.	2023-09-06 09:36:19 +02:00
Christopher Faulet	8073094bf1	NUG/MEDIUM: stconn: Always update stream's expiration date after I/O It is a revert of following patches: * `d7111e7ac` ("MEDIUM: stconn: Don't requeue the stream's task after I/O") * `3479d99d5` ("BUG/MEDIUM: stconn: Update stream expiration date on blocked sends") Because the first one is reverted, the second one is useless and can be reverted too. The issue here is that I/O may be performed without stream wakeup. So if no expiration date was set on the last call to process_stream(), the stream is never rescheduled and no timeout can be detected. This especially happens on TCP streams because fast-forward is enabled very early. Instead of tracking all places where the stream's expiration data must be updated, it is now centralized in sc_notify(), as it was performed before the timeout refactoring. This patch must be backported to 2.8.	2023-09-06 09:29:27 +02:00
Christopher Faulet	b9c87f8082	BUG/MEDIUM: stconn/stream: Forward shutdown on write timeout The commit `7f59d68fe2` ("BUG/MEDIIM: stconn: Flush output data before forwarding close to write side") introduced a regression. When a write timeout is detected, the shutdown is no longer forwarded. Dependig on the channels state, it may block the processing, waiting the client or the server leaves. The commit above tries to avoid to truncate messages on shutdown but on write timeout, if the channel is not empty, there is nothing more we can do to send these data. It means the endpoint is unable to send data. In this case, we must forward the shutdown. This patch should be backported as far as 2.2.	2023-09-06 09:29:27 +02:00
Christopher Faulet	d18657ae11	BUG/MEDIUM: applet: Report an error if applet request more room on aborted SC If an abort was performed and the applet still request more room, it means the applet has not properly handle the error on its own. At least the CLI applet is concerned. Instead of reviewing all applets, the error is now handled in task_run_applet() function. Because of this bug, a session may be blocked infinitly and may also lead to a wakup loop. This patch must only be backported to 2.8 for now. And only to lower versions if a bug is reported because it is a bit sensitive and the code older versions are very different.	2023-09-06 09:29:27 +02:00
Christopher Faulet	34645a6365	BUG/MEDIUM: stconn: Report read activity when a stream is attached to front SC It only concerns the front SC. But it is important to report a read activity when a stream is created and attached to the front SC, especially in TCP. In HTTP, when this happens, the request was necessarily received. But in TCP, the client may open a connection without sending anything. We must still report a first read activity in this case to be able to properly report client timeout. This patch must be backported to 2.8.	2023-09-06 09:29:27 +02:00
Christopher Faulet	015fec6a29	BUG/MINOR: stconn: Don't inhibit shutdown on connection on error In the SC function responsible to perform shutdown, there is a statement inhibiting the shutdown if an error was encountered on the SC. This statement is inherited from very old version and should in fact be removed. The error may be set from the stream. In this case the shutdown must be performed. In all cases, it is not a big deal if the shutdown is performed twice because underlying functions already handle multiple calls. This patch does not fix any bug. Thus there is no reason to backport it.	2023-09-06 09:29:27 +02:00
Fr�d�ric L�caille	fb4294be55	BUG/MINOR: quic: Wrong RTT computation (srtt and rrt_var) Due to the fact that several variable values (rtt_var, srtt) were stored as multiple of their real values, some calculations were less accurate as expected. Stop storing 4rtt_var values, and 8srtt values. Adjust all the impacted statements. Must be backported as far as 2.6.	2023-09-05 17:14:51 +02:00
Fr�d�ric L�caille	cf768f7456	BUG/MINOR: quic: Wrong RTT adjusments There was a typo in the test statement to check if the rtt must be adjusted (>= incorectly replaced by >). Must be backported as far as 2.6.	2023-09-05 17:14:51 +02:00
William Lallemand	6bc00a97da	MINOR: httpclient: allow to configure the timeout.connect When using the httpclient, one could be bothered with it returning after a very long time when failing. By default the httpclient has a retries of 3 and a timeout connect of 5s, which can results in pause of 20s upon failure. This patch allows the user to configure the "timeout connect" of the httpclient so it could reduce the time to return an error. This patch helps fixing part of the issue #2269. Could be backported in 2.7 if needed.	2023-09-05 16:42:27 +02:00
William Lallemand	c52948bd2c	MINOR: httpclient: allow to configure the retries When using the httpclient, one could be bothered with it returning after a very long time when failing. By default the httpclient has a retries of 3 and a timeout connect of 5s, which can results in pause of 20s upon failure. This patch allows the user to configure the retries of the httpclient so it could reduce the time to return an error. This patch helps fixing part of the issue #2269. Could be backported in 2.7 if needed.	2023-09-05 15:55:04 +02:00
William Lallemand	fcb080d8f9	MEDIUM: mworker: display a more accessible message when a worker crash Should fix issue #1034. Display a more accessible message when a worker crash about what to do. Example: $ ./haproxy -W -f haproxy.cfg [NOTICE] (308877) : New worker (308884) forked [NOTICE] (308877) : Loading success. [NOTICE] (308877) : haproxy version is 2.9-dev4-d90d3b-58 [NOTICE] (308877) : path to executable is ./haproxy [ALERT] (308877) : Current worker (308884) exited with code 139 (Segmentation fault) [WARNING] (308877) : A worker process unexpectedly died and this can only be explained by a bug in haproxy or its dependencies. Please check that you are running an up to date and maintained version of haproxy and open a bug report. HAProxy version 2.9-dev4-d90d3b-58 2023/09/05 - https://haproxy.org/ Status: development branch - not safe for use in production. Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open Running on: Linux 6.2.0-31-generic #31-Ubuntu SMP PREEMPT_DYNAMIC Mon Aug 14 13:42:26 UTC 2023 x86_64 [ALERT] (308877) : exit-on-failure: killing every processes with SIGTERM [WARNING] (308877) : All workers exited. Exiting... (139)	2023-09-05 15:31:04 +02:00
William Lallemand	d90d3bf894	MINOR: global: export the display_version() symbol Export the display_version() function which can be used elsewhere than in haproxy.c	2023-09-05 15:24:39 +02:00
Fr�d�ric L�caille	aeb2f28ca7	BUG/MINOR: quic: Unchecked pointer to Handshake packet number space It is possible that there are still Initial crypto data in flight without Handshake crypto data in flight. This is very rare but possible. This issue was reported by handshakeloss interop test with quic-go as client and @chipitsine in GH #2279. No need to backport.	2023-09-05 11:38:33 +02:00
Frédéric Lécaille	3afe54ed5b	BUILD: quic: Compilation issue on 32-bits systems with quic_may_send_bytes() quic_may_send_bytes() implementation arrived with this commit: MINOR: quic: Amplification limit handling sanitization. It returns a size_t. So when compared with QUIC_MIN() with qc->path->mtu there is no need to cast this latted anymore because it is also a size_t. Detected when compiled with -m32 gcc option.	2023-09-05 10:33:56 +02:00
Willy Tarreau	86854dd032	MEDIUM: threads: detect excessive thread counts vs cpu-map This detects when there are more threads bound via cpu-map than CPUs enabled in cpu-map, or when there are more total threads than the total number of CPUs available at boot (for unbound threads) and configured for bound threads. In this case, a warning is emitted to explain the problems it will cause, and explaining how to address the situation. Note that some configurations will not be detected as faulty because the algorithmic complexity to resolve all arrangements grows in O(N!). This means that having 3 threads on 2 CPUs and one thread on 2 CPUs will not be detected as it's 4 threads for 4 CPUs. But at least configs such as T0:(1,4) T1:(1,4) T2:(2,4) T3:(3,4) will not trigger a warning since they're valid.	2023-09-04 19:39:17 +02:00
Willy Tarreau	8357f950cb	MEDIUM: threads: detect incomplete CPU bindings It's very easy to mess up with some cpu-map directives and to leave some thread unbound. Let's add a test that checks that either all threads are bound or none are bound, but that we do not face the intermediary situation where some are pinned and others are left wandering around, possibly on the same CPUs as bound ones. Note that this should not be backported, or maybe turned into a notice only, as it appears that it will easily catch invalid configs and that may break updates for some users.	2023-09-04 19:39:17 +02:00
Willy Tarreau	e65f54cf96	MINOR: cpuset: centralize a reliable bound cpu detection Till now the CPUs that were bound were only retrieved in thread_cpus_enabled() in order to count the number of CPUs allowed, and it relied on arch-specific code. Let's slightly arrange this into ha_cpuset_detect_bound() that reuses the ha_cpuset struct and the accompanying code. This makes the code much clearer without having to carry along some arch-specific stuff out of this area. Note that the macos-specific code used in thread.c to only count online CPUs but not retrieve a mask, so for now we can't infer anything from it and can't implement it. In addition and more importantly, this function is reliable in that it will only return a value when the detection is accurate, and will not return incomplete sets on operating systems where we don't have an exact list, such as online CPUs.	2023-09-04 19:39:17 +02:00
Willy Tarreau	d3ecc67a01	MINOR: cpuset: add ha_cpuset_or() to bitwise-OR two CPU sets This operation was not implemented and will be needed later.	2023-09-04 19:39:17 +02:00
Willy Tarreau	eb10567254	MINOR: cpuset: add ha_cpuset_isset() to check for the presence of a CPU in a set This function will be convenient to test for the presence of a given CPU in a set.	2023-09-04 19:39:17 +02:00
Willy Tarreau	fca3fc0d90	BUILD: checks: shut up yet another stupid gcc warning gcc has always had hallucinations regarding value ranges, and this one is interesting, and affects branches 4.7 to 11.3 at least. When building without threads, the randomly picked new_tid that is reduced to a multiply by 1 shifted right 32 bits, hence a constant output of 0 shows this warning: src/check.c: In function 'process_chk_conn': src/check.c:1150:32: warning: array subscript [-1, 0] is outside array bounds of 'struct thread_ctx[1]' [-Warray-bounds] In file included from include/haproxy/thread.h:28, from include/haproxy/list.h:26, from include/haproxy/action.h:28, from src/check.c:31: or this one when trying to force the test to see that it cannot be zero(!): src/check.c: In function 'process_chk_conn': src/check.c:1150:54: warning: array subscript [0, 0] is outside array bounds of 'struct thread_ctx[1]' [-Warray-bounds] 1150 \| uint t2_act = _HA_ATOMIC_LOAD(&ha_thread_ctx[thr2].active_checks); \| ~~~~~~~~~~~~~^~~~~~ include/haproxy/atomic.h:66:40: note: in definition of macro 'HA_ATOMIC_LOAD' 66 \| #define HA_ATOMIC_LOAD(val) *(val) \| ^~~ src/check.c:1150:24: note: in expansion of macro '_HA_ATOMIC_LOAD' 1150 \| uint t2_act = _HA_ATOMIC_LOAD(&ha_thread_ctx[thr2].active_checks); \| ^~~~~~~~~~~~~~~ Let's just add an ALREADY_CHECKED() statement there, no other check seems to get rid of it. No backport is needed.	2023-09-04 19:38:51 +02:00
Andrew Hopkins	b3f94f8b3b	BUILD: ssl: Build with new cryptographic library AWS-LC This adds a new option for the Makefile USE_OPENSSL_AWSLC, and update the documentation with instructions to use HAProxy with AWS-LC. Update the type of the OCSP callback retrieved with SSL_CTX_get_tlsext_status_cb with the actual type for libcrypto versions greater than 1.0.2. This doesn't affect OpenSSL which casts the callback to void* in SSL_CTX_ctrl.	2023-09-04 18:19:18 +02:00
Miroslav Zagorac	3cfc30416c	MINOR: properly mark the end of the CLI command in error messages In several places in the file src/ssl_ckch.c, in the message about the incorrect use of the CLI command, the end of that CLI command is not correctly marked with the sign ' .	2023-09-04 18:13:43 +02:00
Willy Tarreau	8547f5cfa2	BUG/MINOR: stream: further protect stream_dump() against incomplete sessions As found by Coverity in issue #2273, the fix in commit `e64bccab2` ("BUG/MINOR: stream: protect stream_dump() against incomplete streams") was still not enough, as scf/scb are still dereferenced to dump their flags and states. This should be backported to 2.8.	2023-09-04 15:32:17 +02:00
Chris Staite	3939e39479	BUG/MEDIUM: h1-htx: Ensure chunked parsing with full output buffer A previous fix to ensure that there is sufficient space on the output buffer to place parsed data (#2053) introduced an issue that if the output buffer is filled on a chunk boundary no data is parsed but the congested flag is not set due to the state not being H1_MSG_DATA. The check to ensure that there is sufficient space in the output buffer is actually already performed in all downstream functions before it is used. This makes the early optimisation that avoids the state transition to H1_MSG_DATA needless. Therefore, in order to allow the chunk parser to continue in this edge case we can simply remove the early check. This ensures that the state can progress and set the congested flag correctly in the caller. This patch fixes #2262. The upstream change that caused this logic error was backported as far as 2.5, therefore it makes sense to backport this fix back that far also.	2023-09-04 12:15:36 +02:00
Willy Tarreau	135c66f6cb	BUG/MEDIUM: connection: fix pool free regression with recent ppv2 TLV patches In commit `fecc573da` ("MEDIUM: connection: Generic, list-based allocation and look-up of PPv2 TLVs") there was a tiny mistake, elements of length <= 128 are allocated from pool_pp_128 but only those of length < 128 are released to this pool, other ones go to pool_pp_256. Because of this, elements of size exactly 128 are allocated from 128 and released to 256. It can be reproduced a few times by running sample_fetches/tlvs.vtc 1000 times with -DDEBUG_DONT_SHARE_POOLS -DDEBUG_MEMORY_POOLS -DDEBUG_EXPR -DDEBUG_STRICT=2 -DDEBUG_POOL_INTEGRITY -DDEBUG_POOL_TRACING -DDEBUG_NO_POOLS. Not sure why it doesn't reproduce more often though. No backport is needed. This should address github issues #2275 and #2274.	2023-09-04 11:45:37 +02:00
Fr�d�ric L�caille	d52466726f	BUG/MINOR: quic: Unchecked pointer to packet number space dereferenced It is possible that there are still Initial crypto data in flight without Handshake crypto data in flight. This is very rare but possible. This issue was reported by long-rtt interop test with quic-go as client and @chipitsine in GH #2276. No need to backport.	2023-09-04 11:29:35 +02:00
Fr�d�ric L�caille	9077f20251	BUG/MAJOR: quic: Really ignore malformed ACK frames. If not correctly parsed, an ACK frame must be ignored without any more treatment. Before this patch an ACK frame could be partially correctly parsed, then some errors could be detected which leaded newly acknowledged packets to be released in a wrong way calling free_quic_tx_pkts() called by qc_parse_ack_frm(). But there is no reason to release such packets because of a malformed ACK frame. This patch modifies qc_parse_ack_frm(). The newly acknowledged TX packets is done in two steps. It first collects the newly acknowledged packet calling qc_newly_acked_pkts(). Then proceed the same way as before for the treatments of haproxy TX packets acknowledged by the peer. If the ACK frame could not be fully parsed, the newly ackowledged packets are replaced back from where they were detached: the tree of TX packets for their encryption level. Must be backported as far as 2.6.	2023-09-04 11:29:35 +02:00
Fr�d�ric L�caille	7dad52bdbd	MINOR: quic: Add a trace to quic_release_frm() Display the address of the frame to be released as soon as entering into quic_release_frm() whose job is obviously to released the memory allocated for the frame <frm> passed as parameter.	2023-09-04 11:29:35 +02:00
Fr�d�ric L�caille	3c90c1ce6b	BUG/MINOR: quic: Possible skipped RTT sampling There are very few chances this bug may occur. Furthermore the consequences are not dramatic: an RTT sampling may be ignored. I guess this may happen when the now_ms global value wraps. Do not rely on the time variable value a packet was sent to decide if it is a newly acknowledged packet but on its presence or not in the tx packet ebtree. Must be backported as far as 2.6.	2023-09-04 11:29:35 +02:00
Christopher Faulet	0b93ff8c87	BUG/MEDIUM: stconn: Wake applets on sending path if there is a pending shutdown An applet is not woken up on sending path if it is not waiting for data or if it states it will not consume data. However, it is important to still wake it up if there is a pending shutdown. Otherwise, the event may be missed and some data may remain blocked in the channel's buffer. Because of this bug, it is possible to have a stream stuck if data are also blocked on the opposite channel. It is for instance possible to hit the buf with the stats applet and a client not consuming data. This patch must slowly be backported as far as 2.2. It should partially fix issue #2249.	2023-09-01 14:18:26 +02:00
Christopher Faulet	9e394d34e0	BUG/MINOR: stconn: Don't report blocked sends during connection establishment The server timeout must not be handled during the connection establishment to not superseed the connect timeout. To do so, we must not consider outgoing data are blocked during this stage. Concretly, it means the fsb time must not be updated during connection establishment. It is not an issue with regular clients because the server timeout is only defined when the connection is estalished. However, it may be an issue for the HTTP client, when the server timeout is lower than the connect timeout. In this case, an early 502 may be reported with no connection retries. This patch must be backported to 2.8.	2023-09-01 14:18:26 +02:00
Christopher Faulet	3479d99d5f	BUG/MEDIUM: stconn: Update stream expiration date on blocked sends When outgoing data are blocked, we must update the stream expiration date and requeue the task. It is important to be sure to properly handle write timeout, expecially if the stream cannot expire on reads. This bug was introduced when handling of channel's timeouts was refactored to be managed by the stream-connectors. It is an issue if there is no server timeout and the client does not consume the response (or the opposite but it is less common). It is also possible to trigger the same scenario with applets on server side because, most of time, there is no server timeout. This patch must be backported to 2.8.	2023-09-01 14:18:26 +02:00
Christopher Faulet	49ed83e948	DEBUG: applet: Properly report opposite SC expiration dates in traces The wrong label was used in trace to report expiration dates of the opposite SC. "sc" was used instead of "sco". This patch should be backported to 2.8.	2023-09-01 14:18:26 +02:00
Willy Tarreau	b0031d9679	MINOR: checks: also consider the thread's queue for rebalancing Let's also check for other threads when the current one is queueing, let's not wait for the load to be high. Now this totally eliminates differences between threads.	2023-09-01 14:00:04 +02:00
Willy Tarreau	844a3bc25b	MEDIUM: checks: implement a queue in order to limit concurrent checks The progressive adoption of OpenSSL 3 and its abysmal handshake performance has started to reveal situations where it simply isn't possible anymore to succesfully run health checks on many servers, because between the moment all the checks are started and the moment the handshake finally completes, the timeout has expired! This also has consequences on production traffic which gets significantly delayed as well, all that for lots of checks. While it's possible to increase the check delays, it doesn't solve everything as checks still take a huge amount of time to converge in such conditions. Here we take a different approach by permitting to enforce the maximum concurrent checks per thread limitation and implementing an ordered queue. Thanks to this, if a thread about to start a check has reached its limit, it will add the check at the end of a queue and it will be processed once another check is finished. This proves to be extremely efficient, with all checks completing in a reasonable amount of time and not being disturbed by the rest of the traffic from other checks. They're just cycling slower, but at the speed the machine can handle. One must understand however that if some complex checks perform multiple exchanges, they will take a check slot for all the required duration. This is why the limit is not enforced by default. Tests on SSL show that a limit of 5-50 checks per thread on local servers gives excellent results already, so that could be a good starting point.	2023-09-01 14:00:04 +02:00
Willy Tarreau	cfc0bceeb5	MEDIUM: checks: search more aggressively for another thread on overload When the current check is overloaded (more running checks than the configured limit), we'll try more aggressively to find another thread. Instead of just opportunistically looking for one half as loaded, now if the current thread has more than 1% more active checks than another one, or has more than a configured limit of concurrent running checks, it will search for a more suitable thread among 3 other random ones in order to migrate the check there. The number of migrations remains very low (~1%) and the checks load very fair across all threads (~1% as well). The new parameter is called tune.max-checks-per-thread.	2023-09-01 08:26:06 +02:00
Willy Tarreau	016e189ea3	MINOR: check: also consider the random other thread's active checks When checking if it's worth transferring a sleeping thread to another random thread, let's also check if that random other thread has less checks than the current one, which is another reason for transferring the load there. This commit adds a function "check_thread_cmp_load()" to compare two threads' loads in order to simplify the decision taking. The minimum active check count before starting to consider rebalancing the load was now raised from 2 to 3, because tests show that at 15k concurrent checks, at 2, 50% are evaluated for rebalancing and 30% are rebalanced, while at 3, this is cut in half.	2023-09-01 08:26:06 +02:00
Willy Tarreau	00de9e0804	MINOR: checks: maintain counters of active checks per thread Let's keep two check counters per thread: - one for "active" checks, i.e. checks that are no more sleeping and are assigned to the thread. These include sleeping and running checks ; - one for "running" checks, i.e. those which are currently executing on the thread. By doing so, we'll be able to spread the health checks load a bit better and refrain from sending too many at once per thread. The counters are atomic since a migration increments the target thread's active counter. These numbers are reported in "show activity", which allows to check per thread and globally how many checks are currently pending and running on the system. Ideally, we should only consider checks in the process of establishing a connection since that's really the expensive part (particularly with OpenSSL 3.0). But the inner layers are really not suitable to doing this. However knowing the number of active checks is already a good enough hint.	2023-09-01 08:26:06 +02:00
Willy Tarreau	3b7942a1c9	MINOR: check/activity: collect some per-thread check activity stats We now count the number of times a check was started on each thread and the number of times a check was adopted. This helps understand better what is observed regarding checks.	2023-09-01 08:26:06 +02:00
Willy Tarreau	e03d05c6ce	MINOR: check: remember when we migrate a check The goal here is to explicitly mark that a check was migrated so that we don't do it again. This will allow us to perform other actions on the target thread while still knowing that we don't want to be migrated again. The new READY bit combine with SLEEPING to form 4 possible states: SLP RDY State Description 0 0 - (reserved) 0 1 RUNNING Check is bound to current thread and running 1 0 SLEEPING Check is sleeping, not bound to a thread 1 1 MIGRATING Check is migrating to another thread Thus we set READY upon migration, and check for it before migrating, this is sufficient to prevent a second migration. To make things a bit clearer, the SLEEPING bit was switched with FASTINTER so that SLEEPING and READY are adjacent.	2023-09-01 08:26:06 +02:00
Willy Tarreau	3544c9f8a0	MINOR: checks: pin the check to its thread upon wakeup When a check leaves the sleeping state, we must pin it to the thread that is processing it. It's normally always the case after the first execution, but initial checks that start assigned to any thread (-1) could be assigned much later, causing problems with planned changes involving queuing. Thus better do it early, so that all threads start properly pinned.	2023-09-01 08:26:06 +02:00
Willy Tarreau	7163f95b43	MINOR: checks: start the checks in sleeping state The CHK_ST_SLEEPING state was introduced by commit `d114f4a68` ("MEDIUM: checks: spread the checks load over random threads") to indicate that a check was not currently bound to a thread and that it could easily be migrated to any other thread. However it did not start the checks in this state, meaning that they were not redispatchable on startup. Sometimes under heavy load (e.g. when using SSL checks with OpenSSL 3.0) the cost of setting up new connections is so high that some threads may experience connection timeouts on startup. In this case it's better if they can transfer their excess load to other idle threads. By just marking the check as sleeping upon startup, we can do this and significantly reduce the number of failed initial checks.	2023-09-01 08:26:06 +02:00
Willy Tarreau	48442b8b15	BUG/MINOR: checks: do not queue/wake a bounced check A small issue was introduced with commit `d114f4a68` ("MEDIUM: checks: spread the checks load over random threads"): when a check is bounced to another thread, its expiration time is set to TICK_ETERNITY. This makes it show as not expired upon first wakeup on the next thread, thus being detected as "woke up too early" and being instantly rescheduled. Only this after this next wakeup it will be properly considered. Several approaches were attempted to fix this. The best one seems to consist in resetting t->expire and expired upon wakeup, and changing the !expired test for !tick_is_expired() so that we don't trigger on this case. This needs to be backported to 2.7.	2023-09-01 08:26:06 +02:00
Willy Tarreau	338431ecb6	MINOR: activity: report the current run queue size While troubleshooting the causes of load spikes, it appeared that the length of individual run queues was missing, let's add it to "show activity".	2023-09-01 08:26:06 +02:00
Willy Tarreau	2cb896c4b0	MEDIUM: server/ssl: pick another thread's session when we have none yet The per-thread SSL context in servers causes a burst of connection renegotiations on startup, both for the forwarded traffic and for the health checks. Health checks have been seen to continue to cause SSL rekeying for several minutes after a restart on large thread-count machines. The reason is that the context is exlusively per-thread and that the more threads there are, the more likely it is for a new connection to start on a thread that doesn't have such a context yet. In order to improve this situation, this commit ensures that a thread starting an SSL connection to a server without a session will first look at the last session that was updated by another thread, and will try to use it. In order to minimize the contention, we're using a read lock here to protect the data, and the first-level index is an integer containing the thread number, that is always valid and may always be dereferenced. This way the session retrieval algorithm becomes quite simple: - if the last thread index is valid, then try to use the same session under a read lock ; - if any error happens, then atomically nuke the index so that other threads don't use it and the next one to update a connection updates it again And for the ssl_sess_new_srv_cb(), we have this: - update the entry under a write lock if the new session is valid, otherwise kill it if the session is not valid; - atomically update the index if it was 0 and the new one is valid, otherwise atomically nuke it if the session failed. Note that even if only the pointer is destroyed, the element will be re-allocated by the next thread during the sess_new_srv_sb(). Right now a session is picked even if the SNI doesn't match, because we don't know the SNI yet during ssl_sock_init(), but that's essentially a matter of API, since connect_server() figures the SNI very early, then calls conn_prepare() which calls ssl_sock_init(). Thus in the future we could easily imaging storing a number of SNI-based contexts instead of storing contexts per thread. It could be worth backporting this to one LTS version after some observation, though this is not strictly necessary. the current commit depends on the following ones: BUG/MINOR: ssl_sock: fix possible memory leak on OOM MINOR: ssl_sock: avoid iterating realloc(+1) on stored context DOC: ssl: add some comments about the non-obvious session allocation stuff CLEANUP: ssl: keep a pointer to the server in ssl_sock_init() MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session MINOR: server/ssl: maintain an index of the last known valid SSL session MINOR: server/ssl: clear the shared good session index on failure MEDIUM: server/ssl: pick another thread's session when we have none yet	2023-08-31 09:27:14 +02:00
Willy Tarreau	777f62cfb7	MINOR: server/ssl: clear the shared good session index on failure If we fail to set the session using SSL_set_session(), we want to quickly erase our index from the shared one so that any other thread with a valid session replaces it.	2023-08-31 08:50:01 +02:00
Willy Tarreau	52b260bae4	MINOR: server/ssl: maintain an index of the last known valid SSL session When a thread creates a new session for a server, if none was known yet, we assign the thread id (hence the reused_sess index) to a shared variable so that other threads will later be able to find it when they don't have one yet. For now we only set and clear the pointer upon session creation, we do not yet pick it. Note that we could have done it per thread-group, so as to avoid any cross-thread exchanges, but it's anticipated that this is essentially used during startup, at a moment where the cost of inter-thread contention is very low compared to the ability to restart at full speed, which explains why instead we store a single entry.	2023-08-31 08:50:01 +02:00
Willy Tarreau	607041dec3	MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session The goal will be to permit a thread to update its session while having it shared with other threads. For now we only place the lock and arrange the code around it so that this is quite light. For now only the owner thread uses this lock so there is no contention. Note that there is a subtlety in the openssl API regarding i2s_SSL_SESSION() in that it fills the area pointed to by its argument with a dump of the session and returns a size that's equal to the previously allocated one. As such, it does modify the shared area even if that's not obvious at first glance.	2023-08-31 08:50:01 +02:00
Willy Tarreau	95ac5fe4a8	MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid In ssl_sock_set_servername(), we're retrieving the current server name from the current thread, hoping it will not have changed. This is a bit dangerous as strictly speaking it's not easy to prove that no other connection had to use one between the moment it was retrieved in ssl_sock_init() and the moment it's being read here. In addition, this forces us to maintain one session per thread while this is not the real need, in practice we only need one session per SNI. And the current model prevents us from sharing sessions between threads. This had been done in 2.5 via commit `e18d4e828` ("BUG/MEDIUM: ssl: backend TLS resumption with sni and TLSv1.3"), but as analyzed with William, it turns out that a saner approach consists in keeping the call to SSL_get_servername() there and instead to always assign the SNI to the current SSL context via SSL_set_tlsext_host_name() immediately when the session is retreived. This way the session and SNI are consulted atomically and the host name is only checked from the session and not from possibly changing elements. As a bonus the rdlock that was added by that commit could now be removed, though it didn't cost much.	2023-08-31 08:49:15 +02:00
Willy Tarreau	335b5adf2c	CLEANUP: ssl: keep a pointer to the server in ssl_sock_init() We're using about 6 times "__objt_server(conn->target)" there, it's not quite easy to read, let's keep a pointer to the server.	2023-08-30 18:58:40 +02:00
Willy Tarreau	bc31ef0896	DOC: ssl: add some comments about the non-obvious session allocation stuff The SSL session allocation/reuse part is far from being trivial, and there are some necessary tricks such as allocating then immediately freeing that are required by the API due to internal refcount. All of this is particularly hard to grasp, even with the scarce man pages. Let's document a little bit what's granted and expected along this path to help the reader later.	2023-08-30 11:43:06 +02:00
Willy Tarreau	2c6fe24001	MINOR: ssl_sock: avoid iterating realloc(+1) on stored context The SSL context storage in servers is per-thread, and the contents are allocated for a length that is determined from the session. It turns out that placing some traces there revealed that the realloc() that is called to grow the area can be called multiple times in a row even for just health checks, to grow the area by just one or two bytes. Given that malloc() allocates in multiples of 8 or 16 anyway, let's round the allocated size up to the nearest multiple of 8 to avoid this unneeded operation.	2023-08-30 11:43:06 +02:00
Alexander Stephan	2cc53ecc8f	MINOR: sample: Add common TLV types as constants for fc_pp_tlv This patch adds common TLV types as specified in the PPv2 spec. We will use the suffix of the type, e.g., PP2_TYPE_AUTHORITY becomes AUTHORITY.	2023-08-29 15:32:02 +02:00
Alexander Stephan	0a4f6992e0	MINOR: sample: Refactor fc_pp_unique_id by wrapping the generic TLV fetch The fetch logic is redundant and can be simplified by simply calling the generic fetch with the correct TLV ID set as an argument, similar to fc_pp_authority.	2023-08-29 15:32:01 +02:00
Alexander Stephan	ece0d1ab49	MINOR: sample: Refactor fc_pp_authority by wrapping the generic TLV fetch We already have a call that can retreive an TLV with any value. Therefore, the fetch logic is redundant and can be simplified by simply calling the generic fetch with the correct TLV ID set as an argument.	2023-08-29 15:31:51 +02:00
Alexander Stephan	f773ef721c	MEDIUM: sample: Add fetch for arbitrary TLVs Based on the new, generic allocation infrastructure, a new sample fetch fc_pp_tlv is introduced. It is an abstraction for existing PPv2 TLV sample fetches. It takes any valid TLV ID as argument and returns the value as a string, similar to fc_pp_authority and fc_pp_unique_id.	2023-08-29 15:31:28 +02:00
Alexander Stephan	fecc573da1	MEDIUM: connection: Generic, list-based allocation and look-up of PPv2 TLVs In order to be able to implement fetches in the future that allow retrieval of any TLVs, a new generic data structure for TLVs is introduced. Existing TLV fetches for PP2_TYPE_AUTHORITY and PP2_TYPE_UNIQUE_ID are migrated to use this new data structure. TLV related pools are updated to not rely on type, but only on size. Pools accomodate the TLV list element with their associated value. For now, two pools for 128 B and 256 B values are introduced. More fine-grained solutions are possible in the future, if necessary.	2023-08-29 15:15:47 +02:00
Alexander Stephan	c9d47652d2	CLEANUP/MINOR: connection: Improve consistency of PPv2 related constants This patch improves readability by scoping HA proxy related PPv2 constants with a 'HA" prefix. Besides, a new constant for the length of a CRC32C TLV is introduced. The length is derived from the PPv2 spec, so 32 Bit.	2023-08-29 15:15:47 +02:00
Willy Tarreau	bd84387beb	MEDIUM: capabilities: enable support for Linux capabilities For a while there has been the constraint of having to run as root for transparent proxying, and we're starting to see some cases where QUIC is not running in socket-per-connection mode due to the missing capability that would be needed to bind a privileged port. It's not realistic to ask all QUIC users on port 443 to run as root, so instead let's provide a basic support for capabilities at least on linux. The ones currently supported are cap_net_raw, cap_net_admin and cap_net_bind_service. The mechanism was made OS-specific with a dedicated file because it really is. It can be easily refined later for other OSes if needed. A new keyword "setcaps" is added to the global section, to enumerate the capabilities that must be kept when switching from root to non-root. This is ignored in other situations though. HAProxy has to be built with USE_LINUX_CAP=1 for this to be supported, which is enabled by default for linux-glibc, linux-glibc-legacy and linux-musl. A good way to test this is to start haproxy with such a config: global uid 1000 setcap cap_net_bind_service frontend test mode http timeout client 3s bind quic4@:443 ssl crt rsa+dh2048.pem allow-0rtt and run it under "sudo strace -e trace=bind,setuid", then connecting there from an H3 client. The bind() syscall must succeed despite the user id having been switched.	2023-08-29 11:11:50 +02:00
Willy Tarreau	e64bccab20	BUG/MINOR: stream: protect stream_dump() against incomplete streams If a stream is interrupted during its initialization by a panic signal and tries to dump itself, it may cause a crash during the dump due to scf and/or scb not being fully initialized. This may also happen while releasing an endpoint to attach a new one. The effect is that instead of dying on an abort, the process dies on a segv. This race is ultra- rare but totally possible. E.g: #0 se_fl_test (test=1, se=0x0) at include/haproxy/stconn.h:98 #1 sc_ep_test (test=1, sc=0x7ff8d5cbd560) at include/haproxy/stconn.h:148 #2 sc_conn (sc=0x7ff8d5cbd560) at include/haproxy/stconn.h:223 #3 stream_dump (buf=buf@entry=0x7ff9507e7678, s=0x7ff4c40c8800, pfx=pfx@entry=0x55996c558cb3 ' ' <repeats 13 times>, eol=eol@entry=10 '\n') at src/stream.c:2840 #4 0x000055996c493b42 in ha_task_dump (buf=buf@entry=0x7ff9507e7678, task=<optimized out>, pfx=pfx@entry=0x55996c558cb3 ' ' <repeats 13 times>) at src/debug.c:328 #5 0x000055996c493edb in ha_thread_dump_one (thr=thr@entry=18, from_signal=from_signal@entry=0) at src/debug.c:227 #6 0x000055996c493ff1 in ha_thread_dump (buf=buf@entry=0x7ff9507e7678, thr=thr@entry=18) at src/debug.c:270 #7 0x000055996c494257 in ha_panic () at src/debug.c:430 #8 ha_panic () at src/debug.c:411 (...) #23 0x000055996c341fe8 in ssl_sock_close (conn=<optimized out>, xprt_ctx=0x7ff8dcae3880) at src/ssl_sock.c:6699 #24 0x000055996c397648 in conn_xprt_close (conn=0x7ff8c297b0c0) at include/haproxy/connection.h:148 #25 conn_full_close (conn=0x7ff8c297b0c0) at include/haproxy/connection.h:192 #26 h1_release (h1c=0x7ff8c297b3c0) at src/mux_h1.c:1074 #27 0x000055996c39c9f0 in h1_detach (sd=<optimized out>) at src/mux_h1.c:3502 #28 0x000055996c474de4 in sc_detach_endp (scp=scp@entry=0x7ff9507e3148) at src/stconn.c:375 #29 0x000055996c4752a5 in sc_reset_endp (sc=<optimized out>, sc@entry=0x7ff8d5cbd560) at src/stconn.c:475 Note that this cannot happen on "show sess" since a stream never leaves process_stream in such an uninitialized state, thus it's really only the crash dump that may cause this. It should be backported to 2.8.	2023-08-29 11:11:50 +02:00
William Lallemand	e7d9082315	BUG/MINOR: ssl/cli: can't find ".crt" files when replacing a certificate Bug was introduced by commit 26654 ("MINOR: ssl: add "crt" in the cert_exts array"). When looking for a .crt directly in the cert_exts array, the ssl_sock_load_pem_into_ckch() function will be called with a argument which does not have its ".crt" extensions anymore. If "ssl-load-extra-del-ext" is used this is not a problem since we try to add the ".crt" when doing the lookup in the tree. However when using directly a ".crt" without this option it will failed looking for the file in the tree. The fix removes the "crt" entry from the array since it does not seem to be really useful without a rework of all the lookups. Should fix issue #2265 Must be backported as far as 2.6.	2023-08-28 18:20:39 +02:00
Willy Tarreau	0074c36dd2	BUILD: pools: import plock.h to build even without thread support In 2.9-dev4, commit `544c2f2d9` ("MINOR: pools: use EBO to wait for unlock during pool_flush()") broke the thread-less build by calling pl_wait_new_long() without explicitly including plock.h which is normally included by thread.h when threads are enabled.	2023-08-26 17:28:08 +02:00
Willy Tarreau	a7b9baa2cc	BUG/MEDIUM: mux-h2: fix crash when checking for reverse connection after error If the connection is closed in h2_release(), which is indicated by ret<0, we must not dereference conn anymore. This was introduced in 2.9-dev4 by commit `5053e8914` ("MEDIUM: h2: prevent stream opening before connection reverse completed") and detected after a few hours of runtime thanks to running with pool integrity checks and caller enabled. No backport is needed.	2023-08-26 17:05:19 +02:00
Amaury Denoyelle	5afcb686b9	MAJOR: connection: purge idle conn by last usage Backend idle connections are purged on a recurring occurence during the process lifetime. An estimated number of needed connections is calculated and the excess is removed periodically. Before this patch, purge was done directly using the idle then the safe connection tree of a server instance. This has a major drawback to take no account of a specific ordre and it may removed functional connections while leaving ones which will fail on the next reuse. The problem can be worse when using criteria to differentiate idle connections such as the SSL SNI. In this case, purge may remove connections with a high rate of reusing while leaving connections with criteria never matched once, thus reducing drastically the reuse rate. To improve this, introduce an alternative storage for idle connection used in parallel of the idle/safe trees. Now, each connection inserted in one of this tree is also inserted in the new list at `srv_per_thread.idle_conn_list`. This guarantees that recently used connection is present at the end of the list. During the purge, use this list instead of idle/safe trees. Remove first connection in front of the list which were not reused recently. This will ensure that connection that are frequently reused are not purged and should increase the reuse rate, particularily if distinct idle connection criterias are in used.	2023-08-25 15:57:48 +02:00
Amaury Denoyelle	61fc9568fb	MINOR: server: move idle tree insert in a dedicated function Define a new function _srv_add_idle(). This is a simple wrapper to insert a connection in the server idle tree. This is reserved for simple usage and require to idle_conns lock. In most cases, srv_add_to_idle_list() should be used. This patch does not have any functional change. However, it will help with the next patch as idle connection will be always inserted in a list as secondary storage along with idle/safe trees.	2023-08-25 15:57:48 +02:00
Amaury Denoyelle	77ac8eb4a6	MINOR: connection: simplify removal of idle conns from their trees Small change of API for conn_delete_from_tree(). Now the connection instance is taken as argument instead of its inner node. No functional change introduced with this commit. This simplifies slightly invocation of conn_delete_from_tree(). The most useful changes is that this function will be extended in the next patch to be able to remove the connection from its new idle list at the same time as in its idle tree.	2023-08-25 15:57:48 +02:00
Fr�d�ric L�caille	81815a9a83	MEDIUM: map/acl: Replace map/acl spin lock by a read/write lock. Replace ->lock type of pat_ref struct by HA_RWLOCK_T. Replace all calls to HA_SPIN_LOCK() (resp. HA_SPIN_UNLOCK()) by HA_RWLOCK_WRLOCK() (resp. HA_RWLOCK_WRUNLOCK()) when a write access is required. There is only one read access which is needed. This is in the "show map" command callback, cli_io_handler_map_lookup() where a HA_SPIN_LOCK() call is replaced by HA_RWLOCK_RDLOCK() (resp. HA_SPIN_UNLOCK() by HA_RWLOCK_RDUNLOCK). Replace HA_SPIN_INIT() calls by HA_RWLOCK_INIT() calls.	2023-08-25 15:42:03 +02:00
Fr�d�ric L�caille	5fea59754b	MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list Replace as much as possible list_for_each*() around ->head list, member of pat_ref_elt struct by use of its ->ebpt_root member which is an ebtree.	2023-08-25 15:42:01 +02:00

... 12 13 14 15 16 ...

17544 Commits