haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2026-01-17 23:01:06 +01:00

Author	SHA1	Message	Date
William Lallemand	90c5618ed5	MEDIUM: systemd: implement directory loading Redhat-based system already use a CFGDIR variable to load configuration files from a directory, this patch implements the same feature. It now requires that /etc/haproxy/conf.d exists or the service won't be able to start.	2026-01-16 09:55:33 +01:00
Egor Shestakov	a3ee35cbfc	REORG/MINOR: cfgparse: eliminate code duplication by lshift_args() There were similar parts of the code in "no" and "default" prefix keywords handling. This duplication caused the bug once. No backport needed.	2026-01-16 09:09:24 +01:00
Egor Shestakov	447d73dc99	BUG/MINOR: cfgparse: fix "default" prefix parsing Fix the left shift of args when "default" prefix matches. The cause of the bug was the absence of zeroing of the right element during the shift. The same bug for "no" prefix was fixed by commit 0f99e3497, but missed for "default". The shift of ("default", "option", "dontlog-normal") produced ("option", "dontlog-normal", "dontlog-normal") instead of ("option", "dontlog-normal", "") As an example, a valid config line: default option dontlog-normal caused a parse error: [ALERT] (32914) : config : parsing [bug-default-prefix.cfg:22] : 'option dontlog-normal' cannot handle unexpected argument 'dontlog-normal'. The patch should be backported to all stable versions, since the absence of zeroing was introduced with "default" keyword.	2026-01-16 09:09:19 +01:00
Remi Tricot-Le Breton	362ff2628f	REGTESTS: jwe: Fix tests of algorithms not supported by AWS-LC Many tests use the A128KW algorithm which is not supported by AWS-LC but instead of removing those tests we will just have a hardcoded value set by default in this case.	2026-01-15 10:56:28 +01:00
Remi Tricot-Le Breton	aba18bac71	MINOR: jwe: Some algorithms not supported by AWS-LC AWS-LC does not have EVP_aes_128_wrap or EVP_aes_192_wrap so the A128KW and A192KW algorithms will not be supported for JWE token decryption.	2026-01-15 10:56:28 +01:00
Remi Tricot-Le Breton	39da1845fc	DOC: jwe: Add doc for jwt_decrypt converters Add doc for jwt_decrypt_secret and jwt_decrypt_cert converters.	2026-01-15 10:56:28 +01:00
Remi Tricot-Le Breton	4b73a3ed29	REGTESTS: jwe: Add jwt_decrypt_secret and jwt_decrypt_cert tests Test the new jwt_decrypt converters.	2026-01-15 10:56:27 +01:00
Remi Tricot-Le Breton	e3a782adb5	MINOR: jwe: Add new jwt_decrypt_cert converter This converter checks the validity and decrypts the content of a JWE token that has an asymetric "alg" algorithm (RSA). In such a case, we must provide a path to an already loaded certificate and private key that has the "jwt" option set to "on".	2026-01-15 10:56:27 +01:00
Remi Tricot-Le Breton	416b87d5db	MINOR: jwe: Add new jwt_decrypt_secret converter This converter checks the validity and decrypts the content of a JWE token that has a symetric "alg" algorithm. In such a case, we only require a secret as parameter in order to decrypt the token.	2026-01-15 10:56:27 +01:00
Remi Tricot-Le Breton	2b45b7bf4f	REGTESTS: ssl: Add tests for new aes cbc converters This test mimics what was already done for the aes_gcm converters. Some data is encrypted and directly decrypted and we ensure that the output was not changed.	2026-01-15 10:56:27 +01:00
Remi Tricot-Le Breton	c431034037	MINOR: ssl: Add new aes_cbc_enc/_dec converters Those converters allow to encrypt or decrypt data with AES in Cipher Block Chaining mode. They work the same way as the already existing aes_gcm_enc/_dec ones apart from the AEAD tag notion which is not supported in CBC mode.	2026-01-15 10:56:27 +01:00
Remi Tricot-Le Breton	f0e64de753	MINOR: ssl: Factorize AES GCM data processing The parameter parsing and processing and the actual crypto part of the aes_gcm converter are interleaved. This patch puts the crypto parts in a dedicated function for better reuse in the upcoming JWE processing.	2026-01-15 10:56:27 +01:00
Amaury Denoyelle	6870551a57	MEDIUM: proxy: force traffic on unpublished/disabled backends A recent patch has introduced a new state for proxies : unpublished backends. Such backends won't be eligilible for traffic, thus use_backend/default_backend rules which target them won't match and content switching rules processing will continue. This patch defines a new frontend keywords 'force-be-switch'. This keyword allows to ignore unpublished or disabled state. Thus, use_backend/default_backend will match even if the target backend is unpublished or disabled. This is useful to be able to test a backend instance before exposing it outside. This new keyword is converted into a persist rule of new type PERSIST_TYPE_BE_SWITCH, stored in persist_rules list proxy member. This is the only persist rule applicable to frontend side. Prior to this commit, pure frontend proxies persist_rules list were always empty. This new features requires adjustment in process_switching_rules(). Now, when a use_backend/default_backend rule matches with an non eligible backend, frontend persist_rules are inspected to detect if a force-be-switch is present so that the backend may be selected.	2026-01-15 09:08:19 +01:00
Amaury Denoyelle	16f035d555	MINOR: cfgparse: adapt warnif_cond_conflicts() error output Utility function warnif_cond_conflicts() is used when parsing an ACL. Previously, the function directly calls ha_warning() to report an error. Change the function so that it now takes the error message as argument. Caller can then output it as wanted. This change is necessary to use the function when parsing a keyword registered as cfg_kw_list. The next patch will reuse it.	2026-01-15 09:08:18 +01:00
Amaury Denoyelle	82907d5621	MINOR: stats: report BE unpublished status A previous patch defines a new proxy status : unpublished backends. This patch extends this by changing proxy status reported in stats. If unpublished is set, an extra "(UNPUB)" is added to the field. Also, HTML stats is also slightly updated. If a backend is up but unpublished, its status will be reported in orange color.	2026-01-15 09:08:18 +01:00
Amaury Denoyelle	797ec6ede5	MEDIUM: proxy: implement publish/unpublish backend CLI Define a new set of CLI commands publish/unpublish backend <be>. The objective is to be able to change the status of a backend to unpublished. Such a backend is considered ineligible to traffic : this allows to skip use_backend rules which target it. Note that contrary to disabled/stopped proxies, an unpublished backend still has server checks running on it. Internally, a new proxy flags PR_FL_BE_UNPUBLISHED is defined. CLI commands handler "publish backend" and "unpublish backend" are executed under thread isolation. This guarantees that the flag can safely be set or remove in the CLI handlers, and read during content-switching processing.	2026-01-15 09:08:18 +01:00
Amaury Denoyelle	21fb0a3f58	MEDIUM: proxy: do not select a backend if disabled A proxy can be marked as disabled using the keyword with the same name. The doc mentions that it won't process any traffic. However, this is not really the case for backends as they may still be selected via switching rules during stream processing. In fact, currently access to disabled backends will be conducted up to assign_server(). However, no eligible server is found at this stage, resulting in a connection closure or an HTTP 503, which is expected. So in the end, servers in disabled backends won't receive any traffic. But this is only because post-parsing steps are not performed on such backends. Thus, this can be considered as functional but only via side-effects. This patch clarifies the handling of disable backends, so that they are never selected via switching rules. Now, process_switching_rules() will ignore disable backends and continue rules evaluation. As this is a behavior change, this patch is labelled as medium. The documentation manuel for use_backend is updated accordingly.	2026-01-15 09:08:18 +01:00
Amaury Denoyelle	2d26d353ce	REGTESTS: add test on backend switching rules selection Create a new test to ensure that switching rules selection is fine. Currently, this checks that dynamic backend switching works as expected. If a matching rule is resolved to an unexisting backend, the default backend is used instead. This regtest should be useful as switching-rules will be extended in a future set of patches to add new abilities on backends, linked to dynamic backend support.	2026-01-15 09:08:18 +01:00
Amaury Denoyelle	12975c5c37	MEDIUM: stream: refactor switching-rules processing This commit rewrites process_switching_rules() function. The objective is to simplify backend selection so that a single unified stream_set_backend() call is kept, both for regular and default backends case. This patch will be useful to add new capabilities on backends, in the context of dynamic backend support implementation.	2026-01-15 09:08:18 +01:00
Amaury Denoyelle	2f6aab9211	BUG/MINOR: proxy: free persist_rules force-persist proxy keyword is converted into a persist_rule, stored in proxy persist_rules list member. Each new rule is dynamically allocated during parsing. This commit fixes the memory leak on deinit due to a missing free on persist_rules list entries. This is done via deinit_proxy() modification. Each rule in the list is freed, along with its associated ACL condition type. This can be backported to every stable version.	2026-01-15 09:08:18 +01:00
Olivier Houchard	a209c35f30	MEDIUM: thread: Turn the group mask in thread set into a group counter If we want to be able to have more than 64 thread groups, we can no longer use thread group masks as long. One remaining place where it is done is in struct thread_set. However, it is not really used as a mask anywhere, all we want is a thread group counter, so convert that mask to a counter.	2026-01-15 05:24:53 +01:00
Olivier Houchard	6249698840	BUG/MEDIUM: queues: Fix arithmetic when feeling non_empty_tgids Fix the arithmetic when pre-filling non_empty_tgids when we still have more than 32/64 thread groups left, to get the right index, we of course have to divide the number of thread groups by the number of bits in a long. This bug was introduced by commit 7e1fed4b7a8b862bf7722117f002ee91a836beb5, but hopefully was not hit because it requires to have at least as much thread groups as there are bits in a long, which is impossible on 64bits machines, as MAX_TGROUPS is still 32.	2026-01-15 04:28:04 +01:00
Olivier Houchard	1397982599	MINOR: threads: Eliminate all_tgroups_mask. Now that it is unused, eliminate all_tgroups_mask, as we can't 64bits masks to represent thread groups, if we want to be able to have more than 64 thread groups.	2026-01-15 03:46:57 +01:00
Olivier Houchard	7e1fed4b7a	MINOR: queues: Turn non_empty_tgids into a long array. In order to be able to have more than 64 thread groups, turn non_empty_tgids into a long array, so that we have enough bits to represent everty thread group, and manipulate it with the ha_bit_* functions.	2026-01-15 03:46:57 +01:00
Aurelien DARRAGON	2ec387cdc2	BUG/MINOR: http_act: fix deinit performed on uninitialized lf_expr in release_http_map() As reported by GH user @Lzq-001 on issue #3245, the config below would cause haproxy to SEGFAULT after having reported an error: frontend 0000000 http-request set-map %[hdr(0000)0_ Root cause is simple, in parse_http_set_map(), we define the release function (which is responsible to clear lf_expr expressions used by the action), prior to initializing the expressions, while the release function assumes the expressions are always initialized. For all similar actions, we already perform the init prior to setting the related release function, but this was not the case for parse_http_set_map(). We fix the bug by initializing the expressions earlier. Thanks to @Lzq-001 for having reported the issue and provided a simple reproducer. It should be backported to all stable versions, note for versions prior to 3.0, lf_expr_init() should be replace by LIST_INIT(), see 6810c41 ("MEDIUM: tree-wide: add logformat expressions wrapper")	2026-01-14 20:05:39 +01:00
Olivier Houchard	7f4b053b26	MEDIUM: counters: mostly revert da813ae4d7cb77137ed Contrarily to what was previously believed, there are corner cases where the counters may not be allocated, and we may want to make them optional at a later date, so we have to check if those counters are there. However, just checking that shared.tg is non-NULL is enough, we can then assume that shared.tg[tgid - 1] has properly been allocated too. Also modify the various COUNTER_SHARED_* macros to make sure they check for that too.	2026-01-14 12:39:14 +01:00
Amaury Denoyelle	7aa839296d	BUG/MEDIUM: quic: fix ACK ECN frame parsing ACK frames are either of type 0x02 or 0x03. The latter is an indication that it contains extra ECN related fields. In haproxy QUIC stack, this is considered as a different frame type, set to QUIC_FT_ACK_ECN, with its own set of builder/parser functions. This patch fixes ACK ECN parsing function. Indeed, the latter suffered from two issues. First, 'first ACK range' and 'ACK ranges' were inverted. Then, the three remaining ECN fields were simply ignored by the parsing function. This issue can cause desynchronization in the frames parsing code, which may result in various result. Most of the time, the connection will be aborted by haproxy due to an invalid frame content read. Note that this issue was not detected earlier as most clients do not enable ECN support if the peer is not able to emit ACK ECN frame first, which haproxy currently never sends. Nevertheless, this is not the case for every client implementation, thus proper ACK ECN parsing is mandatory for a proper QUIC stack support. Fix this by adjusting quic_parse_ack_ecn_frame() function. The remaining ECN fields are parsed to ensure correct packet parsing. Currently, they are not used by the congestion controller. This must be backported up to 2.6.	2026-01-13 15:08:02 +01:00
Olivier Houchard	82196eb74e	BUG/MEDIUM: threads: Fix binding thread on bind. The code to parse the "thread" keyword on bind lines was changed to check if the thread numbers were correct against the value provided with max-threads-per-group, if any were provided, however, at the time those thread keywords have been set, it may not yet have been set, and that breaks the feature, so revert to check against MAX_THREADS_PER_GROUP instead, it should have no major impact.	2026-01-13 11:45:46 +01:00
Olivier Houchard	da813ae4d7	MEDIUM: counters: Remove some extra tests Before updating counters, a few tests are made to check if the counters exits. but those counters should always exist at this point, so just remmove them. This commit should have no impact, but can easily be reverted with no functional impact if various crashes appear.	2026-01-13 11:12:34 +01:00
Olivier Houchard	5495c88441	MEDIUM: counters: Dynamically allocate per-thread group counters Instead of statically allocating the per-thread group counters, based on the max number of thread groups available, allocate them dynamically, based on the number of thread groups actually used. That way we can increase the maximum number of thread groups without using an unreasonable amount of memory.	2026-01-13 11:12:34 +01:00
Willy Tarreau	37057feb80	BUG/MINOR: net_helper: fix IPv6 header length processing The IPv6 header contains a payload length that excludes the 40 bytes of IPv6 packet header, which differs from IPv4's total length which includes it. As a result, the parser was wrong and would only see the IP part and not the TCP one unless sufficient options were present tocover it. This issue came in 3.4-dev2 with recent commit e88e03a6e4 ("MINOR: net_helper: add ip.fp() to build a simplified fingerprint of a SYN"), so no backport is needed.	2026-01-13 08:42:36 +01:00
Aurelien DARRAGON	fcd4d4a7aa	BUG/MINOR: hlua_fcn: ensure Patref:add_bulk() is given a table object before using it As reported by GH user @kanashimia in GH #3241, providing anything else than a table to Patref:add_bulk() method could cause a segfault because we were calling lua_next() with the lua object without ensuring it actually is a table. Let's add the missing lua_istable() check on the stack object before calling lua_next() function on it. It should be backported up to 3.2 with 884dc62 ("MINOR: hlua_fcn: add Patref:add_bulk()")	2026-01-12 17:30:54 +01:00
Aurelien DARRAGON	04545cb2b7	BUG/MINOR: hlua_fcn: fix broken yield for Patref:add_bulk() In GH #3241, GH user @kanashimia reported that the Patref:add_bulk() method would raise a Lua exception when called with more than 101 elements at once. As identified by @kanashimia there was an error in the way the add_bulk() method was forced to yield after 101 elements precisely. The yield is there to ensure Lua doesn't eat too much ressources at once and doesn't impact haproxy's core responsiveness, but the check for the yield was misplaced resulting in improper stack content upon resume. Thanks to user @kanashimia who even provided a reproducer which helped a lot to troubleshoot the issue. This fix should be backported up to 3.2 with 884dc62 ("MINOR: hlua_fcn: add Patref:add_bulk()") where the bug was introduced.	2026-01-12 17:30:52 +01:00
Olivier Houchard	b1cfeeef21	BUG/MINOR: stats-file: Use a 16bits variable when loading tgid Now that the tgid stored in the stats file has been increased to 16bits by commit 022cb3ab7fdce74de2cf24bea865ecf7015e5754, don't forget to increase the variable size when reading it from the file, too. This should have no impact given the maximum thread group limit is still 32.	2026-01-12 09:48:54 +01:00
Olivier Houchard	022cb3ab7f	MINOR: stats: Increase the tgid from 8bits to 16bits Increase the size of the stored tgid in the stat file from 8bits to 32bits, so that we can have more than 256 thread group. 65536 should be enough for some time. This bumps thet stat file minor version, as the structure changes.	2026-01-12 09:39:52 +01:00
Olivier Houchard	c0f64fc36a	MINOR: receiver: Dynamically alloc the "members" field of shard_info Instead of always allocating MAX_TGROUPS members, allocate them dynamically, using the number of thread groups we'll use, so that increasing MAX_TGROUPS will not have a huge impact on the structure size.	2026-01-12 09:32:27 +01:00
Tim Duesterhus	96faf71f87	CLEANUP: connection: Remove outdated note about CO_FL `0x00002000` being unused This flag is used as of commit dcce9369129f6ca9b8eed6b451c0e20c226af2e3 ("MINOR: connections: Add a new CO_FL_SSL_NO_CACHED_INFO flag"). This patch should be backported to 3.3. Apparently dcce9369129 has been backported to 3.2 and 3.1 already, with that change already applied, so no need for a backport there.	2026-01-12 03:22:15 +01:00
Willy Tarreau	2560cce7c5	MINOR: tcp-sample: permit retrieving tcp_info from the connection/session stage The fc_xxx info that are retrieved over tcp_info could currently not be accessed before a stream is created due to a test that verified the existence of a stream. The rationale here was that the function works both for frontend and backend. Let's always retrieve these info from the session for the frontend case so that it now becomes possible to set variables at connection/session time. The doc did not mention this limitation so this could almost be considered as a bug.	2026-01-11 15:48:20 +01:00
Willy Tarreau	880bbeeda4	MINOR: sample: also support retrieving fc.timer.handshake without a stream Some timers, like the handshake timer, are stored in the session and are only copied to the logs struct when a stream is created. But this means we can't measure it without a stream, nor store it once for all in a variable at session creation time. Let's extend the sample fetch function to retrieve it from the session when no stream is present. The doc did not mention this limitation so this could almost be considered as a bug.	2026-01-11 15:48:19 +01:00
Amaury Denoyelle	875bbaa7fc	MINOR: cfgparse: remove duplicate "force-persist" in common kw list "force-persist" proxy keyword is listed twice in common_kw_list. This patch removes the duplicated occurence. This could be backported up to 2.4.	2026-01-09 16:45:54 +01:00
Willy Tarreau	46088b7ad0	MEDIUM: config: warn if some userlist hashes are too slow It was reported in GH #2956 and more recently in GH #3235 that some hashes are way too slow. The former triggers watchdog warnings during checks, the second sees the config parsing take 20 seconds. This is always due to the use of hash algorithms that are not suitable for use in low-latency environments like web. They might be fine for a local auth though. The difficulty, as explained by Philipp Hossner, is that developers are not aware of this cost and adopt this without suspecting any side effect. The proposal here is to measure the crypt() call time and emit a warning if it takes more than 10ms (which is already extreme). This was tested by Philipp and confirmed to catch his case. This is marked medium as it might start to report warnings on config suffering from this problem without ever detecting it till now.	2026-01-09 14:56:18 +01:00
akarl10	a203ce6854	BUG/MINOR: ech/quic: enable ech configuration also for quic listeners Patch dba4fd24 ("MEDIUM: ssl/ech: config and load keys") introduced ECH configuration for bind lines, but the QUIC configuration parsers still suffers from not using the same code as the TCP/TLS one, so the init for QUIC was missed. Must be backported in 3.3.	2026-01-08 17:34:28 +01:00
William Lallemand	6e1718ce4b	CI: github: remove ERR=1 temporarly from the ECH job The ECH job still fails to compile since the openssl 4.0 deprecated functions were not removed yet. Let's remove ERR=1 temporarly. We do know that there's a regression in OpenSSL 4.0 with these reg-tests though: Error: # top TEST reg-tests/ssl/set_ssl_crlfile.vtc FAILED (0.219) exit=2 Error: # top TEST reg-tests/ssl/set_ssl_cafile.vtc FAILED (0.236) exit=2 Error: # top TEST reg-tests/quic/set_ssl_crlfile.vtc FAILED (0.196) exit=2	2026-01-08 17:32:27 +01:00
Christian Ruppert	dbe52cc23e	REGTESTS: ssl: Fix reg-tests curve check OpenSSL changed the output from "Server Temp Key" in prior versions to "Peer Temp Key" in recent ones. `a39dc27c25` It looks like it affects OpenSSL >=3.5.0 This broke the reg-test for e.g. Debian 13 builds, using OpenSSL 3.5.1 Fixes bug #3238 Could be backported in every branches. Signed-off-by: Christian Ruppert <idl0r@qasl.de>	2026-01-08 16:14:54 +01:00
William Lallemand	623aa725a2	BUG/MINOR: cli/stick-tables: argument to "show table" is optional Discussed in issue #3187, the CLI help is confusing for the "show table" command as it seems that the argument is mandatory. This patch adds the arguments between square brackets to remove the confusion.	2026-01-08 11:54:01 +01:00
Willy Tarreau	dbba442740	BUILD: sockpair: fix build issue on macOS related to variable-length arrays In GH issue #3226, Sergey Fedorov (@barracuda156) reported that since commit 10c14a1ed0 ("MINOR: proto_sockpair: send_fd_uxst: init iobuf, cmsghdr, cmsgbuf to zeros"), macOS 10.6.8 with gcc 14.3.0 doesn't build anymore: src/proto_sockpair.c: In function 'send_fd_uxst': src/proto_sockpair.c:246:49: error: variable-sized object may not be initialized except with an empty initializer 246 \| char cmsgbuf[CMSG_SPACE(sizeof(int))] = {0}; \| ^ src/proto_sockpair.c:247:45: error: variable-sized object may not be initialized except with an empty initializer 247 \| char buf[CMSG_SPACE(sizeof(int))] = {0}; \| ^ Upon investigation, it appears that the CMSG_SPACE() macro on this OS looks too complex for gcc to consider it as a constant, so it takes these buffers for variable-length arrays and cannot initialize them. Let's move to a simple memset() instead, which Sergey confirmed fixes the problem. This needs to be backported as far as 3.1. Thanks to Sergey for the report, the bisect and testing the fix.	2026-01-08 09:26:22 +01:00
Hyeonggeun Oh	c17ed69bf3	MINOR: cfgparse: Refactor "userlist" parser to print it in -dKall operation This patch covers issue https://github.com/haproxy/haproxy/issues/3221. The parser for the "userlist" section did not use the standard keyword registration mechanism. Instead, it relied on a series of strcmp() comparisons to identify keywords such as "group" and "user". This had two main drawbacks: 1. The keywords were not discoverable by the "-dKall" dump option, making it difficult for users to see all available keywords for the section. 2. The implementation was inconsistent with the parsers for other sections, which have been progressively refactored to use the standard cfg_kw_list infrastructure. This patch refactors the userlist parser to align it with the project's standard conventions. The parsing logic for the "group" and "user" keywords has been extracted from the if/else block in cfg_parse_users() into two new dedicated functions: - cfg_parse_users_group() - cfg_parse_users_user() These two keywords are now registered via a dedicated cfg_kw_list, making them visible to the rest of the HAPorxy ecosystem, including the -dKall dump.	2026-01-07 18:25:09 +01:00
William Lallemand	91cff75908	BUG/MINOR: cfgparse: wrong section name upon error When a unknown keyword was used in the "userlist" section, the error was mentioning the "users" section, instead of "userlist". Could be backported in every branches.	2026-01-07 18:13:12 +01:00
William Lallemand	4aff6d1c25	BUILD: tools: memchr definition changed in C23 New gcc and clang versions from fedora rawhide seems to use the C23 standard by default. This version changes the definition of some string.h functions, which now return a const char * instead of a char . src/tools.c: In function ‘fgets_from_mem’: src/tools.c:7200:17: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 7200 \| new_pos = memchr(position, '\n', size); \| ^ Strangely, -Wdiscarded-qualifiers does not seem to catch all the memchr. Should fix issue #3228. This could be backported in previous versions.	2026-01-07 14:51:26 +01:00
William Lallemand	5322bd3785	BUILD: ssl: strchr definition changed in C23 New gcc and clang versions from fedora rawhide seems to use the C23 standard by default. This version changes the definition of some string.h functions, which now return a const char * instead of a char *. src/ssl_sock.c: In function ‘SSL_CTX_keylog’: src/ssl_sock.c:4475:17: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers] 4475 \| lastarg = strrchr(line, ' '); Strangely, -Wdiscarded-qualifiers does not seem to catch all the strrchr. Should fix issue #3228. This could be backported in previous versions.	2026-01-07 14:51:26 +01:00
Willy Tarreau	71b00a945d	[RELEASE] Released version 3.4-dev2 Released version 3.4-dev2 with the following main changes : - BUG/MEDIUM: mworker/listener: ambiguous use of RX_F_INHERITED with shards - BUG/MEDIUM: http-ana: Properly detect client abort when forwarding response (v2) - BUG/MEDIUM: stconn: Don't report abort from SC if read0 was already received - BUG/MEDIUM: quic: Don't try to use hystart if not implemented - CLEANUP: backend: Remove useless test on server's xprt - CLEANUP: tcpcheck: Remove useless test on the xprt used for healthchecks - CLEANUP: ssl-sock: Remove useless tests on connection when resuming TLS session - REGTESTS: quic: fix a TLS stack usage - REGTESTS: list all skipped tests including 'feature cmd' ones - CI: github: remove openssl no-deprecated job - CI: github: add a job to test the master branch of OpenSSL - CI: github: openssl-master.yml misses actions/checkout - BUG/MEDIUM: backend: Do not remove CO_FL_SESS_IDLE in assign_server() - CI: github: use git prefix for openssl-master.yml - BUG/MEDIUM: mux-h2: synchronize all conditions to create a new backend stream - REGTESTS: fix error when no test are skipped - MINOR: cpu-topo: Turn the cpu policy configuration into a struct - MEDIUM: cpu-topo: Add a "threads-per-core" keyword to cpu-policy - MEDIUM: cpu-topo: Add a "cpu-affinity" option - MEDIUM: cpu-topo: Add a new "max-threads-per-group" global keyword - MEDIUM: cpu-topo: Add the "per-thread" cpu_affinity - MEDIUM: cpu-topo: Add the "per-ccx" cpu_affinity - BUG/MINOR: cpu-topo: fix -Wlogical-not-parentheses build with clang - DOC: config: fix number of values for "cpu-affinity" - MINOR: tools: add a secure implementation of memset - MINOR: mux-h2: add missing glitch count for non-decodable H2 headers - MINOR: mux-h2: perform a graceful close at 75% glitches threshold - MEDIUM: mux-h1: implement basic glitches support - MINOR: mux-h1: perform a graceful close at 75% glitches threshold - MEDIUM: cfgparse: acknowledge that proxy ID auto numbering starts at 2 - MINOR: cfgparse: remove useless checks on no server in backend - OPTIM/MINOR: proxy: do not init proxy management task if unused - MINOR: patterns: preliminary changes for reorganization - MEDIUM: patterns: reorganize pattern reference elements - CLEANUP: patterns: remove dead code - OPTIM: patterns: cache the current generation - MINOR: tcp: add new bind option "tcp-ss" to instruct the kernel to save the SYN - MINOR: protocol: support a generic way to call getsockopt() on a connection - MINOR: tcp: implement the get_opt() function - MINOR: tcp_sample: implement the fc_saved_syn sample fetch function - CLEANUP: assorted typo fixes in the code, commits and doc - BUG/MEDIUM: cpu-topo: Don't forget to reset visited_ccx. - BUG/MAJOR: set the correct generation ID in pat_ref_append(). - BUG/MINOR: backend: fix the conn_retries check for TFO - BUG/MINOR: backend: inspect request not response buffer to check for TFO - MINOR: net_helper: add sample converters to decode ethernet frames - MINOR: net_helper: add sample converters to decode IP packet headers - MINOR: net_helper: add sample converters to decode TCP headers - MINOR: net_helper: add ip.fp() to build a simplified fingerprint of a SYN - MINOR: net_helper: prepare the ip.fp() converter to support more options - MINOR: net_helper: add an option to ip.fp() to append the TTL to the fingerprint - MINOR: net_helper: add an option to ip.fp() to append the source address - DOC: config: fix the length attribute name for stick tables of type binary / string - MINOR: mworker/cli: only keep positive PIDs in proc_list - CLEANUP: mworker: remove duplicate list.h include - BUG/MINOR: mworker/cli: fix show proc pagination using reload counter - MINOR: mworker/cli: extract worker "show proc" row printer - MINOR: cpu-topo: Factorize code - MINOR: cpu-topo: Rename variables to better fit their usage - BUG/MEDIUM: peers: Properly handle shutdown when trying to get a line - BUG/MEDIUM: mux-h1: Take care to update <kop> value during zero-copy forwarding - MINOR: threads: Avoid using a thread group mask when stopping. - MINOR: hlua: Add support for lua 5.5 - MEDIUM: cpu-topo: Add an optional directive for per-group affinity - BUG/MEDIUM: mworker: can't use signals after a failed reload - BUG/MEDIUM: stconn: Move data from <kip> to <kop> during zero-copy forwarding - DOC: config: fix a few typos and refine cpu-affinity - MINOR: receiver: Remove tgroup_mask from struct shard_info - BUG/MINOR: quic: fix deprecated warning for window size keyword	2026-01-07 11:02:12 +01:00
Amaury Denoyelle	e061547d9d	BUG/MINOR: quic: fix deprecated warning for window size keyword QUIC configuration was cleaned up in the previous release. Several global keyword names were changed to unify the configuration. For each of them the older keyword is marked as deprecated, with a warning to mention the newer alternative. This patch fixes the warning for 'tune.quic.frontend.default-max-size' as the alternative proposed was not correct. The proper value now is 'tune.quic.fe.cc.max-win-size'. This must be backported up to 3.3.	2026-01-07 09:54:31 +01:00
Olivier Houchard	41cd589645	MINOR: receiver: Remove tgroup_mask from struct shard_info The only purpose from tgroup_mask seems to be to calculate how many tgroups share the same shard, but this is an information we can calculate differently, we just have to increment the number when a new receiver is added to the shard, and decrement it when one is detached from the shard. Removing thread group masks will allow us to increase the maximum number of thread groups past 64.	2026-01-07 09:27:12 +01:00
Willy Tarreau	c3fcdfaf5c	DOC: config: fix a few typos and refine cpu-affinity There were two typos in the recently updated parts about per-group. Also, change the commas to ':' after the options values, as sometimes it would be confusing. Last, place quotes around keyword names so that they're explicitly referred to as language keywords. No backport is needed.	2026-01-07 09:19:25 +01:00
Christopher Faulet	83457b9e38	BUG/MEDIUM: stconn: Move data from <kip> to <kop> during zero-copy forwarding The <kip> of producer was not forwarded to <kop> of consumer when zero-copy data forwarding was tried. Because of the issue, the chunking of emitted H1 messages could be invalid. To fix the bug, sc_ep_fwd_kip() must be called at this stage. This fix is related to the previous one (529a8dbfb "BUG/MEDIUM: mux-h1: Take care to update <kop> value during zero-copy forwarding"). Both are required to fully fix the issue #3230. This patch must be backported to 3.3.	2026-01-06 15:41:50 +01:00
William Lallemand	97490a7789	BUG/MEDIUM: mworker: can't use signals after a failed reload In issue #3229 it was reported that the master couldn't reload after a failed reload following a wrong configuration. It is still possible to do a reload using the "reload" command of the master CLI. But every signals are blocked. The problem was introduced in 709cde6d0 ("BUG/MEDIUM: mworker: signals inconsistencies during startup and reload") which fixes the blocking of signals during the reload. However the patch missed a case, indeed, the run_master_in_recovery_mode() is not being called when the worker failed to parse the configuration, it is only failing when the master is failing. To handle this case, the mworker_unblock_signals() function must be called upon mworker_on_new_child_failure(). But since this is called in an haproxy signal handler it would mess with the signals. Instead, the patch adds a task which is started by the signal handler, and restores the signals outside of it. This must be backported as far as 3.1.	2026-01-06 14:27:53 +01:00
Olivier Houchard	56fd0c1a5c	MEDIUM: cpu-topo: Add an optional directive for per-group affinity When using per-group affinity, add an optional new directive. It accepts the values of "auto", where when multiple thread groups are created, the available CPUs are split equally across the groups, and is the new default, and "loose", where all groups are bound to all available CPUs, this is the old default.	2026-01-06 11:32:45 +01:00
Mike Lothian	1c0f781994	MINOR: hlua: Add support for lua 5.5 Lua 5.5 adds an extra argument to lua_newstate(). Since there are already a few other ifdefs in hlua.c checking for the Lua version, and there's a single call place, let's do the same here. This should be safe for backporting if needed. Signed-off-by: Mike Lothian <mike@fireburn.co.uk>	2026-01-06 11:05:02 +01:00
Olivier Houchard	853604f87a	MINOR: threads: Avoid using a thread group mask when stopping. Remove the "stopped_tgroup_mask" variable, that indicated which thread groups were stopping, and instead just use "stopped_tgroups", a counter indicating how many thread groups are stopping. We want to remove all thread group masks, so that we can increase the maximum number of thread groups past 64.	2026-01-06 08:30:55 +01:00
Christopher Faulet	529a8dbfba	BUG/MEDIUM: mux-h1: Take care to update <kop> value during zero-copy forwarding Since the extra field was removed from the HTX structure, a regression was introduced when forwarding of chunked messages. The <kop> value was not decreased as it should be when data were sent via the zero-copy forwarding. Because of this bug, it was possible to announce a chunk size larger than the chunk data sent. To fix the bug, an helper function was added to properly update the <kop> value when a chunk size is emitted. This function is now called when new chunk is announced, including during zero-copy forwarding. As a workaround, "tune.disable-zero-copy-forwarding" or just "tune.h1.zero-copy-fwd-send off" can be set in the global section. This patch should fix the issue #3230. It must be backported to 3.3.	2026-01-06 07:39:05 +01:00
Christopher Faulet	0b29b76a52	BUG/MEDIUM: peers: Properly handle shutdown when trying to get a line When a shutdown was reported to a peer applet, the event was not properly handled if it failed to receive data. The function responsible to get data was exiting too early if the applet buffer was empty, without testing the sedesc status. Because of this issue, it was possible to have frozen peer applets. For instance, it happend on client timeout. With too many frozen applets, it was possible to reach the maxconn. This patch should fix the issue #3234. It must be backported to 3.3.	2026-01-05 13:46:57 +01:00
Olivier Houchard	196d16f2b1	MINOR: cpu-topo: Rename variables to better fit their usage Rename "visited_tsid" and "visited_ccx" to "touse_tsid" and "touse_ccx". They are not there to remember which tsid/ccx we alreaday visited, contrarily to visited_ccx_set and visited_cl_set, they are there to know which tsid/ccx we should use, so make that clear.	2026-01-05 09:25:48 +01:00
Olivier Houchard	bbf5c30a87	MINOR: cpu-topo: Factorize code Factorize the code common to cpu_policy_group_by_ccx() and cpu_policy_group_by_cluster() into a new function, cpu_policy_assign_threads().	2026-01-05 09:24:44 +01:00
Alexander Stephan	e241144e70	MINOR: mworker/cli: extract worker "show proc" row printer Introduce cli_append_worker_row() to centralize formatting of a single worker row. Also, replace duplicated row-printing code in both current and old workers loops with the helper. Motivation: Reduces LOC and improves readability by removing duplication.	2026-01-05 08:59:45 +01:00
Alexander Stephan	4c10d9c70c	BUG/MINOR: mworker/cli: fix show proc pagination using reload counter After commit 594408cd612b5 ("BUG/MINOR: mworker/cli: 'show proc' is limited by buffer size"), related to ticket #3204, the "show proc" logic has been fixed to be able to print more than 202 processes. However, this fix can lead to the omission of entries in case they have the same timestamp. To fix this, we use the unique reload counter instead of the timestamp. On partial flush, set ctx->next_reload = child->reloads. On resume skip entries with child->reloads >= ctx->next_reload. Finally, we clear ctx->next_reload at the end of a complete dump so subsequent show proc starts from the top. Could be backported in all stable branches.	2026-01-05 08:59:34 +01:00
Alexander Stephan	a5f274de92	CLEANUP: mworker: remove duplicate list.h include Drop the second #include <haproxy/list.h> from mworker.c. No functional change; reduces redundancy and keeps includes tidy.	2026-01-05 08:59:34 +01:00
Alexander Stephan	c30eeb2967	MINOR: mworker/cli: only keep positive PIDs in proc_list Change mworker_env_to_proc_list() to if (child->pid > 0) before LIST_APPEND, avoiding invalid PIDs (0/-1) in the process list. This has no functional impact beyond stricter validation and it aligns with existing kill safeguards.	2026-01-05 08:59:14 +01:00
Willy Tarreau	6970c8b8b6	DOC: config: fix the length attribute name for stick tables of type binary / string The stick-table doc was reworked and moved in 3.2 with commit da67a89f3 ("DOC: config: move stick-tables and peers to their own section"), however the optional length attribute for binary/string types was mistakenly spelled "length" while it's "len". This must be backported to 3.2.	2026-01-01 10:52:50 +01:00
Willy Tarreau	a206f85f96	MINOR: net_helper: add an option to ip.fp() to append the source address The new value 4 will permit to append the source address to the fingerprint, making it easier to build rules checking a specific path.	2026-01-01 10:32:16 +01:00
Willy Tarreau	70ffae3614	MINOR: net_helper: add an option to ip.fp() to append the TTL to the fingerprint With mode value 1, the TTL will be appended immediately after the 7 bytes, making it a 8-byte fingerprint.	2026-01-01 10:19:48 +01:00
Willy Tarreau	2c317cfed7	MINOR: net_helper: prepare the ip.fp() converter to support more options It can make sense to support extra components in the fingerprint to ease configuration, so let's change the 0/1 value to a bit field. We also turn the current 1 (TCP options list) to 2 so that we'll reuse 1 for the TTL.	2026-01-01 10:19:20 +01:00
Willy Tarreau	e88e03a6e4	MINOR: net_helper: add ip.fp() to build a simplified fingerprint of a SYN Here we collect all the stuff that depends on the sender's settings, such as TOS, IP version, TTL range, presence of DF bit or IP options, presence of DATA in the SYN, CWR+ECE flags, TCP header length, wscale, initial window, mss, as well as the list of TCP extension kinds. It's obviously fairly limited but can allows to avoid blacklisting certain valid clients sharing the same IP address as a misbehaving one. It supports both a short and a long mode depending on the argument. These can be used with the tcp-ss bind option. The doc was updated accordingly.	2025-12-31 17:17:38 +01:00
Willy Tarreau	6e46d1345b	MINOR: net_helper: add sample converters to decode TCP headers This adds the following converters, used to decode fields in an incoming tcp header: tcp.dst, tcp.flags, tcp.seq, tcp.src, tcp.win, tcp.options.mss, tcp.options.tsopt, tcp.options.tsval, tcp.options.wscale, tcp.options_list, These can be used with the tcp-ss bind option. The doc was updated accordingly.	2025-12-31 17:17:23 +01:00
Willy Tarreau	e0a7a7ca43	MINOR: net_helper: add sample converters to decode IP packet headers This adds a few converters that help decode parts of IP packets: - ip.data : returns the next header (typically TCP) - ip.df : returns the dont-fragment flags - ip.dst : returns the destination IPv4/v6 address - ip.hdr : returns only the IP header - ip.proto: returns the upper level protocol (udp/tcp) - ip.src : returns the source IPv4/v6 address - ip.tos : returns the TOS / TC field - ip.ttl : returns the TTL/HL value - ip.ver : returns the IP version (4 or 6) These can be used with the tcp-ss bind option. The doc was updated accordingly.	2025-12-31 17:16:29 +01:00
Willy Tarreau	90d2f157f2	MINOR: net_helper: add sample converters to decode ethernet frames This adds a few converters that help decode parts of ethernet frame headers: - eth.data : returns the next header (typically IP) - eth.dst : returns the destination MAC address - eth.hdr : returns only the ethernet header - eth.proto: returns the ethernet proto - eth.src : returns the source MAC address - eth.vlan : returns the VLAN ID when present These can be used with the tcp-ss bind option. The doc was updated accordingly.	2025-12-31 17:15:36 +01:00
Willy Tarreau	933cb76461	BUG/MINOR: backend: inspect request not response buffer to check for TFO In 2.6, do_connect_server() was introduced by commit 0a4dcb65f ("MINOR: stream-int/backend: Move si_connect() in the backend scope") and changed the approach to work with a stream instead of a stream-interface. However si_oc(si) was wrongly turned to &s->res instead of &s->req, which breaks TFO by always inspecting the response channel to figure whether there are data pending. This fix can be backported to all versions till 2.6.	2025-12-31 13:03:53 +01:00
Willy Tarreau	799653d536	BUG/MINOR: backend: fix the conn_retries check for TFO In 2.6, the retries counter on a stream was changed from retries left to retries done via commit 731c8e6cf ("MINOR: stream: Simplify retries counter calculation"). However, one comparison fell through the cracks in order to detect whether or not we can use TFO (only first attempt), resulting in TFO never working anymore. This may be backported to all versions till 2.6.	2025-12-31 13:03:53 +01:00
Maxime Henrion	51592f7a09	BUG/MAJOR: set the correct generation ID in pat_ref_append(). This fixes crashes when creating more than one new revision of a map or acl file and purging the previous version.	2025-12-31 00:29:47 +01:00
Olivier Houchard	54f59e4669	BUG/MEDIUM: cpu-topo: Don't forget to reset visited_ccx. We want to reset visited_ccx, as introduced by commit 8aef5bec1ef57eac449298823843d6cc08545745, each time we run the loop, otherwise the chances of its content being correct are very low, and will likely end up being bound to the wrong threads. This was reported in github issue #3224.	2025-12-26 23:55:57 +01:00
Ilia Shipitsin	f8a77ecf62	CLEANUP: assorted typo fixes in the code, commits and doc	2025-12-25 19:45:29 +01:00
Willy Tarreau	6fb521d2f6	MINOR: tcp_sample: implement the fc_saved_syn sample fetch function This function retrieves the copy of a SYN packet that the system has kept for us when bind option "tcp-ss" was set to 1 or above. It's recommended to copy it to a local variable because it will be freed after being read. It allows to inspect all parts of an incoming SYN packet, provided that it was preserved (e.g. not possible with SYN cookies). The doc provides examples of how to use it.	2025-12-24 18:39:37 +01:00
Willy Tarreau	52d60bf9ee	MINOR: tcp: implement the get_opt() function It relies on the generic sock_conn_get_opt() function and will permit sample fetch functions to retrieve generic TCP-level info.	2025-12-24 18:38:51 +01:00
Willy Tarreau	6d995e59e9	MINOR: protocol: support a generic way to call getsockopt() on a connection It's regularly needed to call getsockopt() on a connection, but each time the calling code has to do all the job by itself. This commit adds a "get_opt()" callback on the protocol struct, that directly calls getsockopt() on the connection's FD. A generic implementation for standard sockets is provided, though QUIC would likely require a different approach, or maybe a mapping. Due to the overlap between IP/TCP/socket option values, it is necessary for the caller to indicate both the level and the option. An abstraction of the level could be done, but the caller would nonetheless have to know the optname, which is generally defined in the same include files. So for now we'll consider that this callback is only for very specific use. The levels and optnames are purposely passed as signed ints so that it is possible to further extend the API by using negative levels for internal namespaces.	2025-12-24 18:38:51 +01:00
Willy Tarreau	44c67a08dd	MINOR: tcp: add new bind option "tcp-ss" to instruct the kernel to save the SYN This option enables TCP_SAVE_SYN on the listening socket, which will cause the kernel to try to save a copy of the SYN packet header (L2, IP and TCP are supported). This can permit to check the source MAC address of a client, or find certain TCP options such as a source address encapsulated using RFC7974. It could also be used as an alternate approach to retrieving the source and destination addresses and ports. For now setting the option is enabled, but sample fetch functions and converters will be needed to extract info.	2025-12-24 11:35:09 +01:00
Maxime Henrion	1fdccbe8da	OPTIM: patterns: cache the current generation This makes a significant difference when loading large files and during commit and clear operations, thanks to improved cache locality. In the measurements below, master refers to the code before any of the changes to the patterns code, not the code before this one commit. Timing the replacement of 10M entries from the CLI with this command which also reports timestamps at start, end of upload and end of clear: $ (echo "prompt i"; echo "show activity"; echo "prepare acl #0"; awk '{print "add acl @1 #0",$0}' < bad-ip.map; echo "show activity"; echo "commit acl @1 #0"; echo "clear acl @0 #0";echo "show activity") \| socat -t 10 - /tmp/sock1 \| grep ^uptim master, on a 3.7 GHz EPYC, 3 samples: uptime_now: 6.087030 uptime_now: 25.981777 => 21.9 sec insertion time uptime_now: 29.286368 => 3.3 sec commit+clear uptime_now: 5.748087 uptime_now: 25.740675 => 20.0s insertion time uptime_now: 29.039023 => 3.3 s commit+clear uptime_now: 7.065362 uptime_now: 26.769596 => 19.7s insertion time uptime_now: 30.065044 => 3.3s commit+clear And after this commit: uptime_now: 6.119215 uptime_now: 25.023019 => 18.9 sec insertion time uptime_now: 27.155503 => 2.1 sec commit+clear uptime_now: 5.675931 uptime_now: 24.551035 => 18.9s insertion uptime_now: 26.652352 => 2.1s commit+clear uptime_now: 6.722256 uptime_now: 25.593952 => 18.9s insertion uptime_now: 27.724153 => 2.1s commit+clear Now timing the startup time with a 10M entries file (on another machine) on master, 20 samples: Standard Deviation, s: 0.061652677408033 Mean: 4.217 And after this commit: Standard Deviation, s: 0.081821371548669 Mean: 3.78	2025-12-23 21:17:39 +01:00
Maxime Henrion	99e625a41d	CLEANUP: patterns: remove dead code Situations where we are iterating over elements and find one with a different generation ID cannot arise anymore since the elements are kept per-generation.	2025-12-23 21:17:39 +01:00
Maxime Henrion	545cf59b6f	MEDIUM: patterns: reorganize pattern reference elements Instead of a global list (and tree) of pattern reference elements, we now have an intermediate pat_ref_gen structure and store the elements in those. This simplifies the logic of some operations such as commit and clear, and improves performance in some cases - numbers to be provided in a subsequent commit after one important optimization is added. A lot of the changes are due to adding an extra level of indirection, changing many cases where we iterate over all elements to an outer loop iterating over the generation and an inner one iterating over the elements of the current generation. It is therefore easier to read this patch using 'git diff -w'.	2025-12-23 21:17:39 +01:00
Maxime Henrion	5547bedebb	MINOR: patterns: preliminary changes for reorganization Safe and non-functional changes that only add currently unused structures, field, functions and macros, in preparation of larger changes that alter the way pattern reference elements are stored. This includes code to create and lookup generation objects, and macros to iterate over the generations of a pattern reference.	2025-12-23 21:17:39 +01:00
Amaury Denoyelle	a4a17eb366	OPTIM/MINOR: proxy: do not init proxy management task if unused Each proxy has its owned task for internal purpose. Currently, it is only used either by frontends or if a stick-table is present. This commit rendres the task allocation optional to only the required case. Thus, it is not allocated anymore for backend only proxies without stick-table.	2025-12-23 16:35:49 +01:00
Amaury Denoyelle	c397f6fc9a	MINOR: cfgparse: remove useless checks on no server in backend A legacy check could be activated at compile time to reject backends without servers. In practice this is not used anymore and does not have much sense with the introduction of dynamic servers.	2025-12-23 16:35:49 +01:00
Amaury Denoyelle	b562602044	MEDIUM: cfgparse: acknowledge that proxy ID auto numbering starts at 2 Each frontend/backend/listen proxies is assigned an unique ID. It can either be set explicitely via 'id' keyword, or automatically assigned on post parsing depending on the available values. It was expected that the first automatically assigned value would start at '1'. However, due to a legacy bug this is not the case as this value is always skipped. Thus, automatically assigned proxies always start at '2' or more. To avoid breaking the current existing state, this situation is now acknowledged with the current patch. The code is rewritten with an explicit warning to ensure that this won't be fixed without knowing the current status. A new regtest also ensures this.	2025-12-23 16:35:49 +01:00
Willy Tarreau	5904f8279b	MINOR: mux-h1: perform a graceful close at 75% glitches threshold This avoids hitting the hard wall for connections with non-compliant peers that are accumulating errors. We recycle the connection early enough to permit to reset the counter. Example below with a threshold set to 100: Before, 1% errors: $ h1load -H "Host : blah" -c 1 -n 10000000 0:4445 # time conns tot_conn tot_req tot_bytes err cps rps bps ttfb 1 1 1039 103872 6763365 1038 1k03 103k 54M1 9.426u 2 1 2128 212793 14086140 2127 1k08 108k 58M5 8.963u 3 1 3215 321465 21392137 3214 1k08 108k 58M3 8.982u 4 1 4307 430684 28735013 4306 1k09 109k 58M6 8.935u 5 1 5390 538989 36016294 5389 1k08 108k 58M1 9.021u After, no more errors: $ h1load -H "Host : blah" -c 1 -n 10000000 0:4445 # time conns tot_conn tot_req tot_bytes err cps rps bps ttfb 1 1 1509 113161 7487809 0 1k50 113k 59M9 8.482u 2 1 3002 225101 15114659 0 1k49 111k 60M9 8.582u 3 1 4508 338045 22809911 0 1k50 112k 61M5 8.523u 4 1 5971 447785 30286861 0 1k46 109k 59M7 8.772u 5 1 7472 560335 37955271 0 1k49 112k 61M2 8.537u	2025-12-20 19:29:37 +01:00
Willy Tarreau	05b457002b	MEDIUM: mux-h1: implement basic glitches support We now count glitches for each parsing error, including those that have been accepted via accept-unsafe-violations-*. Front and back are considered and the connection gets killed on error once if the threshold is reached or passed and the CPU usage is beyond the configured limit (0 by default). This was tested with: curl -ivH "host : blah" 0:4445{,,,,,,,,,} which sends 10 requests to a configuration having a threshold of 5. The global keywords are named similarly to H2 and quic: tune.h1.be.glitches-threshold xxxx tune.h1.fe.glitches-threshold xxxx The glitches count of each connection is also reported when non-null in the connection dumps (e.g. "show fd").	2025-12-20 19:29:33 +01:00
Willy Tarreau	0901f60cef	MINOR: mux-h2: perform a graceful close at 75% glitches threshold This avoids hitting the hard wall for connections with non-compliant peers that would be accumulating errors over long connections. We now permit to recycle the connection early enough to reset the connection counter. This was tested artificially by adding this to h2c_frt_handle_headers(): h2c_report_glitch(h2c, 1, "new stream"); or this to h2_detach(): h2c_report_glitch(h2c, 1, "detaching"); and injecting using h2load -c 1 -n 1000 0:4445 on a config featuring tune.h2.fe.glitches-threshold 1000: finished in 8.74ms, 85802.54 req/s, 686.62MB/s requests: 1000 total, 751 started, 751 done, 750 succeeded, 250 failed, 250 errored, 0 timeout status codes: 750 2xx, 0 3xx, 0 4xx, 0 5xx traffic: 6.00MB (6293303) total, 132.57KB (135750) headers (space savings 29.84%), 5.86MB (6144000) data min max mean sd +/- sd time for request: 9us 178us 10us 6us 99.47% time for connect: 139us 139us 139us 0us 100.00% time to 1st byte: 339us 339us 339us 0us 100.00% req/s : 87477.70 87477.70 87477.70 0.00 100.00% The failures are due to h2load not supporting reconnection.	2025-12-20 19:26:29 +01:00
Willy Tarreau	52adeef7e1	MINOR: mux-h2: add missing glitch count for non-decodable H2 headers One rare error case could produce a protocol error on the stream when not being able to decode response headers wasn't being accounted as a glitch, so let's fix it.	2025-12-20 19:11:16 +01:00
Maxime Henrion	c8750e4e9d	MINOR: tools: add a secure implementation of memset This guarantees that the compiler will not optimize away the memset() call if it detects a dead store. Use this to clear SSL passphrases. No backport needed.	2025-12-19 17:42:57 +01:00
Willy Tarreau	bd92f34f02	DOC: config: fix number of values for "cpu-affinity" It said "accepts 2 values" then goes on enumerating 5 since more were added one at a time. Let's fix it by removing the number. No backport is needed.	2025-12-19 11:21:09 +01:00
William Lallemand	03340748de	BUG/MINOR: cpu-topo: fix -Wlogical-not-parentheses build with clang src/cpu_topo.c:1325:15: warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses] 1325 \| } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE) \| ^ ~ src/cpu_topo.c:1325:15: note: add parentheses after the '!' to evaluate the bitwise operator first 1325 \| } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE) \| ^ \| ( ) src/cpu_topo.c:1325:15: note: add parentheses around left hand side expression to silence this warning 1325 \| } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE) \| ^ \| ( ) src/cpu_topo.c:1533:15: warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses] 1533 \| } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE) \| ^ ~ src/cpu_topo.c:1533:15: note: add parentheses after the '!' to evaluate the bitwise operator first 1533 \| } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE) \| ^ \| ( ) src/cpu_topo.c:1533:15: note: add parentheses around left hand side expression to silence this warning 1533 \| } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE) \| ^ \| ( ) No backport needed.	2025-12-19 10:15:17 +01:00
Olivier Houchard	8aef5bec1e	MEDIUM: cpu-topo: Add the "per-ccx" cpu_affinity Add a new cpu-affinity keyword, "per-ccx". If used, each thread will be bound to all the hardware threads available in one CCX of the threads group.	2025-12-18 18:52:52 +01:00
Olivier Houchard	c524b181a2	MEDIUM: cpu-topo: Add the "per-thread" cpu_affinity Add a new cpu-affinity keyword, "per-thread". If used, each thread will be bound to only one hardware thread of the thread group. If used in conjonction with the "threads-per-core 1" cpu_policy, then each thread will be bound on a different core.	2025-12-18 18:52:52 +01:00
Olivier Houchard	7e22d9c484	MEDIUM: cpu-topo: Add a new "max-threads-per-group" global keyword Add a new global keyword, max-threads-per-group. It sets the maximum number of threads a thread group can contain. Unless the number of thread groups is fixed with "thread-groups", haproxy will just create more thread groups as needed. The default and maximum value is 64.	2025-12-18 18:52:52 +01:00
Olivier Houchard	3865f6c5c6	MEDIUM: cpu-topo: Add a "cpu-affinity" option Add a new global option, "cpu-affinity", which controls how threads are bound. It currently accepts three values, "per-core", which will bind one thread to each hardware thread of a given core, and "per-group" which will use all the available hardware threads of the thread group, and "auto", the default, which will use "per-group", unless "threads-per-core 1" has been specified in cpu_policy, in which case it will use per-core.	2025-12-18 18:52:52 +01:00
Olivier Houchard	3671652bc9	MEDIUM: cpu-topo: Add a "threads-per-core" keyword to cpu-policy Add a new, optional key-word to "cpu-policy", "threads-per-core". It takes one argument, "1" or "auto". If "1" is used, then only one thread per core will be created, no matter how many hardware thread each core has. If "auto" is used, then one thread will be created per hardware thread, as is the case by default. for example: cpu-policy performance threads-per-core 1	2025-12-18 18:52:52 +01:00
Olivier Houchard	58f04b4615	MINOR: cpu-topo: Turn the cpu policy configuration into a struct Turn the cpu policy configuration into a struct. Right now it just contains an int, that represents the policy used, but will get more information soon.	2025-12-18 18:52:52 +01:00
William Lallemand	876b1e8477	REGTESTS: fix error when no test are skipped Since commit 1ed2c9d ("REGTESTS: list all skipped tests including 'feature cmd' ones"), the script emits some error when trying to display the list of skipped tests when there are none. No backport needed.	2025-12-18 17:26:50 +01:00
Willy Tarreau	9a046fc3ad	BUG/MEDIUM: mux-h2: synchronize all conditions to create a new backend stream In H2 the conditions to create a new stream differ for a client and a server when a GOAWAY was exchanged. While on the server, any stream whose ID is lower than or equal to the one advertised in GOAWAY is valid, for a client it's forbidden to create any stream after receipt of a GOAWAY, even if its ID is lower than or equal to the last one, despite the server not being able to tell the difference from the number of streams in flight. Unfortunately, the logic in the code did not always reflect this specificity of the client (the backend code in our case), and most often considered that it was still permitted to create a new stream until the max_id was greater than or equal to the advertised last_id. This is for example what h2c_is_dead() and h2c_streams_left() do. In other places, such as h2_avail_streams(), the rule is properly taken into account. Very often the advertised last_id is the same, and this is also what haproxy does (which explains why it's impossible to reproduce the issue by chaining two haproxy layers), but a server may wish to advertise any ID including 2^31-1 as mentioned in the spec, and in this case the functions would behave differently. This discrepancy results in a corner case where a GOAWAY received on an idle connection will cause the next stream creation to be initially accepted but then rejected via h2_avail_streams(), and the connection left in a bad state, still attached to the session due to http-reuse safe, but not reinserted into idle list, since the backend code currently is not able to properly recover from this situation. Worse, the idle flags are no longer on it but TASK_F_USR1 still is, and this makes the recently added BUG_ON() rightfully trigger since this case is not supposed to happen. Admittedly more of the backend recovery code needs to be reworked, however the mux must consistently decide whether or not a connection may be reused or needs to be released. This commit fixes the affected logic by introducing a new function "h2c_reached_last_stream()" which says if a connection has reached its last stream, regardless of the side, and using this one everywhere max_id was compared to last_id. This is sufficient to address the corner case that be_reuse_connection() currently cannot recover from. This is in relation to GH issue #3215 and it should be sufficient to fix the issue there. Thanks to Chris Staite for reporting the issue and kudos to Amaury for spotting the events sequence that can lead to this situation. This patch must be backported to 3.3 first, then to older versions later. It's worth noting that it's much more difficult to observe the issue before 3.3 because the BUG_ON() is not there, and the possibly non-released connection might end up being killed for other reasons (timeouts etc). But one possible visible effect might be the impossibility to delete a server (which Chris observed in 3.3).	2025-12-18 17:01:32 +01:00
William Lallemand	9c8925ba0d	CI: github: use git prefix for openssl-master.yml Uses the git- prefix in order to get the latest tarball for the master branch on github.	2025-12-18 16:13:04 +01:00
Olivier Houchard	40d16af7a6	BUG/MEDIUM: backend: Do not remove CO_FL_SESS_IDLE in assign_server() Back in the mists of time, commit e91a526c8f decided that if we were trying to stay on the same server than the previous request, and if there were a connection available in the session, we'd remove its CO_FL_SESS_IDLE. The reason for doing that has been long lost, probably it fixed a bug at some point, but it was most probably not the right place to do that. And starting with 3.3, this triggers a BUG_ON() because that flag is expected later on. So just revert the commit, if the ancient bug shows up again, it will be fixed another way. This should be backported to 3.3. There is little reason to backport it to previous versions, unless other patches depend on it.	2025-12-18 16:09:34 +01:00
William Lallemand	0c7a4469d2	CI: github: openssl-master.yml misses actions/checkout The job can't run setup-vtest because the actions/checkout use line is missing.	2025-12-18 16:03:20 +01:00
William Lallemand	38d3c24931	CI: github: add a job to test the master branch of OpenSSL vtest.yml only builds the releases of OpenSSL for now, there's no way to check if we still have issues with the API before a pre-release version is released. This job builds the master branch of OpenSSL. It is run everyday at 3 AM.	2025-12-18 15:43:06 +01:00
William Lallemand	a58f09b63c	CI: github: remove openssl no-deprecated job Remove the openssl no-deprecated job which was used for 1.1.0 API. It's not useful anymore since it uses the OpenSSL version of the distributions. Checking depreciations in the API is still useful when using newest version of the library. A job for the OpenSSL master branch would be more useful than that.	2025-12-18 15:22:27 +01:00
William Lallemand	1ed2c9da2c	REGTESTS: list all skipped tests including 'feature cmd' ones The script for running regression tests is modified to improve the visibility of skipped tests. Previously, the reasons for skipping tests were only visible during the test discovery phase when grepping the vtc (REQUIRE, EXCLUDE, etc). But reg-tests skipped by vtest with the 'feature cmd' keywords were not listed. This change introduces the following: - vtest does not remove the logs itself anymore, because it is not able to let the log available when a test is skipped. So the -L parameter is now always passed to vtest - All skipped tests during the discovery phase are now logged to a 'skipped.log' file within the test directory - The script now parses vtest logs to find tests that were skipped due to missing features (via the 'feature cmd' in .vtc files) and adds them to the skipped list.	2025-12-17 15:54:15 +01:00
Frederic Lecaille	8523a5cde0	REGTESTS: quic: fix a TLS stack usage This issue was reported in GH #3214 where quic/tls13_ssl_crt-list_filters.vtc QUIC reg test was run without haproxy QUIC support due to OPENSSL_AWSLC enabled featured. This is due to the fact that when ssl/tls13_ssl_crt-list_filters.vtc has been ported to QUIC the feature(OPENSSL) was silly replaced by feature(QUIC) leading the script to be run even without QUIC support if OR'ed OPENSSL_AWSLC feature is enabled. A good method to port these feature() commands to QUIC would have been to add a feature(QUIC) command seperated from the one used for the supported TLS stacks identified by the original underlying ssl reg tests (in reg-tests/ssl). This is what is done by this patch. Thank you to @idl0r for having reported this issue.	2025-12-15 09:44:42 +01:00
Christopher Faulet	a25394b6c8	CLEANUP: ssl-sock: Remove useless tests on connection when resuming TLS session In ssl_sock_srv_try_reuse_sess(), the connection is always defined, to TCP and QUIC connections. No reason to test it. Because it is not so obvious for the QUIC part, a BUG_ON() could be added here. For now, just remove useless tests. This patch should fix a Coverity report from #3213.	2025-12-15 08:16:59 +01:00
Christopher Faulet	d6b1d5f6e9	CLEANUP: tcpcheck: Remove useless test on the xprt used for healthchecks The xprt used to perform a healthcheck is always defined and cannot be NULL. So there is no reason to test it. It could lead to wrong assumptions later in the code. This patch should fix a Coverity report from #3213.	2025-12-15 08:01:21 +01:00
Christopher Faulet	5c5914c32e	CLEANUP: backend: Remove useless test on server's xprt The server's xprt is always defined and cannot be NULL. So there is no reason to test it. It could lead to wrong assumptions later in the code. This patch should fix a Coverity report from #3213.	2025-12-15 07:56:53 +01:00
Olivier Houchard	a08bc468d2	BUG/MEDIUM: quic: Don't try to use hystart if not implemented Not every CC algos implement hystart, so only call the method if it is actually there. Failure to do so will cause crashes if hystart is on, and the algo doesn't implement it. This should fix github issue #3218 This should be backported up to 3.0.	2025-12-14 16:46:12 +01:00
Christopher Faulet	54e58103e5	BUG/MEDIUM: stconn: Don't report abort from SC if read0 was already received SC_FL_ABRT_DONE flag should never be set when SC_FL_EOS was already set. These both flags were introduced to replace the old CF_SHUTR and to have a flag for shuts driven by the stream and a flag for the read0 received by the mux. So both flags must not be seen at same time on a SC. It is espeically important because some processing are performed when these flags are set. And wrong decisions may be made. This patch must be backproted as far as 2.8.	2025-12-12 08:41:08 +01:00
Christopher Faulet	a483450fa2	BUG/MEDIUM: http-ana: Properly detect client abort when forwarding response (v2) The first attempt to fix this issue (c672b2a29 "BUG/MINOR: http-ana: Properly detect client abort when forwarding the response") was not fully correct and could be responsible to false report of client abort during the response forwarding. I guess it is possible to truncate the response. Instead, we must also take care that the client closed on its side, by checking SC_FL_EOS flag on the front SC. Indeed, if the client has aborted, this flag should be set. This patch should be backported as far as 2.8.	2025-12-12 08:41:08 +01:00
William Lallemand	5b19d95850	BUG/MEDIUM: mworker/listener: ambiguous use of RX_F_INHERITED with shards The RX_F_INHERITED flag was ambiguous, as it was used to mark both listeners inherited from the parent process and listeners duplicated from another local receiver. This could lead to incorrect behavior concerning socket unbinding and suspension. This commit refactors the handling of inherited listeners by splitting the RX_F_INHERITED flag into two more specific flags: - RX_F_INHERITED_FD: Indicates a listener inherited from the parent process via its file descriptor. These listeners should not be unbound by the master. - RX_F_INHERITED_SOCK: Indicates a listener that shares a socket with another one, either by being inherited from the parent or by being duplicated from another local listener. These listeners should not be suspended or resumed individually. Previously, the sharding code was unconditionally using RX_F_INHERITED when duplicating a file descriptor. In HAProxy versions prior to 3.1, this led to a file descriptor leak for duplicated unix stats sockets in the master process. This would eventually cause the master to crash with a BUG_ON in fd_insert() once the file descriptor limit was reached. This must be backported as far as 3.0. Branches earlier than 3.0 are affected but would need a different patch as the logic is different.	2025-12-11 18:09:47 +01:00
Willy Tarreau	aed953088e	[RELEASE] Released version 3.4-dev1 Released version 3.4-dev1 with the following main changes : - BUG/MINOR: jwt: Missing "case" in switch statement - DOC: configuration: ECH support details - Revert "MINOR: quic: use dynamic cc_algo on bind_conf" - MINOR: quic: define quic_cc_algo as const - MINOR: quic: extract cc-algo parsing in a dedicated function - MINOR: quic: implement cc-algo server keyword - BUG/MINOR: quic-be: Missing keywords array NULL termination - REGTESTS: ssl enable tls12_reuse.vtc for AWS-LC - REGTESTS: ssl: split tls*_reuse in stateless and stateful resume tests - BUG/MEDIUM: connection: fix "bc_settings_streams_limit" typo - BUG/MEDIUM: config: ignore empty args in skipped blocks - DOC: config: mention clearer that the cache's total-max-size is mandatory - DOC: config: reorder the cache section's keywords - BUG/MINOR: quic/ssl: crash in ClientHello callback ssl traces - BUG/MINOR: quic-be: handshake errors without connection stream closure - MINOR: quic: Add useful debugging traces in qc_idle_timer_do_rearm() - REGTESTS: ssl: Move all the SSL certificates, keys, crt-lists inside "certs" directory - REGTESTS: quic/ssl: ssl/del_ssl_crt-list.vtc supported by QUIC - REGTESTS: quic: dynamic_server_ssl.vtc supported by QUIC - REGTESTS: quic: issuers_chain_path.vtc supported by QUIC - REGTESTS: quic: new_del_ssl_cafile.vtc supported by QUIC - REGTESTS: quic: ocsp_auto_update.vtc supported by QUIC - REGTESTS: quic: set_ssl_bug_2265.vtc supported by QUIC - MINOR: quic: avoid code duplication in TLS alert callback - BUG/MINOR: quic-be: missing connection stream closure upon TLS alert to send - REGTESTS: quic: set_ssl_cafile.vtc supported by QUIC - REGTESTS: quic: set_ssl_cert_noext.vtc supported by QUIC - REGTESTS: quic: set_ssl_cert.vtc supported by QUIC - REGTESTS: quic: set_ssl_crlfile.vtc supported by QUIC - REGTESTS: quic: set_ssl_server_cert.vtc supported by QUIC - REGTESTS: quic: show_ssl_ocspresponse.vtc supported by QUIC - REGTESTS: quic: ssl_client_auth.vtc supported by QUIC - REGTESTS: quic: ssl_client_samples.vtc supported by QUIC - REGTESTS: quic: ssl_default_server.vtc supported by QUIC - REGTESTS: quic: new_del_ssl_crlfile.vtc supported by QUIC - REGTESTS: quic: ssl_frontend_samples.vtc supported by QUIC - REGTESTS: quic: ssl_server_samples.vtc supported by QUIC - REGTESTS: quic: ssl_simple_crt-list.vtc supported by QUIC - REGTESTS: quic: ssl_sni_auto.vtc code provision for QUIC - REGTESTS: quic: ssl_curve_name.vtc supported by QUIC - REGTESTS: quic: add_ssl_crt-list.vtc supported by QUIC - REGTESTS: add ssl_ciphersuites.vtc (TCP & QUIC) - BUG/MINOR: quic: do not set first the default QUIC curves - REGTESTS: quic/ssl: Add ssl_curves_selection.vtc - BUG/MINOR: ssl: Don't allow to set NULL sni - MEDIUM: quic: Add connection as argument when qc_new_conn() is called - MINOR: ssl: Add a function to hash SNIs - MINOR: ssl: Store hash of the SNI for cached TLS sessions - MINOR: ssl: Compare hashes instead of SNIs when a session is cached - MINOR: connection/ssl: Store the SNI hash value in the connection itself - MEDIUM: tcpcheck/backend: Get the connection SNI before initializing SSL ctx - BUG/MEDIUM: ssl: Don't reuse TLS session if the connection's SNI differs - MEDIUM: ssl/server: No longer store the SNI of cached TLS sessions - BUG/MINOR: log: Dump good %B and %U values in logs - BUG/MEDIUM: http-ana: Don't close server connection on read0 in TUNNEL mode - DOC: config: Fix description of the spop mode - DOC: config: Improve spop mode documentation - MINOR: ssl: Split ssl_crt-list_filters.vtc in two files by TLS version - REGTESTS: quic: tls13_ssl_crt-list_filters.vtc supported by QUIC - BUG/MEDIUM: h3: do not access QCS <sd> if not allocated - CLEANUP: mworker/cli: remove useless variable - BUG/MINOR: mworker/cli: 'show proc' is limited by buffer size - BUG/MEDIUM: ssl: Always check the ALPN after handshake - MINOR: connections: Add a new CO_FL_SSL_NO_CACHED_INFO flag - BUG/MEDIUM: ssl: Don't store the ALPN for check connections - BUG/MEDIUM: ssl: Don't resume session for check connections - CLEANUP: improvements to the alignment macros - CLEANUP: use the automatic alignment feature - CLEANUP: more conversions and cleanups for alignment - BUG/MEDIUM: h3: fix access to QCS <sd> definitely - MINOR: h2/trace: emit a trace of the received RST_STREAM type	2025-12-10 16:52:30 +01:00
Willy Tarreau	3ec5818807	MINOR: h2/trace: emit a trace of the received RST_STREAM type Right now we don't get any state trace when receiving an RST_STREAM, and this is not convenient because RST_STREAM(0) is not visible at all, except in developer level because the function is entered and left. Let's extract the RST code first and always log it using TRACE_PRINTF() (along with h2c/h2s) so that it's possible to detect certain codes being used.	2025-12-10 15:58:56 +01:00
Amaury Denoyelle	5b8e6d6811	BUG/MEDIUM: h3: fix access to QCS <sd> definitely The previous patch tried to fix access to QCS <sd> member, as the latter is not always allocated anymore on the frontend side. a15f0461a016a664427f5aaad2227adcc622c882 BUG/MEDIUM: h3: do not access QCS <sd> if not allocated In particular, access was prevented after HEADERS parsing in case h3_req_headers_to_htx() returned an error, which indicates that the stream-endpoint allocation was not performed. However, this still is not enough when QCS instance is already closed at this step. Indeed, in this case, h3_req_headers_to_htx() returns OK but stream-endpoint allocation is skipped as an optimization as no data exchange will be performed. To definitely fix this kind of problems, add checks on qcs <sd> member before accessing it in H3 layer. This method is the safest one to ensure there is no NULL dereferencement. This should fix github issue #3211. This must be backported along the above mentionned patch.	2025-12-10 12:04:37 +01:00
Maxime Henrion	6eedd0d485	CLEANUP: more conversions and cleanups for alignment - Convert additional cases to use the automatic alignment feature for the THREAD_ALIGN(ED) macros. This includes some cases that are less obviously correct where it seems we wanted to align only in the USE_THREAD case but were not using the thread specific macros. - Also move some alignment requirements to the structure definition instead of having it on variable declaration.	2025-12-09 17:40:58 +01:00
Maxime Henrion	bc8e14ec23	CLEANUP: use the automatic alignment feature - Use the automatic alignment feature instead of hardcoding 64 all over the code. - This also converts a few bare __attribute__((aligned(X))) to using the ALIGNED macro.	2025-12-09 17:14:58 +01:00
Maxime Henrion	74719dc457	CLEANUP: improvements to the alignment macros - It is now possible to use the THREAD_ALIGN and THREAD_ALIGNED macros without a parameter. In this case, we automatically align on the cache line size. - The cache line size is set to 64 by default to match the current code, but it can be overridden on the command line. - This required moving the DEFVAL/DEFNULL/DEFZERO macros to compiler.h instead of tools-t.h, to avoid namespace pollution if we included tools-t.h from compiler.h.	2025-12-09 17:05:52 +01:00
Olivier Houchard	420b42df1c	BUG/MEDIUM: ssl: Don't resume session for check connections Don't attempt to use stored sessions when creating new check connections, as the check SSL parameters might be different from the server's ones. This has not been proven to be a problem yet, but it doesn't mean it can't be, and this should be backported up to 2.8 along with dcce9369129f6ca9b8eed6b451c0e20c226af2e3 if it is.	2025-12-09 16:45:54 +01:00
Olivier Houchard	be4e1220c2	BUG/MEDIUM: ssl: Don't store the ALPN for check connections When establishing check connections, do not store the negociated ALPN into the server's path_param if the connection is a check connection, as it may use different SSL parameters than the regular connections. To do so, only store them if the CO_FL_SSL_NO_CACHED_INFO is not set. Otherwise, the check ALPN may be stored, and the wrong mux can be used for regular connections, which will end up generating 502s. This should fix Github issue #3207 This should be backported to 3.3.	2025-12-09 16:43:31 +01:00
Olivier Houchard	dcce936912	MINOR: connections: Add a new CO_FL_SSL_NO_CACHED_INFO flag Add a new flag to connections, CO_FL_SSL_NO_CACHED_INFO, and set it for checks. It lets the ssl layer know that he should not use cached informations, such as the ALPN as stored in the server, or cached sessions. This wlil be used for checks, as checks may target different servers, or used a different SSL configuration, so we can't assume the stored informations are correct. This should be backported to 3.3, and may be backported up to 2.8 if the attempts to do session resume by checks is proven to be a problem.	2025-12-09 16:43:31 +01:00
Olivier Houchard	260d64d787	BUG/MEDIUM: ssl: Always check the ALPN after handshake Move the code that is responsible for checking the ALPN, and updating the one stored in the server's path_param, from after we created the mux, to after we did an handshake. Once we did it once, the mux will not be created by the ssl code anymore, as when we know which mux to use thanks to the ALPN, it will be done earlier in connect_server(), so in the unlikely event it changes, we would not detect it anymore, and we'd keep on creating the wrong mux. This can be reproduced by doing a first request, and then changing the ALPN of the server without haproxy noticing (ie without haproxy noticing that the server went down). This should be backported to 3.3.	2025-12-09 16:43:31 +01:00
William Lallemand	594408cd61	BUG/MINOR: mworker/cli: 'show proc' is limited by buffer size In ticket #3204, it was reported that "show proc" is not able to display more than 202 processes. Indeed the bufsize is 16k by default in the master, and can't be changed anymore since 3.1. This patch allows the 'show proc' to start again to dump when the buffer is full, based on the timestamp of the last PID it attempted to dump. Using pointers or count the number of processes might not be a good idea since the list can change between calls. Could be backported in all stable branche.	2025-12-09 16:09:10 +01:00
William Lallemand	dabe8856ad	CLEANUP: mworker/cli: remove useless variable The msg variable is declared and free but never used, this patch removes it.	2025-12-09 16:09:10 +01:00
Amaury Denoyelle	a15f0461a0	BUG/MEDIUM: h3: do not access QCS <sd> if not allocated Since the following commit, allocation of QCS stream-endpoint on FE side has been delayed. The objective is to allocate it only for QCS attached to an upper stream object. Stream-endpoint allocation is now performed on qcs_attach_sc() called during HEADERS parsing. commit e6064c561684d9b079e3b5725d38dc3b5c1b5cd5 OPTIM: mux-quic: delay FE sedesc alloc to stream creation Also, stream-endpoint is accessed through the QCS instance after HEADERS or DATA frames parsing, to update the known input payload length. The above patch triggered regressions as in some code paths, <sd> field is dereferenced while still being NULL. This patch fixes this by restricting access to <sd> field after newer conditions. First, after HEADERS parsing, known input length is only updated if h3_req_headers_to_htx() previously returned a success value, which guarantee that qcs_attach_sc() has been executed. After DATA parsing, <sd> is only accessed after the frame validity check. This ensures that HEADERS were already parsed, thus guaranteing that stream-endpoint is allocated. This should fix github issue #3211. This must be backported up to 3.3. This is sufficient, unless above patch is backported to previous releases, in which case the current one must be picked with it.	2025-12-09 15:00:23 +01:00
Frederic Lecaille	18625f7ff3	REGTESTS: quic: tls13_ssl_crt-list_filters.vtc supported by QUIC ssl/tls13_ssl_crt-list_filters.vtc was renamed to ssl/tls13_ssl_crt-list_filters.vtci to produce a common part runnable both for QUIC and TCP listeners. Then tls13_ssl_crt-list_filters.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-09 07:42:45 +01:00
Frederic Lecaille	c005ed0df8	MINOR: ssl: Split ssl_crt-list_filters.vtc in two files by TLS version Seperate the section from ssl_crt-list_filters.vtc which supports TLS 1.2 and 1.3 versions to produce tls12_ssl_crt-list_filters.vtc and tls13_ssl_crt-list_filters.vtc.	2025-12-09 07:42:45 +01:00
Christopher Faulet	2fa3b4c3a3	DOC: config: Improve spop mode documentation The spop mode description was a bit confusing. So let's improve it. Thanks to @NickMRamirez. This patch shoud fix issue #3206. It could be backported as far as 3.1.	2025-12-08 15:24:05 +01:00
Christopher Faulet	e16dcab92f	DOC: config: Fix description of the spop mode It was mentionned that the spop mode turned the backend into a "log" backend. It is obviously wrong. It turns the backend into a spop backend. This patch should be backported as far as 3.1.	2025-12-08 15:22:01 +01:00
Christopher Faulet	3cf4e7afb9	BUG/MEDIUM: http-ana: Don't close server connection on read0 in TUNNEL mode It is a very old bug (2012), dating from the introduction of the keep-alive support to HAProxy. When a request is fully received, the SC on backend side is switched to NOHALF mode. It means that when the read0 is received from the server, the server connection is immediately closed. It is expected to do so at the end of a classical request. However, it must not be performed if the session is switched to the TUNNEL mode (after an HTTP/1 upgrade or a CONNECT). The client may still have data to send to the server. And closing brutally the server connection this way will be handled as an error on client side. This bug is especially visible when a H2 connection on client side because a RST_STREAM is emitted and a "SD--" is reported in logs. Thanks to @chrisstaite This patch should fix the issue #3205. It must be backported to all stable versions.	2025-12-08 15:22:01 +01:00
Christopher Faulet	5d74980277	BUG/MINOR: log: Dump good %B and %U values in logs When per-stream "bytes_in" and "bytes_out" counters where replaced in 3.3, the wrong counters were used for %B and %U values in logs. In the configuration manual and the commit message, it was specificed that "bytes_in" was replaced by "req_in" and "bytes_out" by "res_in", but in the code, wrong counters were used. It is now fixed. This patch should fix the issue #3208. It must be backported to 3.3.	2025-12-08 15:22:01 +01:00
Christopher Faulet	be998b590e	MEDIUM: ssl/server: No longer store the SNI of cached TLS sessions Thanks to the previous patch, "BUG/MEDIUM: ssl: Don't reuse TLS session if the connection's SNI differs", it is no useless to store the SNI of cached TLS sessions. This SNI is no longer tested and new connections reusing a session must have the same SNI. The main change here is for the ssl_sock_set_servername() function. It is no longer possible to compare the SNI of the reused session with the one of the new connection. So, the SNI is always set, with no other processing. Mainly, the session is not destroyed when SNIs don't match. It means the commit 119a4084bf ("BUG/MEDIUM: ssl: for a handshake when server-side SNI changes") is implicitly reverted. It is good to note that it is unclear for me when and why the reused session should be destroyed. Because I'm unable to reproduce any issue fixed by the commit above. This patch could be backported as far as 3.0 with the commit above.	2025-12-08 15:22:01 +01:00
Christopher Faulet	5702009c8c	BUG/MEDIUM: ssl: Don't reuse TLS session if the connection's SNI differs When a new SSL server connection is created, if no SNI is set, it is possible to inherit from the one of the reused TLS session. The bug was introduced by the commit 95ac5fe4a ("MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid"). The mixup is possible between regular connections but also with health-checks connections. But it is only the visible part of the bug. If the SNI of the cached TLS session does not match the one of the new connection, no reuse must be performed at all. To fix the bug, hash of the SNI of the reused session is compared with the one of the new connection. The TLS session is reused only if the hashes are the same. This patch should fix the issue #3195. It must be slowly backported as far as 3.0. it relies on the following series: * MEDIUM: tcpcheck/backend: Get the connection SNI before initializing SSL ctx * MINOR: connection/ssl: Store the SNI hash value in the connection itself * MEDIUM: ssl: Store hash of the SNI for cached TLS sessions * MINOR: ssl: Add a function to hash SNIs * MEDIUM: quic: Add connection as argument when qc_new_conn() is called * BUG/MINOR: ssl: Don't allow to set NULL sni	2025-12-08 15:22:01 +01:00
Christopher Faulet	7e9d921141	MEDIUM: tcpcheck/backend: Get the connection SNI before initializing SSL ctx The SNI of a new connection is now retrieved earlier, before the initialization of the SSL context. So, concretely, it is now performed before calling conn_prepare(). The SNI is then set just after.	2025-12-08 15:22:01 +01:00
Christopher Faulet	28654f3c9b	MINOR: connection/ssl: Store the SNI hash value in the connection itself When a SNI is set on a new connection, its hash is now saved in the connection itself. To do so, a dedicated field was added into the connection strucutre, called sni_hash. For now, this value is only used when the TLS session is cached.	2025-12-08 15:22:01 +01:00
Christopher Faulet	92f77cb3e6	MINOR: ssl: Compare hashes instead of SNIs when a session is cached This patch relies on the commit "MINOR: ssl: Store hash of the SNI for cached TLS sessions". We now use the hash of the SNIs instead of the SNIs themselves to know if we must update the cached SNI or not.	2025-12-08 15:22:01 +01:00
Christopher Faulet	9794585204	MINOR: ssl: Store hash of the SNI for cached TLS sessions For cached TLS sessions, in addition to the SNI itself, its hash is now also saved. No changes are expected here because this hash is not used for now. This commit relies on: * MINOR: ssl: Add a function to hash SNIs	2025-12-08 15:22:00 +01:00
Christopher Faulet	d993e1eeae	MINOR: ssl: Add a function to hash SNIs This patch only adds the function ssl_sock_sni_hash() that can be used to get the hash value corresponding to an SNI. A global seed, sni_hash_seed, is used.	2025-12-08 15:22:00 +01:00
Christopher Faulet	a83ed86b78	MEDIUM: quic: Add connection as argument when qc_new_conn() is called This patch reverts the commit efe60745b ("MINOR: quic: remove connection arg from qc_new_conn()"). The connection will be mandatory when the QUIC connection is created on backend side to fix an issue when we try to reuse a TLS session. So, the connection is again an argument of qc_new_conn(), the 4th argument. It is NULL for frontend QUIC connections but there is no special check on it.	2025-12-08 15:22:00 +01:00
Christopher Faulet	3534efe798	BUG/MINOR: ssl: Don't allow to set NULL sni ssl_sock_set_servername() function was documented to support NULL sni to unset it. However, the man page of SSL_get_servername() does not mentionned it is supported or not. And it is in fact not supported by WolfSSL and leads to a crash if we do so. For now, this function is never called with a NULL sni, so it better and safer to forbid this case. Now, if the sni is NULL, the function does nothing. This patch could be backported to all stable versions.	2025-12-08 15:22:00 +01:00
Frederic Lecaille	7872260525	REGTESTS: quic/ssl: Add ssl_curves_selection.vtc This reg test ensures the curves may be correctly set for frontend and backends by "ssl-default-bind-curves" and "ssl-default-server-curves" as global options or with "curves" options on "bind" and "server" lines.	2025-12-08 10:40:59 +01:00
Frederic Lecaille	90064ac88b	BUG/MINOR: quic: do not set first the default QUIC curves This patch impacts both the QUIC frontends and listeners. Note that "ssl-default-bind-ciphersuites", "ssl-default-bind-curves", are not ignored by QUIC by the frontend. This is also the case for the backends with "ssl-default-server-ciphersuites" and "ssl-default-server-curves". These settings are set by ssl_sock_prepare_ctx() for the frontends and by ssl_sock_prepare_srv_ssl_ctx() for the backends. But ssl_quic_initial_ctx() first sets the default QUIC frontends (see <quic_ciphers> and <quic_groups>) before these ssl_sock.c function are called, leading some TLS stack to refuse them if they do not support them. This is the case for some OpenSSL 3.5 stack with FIPS support. They do not support X25519. To fix this, set the default QUIC ciphersuites and curves only if not already set by the settings mentioned above. Rename <quic_ciphers> global variable to <default_quic_ciphersuites> and <quic_groups> to <default_quic_curves> to reflect the OpenSSL API naming. These options are taken into an account by ssl_quic_initial_ctx() which inspects these four variable before calling SSL_CTX_set_ciphersuites() with <default_quic_ciphersuites> as parameter and SSL_CTX_set_curves() with <default_quic_curves> as parameter if needed, that is to say, if no ciphersuites and curves were set by "ssl-default-bind-ciphersuites", "ssl-default-bind-curves" as global options or "ciphersuites", "curves" as "bind" line options. Note that the bind_conf struct is not modified when no "ciphersuites" or "curves" option are used on "bind" lines. On backend side, rely on ssl_sock_init_srv() to set the server ciphersuites and curves. This function is modified to use respectively <default_quic_ciphersuites> and <default_quic_curves> if no ciphersuites and curves were set by "ssl-default-server-ciphersuites", "ssl-default-server-curves" as global options or "ciphersuites", "curves" as "server" line options. Thank to @rwagoner for having reported this issue in GH #3194 when using an OpenSSL 3.5.4 stack with FIPS support. Must be backported as far as 2.6	2025-12-08 10:40:59 +01:00
Frederic Lecaille	a2d2cda631	REGTESTS: add ssl_ciphersuites.vtc (TCP & QUIC) This reg test ensures the ciphersuites may be correctly set for frontend and backends by "ssl-default-bind-ciphersuites" and "ssl-default-server-ciphersuites" as global options or with "ciphersuites" options on "bind" and "server" lines.	2025-12-08 10:40:59 +01:00
Frederic Lecaille	062a0ed899	REGTESTS: quic: add_ssl_crt-list.vtc supported by QUIC ssl/add_ssl_crt-list.vtc was renamed to ssl/add_ssl_crt-list.vtci to produce a common part runnable both for QUIC and TCP listeners. Then add_ssl_crt-list.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	4214c97dd4	REGTESTS: quic: ssl_curve_name.vtc supported by QUIC ssl/ssl_curve_name.vtc was renamed to ssl/ssl_curve_name.vtci to produce a common part runnable both for QUIC and TCP listeners. Then ssl_curve_name.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners); Note that this script works by chance for QUIC because the curves selection matches the default ones used by QUIC.	2025-12-08 10:40:59 +01:00
Frederic Lecaille	c615b14fac	REGTESTS: quic: ssl_sni_auto.vtc code provision for QUIC ssl/ssl_sni_auto.vtc was renamed to ssl/ssl_sni_auto.vtci to produce a common part runnable both for QUIC and TCP listeners. Then ssl_sni_auto.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners); Mark the test as broken for QUIC	2025-12-08 10:40:59 +01:00
Frederic Lecaille	7bb7b26317	REGTESTS: quic: ssl_simple_crt-list.vtc supported by QUIC ssl/ssl_simple_crt-list.vtc was renamed to ssl/ssl_simple_crt-list.vtci to produce a common part runnable both for QUIC and TCP listeners. Then ssl_simple_crt-list.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	b87bee8e04	REGTESTS: quic: ssl_server_samples.vtc supported by QUIC ssl/ssl_server_samples.vtc was renamed to ssl/ssl_server_samples.vtci to produce a common part runnable both for QUIC and TCP listeners. Then ssl_server_samples.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	25529dddb6	REGTESTS: quic: ssl_frontend_samples.vtc supported by QUIC ssl/ssl_frontend_samples.vtc was renamed to ssl/ssl_frontend_samples.vtci to produce a common part runnable both for QUIC and TCP listeners. Then ssl_frontend_samples.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	5cf5f76a90	REGTESTS: quic: new_del_ssl_crlfile.vtc supported by QUIC ssl/new_del_ssl_crlfile.vtc was renamed to ssl/new_del_ssl_crlfile.vtci to produce a common part runnable both for QUIC and TCP listeners. Then new_del_ssl_crlfile.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	fc0c52f2af	REGTESTS: quic: ssl_default_server.vtc supported by QUIC ssl/ssl_default_server.vtc was renamed to ssl/ssl_default_server.vtci to produce a common part runnable both for QUIC and TCP listeners. Then ssl_default_server.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	4bff826204	REGTESTS: quic: ssl_client_samples.vtc supported by QUIC ssl/ssl_client_samples.vtc was renamed to ssl/ssl_client_samples.vtci to produce a common part runnable both for QUIC and TCP listeners. Then ssl_client_samples.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	47889154d2	REGTESTS: quic: ssl_client_auth.vtc supported by QUIC ssl/ssl_client_auth.vtc was renamed to ssl/ssl_client_auth.vtci to produce a common part runnable both for QUIC and TCP listeners. Then ssl_client_auth.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	b285f11cd6	REGTESTS: quic: show_ssl_ocspresponse.vtc supported by QUIC ssl/show_ssl_ocspresponse.vtc was renamed to ssl/show_ssl_ocspresponse.vtci to produce a common part runnable both for QUIC and TCP listeners. Then show_ssl_ocspresponse.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	c4d066e735	REGTESTS: quic: set_ssl_server_cert.vtc supported by QUIC ssl/set_ssl_server_cert.vtc was renamed to ssl/set_ssl_server_cert.vtci to produce a common part runnable both for QUIC and TCP listeners. Then set_ssl_server_cert.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	c1a818c204	REGTESTS: quic: set_ssl_crlfile.vtc supported by QUIC ssl/set_ssl_crlfile.vtc was renamed to ssl/set_ssl_crlfile.vtci to produce a common part runnable both for QUIC and TCP listeners. Then set_ssl_crlfile.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	83b3e2876e	REGTESTS: quic: set_ssl_cert.vtc supported by QUIC ssl/set_ssl_cert.vtc was renamed to ssl/set_ssl_cert.vtci to produce a common part runnable both for QUIC and TCP listeners. Then set_ssl_cert.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	cb1e9e3cd8	REGTESTS: quic: set_ssl_cert_noext.vtc supported by QUIC ssl/set_ssl_cert_noext.vtc was renamed to ssl/set_ssl_cert_noext.vtci to produce a common part runnable both for QUIC and TCP listeners. Then set_ssl_cert_noext.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	9c3180160d	REGTESTS: quic: set_ssl_cafile.vtc supported by QUIC ssl/set_ssl_cafile.vtc was renamed to ssl/set_ssl_cafile.vtci to produce a common part runnable both for QUIC and TCP listeners. Then set_ssl_cafile.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	3f5e73e83f	BUG/MINOR: quic-be: missing connection stream closure upon TLS alert to send This is the same issue as the one fixed by this commit: BUG/MINOR: quic-be: handshake errors without connection stream closure But this time this is when the client has to send an alert to the server. The fix consists in creating the mux after having set the handshake connection error flag and error_code. This bug was revealed by ssl/set_ssl_cafile.vtc reg test. Depends on this commit: MINOR: quic: avoid code duplication in TLS alert callback Must be backported to 3.3	2025-12-08 10:40:59 +01:00
Frederic Lecaille	e7b06f5e7a	MINOR: quic: avoid code duplication in TLS alert callback Both the OpenSSL QUIC API TLS alert callback ha_quic_ossl_alert() does exactly the same thing than the one for quictls API, even if the parameter have different types. Call ha_quic_send_alert() quictls callback from ha_quic_ossl_alert OpenSSL QUIC API callback to avoid such code duplication.	2025-12-08 10:40:59 +01:00
Frederic Lecaille	ad101dc3d5	REGTESTS: quic: set_ssl_bug_2265.vtc supported by QUIC ssl/set_ssl_bug_2265.vtc was renamed to ssl/set_ssl_bug_2265.vtci to produce a common part runnable both for QUIC and TCP listeners. Then set_ssl_bug_2265.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	2e7320d2ee	REGTESTS: quic: ocsp_auto_update.vtc supported by QUIC ssl/ocsp_auto_update.vtc was renamed to ssl/ocsp_auto_update.vtci to produce a common part runnable both for QUIC and TCP listeners. Then ocsp_auto_update.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC listeners and "stream" for TCP listeners);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	cdfd9b154a	REGTESTS: quic: new_del_ssl_cafile.vtc supported by QUIC ssl/new_del_ssl_cafile.vtc was rename to ssl/new_del_ssl_cafile.vtci to produce a common part runnable both for QUIC and TCP connections. Then new_del_ssl_cafile.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC connection and "stream" for TCP connections);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	8c48a7798a	REGTESTS: quic: issuers_chain_path.vtc supported by QUIC ssl/issuers_chain_path.vtc was rename to ssl/issuers_chain_path.vtci to produce a common part runnable both for QUIC and TCP connections. Then issuers_chain_path.vtc files were created both under ssl and quic directories to call this .vtci file with correct VTC_SOCK_TYPE environment values ("quic" for QUIC connection and "stream" for TCP connections);	2025-12-08 10:40:59 +01:00
Frederic Lecaille	94a7e0127b	REGTESTS: quic: dynamic_server_ssl.vtc supported by QUIC ssl/dynamic_server_ssl.vtc was rename to ssl/dynamic_server_ssl.vtci to produce a common part runnable both for QUIC and TCP connections. Then dynamic_server_ssl.vtc were created both under ssl and quic directories to call the .vtci file with correct VTC_SOCK_TYPE environment value. Note that VTC_SOCK_TYPE may be resolved in haproxy -cli { } sections.	2025-12-08 10:40:59 +01:00
Frederic Lecaille	588d0edf99	REGTESTS: quic/ssl: ssl/del_ssl_crt-list.vtc supported by QUIC Extract from ssl/del_ssl_crt-list.vtc the common part to produce ssl/del_ssl_crt-list.vtci which may be reused by QUIC and TCP from respectively quic/del_ssl_crt-list.vtc and ssl/del_ssl_crt-list.vtc thanks to "include" VTC command and VTC_SOCK_TYPE special vtest environment variable.	2025-12-08 10:40:59 +01:00
Frederic Lecaille	6e94b69665	REGTESTS: ssl: Move all the SSL certificates, keys, crt-lists inside "certs" directory Move all these files and others for OCSP tests found into reg-tests/ssl to reg-test/ssl/certs and adapt all the VTC files which use them. This patch is needed by other tests which have to include the SSL tests. Indeed, some VTC commands contain paths to these files which cannot be customized with environment variables, depending on the location the VTC file is runi from, because VTC does not resolve the environment variables. Only macros as ${testdir} can be resolved. For instance this command run from a VTC file from reg-tests/ssl directory cannot be reused from another directory, except if we add a symbolic link for each certs, key etc. haproxy h1 -cli { send "del ssl crt-list ${testdir}/localhost.crt-list ${testdir}/common.pem:1" } This is not what we want. We add a symbolic link to reg-test/ssl/certs to the directory and modify the command above as follows: haproxy h1 -cli { send "del ssl crt-list ${testdir}/certs/localhost.crt-list ${testdir}/certs/common.pem:1" }	2025-12-08 10:40:59 +01:00
Frederic Lecaille	21293dd6c3	MINOR: quic: Add useful debugging traces in qc_idle_timer_do_rearm() Traces were missing in this function. Also add information about the connection struct from qc->conn when initialized for all the traces. Should be easily backported as far as 2.6.	2025-12-08 10:40:59 +01:00
Frederic Lecaille	c36e27d10e	BUG/MINOR: quic-be: handshake errors without connection stream closure This bug was revealed on backend side by reg-tests/ssl/del_ssl_crt-list.vtc when run wich QUIC connections. As expected by the test, a TLS alert is generated on servsr side. This latter sands a CONNECTION_CLOSE frame with a CRYPTO error (>= 0x100). In this case the client closes its QUIC connection. But the stream connection was not informed. This leads the connection to be closed after the server timeout expiration. It shouls be closed asap. This is the reason why reg-tests/ssl/del_ssl_crt-list.vtc could succeeds or failed, but only after a 5 seconds delay. To fix this, mimic the ssl_sock_io_cb() for TCP/SSL connections. Call the same code this patch implements with ssl_sock_handle_hs_error() to correctly handle the handshake errors. Note that some SSL counters were not incremented for both the backends and frontends. After such errors, ssl_sock_io_cb() start the mux after the connection has been flagged in error. This has as side effect to close the stream in conn_create_mux(). Must be backported to 3.3 only for backends. This is not sure at this time if this bug may impact the frontends.	2025-12-08 10:40:59 +01:00
Frederic Lecaille	63273c795f	BUG/MINOR: quic/ssl: crash in ClientHello callback ssl traces Such crashes may occur for QUIC frontends only when the SSL traces are enabled. ssl_sock_switchctx_cbk() ClientHello callback may be called without any connection initialize (<conn>) for QUIC connections leading to crashes when passing conn->err_code to TRACE_ERROR(). Modify the TRACE_ERROR() statement to pass this parameter only when <conn> is initialized. Must be backported as far as 3.2.	2025-12-08 10:40:59 +01:00
Willy Tarreau	d2a1665af0	DOC: config: reorder the cache section's keywords Probably due to historical accumulation, keywords were in a random order that doesn't help when looking them up. Let's just reorder them in alphabetical order like other sections. This can be backported.	2025-12-04 15:44:38 +01:00
Willy Tarreau	4d0a88c746	DOC: config: mention clearer that the cache's total-max-size is mandatory As reported in GH issue #3201, it's easy to overlook this, so let's make it clearer by mentioning the keyword. This can be backported to all versions.	2025-12-04 15:42:09 +01:00
Willy Tarreau	cd959f1321	BUG/MEDIUM: config: ignore empty args in skipped blocks As returned by Christian Ruppert in GH issue #3203, we're having an issue with checks for empty args in skipped blocks: the check is performed after the line is tokenized, without considering the case where it's disabled due to outer false .if/.else conditions. Because of this, a test like this one: .if defined(SRV1_ADDR) server srv1 "$SRV1_ADDR" .endif will fail when SRV1_ADDR is empty or not set, saying that this will result in an empty arg on the line. The solution consists in postponing this check after the conditions evaluation so that disabled lines are already skipped. And for this to be possible, we need to move "errptr" one level above so that it remains accessible there. This will need to be backported to 3.3 and wherever commit 1968731765 ("BUG/MEDIUM: config: solve the empty argument problem again") is backported. As such it is also related to GH issue #2367.	2025-12-04 15:33:43 +01:00
Willy Tarreau	b29560f610	BUG/MEDIUM: connection: fix "bc_settings_streams_limit" typo The keyword was correct in the doc but in the code it was spelled with a missing 's' after 'settings', making it unavailable. Since there was no other way to find this but reading the code, it's safe to simply fix it and assume nobody relied on the wrong spelling. In the worst case for older backports it can also be duplicated. This must be backported to 3.0.	2025-12-04 15:26:54 +01:00
William Lallemand	85689b072a	REGTESTS: ssl: split tls*_reuse in stateless and stateful resume tests Simplify ssl_reuse.vtci so it can be started with variables: - SSL_CACHESIZE allow to specify the size of the session cache size for the frontend - NO_TLS_TICKETS allow to specify the "no-tls-tickets" option on bind It introduces these files: - ssl/tls12_resume_stateful.vtc - ssl/tls12_resume_stateless.vtc - ssl/tls13_resume_stateless.vtc - ssl/tls13_resume_stateful.vtc - quic/tls13_resume_stateless.vtc - quic/tls13_resume_stateful.vtc - quic/tls13_0rtt_stateful.vtc - quic/tls13_0rtt_stateless.vtc stateful files have "no-tls-tickets" + tune.tls.cachesize 20000 stateless files have "tls-tickets" + tune.tls.cachesize 0 This allows to enable AWS-LC on TCP TLS1.2 and TCP TL1.3+tickets. TLS1.2+stateless does not seem to work on WolfSSL.	2025-12-04 15:05:56 +01:00
William Lallemand	c7b5d2552a	REGTESTS: ssl enable tls12_reuse.vtc for AWS-LC The TLS resume test was never started with AWS-LC because the TLS1.3 part was not working. Since we split the reg-tests with a TLS1.2 part and a TLS1.3 part, we can enable the tls1.2 part for AWS-LC.	2025-12-04 11:40:04 +01:00
Frederic Lecaille	cdca48b88c	BUG/MINOR: quic-be: Missing keywords array NULL termination This bug arrived with this commit: MINOR: quic: implement cc-algo server keyword where <srv> keywords list with a missing array NULL termination inside was introduced to parse the QUIC backend CC algorithms. Detected by ASAN during ssl/add_ssl_crt-list.vtc execution as follows: * h1 debug\|==4066081==ERROR: AddressSanitizer: global-buffer-overflow on address 0x5562e31dedb8 at pc 0x5562e298951f bp 0x7ffe9f9f2b40 sp 0x7ffe9f9f2b38 * h1 debug\|READ of size 8 at 0x5562e31dedb8 thread T0 ** dT 0.173 * h1 debug\| #0 0x5562e298951e in srv_find_kw src/server.c:789 * h1 debug\| #1 0x5562e2989630 in _srv_parse_kw src/server.c:3847 * h1 debug\| #2 0x5562e299db1f in parse_server src/server.c:4024 * h1 debug\| #3 0x5562e2c86ea4 in cfg_parse_listen src/cfgparse-listen.c:593 * h1 debug\| #4 0x5562e2b0ede9 in parse_cfg src/cfgparse.c:2708 * h1 debug\| #5 0x5562e2c47d48 in read_cfg src/haproxy.c:1077 * h1 debug\| #6 0x5562e2682055 in main src/haproxy.c:3366 * h1 debug\| #7 0x7ff3ff867249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 * h1 debug\| #8 0x7ff3ff867304 in __libc_start_main_impl ../csu/libc-start.c:360 * h1 debug\| #9 0x5562e26858d0 in _start (/home/flecaille/src/haproxy/haproxy+0x2638d0) * h1 debug\| * h1 debug\|0x5562e31dedb8 is located 40 bytes to the left of global variable 'bind_kws' defined in 'src/cfgparse-quic.c:255:28' (0x5562e31dede0) of size 120 * h1 debug\|0x5562e31dedb8 is located 0 bytes to the right of global variable 'srv_kws' defined in 'src/cfgparse-quic.c:264:27' (0x5562e31ded80) of size 56 * h1 debug\|SUMMARY: AddressSanitizer: global-buffer-overflow src/server.c:789 in srv_find_kw * h1 debug\|Shadow bytes around the buggy address: * h1 debug\| 0x0aacdc633d60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * h1 debug\| 0x0aacdc633d70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * h1 debug\| 0x0aacdc633d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * h1 debug\| 0x0aacdc633d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * h1 debug\| 0x0aacdc633da0: 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9 f9 f9 * h1 debug\|=>0x0aacdc633db0: 00 00 00 00 00 00 00[f9]f9 f9 f9 f9 00 00 00 00 * h1 debug\| 0x0aacdc633dc0: 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9 f9 * h1 debug\| 0x0aacdc633dd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * h1 debug\| 0x0aacdc633de0: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9 f9 * h1 debug\| 0x0aacdc633df0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 * h1 debug\| 0x0aacdc633e00: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 * h1 debug\|Shadow byte legend (one shadow byte represents 8 application bytes): This should be backported where the commit above is supposed to be backported.	2025-12-03 11:07:47 +01:00
Amaury Denoyelle	47dff5be52	MINOR: quic: implement cc-algo server keyword Extend QUIC server configuration so that congestion algorithm and maximum window size can be set on the server line. This can be achieved using quic-cc-algo keyword with a syntax similar to a bind line. This should be backported up to 3.3 as this feature is considered as necessary for full QUIC backend support. Note that this relies on the serie of previous commits which should be picked first.	2025-12-01 15:53:58 +01:00
Amaury Denoyelle	4f43abd731	MINOR: quic: extract cc-algo parsing in a dedicated function Extract code from bind_parse_quic_cc_algo() related to pure parsing of quic-cc-algo keyword. The objective is to be able to quickly duplicate this option on the server line. This may need to be backported to support QUIC congestion control algorithm support on the server line in version 3.3.	2025-12-01 15:06:01 +01:00
Amaury Denoyelle	979588227f	MINOR: quic: define quic_cc_algo as const Each QUIC congestion algorithm is defined as a structure with callbacks in it. Every quic_conn has a member pointing to the configured algorithm, inherited from the bind-conf keyword or to the default CUBIC value. Convert all these definitions to const. This ensures that there never will be an accidental modification of a globally shared structure. This also requires to mark quic_cc_algo field in bind_conf and quic_cc as const.	2025-12-01 15:05:41 +01:00
Amaury Denoyelle	acbb378136	Revert "MINOR: quic: use dynamic cc_algo on bind_conf" This reverts commit a6504c9cfb6bb48ae93babb76a2ab10ddb014a79. Each supported QUIC algo are associated with a set of callbacks defined in a structure quic_cc_algo. Originally, bind_conf would use a constant pointer to one of these definitions. During pacing implementation, this field was transformed into a dynamically allocated value copied from the original definition. The idea was to be able to tweak settings at the listener level. However, this was never used in practice. As such, revert to the original model. This may need to be backported to support QUIC congestion control algorithm support on the server line in version 3.3.	2025-12-01 14:18:58 +01:00
William Lallemand	c641ea4f9b	DOC: configuration: ECH support details Specify which OpenSSL branch is supported and that AWS-LC is not supported. Must be backported to 3.3.	2025-11-30 09:47:56 +01:00
Remi Tricot-Le Breton	2b3d13a740	BUG/MINOR: jwt: Missing "case" in switch statement Because of missing "case" keyword in front of the values in a switch case statement, the values were interpreted as goto tags and the switch statement became useless. This patch should fix GitHub issue #3200. The fix should be backported up to 2.8.	2025-11-28 16:36:46 +01:00
Willy Tarreau	36133759d3	[RELEASE] Released version 3.4-dev0 Released version 3.4-dev0 with the following main changes : - MINOR: version: mention that it's development again	2025-11-26 16:12:45 +01:00
Willy Tarreau	e8d6ffb692	MINOR: version: mention that it's development again This essentially reverts d8ba9a2a92.	2025-11-26 16:11:47 +01:00
Willy Tarreau	7832fb21fe	[RELEASE] Released version 3.3.0 Released version 3.3.0 with the following main changes : - BUG/MINOR: acme: better challenge_ready processing - BUG/MINOR: acme: warning ‘ctx’ may be used uninitialized - MINOR: httpclient: complete the https log - BUG/MEDIUM: server: do not use default SNI if manually set - BUG/MINOR: freq_ctr: Prevent possible signed overflow in freq_ctr_overshoot_period - DOC: ssl: Document the restrictions on 0RTT. - DOC: ssl: Note that 0rtt works fork QUIC with QuicTLS too. - BUG/MEDIUM: quic: do not prevent sending if no BE token - BUG/MINOR: quic/server: free quic_retry_token on srv drop - MINOR: quic: split global CID tree between FE and BE sides - MINOR: quic: use separate global quic_conns FE/BE lists - MINOR: quic: add "clo" filter on show quic - MINOR: quic: dump backend connections on show quic - MINOR: quic: mark backend conns on show quic - BUG/MINOR: quic: fix uninit list on show quic handler - BUG/MINOR: quic: release BE quic_conn on connect failure - BUG/MINOR: server: fix srv_drop() crash on partially init srv - BUG/MINOR: h3: do no crash on forwarding multiple chained response - BUG/MINOR: h3: handle properly buf alloc failure on response forwarding - BUG/MEDIUM: server/ssl: Unset the SNI for new server connections if none is set - BUG/MINOR: acme: fix ha_alert() call - Revert "BUG/MEDIUM: server/ssl: Unset the SNI for new server connections if none is set" - BUG/MINOR: sock-inet: ignore conntrack for transparent sockets on Linux - DEV: patchbot: prepare for new version 3.4-dev - DOC: update INSTALL with the range of gcc compilers and openssl versions - MINOR: version: mention that 3.3 is stable now	2025-11-26 15:55:57 +01:00
Willy Tarreau	d8ba9a2a92	MINOR: version: mention that 3.3 is stable now This version will be maintained up to around Q1 2027. The INSTALL file also mentions it.	2025-11-26 15:54:30 +01:00
Willy Tarreau	09dd6bb4cb	DOC: update INSTALL with the range of gcc compilers and openssl versions Gcc 4.7 to 15 are tested. OpenSSL was tested up to 3.6. QUIC support requires OpenSSL >= 3.5.2.	2025-11-26 15:50:43 +01:00
Willy Tarreau	22fd296a04	DEV: patchbot: prepare for new version 3.4-dev The bot will now load the prompt for the upcoming 3.4 version so we have to rename the files and update their contents to match the current version.	2025-11-26 15:35:22 +01:00
Willy Tarreau	e5658c52d0	BUG/MINOR: sock-inet: ignore conntrack for transparent sockets on Linux As reported in github issue #3192, in certain situations with transparent listeners, it is possible to get the incoming connection's destination wrong via SO_ORIGINAL_DST. Two cases were identified thus far: - incorrect conntrack configuration where NOTRACK is used only on incoming packets, resulting in reverse connections being created from response packets. It's then mostly a matter of timing, i.e. whether or not the connection is confirmed before the source is retrieved, but in this case the connection's destination address as retrieved by SO_ORIGINAL_DST is the client's address. - late outgoing retransmit that recreates a just expired conntrack entry, in reverse direction as well. It's possible that combinations of RST or FIN might play a role here in speeding up conntrack eviction, as well as the rollover of source ports on the client whose new connection matches an older one and simply refreshes it due to nf_conntrack_tcp_loose being set by default. TPROXY doesn't require conntrack, only REDIRECT, DNAT etc do. However the system doesn't offer any option to know how a conntrack entry was created (i.e. normally or via a response packet) to let us know that it's pointless to check the original destination, nor does it permit to access the local vs peer addresses in opposition to src/dst which can be wrong in this case. One alternate approach could consist in only checking SO_ORIGINAL_DST for listening sockets not configured with the "transparent" option, but the problem here is that our low-level API only works with FDs without knowing their purpose, so it's unknown there that the fd corresponds to a listener, let alone in transparent mode. A (slightly more expensive) variant of this approach here consists in checking on the socket itself that it was accepted in transparent mode using IP_TRANSPARENT, and skip SO_ORIGINAL_DST if this is the case. This does the job well enough (no more client addresses appearing in the dst field) and remains a good compromise. A future improvement of the API could permit to pass the transparent flag down the stack to that function. This should be backported to stable versions after some observation in latest -dev. For reference, here are some links to older conversations on that topic that Lukas found during this analysis: https://lists.openwall.net/netdev/2019/01/12/34 https://discourse.haproxy.org/t/send-proxy-not-modifying-some-traffic-with-proxy-ip-port-details/3336/9 https://www.mail-archive.com/haproxy@formilux.org/msg32199.html https://lists.openwall.net/netdev/2019/01/23/114	2025-11-26 13:43:58 +01:00
Christopher Faulet	7d9cc28f92	Revert "BUG/MEDIUM: server/ssl: Unset the SNI for new server connections if none is set" This reverts commit de29000e602bda55d32c266252ef63824e838ac0. The fix was in fact invalid. First it is not supprted by WolfSSL to call SSL_set_tlsext_host_name with a hostname to NULL. Then, it is not specified as supported by other SSL libraries. But, by reviewing the root cause of this bug, it appears there is an issue with the reuse of TLS sesisons. It must not be performed if the SNI does not match. A TLS session created with a SNI must not be reused with another SNI. The side effects are not clear but functionnaly speaking, it is invalid. So, for now, the commit above was reverted because it is invalid and it crashes with WolfSSL. Then the init of the SSL connection must be reworked to get the SNI earlier, to be able to reuse or not an existing TLS session.	2025-11-26 12:05:43 +01:00
Maxime Henrion	d506c03aa0	BUG/MINOR: acme: fix ha_alert() call A NULL pointer was passed as the format string, so this alert message was never written. Must be backported to 3.2.	2025-11-25 20:20:25 +01:00
Christopher Faulet	de29000e60	BUG/MEDIUM: server/ssl: Unset the SNI for new server connections if none is set When a new SSL server connection is created, if no SNI is set, it is possible to inherit from the one of the reused TLS session. The bug was introduced by the commit 95ac5fe4a ("MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid"). The mixup is possible between regular connections but also with health-checks connections. To fix the issue, when no SNI is set, for regular server connections and for health-check connections, the SNI must explicitly be disabled by calling ssl_sock_set_servername() with the hostname set to NULL. Many thanks to Lukas for his detailed bug report. This patch should fix the issue #3195. It must be backported as far as 3.0.	2025-11-25 16:32:46 +01:00
Amaury Denoyelle	a70816da82	BUG/MINOR: h3: handle properly buf alloc failure on response forwarding Replace BUG_ON() for buffer alloc failure on h3_resp_headers_to_htx() by proper error handling. An error status is reported which should be sufficient to initiate connection closure. No need to backport.	2025-11-25 15:55:08 +01:00
Amaury Denoyelle	ae96defaca	BUG/MINOR: h3: do no crash on forwarding multiple chained response h3_resp_headers_to_htx() is the function used to convert an HTTP/3 response into a HTX message. It was introduced on this release for QUIC backend support. A BUG_ON() would occur if multiple responses are forwarded simultaneously on a stream without rcv_buf in between. Fix this by removing it. Instead, if QCS HTX buffer is not empty when handling with a new response, prefer to pause demux operation. This is restarted when the buffer has been read and emptied by the upper stream layer. No need to backport.	2025-11-25 15:52:37 +01:00
Amaury Denoyelle	a363b536a9	BUG/MINOR: server: fix srv_drop() crash on partially init srv A recent patch has introduced free operation for QUIC tokens stored in a server. These values are located in <per_thr> server array. However, a server instance may be released prior to its full initialization in case of a failure during "add server" CLI command. The mentionned patch would cause a srv_drop() crash due to an invalid usage of NULL <per_thr> member. Fix this by adding a check on <per_thr> prior to dereference it in srv_drop(). No need to backport.	2025-11-25 15:16:13 +01:00
Amaury Denoyelle	6c08eb7173	BUG/MINOR: quic: release BE quic_conn on connect failure If quic_connect_server() fails, quic_conn FD will remain unopened as set to -1. Backend connections do not have a fallback socket for future exchange, contrary to frontend one which can use the listener FD. As such, it is better to release these connections early. This patch adjusts such failure by extending quic_close(). This function is called by the upper layer immediately after a connect issue. In this case, release immediately a quic_conn backend instance if the FD is unset, which means that connect has previously failed. Also, quic_conn_release() is extended to ensure that such faulty connections are immediately freed and not converted into a quic_conn_closed instance. Prior to this patch, a backend quic_conn without any FD would remain allocated and possibly active. If its tasklet is executed, this resulted in a crash due to access to an invalid FD. No need to backport.	2025-11-25 14:50:23 +01:00
Amaury Denoyelle	346631700d	BUG/MINOR: quic: fix uninit list on show quic handler A recent patch has extended "show quic" capability. It is now possible to list a specific list of connections, either active frontend, closing frontend or backend connections. An issue was introduced as the list is local storage. As this command is reentrant, show quic context must be extended so that the currently inspected list is also saved. This issue was reported via GCC which mentions an uninitilized value depending on branching conditions.	2025-11-25 14:50:19 +01:00
Amaury Denoyelle	a3f76875f4	MINOR: quic: mark backend conns on show quic Add an extra "(B)" marker when displaying a backend connection during a "show quic". This is useful to differentiate them with the frontend side when displaying all connections.	2025-11-25 14:31:27 +01:00
Amaury Denoyelle	e56fdf6320	MINOR: quic: dump backend connections on show quic Add a new "be" filter to "show quic". Its purpose is to be able to display backend connections. These connections can also be listed using "all" filter.	2025-11-25 14:30:18 +01:00
Amaury Denoyelle	3685681373	MINOR: quic: add "clo" filter on show quic Add a new filter "clo" for "show quic" command. Its purpose is to filter output to only list closing frontend connections.	2025-11-25 14:30:18 +01:00
Amaury Denoyelle	49e6fca51b	MINOR: quic: use separate global quic_conns FE/BE lists Each quic_conn instance is stored in a global list. Its purpose is to be able to loop over all known connections during "show quic". Split this into two separate lists for frontend and backend usage. Another change is that closing backend connections do not move into quic_conns_clo list. They remain instead in their original list. The objective of this patch is to reduce the contention between the two sides. Note that this prevents backend connections to be listed in "show quic" now. This will be adjusted in a future patch.	2025-11-25 14:30:18 +01:00
Amaury Denoyelle	a5801e542d	MINOR: quic: split global CID tree between FE and BE sides QUIC CIDs are stored in a global tree. Prior to this patch, CIDs used on both frontend and backend sides were mixed together. This patch implement CID storage separation between FE and BE sides. The original tre quic_cid_trees is splitted as quic_fe_cid_trees/quic_be_cid_trees. This patch should reduce contention between frontend and backend usages. Also, it should reduce the risk of random CID collision.	2025-11-25 14:30:18 +01:00
Amaury Denoyelle	4b596c1ea8	BUG/MINOR: quic/server: free quic_retry_token on srv drop A recent patch has implemented caching of QUIC token received from a NEW_TOKEN frame into the server cache. This value is stored per thread into a <quic_retry_token> field. This field is an ist, first set to an empty string. Via qc_try_store_new_token(), it is reallocated to fit the size of the newly stored token. Prior to this patch, the field was never freed so this causes a memory leak. Fix this by using istfree() on <quic_retry_token> field during srv_drop(). No need to backport.	2025-11-25 14:30:18 +01:00
Amaury Denoyelle	cbfe574d8a	BUG/MEDIUM: quic: do not prevent sending if no BE token For QUIC client support, a token may be emitted along with INITIAL packets during the handshake. The token is encoded during emission via qc_enc_token() called by qc_build_pkt(). The token may be provided from different sources. First, it can be retrieved via <retry_token> quic_conn member when a Retry packet was received. If not present, a token may be reused from the server cache, populated from NEW_TOKEN received from previous a connection. Prior to this patch, the last method may cause an issue. If the upper connection instance is released prior to the handshake completion, this prevents access to a possible server token. This is considered an error by qc_enc_token(). The error is reported up to calling functions, preventing any emission to be performed. In the end, this prevented the either the full quic_conn release or subsizing into quic_conn_closed until the idle timeout completion (30s by default). With abortonclose set now by default on HTTP frontends, early client shutdowns can easily cause excessive memory consumption. To fix this, change qc_enc_token() so that if connection is closed, no token is encoded but also no error is reported. This allows to continue emission and permit early connection release. No need to backport.	2025-11-25 14:30:18 +01:00
Olivier Houchard	e27216b799	DOC: ssl: Note that 0rtt works fork QUIC with QuicTLS too. Document that one can use 0rtt with QUIC when using QuicTLS too.	2025-11-25 13:17:45 +01:00
Olivier Houchard	f867068dc7	DOC: ssl: Document the restrictions on 0RTT. Document that with QUIC, 0RTT only works with OpenSSL >= 3.5.2 and AWS-LC, and for TLS/TCP, it only works with OpenSSL, and frontends require that an ALPN be sent by the client to use the early data before the handshake.	2025-11-25 11:46:22 +01:00
Jacques Heunis	91eb9b082b	BUG/MINOR: freq_ctr: Prevent possible signed overflow in freq_ctr_overshoot_period All of the other bandwidth-limiting code stores limits and intermediate (byte) counters as unsigned integers. The exception here is freq_ctr_overshoot_period which takes in unsigned values but returns a signed value. While this has the benefit of letting the caller know how far away from overshooting they are, this is not currently leveraged anywhere in the codebase, and it has the downside of halving the positive range of the result. More concretely though, returning a signed integer when all intermediate values are unsigned (and boundaries are not checked) could result in an overflow, producing values that are at best unexpected. In the case of flt_bwlim (the only usage of freq_ctr_overshoot_period in the codebase at the time of writing), an overflow could cause the filter to wait for a large number of milliseconds when in fact it shouldn't wait at all. This is a niche possibility, because it requires that a bandwidth limit is defined in the range [2^31, 2^32). In this case, the raw limit value would not fit into a signed integer, and close to the end of the period, the `(elapsed * freq)/period` calculation could produce a value which also doesn't fit into a signed integer. If at the same time `curr` (the number of events counted so far in the current period) is small, then we could get a very large negative value which overflows. This is undefined behaviour and could produce surprising results. The most obvious outcome is flt_bwlim sometimes waiting for a large amount of time in a case where it shouldn't wait at all, thereby incorrectly slowing down the flow of data. Converting just the return type from signed to unsigned (and checking for the overflow) prevents this undefined behaviour. It also makes the range of valid values consistent between the input and output of freq_ctr_overshoot_period and with the input and output of other freq_ctr functions, thereby reducing the potential for surprise in intermediate calculations: now everything supports the full 0 - 2^32 range.	2025-11-24 14:10:13 +01:00
Amaury Denoyelle	2829165f61	BUG/MEDIUM: server: do not use default SNI if manually set A new server feature "sni-auto" has been introduced recently. The objective is to automatically set the SNI value to the host header if no SNI is explicitely set. 668916c1a2fc2180028ae051aa805bb71c7b690b MEDIUM: server/ssl: Base the SNI value to the HTTP host header by default There is an issue with it : server SNI is currently always overwritten, even if explicitely set in the configuration file. Adjust check_config_validity() to ensure the default value is only used if <sni_expr> is NULL. This issue was detected as a memory leak on <sni_expr> was reported when SNI is explicitely set on a server line. This patch is related to github feature request #3081. No need to backport, unless the above patch is.	2025-11-24 11:45:18 +01:00
William Lallemand	5dbf06e205	MINOR: httpclient: complete the https log The httpsclient_log_format variable lacks a few values in the TLS fields that are now available as fetches. On the backend side we have: "%[fc_err]/%[ssl_fc_err,hex]/%[ssl_c_err]/%[ssl_c_ca_err]/%[ssl_fc_is_resumed] %[ssl_fc_sni]/%sslv/%sslc" We now have enough sample fetches to have this equivalent in the httpclient: "%[bc_err]/%[ssl_bc_err,hex]/%[ssl_c_err]/%[ssl_c_ca_err]/%[ssl_bc_is_resumed] %[ssl_bc_sni]/%[ssl_bc_protocol]/%[ssl_bc_cipher]" Instead of the current: "%[bc_err]/%[ssl_bc_err,hex]/-/-/%[ssl_bc_is_resumed] -/-/-"	2025-11-22 12:29:33 +01:00
William Lallemand	0cae2f0515	BUG/MINOR: acme: warning ‘ctx’ may be used uninitialized Please compiler with maybe-uninitialized warning src/acme.c: In function ‘cli_acme_chall_ready_parse’: include/haproxy/task.h:215:9: error: ‘ctx’ may be used uninitialized [-Werror=maybe-uninitialized] 215 \| _task_wakeup(t, f, MK_CALLER(WAKEUP_TYPE_TASK_WAKEUP, 0, 0)) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/acme.c:2903:17: note: in expansion of macro ‘task_wakeup’ 2903 \| task_wakeup(ctx->task, TASK_WOKEN_MSG); \| ^~~~~~~~~~~ src/acme.c:2862:26: note: ‘ctx’ was declared here 2862 \| struct acme_ctx *ctx; \| ^~~ Backport to 3.2.	2025-11-21 23:04:16 +01:00
William Lallemand	d77d3479ed	BUG/MINOR: acme: better challenge_ready processing Improve the challenge_ready processing: - do a lookup directly instead looping in the task tree - only do a task_wakeup when every challenges are ready to avoid starting the task and stopping it just after - Compute the number of remaining challenge to setup - Output a message giving the number of remaining challenges to setup and if the task started again. Backport to 3.2.	2025-11-21 22:47:52 +01:00
Willy Tarreau	8418c001ce	[RELEASE] Released version 3.3-dev14 Released version 3.3-dev14 with the following main changes : - MINOR: stick-tables: Rename stksess shards to use buckets - MINOR: quic: do not use quic_newcid_from_hash64 on BE side - MINOR: quic: support multiple random CID generation for BE side - MINOR: quic: try to clarify quic_conn CIDs fields direction - MINOR: quic: refactor qc_new_conn() prototype - MINOR: quic: remove <ipv4> arg from qc_new_conn() - MEDIUM: mworker: set the mworker-max-reloads to 50 - BUG/MEDIUM: quic-be: prevent use of MUX for 0-RTT sessions without secrets - CLEANUP: startup: move confusing msg variable - BUG/MEDIUM: mworker: signals inconsistencies during startup and reload - BUG/MINOR: mworker: wrong signals during startup - BUG/MINOR: acme: P-256 doesn't work with openssl >= 3.0 - REGTESTS: ssl: split the SSL reuse test into TLS 1.2/1.3 - BUILD: Makefile: make install with admin tools - CI: github: make install-bin instead of make install - BUG/MINOR: ssl: remove dead code in ssl_sock_from_buf() - BUG/MINOR: mux-quic: implement max-reuse server parameter - MINOR: quic: fix trace on quic_conn_closed release - BUG/MINOR: quic: do not decrement jobs for backend conns - BUG/MINOR: quic: fix FD usage for quic_conn_closed on backend side - BUILD: Makefile: remove halog from install-admin - REGTESTS: ssl: add basic 0rtt tests for TLSv1.2, TLSv1.3 and QUIC - REGTESTS: ssl: also verify that 0-rtt properly advertises early-data:1 - MINOR: quic/flags: add missing QUIC flags for flags dev tool. - MINOR: quic: uneeded xprt context variable passed as parameter - MINOR: limits: keep a copy of the rough estimate of needed FDs in global struct - MINOR: limits: explain a bit better what to do when fd limits are exceeded - BUG/MEDIUM: quic-be/ssl_sock: TLS callback called without connection - BUG/MINOR: acme: alert when the map doesn't exist at startup - DOC: acme: add details about the DNS-01 support - DOC: acme: explain how to dump the certificates - DOC: acme: configuring acme needs a crt file - DOC: acme: add details about key pair generation in ACME section - BUG/MEDIUM: queues: Don't forget to unlock the queue before exiting - MINOR: muxes: Support an optional ALPN string when defining mux protocols - MINOR: config: Do proto detection for listeners before checks about ALPN - BUG/MEDIUM: config: Use the mux protocol ALPN by default for listeners if forced - DOC: config: Add a note about conflict with ALPN/NPN settings and proto keyword - MINOR: quic: store source address for backend conns - BUG/MINOR: quic: flag conn with CO_FL_FDLESS on backend side - ADMIN: dump-certs: let dry-run compare certificates - BUG/MEDIUM: connection/ssl: also fix the ssl_sock_io_cb() regarding idle list - DOC: http: document 413 response code - MINOR: limits: display the computed maxconn using ha_notice() - BUG/MEDIUM: applet: Fix conditions to detect spinning loop with the new API - BUG/MEDIUM: cli: State the cli have no more data to deliver if it yields - MINOR: h3: adjust sedesc update for known input payload len - BUG/MINOR: mux-quic: fix sedesc leak on BE side - OPTIM: mux-quic: delay FE sedesc alloc to stream creation - BUG/MEDIUM: quic-be: quic_conn_closed buffer overflow - BUG/MINOR: mux-quic: check access on qcs stream-endpoint - BUG/MINOR: acme: handle multiple auth with the same name - BUG/MINOR: acme: prevent creating map entries with dns-01	2025-11-21 14:13:44 +01:00
William Lallemand	548e7079cd	BUG/MINOR: acme: prevent creating map entries with dns-01 We don't need map entries with dns-01. The patch must be backported to 3.2.	2025-11-21 12:28:41 +01:00
William Lallemand	26093121a3	BUG/MINOR: acme: handle multiple auth with the same name In case of the dns-01 challenge, it is possible to have a domain "example.com" and "*.example.com" in the same request. This will create 2 different auth objects, which need 2 different challenges. However the associated domain is "example.com" for both auth objects. When doing a "challenge_ready", the algorithm will break at the first domain found. But since you can have multiple time the same domain in this case, breaking at the first one prevent to have all auth objects in a ready state. This patch just remove the break so we can loop on every auth objects. Must be backported to 3.2.	2025-11-21 12:28:41 +01:00
Amaury Denoyelle	bbd83e3de9	BUG/MINOR: mux-quic: check access on qcs stream-endpoint Since the following commit, allocation of stream-endpoint has been delayed. The objective is to allocate it only for QCS attached to an upper stream object. commit e6064c561684d9b079e3b5725d38dc3b5c1b5cd5 OPTIM: mux-quic: delay FE sedesc alloc to stream creation However, some MUX functions are unsafe as qcs->sd is dereferenced without any check on it which will result in a crash. Fix this by testing that qcs->sd is allocated before using it. This does not need to be backported, unless the above patch is.	2025-11-21 11:16:07 +01:00
Frederic Lecaille	91f479604e	BUG/MEDIUM: quic-be: quic_conn_closed buffer overflow This bug impacts only the backends. Recent commits have modified quic_rx_pkt_parse() for the QUIC backend to handle the retry token, and version negotiation. This function is called for the quic_conn even when is closing state (so for the quic_conn_closed struct). The quic_conn struct and quic_conn_closed struct share some members thank to the leading QUIC_CONN_COMMON struct. The recent modification impacts some members which do not exist for the quic_connn_closed struct, leading to buffer overflows if modified. For the backends only this patch: 1- silently drops the Retry packet (received/parsed only by backends) 2- silently drops the Initial packets received in closing state This is safe for the Initial packets because in closing state the datagrams are entirely skipped thanks to qc_rx_check_closing() in quic_dgram_parse(). No backport needed because the backend support arrived with the current dev.	2025-11-21 10:49:44 +01:00
Amaury Denoyelle	e6064c5616	OPTIM: mux-quic: delay FE sedesc alloc to stream creation On frontend side, a stream-endpoint is allocated on every qcs_new() invokation. However, this is only used for bidirectional request streams. This patch delays stream-endpoint allocation to qcs_attach_sc(), just prior the instantiation of the upper stream object. This does not bring any behavior change but is a nice optimization.	2025-11-21 10:34:08 +01:00
Amaury Denoyelle	4fb8908605	BUG/MINOR: mux-quic: fix sedesc leak on BE side On backend side, streams are instantiated prior to their QCS MUX counterpart. Thus, QCS can reuse the stream-endpoint already allocated with the streams, either on qmux_init() or attach operation. However, a stream-endpoint is also always allocated in every qcs_new() invokation. For backend QCS, it is thus overwritten on qmux_init()/attach operation. This causes a memleak. Fix this by restricting allocation of stream-endpoint only for frontend connection. This does not need to be backported.	2025-11-21 10:34:08 +01:00
Amaury Denoyelle	9f16c64a8c	MINOR: h3: adjust sedesc update for known input payload len	2025-11-21 10:34:08 +01:00
Christopher Faulet	0629ce8f4b	BUG/MEDIUM: cli: State the cli have no more data to deliver if it yields A regression was introduced in the commit 2d7e3ddd4 ("BUG/MEDIUM: cli: do not return ACKs one char at a time"). When the CLI is processing a command line, we no longer send response immediately. It is especially useful for clients sending a bunch of commands with very short response. However, in that state, the CLI applet must state it has no more data to deliver. Otherwise it will be woken up again and again because data are found in its output buffer with no blocking conditions. In worst cases, if the command rate is really high, this can trigger the watchdog. This patch must be backported where the patch above is, so probably as far as 3.0.	2025-11-21 10:00:15 +01:00
Christopher Faulet	dfdccbd2af	BUG/MEDIUM: applet: Fix conditions to detect spinning loop with the new API There was a mixup between read/send events and ability for an applet to receive and send. The fix seems obvious by reading it. The call-rate must be incremented when nothing was received from the applet while it was allowed and nothing was sent to the applet while it was allowed. This patch must be backported as far as 3.0.	2025-11-21 09:41:05 +01:00
Willy Tarreau	4cbff2cad9	MINOR: limits: display the computed maxconn using ha_notice() The computed maxconn was only displayed in verbose or debug modes. This is too bad because lots of users just don't know what they're starting with and can be trapped when an environment changes. Let's use ha_notice() instead of a conditional fprintf() so that it gets displayed right after the other startup messages, hoping that users will get used to seeing it and more easily spot anomalies. See github issue #3191 for more context.	2025-11-20 18:38:09 +01:00
Lukas Tribus	a50c074b74	DOC: http: document 413 response code Considering that we only use a "413 Payload Too Large" response in a single situation with a specific config toogle (h1-accept-payload-with-any-method), add some text to make it easier to find. Should be backported to 2.6. Link: https://github.com/cbonte/haproxy-dconv/issues/46 Link: https://discourse.haproxy.org/t/haproxy-error-413-paylod-too-large/9831/3	2025-11-20 18:07:01 +01:00
Willy Tarreau	05c409f1be	BUG/MEDIUM: connection/ssl: also fix the ssl_sock_io_cb() regarding idle list The fix in commit 9481cef948 ("BUG/MEDIUM: connection: do not reinsert a purgeable conn in idle list") is also needed for ssl_sock_io_cb() which can also release an idle connection and must perform the same checks. This fix must be backported to all stable versions containing the fix above.	2025-11-20 17:19:50 +01:00
William Lallemand	6aa236e964	ADMIN: dump-certs: let dry-run compare certificates Let the --dry-run mode connect to the socket and compare the certificates. It would exits the process just before trying to move the previous certificate and replace it. This allow to have the "[NOTICE] (1234) XXX is already up to date" message with dry-run.	2025-11-20 16:50:20 +01:00
Amaury Denoyelle	b2664d4450	BUG/MINOR: quic: flag conn with CO_FL_FDLESS on backend side Connection struct defines an handle which can point to either a FD or a quic_conn. On the latter case, CO_FL_FDLESS must be set. This is already the case on frontend side. This patch fixes QUIC backend support. Before setting connection handle member to a quic_conn instance, ensure that CO_FL_FDLESS flag is set on the connection. Prior to this patch, crash can occur in "show sess all". No need to backport.	2025-11-20 16:44:03 +01:00
Amaury Denoyelle	cd2962ee64	MINOR: quic: store source address for backend conns quic_conn has a local_addr member which is used to store the connection source address. On backend side, this member is initialized to NULL as the address is not yet known prior to connect. With this patch, quic_connect_server() is extended so that local_addr is updated after connect() success. Also, quic_sock_get_src() is completed for the backend side which now returns local_addr member. This step is necessary to properly support fetches bc_src/bc_src_port.	2025-11-20 16:44:03 +01:00
Christopher Faulet	a14b7790ad	DOC: config: Add a note about conflict with ALPN/NPN settings and proto keyword If a mux protocol is forced and an incompatible ALPN or NPN settings are used, connection errors may be experienced. There is no check performed during HAProxy startup and It is not necessarily obvious. So a note is added to warn users about this usage.	2025-11-20 16:14:52 +01:00
Christopher Faulet	0a7f3954b5	BUG/MEDIUM: config: Use the mux protocol ALPN by default for listeners if forced Since the commit 5003ac7fe ("MEDIUM: config: set useful ALPN defaults for HTTPS and QUIC"), the ALPN is set by default to "h2,http/1.1" for HTTPS listeners. However, it is in conflict with the forced mux protocol, if any. Indeed, with "proto" keyword, the mux can be forced. In that case, some combinations with the default ALPN will triggers connections errors. For instance, by setting "proto h2", it will not be possible to use the H1 multiplexer. So we must take care to not advertise it in the ALPN. Worse, since the commit above, most modern HTTP clients will try to use the H2 because it is advertised in the ALPN. By setting "proto h1" on the bind line will make all the traffic rejected in error. To fix the issue, and thanks to previous commits, if it is defined, we are now relying on the ALPN defined by the mux protocol by default. The H1 multiplexer (only the one that can be forced) defines it to "http/1.1" while the H2 multiplexer defines it to "h2". So by default, if one or another of these muxes is forced, and if no ALPN is set, the mux ALPN is used. Other multiplexers are not defining any default ALPN for now, because it is useless. In addition, only the listeners are concerned because there is no default ALPN on the server side.Finally, there is no tests performed if the ALPN is forced on the bind line. It is the user responsibility to properly configure his listeners (at least for now). This patch depends on: * MINOR: config: Do proto detection for listeners before checks about ALPN * MINOR: muxes: Support an optional ALPN string when defining mux protocols The series must be backported as far as 2.8.	2025-11-20 16:14:52 +01:00
Christopher Faulet	2ef8b91a00	MINOR: config: Do proto detection for listeners before checks about ALPN The verification of any forced mux protocol, via the "proto" keyword, for listeners is now performed before any tests on the ALPN. It will be mandatory to be able to force the default ALPN, if not forced on the bind line. This patch will be mandatory for the next fix.	2025-11-20 16:14:52 +01:00
Christopher Faulet	8e08a635eb	MINOR: muxes: Support an optional ALPN string when defining mux protocols When a multiplexer protocol is defined, it is now possible to specify the ALPN it supports, in binary format. This info is optionnal. For now only the h2 and the h1 multiplexers define an ALPN because this will be mandatory for a fix. But this could be used in future for different purpose. This patch will be mandatory for the next fix.	2025-11-20 16:14:52 +01:00
Olivier Houchard	e9d34f991e	BUG/MEDIUM: queues: Don't forget to unlock the queue before exiting In assign_server_and_queue(), there's a rare case when the server was full, so we created a pendconn, another server was considered but in the meanwhile the pendconn was unqueued already, so we just left the function. We did so, however, while still holding the queue lock, which will ultimately lead to a deadlock, and ultimately the watchdog would kill the process. To fix that, just unlock the queue before leaving. This should be backported to 3.2.	2025-11-20 13:57:06 +01:00
William Lallemand	1b443bdec5	DOC: acme: add details about key pair generation in ACME section In 3.3 it is possible to generate a key pair without needing a existing certificate on the disk.	2025-11-20 12:48:22 +01:00
William Lallemand	d6e3e5b3a6	DOC: acme: configuring acme needs a crt file Configuring acme in 3.2 needs a certificate on the disk. To be backported to 3.2	2025-11-20 12:44:54 +01:00
William Lallemand	332dcaecba	DOC: acme: explain how to dump the certificates The certificates can be dumped with either the dataplaneapi or the haproxy-dump-certs scripts. Must be backported in 3.2 as well as the script.	2025-11-20 12:40:38 +01:00
William Lallemand	5ff4c066e7	DOC: acme: add details about the DNS-01 support DNS-01 is supported and was backported in 3.2. Backport to 3.2.	2025-11-20 12:37:48 +01:00
William Lallemand	e0665d4ffe	BUG/MINOR: acme: alert when the map doesn't exist at startup When configuring an acme section with the 'map' keyword, the user must use an existing map. If the map doesn't exist, a log will be emitted when trying to add the challenge to the map. This patch change the behavior by checking at startup if the map exists, so haproxy would warn and won't start with a non-existing map. This must be backported in 3.2.	2025-11-20 12:22:19 +01:00
Frederic Lecaille	fab7da0fd0	BUG/MEDIUM: quic-be/ssl_sock: TLS callback called without connection Contrary to TCP, QUIC does not SSL_free() its SSL * object when its ->close() XPRT callback is called. This has as side effect to trigger some BUG_ON(!conn) with <conn> the connection from TLS callbacks registered at configuration parsing time, so after this <conn> have been released. This is the case for instance with ssl_sock_srv_verifycbk() whose role is to add some checks to the built-in server certificate verification process. This patch prevents the pointer to <conn> dereferencing inside several callbacks shared between TCP and QUIC. Thank you to @InputOutputZ for its report in GH #3188. As the QUIC backend feature arrived with the current 3.3 dev, no need to backport.	2025-11-20 11:36:57 +01:00
Willy Tarreau	8438ca273f	MINOR: limits: explain a bit better what to do when fd limits are exceeded As shown in github issue #3191, the error message shown when FD limits are exceeded is not very useful as-is, since the current hard limit is not displayed, and no suggestion is made about what to change in the config. Let's explain about maxconn/ulimit-n/fd-hard-limit, suggest dropping them or setting them to a context-based value at roughly 49% of the current limit minus the known used FDs for listeners and checks. This allows common "large" hard limits to report mostly round maxconns. Example: [ALERT] (25330) : [haproxy.main()] Cannot raise FD limit to 4001020, current limit is 1024 and hard limit is 4096. You may prefer to let HAProxy adjust the limit by itself; for this, please just drop any 'maxconn' and 'ulimit-n' from the global section, and possibly add 'fd-hard-limit' lower than this hard limit. You may also force a new 'maxconn' value that is a bit lower than half of the hard limit minus listeners and checks. This results in roughly 1500 here.	2025-11-20 08:44:52 +01:00
Willy Tarreau	91d4f4f618	MINOR: limits: keep a copy of the rough estimate of needed FDs in global struct It's always a pain to guess the number of FDs that can be needed by listeners, checks, threads, pollers etc. We have this estimate in global.maxsock before calling set_global_maxconn(), but we lose it the line after. Let's copy it into global.est_fd_usage and keep it. This will be helpful to try to provide more accurate suggestions for maxconn.	2025-11-20 08:44:52 +01:00
Frederic Lecaille	2c6720a163	MINOR: quic: uneeded xprt context variable passed as parameter This quic_conn ->xrpt_ctx is passed to qc_send_ppkts(), the quic_conn is retrieved from this context to be used inside this function and it is not used at all by this function. This patch simply directly passes the quic_conn to qc_send_ppkts(). This is only what this function needs.	2025-11-20 08:17:44 +01:00
Frederic Lecaille	a88fdf8669	MINOR: quic/flags: add missing QUIC flags for flags dev tool. Add missing QUIC_FL_CONN_XPRT_CLOSED quic_conn flags definition.	2025-11-20 08:10:58 +01:00
Willy Tarreau	40687ebc64	REGTESTS: ssl: also verify that 0-rtt properly advertises early-data:1 This patch completes the 0-rtt test to verify that early-data:1 is properly emitted to the server in the relevant situations. We carefully compare it with the expected values that are computed based on the TLS version, the client and listener's support for 0-rtt and the resumption status. A response header "x-early-data-test" is set to OK on success, or KO on failure and the client tests this. The previous test is kept as well. This was tested with quictls-1.1.1 and quictls-3.0.1 for TCP, as well as aws-lc for QUIC.	2025-11-19 22:30:31 +01:00
Willy Tarreau	2dc4d99cd2	REGTESTS: ssl: add basic 0rtt tests for TLSv1.2, TLSv1.3 and QUIC These tests try all the combinations of {0,1}rtt <-> {0,1}rtt with stateless and stateful tickets. They take into consideration the TLS version to decide whether or not 0rtt should work. Since we cannot use environment variables in the client, the tests are run in haproxy itself where the frontends set a "x-early-rcvd-test" response header that the client checks. At this stage, the test only verifies that some early data were received. Note that the tests are a bit complex because we need 4 listeners for the various combinations of 0rtt/tickets, then we have to set expectations based on the TLS version (1.2 vs 1.3), as well as the session resumption status. We have to set alpn on the server lines because currently our frontends expect it for 0-rtt to work.	2025-11-19 22:30:21 +01:00
William Lallemand	f6373a6ca8	BUILD: Makefile: remove halog from install-admin The dependency to halog build provokes problems when changing CFLAGS and LDFLAGS, because you're suppose to have the same flags during the build and the install if there's still some things to build. We probably need to store the flags somewhere to reuse them at another step, but we need to do it cleanly. In the meantime it's better not to have this dependency.	2025-11-19 16:52:20 +01:00
Amaury Denoyelle	d54d78fe9a	BUG/MINOR: quic: fix FD usage for quic_conn_closed on backend side On the frontend side, QUIC transfer can be performed either via a connection owned FD or multiplex on the listener one. When a quic_conn is freed and converted to quic_conn_closed instance, its FD if open is closed and all exchanges are now multiplex via the listener FD. This is different for the backend as connections only has the choice to use their owned FD. Thus, special care care must be taken when freeing a connection and converting it to a quic_conn_closed instance. In this case, qc_release_fd() is delayed to the quic_conn_closed release. Furthermore, when the FD is transferred, its iocb and owner fields are updated to the new quic_conn_closed instance. Without it, a crash will occur when accessing the freed quic_conn tasklet. A newly dedicated handler quic_conn_closed_sock_fd_iocb is used to ensure access to quic_conn_closed members only.	2025-11-19 16:02:22 +01:00
Amaury Denoyelle	46c5c232d7	BUG/MINOR: quic: do not decrement jobs for backend conns jobs is a global counter which serves to account activity through the whole process. Soft-stop procedure will wait until this counter is resetted to the nul value. jobs is not used for backend connections. Thus, it is not incremented when a QUIC backend connection is instantiated as expected. However, decrement is performed on all sides during quic_conn_release(). This causes the counter wrapping. Fix this by decrementing jobs only for frontend connections. Without this patch, soft stop procedure will hang indefinitely if QUIC backend connections were in use.	2025-11-19 16:02:22 +01:00
Amaury Denoyelle	1a22caa6ed	MINOR: quic: fix trace on quic_conn_closed release Adjust leaving trace of quic_release_cc_conn() so that the end of the function is properly reported.	2025-11-19 16:02:22 +01:00
Amaury Denoyelle	e55bcf5746	BUG/MINOR: mux-quic: implement max-reuse server parameter Properly implement support for max-reuse server keyword. This is done by adding a total count of streams seen for the whole connection. This value is used in avail_streams callback.	2025-11-19 16:02:22 +01:00
William Lallemand	c8540f7437	BUG/MINOR: ssl: remove dead code in ssl_sock_from_buf() When haproxy is compiled in -O0, the SSL_get_max_early_data() symbol is used in the generated assembly, however -O2 seems to remove this symbol when optimizing the code. It happens because `if conn_is_back(conn)` and `if (objt_listener(conn->target))` are opposed conditions, which mean we never use the branch when objt_listener(conn->target) is true. This patch removes the dead code. Bonus: SSL_get_max_early_data() is not implemented in rustls, and that's the only thing preventing to start with it. This can be backported in every stable branches.	2025-11-19 11:00:05 +01:00
William Lallemand	1f562687e3	CI: github: make install-bin instead of make install make install now have a dependency to install-admin which have a dependency to admin/halog/halog. halog links haproxy .o together with its own objects, but those objects when built with ASAN must also be linked with ASAN or it won't be possible to link the binary. We don't need an ASAN-ready halog, so let's just do an install-bin instead that will just install haproxy.	2025-11-18 20:11:23 +01:00
William Lallemand	c3a95ba839	BUILD: Makefile: make install with admin tools `make install` now install some admin tools: - halog in SBINDIR - haproxy-dump-certs in SBINDIR - haproxy-reload in SBINDIR	2025-11-18 20:02:24 +01:00
Willy Tarreau	14cb3799df	REGTESTS: ssl: split the SSL reuse test into TLS 1.2/1.3 QUIC and TLS don't use the same tests because QUIC only supports TLS 1.3 while SSL tests both TLS 1.2 and 1.3, which complicates the tests scenarios. This change extracts the core of the test into a single generic ssl_reuse.vtci file and creates new high-level tests for TLSv1.2 over TCP, TLSv1.3 over TCP and TLSv1.3 over QUIC, which simply include this file and set two variables. The test is now cleaner and simpler.	2025-11-18 16:51:56 +01:00
William Lallemand	177816d2b8	BUG/MINOR: acme: P-256 doesn't work with openssl >= 3.0 When trying to use the P-256 curve in the acme configuration with OpenSSL 3.x, the generation of the account was failing because OpenSSL doesn't return a NIST or SECG curve name, but a ANSI X9.62 one. Since the ANSI X9.62 curve names were not in the list, it couldn't match anything supported. This patch fixes the issue by adding both prime192v1 and prime256v1 name in the struct curve array which is used during curve parsing. Must be backported to 3.2.	2025-11-18 11:34:28 +01:00
William Lallemand	9bf01a0d29	BUG/MINOR: mworker: wrong signals during startup Since the new master-worker model in 3.1, signals are registered in step_init_3(). However, those signals were supposed to be registered only for the worker or the standalone mode. It would call the wrong callback in the master even during configuration parsing. The patch set the signals handler to NULL for the master so it does nothing until they really are registered. Must be backported as far as 3.1.	2025-11-18 10:27:34 +01:00
William Lallemand	709cde6d08	BUG/MEDIUM: mworker: signals inconsistencies during startup and reload Since haproxy 3.1, the master-worker mode changed to let the worker parse the configuration instead of the master. Previously, signals were blocked during configuration parsing and unblocked before entering the polling loop of the master. This way it was impossible to start a reload during the configuration parsing. But with the new model, the polling loop is started in the master before the configuration parsing is finished, and the signals are still unblocked at this step. Meaning that it is possible to start a reload while the configuration is parsing. This patch reintroduce the behavior of blocking the signals during configuration parsing adapted to the new model: - Before the exec() of the reload, signals are blocked. - When entering the polling loop, the SIGCHLD is unblocked because it is required to get a failure during configuration parsing in the worker - Once the configuration is parsed, upon success in _send_status() or upon failure in run_master_in_recovery_mode() every signals are unblocked. This patch must be backported as far as 3.1.	2025-11-18 10:05:42 +01:00
William Lallemand	b38405d156	CLEANUP: startup: move confusing msg variable Move the char *msg variable declared in main() in a sub-block since there's already multiple msg variable in other sub-blocks in this function. Also make it const.	2025-11-18 09:43:25 +01:00
Frederic Lecaille	37d01eea37	BUG/MEDIUM: quic-be: prevent use of MUX for 0-RTT sessions without secrets The QUIC backend crashes when its peer does not support 0-RTT. In this case, when the sessions are reused, no early-data level secrets are derived by the TLS stack. This leads to crashes from qc_send_mux() which does not suppose that both early-data level (qc->eel) and application level (qc->ael) cipher levels could be non initialized. To fix this: - prevent qc_send_mux() to send data if these two encryption level are not intialized. In this case it returns QUIC_TX_ERR_NONE; - avoid waking up the MUX from XPRT ->start() callback if the MUX is ready but without early-data level secrets to send them; - ensure the MUX is woken up by qc_ssl_do_handshake() after handshake completion if it is ready calling qc_notify_send() Thank you to @InputOutputZ for having reported this issue in GH #3188. No need to backport because QUIC backends is a current 3.3 development feature.	2025-11-17 15:40:24 +01:00
William Lallemand	0367227375	MEDIUM: mworker: set the mworker-max-reloads to 50 There was no mworker-max-reload value by default, it was set to INT_MAX so this was impossible to reach. The default value is now 50, which is still high, but no workers should undergo that much reloads. Meaning that a worker will be killed with SIGTERM if it reach this much reloads.	2025-11-17 11:54:30 +01:00
Amaury Denoyelle	c67a614e45	MINOR: quic: remove <ipv4> arg from qc_new_conn() Remove <ipv4> argument from qc_new_conn(). This parameter is unnecessary as it can be derived from the family type of the addresses also passed as argument.	2025-11-17 10:20:54 +01:00
Amaury Denoyelle	133f100467	MINOR: quic: refactor qc_new_conn() prototype The objective of this patch is to streamline qc_new_conn() usage so that it is similar for frontend and backend sides. Previously, several parameters were set only for frontend connections. These arguments are replaced by a single quic_rx_packet argument, which represents the INITIAL packet triggering the connection allocation on the server side. For a QUIC client endpoint, it remains NULL. This usage is consider more explicit. As a minor change, <target> is moved as the first argument of the function. This is considered useful as this argument determines whether the connection is a frontend or backend entry. Along with these changes, qc_new_conn() documentation has been reworded so that it is now up-to-date with the newest usage.	2025-11-17 10:13:40 +01:00
Amaury Denoyelle	49edaca513	MINOR: quic: try to clarify quic_conn CIDs fields direction quic_conn has two fields named <dcid> and <scid>. It may cause confusion as it is not obvious how these fields are related to the connection direction. Try to improve this by extending the documentation of these two fields.	2025-11-17 10:11:04 +01:00
Amaury Denoyelle	035c026220	MINOR: quic: support multiple random CID generation for BE side When a new backend connection is instantiated, a CID is first randomly generated. It will serve as the first DCID for incoming packets from the server. Prior to this patch, if the generated CID caused a collision with an other entries from another connection, an error is reported and the connection cannot be allocated. This patch improves this procedure by implementing retries when a collision occurs. Now, at most three attemps will be performed before giving up. This is the same procedure already performed for CIDs instantiated after RETIRE_CONNECTION_ID frame parsing. Along with this functional change, qc_new_conn() is refactored for backend instantiation. The CID generation is extracted from it and the value is passed as an argument. This is considered cleaner as the code is more similar between frontend and backend sides.	2025-11-17 10:11:04 +01:00
Amaury Denoyelle	8720130cc7	MINOR: quic: do not use quic_newcid_from_hash64 on BE side quic_newcid_from_hash64 is an external callback. If defined, it serves as a CID method generation, as an alternative to the default random implementation. This mechanism was not correctly implemented on the backend side. Indeed, <hash64> quic_conn member is only setted for frontend connections. The simplest solution would be to properly define it also for backend ones. However, quic_newcid_from_hash64 derivation is really only useful for the frontend side for now. Thus, this patch disables using it on the backend side in favor of the default random generator. To implement this, quic_cid_generate() is splitted in two functions, for both methods of CIDs generation. This is the responsibility of the caller to select the proper method. On backend side, only random implementation is now used.	2025-11-17 10:11:04 +01:00
Christopher Faulet	fc6e3e9081	MINOR: stick-tables: Rename stksess shards to use buckets The shard keyword is already used by the peers and on the server lines. And it is unrelated with the session keys distribution. So instead of talking about shard for the session key hashing, we now use the term "bucket".	2025-11-17 07:42:51 +01:00
Willy Tarreau	e5dadb2e8e	[RELEASE] Released version 3.3-dev13 Released version 3.3-dev13 with the following main changes : - BUG/MEDIUM: config: for word expansion, empty or non-existing are the same - BUG/MINOR: quic: close connection on CID alloc failure - MINOR: quic: adjust CID conn tree alloc in qc_new_conn() - MINOR: quic: split CID alloc/generation function - BUG/MEDIUM: quic: handle collision on CID generation - MINOR: quic: extend traces on CID allocation - MEDIUM/OPTIM: quic: alloc quic_conn after CID collision check - MINOR: stats-proxy: ensure future-proof FN_AGE manipulation in me_generate_field() - BUG/MEDIUM: stats-file: fix shm-stats-file preload not working anymore - BUG/MINOR: do not account backend connections into maxconn - BUG/MEDIUM: init: 'devnullfd' not properly closed for master - BUG/MINOR: acme: more explicit error when BIO_new_file() - BUG/MEDIUM: quic-be: do not launch the connection migration process - MINOR: quic-be: Parse the NEW_TOKEN frame - MEDIUM: quic-be: Parse, store and reuse tokens provided by NEW_TOKEN - MINOR: quic-be: helper functions to save/restore transport params (0-RTT) - MINOR: quic-be: helper quic_reuse_srv_params() function to reuse server params (0-RTT) - MINOR: quic-be: Save the backend 0-RTT parameters - MEDIUM: quic-be: modify ssl_sock_srv_try_reuse_sess() to reuse backend sessions (0-RTT) - MINOR: quic-be: allow the preparation of 0-RTT packets - MINOR: quic-be: Send post handshake frames from list of frames (0-RTT) - MEDIUM: quic-be: qc_send_mux() adaptation for 0-RTT - MINOR: quic-be: discard the 0-RTT keys - MEDIUM: quic-be: enable the use of 0-RTT - MINOR: quic-be: validate the 0-RTT transport parameters - MINOR: quic-be: do not create the mux after handshake completion (for 0-RTT) - MINOR: quic-be: avoid a useless I/O callback wakeup for 0-RTT sessions - BUG/MEDIUM: acme: move from mt_list to a rwlock + ebmbtree - BUG/MINOR: acme: can't override the default resolver - MINOR: ssl/sample: expose ssl_*c_curve for AWS-LC - MINOR: check: delay MUX init when SSL ALPN is used - MINOR: cfgdiag: adjust diag on servers - BUG/MINOR: check: only try connection reuse for http-check rulesets - BUG/MINOR: check: fix reuse-pool if MUX inherited from server - MINOR: check: clarify check-reuse-pool interaction with reuse policy - DOC: configuration: add missing ssllib_name_startswith() - DOC: configuration: add missing openssl_version predicates - MINOR: cfgcond: add "awslc_api_atleast" and "awslc_api_before" - REGTESTS: ssl: activate ssl_curve_name.vtc for AWS-LC - BUILD: ech: fix clang warnings - BUG/MEDIUM: stick-tables: Always return the good stksess from stktable_set_entry - BUG/MINOR: stick-tables: Fix return value for __stksess_kill() - CLEANUP: stick-tables: Don't needlessly compute shard number in stksess_free() - MINOR: h1: h1_release() should return if it destroyed the connection - BUG/MEDIUM: h1: prevent a crash on HTTP/2 upgrade - MINOR: check: use auto SNI for QUIC checks - MINOR: check: ensure QUIC checks configuration coherency - CLEANUP: peers: remove an unneeded null check - Revert "BUG/MEDIUM: connections: permit to permanently remove an idle conn" - BUG/MEDIUM: connection: do not reinsert a purgeable conn in idle list - DEBUG: extend DEBUG_STRESS to ease testing and turn on extra checks - DEBUG: add BUG_ON_STRESS(): a BUG_ON() implemented only when DEBUG_STRESS > 0 - DEBUG: servers: add a few checks for stress-testing idle conns - BUG/MINOR: check: fix QUIC check test when QUIC disabled - BUG/MINOR: quic-be: missing version negotiation - CLEANUP: quic: Missing succesful SSL handshake backend trace (OpenSSL 3.5) - BUG/MINOR: quic-be: backend SSL session reuse fix (OpenSSL 3.5) - REGTEST: quic: quic/ssl_reuse.vtc supports OpenSSL 3.5 QUIC API	2025-11-14 19:22:46 +01:00
Frederic Lecaille	d8f3ed6c23	REGTEST: quic: quic/ssl_reuse.vtc supports OpenSSL 3.5 QUIC API This scripts is supported by OpenSSL 3.5 QUIC API since this previous commit: BUG/MINOR: quic: backend SSL session reuse fix (HAVE_OPENSSL_QUIC) Should be backported where this commit is backported.	2025-11-14 18:06:47 +01:00
Frederic Lecaille	54eeda4b01	BUG/MINOR: quic-be: backend SSL session reuse fix (OpenSSL 3.5) This bug impacts only the QUIC backends when haproxy is compiled against OpenSSL 3.5 with QUIC API(HAVE_OPENSSL_QUIC). The QUIC clients could not reuse their SSL session because the TLS tickets received from the servers could not be provided to the TLS stack. This should be done when the stack calls ha_quic_ossl_crypto_recv_rcd() (OSSL_FUNC_SSL_QUIC_TLS_CRYPTO_RECV_RCD callback). According to OpenSSL team, an SSL_read() call must be done after the handshake completion. It seems the correct location is at the same level as for SSL_process_quic_post_handshake() for quictls. Thank you to @mattcaswell, @Sashan and @vdukhovni for having helped in solving this issue. Must be backported to 3.1	2025-11-14 17:50:49 +01:00
Frederic Lecaille	644bf585c3	CLEANUP: quic: Missing succesful SSL handshake backend trace (OpenSSL 3.5) This very minor issue impacts only the backend when compiled against OpenSSL 3.5 with QUIC API (HAVE_OPENSSL_QUIC). The "SSL handshake OK" trace was not dumped by a TRACE() call. This was very annoying when debugging. Modify the concerned code section which is a bit ugly and simplify it. The TRACE() call is done at a unique location for now on. Should be backported to 3.2 to ease any further backport.	2025-11-14 17:50:49 +01:00
Frederic Lecaille	f0c52f7160	BUG/MINOR: quic-be: missing version negotiation This bug impacts only the QUIC clients (or backends). The version negotiation was not supported at all for them. This is an oversight. Contrary to the QUIC server which choose the negotiated version after having received the transport parameters (into ClientHello message) the client selects the negotiated version from the first Initial packet version field. Indeed, the server transport parameters are inside the ServerHello messages ciphered into Handshake packets. This non intrusive patch does not impact the QUIC server implementation. It only selects the negotiated version from the first Initial packet received from the server and consequently initializes the TLS cipher context. Thank you to @InputOutputZ for having reporte this issue in GH #3178. No need to backport because the QUIC backends support arrives with 3.3.	2025-11-14 17:37:34 +01:00
Willy Tarreau	0746aa68b8	BUG/MINOR: check: fix QUIC check test when QUIC disabled Latest commit ef206d441c ("MINOR: check: ensure QUIC checks configuration coherency") introduced a regression when QUIC is not compiled in. Indeed, not specifying a check proto sets mux_proto to NULL, which also happens to be the value of get_mux_proto("QUIC"), so it complains about QUIC. Let's add a non-null check in addition to this. No backport is needed.	2025-11-14 17:27:53 +01:00
Willy Tarreau	4a6dec7193	DEBUG: servers: add a few checks for stress-testing idle conns The latest idle conns fix 9481cef948 ("BUG/MEDIUM: connection: do not reinsert a purgeable conn in idle list") addresses a very hard-to-hit case which manifests itself with an attempt to reuse a connection fails because conn->mux is NULL: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000655410b8642c in conn_backend_get (reuse_mode=4, srv=srv@entry=0x6554378a7140, sess=sess@entry=0x7cfe140948a0, is_safe=is_safe@entry=0, hash=hash@entry=910818338996668161) at src/backend.c:1390 1390 if (conn->mux->takeover && conn->mux->takeover(conn, i, 0) == 0) { However the condition that leads to this situation can be detected earlier, by the presence of the connection in the toremove_list, whose race window is much larger and easier to detect. This patch adds a few BUG_ON_STRESS() at selected places that an detect this condition. When built with -DDEBUG_STRESS and run under stress with two distinct processes communicating over H2 over SSL, under a stress of 400-500k req/s, the front process usually crashes in the first 10-30s triggering in _srv_add_idle() if the fix above is reverted (and it does not crash with the fix). This is mainly included to serve as an illustration of how to instrument the code for seamless stress testing.	2025-11-14 17:00:17 +01:00
Willy Tarreau	675c86c4aa	DEBUG: add BUG_ON_STRESS(): a BUG_ON() implemented only when DEBUG_STRESS > 0 The purpose of this new BUG_ON is beyond BUG_ON_HOT(). While BUG_ON_HOT() is meant to be light but placed on very hot code paths, BUG_ON_STRESS() might be heavy and only used under stress-testing, to try to detect early that something bad is starting to happen. This one is not even type-checked when not defined because we don't want to risk the compiler emitting the slightest piece of code there in production mode, so as to give enough freedom to the developers.	2025-11-14 16:42:53 +01:00
Willy Tarreau	3d441e78e5	DEBUG: extend DEBUG_STRESS to ease testing and turn on extra checks DEBUG_STRESS is currently used only to expose "stress-level". With this patch, we go a bit further, by automatically forcing DEBUG_STRICT and DEBUG_STRICT_ACTION to their highest values in order to enable all BUG_ON levels, and make all of them result in a crash. In addition, care is taken to always only have 0 or 1 in the macro, so that it can be tested using "#if DEBUG_STRESS > 0" as well as "if (DEBUG_STRESS) { }" everywhere. The goal will be to ease insertion of extra tests for builds dedicated to stress-testing that enable possibly expensive extra checks on certain code paths that cannot reasonably be compiled in for production code right now.	2025-11-14 16:38:04 +01:00
Amaury Denoyelle	9481cef948	BUG/MEDIUM: connection: do not reinsert a purgeable conn in idle list A recent patch was introduced to fix a rare race condition in idle connection code which would result in a crash. The issue is when MUX IO handler run on top of connection moved in the purgeable list. The connection would be considered as present in the idle list instead, and reinserted in it at the end of the handler while still in the purge list. 096999ee208b8ae306983bc3fd677517d05948d2 BUG/MEDIUM: connections: permit to permanently remove an idle conn This patch solves the described issue. However, it introduces another bug as it may clear connection flag when removing a connection from its parent list. However, these flags now serve primarily as a status which indicate that the connection is accounted by the server. When a backend connection is freed, server idle/used counters are decremented accordingly to these flags. With the above patch, an incorrect counter could be adjusted and thus wrapping would occured. The first impact of this bug is that it may distort the estimated number of connections needed by servers, which would result either in poor reuse rate or too many idle connections kept. Another noticeable impact is that it may prevent server deletion. The main problem of the original and current issues is that connection flags are misinterpreted as telling if a connection is present in the idle list. As already described here, in fact these flags are solely a status which indicate that the connection is accounted in server counters. Thus, here are the definitive conclusion that can be learned here : * (conn->flags & CO_FL_LIST_MASK) == 1: the connection is accounted by the server it may or may not be present in the idle list * (conn->flags & CO_FL_LIST_MASK) == 0 the connection is not accounted and not present in idle list The discussion above does not mention session list, but a similar pattern can be observed when CO_FL_SESS_IDLE flag is set. To keep the original issue solved and fix the current one, IO MUX handlers prologue are rewritten. Now, flags are not checked anymore for list appartenance and LIST_INLIST macro is used instead. This is definitely clearer with conn_in_list purpose here. On IO MUX handlers end, conn idle flags may be checked if conn_in_list was true, to reinsert the connection either in idle or safe list. This is considered safe as no function should modify idle flags when a connection is not stored in a list, except during conn_free() operation. This patch must be backported to every stable versions after revert of the above commit. It should be appliable up to 3.0 without any issue. On 2.8 and below, <idle_list> connection member does not exist. It should be safe to check <leaf_p> tree node as a replacement.	2025-11-14 16:06:34 +01:00
Amaury Denoyelle	d79295d89b	Revert "BUG/MEDIUM: connections: permit to permanently remove an idle conn" The target patch fixes a rare race condition which happen when a MUX IO handler is working on a connection already moved into the purge list. In this case, the handler will incorrectly moved back the connection into the idle list. To fix this, conn_delete_from_tree() was extended to remove flags along with the connection from the idle list. This was performed when the connection is moved into the purge list. However, it introduces another issue related to the idle server connection accounting. Thus it is necessary to revert it prior to the incoming newer fix. This patch must be backported to every version where the original commit is.	2025-11-14 16:06:34 +01:00
Willy Tarreau	6b9c3d0621	CLEANUP: peers: remove an unneeded null check Coverity reported in GH #3181 that a NULL test was useless, in peers_trace(), which is true since the peer always belongs to a peers section and it was already dereferenced. Let's just remove the test to avoid the confusion.	2025-11-14 13:47:20 +01:00
Amaury Denoyelle	ef206d441c	MINOR: check: ensure QUIC checks configuration coherency QUIC is now supported on the backend side, thus it is possible to use it with server checks. However, checks configuration can be quite extensive, differing greatly from the server settings. This patch ensures that QUIC checks are always performed under a controlled context. Objectives are to avoid any crashes and ensure that there is no suprise for users in respect to the configuration. The first part of this patch ensures that QUIC checks can only be activated on QUIC servers. Indeed, QUIC requires dedicated initialization steps prior to its usage. The other part of this patch disables QUIC usage when one or multiple specific check connection settings are specified in the configuration, diverging from the server settings. This is the simplest solution for now and ensure that there is no hidden behavior to users. This means that it's currently impossible to perform QUIC checks if other endpoints that the server itself. However for now there is no real use-case for this scenario. Along with these changes, check-proto documentation is updated to clarify QUIC checks behavior.	2025-11-14 13:42:08 +01:00
Amaury Denoyelle	ca5a5f37a1	MINOR: check: use auto SNI for QUIC checks By default, check SNI is set to the Host header when an HTTPS check is performed. This patch extends this mode so that it is also active when QUIC checks are executed. This patch should improve reuse rate with checks. Indeed, SNI is also already automatically set for normal traffic. The same value must be used during check so that a connection hash match can be found.	2025-11-14 13:42:08 +01:00
Olivier Houchard	333deef485	BUG/MEDIUM: h1: prevent a crash on HTTP/2 upgrade Change h1_process() to return -2 when the mux is destroyed but the connection is not, so that we can differentiate between "both mux and connection were destroyed" and "only the mux was destroyed". It can happen that only the mux gets destroyed, and the connection is still alive, if we did upgrade it to HTTP/2. In h1_wake(), if the connection is alive, then return 0, as the wake methods should only return -1 if the connection is dead. This fixes a bug where the ssl xprt would consider the connection destroyed, and thus would consider its tasklet should die, and return NULL, and its TASK_RUNNING flag would never be removed, leading to an infinite loop later on. This would happen anytime an HTTP/2 upgrade was successful. This should be backported up to 2.8. While the bug by commit 00f43b7c8b136515653bcb2fc014b0832ec32d61, it was not triggered before only by chance, and exists in previous releases too.	2025-11-14 12:49:35 +01:00
Olivier Houchard	2f8f09854f	MINOR: h1: h1_release() should return if it destroyed the connection h1_release() is called to destroy everything related to the mux h1, usually even the connection. However, it handles upgrades to HTTP/2 too, in which case the h1 mux will be destroyed, but the connection will still be alive. So make it so it returns 0 if everything is destroyed, and -1 if the connection is still alive. This should be backported up to 2.8, as a future bugfix will depend on it.	2025-11-14 12:49:35 +01:00
Christopher Faulet	14a333c4f4	CLEANUP: stick-tables: Don't needlessly compute shard number in stksess_free() Since commit 0bda33a3e ("MINOR: stick-tables: remove the uneeded read lock in stksess_free()"), the lock on the shard is no longer acquired. So it is useless to still compture the shard number. The result is never used and can be safely removed.	2025-11-14 11:56:14 +01:00
Christopher Faulet	346d6c3ac7	BUG/MINOR: stick-tables: Fix return value for __stksess_kill() The commit 9938fb9c7 ("BUG/MEDIUM: stick-tables: Fix race with peers when killing a sticky session") introduced a regression. __stksess_kill() must always return 0 if the session cannot be released. But when the ref_cnt is tested under the update lock, a success is reported if the session is still in-used. 0 must be returned in that case. This bug is harmless because callers never use the return value of __stksess_kill() or stksess_kill(). This bug must be backported as far as 3.0.	2025-11-14 11:56:14 +01:00
Christopher Faulet	bd4fff9a76	BUG/MEDIUM: stick-tables: Always return the good stksess from stktable_set_entry In stktable_set_entry(), the return value of __stktable_store() is not tested while it is possible to get an existing session with the same key instead of the one we want to insert. It happens when we fails to upgrade the read lock on the bucket to an write lock. In that case, we release the lock for a short time to get a write lock. So, to fix the bug, we must check the session returned by __stktable_store() and take care to return this one. The bug was introduced by the commit e62885237c ("MEDIUM: stick-table: make stktable_set_entry() look up under a read lock"). It must be backported as far as 2.8.	2025-11-14 11:56:12 +01:00
William Lallemand	bf639e581d	BUILD: ech: fix clang warnings No impact as the state is either SHOW_ECH_SPECIFIC or SHOW_ECH_ALL but never anything else. src/ech.c:240:6: error: variable 'p' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] 240 \| if (ctx->state == SHOW_ECH_ALL) { \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ src/ech.c:275:12: note: uninitialized use occurs here 275 \| ctx->pp = p; \| ^ src/ech.c:240:2: note: remove the 'if' if its condition is always true 240 \| if (ctx->state == SHOW_ECH_ALL) { \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/ech.c:228:17: note: initialize the variable 'p' to silence this warning 228 \| struct proxy p; \| ^ \| = NULL src/ech.c:240:6: error: variable 'bind_conf' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] 240 \| if (ctx->state == SHOW_ECH_ALL) { \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ src/ech.c:276:11: note: uninitialized use occurs here 276 \| ctx->b = bind_conf; \| ^~~~~~~~~ src/ech.c:240:2: note: remove the 'if' if its condition is always true 240 \| if (ctx->state == SHOW_ECH_ALL) { \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/ech.c:229:29: note: initialize the variable 'bind_conf' to silence this warning 229 \| struct bind_conf bind_conf; \| ^ \| = NULL 2 errors generated. make: *** [Makefile:1062: src/ech.o] Error 1	2025-11-14 11:35:38 +01:00
William Lallemand	e17881128b	REGTESTS: ssl: activate ssl_curve_name.vtc for AWS-LC It was difficult to test ssl_curve_name.vtc with AWS-LC without a way to check the AWS-LC API. Let's add awslc_api_atleast() in the start conditions.	2025-11-14 11:01:45 +01:00
William Lallemand	3d15c07ed0	MINOR: cfgcond: add "awslc_api_atleast" and "awslc_api_before" AWS-LC features are not easily tested with just the openssl version constant. AWS-LC uses its own API versioning stored in the AWSLC_API_VERSION constant. This patch add the two awslc_api_atleast and awslc_api_before predicates that help to check the AWS-LC API.	2025-11-14 11:01:45 +01:00
William Lallemand	35d21a8bc0	DOC: configuration: add missing openssl_version predicates Add missing openssl_version_atleast() and openssl_version_before() predicates. The predicates exist since 3aeb3f9347 ("MINOR: cfgcond: implements openssl_version_atleast and openssl_version_before"). Must be backported in every stable versions.	2025-11-14 11:01:45 +01:00
William Lallemand	9ad018a3dd	DOC: configuration: add missing ssllib_name_startswith() Add the missing ssllib_name_startswith() predicate in the documentation. The predicate was introduced with b01179aa9 ("MINOR: ssl: Add ssllib_name_startswith precondition"). Must be backported as far as 2.6.	2025-11-14 11:01:45 +01:00
Amaury Denoyelle	8415254cea	MINOR: check: clarify check-reuse-pool interaction with reuse policy check-reuse-pool can only perform as expected if reuse policy on the backend is set to aggressive or higher. Update the documentation to reflect this and implement a server diag warning.	2025-11-14 10:44:05 +01:00
Amaury Denoyelle	52a7d4ec39	BUG/MINOR: check: fix reuse-pool if MUX inherited from server Check reuse is only performed if no specific check connect options are specified on the configuration. This ensures that reuse won't be performed if intending to use different connection parameters from the default traffic. This relies on tcpcheck_use_nondefault_connect() which indicates if the check has any specific connection parameters. One of them if check <mux_proto> field. However, this field may be automatically set during init_srv_check() in some specific conditions without any explicit configuration, most notably when using http-check rulesets on an HTTP backend. Thus, it prevents connection reuse for these checks. This commit fixes this by adjuting tcpcheck_use_nondefault_connect(). Beside checking check <mux_proto> field, it also detects if it is different from the server configuration. This is sufficient to know if the value is derived from the configuration or automatically calculated in init_srv_check(). Note that this patch introduces a small behavior change. Prior to it, check reuse were never performed if "check-proto" is explicitely configured. Now, check reuse will be performed if the configured value is identical to the server MUX protocol. This is considered as acceptable as connection reuse is safe when using a similar MUX protocol. This must be backported up to 3.2.	2025-11-14 10:44:05 +01:00
Amaury Denoyelle	5d021c028e	BUG/MINOR: check: only try connection reuse for http-check rulesets In 3.2, a new server keyword "check-reuse-pool" has been introduced. It allows to reuse a connection for a new check, instead of always initializing a new one. This is only performed if the check does not rely on specific connection parameters differing from the server. This patch further restricts reuse for checks only when an HTTP ruleset is used at the backend level. Indeed, reusing a connection outside of HTTP is an undefined behavior. The impact of this bug is unknown and depends on the proxy/server configuration. In the case of an HTTP backend with non-HTTP checks, check-reuse-pool would probably cause a drop in reuse rate. Along this change, implement a new diagnostic warning on servers to report that check-reuse-pool cannot apply due to an incompatible check type. This must be backported up to 3.2.	2025-11-14 10:44:03 +01:00
Amaury Denoyelle	d92f8f84fb	MINOR: cfgdiag: adjust diag on servers Adjust code dealing with diagnostics performed on server. The objective is to extract the check on duplicate cookies in a dedicated function outside of the proxies/servers loop. This does not have any noticeable impact. This patch is merely a code improvment to implement easily new future diagnostics on servers.	2025-11-14 10:00:26 +01:00
Amaury Denoyelle	d12971dfea	MINOR: check: delay MUX init when SSL ALPN is used When instantiating a new connection for check, its MUX may be initialized early. This was not performed though if SSL ALPN negotiation will be used, except if check MUX is already fixed. However, this method of initialization is problematic when QUIC MUX is used. Indeed, this multiplexer must only be instantiated after the above application protocol is known, which is derived from the ALPN negotiation. If this is not the case a crash will occur in qmux_init(). In fact, a similar problem was already encountered for normal traffic. Thus, a change was performed in connect_server() : MUX early initialization is now always skipped if SSL ALPN negotiation is active, even if MUX is already fixed. This patch introduces a similar change for checks. Without this patch, it is not possible to perform check on QUIC servers as expected. Indeed, when http-check ruleset is active a crash would occur prior to it.	2025-11-14 09:49:04 +01:00
Damien Claisse	1d46c08689	MINOR: ssl/sample: expose ssl_*c_curve for AWS-LC The underlying SSL_get_negotiated_group function has been backported into AWS-LC [1], so expose the feature for users of this TLS stack as well. Note that even though it was actually added in AWS-LC 1.56.0, we require AWSLC_API_VERSION >= 35 which was released in AWS-LC 1.57.0, because API version wasn't incremented after this change. As the delta is one minor version (less than two weeks), I consider this acceptable to avoid relying on a proxy constant like TLSEXT_nid_unknown which might be removed at some point. [1] `d6a37244ad`	2025-11-13 17:36:43 +01:00
William Lallemand	b9b158ea4c	BUG/MINOR: acme: can't override the default resolver httpclient_acme_init() was called in cfg_parse_acme() which is at section parsing. httpclient_acme_init() also calls httpclient_create_proxy() which could create a "default" resolvers section if it doesn't exists. If one tries to override the default resolvers section after an ACME section, the resolvers section parsing will fail because the section was already created by httpclient_create_proxy(). This patch fixes the issue by moving the initialization of the ACME proxy to a pre_check callback, which is called just before check_config_validity(). Must be backported in 3.2.	2025-11-13 17:17:11 +01:00
William Lallemand	2bdf5a7937	BUG/MEDIUM: acme: move from mt_list to a rwlock + ebmbtree The current ACME scheduler suffers from problems due to the way the tasks are stored: - MT_LIST are not scalables when having a lot of ACME tasks and having to look for a specific one. - the acme_task pointer was stored in the ckch_store in order to not passing through the whole list. But a ckch_store can be updated and the pointer lost in the previous one. - when a task fails, the ptr in the ckch_store was not removed because we only work with a copy of the original ckch_store, it would need to lock the ckchs_tree and remove this pointer. This patch fixes the issues by removing the MT_LIST-based architecture, and replacing it by a simple ebmbtree + rwlock design. The pointer to the task is not stored anymore in the ckch_store, but instead it is stored in the acme_tasks tree. Finding a task is done by doing a lookup on this tree with a RDLOCK. Instead of checking if store->acme_task is not NULL, a lookup is also done. This allow to remove the stuck "acme_task" pointer in the store, which was preventing to restart an acme task when the previous failed for this specific certificate. Must be backported in 3.2.	2025-11-13 15:18:12 +01:00
Frederic Lecaille	c76e072e43	MINOR: quic-be: avoid a useless I/O callback wakeup for 0-RTT sessions For backends and 0-RTT sessions, this patch modifies the ->start() callback to wake up the I/O callback only if the connection (and the mux) is not ready. Note that connect_server() has been modified to call this xprt callback just after having created the mux and installed the mux. Contrary to 1-RTT session, for 0-RTT sessions, the connections are always ready before calling this ->start xprt callback.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	92d2ab76e0	MINOR: quic-be: do not create the mux after handshake completion (for 0-RTT) This is required during connection with 0-RTT support, to prevent two mux creations. Indeed, for 0-RTT sessions, the QUIC mux is already started very soon from connect_server() (src/backend.c).	2025-11-13 14:04:31 +01:00
Frederic Lecaille	d84463f9f6	MINOR: quic-be: validate the 0-RTT transport parameters During 0-RTT sessions, some server transport parameters are reused after having been save from previous sessions. These parameters must not be reduced when it resends them. The client must check this is the case when some early data are accepted by the server. This is what is implemented by this patch. Implement qc_early_tranport_params_validate() which checks the new server parameters are not reduced. Also implement qc_ssl_eary_data_accepted() which was not implemented for TLS stack without 0-RTT support (for instance wolfssl). That said this function was no more used. This is why the compilation against wolfssl could not fail.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	6419b9f204	MEDIUM: quic-be: enable the use of 0-RTT This patch allows the use of 0-RTT feature on QUIC server lines with "allow-0rtt" option. In fact 0-RTT is really enabled only if ssl_sock_srv_try_reuse_sess() successfully manages to reuse the SSL session and the chosen application protocol from previous connections. Note that, at this time, 0-RTT works only with quictls and aws-lc as TLS stack. (0-RTT does not work at all (even for QUIC frontends) with libressl).	2025-11-13 14:04:31 +01:00
Frederic Lecaille	46d490f7c2	MINOR: quic-be: discard the 0-RTT keys This patch allows the discarding of the 0-RTT keys as soon as 1-RTT keys are available.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	3f60891360	MEDIUM: quic-be: qc_send_mux() adaptation for 0-RTT When entering this function, a selection is done about the encryption level to be used to send data. For a client, the early data encryption level is used to send 0-RTT if this encryption level is initialized. The Initial encryption is also registered to the send list for clients if there is Initial crypto data to send. This allow Initial and 0-RTT packets to be coalesced by datagrams.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	a4bbbc75db	MINOR: quic-be: Send post handshake frames from list of frames (0-RTT) This patch is required to make 0-RTT work. It modifies the prototype of quic_build_post_handshake_frames() to send post handshake frames from a list of frames in place of the application encryption level (used as <qc->ael> local variable). This patch does not modify at all the current QUIC stack behavior (even for QUIC frontends). It must be considered as a preparation for the code to come about 0-RTT support for QUIC backends.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	ac1d3eba88	MINOR: quic-be: allow the preparation of 0-RTT packets A QUIC server never sends 0-RTT packets contrary to the client. This very simple modification allow the the preparation of 0-RTT packets with early data as encryption level (->eel).	2025-11-13 14:04:31 +01:00
Frederic Lecaille	6e14365a5b	MEDIUM: quic-be: modify ssl_sock_srv_try_reuse_sess() to reuse backend sessions (0-RTT) This function is called for both TCP and QUIC connections to reuse SSL sessions saved by ssl_sess_new_srv_cb() callback called upon new SSL session creation. In addition to this, a QUIC SSL session must reuse the ALPN and some specific QUIC transport parameters. This is what is added by this patch for QUIC 0-RTT sessions. Note that for now on, ssl_sock_srv_try_reuse_sess() may fail for QUIC connections if it did not managed to reuse the ALPN. The caller must be informed of such an issue. It must not enable 0-RTT for the current session in this case. This is impossible without ALPN which is required to start a mux. ssl_sock_srv_try_reuse_sess() is modified to always succeeds for TCP connections.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	5309dfb56b	MINOR: quic-be: Save the backend 0-RTT parameters For both TCP and QUIC connections, this is ssl_sess_new_srv_cb() callback which is called when a new SSL session is created. Its role is to save the session to be reused for the next sessions. This patch modifies this callback to save the QUIC parameters to be reused for the next 0-RTT sessions (or during SSL session resumption). The already existing path_params->nego_alpn member is used to store the ALPN as this is done for TCP alongside path_params->tps new quic_early_transport_params struct used to save the QUIC transport parameters to be reused for 0-RTT sessions.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	41e40eb431	MINOR: quic-be: helper quic_reuse_srv_params() function to reuse server params (0-RTT) Implement quic_reuse_srv_params() whose role is to reuse the ALPN negotiated during a first connection to a QUIC backend alongside its transport parameters.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	33564ca54c	MINOR: quic-be: helper functions to save/restore transport params (0-RTT) Define quic_early_transport_params new struct for QUIC transport parameters in relation with 0-RTT. This parameters must be saved during a first session to be reused for 0-RTT next sessions. qc_early_transport_params_cpy() copies the 0-RTT transport parameters to be saved during a first connection to a backend. The copy is made from a quic_transport_params struct to a quic_ealy_transport_params struct. On the contrary, qc_early_transport_params_reuse() copies the transport parameters to be reused for a 0-RTT session from a previous one. The copy is made from a quic_early_transport_params strcut to a quic_transport_params struct. Also add QUIC_EV_EARLY_TRANSP_PARAMS trace event to dump such 0-RTT transport parameters from traces.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	80070fe51c	MEDIUM: quic-be: Parse, store and reuse tokens provided by NEW_TOKEN Add a per thread ist struct to srv_per_thread struct to store the QUIC token to be reused for subsequent sessions. Parse at packet level (from qc_parse_ptk_frms()) these tokens and store them calling qc_try_store_new_token() newly implemented function. This is this new function which does its best (may fail) to update the tokens. Modify qc_do_build_pkt() to resend these tokens calling quic_enc_token() implemented by this patch.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	8f23d4d287	MINOR: quic-be: Parse the NEW_TOKEN frame Rename ->data qf_new_token struct field to ->w_data to distinguish it from ->r_data new field used to parse the NEW_TOKEN frame. Indeed to build the NEW_TOKEN we need to write it to a static buffer into the frame struct. To parse it we only need to store the address of the token field into the RX buffer.	2025-11-13 14:04:31 +01:00
Frederic Lecaille	64e32a0767	BUG/MEDIUM: quic-be: do not launch the connection migration process At this time the connection migration is not supported by QUIC backends. This patch prevents this process to be launched for connections to QUIC backends. Furthermore, the connection migration process could be started systematically when connecting a backend to INADDR_ANY, leading to crashes into qc_handle_conn_migration() (when referencing qc->li). Thank you to @InputOutputZ for having reported this issue in GH #3178. This patch simply checks the connection type (listener or not) before checking if a connection migration must be started. No need to backport because support for QUIC backends is available from 3.3.	2025-11-13 13:52:40 +01:00
William Lallemand	071e5063d8	BUG/MINOR: acme: more explicit error when BIO_new_file() Replace the error message of BIO_new_file() when the account-key cannot be created on disk by "acme: cannot create the file '%s'". It was previously "acme: out of memory." Which is unclear. Must be backported to 3.2.	2025-11-13 11:56:33 +01:00
Remi Tricot-Le Breton	1b19e4ef32	BUG/MEDIUM: init: 'devnullfd' not properly closed for master Since commit "1ec59d3 MINOR: init: Make devnullfd global and create it earlier in init" the devnullfd pointing towards /dev/null gets created early in the init process but it was closed after the call to "mworker_run_master". The master process never got to the FD closing code and we had an FD leak. This patch does not need to be backported.	2025-11-12 16:06:28 +01:00
Amaury Denoyelle	7927ee95f3	BUG/MINOR: do not account backend connections into maxconn Remove QUIC backend connections from global actconn accounting. Indeed, this counter is only used on the frontend side. This is required to ensure maxconn coherence.	2025-11-12 14:45:00 +01:00
Aurelien DARRAGON	3262da84ea	BUG/MEDIUM: stats-file: fix shm-stats-file preload not working anymore Due to recent commit 5c299dee ("MEDIUM: stats: consider that shared stats pointers may be NULL") shm-stats-file preloading suddenly stopped working In fact preloading should be considered as an initializing step so the counters may be assigned there without checking for NULL first. Indeed there are supposed to be NULL because preloading occurs before counters_{fe,be}_shared_prepare() which takes care of setting the pointers for counters if they weren't set before. Obviously this corner-case was overlooked during 5c299dee writing and testing. Thanks to Nick Ramirez for having reported the issue. No backport needed, this issue is specific to 3.3.	2025-11-11 22:36:17 +01:00
Aurelien DARRAGON	a287841578	MINOR: stats-proxy: ensure future-proof FN_AGE manipulation in me_generate_field() Commit ad1bdc33 ("BUG/MAJOR: stats-file: fix crash on non-x86 platform caused by unaligned cast") revealed an ambiguity in me_generate_field() around FN_AGE manipulation. For now FN_AGE can only be stored as u32 or s32, but in the future we could also support 64bit FN_AGES, and the current code assumes 32bits types and performs and explicit unsigned int cast. Instead we group current 32 bits operations for FF_U32 and FF_S32 formats, and let room for potential future formats for FN_AGE. Commit ad1bdc33 also suggested that the fix was temporary and the approach must change, but after a code review it turns out the current approach (generic types manipulation under me_generate_field()) is legit. The introduction of shm-stats-file feature didn't change the logic which was initially implemented in 3.0. It only extended it and since shared stats are now spread over thread-groups since 3.3, the use of atomic operations made typecasting errors more visible, and structure mapping change from d655ed5f14 ("BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency (2nd attempt)") was in fact the only change to blame for the crash on non-x86 platforms. With ambiguities removed in me_generate_field(), let's hope we don't face similar bugs in the future. Indeed, with generic counters, and more specifically shared ones (which leverage atomic ops), great care must be taken when changing their underlying types as me_generate_field() solely relies on stat_col descriptor to know how to read the stat from a generic pointer, so any breaking change must be reflected in that function as well No backport needed.	2025-11-10 21:32:22 +01:00
Amaury Denoyelle	5a8728d03a	MEDIUM/OPTIM: quic: alloc quic_conn after CID collision check On Initial packet parsing, a new quic_conn instance is allocated via qc_new_conn(). Then a CID is allocated with its value derivated from client ODCID. On CID tree insert, a collision can occur if another thread was already parsing an Initial packet from the same client. In this case, the connection is released and the packet will be requeued to the other thread. Originally, CID collision check was performed prior to quic_conn allocation. This was changed by the commit below, as this could cause issue on quic_conn alloc failure. commit 4ae29be18c5b212dd2a1a8e9fa0ee2fcb9dbb4b3 BUG/MINOR: quic: Possible endless loop in quic_lstnr_dghdlr() However, this procedure is less optimal. Indeed, qc_new_conn() performs many steps, thus it could be better to skip it on Initial CID collision, which can happen frequently. This patch restores the older order of operations, with CID collision check prior to quic_conn allocation. To ensure this does not cause again the same bug, the CID is removed in case of quic_conn alloc failure. This should prevent any loop as it ensures that a CID found in the global tree does not point to a NULL quic_conn, unless if CID is attach to a foreign thread. When this thread will parse a re-enqueued packet, either the quic_conn is already allocated or the CID has been removed, triggering a fresh CID and quic_conn allocation procedure.	2025-11-10 12:10:14 +01:00
Amaury Denoyelle	a9d11ab7f3	MINOR: quic: extend traces on CID allocation Add new traces to detect the CID generation method and also when an Initial packet is requeued due to CID collision.	2025-11-10 12:10:14 +01:00
Amaury Denoyelle	2623e0a0b7	BUG/MEDIUM: quic: handle collision on CID generation CIDs are provided by haproxy so that the peer can use them as DCID of its packets. Their value is set via a random generator. It happens on several occasions during connection lifetime: * via ODCID derivation if haproxy is the server * on quic_conn init if haproxy is the client * during post-handshake if haproxy is the server * on RETIRE_CONNECTION_ID frame parsing CIDs are stored in a global tree. On ODCID derivation, a check is performed to ensure the CID is not a duplicate value. This is mandatory to properly handle multiple INITIAL packets from the same client on different thread. However, for the other cases, no check is performed for CID collision. As _quic_cid_insert() is silent, the issue is not detected at all. This results in a CID advertized to the peer but not stored in the global one. In the end, this may cause two issues. The first one is that packets from the client which use the new CID will be rejected by haproxy, most probably with a STATELESS_RESET. The second issue is that it can cause a crash during quic_conn release. Indeed, the CID is stored in the quic_conn local tree and thus eb_delete() for the global tree will be performed. As <leaf_p> member is uninit, this results in a segfault. Note that this issue is pretty rare. It can only be observed if running with a high number of concurrent connections in parallel, so that the random generator will provide duplicate values. Patch is still labelled as MEDIUM as this modifies code paths used frequently. To fix this, _quic_cid_insert() unsafe function is completely removed. Instead, quic_cid_insert() can be used, which reports an error code if a collision happens. CID are then stored in the quic_conn tree only after global tree insert success. Here is the solution for each steps if a collision occurs : * on init as client: the connection is completely released * post-handshake: the CID is immediately released. The connection is kept, but it will miss an extra CID. * on RETIRE_CONNECTION_ID parsing: a loop is implemented to retry random generation. It it fails several times, the connection is closed in error. A small convenience change is made to quic_cid_insert(). Output parameter <new_tid> can now be NULL, which is useful as most of the times caller do not care about it. This must be backported up to 2.6.	2025-11-10 12:10:14 +01:00
Amaury Denoyelle	419e5509d8	MINOR: quic: split CID alloc/generation function Split new_quic_cid() function into multiple ones. This patch should not introduce any visible change. The objective is to render CID allocation and generation more modular. The first advantage of this patch is to bring code simplication. In particular, conn CID sequence number increment and insertion into connection tree is simpler than before. Another improvment is also that errors could now be handled easier at each different steps of the CID init. This patch is a prerequisite for the fix on CID collision, thus it must be backported prior to it to every affected version.	2025-11-10 12:10:14 +01:00
Amaury Denoyelle	0ef473ba6b	MINOR: quic: adjust CID conn tree alloc in qc_new_conn() Change qc_new_conn() so that the connection CID tree is allocated earlier in the function. This patch does not introduce a behavior change. Its objective is to facilitate future evolutions on CIDs handling. This patch is a prerequisite for the fix on CID collision, thus it must be backported prior to it to every affected version.	2025-11-10 12:10:14 +01:00
Amaury Denoyelle	73621adb23	BUG/MINOR: quic: close connection on CID alloc failure During RETIRE_CONNECTION_ID frame parsing, a new connection ID is immediately reallocated after the release of the previous one. This is done to ensure that the peer will never run out of DCID. Prior to this patch, a CID allocation failure was be silently ignored. This prevent the emission of a new CID, which could prevent the peer to emit packets if it had no other CIDs available for use. Now, such error is considered fatal to the connection. This is the safest solution as it's better to close connections when memory is running low. It must be backported up to 2.8.	2025-11-10 12:10:14 +01:00
Willy Tarreau	137d5ba93f	BUG/MEDIUM: config: for word expansion, empty or non-existing are the same Amaury reported a case where "${FOO[*]}" still produces an empty field. It happens if the variable is defined but does not contain any non-space characters. The reason is that we special-case word expansion only on non-existing vars. Let's change the ordering of operations so that word- expanded vars always pretend the current arg is not an empty quote, so that we don't make any difference between a non-existing var and an empty one. No backport is needed unless commit 1968731765 ("BUG/MEDIUM: config: solve the empty argument problem again") is.	2025-11-10 11:59:35 +01:00
Willy Tarreau	b26a6d50c6	[RELEASE] Released version 3.3-dev12 Released version 3.3-dev12 with the following main changes : - MINOR: quic: enable SSL on QUIC servers automatically - MINOR: quic: reject conf with QUIC servers if not compiled - OPTIM: quic: adjust automatic ALPN setting for QUIC servers - MINOR: sample: optional AAD parameter support to aes_gcm_enc/dec - REGTESTS: converters: check USE_OPENSSL in aes_gcm.vtc - BUG/MINOR: resolvers: ensure fair round robin iteration - BUG/MAJOR: stats-file: fix crash on non-x86 platform caused by unaligned cast - OPTIM: backend: skip conn reuse for incompatible proxies - SCRIPTS: build-ssl: allow to build a FIPS version without FIPS - OPTIM: proxy: move atomically access fields out of the read-only ones - SCRIPTS: build-ssl: fix rpath in AWS-LC install for openssl and bssl bin - CI: github: update to macos-26 - BUG/MINOR: quic: fix crash on client handshake abort - MINOR: quic: do not set conn member if ssl_sock_ctx - MINOR: quic: remove connection arg from qc_new_conn() - BUG/MEDIUM: server: Add a rwlock to path parameter - BUG/MEDIUM: server: Also call srv_reset_path_parameters() on srv up - BUG/MEDIUM: mux-h1: fix 414 / 431 status code reporting - BUG/MEDIUM: mux-h2: make sure not to move a dead connection to idle - BUG/MEDIUM: connections: permit to permanently remove an idle conn - MEDIUM: cfgparse: deprecate 'master-worker' keyword alone - MEDIUM: cfgparse: 'daemon' not compatible with -Ws - DOC: configuration: deprecate the master-worker keyword - MINOR: quic: remove <mux_state> field - BUG/MEDIUM: stick-tables: Make sure we handle expiration on all tables - MEDIUM: stick-tables: Optimize the expiration process a bit. - MEDIUM: ssl/ckch: use ckch_store instead of ckch_data for ckch_conf_kws - MINOR: acme: generate a temporary key pair - MEDIUM: acme: generate a key pair when no file are available - BUILD: ssl/ckch: wrong function name in ckch_conf_kws - BUILD: acme: acme_gen_tmp_x509() signedness and unused variables - BUG/MINOR: acme: fix initialization issue in acme_gen_tmp_x509() - BUILD: ssl/ckch: fix ckch_conf_kws parsing without ACME - MINOR: server: move the lock inside srv_add_idle() - DOC: acme: crt-store allows you to start without a certificate - BUG/MINOR: acme: allow 'key' when generating cert - MINOR: stconn: Add counters to SC to know number of bytes received and sent - MINOR: stream: Add samples to get number of bytes received or sent on each side - MINOR: counters: Add req_in/req_out/res_in/res_out counters for fe/be/srv/li - MINOR: stream: Remove bytes_in and bytes_out counters from stream - MINOR: counters: Remove bytes_in and bytes_out counter from fe/be/srv/li - MINOR: stats: Add stats about request and response bytes received and sent - MINOR: applet: Add function to get amount of data in the output buffer - MINOR: channel: Remove total field from channels - DEBUG: stream: Add bytes_in/bytes_out value for both SC in session dump - MEDIUM: stktables: Limit the number of stick counters to 100 - BUG/MINOR: config: Limit "tune.maxpollevents" parameter to 1000000 - BUG/MEDIUM: server: close a race around ready_srv when deleting a server - BUG/MINOR: config: emit warning for empty args when not in discovery mode - BUG/MEDIUM: config: solve the empty argument problem again - MEDIUM: config: now reject configs with empty arguments - MINOR: tools: add support for ist to the word fingerprinting functions - MINOR: tools: add env_suggest() to suggest alternate variable names - MINOR: tools: have parse_line's error pointer point to unknown variable names - MINOR: cfgparse: try to suggest correct variable names on errors - IMPORT: cebtree: Replace offset calculation with offsetof to avoid UB - BUG/MINOR: acme: wrong dns-01 challenge in the log - MEDIUM: backend: Defer conn_xprt_start() after mux creation - MINOR: peers: Improve traces for peers - MEDIUM: peers: No longer ack updates during a full resync - MEDIUM: peers: Remove commitupdate field on stick-tables - BUG/MEDIUM: peers: Fix update message parsing during a full resync - MINOR: sample/stats: Add "bytes" in req_{in,out} and res_{in,out} names - BUG/MEDIUM: stick-tables: Make sure updates are seen as local - BUG/MEDIUM: proxy: use aligned allocations for struct proxy - BUG/MEDIUM: proxy: use aligned allocations for struct proxy_per_tgroup - BUG/MINOR: acme: avoid a possible crash on error paths	2025-11-08 12:12:00 +01:00
Willy Tarreau	5574163073	BUG/MINOR: acme: avoid a possible crash on error paths In acme_EVP_PKEY_gen(), an error message is printed if *errmsg is set, however, since commit 546c67d13 ("MINOR: acme: generate a temporary key pair"), errmsg is passed as NULL in at least one occurrence, leading the compiler to issue a NULL deref warning at -O3. And indeed, if the errors are encountered, a crash will occur. No backport is needed.	2025-11-07 22:27:25 +01:00
Willy Tarreau	fb8edd0ce6	BUG/MEDIUM: proxy: use aligned allocations for struct proxy_per_tgroup In 3.2, commit f879b9a18 ("MINOR: proxies: Add a per-thread group field to struct proxy") introduced struct proxy_per_tgroup that is declared as thread_aligned, but is allocated using calloc(). Thus it is at risk of crashing on machines using instructions requiring 64-byte alignment such as AVX512. Let's use ha_aligned_zalloc_typed() instead of malloc(). For 3.2, we don't have aligned allocations, so instead the THREAD_ALIGNED() will have to be removed from the struct definition. Alternately, we could manually align it as is done for fdtab.	2025-11-07 22:22:55 +01:00
Willy Tarreau	df9eb2e7b6	BUG/MEDIUM: proxy: use aligned allocations for struct proxy Commit fd012b6c5 ("OPTIM: proxy: move atomically access fields out of the read-only ones") caused the proxy struct to be 64-byte aligned, which allows the compiler to use optimizations such as AVX512 to zero certain fields. However the struct was allocated using calloc() so it was not necessarily aligned, causing segv on startup on compatible machines. Let's just use ha_aligned_zalloc_typed() to allocate the struct. No backport is needed.	2025-11-07 22:22:55 +01:00
Olivier Houchard	c26bcfc1e3	BUG/MEDIUM: stick-tables: Make sure updates are seen as local In stktable_touch_with_exp, if it is a local update, add it to the pending update list even if it's already in the tree as a remote update, otherwise it will never be communicated to other peers; It used to work before 3.2 because of the ordering of operations, but it's been broken by adding an extra step with the pending update list, so we now have to explicitely check for that. This should be backported to 3.2.	2025-11-07 16:23:21 +01:00
Christopher Faulet	7d1787ba8e	MINOR: sample/stats: Add "bytes" in req_{in,out} and res_{in,out} names Number of bytes received or sent by a client or a server are now saved. Sample fetches and stats fields to retrieve these informations are renamed to add "bytes" in names to avoid any ambiguity with number of requests and responses.	2025-11-07 14:09:48 +01:00
Christopher Faulet	f12252c7a5	BUG/MEDIUM: peers: Fix update message parsing during a full resync The commit 590c5ff2e ("MEDIUM: peers: No longer ack updates during a full resync") introduced a regression. During a full resync, the ID of an update message is not parsed at all. Thus, the parsing of the whole message in desynchronized. On full resync the update id itself is ignored, to not be acked, but it must be parsed. It is now fixed. It is a 3.3-specific bug, no backport needed.	2025-11-07 12:47:34 +01:00
Christopher Faulet	ecc2c3a35d	MEDIUM: peers: Remove commitupdate field on stick-tables This stick-table field was atomically updated with the last update id pushed and dumped on the CLI but never used otherwise. And all peer sessions share the same id because it is a stick-table info. So the info in peers dump is pretty limited. So, let's remove it.	2025-11-07 12:17:53 +01:00
Christopher Faulet	590c5ff2ed	MEDIUM: peers: No longer ack updates during a full resync ACK messages received by a peer sending updates during a full resync are ignored. So, on the other side, there is no reason to still send these ACK messages. Let's skip them. In addition, the received updates during this stage are not considered as to be acked. It is important to be sure to properly emit ACK messages once the full sync finished.	2025-11-07 11:50:13 +01:00
Christopher Faulet	383bf11306	MINOR: peers: Improve traces for peers Trace messages for peers were only protocol oriented and information provided were quite light. With this patch, the traces were improved. information about the peer, its applet and the section are dumped. Several verbosities are now available and messages are dumped at different levels depending on the context. It should easier to track issues in the peers.	2025-11-07 11:50:13 +01:00
Olivier Houchard	25559e7055	MEDIUM: backend: Defer conn_xprt_start() after mux creation In connect_server(), defer the call to conn_xprt_start() until after we had a chance to create the mux. The xprt can behave differently depending on if a mux is or is not available at this point, as if it is, it may want to wait until some data comes from the mux. This does not need to be backported.	2025-11-07 11:40:52 +01:00
William Lallemand	3bc90d01d1	BUG/MINOR: acme: wrong dns-01 challenge in the log Since 861fe532046 ("MINOR: acme: add the dns-01-record field to the sink"), the dns-01 challenge is output in the dns_record trash, instead of the global trash. The send_log string was never updated with this change, and dumps some data from the global trash instead. Since the last data emitted in the trash seems to be the dns-01 token from the authorization object, it looks like the response to the challenge. This must be backported to 3.2.	2025-11-07 09:49:04 +01:00
Ben Kallus	d5ca3bb3b4	IMPORT: cebtree: Replace offset calculation with offsetof to avoid UB This is the same as the equivalent fix in ebtree: The C standard specifies that it's undefined behavior to dereference NULL (even if you use & right after). The hand-rolled offsetof idiom &(((s*)NULL)->f) is thus technically undefined. This clutters the output of UBSan and is simple to fix: just use the real offsetof when it's available. This is cebtree commit 2d08958858c2b8a1da880061aed941324e20e748.	2025-11-07 07:32:58 +01:00
Willy Tarreau	4c3351fd63	MINOR: cfgparse: try to suggest correct variable names on errors When an empty argument comes from the use of a non-existing variable, we'll now detect the difference with an empty variable (error pointer points to the variable's name instead), and submit it to env_suggest() to see if another variable looks likely to be the right one or not. This can be quite useful to quickly figure how to fix misspelled variable names. Currently only series of letters, digits and underscores are attempted to be resolved as a name. A typical example is: peer "${HAPROXY_LOCAL_PEER}" 127.0.0.1:10000 which produces: [ALERT] (24231) : config : parsing [bug-argv4.cfg:2]: argument number 1 at position 13 is empty and marks the end of the argument list: peer "${HAPROXY_LOCAL_PEER}" 127.0.0.1:10000 ^ [NOTICE] (24231) : config : Hint: maybe you meant HAPROXY_LOCALPEER instead ?	2025-11-06 19:57:44 +01:00
Willy Tarreau	49585049b9	MINOR: tools: have parse_line's error pointer point to unknown variable names When an argument is empty, parse_line() currently returns a pointer to the empty string itself. This is convenient, but it's only actionable by the user who will see for example "${HAPROXY_LOCALPEER}" and figure what is wrong. Here we slightly change the reported pointer so that if an empty argument results from the evaluation of an empty variable (meaning that all variables in string are empty and no other char is present), then instead of pointing to the opening quote, we'll return a pointer to the first character of the variable's name. This will allow to make a difference between an empty variable and an unknown variable, and for the caller to take action based on this. I.e. before we would get: log "${LOG_SERVER_IP}" local0 ^ if LOG_SERVER_IP is not set, and now instead we'll get this: log "${LOG_SERVER_IP}" local0 ^	2025-11-06 19:57:44 +01:00
Willy Tarreau	14087e48b9	MINOR: tools: add env_suggest() to suggest alternate variable names The purpose here is to look in the environment for a variable whose name looks like the provided one. This will be used to try to auto- correct misspelled environment variables that would silently be turned to an empty string.	2025-11-06 19:57:44 +01:00
Willy Tarreau	a4d78dd4f5	MINOR: tools: add support for ist to the word fingerprinting functions The word fingerprinting functions are used to compare similar words to suggest a correctly spelled one that looks like what the user proposed. Currently the functions only support const char*, but there's no reason for this, and it would be convenient to support substrings extracted from random pieces of configurations. Here we're adding new variants "_with_len" that take these ISTs and which are in fact a slight change of the original ones that the old ones now rely on.	2025-11-06 19:57:44 +01:00
Willy Tarreau	d9d0721bc9	MEDIUM: config: now reject configs with empty arguments As prepared during 3.2, we must error on empty arguments because they mark the end of the line and cause subsequent arguments to be silently ignored. It was too late in 3.2 to turn that into an error so it's a warning, but for 3.3 it needed to be an alert. This patch does that. It doesn't instantly break, instead it counts one fatal error per violating line. This allows to emit several errors at once, which can often be caused by the same variable being missed, or a group of variables sharing a same misspelled prefix for example. Tests show that it helps locate them better. It also explains what to look for in the config manual for help with variables expansion.	2025-11-06 19:57:44 +01:00
Willy Tarreau	1968731765	BUG/MEDIUM: config: solve the empty argument problem again This mostly reverts commit ff8db5a85 ("BUG/MINOR: config: Stopped parsing upon unmatched environment variables"). As explained in commit #2367, finally the fix above was incorrect because it causes other trouble such as this: log "192.168.100.${NODE}" "local0" being resolved to this: log 192.168.100.local0 when NODE does not exist due to the loss of the spaces. In fact, while none of us was well aware of this, when the user had: server app 127.0.0.1:80 "${NO_CHECK}" weight 123 in fact they should have written it this way: server app 127.0.0.1:80 "${NO_CHECK[]}" weight 123 so that the variable is expanded to zero, one or multiple words, leaving no empty arg (like in shell). This is supported since 2.3 with commit fa41cb6 so the right fix is in the config, let's revert the fix and properly address the issue. Some changes are necessary however, since after that patch, the in_arg checks were added and are now inserting an empty argument even for proper error reporting. For example, the following statement: acl foo path "/a" "${FOO[]}" "/b" would complain about an empty arg at FOO due to in_arg=1, while dropping this in_arg=1 with the following config: acl foo path "/a" "${FOO}" "/b" would silently stop after "/a" instead of complaining about an empty field. So the approach here consists in noting whether or not something was written since the quotes were emitted, in order to decide whether or not to produce an argument. This way, "" continues to be an explicitly empty arg, just like the same with an unknown variable, while "${FOO[]}" is allowed to prevent the creation of an argument if empty. This should be backported to some* versions, but the risk that some configs were altered to rely on the broken fix is not null. At least recent LTS should be reverted. Note that this requires previous commit: BUG/MINOR: config: emit warning for empty args when not in discovery mode otherwise this will break again configs relying on HAPROXY_LOCALPEER and maybe a few other variables set at the end of discovery.	2025-11-06 19:57:44 +01:00
Willy Tarreau	004e1be48e	BUG/MINOR: config: emit warning for empty args when not in discovery mode This actually reverses the condition of commit 5f1fad1690 ("BUG/MINOR: config: emit warning for empty args only in discovery mode"). Indeed, some variables are not known in discovery mode (e.g. HAPROXY_LOCALPEER), and statements like: peer "${HAPROXY_LOCALPEER}" 127.0.0.1:10000 are broken during discovery mode. It turns out that the warning is currently hidden by commit ff8db5a85d ("BUG/MINOR: config: Stopped parsing upon unmatched environment variables") since it silently drops empty args which is sufficient to hide the warning, but it also breaks other configs and needs to be reverted, which will break configs like above again. In issue #2995 we were not fully decided about discovery mode or not, and already suspected some possible issues without being able to guess which ones. The only downside of not displaying them in discovery mode is that certain empty fields on the rare keywords specific to master mode might remain silent until used. Let's just flip the condition to check for empty args in normal mode only. This should be backported to 3.2 after some time of observation.	2025-11-06 19:57:44 +01:00
Willy Tarreau	0144426dfb	BUG/MEDIUM: server: close a race around ready_srv when deleting a server When a server is being disabled or deleted, in case it matches the backend's ready_srv, this one is reset. However it's currently done in a non-atomic way when the server goes down, and that could occasionally reset the entry matching another server, but more importantly if in parallel some requests are dequeued for that server, it may re-appear there after having been removed, leading to a possible crash once it is fully removed, as shown in issue #3177. Let's make sure we reset the pointer when detaching the server from the proxy, and use a CAS in both cases to only reset this server. This fix needs to be backported to 3.2. There, srv_detach() is in server.c instead of server.h. Thanks to Basha Mougamadou for the detailed report and the useful backtraces.	2025-11-06 19:57:44 +01:00
Christopher Faulet	c6f68901cc	BUG/MINOR: config: Limit "tune.maxpollevents" parameter to 1000000 "tune.maxpollevents" global parameter was not limited. It was possible to set any integer value. But this value is used to allocate the array of events used by epoll. With a huge value, it seems the allocation silently fail, making haproxy totally unresponsive. So let's to limit its value to 1 million. It is pretty high and it should not be an issue to forbid greater values. The documentation was updated accordingly. This patch could be backported to all stable branches.	2025-11-06 15:56:21 +01:00
Christopher Faulet	80edbad4f9	MEDIUM: stktables: Limit the number of stick counters to 100 "tune.stick-counters" global parameter was accepting any positive integer value. But the maximum value is incredibly high. Setting a huge value has signitifcant impact on memory and CPU usage. To avoid any issue, this value is now limited to 100. It should be greater enough to all usage. It can be seen as a breaking change.	2025-11-06 15:01:29 +01:00
Christopher Faulet	949199a2f4	DEBUG: stream: Add bytes_in/bytes_out value for both SC in session dump It could be handy to have these infos in the full session dump. So let's dump it now.	2025-11-06 15:01:29 +01:00
Christopher Faulet	a1b5325a7a	MINOR: channel: Remove total field from channels The <total> field in the channel structure is now useless, so it can be removed. The <bytes_in> field from the SC is used instead. This patch is related to issue #1617.	2025-11-06 15:01:29 +01:00
Christopher Faulet	1effe0fc0a	MINOR: applet: Add function to get amount of data in the output buffer The helper function applet_output_data() returns the amount of data in the output buffer of an applet. For applets using the new API, it is based on data present in the outbuf buffer. For legacy applets, it is based on input data present in the input channel's buffer. The HTX version, applet_htx_output_data(), is also available This patch is related to issue #1617.	2025-11-06 15:01:29 +01:00
Christopher Faulet	4991a51208	MINOR: stats: Add stats about request and response bytes received and sent In previous patches, these counters were added per frontend, backend, server and listener. With this patch, these counters are reported on stats, including promex. Note that the stats file minor version was incremented by one because the shm_stats_file_object struct size has changed. This patch is related to issue #1617.	2025-11-06 15:01:29 +01:00
Christopher Faulet	0084baa6ba	MINOR: counters: Remove bytes_in and bytes_out counter from fe/be/srv/li bytes_in and bytes_out counters per frontend, backend, listener and server were removed and we now rely on, respectively on, req_in and res_in counters. This patch is related to issue #1617.	2025-11-06 15:01:29 +01:00
Christopher Faulet	567df50d91	MINOR: stream: Remove bytes_in and bytes_out counters from stream per-stream bytes_in and bytes_out counters was removed and replaced by req.in and res.in. Coorresponding samples still exists but replies on new counters. This patch is related to issue #1617.	2025-11-06 15:01:29 +01:00
Christopher Faulet	1c62a6f501	MINOR: counters: Add req_in/req_out/res_in/res_out counters for fe/be/srv/li Thanks to the previous patch, and based on info available on the stream, it is now possible to have counters for frontends, backends, servers and listeners to report number of bytes received and sent on both sides. This patch is related to issue #1617.	2025-11-06 15:01:29 +01:00
Christopher Faulet	ac9201f929	MINOR: stream: Add samples to get number of bytes received or sent on each side req.in and req.out samples can now be used to get the number of bytes received by a client and send to the server. And res.in and res.out samples can be used to get the number of bytes received by a server and send to the client. These info are stored in the logs structure inside a stream. This patch is related to issue #1617.	2025-11-06 15:01:28 +01:00
Christopher Faulet	629fbbce19	MINOR: stconn: Add counters to SC to know number of bytes received and sent <bytes_in> and <bytes_out> counters were added to SC to count, respectively, the number of bytes received from an endpoint or sent to an endpoint. These counters are updated for connections and applets. This patch is related to issue #1617.	2025-11-06 15:01:28 +01:00
William Lallemand	094baa1cc0	BUG/MINOR: acme: allow 'key' when generating cert Allow to use the 'key' keyword when 'crt' was generated with both a crt and a key. No backport needed.	2025-11-06 14:11:43 +01:00
William Lallemand	05036180d9	DOC: acme: crt-store allows you to start without a certificate If your acme certificate is declared in a crt-store, and the certificate file does not exist on the disk, HAProxy will start with a temporary key pair.	2025-11-06 13:40:42 +01:00
Willy Tarreau	5fe4677231	MINOR: server: move the lock inside srv_add_idle() Almost all callers of _srv_add_idle() lock the list then call the function. It's not the most efficient and it requires some care from the caller to take care of that lock. Let's change this a little bit by having srv_add_idle() that takes the lock and calls _srv_add_idle() that is now inlined. This way callers don't have to handle the lock themselves anymore, and the lock is only taken around the sensitive parts, not the function call+return. Interestingly, perf tests show a small perf increase from 2.28-2.32M RPS to 2.32-2.37M RPS on a 128-thread system.	2025-11-06 13:16:24 +01:00
William Lallemand	a8498cde74	BUILD: ssl/ckch: fix ckch_conf_kws parsing without ACME Without ACME, the tmp_pkey and tmp_x509 functions are not available, the patch checks HAVE_ACME to use them.	2025-11-06 12:27:27 +01:00
William Lallemand	22f92804d6	BUG/MINOR: acme: fix initialization issue in acme_gen_tmp_x509() src/acme.c: In function ‘acme_gen_tmp_x509’: src/acme.c:2685:15: error: ‘digest’ may be used uninitialized [-Werror=maybe-uninitialized] 2685 \| if (!(X509_sign(newcrt, pkey, digest))) \| ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/acme.c:2628:23: note: ‘digest’ was declared here 2628 \| const EVP_MD *digest; \| ^~~~~~	2025-11-06 12:12:18 +01:00
William Lallemand	0524af034f	BUILD: acme: acme_gen_tmp_x509() signedness and unused variables Fix compilation issues in acme_gen_tmp_x509(). src/acme.c:2665:66: warning: pointer targets in passing argument 4 of ‘X509_NAME_add_entry_by_txt’ differ in signedness [-Wpointer-sign] 2665 \| if (X509_NAME_add_entry_by_txt(name, "CN", MBSTRING_ASC, "expired", \| ^~~~~~~~~ \| \| \| char * In file included from /usr/include/openssl/ssl.h:32, from include/haproxy/openssl-compat.h:19, from include/haproxy/acme-t.h:6, from src/acme.c:16: /usr/include/openssl/x509.h:1074:53: note: expected ‘const unsigned char ’ but argument is of type ‘char ’ 1074 \| const unsigned char *bytes, int len, int loc, \| ~~~~~~~~~~~~~~~~~~~~~^~~~~ src/acme.c:2630:23: warning: unused variable ‘i’ [-Wunused-variable] 2630 \| unsigned int i; \| ^ src/acme.c:2629:23: warning: unused variable ‘ctx’ [-Wunused-variable] 2629 \| X509V3_CTX ctx; \| ^~~	2025-11-06 12:08:04 +01:00
William Lallemand	a15d4f5b19	BUILD: ssl/ckch: wrong function name in ckch_conf_kws ckch_conf_load_pem does not exist anymore and ckch_conf_load_pem_or_generate must be used instead	2025-11-06 12:03:29 +01:00
William Lallemand	582a1430b2	MEDIUM: acme: generate a key pair when no file are available When an acme keyword is associated to a crt and key, and the corresponding files does not exist, HAProxy would not start. This patch allows to configure acme without pre-generating a keypair before starting HAProxy. If the files does not exist, it tries to generate a unique keypair in memory, that will be used for every ACME certificates that don't have a file on the disk yet.	2025-11-06 11:56:27 +01:00
William Lallemand	546c67d137	MINOR: acme: generate a temporary key pair This patch provides two functions acme_gen_tmp_pkey() and acme_gen_tmp_x509(). These functions generates a unique keypair and X509 certificate that will be stored in tmp_x509 and tmp_pkey. If the key pair or certificate was already generated they will return the existing one. The key is an RSA2048 and the X509 is generated with a expiration in the past. The CN is "expired". These are just placeholders to be used if we don't have files.	2025-11-06 11:56:27 +01:00
William Lallemand	1df55b441b	MEDIUM: ssl/ckch: use ckch_store instead of ckch_data for ckch_conf_kws This is an API change, instead of passing a ckch_data alone, the ckch_conf_kws.func() is called with a ckch_store. This allows the callback to access the whole ckch_store, with the ckch_conf and the ckch_data. But it requires the ckch_conf to be actually put in the ckch_store before.	2025-11-06 11:56:27 +01:00
Olivier Houchard	201971ec5f	MEDIUM: stick-tables: Optimize the expiration process a bit. In process_tables_expire(), if the table we're analyzing still has entries, and thus should be put back into the tree, do not put it in the mt_list, to have it put back into the tree the next time the task runs. There is no problem with putting it in the tree right away, as either the next expiration is in the future, or we handled the maximum number of expirations per task call and we're about to stop, anyway. This does not need to be backported.	2025-11-05 19:22:11 +01:00
Olivier Houchard	93f994e8b1	BUG/MEDIUM: stick-tables: Make sure we handle expiration on all tables In process_tables_expire(), when parsing all the tables with expiration set, to check if the any entry expired, make sure we start from the oldest one, we can't just rely on eb32_first(), because of sign issues on the timestamp. Not doing that may mean some tables are not considered for expiration. This does not need to be backported.	2025-11-05 19:22:11 +01:00
Amaury Denoyelle	b9809fe0d0	MINOR: quic: remove <mux_state> field This patch removes <mux_state> field from quic_conn structure. The purpose of this field was to indicate if MUX layer above quic_conn is not yet initialized, active, or already released. It became tedious to properly set it as initialization order of the various quic_conn/conn/MUX layers now differ between the frontend and backend sides, and also depending if 0-RTT is used or not. Recently, a new change introduced in connect_server() will allow to initialize QUIC MUX earlier if ALPN is cached on the server structure. This had another level of complexity. Thus, this patch removes <mux_state> field completely. Instead, a new flag QUIC_FL_CONN_XPRT_CLOSED is defined. It is set at a single place only on close XPRT callback invokation. It can be mixed with the new utility functions qc_wait_for_conn()/qc_is_conn_ready() to determine the status of conn/MUX layers now without an extra quic_conn field.	2025-11-05 14:03:34 +01:00
William Lallemand	99a2454e9d	DOC: configuration: deprecate the master-worker keyword Deprecate the 'master-worker' keyword in the global section. Split the configuration of the 'no-exit-on-failure' subkeyword in another section which is not deprecated yet and explains that its only meant for debugging purpose.	2025-11-05 12:27:11 +01:00
William Lallemand	4f978325ac	MEDIUM: cfgparse: 'daemon' not compatible with -Ws Emit a warning when the 'daemon' keyword is used in master-worker mode for systemd (-Ws). This never worked and was always ignored by setting MODE_FOREGROUND during cmdline parsing.	2025-11-05 11:49:11 +01:00
William Lallemand	631233e9ec	MEDIUM: cfgparse: deprecate 'master-worker' keyword alone Warn when the 'master-worker' keyword is used without 'no-exit-on-failure'. Warn when the 'master-worker' keyword is used and -W and -Ws already set the mode.	2025-11-05 11:49:11 +01:00
Willy Tarreau	096999ee20	BUG/MEDIUM: connections: permit to permanently remove an idle conn There's currently a function conn_delete_from_tree() which is used to detach an idle connection from the tree it's currently attached to so that it is no longer found. This function is used in three circumstances: - when picking a new connection that no longer has any avail stream - when temporarily working on the connection from an I/O handler, in which case it's re-added at the end - when killing a connection The 2nd case above is quite specific, as it requires to preserve the CO_FL_LIST_MASK flags so that the connection can be re-inserted into the proper tree when leaving the handler. However, there's a catch. When killing a connection, we want to be certain it will not be reinserted into the tree. The flags preservation is causing a tiny race if an I/O happens while the connection is in the kill list, because in this case the I/O handler will note the connection flags, do its work, then reinsert the connection where it believed it was, then the connection gets purged, and another user can find it in the tree. The issue is very difficult to reproduce. On a 128-thread machine it happens in H2 around 500k req/s after around 50M requests. In H1 it happens after around 1 billion requests. The fix here consists in passing an extra argument to the function to indicate if the removal is permanent or not. When it's permanent, the function will clear the associated flags. The callers were adjusted so that all those dequeuing a connection in order to kill it do it permanently and all other ones do it only temporarily. A slightly different approach could have worked: the function could always remove all flags, and the callers would need to restore them. But this would require trickier modifications of the various call places, compared to only passing 0/1 to indicate the permanent status. This will need to be backported to all stable versions. The issue was at least reproduced since 3.1 (not tested before). The patch will need to be adjusted for 3.2 and older, because a 2nd argument "thr" was added in 3.3, so the patch will not apply to older versions as-is.	2025-11-05 11:08:25 +01:00
Willy Tarreau	59c599f3f0	BUG/MEDIUM: mux-h2: make sure not to move a dead connection to idle In h2_detach(), it looks possible to place a dead connection back to the idle list, and to later call h2_release() on it once detected as dead. It's not certain that it happens but nothing in the code shows it is not possible, so better make sure it cannot happen. This should be preventively backported to all versions.	2025-11-05 11:08:25 +01:00
Maximilian Moehl	0799fd1072	BUG/MEDIUM: mux-h1: fix 414 / 431 status code reporting The more detailed status code reporting introduced with bc967758a2 is checking against the error state to determine whether it is a too long URL or too large headers. The check used always returns true which results in a 414 as the error state is only set at a later point. This commit adjusts the check to use the current state instead to return the intended status code. This patch must be backported as far as 3.1.	2025-11-05 10:55:18 +01:00
Olivier Houchard	06821dc189	BUG/MEDIUM: server: Also call srv_reset_path_parameters() on srv up Also call srv_reset_path_parameters() when the server changed states, and got up. It is not enough to do it when the server goes down, because there's a small race condition, and a connection could get established just after we did it, and could have set the path parameters. This does not need to be backported.	2025-11-04 18:47:34 +01:00
Olivier Houchard	7d4aa7b22b	BUG/MEDIUM: server: Add a rwlock to path parameter Add a rwlock to control the server's path_parameter, to make sure multiple threads don't set it at the same time, and it can't be seen in an inconsistent state. Also don't set the parameter every time, only set them if they have changed, to prevent needless writes. This does not need to be backported.	2025-11-04 18:47:34 +01:00
Amaury Denoyelle	efe60745b3	MINOR: quic: remove connection arg from qc_new_conn() This patch is similar to the previous one, this time dealing with qc_new_conn(). This function was asymetric on frontend and backend side, as connection argument was set only in the latter case. This was required prior due to qc_alloc_ssl_sock_ctx() signature. This has changed with the previous patch, thus qc_new_conn() can also be realigned on both FE and BE sides. <conn> member of quic_conn instance is always set outside it, in qc_xprt_start() on the backend case.	2025-11-04 17:47:42 +01:00
Amaury Denoyelle	5a17cade4f	MINOR: quic: do not set conn member if ssl_sock_ctx ssl_sock_ctx is a generic object used both on TCP/SSL and QUIC stacks. Most notably it contains a <conn> member which is a pointer to struct connection. On QUIC frontend side, this member is always set to NULL. Indeed, connection is only created after handshake completion. However, this has changed for backend side, where the connection is instantiated prior to its quic_conn counterpart. Thus, ssl_sock_ctx member would be set in this case as a convenience for use later in qc_ssl_do_hanshake(). However, this method was unsafe as the connection can be released, without resetting ssl_sock_ctx member. Thus, the previous patch fixes this by using on <conn> member through the quic_conn instance which is the proper way. Thus, this patch resets ssl_sock_ctx <conn> member to NULL. This is deemed the cleanest method as it ensures that both frontend and backend sides must not use it anymore.	2025-11-04 17:38:09 +01:00
Amaury Denoyelle	69de7ec14e	BUG/MINOR: quic: fix crash on client handshake abort On backend side, a connection can be aborted and released prior to handshake completion. This causes a crash in qc_ssl_do_hanshake() as <conn> member of ssl_sock_ctx is not reset in this case. To fix this, use <conn> member of quic_conn instead. This is safe as it is properly set to NULL when a connection is released. No impact on the frontend side as <conn> member is not accessed. Indeed, in this case connection is most of the times allocated after handshake completion. No need to be backported.	2025-11-04 17:33:42 +01:00
William Lallemand	3c578ca31c	CI: github: update to macos-26 macOS-15 images seems to have difficulties to run the reg-tests since a few days for an unknown reason. Doing a rollback of both VTest2 and haporxy doesn't seem to fix the problem so this is probably related to a change in github actions. This patch switches the image to the new macos-26 images which seems to fix the problem.	2025-11-03 16:17:36 +01:00
William Lallemand	0c34502c6d	SCRIPTS: build-ssl: fix rpath in AWS-LC install for openssl and bssl bin AWS-LC binaries were not linked correctly with an rpath, preventing the binaries to be useful without setting an LD_LIBRARY_PATH manually.	2025-11-03 15:04:57 +01:00
Willy Tarreau	fd012b6c59	OPTIM: proxy: move atomically access fields out of the read-only ones Perf top showed that h1_snd_buf() was having great difficulties accessing the proxy's server_id_hdr_name field in the middle of the headers loop. Moving the assignment out of the loop to a local variable moved the problem there as well: \| if (!(h1m->flags & H1_MF_RESP) && isttest(h1c->px->server_id_hdr_n 0.10 \|20b0: mov -0x120(%rbp),%rdi 1.33 \| mov 0x60(%rdi),%r10 0.01 \| test %eax,%eax 0.18 \| jne 2118 12.87 \| mov 0x350(%r10),%rdi 0.01 \| test %rdi,%rdi 0.05 \| je 2118 \| mov 0x358(%r10),%r11 It turns out that there are several atomically accessed fields in its vicinity, causing the cache line to bounce all the time. Let's collect the few frequently changed fields and place them together at the end of the structure, and plug the 32-bit hole with another isolated field. Doing so also reduced a little bit the cost of decrementing be->be_conn in process_stream(), and overall the HTTP/1 performance increased by about 1% both on ARM and x86_64.	2025-11-03 13:54:49 +01:00
William Lallemand	12aca978a8	SCRIPTS: build-ssl: allow to build a FIPS version without FIPS build-ssl.sh is always prepending a "v" to the version, preventing to build a FIPS version without FIPS enabled. This patch checks if FIPS is in the version string to chose to add the "v" or not. Example: AWS_LC_VERSION=AWS-LC-FIPS-3.0.0 BUILDSSL_DESTDIR=/opt/awslc-3.0.0 ./scripts/build-ssl.sh	2025-11-03 12:03:05 +01:00
Amaury Denoyelle	6bfabfdc77	OPTIM: backend: skip conn reuse for incompatible proxies When trying to reuse a backend connection, a connection hash is calculated to match an entry with similar parameters. Previously, this operation was skipped if the stream content wasn't based on HTTP, as it would have been incompatible with http-reuse. With the introduction of SPOP backends, this condition was removed, so that it can also benefit from connection reuse. However, this means that now hash calcul is always performed when connecting to a server, even for TCP or log backends. This is unnecessary as these proxies cannot perform connection reuse. Note also that reuse mode is resetted on postparsing for incompatible backends. This at least guarantees that no tree lookup will be performed via be_reuse_connection(). However, connection lookup is still performed in the session via session_get_conn() which is another unnecessary operation. Thus, this patch restores the condition so that reuse operations are now entirely skipped if a backend mode is incompatible. This is implemented via a new utility function named be_supports_conn_reuse(). This could be backported up to 3.1, as this commit could be considered as a performance regression for tcp/log backend modes.	2025-11-03 10:43:50 +01:00
Willy Tarreau	ad1bdc3364	BUG/MAJOR: stats-file: fix crash on non-x86 platform caused by unaligned cast Since commit d655ed5f14 ("BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency (2nd attempt)"), the last_state_change field in the counters is a uint (to match how it's reported). However, it happens that there are explicit casts in function me_generate_field() to retrieve the value, and which cause crashes on aarch64 and likely other non-x86 64-bit platforms due to atomically reading an unaligned 64-bit value, and may even randomly crash other 64-bit platforms when reading past the end of the structure. The fix for now adapts the cast to match the one used by the accessed type (i.e. unsigned int), but the approach must change, as there's nothing there which allows to figure whether or not the type is correct by just reading the code. At minima a typeof() on a named field is needed, but this requires more invasive changes, hence this temporary fix. No backport is needed, as stats-file is only in 3.3.	2025-11-03 07:33:11 +01:00
Damien Claisse	561dc127bd	BUG/MINOR: resolvers: ensure fair round robin iteration Previous fixes restored round robin iteration, but an imbalance remains when the response tree contains record types other than A or AAAA. Let's take the following example: the DNS answers two A records and a CNAME. The response "tree" (which is actually flat, more like a list) may look as follows, ordered by hash: - 1st item: first A record with IP 1 - 2nd item: second A record with IP 2 - 3rd item: CNAME record As a consequence, resolv_get_ip_from_response will iterate as follows, while the TTL is still valid: - 1st call: DNS request is done, response tree is created, iteration starts at the first item, IP 1 is returned. - 2nd call: cached response tree is used, iteration starts at the second item, IP 2 is returned. - 3rd call: cached response tree is used, iteration starts at the third item, but it's a CNAME, so we continue to the next item, which restarts iteration at the first item, and IP 1 is returned. - 4th call: cached response tree is used and iteration restarts at the beginning, returning IP 1 again. The 1-2-1-1-2-1-1-2 sequence will repeat, so IP 1 will be used twice as often as IP 2, creating a strong imbalance. Even with more IP addresses, the first one by hashing order in the tree will always receive twice the traffic of the others. To fix this, set the next iteration item to the one following the selected IP record, if any. This ensures we never use the same IP twice in a row. This commit should be backported where 3023e9819 ("BUG/MINOR: resolvers: Restore round-robin selection on records in DNS answers") is, so as far as 2.6.	2025-11-02 17:28:32 +01:00
William Lallemand	d1d2461197	REGTESTS: converters: check USE_OPENSSL in aes_gcm.vtc Check USE_OPENSSL as well as the haproxy version for the aes_gcm reg-test.	2025-10-31 12:43:00 +01:00
William Lallemand	1d859bdaa2	MINOR: sample: optional AAD parameter support to aes_gcm_enc/dec The aes_gcm_enc() and aes_gcm_dec() sample converters now accept an optional fifth argument for Additional Authenticated Data (AAD). When provided, the AAD value is base64-decoded and used during AES-GCM encryption or decryption. Both string and variable forms are supported. This enables use cases that require authentication of additional data.	2025-10-31 12:27:38 +01:00
Amaury Denoyelle	73b5d331cc	OPTIM: quic: adjust automatic ALPN setting for QUIC servers If a QUIC server is declared without ALPN, "h3" value is automatically set during _srv_parse_finalize(). This patch adjusts this operation. Instead of relying on ssl_sock_parse_alpn(), a plain strdup() is used. This is considered more efficient as the ALPN string is constant in this case. This method is already used for listeners on the frontend side.	2025-10-31 11:32:20 +01:00
Amaury Denoyelle	14a6468df5	MINOR: quic: reject conf with QUIC servers if not compiled Ensure that QUIC support is compiled into haproxy when a QUIC server is configured. This check is performed during _srv_parse_finalize() so that it is detected both on configuration parsing and when adding a dynamic server via the CLI. Note that this changes the behavior of srv_is_quic() utility function. Previously, it always returned false when QUIC support wasn't compiled. With this new check introduced, it is now guaranteed that a QUIC server won't exist if compilation support is not active. Hence srv_is_quic() does not rely anymore on USE_QUIC define.	2025-10-31 11:32:20 +01:00
Amaury Denoyelle	1af3caae7d	MINOR: quic: enable SSL on QUIC servers automatically Previously, QUIC servers were rejected if SSL was not explicitely activated using 'ssl' configuration keyword. Change this behavior : now SSL is automatically activated for QUIC servers when the keyword is missing. A warning is displayed as it is considered better to explicitely note that SSL is in use.	2025-10-31 11:32:14 +01:00
Willy Tarreau	0a14ad11be	[RELEASE] Released version 3.3-dev11 Released version 3.3-dev11 with the following main changes : - BUG/MEDIUM: mt_list: Make sure not to unlock the element twice - BUG/MINOR: quic-be: unchecked connections during handshakes - BUG/MEDIUM: cli: also free the trash chunk on the error path - MINOR: initcalls: Add a new initcall stage, STG_INIT_2 - MEDIUM: stick-tables: Use a per-shard expiration task - MEDIUM: stick-tables: Remove the table lock - MEDIUM: stick-tables: Stop if stktable_trash_oldest() fails. - MEDIUM: stick-tables: Stop as soon as stktable_trash_oldest succeeds. - BUG/MEDIUM: h1-htx: Don't set HTX_FL_EOM flag on 1xx informational messages - BUG/MEDIUM: h3: properly encode response after interim one in same buf - BUG/MAJOR: pools: fix default pool alignment - MINOR: ncbuf: extract common types - MINOR: ncbmbuf: define new ncbmbuf type - MINOR: ncbmbuf: implement add - MINOR: ncbmbuf: implement iterator bitmap utilities functions - MINOR: ncbmbuf: implement ncbmb_data() - MINOR: ncbmbuf: implement advance operation - MINOR: ncbmbuf: add tests as standalone mode - BUG/MAJOR: quic: use ncbmbuf for CRYPTO handling - MINOR: quic: remove received CRYPTO temporary tree storage - MINOR: stats-file: fix typo in shm-stats-file object struct size detection - MINOR: compiler: add FIXED_SIZE(size, type, name) macro - MEDIUM: freq-ctr: use explicit-size types for freq-ctr struct - BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency - BUG/MEDIUM: build: limit excessive and counter-productive gcc-15 vectorization - BUG/MEDIUM: stick-tables: Don't loop if there's nothing left - MINOR: acme: add the dns-01-record field to the sink - MINOR: acme: display the complete challenge_ready command in the logs - BUG/MEDIUM: mt_lists: Avoid el->prev = el->next = el - MINOR: quic: remove unused conn-tx-buffers limit keyword - MINOR: quic: prepare support for options on FE/BE side - MINOR: quic: rename "no-quic" to "tune.quic.listen" - MINOR: quic: duplicate glitches FE option on BE side - MINOR: quic: split congestion controler options for FE/BE usage - MINOR: quic: split Tx options for FE/BE usage - MINOR: quic: rename max Tx mem setting - MINOR: quic: rename retry-threshold setting - MINOR: quic: rename frontend sock-per-conn setting - BUG/MINOR: quic: split max-idle-timeout option for FE/BE usage - BUG/MINOR: quic: split option for congestion max window size - BUG/MINOR: quic: rename and duplicate stream settings - BUG/MEDIUM: applet: Improve again spinning loops detection with the new API - Revert "BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency" - Revert "MEDIUM: freq-ctr: use explicit-size types for freq-ctr struct" - Revert "MINOR: compiler: add FIXED_SIZE(size, type, name) macro" - BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency (2nd attempt) - BUG/MINOR: stick-tables: properly index string-type keys - BUILD: openssl-compat: fix build failure with OPENSSL=0 and KTLS=1 - BUG/MEDIUM: mt_list: Use atomic operations to prevent compiler optims - MEDIUM: quic: Fix build with openssl-compat - MINOR: applet: do not put SE_FL_WANT_ROOM on rcv_buf() if the channel is empty - MINOR: cli: create cli_raw_rcv_buf() from the generic applet_raw_rcv_buf() - BUG/MEDIUM: cli: do not return ACKs one char at a time - BUG/MEDIUM: ssl: Crash because of dangling ckch_store reference in a ckch instance - BUG/MINOR: ssl: Remove unreachable code in CLI function - BUG/MINOR: acl: warn if "_sub" derivative used with an explicit match - DOC: config: fix confusing typo about ACL -m ("now" vs "not") - DOC: config: slightly clarify the ssl_fc_has_early() behavior - MINOR: ssl-sample: add ssl_fc_early_rcvd() to detect use of early data - CI: disable fail-fast on fedora rawhide builds - MINOR: http: fix 405,431,501 default errorfile - BUG/MINOR: init: Do not close previously created fd in stdio_quiet - MINOR: init: Make devnullfd global and create it earlier in init - MINOR: init: Use devnullfd in stdio_quiet calls instead of recreating a fd everytime - MEDIUM: ssl: Add certificate password callback that calls external command - MEDIUM: ssl: Add local passphrase cache - MINOR: ssl: Do not dump decrypted privkeys in 'dump ssl cert' - BUG/MINOR: resolvers: Apply dns-accept-family setting on additional records - MEDIUM: h1: Immediately try to read data for frontend - REGTEST: quic: add ssl_reuse.vtc new QUIC test - BUG/MINOR: ssl: returns when SSL_CTX_new failed during init - MEDIUM: ssl/ech: config and load keys - MINOR: ssl/ech: add logging and sample fetches for ECH status and outer SNI - MINOR: listener: implement bind_conf_find_by_name() - MINOR: ssl/ech: key management via stats socket - CI: github: add USE_ECH=1 to haproxy for openssl-ech job - DOC: configuration: "ech" for bind lines - BUG/MINOR: ech: non destructive parsing in cli_find_ech_specific_ctx() - DOC: management: document ECH CLI commands - MEDIUM: mux-h2: do not needlessly refrain from sending data early - MINOR: mux-h2: extract the code to send preface+settings into its own function - BUG/MINOR: mux-h2: send the preface along with the first request if needed	2025-10-31 10:09:57 +01:00
Willy Tarreau	a1f26ca307	BUG/MINOR: mux-h2: send the preface along with the first request if needed Tests involving 0-RTT and H2 on the backend show that 0-RTT is being partially used but does not work. The analysis shows that only the preface and settings are sent using early-data and the request is sent separately. As explained in the previous patch, this is caused by the fact that a wakeup of the iocb is needed just to send the preface, then a new call to process_stream is needed to try sending again. Here with this patch, we're making h2_snd_buf() able to send the preface if it was not yet sent. Thanks to this, the preface, settings and first request can now leave as a single TCP segment. In case of TLS with 0-RTT, it now allows all the block to leave in early data. Even in clear-text H2, we're now seeing a 15% lower context-switch count, and the number of calls to process_stream() per connection dropped from 3 to 2. The connection rate increased by an extra 9.5%. Compared to without the last 3 patches, this is a 22% reduction of context-switches, 33% reduction of process_stream() calls, and 15.7% increase in connection rate. And more importantly, 0-RTT now really works with H2 on the backend, saving one full RTT on the first request. This fix is only for a missed optimization and a non-functional 0-RTT on the backend. It's worth backporting it, but it doesn't cause enough harm to hurry a backport. Better wait for it to live a little bit in 3.3 (till at least a week or two after the final release) before backporting it. It's not sure that it's worth going beyond 3.2 in any case. It depends on the these two previous commits: MEDIUM: mux-h2: do not needlessly refrain from sending data early MINOR: mux-h2: extract the code to send preface+settings into its own function	2025-10-30 18:16:54 +01:00
Willy Tarreau	d5aa3e19cc	MINOR: mux-h2: extract the code to send preface+settings into its own function The code that deals with sending preface + settings and changing the state currently is in h2_process_mux(), but we'll want to do it as well from h2_snd_buf(), so let's move it to a dedicate function first. At this point there is no functional change.	2025-10-30 18:16:54 +01:00
Willy Tarreau	b0e8edaef2	MEDIUM: mux-h2: do not needlessly refrain from sending data early The mux currently refrains from sending data before H2_CS_FRAME_H, i.e. before the peer's SETTINGS frame was received. While it makes sense on the frontend, it's causing harm on the backend because it forces the first request to be sent in two halves over an extra RTT: first the preface and settings, second the request once the settings are received. This is totally contrary to the philosophy of the H2 protocol, consisting in permitting the client to send as soon as possible. Actually what happens is the following: - process_stream() calls connect_server() - connect_server() creates a connection, and if the proto/alpn is guessed or known, the mux is instantiated for the current request. - the H2 init code wakes the h2 tasklet up and returns - process_stream() tries to send the request using h2_snd_buf(), but that one sees that we're before H2_CS_FRAME_H, refrains from doing so and returns. - process_stream() subscribes and quits - the h2 tasklet can now execute to send the preface and settings, which leave as a first TCP segment. The connection is ready. - the iocb is woken again once the server's SETTINGS frame is received, turning the connection to the H2_CS_FRAME_H state, and the iocb wake up process_stream(). - process_stream() executes again and can try to send again. - h2_snd_buf() is called and finally sends the request as a second TCP segment. Not only this is inefficient, but it also renders 0-RTT and TFO impossible on H2 connections. When 0-RTT is used, only the preface and settings leave as early data (the very first data of that connection), which is totally pointless. In order to fix this, we have to go through a few steps: - first we need to let data be sent to a server immediately after the SETTINGS frame was sent (i.e. in H2_CS_SETTINGS1 state instead of H2_CS_FRAME_H). However, some protocol extensions are advertised by the server using SETTINGS (e.g. RFC8441) and some requests might need to know the existence of such extensions. For this reason we're adding a new h2c flag, H2_CF_SETTINGS_NEEDED, which indicates that some operations were not done because a server's SETTINGS frame is needed. This is set when trying to send a protocol upgrade or extended CONNECT during H2_CS_SETTINGS1, indicating that it's needed to wait for H2_CS_FRAME_H in this case. The flag is always set on frontend connections. This is what is being done in this patch. - second, we need to be able to push the preface opportunistically with the first h2_snd_buf() so that it's not needed to wake the tasklet up just to send that and wake process_stream() again. This will be in a separate patch. By doing the first step, we're at least saving one needless tasklet wakeup per connection (~9%), which results in ~5% backend connection rate increase.	2025-10-30 18:16:54 +01:00
William Lallemand	0436062f48	DOC: management: document ECH CLI commands Document "show ssl ech", "add ssl ech", "set ssl ech" and "del ssl ech"	2025-10-30 11:59:39 +01:00
William Lallemand	f6503bd7d3	BUG/MINOR: ech: non destructive parsing in cli_find_ech_specific_ctx() cli_find_ech_specific_ctx() parses the <frontend>/<bind_conf> and sets a \0 in place the '/'. But the originals tring is still used to emit messages in the CLI so we only output the frontend part. This patch do the parsing in a trash buffer instead.	2025-10-30 11:59:39 +01:00
William Lallemand	37f76c45fa	DOC: configuration: "ech" for bind lines ECH is an experimental features which still a draft, but already exists as a feature branch in OpenSSL. This patch explains how to configure "ech" on bind lines.	2025-10-30 10:38:46 +01:00
William Lallemand	ce413f002a	CI: github: add USE_ECH=1 to haproxy for openssl-ech job Add the USE_ECH=1 make option to the haproxy build in order to test the build of the feature.	2025-10-30 10:38:38 +01:00
sftcd	9aacb684cd	MINOR: ssl/ech: key management via stats socket This patch extends the ECH support by adding runtime CLI commands to view and modify ECH configurations. New commands are added to the HAProxy CLI: - "show ssl ech [<name>]" displays all ECH configurations or a specific one. - "add ssl ech <name> <payload>" adds a new PEM-formatted ECH configuration. - "set ssl ech <name> <payload>" replaces all existing ECH configurations. - "del ssl ech <name> [<age-in-secs>]" removes ECH configurations, optionally filtered by age.	2025-10-30 10:38:31 +01:00
William Lallemand	1e2f920be6	MINOR: listener: implement bind_conf_find_by_name() Returns a pointer to the first bind_conf matching <name> in a frontend <front>. When name is prefixed by a @ (@<filename>:<linenum>), it tries to look for the corresponding filename and line of the configuration file. NULL is returned if no match is found.	2025-10-30 10:37:42 +01:00
sftcd	23f5cbb411	MINOR: ssl/ech: add logging and sample fetches for ECH status and outer SNI This patch adds functions to expose Encrypted Client Hello (ECH) status and outer SNI information for logging and sample fetching. Two new helper functions are introduced in ech.c: - conn_get_ech_status() places the ECH processing status string into a buffer. - conn_get_ech_outer_sni() retrieves the outer SNI value if ECH succeeded. Two new sample fetch keywords are added: - "ssl_fc_ech_status" returns the ECH status string. - "ssl_fc_ech_outer_sni" returns the outer SNI value seen during ECH. These allow ECH information to be used in HAProxy logs, ACLs, and captures.	2025-10-30 10:37:30 +01:00
sftcd	dba4fd248a	MEDIUM: ssl/ech: config and load keys This patch introduces the USE_ECH option in the Makefile to enable support for Encrypted Client Hello (ECH) with OpenSSL. A new function, load_echkeys, is added to load ECH keys from a specified directory. The SSL context initialization process in ssl_sock.c is updated to load these keys if configured. A new configuration directive, `ech`, is introduced to allow users to specify the ECH key directory in the listener configuration.	2025-10-30 10:37:12 +01:00
William Lallemand	83e3cbc262	BUG/MINOR: ssl: returns when SSL_CTX_new failed during init In ssl_sock_initial_ctx(), returns when SSL_CTX_new() failed instead of trying to apply anything on the ctx. This may avoid crashing when there's not enough memory anymore during configuration parsing. Could be backported in every haproxy versions	2025-10-30 10:36:56 +01:00
Frederic Lecaille	2f621aa52e	REGTEST: quic: add ssl_reuse.vtc new QUIC test Note that this test does not work with OpenSSL 3.5.0 QUIC API because the callback set by SSL_CTX_sess_set_new_cb() (ssl_sess_new_srv_cb()) is not called (at least for QUIC clients) The role of this new QUIC test is to run the same SSL/TCP test as reg-tests/ssl/ssl_reuse.vtc but with QUIC connections where applicable (only with TLSv1.3). To do so, this QUIC test uses the "include" vtc command to run ssl/ssl_reuse.vtc It also sets the VTC_SOCK_TYPE environment variable with the "setenv" command and "quic" as value. This will ask vtest2 to use QUIC sockets for all "fd@{...}" addresses prefixed by "${VTC_SOCK_TYPE}+" socket type if VTC_SOCK_TYPE value is "quic". The SSL/TCP is modified to set this environment variable with "setenv -ifunset" from ssl/ssl_reuse.vtc with "stream" as value, if it not already set. vtest2 must be used with this patch to support this new QUIC test: `9aa4d498db` Thanks to this latter patch, vtest2 retrieves the VTC_SOCK_TYPE environment variable value, then it parses the vtc file to retrieve all the fd addresses prefixed by "${VTC_SOCK_TYPE}+" and creates a QUIC socket or a TCP socket depending on this variable value.	2025-10-30 08:33:54 +01:00
Olivier Houchard	b3d6f44af8	MEDIUM: h1: Immediately try to read data for frontend In h1_init(), if we're a frontend connection, immediately attempt to read data, if the connection is ready, instead of just subscribing. There may already be data available, at least if we're using 0RTT. This may be backported up to 2.8 in a while, after 3.3 is released, so that if it causes problem, we have a chance to hear about it.	2025-10-29 17:18:26 +01:00
Christopher Faulet	c84c15d393	BUG/MINOR: resolvers: Apply dns-accept-family setting on additional records dns-accept-family setting was only evaluated for responses to A / AAAA DNS queries. It was ignored when additional records in SRV responses were parsed. With this patch, whena SRV responses is parsed, additional records not matching the dns-accept-family setting are ignored, as expected. This patch must be backported to 3.2.	2025-10-29 11:20:27 +01:00
Remi Tricot-Le Breton	dc35a3487b	MINOR: ssl: Do not dump decrypted privkeys in 'dump ssl cert' A private keys that is password protected and was decoded during init thanks to the password obtained thanks to 'ssl-passphrase-cmd' should not be dumped via 'dump ssl cert' CLI command.	2025-10-29 10:54:17 +01:00
Remi Tricot-Le Breton	5a036d223b	MEDIUM: ssl: Add local passphrase cache Instead of calling the external password command for all loaded encrypted certificates, we will keep a local password cache. The passwords won't be stored as plain text, they will be stored obfuscated into the password cache. The obfuscation is simply based on a XOR'ing with a random number built during init. After init is performed, the password cache is overwritten and freed so that no dangling info allowing to dump the passwords remains.	2025-10-29 10:54:17 +01:00
Remi Tricot-Le Breton	478dd7bad0	MEDIUM: ssl: Add certificate password callback that calls external command When a certificate is protected by a password, we can provide the password via the dedicated pem_password_cb param provided to PEM_read_bio_PrivateKey. HAProxy will fetch the password automatically during init by calling a user-defined external command that should dump the right password on its standard output (see new 'ssl-passphrase-cmd' global option).	2025-10-29 10:54:17 +01:00
Remi Tricot-Le Breton	a011683622	MINOR: init: Use devnullfd in stdio_quiet calls instead of recreating a fd everytime Since commit "65760d MINOR: init: Make devnullfd global and create it earlier in init" the devnullfd file descriptor pointing to /dev/null is created regardless of the process's parameters so we can use it in all 'stdio_quiet' calls instead or recreating an FD.	2025-10-29 10:54:17 +01:00
Remi Tricot-Le Breton	1ec59d3426	MINOR: init: Make devnullfd global and create it earlier in init The devnull fd might be needed during configuration parsing, if some options require to fork/exec for instance. So we now create it much earlier in the init process and without depending on the '-q' or '-d' parameters.	2025-10-29 10:54:17 +01:00
Remi Tricot-Le Breton	c606ff45a0	BUG/MINOR: init: Do not close previously created fd in stdio_quiet During init we were calling 'stdio_quiet' and passing the previously created 'devnullfd' file descriptor. But the 'stdio_quiet' was also closed afterwards which raised an error (EBADF). If we keep from closing FDs that were opened outside of the 'stdio_quiet' function we will let the caller manage its FD and avoid double close calls. This patch can be backported to all stable branches.	2025-10-29 10:54:17 +01:00
Huangbin Zhan	ad9a24ee55	MINOR: http: fix 405,431,501 default errorfile A few typos were present in the default errorfiles for the status codes above (missing dot at the end of the sentence, extra closing bracket). This fixes them. This can be backported.	2025-10-29 08:47:19 +01:00
Ilia Shipitsin	9781d91e4d	CI: disable fail-fast on fedora rawhide builds Previously builds were dependent in terms that if one fails, other are stopped. By their nature those builds are independent, let's not to fail them altogether	2025-10-29 08:15:01 +01:00
Willy Tarreau	18b27bfec9	MINOR: ssl-sample: add ssl_fc_early_rcvd() to detect use of early data We currently have ssl_fc_has_early() which says that early data are still unconfirmed by a final handshake, but nothing to see if a client has been able to use early data at all, which is a problem because such mechanisms generally depend on multiple factors and it's hard to know when they start to work. This new sample fetch function will indicate that some early data were seen over that front connection, i.e. this can be used to confirm that at some point the client was able to push some. This is essentially a debugging tool that has no practical use case other than debugging.	2025-10-29 08:13:29 +01:00
Willy Tarreau	765d49b680	DOC: config: slightly clarify the ssl_fc_has_early() behavior Clarify that it's about handshake completion, and also mention that the action to be used to wait for the handshake is "wait-for-handshake", which was not mentioned. This can be backported though it's very minor.	2025-10-29 08:13:29 +01:00
Willy Tarreau	20174ca143	DOC: config: fix confusing typo about ACL -m ("now" vs "not") A one-letter typo in the doc update comint with commit 6ea50ba462 ("MINOR: acl; Warn when matching method based on a suffix is overwritten") inverts the meaning of the sentence. It was "is not allowed" and not "is now allowed". Needs to be backported only if the commit above ever is (unlikely).	2025-10-29 08:13:29 +01:00
Amaury Denoyelle	7f2ae10920	BUG/MINOR: acl: warn if "_sub" derivative used with an explicit match Recently, a new warning is displayed when an ACL derivative match method is override with another '-m' method. This is implemented via the following patch : 6ea50ba462692d6dcf301081f23cab3e0f6086e4 MINOR: acl; Warn when matching method based on a suffix is overwritten However, this warning was not reported when "_sub" suffix was specified. Fix this by adding PAT_MATCH_SUB in the warning comparison. No backport needed except if above commit is.	2025-10-28 11:59:32 +01:00
Remi Tricot-Le Breton	89b43740e3	BUG/MINOR: ssl: Remove unreachable code in CLI function Remove unreachable code in 'cli_parse_show_jwt' function. This bug was raised in GitHub #3159. This patch does not need to be backported.	2025-10-28 10:44:51 +01:00
Remi Tricot-Le Breton	7482b6ebf0	BUG/MEDIUM: ssl: Crash because of dangling ckch_store reference in a ckch instance When updating CAs via the CLI, we need to create new copies of all the impacted ckch instances (as in referenced in the ckch_inst_link list of the updated CA) in order to use them instead of the old ones once the updated is completed. This relies on the ckch_inst_rebuild function that would set the ckch_store field of the ckch_inst. But we forgot to also add the newly created instances in the ckch_inst list of the corresponding ckch_store. When updating a certificate afterwards, we iterate over all the instances linked in the ckch_inst list of the ckch_store (which is missing some instances because of the previous command) and rebuild the instances before replacing the ckch_store. The previous ckch_store, still referenced by the dangling ckch instance then gets deleted which means that the instance keeps a reference to a free'd object. Then if we were to once again update the CA file, we would iterate over the ckch instances referenced in the cafile_entry's ckch_inst_link list, which includes the first mentioned ckch instance with the dead ckch_store reference. This ends up crashing during the ckch_inst_rebuild operation. This bug was raised in GitHub #3165. This patch should be backported to all stable branches.	2025-10-28 10:43:45 +01:00
Willy Tarreau	2d7e3ddd4a	BUG/MEDIUM: cli: do not return ACKs one char at a time Since 3.0 where the CLI started to use rcv_buf, it appears that some external tools sending chained commands are randomly experiencing failures. Each time this happens when the whole command is sent as a single packet, immediately followed by a close. This is not a correct way to use the CLI but this has been working for ages for simple netcat-based scripts, so we should at least try to preserve this. The cause of the failure is that the first LF that acks a command is immediately sent back to the client and rejected due to the closed connection. This in turn forwards the error back to the applet which aborts its processing. Before 3.0 the responses would be queued into the buffer, then sent back to the channel, and would all fail at once. This changed when snd_buf/rcv_buf were implemented because the applets are much more responsive and since they yield between each command, they can deliver one ACK at a time that is immediately forwarded down the chain. An easy way to observe the problem is to send 5 map updates, a shutdown, and immediately close via tcploop, and in parallel run a periodic "show map" to count the number of elements: $ tcploop -U /tmp/sock1 C S:"add map #0 1 1; add map #0 2 2; add map #0 3 3; add map #0 4 4; add map #0 5 5\n" F K Before 3.0, there would always be 5 elements. Since 3.0 and before 20ec1de214 ("MAJOR: cli: Refacor parsing and execution of pipelined commands"), almost always 2. And since that commit above in 3.2, almost always one. Doing the same using socat or netcat shows almost always 5... It's entirely timing-dependent, and might even vary based on the RTT between the client and haproxy! The approach taken here consists in doing the same principle as MSG_MORE or Nagle but on the response buffer: the applet doesn't need to send a single ACK for each command when it has already been woken up and is scheduled to come back to work. It's fine (and even desirable) that ACKs are grouped in a single packet as much as possible. For this reason, this patch implements APPCTX_CLI_ST1_YIELD, a new CLI flag which indicates that the applet left in yielding condition, i.e. it has not finished its work. This flag is used by .rcv_buf to hold pending data. This way we won't return partial responses for no reason, and we can continue to emulate the previous behavior. One very nice benefit to this is that it saves huge amounts of CPU on the client. In the test below that tries to update 1M map entries, the CPU used by socat went from 100% to 0% and the total transfer time dropped by 28%: before: $ time awk 'BEGIN{ printf "prompt i\n"; for (i=0;i<1000000;i++) { \ printf "add map #0 %d %d\n",i,i,i }}' \| socat /tmp/sock1 - >/dev/null real 0m2.407s user 0m1.485s sys 0m1.682s after: $ time awk 'BEGIN{ printf "prompt i\n"; for (i=0;i<1000000;i++) { \ printf "add map #0 %d %d\n",i,i,i }}' \| socat /tmp/sock1 - >/dev/null real 0m1.721s user 0m0.952s sys 0m0.057s The difference is also quite visible on the number of syscalls during the test (for 1k updates): before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.071691 0 100001 sendmsg after: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000011 1 9 sendmsg This patch will need to be backported to 3.0, and depends on these two patches to be backported as well: MINOR: applet: do not put SE_FL_WANT_ROOM on rcv_buf() if the channel is empty MINOR: cli: create cli_raw_rcv_buf() from the generic applet_raw_rcv_buf()	2025-10-27 16:57:07 +01:00
Willy Tarreau	f38ea2731b	MINOR: cli: create cli_raw_rcv_buf() from the generic applet_raw_rcv_buf() This is in preparation for a future fix. For now it's simply a pure copy of the original function, but dedicated to the CLI. It will have to be backported to 3.0.	2025-10-27 16:57:07 +01:00
Willy Tarreau	35106d65fb	MINOR: applet: do not put SE_FL_WANT_ROOM on rcv_buf() if the channel is empty appctx_rcv_buf() prepares all the work to schedule the transfers between the applet and the channel, and it takes care of setting the various flags that indicate what condition is blocking the transfer from progressing. There is one limitation though. In case an applet refrains from sending data (e.g. rate-limited, prefers to aggregate blocks etc), it will leave a possibly empty channel buffer, and keep some data in its outbuf. The data in its outbuf will be seen by the function above as an indication of a channel full condition, so it will place SE_FL_WANT_ROOM. But later, sc_applet_recv() will see this flag with a possibly empty channel, and will rightfully trigger a BUG_ON(). appctx_rcv_buf() should be more accurate in fact. It should only set SE_FL_RCV_MORE when more data are present in the applet, then it should either set or clear SE_FL_WANT_ROOM dependingon whether the channel is empty or not. Right now it doesn't seem possible to trigger this condition in the current state of applets, but this will become possible with a future bugfix that will have to be backported, so this patch will need to be backported to 3.0.	2025-10-27 16:57:07 +01:00
Olivier Houchard	259b1e1c18	MEDIUM: quic: Fix build with openssl-compat As the QUIC options have been split into backend and frontend, there is no more GTUNE_QUIC_LISTEN_OFF to be found in global.tune.options, look for QUIC_TUNE_FE_LISTEN_OFF in quic_tune.fe instead. This should fix the build with USE_QUIC and USE_QUIC_OPENSSL_COMPAT.	2025-10-24 13:51:15 +02:00
Olivier Houchard	837351245a	BUG/MEDIUM: mt_list: Use atomic operations to prevent compiler optims As a folow-up to f40f5401b9f24becc6fdd2e77d4f4578bbecae7f, explicitely use atomic operations to set the prev and next fields, to make sure the compiler can't assume anything about it, and just does it. This should be backported after f40f5401b9 up to 2.8.	2025-10-24 13:34:41 +02:00
Willy Tarreau	2ec6df59bf	BUILD: openssl-compat: fix build failure with OPENSSL=0 and KTLS=1 The USE_KTLS test is currently being done outside of the USE_OPENSSL guard so disabling USE_OPENSSL still results in build failures on libcs built with support for kernels before 4.17, because we enable KTLS by default on linux. Let's move the KTLS block inside the USE_OPENSSL guard instead. No backport is needed since KTLS is only in 3.3.	2025-10-24 10:45:02 +02:00
Willy Tarreau	1824079fca	BUG/MINOR: stick-tables: properly index string-type keys This is one of the rare pleasant surprises of fixing an almost 16-years old bug that remained unnoticed since the feature was implemented. In 1.4-dev7, commit 3bd697e071 ("[MEDIUM] Add stick table (persistence) management functions and types") introduced stick-tables with multiple key types, including strings, IP addresses and integers. Entries are coded in binary and their binary representation is indexed. A special case was made for strings in order to index them as zero-terminated strings. However, there's one subtlety. While strings indeed have a zero appended, they're still indexed using ebmb_insert(), which means that all the bytes till the configured size are indexed as well. And while these bytes generally come from a temporary storage that often contains zeroes, or that is longer than the configured string length and will result in truncation, it's not always the case and certain traffic patterns with certain configurations manage to occasionally present unpadded strings resulting in apparent duplicate keys appearing in the dump, as shown in GH issue #3161. It seems to be essentially reproducible at boot, and not to be particularly affected by mixed patterns. These keys are in fact not exact duplicates in memory, but everywhere they're used (including during synchronization), they are equal. What's interesting is that when this happens, one key can be presented to a peer with its own data and will be indexed as the only one, possibly replacing contents from the previous key, which might replace them again later once updated in turn. This is visible in the dump of the issue above, where key "localhost:8001" was split into two entries, one with a request count of one and the other with a request count of 499999, and indeed, all peers see only that last value, which overwrote the first one. This fix must be backported to all stable branches. Special kudos to Mark Wort for undelining that one.	2025-10-24 10:15:11 +02:00
Aurelien DARRAGON	d655ed5f14	BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency (2nd attempt) This is a second attempt at fixing issues on 32bits systems which would trigger the following BUG_ON() statement: FATAL: bug condition "sizeof(struct shm_stats_file_object) != 544" matched at src/stats-file.c:825 shm_stats_file_object struct size changed, is is part of the exported API: ensure all precautions were taken (ie: shm_stats_file version change) before adjusting this This is a drop-in replacement for d30b88a6c + 4693ee0ff, as suggested by Willy. Indeed, on supported platforms unsigned int can be assumed to be 4 bytes long, and long can be assumed to be 8 bytes long. As such, the previous attempt was overkill and added unecessary maintenance complexity which could result in bugs if not used properly. Moreover, it would only partially solve the issue, since on little endian vs big endian architectures, the provisioned memory areas (originating from the same shm stats file) could be read differently by the host. Instead we fix the aligments issues, and this alone helps to ensure struct memory consistency on 64 vs 32bits platforms. It was tested on both i386 and i586. last_change and last_sess counters are now stored as unsigned int, as it helped to fix the alignment issues and they were found to be used as 32bits integers anyway. Thanks to Willy for problem analysis and the patch proposal. No backport needed.	2025-10-24 09:35:38 +02:00
Aurelien DARRAGON	a931779dde	Revert "MINOR: compiler: add FIXED_SIZE(size, type, name) macro" This reverts commit 466a603b59ed77e9787398ecf1baf77c46ae57b1. Due to the last 2 commits, this macro is now unused, and will probably never be used, so let's get rid of that for now.	2025-10-24 09:35:34 +02:00
Aurelien DARRAGON	8277f891d2	Revert "MEDIUM: freq-ctr: use explicit-size types for freq-ctr struct" This reverts commit 4693ee0ff7a5fa4a12ff69b1a33adca142e781ac. As discussed in GH #3168, this works but it is not the proper way to fix the issue. See following commits.	2025-10-24 09:35:29 +02:00
Aurelien DARRAGON	c0d952ccc1	Revert "BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency" This reverts commit d30b88a6cc47d662e92b524ad5818be312401d0e. As discussed in GH #3168, this works but it is not the proper way to fix the issue. See following commits.	2025-10-24 09:35:25 +02:00
Christopher Faulet	854888497e	BUG/MEDIUM: applet: Improve again spinning loops detection with the new API A first attempt to fix this issue was already pushed (54b7539d6 "BUG/MEDIUM: apppet: Improve spinning loop detection with the new API"). But it not was fully accurrate. Indeed, we must check if something was received or sent by the applet before incrementing the call rate. But we must also take care the applet is allowed to receive or send data. That is what is performed in this patch. This patch must be backported as far as 3.0 with the patch above.	2025-10-24 09:26:10 +02:00
Amaury Denoyelle	7ba4b0ad5f	BUG/MINOR: quic: rename and duplicate stream settings Several settings can be set to control stream multiplexing and associated receive window. Previously, all of these settings were configured using prefix "tune.quic.frontend.", despite being applied blindly on both sides. Fix this by duplicating these settings specific to frontend and backend side. Options are also renamed to use the standardize prefix "tune.quic.[be\|fe].stream." notation. Also, each option is individually renamed to better reflect its purpose and hide technical details relative to QUIC transport parameter naming : * max-data-size -> stream.rxbuf * max-streams-bidi -> stream.max-concurrent * stream-data-ratio -> stream.data-ratio No need to backport.	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	d5142706f8	BUG/MINOR: quic: split option for congestion max window size	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	33afba0dda	BUG/MINOR: quic: split max-idle-timeout option for FE/BE usage Streamline max-idle-timeout option. Rename it to use the newer cohesive naming scheme 'tune.quic.fe\|be.'. Two different fields were already defined in global struct. These fields are moved into quic_tune along with other QUIC settings. However, no parser was defined for backend option, this commit fixes this. No need to backport this.	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	5bc659a4a2	MINOR: quic: rename frontend sock-per-conn setting On frontend side, a quic_conn can have a dedicated FD or use the listener one. These different modes can be activated via a global QUIC tune setting. This patch adjusts the option. First, it is renamed to the more meaningful name 'tune.quic.fe.sock-per-conn'. Also, arguments are now either 'default-on' or 'force-off'. The objective is to better highlight reliationship with 'quic-socket' bind option. The older option is deprecated and will be removed in 3.5.	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	a14c6cee17	MINOR: quic: rename retry-threshold setting A QUIC global tune setting is defined to be able to force Retry emission prior to handshake. By definition, this ability is only supported by QUIC servers, hence it is a frontend option only. Rename the option to use "fe" prefix. The old option name is deprecated and will be removed in 3.5	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	d248c5bd21	MINOR: quic: rename max Tx mem setting QUIC global memory can be limited across the entire process via a global tune setting. Previously, this setting used to misleading "frontend" prefix. As this is applied as a sum between all QUIC connections, both from frontend and backend sides, remove the prefix. The new option name is "tune.quic.mem.tx-max". The older option name is deprecated and will be removed in 3.5.	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	9bfe9b9e21	MINOR: quic: split Tx options for FE/BE usage This patch is similar to the previous one, except that it is focused on Tx QUIC settings. It is now possible to toggle GSO and pacing on frontend and backend sides independently. As with previous patch, option are renamed to use "fe/be" unified prefixes. This is part of the current serie of commits which unify QUI settings. Older options are deprecated and will be removed on 3.5 release.	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	33a8cb87a9	MINOR: quic: split congestion controler options for FE/BE usage Various settings can be configured related to QUIC congestion controler. This patch duplicates them to be able to set independent values on frontend and backend sides. As with previous patch, option are renamed to use "fe/be" unified prefixes. This is part of the current serie of commits which unify QUIC settings. Older options are deprecated and will be removed on 3.5 release.	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	7640e9a9ee	MINOR: quic: duplicate glitches FE option on BE side Previously, QUIC glitches support was only implemented for frontend side. Extend this so that the option can be specified separately both on frontend and backend sides. Function _qcc_report_glitch() now retrieves the relevant max value based on connection side. In addition to this, option has been renamed to use "fe/be" prefixes. This is part of the current serie of commits which unify QUIC settings. Older options are deprecated and will be removed on 3.5 release.	2025-10-23 16:49:20 +02:00
Amaury Denoyelle	b34cd0b506	MINOR: quic: rename "no-quic" to "tune.quic.listen" Rename the option to quickly enable/disable every QUIC listeners. It now takes an argument on/off. The documentation is extended to reflect the fact that QUIC backend are not impacted by this option. The older keyword is simply removed. Deprecation is considered unnecessary as this setting is only useful during debugging.	2025-10-23 16:47:58 +02:00
Amaury Denoyelle	42e5ec6519	MINOR: quic: prepare support for options on FE/BE side A major reorganization of QUIC settings is going to be performed. One of its objective is to clearly define options which can be separately configured on frontend and backend proxy sides. To implement this, quic_tune structure is extended to support fe and be options. A set of macros/functions is also defined : it allows to retrieve an option defined on both sides with unified code, based on proxy side of a quic_conn/connection instance.	2025-10-23 15:06:01 +02:00
Amaury Denoyelle	cf3cf7bdda	MINOR: quic: remove unused conn-tx-buffers limit keyword Remove parsing code for tune.quic.frontend.conn-tx-buffers.limit. This option was deprecated for some time and in fact was noop and not mentionned anymore in the documentation.	2025-10-23 15:06:01 +02:00
Olivier Houchard	f40f5401b9	BUG/MEDIUM: mt_lists: Avoid el->prev = el->next = el Avoid setting both el->prev and el->next on the same line. The goal is to set both el->prev and el->next to el, but a naive compiler, such as when we're using -O0, will set el->next first, then will set el->prev to the value of el->next, but if we're unlucky, el->next will have been set to something else by another thread. So explicitely set both to what we want. This should be backported up to 2.8.	2025-10-23 14:43:51 +02:00
William Lallemand	d0f9515e5c	MINOR: acme: display the complete challenge_ready command in the logs When using a wildcard DNS domain in the ACME configuration, for example *.example.com, one might think that it needs to use the challenge_ready command with this domain. But that's not the case, the challenge_ready command takes the domain asked by the ACME server, which is stripped of the wildcard. In order to be clearer, the log message shows exactly the command the user should sent, which is clearer.	2025-10-23 11:14:07 +02:00
William Lallemand	861fe53204	MINOR: acme: add the dns-01-record field to the sink The dns-01-record field in the dpapi sink, output the authentication token which is needed in the TXT record in order to validate the DNS-01 challenge.	2025-10-23 11:14:07 +02:00
Olivier Houchard	dfe866fa98	BUG/MEDIUM: stick-tables: Don't loop if there's nothing left Before waking up the expiration task again at the end of it, make sure the next date is set. If there's nothing left to do, then task_exp will be TASK_ETERNITY and we then don't want to be waken up again.	2025-10-23 10:51:52 +02:00
Willy Tarreau	871c80505c	BUG/MEDIUM: build: limit excessive and counter-productive gcc-15 vectorization In https://bugs.gentoo.org/964719, Dan Goodliffe reported that using CFLAGS="-O3 -march=westmere" creates a binary that segfaults on startup with gcc-15. This could be reproduced here, is isolated to gcc-15 and -O3, and is caused by gcc emitting "movdqa" instructions to read unaligned longs taken from chars that were carefully isolated within ifdefs checking for support for unaligned integers on the platform... Some experiments showed that changing all casts all over the code using either typedef-enforced align(1) or using the packed union trick does the job, it needs a more in-depth validation since it's obvious that it doesn't produce the same code at all (at least on more modern machines). However, the offending optimization option could be isolated, it's "-fvect-cost-model=dynamic" which causes this, while -O2 uses "-fvect-cost-model=very-cheap". Turning it back to very-cheap solves the issue, reduces the code, and yields an extra 5% performance increase on the http-request rate (181k vs 172k on a single core)! This could at least partially explain why it has been observed several times over the last few years that -O3 yields bigger and slower code than -O2. It was also verified that the option doesn't change the emitted code at -O0..-O2,-Os,-Oz, but only at -O3. This patch detects the presence of this option and turns it on to address the problem that some distros are facing after an upgrade to gcc-15. As such it should be backported to recent LTS and stable branches. Here, 3.1 was used, so it seems legit to at least target the last two LTS branches (i.e. go as far as 3.0). Thanks to Dan Goodliffe for sharing a working reproducer, Sam James for starting the investigations and Christian Ruppert for bringing the issue to us.	2025-10-23 10:06:52 +02:00
Aurelien DARRAGON	d30b88a6cc	BUG/MAJOR: stats-file: ensure shm_stats_file_object struct mapping consistency As reported by @tianon on GH #3168, running haproxy on 32bits i386 platform would trigger the following BUG_ON() statement: FATAL: bug condition "sizeof(struct shm_stats_file_object) != 544" matched at src/stats-file.c:825 shm_stats_file_object struct size changed, is is part of the exported API: ensure all precautions were taken (ie: shm_stats_file version change) before adjusting this In fact, some efforts were already taken to ensure shm_stats_file_object struct size remains consistent on 64 vs 32 bits platforms, since shm_stats_file_object is part of the public API and directly exposed in the stats file. However, some parts were overlooked: some structs that are embedded in shm_stats_file_object struct itself weren't using fixed-width integers, and would sometime be unaligned. The result of this is that it was up to the compiler (platform-dependent) to choose how to deal with such ambiguities, which could cause the struct mapping/size to be inconsistent from one platform to another. Hopefully this was caught by the BUG_ON() statement and with the precious help of @tianon To fix this, we now use fixed-width integers everywhere for members (and submembers) of shm_stats_file_object struct, and we use explicit padding where missing to avoid automatic padding when we don't expect one. As for the previous commit, we leverage FIXED_SIZE() and FIXED_SIZE_ARRAY() macro to set the expected width for each integer without causing build issues on platform that don't support larger integers. No backport needed, this feature was introduced during 3.3-dev.	2025-10-22 20:52:22 +02:00
Aurelien DARRAGON	4693ee0ff7	MEDIUM: freq-ctr: use explicit-size types for freq-ctr struct freq-ctr struct is used by the shm_stats_file API, and more precisely, it is used in the shm_stats_file_object struct for counters. shm_stats_file_object struct requires to be plateform-independent, thus we switch to using explicit size types (AKA fixed width integer types) for freq-ctr, in the attempt to make freq-ctr size and memory mapping consistent from one platform to another. We cannot simply use fixed-width integer because some of them are involved in atomic operations, and forcing a given width could cause build issues on some platforms where atomic ops are not implemented for large integers. Instead we leverage the FIXED_SIZE macro to keep handling the integers as before, but forcing them to be stored using expected number of bytes (unused bytes will simply be ignored). No change of behavior should be expected.	2025-10-22 20:52:18 +02:00
Aurelien DARRAGON	466a603b59	MINOR: compiler: add FIXED_SIZE(size, type, name) macro FIXED_SIZE() macro can be used to instruct the compiler that the struct member named <name>, handled as <type>, must be stored using <size> bytes and that even if the type used is actualler smaller than the expected size FIXED_SIZE_ARRAY(), similar to FIXED_SIZE() but for arrays: it takes an extra argument which is the number of members. They may be used for portability concerns to ensure a structure mapping remains consistent between platforms.	2025-10-22 20:52:12 +02:00
Aurelien DARRAGON	1e4dbebef2	MINOR: stats-file: fix typo in shm-stats-file object struct size detection As reported by @TimWolla on GH #3168, there was a typo in shm stats file BUG_ON to report that the size of shm_stats_file_object changed. No backport needed.	2025-10-22 20:52:08 +02:00
Amaury Denoyelle	f50425c021	MINOR: quic: remove received CRYPTO temporary tree storage The previous commit switch from ncbuf to ncbmbuf as storage for received CRYPTO frames. The latter ensures that buffering of such frames cannot fail anymore due to gaps size. Previously, extra mechanism were implemented on QUIC frames parsing function to overcome the limitation of ncbuf on gaps size. Before insertion, CRYPTO frames were stored in a temporary tree to order their insertion. As this is not necessary anymore, this commit removes the temporary tree insertion. This commit is closely associated to the previous bug fix. As it provides a neat optimization and code simplication, it can be backported with it, but not in the next immediate release to spot potential regression.	2025-10-22 15:24:02 +02:00
Amaury Denoyelle	4c11206395	BUG/MAJOR: quic: use ncbmbuf for CRYPTO handling In QUIC, TLS handshake messages such as ClientHello are encapsulated in CRYPTO frames. Each QUIC implementation can split the content in several frames of random sizes. In fact, this feature is now used by several clients, based on chrome so-called "Chaos protection" mechanism : https://quiche.googlesource.com/quiche/+/cb6b51054274cb2c939264faf34a1776e0a5bab7 To support this, haproxy uses a ncbuf storage to store received CRYPTO frames before passing it to the SSL library. However, this storage suffers from a limitation as gaps between two filled blocks cannot be smaller than 8 bytes. Thus, depending on the size of received CRYPTO frames and their order, ncbuf may not be sufficient. Over time, several mechanisms were implemented in haproxy QUIC frames parsing to overcome the ncbuf limitation. However, reports recently highlight that with some clients haproxy is not able to deal with CRYPTO frames reception. In particular, this is the case with the latest ngtcp2 release, which implements a similar chaos protection mechanism via the following patch. It also seems that this impacts haproxy interaction with firefox. commit 89c29fd8611d5e6d2f6b1f475c5e3494c376028c Author: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com> Date: Mon Aug 4 22:48:06 2025 +0900 Crumble Client Initial CRYPTO (aka chaos protection) To fix haproxy CRYPTO frames buffering once and for all, an alternative non-contiguous buffer named ncbmbuf has been recently implemented. This type does not suffer from gaps size limitation, albeit at the cost of a small reduction in the size available for data storage. Thus, the purpose of this current patch is to replace ncbuf with the newer ncbmbuf for QUIC CRYPTO frames parsing. Now, ncbmb_add() is used to buffer received frames which is guaranteed to suceed. The only remaining case of error is if a received frame offset and length exceed the ncbmbuf data storage, which would result in a CRYPTO_BUFFER_EXCEEDED error code. A notable behavior change when switching to ncbmbuf implementation is that NCB_ADD_COMPARE mode cannot be used anymore during add. Instead, crypto frame content received at a similar offset will be overwritten. A final note regarding STREAM frames parsing. For now, it is considered unnecessary to switch from ncbuf in this case. Indeed, QUIC clients does not perform aggressive fragmentation for them. Keeping ncbuf ensure that the data storage size is bigger than the equivalent ncbmbuf area. This should fix github issue #3141. This patch must be backported up to 2.6. It is first necessary to pick the relevant commits for ncbmbuf implementation prior to it.	2025-10-22 15:04:41 +02:00
Amaury Denoyelle	25e378fa65	MINOR: ncbmbuf: add tests as standalone mode Write some tests for ncbmbuf buf. These tests should be run each time ncbmbuf implementation is adjusted. Use the following command : $ gcc -g -DSTANDALONE -I./include -o ncbmbuf src/ncbmbuf.c && ./ncbmbuf As the previous patch, this commit must be backported prior to the fix to come on QUIC CRYPTO frames parsing.	2025-10-22 15:04:24 +02:00
Amaury Denoyelle	8b8ab2824e	MINOR: ncbmbuf: implement advance operation Implement ncbmb_advance() function for the ncbmbuf type. This allows to remove bytes in front of the buffer, regardless of the existing gaps. This is implemented by resetting the corresponding bits of the bitmap. As the previous patch, this commit must be backported prior to the fix to come on QUIC CRYPTO frames parsing.	2025-10-22 15:04:06 +02:00
Amaury Denoyelle	42c495f3d7	MINOR: ncbmbuf: implement ncbmb_data() Implement ncbmb_data() function for the ncbmbuf type. Its purpose is similar to its ncbuf counterpart : it returns the size in bytes of data starting at a specific offset until the next gap. As the previous patch, this commit must be backported prior to the fix to come on QUIC CRYPTO frames parsing.	2025-10-22 15:04:06 +02:00
Amaury Denoyelle	db4a68752d	MINOR: ncbmbuf: implement iterator bitmap utilities functions Extend private API for ncbmbuf type by defining an iterator type for the buffer bitmap handling. The purpose is to provide a simple method to iterate over the bitmap one byte at a time, with a proper bitmask set to hide irrelevant bits. This internal type is unused for now, but will become useful when implementing ncb_data() and ncb_advance() functions. As the previous patch, this commit must be backported prior to the fix to come on QUIC CRYPTO frames parsing.	2025-10-22 15:04:06 +02:00
Amaury Denoyelle	1e1a3aa6aa	MINOR: ncbmbuf: implement add This patch implements add operation for ncbmbuf type. This function is simpler than its ncbuf counterpart. Indeed, for now only NCB_ADD_OVERWRT mode is supported. This compromise has been chosen as ncbmbuf will be first used for QUIC CRYPTO frames handling, which does not mandate to compare existing filled blocks during insertion. As the previous patch, this commit must be backported prior to the fix to come on QUIC CRYPTO frames parsing.	2025-10-22 15:04:06 +02:00
Amaury Denoyelle	b9f91ad3ff	MINOR: ncbmbuf: define new ncbmbuf type Define ncbmbuf which is an alternative non-contiguous buffer implementation. "bm" abbreviation stands for bitmap, which reflects how gaps and filled blocks are encoded. The main purpose of this implementation is to get rid of the ncbuf limitation regarding the minimal size for gaps between two blocks of data. This commit adds the new module ncbmbuf. Along with it, some utility functions such as ncbmb_make(), ncbmb_init() and ncbmb_is_empty() are defined. Public API of ncbmbuf will be extended in the following patches. This patch is not considered a bug fix. However, it will be required to fix issue encountered on QUIC CRYPTO frames parsing. Thus, it will be necessary to backport the current patch prior to the fix to come.	2025-10-22 15:04:06 +02:00
Amaury Denoyelle	59f0bafef2	MINOR: ncbuf: extract common types ncbuf is a module which provide a non-contiguous buffer type implementation. This patch extracts some basic types related to it into a new file ncbuf_common.h. This patch will be useful to provide a new non-contiguous buffer alternative implementation based on a bitmap. This patch is not a bug fix. However, it is necessary for ncbmbuf implementation which will be required to fix a QUIC issue on CRYPTO frames parsing. This, it will be necessary to backport the current patch prior to the fix to come.	2025-10-22 11:11:20 +02:00
Willy Tarreau	f936feb3a9	BUG/MAJOR: pools: fix default pool alignment The doc in commit 977feb5617 ("DOC: api: update the pools API with the alignment and typed declarations") says that alignment of zero means the type's alignment. And this is followed by the DECLARE_TYPED_POOL() macro. Yet this is not what is done in create_pool_from_reg() which only raises the alignment to a void* if lower, while it should start from the type's. The effect is haproxy refusing to start on some 32-bit platforms since that commit, displaying an error such as: "BUG in the code: at src/mux_h2.c:454, requested creation of pool 'h2s' aligned to 4 while type requires alignment of 8! Please report to developers. Aborting." Let's just apply the default type's alignment. Thanks to @tianon for reporting this in GH issue #3168. No backport is needed since aligned pools are 3.3-only.	2025-10-22 09:06:20 +02:00
Amaury Denoyelle	bece704128	BUG/MEDIUM: h3: properly encode response after interim one in same buf Recently, proper support for interim responses forwarding to HTTP/3 client has been implemented. However, there was still an issue if two responses are both encoded in the same snd_buf() iteration. The issue is caused due to H3 HEADERS frame encoding method : 5 bytes are reserved in front of the buffer to encode both H3 frame type and varint length field. After proper headers encoding, output buffer head is adjusted so that length can be encoded using the minimal varint size. However, if the buffer is not empty due to a previous response already encoded but not yet emitted, messing with the buffer head will corrupt the entire H3 message. This only happens when encoding of both responses is done in the same snd_buf() iteration, or at least without emission to quic_conn layer in between. The result of this bug is that the HTTP/3 client will be unable to parse the response, most of the time reporting a formatting error. This can be reproduced using the following netcat as HTTP/1 server to haproxy : $ while sleep 0.2; do \ printf "HTTP/1.1 100 continue\r\n\r\nHTTP/1.1 200 ok\r\nContent-length: 5\r\nConnection: close\r\n\r\nblah\n" \| nc -lp8002 done To fix this, only adjust buffer head if content is empty. If this is not the case, frame length is simply encoded as a 4-bytes varint size so that messages are contiguous in the buffer. This must be backported up to 2.6.	2025-10-21 15:51:48 +02:00
Christopher Faulet	18ece2b424	BUG/MEDIUM: h1-htx: Don't set HTX_FL_EOM flag on 1xx informational messages 1xx informational messages are part of the HTTP response. It is not expected to have a HX_FL_EOM flag set after parsing such messages when received from a server. It is espacially important whne an informational messages is processed on client side while the final response was not recieved yet, to not erroneously detect the end of the message. The HTTP multiplexers seem to ignore the HTX_FL_EOM flag for information messages, but it remains an error from the HTX specification point of view. So it must be fixed. While it should theorically be backported as far as 3.0, it is a good idea to not do so for now because no bug was reported and regressions may happen.	2025-10-21 14:22:26 +02:00
Olivier Houchard	cd92aeb366	MEDIUM: stick-tables: Stop as soon as stktable_trash_oldest succeeds. stktable_trash_oldest() goes through all the shards, trying to free a number of entries. Going through each shard is expensive, as we have to take the shard lock, so stop as soon as we free'd at least one entry, as it is only called when we want to make room for one entry.	2025-10-20 15:04:47 +02:00
Olivier Houchard	7854331c71	MEDIUM: stick-tables: Stop if stktable_trash_oldest() fails. In stksess_new(), if the table is full, we call stktable_trash_oldest() to remove a few entries so that we have some room for a new one. It is unlikely, but possible, that stktable_trash_oldest() will fail. If so, just give up and do not add the new entry, instead of adding it anyway. Give up if stktable_trash_oldest() fails to free any entry	2025-10-20 15:04:47 +02:00
Olivier Houchard	d5562e31bd	MEDIUM: stick-tables: Remove the table lock Remove the table lock, it was only protecting the per-table expiration date, and that task is gone.	2025-10-20 15:04:47 +02:00
Olivier Houchard	8bc8a21b25	MEDIUM: stick-tables: Use a per-shard expiration task Instead of having per-table expiration tasks, just use one per shard. The task will now go through all the tables to expire entries. When a table gets an expiration earlier than the one previously known, it will be put in a mt-list, and the task will be responsible to put it into an eb32, ordered based on the next expiration. Each per-shard task will run on a different thread, so it should lead to a better load distribution than the per-table tasks.	2025-10-20 15:04:47 +02:00
Olivier Houchard	945aa0ea82	MINOR: initcalls: Add a new initcall stage, STG_INIT_2 Add a new initcall stage, STG_INIT_2, for stuff to be called after step_init_2() is called, so after we know for sure that global.nbthread will be set. Modify stick-tables stkt_late_init() to run at STG_INIT_2 instead of STG_INIT, in anticipation for it to be enhanced and have a need for global.nbthread.	2025-10-20 15:04:41 +02:00
Willy Tarreau	e63e98f1d8	BUG/MEDIUM: cli: also free the trash chunk on the error path Since commit 20ec1de214 ("MAJOR: cli: Refacor parsing and execution of pipelined commands"), command not returning any response (e.g. "quit") don't pass through the free_trash_chunk() call, possibly leaking the cmdline buffer. A typical way to reproduce it is to loop on "quit" on the CLI, though it very likely affects other specific commands. Let's make sure in the release handler that we always release that chunk in any case. This must be backported to 3.2.	2025-10-20 14:58:53 +02:00
Frederic Lecaille	edd21121d2	BUG/MINOR: quic-be: unchecked connections during handshakes This bug impacts only the backends. The ->conn (pointer to struct connection) member validity of the ssl_sock_ctx struct was not checked before being dereferenced, leading to possible crashes in qc_ssl_do_hanshake() during handshake. This was reported by GH #3163 issue. No need to backport because the QUIC backend support arrived with 3.3	2025-10-20 14:27:12 +02:00
Olivier Houchard	7a33b90b3c	BUG/MEDIUM: mt_list: Make sure not to unlock the element twice In mt_list_delete(), if the element was not in a list, then n and p will point to it, and so setting n->prev and n->next will be enough to unlock it. Don't do it twice, as once it's been done the first time, another thread may be working with it, and may have added it to a list already, and doing it a second time can lead to list inconsistencies. This should be backported up to 2.8.	2025-10-19 23:21:42 +02:00
Willy Tarreau	aa259f5b42	[RELEASE] Released version 3.3-dev10 Released version 3.3-dev10 with the following main changes : - BUG/MEDIUM: connections: Only avoid creating a mux if we have one - BUG/MINOR: sink: retry attempt for sft server may never occur - CLEANUP: mjson: remove MJSON_ENABLE_RPC code - CLEANUP: mjson: remove MJSON_ENABLE_PRINT code - CLEANUP: mjson: remove MJSON_ENABLE_NEXT code - CLEANUP: mjson: remove MJSON_ENABLE_BASE64 code - CLEANUP: mjson: remove unused defines and math.h - BUG/MINOR: http-ana: Reset analyse_exp date after 'wait-for-body' action - CLEANUP: mjson: remove unused defines from mjson.h - BUG/MINOR: acme: avoid overflow when diff > notAfter - DEV: patchbot: use git reset+checkout instead of pull - MINOR: proxy: explicitly permit abortonclose on frontends and clarify the doc - REGTESTS: fix h2_desync_attacks to wait for the response - REGTESTS: http-messaging: fix the websocket and upgrade tests not to close early - MINOR: proxy: only check abortonclose through a dedicated function - MAJOR: proxy: enable abortonclose by default on HTTP proxies - MINOR: proxy: introduce proxy_abrt_close_def() to pass the desired default - MAJOR: proxy: enable abortonclose by default on TLS listeners - MINOR: h3/qmux: Set QC_SF_UNKNOWN_PL_LENGTH flag on QCS when headers are sent - MINOR: stconn: Add two fields in sedesc to replace the HTX extra value - MINOR: h1-htx: Increment body len when parsing a payload with no xfer length - MINOR: mux-h1: Set known input payload length during demux - MINOR: mux-fcgi: Set known input payload length during demux - MINOR: mux-h2: Use <body_len> H2S field for payload without content-length - MINOR: mux-h2: Set known input payload length of the sedesc - MINOR: h3: Set known input payload length of the sedesc - MINOR: stconn: Move data from kip to kop when data are sent to the consumer - MINOR: filters: Reset knwon input payload length if a data filter is used - MINOR: hlua/http-fetch: Use <kip> instead of HTX extra field to get body size - MINOR: cache: Use the <kip> value to check too big objects - MINOR: compression: Use the <kip> value to check body size - MEDIUM: mux-h1: Stop to use HTX extra value when formatting message - MEDIUM: htx: Remove the HTX extra field - MEDIUM: acme: don't insert acme account key in ckchs_tree - BUG/MINOR: acme: memory leak from the config parser - CI: cirrus-ci: bump FreeBSD image to 14-3 - BUG/MEDIUM: ssl: take care of second client hello - BUG/MINOR: ssl: always clear the remains of the first hello for the second one - BUG/MEDIUM: stconn: Properly forward kip to the opposite SE descriptor - MEDIUM: applet: Forward <kip> to applets - DEBUG: mux-h1: Dump <kip> and <kop> values with sedesc info - BUG/MINOR: ssl: leak in ssl-f-use - BUG/MINOR: ssl: leak crtlist_name in ssl-f-use - BUILD: makefile: disable tail calls optimizations with memory profiling - BUG/MEDIUM: apppet: Improve spinning loop detection with the new API - BUG/MINOR: ssl: Free global_ssl structure contents during deinit - BUG/MINOR: ssl: Free key_base from global_ssl structure during deinit - MEDIUM: jwt: Remove certificate support in jwt_verify converter - MINOR: jwt: Add new jwt_verify_cert converter - MINOR: jwt: Do not look into ckch_store for jwt_verify converter - MINOR: jwt: Add new "jwt" certificate option - MINOR: jwt: Add specific error code for known but unavailable certificate - DOC: jwt: Add doc about "jwt_verify_cert" converter - MINOR: ssl: Dump options in "show ssl cert" - MINOR: jwt: Add new "add/del/show ssl jwt" CLI commands - REGTEST: jwt: Test new CLI commands - BUG/MINOR: ssl: Potential NULL deref in trace macro - MINOR: regex: use a thread-local match pointer for pcre2 - BUG/MEDIUM: pools: fix bad freeing of aligned pools in UAF mode - MEDIUM: pools: detect() when munmap() fails in UAF mode - TESTS: quic: useless param for b_quic_dec_int() - BUG/MEDIUM: pools: fix crash on filtered "show pools" output - BUG/MINOR: pools: don't report "limited to the first X entries" by default - BUG/MAJOR: lb-chash: fix key calculation when using default hash-key id - BUG/MEDIUM: stick-tables: Don't forget to dec count on failure. - BUG/MINOR: quic: check applet_putchk() for 'show quic' first line - TESTS: quic: fix uninit of quic_cc_path const member - BUILD: ssl: can't build when using -DLISTEN_DEFAULT_CIPHERS - BUG/MAJOR: quic: uninitialized quic_conn_closed struct members - BUG/MAJOR: quic: do not reset QUIC backends fds in closing state - BUG/MINOR: quic: SSL counters not handled - DOC: clarify the experimental status for certain features - MINOR: config: remove experimental status on tune.disable-fast-forward - MINOR: tree-wide: add missing TAINTED flags for some experimental directives - MEDIUM: config: warn when expose-experimental-directives is used for no reason - BUG/MEDIUM: threads/config: drop absent threads from thread groups - REGTESTS: remove experimental from quic/retry.vtc	2025-10-18 11:24:05 +02:00
Willy Tarreau	e8dcd4c9c8	REGTESTS: remove experimental from quic/retry.vtc Recent commit 8b7a82cd30 ("MEDIUM: config: warn when expose-experimental-directives is used for no reason") triggered on this test exactly for the reason it was made for. The tests were just done without quic on it. Let's drop the unneeded option.	2025-10-17 20:55:43 +02:00
Willy Tarreau	c365e47095	BUG/MEDIUM: threads/config: drop absent threads from thread groups Thread groups can be assigned arbitrary thread ranges, but if the mentioned threads do not exist, this causes crashes in listener_accept() or some connections to be ignored. The reason is that the calculated mask is derived from the thread group's enabled threads count. Examples: global nbthread 2 thread-groups 2 thread-group 1 1-64 thread-group 2 65-128 frontend f-crash bind :8001 thread 1/all frontend f-freeze bind :8002 thread 2/all This commit removes missing threads, emits a warning when the thread group just has less threads than requested, and an error when it is left with no threads at all. This must be backported to 3.1 since the issue is present there already.	2025-10-17 20:36:00 +02:00
Willy Tarreau	8b7a82cd30	MEDIUM: config: warn when expose-experimental-directives is used for no reason If users start to enable expose-experimental-directives for the purpose of testing one specific feature, there are chances that the option remains forever and hides the experimental status of other options. Let's emit a warning if the option appears and is not used. This will remind users that they can now drop it, and help keep configs safe for future upgrades.	2025-10-17 19:00:21 +02:00
Willy Tarreau	80ed9f9dcf	MINOR: tree-wide: add missing TAINTED flags for some experimental directives We normally taint the process when using experimental directives, but a handful of places were missed so we don't always know that they are in use. Let's fix these places (hint for future directives, just look for places checking for "experimental_directives_allowed", and add "mark_tainted(TAINTED_CONFIG_EXP_KW_DECLARED);").	2025-10-17 19:00:21 +02:00
Willy Tarreau	d3881e61ac	MINOR: config: remove experimental status on tune.disable-fast-forward The option was turned to off by default in 2.8 with commit 2f7c82bfd ("BUG/MINOR: haproxy: Fix option to disable the fast-forward"), however at the same time it should have dropped its experimental status since the feature is enabled by default. The only goal of the option is to debug something, like many other tune.xxx options. The option should still normally not be used without being invited to do so by developers looking for something specific though. This could be backported if desired to simplify debugging, though this has never been needed for now.	2025-10-17 18:59:47 +02:00
Willy Tarreau	e7c8deb810	DOC: clarify the experimental status for certain features Certain features require "expose-experimental-directives" to be set in the global section. Let's clarify that experimental featuers are only maintained in best effort mode, may break during the stable cycle, and are generally not maintained beyond the release of the next LTS branch since it is extremely challenging, and early adopters are expected to upgrade to benefit from improvements anyway.	2025-10-17 18:41:13 +02:00
Frederic Lecaille	51eca5cbce	BUG/MINOR: quic: SSL counters not handled The SSL counters were not handled at all for QUIC connections. This patch implement ssl_sock_update_counters() extracting the code from ssl_sock.c and call this function where applicable both in TLS/TCP and QUIC parts. Must be backported as far as 2.8.	2025-10-17 12:13:43 +02:00
Frederic Lecaille	8a8417b54a	BUG/MAJOR: quic: do not reset QUIC backends fds in closing state This bug impacts only the backends. When entering the closing state, a quic_closed_conn is used to replace the quic_conn. In this state, the ->fd value was reset to -1 value calling qc_init_fd(). This value is used by qc_may_use_saddr() which supposes it cannot be -1 for a backend, leading ->li to be dereferencd, which is legal only for a listener. This bug impacts only the backend but with possible crash when qc_may_use_saddr() is called: qc_test_fd() is false leading qc->li to be dereferenced. This is legal only for a listener. This patch prevents such fd value resettings for backends. No need to backport because the QUIC backends support arrived with 3.3.	2025-10-17 12:13:43 +02:00
Frederic Lecaille	56d15b2a03	BUG/MAJOR: quic: uninitialized quic_conn_closed struct members A quic_conn_closed struct is initialized to replace the quic_conn when the connection enters the closing to reduce the connection memory footprint. ->max_udp_payload quic_conn_close was not initialized leading to possible BUG_ON()s in qc_rcv_buf() when comparing the RX buf size to this payload. ->cntrs counters were alon not initialized with the only consequence to generate wrong values for these counters. Must be backported as far as 2.9.	2025-10-17 12:13:43 +02:00
William Lallemand	b74a437e57	BUILD: ssl: can't build when using -DLISTEN_DEFAULT_CIPHERS Emeric reported that he can't build haproxy anymore since 9bc6a034 ("BUG/MINOR: ssl: Free global_ssl structure contents during deinit"). src/ssl_sock.c:7020:40: error: comparison with string literal results in unspecified behavior [-Werror=address] 7020 \| if (global_ssl.listen_default_ciphers != LISTEN_DEFAULT_CIPHERS) \| ^~ src/ssl_sock.c:7023:41: error: comparison with string literal results in unspecified behavior [-Werror=address] 7023 \| if (global_ssl.connect_default_ciphers != CONNECT_DEFAULT_CIPHERS) \| ^~ src/ssl_sock.c: At top level: Indeed the mentionned patch is checking the pointer in order to free something freeable, but that can't work because these constant are strings literal which can be passed from the compiler and not pointers. Also the test is not useful, because these strings are strdup() in __ssl_sock_init, so they can be free directly. Must be backported in every stable branches with 9bc6a034.	2025-10-17 09:45:26 +02:00
Amaury Denoyelle	5b04a85bc7	TESTS: quic: fix uninit of quic_cc_path const member Fix quic_tx unittest module by adding an explicit define for <mtu> const member of quic_cc_path. This should fix coverity report from github issue #3162. This can be backported up to 3.2.	2025-10-17 09:29:01 +02:00
Amaury Denoyelle	5067a15870	BUG/MINOR: quic: check applet_putchk() for 'show quic' first line Ensure applet_putchk() return value is checked when outputing via the CLI 'show quic' header line. This is only to align with other usages of the same function, as trash output buffer should always be large enough for it. As such, the command is simply aborted if this is not the case. This should fix coverity report from github issue #3139. This could be backported up to 2.8.	2025-10-17 09:29:01 +02:00
Olivier Houchard	8d31784c0f	BUG/MEDIUM: stick-tables: Don't forget to dec count on failure. In stksess_new(), if we failed to allocate memory for the new stksess, don't forget to decrement the table entry count, as nobody else will do it for us. An artificially high count could lead to at least purging entries while there is no need to. This should be backported up to 2.8. WIP decrement current on allocation failure	2025-10-16 23:46:37 +02:00
Willy Tarreau	03e9a5a1e7	BUG/MAJOR: lb-chash: fix key calculation when using default hash-key id A subtle regression was introduced in 3.0 by commit faa8c3e02 ("MEDIUM: lb-chash: Deterministic node hashes based on server address"). When keys are calculated from the server's ID (which is the default), due to the reorganisation of the code, the key ended up being hashed twice instead of being multiplied by the scaling range. While most users will never notice it, it is blocking some large cache users from upgrading from 2.8 to 3.0 or 3.2 because the keys are redistributed. After a check with users on the mailing list [1] it was estimated that keep the current situation is the worst choice because those who have not yet upgraded will face the problem while by fixing it, those who already have and for whom it happened smoothly will handle it just right again. As such this fix must be backported to 3.0 without waiting (in order to preserve those who upgrade from two redistributions). Please note that only configurations featuring "hash-type consistent" and not having "hash-key" present with a value other than "id" are affected, others are not (e.g. "hash-key addr" is unaffected). [1] https://www.mail-archive.com/haproxy@formilux.org/msg46115.html	2025-10-16 10:43:09 +02:00
Willy Tarreau	f263a45ddf	BUG/MINOR: pools: don't report "limited to the first X entries" by default With the fix in commit 982805e6a3 ("BUG/MINOR: pools: Fix the dump of pools info to deal with buffers limitations"), the max count is now compared to the number of dumped pools instead of the configured numbered, and keeping >= is no longer valid because maxcnt is set by default to the same value when not set, so this means that since this patch we're always displaying "limited to the first X entries" where X is the number of dumped entries even in the absence of any limitation. Let's just fix the comparison to only show this when the limit is lower. This must be backported to 3.2 where the patch above already is.	2025-10-16 08:41:32 +02:00
Willy Tarreau	ab0c97139f	BUG/MEDIUM: pools: fix crash on filtered "show pools" output The truncation of pools output that was adressed in commit 982805e6a3 ("BUG/MINOR: pools: Fix the dump of pools info to deal with buffers limitations") required to split the pools filling from dumping. However there is a problem when a limit is passed that is lower than the number of pools or if a pool name is specified or if pool caches are disabled, because in this case the number of filled slots will be lower than the initially allocated one, and empty entries will be visited either by the sort functions when filling the entries if "byxxx" is specified, or by the dump function after the last entry, but none of these functions was expecting to be passed a NULL entry. Let's just re-adjust nbpools to match the number of filled entries at the end. Anyway the totals are calculated on the number of dumped entries. This must be backported to 3.2 since the fix above was backported there as well.	2025-10-16 08:41:32 +02:00
Frederic Lecaille	d5f4872ba6	TESTS: quic: useless param for b_quic_dec_int() The third parameter passed to b_quic_dec_int() is unitialized. This is not a bug. But this disturbs coverity for an unknown reason as revealed by GH issue #3154. This patch takes the opportunity to use NULL as passed value to avoid using such an uneeded third parameter. Should be backported to 3.2 where this unit test was introduced.	2025-10-15 09:58:03 +02:00
Willy Tarreau	17930edecc	MEDIUM: pools: detect() when munmap() fails in UAF mode Better check that munmap() always works, otherwise it means we might have miscalculated an address, and if it fails silently, it will eat all the memory extremely quickly. Let's add a BUG_ON() on munmap's return.	2025-10-13 19:22:31 +02:00
Willy Tarreau	0e6a233217	BUG/MEDIUM: pools: fix bad freeing of aligned pools in UAF mode As reported by Christopher, in UAF mode memory release of aligned objects as introduced in commit ef915e672a ("MEDIUM: pools: respect pool alignment in allocations") does not work. The padding calculation in the freeing code is no longer correct since it now depends on the alignment, so munmap() fails on EINVAL. Fortunately we don't care much about it since we know it's the low bits of the passed address, which is much simpler to compute, since all mmaps are page-aligned. There's no need to backport this, as this was introduced in 3.3.	2025-10-13 19:19:39 +02:00
Willy Tarreau	fda6dc9597	MINOR: regex: use a thread-local match pointer for pcre2 The pcre2 matching requires an array of matches for grouping, that is allocated when executing the rule by pre-processing it, and that is immediately freed after use. This is quite inefficient and results in annoying patterns in "show profiling" that attribute the allocations to libpcre2 and the releases to haproxy. A good suggestion from Dragan is to pre-allocate these per thread, since the entry is not specific to a regex. In addition we're already limited to MAX_MATCH matches so we don't even have the problem of having to grow it while parsing nor processing. The current patch adds a per-thread pair of init/deinit functions to allocate a thread-local entry for that, and gets rid of the dynamic allocations. It will result in cleaner memory management patterns and slightly higher performance (+2.5%) when using pcre2.	2025-10-13 16:56:43 +02:00
Remi Tricot-Le Breton	6f4ca37880	BUG/MINOR: ssl: Potential NULL deref in trace macro 'ctx' might be NULL when we exit 'ssl_sock_handshake', it can't be dereferenced without check in the trace macro. This was found by Coverity andraised in GitHub #3113. This patch should be backported up to 3.2	2025-10-13 15:44:45 +02:00
Remi Tricot-Le Breton	d82019d05c	REGTEST: jwt: Test new CLI commands Test the "add/del ssl jwt" commands and check the new return value in case of unavailable certificate used in a jwt_verify_cert converter.	2025-10-13 10:38:52 +02:00
Remi Tricot-Le Breton	d4bb9983fa	MINOR: jwt: Add new "add/del/show ssl jwt" CLI commands The new "add/del ssl jwt <file>" commands allow to change the "jwt" flag of an already loaded certificate. It allows to delete certificates used for JWT validation, which was not yet possible. The "show ssl jwt" command iterates over all the ckch_stores and dumps the ones that have the option set.	2025-10-13 10:38:52 +02:00
Remi Tricot-Le Breton	daa36adc6e	MINOR: ssl: Dump options in "show ssl cert" Dump the values of the 'ocsp-update' and 'jwt' flags in the output of 'show ssl cert' CLI command.	2025-10-13 10:38:52 +02:00
Remi Tricot-Le Breton	0f35b46124	DOC: jwt: Add doc about "jwt_verify_cert" converter Add information about the new "jwt_verify_cert" converter and update the existing "jwt_converter" doc to remove mentions of certificates from it. Add information about the new "jwt" certificate option.	2025-10-13 10:38:52 +02:00
Remi Tricot-Le Breton	bf5b912a62	MINOR: jwt: Add specific error code for known but unavailable certificate A certificate that does not have the 'jwt' flag enabled cannot be used for JWT validation. We now raise a specific return value so that such a case can be identified.	2025-10-13 10:38:52 +02:00
Remi Tricot-Le Breton	18ff130e9d	MINOR: jwt: Add new "jwt" certificate option This option can be used to enable the use of a given certificate for JWT verification. It defaults to 'off' so certificates that are declared in a crt-store and will be used for JWT verification must have a "jwt on" option in the configuration.	2025-10-13 10:38:52 +02:00
Remi Tricot-Le Breton	53957c50c3	MINOR: jwt: Do not look into ckch_store for jwt_verify converter We must not try to load full-on certificates for 'jwt_verify' converter anymore. 'jwt_verify_cert' is the only one that accepts a certificate.	2025-10-13 10:38:52 +02:00
Remi Tricot-Le Breton	f5632fd481	MINOR: jwt: Add new jwt_verify_cert converter This converter will be in charge of performing the same operation as the 'jwt_verify' one except that it takes a full-on pem certificate path instead of a public key path as parameter. The certificate path can be either provided directly as a string or via a variable. This allows to use certificates that are not known during init to perform token validation.	2025-10-13 10:38:52 +02:00
Remi Tricot-Le Breton	c3c0597a34	MEDIUM: jwt: Remove certificate support in jwt_verify converter The jwt_verify converter will not take full-on certificates anymore in favor of a new soon to come jwt_verify_cert. We might end up with a new jwt_verify_hmac in the future as well which would allow to deprecate the jwt_verify converter and remove the need for a specific internal tree for public keys. The logic to always look into the internal jwt tree by default and resolve to locking the ckch tree as little as possible will also be removed. This allows to get rid of the duplicated reference to EVP_PKEYs, the one in the jwt tree entry and the one in the ckch_store.	2025-10-13 10:38:52 +02:00
Remi Tricot-Le Breton	b706f2d092	BUG/MINOR: ssl: Free key_base from global_ssl structure during deinit The key_base field of the global_ssl structure is an strdup'ed field (when set) which was never free'd during deinit. This patch can be backported up to branch 3.0.	2025-10-10 17:22:48 +02:00
Remi Tricot-Le Breton	9bc6a0349d	BUG/MINOR: ssl: Free global_ssl structure contents during deinit Some fields of the global_ssl structure are strings that are strdup'ed but never freed. There is only one static global_ssl structure so not much memory is used but we might as well free it during deinit. This patch can be backported to all stable branches.	2025-10-10 17:22:48 +02:00
Christopher Faulet	54b7539d64	BUG/MEDIUM: apppet: Improve spinning loop detection with the new API Conditions to detect the spinning loop for applets based on the new API are not accurrate. We cannot continue to check the channel's buffers state to know if an applet has made some progress. At least, we must also check the applet's buffers. After digging to find the right way to do, it was clear that the best is to use something similar to what is performed for the streams, namely, checking read and write events. And in fact, it is quite easy to do with the new API. So let's do so. This patch must be backported as far as 3.0.	2025-10-10 14:41:15 +02:00
Willy Tarreau	dfe7fa9349	BUILD: makefile: disable tail calls optimizations with memory profiling The purpose of memory profiling precisely is to figure what function allocates and what function frees for specific objects. It turns out that a non-negligible number of release callbacks basically do nothing but a free() or pool_free() call and return, which the compiler happily turns into a jump, making the caller of that callback appear as the real one. That's how we can see libcrypto release to pools such as ssl-capture for example, which also makes the per-DSO calls appear wrong: 10000 0 10720000 0\| 0x448c8d ssl_async_fd_free+0x3b9d p_alloc(1072) [pool=ssl-capture] 50000 0 6800000 0\| 0x4456b9 ssl_async_fd_free+0x5c9 p_alloc(136) [pool=ssl-keylogf] 10072 0 644608 0\| 0x447f14 ssl_async_fd_free+0x2e24 p_alloc(64) [pool=ssl-keylogf] 0 10000 0 1360000\| 0x445987 ssl_async_fd_free+0x897 p_free(-136) [pool=ssl-keylogf] 0 10000 0 1360000\| 0x4459b8 ssl_async_fd_free+0x8c8 p_free(-136) [pool=ssl-keylogf] 0 10000 0 1360000\| 0x4459e9 ssl_async_fd_free+0x8f9 p_free(-136) [pool=ssl-keylogf] 0 10000 0 1360000\| 0x445a1a ssl_async_fd_free+0x92a p_free(-136) [pool=ssl-keylogf] 0 10000 0 1360000\| 0x445a4b ssl_async_fd_free+0x95b p_free(-136) [pool=ssl-keylogf] 0 20072 0 11364608\| 0x7f5f1397db62 libcrypto:CRYPTO_free_ex_data+0xf2/0x261 p_free(-566) [pool=ssl-keylogf] [locked=72 (0.3 %)] Worse, as can be seen on the last line above, there can be a single pool per call place (since we don't release to arbitrary pools), and the stats are misleading by reporting the first used pool only when a same function can call multiple release callbacks. This is why the free call totals 10k ssl-capture and 10072 ssl-keylogfile. Let's just disable tail call optimization when using memory profiling. The gains are only very marginal and complicate so much the debugging that it's not worth it. Now the output is correct, and no longer claims that libcrypto is the caller: 10000 0 10720000 0\| 0x448c9f ssl_async_fd_free+0x3b9f p_alloc(1072) [pool=ssl-capture] 0 10000 0 10720000\| 0x445af0 ssl_async_fd_free+0x9f0 p_free(-1072) [pool=ssl-capture] 50000 0 6800000 0\| 0x4456c9 ssl_async_fd_free+0x5c9 p_alloc(136) [pool=ssl-keylogf] 10177 0 1221240 0\| 0x45543d ssl_async_fd_handler+0xb51d p_alloc(120) [pool=ssl_sock_ct] [locked=165 (1.6 %)] 10061 0 643904 0\| 0x447f1c ssl_async_fd_free+0x2e1c p_alloc(64) [pool=ssl-keylogf] 0 10000 0 1360000\| 0x445987 ssl_async_fd_free+0x887 p_free(-136) [pool=ssl-keylogf] 0 10000 0 1360000\| 0x4459b8 ssl_async_fd_free+0x8b8 p_free(-136) [pool=ssl-keylogf] 0 10000 0 1360000\| 0x4459e9 ssl_async_fd_free+0x8e9 p_free(-136) [pool=ssl-keylogf] 0 10000 0 1360000\| 0x445a1a ssl_async_fd_free+0x91a p_free(-136) [pool=ssl-keylogf] 0 10000 0 1360000\| 0x445a4b ssl_async_fd_free+0x94b p_free(-136) [pool=ssl-keylogf] 0 10188 0 1222560\| 0x44f518 ssl_async_fd_handler+0x55f8 p_free(-120) [pool=ssl_sock_ct] [locked=176 (1.7 %)] 0 10072 0 644608\| 0x445aa6 ssl_async_fd_free+0x9a6 p_free(-64) [pool=ssl-keylogf] [locked=72 (0.7 %)] An attempt was made to only instrument pool_free() to place a compiler barrier, but that resulted in much larger code and wouldn't cover functions ending with a simple "free()" call. "ha_free()" however is already immune against tail call optimization since it has to write the NULL when returning from free(). This should be backported to recent stable releases that are still regularly being debugged.	2025-10-10 13:45:19 +02:00
William Lallemand	47a93dc750	BUG/MINOR: ssl: leak crtlist_name in ssl-f-use This patch fixes a leak of the temporary variable "crtlist_name" which is used in the ssl-f-use parser. Must be backported in 3.2.	2025-10-10 11:22:37 +02:00
William Lallemand	d9365a88a5	BUG/MINOR: ssl: leak in ssl-f-use Fix the leak of the filename in the struct cfg_crt_node which is a temporary structure used for ssl-f-use initialization. Must be backported to 3.2.	2025-10-10 11:22:37 +02:00
Christopher Faulet	cbe5221182	DEBUG: mux-h1: Dump <kip> and <kop> values with sedesc info It could be handy to debug issues, especially because these values was recently introduced.	2025-10-10 11:16:21 +02:00
Christopher Faulet	6a0fe6e460	MEDIUM: applet: Forward <kip> to applets For now, no applets are using the <kop> value when consuming data. At least, as far as I know. But it remains a good idea to keep the applet API compatible. So now, the <kip> of the opposite side is properly forwarded to applets.	2025-10-10 11:11:44 +02:00
Christopher Faulet	4145a61101	BUG/MEDIUM: stconn: Properly forward kip to the opposite SE descriptor By refactoring the HTX to remove the extra field, a bug was introduced in the stream-connector part. The <kip> (known input payload) value of a sedesc was moved to <kop> (knwon output payload) using the same sedesc. Of course, this is totally wrong. <kip> value of a sedesc must be forwarded to the opposite side. In addition, the operation is performed in sc_conn_send(). In this function, we manipulate the stream-connectors. So se_fwd_kip() function was changed to use the stream-connectors directely. Now, the function sc_ep_fwd_kip() is now called with the both stream-connectors to properly forward <kip> from on side to the opposite side. The bug is 3.3-specific. No backport needed.	2025-10-10 11:01:21 +02:00
Willy Tarreau	54f0ab08b8	BUG/MINOR: ssl: always clear the remains of the first hello for the second one William rightfully pointed that despite the ssl capture being a structure, some of its entries are only set for certain contents, so we need to always zero it before using it so as to clear any remains of a previous use, otherwise we could possibly report some entries that were only present in the first hello and not the second one. No need to clear the data though, since any remains will not be referenced by the fields. This must be backported wherever commit 336170007c ("BUG/MEDIUM: ssl: take care of second client hello") is backported.	2025-10-09 18:50:30 +02:00
Willy Tarreau	336170007c	BUG/MEDIUM: ssl: take care of second client hello For a long time we've been observing some sporadic leaks of ssl-capture pool entries on haproxy.org without figuring exactly the root cause. All that was seen was that less calls to the free callback were made than calls to the hello parsing callback, and these were never reproduced locally. It recently turned out to be triggered by the presence of "curves" or "ecdhe" on the "bind" line. Captures have shown the presence of a second client hello, called "Change Cipher Client Hello" in wireshark traces, that calls the client hello callback again. That one wasn't prepared for being called twice per connection, so it allocates an ssl-capture entry and assigns it to the ex_data entry, possibly overwriting the previous one. In this case, the fix is super simple, just reuse the current ex_data if it exists, otherwise allocate a new one. This completely solves the problem. Other callbacks have been audited for the same issue and are not affected: ssl_ini_keylog() already performs this check and ignores subsequent calls, and other ones do not allocate data. This must be backported to all supported versions.	2025-10-09 17:06:49 +02:00
William Lallemand	229eab8fc9	CI: cirrus-ci: bump FreeBSD image to 14-3 FreeBSD CI seems to be broken for a while, try to upgrade the image to the latest 14.3 version.	2025-10-09 14:06:48 +02:00
William Lallemand	f35caafa6e	BUG/MINOR: acme: memory leak from the config parser This patch fixes some memory leaks in the configuration parser: - deinit_acme() was never called - add ha_free() before every strdup() for section overwrite - lacked some free() in deinit_acme()	2025-10-09 12:04:22 +02:00
William Lallemand	9344ecaade	MEDIUM: acme: don't insert acme account key in ckchs_tree Don't insert the acme account key in the ckchs_tree anymore. ckch_store are not made to only include a private key. CLI operations are not possible with them either. That doesn't make much sense to keep it that way until we rework the ckch_store.	2025-10-09 11:01:58 +02:00
Christopher Faulet	914538cd39	MEDIUM: htx: Remove the HTX extra field Thanks for previous changes, it is now possible to remove the <extra> field from the HTX structure. HTX_FL_ALTERED_PAYLOAD flag is also removed because it is now unsued.	2025-10-08 11:10:42 +02:00
Christopher Faulet	2e2953a3f0	MEDIUM: mux-h1: Stop to use HTX extra value when formatting message We now rely on the <kop> value to format the message payload before sending it. It is no longer necessary to use the HTX extra field.	2025-10-08 11:10:42 +02:00
Christopher Faulet	4f40b2de86	MINOR: compression: Use the <kip> value to check body size When an minimum compression size is defined, we can now use the <kip> value to skip the compression instead of the HTX extra field.	2025-10-08 11:10:42 +02:00
Christopher Faulet	c0f5b19bc6	MINOR: cache: Use the <kip> value to check too big objects When an object should be cache, to check if it is too big or not, the <kip> value is now used instead of the HTX extra field.	2025-10-08 11:10:42 +02:00
Christopher Faulet	f1c659f3ae	MINOR: hlua/http-fetch: Use <kip> instead of HTX extra field to get body size The known input payload length now contains the information. There is no reason to still rely on the HTX extra field.	2025-10-08 11:10:25 +02:00
Christopher Faulet	be1ce400c4	MINOR: filters: Reset knwon input payload length if a data filter is used It a data filter is registered on a channel, the corresponding <kip> field must be reset because the payload may be altered.	2025-10-08 11:01:37 +02:00
Christopher Faulet	30c50e4f19	MINOR: stconn: Move data from kip to kop when data are sent to the consumer When data are sent to the consumer, the known output payload length is updated using the known input payload length value and this last one is then reset. se_fwd_kip() function is used for this purpose.	2025-10-08 11:01:37 +02:00
Christopher Faulet	f6a4d41dd0	MINOR: h3: Set known input payload length of the sedesc Set <kip> value when data are transfer to the upper layer, in h3_rcv_buf(). The difference between the known length of the payload before and after a parsing loop is added to <kip> value. When a content-length is specified in the message, the h3s <body_len> field is used. Otherwise, it is the h3s <data_len> field.	2025-10-08 11:01:36 +02:00
Christopher Faulet	bc8c6c42f4	MINOR: mux-h2: Set known input payload length of the sedesc Set <kip> value when data are transfer to the upper layer, in h2_rcv_buf(). The new <body_len> filed of the H2S is used to increment <kip> value and then it is reset. The patch relies on the previous one ("MINOR: mux-h2: Save the known length of the payload").	2025-10-08 11:01:36 +02:00
Christopher Faulet	3a6a576e73	MINOR: mux-h2: Use <body_len> H2S field for payload without content-length Before, the <body_len> H2S field was only use for verity the annonced content-lenght value was respected. Now, this field is used for all messages. Messages with a content-length are still handled the same way. <body_len> is set to the content-length value and decremented by the size of each DATA frame. For other messages, the value is initialized to ULLONG_MAX and still decremented by the size of each DATA frame. This change is mandatory to properly define the known input payload length value of the sedesc.	2025-10-08 11:01:36 +02:00
Christopher Faulet	4fdc23e648	MINOR: mux-fcgi: Set known input payload length during demux Set <kip> value during the response parsing. The difference between the body length before and after a parsing loop is added. The patch relies on the previous one ("MINOR: h1-htx: Increment body len when parsing a payload with no xfer length").	2025-10-08 11:01:36 +02:00
Christopher Faulet	2bf2f68cd8	MINOR: mux-h1: Set known input payload length during demux Set <kip> value during the message parsing. The difference between the body length before and after a parsing loop is added. The patch relies on the previous one ("MINOR: h1-htx: Increment body len when parsing a payload with no xfer length").	2025-10-08 11:01:36 +02:00
Christopher Faulet	c9bc18c0bf	MINOR: h1-htx: Increment body len when parsing a payload with no xfer length In the H1 parseur, the body length was only incremented when the transfer length was known. So when the content-length was specified or when the transfer-encoding value was set to "chunk". Now for messages with unknown transfer length, it is also incremented. It is mandatory to be able to remove the extra field from the HTX message.	2025-10-08 11:01:36 +02:00
Christopher Faulet	c0b6db2830	MINOR: stconn: Add two fields in sedesc to replace the HTX extra value For now, the HTX extra value is used to specify the known part, in bytes, of the HTTP payload we will receive. It may concerne the full payload if a content-length is specified or the current chunk for a chunk-encoded message. The main purpose of this value is to be used on the opposite side to be able to announce chunks bigger than a buffer. It can also be used to check the validity of the payload on the sending path, to properly detect too big or too short payload. However, setting this information in the HTX message itself is not really appropriate because the information is lost when the HTX message is consumed and the underlying buffer released. So the producer must take care to always add it in all HTX messages. it is especially an issue when the payload is altered by a filter. So to fix this design issue, the information will be moved in the sedesc. It is a persistent area to save the information. In addition, to avoid the ambiguity between what the producer say and what the consumer see, the information will be splitted in two fields. In this patch, the fields are added: * kip : The known input payload length * kop : The known output payload lenght The producer will be responsible to set <kip> value. The stream will be responsible to decrement <kip> and increment <kop> accordingly. And the consumer will be responsible to remove consumed bytes from <kop>.	2025-10-08 11:01:36 +02:00
Christopher Faulet	586511c278	MINOR: h3/qmux: Set QC_SF_UNKNOWN_PL_LENGTH flag on QCS when headers are sent QC_SF_UNKNOWN_PL_LENGTH flag is set on the qcs to know a payload of message has an unknown length and not send a RESET_STREAM on shutdown. This flag was based on the HTX extra field value. However, it is not necessary. When headers are processed, before sending them, it is possible to check the HTX start-line to know if the length of the payload is known or not. So let's do so and don't use anymore the HTX extra field for this purpose.	2025-10-08 11:01:36 +02:00
Willy Tarreau	00b27a993f	MAJOR: proxy: enable abortonclose by default on TLS listeners In the continuity of https://github.com/orgs/haproxy/discussions/3146, we must also enable abortonclose by default for TLS listeners so as not to needlessly compute TLS handshakes on dead connections. The change is very small (just set the default value to 1 in the TLS code when neither the option nor its opposite were set). It may possibly cause some TLS handshakes to start failing with 3.3 in certain legacy environments (e.g. TLS health-checks performed using only a client hello and closing afterwards), and in this case it is sufficient to disable the option using "no option abortonclose" in either the affected frontend or the "defaults" section it derives from.	2025-10-08 10:36:59 +02:00
Willy Tarreau	75103e7701	MINOR: proxy: introduce proxy_abrt_close_def() to pass the desired default With this function we can now pass the desired default value for the abortonclose option when neither the option nor its opposite were set. Let's also take this opportunity for using it directly from the HTTP analyser since there's no point in re-checking the proxy's mode there.	2025-10-08 10:29:41 +02:00
Willy Tarreau	644b3dc7d8	MAJOR: proxy: enable abortonclose by default on HTTP proxies As discussed on https://github.com/orgs/haproxy/discussions/3146 and on the mailing list, there's a marked preference for having abortonclose enabled by default when relevant. The point being that with todays' internet, the large majority of requests sent with a closed input channel are aborted requests, and that it's pointless to waste resources processing them. This patch now considers both "option abortonclose" and its opposite "no option abortonclose" to figure whether abortonclose is enabled or disabled in a backend. When neither are set (thus not even inherited from a defaults section), then it considers the proxy's mode, and HTTP mode implies abortonclose by default. This may make some legacy services fail starting with 3.3. In this case it will be sufficient to add "no option abortonclose" in either the affected backend or the defaults section it derives from. But for internet-facing proxies it's better to stay with the option enabled.	2025-10-08 10:29:41 +02:00
Willy Tarreau	fe47e8dfc5	MINOR: proxy: only check abortonclose through a dedicated function In order to prepare for changing the way abortonclose works, let's replace the direct flag check with a similarly named function (proxy_abrt_close) which returns the on/off status of the directive for the proxy. For now it simply reflects the flag's state.	2025-10-08 10:29:41 +02:00
Willy Tarreau	687504344a	REGTESTS: http-messaging: fix the websocket and upgrade tests not to close early By default when building an H2 request, vtest sets the END_STREAM flag on the HEADERS frame. This is problematic with the websocket and proto upgrade tests since we're using CONNECT, because it immediately closes afterwards, which does not correspond to what we're testing. Doing this in abortonclose mode rightfully produces an error. Let's fix the test so as not to set the flag on the HEADERS frame. However, doing so means we'll receive a window update that we must also accept. Now the test works both with and without abortonclose.	2025-10-08 10:29:41 +02:00
Willy Tarreau	8573c5e2a1	REGTESTS: fix h2_desync_attacks to wait for the response Tests with abortonclose showed a bug with this test where the client would close the stream immediately after sending the request, without waiting for the response, causing some random failures on the server side.	2025-10-08 10:29:41 +02:00
Willy Tarreau	c42e62d890	MINOR: proxy: explicitly permit abortonclose on frontends and clarify the doc The "abortonclose" option was recently deprecated in frontends because its action was essentially limited to the backend part (queuing etc). But in 3.3 we started to support it for TLS on frontends, though it would only work when placed in a defaults section. Let's officially support it in frontends, and take this opportunity to clarify the documentation on this topic, which was incomplete regarding frontend and TLS support. Now the doc tries to better cover the different use cases.	2025-10-08 10:29:41 +02:00
Willy Tarreau	f657ffc6e7	DEV: patchbot: use git reset+checkout instead of pull The patchbot stopped on a previous ultra-rare forced push due to wanting the user's name and e-mail before proceeding. We don't want merges nor rebases anyway, only to reset the tree to the next one, so let's do that.	2025-10-08 04:38:35 +02:00
William Lallemand	45fba1db27	BUG/MINOR: acme: avoid overflow when diff > notAfter Avoid an overflow or a negative value if notAfter < diff. This is unlikely to provoke any problem. Fixes issue #3138. Must be backported to 3.2.	2025-10-07 10:54:58 +02:00
William Lallemand	69bd253b23	CLEANUP: mjson: remove unused defines from mjson.h This patch removes unused defines from mjson.h. It also removes unused c++ declarations and includes. string.h is moved to mjson.c	2025-10-06 09:30:07 +02:00
Christopher Faulet	8219fa1842	BUG/MINOR: http-ana: Reset analyse_exp date after 'wait-for-body' action 'wait-for-body' action set analyse_exp date for the channel to the configured time. However, when the action is finished, it does not reset it. It is an issue for some following actions, like 'pause', that also rely on this date. To fix the issue, we must take care to reset the analyse_exp date to TICK_ETERNITY when the 'wait-for-body' action is finished. This patch should fix the issue #3147. It must be backported to all stable versions.	2025-10-03 17:09:16 +02:00
William Lallemand	61933a96a6	CLEANUP: mjson: remove unused defines and math.h Remove unused defines for MSVC which is not used in the case of haproxy, and remove math.h which is not used as well.	2025-10-03 16:09:51 +02:00
William Lallemand	8ea8aaace2	CLEANUP: mjson: remove MJSON_ENABLE_BASE64 code Remove the code used under #if MJSON_ENABLE_BASE64, which is not used within haproxy, to ease the maintenance of mjson.	2025-10-03 16:09:13 +02:00
William Lallemand	4edb05eb12	CLEANUP: mjson: remove MJSON_ENABLE_NEXT code Remove the code used under #if MJSON_ENABLE_NEXT, which is not used within haproxy, to ease the maintenance of mjson.	2025-10-03 16:08:17 +02:00
William Lallemand	a4eeeeeb07	CLEANUP: mjson: remove MJSON_ENABLE_PRINT code Remove the code used under #if MJSON_ENABLE_PRINT, which is not used within haproxy, to ease the maintenance of mjson.	2025-10-03 16:07:59 +02:00
William Lallemand	d63dfa34a2	CLEANUP: mjson: remove MJSON_ENABLE_RPC code Remove the code used under #if MJSON_ENABLE_RPC, which is not used within haproxy, to ease the maintenance of mjson.	2025-10-03 16:06:33 +02:00
Aurelien DARRAGON	c26ac3f5e4	BUG/MINOR: sink: retry attempt for sft server may never occur Since 9561b9fb6 ("BUG/MINOR: sink: add tempo between 2 connection attempts for sft servers"), there is a possibility that the tempo we use to schedule the task expiry may point to TICK_ETERNITY as we add ticks to tempo with a simple addition that doesn't take care of potential wrapping. When this happens (although relatively rare, since now_ms only wraps every 49.7 days, but a forced wrap occurs 20 seconds after haproxy is started so it is more likely to happen there), the process_sink_forward() task expiry being set to TICK_ETERNITY, it may never be called again, this is especially true if the ring section only contains a single server. To fix the issue, we must use tick_add() helper function to set the tempo value and this way we ensure that the value will never be TICK_ETERNITY. It must be backported everywhere 9561b9fb6 was backported (up to 2.6 it seems).	2025-10-03 14:31:05 +02:00
Olivier Houchard	b01a00acb1	BUG/MEDIUM: connections: Only avoid creating a mux if we have one In connect_server(), only avoid creating a mux when we're reusing a connection, if that connection already has one. We can reuse a connection with no mux, if we made a first attempt at connecting to the server and it failed before we could create the mux (or during the mux creation). The connection will then be reused when trying again. This fixes a bug where a stream could stall if the first connection attempt failed before the mux creation. It is easy to reproduce by creating random memory allocation failure with -dmFail. This was introduced by commit 4aaf0bfbced22d706af08725f977dcce9845d340, and thus does not need any backport as long as that commit is not backported.	2025-10-03 13:13:10 +02:00
Christopher Faulet	d0084cb873	[RELEASE] Released version 3.3-dev9 Released version 3.3-dev9 with the following main changes : - BUG/MINOR: acl: Fix error message about several '-m' parameters - MINOR: server: Parse sni and pool-conn-name expressions in a dedicated function - BUG/MEDIUM: server: Use sni as pool connection name for SSL server only - BUG/MINOR: server: Update healthcheck when server settings are changed via CLI - OPTIM: backend: Don't set SNI for non-ssl connections - OPTIM: proto_rhttp: Don't set SNI for non-ssl connections - OPTIM: tcpcheck: Don't set SNI and ALPN for non-ssl connections - BUG/MINOR: tcpcheck: Don't use sni as pool-conn-name for non-SSL connections - MEDIUM: server/ssl: Base the SNI value to the HTTP host header by default - MEDIUM: httpcheck/ssl: Base the SNI value on the HTTP host header by default - OPTIM: tcpcheck: Reorder tcpchek_connect structure fields to fill holes - REGTESTS: ssl: Add a script to test the automatic SNI selection - MINOR: quic: add useful trace about padding params values - BUG/MINOR: quic: too short PADDING frame for too short packets - BUG/MINOR: cpu_topo: work around a small bug in musl's CPU_ISSET() - BUG/MEDIUM: ssl: Properly initialize msg_controllen. - MINOR: quic: SSL session reuse for QUIC - BUG/MEDIUM: proxy: fix crash with stop_proxy() called during init - MINOR: stats-file: use explicit unsigned integer bitshift for user slots - CLEANUP: quic: fix typo in quic_tx trace - TESTS: quic: add unit-tests for QUIC TX part - MINOR: quic: restore QUIC_HP_SAMPLE_LEN constant - REGTESTS: ssl: Fix the script about automatic SNI selection - BUG/MINOR: pools: Fix the dump of pools info to deal with buffers limitations - MINOR: pools: Don't dump anymore info about pools when purge is forced - BUG/MINOR: quic: properly support GSO on backend side - BUG/MEDIUM: mux-h2: Reset MUX blocking flags when a send error is caught - BUG/MEDIUM: mux-h2; Don't block reveives in H2_CS_ERROR and H2_CS_ERROR2 states - BUG/MEDIUM: mux-h2: Restart reading when mbuf ring is no longer full - BUG/MINOR: mux-h2: Remove H2_CF_DEM_DFULL flags when the demux buffer is reset - BUG/MEDIUM: mux-h2: Report RST/error to app-layer stream during 0-copy fwding - BUG/MEDIUM: mux-h2: Reinforce conditions to report an error to app-layer stream - BUG/MINOR: hq-interop: adjust parsing/encoding on backend side - OPTIM: check: do not delay MUX for ALPN if SSL not active - BUG/MEDIUM: checks: fix ALPN inheritance from server - BUG/MINOR: check: ensure checks are compatible with QUIC servers - MINOR: check: reject invalid check config on a QUIC server - MINOR: debug: report the process id in warnings and panics - DEBUG: stream: count the number of passes in the connect loop - MINOR: debug: report the number of loops and ctxsw for each thread - MINOR: debug: report the time since last wakeup and call - DEBUG: peers: export functions that use locks - MINOR: stick-table: permit stksess_new() to temporarily allocate more entries - MEDIUM: stick-tables: relax stktable_trash_oldest() to only purge what is needed - MEDIUM: stick-tables: give up on lock contention in process_table_expire() - MEDIUM: stick-tables: don't wait indefinitely in stktable_add_pend_updates() - MEDIUM: peers: don't even try to process updates under contention - BUG/MEDIUM: h1: Allow reception if we have early data - BUG/MEDIUM: ssl: create the mux immediately on early data - MINOR: ssl: Add a flag to let it known we have an ALPN negociated - MINOR: ssl: Use the new flag to know when the ALPN has been set. - MEDIUM: server: Introduce the concept of path parameters - CLEANUP: backend: clarify the role of the init_mux variable in connect_server() - CLEANUP: backend: invert the condition to start the mux in connect_server() - CLEANUP: backend: simplify the complex ifdef related to 0RTT in connect_server() - CLEANUP: backend: clarify the cases where we want to use early data - MEDIUM: server: Make use of the stored ALPN stored in the server - BUILD: ssl: address a recent build warning when QUIC is enabled - BUG/MINOR: activity: fix reporting of task latency - MINOR: activity: indicate the number of calls on "show tasks" - MINOR: tools: don't emit "+0" for symbol names which exactly match known ones - BUG/MEDIUM: stick-tables: don't loop on non-expirable entries - DEBUG: stick-tables: export stktable_add_pend_updates() for better reporting - BUG/MEDIUM: ssl: Fix a crash when using QUIC - BUG/MEDIUM: ssl: Fix a crash if we failed to create the mux - MEDIUM: dns: bind the nameserver sockets to the initiating thread - MEDIUM: resolvers: make the process_resolvers() task single-threaded - BUG/MINOR: stick-table: make sure never to miss a process_table_expire update - MEDIUM: stick-table: move process_table_expire() to a single thread - MEDIUM: peers: move process_peer_sync() to a single thread - BUG/MAJOR: stream: Force channel analysis on successful synchronous send - MINOR: quic: get rid of ->target quic_conn struct member - MINOR: quic-be: make SSL/QUIC objects use their own indexes (ssl_qc_app_data_index) - MINOR: quic: display build warning for compat layer on recent OpenSSL - DOC: quic: clarifies limited-quic support - BUG/MINOR: acme: null pointer dereference upon allocation failure - BUG/MEDIUM: jws: return size_t in JWS functions - BUG/MINOR: ssl: Potential NULL deref in trace macro - BUG/MINOR: ssl: Fix potential NULL deref in trace callback - BUG/MINOR: ocsp: prototype inconsistency - MINOR: ocsp: put internal functions as static ones - MINOR: ssl: set functions as static when no protypes in the .h - BUILD: ssl: functions defined but not used - BUG/MEDIUM: resolvers: Properly cache do-resolv resolution - BUG/MINOR: resolvers: Restore round-robin selection on records in DNS answers - MINOR: activity: don't report the lat_tot column for show profiling tasks - MINOR: activity: add a new lkw_avg column to show profiling stats - MINOR: activity: collect time spent waiting on a lock for each task - MINOR: thread: add a lock level information in the thread_ctx - MINOR: activity: add a new lkd_avg column to show profiling stats - MINOR: activity: collect time spent with a lock held for each task - MINOR: activity: add a new mem_avg column to show profiling stats - MINOR: activity: collect CPU time spent on memory allocations for each task - MINOR: activity/memory: count allocations performed under a lock - DOC: proxy-protocol: Add TLS group and sig scheme TLVs - BUG/MEDIUM: resolvers: Test for empty tree when getting a record from DNS answer - BUG/MEDIUM: resolvers: Make resolution owns its hostname_dn value - BUG/MEDIUM: resolvers: Accept to create resolution without hostname - BUG/MEDIUM: resolvers: Wake resolver task up whne unlinking a stream requester - BUG/MINOR: ocsp: Crash when updating CA during ocsp updates - Revert "BUG/MINOR: ocsp: Crash when updating CA during ocsp updates" - BUG/MEDIUM: http_ana: fix potential NULL deref in http_process_req_common() - MEDIUM: log/proxy: store log-steps selection using a bitmask, not an eb tree - BUG/MINOR: ocsp: Crash when updating CA during ocsp updates - BUG/MINOR: resolvers: always normalize FQDN from response - BUILD: makefile: implement support for running a command in range - IMPORT: cebtree: import version 0.5.0 to support duplicates - MEDIUM: migrate the patterns reference to cebs_tree - MEDIUM: guid: switch guid to more compact cebuis_tree - MEDIUM: server: switch addr_node to cebis_tree - MEDIUM: server: switch conf.name to cebis_tree - MEDIUM: server: switch the host_dn member to cebis_tree - MEDIUM: proxy: switch conf.name to cebis_tree - MEDIUM: stktable: index table names using compact trees - MINOR: proxy: add proxy_get_next_id() to find next free proxy ID - MINOR: listener: add listener_get_next_id() to find next free listener ID - MINOR: server: add server_get_next_id() to find next free server ID - CLEANUP: server: use server_find_by_id() when looking for already used IDs - MINOR: server: add server_index_id() to index a server by its ID - MINOR: listener: add listener_index_id() to index a listener by its ID - MINOR: proxy: add proxy_index_id() to index a proxy by its ID - MEDIUM: proxy: index proxy ID using compact trees - MEDIUM: listener: index listener ID using compact trees - MEDIUM: server: index server ID using compact trees - CLEANUP: server: slightly reorder fields in the struct to plug holes - CLEANUP: proxy: slightly reorganize fields to plug some holes - CLEANUP: backend: factor the connection lookup loop - CLEANUP: server: use eb64_entry() not ebmb_entry() to convert an eb64 - MINOR: server: pass the server and thread to srv_migrate_conns_to_remove() - CLEANUP: backend: use a single variable for removed in srv_cleanup_idle_conns() - MINOR: connection: pass the thread number to conn_delete_from_tree() - MEDIUM: connection: move idle connection trees to ceb64 - MEDIUM: connection: reintegrate conn_hash_node into connection - CLEANUP: tools: use the item API for the file names tree - CLEANUP: vars: use the item API for the variables trees - BUG/MEDIUM: pattern: fix possible infinite loops on deletion - CI: scripts: add support for git in openssl builds - CI: github: add an OpenSSL + ECH job - CI: scripts: mkdir BUILDSSL_TMPDIR - Revert "BUG/MEDIUM: pattern: fix possible infinite loops on deletion" - BUG/MEDIUM: pattern: fix possible infinite loops on deletion (try 2) - CLEANUP: log: remove deadcode in px_parse_log_steps() - MINOR: counters: document that tg shared counters are tied to shm-stats-file mapping - DOC: internals: document the shm-stats-file format/mapping - IMPORT: ebtree: delete unusable ebpttree.c - IMPORT: eb32/eb64: reorder the lookup loop for modern CPUs - IMPORT: eb32/eb64: use a more parallelizable check for lack of common bits - IMPORT: eb32: drop the now useless node_bit variable - IMPORT: eb32/eb64: place an unlikely() on the leaf test - IMPORT: ebmb: optimize the lookup for modern CPUs - IMPORT: eb32/64: optimize insert for modern CPUs - IMPORT: ebtree: only use __builtin_prefetch() when supported - IMPORT: ebst: use prefetching in lookup() and insert() - IMPORT: ebtree: Fix UB from clz(0) - IMPORT: ebtree: add a definition of offsetof() - IMPORT: ebtree: replace hand-rolled offsetof to avoid UB - MINOR: listener: add the "cc" bind keyword to set the TCP congestion controller - MINOR: server: add the "cc" keyword to set the TCP congestion controller - BUG/MEDIUM: ring: invert the length check to avoid an int overflow - MINOR: trace: don't call strlen() on the thread-id numeric encoding - MINOR: trace: don't call strlen() on the function's name - OPTIM: sink: reduce contention on sink_announce_dropped() - OPTIM: sink: don't waste time calling sink_announce_dropped() if busy - CLEANUP: ring: rearrange the wait loop in ring_write() - OPTIM: ring: always relax in the ring lock and leader wait loop - OPTIM: ring: check the queue's owner using a CAS on x86 - OPTIM: ring: avoid reloading the tail_ofs value before the CAS in ring_write() - BUG/MEDIUM: sink: fix unexpected double postinit of sink backend - MEDIUM: stats: consider that shared stats pointers may be NULL - BUG/MEDIUM: http-client: Fix the test on the response start-line - MINOR: acme: acme-vars allow to pass data to the dpapi sink - MINOR: acme: check acme-vars allocation during escaping - BUG/MINOR: acme/cli: wrong description for "acme challenge_ready" - CI: move VTest preparation & friends to dedicated composite action - BUG/MEDIUM: stick-tables: Don't let table_process_entry() handle refcnt - BUG/MINOR: compression: Test payload size only if content-length is specified - BUG/MINOR: pattern: Properly flag virtual maps as using samples - BUG/MINOR: acme: possible overflow on scheduling computation - BUG/MINOR: acme: possible overflow in acme_will_expire() - CLEANUP: acme: acme_will_expire() uses acme_schedule_date() - BUG/MINOR: pattern: Fix pattern lookup for map with opt@ prefix - CI: scripts: build curl with ECH support - CI: github: add curl+ech build into openssl-ech job - BUG/MEDIUM: ssl: ca-file directory mode must read every certificates of a file - MINOR: acme: provider-name for dpapi sink - BUILD: acme: fix false positive null pointer dereference - MINOR: backend: srv_queue helper - MINOR: backend: srv_is_up converter - BUILD: halog: misleading indentation in halog.c - CI: github: build halog on the vtest job - BUG/MINOR: acme: don't unlink from acme_ctx_destroy() - BUG/MEDIUM: acme: cfg_postsection_acme() don't init correctly acme sections - MINOR: acme: implement "reuse-key" option - ADMIN: haproxy-dump-certs: implement a certificate dumper - ADMIN: dump-certs: don't update the file if it's up to date - ADMIN: dump-certs: create files in a tmpdir - ADMIN: dump-certs: fix lack of / in -p - ADMIN: dump-certs: use same error format as haproxy - ADMIN: reload: add a synchronous reload helper - BUG/MEDIUM: acme: free() of i2d_X509_REQ() with AWS-LC - ADMIN: reload: introduce verbose and silent mode - ADMIN: reload: introduce -vv mode - MINOR: mt_list: Implement MT_LIST_POP_LOCKED() - BUG/MEDIUM: stick-tables: Make sure not to free a pending entry - MINOR: sched: let's permit to share the local ctx between threads - MINOR: sched: pass the thread number to is_sched_alive() - BUG/MEDIUM: wdt: improve stuck task detection accuracy - MINOR: ssl: add the ssl_bc_sni sample fetch function to retrieve backend SNI - MINOR: rawsock: introduce CO_RFL_TRY_HARDER to detect closures on complete reads - MEDIUM: ssl: don't always process pending handshakes on closed connections - MEDIUM: servers: Schedule the server requeue target on creation - MEDIUM: fwlc: Make it so fwlc_srv_reposition works with unqueued srv - BUG/MEDIUM: fwlc: Handle memory allocation failures. - DOC: config: clarify some known limitations of the json_query() converter - BUG/CRITICAL: mjson: fix possible DoS when parsing numbers - BUG/MINOR: h2: forbid 'Z' as well in header field names checks - BUG/MINOR: h3: forbid 'Z' as well in header field names checks - BUG/MEDIUM: resolvers: break an infinite loop in resolv_get_ip_from_response()	2025-10-03 12:12:51 +02:00
Willy Tarreau	ced9784df4	BUG/MEDIUM: resolvers: break an infinite loop in resolv_get_ip_from_response() The fix in 3023e98199 ("BUG/MINOR: resolvers: Restore round-robin selection on records in DNS answers") still contained an issue not addressed f6dfbbe870 ("BUG/MEDIUM: resolvers: Test for empty tree when getting a record from DNS answer"). Indeed, if the next element is the same as the first one, then we can end up with an endless loop because the test at the end compares the next pointer (possibly null) with the end one (first). Let's move the null->first transition at the end. This must be backported where the patches above were backported (3.2 for now).	2025-10-03 09:08:10 +02:00
zhanhb	ad75431b9c	BUG/MINOR: h3: forbid 'Z' as well in header field names checks The current tests in _h3_handle_hdr() and h3_trailers_to_htx() check for an interval between 'A' and 'Z' for letters in header field names that should be forbidden, but mistakenly leave the 'Z' out of the forbidden range, resulting in it being implicitly valid. This has no real consequences but should be fixed for the sake of protocol validity checking. This must be backported to all relevant versions.	2025-10-02 15:30:02 +02:00
zhanhb	7163d9180c	BUG/MINOR: h2: forbid 'Z' as well in header field names checks The current tests in h2_make_htx_request(), h2_make_htx_response() and h2_make_htx_trailers() check for an interval between 'A' and 'Z' for letters in header field names that should be forbidden, but mistakenly leave the 'Z' out of the forbidden range, resulting in it being implicitly valid. This has no real consequences but should be fixed for the sake of protocol validity checking. This must be backported to all relevant versions.	2025-10-02 15:29:58 +02:00
Willy Tarreau	06675db4bf	BUG/CRITICAL: mjson: fix possible DoS when parsing numbers Mjson comes with its own strtod() implementation for portability reasons and probably also because many generic strtod() versions as provided by operating systems do not focus on resource preservation and may call malloc(), which is not welcome in a parser. The strtod() implementation used here apparently originally comes from https://gist.github.com/mattn/1890186 and seems to have purposely omitted a few parts that were considered as not needed in this context (e.g. skipping white spaces, or setting errno). But when subject to the relevant test cases of the designated file above, the current function provides the same results. The aforementioned implementation uses pow() to calculate exponents, but mjson authors visibly preferred not to introduce a libm dependency and replaced it with an iterative loop in O(exp) time. The problem is that the exponent is not bounded and that this loop can take a huge amount of time. There's even an issue already opened on mjson about this: https://github.com/cesanta/mjson/issues/59. In the case of haproxy, fortunately, the watchdog will quickly stop a runaway process but this remains a possible denial of service. A first approach would consist in reintroducing pow() like in the original implementation, but if haproxy is built without Lua nor 51Degrees, -lm is not used so this will not work everywhere. Anyway here we're dealing with integer exponents, so an easy alternate approach consists in simply using shifts and squares, to compute the exponent in O(log(exp)) time. Not only it doesn't introduce any new dependency, but it turns out to be even faster than the generic pow() (85k req/s per core vs 83.5k on the same machine). This must be backported as far as 2.4, where mjson was introduced. Many thanks to Oula Kivalo for reporting this issue. CVE-2025-11230 was assigned to this issue.	2025-10-02 09:37:43 +02:00
Willy Tarreau	67603162c1	DOC: config: clarify some known limitations of the json_query() converter Oula Kivalo reported that different JSON libraries may process duplicate keys differently and that most JSON libraries usually decode the stream before extracting keys, while the current mjson implementation decodes the contents during extraction instead. Let's document this point so that users are aware of the limitations and do not rely on the current behavior and do not use it for what it's not made for (e.g. content sanitization). This is also the case for jwt_header_query(), jwt_payload_query() and jwt_verify(), which already refer to this converter for specificities.	2025-10-02 08:57:39 +02:00
Olivier Houchard	b71bb6c2ae	BUG/MEDIUM: fwlc: Handle memory allocation failures. Properly handle memory allocation failures, by checking the return value for pool_alloc(), and if it fails, make sure that the caller will take it into account. The only use of pool_alloc() in fwlc is to allocate the tree elements in order to properly queue the server into the ebtree, so if that allocation fails, just schedule the requeue tasklet, that will try again, until it hopefully eventually succeeds. This should be backported to 3.2. This should fix github issue #3143.	2025-10-01 18:13:33 +02:00
Olivier Houchard	f4a9c6ffae	MEDIUM: fwlc: Make it so fwlc_srv_reposition works with unqueued srv Modify fwlc_srv_reposition() so that it does not assume that the server was already queued, and so make it so it works even if s->tree_elt is NULL. While the server will usually be queued, there is an unlikely possibility that when the server attempted to get queued when it got up, it failed due to a memory allocation failure, and it just expect the server_requeue tasklet to run to take care of that later. This should be backported to 3.2. This is part of an attempt to fix github issue #3143	2025-10-01 18:13:33 +02:00
Olivier Houchard	822ee90dc2	MEDIUM: servers: Schedule the server requeue target on creation On creation, schedule the server requeue once it's been created. It is possible that when the server went up, it tried to queue itself into the lb specific code, failed to do so, and expect the tasklet to run to take care of that. This should be backported to 3.2. This is part of an attempt to fix github issue #3143.	2025-10-01 18:13:33 +02:00
Willy Tarreau	7ea80cc5b6	MEDIUM: ssl: don't always process pending handshakes on closed connections If a client aborts a pending SSL connection for whatever reason (timeout etc) and the listen queue is large, it may inflict a severe load to a frontend which will spend the CPU creating new sessions then killing the connection. This is similar to HTTP requests aborted just after being sent, except that asymmetric crypto is way more expensive. Unfortunately "option abortonclose" has no effect on this, because it only applies at a higher level. This patch ensures that handshakes being received on a frontend having "option abortonclose" set will be checked for a pending close, and if this is the case, then the connection will be aborted before the heavy calculations. The principle is to use recv(MSG_PEEK) to detect the end, and to destroy the pending handshake data before returning to the SSL library so that it cannot start computing, notices the error and stops. We don't do it without abortonclose though, because this can be used for health checks from other haproxy nodes or even other components which just want to see a handshake succeed. This is in relation with GH issue #3124.	2025-10-01 10:23:04 +02:00
Willy Tarreau	1afaa7b59d	MINOR: rawsock: introduce CO_RFL_TRY_HARDER to detect closures on complete reads Normally, when reading a full buffer, or exactly the requested size, it is not really possible to know if the peer had closed immediately after, and usually we don't care. There's a problematic case, though, which is with SSL: the SSL layer reads in small chunks of a few bytes, and can consume a client_hello this way, then start computation without knowing yet that the client has aborted. In order to permit knowing more, we now introduce a new read flag, CO_RFL_TRY_HARDER, which says that if we've read up to the permitted limit and the flag is set, then we attempt one extra byte using MSG_PEEK to detect whether the connection was closed immediately after that content or not. The first use case will obviously be related to SSL and client_hello, but it might possibly also make sense on HTTP responses to detect a pending FIN at the end of a response (e.g. if a close was already advertised).	2025-10-01 10:23:01 +02:00
Willy Tarreau	dae4cfe8c5	MINOR: ssl: add the ssl_bc_sni sample fetch function to retrieve backend SNI Sometimes in order to debug certain difficult situations it can be useful to know what SNI was configured on a connection going to a server, for example to match it against what the server saw or to detect cases where a server would route on SNI instead of Host. This sample fetch function simply retrieves the SNI configured on the backend connection, if any.	2025-10-01 10:18:53 +02:00
Willy Tarreau	205f1cbf4c	BUG/MEDIUM: wdt: improve stuck task detection accuracy The fact that the watchdog timer measures the execution time from the last return from the poller tends to amplify the impact of multiple bad tasks, and may explain some of the panics reported by Felipe and Ricardo in GH issues #3084, #3092 and #3101. The problem is that we check the time if we see that the scheduler appears not to be moving anymore, but one situation may still arise and catch a bad task: - one slow task takes so long a time that it triggers the watchdog twice, emitting a warning the second time (~200ms). The scheduler is rightfully marked as stuck. - then it completes and the scheduler is no longer stuck. Many other tasks run in turn, they all take quite some time but not enough to trigger a warning. But collectively their cost adds up. - then a task takes more than the warning time (100ms), and causes the total execution time to cross the second. The watchdog is called, sees that we've spend more than 1 second since we left the poller, and marks the thread as stuck. - the task is not finished, the watchdog is called again, sees more than one second with a stuck thread and panics 100ms later. The total time away from the poller is indeed more than one second, which is very bad, but no single task caused this individually, and while the warnings are OK, the watchdog should not panic in this case. This patch revisits the approach to store the moment the scheduler was marked as stuck in the wdt context. The idea is that this date will be used to detect warnings and panics. And by doing so and exploiting the new is_sched_alive(thr), we can greatly simplify the mechanism so that the signal handling thread does the strict minimum (mark the scheduler as possibly stuck and update the stuck_start date), and only bounces to the reporting thread if the scheduler made no progress since last call. This means that without even doing computations in the handing thread, we can continue to avoid all bounces unless a warning is required. Then when the reporting thread is signaled, it will check the dates from the last moment the scheduler was marked, and will decide to warn or panic. The panic decision continues to pass via a TH_FL_STUCK flag to probe the code so that exceptionally slow code (e.g. live cert generation etc) can still find a way to avoid the panic if absolutely certain that things are still moving. This means that now we have the guarantee that panics will only happen if a given task spends more than one full second not moving, and that warnings will be issued for other calls crossing the warn delay boundary. This was tested using artificially slow operations, and all combinations which individually took less than a second only resulted in floods of warnings even if the total reported time in the warning was much higher, while those above one second provoked the panic. One improvement could consist in reporting the time since last stuck in the thread dumps to differentiate the individual task from the whole set. This needs to be backported to 3.2 along with the two previous patches: MINOR: sched: let's permit to share the local ctx between threads MINOR: sched: pass the thread number to is_sched_alive()	2025-10-01 10:18:53 +02:00
Willy Tarreau	25f5f357cc	MINOR: sched: pass the thread number to is_sched_alive() Now it will be possible to query any thread's scheduler state, not only the current one. This aims at simplifying the watchdog checks for reported threads. The operation is now a simple atomic xchg.	2025-10-01 10:18:53 +02:00
Willy Tarreau	7c7e17a605	MINOR: sched: let's permit to share the local ctx between threads The watchdog timer has to go through complex operations due to not being able to check if another thread's scheduler is still ticking. This is simply because the scheduler status is marked as thread-local while it could in fact also be an array. Let's do that (and align the array to avoid false sharing) so that it's now possible to check any scheduler's status.	2025-10-01 10:18:53 +02:00
Olivier Houchard	21ae35dd29	BUG/MEDIUM: stick-tables: Make sure not to free a pending entry There is a race condition, an entry can be free'd by stksess_kill() between the time stktable_add_pend_updates() gets the entry from the mt_list, and the time it adds it to the ebtree. To prevent this, use the newly implemented MT_LIST_POP_LOCKED() to keep the stksess locked until it is added to the tree. That way, __stksess_kill() will wait until we're done with it. This should be backported to 3.2.	2025-09-30 16:25:07 +02:00
Olivier Houchard	cf26745857	MINOR: mt_list: Implement MT_LIST_POP_LOCKED() Implement MT_LIST_POP_LOCKED(), that behaves as MT_LIST_POP() and removes the first element from the list, if any, but keeps it locked. This should be backported to 3.2, as it will be use in a bug fix in the stick tables that affects 3.2 too.	2025-09-30 16:25:07 +02:00
William Lallemand	6316f958e3	ADMIN: reload: introduce -vv mode The -v verbose mode displays the loading messages returned by the master CLI reload command upon error. The new -vv mode displays the loading messages even upon success, showing the content of `show startup-logs` after the reload attempt.	2025-09-29 19:29:10 +02:00
William Lallemand	5d05f343b9	ADMIN: reload: introduce verbose and silent mode By default haproxy-reload displays the error that are not emitted by haproxy, but only emitted by haproxy-reload. -s silent mode, don't display any error -v verbose mode, display the loading messages returned by the master CLI reload command upon error.	2025-09-29 19:29:10 +02:00
William Lallemand	3ce597bfa2	BUG/MEDIUM: acme: free() of i2d_X509_REQ() with AWS-LC When using AWS-LC, the free() of the data ptr resulting from i2d_X509_REQ() might crash, because it uses the free() of the libc instead of OPENSSL_free(). It does not seems to be a problem on openssl builds. Must be backported in 3.2.	2025-09-29 13:46:51 +02:00
William Lallemand	8635c7d789	ADMIN: reload: add a synchronous reload helper haproxy-reload is a utility script which reload synchronously using the master CLI, instead of asynchronously with kill.	2025-09-28 22:10:40 +02:00
William Lallemand	02f7bff90b	ADMIN: dump-certs: use same error format as haproxy Replace error/notice by [ALERT]/[WARNING]/[NOTICE] like it's done in haproxy. ALERT means a failure and the program will exit 1 just after it WARNING will continue the execution of the program NOTICE will continue the execution as well	2025-09-28 20:21:07 +02:00
William Lallemand	5c9f28641b	ADMIN: dump-certs: fix lack of / in -p Add a trailing / so -p don't fail if it wasn't specified.	2025-09-28 18:21:25 +02:00
William Lallemand	172ac6ad03	ADMIN: dump-certs: create files in a tmpdir Files dumped from the socket are put in a temporary directory, this directory is then removed upon exit. Variable were cleaned to be clearer: - crt_filename -> prev_crt - key_filename -> prev_key - ${crt_filename}.${tmp} -> new_crt - ${key_filename}.${tmp} -> new_key	2025-09-28 18:21:25 +02:00
William Lallemand	8781c65d8a	ADMIN: dump-certs: don't update the file if it's up to date Compare the fingerprint of the leaf certificate to the previous file to check if it needs to be updated or not Also skip the check if no file is on the disk.	2025-09-28 18:21:20 +02:00
William Lallemand	3a6ea8b959	ADMIN: haproxy-dump-certs: implement a certificate dumper haproxy-dump0-certs is a bash script that connects to your master socket or your stat socket in order to dump certificates from haproxy memory to the corresponding files.	2025-09-28 13:38:48 +02:00
William Lallemand	b70c7f48fa	MINOR: acme: implement "reuse-key" option The new "reuse-key" option in the "acme" section, allows to keep the private key instead of generating a new one at each renewal.	2025-09-27 21:41:39 +02:00
William Lallemand	a9ccf692e7	BUG/MEDIUM: acme: cfg_postsection_acme() don't init correctly acme sections The cfg_postsection_acme() redefines its own cur_acme variable, pointing to the first acme section created. Meaning that the first section would be init multiple times, and the next sections won't never be initialized. It could result in crashes at the first use of all sections that are not the first one. Must be backported in 3.2	2025-09-27 19:58:44 +02:00
William Lallemand	406fd0ceb1	BUG/MINOR: acme: don't unlink from acme_ctx_destroy() Unlinking the acme_ctx element from acme_ctx_destroy() requires to have the element unlocked, because MT_LIST_DELETE() locks the element. acme_ctx_destroy() frees the data from acme_ctx with the ctx still linked and unlocked, then lock to unlink. So there's a small risk of accessing acme_ctx from somewhere else. The only way to do that would be to use the `acme challenge_ready` CLI command at the same time. Fix the issue by doing a mt_list_unlock_link() and a mt_list_unlock_self() to unlink the element under the lock, then destroy the element. This must be backported in 3.2.	2025-09-27 18:52:56 +02:00
William Lallemand	6499c0a0d5	CI: github: build halog on the vtest job halog was not built in the vtest job. Add it to vtest.yml to be able to track build issues on push.	2025-09-26 16:29:29 +02:00
William Lallemand	f1f5877ce1	BUILD: halog: misleading indentation in halog.c admin/halog/halog.c: In function 'filter_count_url': admin/halog/halog.c:1685:9: error: this 'if' clause does not guard... [-Werror=misleading-indentation] 1685 \| if (unlikely(!ustat)) \| ^~ admin/halog/halog.c:1687:17: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if' 1687 \| if (unlikely(!ustat)) { \| ^~ This patch fixes the indentation. Must be backported where fbd0fb20a22 ("BUG/MINOR: halog: Add OOM checks for calloc() in filter_count_srv_status() and filter_count_url()") was backported.	2025-09-26 16:01:50 +02:00
Chris Staite	54f53bc875	MINOR: backend: srv_is_up converter There is currently an srv_queue converter which is capable of taking the output of a dynamic name and determining the queue length for a given server. In addition there is a sample fetcher for whether a server is currently up. This simply combines the two such that srv_is_up can be used as a converter too. Future work might extend this to other sample fetchers for servers, but this is probably the most useful for acl routing.	2025-09-26 10:46:48 +02:00
Chris Staite	faba98c85f	MINOR: backend: srv_queue helper In preparation of providing further server converters, split the code for finding the server from the sample out. Additionally, update the documentation for srv_queue converter to note security concerns.	2025-09-26 10:46:48 +02:00
William Lallemand	b3b910cc3f	BUILD: acme: fix false positive null pointer dereference src/acme.c: In function ‘cfg_parse_acme_vars_provider’: src/acme.c:471:9: error: potential null pointer dereference [-Werror=null-dereference] 471 \| free(*dst); \| ^~~~~~~~~~ gcc13 on ubuntu 24.04 detects a false positive when building 3e72a9f ("MINOR: acme: provider-name for dpapi sink"). Indeed dst can't be NULL. Clarify the code so gcc don't complain anymore.	2025-09-26 10:34:35 +02:00
William Lallemand	3e72a9f618	MINOR: acme: provider-name for dpapi sink Like "acme-vars", the "provider-name" in the acme section is used in case of DNS-01 challenge and is sent to the dpapi sink. This is used to pass the name of a DNS provider in order to chose the DNS API to use. This patch implements the cfg_parse_acme_vars_provider() which parses either acme-vars or provider-name options and escape their strings. Example: $ ( echo "@@1 show events dpapi -w -0"; cat - ) \| socat /tmp/master.sock - \| cat -e <0>2025-09-18T17:53:58.831140+02:00 acme deploy foobpar.pem thumbprint gDvbPL3w4J4rxb8gj20mGEgtuicpvltnTl6j1kSZ3vQ$ acme-vars "var1=foobar\"toto\",var2=var2"$ provider-name "godaddy"$ {$ "identifier": {$ "type": "dns",$ "value": "example.com"$ },$ "status": "pending",$ "expires": "2025-09-25T14:41:57Z",$ [...]	2025-09-26 10:23:35 +02:00
William Lallemand	c52d69cc78	BUG/MEDIUM: ssl: ca-file directory mode must read every certificates of a file The httpclient is configured with @system-ca by default, which uses the directory returned by X509_get_default_cert_dir(). On debian/ubuntu systems, this directory contains multiple certificate files that are loaded successfully. However it seems that on other systems the files in this directory is the direct result of ca-certificates instead of its source. Meaning that you would only have a bundle file with every certificates in it. The loading was not done correctly in case of directory loading, and was only loading the first certificate of each file. This patch fixes the issue by using X509_STORE_load_locations() on each file from the scandir instead of trying to load it manually with BIO. Not that we can't use X509_STORE_load_locations with the `dir` argument, which would be simpler, because it uses X509_LOOKUP_hash_dir() which requires a directory in hash form. That wouldn't be suited for this use case. Must be backported in every stable branches. Fix issue #3137.	2025-09-26 09:36:55 +02:00
William Lallemand	230a072102	CI: github: add curl+ech build into openssl-ech job Build a curl binary with the ECH function linked with our openssl+ech library.	2025-09-25 17:05:46 +02:00
William Lallemand	44b20e0b01	CI: scripts: build curl with ECH support Add a script to build curl with ECH support, to specify the path of the openssl+ECH library, you should set the SSL_LIB variable with the prefix of the library. Example: SSL_LIB=/opt/openssl-ech CURL_DESTDIR=/opt/curl-ech/ ./build-curl.sh	2025-09-25 17:05:46 +02:00
Christopher Faulet	7aa9f5ec98	BUG/MINOR: pattern: Fix pattern lookup for map with opt@ prefix When we look for a map file reference, the file@ prefix is removed because if may be omitted. The same is true with opt@ prefix. However this case was not properly performed in pat_ref_lookup(). Let's do so. This patch must be backported as far as 3.0.	2025-09-25 15:28:22 +02:00
William Lallemand	c325e34e6d	CLEANUP: acme: acme_will_expire() uses acme_schedule_date() Date computation between acme_will_expire() and acme_schedule_date() are the same. Call acme_schedule_date() from acme_will_expire() and put the functions as static. The patch also move the functions in the right order.	2025-09-25 15:14:31 +02:00
William Lallemand	f256b5fdf3	BUG/MINOR: acme: possible overflow in acme_will_expire() acme_will_expire() computes the schedule date using notAfter and notBefore from the certificate. However notBefore could be greater than notAfter and could result in an overflow. This is unlikely to happen and would mean an incorrect certificate. This patch fixes the issue by checking that notAfter > notBefore. It also replace the int type by a time_t to avoid overflow on 64bits architecture which is also unlikely to happen with certificates. `(date.tv_sec + diff > notAfter)` was also replaced by `if (notAfter - diff <= date.tv_sec)` to avoid an overflow. Fix issue #3135. Need to be backported to 3.2.	2025-09-25 15:12:14 +02:00
William Lallemand	68770479ea	BUG/MINOR: acme: possible overflow on scheduling computation acme_schedule_date() computes the schedule date using notAfter and notBefore from the certificate. However notBefore could be greater than notAfter and could result in an overflow. This is unlikely to happen and would mean an incorrect certificate. This patch fixes the issue by checking that notAfter > notBefore. It also replace the int type by a time_t to avoid overflow on 64bits architecture which is also unlikely to happen with certificates. Fix issue #3136. Need to be backported to 3.2.	2025-09-25 15:12:03 +02:00
Christopher Faulet	3be8b06a60	BUG/MINOR: pattern: Properly flag virtual maps as using samples When a map file is load, internally, the pattern reference is flagged as based on a sample. However it is not performed for virtual maps. This flag is only used during startup to check the map compatibility when it used at different places. At runtime this does not change anything. But errors can be triggered during configuration parsing. For instance, the following valid config will trigger an error: http-request set-map(virt@test) foo bar if !{ str(foo),map(virt@test) -m found } http-request set-var(txn.foo) str(foo),map(virt@test) The fix is quite obvious. PAT_REF_SMP flag must be set for virtual map as any other map. A workaround is to use optional map (opt@...) by checking the map id cannot reference an existing file. This patch must be backported as far as 3.0.	2025-09-25 10:16:53 +02:00
Christopher Faulet	23e5d272af	BUG/MINOR: compression: Test payload size only if content-length is specified When a minimum size is defined to performe the comression, the message payload size is tested. To do so, information from the HTX message a used to determine the message length. However it is performed regardless the payload length is fully known or not. Concretely, the test must on be performed when a content-length value was speficied or when the message was fully received (EOM flag set). Otherwise, we are unable to really determine the real payload length. Because of this bug, compression may be skipped for a large chunked message because the first chunks received are too small. But this does not mean the whole message is small. This patch must be backported to 3.2.	2025-09-25 10:16:53 +02:00
Olivier Houchard	71199e394c	BUG/MEDIUM: stick-tables: Don't let table_process_entry() handle refcnt Instead of having table_process_entry() decrement the session's ref counter, do it outside, from the caller. Some were missed, such as when an action was invalid, which would lead to the ref counter not being decremented, and the session not being destroyable. It makes more sense to do that from the caller, who just obtained the ref counter, anyway. This should be backporter up to 2.8.	2025-09-22 23:14:19 +02:00
Ilia Shipitsin	8c8e50e09a	CI: move VTest preparation & friends to dedicated composite action reference: https://docs.github.com/en/actions/tutorials/create-actions/create-a-composite-action preparing coredump limits, installing VTest are now served by dedicated composite action	2025-09-22 19:18:23 +02:00
William Lallemand	fbffd2e25f	BUG/MINOR: acme/cli: wrong description for "acme challenge_ready" The "acme challenge_ready" command mistakenly use the description of the "acme status" command. This patch adds the right description. Must be backported to 3.2.	2025-09-22 19:14:54 +02:00
William Lallemand	34cdc5e191	MINOR: acme: check acme-vars allocation during escaping Handle allocation properly during acme-vars parsing. Check if we have a allocation failure in both the malloc and the realloc and emits an error if that's the case.	2025-09-19 18:11:50 +02:00
William Lallemand	92c31a6fb7	MINOR: acme: acme-vars allow to pass data to the dpapi sink In the case of the dns-01 challenge, the agent that handles the challenge might need some extra information which depends on the DNS provider. This patch introduces the "acme-vars" option in the acme section, which allows to pass these data to the dpapi sink. The double quotes will be escaped when printed in the sink. Example: global setenv VAR1 'foobar"toto"' acme LE directory https://acme-staging-v02.api.letsencrypt.org/directory challenge DNS-01 acme-vars "var1=${VAR1},var2=var2" Would output: $ ( echo "@@1 show events dpapi -w -0"; cat - ) \| socat /tmp/master.sock - \| cat -e <0>2025-09-18T17:53:58.831140+02:00 acme deploy foobpar.pem thumbprint gDvbPL3w4J4rxb8gj20mGEgtuicpvltnTl6j1kSZ3vQ$ acme-vars "var1=foobar\"toto\",var2=var2"$ {$ "identifier": {$ "type": "dns",$ "value": "example.com"$ },$ "status": "pending",$ "expires": "2025-09-25T14:41:57Z",$ [...]	2025-09-19 16:40:53 +02:00
Christopher Faulet	331689d216	BUG/MEDIUM: http-client: Fix the test on the response start-line The commit 88aa7a780 ("MINOR: http-client: Trigger an error if first response block isn't a start-line") introduced a bug. From an endpoint, an applet or a mux, the <first> index must never be used. It is reserved to the HTTP analyzers. From endpoint, this value may be undefined or just point on any other block that the first one. Instead we must always get the head block. In taht case, to be sure the first HTX block in a response is a start-line, we must use htx_get_head_type() function instead of htx_get_first_type(). Otherwise, we can trigger an error while the response is in fact properly formatted. It is a 3.3-speific issue. cNo backport needed.	2025-09-19 14:59:28 +02:00
Aurelien DARRAGON	5c299dee5a	MEDIUM: stats: consider that shared stats pointers may be NULL This patch looks huge, but it has a very simple goal: protect all accessed to shared stats pointers (either read or writes), because we know consider that these pointers may be NULL. The reason behind this is despite all precautions taken to ensure the pointers shouldn't be NULL when not expected, there are still corner cases (ie: frontends stats used on a backend which no FE cap and vice versa) where we could try to access a memory area which is not allocated. Willy stumbled on such cases while playing with the rings servers upon connection error, which eventually led to process crashes (since 3.3 when shared stats were implemented) Also, we may decide later that shared stats are optional and should be disabled on the proxy to save memory and CPU, and this patch is a step further towards that goal. So in essence, this patch ensures shared stats pointers are always initialized (including NULL), and adds necessary guards before shared stats pointers are de-referenced. Since we already had some checks for backends and listeners stats, and the pointer address retrieval should stay in cpu cache, let's hope that this patch doesn't impact stats performance much.	2025-09-18 16:49:51 +02:00
Aurelien DARRAGON	40eb1dd135	BUG/MEDIUM: sink: fix unexpected double postinit of sink backend Willy experienced an unexpected behavior with the config below: global stats socket :1514 ring buf1 server srv1 127.0.0.1:1514 Indeed, haproxy would connect to the ring server twice since commit 23e5f18b ("MEDIUM: sink: change the sink mode type to PR_MODE_SYSLOG"), and one of the connection would report errors. The reason behind is is, despite the above commit saying no change of behavior is expected, with the sink forward_px proxy now being set with PR_MODE_SYSLOG, postcheck_log_backend() was being automatically executed in addition to the manual cfg_post_parse_ring() function for each "ring" section. The consequence is that sink_finalize() was called twice for a given "ring" section, which means the connection init would be triggered twice.. which in turn resulted in the behavior described above, plus possible unexpected side-effects. To fix the issue, when we create the forward_px proxy, we now set the PR_CAP_INT capability on it to tell haproxy not to automatically manage the proxy (ie: to skip the automatic log backend postinit), because we are about to manually manage the proxy from the sink API. No backport needed, this bug is specific to 3.3	2025-09-18 16:49:29 +02:00
Willy Tarreau	79ef362d9e	OPTIM: ring: avoid reloading the tail_ofs value before the CAS in ring_write() The load followed by the CAS seem to cause two bus cycles, one to retrieve the cache line in shared state and a second one to get exclusive ownership of it. Tests show that on x86 it's much better to just rely on the previous value and preset it to zero before entering the loop. We just mask the ring lock in case of failure so as to challenge it on next iteration and that's done. This little change brings 2.3% extra performance (11.34M msg/s) on a 64-core AMD.	2025-09-18 15:27:32 +02:00
Willy Tarreau	a727c6eaa5	OPTIM: ring: check the queue's owner using a CAS on x86 In the loop where the queue's leader tries to get the tail lock, we also need to check if another thread took ownership of the queue the current thread is currently working for. This is currently done using an atomic load. Tests show that on x86, using a CAS for this is much more efficient because it allows to keep the cache line in exclusive state for a few more cycles that permit the queue release call after the loop to be done without having to wait again. The measured gain is +5% for 128 threads on a 64-core AMD system (11.08M msg/s vs 10.56M). However, ARM loses about 1% on this, and we cannot afford that on machines without a fast CAS anyway, so the load is performed using a CAS only on x86_64. It might not be as efficient on low-end models but we don't care since they are not the ones dealing with high contention.	2025-09-18 15:08:12 +02:00
Willy Tarreau	d25099b359	OPTIM: ring: always relax in the ring lock and leader wait loop Tests have shown that AMD systems really need to use a cpu_relax() in these two loops. The performance improves from 10.03 to 10.56M messages per second (+5%) on a 128-thread system, without affecting intel nor ARM, so let's do this.	2025-09-18 15:07:56 +02:00
Willy Tarreau	eca1f90e16	CLEANUP: ring: rearrange the wait loop in ring_write() The loop is constructed in a complicated way with a single break statement in the middle and many continue statements everywhere, making it hard to better factor between variants. Let's first reorganize it so as to make it easier to escape when the ring tail lock is obtained. The sequence of instrucitons remains the same, it's only better organized.	2025-09-18 14:58:38 +02:00
Willy Tarreau	08c6bbb542	OPTIM: sink: don't waste time calling sink_announce_dropped() if busy If we see that another thread is already busy trying to announce the dropped counter, there's no point going there, so let's just skip all that operation from sink_write() and avoid disturbing the other thread. This results in a boost from 244 to 262k req/s.	2025-09-18 09:07:35 +02:00
Willy Tarreau	4431e3bd26	OPTIM: sink: reduce contention on sink_announce_dropped() perf top shows that sink_announce_dropped() consumes most of the CPU on a 128-thread x86 system. Digging further reveals that the atomic fetch_or() on the dropped field used to detect the presence of another thread is entirely responsible for this. Indeed, the compiler implements it using a CAS that loops without relaxing and makes all threads wait until they can synchronize on this one, only to discover later that another thread is there and they need to give up. Let's just replace this with a hand-crafted CAS loop that will detect before attempting the CAS if another thread is there. Doing so achieves the same goal without forcing threads to agree. With this simple change, the sustained request rate on h1 with all traces on bumped from 110k/s to 244k/s! This should be backported to stable releases where it's often needed to help debugging.	2025-09-18 08:38:34 +02:00
Willy Tarreau	361c227465	MINOR: trace: don't call strlen() on the function's name Currently there's a small mistake in the way the trace function and macros. The calling function name is known as a constant until the macro and passed as-is to the __trace() function. That one needs to know its length and will call ist() on it, resulting in a real call to strlen() while that length was known before the call. Let's use an ist instead of a const char* for __trace() and __trace_enabled() so that we can now completely avoid calling strlen() during this operation. This has significantly reduced the importance of __trace_enabled() in perf top.	2025-09-18 08:31:57 +02:00
Willy Tarreau	06fa9f717f	MINOR: trace: don't call strlen() on the thread-id numeric encoding In __trace(), we're making an integer for the thread id but this one is passed through strlen() in the call to ist() because it's not a constant. We do know that it's exactly 3 chars long so we can manage this using ist2() and pass it the length instead in order to reduce the number of calls to strlen(). Also let's note that the thread number will no longer be numeric for thread numbers above 100.	2025-09-18 08:02:59 +02:00
Willy Tarreau	d53ad49ad1	BUG/MEDIUM: ring: invert the length check to avoid an int overflow Vincent Gramer reported in GH issue #3125 a case of crash on a BUG_ON() condition in the rings. What happens is that a message that is one byte less than the maximum ring size is emitted, and it passes all the checks, but once inflated by the extra +1 for the refcount, it can no longer. But the check was made based on message size compared to space left, except that this space left can now be negative, which is a high positive for size_t, so the check remained valid and triggered a BUG_ON() later. Let's compute the size the other way around instead (i.e. current + needed) since we can't have rings as large as half of the memory space anyway, thus we have no risk of overflow on this one. This needs to be backported to all versions supporting multi-threaded rings (3.0 and above). Thanks to Vincent for the easy and working reproducer.	2025-09-17 18:45:13 +02:00
Willy Tarreau	8c077c17eb	MINOR: server: add the "cc" keyword to set the TCP congestion controller It is possible on at least Linux and FreeBSD to set the congestion control algorithm to be used with outgoing connections, among the list of supported and permitted ones. Let's expose this setting with "cc". Unknown or forbidden algorithms will be ignored and the default one will continue to be used.	2025-09-17 17:19:33 +02:00
Willy Tarreau	4ed3cf295d	MINOR: listener: add the "cc" bind keyword to set the TCP congestion controller It is possible on at least Linux and FreeBSD to set the congestion control algorithm to be used with incoming connections, among the list of supported and permitted ones. Let's expose this setting with "cc". Permission issues might be reported (as warnings).	2025-09-17 17:03:42 +02:00
Ben Kallus	31d0695a6a	IMPORT: ebtree: replace hand-rolled offsetof to avoid UB The C standard specifies that it's undefined behavior to dereference NULL (even if you use & right after). The hand-rolled offsetof idiom &(((s)NULL)->f) is thus technically undefined. This clutters the output of UBSan and is simple to fix: just use the real offsetof when it's available. Note that there's no clear statement about this point in the spec, only several points which together converge to this: - From N3220, 6.5.3.4: A postfix expression followed by the -> operator and an identifier designates a member of a structure or union object. The value is that of the named member of the object to which the first expression points, and is an lvalue. - From N3220, 6.3.2.1: An lvalue is an expression (with an object type other than void) that potentially designates an object; if an lvalue does not designate an object when it is evaluated, the behavior is undefined. - From N3220, 6.5.4.4 p3: The unary & operator yields the address of its operand. If the operand has type "type", the result has type "pointer to type". If the operand is the result of a unary operator, neither that operator nor the & operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue. Similarly, if the operand is the result of a [] operator, neither the & operator nor the unary * that is implied by the [] is evaluated and the result is as if the & operator were removed and the [] operator were changed to a + operator. => In short, this is saying that C guarantees these identities: 1. &(p) is equivalent to p 2. &(p[n]) is equivalent to p + n As a consequence, &(p) doesn't result in the evaluation of *p, only the evaluation of p (and similar for []). There is no corresponding special carve-out for ->. See also: https://pvs-studio.com/en/blog/posts/cpp/0306/ After this patch, HAProxy can run without crashing after building w/ clang-19 -fsanitize=undefined -fno-sanitize=function,alignment This is ebtree commit bd499015d908596f70277ddacef8e6fa998c01d5. Signed-off-by: Willy Tarreau <w@1wt.eu> This is ebtree commit 5211c2f71d78bf546f5d01c8d3c1484e868fac13.	2025-09-17 14:30:32 +02:00
Willy Tarreau	a31da78685	IMPORT: ebtree: add a definition of offsetof() We'll use this to improve the definition of container_of(). Let's define it if it does not exist. We can rely on __builtin_offsetof() on recent enough compilers. This is ebtree commit 1ea273e60832b98f552b9dbd013e6c2b32113aa5. Signed-off-by: Willy Tarreau <w@1wt.eu> This is ebtree commit 69b2ef57a8ce321e8de84486182012c954380401.	2025-09-17 14:30:32 +02:00
Ben Kallus	ddbff4e235	IMPORT: ebtree: Fix UB from clz(0) From 'man gcc': passing 0 as the argument to "__builtin_ctz" or "__builtin_clz" invokes undefined behavior. This triggers UBsan in HAProxy. [wt: tested in treebench and verified not to cause any performance regression with opstime-u32 nor stress-u32] Signed-off-by: Willy Tarreau <w@1wt.eu> This is ebtree commit 8c29daf9fa6e34de8c7684bb7713e93dcfe09029. Signed-off-by: Willy Tarreau <w@1wt.eu> This is ebtree commit cf3b93736cb550038325e1d99861358d65f70e9a.	2025-09-17 14:30:32 +02:00
Willy Tarreau	52c6dd773d	IMPORT: ebst: use prefetching in lookup() and insert() While the previous optimizations couldn't be preserved due to the possibility of out-of-bounds accesses, at least the prefetch is useful. A test on treebench shows that for 64k short strings, the lookup time falls from 276 to 199ns per lookup (28% savings), and the insert falls from 311 to 296ns (4.9% savings), which are pretty respectable, so let's do this. This is ebtree commit b44ea5d07dc1594d62c3a902783ed1fb133f568d.	2025-09-17 14:30:32 +02:00
Willy Tarreau	fef4cfbd21	IMPORT: ebtree: only use __builtin_prefetch() when supported It looks like __builtin_prefetch() appeared in gcc-3.1 as there's no mention of it in 3.0's doc. Let's replace it with eb_prefetch() which maps to __builtin_prefetch() on supported compilers and falls back to the usual do{}while(0) on other ones. It was tested to properly build with tcc as well as gcc-2.95. This is ebtree commit 7ee6ede56a57a046cb552ed31302b93ff1a21b1a.	2025-09-17 14:30:32 +02:00
Willy Tarreau	3dda813d54	IMPORT: eb32/64: optimize insert for modern CPUs Similar to previous patches, let's improve the insert() descent loop to avoid discovering mandatory data too late. The change here is even simpler than previous ones, a prefetch was installed and troot is calculated before last instruction in a speculative way. This was enough to gain +50% insertion rate on random data. This is ebtree commit e893f8cc4d44b10f406b9d1d78bd4a9bd9183ccf.	2025-09-17 14:30:32 +02:00
Willy Tarreau	61654c07bd	IMPORT: ebmb: optimize the lookup for modern CPUs This is the same principles as for the latest improvements made on integer trees. Applying the same recipes made the ebmb_lookup() function jump from 10.07 to 12.25 million lookups per second on a 10k random values tree (+21.6%). It's likely that the ebmb_lookup_longest() code could also benefit from this, though this was neither explored nor tested. This is ebtree commit a159731fd6b91648a2fef3b953feeb830438c924.	2025-09-17 14:30:32 +02:00
Willy Tarreau	6c54bf7295	IMPORT: eb32/eb64: place an unlikely() on the leaf test In the loop we can help the compiler build slightly more efficient code by placing an unlikely() around the leaf test. This shows a consistent 0.5% performance gain both on eb32 and eb64. This is ebtree commit 6c9cdbda496837bac1e0738c14e42faa0d1b92c4.	2025-09-17 14:30:32 +02:00
Willy Tarreau	384907f4e7	IMPORT: eb32: drop the now useless node_bit variable This one was previously used to preload from the node and keep a copy in a register on i386 machines with few registers. With the new more optimal code it's totally useless, so let's get rid of it. By the way the 64 bit code didn't use that at all already. This is ebtree commit 1e219a74cfa09e785baf3637b6d55993d88b47ef.	2025-09-17 14:30:31 +02:00
Willy Tarreau	c9e4adf608	IMPORT: eb32/eb64: use a more parallelizable check for lack of common bits Instead of shifting the XOR value right and comparing it to 1, which roughly requires 2 sequential instructions, better test if the XOR has any bit above the current bit, which means any bit set among those strictly higher, or in other words that XOR & (-bit << 1) is non-zero. This is one less instruction in the fast path and gives another nice performance gain on random keys (in million lookups/s): eb32 1k: 33.17 -> 37.30 +12.5% 10k: 15.74 -> 17.08 +8.51% 100k: 8.00 -> 9.00 +12.5% eb64 1k: 34.40 -> 38.10 +10.8% 10k: 16.17 -> 17.10 +5.75% 100k: 8.38 -> 8.87 +5.85% This is ebtree commit c942a2771758eed4f4584fe23cf2914573817a6b.	2025-09-17 14:30:31 +02:00
Willy Tarreau	6af17d491f	IMPORT: eb32/eb64: reorder the lookup loop for modern CPUs The current code calculates the next troot based on a calculation. This was efficient when the algorithm was developed many years ago on K6 and K7 CPUs running at low frequencies with few registers and limited branch prediction units but nowadays with ultra-deep pipelines and high latency memory that's no longer efficient, because the CPU needs to have completed multiple operations before knowing which address to start fetching from. It's sad because we only have two branches each time but the CPU cannot know it. In addition, the calculation is performed late in the loop, which does not help the address generation unit to start prefetching next data. Instead we should help the CPU by preloading data early from the node and calculing troot as soon as possible. The CPU will be able to postpone that processing until the dependencies are available and it really needs to dereference it. In addition we must absolutely avoid serializing instructions such as "(a >> b) & 1" because there's no way for the compiler to parallelize that code nor for the CPU to pre- process some early data. What this patch does is relatively simple: - we try to prefetch the next two branches as soon as the node is known, which will help dereference the selected node in the next iteration; it was shown that it only works with the next changes though, otherwise it can reduce the performance instead. In practice the prefetching will start a bit later once the node is really in the cache, but since there's no dependency between these instructions and any other one, we let the CPU optimize as it wants. - we preload all important data from the node (next two branches, key and node.bit) very early even if not immediately needed. This is cheap, it doesn't cause any pipeline stall and speeds up later operations. - we pre-calculate 1<<bit that we assign into a register, so as to avoid serializing instructions when deciding which branch to take. - we assign the troot based on a ternary operation (or if/else) so that the CPU knows upfront the two possible next addresses without waiting for the end of a calculation and can prefetch their contents every time the branch prediction unit guesses right. Just doing this provides significant gains at various tree sizes on random keys (in million lookups per second): eb32 1k: 29.07 -> 33.17 +14.1% 10k: 14.27 -> 15.74 +10.3% 100k: 6.64 -> 8.00 +20.5% eb64 1k: 27.51 -> 34.40 +25.0% 10k: 13.54 -> 16.17 +19.4% 100k: 7.53 -> 8.38 +11.3% The performance is now much closer to the sequential keys. This was done for all variants ({32,64}{,i,le,ge}). Another point, the equality test in the loop improves the performance when looking up random keys (since we don't need to reach the leaf), but is counter-productive for sequential keys, which can gain ~17% without that test. However sequential keys are normally not used with exact lookups, but rather with lookup_ge() that spans a time frame, and which does not have that test for this precise reason, so in the end both use cases are served optimally. It's interesting to note that everything here is solely based on data dependencies, and that trying to perform less operations upfront always ends up with lower performance (typically the original one). This is ebtree commit 05a0613e97f51b6665ad5ae2801199ad55991534.	2025-09-17 14:30:31 +02:00
Willy Tarreau	dcd4d36723	IMPORT: ebtree: delete unusable ebpttree.c Since commit 21fd162 ("[MEDIUM] make ebpttree rely solely on eb32/eb64 trees") it was no longer used and no longer builds. The commit message mentions that the file is no longer needed, probably that a rebase failed and left the file there. This is ebtree commit fcfaf8df90e322992f6ba3212c8ad439d3640cb7.	2025-09-17 14:30:31 +02:00
Aurelien DARRAGON	b72225dee2	DOC: internals: document the shm-stats-file format/mapping Add some documentation about shm stats file structure to help writing tools that can parse the file to use the shared stats counters. This file was written for shm stats file version 1.0 specifically, it may need to be updated when the shm stats file structure changes in the future.	2025-09-17 11:32:58 +02:00
Aurelien DARRAGON	644b6b9925	MINOR: counters: document that tg shared counters are tied to shm-stats-file mapping Let's explicitly mention that fe_counters_shared_tg and be_counters_shared_tg structs are embedded in shm_stats_file_object struct so any change in those structs will result in shm stats file incompatibility between processes, thus extra precaution must be taken when making changes to them. Note that the provisionning made in shm_stats_file_object struct could be used to add members to {fe,be}_counters_shared_tg without changing shm_stats_file_object struct size if needed in order to preserve shm stats file version.	2025-09-17 11:31:29 +02:00
Aurelien DARRAGON	31b3be7aae	CLEANUP: log: remove deadcode in px_parse_log_steps() When logsteps proxy storage was migrated from eb nodes to bitmasks in 6a92b14 ("MEDIUM: log/proxy: store log-steps selection using a bitmask, not an eb tree"), some unused eb node related code was left over in px_parse_log_steps() Not only this code is unused, it also resulted in wasted memory since an eb node was allocated for nothing. This should fix GH #3121	2025-09-17 11:31:17 +02:00
Willy Tarreau	3d73e6c818	BUG/MEDIUM: pattern: fix possible infinite loops on deletion (try 2) Commit e36b3b60b3 ("MEDIUM: migrate the patterns reference to cebs_tree") changed the construction of the loops used to look up matching nodes, and since we don't need two elements anymore, the "continue" statement now loops on the same element when deleting. Let's fix this to make sure it passes through the next one. While this bug is 3.3 only, it turns out that 3.2 is also affected by the incorrect loop construct in pat_ref_set_from_node(), where it's possible to run an infinite loop since commit 010c34b8c7 ("MEDIUM: pattern: consider gen_id in pat_ref_set_from_node()") due to the "continue" statement being placed before the ebmb_next_dup() call. As such the relevant part of this fix (pat_ref_set_from_elt) will need to be backported to 3.2.	2025-09-16 16:32:39 +02:00
Willy Tarreau	f1b1d3682a	Revert "BUG/MEDIUM: pattern: fix possible infinite loops on deletion" This reverts commit 359a829ccb8693e0b29808acc0fa7975735c0353. The fix is neither sufficient nor correct (it triggers ASAN). Better redo it cleanly rather than accumulate invalid fixes.	2025-09-16 16:32:39 +02:00
William Lallemand	6b6c03bc0d	CI: scripts: mkdir BUILDSSL_TMPDIR Creates the BUILDSSL_TMPDIR at the beginning of the script instead of having to create it in each download functions	2025-09-16 15:35:35 +02:00
William Lallemand	9517116f63	CI: github: add an OpenSSL + ECH job The upcoming ECH feature need a patched OpenSSL with the "feature/ech" branch. This daily job launches an openssl build, as well as haproxy build with reg-tests.	2025-09-16 15:05:44 +02:00
William Lallemand	31319ff7f0	CI: scripts: add support for git in openssl builds Add support for git releases downloaded from github in openssl builds: - GIT_TYPE variable allow you to chose between "branch" or "commit" - OPENSSL_VERSION variable supports a "git-" prefix - "git-${commit_id}" is stored in .openssl_version instead of the branch name for version comparison.	2025-09-16 15:05:44 +02:00
Willy Tarreau	359a829ccb	BUG/MEDIUM: pattern: fix possible infinite loops on deletion Commit e36b3b60b3 ("MEDIUM: migrate the patterns reference to cebs_tree") changed the construction of the loops used to look up matching nodes, and since we don't need two elements anymore, the "continue" statement now loops on the same element when deleting. Let's fix this to make sure it passes through the next one. No backport is needed, this is only 3.3.	2025-09-16 11:49:01 +02:00
Willy Tarreau	4edff4a2cc	CLEANUP: vars: use the item API for the variables trees The variables trees use the immediate cebtree API, better use the item one which is more expressive and safer. The "node" field was renamed to "name_node" to avoid any ambiguity.	2025-09-16 10:51:23 +02:00
Willy Tarreau	c058cc5ddf	CLEANUP: tools: use the item API for the file names tree The file names tree uses the immediate cebtree API, better use the item one which is more expressive and safer.	2025-09-16 10:41:19 +02:00
Willy Tarreau	2d6b5c7a60	MEDIUM: connection: reintegrate conn_hash_node into connection Previously the conn_hash_node was placed outside the connection due to the big size of the eb64_node that could have negatively impacted frontend connections. But having it outside also means that one extra allocation is needed for each backend connection, and that one memory indirection is needed for each lookup. With the compact trees, the tree node is smaller (16 bytes vs 40) so the overhead is much lower. By integrating it into the connection, We're also eliminating one pointer from the connection to the hash node and one pointer from the hash node to the connection (in addition to the extra object bookkeeping). This results in saving at least 24 bytes per total backend connection, and only inflates connections by 16 bytes (from 240 to 256), which is a reasonable compromise. Tests on a 64-core EPYC show a 2.4% increase in the request rate (from 2.08 to 2.13 Mrps).	2025-09-16 09:23:46 +02:00
Willy Tarreau	ceaf8c1220	MEDIUM: connection: move idle connection trees to ceb64 Idle connection trees currently require a 56-byte conn_hash_node per connection, which can be reduced to 32 bytes by moving to ceb64. While ceb64 is theoretically slower, in practice here we're essentially dealing with trees that almost always contain a single key and many duplicates. In this case, ceb64 insert and lookup functions become faster than eb64 ones because all duplicates are a list accessed in O(1) while it's a subtree for eb64. In tests it is impossible to tell the difference between the two, so it's worth reducing the memory usage. This commit brings the following memory savings to conn_hash_node (one per backend connection), and to srv_per_thread (one per thread and per server): struct before after delta conn_hash_nodea 56 32 -24 srv_per_thread 96 72 -24 The delicate part is conn_delete_from_tree(), because we need to know the tree root the connection is attached to. But thanks to recent cleanups, it's now clear enough (i.e. idle/safe/avail vs session are easy to distinguish).	2025-09-16 09:23:46 +02:00
Willy Tarreau	95b8adff67	MINOR: connection: pass the thread number to conn_delete_from_tree() We'll soon need to choose the server's root based on the connection's flags, and for this we'll need the thread it's attached to, which is not always the current one. This patch simply passes the thread number from all callers. They know it because they just set the idle_conns lock on it prior to calling the function.	2025-09-16 09:23:46 +02:00
Willy Tarreau	efe519ab89	CLEANUP: backend: use a single variable for removed in srv_cleanup_idle_conns() Probably due to older code, there's a boolean variable used to set another one which is then checked. Also the first check is made under the lock, which is unnecessary. Let's simplify this and use a single variable. This only makes the code clearer, it doesn't change the output code.	2025-09-16 09:23:46 +02:00
Willy Tarreau	f7d1fc2b08	MINOR: server: pass the server and thread to srv_migrate_conns_to_remove() We'll need to have access to the srv_per_thread element soon from this function, and there's no particular reason for passing it list pointers so let's pass the server and the thread so that it is autonomous. It also makes the calling code simpler.	2025-09-16 09:23:46 +02:00
Willy Tarreau	d1c5df6866	CLEANUP: server: use eb64_entry() not ebmb_entry() to convert an eb64 There were a few leftovers from an earlier version of the conn_hash_node that was using ebmb nodes. A few calls to ebmb_first() and ebmb_entry() were still present while acting on an eb64 tree. These are harmless as one is just eb_first() and the other container_of(), but it's confusing so let's clean them up.	2025-09-16 09:23:46 +02:00
Willy Tarreau	3d18a0d4c2	CLEANUP: backend: factor the connection lookup loop The connection lookup loop is made of two identical blocks, one looking in the idle or safe lists and the other one looking into the safe list only. The second one is skipped if a connection was found or if the request looks for a safe one (since already done). Also the two are slightly different due to leftovers from earlier versions in that the second one checks for safe connections and not the first one, and the second one sets is_safe which is not used later. Let's just rationalize all this by placing them in a loop which checks first from the idle conns and second from the safe ones, or skips the first step if the request wants a safe connection. This reduces the code and shortens the time spent under the lock.	2025-09-16 09:23:46 +02:00
Willy Tarreau	7773d87ea6	CLEANUP: proxy: slightly reorganize fields to plug some holes The proxy struct has several small holes that deserved being plugged by moving a few fields around. Now we're down to 3056 from 3072 previously, and the remaining holes are small. At the moment, compared to before this series, we're seeing these sizes: type\size 7d554ca62 current delta listener 752 704 -48 (-6.4%) server 4032 3840 -192 (-4.8%) proxy 3184 3056 -128 (-4%) stktable 3392 3328 -64 (-1.9%) Configs with many servers have shrunk by about 4% in RAM and configs with many proxies by about 3%.	2025-09-16 09:23:46 +02:00
Willy Tarreau	8df81b6fcc	CLEANUP: server: slightly reorder fields in the struct to plug holes The struct server still has a lot of holes and padding that make it quite big. By moving a few fields aronud between areas which do not interact (e.g. boot vs aligned areas), it's quite easy to plug some of them and/or to arrange larger ones which could be reused later with a bit more effort. Here we've reduced holes by 40 bytes, allowing the struct to shrink by one more cache line (64 bytes). The new size is 3840 bytes.	2025-09-16 09:23:46 +02:00
Willy Tarreau	d18d972b1f	MEDIUM: server: index server ID using compact trees The server ID is currently stored as a 32-bit int using an eb32 tree. It's used essentially to find holes in order to automatically assign IDs, and to detect duplicates. Let's change this to use compact trees instead in order to save 24 bytes in struct server for this node, plus 8 bytes in struct proxy. The server struct is still 3904 bytes large (due to alignment) and the proxy struct is 3072.	2025-09-16 09:23:46 +02:00
Willy Tarreau	66191584d1	MEDIUM: listener: index listener ID using compact trees The listener ID is currently stored as a 32-bit int using an eb32 tree. It's used essentially to find holes in order to automatically assign IDs, and to detect duplicates. Let's change this to use compact trees instead in order to save 24 bytes in struct listener for this node, plus 8 bytes in struct proxy. The struct listener is now 704 bytes large, and the struct proxy 3080.	2025-09-16 09:23:46 +02:00
Willy Tarreau	1a95bc42c7	MEDIUM: proxy: index proxy ID using compact trees The proxy ID is currently stored as a 32-bit int using an eb32 tree. It's used essentially to find holes in order to automatically assign IDs, and to detect duplicates. Let's change this to use compact trees instead in order to save 24 bytes in struct proxy for this node, plus 8 bytes in the root (which is static so not much relevant here). Now the proxy is 3088 bytes large.	2025-09-16 09:23:46 +02:00
Willy Tarreau	eab5b89dce	MINOR: proxy: add proxy_index_id() to index a proxy by its ID This avoids needlessly exposing the tree's root and the mechanics outside of the low-level code.	2025-09-16 09:23:46 +02:00
Willy Tarreau	5e4b6714e1	MINOR: listener: add listener_index_id() to index a listener by its ID This avoids needlessly exposing the tree's root and the mechanics outside of the low-level code.	2025-09-16 09:23:46 +02:00
Willy Tarreau	5a5cec4d7a	MINOR: server: add server_index_id() to index a server by its ID This avoids needlessly exposing the tree's root and the mechanics outside of the low-level code.	2025-09-16 09:23:46 +02:00
Willy Tarreau	4ed4cdbf3d	CLEANUP: server: use server_find_by_id() when looking for already used IDs In srv_parse_id(), there's no point doing all the low-level work with the tree functions to check for the existence of an ID, we already have server_find_by_id() which does exactly this, so let's use it.	2025-09-16 09:23:46 +02:00
Willy Tarreau	0b0aefe19b	MINOR: server: add server_get_next_id() to find next free server ID This was previously achieved via the generic get_next_id() but we'll soon get rid of generic ID trees so let's have a dedicated server_get_next_id(). As a bonus it reduces the exposure of the tree's root outside of the functions.	2025-09-16 09:23:46 +02:00
Willy Tarreau	23605eddb1	MINOR: listener: add listener_get_next_id() to find next free listener ID This was previously achieved via the generic get_next_id() but we'll soon get rid of generic ID trees so let's have a dedicated listener_get_next_id(). As a bonus it reduces the exposure of the tree's root outside of the functions.	2025-09-16 09:23:46 +02:00
Willy Tarreau	b2402d67b7	MINOR: proxy: add proxy_get_next_id() to find next free proxy ID This was previously achieved via the generic get_next_id() but we'll soon get rid of generic ID trees so let's have a dedicated proxy_get_next_id().	2025-09-16 09:23:46 +02:00
Willy Tarreau	f4059ea42f	MEDIUM: stktable: index table names using compact trees Here we're saving 64 bytes per stick-table, from 3392 to 3328, and the change was really straightforward so there's no reason not to do it.	2025-09-16 09:23:46 +02:00
Willy Tarreau	d0d60a007d	MEDIUM: proxy: switch conf.name to cebis_tree This is used to index the proxy's name and it contains a copy of the pointer to the proxy's name in <id>. Changing that for a ceb_node placed just before <id> saves 32 bytes to the struct proxy, which is now 3112 bytes large. Here we need to continue to support duplicates since they're still allowed between type-incompatible proxies. Interestingly, the use of cebis_next_dup() instead of cebis_next() in proxy_find_by_name() allows us to get rid of an strcmp() that was performed for each use_backend rule. A test with a large config (100k backends) shows that we can get 3% extra performance on a config involving a static use_backend rule (3.09M to 3.18M rps), and even 4.5% on a dynamic rule selecting a random backend (2.47M to 2.59M).	2025-09-16 09:23:46 +02:00
Willy Tarreau	fdf6fd5b45	MEDIUM: server: switch the host_dn member to cebis_tree This member is used to index the hostname_dn contents for DNS resolution. Let's replace it with a cebis_tree to save another 32 bytes (24 for the node + 8 by avoiding the duplication of the pointer). The struct server is now at 3904 bytes.	2025-09-16 09:23:46 +02:00
Willy Tarreau	413e903a22	MEDIUM: server: switch conf.name to cebis_tree This is used to index the server name and it contains a copy of the pointer to the server's name in <id>. Changing that for a ceb_node placed just before <id> saves 32 bytes to the struct server, which remains 3968 bytes large due to alignment. The proxy struct shrinks by 8 bytes to 3144. It's worth noting that the current way duplicate names are handled remains based on the previous mechanism where dups were permitted. Ideally we should now reject them during insertion and use unique key trees instead.	2025-09-16 09:23:46 +02:00
Willy Tarreau	0e99f64fc6	MEDIUM: server: switch addr_node to cebis_tree This contains the text representation of the server's address, for use with stick-tables with "srvkey addr". Switching them to a compact node saves 24 more bytes from this structure. The key was moved to an external pointer "addr_key" right after the node. The server struct is now 3968 bytes (down from 4032) due to alignment, and the proxy struct shrinks by 8 bytes to 3152.	2025-09-16 09:23:46 +02:00
Willy Tarreau	91258fb9d8	MEDIUM: guid: switch guid to more compact cebuis_tree The current guid struct size is 56 bytes. Once reduced using compact trees, it goes down to 32 (almost half). We're not on a critical path and size matters here, so better switch to this. It's worth noting that the name part could also be stored in the guid_node at the end to save 8 extra byte (no pointer needed anymore), however the purpose of this struct is to be embedded into other ones, which is not compatible with having a dynamic size. Affected struct sizes in bytes: Before After Diff server 4032 4032 0* proxy 3184 3160 -24 listener 752 728 -24 *: struct server is full of holes and padding (176 bytes) and is 64-byte aligned. Moving the guid_node elsewhere such as after sess_conn reduces it to 3968, or one less cache line. There's no point in moving anything now because forthcoming patches will arrange other parts.	2025-09-16 09:23:46 +02:00
Willy Tarreau	e36b3b60b3	MEDIUM: migrate the patterns reference to cebs_tree cebs_tree are 24 bytes smaller than ebst_tree (16B vs 40B), and pattern references are only used during map/acl updates, so their storage is pure loss between updates (which most of the time never happen). By switching their indexing to compact trees, we can save 16 to 24 bytes per entry depending on alightment (here it's 24 per struct but 16 practical as malloc's alignment keeps 8 unused). Tested on core i7-8650U running at 3.0 GHz, with a file containing 17.7M IP addresses (16.7M different): $ time ./haproxy -c -f acl-ip.cfg Save 280 MB RAM for 17.7M IP addresses, and slightly speeds up the startup (5.8%, from 19.2s to 18.2s), a part of which possible being attributed to having to write less memory. Note that this is on small strings. On larger ones such as user-agents, ebtree doesn't reread the whole key and might be more efficient. Before: RAM (VSZ/RSS): 4443912 3912444 real 0m19.211s user 0m18.138s sys 0m1.068s Overhead Command Shared Object Symbol 44.79% haproxy haproxy [.] ebst_insert 25.07% haproxy haproxy [.] ebmb_insert_prefix 3.44% haproxy libc-2.33.so [.] __libc_calloc 2.71% haproxy libc-2.33.so [.] _int_malloc 2.33% haproxy haproxy [.] free_pattern_tree 1.78% haproxy libc-2.33.so [.] inet_pton4 1.62% haproxy libc-2.33.so [.] _IO_fgets 1.58% haproxy libc-2.33.so [.] _int_free 1.56% haproxy haproxy [.] pat_ref_push 1.35% haproxy libc-2.33.so [.] malloc_consolidate 1.16% haproxy libc-2.33.so [.] __strlen_avx2 0.79% haproxy haproxy [.] pat_idx_tree_ip 0.76% haproxy haproxy [.] pat_ref_read_from_file 0.60% haproxy libc-2.33.so [.] __strrchr_avx2 0.55% haproxy libc-2.33.so [.] unlink_chunk.constprop.0 0.54% haproxy libc-2.33.so [.] __memchr_avx2 0.46% haproxy haproxy [.] pat_ref_append After: RAM (VSZ/RSS): 4166108 3634768 real 0m18.114s user 0m17.113s sys 0m0.996s Overhead Command Shared Object Symbol 38.99% haproxy haproxy [.] cebs_insert 27.09% haproxy haproxy [.] ebmb_insert_prefix 3.63% haproxy libc-2.33.so [.] __libc_calloc 3.18% haproxy libc-2.33.so [.] _int_malloc 2.69% haproxy haproxy [.] free_pattern_tree 1.99% haproxy libc-2.33.so [.] inet_pton4 1.74% haproxy libc-2.33.so [.] _IO_fgets 1.73% haproxy libc-2.33.so [.] _int_free 1.57% haproxy haproxy [.] pat_ref_push 1.48% haproxy libc-2.33.so [.] malloc_consolidate 1.22% haproxy libc-2.33.so [.] __strlen_avx2 1.05% haproxy libc-2.33.so [.] __strcmp_avx2 0.80% haproxy haproxy [.] pat_idx_tree_ip 0.74% haproxy libc-2.33.so [.] __memchr_avx2 0.69% haproxy libc-2.33.so [.] __strrchr_avx2 0.69% haproxy libc-2.33.so [.] _IO_getline_info 0.62% haproxy haproxy [.] pat_ref_read_from_file 0.56% haproxy libc-2.33.so [.] unlink_chunk.constprop.0 0.56% haproxy libc-2.33.so [.] cfree@GLIBC_2.2.5 0.46% haproxy haproxy [.] pat_ref_append If the addresses are totally disordered (via "shuf" on the input file), we see both implementations reach exactly 68.0s (slower due to much higher cache miss ratio). On large strings such as user agents (1 million here), it's now slightly slower (+9%): Before: real 0m2.475s user 0m2.316s sys 0m0.155s After: real 0m2.696s user 0m2.544s sys 0m0.147s But such patterns are much less common than short ones, and the memory savings do still count. Note that while it could be tempting to get rid of the list that chains all these pat_ref_elt together and only enumerate them by walking along the tree to save 16 extra bytes per entry, that's not possible due to the problem that insertion ordering is critical (think overlapping regex such as /index.* and /index.html). Currently it's not possible to proceed differently because patterns are first pre-loaded into the pat_ref via pat_ref_read_from_file_smp() and later indexed by pattern_read_from_file(), which has to only redo the second part anyway for maps/acls declared multiple times.	2025-09-16 09:23:46 +02:00
Willy Tarreau	ddf900a0ce	IMPORT: cebtree: import version 0.5.0 to support duplicates The support for duplicates is necessary for various use cases related to config names, so let's upgrade to the latest version which brings this support. This updates the cebtree code to commit 808ed67 (tag 0.5.0). A few tiny adaptations were needed: - replace a few ceb_node with ceb_root since pointers are now tagged ; - replace cebu.h with ceb.h since both are now merged in the same include file. This way we can drop the unused cebu*.h files from cebtree that are provided only for compatibility. - rename immediate storage functions to cebXX_imm_XXX() as per the API change in 0.5 that makes immediate explicit rather than implicit. This only affects vars and tools.c:copy_file_name(). The tests continue to work.	2025-09-16 09:23:46 +02:00
Willy Tarreau	90b70b61b1	BUILD: makefile: implement support for running a command in range When running "make range", it would be convenient to support running reg tests or anything else such as "size", "pahole" or even benchmarks. Such commands are usually specific to the developer's environment, so let's just pass a generic variable TEST_CMD that is executed as-is if not empty. This way it becomes possible to run "make range RANGE=... TEST_CMD=...".	2025-09-16 09:23:46 +02:00
Valentine Krasnobaeva	f8acac653e	BUG/MINOR: resolvers: always normalize FQDN from response RFC1034 states the following: By convention, domain names can be stored with arbitrary case, but domain name comparisons for all present domain functions are done in a case-insensitive manner, assuming an ASCII character set, and a high order zero bit. This means that you are free to create a node with label "A" or a node with label "a", but not both as brothers; you could refer to either using "a" or "A". In practice, most DNS resolvers normalize domain labels (i.e., convert them to lowercase) before performing searches or comparisons to ensure this requirement is met. While HAProxy normalizes the domain name in the request, it currently does not do so for the response. Commit 75cc653 ("MEDIUM: resolvers: replace bogus resolv_hostname_cmp() with memcmp()") intentionally removed the `tolower()` conversion from `resolv_hostname_cmp()` for safety and performance reasons. This commit re-introduces the necessary normalization for FQDNs received in the response. The change is made in `resolv_read_name()`, where labels are processed as an unsigned char string, allowing `tolower()` to be applied safely. Since a typical FQDN has only 3-4 labels, replacing `memcpy()` with an explicit copy that also applies `tolower()` should not introduce a significant performance degradation. This patch addresses the rare edge case, as most resolvers perform this normalization themselves. This fixes the GitHub issue #3102. This fix may be backported in all stable versions since 2.5 included 2.5.	2025-09-15 18:02:16 +02:00
Remi Tricot-Le Breton	257df69fbd	BUG/MINOR: ocsp: Crash when updating CA during ocsp updates If an ocsp response is set to be updated automatically and some certificate or CA updates are performed on the CLI, if the CLI update happens while the OCSP response is being updated and is then detached from the udapte tree, it might be wrongly inserted into the update tree in 'ssl_sock_load_ocsp', and then reinserted when the update finishes. The update tree then gets corrupted and we could end up crashing when accessing other nodes in the ocsp response update tree. This patch must be backported up to 2.8. This patch fixes GitHub #3100.	2025-09-15 15:34:36 +02:00
Aurelien DARRAGON	6a92b14cc1	MEDIUM: log/proxy: store log-steps selection using a bitmask, not an eb tree An eb tree was used to anticipate for infinite amount of custom log steps configured at a proxy level. In turns out this makes no sense to configure that much logging steps for a proxy, and the cost of the eb tree is non negligible in terms of memory footprint, especially when used in a default section. Instead, let's use a simple bitmask, which allows up to 64 logging steps configured at proxy level. If we lack space some day (and need more than 64 logging steps to be configured), we could simply modify "struct log_steps" to spread the bitmask over multiple 64bits integers, minor some adjustments where the mask is set and checked.	2025-09-15 10:29:02 +02:00
Aurelien DARRAGON	be417c1db2	BUG/MEDIUM: http_ana: fix potential NULL deref in http_process_req_common() As reported by @kenballus in GH #3118, a potential NULL-deref was introduced in 3da1d63 ("BUG/MEDIUM: http_ana: handle yield for "stats http-request" evaluation") Indeed, px->uri_auth may be NULL when stats directive is not involved in the current proxy section. The bug went unnoticed because it didn't seem to cause any side-effect so far and valgrind didn't catch it. However ASAN did, so let's fix it before it causes harm. It should be backported with 3da1d63.	2025-09-15 10:28:59 +02:00
Christopher Faulet	b582fd41c2	Revert "BUG/MINOR: ocsp: Crash when updating CA during ocsp updates" This reverts commit 167ea8fc7b0cf9d1bf71ec03d7eac3141fbe0080. The patch was backported by mistake.	2025-09-15 10:16:20 +02:00
Remi Tricot-Le Breton	167ea8fc7b	BUG/MINOR: ocsp: Crash when updating CA during ocsp updates If an ocsp response is set to be updated automatically and some certificate or CA updates are performed on the CLI, if the CLI update happens while the OCSP response is being updated and is then detached from the udapte tree, it might be wrongly inserted into the update tree in 'ssl_sock_load_ocsp', and then reinserted when the update finishes. The update tree then gets corrupted and we could end up crashing when accessing other nodes in the ocsp response update tree. This patch must be backported up to 2.8. This patch fixes GitHub #3100.	2025-09-15 08:20:16 +02:00
Christopher Faulet	157852ce99	BUG/MEDIUM: resolvers: Wake resolver task up whne unlinking a stream requester Another regression introduced with the commit 3023e9819 ("BUG/MINOR: resolvers: Restore round-robin selection on records in DNS answers"). Stream requesters are unlinked from any theards. So we must not try to queue the resolver's task here because it is not allowed to do so from another thread than the task thread. Instead, we can simply wake the resolver's task up. It is only performed when the last stream requester is unlink from the resolution. This patch should fix the issue #3119. It must be backported with the commit above.	2025-09-15 07:57:29 +02:00
Christopher Faulet	e6a9192af6	BUG/MEDIUM: resolvers: Accept to create resolution without hostname A regression was introduced by commit 6cf2401ed ("BUG/MEDIUM: resolvers: Make resolution owns its hostname_dn value"). In fact, it is possible (an allowed ?!) to create a resolution without hostname (hostname_dn == NULL). It only happens on startup for a server relying on a resolver but defined with an IP address and not a hostname Because of the patch above, an error is triggered during the configuration parsing when this happens, while it should be accepted. This patch must be backported with the commit above.	2025-09-12 11:52:06 +02:00
Christopher Faulet	6cf2401eda	BUG/MEDIUM: resolvers: Make resolution owns its hostname_dn value The commit 37abe56b1 ("BUG/MEDIUM: resolvers: Properly cache do-resolv resolution") introduced a regression. A resolution does not own its hostname_dn value, it is a pointer on the first request value. But since the commit above, it is possible to have orphan resolution, with no requester. So it is important to modify the resolutions to make it owns its hostname_dn value by duplicating it when it is created. This patch must be backported with the commit above.	2025-09-12 11:09:19 +02:00
Christopher Faulet	f6dfbbe870	BUG/MEDIUM: resolvers: Test for empty tree when getting a record from DNS answer In the previous fix 5d1d93fad ("BUG/MEDIUM: resolvers: Properly handle empty tree when getting a record from the DNS answer"), I missed the fact the answer tree can be empty. So, to avoid crashes, when the answer tree is empty, we immediately exit from resolv_get_ip_from_response() function with RSLV_UPD_NO_IP_FOUND. In addition, when a record is removed from the tree, we take care to reset the next node saved if necessary. This patch must be backported with the commit above.	2025-09-12 11:09:19 +02:00
Collison, Steven	d738fa4ec0	DOC: proxy-protocol: Add TLS group and sig scheme TLVs This change adds the PP2_SUBTYPE_SSL_GROUP and PP2_SUBTYPE_SSL_SIG_SCHEME code point reservations in proxy_protocol.txt. The motivation for adding these two TLVs is for backend visibility into the negotiated TLS key exchange group and handshake signature scheme. Demand for visibility is expected to increase as endpoints migrate to use new Post-Quantum resistant algorithms for key exchange and signatures.	2025-09-12 09:25:14 +02:00
Willy Tarreau	8fb5ae5cc6	MINOR: activity/memory: count allocations performed under a lock By checking the current thread's locking status, it becomes possible to know during a memory allocation whether it's performed under a lock or not. Both pools and memprofile functions were instrumented to check for this and to increment the memprofile bin's locked_calls counter. This one, when not zero, is reported on "show profiling memory" with a percentage of all allocations that such locked allocations represent. This way it becomes possible to try to target certain code paths that are particularly expensive. Example: $ socat - /tmp/sock1 <<< "show profiling memory"\|grep lock 20297301 0 2598054528 0\| 0x62a820fa3991 sockaddr_alloc+0x61/0xa3 p_alloc(128) [pool=sockaddr] [locked=54962 (0.2 %)] 0 20297301 0 2598054528\| 0x62a820fa3a24 sockaddr_free+0x44/0x59 p_free(-128) [pool=sockaddr] [locked=34300 (0.1 %)] 9908432 0 1268279296 0\| 0x62a820eb8524 main+0x81974 p_alloc(128) [pool=task] [locked=9908432 (100.0 %)] 9908432 0 554872192 0\| 0x62a820eb85a6 main+0x819f6 p_alloc(56) [pool=tasklet] [locked=9908432 (100.0 %)] 263001 0 63120240 0\| 0x62a820fa3c97 conn_new+0x37/0x1b2 p_alloc(240) [pool=connection] [locked=20662 (7.8 %)] 71643 0 47307584 0\| 0x62a82105204d pool_get_from_os_noinc+0x12d/0x161 posix_memalign(660) [locked=5393 (7.5 %)]	2025-09-11 16:32:34 +02:00
Willy Tarreau	9d8c2a888b	MINOR: activity: collect CPU time spent on memory allocations for each task When task profiling is enabled, the pool alloc/free code will measure the time it takes to perform memory allocation after a cache miss or memory freeing to the shared cache or OS. The time taken with the thread-local cache is never measured as measuring that time is very expensive compared to the pool access time. Here doing so costs around 2% performance at 2M req/s, only when task profiling is enabled, so this remains reasonable. The scheduler takes care of collecting that time and updating the sched_activity entry corresponding to the current task when task profiling is enabled. The goal clearly is to track places that are wasting CPU time allocating and releasing too often, or causing large evictions. This appears like this in "show profiling tasks aggr": Tasks activity over 11.428 sec till 0.000 sec ago: function calls cpu_tot cpu_avg lkw_avg lkd_avg mem_avg lat_avg process_stream 44183891 16.47m 22.36us 491.0ns 1.154us 1.000ns 101.1us h1_io_cb 57386064 4.011m 4.193us 20.00ns 16.00ns - 29.47us sc_conn_io_cb 42088024 49.04s 1.165us - - - 54.67us h1_timeout_task 438171 196.5ms 448.0ns - - - 100.1us srv_cleanup_toremove_conns 65 1.468ms 22.58us 184.0ns 87.00ns - 101.3us task_process_applet 3 508.0us 169.3us - 107.0us 1.847us 29.67us srv_cleanup_idle_conns 6 225.3us 37.55us 15.74us 36.84us - 49.47us accept_queue_process 2 45.62us 22.81us - - 4.949us 54.33us	2025-09-11 16:32:34 +02:00
Willy Tarreau	195794eb59	MINOR: activity: add a new mem_avg column to show profiling stats This new column will be used for reporting the average time spent allocating or freeing memory in a task when task profiling is enabled. For now it is not updated.	2025-09-11 16:32:34 +02:00
Willy Tarreau	98cc815e3e	MINOR: activity: collect time spent with a lock held for each task When DEBUG_THREAD > 0 and task profiling enabled, we'll now measure the time spent with at least one lock held for each task. The time is collected by locking operations when locks are taken raising the level to one, or released resetting the level. An accumulator is updated in the thread_ctx struct that is collected by the scheduler when the task returns, and updated in the sched_activity entry of the related task. This allows to observe figures like this one: Tasks activity over 259.516 sec till 0.000 sec ago: function calls cpu_tot cpu_avg lkw_avg lkd_avg lat_avg h1_io_cb 15466589 2.574m 9.984us - - 33.45us <- sock_conn_iocb@src/sock.c:1099 tasklet_wakeup sc_conn_io_cb 8047994 8.325s 1.034us - - 870.1us <- sc_app_chk_rcv_conn@src/stconn.c:844 tasklet_wakeup process_stream 7734689 4.356m 33.79us 1.990us 1.641us 1.554ms <- sc_notify@src/stconn.c:1206 task_wakeup process_stream 7734292 46.74m 362.6us 278.3us 132.2us 972.0us <- stream_new@src/stream.c:585 task_wakeup sc_conn_io_cb 7733158 46.88s 6.061us - - 68.78us <- h1_wake_stream_for_recv@src/mux_h1.c:3633 tasklet_wakeup task_process_applet 6603593 4.484m 40.74us 16.69us 34.00us 96.47us <- sc_app_chk_snd_applet@src/stconn.c:1043 appctx_wakeup task_process_applet 4761796 3.420m 43.09us 18.79us 39.28us 138.2us <- __process_running_peer_sync@src/peers.c:3579 appctx_wakeup process_table_expire 4710662 4.880m 62.16us 9.648us 53.95us 158.6us <- run_tasks_from_lists@src/task.c:671 task_queue stktable_add_pend_updates 4171868 6.786s 1.626us - 1.487us 47.94us <- stktable_add_pend_updates@src/stick_table.c:869 tasklet_wakeup h1_io_cb 2871683 1.198s 417.0ns 70.00ns 69.00ns 1.005ms <- h1_takeover@src/mux_h1.c:5659 tasklet_wakeup process_peer_sync 2304957 5.368s 2.328us - 1.156us 68.54us <- stktable_add_pend_updates@src/stick_table.c:873 task_wakeup process_peer_sync 1388141 3.174s 2.286us - 1.130us 52.31us <- run_tasks_from_lists@src/task.c:671 task_queue stktable_add_pend_updates 463488 3.530s 7.615us 2.000ns 7.134us 771.2us <- stktable_touch_with_exp@src/stick_table.c:654 tasklet_wakeup Here we see that almost the entirety of stktable_add_pend_updates() is spent under a lock, that 1/3 of the execution time of process_stream() was performed under a lock and that 2/3 of it was spent waiting for a lock (this is related to the 10 track-sc present in this config), and that the locking time in process_peer_sync() has now significantly reduced. This is more visible with "show profiling tasks aggr": Tasks activity over 475.354 sec till 0.000 sec ago: function calls cpu_tot cpu_avg lkw_avg lkd_avg lat_avg h1_io_cb 25742539 3.699m 8.622us 11.00ns 10.00ns 188.0us sc_conn_io_cb 22565666 1.475m 3.920us - - 473.9us process_stream 21665212 1.195h 198.6us 140.6us 67.08us 1.266ms task_process_applet 16352495 11.31m 41.51us 17.98us 36.55us 112.3us process_peer_sync 7831923 17.15s 2.189us - 1.107us 41.27us process_table_expire 6878569 6.866m 59.89us 9.359us 51.91us 151.8us stktable_add_pend_updates 6602502 14.77s 2.236us - 2.060us 119.8us h1_timeout_task 801 703.4us 878.0ns - - 185.7us srv_cleanup_toremove_conns 347 12.43ms 35.82us 240.0ns 70.00ns 1.924ms accept_queue_process 142 1.384ms 9.743us - - 340.6us srv_cleanup_idle_conns 74 475.0us 6.418us 896.0ns 5.667us 114.6us	2025-09-11 16:32:34 +02:00
Willy Tarreau	95433f224e	MINOR: activity: add a new lkd_avg column to show profiling stats This new column will be used for reporting the average time spent in a task with at least one lock held. It will only have a non-zero value when DEBUG_THREAD > 0. For now it is not updated.	2025-09-11 16:32:34 +02:00
Willy Tarreau	4b23b2ed32	MINOR: thread: add a lock level information in the thread_ctx The new lock_level field indicates the number of cumulated locks that are held by the current thread. It's fed as soon as DEBUG_THREAD is at least 1. In addition, thread_isolate() adds 128, so that it's even possible to check for combinations of both. The value is also reported in thread dumps (warnings and panics).	2025-09-11 16:32:34 +02:00
Willy Tarreau	503084643f	MINOR: activity: collect time spent waiting on a lock for each task When DEBUG_THREAD > 0, and if task profiling is enabled, then each locking attempt will measure the time it takes to obtain the lock, then add that time to a thread_ctx accumulator that the scheduler will then retrieve to update the current task's sched_activity entry. The value will then appear avearaged over the number of calls in the lkw_avg column of "show profiling tasks", such as below: Tasks activity over 48.298 sec till 0.000 sec ago: function calls cpu_tot cpu_avg lkw_avg lat_avg h1_io_cb 3200170 26.81s 8.377us - 32.73us <- sock_conn_iocb@src/sock.c:1099 tasklet_wakeup sc_conn_io_cb 1657841 1.645s 992.0ns - 853.0us <- sc_app_chk_rcv_conn@src/stconn.c:844 tasklet_wakeup process_stream 1600450 49.16s 30.71us 1.936us 1.392ms <- sc_notify@src/stconn.c:1206 task_wakeup process_stream 1600321 7.770m 291.3us 209.1us 901.6us <- stream_new@src/stream.c:585 task_wakeup sc_conn_io_cb 1599928 7.975s 4.984us - 65.77us <- h1_wake_stream_for_recv@src/mux_h1.c:3633 tasklet_wakeup task_process_applet 997609 46.37s 46.48us 16.80us 113.0us <- sc_app_chk_snd_applet@src/stconn.c:1043 appctx_wakeup process_table_expire 922074 48.79s 52.92us 7.275us 181.1us <- run_tasks_from_lists@src/task.c:670 task_queue stktable_add_pend_updates 705423 1.511s 2.142us - 56.81us <- stktable_add_pend_updates@src/stick_table.c:869 tasklet_wakeup task_process_applet 683511 34.75s 50.84us 18.37us 153.3us <- __process_running_peer_sync@src/peers.c:3579 appctx_wakeup h1_io_cb 535395 198.1ms 370.0ns 72.00ns 930.4us <- h1_takeover@src/mux_h1.c:5659 tasklet_wakeup It now makes it pretty obvious which tasks (hence call chains) spend their time waiting on a lock and for what share of their execution time.	2025-09-11 16:32:34 +02:00
Willy Tarreau	1956c544b5	MINOR: activity: add a new lkw_avg column to show profiling stats This new column will be used for reporting the average time spent waiting for a lock. It will only have a non-zero value when DEBUG_THREAD > 0. For now it is not updated.	2025-09-11 16:32:34 +02:00
Willy Tarreau	9f7ce9e807	MINOR: activity: don't report the lat_tot column for show profiling tasks This column is pretty useless, as the total latency experienced by tasks is meaningless, what matters is the average per call. Since we'll add more columns and we need to keep all of this readable, let's get rid of this column.	2025-09-11 16:32:34 +02:00
Christopher Faulet	3023e98199	BUG/MINOR: resolvers: Restore round-robin selection on records in DNS answers Since the commit dcb696cd3 ("MEDIUM: resolvers: hash the records before inserting them into the tree"), When several records are found in a DNS answer, the round robin selection over these records is no longer performed. Indeed, before a list of records was used. To ensure each records was selected one after the other, at each selection, the first record of the list was moved at the end. When this list was replaced bu a tree, the same mechanism was preserved. However, the record is indexed using its key, a hash of the record. So its position never changes. When it is removed and reinserted in the tree, its position remains the same. When we walk though the tree, starting from the root, the records are always evaluated in the same order. So, even if there are several records in a DNS answer, the same IP address is always selected. It is quite easy to trigger the issue with a do-resolv action. To fix the issue, the node to perform the next selection is now saved. So instead of restarting from the root each time, we can restart from the next node of the previous call. Thanks to Damien Claisse for the issue analysis and for the reproducer. This patch should fix the issue #3116. It must be backported as far as 2.6.	2025-09-11 15:46:45 +02:00
Christopher Faulet	37abe56b18	BUG/MEDIUM: resolvers: Properly cache do-resolv resolution As stated by the documentation, when a do-resolv resolution is performed, the result should be cached for <hold.valid> milliseconds. However, the only way to cache the result is to always have a requester. When the last requester is unlink from the resolution, the resolution is released. So, for a do-resolv resolution, it means it could only work by chance if the same FQDN is requested enough to always have at least two streams waiting for the resolution. And because in that case, the cached result is used, it means the traffic must be quite high. In fact, a good approach to fix the issue is to keep orphan resolutions to be able cache the result and only release them after hold.valid milliseconds after the last real resolution. The resolver's task already releases orphan resolutions. So we only need to check the expiration date and take care to not release the resolution when the last stream is unlink from it. This patch should be backported to all stable versions. We can start to backport it as far as 3.1 and then wait a bit.	2025-09-11 15:46:45 +02:00
William Lallemand	fb832e1e52	BUILD: ssl: functions defined but not used Previous patch 50d191b ("MINOR: ssl: set functions as static when no protypes in the .h") broke the WolfSSL function with unused functions. This patch add __maybe_unused to ssl_sock_sctl_parse_cbk(), ssl_sock_sctl_add_cbk() and ssl_sock_msgcbk()	2025-09-11 15:32:59 +02:00
William Lallemand	50d191b8a3	MINOR: ssl: set functions as static when no protypes in the .h Check with -Wmissing-prototypes what should be static. src/ssl_sock.c:1572:5: error: no previous prototype for ‘ssl_sock_sctl_add_cbk’ [-Werror=missing-prototypes] 1572 \| int ssl_sock_sctl_add_cbk(SSL ssl, unsigned ext_type, const unsigned char out, size_t outlen, int al, void add_arg) \| ^~~~~~~~~~~~~~~~~~~~~ src/ssl_sock.c:1582:5: error: no previous prototype for ‘ssl_sock_sctl_parse_cbk’ [-Werror=missing-prototypes] 1582 \| int ssl_sock_sctl_parse_cbk(SSL s, unsigned int ext_type, const unsigned char in, size_t inlen, int al, void parse_arg) \| ^~~~~~~~~~~~~~~~~~~~~~~ src/ssl_sock.c:1604:6: error: no previous prototype for ‘ssl_sock_infocbk’ [-Werror=missing-prototypes] 1604 \| void ssl_sock_infocbk(const SSL ssl, int where, int ret) \| ^~~~~~~~~~~~~~~~ src/ssl_sock.c:2107:6: error: no previous prototype for ‘ssl_sock_msgcbk’ [-Werror=missing-prototypes] 2107 \| void ssl_sock_msgcbk(int write_p, int version, int content_type, const void buf, size_t len, SSL ssl, void arg) \| ^~~~~~~~~~~~~~~ src/ssl_sock.c:3936:5: error: no previous prototype for ‘sh_ssl_sess_new_cb’ [-Werror=missing-prototypes] 3936 \| int sh_ssl_sess_new_cb(SSL ssl, SSL_SESSION sess) \| ^~~~~~~~~~~~~~~~~~ src/ssl_sock.c:3990:14: error: no previous prototype for ‘sh_ssl_sess_get_cb’ [-Werror=missing-prototypes] 3990 \| SSL_SESSION sh_ssl_sess_get_cb(SSL ssl, __OPENSSL_110_CONST__ unsigned char key, int key_len, int do_copy) \| ^~~~~~~~~~~~~~~~~~ src/ssl_sock.c:4043:6: error: no previous prototype for ‘sh_ssl_sess_remove_cb’ [-Werror=missing-prototypes] 4043 \| void sh_ssl_sess_remove_cb(SSL_CTX ctx, SSL_SESSION sess) \| ^~~~~~~~~~~~~~~~~~~~~ src/ssl_sock.c:4075:6: error: no previous prototype for ‘ssl_set_shctx’ [-Werror=missing-prototypes] 4075 \| void ssl_set_shctx(SSL_CTX ctx) \| ^~~~~~~~~~~~~ src/ssl_sock.c:4103:6: error: no previous prototype for ‘SSL_CTX_keylog’ [-Werror=missing-prototypes] 4103 \| void SSL_CTX_keylog(const SSL ssl, const char line) \| ^~~~~~~~~~~~~~ src/ssl_sock.c:5167:6: error: no previous prototype for ‘ssl_sock_deinit’ [-Werror=missing-prototypes] 5167 \| void ssl_sock_deinit() \| ^~~~~~~~~~~~~~~ src/ssl_sock.c:6976:6: error: no previous prototype for ‘ssl_sock_close’ [-Werror=missing-prototypes] 6976 \| void ssl_sock_close(struct connection conn, void xprt_ctx) { \| ^~~~~~~~~~~~~~ src/ssl_sock.c:7846:17: error: no previous prototype for ‘ssl_action_wait_for_hs’ [-Werror=missing-prototypes] 7846 \| enum act_return ssl_action_wait_for_hs(struct act_rule rule, struct proxy *px, \| ^~~~~~~~~~~~~~~~~~~~~~	2025-09-11 15:23:59 +02:00
William Lallemand	19daee6549	MINOR: ocsp: put internal functions as static ones -Wmissing-prototypes let us check which functions can be made static and is not used elsewhere. rc/ssl_ocsp.c:1079:5: error: no previous prototype for ‘ssl_ocsp_update_insert_after_error’ [-Werror=missing-prototypes] 1079 \| int ssl_ocsp_update_insert_after_error(struct certificate_ocsp ocsp) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/ssl_ocsp.c:1116:6: error: no previous prototype for ‘ocsp_update_response_stline_cb’ [-Werror=missing-prototypes] 1116 \| void ocsp_update_response_stline_cb(struct httpclient hc) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/ssl_ocsp.c:1127:6: error: no previous prototype for ‘ocsp_update_response_headers_cb’ [-Werror=missing-prototypes] 1127 \| void ocsp_update_response_headers_cb(struct httpclient hc) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/ssl_ocsp.c:1138:6: error: no previous prototype for ‘ocsp_update_response_body_cb’ [-Werror=missing-prototypes] 1138 \| void ocsp_update_response_body_cb(struct httpclient hc) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/ssl_ocsp.c:1149:6: error: no previous prototype for ‘ocsp_update_response_end_cb’ [-Werror=missing-prototypes] 1149 \| void ocsp_update_response_end_cb(struct httpclient *hc) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~ src/ssl_ocsp.c:2095:5: error: no previous prototype for ‘ocsp_update_postparser_init’ [-Werror=missing-prototypes] 2095 \| int ocsp_update_postparser_init() \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~	2025-09-11 15:18:48 +02:00
William Lallemand	0224d60de6	BUG/MINOR: ocsp: prototype inconsistency Inconsistencies between the .h and the .c can't be catched because the .h is not included in the .c. ocsp_update_init() does not have the right prototype and lacks a const attribute. Must be backported in all previous stable versions.	2025-09-11 15:18:10 +02:00
Remi Tricot-Le Breton	e0844a305c	BUG/MINOR: ssl: Fix potential NULL deref in trace callback 'conn' might be NULL in the trace callback so the calls to conn_err_code_str must be covered by a proper check. This issue was found by Coverity and raised in GitHub #3112. The patch must be backported to 3.2.	2025-09-11 14:31:32 +02:00
Remi Tricot-Le Breton	a316342ec6	BUG/MINOR: ssl: Potential NULL deref in trace macro 'ctx' might be NULL when we exit 'ssl_sock_handshake', it can't be dereferenced without check in the trace macro. This was found by Coverity andraised in GitHub #3113. This patch should be backported up to 3.2.	2025-09-11 14:31:32 +02:00
William Lallemand	e52e6f66ac	BUG/MEDIUM: jws: return size_t in JWS functions JWS functions are supposed to return 0 upon error or when nothing was produced. This was done in order to put easily the return value in trash->data without having to check the return value. However functions like a2base64url() or snprintf() could return a negative value, which would be casted in a unsigned int if this happen. This patch add checks on the JWS functions to ensure that no negative value can be returned, and change the prototype from int to size_t. This is also related to issue #3114. Must be backported to 3.2.	2025-09-11 14:31:32 +02:00
William Lallemand	66a7ebfeef	BUG/MINOR: acme: null pointer dereference upon allocation failure Reported in issue #3115: 11. var_compare_op: Comparing task to null implies that task might be null. 681 if (!task) { 682 ret++; 683 ha_alert("acme: couldn't start the scheduler!\n"); 684 } CID 1609721: (#1 of 1): Dereference after null check (FORWARD_NULL) 12. var_deref_op: Dereferencing null pointer task. 685 task->nice = 0; 686 task->process = acme_scheduler; 687 688 task_wakeup(task, TASK_WOKEN_INIT); 689 } 690 Task would be dereferenced upon allocation failure instead of falling back to the end of the function after the error. Should be backported in 3.2.	2025-09-11 14:31:32 +02:00
Amaury Denoyelle	c15129f7dc	DOC: quic: clarifies limited-quic support This patch extends the documentation for "limited-quic" global keyword. It mentions first that it relies on USE_QUIC_OPENSSL_COMPAT=1 build option. Compatibility with TLS libraries is now clearly exposed. In particular, it highlights the fact that it is mostly targetted at OpenSSL version prior to 3.5.2, and that it should be disabled if a recent OpenSSL release is available. It also states that limited-quic does nothing if USE_QUIC_OPENSSL_COMPAT is not set during compilation.	2025-09-11 10:11:12 +02:00
Amaury Denoyelle	d293cc62dc	MINOR: quic: display build warning for compat layer on recent OpenSSL Build option USE_QUIC_OPENSSL_COMPAT=1 must be set to activate QUIC support for OpenSSL prior to version 3.5.2. This compiles an internal compatibility layer, which must be then activated at runtime with global option limited-quic. Starting from OpenSSL version 3.5.2, a proper QUIC TLS API is now exposed. Thus, the compatibility layer is unneeded. However it can still be compiled against newer OpenSSL releases and activated at runtime, mostly for test purpose. As this compatibility layer has some limitations, (no support for QUIC 0-RTT), it's important that users notice this situation and disable it if possible. Thus, this patch adds a notice warning when USE_QUIC_OPENSSL_COMPAT=1 is set when building against OpenSSL 3.5.2 and above. This should be sufficient for users and packagers to understand that this option is not necessary anymore. Note that USE_QUIC_OPENSSL_COMPAT=1 is incompatible with others TLS library which exposed a QUIC API based on original BoringSSL patches set. A build error will prevent the compatibility layer to be built. limited-quic option is thus silently ignored.	2025-09-11 10:11:12 +02:00
Frederic Lecaille	5027ba36a9	MINOR: quic-be: make SSL/QUIC objects use their own indexes (ssl_qc_app_data_index) This index is used to retrieve the quic_conn object from its SSL object, the same way the connection is retrieved from its SSL object for SSL/TCP connections. This patch implements two helper functions to avoid the ugly code with such blocks: #ifdef USE_QUIC else if (qc) { .. } #endif Implement ssl_sock_get_listener() to return the listener from an SSL object. Implement ssl_sock_get_conn() to return the connection from an SSL object and optionally a pointer to the ssl_sock_ctx struct attached to the connections or the quic_conns. Use this functions where applicable: - ssl_tlsext_ticket_key_cb() calls ssl_sock_get_listener() - ssl_sock_infocbk() calls ssl_sock_get_conn() - ssl_sock_msgcbk() calls ssl_sock_get_ssl_conn() - ssl_sess_new_srv_cb() calls ssl_sock_get_conn() - ssl_sock_srv_verifycbk() calls ssl_sock_get_conn() Also modify qc_ssl_sess_init() to initialize the ssl_qc_app_data_index index for the QUIC backends.	2025-09-11 09:51:28 +02:00
Frederic Lecaille	47bb15ca84	MINOR: quic: get rid of ->target quic_conn struct member The ->li (struct listener ) member of quic_conn struct was replaced by a ->target (struct obj_type ) member by this commit: MINOR: quic-be: get rid of ->li quic_conn member to abstract the connection type (front or back) when implementing QUIC for the backends. In these cases, ->target was a pointer to the ojb_type of a server struct. This could not work with the dynamic servers contrary to the listeners which are not dynamic. This patch almost reverts the one mentioned above. ->target pointer to obj_type member is replaced by ->li pointer to listener struct member. As the listener are not dynamic, this is easy to do this. All one has to do is to replace the objt_listener(qc->target) statement by qc->li where applicable. For the backend connection, when needed, this is always qc->conn->target which is used only when qc->conn is initialized. The only "problematic" case is for quic_dgram_parse() which takes a pointer to an obj_type as third argument. But this obj_type is only used to call quic_rx_pkt_parse(). Inside this function it is used to access the proxy counters of the connection thanks to qc_counters(). So, this obj_type argument may be null for now on with this patch. This is the reason why qc_counters() is modified to take this into consideration.	2025-09-11 09:51:28 +02:00
Christopher Faulet	5354c24c76	BUG/MAJOR: stream: Force channel analysis on successful synchronous send This patchs reverts commit a498e527b ("BUG/MAJOR: stream: Remove READ/WRITE events on channels after analysers eval") because of a regression. It was an attempt to properly detect synchronous sends, even when the stream was woken up on a write event. However, the fix was wrong because it could mask shutdowns performed during process_stream() and block the stream. Indeed, when a shutdown is performed, because an error occurred for instance, a write event is reported. The commit above could mask this event while the shutdown prevent any synchronous sends. In such case, the stream could remain blocked infinitly because an I/O event was missed. So to properly fix the original issue (#3070), the write event must not be masked before a synchronous send. Instead, we now force the channel analysis by setting explicitly CF_WAKE_ONCE flags on the corresponding channel if a write event is reported after the synchronous send. CF_WRITE_EVENT flag is remove explicitly just before, so it is quite easy to detect. This patch must be backport to all stable version in same time of the commit above.	2025-09-11 09:47:47 +02:00
Willy Tarreau	ded2110ec6	MEDIUM: peers: move process_peer_sync() to a single thread The remaining half of the task_queue() and task_wakeup() contention is caused by this function when peers are in use, because just like process_table_expire(), it's created using task_new_anywhere() and is woken up for local updates. Let's turn it to single thread by rotating the assigned threads during initialization so that a table only runs on one thread at a time. Here we go backwards to assign the threads, so that on small setups they don't end up on the same CPUs as the ones used by the stick-tables. This way this will make an even better use of large machines. The performance remains the same as with previous patch, even slightly better (1-3% on avg). At this point there's almost no multi-threaded task activity anymore (only srv_cleanup_idle_server once in a while). This should improve the situation described by Felipe in issues #3084 and #3101. This should be backported to 3.2 after some extended checks.	2025-09-10 19:14:05 +02:00
Willy Tarreau	e05afda249	MEDIUM: stick-table: move process_table_expire() to a single thread A big deal of the task_queue() contention is caused by this function because it's created using task_new_anywhere() and is subject to heavy updates. Let's turn it to single thread by rotating the assigned threads during initialization so that a table only runs on one thread at a time. However there's a trick: the function used to call task_queue() to requeue the task if it had advanced its timer (may only happen when learning an entry from a peer). We can't do that anymore since we can't queue another thread's task. Thus instead of the task needs to be scheduled earlier than previously planned, we simply perform a wakeup. It will likely do nothing and will self-adjust its next wakeup timer. Doing so halves the number of multi-thread task wakeups. In addition the request rate at saturation increased by 12% with 16 peers and 40 tables on a 16 8-thread processes. This should improve the situation described by Felipe in issues #3084 and #3101. This should be backported to 3.2 after some extended checks.	2025-09-10 19:13:33 +02:00
Willy Tarreau	2831cb104f	BUG/MINOR: stick-table: make sure never to miss a process_table_expire update In stktable_requeue_exp(), there's a tiny race at the beginning during which we check the task's expiration date to decide whether or not to wake process_table_expire() up. During this race, the task might just have finished running on its owner thread and we can miss a task_queue() opportunity, which probably explains why during testing it seldom happens that a few entries are left at the end. Let's perform a CAS to confirm the value is still the same before leaving. This way we're certain that our value has been seen at least once. This should be backported to 3.2.	2025-09-10 18:45:01 +02:00
Willy Tarreau	2ce5e0edcc	MEDIUM: resolvers: make the process_resolvers() task single-threaded This task is sometimes caught triggering the watchdog while waiting for the infamous resolvers lock, or the scheduler's wait queue lock in task_queue(). Both are caused by its multi-threaded capability. The task may indeed start on a thread that's different from the one that is currently receiving a response and that holds the resolvers lock, and when being queued back, it requires to lock the wait queue. Both problems disappear when sticking it to a single thread. But for configs running multiple resolvers sections, it would be suboptimal to run them all on the same thread. In order to avoid this, we implement a counter in the resolvers_finalize_config() section that rotates the thread for each resolvers section. This was sufficient to further improve the performance here, making the CPU usage drop to about 7% (from 11 previously or 38 initially) and not showing any resolvers lock contention anymore in perf top output. The change was kept fairly minimal to permit a backport once enough testing is conducted on it. It could address a significant part of the trouble reported by Felipe in GH issue #3101.	2025-09-10 16:51:14 +02:00
Willy Tarreau	d624aceaef	MEDIUM: dns: bind the nameserver sockets to the initiating thread There's still a big architectural limitation in the dns/resolvers code regarding threads: resolvers run as a task that is scheduled to run anywhere, and each NS dgram socket is bound to any thread of the same thread group as the initiating thread. This becomes a big problem when dealing with multiple nameservers because responses arrive on any thread, start by locking the resolvers section, and other threads dealing with responses are just stuck waiting for the lock to disappear. This means that most of the time is exclusively spent causing contention. The process_resolvers() function also also suffers from this contention but apparently less often. It turns out that the nameserver sockets are created during emission of the first packet, triggered from the resolvers task. The present patch exploits this to stick all sockets to the calling thread instead of any thread. This way there is no longer any contention between multiple nameservers of a same resolvers section. Tests with a section having 10 name servers showed that the CPU usage dropped from 38 to about 10%, or almost by a factor of 4. Note that TCP resolvers do not offer this possibility because the tasks that manage the applets are created earlier to run anywhere during config parsing. This might possibly be refined later, e.g. by changing the task's affinity when it first runs. The change was kept fairly minimal to permit a backport once enough testing is conducted on it. It could address a significant part of the trouble reported by Felipe in GH issue #3101.	2025-09-10 16:48:09 +02:00
Olivier Houchard	07c10ec2f1	BUG/MEDIUM: ssl: Fix a crash if we failed to create the mux In ssl_sock_io_cb(), if we failed to create the mux, we may have destroyed the connection, so only attempt to access it to get the ALPN if conn_create_mux() was successful. This fixes crashes that may happen when using ssl.	2025-09-10 12:02:53 +02:00
Olivier Houchard	1759c97255	BUG/MEDIUM: ssl: Fix a crash when using QUIC Commit 5ab9954faa9c815425fa39171ad33e75f4f7d56f introduced a new flag in ssl_sock_ctx, to know that an ALPN was negociated, however, the way to get the ssl_sock_ctx was wrong for QUIC. If we're using QUIC, get it from the quic_conn. This should fix crashes when attempting to use QUIC.	2025-09-10 11:45:03 +02:00
Willy Tarreau	be86a69fe8	DEBUG: stick-tables: export stktable_add_pend_updates() for better reporting This function is a tasklet handler used to send peers updates, and it can happen quite a bit in "show tasks" and "show profiling tasks", so let's export it so that we don't face a cryptic symbol name: $ socat - /tmp/haproxy-n10.stat <<< "show tasks" Running tasks: 43 (8 threads) function places % lat_tot lat_avg calls_tot calls_avg calls% process_table_expire 16 37.2 1.072m 4.021s 115831 7239 15.4 task_process_applet 15 34.8 1.072m 4.287s 486299 32419 65.0 stktable_add_pend_updates 8 18.6 - - 89725 11215 12.0 sc_conn_io_cb 3 6.9 - - 5007 1669 0.6 process_peer_sync 1 2.3 4.293s 4.293s 50765 50765 6.7 This should be backported to 3.2 as it participates to debugging the table+peers processing overhead.	2025-09-10 11:34:51 +02:00
Willy Tarreau	993c09438b	BUG/MEDIUM: stick-tables: don't loop on non-expirable entries The stick-table expiration of ref-counted entries was insufficiently addresse by commit 324f0a60ab ("BUG/MINOR: stick-tables: never leave used entries without expiration"), because now entries are just requeued where they were, so they're visited over and over for long sessions, causing process_table_expire() to loop, eating CPU and causing lock contention. Here we take care of refreshing their timeer when they are met, so that we don't meet them more than once per stick-table lifetime. It should address at least a part of the recent degradation that Felipe noticed in GH #3084. Since the fix above was marked for backporting to 3.2, this one should be backported there as well.	2025-09-10 11:27:27 +02:00
Willy Tarreau	997d217dee	MINOR: tools: don't emit "+0" for symbol names which exactly match known ones resolve_sym_name() knows a number of symbols, but when one exactly matches (e.g. a task's handler), it systematically displays the offset behind it ("+0"). Let's only show the offset when non-zero. This can be backported as this is helpful for debugging.	2025-09-10 10:44:33 +02:00
Willy Tarreau	9eb35563a6	MINOR: activity: indicate the number of calls on "show tasks" The "show tasks" command can be useful to inspect run queues for active tasks, but currently it's difficult to distinguish an occasional running task from a heavily active one. Let's collect the number of calls for each of them, report them average on the number of instances of each task as well as a percentage of the total used. This way it even becomes possible to get a hint about how CPU usage is distributed.	2025-09-10 10:44:33 +02:00
Willy Tarreau	17d3392348	BUG/MINOR: activity: fix reporting of task latency In 2.4, "show tasks" was introduced by commit 7eff06e162 ("MINOR: activity: add a new "show tasks" command to list currently active tasks") to expose some info about running tasks. The latency is not correct because it's a u32 subtracted from a u64. It ought to have been casted to u32 for the operation, which is what this patch does. This can be backported to 2.4.	2025-09-10 10:44:33 +02:00
Willy Tarreau	bdff394195	BUILD: ssl: address a recent build warning when QUIC is enabled Since commit 5ab9954faa ("MINOR: ssl: Add a flag to let it known we have an ALPN negociated"), when building with QUIC we get this warning: src/ssl_sock.c: In function 'ssl_sock_advertise_alpn_protos': src/ssl_sock.c:2189:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] Let's just move the instructions after the optional declaration. No backport is needed.	2025-09-10 10:44:33 +02:00
Olivier Houchard	d4c51a4f57	MEDIUM: server: Make use of the stored ALPN stored in the server Now that which ALPN gets negociated for a given server, use that to decide if we can create the mux right away in connect_server(), and use it in conn_install_mux_be(). That way, we may create the mux soon enough for early data to be sent, before the handshake has been completed. This commit depends on several previous commits, and it has not been deemed important enough to backport.	2025-09-09 19:01:24 +02:00
Willy Tarreau	6a2b3269f9	CLEANUP: backend: clarify the cases where we want to use early data The conditions to use early data on output are super tricky and detected later, so that it's difficult to figure how this works. This patch splits the condition in two parts, the one that can be performed early that is based on config/client/etc. It is used to clear a variable that allows early data to be used in case any condition is not satisfied. It was purposely split into multiple independent and reviewable tests. The second part remains where it was at the end, and is used to temporarily clear the handshake flags to let the data layer use early data. This one being tricky, a large comment explaining the principle was added. The logic was not changed at all, only the code was made more readable.	2025-09-09 19:01:24 +02:00
Willy Tarreau	9b9d0720e1	CLEANUP: backend: simplify the complex ifdef related to 0RTT in connect_server() Since 3.0 we have HAVE_SSL_0RTT precisely to avoid checking horribly complicated and unmaintainable conditions to detect support for 0RTT. Let's just drop the complex condition and use the macro instead.	2025-09-09 19:01:24 +02:00
Willy Tarreau	4aaf0bfbce	CLEANUP: backend: invert the condition to start the mux in connect_server() Instead of trying to switch from delayed start to instant start based on a single condition, let's do the opposite and preset the condition to instant start and detect what could cause it to be delayed, thus falling back to the slow mode. The condition remains exactly the inverted one and better matches the comment about ALPN being the only cause of such a delay.	2025-09-09 19:01:24 +02:00
Willy Tarreau	7b4a7f92b5	CLEANUP: backend: clarify the role of the init_mux variable in connect_server() The init_mux variable is currently used in a way that's not super easy to grasp. It's set a bit too late and requires to know a lot of info at once. Let's first rename it to "may_start_mux_now" to clarify its role, as the purpose is not to force the mux to be initialized now but to permit it to do it.	2025-09-09 19:01:24 +02:00
Olivier Houchard	ff47ae60f3	MEDIUM: server: Introduce the concept of path parameters Add a new field in struct server, path parameters. It will contain connection informations for the server that are not expected to change. For now, just store the ALPN negociated with the server. Each time an handhskae is done, we'll update it, even though it is not supposed to change. This will be useful when trying to send early data, that way we'll know which mux to use. Each time the server goes down or is disabled, those informations are erased, as we can't be sure those parameters will be the same once the server will be back up.	2025-09-09 19:01:24 +02:00
Olivier Houchard	9d65f5cd4d	MINOR: ssl: Use the new flag to know when the ALPN has been set. How that we have a flag to let us know the ALPN has been set, we no longer have to call ssl_sock_get_alpn() to know if the alpn has been negociated already. Remove the call to conn_create_mux() from ssl_sock_handshake(), and just reuse the one already present in ssl_sock_io_cb() if we have received early data, and if the flag is set.	2025-09-09 19:01:24 +02:00
Olivier Houchard	5ab9954faa	MINOR: ssl: Add a flag to let it known we have an ALPN negociated Add a new flag to the ssl_sock_ctx, to be set as soon as the ALPN has been negociated. This happens before the handshake has been completed, and that information will let us know that, when we receive early data, if the ALPN has been negociated, then we can immediately create a mux, as the ALPN will tell us which mux to use.	2025-09-09 19:01:24 +02:00
Olivier Houchard	6b78af837d	BUG/MEDIUM: ssl: create the mux immediately on early data If we received early data, and an ALPN has been negociated, then immediately try to create a mux if we did not have one already. Generally, at this point we would not have one, as the mux is decided by the ALPN, however at this point, even if the handshake is not done yet, we have enough to determine the ALPN, so we can immediately create the mux. Doing so makes up able to treat the request immediately, without waiting for the handshake to be done. This should be backported up to 2.8.	2025-09-09 19:01:24 +02:00
Olivier Houchard	aa25ddb773	BUG/MEDIUM: h1: Allow reception if we have early data In h1_recv_allowed(), do not forbid the reception if we are yet to complete the connection, if we have received early data on it. That way, we can deal with them right away, instead of waiting for the handshake to be done. This should be backported up to 2.8.	2025-09-09 19:01:24 +02:00
Willy Tarreau	d7696d11e1	MEDIUM: peers: don't even try to process updates under contention Recent fix 2421c3769a ("BUG/MEDIUM: peers: don't fail twice to grab the update lock") improved the situation a lot for peers under locking contention but still not enough for situations with many peers and many entries to expire fast. It's indeed still possible to trigger warnings at end of injection sessions for 16 peers at 100k req/s each doing 10 random track-sc when process_table_expire() runs and holds the update lock if compiled with a high value of STKTABLE_MAX_UPDATES_AT_ONCE (1000). Better just not insist in this case and postpone the update. At this point, under load only ebmb_lookup() consumes CPU, other functions are in the few percent, indicating reasonable contention, and peers remain updated. This should be backported to 3.2 after a bit of testing.	2025-09-09 17:56:37 +02:00
Willy Tarreau	d5e7fba5c0	MEDIUM: stick-tables: don't wait indefinitely in stktable_add_pend_updates() This one doesn't need to wait forever, if it cannot work it can postpone it. When building with a high value of STKTABLE_MAX_UPDATES_AT_ONCE (1000), it's still possible to trigger warnings in this function on the write lock that is contended by peers and expiration. Changing it for a trylock resolves the issue. This should be backported to 3.2 after a bit of testing.	2025-09-09 17:56:37 +02:00
Willy Tarreau	a771b14541	MEDIUM: stick-tables: give up on lock contention in process_table_expire() process_table_expire() can take quite a lot of time running over all shards. During this time it will hinder track-sc rules and peers, which will experience an increased latency to do their work, especially peers where each message will cause a lock, whose cumulated time can exceed the watchdog's patience. Here, we proceed just like in stktable_trash_oldest(), which is that we're using a trylock to detect contention. The first time it happens, if we hadn't purged anything, we switch to a regular lock to perform the operation, and next time it happens we abort. This guarantees that some entries will be expired and that contention will be reduced with when detected. With this change, various tests didn't manage to produce any warning, including at the end of the load generation session. This should be backported to 3.2 after a bit more testing.	2025-09-09 17:56:37 +02:00
Willy Tarreau	f87cf8b76e	MEDIUM: stick-tables: relax stktable_trash_oldest() to only purge what is needed stktable_trash_oldest() does insist a lot on purging what was requested, only limited by STKTABLE_MAX_UPDATES_AT_ONCE. This is called in two conditions, one to allocate a new stksess, and the other one to purge entries of a stopping process. The cost of iterating over all shards is huge, and a shard lock is taken each time before looking up entries. Moreover, multiple threads can end up doing the same and looking hard for many entries to purge when only one is needed. Furthermore, all threads start from the same shard, hence synchronize their locks. All of this costs a lot to other operations such as access from peers. This commit simplifies the approach by ignoring the budget, starting from a random shard number, and using a trylock so as to be able to give up early in case of contention. The approach chosen here consists in trying hard to flush at least one entry, but once at least one is evicted or at least one trylock failed, then a failure on the trylock will result in finishing. The function now returns a success as long as one entry was freed. With this, tests no longer show watchdog warnings during tests, though a few still remain when stopping the tests (which are not related to this function but to the contention from process_table_expire()). With this change, under high contention some entries' purge might be postponed and the table may occasionally contain slightly more entries than their size (though this already happens since stksess_new() first increments ->current before decrementing it). Measures were made on a 64-core system with 8 peers of 16 threads each, at CPU saturation (350k req/s each doing 10 track-sc) for 10M req, with 3 different approaches: - this one resulted in 1500 failures to find an entry (0.015% size overhead), with the lowest contention and the fairest peers distibution. - leaving only after a success resulted in 229 failures (0.0029% size overhead) but doubled the time spent in the function (on the write lock precisely). - leaving only when both a success and a failed lock were met resulted in 31 failures (0.00031% overhead) but the contention was high enough again so that peers were not all up to date. Considering that a saturated machine might exceed its entries by 0.015% is pretty minimal, the mechanism is kept. This should be backported to 3.2 after a bit more testing as it resolves some watchdog warnings and panics. It requires precedent commit "MINOR: stick-table: permit stksess_new() to temporarily allocate more entries" to over-allocate instead of failing in case of contention.	2025-09-09 17:56:37 +02:00
Willy Tarreau	b119280f60	MINOR: stick-table: permit stksess_new() to temporarily allocate more entries stksess_new() calls stktable_trash_oldest() to release some entries. If it fails however, it will fail to allocate an entry. This is a problem because it doesn't permit stktable_trash_oldest() to be used in best effort mode, which forces it to impose high contention. There's no problem with allocating slightly more in practice. In the worst case if all entries are in use, it's not shocking to temporarily exceed the number of entries by a few units. Let's relax this problematic rule. This patch might need to be backported to 3.2 after a bit more testing in order to support locking relaxation.	2025-09-09 17:56:37 +02:00
Willy Tarreau	0f33a55171	DEBUG: peers: export functions that use locks The following functions take locks and are often involved in warnings but are currently not resolved, so let's export them so that they are properly decoded: peer_prepare_updatemsg(), peer_send_teachmsgs(), peer_treat_updatemsg(), peer_send_msgs(), peer_io_handler() This should be backported to 3.2.	2025-09-09 17:56:14 +02:00
Willy Tarreau	25195ba1e7	MINOR: debug: report the time since last wakeup and call When task profiling is enabled, the current thread knows when the currently running task was woken up and called, so we can calculate how long ago it was woken up and called. This is convenient to figure whether or not a warning or panic is caused by this task or by a previous one, so let's report this info in thread outputs when known. It would be useful to backport this to 3.2.	2025-09-09 17:56:14 +02:00
Willy Tarreau	12bc4f9c44	MINOR: debug: report the number of loops and ctxsw for each thread When multiple similar warnings are emitted, it can be difficult to know whether only one task is looping slowly or if many are sharing the CPU. Let's report the number of context switches and polling loop turns in thread dumps so that warnings are easier to understand. This should be backported to 3.2.	2025-09-09 17:56:14 +02:00
Willy Tarreau	c3f94fbd9b	DEBUG: stream: count the number of passes in the connect loop Normally the connect loop cannot loop, but some recent traces can easily convince one of the opposite. Let's add a counter, including in panic dumps, in order to avoid the repeated long head scratching sessions starting with "and what if...". In addition, if it's found to loop, this time it will be certain and will indicate what to zoom in. This should be backported to 3.2.	2025-09-09 17:56:14 +02:00
Willy Tarreau	8153cf1e51	MINOR: debug: report the process id in warnings and panics Warning and panic messages currently do not report the PID. This is annoying when trying to reproduce problems because warnings do not allow know which process to attach to in order to debug, and panics do not permit to know which core dump corresponds to which dump. Let's add them in both messages. This should probably be backported at least to 3.2.	2025-09-09 17:56:14 +02:00
Amaury Denoyelle	0678d0a69b	MINOR: check: reject invalid check config on a QUIC server QUIC is now supported on the backend side. The previous commit ensures that simple checks can be activated on QUIC servers without any issue. The current patch ensures that check server settings remain compatible with a QUIC server. Thus, configuration is now invalid if check specifies an explicit MUX proto other than QUIC, disables SSL or try to use PROXY protocol.	2025-09-09 16:55:09 +02:00
Amaury Denoyelle	cd3027a7ee	BUG/MINOR: check: ensure checks are compatible with QUIC servers Previously, checks were only performed on TCP. However, QUIC is now supported on backend. Prior to this patch, check activation for QUIC servers would result in a crash. To ensure compatibility between QUIC servers and checks, adjust protocol_lookup() performed during check connect step. Instead of using a hardcoded PROTO_TYPE_STREAM, the value is now derived from server settings. This does not need to be backported.	2025-09-09 16:55:09 +02:00
Amaury Denoyelle	c6d33c09fc	BUG/MEDIUM: checks: fix ALPN inheritance from server If no specific check settings are defined on a server line, it is expected that these checks will be performed with the same parameters as normal connections on the same server. ALPN must be carefully taken into account for checks. Most notably, MUX initialization is delayed so that it is performed only after SSL handshake. Prior to this patch, MUX init delay was only performed if ALPN was defined via check settings. Thus, with the following settings, checks would be performed on HTTP/1.1 without consulting ALPN negotiation result from the server : server s1 127.0.0.1:443 ssl crt <...> alpn h2 check This bug may result in checks reporting failure, for example in case of a server answering HTTP/2 to ALPN negotiation to the configuration above. Besides, there is incoherency between normal and check connections, which is not what the documentation specifies. This patch fixes this code. Now server parameters are also taken into account. This ensures that checks and normal connections by default use the same connection method. This must be backported up to 2.4.	2025-09-09 16:55:09 +02:00
Amaury Denoyelle	fee3bd48b4	OPTIM: check: do not delay MUX for ALPN if SSL not active To ensure ALPN is properly applied on checks, MUX initialization is delayed so that it is created on SSL handshake completion. However, this does not check if SSL is really active for the connection. This patch adjusts the condition so that MUX init is not delayed if SSL is not active for the check connection. A similar process is already conducted for normal connections via connect_server(). This must be backported up to 2.4. Despite not being a bug, it must be backported for the following patch which fixes check ALPN inheritance from server settings.	2025-09-09 16:55:09 +02:00
Amaury Denoyelle	536d2aafa3	BUG/MINOR: hq-interop: adjust parsing/encoding on backend side HTTP/0.9 is available on top of QUIC. This protocol is reserved for internal use, mostly interop purpose. This patch adjusts HTTP/0.9 layer with the following changes : * version is not emitted anymore on the status line. This is performed as some servers does not parse it correctly. * status line is set explicitely on HTX status-line. This ensures the correct HTTP status code is reported to the upper stream layer. This does not need to be backported.	2025-09-09 16:55:09 +02:00
Christopher Faulet	b901e56acd	BUG/MEDIUM: mux-h2: Reinforce conditions to report an error to app-layer stream This patch relies on the previous one ("BUG/MEDIUM: mux-h2: Report RST/error to app-layer stream during 0-copy fwding"). When the end of the connection is detected, so when the H2_CF_END_REACHED flag is set after the shutdown was received and all incoming data were processed, if a stream is blocked by the flow control (the stream one or the connection one), an error must be reported to the app-layer stream. Otherwise, outgoing data won't be sent and the opposite side will handle this as a lack of room. So the stream will be blocked until the write timeout is triggerd. By reporting the error early, the stream can be immediately closed. This patch should be backported to 3.2. For older versions, it is probably a good idea to wait for bug report.	2025-09-09 16:30:54 +02:00
Christopher Faulet	22e14f7b54	BUG/MEDIUM: mux-h2: Report RST/error to app-layer stream during 0-copy fwding In h2_nego_ff(), it is important to report reset and error to app-layer stream and to send the RST-STREAM frame accordingly. It is not clear if it is an issue or not. But it is clearly a difference with the classical forwarding via h2_snd_buf. And it is mandatory for the next fix. This patch should be backported to 3.2. But is is probably a good idea to not backport it on older versions, except if a bug is reported in this area.	2025-09-09 16:30:21 +02:00
Christopher Faulet	3b7112aa1d	BUG/MINOR: mux-h2: Remove H2_CF_DEM_DFULL flags when the demux buffer is reset This only happens when a connection error is detected or when the H2 connection is in ERR/ERR2 state. The demux buffer is explicitly reset. In that case, it is important to remove the flag reporting this buffer as full. It is probably worth to backport this patch to 3.2. But it is not mandatory on older versions because it does not fix any known issue.	2025-09-09 16:29:14 +02:00
Christopher Faulet	12edcccc82	BUG/MEDIUM: mux-h2: Restart reading when mbuf ring is no longer full When the mbuf ring buffer is full, the flag H2_CF_DEM_MROOM is set on the H2 connection to block any demux. It is important to properly handle ACK frames. However, we must take care to restart reading when some data were removed from the mbuf. Otherwise, we may block the demux for no reason. It is especially an issue if the demux buffer is full. In that case, the H2 connection is blocked, waiting for the timeout. This patch should be backported to 3.2. But is is probably a good idea to not backport it on older versions, except if a bug is reported in this area.	2025-09-09 16:07:20 +02:00
Christopher Faulet	c6e4584d2b	BUG/MEDIUM: mux-h2; Don't block reveives in H2_CS_ERROR and H2_CS_ERROR2 states The H2 connection is switched to ERR when a GOAWAY must be sent and in ERR2 when it is sent. In these states, no more data can be emitted by the mux. But there is no reason to not try to process incoming data or to not try to receive data. It is espcially important to be able to get the shutdown from the TCP connection when a SSL connection was previously detected. Otherwise, it is possible to block a H2 connection until its timeout expiration to be able to close it. This patch should be backported to 3.2. But is is probably a good idea to not backport it on older versions, except if a bug is reported in this area.	2025-09-09 16:07:20 +02:00
Christopher Faulet	626d7934cf	BUG/MEDIUM: mux-h2: Reset MUX blocking flags when a send error is caught When an send error is detected on the underlying connection, a pending error is reported to the H2 connection by setting H2_CF_ERR_PENDING flag. When this happen the tail of the mux ring buffer is reset. However some blocking flags remain set and have no chance to be removed later because of the pending error. Especially the flag H2_CF_DEM_MROOM which block data demultiplexing. Thus, it is possible to block a H2 connection with unparsed incoming data. Worse, if a read event is received, it could lead to a wakeup loop between the H2 connection and the underlying SSL connection. The H2 connection is unable to convert the pending error to a fatal error because the demultiplexing is blocked. In the mean time, it tries to receive more data because of the not-consumed read event. On the underlying connection side, the error detected earlier blocks the read, but the H2 connection is woken up to handle the error. To fix the issue, blocking flags must be removed when a send error is caught, H2_CF_MUX_MFULL and H2_CF_DEM_MROOM flags. But, it is not necessary to only release the tail of the mbuf ring. When a send error is detected, all outgoing data can be flushed. So, now, in h2_send(), h2_release_mbuf() function is called on pending error. The mbuf ring is fully released and H2_CF_MUX_MFULL and H2_CF_DEM_MROOM flags are removed. Many thanks to Krzysztof Kozłowski for its help to spot this issue. This patch could be backported at least as far as 2.8. But it is a bit sensitive. So, it is probably a good idea to backport it to 3.2 for now and wait for bug report on older versions.	2025-09-09 16:07:20 +02:00
Amaury Denoyelle	0b6908385e	BUG/MINOR: quic: properly support GSO on backend side Previously, GSO emission was explicitely disabled on backend side. This is not true since the following patch, thus GSO can be used, for example when transfering large POST requests to a HTTP/3 backend. commit e064e5d46171d32097a84b8f84ccc510a5c211db MINOR: quic: duplicate GSO unsupp status from listener to conn However, GSO on the backend side may cause crash when handling EIO. In this case, GSO must be completely disabled. Previously, this was performed by flagging listener instance. In backend side, this would cause a crash as listener is NULL. This patch fixes it by supporting GSO disable flag for servers. Thus, in qc_send_ppkts(), EIO can be converted either to a listener or server flag depending on the quic_conn proxy side. On backend side, server instance is retrieved via <qc.conn.target>. This is enough to guarantee that server is not deleted. This does not need to be backported.	2025-09-08 16:18:05 +02:00
Christopher Faulet	e653dc304e	MINOR: pools: Don't dump anymore info about pools when purge is forced Historically, when the purge of pools was forced by sending a SIGQUIT to haproxy, information about the pools were first dumped. It is now totally pointless because these info can be retrieved via the CLI. It is even less relevant now because the purge is forced typically when there are memroy issues and to dump pools information, data must be allocated. dump_pools_info() function was simplified because it is now called only from an applet. No reason to still try to dump info on stderr.	2025-09-08 16:04:40 +02:00
Christopher Faulet	982805e6a3	BUG/MINOR: pools: Fix the dump of pools info to deal with buffers limitations The "show pools" CLI command was not designed to dump information exceeding the size of a buffer. But there is now much more pools than few years ago and when detailed information are dumped, we exceeds the buffer limit and the output is truncated. To fix the issue, the command must be refactored to be able to stream the result. To do so, the array containing pools info is now part of the command context and it is dynamically allocated. A dedicated function was created to fill all info. In addition, the index of the next pool to dump is saved in the command context too to properly handle resumption cases. Finally global information about pools are also stored in the command context for convenience. This patch should fix the issue #3067. It must be backported to 3.2. On older release, the buffer limit is never reached.	2025-09-08 16:01:51 +02:00
Christopher Faulet	d75718af14	REGTESTS: ssl: Fix the script about automatic SNI selection First, the barrier to delay the client execution was moved before the client definition. Otherwise, the connection is established too early and with short timeouts it could be closed before the requests are sent. The main purpose of the barrier was to workaround slow health-checks. This is also the reason why the script was flagged as slow. But it can be significantly speed-up by setting a slow "inter" value. It is now set to 100ms and the script is no longer slow.	2025-09-08 15:55:56 +02:00
Amaury Denoyelle	f645cd3c74	MINOR: quic: restore QUIC_HP_SAMPLE_LEN constant The below patch fixes padding emission for small packets, which is required to ensure that header protection removal can be performed by the recipient. commit d7dea408c64c327cab6aebf4ccad93405b675565 BUG/MINOR: quic: too short PADDING frame for too short packets In addition to the proper fix, constant QUIC_HP_SAMPLE_LEN was removed and replaced by QUIC_TLS_TAG_LEN. However, it still makes sense to have a dedicated constant which represent the size of the sample used for header protection. Thus, this patch restores it. Special instructions for backport : above patch mentions that no backport is needed. However, this is incorrect, as bug is introduced by another patch scheduled for backport up to 2.6. Thus, it is first mandatory to schedule d7dea408c64c327cab6aebf4ccad93405b675565 after it. Then, this patch can also be used for the sake of code clarity.	2025-09-08 14:49:03 +02:00
Amaury Denoyelle	c20c71a079	TESTS: quic: add unit-tests for QUIC TX part Define a new "quic_tx" unit-test which is used to test QUIC TX module. For the moment, a single test is performed on qc_do_build_pkt(). It checks that PADDING is correctly added for HP sampling in case of a small packet.	2025-09-08 14:49:03 +02:00
Amaury Denoyelle	fb8c6e2030	CLEANUP: quic: fix typo in quic_tx trace Fix trace in qc_may_build_pkt(). This can be backported up to 3.0.	2025-09-08 14:49:03 +02:00
Aurelien DARRAGON	b9ef55d56d	MINOR: stats-file: use explicit unsigned integer bitshift for user slots As reported in GH #3104, there remained a place where (1 << shift was used to set or remove bits from uint64_t users bitfield. It is incorrect and could lead to bugs for values > 32 bits. Instead, let's use 1ULL to ensure the operation remains 64bits consistent. No backport needed.	2025-09-08 13:38:49 +02:00
Aurelien DARRAGON	9272b8ce74	BUG/MEDIUM: proxy: fix crash with stop_proxy() called during init Willy reported that the following config would segfault right after the "removing incomplete section 'peer' is emitted: peers peers bind :2300 server n10 127.0.0.1:2310 listen dummy bind localhost:9999 This is caused by the fact that stop_proxy(), which tries to read shared counters, is called during early init while shared counters are not yet initialized. To fix the crash, let's check if we're still during starting phase, in which case we assume the counters are not initialized and we assume 0 value instead. No backport needed unless 16eb0fab31 ("MAJOR: counters: dispatch counters over thread groups") is.	2025-09-08 13:38:38 +02:00
Frederic Lecaille	6f9fccec1f	MINOR: quic: SSL session reuse for QUIC Mimic the same behavior as the one for SSL/TCP connetion to implement the SSL session reuse. Extract the code which try to reuse the SSL session for SSL/TCP connections to implement ssl_sock_srv_try_reuse_sess(). Call this function from QUIC ->init() xprt callback (qc_conn_init()) as this done for SSL/TCP connections.	2025-09-08 11:46:26 +02:00
Olivier Houchard	b3e685ac3d	BUG/MEDIUM: ssl: Properly initialize msg_controllen. When kTLS is compiled in, make sure msg_controllen is initialized to 0. If we're not actually kTLS, then it won't be set, but we'll check that it is non-zero later to check if we ancillary data. This does not need to be backported. This should fix CID 1620865, as reported in github issue #3106.	2025-09-06 14:19:48 +02:00
Willy Tarreau	75bd9255dd	BUG/MINOR: cpu_topo: work around a small bug in musl's CPU_ISSET() As found in GH issue #3103, CPU_ISSET() on musl 1.25 doesn't match the man page which says it's returning an int. The reason is pretty simple, it's a macro that operates on the bits directly and returns the result of the bit field applied to the mask as an unsigned long. Bits above 31 will simply be dropped if returned as an int, which causes CPUs 32..63 to appear as absent from cpu_sets. The fix is trivial, it consists in just comparing the result against zero (i.e. turning it to a boolean), but before it's merged and deployed we'll have to face such deployments, so better implement the same workaround in the code here since we have access to the raw long value. This workaround should be backported to 3.0.	2025-09-06 11:05:52 +02:00
Frederic Lecaille	d7dea408c6	BUG/MINOR: quic: too short PADDING frame for too short packets This bug arrvived with this commit: MINOR: quic: centralize padding for HP sampling on packet building What was missed is the fact that at the centralization point for the PADDING frame to add for too short packet, <len> payload length already includes <*pn_len> the packet number field length value. So when computing the length of the PADDING frame, the packet field length must not be considered and added to the payload length (<len>). This bug leaded too short PADDING frame to too short packets. This was the case, most of times with Application level packets with a 1-byte packet number field followed by a 1-byte PING frame. A 1-byte PADDING frame was added in this case in place of a correct 2-bytes PADDINF frame. The header packet protection of such packet could not be removed by the clients as for instance for ngtcp2 with such traces: I00001828 0x5a135c81e803f092c74bac64a85513b657 pkt could not decrypt packet number As the header protection could no be removed, the header keyupdate bit could also not be read by packet analyzers such as pyshark used during the keyupdate tests. No need to backport.	2025-09-05 16:17:11 +02:00
Frederic Lecaille	71336bdd08	MINOR: quic: add useful trace about padding params values When adding a PADDING frame for too short packets, add a trace about variable values whose this PADDING frame length depends on.	2025-09-05 16:17:11 +02:00
Christopher Faulet	cc8af125be	REGTESTS: ssl: Add a script to test the automatic SNI selection The script reg-tests/ssl/ssl_sni_auto.vtc tests the automatic SNI selection for regular server connections and for health-check ones. It rely on a 3.3-dev8 feature (in fact, it was pushed just after the dev8).	2025-09-05 15:56:42 +02:00
Christopher Faulet	f9a6ae727c	OPTIM: tcpcheck: Reorder tcpchek_connect structure fields to fill holes Thanks to this patch, two 4-bytes holes are now filled in the tcpchek_connect structure.	2025-09-05 15:56:42 +02:00
Christopher Faulet	ffc1f096e0	MEDIUM: httpcheck/ssl: Base the SNI value on the HTTP host header by default Similarly to the automic SNI selection for regulat SSL traffic, the SNI of health-checks HTTPS connection is now automatically set by default by using the host header value. "check-sni-auto" and "no-check-sni-auto" server settings were added to change this behavior. Only implicit HTTPS health-checks can take advantage of this feature. In this case, the host header value from the "option httpchk" directive is used to extract the SNI. It is disabled if http-check rules are used. So, the SNI must still be explicitly specified via a "http-check connect" rule. This patch with should paritally fix the issue #3081.	2025-09-05 15:56:42 +02:00
Christopher Faulet	668916c1a2	MEDIUM: server/ssl: Base the SNI value to the HTTP host header by default For HTTPS outgoing connections, the SNI is now automatically set using the Host header value if no other value is already set (via the "sni" server keyword). It is now the default behavior. It could be disabled with the "no-sni-auto" server keyword. And eventually "sni-auto" server keyword may be used to reset any previous "no-sni-auto" setting. This option can be inherited from "default-server" settings. Finally, if no connection name is set via "pool-conn-name" setting, the selected value is used. The automatic selection of the SNI is enabled by default for all outgoing connections. But it is concretely used for HTTPS connections only. The expression used is "req.hdr(host),host_only". This patch should paritally fix the issue #3081. It only covers the server part. Another patch will add the feature for HTTP health-checks.	2025-09-05 15:56:42 +02:00
Christopher Faulet	58555b8653	BUG/MINOR: tcpcheck: Don't use sni as pool-conn-name for non-SSL connections When we try to ruse connection to perform an healtcheck, the SNI, from the tcpcheck connection or the healthcheck itself, must not be used as connection name for non-SSL connections. This patch must be backported to 3.2.	2025-09-05 15:56:42 +02:00
Christopher Faulet	eb3d4eb59f	OPTIM: tcpcheck: Don't set SNI and ALPN for non-ssl connections There is no reason to set the SNI and ALPN for non-ssl connections. It is not really an issue because ssl_sock_set_servername() and ssl_sock_set_alpn() functions will do nothing. But it is cleaner this way and this could avoid bugs in future. No backport needed, because there is no bug.	2025-09-05 15:56:42 +02:00
Christopher Faulet	ef07d3511a	OPTIM: proto_rhttp: Don't set SNI for non-ssl connections There is no reason to set the SNI for non-ssl connections. It is not really an issue because ssl_sock_set_servername() function will do nothing. But there is no reason to uselessly evaluate an expression. No backport needed, because there is no bug.	2025-09-05 15:56:42 +02:00
Christopher Faulet	52866349a1	OPTIM: backend: Don't set SNI for non-ssl connections There is no reason to set the SNI for non-ssl connections. It is not really an issue because ssl_sock_set_servername() function will do nothing. But there is no reason to uselessly evaluate an expression. No backport needed, because there is no bug.	2025-09-05 15:56:42 +02:00
Christopher Faulet	a97bd0f505	BUG/MINOR: server: Update healthcheck when server settings are changed via CLI not all changes are concerned. But when the SSL is enabled or disabled for a server, the healthcheck xprt must be eventually be updated too. This happens when the healthcheck relies on the server settings. In the same spirit, when the healthcheck address and port are updated, we must fallback on the raw xprt if the SSL is not explicitly enabled for the healthcheck with a "check-ssl" parameter. This patch should be backported to all stable versions.	2025-09-05 15:56:42 +02:00
Christopher Faulet	f8f94ffc9c	BUG/MEDIUM: server: Use sni as pool connection name for SSL server only By default, for a given server, when no pool-conn-name is specified, the configured sni is used. However, this must only be done when SSL is in-use for the server. Of course, it is uncommon to have a sni expression for now-ssl server. But this may happen. In addition, the SSL may be disabled via the CLI. In that case, the pool-conn-name must be discarded if it was copied from the sni. And, we must of course take care to set it if the ssl is enabled. Finally, when the attac-srv action is checked, we now checked the pool-conn-name expression. This patch should be backported as far as 3.0. It relies on "MINOR: server: Parse sni and pool-conn-name expressions in a dedicated function" which should be backported too.	2025-09-05 15:56:08 +02:00
Christopher Faulet	086a248645	MINOR: server: Parse sni and pool-conn-name expressions in a dedicated function This change is mandatory to fix an issue. The parsing of sni and pool-conn-name expressions (from string to expression) is now handled in a dedicated function. This will avoid to duplicate the same code at different places.	2025-09-05 11:32:21 +02:00
Christopher Faulet	bb407ba8e3	BUG/MINOR: acl: Fix error message about several '-m' parameters There is a typo in the commit * c51ddd5c3 ("MINOR: acl: Only allow one '-m' matching method") . '*m' was reported in the error message instead of '-m'. In addition, it is now mentionned that only the last one should be keep if an old config triggers the error. No backport needed, except if the commit above is backported.	2025-09-05 11:32:20 +02:00
Willy Tarreau	b167d545cf	[RELEASE] Released version 3.3-dev8 Released version 3.3-dev8 with the following main changes : - BUG/MEDIUM: mux-h2: fix crash on idle-ping due to unwanted ABORT_NOW - BUG/MINOR: quic-be: missing Initial packet number space discarding - BUG/MEDIUM: quic-be: crash after backend CID allocation failures - BUG/MEDIUM: ssl: apply ssl-f-use on every "ssl" bind - BUG/MAJOR: stream: Remove READ/WRITE events on channels after analysers eval - MINOR: dns: dns_connect_nameserver: fix fd leak at error path - BUG/MEDIUM: quic: reset padding when building GSO datagrams - BUG/MINOR: quic: do not emit probe data if CONNECTION_CLOSE requested - BUG/MAJOR: quic: fix INITIAL padding with probing packet only - BUG/MINOR: quic: don't coalesce probing and ACK packet of same type - MINOR: quic: centralize padding for HP sampling on packet building - MINOR: http_ana: fix typo in http_res_get_intercept_rule - BUG/MEDIUM: http_ana: handle yield for "stats http-request" evaluation - MINOR: applet: Rely on applet flag to detect the new api - MINOR: applet: Add function to test applet flags from the appctx - MINOR: applet: Add a flag to know an applet is using HTX buffers - MINOR: applet: Make some applet functions HTX aware - MEDIUM: applet: Set .rcv_buf and .snd_buf functions on default ones if not set - BUG/MEDIUM: mux-spop: Reject connection attempts from a non-spop frontend - REGTESTS: jwt: create dynamically "cert.ecdsa.pem" - BUG/MEDIUM: spoe: Improve error detection in SPOE applet on client abort - MINOR: haproxy: abort config parsing on fatal errors for post parsing hooks - MEDIUM: server: split srv_init() in srv_preinit() + srv_postinit() - MINOR: proxy: handle shared listener counters preparation from proxy_postcheck() - DOC: configuration: reword 'generate-certificates' - BUG/MEDIUM: quic-be: avoid crashes when releasing Initial pktns - BUG/MINOR: quic: reorder fragmented RX CRYPTO frames by their offsets - MINOR: ssl: diagnostic warning when both 'default-crt' and 'strict-sni' are used - MEDIUM: ssl: convert diag to warning for strict-sni + default-crt - DOC: configuration: clarify 'default-crt' and implicit default certificates - MINOR: quic: remove ->offset qf_crypto struct field - BUG/MINOR: mux-quic: trace with non initialized qcc - BUG/MINOR: acl: set arg_list->kw to aclkw->kw string literal if aclkw is found - BUG/MEDIUM: mworker: fix startup and reload on macOS - BUG/MINOR: connection: rearrange union list members - BUG/MINOR: connection: remove extra session_unown_conn() on reverse - MINOR: cli: display failure reason on wait command - BUG/MINOR: server: decrement session idle_conns on del server - BUG/MINOR: mux-quic: do not access conn after idle list insert - MINOR: session: document explicitely that session_add_conn() is safe - MINOR: session: uninline functions related to BE conns management - MINOR: session: refactor alloc/lookup of sess_conns elements - MEDIUM: session: protect sess conns list by idle_conns_lock - MINOR: server: shard by thread sess_conns member - MEDIUM: server: close new idle conns if server in maintenance - MEDIUM: session: close new idle conns if server in maintenance - MINOR: server: cleanup idle conns for server in maint already stopped - MINOR: muxes: enforce thread-safety for private idle conns - MEDIUM: conn/muxes/ssl: reinsert BE priv conn into sess on IO completion - MEDIUM: conn/muxes/ssl: remove BE priv idle conn from sess on IO - MEDIUM: mux-quic: enforce thread-safety of backend idle conns - MAJOR: server: implement purging of private idle connections - MEDIUM: session: account on server idle conns attached to session - MAJOR: server: do not remove idle conns in del server - BUILD: mworker: fix ignoring return value of ‘read’ - DOC: unreliable sockpair@ on macOS - MINOR: muxes: adjust takeover with buf_wait interaction - OPTIM: backend: set release on takeover for strict maxconn - DOC: configuration: confuse "strict-mode" with "zero-warning" - MINOR: doc: add missing statistics column - MINOR: doc: add missing statistics column - MINOR: stats: display new curr_sess_idle_conns server counter - MINOR: proxy: extend "show servers conn" output - MEDIUM: proxy: Reject some header names for 'http-send-name-header' directive - BUG/BUILD: stats: fix build due to missing stat enum definition - DOC: proxy-protocol: Make example for PP2_SUBTYPE_SSL_SIG_ALG accurate - CLEANUP: quic: remove a useless CRYPTO frame variable assignment - BUG/MEDIUM: quic: CRYPTO frame freeing without eb_delete() - BUG/MAJOR: mux-quic: fix crash on reload during emission - MINOR: conn/muxes/ssl: add ASSUME_NONNULL() prior to _srv_add_idle - REG-TESTS: map_redirect: Don't use hdr_dom in ACLs with "-m end" matching method - MINOR: acl: Only allow one '-m' matching method - MINOR: acl; Warn when matching method based on a suffix is overwritten - BUG/MEDIUM: server: Duplicate healthcheck's alpn inherited from default server - BUG/MINOR: server: Duplicate healthcheck's sni inherited from default server - BUG/MINOR: acl: Properly detect overwritten matching method - BUG/MINOR: halog: Add OOM checks for calloc() in filter_count_srv_status() and filter_count_url() - BUG/MINOR: log: Add OOM checks for calloc() and malloc() in logformat parser and dup_logger() - BUG/MINOR: acl: Add OOM check for calloc() in smp_fetch_acl_parse() - BUG/MINOR: cfgparse: Add OOM check for calloc() in cfg_parse_listen() - BUG/MINOR: compression: Add OOM check for calloc() in parse_compression_options() - BUG/MINOR: tools: Add OOM check for malloc() in indent_msg() - BUG/MINOR: quic: ignore AGAIN ncbuf err when parsing CRYPTO frames - MINOR: quic/flags: complete missing flags - BUG/MINOR: quic: fix room check if padding requested - BUG/MINOR: quic: fix padding issue on INITIAL retransmit - BUG/MINOR: quic: pad Initial pkt with CONNECTION_CLOSE on client - MEDIUM: quic: strengthen BUG_ON() for unpad Initial packet on client - DOC: configuration: rework the jwt_verify keyword documentation - BUG/MINOR: haproxy: be sure not to quit too early on soft stop - BUILD: acl: silence a possible null deref warning in parse_acl_expr() - MINOR: quic: Add more information about RX packets - CI: fix syntax of Quic Interop pipelines - MEDIUM: cfgparse: warn when using user/group when built statically - BUG/MEDIUM: stick-tables: don't leave the expire loop with elements deleted - BUG/MINOR: stick-tables: never leave used entries without expiration - BUG/MEDIUM: peers: don't fail twice to grab the update lock - MINOR: stick-tables: limit the number of visited nodes during expiration - OPTIM: stick-tables: exit expiry faster when the update lock is held - MINOR: counters: retrieve detailed errmsg upon failure with counters_{fe,be}_shared_prepare() - MINOR: stats-file: introduce shm-stats-file directive - MEDIUM: stats-file: processes share the same clock source from shm-stats-file - MINOR: stats-file: add process slot management for shm stats file - MEDIUM: stats-file/counters: store and preload stats counters as shm file objects - DOC: config: document "shm-stats-file" directive - OPTIM: stats-file: don't unnecessarily die hard on shm_stats_file_reuse_object() - MINOR: compiler: add ALWAYS_PAD() macro - BUILD: stats-file: fix aligment issues - MINOR: stats-file: reserve some bytes in exported structs - MEDIUM: stats-file: add some BUG_ON() guards to ensure exported structs are not changed by accident - BUG/MINOR: check: ensure check-reuse is compatible with SSL - BUG/MINOR: check: fix dst address when reusing a connection - REGTESTS: explicitly use "balance roundrobin" where RR is needed - MAJOR: backend: switch the default balancing algo to "random" - BUG/MEDIUM: conn: fix UAF on connection after reversal on edge - BUG/MINOR: connection: streamline conn detach from lists - BUG/MEDIUM: quic-be: too early SSL_SESSION initialization - BUG/MINOR: log: fix potential memory leak upon error in add_to_logformat_list() - MEDIUM: init: always warn when running as root without being asked to - MINOR: sample: Add base2 converter - MINOR: version: add -vq, -vqb, and -vqs flags for concise version output - BUILD: trace: silence a bogus build warning at -Og - MINOR: trace: accept trace spec right after "-dt" on the command line - BUILD: makefile: bump the default minimum linux version to 4.17	2025-09-05 09:54:34 +02:00
Willy Tarreau	85ac6a6f7b	BUILD: makefile: bump the default minimum linux version to 4.17 As explained during the 3.3-dev7 announcement below: https://www.mail-archive.com/haproxy@formilux.org/msg46073.html no regularly maintained distro supports a kernel older than 4.18 anymore, and KTLS is supported since 4.17. So it's about the right moment to bump the default minimum kernel version supported by glibc and musl to automatically cover new features. The linux-glibc-legacy target still supports 2.6.28 and above.	2025-09-05 09:44:56 +02:00
Willy Tarreau	670dc299d3	MINOR: trace: accept trace spec right after "-dt" on the command line I continue to mistakenly set the traces using "-dtXXX" and to have to refer to the doc to figure that it requires a separate argument and differs from some other options. Worse, "-dthelp" doesn't say anything and silently ignores the argument. Let's make the parser take whatever follows "-dt" as the argument if present, otherwise take the next one (as it currently does). Doing this even allows to simplify the code, and is easier to figure the syntax since "-dthelp" now works.	2025-09-05 09:33:28 +02:00
Willy Tarreau	abfd6f3b93	BUILD: trace: silence a bogus build warning at -Og gcc-13.3 at -Og emits an incorrect build warning in trace.c about a possibly initialized variable: In file included from include/haproxy/api.h:35, from src/trace.c:22: src/trace.c: In function 'trace_parse_cmd': include/haproxy/bug.h:431:17: warning: 'arg' may be used uninitialized [-Wmaybe-uninitialized] 431 \| free(__x); \ \| ^~~~~~~~~~ src/trace.c:1136:9: note: in expansion of macro 'ha_free' 1136 \| ha_free(&oarg); \| ^~~~~~~ src/trace.c:1008:15: note: 'arg' was declared here 1008 \| char arg, *oarg; \| ^~~ The warning is obviously wrong since the field is initialized in one of the two branches of an "if" whose complementary one returns. But the compiler doesn't seem to see this because the if is in fact two ifs each with an opposite condition: "if (arg_src)" then "if (!arg_src)". Let's just move upwards the default one that returns and eliminate the other one. Reading the diff with "git diff -b" better shows the tiny change. It could be backported to 3.0.	2025-09-05 09:19:24 +02:00
Nikita Kurashkin	ef73fe2584	MINOR: version: add -vq, -vqb, and -vqs flags for concise version output This patch introduces three new command line flags to display HAProxy version info more flexibly: - `-vqs` outputs the short version string without commit info (e.g., "3.3.1"). - `-vqb` outputs only the branch (major.minor) part of the version (e.g., "3.3"). - `-vq` outputs the full version string with suffixes (e.g., "3.3.1-dev5-1bb975-71"). This allows easier parsing of version info in automation while keeping existing -v and -vv behaviors. The command line argument parsing now calls `display_version_plain()` with a display_mode parameter to select the desired output format. The function handles stripping of commit or patch info as needed, depending on the mode. Signed-off-by: Nikita Kurashkin <nkurashkin@stsoft.ru>	2025-09-05 08:57:57 +02:00
Maximilian Moehl	5d9abc68b4	MINOR: sample: Add base2 converter This commit adds the base2 converter to turn binary input into it's string representation. Each input byte is converted into a series of eight characters which are either 0s and 1s by bit-wise comparison.	2025-09-05 08:51:51 +02:00
Willy Tarreau	a6986e1cd6	MEDIUM: init: always warn when running as root without being asked to Like many exposed network deamons, haproxy does normally not need to run as root and strongly recommends against this, unless strictly necessary. On some operating systems, capabilities even totally alleviate this need. Lately, maybe due to a raise of containerization or automated config generation or a bit of both, we've observed a resurgence of this bad practice, possibly due to the fact that users are just not aware of the conditions they're using their daemon. Let's add a warning at boot when starting as root without having requested it using "uid" or "user". And take this opportunity for warning the user about the existence of capabilities when supported, and encouraging the use of a chroot. This is achieved by leaving global.uid set to -1 by default, allowing us to detect if it was explicitly set or not.	2025-09-05 08:51:07 +02:00
Aurelien DARRAGON	c97ced3f93	BUG/MINOR: log: fix potential memory leak upon error in add_to_logformat_list() As reported on GH #3099, upon memory error add_to_logformat_list() will return and error but it fails to properly memory which was allocated within the function, which could result in memory leak. Let's free all relevant variables allocated by the function before returning. No backport needed unless 22ac1f5ee ("("BUG/MINOR: log: Add OOM checks for calloc() and malloc() in logformat parser and dup_logger()") is.	2025-09-04 23:07:22 +02:00
Frederic Lecaille	842f32f3f1	BUG/MEDIUM: quic-be: too early SSL_SESSION initialization When an SNI is set on a QUIC server line, ssl_sock_set_servername() is called from connect_server() (backend.c). This leads some BUG_ON() to be triggered because the CO_FL_WAIT_L6_CONN \| CO_FL_SSL_WAIT_HS were not set. This must be done into the ->init() xprt callback. This patch move the flags settings from ->start() to ->init() callback. Indeed, connect_server() calls these functions in this order: ->init(), ssl_sock_set_servername() # => crash if CO_FL_WAIT_L6_CONN \| CO_FL_SSL_WAIT_HS not set ->start() Furthermore ssl_sock_set_servername() has a side effect to reset the SSL_SESSION object (attached to SSL object) calling SSL_set_session(), leading to crashes as follows: [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `./haproxy -f quic_srv.cfg'. Program terminated with signal SIGSEGV, Segmentation fault. #0 tls_process_server_hello (s=0x560c259733b0, pkt=0x7fffac239f20) at ssl/statem/statem_clnt.c:1624 1624 if (s->session->session_id_length > 0) { [Current thread is 1 (Thread 0x7fc364e53dc0 (LWP 35514))] (gdb) bt #0 tls_process_server_hello (s=0x560c259733b0, pkt=0x7fffac239f20) at ssl/statem/statem_clnt.c:1624 #1 0x00007fc36540fba4 in ossl_statem_client_process_message (s=0x560c259733b0, pkt=0x7fffac239f20) at ssl/statem/statem_clnt.c:1042 #2 0x00007fc36540d028 in read_state_machine (s=0x560c259733b0) at ssl/statem/statem.c:646 #3 0x00007fc36540ca70 in state_machine (s=0x560c259733b0, server=0) at ssl/statem/statem.c:439 #4 0x00007fc36540c576 in ossl_statem_connect (s=0x560c259733b0) at ssl/statem/statem.c:250 #5 0x00007fc3653f1698 in SSL_do_handshake (s=0x560c259733b0) at ssl/ssl_lib.c:3835 #6 0x0000560c22620327 in qc_ssl_do_hanshake (qc=qc@entry=0x560c25961f60, ctx=ctx@entry=0x560c25963020) at src/quic_ssl.c:863 #7 0x0000560c226210be in qc_ssl_provide_quic_data (len=90, data=<optimized out>, ctx=0x560c25963020, level=ssl_encryption_initial, ncbuf=0x560c2588bb18) at src/quic_ssl.c:1071 #8 qc_ssl_provide_all_quic_data (qc=qc@entry=0x560c25961f60, ctx=0x560c25963020) at src/quic_ssl.c:1123 #9 0x0000560c2260ca5f in quic_conn_io_cb (t=0x560c25962f80, context=0x560c25961f60, state=<optimized out>) at src/quic_conn.c:791 #10 0x0000560c228255ed in run_tasks_from_lists (budgets=<optimized out>) at src/task.c:648 #11 0x0000560c22825f7a in process_runnable_tasks () at src/task.c:889 #12 0x0000560c22793dc7 in run_poll_loop () at src/haproxy.c:2836 #13 0x0000560c22794481 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3056 #14 0x0000560c2259082d in main (argc=<optimized out>, argv=<optimized out>) at src/haproxy.c:3667 <s> is the SSL object, and <s->session> is the SSL_SESSION object. For the client, this is the first call do SSL_do_handshake() which initializes this SSL_SESSION object from ->init() xpt callback. Then it is reset by ssl_sock_set_servername(), then tls_process_server_hello() TLS stack is called with NULL value for s->session when receiving the ServerHello TLS message. To fix this, simply move the first call to SSL_do_handshake to ->start xprt call back (qc_xprt_start()). No need to backport.	2025-09-04 20:49:06 +02:00
Amaury Denoyelle	687df405fe	BUG/MINOR: connection: streamline conn detach from lists Over their lifetime, connections are attached to different list. These lists depends on whether connection is on frontend or backend side. Attach point members are stored via a union in struct connection. The next commit reorganizes them so that a proper frontend/backend separation is performed : commit a96f1286a75246fef6db3e615fabdef1de927d83 BUG/MINOR: connection: rearrange union list members On conn_free(), connection instance must be removed from these lists to ensure there is no use-after-free case. However code was still shaky there, despite no real issue. Indeed, <toremove_list> was detached for all connections, despite being only used on backend side only. This patch streamlines the freeing of connection. Now, <toremove_list> detach is performed in conn_backend_deinit(). Moreover, a new helper conn_frontend_deinit() is defined. It ensures that <stopping_list> detach is done. Prior it was performed individually by muxes. Note that a similar procedure is performed when the connection is reversed. Hence, conn_frontend_deinit() is now used here as well, rendering reversal from FE to BE or vice versa symmetrical. As mentionned above, no crash occured prior to this patch, but the code was fragile, in particular access to <toremove_list> for frontend connections. Thus this patch is considered as a bug fix worthy of a backport along with above mentionned patch, currently up to 3.0.	2025-09-04 18:31:20 +02:00
Amaury Denoyelle	27ff7ff296	BUG/MEDIUM: conn: fix UAF on connection after reversal on edge When a connection is reversed, some elements must be resetted prior to reusing it. Most notably, connection must be removed from lists specific on frontend/backend sides. When reverse was performed for frontend to backend side, connection was not removed via its <stopping_list> attach point. On previous releases, this did not cause any issue. However, crashes start to occur recently, probably due to the recent reorganization of connection list attach points from the following patch. commit a96f1286a75246fef6db3e615fabdef1de927d83 BUG/MINOR: connection: rearrange union list members To fix this, simply ensure that <stopping_list> detach is performed via conn_reverse(). This patch must be backported up to 3.0 release.	2025-09-04 18:13:35 +02:00
Willy Tarreau	93cc18ac42	MAJOR: backend: switch the default balancing algo to "random" For many years, an unset load balancing algorithm would use "roundrobin". It was shown several times that "random" with at least 2 draws (the default) generally provides better performance and fairness in that it will automatically adapt to the server's load and capacity. This was further described with numbers in this discussion: https://www.mail-archive.com/haproxy@formilux.org/msg46011.html https://github.com/orgs/haproxy/discussions/3042 BTW there were no objection and only support for the change. The goal of this patch is to change the default algo when none is specified, from "roundrobin" to "random". This way, users who don't care and don't set the load balancing algorithm will benefit from a better one in most cases, while those who have good reasons to prefer roundrobin (for session affinity or for reproducible sequences like used in regtests) can continue to specify it. The vast majority of users should not notice a difference.	2025-09-04 08:30:35 +02:00
Willy Tarreau	60931ceae9	REGTESTS: explicitly use "balance roundrobin" where RR is needed A few tests explicitly rely on the server ordering granted by "balance roundrobin", but didn't specify the balance algorithm. As it will change soon, let's explicit it.	2025-09-04 08:18:53 +02:00
Amaury Denoyelle	9410b2ab97	BUG/MINOR: check: fix dst address when reusing a connection The keyword check-reuse-pool allows to reuse an idle connection to perform a health check instead of opening a new one. It is implemented similarly to HTTP transfer reuse : a hash is calculated with a subset of properties to lookup a connection with the same characteristics. One of these properties is the destination address. Initially it was always set to NULL prior to reuse check, as this is necessary to match connections on a reverse-HTTP server. However, this prevents reuse on other servers with a proper address configured. Indeed, in this case destination address is always used as key for connections inserted in idle pool. This patch fixes this by properly setting destination address for check reuse. By default, it reuses the address from the server. The only exception is if the server is using reverse-HTTP, in which case address remains NULL. A new test is also performed prior to try check reuse to ensure this is not performed on a transparent server. Indeed, in this case server address would be unset. Anyway, check cannot reuse a connection in this case so this is OK. Note that this does not prevent to continue check with a newly connection with a NULL address : this should be handled more properly in another patch. This must be backported up to 3.2.	2025-09-03 16:58:14 +02:00
Amaury Denoyelle	6d3c3c7871	BUG/MINOR: check: ensure check-reuse is compatible with SSL SSL may be activated implicitely if a server relies on SSL, even without check-ssl keyword. This is performed by init_srv_check() function. The main operation is to change xprt layer for check to SSL. Prior to this patch, <use_ssl> check member was also set, despite not strictly necessary. This has a negative side-effect of rendering check-reuse-pool ineffective. Indeed, reuse on check is only performed if no specific check configuration has been specified (see tcpcheck_use_nondefault_connect()). This patch fixes check reuse with SSL : <use_ssl> is not set in case SSL is inherited implicitely from server configuration. Thus, <use_ssl> is now only set if an explicit check-ssl keyword is set, which disables connection reuse for check. This must be backported up to 3.2.	2025-09-03 16:54:48 +02:00
Aurelien DARRAGON	f32bc8f0a4	MEDIUM: stats-file: add some BUG_ON() guards to ensure exported structs are not changed by accident Add two BUG_ON() in shm_stats_file_prepare() which will trigger if exported structures (shm_stats_file_hdr and shm_stats_file_object) change in size, because it means that they will become incompatible with older versions and thus precautions should be taken by the developer to ensure compatibility with olders versions, or at least detect incompatible versions by changing the version number to prevent bugs resulting from inconsistent mapping between versions. The BUG_ON() may be safely adjusted then. Please note that it doesn't protect against accidental struct member re-ordering if the resulting struct size is equal..	2025-09-03 16:29:55 +02:00
Aurelien DARRAGON	1a1362ea0b	MINOR: stats-file: reserve some bytes in exported structs We may need additional struct members in shm_stats_file_object and shm_stats_file_hdr, yet since these structs are exported they should not change in size nor ordering else it would require a version change to break compability on purpose since mapping would differ. Here we reserve 64 additional bytes in shm_stats_file_object, and 128 bytes in shm_stats_file_hdr for future usage.	2025-09-03 16:29:48 +02:00
Aurelien DARRAGON	21d97ccfae	BUILD: stats-file: fix aligment issues Document some byte holes and fix some potential aligment issues between 32 and 64 bits architectures to ensure the shm_stats_file memory mapping is consistent between operating systems.	2025-09-03 16:28:46 +02:00
Aurelien DARRAGON	46a5948ed2	MINOR: compiler: add ALWAYS_PAD() macro same as THREAD_PAD() but doesn't depend on haproxy being compiled with thread support. It may be useful for memory (or files) that may be shared between multiple processed.	2025-09-03 16:28:46 +02:00
Aurelien DARRAGON	cf2562cddf	OPTIM: stats-file: don't unnecessarily die hard on shm_stats_file_reuse_object() shm_stats_file_reuse_object() has a non negligible cost, especially if the shm file contains a lot of objects because the functions scans the whole shm file to find available slots. During startup, if no existing objects could be mapped in the shm file shm_stats_file_add_object() for each object (server, fe, be or listener) with a GUID set. On large config it means shm_stats_file_add_object() could be called a lot of times in a row. With current implementation, each shm_stats_file_add_object() call leverages shm_stats_file_reuse_object(), so the more objects are defined in the config, the slower the startup will be. To try to optimize startup time a bit with large configs, we don't sytematically call shm_stats_file_reuse_object(), especially when we know that the previous attempt to reuse objects failed. In this case we add a small tempo between failed attempts to reuse objects because we assume the new attempt will probably fail anyway. (For slots to become available, either an old process has to clean its entries, or they have to time out which implies that the clock needs to be updated)	2025-09-03 16:28:41 +02:00
Aurelien DARRAGON	16abfb6e06	DOC: config: document "shm-stats-file" directive Add some documentation for "shm-stats-file" and "shm-stats-file-max-objects" experimental directives related to the use of shared memory for storing stats counters (see previous commits for implementation details)	2025-09-03 15:59:42 +02:00
Aurelien DARRAGON	585ece4c92	MEDIUM: stats-file/counters: store and preload stats counters as shm file objects This is the last patch of the shm stats file series, in this patch we implement the logic to store and fetch shm stats objects and associate them to existing shared counters on the current process. Shm objects are stored in the same memory location as the shm stats file header. In fact they are stored right after it. All objects (struct shm_stats_file_object) have the same size (no matter their type), which allows for easy object traversal without having to check the object's type, and could permit the use of external tools to scan the SHM in the future. Each object stores a guid (of GUID_MAX_LEN+1 size) and tgid which allows to match corresponding shared counters indexes. Also, as stated before, each object stores the list of users making use of it. Objects are never released (the map can only grow), but unused objects (when no more users or active users are found in objects->users), the object is automatically recycled. Also, each object stores its type which defines how the object generic data member should be handled. Upon startup (or reload), haproxy first tries to scan existing shm to find objects that could be associated to frontends, backends, listeners or servers in the current config based on GUID. For associations that couldn't be made, haproxy will automatically create missing objects in the SHM during late startup. When haproxy matches with an existing object, it means the counter from an older process is preserved in the new process, so multiple processes temporarily share the same counter for as long as required for older processes to eventually exit.	2025-09-03 15:59:37 +02:00
Aurelien DARRAGON	ee17d20245	MINOR: stats-file: add process slot management for shm stats file Now that all processes tied to the same shm stats file now share a common clock source, we introduce the process slot notion in this patch. Each living process registers itself in a map at a free index: each slot stores information about the process' PID and heartbeat. Each process is responsible for updating its heartbeat, a slot is considered as "free" if the heartbeat was never set or if the heartbeat is expired (60 seconds of inactivity). The total number of slots is set to 64, this is on purpose because it allows to easily store the "users" of a given shm object using a 64 bits bitmask. Given that when haproxy is reloaded olders processes are supposed to die eventually, it should be large enough (64 simultaneous processes) to be safe. If we manage to reach this limit someday, more slots could be added by splitting "users" bitmask on multiple 64bits variable.	2025-09-03 15:59:33 +02:00
Aurelien DARRAGON	443e657fd6	MEDIUM: stats-file: processes share the same clock source from shm-stats-file The use of the "shm-stats-file" directive now implies that all processes using the same file now share a common clock source, this is required for consistency regarding time-related operations. The clock source is stored in the shm stats file header. When the directive is set, all processes share the same clock (global_now_ms and global_now_ns both point to variables in the map), this is required for time-based counters such as freq counters to work consistently. Since all processes manipulate global clock with atomic operations exclusively during runtime, and don't systematically relies on it (thanks to local now_ms and now_ns), it is pretty much transparent.	2025-09-03 15:59:27 +02:00
Aurelien DARRAGON	c91d93ed1c	MINOR: stats-file: introduce shm-stats-file directive add initial support for the "shm-stats-file" directive and associated "shm-stats-file-max-objects" directive. For now they are flagged as experimental directives. The shared memory file is automatically created by the first process. The file is created using open() so it is up to the user to provide relevant path (either on regular filesystem or ramfs for performance reasons). The directive takes only one argument which is path of the shared memory file. It is passed as-is to open(). The maximum number of objects per thread-group (hard limit) that can be stored in the shm is defined by "shm-stats-file-max-objects" directive, Upon initial creation, the main shm stats file header is provisioned with the version which must remains the same to be compatible between processes and defaults to 2k. which means approximately 1mb max per thread group and should cover most setups. When the limit is reached (during startup) an error is reported by haproxy which invites the user to increase the "shm-stats-file-max-objects" if desired, but this means more memory will be allocated. Actual memory usage is low at start, because only the mmap (mapping) is provisionned with the maximum number of objects to avoid relocating the memory area during runtime, but the actual shared memory file is dynamically resized when objects are added (resized by following half power of 2 curve when new objects are added, see upcoming commits) For now only the file is created, further logic will be implemented in upcoming commits.	2025-09-03 15:59:22 +02:00
Aurelien DARRAGON	cb08bcb9d6	MINOR: counters: retrieve detailed errmsg upon failure with counters_{fe,be}_shared_prepare() counters_{fe,be}_shared_prepare now take an extra <errmsg> parameter that contains additional hints about the error in case of failure. It must be freed accordingly since it is allocated using memprintf	2025-09-03 15:59:17 +02:00
Willy Tarreau	46463d6850	OPTIM: stick-tables: exit expiry faster when the update lock is held It helps keep the contention level low: when we hold the update lock that we know other parts may be relying on (peers, track-sc etc), we decrease the remaining visit counters 4 times as fast to further reduce the contention. At this point no more warnings are seen during intense synchronization (2x64 cores, 1.5M req/s with a track-sc each, 5M entries in use).	2025-09-03 15:51:13 +02:00
Willy Tarreau	696793205b	MINOR: stick-tables: limit the number of visited nodes during expiration As reported by Felipe in GH issue #3084, on large systems it's not sufficient to leave the expiration process after a certain number of expired entries, because if they accumulate too fast, it's possible to still spend some time visiting many (e.g. those still in use), which takes time. Thus here we're taking a stricter approach consisting in counting the number of visited entries, which allows to leave early if we can't do the expected work in a reasonable amount of time. In order to avoid always stopping on first shards and never visiting last ones, we're always starting from a random shard number and looping from that one. This way even if we always leave early, all shards will be handled equally. This should be backported to 3.2.	2025-09-03 15:51:13 +02:00
Willy Tarreau	2421c3769a	BUG/MEDIUM: peers: don't fail twice to grab the update lock When the expire task is running fast (i.e. running almost alone), it's super hard to grab the update lock and peers can easily trigger the watchdog because the time it takes to grab this lock is multiplied by the number of updates to perform. This is easier to trigger at the end of an injection session where the expire task is omni-present. Let's just record that we failed once and don't fail a second time in the loop. This should be backported to 3.2, but probably not further given that this area changed significantly in 3.2.	2025-09-03 15:51:13 +02:00
Willy Tarreau	324f0a60ab	BUG/MINOR: stick-tables: never leave used entries without expiration When trying to kill/expire entries, if a ref-counted entry is found, let's requeue it with its expiration timer instead of leaving it out, because other ref-counters (e.g. peers) will not purge it otherwise, leaving it orphan. This one seems trickier to trigger, though it seems to happen sometimes when peers are late and a long resync is active and competing with intense calls to process_table_expire() (i.e. when no other acitvity is there). This must be backported to 3.2. It's likely that older versions are affected as well, but possibly differently since the expiration mechanism changed between 3.1 and 3.2, so better not take unneeded risks there.	2025-09-03 15:51:13 +02:00
Willy Tarreau	8da6ed6b6a	BUG/MEDIUM: stick-tables: don't leave the expire loop with elements deleted In 3.2, the table expiration latency was improved by commit 994cc58576 ("MEDIUM: stick-tables: Limit the number of entries we expire"), however it introduced an issue by which it's possible to leave the loop after a certain number of elements were expired, without requeuing the deleted elements. The issue it causes is that other places with a non-null ref_cnt will not necessarily delete it themselves, resulting in orphan elements in the table. These ones will then pollute it and force recycling old ones more often which in turn results in an increase of the contention. Let's check for the expiration counter before deleting the element so that it can be found upon next visit. This fix must be backported to 3.2. It is directly related to GH issue #3084. Thanks to Felipe and Ricardo for sharing precious info and testing a candidate fix.	2025-09-03 15:51:13 +02:00
William Lallemand	554a15562f	MEDIUM: cfgparse: warn when using user/group when built statically In issue #3013, an user observed a crash at startup of haproxy when building statically and using the "user" global section. This is a known problem of the glibc and the linker even warn about this: > warning: Using 'getgrnam' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking > warning: Using 'getpwnam' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking Let's emit a warning when using user/group in this case.	2025-09-03 14:45:00 +02:00
Ilia Shipitsin	3354719709	CI: fix syntax of Quic Interop pipelines previously, wrong syntax of passing build arguments was used, thus previously images were built using default SSLLIB=QuicTLS-1.1.1	2025-09-03 11:36:14 +02:00
Frederic Lecaille	58b153b882	MINOR: quic: Add more information about RX packets This patch is very useful to debug issues at RX packet processing level. Should be easily backported as far as 2.6 (for debug purposes).	2025-09-03 09:41:38 +02:00
Willy Tarreau	4902195313	BUILD: acl: silence a possible null deref warning in parse_acl_expr() The fix in commit 441cd614f9 ("BUG/MINOR: acl: set arg_list->kw to aclkw->kw string literal if aclkw is found") involves an unchecked access to "al" after that one is tested for possibly being NULL. This rightfully upsets Coverity (GH #3095) and might also trigger warnings depending on the compilers. However, no known caller to date passes a NULL arg list here so there's no way to trigger this theoretical bug. This should be backported along with the fix above to avoid emitting warnings, possibly as far as 2.6 since that fix was tagged as such.	2025-09-02 17:41:51 +02:00
Willy Tarreau	c128887b8e	BUG/MINOR: haproxy: be sure not to quit too early on soft stop The fix in 4a9e3e102e ("BUG/MINOR: haproxy: only tid 0 must not sleep if got signal") had the nasty side effect of breaking the graceful reload operations: threads whose id is non-zero could quit too early and not process incoming traffic, which is visible with broken connections during reloads. They just need to ignore the the stopping condition until the signal queue is empty. In any case, it's the thread in charge of the signal queue which will notify them once it receives the signal. It was verified that connections are no longer broken with this fix, and that the issue that required it (#2537, looping threads on reload) does not re-appear with the reproducer, while it still did without the fix above. Since the fix above was backported to every stable version, this one will also have to.	2025-09-02 11:33:14 +02:00
William Lallemand	ce57f11991	DOC: configuration: rework the jwt_verify keyword documentation Split the documentation in multiple sections: - Explanation about what it does and how - <alg> parameter with array of parameters - <key> parameter with details about certificates and public keys - Return value Others changes: - certificates does not need to be known during configuration parsing - differences between public key and certificate	2025-09-02 11:16:42 +02:00
Amaury Denoyelle	36d28bfca3	MEDIUM: quic: strengthen BUG_ON() for unpad Initial packet on client To avoid anti-amplification limit, it is required that Initial packet are padded to be at least 1.200 bytes long. On server side, this only applies to ack-eliciting packets. However, for client side, this is mandatory for every packets. This patch adjusts qc_txb_store() BUG_ON statement used to catch too small Initial packets. On QUIC client side, ack-eliciting flag is now ignored, thus every packets are checked. This is labelled as MEDIUM as this BUG_ON() is known to be easily triggered, as QUIC datagrams encoding function are complex. However, it's important that a QUIC endpoint respects it, else the peer will drop the invalid packet and could immediately close the connection.	2025-09-02 10:41:49 +02:00
Amaury Denoyelle	209a54d539	BUG/MINOR: quic: pad Initial pkt with CONNECTION_CLOSE on client Currently, when connection is closing, only CONNECTION_CLOSE frame is emitted via qc_prep_pkts()/qc_do_build_pkt(). Also, only the first registered encryption level is considered while the others are dismissed. This results in a single packet datagram. This can cause issues for QUIC client support, as padding is required for every Initial packet, contrary to server side where only ack-eliciting packets are eligible. Thus a client must add padding to a CONNECTION_CLOSE frame on Initial level. This patch adjusts qc_prep_pkts() to ensure such packet will be correctly padded on client side. It sets <final_packet> variable which instructs that if padding is necessary it must be apply immediately on the current encryption level instead of the last one. It could appear as unnecessary to pad a CONNECTION_CLOSE packet, as the peer will enter in draining state when processing it. However, RFC mandates that a client Initial packet too small must be dropped by the server, so there is a risk that the CONNECTION_CLOSE is simply discarded prior to its processing if stored in a too small datagram. No need to backport as this is a QUIC backend issue only.	2025-09-02 10:34:12 +02:00
Amaury Denoyelle	e9b78e3fb1	BUG/MINOR: quic: fix padding issue on INITIAL retransmit On loss detection timer expiration, qc_dgrams_retransmit() is used to reemit lost packets. Different code paths are present depending on the active encryption level. If Initial level is still initialized, retransmit is performed both for Initial and Handshake spaces, by first retrieving the list of lost frames for each of them. Prior to this patch, Handshake level was always registered for emission after Initial, even if it dit not have any frame to reemit. In this case, most of the time it would result in a datagram containing Initial reemitted frames packet coalesced with a Handshake packet consisting only of a PADDING frame. This is because padding is only added for the last registered QEL. For QUIC backend support, this may cause issues. This is because contrary to QUIC server side, Initial and Handshake levels keys are not derived simultaneously for a QUIC client. Thus, if the latter keys are unavailable, Handshake packet cannot be encoded in sending, leaving a single Initial packet. However, this is now too late to add PADDING. Thus the resulting datagram is invalid : this triggers the BUG_ON() assert failure located on qc_txb_store(). This patch fixes this by amending qc_dgrams_retransmit(). Now, Handshake level is only registered for emission if there is frame to retransmit, which implies that Handshake keys are already available. Thus, PADDING will now either be added at Initial or Handshake level as expected. Note that this issue should not be present on QUIC frontend, due to Initial and Handshake keys derivation almost simultaneously. However, this should still be backported up to 3.0.	2025-09-02 10:31:32 +02:00
Amaury Denoyelle	34d5bfd23c	BUG/MINOR: quic: fix room check if padding requested qc_prep_pkts() activates padding when building an Initial packet. This ensures that resulting datagram will always be at least 1.200 bytes, which is mandatory to prevent deadlock over anti-amplication. Prior to padding activation, a check is performed to ensure that output buffer is big enough for a padded datagram. However, this did not take into account previously built packets which would be coalesced in the same datagram. Thus this patch fixes this comparison check. In theory, prior to this patch, in some cases Initial packets could not be built despite a datagram of the proper size. Currently, this probably never happens as Initial packet is always the first encoded in a datagram, thus there is no coalesced packet prior to it. However, there is no hard requirement on this, so it's better to reflect this in the code. This should be backported up to 2.6.	2025-09-02 10:29:11 +02:00
Amaury Denoyelle	a84b404b34	MINOR: quic/flags: complete missing flags Add missing quic_conn flags definition for dev utility.	2025-09-02 09:37:43 +02:00
Frederic Lecaille	fba80c7fe8	BUG/MINOR: quic: ignore AGAIN ncbuf err when parsing CRYPTO frames This fix follows this previous one: BUG/MINOR: quic: reorder fragmented RX CRYPTO frames by their offsets which is not sufficient when a client fragments and mixes its CRYPTO frames AND leaveswith holes by packets. ngtcp2 (and perhaps chrome) splits theire CRYPTO frames but without hole by packet. In such a case, the CRYPTO parsing leads to QUIC_RX_RET_FRM_AGAIN errors which cannot be fixed when the peer resends its packets. Indeed, even if the peer resends its frames in a different order, this does not help because since the previous commit, the CRYPTO frames are ordered on haproxy side. This issue was detected thanks to the interopt tests with quic-go as client. This client fragments its CRYPTO frames, mixes them, and generate holes, and most of the times with the retry test. To fix this, when a QUIC_RX_RET_FRM_AGAIN error is encountered, the CRYPTO frames parsing is not stop. This leaves chances to the next CRYPTO frames to be parsed. Must be backported as far as 2.6 as the commit mentioned above.	2025-09-02 08:13:58 +02:00
Alexander Stephan	26776c7b8f	BUG/MINOR: tools: Add OOM check for malloc() in indent_msg() This patch adds a missing out-of-memory (OOM) check after the call to `malloc()` in `indent_msg()`. If memory allocation fails, the function returns NULL to prevent undefined behavior. Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>	2025-09-02 07:29:54 +02:00
Alexander Stephan	aa20905ac9	BUG/MINOR: compression: Add OOM check for calloc() in parse_compression_options() This patch adds a missing out-of-memory (OOM) check after the call to `calloc()` in `parse_compression_options()`. If memory allocation fails, an error message is set, the function returns -1, and parsing is aborted to ensure safe handling of low-memory conditions. Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>	2025-09-02 07:29:54 +02:00
Alexander Stephan	73f9a75894	BUG/MINOR: cfgparse: Add OOM check for calloc() in cfg_parse_listen() This commit adds a missing out-of-memory (OOM) check after the call to `calloc()` in `cfg_parse_listen()`. If memory allocation fails, an alert is logged, error codes are set, and parsing is aborted to prevent undefined behavior. Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>	2025-09-02 07:29:54 +02:00
Alexander Stephan	c3e69cf065	BUG/MINOR: acl: Add OOM check for calloc() in smp_fetch_acl_parse() This patch adds a missing out-of-memory (OOM) check after the call to `calloc()` in `smp_fetch_acl_parse()`. If memory allocation fails, an error message is set and the function returns 0, improving robustness in low-memory situations. Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>	2025-09-02 07:29:54 +02:00
Alexander Stephan	22ac1f5ee9	BUG/MINOR: log: Add OOM checks for calloc() and malloc() in logformat parser and dup_logger() This patch adds missing out-of-memory (OOM) checks after calls to `calloc()` and `malloc()` in the logformat parser and the `dup_logger()` function. If memory allocation fails, an error is reported or NULL is returned, preventing undefined behavior in low-memory conditions. Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>	2025-09-02 07:29:54 +02:00
Alexander Stephan	fbd0fb20a2	BUG/MINOR: halog: Add OOM checks for calloc() in filter_count_srv_status() and filter_count_url() This patch adds missing out-of-memory (OOM) checks after calls to calloc() in the functions `filter_count_srv_status()` and `filter_count_url()`. If memory allocation fails, an error message is printed to stderr and the process exits with status 1. This improves robustness and prevents undefined behavior in low-memory situations. Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>	2025-09-02 07:29:54 +02:00
Christopher Faulet	8c555a4a4e	BUG/MINOR: acl: Properly detect overwritten matching method A bug was introduced by the commit 6ea50ba46 ("MINOR: acl; Warn when matching method based on a suffix is overwritten"). The test on the match function, when defined was not correct. It is now fixed. No backport needed, except if the commit above is backported.	2025-09-01 21:36:25 +02:00
Christopher Faulet	f8b7299ee7	BUG/MINOR: server: Duplicate healthcheck's sni inherited from default server It is not really an issue, but the "check-sni" value inerited from a default server is not duplicated while the paramter value is duplicated during the parsing. So here there is a small leak if several "check-sni" parameters are used on the same server line. The previous value is never released. But to fix this issue, the value inherited from the default server must also be duplicated. At the end it is safer this way and consistant with the parsing of the "sni" parameter. It is harmless so there is no reason to backport this patch.	2025-09-01 15:45:05 +02:00
Christopher Faulet	f7a04b428a	BUG/MEDIUM: server: Duplicate healthcheck's alpn inherited from default server When "check-alpn" parameter is inherited from the default server, the value is not duplicated, the pointer of the default server is used. However, when this parameter is overridden, the old value is released. So the "check-alpn" value of the default server is released. So it is possible to have a UAF if if another server inherit from the same the default server. To fix the issue, the "check-alpn" parameter must be handled the same way the "alpn" is. The default value is duplicated. So it could be safely released if it is forced on the server line. This patch should fix the issue #3096. It must be backported to all stable versions.	2025-09-01 15:45:05 +02:00
Christopher Faulet	6ea50ba462	MINOR: acl; Warn when matching method based on a suffix is overwritten From time to time, issues are reported about string matching based on suffix (for instance path_beg). Each time, it appears these ACLs are used in conjunction with a converter or followed by an explicit matching method (-m). Unfortunatly, it is not an issue but an expected behavior, while it is not obvious. matching suffixes can be consider as aliases on the corresponding '-m' matching method. Thus "path_beg" is equivalent to "path -m beg". When a converter is used the original matching (string) is used and the suffix is lost. When followed by an explicit matching method, it overwrites the matching method based on the suffix. It is expected but confusing. Thus now a warning is emitted because it is a configuration issue for sure. Following sample fetch functions are concerned: * base * path * req.cook * req.hdr * res.hdr * url * urlp The configuration manual was modified to make it less ambiguous.	2025-09-01 15:45:05 +02:00
Christopher Faulet	c51ddd5c38	MINOR: acl: Only allow one '-m' matching method Several '-m' explicit matching method was allowed, but only the last one was really used. There is no reason to specify several matching method and it is most probably an error or a lack of understanding of how matchings are performed. So now, an error is triggered during the configuration parsing to avoid any bad usage.	2025-09-01 15:45:05 +02:00
Christopher Faulet	d09d7676d0	REG-TESTS: map_redirect: Don't use hdr_dom in ACLs with "-m end" matching method hdr_dom() is a alias of "hdr() -m dom". So using it with another explicit matching method does not work because the matching on the domain will never be performed. Only the last matching method is used. The scripts was working by chance because no port was set on host header values. The script was fixed by using "host_only" converter. In addition, host header values were changed to have a port now.	2025-09-01 15:45:05 +02:00
Amaury Denoyelle	1868ca9a95	MINOR: conn/muxes/ssl: add ASSUME_NONNULL() prior to _srv_add_idle When manipulating idle backend connections for input/output processing, special care is taken to ensure the connection cannot be accessed by another thread, for example via a takeover. When processing is over, connection is reinserted in its original list. A connection can either be attached to a session (private ones) or a server idle tree. In the latter case, <srv> is guaranteed to be non null prior to _srv_add_idle() thanks to CO_FL_LIST_MASK comparison with conn flags. This patch adds an ASSUME_NONNULL() to better reflect this. This should fix coverity reports from github issue #3095.	2025-09-01 15:35:22 +02:00
Amaury Denoyelle	dcf2261612	BUG/MAJOR: mux-quic: fix crash on reload during emission MUX QUIC restricts buffer allocation per connection based on the underlying congestion window. If a QCS instance cannot allocate a new buffer, it is put in a buf_wait list. Typically, this will cause stream upper layer to subscribe for sending. A BUG_ON() was present on snd_buf and nego_ff callback prologue to ensure that these functions were not called if QCS is already in buf_wait list. The objective was to guarantee that there is no wake up on a stream if it cannot allocate a buffer. However, this BUG_ON() is not correct, as it can be fired legitimely. Indeed, stream layer can retry emission even if no wake up occured. This case can happen on reload. Thus, BUG_ON() will cause an unexpected crash. Fix this by removing these BUG_ON(). Instead, snd_buf/nego_ff callbacks ensure that QCS is not subscribed in buf_wait list. If this is the case, a nul value will be returned, which is sufficient for the stream layer to pause emission and subscribe if necessary. Occurences for this crash have been reported on the mailing list. It is also the subject of github issue #3080, which should be fixed with this patch. This must be backported up to 3.0.	2025-09-01 15:35:22 +02:00
Frederic Lecaille	800ba73a9c	BUG/MEDIUM: quic: CRYPTO frame freeing without eb_delete() Since this commit: BUG/MINOR: quic: reorder fragmented RX CRYPTO frames by their offsets when they are parsed, the CRYPTO frames are ordered by their offsets into an ebtree. Then their data are provided to the ncbufs. But in case of error, when qc_handle_crypto_frm() returns QUIC_RX_RET_FRM_FATAL or QUIC_RX_RET_FRM_AGAIN), they remain attached to their tree. Then from <err> label, they are deteleted and deleted (with a while(node) { eb_delete(); qc_frm_free();} loop). But before this loop, these statements directly free the frame without deleting it from its tree, if this is a CRYPTO frame, leading to a use after free when running the loop: if (frm) qc_frm_free(qc, &frm); This issue was detected by the interop tests, with quic-go as client. Weirdly, this client sends CRYPTO frames by packet with holes. Must be backported as far as 2.6 as the commit mentioned above.	2025-09-01 10:39:00 +02:00
Frederic Lecaille	90126ec9b7	CLEANUP: quic: remove a useless CRYPTO frame variable assignment This modification should have arrived with this commit: MINOR: quic: remove ->offset qf_crypto struct field Since this commit, the CRYPTO offset node key assignment is done at parsing time when calling qc_parse_frm() from qc_parse_pkt_frms(). This useless assigment has been reported in GH #3095 by coverity. This patch should be easily backported as far as 2.6 as the one mentioned above to ease any further backport to come.	2025-09-01 09:31:04 +02:00
Collison, Steven	00be358426	DOC: proxy-protocol: Make example for PP2_SUBTYPE_SSL_SIG_ALG accurate The docs call out that this field is the algorithm used to sign the certificate. However, the example only had the hash portion of the signature algorithm. This change updates the example to be accurate based on a value written by HAProxy, which is based on an OID for signature algorithms. I based example on a real TLV written by HAProxy on my machine with all SSL TLVs enabled in config.	2025-08-29 16:26:57 +02:00
Amaury Denoyelle	1517869145	BUG/BUILD: stats: fix build due to missing stat enum definition Recently, new server counter for private idle connections have been added to statistics output. However, the patch was missing ST_I_PX_PRIV_IDLE_CUR enum definition. No need to backport.	2025-08-29 09:32:10 +02:00
Christopher Faulet	8f3b537547	MEDIUM: proxy: Reject some header names for 'http-send-name-header' directive From time to time, we saw the 'http-send-name-header' directive used to overwrite the Host header to workaround limitations of a buggy application. Most of time, this led to troubles. This was never officially supported and each time we strongly discouraged anyone to do so. We already thought to deprecate this directive, but it seems to be still used by few people. So for now, we decided to strengthen the tests performed on it. The header name is now checked during the configuration parsing to forbid some risky names. 'Host', 'Content-Length', 'Transfer-Encoding' and 'Connection' header names are now rejected. But more headers could be added in future.	2025-08-29 09:27:01 +02:00
Amaury Denoyelle	2afcba1eb7	MINOR: proxy: extend "show servers conn" output CLI command "show servers conn" is used as a debugging tool to monitor the number of connections per server. This patch extends its output by adding the content of two server counters. <served> is the first added column. It represents the number of active streams on a server. <curr_sess_idle_conns> is the second added column. This is a recently added value which account private idle connections referencing a server.	2025-08-28 18:58:11 +02:00
Amaury Denoyelle	fac1de935a	MINOR: stats: display new curr_sess_idle_conns server counter Add a new stats column in proxy stats to display server counter for private idle connections. This counter has been introduced recently. The value is displayed on CSV output on the last column before modules. It is also displayed on HTLM page alongside other idle server counters.	2025-08-28 18:58:11 +02:00
Amaury Denoyelle	fb43343f6f	MINOR: doc: add missing statistics column Complete documentation with missing description of newly added columns. This must be backported up to 2.8.	2025-08-28 18:58:11 +02:00
Amaury Denoyelle	f0710a1fbc	MINOR: doc: add missing statistics column Complete documentation with missing description of newly added columns. This should be backported up to 2.4	2025-08-28 18:58:11 +02:00
William Lallemand	e0ec01849f	DOC: configuration: confuse "strict-mode" with "zero-warning" 4b10302fd8 ("MINOR: cfgparse: implement a simple if/elif/else/endif macro block handler") introduces a confusion between "strict-mode" and "zero-warning". This patch fixes the issue by changing "strict-mode" by "zero-warning" in section 2.4. Conditional blocks. Must be backported as far as 2.4.	2025-08-28 17:35:06 +02:00
Amaury Denoyelle	21f7974e05	OPTIM: backend: set release on takeover for strict maxconn When strict maxconn is enforced on a server, it may be necessary to kill an idle connection to never exceed the limit. To be able to delete a connection from any thread, takeover is first used to migrate it on the current thread prior to its deletion. As takeover is performed to delete a connection instead of reusing it, <release> argument can be set to true. This removes unnecessary allocations of resources prior to connection deletion. As such, this patch is a small optimization for strict maxconn implementation. Note that this patch depends on the previous one which removes any assumption in takeover implementation that thread isolation is active if <release> is true.	2025-08-28 16:11:32 +02:00
Amaury Denoyelle	d971d3fed8	MINOR: muxes: adjust takeover with buf_wait interaction Takeover operation defines an argument <release>. It's a boolean which if set indicate that freed connection resources during the takeover does not have to be reallocated on the new thread. Typically, it is set to false when takever is performed to reuse a connection. However, when used to be able to delete a connection from a different thread, <release> should be set to true. Previously, <release> was only set in conjunction with "del server" handler. This operation was performed under thread isolation, which guarantee that not thread-safe operation such as removal from buf_wait list could be performed on takeover if <release> was true. In the contrary case, takeover operation would fail. Recently, "del server" handler has been adjusted to remove idle connection cleanup with takeover. As such, <release> is never set to true in remaining takeover usage. However, takeover is also used to enforce strict-maxconn on a server. This is performed to delete a connection from any thread, which is the primary reason of <release> to true. But for the moment as takeover implementers considers that thread isolation is active if <release> is set, this is not yet applicable for strict-maxconn usage. Thus, the purpose of this patch is to adjust takeover implementation. Remove assumption between <release> and thread-isolation mode. It's not possible to remove a connection from a buf_wait list, an error will be return in any case.	2025-08-28 16:09:48 +02:00
William Lallemand	8a456399db	DOC: unreliable sockpair@ on macOS We discovered that the sockpair@ protocol is unreliable in macOS, this is the same problem that we fixed in d7f6819. But it's not possible to implement a acknowledgment once the socket are in non-blocking mode. The problem was discovered in issue #3045. Must be backported in every stable versions.	2025-08-28 15:35:17 +02:00
William Lallemand	ffdccb6e04	BUILD: mworker: fix ignoring return value of ‘read’ Fix read return value unused result. src/haproxy.c: In function ‘main’: src/haproxy.c:3630:17: error: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result] 3630 \| read(sock_pair[1], &c, 1); \| ^~~~~~~~~~~~~~~~~~~~~~~~~ Must be backported where d7f6819 is backported.	2025-08-28 15:13:01 +02:00
Amaury Denoyelle	7232677385	MAJOR: server: do not remove idle conns in del server Do not remove anymore idle and purgeable connections directly under the "del server" handler. The main objective of this patch is to reduce the amount of work performed under thread isolation. This should improve "del server" scheduling with other haproxy tasks. Another objective is to be able to properly support dynamic servers with QUIC. Indeed, takeover is not yet implemented for this protocol, hence it is not possible to rely on cleanup of idle connections performed by a single thread under "del server" handler. With this change it is not possible anymore to remove a server if there is still idle connections referencing it. To ensure this cannot be performed, srv_check_for_deletion() has been extended to check server counters for idle and idle private connections. Server deletion should still remain a viable procedure, as first it is mandatory to put the targetted server into maintenance. This step forces the cleanup of its existing idle connections. Thanks to a recent change, all finishing connections are also removed immediately instead of becoming idle. In short, this patch transforms idle connections removal from a synchronous to an asynchronous procedure. However, this should remain a steadfast and quick method achievable in less than a second. This patch is considered major as some users may notice this change when removing a server. In particular with the following CLI commands pipeline: "disable server <X>; shutdown sessions server <X>; del server <X>" Server deletion will now probably fail, as idle connections purge cannot be completed immediately. Thus, it is now highly advise to always use a small delay "wait srv-removable" before "del server" to ensure that idle connections purge is executed prior. Along with this change, documentation for "del server" and related "shutdown sessions server" has been refined, in particular to better highlight under what conditions a server can be removed.	2025-08-28 15:08:35 +02:00
Amaury Denoyelle	dbe31e3f65	MEDIUM: session: account on server idle conns attached to session This patch adds a new member <curr_sess_idle_conns> on the server. It serves as a counter of idle connections attached on a session instead of regular idle/safe trees. This is used only for private connections. The objective is to provide a method to detect if there is idle connections still referencing a server. This will be particularly useful to ensure that a server is removable. Currently, this is not yet necessary as idle connections are directly freed via "del server" handler under thread isolation. However, this procedure will be replaced by an asynchronous mechanism outside of thread isolation. Careful: connections attached to a session but not idle will not be accounted by this counter. These connections can still be detected via srv_has_streams() so "del server" will be safe. This counter is maintain during the whole lifetime of a private connection. This is mandatory to guarantee "del server" safety and is conform with other idle server counters. What this means it that decrement is performed only when the connection transitions from idle to in use, or just prior to its deletion. For the first case, this is covered by session_get_conn(). The second case is trickier. It cannot be done via session_unown_conn() as a private connection may still live a little longer after its removal from session, most notably when scheduled for idle purging. Thus, conn_free() has been adjusted to handle the final decrement. Now, conn_backend_deinit() is also called for private connections if CO_FL_SESS_IDLE flag is present. This results in a call to srv_release_conn() which is responsible to decrement server idle counters.	2025-08-28 15:08:35 +02:00
Amaury Denoyelle	7a6e3c1a73	MAJOR: server: implement purging of private idle connections When a server goes into maintenance, or if its IP address is changed, idle connections attached to it are scheduled for deletion via the purge mechanism. Connections are moved from server idle/safe list to the purge list relative to their thread. Connections are freed on their owned thread by the scheduled purge task. This patch extends this procedure to also handle private idle connections stored in sessions instead of servers. This is possible thanks via <sess_conns> list server member. A call to the newly defined-function session_purge_conns() is performed on each list element. This moves private connections from their session to the purge list alongside other server idle connections. This change relies on the serie of previous commits which ensure that access to private idle connections is now thread-safe, with idle_conns lock usage and careful manipulation of private idle conns in input/output handlers. The main benefit of this patch is that now all idle connections targetting a server set in maintenance are removed. Previously, private connections would remain until their attach sessions were closed.	2025-08-28 15:08:35 +02:00
Amaury Denoyelle	17a1daca72	MEDIUM: mux-quic: enforce thread-safety of backend idle conns Complete QUIC MUX for backend side. Ensure access to idle connections are performed in a thread-safe way. Even if takeover is not yet implemented for this protocol, it is at least necessary to ensure that there won't be any issue with idle connections purging mechanism. This change will also be necessary to ensure that QUIC servers can safely be removed via CLI "del server". This is not yet sufficient as currently server deletion still relies on takeover for idle connections removal. However, this will be adjusted in a future patch to instead use idle connections standard purging mechanism.	2025-08-28 15:08:35 +02:00
Amaury Denoyelle	73fd12e928	MEDIUM: conn/muxes/ssl: remove BE priv idle conn from sess on IO This is a direct follow-up of previous patch which adjust idle private connections access via input/output handlers. This patch implement the handlers prologue part. Now, private idle connections require a similar treatment with non-private idle connections. Thus, private conns are removed temporarily from its session under protection of idle_conns lock. As locking usage is already performed in input/output handler, session_unown_conn() cannot be called. Thus, a new function session_detach_idle_conn() is implemented in session module, which performs basically the same operation but relies on external locking.	2025-08-28 15:08:35 +02:00
Amaury Denoyelle	8de0807b74	MEDIUM: conn/muxes/ssl: reinsert BE priv conn into sess on IO completion When dealing with input/output on a connection related handler, special care must be taken prior to access the connection if it is considered as idle, as it could be manipulated by another thread. Thus, connection is first removed from its idle tree before processing. The connection is reinserted on processing completion unless it has been freed during it. Idle private connections are not concerned by this, because takeover is not applied on them. However, a future patch will implement purging of these connections along with regular idle ones. As such, it is necessary to also protect private connections usage now. This is the subject of this patch and the next one. With this patch, input/output handlers epilogue of muxes/SSL/conn_notify_mux() are adjusted. A new code path is able to deal with a connection attached to a session instead of a server. In this case, session_reinsert_idle_conn() is used. Contrary to session_add_conn(), this new function is reserved for idle connections usage after a temporary removal. Contrary to _srv_add_idle() used by regular idle connections, session_reinsert_idle_conn() may fail as an allocation can be required. If this happens, the connection is immediately destroyed. This patch has no effect for now. It must be coupled with the next one which will temporarily remove private idle connections on input/output handler prologue.	2025-08-28 15:08:35 +02:00
Amaury Denoyelle	9574867358	MINOR: muxes: enforce thread-safety for private idle conns When a backend connnection becomes idle, muxes must activate some protection to mark future access on it as dangerous. Indeed, once a connection is inserted in an idle list, it may be manipulated by another thread, either via takeover or scheduled for purging. Private idle connections are stored into a session instead of the server tree. They are never subject to a takeover for reuse or purge mechanism. As such, currently they do not require the same level of protection. However, a new patch will introduce support for private idle connections purging. Thus, the purpose of this patch is to ensure protection is activated as well now. TASK_F_USR1 was already set on them as an anticipation for such need. Only some extra operations were missing, most notably xprt_set_idle() invokation. Also, return path of muxes detach operation is adjusted to ensure such connection are never accessed after insertion.	2025-08-28 14:55:21 +02:00
Amaury Denoyelle	b18b5e2f74	MINOR: server: cleanup idle conns for server in maint already stopped When a server goes into maintenance mode, its idle connections are scheduled for an immediate purge. However, this is not the case if the server is already in stopped state, for example due to a health check failure. Adjust _srv_update_status_adm() to ensure that idle connections are always scheduled for purge when going into maintenance in both cases. The main advantage of this patch is to ensure consistent behavior for server maintenance mode. Note that it will also become necessary as server deletion will be adjusted with a future patch. Idle connection closure won't be performed by "del server" handler anymore, so it's important to ensure that a full cleanup is always performed prior to executing it, else the server may not be removable during a certain delay.	2025-08-28 14:55:21 +02:00
Amaury Denoyelle	fa1a168bf1	MEDIUM: session: close new idle conns if server in maintenance Previous patch ensures that a backend connection going into idle state is rejected and freed if its target server is in maintenance. This patch introduces a similar change for connections attached in the session. session_check_idle_conn() now returns an errorl if connection target server is in maintenance, similarly to session max idle conns limit reached. This is sufficient to instruct muxes to delete the connection immediately.	2025-08-28 14:55:21 +02:00
Amaury Denoyelle	67df6577ff	MEDIUM: server: close new idle conns if server in maintenance Currently, when a server is set on maintenance mode, its idle connection are scheduled for purge. However, this does not prevent currently used connection to become idle later on, even if the server is still off. Change this behavior : an idle connection is now rejected by the server if it is in maintenance. This is implemented with a new condition in srv_add_to_idle_list() which returns an error value. In this case, muxes stream detach callback will immediately free the connection. A similar change is also performed in each MUX and SSL I/O handlers and in conn_notify_mux(). An idle connection is not reinserted in its idle list if server is in maintenance, but instead it is immediately freed.	2025-08-28 14:55:18 +02:00
Amaury Denoyelle	f234b40cde	MINOR: server: shard by thread sess_conns member Server member <sess_conns> is a mt_list which contains every backend connections attached to a session which targets this server. These connecions are not present in idle server trees. The main utility of this list is to be able to cleanup these connections prior to removing a server via "del server" CLI. However, this procedure will be adjusted by a future patch. As such, <sess_conns> member must be moved into srv_per_thread struct. Effectively, this duplicates a list for every threads. This commit does not introduce functional change. Its goal is to ensure that these connections are now ordered by their owning thread, which will allow to implement a purge, similarly to idle connections attached to servers.	2025-08-28 14:52:29 +02:00
Amaury Denoyelle	37fca75ef7	MEDIUM: session: protect sess conns list by idle_conns_lock Introduce idle_conns_lock usage to protect manipulation to <priv_conns> session member. This represents a list of intermediary elements used to store backend connections attached to a session to prevent their sharing across multiple clients. Currently, this patch is unneeded as sessions are only manipulated on a single-thread. Indeed, contrary to idle connections stored in servers, takeover is not implemented for connections attached to a session. However, a future patch will introduce purging of these connections, which is already performed for connections attached to servers. As this can be executed by any thread, it is necessary to introduce idle_conns_lock usage to protect their manipulation.	2025-08-28 14:52:29 +02:00
Amaury Denoyelle	f3e8e863c9	MINOR: session: refactor alloc/lookup of sess_conns elements By default backend connections are stored into idle/avail server trees. However, if such connections cannot be shared between multiple clients, session serves as the alternative storage. To be able to quickly reuse a backend conn from a session, they are indexed by their target, which is either a server or a backend proxy. This is the purpose of 'struct sess_priv_conns' intermediary stockage element. Lookup and allocation of these elements are performed in several session function, for example to add, get or remove a backend connection from a session. The purpose of this patch is to simplify this by providing two internal functions sess_alloc_sess_conns() and sess_get_sess_conns(). Along with this, a new BUG_ON() is added into session_unown_conn(), which ensure that sess_priv_conns element is found when the connection is removed from the session.	2025-08-28 14:52:29 +02:00
Amaury Denoyelle	d4f7a2dbcc	MINOR: session: uninline functions related to BE conns management Move from header to source file functions related to session management of backend connections. These functions are big enough to remove inline attribute.	2025-08-28 14:52:29 +02:00
Amaury Denoyelle	d0df41fd22	MINOR: session: document explicitely that session_add_conn() is safe A set of recent patches have simplified management of backend connection attached to sessions. The API is now stricter to prevent any misuse. One of this change is the addition of a BUG_ON() in session_add_conn(), which ensures that a connection is not attached to a session if its <owner> field points to another entry. On older haproxy releases, this assertion could not be enforced due to NTLM as a connection is turned as private during its transfer. When using a true multiplexed protocol on the backend side, the connection could be assigned in turn to several sessions. However, NTLM is now only applied for HTTP/1.1 as it does not make sense if the connection is already shared. To better clarify this situation, extend the comment on BUG_ON() inside session_add_conn().	2025-08-28 14:52:29 +02:00
Amaury Denoyelle	b3ce464435	BUG/MINOR: mux-quic: do not access conn after idle list insert Once a connection is inserted into the server idle/safe tree during stream detach, it is not accessed anymore by the muxes without idle_conns_lock protection. This is because the connection could have been already stolen by a takeover operation. Adjust QUIC MUX detach implementation to follow the same pattern. Note that, no bug can occur due to takeover as QUIC does not implement it. However, prior to this patch, there may still exist race-conditions with idle connection purging. No backport needed.	2025-08-28 14:52:29 +02:00
Amaury Denoyelle	0be225f341	BUG/MINOR: server: decrement session idle_conns on del server When a server is deleted, each of its idle connections are removed. This is also performed for every private connections stored on sessions which referenced the target server. As mentionned above, these private connections are idle, guaranteed by srv_check_for_deletion(). A BUG_ON() on CO_FL_SESS_IDLE is already present to guarantee this. Thus, these connections are accounted on the session to enforce max-session-srv-conns limit. However, this counter is not decremented during private conns cleanup on "del server" handler. This patch fixes this by adding a decrement for every private connections removed via "del server". This should be backported up to 3.0.	2025-08-28 14:52:29 +02:00
Amaury Denoyelle	bce29bc7a4	MINOR: cli: display failure reason on wait command wait CLI command can be used to wait until either a defined timeout or a specific condition is reached. So far, srv-removable is the only event supported. This is tested via srv_check_for_deletion(). This is implemented via srv_check_for_deletion(), which is able to report a message describing the reason if the condition is unmet. Previously, wait return a generic string, to specify if the condition is met, the timer has expired or an immediate error is encountered. In case of srv-removable, it did not report the real reason why a server could not be removed. This patch improves wait command with srv-removable. It now displays the last message returned by srv_check_for_deletion(), either on immediate error or on timeout. This is implemented by using dynamic string output with cli_dynmsg/dynerr() functions.	2025-08-28 14:52:29 +02:00
Amaury Denoyelle	04f05f1880	BUG/MINOR: connection: remove extra session_unown_conn() on reverse When a connection is reversed via rhttp protocol on the edge endpoint, it migrates from frontend to backend side. This operation is performed by conn_reverse(). During this transition, the conn owning session is freed as it becomes unneeded. Prior to this patch, session_unown_conn() was also called during frontend to backend migration. However, this is unnecessary as this function is only used for backend connection reuse. As such, this patch removes this unnecessary call. This does not cause any harm to the process as session_unown_conn() can handle a connection not inserted yet. However, for clarity purpose it's better to backport this patch up to 3.0.	2025-08-28 14:52:29 +02:00
Amaury Denoyelle	a96f1286a7	BUG/MINOR: connection: rearrange union list members A connection can be stored in several lists, thus there is several attach points in struct connection. Depending on its proxy side, either frontend or backend, a single connection will only access some of them during its lifetime. As an optimization, these attach points are organized in a union. However, this repartition was not correctly achieved along frontend/backend side delimitation. Furthermore, reverse HTTP has recently been introduced. With this feature, a connection can migrate from frontend to backend side or vice versa. As such, it becomes even more tedious to ensure that these members are always accessed in a safe way. This commit rearrange these fields. First, union is now clearly splitted between frontend and backend only elements. Next, backend elements are initialized with conn_backend_init(), which is already used during connection reversal on an edge endpoint. A new function conn_frontend_init() serves to initialize the other members, called both on connection first instantiation and on reversal on a dialer endpoint. This model is much cleaner and should prevent any access to fields from the wrong side. Currently, there is no known case of wrong access in the existing code base. However, this cleanup is considered an improvement which must be backported up to 3.0 to remove any possible undefined behavior.	2025-08-28 14:52:29 +02:00
William Lallemand	d7f6819161	BUG/MEDIUM: mworker: fix startup and reload on macOS Since the mworker rework in haproxy 3.1, the worker need to tell the master that it is ready. This is done using the sockpair protocol by sending a _send_status message to the master. It seems that the sockpair protocol is buggy on macOS because of a known issue around fd transfer documented in sendmsg(2): https://man.freebsd.org/cgi/man.cgi?sendmsg(2) BUGS section Because sendmsg() does not necessarily block until the data has been transferred, it is possible to transfer an open file descriptor across an AF_UNIX domain socket (see recv(2)), then close() it before it has actually been sent, the result being that the receiver gets a closed file descriptor. It is left to the application to implement an acknowledgment mechanism to prevent this from happening. Indeed the recv side of the sockpair is closed on the send side just after the send_fd_uxst(), which does not implement an acknowledgment mechanism. So the master might never recv the _send_status message. In order to implement an acknowledgment mechanism, a blocking read() is done before closing the recv fd on the sending side, so we are sure that the message was read on the other side. This was only reproduced on macOS, meaning the master CLI is also impacted on macOS. But no solution was found on macOS for it. Implementing an acknowledgment mechanism would complexify too much the protocol in non-blocking mode. The problem was reported in ticket #3045, reproduced and analyzed by @cognet. Must be backported as far as 3.1.	2025-08-28 14:51:46 +02:00
Valentine Krasnobaeva	441cd614f9	BUG/MINOR: acl: set arg_list->kw to aclkw->kw string literal if aclkw is found During configuration parsing *args can contain different addresses, it is changing from line to line. smp_resolve_args() is called after the configuration parsing, it uses arg_list->kw to create an error message, if a userlist referenced in some ACL is absent. This leads to wrong keyword names reported in such message or some garbage is printed. It does not happen in the case of sample fetches. In this case arg_list->kw is assigned to a string literal from the sample_fetch struct returned by find_sample_fetch(). Let's do the same in parse_acl_expr(), when find_acl_kw() lookup returns a corresponding acl_keyword structure. This fixes the issue #3088 at GitHub. This should be backported in all stable versions since 2.6 including 2.6.	2025-08-28 10:22:21 +02:00
Frederic Lecaille	ffa926ead3	BUG/MINOR: mux-quic: trace with non initialized qcc This issue leads to crashes when the QUIC mux traces are enabled and could be reproduced with -dMfail. When the qcc allocation fails (qcc_init()) haproxy crashes into qmux_dump_qcc_info() because ->conn qcc member is initialized: Program terminated with signal SIGSEGV, Segmentation fault. at src/qmux_trace.c:146 146 const struct quic_conn qc = qcc->conn->handle.qc; [Current thread is 1 (LWP 1448960)] (gdb) p qcc $1 = (const struct qcc ) 0x7f9c63719fa0 (gdb) p qcc->conn $2 = (struct connection *) 0x155550508 (gdb) This patch simply fixes the TRACE() call concerned to avoid <qcc> object dereferencing when it is NULL. Must be backported as far as 3.0.	2025-08-28 08:19:34 +02:00
Frederic Lecaille	31c17ad837	MINOR: quic: remove ->offset qf_crypto struct field This patch follows this previous bug fix: BUG/MINOR: quic: reorder fragmented RX CRYPTO frames by their offsets where a ebtree node has been added to qf_crypto struct. It has the same meaning and type as ->offset_node.key field with ->offset_node an eb64tree node. This patch simply removes ->offset which is no more useful. This patch should be easily backported as far as 2.6 as the one mentioned above to ease any further backport to come.	2025-08-28 08:19:34 +02:00
William Lallemand	2ed515c632	DOC: configuration: clarify 'default-crt' and implicit default certificates Clarify the behavior of implicit default certificates when used on the same line as the default-crt keyword. Should be backported as far as 3.2	2025-08-27 17:09:02 +02:00
William Lallemand	ab7358b366	MEDIUM: ssl: convert diag to warning for strict-sni + default-crt Previous patch emits a diag warning when both 'strict-sni' + 'default-crt' are used on the same bind line. This patch converts this diagnostic warning to a real warning, so the previous patch could be backported without breaking configurations. This was discussed in #3082.	2025-08-27 16:22:12 +02:00
William Lallemand	18ebd81962	MINOR: ssl: diagnostic warning when both 'default-crt' and 'strict-sni' are used It possible to use both 'strict-sni' and 'default-crt' on the same bind line, which does not make much sense. This patch implements a check which will look for default certificates in the sni_w tree when strict-sni is used. (Referenced by their empty sni ""). default-crt sets the CKCH_INST_EXPL_DEFAULT flag in ckch_inst->is_default, so its possible to differenciate explicits default from implicit default. Could be backported as far as 3.0. This was discussed in ticket #3082.	2025-08-27 16:22:12 +02:00
Frederic Lecaille	d753f24096	BUG/MINOR: quic: reorder fragmented RX CRYPTO frames by their offsets This issue impacts the QUIC listeners. It is the same as the one fixed by this commit: BUG/MINOR: quic: repeat packet parsing to deal with fragmented CRYPTO As chrome, ngtcp2 client decided to fragment its CRYPTO frames but in a much more agressive way. This could be fixed with a list local to qc_parse_pkt_frms() to please chrome thanks to the commit above. But this is not sufficient for ngtcp2 which often splits its ClientHello message into more than 10 fragments with very small ones. This leads the packet parser to interrupt the CRYPTO frames parsing due to the ncbuf gap size limit. To fix this, this patch approximatively proceeds the same way but with an ebtree to reorder the CRYPTO by their offsets. These frames are directly inserted into a local ebtree. Then this ebtree is reused to provide the reordered CRYPTO data to the underlying ncbuf (non contiguous buffer). This way there are very few less chances for the ncbufs used to store CRYPTO data to reach a too much fragmented state. Must be backported as far as 2.6.	2025-08-27 16:14:19 +02:00
Frederic Lecaille	729196fbed	BUG/MEDIUM: quic-be: avoid crashes when releasing Initial pktns This bug arrived with this fix: BUG/MINOR: quic-be: missing Initial packet number space discarding leading to crashes when dereferencing ->ipktns. Such crashes could be reproduced with -dMfail option. To reach them, the memory allocations must fail. So, this is relatively rare, except on systems with limited memory. To fix this, do not call quic_pktns_discard() if ->ipktns is NULL. No need to backport.	2025-08-27 16:14:19 +02:00
William Lallemand	c36e4fb17f	DOC: configuration: reword 'generate-certificates' Reword the 'generate-certificates' keyword documentation to clarify what's happening upon error. This was discussed in ticket #3082.	2025-08-27 13:42:29 +02:00
Aurelien DARRAGON	2cd0afb430	MINOR: proxy: handle shared listener counters preparation from proxy_postcheck() We used to allocate and prepare listener counters from check_config_validity() all at once. But it isn't correct, since at that time listeners's guid are not inserted yet, thus counters_fe_shared_prepare() cannot work correctly, and so does shm_stats_file_preload() which is meant to be called even earlier. Thus in this commit (and to prepare for upcoming shm shared counters preloading patches), we handle the shared listener counters prep in proxy_postcheck(), which means that between the allocation and the prep there is the proper window for listener's guid insertion and shm counters preloading. No change of behavior expected when shm shared counters are not actually used.	2025-08-27 12:54:25 +02:00
Aurelien DARRAGON	cdb97cb73e	MEDIUM: server: split srv_init() in srv_preinit() + srv_postinit() We actually need more granularity to split srv postparsing init tasks: Some of them are required to be run BEFORE the config is checked, and some of them AFTER the config is checked. Thus we push the logic from 368d0136 ("MEDIUM: server: add and use srv_init() function") a little bit further and split the function in two distinct ones, one of them executed under check_config_validity() and the other one using REGISTER_POST_SERVER_CHECK() hook. SRV_F_CHECKED flag was removed because it is no longer needed, srv_preinit() is only called once, and so is srv_postinit().	2025-08-27 12:54:19 +02:00
Aurelien DARRAGON	9736221e90	MINOR: haproxy: abort config parsing on fatal errors for post parsing hooks When pre-check and post-check postparsing hooks= are evaluated in step_init_2() potential fatal errors are ignored during the iteration and are only taken into account at the end of the loop. This is not ideal because some errors (ie: memory errors) could cause multiple alert messages in a row, which could make troubleshooting harder for the user. Let's stop as soon as a fatal error is encountered for post parsing hooks, as we use to do everywhere else.	2025-08-27 12:54:13 +02:00
Christopher Faulet	49db9739d0	BUG/MEDIUM: spoe: Improve error detection in SPOE applet on client abort It is possible to interrupt a SPOE applet without reporting an error. For instance, when the client of the parent stream aborts. Thanks to this patch, we take care to report an error on the SPOE applet to be sure to interrupt the processing. It is especially important if the connection to the agent is queued. Thanks to 886a248be ("BUG/MEDIUM: mux-spop: Reject connection attempts from a non-spop frontend"), it is no longer an issue. But there is no reason to continue to process if the parent stream is gone. In addition, in the SPOE filter, if the processing is interrupted when the filter is destroyed, no specific status code was set. It is not a big deal because it cannot be logged at this stage. But it can be used to notify the SPOE applet. So better to set it. This patch should be backported as far as 3.1.	2025-08-26 16:12:18 +02:00
William Lallemand	7a30c10587	REGTESTS: jwt: create dynamically "cert.ecdsa.pem" Stop declaring "cert.ecdsa.pem" in a crt-store, and add it dynamically over the stats socket insted. This way we fully verify a JWS signature with a certificate which never existed at HAProxy startup.	2025-08-25 16:44:24 +02:00
Christopher Faulet	886a248be4	BUG/MEDIUM: mux-spop: Reject connection attempts from a non-spop frontend It is possible to crash the process by initializing a connection to a SPOP server from a non-spop frontend. It is of course unexpected and invalid. And there are some checks to prevent that when the configuration is loaded. However, it is not possible to handle all cases, especially the "use_backend" rules relying on log-format strings. It could be good to improve the backend selection by checking the mode compatibility (for now, it is only performed for the HTTP). But at the end, this can also be handled by the SPOP multiplexer when it is initialized. If the opposite SD is not attached to an SPOE agent, we should fail the mux initialization and return an internal error. This patch must be backported as far as 3.1.	2025-08-25 11:11:05 +02:00
Christopher Faulet	b4a92e7cb1	MEDIUM: applet: Set .rcv_buf and .snd_buf functions on default ones if not set Based on the applet flags, it is possible to set .rcv_buf and .snd_buf callback functions if necessary. If these functions are not defined for an applet using the new API, it means the default functions must be used. We also take care to choose the raw version or the htx version, depending on the applet flags.	2025-08-25 11:11:05 +02:00
Christopher Faulet	71c01c1010	MINOR: applet: Make some applet functions HTX aware applet_output_room() and applet_input_data() are now HTX aware. These functions automatically rely on htx versions if APPLET_FL_HTX flag is set for the applet.	2025-08-25 11:11:05 +02:00
Christopher Faulet	927884a3eb	MINOR: applet: Add a flag to know an applet is using HTX buffers Multiplexers already explicitly announce their HTX support. Now it is possible to set flags on applet, it could be handy to do the same. So, now, HTX aware applets must set the APPLET_FL_HTX flag.	2025-08-25 11:11:05 +02:00
Christopher Faulet	1c76e4b2e4	MINOR: applet: Add function to test applet flags from the appctx appctx_app_test() function can now be used to test the applet flags using an appctx. This simplify a bit tests on applet flags. For now, this function is used to test APPLET_FL_NEW_API flag.	2025-08-25 11:11:05 +02:00
Christopher Faulet	3de6c375aa	MINOR: applet: Rely on applet flag to detect the new api Instead of setting a flag on the applet context by checking the defined callback functions of the applet to know if an applet is using the new API or not, we can now rely on the applet flags itself. By checking APPLET_FL_NEW_API flag, it does the job. APPCTX_FL_INOUT_BUFS flag is thus removed.	2025-08-25 11:11:05 +02:00
Aurelien DARRAGON	3da1d63749	BUG/MEDIUM: http_ana: handle yield for "stats http-request" evaluation stats http-request rules evaluation is handled separately in http_process_req_common(). Because of that, if a rule requires yielding, the evaluation is interrupted as (F)YIELD verdict return values are not handled there. Since 3.2 with the introduction of costly ruleset interruption in 0846638 ("MEDIUM: stream: interrupt costly rulesets after too many evaluations"), the issue started being more visible because stats http-request rules would be interrupted when the evaluation counters reached tune.max-rules-at-once, but the evaluation would never be resumed, and the request would continue to be handled as if the evaluation was complete. Note however that the issue already existed in the past for actions that could return ACT_RET_YIELD such as "pause" for instance. This issue was reported by GH user @Wahnes in #3087, thanks to him for providing useful repro and details. To fix the issue, we merge rule vedict handling in http_process_req_common() so that "stats http-request" evaluation benefits from all return values already supported for the current ruleset. It should be backported in 3.2 with 0846638 ("MEDIUM: stream: interrupt costly rulesets after too many evaluations"), and probably even further (all stable versions) if the patch adaptation is not to complex (before HTTP_RULE_RES_FYIELD was introduced) because it is still relevant.	2025-08-25 10:59:16 +02:00
Aurelien DARRAGON	f9b227ebff	MINOR: http_ana: fix typo in http_res_get_intercept_rule HTTP_RULE_RES_YIELD was used where HTTP_RULE_RES_FYIELD should be used. Hopefully, aside from debug traces, both return values were treated equally. Let's fix that to prevent confusion and from causing bugs in the future. It may be backported in 3.2 with 0846638 ("MEDIUM: stream: interrupt costly rulesets after too many evaluations") if it easily applies	2025-08-25 10:59:08 +02:00
Amaury Denoyelle	1529ec1a25	MINOR: quic: centralize padding for HP sampling on packet building The below patch has simplified INITIAL padding on emission. Now, qc_prep_pkts() is responsible to activate padding for this case, and there is no more special case in qc_do_build_pkt() needed. commit 8bc339a6ad4702f2c39b2a78aaaff665d85c762b BUG/MAJOR: quic: fix INITIAL padding with probing packet only However, qc_do_build_pkt() may still activate padding on its own, to ensure that a packet is big enough so that header protection decryption can be performed by the peer. HP decryption is performed by extracting a sample from the ciphered packet, starting 4 bytes after PN offset. Sample length is 16 bytes as defined by TLS algos used by QUIC. Thus, a QUIC sender must ensures that length of packet number plus payload fields to be at least 4 bytes long. This is enough given that each packet is completed by a 16 bytes AEAD tag which can be part of the HP sample. This patch simplifies qc_do_build_pkt() by centralizing padding for this case in a single location. This is performed at the end of the function after payload is completed. The code is thus simpler. This is not a bug. However, it may be interesting to backport this patch up to 2.6, as qc_do_build_pkt() is a tedious function, in particular when dealing with padding generation, thus it may benefit greatly from simplification.	2025-08-25 08:48:24 +02:00
Amaury Denoyelle	7d554ca629	BUG/MINOR: quic: don't coalesce probing and ACK packet of same type Haproxy QUIC stack suffers from a limitation : it's not possible to emit a packet which contains probing data and a ACK frame in it. Thus, in case qc_do_build_pkt() is invoked which both values as true, probing has the priority and ACK is ignored. However, this has the undesired side-effect of possibly generating two coalesced packets of the same type in the same datagram : the first one with the probing data and the second with an ACK frame. This is caused by qc_prep_pkts() loop which may call qc_do_build_pkt() multiple times with the same QEL instance. This case is normally use when a full datagram has been built but there is still content to emit on the current encryption level. To fix this, alter qc_prep_pkts() loop : if both probing and ACK is requested, force the datagram to be written after packet encoding. This will result in a datagram containing the packet with probing data as final entry. A new datagram is started for the next packet which will can contain the ACK frame. This also has some impact on INITIAL padding. Indeed, if packet must be the last due to probing emission, qc_prep_pkts() will also activate padding to ensure final datagram is at least 1.200 bytes long. Note that coalescing two packets of the same type is not invalid according to QUIC RFC. However it could cause issue with some shaky implementations, so it is considered as a bug. This must be backported up to 2.6.	2025-08-22 18:20:42 +02:00
Amaury Denoyelle	8bc339a6ad	BUG/MAJOR: quic: fix INITIAL padding with probing packet only A QUIC datagram that contains an INITIAL packet must be padded to 1.200 bytes to prevent any deadlock due to anti-amplification protection. This is implemented by encoding a PADDING frame on the last packet of the datagram if necessary. Previously, qc_prep_pkts() was responsible to activate padding when calling qc_do_build_pkt(), as it knows which packet is the last to encode. However, this has the side-effect of preventing PING emission for probing with no data as this case was handled in an else-if branch after padding. This was fixed by the below commit 217e467e89d15f3c22e11fe144458afbf718c8a8 BUG/MINOR: quic: fix malformed probing packet building Above logic was altered to fix the PING case : padding was set to false explicitely in qc_prep_pkts(). Padding was then added in a specific block dedicated to the PING case in qc_do_build_pkt() itself for INITIAL packets. However, the fix is incorrect if the last QEL used to built a packet is not the initial one and probing is used with PING frame only. In this case, specific block in qc_do_build_pkt() does not add padding. This causes a BUG_ON() crash in qc_txb_store() which catches these packets as irregularly formed. To fix this while also properly handling PING emission, revert to the original padding logic : qc_prep_pkts() is responsible to activate INITIAL padding. To not interfere with PING emission, qc_do_build_pkt() body is adjusted so that PING block is moved up in the function and detached from the padding condition. The main benefit from this patch is that INITIAL padding decision in qc_prep_pkts() is clearer now. Note that padding can also be activated by qc_do_build_pkt(), as packets should be big enough for header protection decipher. However, this case is different from INITIAL padding, so it is not covered by this patch. This should be backported up to 2.6.	2025-08-22 18:12:32 +02:00
Amaury Denoyelle	0376e66112	BUG/MINOR: quic: do not emit probe data if CONNECTION_CLOSE requested If connection closing is activated, qc_prep_pkts() can only built a datagram with a single packet. This is because we consider that only a single CONNECTION_CLOSE frame is relevant at this stage. This is handled both by qc_prep_pkts() which ensure that only a single packet datagram is built and also qc_do_build_pkt() which prevents the invokation of qc_build_frms() if <cc> is set. However, there is an incoherency for probing. First, qc_prep_pkts() deactivates it if connection closing is requested. But qc_do_build_pkt() may still emit probing frame as it does not check its <probe> argument but rather <pto_probe> QEL field directly. This can results in a packet mixing a PING and a CONNECTION close frames, which is useless. Fix this by adjusting qc_do_build_pkt() : closing argument is also checked on PING probing emission. Note that there is still shaky code here as qc_do_build_pkt() should rely only on <probe> argument to ensure this. This should be backported up to 2.6.	2025-08-22 18:06:43 +02:00
Amaury Denoyelle	fc3ad50788	BUG/MEDIUM: quic: reset padding when building GSO datagrams qc_prep_pkts() encodes input data into QUIC packets in a loop into one or several datagrams. It supports GSO which requires to built a serie of multiple datagrams of the same length. Each packet encoding is performed via a call to qc_do_build_pkt(). This function has an argument to specify if output packet must be completed with a PADDING frame. This option is activated when qc_prep_pkts() encodes the last packet of a datagram with at least one INITIAL packet in it. Padding is resetted each time a new datagram is started. However, this was not performed if GSO is used to built the next datagram. This patch fixes it by properly resetting padding in this case also. The impact of this bug is unknown. It may have several effectfs, one of the most obvious being the insertion of unnecessary padding in packets. It could also potentially trigger an infinite loop in qc_prep_pkts(), although this has never been encountered so far. This must be backported up to 3.1.	2025-08-22 16:22:01 +02:00
Valentine Krasnobaeva	0dc8d8d027	MINOR: dns: dns_connect_nameserver: fix fd leak at error path This fixes the commit 2c7e05f80e3b ("MEDIUM: dns: don't call connect to dest socket for AF_INET*"). If we fail to bind AF_INET sockets or the address family of the nameserver protocol isn't something, what we expect, we need to close the fd, obtained by connect. This fixes the issue GitHub #3085 This must be backported along with the commit 2c7e05f80e3b.	2025-08-22 10:50:47 +02:00
Christopher Faulet	a498e527b4	BUG/MAJOR: stream: Remove READ/WRITE events on channels after analysers eval It is possible to miss a synchronous write event in process_stream() if the stream was woken up on a write event. In that case, it is possible to freeze the stream until the next I/O event or timeout. Concretely, the stream is woken up with CF_WRITE_EVENT on a channel. this flag is removed from the channel when we leave proces_stream(). But before leaving process_stream(), when a synchronous send is tried on this channel, the flag is removed and eventually set again on success. But this event is masked by the previous one, and the channel is not resync as it should be. To fix the bug, CF_READ_EVENT and CF_WRITE_EVENT flags are removed from a channel after the corresponding analysers evaluation. This way, we will be able to detect a successful synchronous send to restart analysers evaluation based on the new channel state. It is safe (or it should be) to do so becaues these flags are only used by analysers and tested to resync the stream inside process_stream(). It is a very old bug and I guess all versions are affected. It was observed on 2.9 and higher, and with the master/worker only. But it could affect any stream. It is tagged a MAJOR because this area is really sensitive to any change. This patch should fix the issue #3070. It should probably be backported to all stable versions, but only after a period of observation and with a special care because this area is really sensitive to changes. It is probably reasonnable to backport it as far as 3.0 and wait for older versions. Thanks to Valentine for its help on this issue !	2025-08-21 20:15:18 +02:00
William Lallemand	7b3b3d7146	BUG/MEDIUM: ssl: apply ssl-f-use on every "ssl" bind This patch introduces a change of behavior in the configuration parsing. Previously the "ssl-f-use" lines were only applied on "ssl" bind lines that does not have any "crt" configured. Since there is no warning and you could mix bind lines with and without crt, this is really confusing. This patch applies the "ssl-f-use" lines on every "ssl" bind lines. This was discussed in ticket #3082. Must be backported in 3.2.	2025-08-21 14:58:06 +02:00
Frederic Lecaille	e513620c72	BUG/MEDIUM: quic-be: crash after backend CID allocation failures This bug impacts only the QUIC backends. It arrived with this commit: MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn()) which was supposed to be fixed by: BUG/MEDIUM: quic: crash after quic_conn allocation failures but this commit was not sufficient. Such a crashe could be reproduced with -dMfail option. To reach it, the <conn_id> object allocation must fail (from qc_new_conn()). So, this is relatively rare, except on systems with limited memory. No need to backport.	2025-08-21 14:24:31 +02:00
Frederic Lecaille	9a22770ac5	BUG/MINOR: quic-be: missing Initial packet number space discarding A QUIC client must discard the Initial packet number space as soon as it first sends a Handshake packet. This patch implements this packet number space which was missing.	2025-08-21 14:24:31 +02:00
Amaury Denoyelle	901de11157	BUG/MEDIUM: mux-h2: fix crash on idle-ping due to unwanted ABORT_NOW An ABORT_NOW() was used during debugging idle-ping but was not removed from the final code. This may cause crash, in particular when mixing idle-ping with shorter http-request/http-keep-alive values. Fix this situation by removing ABORT_NOW() statement. This should fix github issue #3079. This must be backported up to 3.2.	2025-08-21 14:21:11 +02:00
Willy Tarreau	82b002a225	[RELEASE] Released version 3.3-dev7 Released version 3.3-dev7 with the following main changes : - MINOR: quic: duplicate GSO unsupp status from listener to conn - MINOR: quic: define QUIC_FL_CONN_IS_BACK flag - MINOR: quic: prefer qc_is_back() usage over qc->target - BUG/MINOR: cfgparse: immediately stop after hard error in srv_init() - BUG/MINOR: cfgparse-listen: update err_code for fatal error on proxy directive - BUG/MINOR: proxy: avoid NULL-deref in post_section_px_cleanup() - MINOR: guid: add guid_get() helper - MINOR: guid: add guid_count() function - MINOR: clock: add clock_set_now_offset() helper - MINOR: clock: add clock_get_now_offset() helper - MINOR: init: add REGISTER_POST_DEINIT_MASTER() hook - BUILD: restore USE_SHM_OPEN build option - BUG/MINOR: stick-table: cap sticky counter idx with tune.nb_stk_ctr instead of MAX_SESS_STKCTR - MINOR: sock: update broken accept4 detection for older hardwares. - CI: vtest: add os name to OT cache key - CI: vtest: add Ubuntu arm64 builds - BUG/MEDIUM: ssl: Fix 0rtt to the server - BUG/MEDIUM: ssl: fix build with AWS-LC - MEDIUM: acme: use lowercase for challenge names in configuration - BUG/MINOR: init: Initialize random seed earlier in the init process - DOC: management: clarify usage of -V with -c - MEDIUM: ssl/cli: relax crt insertion in crt-list of type directory - MINOR: tools: implement ha_aligned_zalloc() - CLEANUP: fd: make use of ha_aligned_alloc() for the fdtab - MINOR: pools: distinguish the requested alignment from the type-specific one - MINOR: pools: permit to optionally specify extra size and alignment - MINOR: pools: always check that requested alignment matches the type's - DOC: api: update the pools API with the alignment and typed declarations - MEDIUM: tree-wide: replace most DECLARE_POOL with DECLARE_TYPED_POOL - OPTIM: tasks: align task and tasklet pools to 64 - OPTIM: buffers: align the buffer pool to 64 - OPTIM: queue: align the pendconn pools to 64 - OPTIM: connection: align connection pools to 64 - OPTIM: server: start to use aligned allocs in server - DOC: management: fix typo in commit f4f93c56 - DOC: config: recommend single quoting passwords - MINOR: tools: also implement ha_aligned_alloc_typed() - MEDIUM: server: introduce srv_alloc()/srv_free() to alloc/free a server - MINOR: server: align server struct to 64 bytes - MEDIUM: ring: always allocate properly aligned ring structures - CI: Update to actions/checkout@v5 - MINOR: quic: implement qc_ssl_do_hanshake() - BUG/MEDIUM: quic: listener connection stuck during handshakes (OpenSSL 3.5) - BUG/MINOR: mux-h1: fix wrong lock label - MEDIUM: dns: don't call connect to dest socket for AF_INET* - BUG/MINOR: spoe: Properly detect and skip empty NOTIFY frames - BUG/MEDIUM: cli: Report inbuf is no longer full when a line is consumed - BUG/MEDIUM: quic: crash after quic_conn allocation failures - BUG/MEDIUM: quic-be: do not initialize ->conn too early - BUG/MEDIUM: mworker: more verbose error upon loading failure - MINOR: xprt: Add recvmsg() and sendmsg() parameters to rcv_buf() and snd_buf(). - MINOR: ssl: Add a "flags" field to ssl_sock_ctx. - MEDIUM: xprt: Add a "get_capability" method. - MEDIUM: mux_h1/mux_pt: Use XPRT_CAN_SPLICE to decide if we should splice - MINOR: cfgparse: Add a new "ktls" option to bind and server. - MINOR: ssl: Define HAVE_VANILLA_OPENSSL if openssl is used. - MINOR: build: Add a new option, USE_KTLS. - MEDIUM: ssl: Add kTLS support for OpenSSL. - MEDIUM: splice: Don't consider EINVAL to be a fatal error - MEDIUM: ssl: Add splicing with SSL. - MEDIUM: ssl: Add ktls support for AWS-LC. - MEDIUM: ssl: Add support for ktls on TLS 1.3 with AWS-LC - MEDIUM: ssl: Handle non-Application data record with AWS-LC - MINOR: ssl: Add a way to globally disable ktls.	2025-08-20 21:52:39 +02:00
Olivier Houchard	6f21c5631a	MINOR: ssl: Add a way to globally disable ktls. Add a new global option, "noktls", as well as a command line option, "-dT", to totally disable ktls usage, even if it is activated on servers or binds in the configuration. That makes it easier to quickly figure out if a problem is related to ktls or not.	2025-08-20 18:33:11 +02:00
Olivier Houchard	5da3540988	MEDIUM: ssl: Handle non-Application data record with AWS-LC Handle receiving and sending TLS records that are not application data records. When receiving, we ignore new session tickets records, we handle close notify as a read0, and we consider any other records as a connection error. For sending, we're just sending close notify, so that the TLS connection is properly closed.	2025-08-20 18:33:11 +02:00
Olivier Houchard	fefc1cce20	MEDIUM: ssl: Add support for ktls on TLS 1.3 with AWS-LC AWS-LC added a new API in AWS-LC 1.54 that allows the user to retrieve the keys for TLS 1.3 connections with SSL_get_read_traffic_secret(), so use it to be able to use ktls with TLS 1.3 too.	2025-08-20 18:33:11 +02:00
Olivier Houchard	5c8fa50966	MEDIUM: ssl: Add ktls support for AWS-LC. Add ktls support for AWS-LC. As it does not know anything about ktls, it means extracting keys from the ssl lib, and provide them to the kernel. At which point we can use regular recvmsg()/sendmsg() calls. This patch only provides support for TLS 1.2, AWS-LC provides a different way to extract keys for TLS 1.3. Note that this may work with BoringSSL too, but it has not been tested.	2025-08-20 18:33:11 +02:00
Olivier Houchard	a903004a1a	MEDIUM: ssl: Add splicing with SSL. Implement the splicing methods to the SSL xprt (which will just call the raw_sock methods if kTLS is enabled on the socket), and properly report that a connection supports splicing if kTLS is configured on that connection. For OpenSSL, if the upper layer indicated that it wanted to start using splicing by adding the CO_FL_WANT_SPLICING flag, make sure we don't read any more data from the socket, and just drain what may be in the internal OpenSSL buffers, before allowing splicing	2025-08-20 18:33:11 +02:00
Olivier Houchard	755436920d	MEDIUM: splice: Don't consider EINVAL to be a fatal error Don't consider that EINVAL is a fatal error, when calling splice(). When doing splicing from a kTLS socket, splice() will set errno to EINVAL if the next record to be read is not an application data record. This is not a fatal error, it just means we have to use recvmsg() to read it, and potentially we can then resume using splicing. It is unfortunate that EINVAL was used for that case, but we should never get any other case of receiving EINVAL from splice(), so it should be safe to treat it as non-fatal.	2025-08-20 18:33:11 +02:00
Olivier Houchard	ed7d20afc8	MEDIUM: ssl: Add kTLS support for OpenSSL. Modify the SSL code to enable kTLS with OpenSSL. It mostly requires our internal BIO to be able to handle the various kTLS-specific controls in ha_ssl_ctrl(), as well as being able to use recvmsg() and sendmsg() from ha_ssl_read() and ha_ssl_write().	2025-08-20 18:33:11 +02:00
Olivier Houchard	6270073072	MINOR: build: Add a new option, USE_KTLS. Add a new define, USE_KTLS, that enables using kTLS in haproxy. It will only work for Linux with a kernel >= 4.17.	2025-08-20 18:33:11 +02:00
Olivier Houchard	7836fe8fe3	MINOR: ssl: Define HAVE_VANILLA_OPENSSL if openssl is used. If we're using OpenSSL as our crypto library, so add a define, HAVE_VANILLA_OPENSSL, to make it easier to differentiate between the various crypto libs.	2025-08-20 18:33:10 +02:00
Olivier Houchard	e8674658ae	MINOR: cfgparse: Add a new "ktls" option to bind and server. Add a new "ktls" option to bind and server. Valid values are "on" and "off". It currently does nothing, but when kTLS will be implemented, it will enable or disable kTLS for the corresponding sockets. It is marked as experimental for now.	2025-08-20 18:33:10 +02:00
Olivier Houchard	075e753802	MEDIUM: mux_h1/mux_pt: Use XPRT_CAN_SPLICE to decide if we should splice In both mux_h1 and mux_pt, use the new XPRT_CAN_SPLICE capability to decide if we should attempt to use splicing or not. If we receive XPRT_CONN_CAN_MAYBE_SPLICE, add a new flag on the connection, CO_FL_WANT_SPLICING, to let the xprt know that we'd love to be able to do splicing, so that it may get ready for that. This should have no effect right now, and is required work for adding kTLS support.	2025-08-20 18:33:10 +02:00
Olivier Houchard	5731b8a19c	MEDIUM: xprt: Add a "get_capability" method. Add a new method to xprts, get_capability, that can be used to query if an xprt supports something or not. The first capability implemented is XPRT_CAN_SPLICE, to know if the xprt will be able to use splicing for the provided connection. The possible answers are XPRT_CONN_CAN_NOT_SPLICE, which indicates splicing will never be possible for that connection, XPRT_CONN_COULD_SPLICE, which indicates that splicing is not usable right now, but may be in the future, and XPRT_CONN_CAN_SPLICE, that means we can splice right away.	2025-08-20 18:33:10 +02:00
Olivier Houchard	2623b7822e	MINOR: ssl: Add a "flags" field to ssl_sock_ctx. Instead of adding more separate fields in ssl_sock_ctx, add a "flags" one. Convert the "can_send_early_data" to the flag SSL_SOCK_F_EARLY_ENABLED. More flags will be added for kTLS support.	2025-08-20 17:28:03 +02:00
Olivier Houchard	3d685fcb7d	MINOR: xprt: Add recvmsg() and sendmsg() parameters to rcv_buf() and snd_buf(). In rcv_buf() and snd_buf(), use sendmsg/recvmsg instead of send and recv, and add two new optional parameters to provide msg_control and msg_controllen. Those are unused for now, but will be used later for kTLS.	2025-08-20 17:28:03 +02:00
William Lallemand	67cb6aab90	BUG/MEDIUM: mworker: more verbose error upon loading failure When a worker crashes during its configuration parsing and without emitting any messages, the master will emit the message "Failed to load worker!". However that doesn't give us neither the PID of the worker, nor the status code. This patch fixes the problem by emitting a more verbose error. Must be backported as far as 3.1.	2025-08-20 17:15:52 +02:00
Frederic Lecaille	ca5511f022	BUG/MEDIUM: quic-be: do not initialize ->conn too early This bug arrived with this commit: BUG/MEDIUM: quic: do not release BE quic-conn prior to upper conn which added a BUG_ON(qc->conn) statement at the beginning of quic_conn_release(). It is triggered if the connection is not released before releasing the quic_conn. But this is always the case for a backend quic_conn when its allocation from qc_new_conn() fails. Such crashes could be reproduced with -dMfail option. To reach them, the memory allocations must fail. So, this is relatively rare, except on systems with limited memory. To fix this, simply set ->conn quic_conn struct member to a not null value (the one passed as parameter) after the quic_conn allocation has succeeded. No backport needed.	2025-08-20 16:25:51 +02:00
Frederic Lecaille	8514647849	BUG/MEDIUM: quic: crash after quic_conn allocation failures This regression arrived with this commit: MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn()) where qc_new_conn() was modified. The ->cids allocation was moved without checking if a quic_conn_release() call could lead to crashes due to uninitialized quic_conn members. Indeed, if qc_new_conn() fails, then quic_conn_release() is called. This bug could impact both QUIC servers and clients. Such crashes could be reproduced with -dMfail option. To reach them, the memory allocations must fail. So, this is relatively rare, except on systems with limited memory. This patch ensures all the quic_conn members which could lead to crash from quic_conn_release() are initialized before any remaining memory allocations required for the quic_conn. The <conn_id> variable allocated by the client is no more attached to the connection during its allocation, but after the ->cids trees is allocated. No backport needed.	2025-08-20 16:25:51 +02:00
Christopher Faulet	c6c2ef1f11	BUG/MEDIUM: cli: Report inbuf is no longer full when a line is consumed When the command line parsing was refactored (20ec1de21 "MAJOR: cli: Refacor parsing and execution of pipelined commands"), a regression was introduced. When input data are consumed, information about the applet's input buffer are no longer updated accordingly to state it is no longer full. So it is possible to freeze the CLI applet. And a spinning loop may be encountered if a client shutdown is detected in this state. The fix is obivous. When data are consumed from the applet's input buffer, APPCTX_FL_INBLK_FULL flag is removed to notify the input buffer is no longer full and more data can be sent to the CLI applet. This patch should fix the issue #3064. It must be backported to 3.2.	2025-08-20 16:01:50 +02:00
Christopher Faulet	dc6e8dde23	BUG/MINOR: spoe: Properly detect and skip empty NOTIFY frames Since the SPOE was refactored, the detection of empty NOTIFY frames is broken. So it is possible to send a NOTIFY frames to an agent with no message at all. The bug happens because the frame type is now added to the buffer before the messages encoding. So the buffer is never really empty. To fix the issue, the condition to detect empty frame was adapted. This patch must be backported as far as 3.1.	2025-08-20 16:01:50 +02:00
Valentine Krasnobaeva	2c7e05f80e	MEDIUM: dns: don't call connect to dest socket for AF_INET* When we perform connect call for a datagram socket, used to send DNS requests, we set for it the default destination address to some given nameserver. Then we simply use send(), as the destination address is already set. In some usecases described in GitHub issues #3001 and #2654, this approach becames inefficient, nameservers change its IP addresses dynamically, this triggers DNS resolution errors. To fix this, let's perform the bind() on the wildcard address for the datagram AF_INET* client socket. Like this we will allocate a port for it. Then let's use sendto() instead of send(). If the nameserver is local and is listening on the UNIX domain socket, we continue to use the existed approach (connect() and then send()). This fixes issues #3001 and #2654. This may be backported in all stable versions.	2025-08-19 11:26:02 +02:00
Amaury Denoyelle	8ac54cafcd	BUG/MINOR: mux-h1: fix wrong lock label Wrong lock label is used when manipulating idle lock on h1_timeout_task. Fix this by replacing OTHER_LOCK by IDLE_CONNS_LOCK. This only concerns thread debugging statistics. This must be backported up to 2.4.	2025-08-14 16:31:25 +02:00
Frederic Lecaille	878a72d001	BUG/MEDIUM: quic: listener connection stuck during handshakes (OpenSSL 3.5) This issue was reported in GH #3071 by @famfo where a wireshark capture reveals that some handshake could not complete after having received two Initial packets. This could happen when the packets were parsed in two times, calling qc_ssl_provide_all_quic_data() two times. This is due to crypto data stream counter which was incremented two times from qc_ssl_provide_all_quic_data() (see cstream->rx.offset += data statement around line 1223 in quic_ssl.c). One time by the callback which "receives" the crypto data, and on time by qc_ssl_provide_all_quic_data(). Then when parsing the second crypto data frame, the parser detected that the crypto were already provided. To fix this, one could comment the code which increment the crypto data stream counter by <data>. That said, when using the OpenSSL 3.5 QUIC API one should not modified the crypto data stream outside of the OpenSSL 3.5 QUIC API. So, this patch stop calling qc_ssl_provide_all_quic_data() and qc_ssl_provide_quic_data() and only calls qc_ssl_do_hanshake() after having received some crypto data. In addition to this, as these functions are no more called when building haproxy against OpenSSL 3.5, this patch disable their compilations (with #ifndef HAVE_OPENSSL_QUIC). This patch depends on this previous one: MINOR: quic: implement qc_ssl_do_hanshake() Thank you to @famto for this report. Must be backported to 3.2.	2025-08-14 14:54:47 +02:00
Frederic Lecaille	a874821df3	MINOR: quic: implement qc_ssl_do_hanshake() Extract the code in relation with the hanshake SSL API (SSL_do_hanshake()...) from qc_ssl_provide_quic_data() to implement qc_ssl_do_handshake().	2025-08-14 14:54:47 +02:00
Tim Duesterhus	b81a7f428b	CI: Update to actions/checkout@v5 No functional change, but we should keep this current. see 5f4ddb54b05ae0355b1f64c22263a6bc381410df see 5c923f1869881156bf3a25c9659655ae10f7dbd0	2025-08-13 19:15:04 +02:00
Willy Tarreau	a7f8693fa2	MEDIUM: ring: always allocate properly aligned ring structures The rings were manually padded to place the various areas that compose them into different cache lines, provided that the allocator returned a cache-aligned address, which until now was not granted. By now switching to the aligned API we can finally have this guarantee and hope for more consistent ring performance between tests. Like previously the few carefully crafted THREAD_PAD() could simply be replaced by generic THREAD_ALIGN() that dictate the type's alignment. This was the last user of THREAD_PAD() by the way.	2025-08-13 17:47:39 +02:00
Willy Tarreau	cfdab917fe	MINOR: server: align server struct to 64 bytes Several times recently, it was noticed that some benchmarks would highly vary depending on the position of certain fields in the server struct, and this could even vary between runs. The server struct does have separate areas depending on the user cases and hot/cold aspect of the members stored there, but the areas are artificially kept apart using fixed padding instead of real alignment, which has the first sad effect of artificially inflating the struct, and the second one of misaligning it. Now that we have all the necessary tools to keep them aligned, let's just do it. The struct has shrunk from 4160 to 4032 bytes on 64-bit systems, 152 of which are still holes or padding.	2025-08-13 17:37:11 +02:00
Willy Tarreau	a469356268	MEDIUM: server: introduce srv_alloc()/srv_free() to alloc/free a server It happens that we free servers at various places in the code, both on error paths and at runtime thanks to the "server delete" feature. In order to switch to an aligned struct, we'll need to change the calloc() and free() calls. Let's first spot them and switch them to srv_alloc() and srv_free() instead of using calloc() and either free() or ha_free(). An easy trap to fall into is that some of them are default-server entries. The new srv_free() function also resets the pointer like ha_free() does. This was done by running the following coccinelle script all over the code: @@ struct server srv; @@ ( - free(srv) + srv_free(&srv) \| - ha_free(&srv) + srv_free(&srv) ) @@ struct server srv; expression e1; expression e2; @@ ( - srv = malloc(e1) + srv = srv_alloc() \| - srv = calloc(e1, e2) + srv = srv_alloc() ) This is marked medium because despite spotting all call places, we can never rule out the possibility that some out-of-tree patches would allocate their own servers and continue to use the old API... at their own risk.	2025-08-13 17:37:11 +02:00
Willy Tarreau	33d72568dd	MINOR: tools: also implement ha_aligned_alloc_typed() This one is a macro and will allocate a properly aligned and sized object. This will help make sure that the alignment promised to the compiler is respected. When memstats is used, the type name is passed as a string into the .extra field so that it can be displayed in "debug dev memstats". Two tiny mistakes related to memstats macros were also fixed (calloc instead of malloc for zalloc), and the doc was also added to document how to use these calls.	2025-08-13 17:37:08 +02:00
Lukas Tribus	9432e7d688	DOC: config: recommend single quoting passwords Suggests single quoting passwords and update examples to avoid unexpected behaviors due to special characters. Should be backported to stable versions. Link: https://discourse.haproxy.org/t/enhance-documentation-for-insecure-passwords-and-invald-characters/11959	2025-08-13 09:08:25 +02:00
Lukas Tribus	faacc6c084	DOC: management: fix typo in commit f4f93c56 Fixes a small typo in commit f4f93c56 ("DOC: management: clarify usage of -V with -c"). Must be backported as far as 2.8 along commit f4f93c56.	2025-08-13 09:08:25 +02:00
Willy Tarreau	1bb9754648	OPTIM: server: start to use aligned allocs in server This is currently for per-thread arrays like idle conns etc. We're now cache-aligning the per-thread arrays so as to put an end to false sharing. A comparative test between no alignment and alignment on a simple config with round robin between 4 servers showed an average rate of 1.75M/s vs 1.72M/s before for 100M requests. The gain seems to be more commonly less than 1% however. This should mostly help make measurements more reproducible across multiple runs.	2025-08-11 19:55:30 +02:00
Willy Tarreau	c2687f587e	OPTIM: connection: align connection pools to 64 The struct connection is used a lot by the muxes during many operations, particularly at the beginning of the struct (flags, ctrl, xprt and mux). We definitely want this one not to be falsely shared with another thread, so let's align the pools to a cache line.	2025-08-11 19:55:30 +02:00
Willy Tarreau	d6095fcfe6	OPTIM: queue: align the pendconn pools to 64 This is in order to limit false sharing, because this element is already ultra-sensitive to sharing and we'd rather limit it as much as possible.	2025-08-11 19:55:30 +02:00
Willy Tarreau	77335f52fc	OPTIM: buffers: align the buffer pool to 64 This struct is used by memcpy() and friends, particularly during the early recv() and send(). By keeping it 64-byte aligned, we let the underlying libs/kernel use optimal operations (e.g. AVX512) for memory copies while right now it's just random (buffers are found to be equally aligned to 32 and 64 in practice).	2025-08-11 19:55:30 +02:00
Willy Tarreau	c471de7964	OPTIM: tasks: align task and tasklet pools to 64 These structs are intensively used and really must not experience false sharing, so let's declare them aligned to 64. We don't try to align the struct themselves, as we don't want the compiler to expand them either.	2025-08-11 19:55:30 +02:00
Willy Tarreau	c264ea1679	MEDIUM: tree-wide: replace most DECLARE_POOL with DECLARE_TYPED_POOL This will make the pools size and alignment automatically inherit the type declaration. It was done like this: sed -i -e 's:DECLARE_POOL($[^,],[^,],\s$sizeof($[^)]$)):DECLARE_TYPED_POOL(\1\2):g' $(git grep -lw DECLARE_POOL src addons) sed -i -e 's:DECLARE_STATIC_POOL($[^,],[^,],\s$sizeof($[^)]$)):DECLARE_STATIC_TYPED_POOL(\1\2):g' $(git grep -lw DECLARE_STATIC_POOL src addons) 81 replacements were made. The only remaining ones are those which set their own size without depending on a structure. The few ones with an extra size were manually handled. It also means that the requested alignments are now checked against the type's. Given that none is specified for now, no issue is reported. It was verified with "show pools detailed" that the definitions are exactly the same, and that the binaries are similar.	2025-08-11 19:55:30 +02:00
Willy Tarreau	977feb5617	DOC: api: update the pools API with the alignment and typed declarations This adds the DECLARE_ALIGNED() and DECLARE_TYPED() macros.	2025-08-11 19:55:30 +02:00
Willy Tarreau	6be7b64bb4	MINOR: pools: always check that requested alignment matches the type's For pool registrations that are created from the type declaration, we now have the ability to verify that the requested alignment matches the type's one. Let's not miss this opportunity, as we've met bugs in the past that were caused by such mismatches. The principle is simple: if the type alignment is known, we check that the configured alignment is at least as large as that one otherwise we refuse to start (since the code may crash at any moment). Obviously it doesn't crash for now!	2025-08-11 19:55:30 +02:00
Willy Tarreau	e21bb531ca	MINOR: pools: permit to optionally specify extra size and alignment The common macros REGISTER_TYPED_POOL(), DECLARE_TYPED_POOL() and DECLARE_STATIC_TYPED_POOL() will now take two optional arguments, one being the extra size to be added to the structure, and a second one being the desired alignment to enforce. This will permit to specify alignments larger than the default ones promised to the compiler.	2025-08-11 19:55:30 +02:00
Willy Tarreau	d240f387ca	MINOR: pools: distinguish the requested alignment from the type-specific one We're letting users request an alignment but that can violate one imposed by a type, especially if we start seeing REGISTER_TYPED_POOL() grow in adoption, encouraging users to specify alignment on their types. On the other hand, if we ask the user to always specify the alignment, no control is possible and the error is easy. Let's have a second field in the pool registration, for the type-specific one. We'll set it to zero when unknown, and to the types's alignment when known. This way it will become possible to compare them at startup time to detect conflicts. For now no macro permits to set both separately so this is not visible.	2025-08-11 19:55:30 +02:00
Willy Tarreau	5e2837cfb4	CLEANUP: fd: make use of ha_aligned_alloc() for the fdtab We've forcefully aligned the fdtab in commit 97ea9c49f1 ("BUG/MEDIUM: fd: always align fdtab[] to 64 bytes"), but now we don't need such hacks anymore thanks to ha_aligned_alloc(). Let's use it and get rid of fdtab_addr.	2025-08-11 19:55:30 +02:00
Willy Tarreau	746e77d000	MINOR: tools: implement ha_aligned_zalloc() This one is exactly ha_aligned_alloc() followed by a memset(0), as it will be convenient for a number of call places as a replacement for calloc(). Note that ideally we should also have a calloc version that performs basic multiply overflow checks, but these are essentially used with numbers of threads times small structs so that's fine, and we already do the same everywhere in malloc() calls.	2025-08-11 19:55:30 +02:00
William Lallemand	55d561042c	MEDIUM: ssl/cli: relax crt insertion in crt-list of type directory In previous versions of haproxy, insertions of certificates in a crt-list from the CLI would require to have the path of the directory, in the path of the certificate. This would help avoiding that the certificate wasn't loaded upon a reload because it is not at the right place. However, since version 3.0 and crt-store, the name stored in the tree could be an alias and not a path, so that does not make sense anymore. Even though path would be right, the check is not right anymore in this case. The tool or user inserting the certificate must now check itself that the certificate was placed at the right spot on the filesystem. Reported in issue #3053. Could be backported as far as haproxy 3.0.	2025-08-11 17:42:16 +02:00
William Lallemand	f4f93c56c1	DOC: management: clarify usage of -V with -c In ticket #3065 an user complained that no success message is printed anymore when using -c. The message does not appear by default since version 2.9. This patch clarify the documentation. Must be backported as far as 2.8.	2025-08-11 16:23:00 +02:00
Remi Tricot-Le Breton	15ee49e822	BUG/MINOR: init: Initialize random seed earlier in the init process The random seed used in ha_random functions needs to be first initialized by calling ha_random_boot. This function was called rather late in the init process, after the init functions (INITCALLS) are called and after the configuration parsing for instance which means that any ha_random call in an init function would return 0. This was the case in 'vars_init' and 'cache_init' which tried to build seeds for specific hash calculations but ended up not being seeded. This patch can be backported on all stable branches.	2025-08-11 16:02:41 +02:00
William Lallemand	84589a9f48	MEDIUM: acme: use lowercase for challenge names in configuration Both the RFC and the IANA registry refers to challenge names in lowercase. If we need to implement more challenges, it's better to use the correct naming. In order to keep the compatibility with the previous configurations, the parsing does a strcasecmp() instead of a strcmp(). Also rename every occurence in the code and doc in lowercase. This was discussed in issue #1864	2025-08-11 15:09:18 +02:00
Olivier Houchard	b6702d5342	BUG/MEDIUM: ssl: fix build with AWS-LC AWS-LC doesn't provide SSL_in_before(), and doesn't provide an easy way to know if we already started the handshake or not. So instead, just add a new field in ssl_sock_ctx, "can_write_early_data", that will be initialized to 1, and will be set to 0 as soon as we start the handshake. This should be backported up to 2.8 with 13aa5616c9f99dbca0711fd18f716bd6f48eb2ae.	2025-08-08 20:21:14 +02:00
Olivier Houchard	13aa5616c9	BUG/MEDIUM: ssl: Fix 0rtt to the server In order to send early data, we have to make sure no handshake has been initiated at all. To do that, we remove the CO_FL_SSL_WAIT_HS flag, so that we won't attempt to start a handshake. However, by removing those flags, we allow ssl_sock_to_buf() to call SSL_read(), as it's no longer aware that no handshake has been done, and SSL_read() will begin the handshake, thus preventing us from sending early data. The fix is to just call SSL_in_before() to check if no handshake has been done yet, in addition to checking CO_FL_SSL_WAIT_HS (both are needed, as CO_FL_SSL_WAIT_HS may come back in case of renegociation). In ssl_sock_from_buf(), fix the check to see if we may attempt to send early data. Use SSL_in_before() instead of SSL_is_init_finished(), as SSL_is_init_finished() will return 1 if the handshake has been started, but not terminated, and if the handshake has been started, we can no longer send early data. This fixes errors when attempting to send early data (as well as actually sending early data). This should be backported up to 2.8.	2025-08-08 19:13:37 +02:00
Ilia Shipitsin	c10e8401e2	CI: vtest: add Ubuntu arm64 builds Reference: https://github.com/actions/partner-runner-images since GHA now supports arm64 as well, let add those builds. We will start with ASAN builds, other will be added later if required	2025-08-08 15:36:11 +02:00
Ilia Shipitsin	6b2bbcb428	CI: vtest: add os name to OT cache key currently OpenTracing cache does not include os name. it does not allow to distinguish, for example between ubuntu-24.04 and ubuntu-24.04-arm.	2025-08-08 15:36:12 +02:00
David Carlier	7fe8989fbb	MINOR: sock: update broken accept4 detection for older hardwares. Some older ARM embedded settings set errno to EPERM instead of ENOSYS for missing implementations (e.g. Freescale ARM 2.6.35)	2025-08-08 06:01:18 +02:00
Valentine Krasnobaeva	21d5f43aa6	BUG/MINOR: stick-table: cap sticky counter idx with tune.nb_stk_ctr instead of MAX_SESS_STKCTR Cap sticky counter index with tune.nb_stk_ctr instead of MAX_SESS_STKCTR for sc-add-gpc. Same logic is already implemented for sc-inc-gpc and sc-set-gpt keywords. So, it seems missed for sc-add-gpc. This fixes the issue #3061 reported at GitHub. Thanks to @ma311 for reporting their analysis of the issue. This should be backported in all versions until 2.8, included 2.8.	2025-08-08 05:26:30 +02:00
Aurelien DARRAGON	7656a41784	BUILD: restore USE_SHM_OPEN build option Some optional features may still require the use of shm_open() in the future. In this patch we restore the USE_SHM_OPEN build option that was removed in 143be1b59 ("MEDIUM: errors: get rid of shm_open()") and should guard the use of shm_open() in the code.	2025-08-07 22:27:22 +02:00
Aurelien DARRAGON	bcb124f92a	MINOR: init: add REGISTER_POST_DEINIT_MASTER() hook Similar to REGISTER_POST_DEINIT() hook (which is invoked during deinit) but for master process only, when haproxy was started in master-worker mode. The goal is to be able to register cleanup functions that will only run for the master process right before exiting.	2025-08-07 22:27:14 +02:00
Aurelien DARRAGON	c8282f6138	MINOR: clock: add clock_get_now_offset() helper Same as clock_set_now_offset() but to retrieve the offset from external location.	2025-08-07 22:27:09 +02:00
Aurelien DARRAGON	20f9d8fa4e	MINOR: clock: add clock_set_now_offset() helper Since now_offset is a static variable and is not exposed outside from clock.c, let's add an helper so that it becomes possible to set its value from another source file.	2025-08-07 22:27:05 +02:00
Aurelien DARRAGON	4c3a36c609	MINOR: guid: add guid_count() function returns the total amount of registered GUIDs in the guid_tree	2025-08-07 22:26:58 +02:00
Aurelien DARRAGON	7c52964591	MINOR: guid: add guid_get() helper guid_get() is a convenient function to get the actual key string associated to a given guid_node struct	2025-08-07 22:26:52 +02:00
Aurelien DARRAGON	3759172015	BUG/MINOR: proxy: avoid NULL-deref in post_section_px_cleanup() post_section_px_cleanup(), which was implemented in abcc73830 ("MEDIUM: proxy: register a post-section cleanup function"), is called for the current section no matter if the parsing was aborted due to a fatal error. In this case, the curproxy pointer may point to NULL, yet post_section_px_cleanup() assumes curproxy pointer is always valid, which could lead to NULL-deref. For instance, the config below will cause SEGFAULT: listen toto titi To fix the issue, let's simply consider that the curproxy pointer may be NULL in post_section_px_cleanup(), in which case we skip the cleanup for the curproxy since there is nothing we can do. No backport needed	2025-08-07 22:26:47 +02:00
Aurelien DARRAGON	833158f9e0	BUG/MINOR: cfgparse-listen: update err_code for fatal error on proxy directive When improper arguments are provided on proxy directive (listen, frontend or backend), such alert may be emitted: "please use the 'bind' keyword for listening addresses" This was introduced in 6e62fb6405 ("MEDIUM: cfgparse: check section maximum number of arguments"). However, despite the error being reported as alert, the err_code isn't updated accordingly, which could make the upper parser think there was no error, while it isn't the case. In practise since the proxy directive is ignored following proxy related directives should raise errors, so this didn't cause much harm, yet better fix that. It could be backported to all stable versions.	2025-08-07 22:26:42 +02:00
Aurelien DARRAGON	525750e135	BUG/MINOR: cfgparse: immediately stop after hard error in srv_init() Since 368d01361 (" MEDIUM: server: add and use srv_init() function"), in case of srv_init() error, we simply increment cfgerr variable and keep going. It isn't enough, some treatment occuring later in check_config_validity() assume that srv_init() succeeded for servers, and may cause undefined behavior. To fix the issue, let's consider that if (srv_init() & ERR_CODE) returns true, then we must stop checking the config immediately. No backport needed unless 368d01361 is.	2025-08-07 22:26:37 +02:00
Amaury Denoyelle	731b52ded9	MINOR: quic: prefer qc_is_back() usage over qc->target Previously quic_conn <target> member was used to determine if quic_conn was used on the frontend (as server) or backend side (as client). A new helper function can now be used to directly check flag QUIC_FL_CONN_IS_BACK. This reduces the dependency between quic_conn and their relative listener/server instances.	2025-08-07 16:59:59 +02:00
Amaury Denoyelle	cae828cbf5	MINOR: quic: define QUIC_FL_CONN_IS_BACK flag Define a new quic_conn flag assign if the connection is used on the backend side. This is similar to other haproxy components such as struct connection and muxes element. This flag is positionned via qc_new_conn(). Also update quic traces to mark proxy side as 'F' or 'B' suffix.	2025-08-07 16:59:59 +02:00
Amaury Denoyelle	e064e5d461	MINOR: quic: duplicate GSO unsupp status from listener to conn QUIC emission can use GSO to emit multiple datagrams with a single syscall invokation. However, this feature relies on several kernel parameters which are checked on haproxy process startup. Even if these checks report no issue, GSO may still be unable due to the underlying network adapter underneath. Thus, if a EIO occured on sendmsg() with GSO, listener is flagged to mark GSO as unsupported. This allows every other QUIC connections to share the status and avoid using GSO when using this listener. Previously, listener flag was checked for every QUIC emission. This was done using an atomic operation to prevent races. Improve this by duplicating GSO unsupported status as the connection level. This is done on qc_new_conn() and also on thread rebinding if a new listener instance is used. The main benefit from this patch is to reduce the dependency between quic_conn and listener instances.	2025-08-07 16:36:26 +02:00
Willy Tarreau	d76ee72d03	[RELEASE] Released version 3.3-dev6 Released version 3.3-dev6 with the following main changes : - MINOR: acme: implement traces - BUG/MINOR: hlua: take default-path into account with lua-load-per-thread - CLEANUP: counters: rename counters_be_shared_init to counters_be_shared_prepare - MINOR: clock: make global_now_ms a pointer - MINOR: clock: make global_now_ns a pointer as well - MINOR: mux-quic: release conn after shutdown on BE reuse failure - MINOR: session: strengthen connection attach to session - MINOR: session: remove redundant target argument from session_add_conn() - MINOR: session: strengthen idle conn limit check - MINOR: session: do not release conn in session_check_idle_conn() - MINOR: session: streamline session_check_idle_conn() usage - MINOR: muxes: refactor private connection detach - BUG/MEDIUM: mux-quic: ensure Early-data header is set - BUILD: acme: avoid declaring TRACE_SOURCE in acme-t.h - MINOR: acme: emit a log for DNS-01 challenge response - MINOR: acme: emit the DNS-01 challenge details on the dpapi sink - MEDIUM: acme: allow to wait and restart the task for DNS-01 - MINOR: acme: update the log for DNS-01 - BUG/MINOR: acme: possible integer underflow in acme_txt_record() - BUG/MEDIUM: hlua_fcn: ensure systematic watcher cleanup for server list iterator - MINOR: sample: Add le2dec (little endian to decimal) sample fetch - BUILD: fcgi: fix the struct name of fcgi_flt_ctx - BUILD: compat: provide relaxed versions of the MIN/MAX macros - BUILD: quic: use _MAX() to avoid build issues in pools declarations - BUILD: compat: always set _POSIX_VERSION to ease comparisons - MINOR: implement ha_aligned_alloc() to return aligned memory areas - MINOR: pools: support creating a pool from a pool registration - MINOR: pools: add a new flag to declare static registrations - MINOR: pools: force the name at creation time to be a const. - MEDIUM: pools: change the static pool creation to pass a registration - DEBUG: pools: store the pool registration file name and line number - DEBUG: pools: also retrieve file and line for direct callers of create_pool() - MEDIUM: pools: add an alignment property - MINOR: pools: add macros to register aligned pools - MINOR: pools: add macros to declare pools based on a struct type - MEDIUM: pools: respect pool alignment in allocations	2025-08-06 21:50:00 +02:00
Willy Tarreau	ef915e672a	MEDIUM: pools: respect pool alignment in allocations Now pool_alloc_area() takes the alignment in argument and makes use of ha_aligned_malloc() instead of malloc(). pool_alloc_area_uaf() simply applies the alignment before returning the mapped area. The pool_free() functionn calls ha_aligned_free() so as to permit to use a specific API for aligned alloc/free like mingw requires. Note that it's possible to see warnings about mismatching sized during pool_free() since we know both the pool and the type. In pool_free, adding just this is sufficient to detect potential offenders: WARN_ON(__alignof__(*__ptr) > pool->align);	2025-08-06 19:20:36 +02:00
Willy Tarreau	f0d0922aa1	MINOR: pools: add macros to declare pools based on a struct type DECLARE_TYPED_POOL() and friends take a name, a type and an extra size (to be added to the size of the element), and will use this to create the pool. This has the benefit of letting the compiler automatically adapt sizeof() and alignof() based on the type declaration.	2025-08-06 19:20:36 +02:00
Willy Tarreau	6ea0e3e2f8	MINOR: pools: add macros to register aligned pools This adds an alignment argument to create_pool_from_loc() and completes the existing low-level macros with new ones that expose the alignment and the new macros permit to specify it. For now they're not used.	2025-08-06 19:20:36 +02:00
Willy Tarreau	eb075d15f6	MEDIUM: pools: add an alignment property This will be used to declare aligned pools. For now it's not used, but it's properly set from the various registrations that compose a pool, and rounded up to the next power of 2, with a minimum of sizeof(void*). The alignment is returned in the "show pools" part that indicates the entry size. E.g. "(56 bytes/8)" means 56 bytes, aligned by 8.	2025-08-06 19:20:36 +02:00
Willy Tarreau	ac23b873f5	DEBUG: pools: also retrieve file and line for direct callers of create_pool() Just like previous patch, we want to retrieve the location of the caller. For this we turn create_pool() into a macro that collects __FILE__ and __LINE__ and passes them to the now renamed function create_pool_with_loc(). Now the remaining ~30 pools also have their location stored.	2025-08-06 19:20:34 +02:00
Willy Tarreau	efa856a8b0	DEBUG: pools: store the pool registration file name and line number When pools are declared using DECLARE_POOL(), REGISTER_POOL etc, we know where they are and it's trivial to retrieve the file name and line number, so let's store them in the pool_registration, and display them when known in "show pools detailed".	2025-08-06 19:20:32 +02:00
Willy Tarreau	ff62aacb20	MEDIUM: pools: change the static pool creation to pass a registration Now we're creating statically allocated registrations instead of passing all the parameters and allocating them on the fly. Not only this is simpler to extend (we're limited in number of INITCALL args), but it also leaves all of these in the data segment where they are easier to find when debugging.	2025-08-06 19:20:30 +02:00
Willy Tarreau	f51d58bd2e	MINOR: pools: force the name at creation time to be a const. This is already the case as all names are constant so that's fine. If it would ever change, it's not very hard to just replace it in-situ via an strdup() and set a flag to mention that it's dynamically allocated. We just don't need this right now. One immediately visible effect is in "show pools detailed" where the names are no longer truncated.	2025-08-06 19:20:28 +02:00
Willy Tarreau	ee5bc28865	MINOR: pools: add a new flag to declare static registrations We must not free these ones when destroying a pool, so let's dedicate them a flag to mention that they are static. For now we don't have any such.	2025-08-06 19:20:26 +02:00
Willy Tarreau	18505f9718	MINOR: pools: support creating a pool from a pool registration We've recently introduced pool registrations to be able to enumerate all pool creation requests with their respective parameters, but till now they were only used for debugging ("show pools detailed"). Let's go a step further and split create_pool() in two: - the first half only allocates and sets the pool registration - the second half creates the pool from the registration This is what this patch does. This now opens the ability to pre-create registrations and create pools directly from there.	2025-08-06 19:20:22 +02:00
Willy Tarreau	325d1bdcca	MINOR: implement ha_aligned_alloc() to return aligned memory areas We have two versions, _safe() which verifies and adjusts alignment, and the regular one which trusts the caller. There's also a dedicated ha_aligned_free() due to mingw. The currently detected OSes are mingw, unixes older than POSIX 200112 which require memalign(), and those post 200112 which will use posix_memalign(). Solaris 10 reports 200112 (probably through _GNU_SOURCE since it does not do it by default), and Solaris 11 still supports memalign() so for all Solaris we use memalign(). The memstats wrappers are also implemented, and have the exported names. This was the opportunity for providing a separate free call that lets the caller specify the size (e.g. for use with pools). For now this code is not used.	2025-08-06 19:19:27 +02:00
Willy Tarreau	e921fe894f	BUILD: compat: always set _POSIX_VERSION to ease comparisons Sometimes we need to compare it to known versions, let's make sure it's always defined. We set it to zero if undefined so that it cannot match any comparison.	2025-08-06 19:19:27 +02:00
Willy Tarreau	2ce0c63206	BUILD: quic: use _MAX() to avoid build issues in pools declarations With the upcoming pool declaration, we're filling a struct's fields, while older versions were relying on initcalls which could be turned to function declarations. Thus the compound expressions that were usable there are not necessarily anymore, as witnessed here with gcc-5.5 on solaris 10: In file included from include/haproxy/quic_tx.h:26:0, from src/quic_tx.c:15: include/haproxy/compat.h:106:19: error: braced-group within expression allowed only inside a function #define MAX(a, b) ({ \ ^ include/haproxy/pool.h:41:11: note: in definition of macro '__REGISTER_POOL' .size = _size, \ ^ ... include/haproxy/quic_tx-t.h:6:29: note: in expansion of macro 'MAX' #define QUIC_MAX_CC_BUFSIZE MAX(QUIC_INITIAL_IPV6_MTU, QUIC_INITIAL_IPV4_MTU) Let's make the macro use _MAX() instead of MAX() since it relies on pure constants.	2025-08-06 19:19:11 +02:00
Willy Tarreau	cf8871ae40	BUILD: compat: provide relaxed versions of the MIN/MAX macros In 3.0 the MIN/MAX macros were converted to compound expressions with commit 0999e3d959 ("CLEANUP: compat: make the MIN/MAX macros more reliable"). However with older compilers these are not supported out of code blocks (e.g. to initialize variables or struct members). This is the case on Solaris 10 with gcc-5.5 when QUIC doesn't compile anymore with the future pool registration: In file included from include/haproxy/quic_tx.h:26:0, from src/quic_tx.c:15: include/haproxy/compat.h:106:19: error: braced-group within expression allowed only inside a function #define MAX(a, b) ({ \ ^ include/haproxy/pool.h:41:11: note: in definition of macro '__REGISTER_POOL' .size = _size, \ ^ ... include/haproxy/quic_tx-t.h:6:29: note: in expansion of macro 'MAX' #define QUIC_MAX_CC_BUFSIZE MAX(QUIC_INITIAL_IPV6_MTU, QUIC_INITIAL_IPV4_MTU) Let's provide the old relaxed versions as _MIN/_MAX for use with constants like such cases where it's certain that there is no risk. A previous attempt using __builtin_constant_p() to switch between the variants did not work, and it's really not worth the hassle of going this far.	2025-08-06 19:18:42 +02:00
Willy Tarreau	b1f854bb2e	BUILD: fcgi: fix the struct name of fcgi_flt_ctx The struct was mistakenly spelled flt_fcgi_ctx() in fcgi_flt_stop() when it was introduced in 2.1 with commit 78fbb9f991 ("MEDIUM: fcgi-app: Add FCGI application and filter"), causing build issues when trying to get the alignment of the object in pool_free() for debugging purposes. No backport is needed as it's just used to convey a pointer.	2025-08-06 16:27:05 +02:00
Alexander Stephan	ffbb3cc306	MINOR: sample: Add le2dec (little endian to decimal) sample fetch This commit introduces a sample fetch, `le2dec`, to convert little-endian binary input samples into their decimal representations. The function converts the input into a string containing unsigned integer numbers, with each number derived from a specified number of input bytes. The numbers are separated using a user-defined separator. This new sample is achieved by adding a parametrized sample_conv_2dec function, unifying the logic for be2dec and le2dec converters. Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com> [wt: tracked as GH issue #2915] Signed-off-by: Willy Tarreau <w@1wt.eu>	2025-08-05 13:47:53 +02:00
Aurelien DARRAGON	aeff2a3b2a	BUG/MEDIUM: hlua_fcn: ensure systematic watcher cleanup for server list iterator In 358166a ("BUG/MINOR: hlua_fcn: restore server pairs iterator pointer consistency"), I wrongly assumed that because the iterator was a temporary object, no specific cleanup was needed for the watcher. In fact watcher_detach() is not only relevant for the watcher itself, but especially for its parent list to remove the current watcher from it. As iterators are temporary objects, failing to remove their watchers from the server watcher list causes the server watcher list to be corrupted. On a normal iteration sequence, the last watcher_next() receives NULL as target so it successfully detaches the last watcher from the list. However the corner case here is with interrupted iterators: users are free to break away from the iteration loop when a specific condition is met for instance from the lua script, when this happens hlua_listable_servers_pairs_iterator() doesn't get a chance to detach the last iterator. Also, Lua doesn't tell us that the loop was interrupted, so to fix the issue we rely on the garbage collector to force a last detach right before the object is freed. To achieve that, watcher_detach() was slightly modified so that it becomes possible to call it without knowing if the watcher is already detached or not, if watcher_detach() is called on a detached watcher, the function does nothing. This way it saves the caller from having to track the watcher state and makes the API a little more convenient to use. This way we now systematically call watcher_detach() for server iterators right before they are garbage collected. This was first reported in GH #3055. It can be observed when the server list is browsed one than more time when it was already browsed from Lua for a given proxy and the iteration was interrupted before the end. As the watcher list is corrupted, the common symptom is watcher_attach() or watcher_next() not ending due to the internal mt_list call looping forever. Thanks to GH users @sabretus and @sabretus for their precious help. It should be backported everywhere 358166a was.	2025-08-05 13:06:46 +02:00
William Lallemand	66f28dbd3f	BUG/MINOR: acme: possible integer underflow in acme_txt_record() a2base64url() can return a negative value is olen is too short to accept ilen. This is not supposed to happen since the sha256 should always fit in a buffer. But this is confusing since a2base64() returns a signed integer which is pt in output->data which is unsigned. Fix the issue by setting ret to 0 instead of -1 upon error. And returns a unsigned integer instead of a signed one. This patch also checks the return value from the caller in order to emit an error instead of setting trash.data which is already done from the function.	2025-08-05 12:12:50 +02:00
William Lallemand	8afd3e588d	MINOR: acme: update the log for DNS-01 Update the log for DNS-01 by mentionning the challenge_ready command over the CLI.	2025-08-01 18:08:43 +02:00
William Lallemand	9ee14ed2d9	MEDIUM: acme: allow to wait and restart the task for DNS-01 DNS-01 needs a external process which would register a TXT record on a DNS provider, using a REST API or something else. To achieve this, the process should read the dpapi sink and wait for events. With the DNS-01 challenge, HAProxy will put the task to sleep before asking the ACME server to achieve the challenge. The task then need to be woke up, using the command implemented by this patch. This patch implements the "acme challenge_ready" command which should be used by the agent once the challenge was configured in order to wake the task up. Example: echo "@1 acme challenge_ready foobar.pem.rsa domain kikyo" \| socat /tmp/master.sock -	2025-08-01 18:07:12 +02:00
William Lallemand	3dde7626ba	MINOR: acme: emit the DNS-01 challenge details on the dpapi sink This commit adds a new message to the dpapi sink which is emitted during the new authorization request. One message is emitted by challenge to resolve. The certificate name as well as the thumprint of the account key are on the first line of the message. A dump of the JSON response for 1 challenge is dumped, en the message ends with a \0. The agent consuming these messages MUST NOT access the URLs, and SHOULD only uses the thumbprint, dns and token to configure a challenge. Example: $ ( echo "@@1 show events dpapi -w -0"; cat - ) \| socat /tmp/master.sock - \| cat -e <0>2025-08-01T16:23:14.797733+02:00 acme deploy foobar.pem.rsa thumbprint Gv7pmGKiv_cjo3aZDWkUPz5ZMxctmd-U30P2GeqpnCo$ {$ "status": "pending",$ "identifier": {$ "type": "dns",$ "value": "foobar.com"$ },$ "challenges": [$ {$ "type": "dns-01",$ "url": "https://0.0.0.0:14000/chalZ/1o7sxLnwcVCcmeriH1fbHJhRgn4UBIZ8YCbcrzfREZc",$ "token": "tvAcRXpNjbgX964ScRVpVL2NXPid1_V8cFwDbRWH_4Q",$ "status": "pending"$ },$ {$ "type": "dns-account-01",$ "url": "https://0.0.0.0:14000/chalZ/z2_WzibwTPvE2zzIiP3BF0zNy3fgpU_8Nj-V085equ0",$ "token": "UedIMFsI-6Y9Nq3oXgHcG72vtBFWBTqZx-1snG_0iLs",$ "status": "pending"$ },$ {$ "type": "tls-alpn-01",$ "url": "https://0.0.0.0:14000/chalZ/AHnQcRvZlFw6e7F6rrc7GofUMq7S8aIoeDileByYfEI",$ "token": "QhT4ejBEu6ZLl6pI1HsOQ3jD9piu__N0Hr8PaWaIPyo",$ "status": "pending"$ },$ {$ "type": "http-01",$ "url": "https://0.0.0.0:14000/chalZ/Q_qTTPDW43-hsPW3C60NHpGDm_-5ZtZaRfOYDsK3kY8",$ "token": "g5Y1WID1v-hZeuqhIa6pvdDyae7Q7mVdxG9CfRV2-t4",$ "status": "pending"$ }$ ],$ "expires": "2025-08-01T15:23:14Z"$ }$ ^@	2025-08-01 16:48:22 +02:00
William Lallemand	365a69648c	MINOR: acme: emit a log for DNS-01 challenge response This commit emits a log which output the TXT entry to create in case of DNS-01. This is useful in cases you want to update your TXT entry manually. Example: acme: foobar.pem.rsa: DNS-01 requires to set the "acme-challenge.example.com" TXT record to "7L050ytWm6ityJqolX-PzBPR0LndHV8bkZx3Zsb-FMg"	2025-08-01 16:12:27 +02:00
William Lallemand	09275fd549	BUILD: acme: avoid declaring TRACE_SOURCE in acme-t.h Files ending with '-t.h' are supposed to be used for structure definitions and could be included in the same file to check API definitions. This patch removes TRACE_SOURCE from acme-t.h to avoid conflicts with other TRACE_SOURCE definitions.	2025-07-31 16:03:28 +02:00
Amaury Denoyelle	a6e67e7b41	BUG/MEDIUM: mux-quic: ensure Early-data header is set QUIC MUX may be initialized prior to handshake completion, when 0-RTT is used. In this case, connection is flagged with CO_FL_EARLY_SSL_HS, which is notably used by wait-for-hs http rule. Early data may be subject to replay attacks. For this reason, haproxy adds the header 'Early-data: 1' to all requests handled as TLS early data. Thus the server can reject it if it is deemed unsafe. This header injection is implemented by http-ana. However, it was not functional with QUIC due to missing CO_FL_EARLY_DATA connection flag. Fix this by ensuring that QUIC MUX sets CO_FL_EARLY_DATA when needed. This is performed during qcc_recv() for STREAM frame reception. It is only set if QC_CF_WAIT_HS is set, meaning that the handshake is not yet completed. After this, the request is considered safe and Early-data header is not necessary anymore. This should fix github issue #3054. This must be backported up to 3.2 at least. If possible, it should be backported to all stable releases as well. On these versions, the current patch relies on the following refactoring commit : commit 0a53a008d032b69377869c8caaec38f81bdd5bd6 MINOR: mux-quic: refactor wait-for-handshake support	2025-07-31 15:25:59 +02:00
Amaury Denoyelle	697f7d1142	MINOR: muxes: refactor private connection detach Following the latest adjustment on session_add_conn() / session_check_idle_conn(), detach muxes callbacks were rewritten for private connection handling. Nothing really fancy here : some more explicit comments and the removal of a duplicate checks on idle conn status for muxes with true multipexing support.	2025-07-30 16:14:00 +02:00
Amaury Denoyelle	2ecc5290f2	MINOR: session: streamline session_check_idle_conn() usage session_check_idle_conn() is called by muxes when a connection becomes idle. It ensures that the session idle limit is not yet reached. Else, the connection is removed from the session and it can be freed. Prior to this patch, session_check_idle_conn() was compatible with a NULL session argument. In this case, it would return true, considering that no limit was reached and connection not removed. However, this renders the function error-prone and subject to future bugs. This patch streamlines it by ensuring it is never called with a NULL argument. Thus it can now only returns true if connection is kept in the session or false if it was removed, as first intended.	2025-07-30 16:13:30 +02:00
Amaury Denoyelle	dd9645d6b9	MINOR: session: do not release conn in session_check_idle_conn() session_check_idle_conn() is called to flag a connection already inserted in a session list as idle. If the session limit on the number of idle connections (max-session-srv-conns) is exceeded, the connection is removed from the session list. In addition to the connection removal, session_check_idle_conn() directly calls MUX destroy callback on the connection. This means the connection is freed by the function itself and should not be used by the caller anymore. This is not practical when an alternative connection closure method should be used, such as a graceful shutdown with QUIC. As such, remove MUX destroy invokation : this is now the responsability of the caller to either close or release immediately the connection.	2025-07-30 11:43:41 +02:00
Amaury Denoyelle	57e9425dbc	MINOR: session: strengthen idle conn limit check Add a BUG_ON() on session_check_idle_conn() to ensure the connection is not already flagged as CO_FL_SESS_IDLE. This checks that this function is only called one time per connection transition from active to idle. This is necessary to ensure that session idle counter is only incremented one time per connection.	2025-07-30 11:40:16 +02:00
Amaury Denoyelle	ec1ab8d171	MINOR: session: remove redundant target argument from session_add_conn() session_add_conn() uses three argument : connection and session instances, plus a void pointer labelled as target. Typically, it represents the server, but can also be a backend instance (for example on dispatch). In fact, this argument is redundant as <target> is already a member of the connection. This commit simplifies session_add_conn() by removing it. A BUG_ON() on target is extended to ensure it is never NULL.	2025-07-30 11:39:57 +02:00
Amaury Denoyelle	668c2cfb09	MINOR: session: strengthen connection attach to session This commit is the first one of a serie to refactor insertion of backend private connection into the session list. session_add_conn() is used to attach a connection into a session list. Previously, this function would report an error if the connection specified was already attached to another session. However, this case currently never happens and thus can be considered as buggy. Remove this check and replace it with a BUG_ON(). This allows to ensure that session insertion remains consistent. The same check is also transformed in session_check_idle_conn().	2025-07-30 11:39:26 +02:00
Amaury Denoyelle	cfe9bec1ea	MINOR: mux-quic: release conn after shutdown on BE reuse failure On stream detach on backend side, connection is inserted in the proper server/session list to be able to reuse it later. If insertion fails and the connection is idle, the connection can be removed immediately. If this occurs on a QUIC connection, QUIC MUX implements graceful shutdown to ensure the server is notified of the closure. However, the connection instance is not freed. Change this to ensure that both shutdown and release is performed.	2025-07-30 10:04:19 +02:00
Aurelien DARRAGON	14966c856b	MINOR: clock: make global_now_ns a pointer as well Similar to previous commit but for global_now_ns	2025-07-29 18:04:15 +02:00
Aurelien DARRAGON	4a20b3835a	MINOR: clock: make global_now_ms a pointer This is preparation work for shared counters between co-processes. As co-processes will need to share a common date. global_now_ms will be used for that as it will point to the shm when sharing is enabled. Thus in this patch we turn global_now_ms into a pointer (and adjust the places where it is written to and read from, hopefully atomic operations through pointer are already used so the change is trivial) For now global_now_ms points to process-local _global_now_ms which is a fallback for when sharing through the shm is not enabled.	2025-07-29 18:04:14 +02:00
Aurelien DARRAGON	713ebd2750	CLEANUP: counters: rename counters_be_shared_init to counters_be_shared_prepare 75e480d10 ("MEDIUM: stats: avoid 1 indirection by storing the shared stats directly in counters struct") took care of renaming counters_fe_shared_init() but we forgot counters_be_shared_init(). Let's fix that for consistency	2025-07-29 18:00:13 +02:00
Aurelien DARRAGON	2ffe515d97	BUG/MINOR: hlua: take default-path into account with lua-load-per-thread As discussed in GH #3051, default-path is not taken into account when loading files using lua-load-per-thread. In fact, the initial hlua_load_state() (performed on first thread which parses the config) is successful, but other threads run hlua_load_state() later based on config hints which were saved by the first thread, and those config hints only contain the file path provided on the lua-load-per-thread config line, not the absolute one. Indeed, `default-path` directive changes the current working directory only for the thread parsing the configuration. To fix the issue, when storing config hints under hlua_load_per_thread() we now make sure to save the absolute file path for `lua-load-per-thread' argument. Thanks to GH user @zhanhb for having reported the issue It may be backported to all stable versions.	2025-07-29 17:58:28 +02:00
William Lallemand	83a335f925	MINOR: acme: implement traces Implement traces for the ACME protocol. -dt acme:data:complete will dump every input and output buffers, including decoded buffers before being converted to JWS. It will also dump certificates in the traces. -dt acme:user:complete will only dump the state of the task handler.	2025-07-29 17:25:10 +02:00
Willy Tarreau	cedb4f0461	[RELEASE] Released version 3.3-dev5 Released version 3.3-dev5 with the following main changes : - BUG/MEDIUM: queue/stats: also use stream_set_srv_target() for pendconns - DOC: list missing global QUIC settings	2025-07-28 11:26:22 +02:00
Amaury Denoyelle	7fa812a1ac	DOC: list missing global QUIC settings Complete list of global keywords with missing QUIC entries. This could be backported to stable versions. This requires to take into account the version of introduction for each keyword. * limited-quic, introduced in 2.8 * no-quic, introduced in 2.8 * tune.quic.cc.cubic.min-losses, introduced in 3.1	2025-07-28 11:22:35 +02:00
Aurelien DARRAGON	021a0681be	BUG/MEDIUM: queue/stats: also use stream_set_srv_target() for pendconns Following c24de07 ("OPTIM: stats: store fast sharded counters pointers at session and stream level") some crashes were observed in connect_server(): #0 0x00000000007ba39c in connect_server (s=0x65117b0) at src/backend.c:2101 2101 _HA_ATOMIC_INC(&s->sv_tgcounters->connect); Missing separate debuginfos, use: debuginfo-install glibc-2.17-325.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64 nss-softokn-freebl-3.67.0-3.el7_9.x86_64 pcre-8.32-17.el7.x86_64 (gdb) bt #0 0x00000000007ba39c in connect_server (s=0x65117b0) at src/backend.c:2101 #1 0x00000000007baff8 in back_try_conn_req (s=0x65117b0) at src/backend.c:2378 #2 0x00000000006c0e9f in process_stream (t=0x650f180, context=0x65117b0, state=8196) at src/stream.c:2366 #3 0x0000000000bd3e51 in run_tasks_from_lists (budgets=0x7ffd592752e0) at src/task.c:655 #4 0x0000000000bd49ef in process_runnable_tasks () at src/task.c:889 #5 0x0000000000851169 in run_poll_loop () at src/haproxy.c:2834 #6 0x0000000000851865 in run_thread_poll_loop (data=0x1a03580 <ha_thread_info>) at src/haproxy.c:3050 #7 0x0000000000852a53 in main (argc=7, argv=0x7ffd592755f8) at src/haproxy.c:3637 Here the crash occurs during the atomic inc of a sv_tgcounters metric from the stream pointer, which tells us the pointer is likely garbage. In fact, we assign s->sv_tgcounters each time the stream target is set to a valid server. For that we use stream_set_srv_target() helper which does assigment for us. By reviewing the code, in turns out we forgot to call stream_set_srv_target() in pendconn_dequeue(), where the stream target is set to the server who picked the pendconn. Let's fix the bug by using stream_set_srv_target() there. No backport needed unless c24de07 is.	2025-07-28 08:54:38 +02:00
Willy Tarreau	5d4ff9f02e	[RELEASE] Released version 3.3-dev4 Released version 3.3-dev4 with the following main changes : - CLEANUP: server: do not check for duplicates anymore in findserver() - REORG: server: move findserver() from proxy.c to server.c - MINOR: server: use the tree to look up the server name in findserver() - CLEANUP: server: rename server_find_by_name() to server_find() - CLEANUP: server: rename findserver() to server_find_by_name() - CLEANUP: server: use server_find_by_name() where relevant - CLEANUP: cfgparse: lookup proxy ID using existing functions - CLEANUP: stream: lookup server ID using standard functions - CLEANUP: server: simplify server_find_by_id() - CLEANUP: server: add server_find_by_addr() - CLEANUP: stream: use server_find_by_addr() in sticking_rule_find_target() - CLEANUP: server: be sure never to compare src against a non-existing defsrv - MEDIUM: proxy: take the defsrv out of the struct proxy - MINOR: proxy: add checks for defsrv's validity - MEDIUM: proxy: no longer allocate the default-server entry by default - MEDIUM: proxy: register a post-section cleanup function - MINOR: debug: report haproxy and operating system info in panic dumps - BUG/MEDIUM: h3: do not overwrite interim with final response - BUG/MINOR: h3: properly realloc buffer after interim response encoding - BUG/MINOR: h3: ensure that invalid status code are not encoded (FE side) - MINOR: qmux: change API for snd_buf FIN transmission - BUG/MEDIUM: h3: handle interim response properly on FE side - BUG/MINOR: h3: properly handle interim response on BE side - BUG/MINOR: quic: Wrong source address use on FreeBSD - MINOR: h3: remove unused outbuf in h3_resp_headers_send() - BUG/MINOR: applet: Don't trigger BUG_ON if the tid is not on appctx init - DEV: gdb: add a memprofile decoder to the debug tools - MINOR: quic: Get rid of qc_is_listener() - DOC: connection: explain the rules for idle/safe/avail connections - BUG/MEDIUM: quic-be: CC buffer released from wrong pool - BUG/MINOR: halog: exit with error when some output filters are set simultaneosly - MINOR: cpu-topo: split cpu_dump_topology() to show its summary in show dev - MINOR: cpu-topo: write thread-cpu bindings into trash buffer - MINOR: debug: align output style of debug_parse_cli_show_dev with cpu_dump_topology - MINOR: debug: add thread-cpu bindings info in 'show dev' output - MINOR: quic: Remove pool_head_quic_be_cc_buf pool - BUILD: debug: add missed guard USE_CPU_AFFINITY to show cpu bindings - BUG/MEDIUM: threads: Disable the workaround to load libgcc_s on macOS - BUG/MINOR: logs: fix log-steps extra log origins selection - BUG/MINOR: hq-interop: fix FIN transmission - MINOR: ssl: Add ciphers in ssl traces - MINOR: ssl: Add curve id to curve name table and mapping functions - MINOR: ssl: Add curves in ssl traces - MINOR: ssl: Dump ciphers and sigalgs details in trace with 'advanced' verbosity - MINOR: ssl: Remove ClientHello specific traces if !HAVE_SSL_CLIENT_HELLO_CB - MINOR: h3: use smallbuf for request header emission - MINOR: h3: add traces to h3_req_headers_send() - BUG/MINOR: h3: fix uninitialized value in h3_req_headers_send() - MINOR: log: explicitly ignore "log-steps" on backends - BUG/MEDIUM: acme: use POST-as-GET instead of GET for resources - BUG/MINOR mux-quic: apply correctly timeout on output pending data - BUG/MINOR: mux-quic: ensure close-spread-time is properly applied - MINOR: mux-quic: refactor timeout code - MINOR: mux-quic: correctly implement backend timeout - MINOR: mux-quic: disable glitch on backend side - MINOR: mux-quic: store session in QCS instance - MEDIUM: mux-quic: implement be connection reuse - MINOR: mux-quic: do not reuse connection if app already shut - MEDIUM: mux-quic: support backend private connection - MINOR: acme: remove acme_req_auth() and use acme_post_as_get() instead - BUG/MINOR: acme: allow "processing" in challenge requests - CLEANUP: acme: fix wrong spelling of "resources" - CLEANUP: ssl: Use only NIDs in curve name to id table - MINOR: acme: add ACME to the haproxy -vv feature list - BUG/MINOR: hlua: Skip headers when a receive is performed on an HTTP applet - BUG/MEDIUM: applet: State inbuf is no longer full if input data are skipped - BUG/MEDIUM: stconn: Fix conditions to know an applet can get data from stream - BUG/MINOR: applet: Fix applet_getword() to not return one extra byte - BUG/MEDIUM: Remove sync sends from streams to applets - MINOR: applet: Add HTX versions for applet_input_data() and applet_output_room() - MINOR: applet: Improve applet API to take care of inbuf/outbuf alloc failures - MEDIUM: hlua: Update the tcp applet to use its own buffers - MINOR: hlua: Fill the request array on the first HTTP applet run - MINOR: hlua: Use the buffer instead of the HTTP message to get HTTP headers - MEDIUM: hlua: Update the http applet to use its own buffers - BUG/MEDIUM: hlua: Report to SC when data were consumed on a lua socket - BUG/MEDIUM: hlua: Report to SC when output data are blocked on a lua socket - MEDIUM: hlua: Update the socket applet to use its own buffers - BUG/MEDIUM: dns: Reset reconnect tempo when connection is finally established - MEDIUM: dns: Update the dns_session applet to use its own buffers - CLEANUP: http-client: Remove useless indentation when sending request body - MINOR: http-client: Try to send request body with headers if possible - MINOR: http-client: Trigger an error if first response block isn't a start-line - BUG/MINOR: httpclient-cli: Don't try to dump raw headers in HTX mode - MINOR: httpclient-cli: Reset httpclient HTX buffer instead of removing blocks - MEDIUM: http-client: Update the http-client applet to use its own buffers - MEDIUM: log: Update the log applet to use its own buffers - MEDIUM: sink: Update the sink applets to use their own buffers - MEDIUM: peers: Update the peer applet to use its own buffers - MEDIUM: promex: Update the promex applet to use their own buffers - MINOR: applet: Add support for flags on applets with a flag about the new API - MEDIUM: applet: Emit a warning when a legacy applet is spawned - BUG/MEDIUM: logs: fix sess_build_logline_orig() recursion with options - MEDIUM: stats: avoid 1 indirection by storing the shared stats directly in counters struct - CLEANUP: compiler: prefer char * over void * for pointer arithmetic - CLEANUP: include: replace hand-rolled offsetof to avoid UB - CLEANUP: peers: remove unused peer_session_target() - OPTIM: stats: store fast sharded counters pointers at session and stream level	2025-07-26 09:55:26 +02:00
Aurelien DARRAGON	c24de077bd	OPTIM: stats: store fast sharded counters pointers at session and stream level Following commit 75e480d10 ("MEDIUM: stats: avoid 1 indirection by storing the shared stats directly in counters struct"), in order to minimize the impact of the recent sharded counters work, we try to push things a bit further in this patch by storing and using "fast" pointers at the session and stream levels when available to avoid costly indirections and systematic "tgid" resolution (which can not be cached by the CPU due to its THREAD-local nature). Indeed, we know that a session/stream is tied to a given CPU, thanks to this we know that the tgid for a given session/stream will never change. Given that, we are able to store sharded frontend and listener counters pointer at the session level (namely sess->fe_tgcounters and sess->li_tgcounters), and once the backend and the server are selected, we are also able to store backend and server sharded counters pointer at the stream level (namely s->be_tgcounters and s->sv_tgcounters) Everywhere we rely on these counters and the stream or session context is available, we use the fast pointers it instead of the indirect pointers path to make the pointer resolution a bit faster. This optimization proved to bring a few percents back, and together with the previous 75e480d10 commit we now fixed the performance regression (we are back to back with 3.2 stats performance)	2025-07-25 18:24:23 +02:00
Aurelien DARRAGON	cf8ba60c88	CLEANUP: peers: remove unused peer_session_target() Since commit 7293eb68 ("MEDIUM: peers: use server as stream target") peer session target always point to server in order to benefit from existing server transport options. Thanks to that, it is no longer necessary to have peer_session_target() helper function, because all it does is return the pointer to the server object. Let's get rid of that	2025-07-25 18:24:17 +02:00
Ben Kallus	1e48ec7f6c	CLEANUP: include: replace hand-rolled offsetof to avoid UB The C standard specifies that it's undefined behavior to dereference NULL (even if you use & right after). The hand-rolled offsetof idiom &(((s)NULL)->f) is thus technically undefined. This clutters the output of UBSan and is simple to fix: just use the real offsetof when it's available. Note that there's no clear statement about this point in the spec, only several points which together converge to this: - From N3220, 6.5.3.4: A postfix expression followed by the -> operator and an identifier designates a member of a structure or union object. The value is that of the named member of the object to which the first expression points, and is an lvalue. - From N3220, 6.3.2.1: An lvalue is an expression (with an object type other than void) that potentially designates an object; if an lvalue does not designate an object when it is evaluated, the behavior is undefined. - From N3220, 6.5.4.4 p3: The unary & operator yields the address of its operand. If the operand has type "type", the result has type "pointer to type". If the operand is the result of a unary operator, neither that operator nor the & operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue. Similarly, if the operand is the result of a [] operator, neither the & operator nor the unary * that is implied by the [] is evaluated and the result is as if the & operator were removed and the [] operator were changed to a + operator. => In short, this is saying that C guarantees these identities: 1. &(p) is equivalent to p 2. &(p[n]) is equivalent to p + n As a consequence, &(p) doesn't result in the evaluation of *p, only the evaluation of p (and similar for []). There is no corresponding special carve-out for ->. See also: https://pvs-studio.com/en/blog/posts/cpp/0306/ After this patch, HAProxy can run without crashing after building w/ clang-19 -fsanitize=undefined -fno-sanitize=function,alignment	2025-07-25 17:54:32 +02:00
Ben Kallus	d3b46cca7b	CLEANUP: compiler: prefer char * over void * for pointer arithmetic This patch changes two instances of pointer arithmetic on void * to use char * instead, to avoid UB. This is essentially to please UB analyzers, though.	2025-07-25 17:54:32 +02:00
Aurelien DARRAGON	75e480d107	MEDIUM: stats: avoid 1 indirection by storing the shared stats directly in counters struct Between 3.2 and 3.3-dev we noticed a noticeable performance regression due to stats handling. After bisecting, Willy found out that recent work to split stats computing accross multiple thread groups (stats sharding) was responsible for that performance regression. We're looking at roughly 20% performance loss. More precisely, it is the added indirections, multiplied by the number of statistics that are updated for each request, which in the end causes a significant amount of time being spent resolving pointers. We noticed that the fe_counters_shared and be_counters_shared structures which are currently allocated in dedicated memory since a0dcab5c ("MAJOR: counters: add shared counters base infrastructure") are no longer huge since 16eb0fab31 ("MAJOR: counters: dispatch counters over thread groups") because they now essentially hold flags plus the per-thread group id pointer mapping, not the counters themselves. As such we decided to try merging fe_counters_shared and be_counters_shared in their parent structures. The cost is slight memory overhead for the parent structure, but it allows to get rid of one pointer indirection. This patch alone yields visible performance gains and almost restores 3.2 stats performance. counters_fe_shared_get() was renamed to counters_fe_shared_prepare() and now returns either failure or success instead of a pointer because we don't need to retrieve a shared pointer anymore, the function takes care of initializing existing pointer.	2025-07-25 16:46:10 +02:00
Aurelien DARRAGON	31adfb6c15	BUG/MEDIUM: logs: fix sess_build_logline_orig() recursion with options Since ccc43412 ("OPTIM: log: use thread local lf_buildctx to stop pushing it on the stack"), recursively calling sess_build_logline_orig(), which may for instance happen when leveraging %ID (or unique-id fetch) for the first time, would lead to undefined behavior because the parent sess_build_logline_orig() build context was shared between recursive calls (only one build ctx per thread to avoid pushing it on the stack for each call) In short, the parent build ctx would be altered by the recursive calls, which is obviously not expected and could result in log formatting errors. To fix the issue but still avoid polluting the stack with large lf_buildctx struct, let's move the static 256 bytes build buffer out of the buildctx so that the buildctx is now stored in the stack again (each function invokation has its own dedicated build ctx). On the other hand, it's acceptable to have only 1 256 bytes build buffer per thread because the build buffer is not involved in recursives calls (unlike the build ctx) Thanks to Willy and Vincent Gramer for spotting the bug and providing useful repro. It should be backported in 3.0 with ccc43412.	2025-07-25 16:46:03 +02:00
Christopher Faulet	b8d5307bd9	MEDIUM: applet: Emit a warning when a legacy applet is spawned To motivate developers to support the new applets API, a warning is now emitted when a legacy applet is spawned. To not flood users, this warning is only emitted once per legacy applet. To do so, the applet flag APPLET_FL_WARNED was added. It is set when the warning is emitted. Note that test and set on this flag are not performed via atomic operations. So it is possible to have more than one warning for a given applet if it is spawned in same time on several threads. At worrst, there is one warning per thread.	2025-07-25 15:53:33 +02:00
Christopher Faulet	337768656b	MINOR: applet: Add support for flags on applets with a flag about the new API A new field was added in the applet structure to be able to set flags on the applets The first one is related to the new API. APPLET_FL_NEW_API is set for applets based on the new API. It was set on all HAProxy's applets.	2025-07-25 15:44:02 +02:00
Christopher Faulet	2e5e6cdf23	MEDIUM: promex: Update the promex applet to use their own buffers Thanks to this patch, the promex applet is now using its own buffers. .rcv_buf and .snd_buf callback functions are now defined to use the default HTX functions. Parts to receive and send data have also been updated to use the applet API and to remove any dependencies on the stream-connectors and the channels.	2025-07-24 12:13:42 +02:00
Christopher Faulet	a2cb0033bd	MEDIUM: peers: Update the peer applet to use its own buffers Thanks to this patch, the peer applet is now using its own buffers. .rcv_buf and .snd_buf callback functions are now defined to use the default raw functions. The applet API is now used and any dependencies on the stream-connectors and the channels were removed.	2025-07-24 12:13:42 +02:00
Christopher Faulet	576361c23e	MEDIUM: sink: Update the sink applets to use their own buffers Thanks to this patch, the sink applets is now using their own buffers. .rcv_buf and .snd_buf callback functions are now defined to use the default raw functions. The applet API is now used and any dependencies on the stream-connectors and the channels were removed.	2025-07-24 12:13:42 +02:00
Christopher Faulet	5da704b55f	MEDIUM: log: Update the log applet to use its own buffers Thanks to this patch, the log applet is now using its own buffers. .rcv_buf and .snd_buf callback functions are now defined to use the default raw functions. The applet API is now used and any dependencies on the stream-connectors and the channels were removed.	2025-07-24 12:13:42 +02:00
Christopher Faulet	6a2b354dea	MEDIUM: http-client: Update the http-client applet to use its own buffers Thanks to this patch, the http-client applet is now using its own buffers. .rcv_buf and .snd_buf callback functions are now defined to use the default HTX functions. Parts to receive and send data have also been updated to use the applet API and to remove any dependencies on the stream-connectors and the channels.	2025-07-24 12:13:42 +02:00
Christopher Faulet	d05ff904bf	MINOR: httpclient-cli: Reset httpclient HTX buffer instead of removing blocks In the CLI I/O handler interacting with the HTTP client, in HTX mode, after a dump of the HTX message, data must be removed. Instead of removng all blocks one by one, we can call htx_reset() because all the message must be flushed.	2025-07-24 12:13:42 +02:00
Christopher Faulet	1741bc4bf0	BUG/MINOR: httpclient-cli: Don't try to dump raw headers in HTX mode In the CLI I/O handler interacting with the HTTP client, we must not try to push raw headers in HTX mode, because there is no raw data in this mode. This prevent the HTX dump at the end of the I/O handle. It is a 3.3-specific issue. No backport needed.	2025-07-24 12:13:42 +02:00
Christopher Faulet	88aa7a780c	MINOR: http-client: Trigger an error if first response block isn't a start-line The first HTX block of a response must be a start-line. There is no reason to wait for something else. And if there are output data in the response channel buffer, it means we must found the start-line.	2025-07-24 12:13:42 +02:00
Christopher Faulet	c08a0dae30	MINOR: http-client: Try to send request body with headers if possible There is no reason to yield after sending the request headers, except if the request was fully sent. If there is a payload, it is better to send it as well. However, when the whole request was sent, we can leave the I/O handler.	2025-07-24 12:13:42 +02:00
Christopher Faulet	96aa251d20	CLEANUP: http-client: Remove useless indentation when sending request body It was useless to have an indentation to handle HTTPCLIENT_S_REQ_BODY state in the http-client I/O handler.	2025-07-24 12:13:42 +02:00
Christopher Faulet	217da087fd	MEDIUM: dns: Update the dns_session applet to use its own buffers Thanks to this patch, the dns_session applet is now using its own buffers. .rcv_buf and .snd_buf callback functions are now defined to use the default raw functions. Functions to receive and send data have also been updated to use the applet API and to remove any dependencies on the stream-connectors and the channels.	2025-07-24 12:13:41 +02:00
Christopher Faulet	765f14e0e3	BUG/MEDIUM: dns: Reset reconnect tempo when connection is finally established The issue was introduced by commit 27236f221 ("BUG/MINOR: dns: add tempo between 2 connection attempts for dns servers"). In this patch, to delay the reconnection, a timer is used on the appctx when it is created. This postpones the appctx initialization. However, once initialized, the expiration time of the underlying task is not reset. So, it is always considered as expired and the appctx is woken up in loop. The fix is quite simple. In dns_session_init(), the expiration time of the appctx's task is alwaus set to TICK_ETERNITY. This patch must be backported everywhere the commit above was backported. So as far as 2.8 for now but possibly to all stable versions.	2025-07-24 12:13:41 +02:00
Christopher Faulet	e542d2dfaa	MEDIUM: hlua: Update the socket applet to use its own buffers Thanks to this patch, the lua cosocket applet is now using its own buffers. .rcv_buf and .snd_buf callback functions are now defined to use the default raw functions. Functions to receive and send data have also been updated to use the applet API and to remove any dependencies on the stream-connectors and the channels.	2025-07-24 12:13:41 +02:00
Christopher Faulet	7e96ff6b84	BUG/MEDIUM: hlua: Report to SC when output data are blocked on a lua socket It is a fix similar to the previous one ("BUG/MEDIUM: hlua: Report to SC when data were consumed on a lua socket"), but for the write side. The writer must notify the cosocket it needs more space in the request buffer to produce more data by calling sc_need_room(). Otherwise, there is nothing to prevent to wake the cosocket applet up again and again. This patch must be backported as far as 2.8, and maybe to 2.6 too.	2025-07-24 12:13:41 +02:00
Christopher Faulet	21e45a61d1	BUG/MEDIUM: hlua: Report to SC when data were consumed on a lua socket The lua cosocket are quite strange. There is an applet used to handle the connection and writer and readers subscribed on it to write or read data. Writers and readers are tasks woken up by the cosocket applet when data can be consumed or produced, depending on the channels buffers state. Then the cosocket applet is woken up by writers and readers when read or write events were performed. It means the cosocket applet has only few information on what was produced or consumed. It is the writers and readers responsibility to notify any blocking. Among other things, the readers must take care to notify the stream on top of the cosocket applet that some data was consumed. Otherwise, it may remain blocked, waiting for a write event (a write event from the stream point of view is a read event from the cosocket point of view). Thie patch must be backported as far as 2.8, and maybe to 2.6 too.	2025-07-24 12:13:41 +02:00
Christopher Faulet	48df877dab	MEDIUM: hlua: Update the http applet to use its own buffers Thanks to this patch, the lua HTTP applet is now using its own buffers. .rcv_buf and .snd_buf callback functions are now defined to use the default HTX functions. Functions to receive and send data have also been updated to use the applet API and to remove any dependencies on the stream-connectors and the channels.	2025-07-24 12:13:41 +02:00
Christopher Faulet	3e456be5ae	MINOR: hlua: Use the buffer instead of the HTTP message to get HTTP headers hlua_http_get_headers() function was using the HTTP message from the stream TXN to retrieve headers from a message. However, this will be an issue to update the lua HTTP applet to use its own buffers. Indeed, in that case, information from the channels will be unavailable. So now, hlua_http_get_headers() is now using a buffer containing an HTX message. It is just an API change bacause, internally, the function was already manipulation an HTX message.	2025-07-24 12:13:41 +02:00
Christopher Faulet	15080d9aae	MINOR: hlua: Fill the request array on the first HTTP applet run When a lua HTTP applet is created, a "request" object is created, filled with the request information (method, path, headers...), to be able to easily retrieve these information from the script. However, this was done when thee appctx was created, retrieving the info from the request channel. To be ale to update the applet to use its own buffer, it is now performed on the first applet run. Indead, when the applet is created, the info are not forwarded yet and should not be accessed. Note that for now, information are still retrieved from the channel.	2025-07-24 12:13:41 +02:00
Christopher Faulet	fdb66e6c5e	MEDIUM: hlua: Update the tcp applet to use its own buffers Thanks to this patch, the lua TCP applet is now using its own buffers. .rcv_buf and .snd_buf callback functions are now defined to use the default raw functions. Other changes are quite light. Mainly, end of stream and errors are reported on the appctx instead of the stream-endpoint descriptor.	2025-07-24 12:13:41 +02:00
Christopher Faulet	1f9a1cbefc	MINOR: applet: Improve applet API to take care of inbuf/outbuf alloc failures applet_get_inbuf() and applet_get_outbuf() functions were not testing if the buffers were available. So, the caller had to check them before calling one of these functions. It is not really handy. So now, these functions take care to have a fully usable buffer before returning. Otherwise NULL is returned.	2025-07-24 12:13:41 +02:00
Christopher Faulet	44aae94ab9	MINOR: applet: Add HTX versions for applet_input_data() and applet_output_room() It will be useful for HTX applets because availale data in the input buffer and available space in the output buffer are computed from the HTX message and not the buffer itself. So now, applet_htx_input_data() and applet_htx_output_room() functions can be used.	2025-07-24 12:13:41 +02:00
Christopher Faulet	d9855102cf	BUG/MEDIUM: Remove sync sends from streams to applets When the applet API was reviewed to use dedicated buffers, the support for sends from the streams to applets was added. Unfortunately, it was not a good idea because this way it is possible to deliver data to an applet and release it just after, truncated data. Indeed, the release stage for applets is related to the stream release itself. However, unlike the multiplexers, the applets cannot survive to a stream for now. So, for now, the sync sends from the streams is removed for applets, waiting for a better way to handle the applets release stage. Note that this only concerns applets using their own buffers. And of now, the bug is harmless because all refactored applets are on server side and consume data first. But this will be an issue with the HTTP client. This patch should be backported as far as 3.0 after a period of observation.	2025-07-24 12:13:41 +02:00
Christopher Faulet	574d0d8211	BUG/MINOR: applet: Fix applet_getword() to not return one extra byte applet_getword() function is returning one extra byte when a string is returned because the "ret" variable is not reset before the loop on the data. The patch also fixes applet_getline(). It is a 3.3-specific issue. No need to backport.	2025-07-24 12:13:41 +02:00
Christopher Faulet	41a40680ce	BUG/MEDIUM: stconn: Fix conditions to know an applet can get data from stream sc_is_send_allowed() function is used to know if an applet is able to receive data from the stream. But this function was designed for applets using the channels buffer. It is not adapted to applets using their own buffers. when the SE_FL_WAIT_DATA flag is set, it means the applet is waiting for more data and should not be woken up without new data. For applets using channels buffer, just testing the flag is enough because process_stream() will remove if when more data will be available. For applets using their own buffers, it is more complicated. Some data may be blocked in the output channel buffer. In that case, and when the applet input buffer can receive daa, the applet can be woken up. This patch must be backported as far as 3.0 after a period of observation.	2025-07-24 12:13:41 +02:00
Christopher Faulet	0d371d2729	BUG/MEDIUM: applet: State inbuf is no longer full if input data are skipped When data are skipped from the input buffer of an applet, we must take care to notify the input buffer is no longer full. Otherwise, this could prevent the stream to push data to the applet. It is 3.3-specific. No backport needed.	2025-07-24 12:13:41 +02:00
Christopher Faulet	5b5ecf848d	BUG/MINOR: hlua: Skip headers when a receive is performed on an HTTP applet When an HTTP applet tries to retrieve data, the request headers are still in the buffer. But, instead of being silently removed, their size is removed from the amount of data retrieved. When the request payload is fully retrieved, it is not an issue. But it is a problem when a length is specified. The data are shorten from the headers size. So now, we take care to silently remove headers. This patch must be backported to all stable versions.	2025-07-24 12:13:41 +02:00
William Lallemand	8258c8166a	MINOR: acme: add ACME to the haproxy -vv feature list Add "ACME" in the feature list in order to check if the support was built successfully.	2025-07-24 11:49:11 +02:00
Remi Tricot-Le Breton	14615a8672	CLEANUP: ssl: Use only NIDs in curve name to id table The curve name to curve id mapping table was built out of multiple internal tables found in openssl sources, namely the 'nid_to_group' table found in 'ssl/t1_lib.c' which maps openssl specific NIDs to public IANA curve identifiers. In this table, there were two instances of EVP_PKEY_XXX ids being used while all the other ones are NID_XXX identifiers. Since the two EVP_PKEY are actually equal to their NID equivalent in 'include/openssl/evp.h' we can use NIDs all along for better coherence.	2025-07-24 10:58:54 +02:00
Ilia Shipitsin	a2267fafcf	CLEANUP: acme: fix wrong spelling of "resources" "ressources" was used as a variable name, let's use English variant to make spell check happier	2025-07-24 08:11:42 +02:00
William Lallemand	02db0e6b9f	BUG/MINOR: acme: allow "processing" in challenge requests Allow the "processing" status in the challenge object when requesting to do the challenge, in addition to "pending". According to RFC 8555 https://datatracker.ietf.org/doc/html/rfc8555/#section-7.1.6 Challenge objects are created in the "pending" state. They transition to the "processing" state when the client responds to the challenge (see Section 7.5.1) However some CA could respond with a "processing" state without ever transitioning to "pending". Must be backported to 3.2.	2025-07-23 16:07:03 +02:00
William Lallemand	c103123c9e	MINOR: acme: remove acme_req_auth() and use acme_post_as_get() instead acme_req_auth() is only a call to acme_post_as_get() now, there's no reason to keep the function. This patch removes it.	2025-07-23 16:07:03 +02:00
Amaury Denoyelle	08d664b17c	MEDIUM: mux-quic: support backend private connection If a backend connection is private, it should not be reused outside of its original attached session. As such, on stream detach operation, such connection is never inserted into server idle/avail list. Instead, it is stored directly on the session. The purpose of this commit is to implement proper handling of private backend connections via QUIC multiplexer.	2025-07-23 15:49:51 +02:00
Amaury Denoyelle	00d668549e	MINOR: mux-quic: do not reuse connection if app already shut QUIC connection graceful closure is performed in two steps. First, the application layer is closed. In the context of HTTP/3, this is done with a GOAWAY frame emission, which forbids opening of new streams. Then the whole connection is terminated via CONNECTION_CLOSE which is the final emitted frame. This commit ensures that when app layer is shut for a backend connection, this connection is removed from either idle or avail server tree. The objective is to prevent stream layer to try to reuse a connection if no new stream can be attached on it. New BUG_ON checks are inserted in qmux_strm_attach() and h3_attach() to ensure that this assertion is always true.	2025-07-23 15:45:18 +02:00
Amaury Denoyelle	3217835b1d	MEDIUM: mux-quic: implement be connection reuse Implement support for QUIC connection reuse on the backend side. The main change is done during detach stream operation. If a connection is idle, it is inserted in the server list. Else, it is stored in the server avail tree if there is room for more streams. For non idle connection, qmux_avail_streams() is reused to detect that stream flow-control limit is not yet reached. If this is the case, the connection is not inserted in the avail tree, so it cannot be reuse, even if flow-control is unblocked later by the peer. This latter point could be improved in the future. Note that support for QUIC private connections is still missing. Reuse code will evolved to fully support this case.	2025-07-23 15:45:09 +02:00
Amaury Denoyelle	3bf37596ba	MINOR: mux-quic: store session in QCS instance Add a new <sess> member into QCS structure. It is used to store the parent session of the stream on attach operation. This is only done for backend side. This new member will become necessary when connection reuse will be implemented. <owner> member of connection is not suitable as it could be set to NULL, notably after a session_add_conn() failure. Also, a single BE conn can be shared along different session instance, in particular when using aggressive/always reuse mode. Thus it is necessary to linked each QCS instance with its session.	2025-07-23 15:42:37 +02:00
Amaury Denoyelle	826f797bb0	MINOR: mux-quic: disable glitch on backend side For now, QUIC glitch limit counter is only available on the frontend side. Thus, disable incrementation on the backend side for now. Also, session is only available as conn <owner> reliably on the frontend side, so session_add_glitch_ctr() operation is also securised.	2025-07-23 14:39:18 +02:00
Amaury Denoyelle	89329b147d	MINOR: mux-quic: correctly implement backend timeout qcc_refresh_timeout() is the function called on QUIC MUX activity. Its purpose is to update the timeout by selecting the correct value depending on the connection state. Prior to this patch, backend connections were mostly ignored by the function. However, the default server timeout was selecting as a fallback. This is incompatible with backend connections reuse. This patch fixes timeout applied on backend connections. Only values specific to frontend which are http-request and http-keep-alive timeouts are now ignored for a backend connection. Also, fallback timeout is only used for frontend connections. This patch ensures that an idle backend connection won't be deleted due to server timeout. This is necessary for proper connection reuse which will be implemented in a future patch.	2025-07-23 14:36:48 +02:00
Amaury Denoyelle	95cb763cd6	MINOR: mux-quic: refactor timeout code This commit is a small reorganization of condition used into qcc_refresh_timeout(). Its objective is to render the code more logical before the next patch which will ensure that timeout is properly set for backend connections.	2025-07-23 14:36:48 +02:00
Amaury Denoyelle	558532fc57	BUG/MINOR: mux-quic: ensure close-spread-time is properly applied If a connection remains on a proxy currently disabled or stopped, a special spread timeout is set if active close is configured. For QUIC MUX, this is set via qcc_refresh_timeout() as with all other timeout values. Fix this closing timeout setting : it is now used as an override to any other timeout that may have been chosen if calculated spread time is lower than the previously selected value. This is done for backend connections as well. This should be backported up to 2.6 after a period of observation.	2025-07-23 14:36:48 +02:00
Amaury Denoyelle	c5bcc3a21e	BUG/MINOR mux-quic: apply correctly timeout on output pending data When no stream is attached, mux layer is responsible to maintain a timeout. The first criteria is to apply client/server timeout if there is still data waiting for emission. Previously, <hreq> qcc member was used to determine this state. However, this only covers bidirectional streams. Fix this by testing if <send_list> is empty or not. This is enough to take into account both bidi and uni streams. Theorically, this should be backported to every stable versions. However, send-list is not available on 2.6 and there is no alternative to quickly determine if there is waiting output data. Thus, it's better to backport it up to 2.8 only.	2025-07-23 14:36:48 +02:00
William Lallemand	7139ebd676	BUG/MEDIUM: acme: use POST-as-GET instead of GET for resources The requests that checked the status of the challenge and the retrieval of the certificate were done using a GET. This is working with letsencrypt and other CA providers, but it might not work everywhere. RFC 8555 specifies that only the directory and newNonce resources MUST work with a GET requests, but everything else must use POST-as-GET. Must be backported to 3.2.	2025-07-23 12:42:23 +02:00
Aurelien DARRAGON	054fa05e1f	MINOR: log: explicitly ignore "log-steps" on backends "log-steps" was already ignored if directly defined in a backend section, however, when defined in a defaults section it was inherited to all proxies no matter their capability (ie: including backends). As configurations often contain more backends than frontends, this would result in wasted memory given that the log-steps setting is only considered on frontends. Let's fix that by preventing the inheritance from defaults section to anything else than frontends. Also adjust the documentation to mention that the setting in not relevant for backends.	2025-07-22 10:22:04 +02:00
Amaury Denoyelle	e02939108e	BUG/MINOR: h3: fix uninitialized value in h3_req_headers_send() Due to the introduction of smallbuf usage for HTTP/3 headers emission, ret variable may be used uninitialized if buffer allocation fails due to not enough room in QUIC connection window. Fix this by setting ret value to 0. Function variable declaration are also adjusted so that the pattern is similar to h3_resp_headers_send(). Finally, outbuf buffer is also removed as it is now unused. No need to backport.	2025-07-22 09:42:52 +02:00
Amaury Denoyelle	cbbbf4ea43	MINOR: h3: add traces to h3_req_headers_send() Add traces during HTTP/3 request encoding. This operation is performed on the backend side.	2025-07-21 16:58:12 +02:00
Amaury Denoyelle	3126cba82e	MINOR: h3: use smallbuf for request header emission Similarly to HTTP/3 response encoding, a small buffer is first allocated for the request encoding on the backend side. If this is not sufficient, the smallbuf is replaced by a standard buffer and encoding is restarted. This is useful to reduce the window usage over a connection of smaller requests.	2025-07-21 16:58:12 +02:00
Remi Tricot-Le Breton	7fd849f4e0	MINOR: ssl: Remove ClientHello specific traces if !HAVE_SSL_CLIENT_HELLO_CB SSL libraries like wolfSSL that don't have the clienthello callback mechanism enabled do not need to have the traces that are only called from the said callback. The code added to parse the ciphers relied on a function that wes not defined in wolfSSL (SSL_CIPHER_find).	2025-07-21 16:44:50 +02:00
Remi Tricot-Le Breton	665b7d4fa9	MINOR: ssl: Dump ciphers and sigalgs details in trace with 'advanced' verbosity The contents of the extensions were only dumped with verbosity 'complete' which meant that the 'advanced' verbosity was pretty much useless despite what its name implies (it was the same as the 'simple' one). The 'advanced' verbosity is now the "maximum" one, using 'complete' would not add any extra information yet, but it leaves more room for some actually large traces to be dumped later on (some complete ClientHello dumps for instance).	2025-07-21 16:44:50 +02:00
Remi Tricot-Le Breton	8f2b787241	MINOR: ssl: Add curves in ssl traces Dump the ClientHello curves in the SSL traces.	2025-07-21 16:44:50 +02:00
Remi Tricot-Le Breton	d799a1b3b2	MINOR: ssl: Add curve id to curve name table and mapping functions The SSL libraries like OpenSSL for instance do not seem to actually provide a public mapping between IANA defined curve IDs and curve names, or even a mapping between curve IDs and internal NIDs. This new table regroups all those information in a single table so that we can convert curve names (be it SECG or NIST format) to curve IDs or NIDs. The previously existing 'curves2nid' function now uses the new table, and a new 'curveid2str' one is added.	2025-07-21 16:44:50 +02:00
Remi Tricot-Le Breton	f00d9bf12d	MINOR: ssl: Add ciphers in ssl traces Decode the contents of the ClientHello ciphers extension and dump a human readable list in the ssl traces.	2025-07-21 16:44:50 +02:00
Amaury Denoyelle	b0fe453079	BUG/MINOR: hq-interop: fix FIN transmission Since the following patch, app_ops layer is now responsible to report that HTX block was the last transmitted so that FIN STREAM can be set. This is mandatory to properly support HTTP 1xx interim responses. f349df44b4e21d8bf9b575a0aa869056a2ebaa58 MINOR: qmux: change API for snd_buf FIN transmission This change was correctly implemented in HTTP/3 code, however an issue appeared on hq-interop transcoder in case zero-copy DATA transfer is performed when HTX buffer is swapped. If this occured during the transfer of the last HTX block, EOM is not detected and thus STREAM FIN is never set. Most of the times, QMUX shut callback is called immediately after. This results in an emission of a RESET_STREAM to the client, which prevents the data transfer. To fix this, use the same method as HTTP/3 : HTX EOM flag status is checked before any transfer, thus preserving it even after a zero-copy. Criticity of this bug is low as hq-interop is experimental and is mostly used for interop testing. This should fix github issue #3038. This patch must be backported wherever the above one is.	2025-07-21 15:38:02 +02:00
Aurelien DARRAGON	563b4fafc2	BUG/MINOR: logs: fix log-steps extra log origins selection Willy noticed that it was not possible to select extra log origins using log-steps directive. Extra origins are the one registered using log_orig_register() such as http-req. Reason was the error path was always executed during extra log origin matching for log-steps parser, while it should only be executed if no match was found. It should be backported to 3.1.	2025-07-21 15:33:55 +02:00
Olivier Houchard	f8e9545f70	BUG/MEDIUM: threads: Disable the workaround to load libgcc_s on macOS Don't use the workaround to load libgcc_s on macOS. It is not needed there, and it causes issues, as recent macOS dislike processes that fork after threads where created (and the workaround creates a temporary thread). This fixes crashes on macOS at least when using master-worker, and using the system resolver. This should fix Github issue #3035 This should be backported up to 2.8.	2025-07-21 13:56:29 +02:00
Valentine Krasnobaeva	5b45251d19	BUILD: debug: add missed guard USE_CPU_AFFINITY to show cpu bindings Not all platforms support thread-cpu bindings, so let's put cpu_topo_dump_summary() under USE_CPU_AFFINITY guards. Only needs to be backported if 1cc0e023ce ("MINOR: debug: add thread-cpu bindings info in 'show dev' output") is backported.	2025-07-21 11:25:08 +02:00
Frederic Lecaille	14d0f74052	MINOR: quic: Remove pool_head_quic_be_cc_buf pool This patch impacts the QUIC frontends. It reverts this patch MINOR: quic-be: add a "CC connection" backend TX buffer pool which adds <pool_head_quic_be_cc_buf> new pool to allocate CC (connection closed state) TX buffers with bigger object size than the one for <pool_head_quic_cc_buf>. Indeed the QUIC backends must be able to send at least 1200 bytes Initial packets. For now on, both the QUIC frontends and backend use the same pool with MAX(QUIC_INITIAL_IPV6_MTU, QUIC_INITIAL_IPV4_MTU)(1252 bytes) as object size.	2025-07-17 19:33:21 +02:00
Valentine Krasnobaeva	1cc0e023ce	MINOR: debug: add thread-cpu bindings info in 'show dev' output Add thread-cpu bindings info in 'show dev' output, as it can be useful for debugging.	2025-07-17 19:08:13 +02:00
Valentine Krasnobaeva	ff461efc59	MINOR: debug: align output style of debug_parse_cli_show_dev with cpu_dump_topology Align titles style of debug_parse_cli_show_dev() with cpu_dump_topology(). We will call the latter inside of debug_parse_cli_show_dev() to show thread-cpu bindings info.	2025-07-17 19:08:06 +02:00
Valentine Krasnobaeva	9e11c852fe	MINOR: cpu-topo: write thread-cpu bindings into trash buffer Write thread-cpu bindings and cluster summary into provided trash buffer. Like this we can call this function in any place, when this info is needed.	2025-07-17 19:07:58 +02:00
Valentine Krasnobaeva	2405283230	MINOR: cpu-topo: split cpu_dump_topology() to show its summary in show dev cpu_dump_topology() prints details about each enabled CPU and a summary with clusters info and thread-cpu bindings. The latter is often usefull for debugging and we want to add it in the 'show dev' output. So, let's split cpu_dump_topology() in two parts: cpu_topo_debug() to print the details about each enabled CPU; and cpu_topo_dump_summary() to print only the summary. In the next commit we will modify cpu_topo_dump_summary() to write into local trash buffer and it could be easily called from debug_parse_cli_show_dev().	2025-07-17 19:07:46 +02:00
Valentine Krasnobaeva	254e4d59f7	BUG/MINOR: halog: exit with error when some output filters are set simultaneosly Exit with an error if multiple output filters (-ic, -srv, -st, -tc, -u*, etc.) are used at the same time. halog is designed to process and display output for only one filter at a time. Using multiple filters simultaneously can cause a crash because the program is not designed to manage multiple, separate result sets (e.g., one for IP counts, another for URLs). Supporting simultaneous filters would require a redesign to collect entries for each filter in separate ebtree. This would negatively impact performance and is not requested for the moment. This patch prevents the crash by checking filter combinations just after the command line parsing. This issue was reported in GitHUB #3031. This should be backported in all stable versions.	2025-07-17 17:22:37 +02:00
Frederic Lecaille	4eef300a2c	BUG/MEDIUM: quic-be: CC buffer released from wrong pool The "connection close state" TX buffer is used to build the datagram with basically a CONNECTION_CLOSE frame to notify the peer about the connection closure. It allows the quic_conn memory release and its replacement by a lighter quic_cc_conn struct. For the QUIC backend, there is a dedicated pool to build such datagrams from bigger TX buffers. But from quic_conn_release(), this is the pool dedicated to the QUIC frontends which was used to release the QUIC backend TX buffers. This patch simply adds a test about the target of the connection to release the "connection close state" TX buffers from the correct pool. No backport needed.	2025-07-17 11:48:41 +02:00
Willy Tarreau	b6d0ecd258	DOC: connection: explain the rules for idle/safe/avail connections It's super difficult to find the rules that operate idle conns depending on their idle/safe/avail/private status. Some are in lists, others not. Some are in trees, others not. Some have a flag set, others not. This documents the rules before the definitions in connection-t.h. It could even be backported to help during backport sessions.	2025-07-16 18:53:57 +02:00
Frederic Lecaille	838024e07e	MINOR: quic: Get rid of qc_is_listener() Replace all calls to qc_is_listener() (resp. !qc_is_listener()) by calls to objt_listener() (resp. objt_server()). Remove qc_is_listener() implement and QUIC_FL_CONN_LISTENER the flag it relied on.	2025-07-16 16:42:21 +02:00
Willy Tarreau	d9701d312d	DEV: gdb: add a memprofile decoder to the debug tools "memprof_dump" will visit memprofile entries and dump them in a synthetic format counting allocations/releases count/size, type and calling address.	2025-07-16 15:33:33 +02:00
Christopher Faulet	4f7c26cbb3	BUG/MINOR: applet: Don't trigger BUG_ON if the tid is not on appctx init When an appctx is initialized, there is a BUG_ON() to be sure the appctx is really initialized on the right thread to avoid bugs on the thread affinity. However, it is possible to not choose the thread when the appctx is created and let it starts on any thread. In that case, the thread affinity is set when the appctx is initialized. So, we must take cate to not trigger the BUG_ON() in that case. For now, we never hit the bug because the thread affinity is always set during the appctx creation. This patch must be backport as far as 2.8.	2025-07-16 13:47:33 +02:00
Amaury Denoyelle	88c0422e49	MINOR: h3: remove unused outbuf in h3_resp_headers_send() Cleanup h3_resp_headers_send() by removing outbuf buffer variable which is not necessary anymore.	2025-07-16 10:30:59 +02:00
Frederic Lecaille	1c33756f78	BUG/MINOR: quic: Wrong source address use on FreeBSD The bug is a listener only one, and only occured on FreeBSD. The FreeBSD issue has been reported here: https://forums.freebsd.org/threads/quic-http-3-with-haproxy.98443/ where QUIC traces could reveal that sendmsg() calls lead to EINVAL syscall errnos. Such a similar issue could be reproduced from a FreeBSD 14-2 VM with reg-tests/quic/retry.vtc as reg test. As noted by Olivier, this issue could be fixed within the VM binding the listener socket to INADDR_ANY. That said, the symptoms are not exactly the same as the one reporte by the user. What could be observed from such a VM is that if the first recvmsg() call returns the datagram destination address, and if the listener listening address is bound to a specific address, the calls to sendmsg() fail because of the IP_SENDSRCADDR ip option value set by cmsg_set_saddr(). According to the ip(4) freebsd manual such an IP options must be used if the listening socket is bound to a specific address. It is to be noted that into a VM the first call to recvmsg() of the first connection does not return the datagram destination address. This leads the first quic_conn to be initialized without ->local_addr value. This is this value which is used by IP_SENDSRCADDR ip option. In this case, the sendmsg() calls (without IP_SENDSRCADDR) never fail. The issue appears at the second condition. This patch replaces the conditions to use IP_SENDSRCADDR to a call to qc_may_use_saddr(). This latter also checks that the listener listening address is not INADDR_ANY to allow the use of the source address. It is generalized to all the OSes. Indeed, there is no reason to set the source address when the listener is bound to a specific address. Must be backported as far as 2.8.	2025-07-16 10:17:54 +02:00
Amaury Denoyelle	63586a8ab4	BUG/MINOR: h3: properly handle interim response on BE side On backend side, H3 layer is responsible to decode a HTTP/3 response into an HTX message. Multiple responses may be received on a single stream with interim status codes prior to the final one. h3_resp_headers_to_htx() is the function used solely on backend side responsible for H3 response to HTX transcoding. This patch extends it to be able to properly support interim responses. When such a response is received, the new flag H3_SF_RECV_INTERIM is set. This is converted to QMUX qcs flag QC_SF_EOI_SUSPENDED. The objective of this latter flag is to prevent stream EOI to be reported during stream rcv_buf callback, even if HTX message contains EOM and is empty. QC_SF_EOI_SUSPENDED will be cleared when the final response is finally converted, which unblock stream EOI notification for next rcv_buf invocations. Note however that HTX EOM is untouched : it is always set for both interim and final response reception. As a minor adjustment, HTX_SL_F_BODYLESS is always set for interim responses. Contrary to frontend interim response handling, a flag is necessary on QMUX layer. This is because H3 to HTX transcoding and rcv_buf callback are two distinct operations, called under different context (MUX vs stream tasklet). Also note that H3 layer has two distinct flags for interim response handling, one only used as a server (FE side) and the other as a client (BE side). It was preferred to used two distinct flags which is considered less error-prone, contrary to a single unified flag which would require to always set the proxy side to ensure it is relevant or not. No need to backport.	2025-07-15 18:39:23 +02:00
Amaury Denoyelle	e7b3a69c59	BUG/MEDIUM: h3: handle interim response properly on FE side On frontend side, HTTP/3 layer is responsible to transcode an HTX response message into HTTP/3 HEADERS frame. This operations is handled via h3_resp_headers_send(). Prior to this patch, if HTX EOM was encountered in the HTX message after response transcoding, <fin> was reported to the QMUX layer. This will in turn cause FIN stream bit to be set when the response is emitted. However, this is not correct as a single HTX response can be constitued of several interim message, each delimited by EOM block. Most of the time, this bug will cause the client to close the connection as it is invalid to receive an interim response with FIN bit set. Fixes this by now properly differentiate interim and final response. During interim response transcoding, the new flag H3_SF_SENT_INTERIM will be set, which will prevent <fin> to be reported. Thus, <fin> will only be notified for the final response. This must be backported up to 2.6. Note that it relies on the previous patch which also must be taken.	2025-07-15 18:39:23 +02:00
Amaury Denoyelle	f349df44b4	MINOR: qmux: change API for snd_buf FIN transmission Previous patches have fixes interim response encoding via h3_resp_headers_send(). However, it is still necessary to adjust h3 layer state-machine so that several successive HTTP responses are accepted for a single stream. Prior to this, QMUX was responsible to decree that the final HTX message was encoded so that FIN stream can be emitted. However, with interim response, MUX is in fact unable to properly determine this. As such, this is the responsibility of the application protocol layer. To reflect this, app_ops snd_buf callback is modified so that a new output argument <fin> is added to it. Note that for now this commit does not bring any functional change. However, it will be necessary for the following patch. As such, it should be backported prior to it to every versions as necessary.	2025-07-15 18:39:23 +02:00
Amaury Denoyelle	d8b34459b5	BUG/MINOR: h3: ensure that invalid status code are not encoded (FE side) On frontend side, H3 layer transcodes HTX status code into HTTP/3 HEADERS frame. This is done by calling qpack_encode_int_status(). Prior to this patch, the latter function was also responsible to reject an invalid value, which guarantee that only valid codes are encoded (between 100 and 999 values). However, this is not practical as it is impossible to differentiate between an invalid code error and a buffer room exhaustation. Changes this so that now HTTP/3 layer first ensures that HTX code is valid. The stream is closed with H3_INTERNAL_ERROR if invalid value is present. Thus, qpack_encode_int_status() will only report an error due to buffer room exhaustion. If a small buffer is used, a standard buffer will be reallocated which should be sufficient to encode the response. The impact of this bug is minimal. Its main benefit is code clarity, while also removing an unnecessary realloc when confronting with an invalid HTTP code. This should be backported at least up to 3.1. Prior to it, smallbuf mechanism isn't present, hence the impact of this patch is less important. However, it may still be backported to older versions, which should facilitate picking patches for HTTP 1xx interim response support.	2025-07-15 18:39:23 +02:00
Amaury Denoyelle	d59bdfb8ec	BUG/MINOR: h3: properly realloc buffer after interim response encoding Previous commit fixes encoding of several following HTTP response message when interim status codes are first reported. However, h3_resp_headers_send() still was unable to interrupt encoding if output buffer room was not sufficient. This case may be likely because small buffers are used for headers encoding. This commit fixes this situation. If output buffer is not empty prior to response encoding, this means that a previous interim response message was already encoded before. In this case, and if remaining space is not sufficient, use buffer release mechanism : this allows to restart response encoding by using a newer buffer. This process has already been used for DATA and trailers encoding. This must be backported up to 2.6. However, note that buffer release mechanism is not present for version 2.8 and lower. In this case, qcs flag QC_SF_BLK_MROOM should be enough as a replacement.	2025-07-15 18:39:23 +02:00
Amaury Denoyelle	1290fb731d	BUG/MEDIUM: h3: do not overwrite interim with final response An HTTP response may contain several interim response message prior (1xx status) to a final response message (all other status codes). This may cause issues with h3_resp_headers_send() called for response encoding which assumes that it is only call one time per stream, most notably during output buffer handling. This commit fixes output buffer handling when h3_resp_headers_send() is called multiple times due to an interim response. Prior to it, interim response was overwritten with newer response message. Most of the time, this resulted in error for the client due to QPACK decoding failure. This is now fixed so that each response is encoded one after the other. Note that if encoding of several responses is bigger than output buffer, an error is reported. This can definitely occurs as small buffer are used during header encoding. This situation will be improved by the next patch. This must be backported up to 2.6.	2025-07-15 18:39:23 +02:00
Willy Tarreau	110625bdb2	MINOR: debug: report haproxy and operating system info in panic dumps The goal is to help figure the OS version (kernel and userland), any virtualization/containers, and the haproxy version and build features. Sometimes even reporters themselve can be mistaken about the running version or environment. Also printing this at the top hepls draw a visual delimitation between warnings and panic. Now we get something like this: PANIC! Thread 1 is about to kill the process. HAProxy info: version: 3.3-dev3-c863c0-18 features: +51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY (...) Operating system info: virtual machine: no container: no kernel: Linux 6.1.131 #1 SMP PREEMPT_DYNAMIC Fri Mar 14 01:04:55 CET 2025 x86_64 userland: Slackware 15.0 x86_64 * Thread 1 : id=0x7f615a8775c0 act=1 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0 1/1 stuck=0 prof=0 harmless=0 isolated=0 cpu_ns: poll=1835010197 now=1835066102 diff=55905 (...)	2025-07-15 17:18:29 +02:00
Willy Tarreau	abcc73830f	MEDIUM: proxy: register a post-section cleanup function For listen/frontend/backend, we now want to be able to clean up the default-server directive that's no longer used past the end of the section. For this we register a post-section function and perform the cleanup there.	2025-07-15 10:40:17 +02:00
Willy Tarreau	49a619acae	MEDIUM: proxy: no longer allocate the default-server entry by default The default-server entry used to always be allocated. Now we'll postpone its allocation for the first time we need it, i.e. during a "default-server" directive, or when inheriting a defaults section which has one. The memory savings are significant, on a large configuration with 100k backends and no default-server directive, the memory usage dropped from 800MB RSS to 420MB (380 MB saved). It should be possible to also address configs using default-server by releasing this entry when leaving the proxy section, which is not done yet.	2025-07-15 10:39:44 +02:00
Willy Tarreau	76828d4120	MINOR: proxy: add checks for defsrv's validity Now we only copy the default server's settings if such a default server exists, otherwise we only initialize it. At the moment it always exists. The change is mostly performed in srv_settings_cpy() since that's where each caller passes through, and there's no point duplicating that test everywhere.	2025-07-15 10:36:58 +02:00
Willy Tarreau	4ac28f07d0	MEDIUM: proxy: take the defsrv out of the struct proxy The server struct has gone huge over time (~3.8kB), and having a copy of it in the defsrv section of the struct proxy costs a lot of RAM, that is not needed anymore at run time. This patch replaces this struct with a dynamically allocated one. The field is allocated and initialized during alloc_new_proxy() and is freed when the proxy is destroyed for now. But the goal will be to support freeing it after parsing the section.	2025-07-15 10:34:18 +02:00
Willy Tarreau	2414c5ce2f	CLEANUP: server: be sure never to compare src against a non-existing defsrv The test in srv_ssl_settings_cpy() comparing src to the server's proxy's default server does work but it's a subtle trap. Indeed, no check is made on srv->proxy to be valid, and this only works because the compiler is comparing pointer offsets. During the boot, it's common to have NULL here in srv->proxy and of course in this case srv does not match that value which is NULL plus epsilon. But when trying to turn defsrv to a dynamic pointer instead, then the compiler is forced to dereference this NULL srv->proxy and dies during init. Let's always add the null check for srv->proxy before the test to avoid this situation. No backport is needed since the problem cannot happen yet.	2025-07-15 10:33:08 +02:00
Willy Tarreau	36f339d2fe	CLEANUP: stream: use server_find_by_addr() in sticking_rule_find_target() This makes this function a bit less of a mess by no longer manipulating the low-level server address nodes nor the proxy lock.	2025-07-15 10:30:28 +02:00
Willy Tarreau	616c10f608	CLEANUP: server: add server_find_by_addr() Server lookup by address requires locking and manipulation of the tree from user code. Let's provide server_find_by_addr() which does that for us.	2025-07-15 10:30:28 +02:00
Willy Tarreau	fda04994d9	CLEANUP: server: simplify server_find_by_id() At a few places we're seeing some open-coding of the same function, likely because it looks overkill for what it's supposed to do, due to extraneous tests that are not needed (e.g. check of the backend's PR_CAP_BE etc). Let's just remove all these superfluous tests and inline it so that it feels more suitable for use everywhere it's needed.	2025-07-15 10:30:28 +02:00
Willy Tarreau	c8f0b69587	CLEANUP: stream: lookup server ID using standard functions The server lookup in sticking_rule_find_target() uses an open-coded tree search while we have a function for this server_find_by_id(). In addition, due to the way it's coded, the stick-table lock also covers the server lookup by accident instead of being released earlier. This is not a real problem though since such feature is rarely used nowadays. Let's clean all this stuff by first retrieving the ID under the lock and then looking up the corresponding server.	2025-07-15 10:30:28 +02:00
Willy Tarreau	a3443db2eb	CLEANUP: cfgparse: lookup proxy ID using existing functions The code used to detect proxy id conflicts uses an open-coded lookup in the ID tree which is not necessary since we already have functions for this. Let's switch to that instead.	2025-07-15 10:30:28 +02:00
Willy Tarreau	31526f73e6	CLEANUP: server: use server_find_by_name() where relevant Instead of open-coding a tree lookup, in sticking rules and server_find(), let's just rely on server_find_by_name() which now does exactly the same.	2025-07-15 10:30:28 +02:00
Willy Tarreau	61acd15ea8	CLEANUP: server: rename findserver() to server_find_by_name() Now it's more logical and matches what is done in the rest of these functions. server_find() now relies on it.	2025-07-15 10:30:28 +02:00
Willy Tarreau	6ad9285796	CLEANUP: server: rename server_find_by_name() to server_find() This function doesn't just look at the name but also the ID when the argument starts with a '#'. So the name is not correct and explains why this function is not always used when the name only is needed, and why the list-based findserver() is used instead. So let's just call the function "server_find()", and rename its generation-id based cousin "server_find_unique()".	2025-07-15 10:30:28 +02:00
Willy Tarreau	5e78ab33cd	MINOR: server: use the tree to look up the server name in findserver() Let's just use the tree-based lookup instead of walking through the list. This function is used to find duplicates in "track" statements and a few such places, so it's important not to waste too much time on large setups.	2025-07-15 10:30:27 +02:00
Willy Tarreau	12a6a3bb3f	REORG: server: move findserver() from proxy.c to server.c The reason this function was overlooked is that it had mostly equivalent ones in server.c, let's move them together.	2025-07-15 10:30:27 +02:00
Willy Tarreau	732cd0dfa2	CLEANUP: server: do not check for duplicates anymore in findserver() findserver() used to check for duplicate server names. These are no longer accepted in 3.3 so let's get rid of that test and simplify the code. Note that the function still only uses the list instead of the tree.	2025-07-15 10:30:27 +02:00
Willy Tarreau	d4d72e2303	[RELEASE] Released version 3.3-dev3 Released version 3.3-dev3 with the following main changes : - BUG/MINOR: quic-be: Wrong retry_source_connection_id check - MEDIUM: sink: change the sink mode type to PR_MODE_SYSLOG - MEDIUM: server: move _srv_check_proxy_mode() checks from server init to finalize - MINOR: server: move send-proxy* incompatibility check in _srv_check_proxy_mode() - MINOR: mailers: warn if mailers are configured but not actually used - BUG/MEDIUM: counters/server: fix server and proxy last_change mixup - MEDIUM: server: add and use a separate last_change variable for internal use - MEDIUM: proxy: add and use a separate last_change variable for internal use - MINOR: counters: rename last_change counter to last_state_change - MINOR: ssl: check TLS1.3 ciphersuites again in clienthello with recent AWS-LC - BUG/MEDIUM: hlua: Forbid any L6/L7 sample fetche functions from lua services - BUG/MEDIUM: mux-h2: Properly handle connection error during preface sending - BUG/MINOR: jwt: Copy input and parameters in dedicated buffers in jwt_verify converter - DOC: Fix 'jwt_verify' converter doc - MINOR: jwt: Rename pkey to pubkey in jwt_cert_tree_entry struct - MINOR: jwt: Remove unused parameter in convert_ecdsa_sig - MAJOR: jwt: Allow certificate instead of public key in jwt_verify converter - MINOR: ssl: Allow 'commit ssl cert' with no privkey - MINOR: ssl: Prevent delete on certificate used by jwt_verify - REGTESTS: jwt: Add test with actual certificate passed to jwt_verify - REGTESTS: jwt: Test update of certificate used in jwt_verify - DOC: 'jwt_verify' converter now supports certificates - REGTESTS: restrict execution to a single thread group - MINOR: ssl: Introduce new smp_client_hello_parse() function - MEDIUM: stats: add persistent state to typed output format - BUG/MINOR: httpclient: wrongly named httpproxy flag - MINOR: ssl/ocsp: stop using the flags from the httpclient CLI - MEDIUM: httpclient: split the CLI from the actual httpclient API - MEDIUM: httpclient: implement a way to use directly htx data - MINOR: httpclient/cli: add --htx option - BUILD: dev/phash: remove the accidentally committed a.out file - BUG/MINOR: ssl: crash in ssl_sock_io_cb() with SSL traces and idle connections - BUILD/MEDIUM: deviceatlas: fix when installed in custom locations. - DOC: deviceatlas build clarifications - BUG/MINOR: ssl/ocsp: fix definition discrepancies with ocsp_update_init() - MINOR: proto-tcp: Add support for TCP MD5 signature for listeners and servers - BUILD: cfgparse-tcp: Add _GNU_SOURCE for TCP_MD5SIG_MAXKEYLEN - BUG/MINOR: proto-tcp: Take care to initialized tcp_md5sig structure - BUG/MINOR: http-act: Fix parsing of the expression argument for pause action - MEDIUM: httpclient: add a Content-Length when the payload is known - CLEANUP: ssl: Rename ssl_trace-t.h to ssl_trace.h - MINOR: pattern: add a counter of added/freed patterns - CI: set DEBUG_STRICT=2 for coverity scan - CI: enable USE_QUIC=1 for OpenSSL versions >= 3.5.0 - CI: github: add an OpenSSL 3.5.0 job - CI: github: update the stable CI to ubuntu-24.04 - BUG/MEDIUM: quic: SSL/TCP handshake failures with OpenSSL 3.5 - CI: github: update to OpenSSL 3.5.1 - BUG/MINOR: quic: Missing TLS 1.3 QUIC cipher suites and groups inits (OpenSSL 3.5 QUIC API) - BUG/MINOR: quic-be: Malformed coalesced Initial packets - MINOR: quic: Prevent QUIC backend use with the OpenSSL QUIC compatibility module (USE_OPENSS_COMPAT) - MINOR: reg-tests: first QUIC+H3 reg tests (QUIC address validation) - MINOR: quic-be: Set the backend alpn if not set by conf - MINOR: quic-be: TLS version restriction to 1.3 - MINOR: cfgparse: enforce QUIC MUX compat on server line - MINOR: server: support QUIC for dynamic servers - CI: github: skip a ssl library version when latest is already in the list - MEDIUM: resolvers: switch dns-accept-family to "auto" by default - BUG/MINOR: resolvers: don't lower the case of binary DNS format - MINOR: resolvers: do not duplicate the hostname_dn field - MINOR: proto-tcp: Register a feature to report TCP MD5 signature support - BUG/MINOR: listener: really assign distinct IDs to shards - MINOR: quic: Prevent QUIC build with OpenSSL 3.5 new QUIC API version < 3.5.1 - BUG/MEDIUM: quic: Crash after QUIC server callbacks restoration (OpenSSL 3.5) - REGTESTS: use two haproxy instances to distinguish the QUIC traces - BUG/MEDIUM: http-client: Don't wake http-client applet if nothing was xferred - BUG/MEDIUM: http-client: Properly inc input data when HTX blocks are xferred - BUG/MEDIUM: http-client: Ask for more room when request data cannot be xferred - BUG/MEDIUM: http-client: Test HTX_FL_EOM flag before commiting the HTX buffer - BUG/MINOR: http-client: Ignore 1XX interim responses in non-HTX mode - BUG/MINOR: http-client: Reject any 101-switching-protocols response - BUG/MEDIUM: http-client: Drain the request if an early response is received - BUG/MEDIUM: http-client: Notify applet has more data to deliver until the EOM - BUG/MINOR: h3: fix https scheme request encoding for BE side - MINOR: h1-htx: Add function to format an HTX message in its H1 representation - BUG/MINOR: mux-h1: Use configured error files if possible for early H1 errors - BUG/MINOR: h1-htx: Don't forget to init flags in h1_format_htx_msg function - CLEANUP: assorted typo fixes in the code, commits and doc - BUILD: adjust scripts/build-ssl.sh to modern CMake system of QuicTLS - MINOR: debug: add distro name and version in postmortem	2025-07-11 16:45:50 +02:00
Valentine Krasnobaeva	0c63883be1	MINOR: debug: add distro name and version in postmortem Since 2012, systemd compliant distributions contain /etc/os-release file. This file has some standardized format, see details at https://www.freedesktop.org/software/systemd/man/latest/os-release.html. Let's read it in feed_post_mortem_linux() to gather more info about the distribution. (cherry picked from commit f1594c41368baf8f60737b229e4359fa7e1289a9) Signed-off-by: Willy Tarreau <w@1wt.eu>	2025-07-11 11:48:19 +02:00
Ilia Shipitsin	1888991e12	BUILD: adjust scripts/build-ssl.sh to modern CMake system of QuicTLS QuicTLS in master branch has migrated to CMake, let's adopt script to it. Previous OpenSSL+QuicTLS patch is built as usual.	2025-07-11 05:04:31 +02:00
Ilia Shipitsin	0ee3d739b8	CLEANUP: assorted typo fixes in the code, commits and doc Corrected various spelling and phrasing errors to improve clarity and consistency.	2025-07-10 19:49:48 +02:00
Christopher Faulet	516dfe16ff	BUG/MINOR: h1-htx: Don't forget to init flags in h1_format_htx_msg function The regression was introduced by commit 187ae28 ("MINOR: h1-htx: Add function to format an HTX message in its H1 representation"). We must be sure the flags variable must be initialized in h1_format_htx_msg() function. This patch must be backported with the commit above.	2025-07-10 14:10:42 +02:00
Christopher Faulet	d252ec2beb	BUG/MINOR: mux-h1: Use configured error files if possible for early H1 errors The H1 multiplexer is able to produce some errors on its own to report early errors, before the stream is created. In that case, the error files of the proxy were tested to detect empty files (or /dev/null) but they were not used to produce the error itself. But the documentation states that configured error files are used in all cases. And in fact, it is not really a problem to use these files. We must just format a full HTX message. Thanks to the previous patch, it is now possible. This patch should fix the issue #3032. It should be backported to 3.2. For older versions, it must be discussed but it should be quite easy to do.	2025-07-10 10:29:49 +02:00
Christopher Faulet	187ae28cf4	MINOR: h1-htx: Add function to format an HTX message in its H1 representation The function h1_format_htx_msg() can now be used to convert a valid HTX message in its H1 representation. No validity test is performed, the HTX message must be valid. Only trailers are silently ignored if the message is not chunked. In addition, the destination buffer must be empty. 1XX interim responses should be supported. But again, there is no validity tests.	2025-07-10 10:29:49 +02:00
Amaury Denoyelle	378c182192	BUG/MINOR: h3: fix https scheme request encoding for BE side An HTTP/3 request must contains :scheme pseudo-header. Currently, only "https" value is expected due to QUIC transport layer in use. However, https value is incorrectly encoded due to a QPACK index value mismatch in qpack_encode_scheme(). Fix it to ensure that scheme is now properly set for HTTP/3 requests on the backend side. No need to backport this.	2025-07-09 17:41:34 +02:00
Christopher Faulet	0b97bf36fa	BUG/MEDIUM: http-client: Notify applet has more data to deliver until the EOM When we leave the I/O handler with an unfinished request, we must report the applet has more data to deliver. Otherwise, when the channel request buffer is emptied, the http-client applet is not always woken up to forward the remaining request data. This issue was probably revealed by commit "BUG/MEDIUM: http-client: Don't wake http-client applet if nothing was xferred". It is only an issue with large POSTs, when the payload is streamed. This patch must be backported as far as 2.6 with the commit above. But on older versions, the applet API may differ. So be careful.	2025-07-09 16:27:24 +02:00
Christopher Faulet	25b0625d5c	BUG/MEDIUM: http-client: Drain the request if an early response is received When a large request is sent, it is possible to have a response before the end of the request. It is valid from HTTP perspective but it is an issue with the current design of the http-client. Indded, the request and the response are handled sequentially. So the response will be blocked, waiting for the end of the request. Most of time, it is not an issue, except when the request transfer is blocked. In that case, the applet is blocked. With the current API, it is not possible to handle early response and continue the request transfer. So, this case cannot be handle. In that case, it seems reasonnable to drain the request if a response is received. This way, the request transfer, from the caller point of view, is never blocked and the response can be properly processed. To do so, the action flag HTTPCLIENT_FA_DRAIN_REQ is added to the http-client. When it is set, the request payload is just dropped. In that case, we take care to not report the end of input to properly report the request was truncated, especially in logs. It is only an issue with large POSTs, when the payload is streamed. This patch must be backported as far as 2.6.	2025-07-09 16:27:24 +02:00
Christopher Faulet	8ba754108d	BUG/MINOR: http-client: Reject any 101-switching-protocols response Protocol updages are not supported by the http-client. So report an error is a 101-switching-protocols response is received. Of course, it is unexpected because the API is not designed to support upgrades. But it is better to properly handle this case. This patch could be backported as far as 2.6. It depends on the commit "BUG/MINOR: http-client: Ignore 1XX interim responses in non-HTX mode".	2025-07-09 16:27:24 +02:00
Christopher Faulet	9d10be33ae	BUG/MINOR: http-client: Ignore 1XX interim responses in non-HTX mode When the response is re-formatted in raw message, the 1XX interim responses must be skipped. Otherwise, information of the first interim response will be saved (status line and headers) and those from the final response will be dropped. Note that for now, in HTX-mode, the interim messages are removed. This patch must be backported as far as 2.6.	2025-07-09 16:27:24 +02:00
Christopher Faulet	4bdb2e5a26	BUG/MEDIUM: http-client: Test HTX_FL_EOM flag before commiting the HTX buffer when htx_to_buf() function is called, if the HTX message is empty, the buffer is reset. So HTX flags must not be tested after because the info may be lost. So now, we take care to test HTX_FL_EOM flag before calling htx_to_buf(). This patch must be backported as far as 2.8.	2025-07-09 16:27:24 +02:00
Christopher Faulet	e4a0d40c62	BUG/MEDIUM: http-client: Ask for more room when request data cannot be xferred When the request payload cannot be xferred to the channel because its buffer is full, we must request for more room by calling sc_need_room(). It is important to be sure the httpclient applet will not be woken up in loop to push more data while it is not possible. It is only an issue with large POSTs, when the payload is streamed. This patch must be backported as far as 2.6. Note that on 2.6, sc_need_room() only takes one argument.	2025-07-09 16:27:24 +02:00
Christopher Faulet	d9ca8f6b71	BUG/MEDIUM: http-client: Properly inc input data when HTX blocks are xferred When HTX blocks from the requests are transferred into the channel buffer, the return value of htx_xfer_blks() function must not be used to increment the channel input value because meta data are counted here while they are not part of input data. Because of this bug, it is possible to forward more data than these present in the channel buffer. Instead, we look at the input data before and after the transfer and the difference is added. It is only an issue with large POSTs, when the payload is streamed. This patch must be backported as far as 2.6.	2025-07-09 16:27:24 +02:00
Christopher Faulet	fffdac42df	BUG/MEDIUM: http-client: Don't wake http-client applet if nothing was xferred When data are transferred to or from the htt-pclient, the applet is systematically woken up, even when no data are transferred. This could lead to needlessly wakeups. When called from a lua script, if data are blocked for a while, this leads to a wakeup ping-pong loop where the http-client applet is woken up by the lua script which wakes back the script. To fix the issue, in httpclient_req_xfer() and httpclient_res_xfer() functions, we now take care to not wake the http-client applet up when no data are transferred. This patch must be backported as far as 2.6.	2025-07-09 16:27:24 +02:00
Frederic Lecaille	479c9fb067	REGTESTS: use two haproxy instances to distinguish the QUIC traces The aim of this patch is to identify the QUIC traces between the QUIC frontend and backend parts. Two haproxy instances are created. The c(1\|2) http clients connect to ha1 with TCP frontends and QUIC backends. ha2 embeds two QUIC listeners with s1 as TCP backend. When the traces are activated, they are dumped to stderr. Hopefully, they are prefixed by the haproxy instance name (h1 or h2). This is very useful to identify the QUIC instances.	2025-07-09 16:01:02 +02:00
Frederic Lecaille	45ac235baa	BUG/MEDIUM: quic: Crash after QUIC server callbacks restoration (OpenSSL 3.5) Revert this patch which is no more useful since OpenSSL 3.5.1 to remove the QUIC server callback restoration after SSL context switch: MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset It was required for 3.5.0. That said, there was no CI for OpenSSL 3.5 at the date of this commit. The CI recently revealed that the QUIC server side could crash during QUIC reg tests just after having restored the callbacks as implemented by the commit above. Also revert this commit which is no more useful because it arrived with the commit above: BUG/MEDIUM: quic: SSL/TCP handshake failures with OpenSSL 3. Must be backported to 3.2.	2025-07-09 16:01:02 +02:00
Frederic Lecaille	c01eb1040e	MINOR: quic: Prevent QUIC build with OpenSSL 3.5 new QUIC API version < 3.5.1 The QUIC listener part was impacted by the 3.5.0 OpenSSL new QUIC API with several issues which have been fixed by 3.5.1. Add a #error to prevent such OpenSSL 3.5 new QUIC API use with version below 3.5.1. Must be backported to 3.2.	2025-07-09 16:01:02 +02:00
Willy Tarreau	dd49f1ee62	BUG/MINOR: listener: really assign distinct IDs to shards A fix was made in 3.0 for the case where sharded listeners were using a same ID with commit 0db8b6034d ("BUG/MINOR: listener: always assign distinct IDs to shards"). However, the fix is incorrect. By checking the ID of temporary node instead of the kept one in bind_complete_thread_setup() it ends up never inserting the used nodes at this point, thus not reserving them. The side effect is that assigning too close IDs to subsequent listeners results in the same ID still being assigned twice since not reserved. Example: global nbthread 20 frontend foo bind :8000 shards by-thread id 10 bind :8010 shards by-thread id 20 The first one will start a series from 10 to 29 and the second one a series from 20 to 39. But 20 not being inserted when creating the shards, it will remain available for the post-parsing phase that assigns all unassigned IDs by filling holes, and two listeners will have ID 20. By checking the correct node, the problem disappears. The patch above was marked for backporting to 2.6, so this fix should be backported that far as well.	2025-07-09 15:52:33 +02:00
Christopher Faulet	adba8ffb49	MINOR: proto-tcp: Register a feature to report TCP MD5 signature support "HAVE_TCP_MD5SIG" feature is now registered if TCP MD5 signature is supported. This will help the feature detection in the reg-test script dedicated to this feature.	2025-07-09 09:51:24 +02:00
Willy Tarreau	96da670cd7	MINOR: resolvers: do not duplicate the hostname_dn field The hostdn.key field in the server contains a pure copy of the hostname_dn since commit 3406766d57 ("MEDIUM: resolvers: add a ref between servers and srv request or used SRV record") which wanted to lowercase it. Since it's not necessary, let's drop this useless copy. In addition, the return from strdup() was not tested, so it could theoretically crash the process under heavy memory contention.	2025-07-08 07:54:45 +02:00
Willy Tarreau	95cf518bfa	BUG/MINOR: resolvers: don't lower the case of binary DNS format The server's "hostname_dn" is in Domain Name format, not a pure string, as converted by resolv_str_to_dn_label(). It is made of lower-case string components delimited by binary lengths, e.g. <0x03>www<0x07>haproxy<0x03)org. As such it must not be lowercased again in srv_state_srv_update(), because 1) it's useless on the name components since already done, and 2) because it would replace component lengths 97 and above by 32-char shorter ones. Granted, not many domain names have that large components so the risk is very low but the operation is always wrong anyway. This was brought in 2.5 by commit 3406766d57 ("MEDIUM: resolvers: add a ref between servers and srv request or used SRV record"). In the same vein, let's fix the confusing strcasecmp() that are applied to this binary format, and use memcmp() instead. Here there's basically no risk to incorrectly match the wrong record, but that test alone is confusing enough to provoke the existence of the bug above. Finally let's update the component for that field to mention that it's in this format and already lower cased. Better not backport this, the risk of facing this bug is almost zero, and every time we touch such files something breaks for bad reasons.	2025-07-08 07:54:45 +02:00
Willy Tarreau	54d36f3e65	MEDIUM: resolvers: switch dns-accept-family to "auto" by default As notified in the 3.2 announce [1], dns-accept-family needed to switch to "auto" by default in 3.3. This is now done. [1] https://www.mail-archive.com/haproxy@formilux.org/msg45917.html	2025-07-08 07:54:45 +02:00
William Lallemand	9e78859fb3	CI: github: skip a ssl library version when latest is already in the list Skip the job for "latest" libssl version, when this version is the same as a one already in the list. This avoid having 2 jobs for OpenSSL 3.5.1 since no new dev version are available for now and 3.5.1 is already in the list.	2025-07-07 19:46:07 +02:00
Amaury Denoyelle	42365f53e8	MINOR: server: support QUIC for dynamic servers To properly support QUIC for dynamic servers, it is required to extend add server CLI handler : * ensure conformity between server address and proto * automatically set proto to QUIC if not specified * prepare_srv callback must be called to initialize required SSL context Prior to this patch, crashes may occur when trying to use QUIC with dynamic servers. Also, destroy_srv callback must be called when a dynamic server is deallocated. This ensures that there is no memory leak due to SSL context. No need to backport.	2025-07-07 14:29:29 +02:00
Amaury Denoyelle	626cfd85aa	MINOR: cfgparse: enforce QUIC MUX compat on server line Add postparsing checks to control server line conformity regarding QUIC both on the server address and the MUX protocol. An error is reported in the following case : * proto quic is explicitely specified but server address does not specify quic4/quic6 prefix * another proto is explicitely specified but server address uses quic4/quic6 prefix	2025-07-07 14:29:24 +02:00
Frederic Lecaille	e76f1ad171	MINOR: quic-be: TLS version restriction to 1.3 This patch skips the TLS version settings. They have as a side effect to add all the TLS version extensions to the ClientHello message (TLS 1.0 to TLS 1.3). QUIC supports only TLS 1.3.	2025-07-07 14:13:02 +02:00
Frederic Lecaille	93a94ba87b	MINOR: quic-be: Set the backend alpn if not set by conf Simply set the alpn string to "h3,hq_interop" if there is no "alpn" setting for QUIC backends.	2025-07-07 14:13:02 +02:00
Frederic Lecaille	a9b5a2eb90	MINOR: reg-tests: first QUIC+H3 reg tests (QUIC address validation) First simple VTC file for QUIC reg tests. Two listeners are configured, one without Retry enabled and the other without. Two clients simply tries to connect to these listeners to make an basic H3 request.	2025-07-07 14:13:02 +02:00
Frederic Lecaille	5a87f4673a	MINOR: quic: Prevent QUIC backend use with the OpenSSL QUIC compatibility module (USE_OPENSS_COMPAT) Make the server line parsing fail when a QUIC backend is configured if haproxy is built to use the OpenSSL stack compatibility module. This latter does not support the QUIC client part.	2025-07-07 14:13:02 +02:00
Frederic Lecaille	87ada46f38	BUG/MINOR: quic-be: Malformed coalesced Initial packets This bug fix completes this patch which was not sufficient: MINOR: quic-be: Allow sending 1200 bytes Initial datagrams This patch could not allow the build of well formed Initial packets coalesced to others (Handshake) packets. Indeed, the <padding> parameter passed to qc_build_pkt() is deduced from a first value: <padding> value and must be set to 1 for the last encryption level. As a client, the last encryption level is always the Handshake encryption level. But <padding> was always set to 1 for a QUIC client, leading the first Initial packet to be malformed because considered as the second one into the same datagram. So, this patch sets <padding> value passed to qc_build_pkt() to 1 only when there is no last encryption level at all, to allow the build of Initial only packets (not coalesced) or when it frames to send (coalesced packets). No need to backport.	2025-07-07 14:13:02 +02:00
Frederic Lecaille	6aebca7f2c	BUG/MINOR: quic: Missing TLS 1.3 QUIC cipher suites and groups inits (OpenSSL 3.5 QUIC API) This bug impacts both QUIC backends and frontends with OpenSSL 3.5 as QUIC API. The connections to a haproxy QUIC listener from a haproxy QUIC backend could not work at all without HelloRetryRequest TLS messages emitted by the backend asking the QUIC client to restart the handshake followed by TLS alerts: conn. @(nil) OpenSSL error[0xa000098] read_state_machine: excessive message size Furthermore, the Initial CRYPTO data sent by the client were big (about two 1252 bytes packets) (ClientHello TLS message). After analyzing the packets a key_share extension with <unknown> as value was long (more that 1Ko). This extension is in relation with the groups but does not belong to the groups supported by QUIC. That said such connections could work with ngtcp2 as backend built against the same OSSL TLS stack API but with a HelloRetryRequest. ngtcp2 always set the QUIC default cipher suites and group, for all the stacks it supports as implemented by this patch. So this patch configures both QUIC backend and frontend cipher suites and groups calling SSL_CTX_set_ciphersuites() and SSL_CTX_set1_groups_list() with the correct argument, except for SSL_CTX_set1_groups_list() which fails with QUIC TLS for a unknown reason at this time. The call to SSL_CTX_set_options() is useless from ssl_quic_initial_ctx() for the QUIC clients. One relies on ssl_sock_prepare_srv_ssl_ctx() to set them for now on. This patch is effective for all the supported stacks without impact for AWS-LC, and QUIC TLS and fixes the connections for haproxy QUIC frontend and backends when builts against OpenSSL 3.5 QUIC API). A new define HAVE_OPENSSL_QUICTLS has been added to openssl-compat.h to distinguish the QUIC TLS stack. Must be backported to 3.2.	2025-07-07 14:13:02 +02:00
William Lallemand	0efbe6da88	CI: github: update to OpenSSL 3.5.1 Update the OpenSSL 3.5 job to 3.5.1. This must be backported to 3.2.	2025-07-07 13:58:38 +02:00
Frederic Lecaille	fb0324eb09	BUG/MEDIUM: quic: SSL/TCP handshake failures with OpenSSL 3.5 This bug arrived with this commit: MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset To make QUIC connection succeed with OpenSSL 3.5 API, a call to quic_ssl_set_tls_cbs() was needed from several callback which call SSL_set_SSL_CTX(). This has as side effect to set the QUIC callbacks used by the OpenSSL 3.5 API. But quic_ssl_set_tls_cbs() was also called for TCP sessions leading the SSL stack to run QUIC code, if the QUIC support is enabled. To fix this, simply ignore the TCP connections inspecting the <ssl_qc_app_data_index> index value which is NULL for such connections. Must be backported to 3.2.	2025-07-07 12:01:22 +02:00
William Lallemand	d0bd0595da	CI: github: update the stable CI to ubuntu-24.04 Update the stable CI to ubuntu-24.04. Must be backported to 3.2.	2025-07-07 09:29:33 +02:00
William Lallemand	b6fec27ef6	CI: github: add an OpenSSL 3.5.0 job Add an OpenSSL 3.5.0 job to test USE_QUIC. This must be backported to 3.2.	2025-07-07 09:27:17 +02:00
Ilia Shipitsin	d8c867a1e6	CI: enable USE_QUIC=1 for OpenSSL versions >= 3.5.0 OpenSSL 3.5.0 introduced experimental support for QUIC. This change enables the use_quic option when a compatible version of OpenSSL is detected, allowing QUIC-based functionality to be leveraged where applicable. Feature remains disabled for earlier versions to ensure compatibility.	2025-07-07 09:02:11 +02:00
Ilia Shipitsin	198d422a31	CI: set DEBUG_STRICT=2 for coverity scan enabling DEBUG_STRICT=2 will enable BUG_ON_HOT() and help coverity in bug detection for the reference: https://github.com/haproxy/haproxy/issues/3008	2025-07-06 08:17:37 +02:00
Willy Tarreau	573143e0c8	MINOR: pattern: add a counter of added/freed patterns Patterns are allocated when loading maps/acls from a file or dynamically via the CLI, and are released only from the CLI (e.g. "clear map xxx"). These ones do not use pools and are much harder to monitor, e.g. in case a script adds many and forgets to clear them, etc. Let's add a new pair of metrics "PatternsAdded" and "PatternsFreed" that will report the number of added and freed patterns respectively. This can allow to simply graph both. The difference between the two normally represents the number of allocated patterns. If Added grows without Freed following, it can indicate a faulty script that doesn't perform the needed cleanup. The metrics are also made available to Prometheus as patterns_added_total and patterns_freed_total respectively.	2025-07-05 00:12:45 +02:00
Remi Tricot-Le Breton	a075d6928a	CLEANUP: ssl: Rename ssl_trace-t.h to ssl_trace.h This header does not actually contain any structures so it's best to remove the '-t' from the name for better consistency.	2025-07-04 15:21:50 +02:00
William Lallemand	f07f0ee21c	MEDIUM: httpclient: add a Content-Length when the payload is known This introduce a change of behavior in the httpclient API. When generating a request with a payload buffer, the size of the buffer payload is known and does not need to be streamed in chunks. This patch force to sends payload buffer using a Content-Length header in the request, however the behavior does not change if a callback is still used instead of a buffer.	2025-07-04 15:21:50 +02:00
Christopher Faulet	5da4da0bb6	BUG/MINOR: http-act: Fix parsing of the expression argument for pause action When the "pause" action is parsed, if an expression is used instead of a static value, the position of the current argument after the expression evaluation is incremented while it should not. The sample_parse_expr() function already take care of it. However, it should still be incremented when an time value was parsed. This patch must be backported to 3.2.	2025-07-04 14:38:32 +02:00
Christopher Faulet	3cc5991c9b	BUG/MINOR: proto-tcp: Take care to initialized tcp_md5sig structure When the TCP MD5 signature is enabled, on a listening socket or an outgoing one, the tcp_md5sig structure must be initialized first. It is a 3.3-specific issue. No backport needed.	2025-07-04 08:32:06 +02:00
Christopher Faulet	45cb232062	BUILD: cfgparse-tcp: Add _GNU_SOURCE for TCP_MD5SIG_MAXKEYLEN It is required for the musl librairy to be sure TCP_MD5SIG_MAXKEYLEN is defined and avoid build errors.	2025-07-03 16:30:15 +02:00
Christopher Faulet	5232df57ab	MINOR: proto-tcp: Add support for TCP MD5 signature for listeners and servers This patch adds the support for the RFC2385 (Protection of BGP Sessions via the + TCP MD5 Signature Option) for the listeners and the servers. The feature is only available on Linux. Keywords are not exposed otherwise. By setting "tcp-md5sig <password>" option on a bind line, TCP segments of all connections instantiated from the listening socket will be signed with a 16-byte MD5 digest. The same option can be set on a server line to protect outgoing connections to the corresponding server. The primary use case for this option is to allow BGP to protect itself against the introduction of spoofed TCP segments into the connection stream. But it can be useful for any very long-lived TCP connections. A reg-test was added and it will be executed only on linux. All other targets are excluded.	2025-07-03 15:25:40 +02:00
William Lallemand	6f6c6fa4cb	BUG/MINOR: ssl/ocsp: fix definition discrepancies with ocsp_update_init() Since patch 20718f40b6 ("MEDIUM: ssl/ckch: add filename and linenum argument to crt-store parsing"), the definition of ocsp_update_init() and its declaration does not share the same arguments. Must be backported to 3.2.	2025-07-03 15:14:13 +02:00
David Carlier	e7c59a7a84	DOC: deviceatlas build clarifications Update accordingly the related documentation, removing/clarifying confusing parts as it was more complicated than it needed to be.	2025-07-03 09:08:06 +02:00
David Carlier	0e8e20a83f	BUILD/MEDIUM: deviceatlas: fix when installed in custom locations. We are reusing DEVICEATLAS_INC/DEVICEATLAS_LIB when the DeviceAtlas library had been compiled and installed with cmake and make install targets. Works fine except when ldconfig is unaware of the path, thus adding cflags/ldflags into the mix. Ideally, to be backported down to the lowest stable branch.	2025-07-03 09:08:06 +02:00
William Lallemand	720efd0409	BUG/MINOR: ssl: crash in ssl_sock_io_cb() with SSL traces and idle connections TRACE_ENTER is crashing in ssl_sock_io_cb() in case a connection idle is being stolen. Indeed the function could be called with a NULL context and dereferencing it will crash. This patch fixes the issue by initializing ctx only once it is usable, and moving TRACE_ENTER after the initialization. This must be backported to 3.2.	2025-07-02 16:14:19 +02:00
Willy Tarreau	e34a0a50ae	BUILD: dev/phash: remove the accidentally committed a.out file Commit 41f28b3c53 ("DEV: phash: Update 414 and 431 status codes to phash") accidentally committed a.out, resulting in build/checkout issues when locally rebuilt. Let's drop it. This should be backported to 3.1.	2025-07-02 10:55:13 +02:00
William Lallemand	0f1c206b8f	MINOR: httpclient/cli: add --htx option Use the new HTTPCLIENT_O_RES_HTX flag when using the CLI httpclient with --htx. It allows to process directly the response in HTX, then the htx_dump() function is used to display a debug output. Example: echo "httpclient --htx GET https://haproxy.org" \| socat /tmp/haproxy.sock htx=0x79fd72a2e200(size=16336,data=139,used=6,wrap=NO,flags=0x00000010,extra=0,first=0,head=0,tail=5,tail_addr=139,head_addr=0,end_addr=0) [0] type=HTX_BLK_RES_SL - size=31 - addr=0 HTTP/2.0 301 [1] type=HTX_BLK_HDR - size=15 - addr=31 content-length: 0 [2] type=HTX_BLK_HDR - size=32 - addr=46 location: https://www.haproxy.org/ [3] type=HTX_BLK_HDR - size=25 - addr=78 alt-svc: h3=":443"; ma=3600 [4] type=HTX_BLK_HDR - size=35 - addr=103 set-cookie: served=2:TLSv1.3+TCP:IPv4 [5] type=HTX_BLK_EOH - size=1 - addr=138 <empty>	2025-07-01 16:33:38 +02:00
William Lallemand	3e05e20029	MEDIUM: httpclient: implement a way to use directly htx data Add a HTTPCLIENT_O_RES_HTX flag which allow to store directly the HTX data in the response buffer instead of extracting the data in raw format. This is useful when the data need to be reused in another request.	2025-07-01 16:31:47 +02:00
William Lallemand	2f4219ed68	MEDIUM: httpclient: split the CLI from the actual httpclient API This patch split the httpclient code to prevent confusion between the httpclient CLI command and the actual httpclient API. Indeed there was a confusion between the flag used internally by the CLI command, and the actual httpclient API. hc_cli_* functions as well as HC_C_F_* defines were moved to httpclient_cli.c.	2025-07-01 15:46:04 +02:00
William Lallemand	149f6a4879	MINOR: ssl/ocsp: stop using the flags from the httpclient CLI The ocsp-update uses the flags from the httpclient CLI, which are not supposed to be used elsewhere since this is a state for the CLI. This patch implements HC_OCSP flags for the ocsp-update.	2025-07-01 15:46:04 +02:00
William Lallemand	519abefb57	BUG/MINOR: httpclient: wrongly named httpproxy flag The HC_F_HTTPPROXY flag was wrongly named and does not use the correct value, indeed this flag was meant to be used for the httpclient API, not the httpclient CLI. This patch fixes the problem by introducing HTTPCLIENT_FO_HTTPPROXY which has must be set in hc->flags. Also add a member 'options' in the httpclient structure, because the member flags is reinitialized when starting. Must be backported as far as 3.0.	2025-07-01 14:47:52 +02:00
Aurelien DARRAGON	747a812066	MEDIUM: stats: add persistent state to typed output format Add a fourth character to the second column of the "typed output format" to indicate whether the value results from a volatile or persistent metric ('V' or 'P' characters respectively). A persistent metric means the value could possibily be preserved across reloads by leveraging a shared memory between multiple co-processes. Such metrics are identified as "shared" in the code (since they are possibly shared between multiple co-processes) Some reg-tests were updated to take that change into account, also, some outputs in the configuration manual were updated to reflect current behavior.	2025-07-01 14:15:03 +02:00
Mariam John	bd076f8619	MINOR: ssl: Introduce new smp_client_hello_parse() function In this patch we introduce a new helped function called `smp_client_hello_parse()` to extract information presented in a TLS client hello handshake message. 7 sample fetches have also been modified to use this helped function to do the common client hello parsing and use the result to do further processing of extensions/cipher. Fixes: #2532	2025-07-01 11:55:36 +02:00
Willy Tarreau	48d5ef363d	REGTESTS: restrict execution to a single thread group When threads are enabled and running on a machine with multiple CCX or multiple nodes, thread groups are now enabled since 3.3-dev2, causing load-balancing algorithms to randomly fail due to incoming connections spreading over multiple groups and using different load balancing indexes. Let's just force "thread-groups 1" into all configs when threads are enabled to avoid this.	2025-06-30 18:54:35 +02:00
Remi Tricot-Le Breton	94d750421c	DOC: 'jwt_verify' converter now supports certificates The 'jwt_verify' converter can now accept certificates as a second parameter, which can be updated via the CLI.	2025-06-30 17:59:55 +02:00
Remi Tricot-Le Breton	db5ca5a106	REGTESTS: jwt: Test update of certificate used in jwt_verify Using certificates in the jwt_verify converter allows to make use of the CLI certificate updates, which is still impossible with public keys (the legacy behavior).	2025-06-30 17:59:55 +02:00
Remi Tricot-Le Breton	663ba093aa	REGTESTS: jwt: Add test with actual certificate passed to jwt_verify The jwt_verify can now take public certificates as second parameter, either with actual certificate path (no previously mentioned) or from a predefined crt-store or from a variable.	2025-06-30 17:59:55 +02:00
Remi Tricot-Le Breton	093a3ad7f2	MINOR: ssl: Prevent delete on certificate used by jwt_verify A ckch_store used in JWT verification might not have any ckch instances or crt-list entries linked but we don't want to be able to remove it via the CLI anyway since it would make all future jwt_verify calls using this certificate fail.	2025-06-30 17:59:55 +02:00
Remi Tricot-Le Breton	31955e6e0a	MINOR: ssl: Allow 'commit ssl cert' with no privkey The ckch_stores might be used to store public certificates only so in this case we won't provide private keys when updating the certificate via the CLI. If the ckch_store is actually used in a bind or server line an error will still be raised if the private key is missing.	2025-06-30 17:59:55 +02:00
Remi Tricot-Le Breton	522bca98e1	MAJOR: jwt: Allow certificate instead of public key in jwt_verify converter The 'jwt_verify' converter could only be passed public keys as second parameter instead of full-on public certificates. This patch allows proper certificates to be used. Those certificates can be loaded in ckch_stores like any other certificate which means that all the certificate-related operations that can be made via the CLI can now benefit JWT validation as well. We now have two ways JWT validation can work, the legacy one which only relies on public keys which could not be stored in ckch_stores without some in depth changes in the way the ckch_stores are built. In this legacy way, the public keys are fully stored in a cache dedicated to JWT only which does not have any CLI commands and any way to update them during runtime. It also requires that all the public keys used are passed at least once explicitely to the 'jwt_verify' converter so that they can be loaded during init. The new way uses actual certificates, either already stored in the ckch_store tree (if predefined in a crt-store or already used previously in the configuration) or loaded in the ckch_store tree during init if they are explicitely used in the configuration like so: var(txn.bearer),jwt_verify(txn.jwt_alg,"cert.pem") When using a variable (or any other way that can only be resolved during runtime) in place of the converter's <key> parameter, the first time we encounter a new value (for which we don't have any entry in the jwt tree) we will lock the ckch_store tree and try to perform a lookup in it. If the lookup fails, an entry will still be inserted into the jwt tree so that any following call with this value avoids performing the ckch_store tree lookup.	2025-06-30 17:59:55 +02:00
Remi Tricot-Le Breton	6e9f886c4d	MINOR: jwt: Remove unused parameter in convert_ecdsa_sig The pubkey parameter in convert_ecdsa_sig was not actually used.	2025-06-30 17:59:55 +02:00
Remi Tricot-Le Breton	cd89ce1766	MINOR: jwt: Rename pkey to pubkey in jwt_cert_tree_entry struct Rename the jwt_cert_tree_entry member pkey to pubkey to avoid any confusion between private and public key.	2025-06-30 17:59:55 +02:00
Remi Tricot-Le Breton	5c3d0a554b	DOC: Fix 'jwt_verify' converter doc Contrary to what the doc says, the jwt_verify converter only works with a public key and not a full certificate for certificate based protocols (everything but HMAC). This patch should be backported up to 2.8.	2025-06-30 17:59:55 +02:00
Remi Tricot-Le Breton	3465f88f8a	BUG/MINOR: jwt: Copy input and parameters in dedicated buffers in jwt_verify converter When resolving variable values the temporary trash chunks are used so when calling the 'jwt_verify' converter with two variable parameters like in the following line, the input would be overwritten by the value of the second parameter : var(txn.bearer),jwt_verify(txn.jwt_alg,txn.cert) Copying the values into dedicated alloc'ed buffers prevents any new call to get_trash_chunk from erasing the data we need in the converter. This patch can be backported up to 2.8.	2025-06-30 17:59:55 +02:00
Christopher Faulet	5ba0a2d527	BUG/MEDIUM: mux-h2: Properly handle connection error during preface sending On backend side, an error at connection level during the preface sending was not properly handled and could lead to a spinning loop on process_stream() when the h2 stream on client side was blocked, for instance because of h2 flow control. It appeared that no transition was perfromed from the PREFACE state to an ERROR state on the H2 connection when an error occurred on the underlying connection. In that case, the H2 connection was woken up in loop to try to receive data, waking up the upper stream at the same time. To fix the issue, an H2C error must be reported. Most state transitions are handled by the demux function. So it is the right place to do so. First, in PREFACE state and on server side, if an error occurred on the TCP connection, an error is now reported on the H2 connection. REFUSED_STREAM error code is used in that case. In addition, in that case, we also take care to properly handle the connection shutdown. This patch should fix the issue #3020. It must be backported to all stable versions.	2025-06-30 16:48:00 +02:00
Christopher Faulet	a2a142bf40	BUG/MEDIUM: hlua: Forbid any L6/L7 sample fetche functions from lua services It was already forbidden to use HTTP sample fetch functions from lua services. An error is triggered if it happens. However, the error must be extended to any L6/L7 sample fetch functions. Indeed, a lua service is an applet. It totally unexepected for an applet to access to input data in a channel's buffer. These data have not been analyzed yet and are still subject to any change. An applet, lua or not, must never access to "not forwarded" data. Only output data are available. For now, if a lua applet relies on any L6/L7 sampel fetch functions, the behavior is undefined and not consistent. So to fix the issue, hlua flag HLUA_F_MAY_USE_HTTP is renamed to HLUA_F_MAY_USE_CHANNELS_DATA. This flag is used to prevent any lua applet to use L6/L7 sample fetch functions. This patch could be backported to all stable versions.	2025-06-30 16:47:59 +02:00
William Lallemand	7fc8ab0397	MINOR: ssl: check TLS1.3 ciphersuites again in clienthello with recent AWS-LC Patch ed9b8fec49 ("BUG/MEDIUM: ssl: AWS-LC + TLSv1.3 won't do ECDSA in RSA+ECDSA configuration") partly fixed a cipher selection problem with AWS-LC. However this was not checking anymore if the ciphersuites was available in haproxy which is still a problem. The problem was fixed in AWS-LC 1.46.0 with this PR https://github.com/aws/aws-lc/pull/2092. This patch allows to filter again the TLS13 ciphersuites with recent versions of AWS-LC. However, since there are no macros to check the AWS-LC version, it is enabled at the next AWS-LC API version change following the fix in AWS-LC v1.50.0. This could be backported where ed9b8fec49 was backported.	2025-06-30 16:43:51 +02:00
Aurelien DARRAGON	4fcc9b5572	MINOR: counters: rename last_change counter to last_state_change Since proxy and server struct already have an internal last_change variable and we cannot merge it with the shared counter one, let's rename the last_change counter to be more specific and prevent the mixup between the two. last_change counter is renamed to last_state_change, and unlike the internal last_change, this one is a shared counter so it is expected to be updated by other processes in our back. However, when updating last_state_change counter, we use the value of the server/proxy last_change as reference value.	2025-06-30 16:26:38 +02:00
Aurelien DARRAGON	5b1480c9d4	MEDIUM: proxy: add and use a separate last_change variable for internal use Same motivation as previous commit, proxy last_change is "abused" because it is used for 2 different purposes, one for stats, and the other one for process-local internal use. Let's add a separate proxy-only last_change variable for internal use, and leave the last_change shared (and thread-grouped) counter for statistics.	2025-06-30 16:26:31 +02:00
Aurelien DARRAGON	01dfe17acf	MEDIUM: server: add and use a separate last_change variable for internal use last_change server metric is used for 2 separate purposes. First it is used to report last server state change date for stats and other related metrics. But it is also used internally, including in sensitive paths, such as lb related stuff to take decision or perform computations (ie: in srv_dynamic_maxconn()). Due to last_change counter now being split over thread groups since 16eb0fa ("MAJOR: counters: dispatch counters over thread groups"), reading the aggregated value has a cost, and we cannot afford to consult last_change value from srv_dynamic_maxconn() anymore. Moreover, since the value is used to take decision for the current process we don't wan't the variable to be updated by another process in our back. To prevent performance regression and sharing issues, let's instead add a separate srv->last_change value, which is not updated atomically (given how rare the updates are), and only serves for places where the use of the aggregated last_change counter/stats (split over thread groups) is too costly.	2025-06-30 16:26:25 +02:00
Aurelien DARRAGON	9d3c73c9f2	BUG/MEDIUM: counters/server: fix server and proxy last_change mixup 16eb0fa ("MAJOR: counters: dispatch counters over thread groups") introduced some bugs: as a result of improper copy paste during COUNTERS_SHARED_LAST() macro introduction, some functions such as srv_downtime() which used to make use of the server last_change variable now use the proxy one, which doesn't make sense and will likely cause unexpected logical errors/bugs. Let's fix them all at once by properly pointing to the server last_change variable when relevant. No backport needed.	2025-06-30 16:26:19 +02:00
Aurelien DARRAGON	837762e2ee	MINOR: mailers: warn if mailers are configured but not actually used Now that native mailers configuration is only usable with Lua mailers, Willy noticed that we lack a way to warn the user if mailers were previously configured on an older version but Lua mailers were not loaded, which could trick the user into thinking mailers keep working when transitionning to 3.2 while it is not. In this patch we add the 'core.use_native_mailers_config()' Lua function which should be called in Lua script body before making use of 'Proxy:get_mailers()' function to retrieve legacy mailers configuration from haproxy main config. This way haproxy effectively knows that the native mailers config is actually being used from Lua (which indicates user correctly migrated from native mailers to Lua mailers), else if mailers are configured but not used from Lua then haproxy warns the user about the fact that they will be ignored unless they are used from Lua. (e.g.: using the provided 'examples/lua/mailers.lua' to ease transition)	2025-06-27 16:41:18 +02:00
Aurelien DARRAGON	c7c6d8d295	MINOR: server: move send-proxy* incompatibility check in _srv_check_proxy_mode() This way the check is executed no matter the section where the server is declared (ie: not only under the "ring" section)	2025-06-27 16:41:13 +02:00
Aurelien DARRAGON	14d68c2ff7	MEDIUM: server: move _srv_check_proxy_mode() checks from server init to finalize _srv_check_proxy_mode() is currently executed during server init (from _srv_parse_init()), while it used to be fine for current checks, it seems it occurs a bit too early to be usable for some checks that depend on server keywords to be evaluated for instance. As such, to make _srv_check_proxy_mode() more relevant and be extended with additional checks in the future, let's call it later during server finalization, once all server keywords were evaluated. No change of behavior is expected	2025-06-27 16:41:07 +02:00
Aurelien DARRAGON	23e5f18b8e	MEDIUM: sink: change the sink mode type to PR_MODE_SYSLOG No change of behavior expected, but some compat checks will now be aware that the proxy type is not TCP but SYSLOG instead.	2025-06-27 16:41:01 +02:00
Frederic Lecaille	1045623cb8	BUG/MINOR: quic-be: Wrong retry_source_connection_id check This commit broke the QUIC backend connection to servers without address validation or retry activated: MINOR: quic-be: address validation support implementation (RETRY) Indeed the retry_source_connection_id transport parameter was already checked as as if it was required, as if the peer (server) was always using the address validation. Furthermore, relying on ->odcid.len to ensure a retry token was received is not correct. This patch ensures the retry_source_connection_id transport parameter is checked only when a retry token was received (->retry_token != NULL). In this case it also checks that this transport parameter is present when a retry token has been received (tx_params->retry_source_connection_id.len != 0). No need to backport.	2025-06-27 07:59:12 +02:00
Willy Tarreau	299a441110	[RELEASE] Released version 3.3-dev2 Released version 3.3-dev2 with the following main changes : - BUG/MINOR: config/server: reject QUIC addresses - MINOR: server: implement helper to identify QUIC servers - MINOR: server: mark QUIC support as experimental - MINOR: mux-quic-be: allow QUIC proto on backend side - MINOR: quic-be: Correct Version Information transp. param encoding - MINOR: quic-be: Version Information transport parameter check - MINOR: quic-be: Call ->prepare_srv() callback at parsing time - MINOR: quic-be: QUIC backend XPRT and transport parameters init during parsing - MINOR: quic-be: QUIC server xprt already set when preparing their CTXs - MINOR: quic-be: Add a function for the TLS context allocations - MINOR: quic-be: Correct the QUIC protocol lookup - MINOR: quic-be: ssl_sock contexts allocation and misc adaptations - MINOR: quic-be: SSL sessions initializations - MINOR: quic-be: Add a function to initialize the QUIC client transport parameters - MINOR: sock: Add protocol and socket types parameters to sock_create_server_socket() - MINOR: quic-be: ->connect() protocol callback adaptations - MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn()) - MINOR: quic-be: xprt ->init() adapatations - MINOR: quic-be: add field for max_udp_payload_size into quic_conn - MINOR: quic-be: Do not redispatch the datagrams - MINOR: quic-be: Datagrams and packet parsing support - MINOR: quic-be: Handshake packet number space discarding - MINOR: h3-be: Correctly retrieve h3 counters - MINOR: quic-be: Store asap the DCID - MINOR: quic-be: Build post handshake frames - MINOR: quic-be: Add the conn object to the server SSL context - MINOR: quic-be: Initial packet number space discarding. - MINOR: quic-be: I/O handler switch adaptation - MINOR: quic-be: Store the remote transport parameters asap - MINOR: quic-be: Missing callbacks initializations (USE_QUIC_OPENSSL_COMPAT) - MINOR: quic-be: Make the secret derivation works for QUIC backends (USE_QUIC_OPENSSL_COMPAT) - MINOR: quic-be: SSL_get_peer_quic_transport_params() not defined by OpenSSL 3.5 QUIC API - MINOR: quic-be: get rid of ->li quic_conn member - MINOR: quic-be: Prevent the MUX to send/receive data - MINOR: quic: define proper proto on QUIC servers - MEDIUM: quic-be: initialize MUX on handshake completion - BUG/MINOR: hlua: Don't forget the return statement after a hlua_yieldk() - BUILD: hlua: Fix warnings about uninitialized variables - BUILD: listener: fix 'for' loop inline variable declaration - BUILD: hlua: Fix warnings about uninitialized variables (2) - BUG/MEDIUM: mux-quic: adjust wakeup behavior - MEDIUM: backend: delay MUX init with ALPN even if proto is forced - MINOR: quic: mark ctrl layer as ready on quic_connect_server() - MINOR: mux-quic: improve documentation for snd/rcv app-ops - MINOR: mux-quic: define flag for backend side - MINOR: mux-quic: set expect data only on frontend side - MINOR: mux-quic: instantiate first stream on backend side - MINOR: quic: wakeup backend MUX on handshake completed - MINOR: hq-interop: decode response into HTX for backend side support - MINOR: hq-interop: encode request from HTX for backend side support - CLEANUP: quic-be: Add comments about qc_new_conn() usage - BUG/MINOR: quic-be: CID double free upon qc_new_conn() failures - MINOR: quic-be: Avoid SSL context unreachable code without USE_QUIC_OPENSSL_COMPAT - BUG/MINOR: quic: prevent crash on startup with -dt - MINOR: server: reject QUIC servers without explicit SSL - BUG/MINOR: quic: work around NEW_TOKEN parsing error on backend side - BUG/MINOR: http-ana: Properly handle keep-query redirect option if no QS - BUG/MINOR: quic: don't restrict reception on backend privileged ports - MINOR: hq-interop: handle HTX response forward if not enough space - BUG/MINOR: quic: Fix OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn callback (OpenSSL3.5) - BUG/MINOR: quic: fix ODCID initialization on frontend side - BUG/MEDIUM: cli: Don't consume data if outbuf is full or not available - MINOR: cli: handle EOS/ERROR first - BUG/MEDIUM: check: Set SOCKERR by default when a connection error is reported - BUG/MINOR: mux-quic: check sc_attach_mux return value - MINOR: h3: support basic HTX start-line conversion into HTTP/3 request - MINOR: h3: encode request headers - MINOR: h3: complete HTTP/3 request method encoding - MINOR: h3: complete HTTP/3 request scheme encoding - MINOR: h3: adjust path request encoding - MINOR: h3: adjust auth request encoding or fallback to host - MINOR: h3: prepare support for response parsing - MINOR: h3: convert HTTP/3 response into HTX for backend side support - MINOR: h3: complete response status transcoding - MINOR: h3: transcode H3 response headers into HTX blocks - MINOR: h3: use BUG_ON() on missing request start-line - MINOR: h3: reject invalid :status in response - DOC: config: prefer-last-server: add notes for non-deterministic algorithms - CLEANUP: connection: remove unused mux-ops dedicated to QUIC - BUG/MINOR: mux-quic/h3: properly handle too low peer fctl initial stream - MINOR: mux-quic: support max bidi streams value set by the peer - MINOR: mux-quic: abort conn if cannot create stream due to fctl - MEDIUM: mux-quic: implement attach for new streams on backend side - BUG/MAJOR: fwlc: Count an avoided server as unusable. - MINOR: fwlc: Factorize code. - BUG/MEDIUM: quic: do not release BE quic-conn prior to upper conn - MAJOR: cfgparse: turn the same proxy name warning to an error - MAJOR: cfgparse: make sure server names are unique within a backend - BUG/MINOR: tools: only reset argument start upon new argument - BUG/MINOR: stream: Avoid recursive evaluation for unique-id based on itself - BUG/MINOR: log: Be able to use %ID alias at anytime of the stream's evaluation - MINOR: hlua: emit a log instead of an alert for aborted actions due to unavailable yield - MAJOR: mailers: remove native mailers support - BUG/MEDIUM: ssl/clienthello: ECDSA with ssl-max-ver TLSv1.2 and no ECDSA ciphers - DOC: configuration: add details on prefer-client-ciphers - MINOR: ssl: Add "renegotiate" server option - DOC: remove the program section from the documentation - MAJOR: mworker: remove program section support - BUG/MINOR: quic: wrong QUIC_FT_CONNECTION_CLOSE(0x1c) frame encoding - MINOR: quic-be: add a "CC connection" backend TX buffer pool - MINOR: quic: Useless TX buffer size reduction in closing state - MINOR: quic-be: Allow sending 1200 bytes Initial datagrams - MINOR: quic-be: address validation support implementation (RETRY) - MEDIUM: proxy: deprecate the "transparent" and "option transparent" directives - REGTESTS: update http_reuse_be_transparent with "transparent" deprecated - REGTESTS: script: also add a line pointing to the log file - DOC: config: explain how to deal with "transparent" deprecation - MEDIUM: proxy: mark the "dispatch" directive as deprecated - DOC: config: crt-list clarify default cert + cert-bundle - MEDIUM: cpu-topo: switch to the "performance" cpu-policy by default - SCRIPTS: drop the HTML generation from announce-release - BUG/MINOR: tools: use my_unsetenv instead of unsetenv - CLEANUP: startup: move comment about nbthread where it's more appropriate - BUILD: qpack: fix a build issue on older compilers	2025-06-26 18:26:45 +02:00
Willy Tarreau	543b629427	BUILD: qpack: fix a build issue on older compilers Got this on gcc-4.8: src/qpack-enc.c: In function 'qpack_encode_method': src/qpack-enc.c:168:3: error: 'for' loop initial declarations are only allowed in C99 mode for (size_t i = 0; i < istlen(other); ++i) ^ This came from commit a0912cf914 ("MINOR: h3: complete HTTP/3 request method encoding"), no backport is needed.	2025-06-26 18:09:24 +02:00
Valentine Krasnobaeva	20110491d3	CLEANUP: startup: move comment about nbthread where it's more appropriate Move the comment about non_global_section_parsed just above the line, where we reset it.	2025-06-26 18:02:16 +02:00
Valentine Krasnobaeva	a9afc10ae8	BUG/MINOR: tools: use my_unsetenv instead of unsetenv Let's use our own implementation of unsetenv() instead of the one, which is provided in libc. Implementation from libc may vary in dependency of UNIX distro. Implemenation from libc.so.1 ported on Illumos (see the link below) has caused an eternal loop in the clean_env(), where we invoke unsetenv(). (https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/gen/getenv.c#L411C1-L456C1) This is reported at GitHUB #3018 and the reporter has proposed the patch, which we really appreciate! But looking at his fix and to the implementations of unsetenv() in FreeBSD libc and in Linux glibc 2.31, it seems, that the algorithm of clean_env() will perform better with our my_unsetenv() implementation. This should be backported in versions 3.1 and 3.2.	2025-06-26 18:02:16 +02:00
Willy Tarreau	27baa3f9ff	SCRIPTS: drop the HTML generation from announce-release It has not been used over the last 5 years or so and systematically requires manual removal. Let's just stop producing it. Also take this opportunity to add the missing link to /discussions.	2025-06-26 18:02:16 +02:00
Willy Tarreau	b74336984d	MEDIUM: cpu-topo: switch to the "performance" cpu-policy by default As mentioned during the NUMA series development, the goal is to use all available cores in the most efficient way by default, which normally corresponds to "cpu-policy performance". The previous default choice of "cpu-policy first-usable-node" was only meant to stay 100% identical to before cpu-policy. So let's switch the default cpu-policy to "performance" right now. The doc was updated to reflect this.	2025-06-26 16:27:43 +02:00
Maximilian Moehl	5128178256	DOC: config: crt-list clarify default cert + cert-bundle Clarify that HAProxy duplicates crt-list entries for multi-cert bundles which can create unexpected side-effects as only the very first certificate after duplication is considered as default implicitly.	2025-06-26 16:27:07 +02:00
Willy Tarreau	5c15ba5eff	MEDIUM: proxy: mark the "dispatch" directive as deprecated As mentioned in [1], the "dispatch" directive from haproxy 1.0 has long outlived its original purpose and still suffers from a number of technical limitations (no checks, no SSL, no idle connes etc) and still hinders some internal evolutions. It's now time to mark it as deprecated, and to remove it in 3.5 [2]. It was already recommended against in the documentation but remained popular in raw TCP environments for being shorter to write. The directive will now cause a warning to be emitted, suggesting an alternate method involving "server". The warning can be shut using "expose-deprecated-directives". The rare configs from 1.0 where "dispatch" is combined with sticky servers using cookies will just need to set these servers's weights to zero to prevent them from being selected by the load balancing algorithm. All of this is explained in the doc with examples. Two reg tests were using this method, one purposely for this directive, which now has expose-deprecated-directives, and another one to test the behavior of idle connections, which was updated to use "server" and extended to test both "http-reuse never" and "http-reuse always". [1] https://github.com/orgs/haproxy/discussions/2921 [2] https://github.com/haproxy/wiki/wiki/Breaking-changes	2025-06-26 15:29:47 +02:00
Willy Tarreau	19140ca666	DOC: config: explain how to deal with "transparent" deprecation The explanations for the "option transparent" keyword were a bit scarce regarding deprecation, so let's explain how to replace it with a server line that does the same.	2025-06-26 14:52:07 +02:00
Willy Tarreau	16f382f2d9	REGTESTS: script: also add a line pointing to the log file I never counted the number of hours I've been spending selecting then copy-pasting the directory output and manually appending "/LOG" to read a log file but it amounts in tens to hundreds. Let's just add a direct pointer to the log file at the end of the log for a failed run.	2025-06-26 14:33:09 +02:00
Willy Tarreau	1d3ab10423	REGTESTS: update http_reuse_be_transparent with "transparent" deprecated With commit e93f3ea3f8 ("MEDIUM: proxy: deprecate the "transparent" and "option transparent" directives") this one no longer works as the config either has to be adjusted to use server 0.0.0.0 or to enable the deprecated feature. The test used to validate a technical limitation ("transparent" not supporting shared connections), indicated as being comparable to "http-reuse never". Let's now duplicate the test for "http-reuse never" and "http-reuse always" and validate both behaviors. Take this opportunity to fix a few problems in this config: - use "nbthread 1": depending on the thread where the connection arrives, the connection may or may not be reused - add explicit URLs to the clients so that they can be recognized in the logs - add comments to make it clearer what to expect for each test	2025-06-26 14:32:20 +02:00
Willy Tarreau	e93f3ea3f8	MEDIUM: proxy: deprecate the "transparent" and "option transparent" directives As discussed here [1], "transparent" (already deprecated) and "option transparent" are horrible hacks which should really disappear in favor of "server xxx 0.0.0.0" which doesn't rely on hackish code path. This old feature is now deprecated in 3.3 and will disappear in 3.5, as indicated here [2]. A warning is emitted when used, explaining how to proceed, and how to silence the warning using the global "expose-deprecated-directives" if needed. The doc was updated to reflect this new state. [1] https://github.com/orgs/haproxy/discussions/2921 [2] https://github.com/haproxy/wiki/wiki/Breaking-changes	2025-06-26 11:55:47 +02:00
Frederic Lecaille	194e3bc2d5	MINOR: quic-be: address validation support implementation (RETRY) - Add ->retry_token and ->retry_token_len new quic_conn struct members to store the retry tokens. These objects are allocated by quic_rx_packet_parse() and released by quic_conn_release(). - Add <pool_head_quic_retry_token> new pool for these tokens. - Implement quic_retry_packet_check() to check the integrity tag of these tokens upon RETRY packets receipt. quic_tls_generate_retry_integrity_tag() is called by this new function. It has been modified to pass the address where the tag must be generated - Add <resend> new parameter to quic_pktns_discard(). This function is called to discard the packet number spaces where the already TX packets and frames are attached to. <resend> allows the caller to prevent this function to release the in flight TX packets/frames. The frames are requeued to be resent. - Modify quic_rx_pkt_parse() to handle the RETRY packets. What must be done upon such packets receipt is: - store the retry token, - store the new peer SCID as the DCID of the connection. Note that the peer will modify again its SCID. This is why this SCID is also stored as the ODCID which must be matched with the peer retry_source_connection_id transport parameter, - discard the Initial packet number space without flagging it as discarded and prevent retransmissions calling qc_set_timer(), - modify the TLS cryptographic cipher contexts (RX/TX), - wakeup the I/O handler to send new Initial packets asap. - Modify quic_transport_param_decode() to handle the retry_source_connection_id transport parameter as a QUIC client. Then its caller is modified to check this transport parameter matches with the SCID sent by the peer with the RETRY packet.	2025-06-26 09:48:00 +02:00
Frederic Lecaille	8a25fcd36e	MINOR: quic-be: Allow sending 1200 bytes Initial datagrams This easy to understand patch is not intrusive at all and cannot break the QUIC listeners. The QUIC client MUST always pad its datagrams with Initial packets. A "!l" (not a listener) test OR'ed with the existing ones is added to satisfy the condition to allow the build of such datagrams.	2025-06-26 09:48:00 +02:00
Frederic Lecaille	c898b29e64	MINOR: quic: Useless TX buffer size reduction in closing state There is no need to limit the size of the TX buffer to QUIC_MIN_CC_PKTSIZE bytes when the connection is in closing state. There is already a test which limits the number of bytes to be used from this TX buffer after this useless test removed. It limits this number of bytes to the size of the TX buffer itself: if (end > (unsigned char )b_wrap(buf)) end = (unsigned char )b_wrap(buf); This is exactly what is needed when the connection is in closing state. Indeed, the size of the TX buffers are limited to reduce the memory usage. The connection only needs to send short datagrams with at most 2 packets with a CONNECTION_CLOSE* frames. They are built only one time and backed up into small TX buffer allocated from a dedicated pool. The size of this TX buffer is QUIC_MAX_CC_BUFSIZE which depends on QUIC_MIN_CC_PKTSIZE: #define QUIC_MIN_CC_PKTSIZE 128 #define QUIC_MAX_CC_BUFSIZE (2 * (QUIC_MIN_CC_PKTSIZE + QUIC_DGRAM_HEADLEN)) This size is smaller than an MTU. This patch should be backported as far as 2.9 to ease further backports to come.	2025-06-26 09:48:00 +02:00
Frederic Lecaille	9cb2acd2f2	MINOR: quic-be: add a "CC connection" backend TX buffer pool A QUIC client must be able to close a connection sending Initial packets. But QUIC client Initial packets must always be at least 1200 bytes long. To reduce the memory use of TX buffers of a connection when in "closing" state, a pool was dedicated for this purpose but with a too much reduced TX buffer size (QUIC_MAX_CC_BUFSIZE). This patch adds a "closing state connection" TX buffer pool with the same role for QUIC backends.	2025-06-26 09:48:00 +02:00
Frederic Lecaille	1e6d8f199c	BUG/MINOR: quic: wrong QUIC_FT_CONNECTION_CLOSE(0x1c) frame encoding This is an old bug which was there since this commit: MINOR: quic: Avoid zeroing frame structures It seems QUIC_FT_CONNECTION_CLOSE was confused with QUIC_FT_CONNECTION_CLOSE_APP which does not include a "frame type" field. This field was not initialized (so with a random value) which prevent the packet to be built because the packet builder supposes the packet with such frames are very short. Must be backported as far as 2.6.	2025-06-26 09:48:00 +02:00
William Lallemand	7cb6167d04	MAJOR: mworker: remove program section support This patch removes completely the support for the program section, the parsing of the section as well as the internals in the mworker does not support it anymore. The program section was considered dysfonctional and not fully compatible with the "mworker V3" model. Users that want to run an external program must use their init system. The documentation is cleaned up in another patch.	2025-06-25 16:11:34 +02:00
William Lallemand	9b5bf81f3c	DOC: remove the program section from the documentation The program section is obsolete and can be remove from the documentation.	2025-06-25 15:42:57 +02:00
Remi Tricot-Le Breton	34fc73ba81	MINOR: ssl: Add "renegotiate" server option This "renegotiate" option can be set on SSL backends to allow secure renegotiation. It is mostly useful with SSL libraries that disable secure regotiation by default (such as AWS-LC). The "no-renegotiate" one can be used the other way around, to disable secure renegotation that could be allowed by default. Those two options can be set via "ssl-default-server-options" as well.	2025-06-25 15:23:48 +02:00
William Lallemand	370a8cea4a	DOC: configuration: add details on prefer-client-ciphers prefer-client-ciphers does not work exactly the same way when used with a dual algorithm stack (ECDSA + RSA). This patch details its behavior. This patch must be backported in every maintained version. Problem was discovered in #2988.	2025-06-25 14:41:45 +02:00
William Lallemand	4a298c6c5c	BUG/MEDIUM: ssl/clienthello: ECDSA with ssl-max-ver TLSv1.2 and no ECDSA ciphers Patch 23093c72 ("BUG/MINOR: ssl: suboptimal certificate selection with TLSv1.3 and dual ECDSA/RSA") introduced a problem when prioritizing the ECDSA with TLSv1.3. Indeed, when a client with TLSv1.3 capabilities announce a list of ECDSA sigalgs, a list of TLSv1.3 ciphersuites compatible with ECDSA, but only RSA ciphers for TLSv1.2, and haproxy is configured to a ssl-max-ver TLSv1.2, then haproxy would use the ECDSA keypair, but the client wouldn't be able to process it because TLSv1.2 was negociated. HAProxy would be configured like that: ssl-default-bind-options ssl-max-ver TLSv1.2 And a client could be used this way: openssl s_client -connect localhost:8443 -cipher ECDHE-ECDSA-AES128-GCM-SHA256 \ -ciphersuites TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256 This patch fixes the issue by checking if TLSv1.3 was configured before allowing ECDSA is an TLSv1.3 ciphersuite is in the list. This could be backported where 23093c72 ("BUG/MINOR: ssl: suboptimal certificate selection with TLSv1.3 and dual ECDSA/RSA") was backported. However this is quite sensible and we should wait a bit before the backport. This should fix issue #2988	2025-06-25 14:25:14 +02:00
Aurelien DARRAGON	5694a98744	MAJOR: mailers: remove native mailers support As mentioned in 2.8 announce on the mailing list [1] and on the wiki [2] native mailers were deprecated and planned for removal in 3.3. Now is the time to drop the legacy code for native mailers which is based on a tcpcheck "hack" and cannot be maintained. Lua mailers should be used as a drop in replacement. Indeed, "mailers" and associated config directives are preserved because mailers config is exposed to Lua, which helps smoothing the transition from native mailers to Lua based ones. As a reminder, to keep mailers configuration working as before without making changes to the config file, simply add the line below to the global section: lua-load examples/lua/mailers.lua mailers.lua script (provided in the git repository, adjust path as needed) may be customized by users familiar with Lua, by default it emulates the behavior of the native (now removed) mailers. [1]: https://www.mail-archive.com/haproxy@formilux.org/msg43600.html [2]: https://github.com/haproxy/wiki/wiki/Breaking-changes	2025-06-24 10:55:58 +02:00
Aurelien DARRAGON	c0f6024854	MINOR: hlua: emit a log instead of an alert for aborted actions due to unavailable yield As reported by Chris Staite in GH #3002, trying to yield from a Lua action during a client disconnect causes the script to be interrupted (which is expected) and an alert to be emitted with the error: "Lua function '%s': yield not allowed". While this error is well suited for cases where the yield is not expected at all (ie: when context doesn't allow it) and results from a yield misuse in the Lua script, it isn't the case when the yield is exceptionnally not available due to an abort or error in the request/response processing. Because of that we raise an alert but the user cannot do anything about it (the script is correct), so it is confusing and polluting the logs. In this patch we introduce the ACT_OPT_FINAL_EARLY flag which is a complementary flag to ACT_OPT_FIRST. This flag is set when the ACT_OPT_FIRST is set earlier than normal (due to error/abort). hlua_action() then checks for this flag to decide whether an error (alert) or a simple log message should be emitted when the yield is not available. It should solve GH #3002. Thanks to Chris Staite (@chrisstaite-menlo) for having reported the issue and suggested a solution.	2025-06-24 10:55:55 +02:00
Christopher Faulet	20a82027ce	BUG/MINOR: log: Be able to use %ID alias at anytime of the stream's evaluation In a log-format string, using "%[unique-id]" or "%ID" should be equivalent. However, for the first one, the unique ID is generated when the sample fetch function is called. For the alias, it is not true. It that case, the stream's unique ID is generated when the log message is emitted. Otherwise, by default, the unique id is automatically generated at the end of the HTTP request analysis. So, if the alias "%ID" is use in a log-format string anywhere before the end of the request analysis, the evaluation failed and the ID is considered as empty. It is not consistent and in contradiction with the "%ID" documentation. To fix the issue, instead of evaluating the unique ID when the log message is emitted, it is now performed on demand when "%ID" format is evaluated. This patch should fix the issue #3016. It should be backported to all stable versions. It relies on the following commit: * BUG/MINOR: stream: Avoid recursive evaluation for unique-id based on itself	2025-06-24 08:04:50 +02:00
Christopher Faulet	fb7b5c8a53	BUG/MINOR: stream: Avoid recursive evaluation for unique-id based on itself There is nothing that prevent a "unique-id-format" to reference itself, using '%ID' or '%[unique-id]'. If the sample fetch function is used, it leads to an infinite loop, calling recursively the function responsible to generate the unique ID. One solution is to detect it during the configuration parsing to trigger an error. With this patch, we just inhibit recursive calls by considering the unique-id as empty during its evaluation. So "id-%[unique-id]" lf string will be evaluated as "id-". This patch must be backported to all stable versions.	2025-06-24 08:04:50 +02:00
Willy Tarreau	68c3eb3013	BUG/MINOR: tools: only reset argument start upon new argument In issue #2995, Thomas Kjaer reported that empty argument position reporting had been broken yet again. This time it was broken by this latest fix: 2b60e54fb1 ("BUG/MINOR: tools: improve parse_line()'s robustness against empty args"). It turns out that this fix is not the culprit and it's in fact correct. The culprit was the original commit of this series, 7e4a2f39ef ("BUG/MINOR: tools: do not create an empty arg from trailing spaces"), which used to reset arg_start to outpos for every new char in addition to doing it for every arg. This resulted in the end of the line to be seen as always being in error, thus reporting an incorrect position that the caller would correct in a generic way designating the beginning of the line. It didn't reveal prior to the upper fix above because the misassigned value was almost not used by then. Assigning the value before entering the loop fixes this problem and doens't break the series of previous oss-fuzz reproducers. Hopefully it's the last one again. This must be backported to 3.2. Thanks to @tkjaer for reporting the issue along with a reproducer.	2025-06-23 18:41:52 +02:00
Willy Tarreau	d7fad1320e	MAJOR: cfgparse: make sure server names are unique within a backend There was already a check for this but there used to be an exception that allowed duplicate server names only in case where their IDs were explicit and different. This has been emitting a warning since 3.1 and planned for removal in 3.3, so let's do it now. The doc was updated, though it never mentioned this unicity constraint, so that was added. Only the check for the exception was removed, the rest of the code that is currently made to deal with duplicate server names was not cleaned yet (e.g. the tree doesn't need to support dups anymore, and this could be done at insertion time). This may be a subject for future cleanups.	2025-06-23 15:42:32 +02:00
Willy Tarreau	067be38c0e	MAJOR: cfgparse: turn the same proxy name warning to an error As warned since 3.1, it's no longer permitted to have a frontend and a backend under the same name. This causes too many designation issues, and causes trouble with stick-tables as well. Now each proxy name is unique. This commit only changes the check to return an error. Some code parts currently exist to find the best candidates, these will be able to be simplified as future cleanup patches. The doc was updated.	2025-06-23 15:34:05 +02:00
Amaury Denoyelle	74b95922ef	BUG/MEDIUM: quic: do not release BE quic-conn prior to upper conn For frontend side, quic_conn is only released if MUX wasn't allocated, either due to handshake abort, in which case upper layer is never allocated, or after transfer completion when full conn + MUX layers are already released. On the backend side, initialization is not performed in the same order. Indeed, in this case, connection is first instantiated, the nthe quic_conn is created to execute the handshake, while MUX is still only allocated on handshake completion. As such, it is not possible anymore to free immediately quic_conn on handshake failure. Else, this can cause crash if the connection try to reaccess to its transport layer after quic_conn release. Such crash can easily be reproduced in case of connection error to the QUIC server. Here is an example of an experienced backtrace. Thread 1 "haproxy" received signal SIGSEGV, Segmentation fault. 0x0000555555739733 in quic_close (conn=0x55555734c0d0, xprt_ctx=0x5555573a6e50) at src/xprt_quic.c:28 28 qc->conn = NULL; [ ## gdb ## ] bt #0 0x0000555555739733 in quic_close (conn=0x55555734c0d0, xprt_ctx=0x5555573a6e50) at src/xprt_quic.c:28 #1 0x00005555559c9708 in conn_xprt_close (conn=0x55555734c0d0) at include/haproxy/connection.h:162 #2 0x00005555559c97d2 in conn_full_close (conn=0x55555734c0d0) at include/haproxy/connection.h:206 #3 0x00005555559d01a9 in sc_detach_endp (scp=0x7fffffffd648) at src/stconn.c:451 #4 0x00005555559d05b9 in sc_reset_endp (sc=0x55555734bf00) at src/stconn.c:533 #5 0x000055555598281d in back_handle_st_cer (s=0x55555734adb0) at src/backend.c:2754 #6 0x000055555588158a in process_stream (t=0x55555734be10, context=0x55555734adb0, state=516) at src/stream.c:1907 #7 0x0000555555dc31d9 in run_tasks_from_lists (budgets=0x7fffffffdb30) at src/task.c:655 #8 0x0000555555dc3dd3 in process_runnable_tasks () at src/task.c:889 #9 0x0000555555a1daae in run_poll_loop () at src/haproxy.c:2865 #10 0x0000555555a1e20c in run_thread_poll_loop (data=0x5555569d1c00 <ha_thread_info>) at src/haproxy.c:3081 #11 0x0000555555a1f66b in main (argc=5, argv=0x7fffffffde18) at src/haproxy.c:3671 To fix this, change the condition prior to calling quic_conn release. If <conn> member is not NULL, delay the release, similarly to the case when MUX is allocated. This allows connection to be freed first, and detach from quic_conn layer through close xprt operation. No need to backport.	2025-06-20 17:46:10 +02:00
Olivier Houchard	ba5738489f	MINOR: fwlc: Factorize code. Always set unusable if we could not use a server, instead of doing it in each branch This should be backported to 3.2 after e28e647fef43e5865c87f328832fec7794a423e5 is backported.	2025-06-20 15:59:03 +02:00
Olivier Houchard	e28e647fef	BUG/MAJOR: fwlc: Count an avoided server as unusable. When fwlc_get_next_server(), if a server to avoid has been provided, and we have to ignore it, don't forget to increase the number of unusable servers, otherwise we may end up ignoring it over and over, never switching to another server, in an infinite loop until the process gets killed. This hopefully fixes Github issues #3004 and #3014. This should be backported to 3.2.	2025-06-20 15:29:51 +02:00
Amaury Denoyelle	4527a2912b	MEDIUM: mux-quic: implement attach for new streams on backend side Implement attach and avail_streams mux-ops callbacks, which are used on backend side for connection reuse. Attach operation is used to initiate new streams on the connection outside of the first one. It simply relies on qcc_init_stream_local() to instantiate a new QCS instance, which is immediately linked to its stream data layer. Outside of attach, it is also necessary to implement avail_streams so that the stream layer will try to initiate connection reuse. This method reports the number of bidirectional streams which can still be opened for the QUIC connection. It depends directly to the flow-control value advertised by the peer. Thus, this ensures that attach won't cause any flow control violation.	2025-06-18 17:25:27 +02:00
Amaury Denoyelle	81cfaab6b4	MINOR: mux-quic: abort conn if cannot create stream due to fctl Prior to initiate first stream on the backend side, ensure that peer flow-control allows at least that a single bidirectional stream can be created. If this is not the case, abort MUX init operation. Before this patch, flow-control limit was not checked. Hence, if peer does not allow any bidirectional stream, haproxy would violate it, which whould then cause the peer to close the connection. Note that with the current situation, haproxy won't be able to talk to servers which uses a 0 for initial max bidi streams. A proper solution could be to pause the request until a MAX_STREAMS is received, under timeout supervision to ensure the connection is closed if no frame is received.	2025-06-18 17:25:27 +02:00
Amaury Denoyelle	06cab99a0e	MINOR: mux-quic: support max bidi streams value set by the peer Implement support for MAX_STREAMS frame. On frontend, this was mostly useless as haproxy would never initiate new bidirectional streams. However, this becomes necessary to control stream flow-control when using QUIC as a client on the backend side. Parsing of MAX_STREAMS is implemented via new qcc_recv_max_streams(). This allows to update <ms_uni>/<ms_bidi> QCC fields. This patch is necessary to achieve QUIC backend connection reuse.	2025-06-18 17:25:27 +02:00
Amaury Denoyelle	805a070ab9	BUG/MINOR: mux-quic/h3: properly handle too low peer fctl initial stream Previously, no check on peer flow-control was implemented prior to open a local QUIC stream. This was a small problem for frontend implementation, as in this case haproxy as a server never opens bidirectional streams. On frontend, the only stream opened by haproxy in this case is for HTTP/3 control unidirectional data. If the peer uses an initial value for max uni streams set to 0, it would violate its flow control, and the peer will probably close the connection. Note however that RFC 9114 mandates that each peer defines minimal initial value so that at least the control stream can be created. This commit improves the situation of too low initial max uni streams value. Now, on HTTP/3 layer initialization, haproxy preemptively checks flow control limit on streams via a new function qcc_fctl_avail_streams(). If credit is already expired due to a too small initial value, haproxy preemptively closes the connection using H3_ERR_GENERAL_PROTOCOL_ERROR. This behavior is better as haproxy is now the initiator of the connection closure. This should be backported up to 2.8.	2025-06-18 17:18:55 +02:00
Amaury Denoyelle	c807182ec9	CLEANUP: connection: remove unused mux-ops dedicated to QUIC Remove avail_streams_bidi/avail_streams_uni mux_ops. These callbacks were designed to be specific to QUIC. However, they won't be necessary, as stream layer only cares about bidirectional streams.	2025-06-18 17:02:50 +02:00
Valentine Krasnobaeva	cdb2f8d780	DOC: config: prefer-last-server: add notes for non-deterministic algorithms Add some notes which load-balancing algorithm can be considered as deterministic or non-deterministic and add some examples for each type. This was asked via mailing list to clarify the usage of prefer-last-server option. This can be backported to all stable versions.	2025-06-17 21:18:23 +02:00
Amaury Denoyelle	8fc0d2fbd5	MINOR: h3: reject invalid :status in response Add checks to ensure that :status pseudo-header received in HTTP/3 response is valid. If either the header is not provided, or it isn't a 3 digit numbers, the response is considered as invalid and the streams is rejected. Also, glitch counter is now incremented in any of these cases. This should fix coverity report from github issue #3009.	2025-06-17 11:39:35 +02:00
Amaury Denoyelle	f972f7d9e9	MINOR: h3: use BUG_ON() on missing request start-line Convert BUG_ON_HOT() statements to BUG_ON() if HTX start-line is either missing or duplicated when transcoding into a HTTP/3 request. This ensures that such abnormal conditions will be detected even on default builds. This is linked to coverity report #3008.	2025-06-17 11:39:35 +02:00
Amaury Denoyelle	2284aa0d6a	MINOR: h3: transcode H3 response headers into HTX blocks Finalize HTTP/3 response transcoding into HTX message. This patch implements conversion of HTTP/3 headers provided by the server into HTX blocks. Special checks have been implemented to reject connection-specific headers, causing the stream to be shut in error. Also, handling of content-length requires that the body size is equal to the value advertized in the header to prevent HTTP desync.	2025-06-16 18:11:09 +02:00
Amaury Denoyelle	d83255fdc3	MINOR: h3: complete response status transcoding On the backend side, HTTP/3 request response from server is transcoded into a HTX message. Previously, a fixed value was used for the status code. Improve this by extracting the value specified by the server and set it into the HTX status line. This requires to detect :status pseudo-header from the HTTP/3 response.	2025-06-16 18:11:09 +02:00
Amaury Denoyelle	f79effa306	MINOR: h3: convert HTTP/3 response into HTX for backend side support Implement basic support for HTTP/3 request response transcoding into HTX. This is done via a new dedicated function h3_resp_headers_to_htx(). A valid HTX status-line is allocated and stored. Status code is hardcoded to 200 for now. Following patches will be added to remove hardcoded status value and also handle response headers provided by the server.	2025-06-16 18:11:09 +02:00
Amaury Denoyelle	0eb35029dc	MINOR: h3: prepare support for response parsing Refactor HTTP/3 request headers transcoding to HTX done in h3_headers_to_htx(). Some operations are extracted into dedicated functions, to check pseudo-headers and headers conformity, and also trim the value of headers before encoding it in HTX. The objective will be to simplify implementation of HTTP/3 response transcoding by reusing these functions. Also, h3_headers_to_htx() has been renamed to h3_req_headers_to_htx(), to highlight that it is reserved to frontend usage.	2025-06-16 18:11:09 +02:00
Amaury Denoyelle	555ec99d43	MINOR: h3: adjust auth request encoding or fallback to host Implement proper encoding of HTTP/3 authority pseudo-header during request transcoding on the backend side. A pseudo-header :authority is encoded if a value can be extracted from HTX start-line. A special check is also implemented to ensure that a host header is not encoded if :authority already is. A new function qpack_encode_auth() is defined to implement QPACK encoding of :authority header using literal field line with name ref.	2025-06-16 18:11:09 +02:00
Amaury Denoyelle	96183abfbd	MINOR: h3: adjust path request encoding Previously, HTTP/3 backend request :path was hardcoded to value '/'. Change this so that we can now encode any path as requested by the client. Path is extracted from the HTX URI. Also, qpack_encode_path() is extended to support literal field line with name ref.	2025-06-16 18:11:09 +02:00
Amaury Denoyelle	235e818fa1	MINOR: h3: complete HTTP/3 request scheme encoding Previously, scheme was always set to https when transcoding an HTX start-line into a HTTP/3 request. Change this so this conversion is now fully compliant. If no scheme is specified by the client, which is what happens most of the time with HTTP/1, https is set for the HTTP/3 request. Else, reuse the scheme requested by the client. If either https or http is set, qpack_encode_scheme will encode it using entry from QPACK static table. Else, a full literal field line with name ref is used instead as the scheme value is specified as-is.	2025-06-16 18:11:09 +02:00
Amaury Denoyelle	a0912cf914	MINOR: h3: complete HTTP/3 request method encoding On the backend side, HTX start-line is converted into a HTTP/3 request message. Previously, GET method was hardcoded. Implement proper method conversion, by extracting it from the HTX start-line. qpack_encode_method() has also been extended, so that it is able to encode any method, either using a static table entry, or with a literal field line with name ref representation.	2025-06-16 18:11:09 +02:00
Amaury Denoyelle	f5342e0a96	MINOR: h3: encode request headers Implement encoding of HTTP/3 request headers during HTX->H3 conversion on the backend side. This simply relies on h3_encode_header(). Special check is implemented to ensure that connection-specific headers are ignored. An HTTP/3 endpoint must never generate them, or the peer will consider the message as malformed.	2025-06-16 18:11:09 +02:00
Amaury Denoyelle	7157adb154	MINOR: h3: support basic HTX start-line conversion into HTTP/3 request This commit is the first one of a serie which aim is to implement transcoding of a HTX request into HTTP/3, which is necessary for QUIC backend support. Transcoding is implementing via a new function h3_req_headers_send() when a HTX start-line is parsed. For now, most of the request fields are hardcoded, using a GET method. This will be adjusted in the next following patches.	2025-06-16 18:11:09 +02:00
Amaury Denoyelle	fc1a17f169	BUG/MINOR: mux-quic: check sc_attach_mux return value On backend side, QUIC MUX needs to initialize the first local stream during MUX init operation. This is necessary so that the first transfer can then be performed. sc_attach_mux() is used to attach the created QCS instance to its stream data layer. However, return value was not checked, which may cause issues on allocation error. This patch fixes it by returning an error on MUX init operation and freeing the QCS instance in case of sc_attach_mux() error. This fixes coverity report from github issue #3007. No need to backport.	2025-06-16 18:11:09 +02:00
Christopher Faulet	54d74259e9	BUG/MEDIUM: check: Set SOCKERR by default when a connection error is reported When a connection error is reported, we try to collect as much information as possible on the connection status and the server status is adjusted accordingly. However, the function does nothing if there is no connection error and if the healthcheck is not expired yet. It is a problem when an internal error occurred. It may happen at many places and it is hard to be sure an error is reported on the connection. And in fact, it is already a problem when the multiplexer allocation fails. In that case, the healthcheck is not interrupted as it should be. Concretely, it could only happen when a connection is established. It is hard to predict the effects of this bug. It may be unimportant. But it could probably lead to a crash. To avoid any issue, a SOCKERR status is now set by default when a connection error is reported. There is no reason to report a connection error for nothing. So a healthcheck failure must be reported. There is no "internal error" status. So a socket error is reported. This patch must be backport to all stable versions.	2025-06-16 17:47:35 +02:00
Christopher Faulet	fb76655526	MINOR: cli: handle EOS/ERROR first It is not especially a bug fixed. But APPCTX_FL_EOS and APPCTX_FL_ERROR flags must be handled first. These flags are set by the applet itself and should mark the end of all processing. So there is not reason to get the output buffer in first place. This patch could be backported as far as 3.0.	2025-06-16 16:47:59 +02:00
Christopher Faulet	396f0252bf	BUG/MEDIUM: cli: Don't consume data if outbuf is full or not available The output buffer must be available to process a command, at least to be able to emit error messages. When this buffer is full or cannot be allocated, we must wait. In that case, we must take care to notify the SE will not consume input data. It is important to avoid wakeup in loop, especially when the client aborts. When the output buffer is available again and no longer full, and the CLI applet is waiting for a command line, it must notify it will consume input data. This patch must be backported as far as 3.0.	2025-06-16 16:47:59 +02:00
Amaury Denoyelle	96badf86a2	BUG/MINOR: quic: fix ODCID initialization on frontend side QUIC support on the backend side has been implemented recently. This has lead to some adjustment on qc_new_conn() to handle both FE and BE sides, with some of these changes performed by the following commit. 29fb1aee57288a8b16ed91771ae65c2bfa400128 MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn()) An issue was introduced during some code adjustement. Initialization of ODCID was incorrectly performed, which caused haproxy to emit invalid transport parameters. Most of the clients detected this and immediatly closed the connection. Fix this by adjusting qc_lstnr_params_init() invokation : replace <qc.dcid>, which in fact points to the received SCID, by <qc.odcid> whose purpose is dedicated to original DCID storage. This fixes github issue #3006. This issue also caused the majority of tests in the interop to fail. No backport needed.	2025-06-16 10:09:37 +02:00
Frederic Lecaille	5409a73721	BUG/MINOR: quic: Fix OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn callback (OpenSSL3.5) This patch is OpenSSL3.5 QUIC API specific. It fixes OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn() callback (see man(3) SSL_set_quic_tls_cb). The role of this callback is to store the transport parameters received by the peer. At this time it is never used by QUIC listeners because there is another callback which is used to store the transport parameters. This latter callback is not specific to OpenSSL 3.5 QUIC API. As far as I know, the TLS stack call only one time one of the callbacks which have been set to receive and store the transport parameters. That said, OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn() is called for QUIC backends to store the server transport parameters. qc_ssl_set_quic_transport_params() is useless is this callback. It is dedicated to store the local tranport parameters (which are sent to the peer). Furthermore <server> second parameter of quic_transport_params_store() must be 0 for a listener (or QUIC server) whichs call it, denoting it does not receive the transport parameters of a QUIC server. It must be 1 for a QUIC backend (a QUIC client which receives the transport parameter of a QUIC server). Must be backported to 3.2.	2025-06-16 10:02:45 +02:00
Amaury Denoyelle	ab6895cc65	MINOR: hq-interop: handle HTX response forward if not enough space On backend side, HTTP/0.9 response body is copied into stream data HTX buffer. Properly handle the case where the HTX out buffer space is too small. Only copy a partial copy of the HTTP response. Transcoding will be restarted when new room is available.	2025-06-13 17:41:13 +02:00
Amaury Denoyelle	46cee07931	BUG/MINOR: quic: don't restrict reception on backend privileged ports When QUIC is used on the frontend side, communication is restricted with clients using privileged port. This is a simple protection against DNS/NTP spoofing. This feature should not be activated on the backend side, as in this case it is quite frequent to exchange with server running on privileged ports. As such, a new parameter is added to quic_recv() so that it is only active on the frontend side. Without this patch, it is impossible to communicate with QUIC servers running on privileged ports, as incoming datagrams would be silently dropped. No need to backport.	2025-06-13 16:40:21 +02:00
Christopher Faulet	edb8f2bb60	BUG/MINOR: http-ana: Properly handle keep-query redirect option if no QS The keep-query redirect option must do nothing is there is no query-string. However, there is a bug. When there is no QS, an error is returned, leading to return a 500-internal-error to the client. To fix the bug, instead of returning 0 when there is no QS, we just skip the QS processing. This patch should fix the issue #3005. It must be backported as far as 3.1.	2025-06-13 11:27:20 +02:00
Amaury Denoyelle	577fa44691	BUG/MINOR: quic: work around NEW_TOKEN parsing error on backend side NEW_TOKEN frame is never emitted by a client, hence parsing was not tested on frontend side. On backend side, an issue can occur, as expected token length is static, based on the token length used internally by haproxy. This is not sufficient for most server implementation which uses larger token. This causes a parsing error, which may cause skipping of following frames in the same packet. This issue was detected using ngtcp2 as server. As for now tokens are unused by haproxy, simply discard test on token length during NEW_TOKEN frame parsing. The token itself is merely skipped without being stored. This is sufficient for now to continue on experimenting with QUIC backend implementation. This does not need to be backported.	2025-06-12 17:47:15 +02:00
Amaury Denoyelle	830affc17d	MINOR: server: reject QUIC servers without explicit SSL Report an error during server configuration if QUIC is used by SSL is not activiated via 'ssl' keyword. This is done in _srv_parse_finalize(), which is both used by static and dynamic servers. Note that contrary to listeners, an error is reported instead of a warning, and SSL is not automatically activated if missing. This is mainly due to the complex server configuration : _srv_parse_finalize() is ideal to affect every servers, including dynamic entries. However, it is executed after server SSL context allocation performed via <prepare_srv> XPRT operation. A proper fix would be to move SSL ctx alloc in _srv_parse_finalize(), but this may have unknown impact. Thus, for now a simpler solution has been chosen.	2025-06-12 16:16:43 +02:00
Amaury Denoyelle	33cd96a5e9	BUG/MINOR: quic: prevent crash on startup with -dt QUIC traces in ssl_quic_srv_new_ssl_ctx() are problematic as this function is called early during startup. If activating traces via -dt command-line argument, a crash occurs due to stderr sink not yet available. Thus, traces from ssl_quic_srv_new_ssl_ctx() are simply removed. No backport needed.	2025-06-12 15:15:56 +02:00
Frederic Lecaille	5a0ae9e9be	MINOR: quic-be: Avoid SSL context unreachable code without USE_QUIC_OPENSSL_COMPAT This commit added a "err" C label reachable only with USE_QUIC_OPENSSL_COMPAT: MINOR: quic-be: Missing callbacks initializations (USE_QUIC_OPENSSL_COMPAT) leading coverity to warn this: *** CID 1611481: Control flow issues (UNREACHABLE) /src/quic_ssl.c: 802 in ssl_quic_srv_new_ssl_ctx() 796 goto err; 797 #endif 798 799 leave: 800 TRACE_LEAVE(QUIC_EV_CONN_NEW); 801 return ctx; >>> CID 1611481: Control flow issues (UNREACHABLE) >>> This code cannot be reached: "err: SSL_CTX_free(ctx);". 802 err: 803 SSL_CTX_free(ctx); 804 ctx = NULL; 805 TRACE_DEVEL("leaving on error", QUIC_EV_CONN_NEW); 806 goto leave; 807 } The less intrusive (without #ifdef) way to fix this it to add a "goto err" statement from the code part which is reachable without USE_QUIC_OPENSSL_COMPAT. Thank you to @chipitsine for having reported this issue in GH #3003.	2025-06-12 11:45:21 +02:00
Frederic Lecaille	869fb457ed	BUG/MINOR: quic-be: CID double free upon qc_new_conn() failures This issue may occur when qc_new_conn() fails after having allocated and attached <conn_cid> to its tree. This is the case when compiling haproxy against WolfSSL for an unknown reason at this time. In this case the <conn_cid> is freed by pool_head_quic_connection_id(), then freed again by quic_conn_release(). This bug arrived with this commit: MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn()) So, the aim of this patch is to free <conn_cid> only for QUIC backends and if it is not attached to its tree. This is the case when <conn_id> local variable passed with NULL value to qc_new_conn() is then intialized to the same <conn_cid> value.	2025-06-12 11:45:21 +02:00
Frederic Lecaille	dc3fb3a731	CLEANUP: quic-be: Add comments about qc_new_conn() usage This patch should have come with this last commit for the last qc_new_conn() modifications for QUIC backends: MINOR: quic-be: get rid of ->li quic_conn member qc_new_conn() must be passed NULL pointers for several variables as mentioned by the comment. Some of these local variables are used to avoid too much code modifications.	2025-06-12 11:45:21 +02:00
Amaury Denoyelle	603afd495b	MINOR: hq-interop: encode request from HTX for backend side support Implement transcoding of a HTX request into HTTP/0.9. This protocol is a simplified version of HTTP. Request only supports GET method without any header. As such, only a request line is written during snd_buf operation.	2025-06-12 11:28:54 +02:00
Amaury Denoyelle	a286d5476b	MINOR: hq-interop: decode response into HTX for backend side support Implement transcoding of a HTTP/0.9 response into a HTX message. HTTP/0.9 is a really simple substract of HTTP spec. The response does not have any status line and is contains only the payload body. Response is finished when the underlying connection/stream is closed. A status line is generated to be compliant with HTX. This is performed on the first invokation of rcv_buf for the current stream. Status code is set to 200. Payload body if present is then copied using htx_add_data().	2025-06-12 11:28:54 +02:00
Amaury Denoyelle	4031bf7432	MINOR: quic: wakeup backend MUX on handshake completed This commit is the second and final step to initiate QUIC MUX on the backend side. On handshake completion, MUX is woken up just after its creation. This step is necessary to notify the stream layer, via the QCS instance pre-initialized on MUX init, so that the transfer can be resumed. This mode of operation is similar to TCP stack when TLS+ALPN are used, which forces MUX initialization to be delayed after handshake completion.	2025-06-12 11:28:54 +02:00
Amaury Denoyelle	1efaca8a57	MINOR: mux-quic: instantiate first stream on backend side Adjust qmux_init() to handle frontend and backend sides differently. Most notably, on backend side, the first bidirectional stream is created preemptively. This step is necessary as MUX layer will be woken up just after handshake completion.	2025-06-12 11:28:54 +02:00
Amaury Denoyelle	f8d096c05f	MINOR: mux-quic: set expect data only on frontend side Stream data layer is notified that data is expected when FIN is received, which marks the end of the HTTP request. This prepares data layer to be able to handle the expected HTTP response. Thus, this step is only relevant on frontend side. On backend side, FIN marks the end of the HTTP response. No further content is expected, thus expect data should not be set in this case. Note that se_expect_data() invokation via qcs_attach_sc() is not protected. This is because this function will only be called during request headers parsing which is performed on the frontend side.	2025-06-12 11:28:54 +02:00
Amaury Denoyelle	e8775d51df	MINOR: mux-quic: define flag for backend side Mux connection is flagged with new QC_CF_IS_BACK if used on the backend side. For now the only change is during traces, to be able to differentiate frontend and backend usage.	2025-06-12 11:28:54 +02:00
Amaury Denoyelle	93b904702f	MINOR: mux-quic: improve documentation for snd/rcv app-ops Complete document for rcv_buf/snd_buf operations. In particular, return value is now explicitely defined. For H3 layer, associated functions documentation is also extended.	2025-06-12 11:28:54 +02:00
Amaury Denoyelle	e7f1db0348	MINOR: quic: mark ctrl layer as ready on quic_connect_server() Use conn_ctrl_init() on the connection when quic_connect_server() succeeds. This is necessary so that the connection is considered as completely initialized. Without this, connect operation will be call again if connection is reused.	2025-06-12 11:25:12 +02:00
Amaury Denoyelle	a0db93f3d8	MEDIUM: backend: delay MUX init with ALPN even if proto is forced On backend side, multiplexer layer is initialized during connect_server(). However, this step is not performed if ALPN is used, as the negotiated protocol may be unknown. Multiplexer initialization is delayed after TLS handshake completion. There are still exceptions though that forces the MUX to be initialized even if ALPN is used. One of them was if <mux_proto> server field was already set at this stage, which is the case when an explicit proto is selected on the server line configuration. Remove this condition so that now MUX init is delayed with ALPN even if proto is forced. The scope of this change should be minimal. In fact, the only impact concerns server config with both proto and ALPN set, which is pretty unlikely as it is contradictory. The main objective of this patch is to prepare QUIC support on the backend side. Indeed, QUIC proto will be forced on the server if a QUIC address is used, similarly to bind configuration. However, we still want to delay MUX initialization after QUIC handshake completion. This is mandatory to know the selected application protocol, required during QUIC MUX init.	2025-06-12 11:21:32 +02:00
Amaury Denoyelle	044ad3a602	BUG/MEDIUM: mux-quic: adjust wakeup behavior Change wake callback behavior for QUIC MUX. This operation loops over each QCS and notify their stream data layer on certain events via internal helper qcc_wake_some_streams(). Previously, streams were notified only if an error occured on the connection. Change this to notify streams data layer everytime wake callback is used. This behavior is now identical to H2 MUX. qcc_wake_some_streams() is also renamed to qcc_wake_streams(), as it better reflect its true behavior. This change should not have performance impact as wake mux ops should not be called frequently. Note that qcc_wake_streams() can also be called directly via qcc_io_process() to ensure a new error is correctly propagated. As wake callback first uses qcc_io_process(), it will only call qcc_wake_streams() if no error is present. No known issue is associated with this commit. However, it could prevent freezing transfer under certain condition. As such, it is considered as a bug fix worthy of backporting. This should be backported after a period of observation.	2025-06-12 11:12:49 +02:00
Christopher Faulet	2c3f3eaaed	BUILD: hlua: Fix warnings about uninitialized variables (2) It was still failing on Ubuntu-24.04 with GCC+ASAN. So, instead of understand the code path the compiler followed to report uninitialized variables, let's init them now. No backport needed.	2025-06-12 10:49:54 +02:00
Aurelien DARRAGON	b5067a972c	BUILD: listener: fix 'for' loop inline variable declaration commit 16eb0fab3 ("MAJOR: counters: dispatch counters over thread groups") introduced a build regression on some compilers: src/listener.c: In function 'listener_accept': src/listener.c:1095:3: error: 'for' loop initial declarations are only allowed in C99 mode for (int it = 0; it < global.nbtgroups; it++) ^ src/listener.c:1095:3: note: use option -std=c99 or -std=gnu99 to compile your code src/listener.c:1101:4: error: 'for' loop initial declarations are only allowed in C99 mode for (int it = 0; it < global.nbtgroups; it++) { ^ make: * [src/listener.o] Error 1 make: * Waiting for unfinished jobs.... Let's fix that. No backport needed	2025-06-12 08:46:36 +02:00
Christopher Faulet	01f011faeb	BUILD: hlua: Fix warnings about uninitialized variables In hlua_applet_tcp_recv_try() and hlua_applet_tcp_getline_yield(), GCC 14.2 reports warnings about 'blk2' variable that may be used uninitialized. It is a bit strange because the code is pretty similar than before. But to make it happy and to avoid bugs if the API change in future, 'blk2' is now used only when its length is greater than 0. No need to backport.	2025-06-12 08:46:36 +02:00
Christopher Faulet	8c573deb9f	BUG/MINOR: hlua: Don't forget the return statement after a hlua_yieldk() In hlua_applet_tcp_getline_yield(), the function may yield if there is no data available. However we must take care to add a return statement just after the call to hlua_yieldk(). I don't know the details of the LUA API, but at least, this return statement fix a build error about uninitialized variables that may be used. It is a 3.3-specific issue. No backport needed.	2025-06-12 08:46:36 +02:00
Frederic Lecaille	bf6e576cfd	MEDIUM: quic-be: initialize MUX on handshake completion On backend side, MUX is instantiated after QUIC handshake completion. This step is performed via qc_ssl_provide_quic_data(). First, connection flags for handshake completion are resetted. Then, MUX is instantiated via conn_create_mux() function.	2025-06-11 18:37:34 +02:00
Amaury Denoyelle	cdcecb9b65	MINOR: quic: define proper proto on QUIC servers Force QUIC as <mux_proto> for server if a QUIC address is used. This is similarly to what is already done for bind instances on the frontend side. This step ensures that conn_create_mux() will select the proper protocol.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	855fd63f90	MINOR: quic-be: Prevent the MUX to send/receive data Such actions must be interrupted until the handshake completion.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	b9703cf711	MINOR: quic-be: get rid of ->li quic_conn member Replace ->li quic_conn pointer to struct listener member by ->target which is an object type enum and adapt the code. Use __objt_(listener\|server)() where the object type is known. Typically this is were the code which is specific to one connection type (frontend/backend). Remove <server> parameter passed to qc_new_conn(). It is redundant with the <target> parameter. GSO is not supported at this time for QUIC backend. qc_prep_pkts() is modified to prevent it from building more than an MTU. This has as consequence to prevent qc_send_ppkts() to use GSO. ssl_clienthello.c code is run only by listeners. This is why __objt_listener() is used in place of ->li.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	f6ef3bbc8a	MINOR: quic-be: SSL_get_peer_quic_transport_params() not defined by OpenSSL 3.5 QUIC API Disable the code around SSL_get_peer_quic_transport_params() as this was done for USE_QUIC_OPENSSL_COMPAT because SSL_get_peer_quic_transport_params() is not defined by OpenSSL 3.5 QUIC API.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	034cf74437	MINOR: quic-be: Make the secret derivation works for QUIC backends (USE_QUIC_OPENSSL_COMPAT) quic_tls_compat_keylog_callback() is the callback used by the QUIC OpenSSL compatibility module to derive the TLS secrets from other secrets provided by keylog. The <write> local variable to this function is initialized to denote the direction (write to send, read to receive) the secret is supposed to be used for. That said, as the QUIC cryptographic algorithms are symmetrical, the direction is inversed between the peer: a secret which is used to write/send/cipher data from a peer point of view is also the secret which is used to read/receive/decipher data. This was confirmed by the fact that without this patch, the TLS stack first provides the peer with Handshake to send/cipher data. The client could not use such secret to decipher the Handshake packets received from the server. This patch simply reverse the direction stored by <write> variable to make the secrets derivation works for the QUIC client.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	d1cd0bb987	MINOR: quic-be: Missing callbacks initializations (USE_QUIC_OPENSSL_COMPAT) quic_tls_compat_init() function is called from OpenSSL QUIC compatibility module (USE_QUIC_OPENSSL_COMPAT) to initialize the keylog callback and the callback which stores the QUIC transport parameters as a TLS extensions into the stack. These callbacks must also be initialized for QUIC backends.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	fc90964b55	MINOR: quic-be: Store the remote transport parameters asap This is done from TLS secrets derivation callback at Application level (the last encryption level) calling SSL_get_peer_quic_transport_params() to have an access to the TLS transport paremeters extension embedded into the Server Hello TLS message. Then, quic_transport_params_store() is called to store a decoded version of these transport parameters.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	8c2f2615f4	MINOR: quic-be: I/O handler switch adaptation For connection to QUIC servers, this patch modifies the moment where the I/O handler callback is switched to quic_conn_app_io_cb(). This is no more done as for listener just after the handshake has completed but just after it has been confirmed.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	f085a2f5bf	MINOR: quic-be: Initial packet number space discarding. Discard the Initial packet number space as soon as possible. This is done during handshakes in quic_conn_io_cb() as soon as an Handshake packet could be successfully sent.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	a62098bfb0	MINOR: quic-be: Add the conn object to the server SSL context The initialization of <ssl_app_data_index> SSL user data index is required to make all the SSL sessions to QUIC servers work as this is done for TCP servers. The conn object notably retrieve for SSL callback which are server specific (e.g. ssl_sess_new_srv_cb()).	2025-06-11 18:37:34 +02:00
Frederic Lecaille	e226a7cb79	MINOR: quic-be: Build post handshake frames This action is not specific to listeners. A QUIC client also have to send NEW_CONNECTION_ID frames.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	2d076178c6	MINOR: quic-be: Store asap the DCID Store the peer connection ID (SCID) as the connection DCID as soon as an Initial packet is received. Stop comparing the packet to QUIC_PACKET_TYPE_0RTT is already match as QUIC_PACKET_TYPE_INITIAL. A QUIC server must not send too short datagram with ack-eliciting packets inside. This cannot be done from quic_rx_pkt_parse() because one does not know if there is ack-eliciting frame into the Initial packets. If the packet must be dropped, this is after having parsed it!	2025-06-11 18:37:34 +02:00
Frederic Lecaille	b4a9b53515	MINOR: h3-be: Correctly retrieve h3 counters This is done using qc_counters() function which supports also QUIC servers.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	e27b7b4889	MINOR: quic-be: Handshake packet number space discarding This is done for QUIC clients (or haproxy QUIC servers) when the handshake is confirmed.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	43d88a44f1	MINOR: quic-be: Datagrams and packet parsing support Modify quic_dgram_parse() to stop passing it a listener as third parameter. In place the object type address of the connection socket owner is passed to support the haproxy servers with QUIC as transport protocol. qc_owner_obj_type() is implemented to return this address. qc_counters() is also implemented to return the QUIC specific counters of the proxy of owner of the connection. quic_rx_pkt_parse() called by quic_dgram_parse() is also modify to use the object type address used by this latter as last parameter. It is also modified to send Retry packet only from listeners. A QUIC client (connection to haproxy QUIC servers) must drop the Initial packets with non null token length. It is also not supposed to receive O-RTT packets which are dropped.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	266b10b8a4	MINOR: quic-be: Do not redispatch the datagrams The QUIC datagram redispatch is there to counter the race condition which exists only for QUIC connections to listener where datagrams may arrive on the wrong socket between the bind() and connect() calls. Run this code part only for listeners.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	89d5a59933	MINOR: quic-be: add field for max_udp_payload_size into quic_conn Add ->max_udp_payload_size new member to quic_conn struct. Initialize it from qc_new_conn(). Adapt qc_snd_buf() to use it.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	f7c0f5ac1b	MINOR: quic-be: xprt ->init() adapatations Allocate a connection to connect to QUIC servers from qc_conn_init() which is the ->init() QUIC xprt callback. Also initialize ->prepare_srv and ->destroy_srv callback as this done for TCP servers.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	29fb1aee57	MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn()) For haproxy QUIC servers (or QUIC clients), the peer is considered as validated. This is a property which is more specific to QUIC servers (haproxy QUIC listeners). No <odcid> is used for the QUIC client connection. It is used only on the QUIC server side. The <token_odcid> is also not used on the QUIC client side. It must be embedded into the transport parameters only on the QUIC server side. The quic_conn is created before the socket allocation. So, the local address is zeroed. Initilize the transport parameter with qc_srv_params_init(). Stop hardcoding the <server> parameter passed value to qc_new_isecs() to correctly initialize the Initial secrets.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	9831f596ea	MINOR: quic-be: ->connect() protocol callback adaptations Modify quic_connect_server() which is the ->connect() callback for QUIC protocol: - add a BUG_ON() run when entering this funtion: the <fd> socket must equal -1 - conn->handle is a union. conn->handle.qc is use for QUIC connection, conn->handle.fd must not be used to store the fd. - code alignment fix for setsockopt(fd, SOL_SOCKET, (SO_SNDBUF\|SO_RCVBUF)) statements - remove the section of code which was duplicated from ->connect() TCP callback - fd_insert() the new socket file decriptor created to connect to the QUIC server with quic_conn_sock_fd_iocb() as callback for read event.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	52ec3430f2	MINOR: sock: Add protocol and socket types parameters to sock_create_server_socket() This patch only adds <proto_type> new proto_type enum parameter and <sock_type> socket type parameter to sock_create_server_socket() and adapts its callers. This is to prepare the use of this function by QUIC servers/backends.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	9c84f64652	MINOR: quic-be: Add a function to initialize the QUIC client transport parameters Implement qc_srv_params_init() to initialize the QUIC client transport parameters in relation with connections to haproxy servers/backends.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	f49bbd36b9	MINOR: quic-be: SSL sessions initializations Modify qc_alloc_ssl_sock_ctx() to pass the connection object as parameter. It is NULL for a QUIC listener, not NULL for a QUIC server. This connection object is set as value for ->conn quic_conn struct member. Initialise the SSL session object from this function for QUIC servers. qc_ssl_set_quic_transport_params() is also modified to pass the SSL object as parameter. This is the unique parameter this function needs. <qc> parameter is used only for the trace. SSL_do_handshake() must be calle as soon as the SSL object is initialized for the QUIC backend connection. This triggers the TLS CRYPTO data delivery. tasklet_wakeup() is also called to send asap these CRYPTO data. Modify the QUIC_EV_CONN_NEW event trace to dump the potential errors returned by SSL_do_handshake().	2025-06-11 18:37:34 +02:00
Frederic Lecaille	1408d94bc4	MINOR: quic-be: ssl_sock contexts allocation and misc adaptations Implement ssl_sock_new_ssl_ctx() to allocate a SSL server context as this is currently done for TCP servers and also for QUIC servers depending on the <is_quic> boolean value passed as new parameter. For QUIC servers, this function calls ssl_quic_srv_new_ssl_ctx() which is specific to QUIC.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	7c76252d8a	MINOR: quic-be: Correct the QUIC protocol lookup From connect_server(), QUIC protocol could not be retreived by protocol_lookup() because of the PROTO_TYPE_STREAM default passed as argument. In place to support QUIC srv->addr_type.proto_type may be safely passed.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	1e45690656	MINOR: quic-be: Add a function for the TLS context allocations Implement ssl_quic_srv_new_ssl_ctx() whose aim is to allocate a TLS context for QUIC servers.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	a4e1296208	MINOR: quic-be: QUIC server xprt already set when preparing their CTXs The QUIC servers xprts have already been set at server line parsing time. This patch prevents the QUIC servers xprts to be reset to <ssl_sock> value which is the value used for SSL/TCP connections.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	24fc44c44d	MINOR: quic-be: QUIC backend XPRT and transport parameters init during parsing Add ->quic_params new member to server struct. Also set the ->xprt member of the server being initialized and initialize asap its transport parameters from _srv_parse_init().	2025-06-11 18:37:34 +02:00
Frederic Lecaille	0e67687ca9	MINOR: quic-be: Call ->prepare_srv() callback at parsing time This XPRT callback is called from check_config_validity() after the configuration has been parsed to initialize all the SSL server contexts. This patch implements the same thing for the QUIC servers.	2025-06-11 18:37:34 +02:00
Frederic Lecaille	5a711551a2	MINOR: quic-be: Version Information transport parameter check Add a little check to verify that the version chosen by the server matches with the client one. Initiliazes local transport parameters ->negotiated_version value with this version if this is the case. If not, return 0;	2025-06-11 18:37:34 +02:00
Frederic Lecaille	990c9f95f7	MINOR: quic-be: Correct Version Information transp. param encoding According to the RFC, a QUIC client must encode the QUIC version it supports into the "Available Versions" of "Version Information" transport parameter order by descending preference. This is done defining <quic_version_2> and <quic_version_draft_29> new variables pointers to the corresponding version of <quic_versions> array elements. A client announces its available versions as follows: v1, v2, draft29.	2025-06-11 18:37:34 +02:00
Amaury Denoyelle	9c751a3cc1	MINOR: mux-quic-be: allow QUIC proto on backend side Activate QUIC protocol support for MUX-QUIC on the backend side, additionally to current frontend support. This change is mandatory to be able to implement QUIC on the backend side. Without this modification, it is impossible to activate explicitely QUIC protocol on a server line, hence an error is reported : config : proxy 'xxxx' : MUX protocol 'quic' is not usable for server 'yyyy'	2025-06-11 18:37:34 +02:00
Amaury Denoyelle	f66b495f8e	MINOR: server: mark QUIC support as experimental Mark QUIC address support for servers as experimental on the backend side. Previously, it was allowed but wouldn't function as expected. As QUIC backend support requires several changes, it is better to declare it as experimental first.	2025-06-11 18:37:33 +02:00
Amaury Denoyelle	bdd5e58179	MINOR: server: implement helper to identify QUIC servers Define srv_is_quic() which can be used to quickly identified if a server uses QUIC protocol.	2025-06-11 18:37:19 +02:00
Amaury Denoyelle	1ecf2e9bab	BUG/MINOR: config/server: reject QUIC addresses QUIC is not implemented on the backend side. To prevent any issue, it is better to reject any server configured which uses it. This is done via _srv_parse_init() which is used both for static and dynamic servers. This should be backported up to all stable versions.	2025-06-11 18:37:17 +02:00
Christopher Faulet	b5525fe759	[RELEASE] Released version 3.3-dev1 Released version 3.3-dev1 with the following main changes : - BUILD: tools: properly define ha_dump_backtrace() to avoid a build warning - DOC: config: Fix a typo in 2.7 (Name format for maps and ACLs) - REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+ (5) - REGTESTS: Remove REQUIRE_VERSION=2.3 from all tests - REGTESTS: Remove REQUIRE_VERSION=2.4 from all tests - REGTESTS: Remove tests with REQUIRE_VERSION_BELOW=2.4 - REGTESTS: Remove support for REQUIRE_VERSION and REQUIRE_VERSION_BELOW - MINOR: server: group postinit server tasks under _srv_postparse() - MINOR: stats: add stat_col flags - MINOR: stats: add ME_NEW_COMMON() helper - MINOR: proxy: collect per-capability stat in proxy_cond_disable() - MINOR: proxy: add a true list containing all proxies - MINOR: log: only run postcheck_log_backend() checks on backend - MEDIUM: proxy: use global proxy list for REGISTER_POST_PROXY_CHECK() hook - MEDIUM: server: automatically add server to proxy list in new_server() - MEDIUM: server: add and use srv_init() function - BUG/MAJOR: leastconn: Protect tree_elt with the lbprm lock - BUG/MEDIUM: check: Requeue healthchecks on I/O events to handle check timeout - CLEANUP: applet: Update comment for applet_put* functions - DEBUG: check: Add the healthcheck's expiration date in the trace messags - BUG/MINOR: mux-spop: Fix null-pointer deref on SPOP stream allocation failure - CLEANUP: sink: remove useless cleanup in sink_new_from_logger() - MAJOR: counters: add shared counters base infrastructure - MINOR: counters: add shared counters helpers to get and drop shared pointers - MINOR: counters: add common struct and flags to {fe,be}_counters_shared - MEDIUM: counters: manage shared counters using dedicated helpers - CLEANUP: counters: merge some common counters between {fe,be}_counters_shared - MINOR: counters: add local-only internal rates to compute some maxes - MAJOR: counters: dispatch counters over thread groups - BUG/MEDIUM: cli: Properly parse empty lines and avoid crashed - BUG/MINOR: config: emit warning for empty args only in discovery mode - BUG/MINOR: config: fix arg number reported on empty arg warning - BUG/MINOR: quic: Missing SSL session object freeing - MINOR: applet: Add API functions to manipulate input and output buffers - MINOR: applet: Add API functions to get data from the input buffer - CLEANUP: applet: Simplify a bit comments for applet_put* functions - MEDIUM: hlua: Update TCP applet functions to use the new applet API - BUG/MEDIUM: fd: Use the provided tgid in fd_insert() to get tgroup_info - BUG/MINIR: h1: Fix doc of 'accept-unsafe-...-request' about URI parsing	2025-06-11 14:31:33 +02:00
Christopher Faulet	b2f64af341	BUG/MINIR: h1: Fix doc of 'accept-unsafe-...-request' about URI parsing The description of tests performed on the URI in H1 when 'accept-unsafe-violations-in-http-request' option is wrong. It states that only characters below 32 and 127 are blocked when this option is set, suggesting that otherwise, when it is not set, all invalid characters in the URI, according to the RFC3986, are blocked. But in fact, it is not true. By default all character below 32 and above 127 are blocked. And when 'accept-unsafe-violations-in-http-request' option is set, characters above 127 (excluded) are accepted. But characters in (33..126) are never checked, independently of this option. This patch should fix the issue #2906. It should be backported as far as 3.0. For older versions, the docuementation could also be clarified because this part is not really clear. Note the request URI validation is still under discution because invalid characters in (33.126) are never checked and some users request a stricter parsing.	2025-06-10 19:17:56 +02:00
Olivier Houchard	6993981cd6	BUG/MEDIUM: fd: Use the provided tgid in fd_insert() to get tgroup_info In fd_insert(), use the provided tgid to ghet the thread group info, instead of using the one of the current thread, as we may call fd_insert() from a thread of another thread group, that will happen at least when binding the listeners. Otherwise we'd end up accessing the thread mask containing enabled thread of the wrong thread group, which can lead to crashes if we're binding on threads not present in the thread group. This should fix Github issue #2991. This should be backported up to 2.8.	2025-06-10 15:10:56 +02:00
Christopher Faulet	9df380a152	MEDIUM: hlua: Update TCP applet functions to use the new applet API The functions responsible to extract data from the applet input buffer or to push data into the applet output buffer are now relying on the newly added functions in the applet API. This simplifies a bit the code.	2025-06-10 08:16:10 +02:00
Christopher Faulet	18f9c71041	CLEANUP: applet: Simplify a bit comments for applet_put* functions Instead of repeating which buffer is used depending on the API used by the applet, a reference to applet_get_outbuf() was added.	2025-06-10 08:16:10 +02:00
Christopher Faulet	79445766a3	MINOR: applet: Add API functions to get data from the input buffer There was already functions to pushed data from the applet to the stream by inserting them in the right buffer, depending the applet was using or not the legacy API. Here, functions to retreive data pushed to the applet by the stream were added: * applet_getchar : Gets one character * applet_getblk : Copies a full block of data * applet_getword : Copies one text block representing a word using a custom separator as delimiter * applet_getline : Copies one text line * applet_getblk_nc : Get one or two blocks of data * applet_getword_nc: Gets one or two blocks of text representing a word using a custom separator as delimiter * applet_getline_nc: Gets one or two blocks of text representing a line	2025-06-10 08:16:10 +02:00
Christopher Faulet	0d8ecb1edc	MINOR: applet: Add API functions to manipulate input and output buffers In this patch, some functions were added to ease input and output buffers manipulation, regardless the corresponding applet is using its own buffers or it is relying on channels buffers. Following functions were added: * applet_get_inbuf : Get the buffer containing data pushed to the applet by the stream * applet_get_outbuf : Get the buffer containing data pushed by the applet to the stream * applet_input_data : Return the amount of data in the input buffer * applet_skip_input : Skips <len> bytes from the input buffer * applet_reset_input: Skips all bytes from the input buffer * applet_output_room: Returns the amout of space available at the output buffer * applet_need_room : Indicates that the applet have more data to deliver and it needs more room in the output buffer to do so	2025-06-10 08:16:10 +02:00
Frederic Lecaille	6b74633069	BUG/MINOR: quic: Missing SSL session object freeing qc_alloc_ssl_sock_ctx() allocates an SSL_CTX object for each connection. It also allocates an SSL object. When this function failed, it freed only the SSL_CTX object. The correct way to free both of them is to call qc_free_ssl_sock_ctx(). Must be backported as far as 2.6.	2025-06-06 17:53:13 +02:00
Amaury Denoyelle	0cdf529720	BUG/MINOR: config: fix arg number reported on empty arg warning If an empty argument is used in configuration, for example due to an undefined environment variable, the rest of the line is not parsed. As such, a warning is emitted to report this. The warning was not totally correct as it reported the wrong argument index. Fix this by this patch. Note that there is still an issue with the "^" indicator, but this is not as easy to fix yet. This is related to github issue #2995. This should be backported up to 3.2.	2025-06-06 17:03:02 +02:00
Amaury Denoyelle	5f1fad1690	BUG/MINOR: config: emit warning for empty args only in discovery mode Hide warning about empty argument outside of discovery mode. This is necessary, else the message will be displayed twice, which hampers haproxy output lisibility. This should fix github isue #2995. This should be backported up to 3.2.	2025-06-06 17:02:58 +02:00
Christopher Faulet	f5d41803d3	BUG/MEDIUM: cli: Properly parse empty lines and avoid crashed Empty lines was not properly parsed and could lead to crashes because the last argument was parsed outside of the cmdline buffer. Indeed, the last argument is parsed to look for an eventual payload pattern. It is started one character after the newline at the end of the command line. But it is only valid for an non-empty command line. So, now, this case is properly detected when we leave if an empty line is detected. This patch must be backported to 3.2.	2025-06-05 10:46:13 +02:00
Aurelien DARRAGON	16eb0fab31	MAJOR: counters: dispatch counters over thread groups Most fe and be counters are good candidates for being shared between processes. They are now grouped inside "shared" struct sub member under be_counters and fe_counters. Now they are properly identified, they would greatly benefit from being shared over thread groups to reduce the cost of atomic operations when updating them. For this, we take the current tgid into account so each thread group only updates its own counters. For this to work, it is mandatory that the "shared" member from {fe,be}_counters is initialized AFTER global.nbtgroups is known, because each shared counter causes the stat to be allocated lobal.nbtgroups times. When updating a counter without concurrency, the first counter from the array may be updated. To consult the shared counters (which requires aggregation of per-tgid individual counters), some helper functions were added to counter.h to ease code maintenance and avoid computing errors.	2025-06-05 09:59:38 +02:00
Aurelien DARRAGON	12c3ffbb48	MINOR: counters: add local-only internal rates to compute some maxes cps_max (max new connections received per second), sps_max (max new sessions per second) and http.rps_max (maximum new http requests per second) all rely on shared counters (namely conn_per_sec, sess_per_sec and http.req_per_sec). The problem is that shared counters are about to be distributed over thread groups, and we cannot afford to compute the total (for all thread groups) each time we update the max counters. Instead, since such max counters (relying on shared counters) are a very few exceptions, let's add internal (sess,conn,req) per sec freq counters that are dedicated to cps_max, sps_max and http.rps_max computing. Thanks to that, related *_max counters shouldn't be negatively impacted by the thread-group distribution, yet they will not benefit from it either. Related internal freq counters are prefixed with "_" to emphasize the fact that they should not be used for other purpose (the shared ones, which are about to be distributed over thread groups in upcoming commits are still available and must be used instead). The internal ones could eventually be removed at any time if we find another way to compute the {cps,sps,http.rps)_max counters.	2025-06-05 09:59:31 +02:00
Aurelien DARRAGON	b72a8bb138	CLEANUP: counters: merge some common counters between {fe,be}_counters_shared Now that we have a common struct between fe and be shared counters struct let's perform some cleanup to merge duplicate members into the common struct part. This will ease code maintenance.	2025-06-05 09:59:24 +02:00
Aurelien DARRAGON	b599138842	MEDIUM: counters: manage shared counters using dedicated helpers proxies, listeners and server shared counters are now managed via helpers added in one of the previous commits. When guid is not set (ie: when not yet assigned), shared counters pointer is allocated using calloc() (local memory) and a flag is set on the shared counters struct to know how to manipulate (and free it). Else if guid is set, then it means that the counters may be shared so while for now we don't actually use a shared memory location the API is ready for that. The way it works, for proxies and servers (for which guid is not known during creation), we first call counters_{fe,be}_shared_get with guid not set, which results in local pointer being retrieved (as if we just manually called calloc() to retrieve a pointer). Later (during postparsing) if guid is set we try to upgrade the pointer from local to shared. Lastly, since the memory location for some objects (proxies and servers counters) may change from creation to postparsing, let's update counters->last_change member directly under counters_{fe,be}_shared_get() so we don't miss it. No change of behavior is expected, this is only preparation work.	2025-06-05 09:59:17 +02:00
Aurelien DARRAGON	c10ce1c85b	MINOR: counters: add common struct and flags to {fe,be}_counters_shared fe_counters_shared and be_counters_shared may share some common members since they are quite similar, so we add a common struct part shared between the two. struct counters_shared is added for convenience as a generic pointer to manipulate common members from fe or be shared counters pointer. Also, the first common member is added: shared fe and be counters now have a flags member.	2025-06-05 09:59:10 +02:00
Aurelien DARRAGON	aa53887398	MINOR: counters: add shared counters helpers to get and drop shared pointers create include/haproxy/counters.h and src/counters.c files to anticipate for further helpers as some counters specific tasks needs to be carried out and since counters are shared between multiple object types (ie: listener, proxy, server..) we need generic helpers. Add some shared counters helper which are not yet used but will be updated in upcoming commits.	2025-06-05 09:59:04 +02:00
Aurelien DARRAGON	a0dcab5c45	MAJOR: counters: add shared counters base infrastructure Shareable counters are not tagged as shared counters and are dynamically allocated in separate memory area as a prerequisite for being stored in shared memory area. For now, GUID and threads groups are not taken into account, this is only a first step. also we ensure all counters are now manipulated using atomic operations, namely, "last_change" counter is now read from and written to using atomic ops. Despite the numerous changes caused by the counters being moved away from counters struct, no change of behavior should be expected.	2025-06-05 09:58:58 +02:00
Aurelien DARRAGON	89b04f2191	CLEANUP: sink: remove useless cleanup in sink_new_from_logger() As reported by Ilya in GH #2994, some cleanup parts in sink_new_from_logger() function are not used. We can actually simplify the cleanup logic to remove dead code, let's do that by renaming "error_final" label to "error" and only making use of the "error" label, because sink_free() already takes care of proper cleanup for all sink members.	2025-06-05 09:58:50 +02:00
Christopher Faulet	8c4bb8cab3	BUG/MINOR: mux-spop: Fix null-pointer deref on SPOP stream allocation failure When we try to allocate a new SPOP stream, if an error is encountered, spop_strm_destroy() is called to released the eventually allocated stream. But, it must only be called if a stream was allocated. If the reported error is an SPOP stream allocation failure, we must just leave to avoid null-pointer dereference. This patch should fix point 1 of the issue #2993. It must be backported as far as 3.1.	2025-06-04 08:48:49 +02:00
Christopher Faulet	6786b05297	DEBUG: check: Add the healthcheck's expiration date in the trace messags It could help to diagnose some issues about timeout processing. So let's add it !	2025-06-03 15:06:12 +02:00
Christopher Faulet	8ee650a88b	CLEANUP: applet: Update comment for applet_put* functions These functions were copied from the channel API and modified to work with applets using the new API or the legacy one. However, the comments were updated accordingly. It is the purpose of this patch.	2025-06-03 15:03:30 +02:00
Christopher Faulet	7c788f0984	BUG/MEDIUM: check: Requeue healthchecks on I/O events to handle check timeout When a healthchecks is processed, once the first wakeup passed to start the check, and as long as the expiration timer is not reached, only I/O events are able to wake it up. It is an issue when there is a check timeout defined. Especially if the connect timeout is high and the check timeout is low. In that case, the healthcheck's task is never requeue to handle any timeout update. When the connection is established, the check timeout is set to replace the connect timeout. It is thus possible to report a success while a timeout should be reported. So, now, when an I/O event is handled, the healthcheck is requeue, except if an success or an abort is reported. Thanks to Thierry Fournier for report and the reproducer. This patch must be backported to all stable versions.	2025-06-03 15:03:30 +02:00
Olivier Houchard	913b2d6c83	BUG/MAJOR: leastconn: Protect tree_elt with the lbprm lock In fwlc_srv_reposition(), set the server's tree_elt while we still hold the lbprm read lock. While it was protected from concurrent fwlc_srv_reposition() calls by the server's lb_lock, it was not from dequeuing/requeuing that could occur if the server gets down/up or its weight is changed, and that would lead to inconsistencies, and the watchdog killing the process because it is stuck in an infinite loop in fwlc_get_next_server(). This hopefully fixes github issue #2990. This should be backported to 3.2.	2025-06-03 04:42:47 +02:00
Aurelien DARRAGON	368d01361a	MEDIUM: server: add and use srv_init() function rename _srv_postparse() internal function to srv_init() function and group srv_init_per_thr() plus idle conns list init inside it. This way we can perform some simplifications as srv_init() performs multiple server init steps after parsing. SRV_F_CHECKED flag was added, it is automatically set when srv_init() runs successfully. If the flag is already set and srv_init() is called again, nothing is done. This permis to manually call srv_init() earlier than the default POST_CHECK hook when needed without risking to do things twice.	2025-06-02 17:51:33 +02:00
Aurelien DARRAGON	889ef6f67b	MEDIUM: server: automatically add server to proxy list in new_server() while new_server() takes the parent proxy as argument and even assigns srv->proxy to the parent proxy, it didn't actually inserted the server to the parent proxy server list on success. The result is that sometimes we add the server to the list after new_server() is called, and sometimes we don't. This is really error-prone and because of that hooks such as REGISTER_POST_SERVER_CHECK() which as run for all servers listed in all proxies may not be relied upon for servers which are not actually inserted in their parent proxy server list. Plus it feels very strange to have a server that points to a proxy, but then the proxy doesn't know about it because it cannot find it in its server list. To prevent errors and make proxy->srv list reliable, we move the insertion logic directly under new_server(). This requires to know if we are called during parsing or during runtime to either insert or append the server to the parent proxy list. For that we use PR_FL_CHECKED flag from the parent proxy (if the flag is set, then the proxy was checked so we are past the init phase, thus we assume we are called during runtime) This implies that during startup if new_server() has to be cancelled on error paths we need to call srv_detach() (which is now exposed in server.h) before srv_drop(). The consequence of this commit is that REGISTER_POST_SERVER_CHECK() should not run reliably on all servers created using new_server() (without having to manually loop on global servers_list)	2025-06-02 17:51:30 +02:00
Aurelien DARRAGON	e262e4bbe4	MEDIUM: proxy: use global proxy list for REGISTER_POST_PROXY_CHECK() hook REGISTER_POST_PROXY_CHECK() used to iterate over "main" proxies to run registered callbacks. This means hidden proxies (and their servers) did not get a chance to get post-checked and could cause issues if some post- checks are expected to be executed on all proxies no matter their type. Instead we now rely on the global proxies list. Another side effect is that the REGISTER_POST_SERVER_CHECK() now runs as well for servers from proxies that are not part of the main proxies list.	2025-06-02 17:51:27 +02:00
Aurelien DARRAGON	1f12e45b0a	MINOR: log: only run postcheck_log_backend() checks on backend postcheck_log_backend() checks are executed no matter if the proxy actually has the backend capability while the checks actually depend on this. Let's fix that by adding an extra condition to ensure that the BE capability is set. This issue is not tagged as a bug because for now it remains impossible to have a syslog proxy without BE capability in the main proxy list, but this may change in the future.	2025-06-02 17:51:24 +02:00
Aurelien DARRAGON	943958c3ff	MINOR: proxy: add a true list containing all proxies We have global proxies_list pointer which is announced as the list of "all existing proxies", but in fact it only represents regular proxies declared on the config file through "listen, frontend or backend" keywords It is ambiguous, and we currently don't have a straightforwrd method to iterate over all proxies (either public or internal ones) within haproxy Instead we still have to manually iterate over multiple lists (main proxies, log-forward proxies, peer proxies..) which is error-prone. In this patch we add a struct list member (8 bytes) inside struct proxy in order to store every proxy (except default ones) within a global "proxies" list which is actually representative for all proxies existing under haproxy process, like we already have for servers.	2025-06-02 17:51:21 +02:00
Aurelien DARRAGON	6ccf770fe2	MINOR: proxy: collect per-capability stat in proxy_cond_disable() proxy_cond_disable() collects and prints cumulated connections for be and fe proxies no matter their type. With shared stats it may cause issues because depending on the proxy capabilities only fe or be counters may be allocated. In this patch we add some checks to ensure we only try to read from valid memory locations, else we rely on default values (0).	2025-06-02 17:51:17 +02:00
Aurelien DARRAGON	c7c017ec3c	MINOR: stats: add ME_NEW_COMMON() helper Split ME_NEW_* helper into COMMON part and specific part so it becomes easier to add alternative helpers without code duplication.	2025-06-02 17:51:12 +02:00
Aurelien DARRAGON	d04843167c	MINOR: stats: add stat_col flags Add stat_col flags member to store .generic bit and prepare for upcoming flags. No functional change expected.	2025-06-02 17:51:08 +02:00
Aurelien DARRAGON	f0b40b49b8	MINOR: server: group postinit server tasks under _srv_postparse() init_srv_requeue() and init_srv_slowstart() functions are called after initial server parsing via REGISTER_POST_SERVER_CHECK() hook, and they are also manually called for dynamic server after the server is initialized. This may conflict with _srv_postparse() which is also registered via REGISTER_POST_SERVER_CHECK() and called during dynamic server creation To ensure functions don't conflict with each other, let's ensure they are executed in proper order by calling init_srv_requeue and init_srv_slowstart() from _srv_postparse() which now becomes the parent function for server related postparsing stuff. No change of behavior is expected.	2025-06-02 17:51:05 +02:00
Tim Duesterhus	8ee8b8a04d	REGTESTS: Remove support for REQUIRE_VERSION and REQUIRE_VERSION_BELOW This is no longer used since the migration to the native `haproxy -cc 'version_atleast(X)'` functionality. see 8727614dc4046e91997ecce421bcb6a5537cac93 see 5efc48dcf1b133dd415c759e83b21d52dc303786	2025-06-02 17:37:11 +02:00
Tim Duesterhus	d8951ec70f	REGTESTS: Remove tests with REQUIRE_VERSION_BELOW=2.4 HAProxy 2.4 is the lowest supported version, thus this never matches. see 18cd4746e5aff9da78d16220b0412947ceba24f3	2025-06-02 17:37:07 +02:00
Tim Duesterhus	534b09f2a2	REGTESTS: Remove REQUIRE_VERSION=2.4 from all tests HAProxy 2.4 is the lowest supported version, thus this always matches. see 7aff1bf6b90caadfa95f6b43b526275191991d6f	2025-06-02 17:37:04 +02:00
Tim Duesterhus	239785fd27	REGTESTS: Remove REQUIRE_VERSION=2.3 from all tests HAProxy 2.4 is the lowest supported version, thus this always matches. see 7aff1bf6b90caadfa95f6b43b526275191991d6f	2025-06-02 17:37:00 +02:00
Tim Duesterhus	294c47a5ef	REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+ (5) Introduced in: 25bcdb1d9 BUG/MAJOR: h1: Be stricter on request target validation during message parsing see also: fbbbc33df REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+	2025-06-02 17:36:56 +02:00
Christopher Faulet	8e8cdf114b	DOC: config: Fix a typo in 2.7 (Name format for maps and ACLs) "identified" was used instead of "identifier". May be backported as far as 3.0	2025-06-02 09:19:38 +02:00
Willy Tarreau	b88164d9c0	BUILD: tools: properly define ha_dump_backtrace() to avoid a build warning In resolve_sym_name() we declare a few symbols that we want to be able to resolve. ha_dump_backtrace() was declared with a struct buffer instead of a pointer to such a struct, which has no effect since we only want to get the function's pointer, but produces a build warning with LTO, so let's fix it. This can be backported to 3.0.	2025-05-30 17:15:48 +02:00
Willy Tarreau	9f4cd435d3	[RELEASE] Released version 3.3-dev0 Released version 3.3-dev0 with the following main changes : - MINOR: version: mention that it's development again	2025-05-28 16:46:34 +02:00
Willy Tarreau	8809251ee0	MINOR: version: mention that it's development again This essentially reverts a6458fd4269.	2025-05-28 16:46:15 +02:00
Willy Tarreau	e134140d28	[RELEASE] Released version 3.2.0 Released version 3.2.0 with the following main changes : - MINOR: promex: Add agent check status/code/duration metrics - MINOR: ssl: support strict-sni in ssl-default-bind-options - MINOR: ssl: also provide the "tls-tickets" bind option - MINOR: server: define CLI I/O handler for "add server" - MINOR: server: implement "add server help" - MINOR: server: use stress mode for "add server help" - BUG/MEDIUM: server: fix crash after duplicate GUID insertion - BUG/MEDIUM: server: fix potential null-deref after previous fix - MINOR: config: list recently added sections with -dKcfg - BUG/MAJOR: cache: Crash because of wrong cache entry deleted - DOC: configuration: fix the example in crt-store - DOC: config: clarify the wording around single/double quotes - DOC: config: clarify the legacy cookie and header captures - DOC: config: fix alphabetical ordering of layer 7 sample fetch functions - DOC: config: fix alphabetical ordering of layer 6 sample fetch functions - DOC: config: fix alphabetical ordering of layer 5 sample fetch functions - DOC: config: fix alphabetical ordering of layer 4 sample fetch functions - DOC: config: fix alphabetical ordering of internal sample fetch functions - BUG/MINOR: h3: Set HTX flags corresponding to the scheme found in the request - BUG/MEDIUM: h3: Declare absolute URI as normalized when a :authority is found - DOC: config: mention in bytes_in and bytes_out that they're read on input - DOC: config: clarify the basics of ACLs (call point, multi-valued etc) - REGTESTS: Make the script testing conditional set-var compatible with Vtest2 - REGTESTS: Explicitly allow failing shell commands in some scripts - MINOR: listeners: Add support for a label on bind line - BUG/MEDIUM: cli/ring: Properly handle shutdown in "show event" I/O handler - BUG/MEDIUM: hlua: Properly detect shudowns for TCP applets based on the new API - BUG/MEDIUM: hlua: Fix getline() for TCP applets to work with applet's buffers - BUG/MEDIUM: hlua: Fix receive API for TCP applets to properly handle shutdowns - CI: vtest: Rely on VTest2 to run regression tests - CI: vtest: Fix the build script to properly work on MaOS - CI: combine AWS-LC and AWS-LC-FIPS by template - BUG/MEDIUM: httpclient: Throw an error if an lua httpclient instance is reused - DOC: hlua: Add a note to warn user about httpclient object reuse - DOC: hlua: fix a few typos in HTTPMessage.set_body_len() documentation - DEV: patchbot: prepare for new version 3.3-dev - MINOR: version: mention that it's 3.2 LTS now.	2025-05-28 16:35:14 +02:00
Willy Tarreau	a6458fd426	MINOR: version: mention that it's 3.2 LTS now. The version will be maintained up to around Q2 2030. Let's also update the INSTALL file to mention this.	2025-05-28 16:31:27 +02:00
Willy Tarreau	2502435eb3	DEV: patchbot: prepare for new version 3.3-dev The bot will now load the prompt for the upcoming 3.2 version so we have to rename the files and update their contents to match the current version.	2025-05-28 16:23:12 +02:00
Willy Tarreau	21ce685fcd	DOC: hlua: fix a few typos in HTTPMessage.set_body_len() documentation A few typos were noticed while gathering info for the 3.2 announce messages, this fixes them, and will probably constitute the last commit of this release. There's no need to backport it unless commit 94055a5e7 ("MEDIUM: hlua: Add function to change the body length of an HTTP Message") is backported.	2025-05-27 19:33:49 +02:00
Christopher Faulet	cb7a2444d1	DOC: hlua: Add a note to warn user about httpclient object reuse It is not supported to reuse an lua httpclient instance to process several requests. A new object must be created for each request. Thanks to the previous patch ("BUG/MEDIUM: httpclient: Throw an error if an lua httpclient instance is reused"), an error is now reported if this happens. But it is not obvious for users. So the lua-api docuementation was updated accordingly. This patch is related to issue #2986. It should be backported with the commit above.	2025-05-27 18:48:23 +02:00
Christopher Faulet	50fca6f0b7	BUG/MEDIUM: httpclient: Throw an error if an lua httpclient instance is reused It is not expected/supported to reuse an httpclient instance to process several requests. A new instance must be created for each request. However, in lua, there is nothing to prevent a user to create an httpclient object and use it in a loop to process requests. That's unfortunate because this will apparently work, the requests will be sent and a response will be received and processed. However internally some ressources will be allocated and never released. When the next response is processed, the ressources allocated for the previous one are definitively lost. In this patch we take care to check that the httpclient object was never used when a request is sent from a lua script by checking HTTPCLIENT_FS_STARTED flags. This flag is set when a httpclient applet is spawned to process a request and never removed after that. In lua, the httpclient applet is created when the request is sent. So, it is the right place to do this test. This patch should fix the issue #2986. It should be backported as far as 2.6.	2025-05-27 18:47:24 +02:00
Ilya Shipitsin	94ded5523f	CI: combine AWS-LC and AWS-LC-FIPS by template let's reduce code duplication by involving workflow templates	2025-05-27 15:06:58 +02:00
Christopher Faulet	508e074a32	CI: vtest: Fix the build script to properly work on MaOS "config.h" header file is new in VTest2 and includes must be adapted to be able to build VTest on MacOS. Let's add "-I." to make it work.	2025-05-27 14:48:53 +02:00
Christopher Faulet	6a18d28ba2	CI: vtest: Rely on VTest2 to run regression tests VTest2 (https://github.com/vtest/VTest2) was released and is a remplacement for VTest. VTest was archived. So let's use the new version now. If this commit is backported, the 2 following commits must also be backported: * 2808e3577 ("REGTESTS: Explicitly allow failing shell commands in some scripts") * 82c291124 ("REGTESTS: Make the script testing conditional set-var compatible with Vtest2")	2025-05-27 14:38:46 +02:00
Christopher Faulet	bc4c3c7969	BUG/MEDIUM: hlua: Fix receive API for TCP applets to properly handle shutdowns An optional timeout was added to AppletTCP.receive() to interrupt calls after a delay. It was mandatory to be able to implement interactive applets (like trisdemo). However, this broke the API and it made impossible to differentiate the shutdowns from the delays expirations. Indeed, in both cases, an empty string was returned. Because historically an empty string was used to notify a connection shutdown, it should not be changed. So now, 'nil' value is returned when no data was available before the delay expiration. The new AppletTCP:try_receive() function was also affected. To fix it, instead of stating there is no delay when a receive is tried, an expired delay is set. Concretely TICK_ETERNITY was replaced by now_ms. Finally, AppletTCP:getline() function is not concerned for now because there is no way to interrupt it after some delay. The documentation and trisdemo lua script were updated accordingly. This patch depends on "BUG/MEDIUM: hlua: Properly detect shudowns for TCP applets based on the new API". However, it is a 3.2-specific issue, so no backport is needed.	2025-05-27 07:53:19 +02:00
Christopher Faulet	c0ecef71d7	BUG/MEDIUM: hlua: Fix getline() for TCP applets to work with applet's buffers The commit e5e36ce09 ("BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers") fixed the TCP applets API to work with applets using its own buffers. Howver the getline() function was not updated. It could be an issue for anyone registering a CLI commands reading lines. This patch should be backported as far as 3.0.	2025-05-27 07:53:01 +02:00
Christopher Faulet	c64781c2c8	BUG/MEDIUM: hlua: Properly detect shudowns for TCP applets based on the new API The internal function responsible to receive data for TCP applets with internal buffers is buggy. Indeed, for these applets, the buffer API is used to get data. So there is no tests on the SE to properly detect connection shutdowns. So, it must be performed by hand after the call to b_getblk_nc(). This patch must be backported as far as 3.0.	2025-05-26 19:00:00 +02:00
Christopher Faulet	4d4da515f2	BUG/MEDIUM: cli/ring: Properly handle shutdown in "show event" I/O handler The commit 03dc54d802 ("BUG/MINOR: ring: Fix I/O handler of "show event" command to not rely on the SC") introduced a regression. By removing dependencies on the SC, a test to detect client shutdowns was removed. So now, the CLI applet is no longer released when the client shut the connection during a "show event -w". So of course, we should not use the SC to detect the shutdowns. But the SE must be used insteead. It is a 3.2-specific issue, so no backport needed.	2025-05-26 19:00:00 +02:00
Christopher Faulet	99e755d673	MINOR: listeners: Add support for a label on bind line It is now possile to set a label on a bind line. All sockets attached to this bind line inherits from this label. The idea is to be able to groud of sockets. For now, there is no mechanism to create these groups, this must be done by hand.	2025-05-26 19:00:00 +02:00
Christopher Faulet	2808e3577f	REGTESTS: Explicitly allow failing shell commands in some scripts Vtest2, that should replaced Vtest in few months, will reject any failing commands in shell blocks. However, some scripts are executing some commands, expecting an error to be able to parse the error output. So, now use "set +e" in those scripts to explicitly state failing commads are expected. It is just used for non-final commands. At the end, the shell block must still report a success.	2025-05-26 19:00:00 +02:00
Christopher Faulet	82c2911248	REGTESTS: Make the script testing conditional set-var compatible with Vtest2 VTest2 will replaced VTest in few months. There is not so much change expected. One of them is that a User-Agent header is added by default in all requests, except if an custom one is already set or if "-nouseragent" option is used. To still be compatible with VTest, it is not possible to use the option to avoid the header addition. So, a custom user-agent is added in the last test of "sample_fetches/cond_set_var.vtc" to be sure it will pass with Vtest and Vtest2. It is mandatory because the request length is tested.	2025-05-26 19:00:00 +02:00
Willy Tarreau	5b937b7a97	DOC: config: clarify the basics of ACLs (call point, multi-valued etc) This is essentially in order to address the concerns expressed in issue #2226 where it is mentioned that the moment they are called is not clear enough. Admittedly, re-reading the paragraph doesn't make it obvious on a quick read that they behave like functions. This patch adds an extra paragraph that makes the parallel with programming languages' boolean functions and explains the fact that they can be multi-valued. Hoping this is clearer now.	2025-05-26 16:25:22 +02:00
Willy Tarreau	ef9511be90	DOC: config: mention in bytes_in and bytes_out that they're read on input Issue #2267 suggests that it's unclear what exactly the byte counts mean (particularly when compression is involved). Let's clarify that the counts are read on data input and that they also cover headers and a bit of internal overhead.	2025-05-26 15:54:36 +02:00
Christopher Faulet	e70c23e517	BUG/MEDIUM: h3: Declare absolute URI as normalized when a :authority is found Since commit 2c3d656f8 ("MEDIUM: h3: use absolute URI form with :authority"), the absolute URI form is used when a ':authority' pseudo-header is found. However, this URI was not declared as normalized internally. So, when the request is reformated to be sent to an h1 server, the absolute-form is used instead of the origin-form. It is unexpected and may be an issue for some servers that could reject the request. So, now, we take care to set HTX_SL_F_HAS_AUTHORITY flag on the HTX message when an authority was found and HTX_SL_F_NORMALIZED_URI flag is set for "http" or "https" schemes. No backport needed because the commit above must not be backported. It should fix a regression reported on the 3.2-dev17 in issue #2977. This commit depends on "BUG/MINOR: h3: Set HTX flags corresponding to the scheme found in the request".	2025-05-26 11:47:23 +02:00
Christopher Faulet	da9792cca8	BUG/MINOR: h3: Set HTX flags corresponding to the scheme found in the request When a ":scheme" pseudo-header is found in a h3 request, the HTX_SL_F_HAS_SCHM flag must be set on the HTX message. And if the scheme is 'http' or 'https', the corresponding HTX flag must also be set. So, respectively, HTX_SL_F_SCHM_HTTP or HTX_SL_F_SCHM_HTTPS. It is mainly used to send the right ":scheme" pseudo-header value to H2 server on backend side. This patch could be backported as far as 2.6.	2025-05-26 11:38:29 +02:00
Willy Tarreau	083708daf8	DOC: config: fix alphabetical ordering of internal sample fetch functions Some misordering has been accumulating over time, making some of them hard to spot. Also "uptime" was not indexed.	2025-05-26 09:36:23 +02:00
Willy Tarreau	52c2247d90	DOC: config: fix alphabetical ordering of layer 4 sample fetch functions Some misordering has been accumulating over time, making some of them hard to spot.	2025-05-26 09:33:17 +02:00
Willy Tarreau	770098f5e3	DOC: config: fix alphabetical ordering of layer 5 sample fetch functions Some misordering has been accumulating over time, making some of them hard to spot.	2025-05-26 09:26:11 +02:00
Willy Tarreau	5261e35b8f	DOC: config: fix alphabetical ordering of layer 6 sample fetch functions Some misordering has been accumulating over time, making some of them hard to spot.	2025-05-26 09:26:11 +02:00
Willy Tarreau	e9248243e9	DOC: config: fix alphabetical ordering of layer 7 sample fetch functions Some misordering has been accumulating over time, making some of them hard to spot.	2025-05-26 09:26:11 +02:00
Willy Tarreau	38456f63a3	DOC: config: clarify the legacy cookie and header captures As reported in issue #2195, cookie captures and header captures are no longer the recommended way to proceed. Let's mention that this is the legacy way and provide a few pointers to the recommended functions and actions to use the modern methods.	2025-05-26 08:56:33 +02:00
Willy Tarreau	da8d6d1b2c	DOC: config: clarify the wording around single/double quotes As reported in issue #2327, the wording used in the section about quoting can be read two ways due to the use of the two types of quotes to protect each other quote. Better only use the quoting without mixing the two when mentioning them.	2025-05-26 08:36:33 +02:00
William Lallemand	d607940915	DOC: configuration: fix the example in crt-store Fix a bad example in the crt-store section. site1 does not use the "web" crt-store but the global one. Must be backported as far as 3.0 however the section was 3.12 in previous version.	2025-05-25 16:55:08 +02:00
Remi Tricot-Le Breton	90441e9bfe	BUG/MAJOR: cache: Crash because of wrong cache entry deleted When "vary" is enabled, we can have multiple entries for a given primary key in the cache tree. There is a limit to how many secondary entries can be inserted for a given key. When we try to insert a new secondary entry, if the limit is already reached, we can try to find expired entries with the same primary key, and if the limit is still reached we want to abort the current insertion and to remove the node that was just inserted. In commit "a29b073: MEDIUM: cache: Add refcount on cache_entry" though, a regression was introduced. Instead of removing the entry just inserted as the comments suggested, we removed the second to last entry and returned NULL. We then reset the eb.key of the cache_entry in the caller because we assumed that the entry was already removed from the tree. This means that some entries with an empty key were wrongly kept in the tree and the last secondary entry, which keeps the number of secondary entries of a given key was removed. This ended up causing some crashes later on when we tried to iterate over the elements of this given key. The crash could occur in multiple places, either when trying to retrieve an entry or to add some new ones. This crash was raised in GitHub issue #2950. The fix should be backported up to 3.0.	2025-05-23 22:38:54 +02:00
Willy Tarreau	84ffb3d0a9	MINOR: config: list recently added sections with -dKcfg Newly added sections (crt-store, traces, acme) were not listed in -dKcfg, let's add them. For now they have to be manually enumerated.	2025-05-23 10:49:33 +02:00
Willy Tarreau	28c7a22790	BUG/MEDIUM: server: fix potential null-deref after previous fix A valid build warning was reported in the CI with latest commit b40ce97ecc ("BUG/MEDIUM: server: fix crash after duplicate GUID insertion"). Indeed, if the first test in the function fails, we branch to the err label with guid==NULL and will crash there. Let's just test guid before dereferencing it for freeing. This needs to be backported to 3.0 as well since the commit above was meant to go there.	2025-05-22 18:09:12 +02:00
Amaury Denoyelle	b40ce97ecc	BUG/MEDIUM: server: fix crash after duplicate GUID insertion On "add server", if a GUID is defined, guid_insert() is used to add the entry into the global GUID tree. If a similar entry already exists, GUID insertion fails and the server creation is eventually aborted. A crash could occur in this case because of an invalid memory access via guid_remove(). The latter is caused via free_server() as the server insertion is rejected. The invalid occurs on GUID key. The issue occurs because of guid_insert(). The function properly deallocates the GUID key on duplicate insertion, but it failed to reset <guid.node.key> to NULL. This caused the invalid memory access on guid_remove(). To fix this, ensure that key member is properly resetted on guid_insert() error path. This must be backported up to 3.0.	2025-05-22 17:59:37 +02:00
Amaury Denoyelle	5e088e3f8e	MINOR: server: use stress mode for "add server help" Implement stress mode on "add server help". This ensures that the command is fully reentrant on full output buffer. For testing, it requires compilation with USE_STRESS and global setting "stress-level 1".	2025-05-22 17:40:05 +02:00
Amaury Denoyelle	4de5090976	MINOR: server: implement "add server help" Implement "help" as a sub-command for "add server" CLI. The objective is to list all the keywords that are supported for dynamic servers. CLI IO handler and add_srv_ctx are used to support reentrancy on full output buffer. Now that this command is implemented, the outdated keyword list on "add server" from management documentation can be removed.	2025-05-22 17:40:05 +02:00
Amaury Denoyelle	2570892c41	MINOR: server: define CLI I/O handler for "add server" Extend "add server" to support an IO handler function named cli_io_handler_add_server(). A context object is also defined whose usage will depend on IO handler capabilities. IO handler is skipped when "add server" is run in default mode, i.e. on a dynamic server creation. Thus, currently IO handler is unneeded. However, it will become useful to support sub-commands for "add server". Note that return value of "add server" parser has been changed on server creation success. Previously, it was used incorrectly to report if server was inserted or not. In fact, parser return value is used by CLI generic code to detect if command processing has been completed, or should continue to the IO handler. Now, "add server" always returns 1 to signal that CLI processing is completed. This is necessary to preserve CLI output emitted by parser, even now that IO handler is defined for the command. Previously, output was emitted in every situations due to IO handler not defined. See below code snippet from cli.c for a better overview : if (kw->parse && kw->parse(args, payload, appctx, kw->private) != 0) { ret = 1; goto fail; } /* kw->parse could set its own io_handler or io_release handler */ if (!appctx->cli_ctx.io_handler) { ret = 1; goto fail; } appctx->st0 = CLI_ST_CALLBACK; ret = 1; goto end;	2025-05-22 17:40:05 +02:00
Willy Tarreau	1c0f2e62ad	MINOR: ssl: also provide the "tls-tickets" bind option Currently there is "no-tls-tickets" that is also supported in the ssl-default-bind-options directive, but there's no way to re-enable them on a specific "bind" line. This patch simply provides the option to re-enable them. Note that the flag is inverted because tickets are enabled by default and the no-tls-ticket option sets the flag to disable them.	2025-05-22 15:31:54 +02:00
Willy Tarreau	3494775a1f	MINOR: ssl: support strict-sni in ssl-default-bind-options Several users already reported that it would be nice to support strict-sni in ssl-default-bind-options. However, in order to support it, we also need an option to disable it. This patch moves the setting of the option from the strict_sni field to a flag in the ssl_options field so that it can be inherited from the default bind options, and adds a new "no-strict-sni" directive to allow to disable it on a specific "bind" line. The test file "del_ssl_crt-list.vtc" which already tests both options was updated to make use of the default option and the no- variant to confirm everything continues to work.	2025-05-22 15:31:54 +02:00
Christopher Faulet	7244f16ac4	MINOR: promex: Add agent check status/code/duration metrics In the Prometheus exporter, the last health check status is already exposed, with its code and duration in seconds. The server status is also exposed. But the information about the agent check are not available. It is not really handy because when a server status is changed because of the agent, it is not obvious by looking to the Prometheus metrics. Indeed, the server may reported as DOWN for instance, while the health check status still reports a success. Being able to get the agent status in that case could be valuable. So now, the last agent check status is exposed, with its code and duration in seconds. Following metrics can be grabbe now: * haproxy_server_agent_status * haproxy_server_agent_code * haproxy_server_agent_duration_seconds Note that unlike the other metrics, no per-backend aggregated metric is exposed. This patch is related to issue #2983.	2025-05-22 09:50:10 +02:00
Willy Tarreau	0ac41ff97e	[RELEASE] Released version 3.2-dev17 Released version 3.2-dev17 with the following main changes : - DOC: configuration: explicit multi-choice on bind shards option - BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers - BUG/MEDIUM: peers: also limit the number of incoming updates - MEDIUM: hlua: Add function to change the body length of an HTTP Message - BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload - BUG/MINOR: h3: don't insert more than one Host header - BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field - DOC: config: properly index "table and "stick-table" in their section - DOC: management: change reference to configuration manual - BUILD: debug: mark ha_crash_now() as attribute(noreturn) - IMPORT: slz: avoid multiple shifts on 64-bits - IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested - IMPORT: slz: use a better hash for machines with a fast multiply - IMPORT: slz: fix header used for empty zlib message - IMPORT: slz: silence a build warning on non-x86 non-arm - BUG/MAJOR: leastconn: do not loop forever when facing saturated servers - BUG/MAJOR: queue: properly keep count of the queue length - BUG/MINOR: quic: fix crash on quic_conn alloc failure - BUG/MAJOR: leastconn: never reuse the node after dropping the lock - MINOR: acme: renewal notification over the dpapi sink - CLEANUP: quic: Useless BIO_METHOD initialization - MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures - MINOR: quic: Allow the use of the new OpenSSL 3.5.0 QUIC TLS API (to be completed) - MINOR: quic: implement all remaining callbacks for OpenSSL 3.5 QUIC API - MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset - MINOR: quic: OpenSSL 3.5 trick to support 0-RTT - DOC: update INSTALL for QUIC with OpenSSL 3.5 usages - DOC: management: update 'acme status' - BUG/MEDIUM: wdt: always ignore the first watchdog wakeup - CLEANUP: wdt: clarify the comments on the common exit path - BUILD: ssl: avoid possible printf format warning in traces - BUILD: acme: fix build issue on 32-bit archs with 64-bit time_t - DOC: management: precise some of the fields of "show servers conn" - BUG/MEDIUM: mux-quic: fix BUG_ON() on rxbuf alloc error - DOC: watchdog: update the doc to reflect the recent changes - BUG/MEDIUM: acme: check if acme domains are configured - BUG/MINOR: acme: fix formatting issue in error and logs - EXAMPLES: lua: avoid screen refresh effect in "trisdemo" - CLEANUP: quic: remove unused cbuf module - MINOR: quic: move function to check stream type in utils - MINOR: quic: refactor handling of streams after MUX release - MINOR: quic: add some missing includes - MINOR: quic: adjust quic_conn-t.h include list - CLEANUP: cfgparse: alphabetically sort the global keywords - MINOR: glitches: add global setting "tune.glitches.kill.cpu-usage"	2025-05-21 15:56:06 +02:00
Willy Tarreau	a1577a89a0	MINOR: glitches: add global setting "tune.glitches.kill.cpu-usage" It was mentioned during the development of glitches that it would be nice to support not killing misbehaving connections below a certain CPU usage so that poor implementations that routinely misbehave without impact are not killed. This is now possible by setting a CPU usage threshold under which we don't kill them via this parameter. It defaults to zero so that we continue to kill them by default.	2025-05-21 15:47:42 +02:00
Willy Tarreau	eee57b4d3f	CLEANUP: cfgparse: alphabetically sort the global keywords The global keywords table was no longer sorted at all, let's fix it to ease spotting the searched ones.	2025-05-21 15:47:42 +02:00
Amaury Denoyelle	00d90e8839	MINOR: quic: adjust quic_conn-t.h include list Adjust include list in quic_conn-t.h. This file is included in many QUIC source, so it is useful to keep as lightweight as possible. Note that connection/QUIC MUX are transformed into forward declaration for better layer separation.	2025-05-21 14:44:27 +02:00
Amaury Denoyelle	01e3b2119a	MINOR: quic: add some missing includes Insert some missing includes statement in QUIC source files. This was detected after the next commit which adjust the include list used in quic_conn-t.h file.	2025-05-21 14:44:27 +02:00
Amaury Denoyelle	f286288471	MINOR: quic: refactor handling of streams after MUX release quic-conn layer has to handle itself STREAM frames after MUX release. If the stream was already seen, it is probably only a retransmitted frame which can be safely ignored. For other streams, an active closure may be needed. Thus it's necessary that quic-conn layer knows the highest stream ID already handled by the MUX after its release. Previously, this was done via <nb_streams> member array in quic-conn structure. Refactor this by replacing <nb_streams> by two members called <stream_max_uni>/<stream_max_bidi>. Indeed, it is unnecessary for quic-conn layer to monitor locally opened uni streams, as the peer cannot by definition emit a STREAM frame on it. Also, bidirectional streams are always opened by the remote side. Previously, <nb_streams> were set by quic-stream layer. Now, <stream_max_uni>/<stream_max_bidi> members are only set one time, just prior to QUIC MUX release. This is sufficient as quic-conn do not use them if the MUX is available. Note that previously, IDs were used relatively to their type, thus incremented by 1, after shifting the original value. For simplification, use the plain stream ID, which is incremented by 4.	2025-05-21 14:26:45 +02:00
Amaury Denoyelle	07d41a043c	MINOR: quic: move function to check stream type in utils Move general function to check if a stream is uni or bidirectional from QUIC MUX to quic_utils module. This should prevent unnecessary include of QUIC MUX header file in other sources.	2025-05-21 14:17:41 +02:00
Amaury Denoyelle	cf45bf1ad8	CLEANUP: quic: remove unused cbuf module Cbuf are not used anymore. Remove the related source and header files, as well as include statements in the rest of QUIC source files.	2025-05-21 14:16:37 +02:00
Baptiste Assmann	b437094853	EXAMPLES: lua: avoid screen refresh effect in "trisdemo" In current version of the game, there is a "screen refresh" effect: the screen is cleared before being re-drawn. I moved the clear right after the connection is opened and removed it from rendering time.	2025-05-21 12:00:53 +02:00
William Lallemand	8b121ab6f7	BUG/MINOR: acme: fix formatting issue in error and logs Stop emitting \n in errmsg for intermediate error messages, this was emitting multiline logs and was returning to a new line in the middle of sentences. We don't need to emit them in acme_start_task() since the errmsg is ouput in a send_log which already contains a \n or on the CLI which also emits it.	2025-05-21 11:41:28 +02:00
William Lallemand	156f4bd7a6	BUG/MEDIUM: acme: check if acme domains are configured When starting the ACME task with a ckch_conf which does not contain the domains, the ACME task would segfault because it will try to dereference a NULL in this case. The patch fix the issue by emitting a warning when no domains are configured. It's not done at configuration parsing because it is not easy to emit the warning because there are is no callback system which give access to the whole ckch_conf once a line is parsed. No backport needed.	2025-05-21 11:41:28 +02:00
Willy Tarreau	f5ed309449	DOC: watchdog: update the doc to reflect the recent changes The watchdog was improved and fixed a few months ago, but the doc had not been updated to reflect this. That's now done.	2025-05-21 11:34:55 +02:00
Amaury Denoyelle	e399daa67e	BUG/MEDIUM: mux-quic: fix BUG_ON() on rxbuf alloc error RX buffer allocation has been reworked in current dev tree. The objective is to support multiple buffers per QCS to improve upload throughput. RX buffer allocation failure is handled simply : the whole connection is closed. This is done via qcc_set_error(), with INTERNAL_ERROR as error code. This function contains a BUG_ON() to ensure it is called only one time per connection instance. On RX buffer alloc failure, the aformentioned BUG_ON() crashes due to a double invokation of qcc_set_error(). First by qcs_get_rxbuf(), and immediately after it by qcc_recv(), which is the caller of the previous one. This regression was introduced by the following commit. 60f64449fbba7bb6e351e8343741bb3c960a2e6d MAJOR: mux-quic: support multiple QCS RX buffers To fix this, simply remove qcc_set_error() invocation in qcs_get_rxbuf(). On buffer alloc failture, qcc_recv() is responsible to set the error. This does not need to be backported.	2025-05-21 11:33:00 +02:00
Willy Tarreau	5c628d4e09	DOC: management: precise some of the fields of "show servers conn" As reported in issue #2970, the output of "show servers conn" is not clear. It was essentially meant as a debugging tool during some changes to idle connections management, but if some users want to monitor or graph them, more info is needed. The doc mentions the currently known list of fields, and reminds that this output is not meant to be stable over time, but as long as it does not change, it can provide some useful metrics to some users.	2025-05-21 10:45:07 +02:00
Willy Tarreau	4b52d5e406	BUILD: acme: fix build issue on 32-bit archs with 64-bit time_t The build failed on mips32 with a 64-bit time_t here: https://github.com/haproxy/haproxy/actions/runs/15150389164/job/42595310111 Let's just turn the "remain" variable used to show the remaining time into a more portable ullong and use %llu for all format specifiers, since long remains limited to 32-bit on 32-bit archs. No backport needed.	2025-05-21 10:18:47 +02:00
Willy Tarreau	09d4c9519e	BUILD: ssl: avoid possible printf format warning in traces When building on MIPS-32 with gcc-9.5 and glibc-2.31, I got this: src/ssl_trace.c: In function 'ssl_trace': src/ssl_trace.c:118:42: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'ssize_t' {aka 'const int'} [-Wformat=] 118 \| chunk_appendf(&trace_buf, " : size=%ld", *size); \| ~~^ ~~~~~ \| \| \| \| \| ssize_t {aka const int} \| long int \| %d Let's just cast the type. No backport needed.	2025-05-21 10:01:14 +02:00
Willy Tarreau	3b2fb5cc15	CLEANUP: wdt: clarify the comments on the common exit path The condition in which we reach the check for ha_panic() and ha_stuck_warning() are not super clear, let's reformulate them.	2025-05-20 16:37:06 +02:00
Willy Tarreau	0a8bfb5b90	BUG/MEDIUM: wdt: always ignore the first watchdog wakeup With commit a06c215f08 ("MEDIUM: wdt: always make the faulty thread report its own warnings"), when the TH_FL_STUCK flag was flipped on, we'd then go to the panic code instead of giving a second chance like before the commit. This can trigger rare cases that only happen with moderate loads like was addressed by commit 24ce001771 ("BUG/MEDIUM: wdt: fix the stuck detection for warnings"). This is in fact due to the loss of the common "goto update_and_leave" that used to serve both the warning code and the flag setting for probation, and it's apparently what hit Christian in issue #2980. Let's make sure we exit naturally when turning the bit on for the first time. Let's also update the confusing comment at the end of the check that was left over by latest change. Since the first commit was backported to 3.1, this commit should be backported there as well.	2025-05-20 16:37:03 +02:00
William Lallemand	dcdf27af70	DOC: management: update 'acme status' Update the 'acme status' section with the "Stopped" status and fix the description.	2025-05-20 16:08:57 +02:00
Frederic Lecaille	bbe302087c	DOC: update INSTALL for QUIC with OpenSSL 3.5 usages Update the QUIC sections which mention the OpenSSL library use cases.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	08eee0d9cf	MINOR: quic: OpenSSL 3.5 trick to support 0-RTT For an unidentified reason, SSL_do_hanshake() succeeds at its first call when 0-RTT is enabled for the connection. This behavior looks very similar by the one encountered by AWS-LC stack. That said, it was documented by AWS-LC. This issue leads the connection to stop sending handshake packets after having release the handshake encryption level. In fact, no handshake packets could even been sent leading the handshake to always fail. To fix this, this patch simulates a "handshake in progress" state waiting for the application level read secret to be established by the TLS stack. This may happen only after the QUIC listener has completed/confirmed the handshake upon handshake CRYPTO data receipt from the peer.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	849a3af14e	MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset A QUIC must sent its transport parameter using a TLS custom extention. This extension is reset by SSL_set_SSL_CTX(). It can be restored calling quic_ssl_set_tls_cbs() (which calls SSL_set_quic_tls_cbs()).	2025-05-20 15:00:06 +02:00
Frederic Lecaille	b3ac1a636c	MINOR: quic: implement all remaining callbacks for OpenSSL 3.5 QUIC API The quic_conn struct is modified for two reasons. The first one is to store the encoded version of the local tranport parameter as this is done for USE_QUIC_OPENSSL_COMPAT. Indeed, the local transport parameter "should remain valid until after the parameters have been sent" as mentionned by SSL_set_quic_tls_cbs(3) manual. In our case, the buffer is a static buffer attached to the quic_conn object. qc_ssl_set_quic_transport_params() function whose role is to call SSL_set_tls_quic_transport_params() (aliased by SSL_set_quic_transport_params() to set these local tranport parameter into the TLS stack from the buffer attached to the quic_conn struct. The second quic_conn struct modification is the addition of the new ->prot_level (SSL protection level) member added to the quic_conn struct to store "the most recent write encryption level set via the OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn callback (if it has been called)" as mentionned by SSL_set_quic_tls_cbs(3) manual. This patches finally implements the five remaining callacks to make the haproxy QUIC implementation work. OSSL_FUNC_SSL_QUIC_TLS_crypto_send_fn() (ha_quic_ossl_crypto_send) is easy to implement. It calls ha_quic_add_handshake_data() after having converted qc->prot_level TLS protection level value to the correct ssl_encryption_level_t (boringSSL API/quictls) value. OSSL_FUNC_SSL_QUIC_TLS_crypto_recv_rcd_fn() (ha_quic_ossl_crypto_recv_rcd()) provide the non-contiguous addresses to the TLS stack, without releasing them. OSSL_FUNC_SSL_QUIC_TLS_crypto_release_rcd_fn() (ha_quic_ossl_crypto_release_rcd()) release these non-contiguous buffer relying on the fact that the list of encryption level (qc->qel_list) is correctly ordered by SSL protection level secret establishements order (by the TLS stack). OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn() (ha_quic_ossl_got_transport_params()) is a simple wrapping function over ha_quic_set_encryption_secrets() which is used by boringSSL/quictls API. OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn() (ha_quic_ossl_got_transport_params()) role is to store the peer received transport parameters. It simply calls quic_transport_params_store() and set them into the TLS stack calling qc_ssl_set_quic_transport_params(). Also add some comments for all the OpenSSL 3.5 QUIC API callbacks. This patch have no impact on the other use of QUIC API provided by the others TLS stacks.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	dc6a3c329a	MINOR: quic: Allow the use of the new OpenSSL 3.5.0 QUIC TLS API (to be completed) This patch allows the use of the new OpenSSL 3.5.0 QUIC TLS API when it is available and detected at compilation time. The detection relies on the presence of the OSSL_FUNC_SSL_QUIC_TLS_CRYPTO_SEND macro from openssl-compat.h. Indeed this macro is defined by OpenSSL since 3.5.0 version. It is not defined by quictls. This helps in distinguishing these two TLS stacks. When the detection succeeds, HAVE_OPENSSL_QUIC is also defined by openssl-compat.h. Then, this is this new macro which is used to detect the availability of the new OpenSSL 3.5.0 QUIC TLS API. Note that this detection is done only if USE_QUIC_OPENSSL_COMPAT is not asked. So, USE_QUIC_OPENSSL_COMPAT and HAVE_OPENSSL_QUIC are exclusive. At the same location, from openssl-compat.h, ssl_encryption_level_t enum is defined. This enum was defined by quictls and expansively used by the haproxy QUIC implementation. SSL_set_quic_transport_params() is replaced by SSL_set_quic_tls_transport_params. SSL_set_quic_early_data_enabled() (quictls) is also replaced by SSL_set_quic_tls_early_data_enabled() (OpenSSL). SSL_quic_read_level() (quictls) is not defined by OpenSSL. It is only used by the traces to log the current TLS stack decryption level (read). A macro makes it return -1 which is an usused values. The most of the differences between quictls and OpenSSL QUI APIs are in quic_ssl.c where some callbacks must be defined for these two APIs. This is why this patch modifies quic_ssl.c to define an array of OSSL_DISPATCH structs: <ha_quic_dispatch>. Each element of this arry defines a callback. So, this patch implements these six callabcks: - ha_quic_ossl_crypto_send() - ha_quic_ossl_crypto_recv_rcd() - ha_quic_ossl_crypto_release_rcd() - ha_quic_ossl_yield_secret() - ha_quic_ossl_got_transport_params() and - ha_quic_ossl_alert(). But at this time, these implementations which must return an int return 0 interpreted as a failure by the OpenSSL QUIC API, except for ha_quic_ossl_alert() which is implemented the same was as for quictls. The five remaining functions above will be implemented by the next patches to come. ha_quic_set_encryption_secrets() and ha_quic_add_handshake_data() have been moved to be defined for both quictls and OpenSSL QUIC API. These callbacks are attached to the SSL objects (sessions) calling qc_ssl_set_cbs() new function. This latter callback the correct function to attached the correct callbacks to the SSL objects (defined by <ha_quic_method> for quictls, and <ha_quic_dispatch> for OpenSSL). The calls to SSL_provide_quic_data() and SSL_process_quic_post_handshake() have been also disabled. These functions are not defined by OpenSSL QUIC API. At this time, the functions which call them are still defined when HAVE_OPENSSL_QUIC is defined.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	894595b711	MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures There were no traces to diagnose qc_ssl_sess_init() failures from QUIC traces. This patch add calls to TRACE_DEVEL() into qc_ssl_sess_init() and its caller (qc_alloc_ssl_sock_ctx()). This was useful at least to diagnose SSL context initialization failures when porting QUIC to the new OpenSSL 3.5 QUIC API. Should be easily backported as far as 2.6.	2025-05-20 15:00:06 +02:00
Frederic Lecaille	a2822b1776	CLEANUP: quic: Useless BIO_METHOD initialization This code is there from QUIC implementation start. It was supposed to initialize <ha_quic_meth> as a BIO_METHOD static object. But this BIO_METHOD is not used at all! Should be backported as far as 2.6 to help integrate the next patches to come.	2025-05-20 15:00:06 +02:00
William Lallemand	e803385a6e	MINOR: acme: renewal notification over the dpapi sink Output a sink message when the certificate was renewed by the ACME client. The message is emitted on the "dpapi" sink, and ends by \n\0. Since the message contains this binary character, the right -0 parameter must be used when consulting the sink over the CLI: Example: $ echo "show events dpapi -nw -0" \| socat -t9999 /tmp/haproxy.sock - <0>2025-05-19T15:56:23.059755+02:00 acme newcert foobar.pem.rsa\n\0 When used with the master CLI, @@1 should be used instead of @1 in order to keep the connection to the worker. Example: $ echo "@@1 show events dpapi -nw -0" \| socat -t9999 /tmp/master.sock - <0>2025-05-19T15:56:23.059755+02:00 acme newcert foobar.pem.rsa\n\0	2025-05-19 16:07:25 +02:00
Willy Tarreau	99d6c889d0	BUG/MAJOR: leastconn: never reuse the node after dropping the lock On ARM with 80 cores and a single server, it's sometimes possible to see a segfault in fwlc_get_next_server() around 600-700k RPS. It seldom happens as well on x86 with 128 threads with the same config around 1M rps. It turns out that in fwlc_get_next_server(), before calling fwlc_srv_reposition(), we have to drop the lock and that one takes it back again. The problem is that anything can happen to our node during this time, and it can be freed. Then when continuing our work, we later iterate over it and its next to find a node with an acceptable key, and by doing so we can visit either uninitialized memory or simply nodes that are no longer in the tree. A first attempt at fixing this consisted in artificially incrementing the elements count before dropping the lock, but that turned out to be even worse because other threads could loop forever on such an element looking for an entry that does not exist. Maintaining a separate refcount didn't work well either, and it required to deal with the memory release while dropping it, which is really not convenient. Here we're taking a different approach consisting in simply not trusting this node anymore and going back to the beginning of the loop, as is done at a few other places as well. This way we can safely ignore the possibly released node, and the test runs reliably both on the arm and the x86 platforms mentioned above. No performance regression was observed either, likely because this operation is quite rare. No backport is needed since this appeared with the leastconn rework in 3.2.	2025-05-19 16:05:03 +02:00
Amaury Denoyelle	d358da4d83	BUG/MINOR: quic: fix crash on quic_conn alloc failure If there is an alloc failure during qc_new_conn(), cleaning is done via quic_conn_release(). However, since the below commit, an unchecked dereferencing of <qc.path> is performed in the latter. e841164a4402118bd7b2e2dc2b5068f21de5d9d2 MINOR: quic: account for global congestion window To fix this, simply check <qc.path> before dereferencing it in quic_conn_release(). This is safe as it is properly initialized to NULL on qc_new_conn() first stage. This does not need to be backported.	2025-05-19 11:03:48 +02:00
Willy Tarreau	099c1b2442	BUG/MAJOR: queue: properly keep count of the queue length The queue length was moved to its own variable in commit 583303c48 ("MINOR: proxies/servers: Calculate queueslength and use it."), however a few places were missed in pendconn_unlink() and assign_server_and_queue() resulting in never decreasing counts on aborted streams. This was reproduced when injecting more connections than the total backend could stand in TCP mode and letting some of them time out in the queue. No backport is needed, this is only 3.2.	2025-05-17 10:46:10 +02:00
Willy Tarreau	6be02d1c6e	BUG/MAJOR: leastconn: do not loop forever when facing saturated servers Since commit 9fe72bba3 ("MAJOR: leastconn; Revamp the way servers are ordered."), there's no way to escape the loop visiting the mt_list heads in fwlc_get_next_server if all servers in the list are saturated, resulting in a watchdog panic. It can be reproduced with this config and injecting with more than 2 concurrent conns: balance leastconn server s1 127.0.0.1:8000 maxconn 1 server s2 127.0.0.1:8000 maxconn 1 Here we count the number of saturated servers that were encountered, and escape the loop once the number of remaining servers exceeds the number of saturated ones. No backport is needed since this arrived in 3.2.	2025-05-17 10:44:36 +02:00
Willy Tarreau	ccc65012d3	IMPORT: slz: silence a build warning on non-x86 non-arm Building with clang 16 on MIPS64 yields this warning: src/slz.c:931:24: warning: unused function 'crc32_uint32' [-Wunused-function] static inline uint32_t crc32_uint32(uint32_t data) ^ Let's guard it using UNALIGNED_LE_OK which is the only case where it's used. This saves us from introducing a possibly non-portable attribute. This is libslz upstream commit f5727531dba8906842cb91a75c1ffa85685a6421.	2025-05-16 16:43:53 +02:00
Willy Tarreau	31ca29eee1	IMPORT: slz: fix header used for empty zlib message Calling slz_rfc1950_finish() without emitting any data would result in incorrectly emitting a gzip header (rfc1952) instead of a zlib header (rfc1950) due to a copy-paste between the two wrappers. The impact is almost inexistent since the zlib format is almost never used in this context, and compressing totally empty messages is quite rare as well. Let's take this opportunity for fixing another mistake on an RFC number in a comment. This is slz upstream commit 7f3fce4f33e8c2f5e1051a32a6bca58e32d4f818.	2025-05-16 16:43:53 +02:00
Willy Tarreau	411b04c7d3	IMPORT: slz: use a better hash for machines with a fast multiply The current hash involves 3 simple shifts and additions so that it can be mapped to a multiply on architecures having a fast multiply. This is indeed what the compiler does on x86_64. A large range of values was scanned to try to find more optimal factors on machines supporting such a fast multiply, and it turned out that new factor 0x1af42f resulted in smoother hashes that provided on average 0.4% better compression on both the Silesia corpus and an mbox file composed of very compressible emails and uncompressible attachments. It's even slightly better than CRC32C while being faster on Skylake. This patch enables this factor on archs with a fast multiply. This is slz upstream commit 82ad1e75c13245a835c1c09764c89f2f6e8e2a40.	2025-05-16 16:43:53 +02:00
Willy Tarreau	248bbec83c	IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested If building for sse4 and USE_CRC32C_HASH is defined, then we can use crc32c to calculate the lookup hash. By default we don't do it because even on skylake it's slower than the current hash, which only involves a short multiply (~5% slower). But the gains are marginal (0.3%). This is slz upstream commit 44ae4f3f85eb275adba5844d067d281e727d8850. Note: this is not used by default and only merged in order to avoid divergence between the code bases.	2025-05-16 16:43:53 +02:00
Willy Tarreau	ea1b70900f	IMPORT: slz: avoid multiple shifts on 64-bits On 64-bit platforms, disassembling the code shows that send_huff() performs a left shift followed by a right one, which are the result of integer truncation and zero-extension caused solely by using different types at different levels in the call chain. By making encode24() take a 64-bit int on input and send_huff() take one optionally, we can remove one shift in the hot path and gain 1% performance without affecting other platforms. This is slz upstream commit fd165b36c4621579c5305cf3bb3a7f5410d3720b.	2025-05-16 16:43:53 +02:00
Willy Tarreau	0a91c6dcae	BUILD: debug: mark ha_crash_now() as attribute(noreturn) Building on MIPS64 with clang16 incorrectly reports some uninitialized value warnings in stats-proxy.c due to some calls to ABORT_NOW() where the compiler didn't know the code wouldn't return. Let's properly mark the function as noreturn, and take this opportunity for also marking it unused to avoid possible warnings depending on the build options (if ABORT_NOW is not used). No backport needed though it will not harm.	2025-05-16 16:43:53 +02:00
William Lallemand	1eebf98952	DOC: management: change reference to configuration manual Since e24b77e7 ('DOC: config: move the extraneous sections out of the "global" definition') the ACME section of the configuration manual was move from 3.13 to 12.8. Change the reference to that section in "acme renew".	2025-05-16 16:01:43 +02:00
Willy Tarreau	81e46be026	DOC: config: properly index "table and "stick-table" in their section Tim reported in issue #2953 that "stick-table" and "table" were not indexed as keywords. The issue was the indent level. Also let's make sure to put a box around the "store" arguments as well.	2025-05-16 15:37:03 +02:00
Willy Tarreau	df00164fdd	BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field In continuation with 9a05c1f574 ("BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly") and the discussion in issue #2941, @DemiMarie rightfully suggested that Host should also be sanitized, because it is sometimes used in concatenation, such as this: http-request set-url https://%[req.hdr(host)]%[pathq] which was proposed as a workaround for h2 upstream servers that require :authority here: https://www.mail-archive.com/haproxy@formilux.org/msg43261.html The current patch then adds the same check for forbidden chars in the Host header, using the same function as for the patch above, since in both cases we validate the host:port part of the authority. This way we won't reconstruct ambiguous URIs by concatenating Host and path. Just like the patch above, this can be backported afer a period of observation.	2025-05-16 15:13:17 +02:00
Willy Tarreau	b84762b3e0	BUG/MINOR: h3: don't insert more than one Host header Let's make sure we drop extraneous Host headers after having compared them. That also works when :authority was already present. This way, like for h1 and h2, we only keep one copy of it, while still making sure that Host matches :authority. This way, if a request has both :authority and Host, only one Host header will be produced (from :authority). Note that due to the different organization of the code and wording along the evolving RFCs, here we also check that all duplicates are identical, while h2 ignores them as per RFC7540, but this will be re-unified later. This should be backported to stable versions, at least 2.8, though thanks to the existing checks the impact is probably nul.	2025-05-16 15:13:17 +02:00
Christopher Faulet	f45a632bad	BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload It is especially a problem with Lua filters, but it is important to disable the 0-copy forwarding if a filter alters the payload, or at least to be able to disable it. While the filter is registered on the data filtering, it is not an issue (and it is the common case) because, there is now way to fast-forward data at all. But it may be an issue if a filter decides to alter the payload and to unregister from data filtering. In that case, the 0-copy forwarding can be re-enabled in a hardly precdictable state. To fix the issue, a SC flags was added to do so. The HTTP compression filter set it and lua filters too if the body length is changed (via HTTPMessage.set_body_len()). Note that it is an issue because of a bad design about the HTX. Many info about the message are stored in the HTX structure itself. It must be refactored to move several info to the stream-endpoint descriptor. This should ease modifications at the stream level, from filter or a TCP/HTTP rules. This should be backported as far as 3.0. If necessary, it may be backported on lower versions, as far as 2.6. In that case, it must be reviewed and adapted.	2025-05-16 15:11:37 +02:00
Christopher Faulet	94055a5e73	MEDIUM: hlua: Add function to change the body length of an HTTP Message There was no function for a lua filter to change the body length of an HTTP Message. But it is mandatory to be able to alter the message payload. It is not possible update to directly update the message headers because the internal state of the message must also be updated accordingly. It is the purpose of HTTPMessage.set_body_len() function. The new body length myst be passed as argument. If it is an integer, the right "Content-Length" header is set. If the "chunked" string is used, it forces the message to be chunked-encoded and in that case the "Transfer-Encoding" header. This patch should fix the issue #2837. It could be backported as far as 2.6.	2025-05-16 14:34:12 +02:00
Willy Tarreau	f2d7aa8406	BUG/MEDIUM: peers: also limit the number of incoming updates There's a configurable limit to the number of messages sent to a peer (tune.peers.max-updates-at-once), but this one is not applied to the receive side. While it can usually be OK with default settings, setups involving a large tune.bufsize (1MB and above) regularly experience high latencies and even watchdogs during reloads because the full learning process sends a lot of data that manages to fill the entire buffer, and due to the compactness of the protocol, 1MB of buffer can contain more than 100k updates, meaning taking locks etc during this time, which is not workable. Let's make sure the receiving side also respects the max-updates-at-once setting. For this it counts incoming updates, and refrains from continuing once the limit is reached. It's a bit tricky to do because after receiving updates we still have to send ours (and possibly some ACKs) so we cannot just leave the loop. This issue was reported on 3.1 but it should progressively be backported to all versions having the max-updates-at-once option available.	2025-05-15 16:57:21 +02:00
Aurelien DARRAGON	098a5e5c0b	BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers using "send-proxy" or "send-proxy-v2" option on a ring server is not relevant nor supported. Worse, on 2.4 it causes haproxy process to crash as reported in GH #2965. Let's be more explicit about the fact that this keyword is not supported under "ring" context by ignoring the option and emitting a warning message to inform the user about that. Ideally, we should do the same for peers and log servers. The proper way would be to check servers options during postparsing but we currently lack proper cross-type server postparsing hooks. This will come later and thus will give us a chance to perform the compatibilty checks for server options depending on proxy type. But for now let's simply fix the "ring" case since it is the only one that's known to cause a crash. It may be backported to all stable versions.	2025-05-15 16:18:31 +02:00
Basha Mougamadou	824bb93e18	DOC: configuration: explicit multi-choice on bind shards option From the documentation, this wasn't clear enough that shards should be followed by one of the options number / by-thread / by-group. Align it with existing options in documentation so that it becomes more explicit.	2025-05-14 19:41:38 +02:00
Willy Tarreau	17df04ff09	[RELEASE] Released version 3.2-dev16 Released version 3.2-dev16 with the following main changes : - BUG/MEDIUM: mux-quic: fix crash on invalid fctl frame dereference - DEBUG: pool: permit per-pool UAF configuration - MINOR: acme: add the global option 'acme.scheduler' - DEBUG: pools: add a new integrity mode "backup" to copy the released area - MEDIUM: sock-inet: re-check IPv6 connectivity every 30s - BUG/MINOR: ssl: doesn't fill conf->crt with first arg - BUG/MINOR: ssl: prevent multiple 'crt' on the same ssl-f-use line - BUG/MINOR: ssl/ckch: always free() the previous entry during parsing - MINOR: tools: ha_freearray() frees an array of string - BUG/MINOR: ssl/ckch: always ha_freearray() the previous entry during parsing - MINOR: ssl/ckch: warn when the same keyword was used twice - BUG/MINOR: threads: fix soft-stop without multithreading support - BUG/MINOR: tools: improve parse_line()'s robustness against empty args - BUG/MINOR: cfgparse: improve the empty arg position report's robustness - BUG/MINOR: server: dont depend on proxy for server cleanup in srv_drop() - BUG/MINOR: server: perform lbprm deinit for dynamic servers - MINOR: http: add a function to validate characters of :authority - BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly - MINOR: quic: account Tx data per stream - MINOR: mux-quic: account Rx data per stream - MINOR: quic: add stream format for "show quic" - MINOR: quic: display QCS info on "show quic stream" - MINOR: quic: display stream age - BUG/MINOR: cpu-topo: fix group-by-cluster policy for disordered clusters - MINOR: cpu-topo: add a new "group-by-ccx" CPU policy - MINOR: cpu-topo: provide a function to sort clusters by average capacity - MEDIUM: cpu-topo: change "performance" to consider per-core capacity - MEDIUM: cpu-topo: change "efficiency" to consider per-core capacity - MEDIUM: cpu-topo: prefer grouping by CCX for "performance" and "efficiency" - MEDIUM: config: change default limits to 1024 threads and 32 groups - BUG/MINOR: hlua: Fix Channel:data() and Channel:line() to respect documentation - DOC: config: Fix a typo in the "term_events" definition - BUG/MINOR: spoe: Don't report error on applet release if filter is in DONE state - BUG/MINOR: mux-spop: Don't report error for stream if ACK was already received - BUG/MINOR: mux-spop: Make the demux stream ID a signed integer - BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error - MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing - BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state - BUG/MEDIUM: mux-spop: Properly handle CLOSING state - BUG/MEDIUM: spop-conn: Report short read for partial frames payload - BUG/MEDIUM: mux-spop: Properly detect truncated frames on demux to report error - BUG/MEDIUM: mux-spop; Don't report a read error if there are pending data - DEBUG: mux-spop: Review some trace messages to adjust the message or the level - DOC: config: move address formats definition to section 2 - DOC: config: move stick-tables and peers to their own section - DOC: config: move the extraneous sections out of the "global" definition - CI: AWS-LC(fips): enable unit tests - CI: AWS-LC: enable unit tests - CI: compliance: limit run on forks only to manual + cleanup - CI: musl: enable unit tests - CI: QuicTLS (weekly): limit run on forks only to manual dispatch - CI: WolfSSL: enable unit tests	2025-05-14 17:01:46 +02:00
Ilia Shipitsin	12de9ecce5	CI: WolfSSL: enable unit tests Run the new make unit-tests on the CI.	2025-05-14 17:00:31 +02:00
Ilia Shipitsin	75a1e40501	CI: QuicTLS (weekly): limit run on forks only to manual dispatch	2025-05-14 17:00:31 +02:00
Ilia Shipitsin	a8b1b08fd7	CI: musl: enable unit tests Run the new make unit-tests on the CI.	2025-05-14 17:00:31 +02:00
Ilia Shipitsin	01225f9aa5	CI: compliance: limit run on forks only to manual + cleanup	2025-05-14 17:00:31 +02:00
Ilia Shipitsin	61b30a09c0	CI: AWS-LC: enable unit tests Run the new make unit-tests on the CI.	2025-05-14 17:00:31 +02:00
Ilia Shipitsin	944a96156e	CI: AWS-LC(fips): enable unit tests Run the new make unit-tests on the CI.	2025-05-14 17:00:31 +02:00
Willy Tarreau	e24b77e765	DOC: config: move the extraneous sections out of the "global" definition Due to some historic mistakes that have spread to newly added sections, a number of of recently added small sections found themselves described under section 3 "global parameters" which is specific to "global" section keywords. This is highly confusing, especially given that sections 3.1, 3.2, 3.3 and 3.10 directly start with keywords valid in the global section, while others start with keywords that describe a new section. Let's just create a new chapter "12. other sections" and move them all there. 3.10 "HTTPclient tuning" however was moved to 3.4 as it's really a definition of the global options assigned to the HTTP client. The "programs" that are going away in 3.3 were moved at the end to avoid a renumbering later. Another nice benefit is that it moves a lot of text that was previously keeping the global and proxies sections apart.	2025-05-14 16:08:02 +02:00
Willy Tarreau	da67a89f30	DOC: config: move stick-tables and peers to their own section As suggested by Tim in issue #2953, stick-tables really deserve their own section to explain the configuration. And peers have to move there as well since they're totally dedicated to stick-tables. Now we introduce a new section "Stick-tables and Peers", explaining the concepts, and under which there is one subsection for stick-tables configuration and one for the peers (which mostly keeps the existing peers section).	2025-05-14 16:08:02 +02:00
Willy Tarreau	423dffa308	DOC: config: move address formats definition to section 2 Section 2 describes the config file format, variables naming etc, so there's no reason why the address format used in this file should be in a separate section, let's bring it into section 2 as well.	2025-05-14 16:08:02 +02:00
Christopher Faulet	e2ae8a74e8	DEBUG: mux-spop: Review some trace messages to adjust the message or the level Some trace messages were not really accurrate, reporting a CLOSED connection while only an error was reported on it. In addition, an TRACE_ERROR() was used to report a short read on HELLO/DISCONNECT frames header. But it is not an error. a TRACE_DEVEL() should be used instead. This patch could be backported to 3.1 to ease future backports.	2025-05-14 11:52:10 +02:00
Christopher Faulet	6e46f0bf93	BUG/MEDIUM: mux-spop; Don't report a read error if there are pending data When an read error is detected, no error must be reported on the SPOP connection is there are still some data to parse. It is important to be sure to process all data before reporting the error and be sure to not truncate received frames. However, we must also take care to handle short read case to not wait data that will never be received. This patch must be backported to 3.1.	2025-05-14 11:51:58 +02:00
Christopher Faulet	16314bb93c	BUG/MEDIUM: mux-spop: Properly detect truncated frames on demux to report error There was no test in the demux part to detect truncated frames and to report an error at the connection level. The SPOP streams were properly switch to half-closed state. But waiting the associated SPOE applets were woken up and released, the SPOP connection could be woken up several times for nothing. I never triggered the watchdog in that case, but it is not excluded. Now, at the end of the demux function, if a specific test was added to detect truncated frames to report an error and close the connection. This patch must be backported to 3.1.	2025-05-14 11:47:41 +02:00
Christopher Faulet	71feb49a9f	BUG/MEDIUM: spop-conn: Report short read for partial frames payload When a frame was not fully received, a short read must be reported on the SPOP connection to help the demux to handle truncated frames. This was performed for frames truncated on the header part but not on the payload part. It is now properly detected. This patch must be backported to 3.1.	2025-05-14 09:20:10 +02:00
Christopher Faulet	ddc5f8d92e	BUG/MEDIUM: mux-spop: Properly handle CLOSING state The CLOSING state was not handled at all by the SPOP multiplexer while it is mandatory when a DISCONNECT frame was sent and the mux should wait for the DISCONNECT frame in reply from the agent. Thanks to this patch, it should be fixed. In addition, if an error occurres during the AGENT HELLO frame parsing, the SPOP connection is no longer switched to CLOSED state and remains in ERROR state instead. It is important to be able to send the DISCONNECT frame to the agent instead of closing the TCP connection immediately. This patch depends on following commits: * BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state * MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing * BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error * BUG/MINOR: mux-spop: Make the demux stream ID a signed integer All the series must be backported to 3.1.	2025-05-14 09:14:12 +02:00
Christopher Faulet	a3940614c2	BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state SPOP_CS_FRAME_H and SPOP_CS_FRAME_P states, that were used to handle frame parsing, were removed. The demux process now relies on the demux stream ID to know if it is waiting for the frame header or the frame payload. Concretly, when the demux stream ID is not set (dsi == -1), the demuxer is waiting for the next frame header. Otherwise (dsi >= 0), it is waiting for the frame payload. It is especially important to be able to properly handle DISCONNECT frames sent by the agents. SPOP_CS_RUNNING state is introduced to know the hello handshake was finished and the SPOP connection is able to open SPOP streams and exchange NOTIFY/ACK frames with the agents. It depends on the following fixes: * MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing * BUG/MINOR: mux-spop: Make the demux stream ID a signed integer This change will be mandatory for the next fix. It must be backported to 3.1 with the commits above.	2025-05-13 19:51:40 +02:00
Christopher Faulet	6b0f7de4e3	MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing After the ACK frame was parsed, it is useless to set the SPOP connection state to SPOP_CS_FRAME_H state because this will be automatically handled by the demux function. If it is not an issue, but this will simplify changes for the next commit.	2025-05-13 19:51:40 +02:00
Christopher Faulet	197eaaadfd	BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error Till now, only SPOP connections fully closed or those with a TCP connection on error were concerned. But available streams could be reported for SPOP connections in error or closing state. But in these states, no NOTIFY frames will be sent and no ACK frames will be parsed. So, no new SPOP streams should be opened. This patch should be backported to 3.1.	2025-05-13 19:51:40 +02:00
Christopher Faulet	cbc10b896e	BUG/MINOR: mux-spop: Make the demux stream ID a signed integer The demux stream ID of a SPOP connection, used when received frames are parsed, must be a signed integer because it is set to -1 when the SPOP connection is initialized. It will be important for the next fix. This patch must be backported to 3.1.	2025-05-13 19:51:40 +02:00
Christopher Faulet	6d68beace5	BUG/MINOR: mux-spop: Don't report error for stream if ACK was already received When a SPOP connection was closed or was in error, an error was systematically reported on all its SPOP streams. However, SPOP streams that already received their ACK frame must be excluded. Otherwise if an agent sends a ACK and close immediately, the ACK will be ignored because the SPOP stream will handle the error first. This patch must be backported to 3.1.	2025-05-13 19:51:40 +02:00
Christopher Faulet	1cd30c998b	BUG/MINOR: spoe: Don't report error on applet release if filter is in DONE state When the SPOE applet was released, if a SPOE filter context was still attached to it, an error was reported to the filter. However, there is no reason to report an error if the ACK message was already received. Because of this bug, if the ACK message is received and the SPOE connection is immediately closed, this prevents the ACK message to be processed. This patch should be backported to 3.1.	2025-05-13 19:51:40 +02:00
Christopher Faulet	dcce02d6ed	DOC: config: Fix a typo in the "term_events" definition A space was missing before the colon.	2025-05-13 19:51:40 +02:00
Christopher Faulet	a5de0e1595	BUG/MINOR: hlua: Fix Channel:data() and Channel:line() to respect documentation When the channel API was revisted, the both functions above was added. An offset can be passed as argument. However, this parameter could be reported to be out of range if there was not enough input data was received yet. It is an issue, especially with a tcp rule, because more data could be received. If an error is reported too early, this prevent the rule to be reevaluated later. In fact, an error should only be reported if the offset is part of the output data. Another issue is about the conditions to report 'nil' instead of an empty string. 'nil' was reported when no data was found. But it is not aligned with the documentation. 'nil' must only be returned if no more data cannot be received and there is no input data at all. This patch should fix the issue #2716. It should be backported as far as 2.6.	2025-05-13 19:51:40 +02:00
Willy Tarreau	e049bd00ab	MEDIUM: config: change default limits to 1024 threads and 32 groups A test run on a dual-socket EPYC 9845 (2x160 cores) showed that we'll be facing new limits during the lifetime of 3.2 with our current 16 groups and 256 threads max: $ cat test.cfg global cpu-policy perforamnce $ ./haproxy -dc -c -f test.cfg ... Thread CPU Bindings: Tgrp/Thr Tid CPU set 1/1-32 1-32 32: 0-15,320-335 2/1-32 33-64 32: 16-31,336-351 3/1-32 65-96 32: 32-47,352-367 4/1-32 97-128 32: 48-63,368-383 5/1-32 129-160 32: 64-79,384-399 6/1-32 161-192 32: 80-95,400-415 7/1-32 193-224 32: 96-111,416-431 8/1-32 225-256 32: 112-127,432-447 Raising the default limit to 1024 threads and 32 groups is sufficient to buy us enough margin for a long time (hopefully, please don't laugh, you, reader from the future): $ ./haproxy -dc -c -f test.cfg ... Thread CPU Bindings: Tgrp/Thr Tid CPU set 1/1-32 1-32 32: 0-15,320-335 2/1-32 33-64 32: 16-31,336-351 3/1-32 65-96 32: 32-47,352-367 4/1-32 97-128 32: 48-63,368-383 5/1-32 129-160 32: 64-79,384-399 6/1-32 161-192 32: 80-95,400-415 7/1-32 193-224 32: 96-111,416-431 8/1-32 225-256 32: 112-127,432-447 9/1-32 257-288 32: 128-143,448-463 10/1-32 289-320 32: 144-159,464-479 11/1-32 321-352 32: 160-175,480-495 12/1-32 353-384 32: 176-191,496-511 13/1-32 385-416 32: 192-207,512-527 14/1-32 417-448 32: 208-223,528-543 15/1-32 449-480 32: 224-239,544-559 16/1-32 481-512 32: 240-255,560-575 17/1-32 513-544 32: 256-271,576-591 18/1-32 545-576 32: 272-287,592-607 19/1-32 577-608 32: 288-303,608-623 20/1-32 609-640 32: 304-319,624-639 We can change this default now because it has no functional effect without any configured cpu-policy, so this will only be an opt-in and it's better to do it now than to have an effect during the maintenance phase. A tiny effect is a doubling of the number of pool buckets and stick-table shards internally, which means that aside slightly reducing contention in these areas, a dump of tables can enumerate keys in a different order (hence the adjustment in the vtc). The only really visible effect is a slightly higher static memory consumption (29->35 MB on a small config), but that difference remains even with 50k servers so that's pretty much acceptable. Thanks to Erwan Velu for the quick tests and the insights!	2025-05-13 18:15:33 +02:00
Willy Tarreau	158da59c34	MEDIUM: cpu-topo: prefer grouping by CCX for "performance" and "efficiency" Most of the time, machines made of multiple CPU types use the same L3 for them, and grouping CPUs by frequencies to form groups doesn't bring any value and on the opposite can impair the incoming connection balancing. This choice of grouping by cluster was made in order to constitute a good choice on homogenous machines as well, so better rely on the per-CCX grouping than the per-cluster one in this case. This will create less clusters on machines where it counts without affecting other ones. It doesn't seem necessary to change anything for the "resource" policy since it selects a single cluster.	2025-05-13 16:48:30 +02:00
Willy Tarreau	70b0dd6b0f	MEDIUM: cpu-topo: change "efficiency" to consider per-core capacity This is similar to the previous change to the "performance" policy but it applies to the "efficiency" one. Here we're changing the sorting method to sort CPU clusters by average per-CPU capacity, and we evict clusters whose per-CPU capacity is above 125% of the previous one. Per-core capacity allows to detect discrepancies between CPU cores, and to continue to focus on efficient ones as a priority.	2025-05-13 16:48:30 +02:00
Willy Tarreau	6c88e27cf4	MEDIUM: cpu-topo: change "performance" to consider per-core capacity Running the "performance" policy on highly heterogenous systems yields bad choices when there are sufficiently more small than big cores, and/or when there are multiple cluster types, because on such setups, the higher the frequency, the lower the number of cores, despite small differences in frequencies. In such cases, we quickly end up with "performance" only choosing the small or the medium cores, which is contrary to the original intent, which was to select performance cores. This is what happens on boards like the Orion O6 for example where only the 4 medium cores and 2 big cores are choosen, evicting the 2 biggest cores and the 4 smallest ones. Here we're changing the sorting method to sort CPU clusters by average per-CPU capacity, and we evict clusters whose per-CPU capacity falls below 80% of the previous one. Per-core capacity allows to detect discrepancies between CPU cores, and to continue to focus on high performance ones as a priority.	2025-05-13 16:48:30 +02:00
Willy Tarreau	5ab2c815f1	MINOR: cpu-topo: provide a function to sort clusters by average capacity The current per-capacity sorting function acts on a whole cluster, but in some setups having many small cores and few big ones, it becomes easy to observe an inversion of metrics where the many small cores show a globally higher total capacity than the few big ones. This does not necessarily fit all use cases. Let's add new a function to sort clusters by their per-cpu average capacity to cover more use cases.	2025-05-13 16:48:30 +02:00
Willy Tarreau	01df98adad	MINOR: cpu-topo: add a new "group-by-ccx" CPU policy This cpu-policy will only consider CCX and not clusters. This makes a difference on machines with heterogenous CPUs that generally share the same L3 cache, where it's not desirable to create multiple groups based on the CPU types, but instead create one with the different CPU types. The variants "group-by-2/3/4-ccx" have also been added. Let's also add some text explaining the difference between cluster and CCX.	2025-05-13 16:48:30 +02:00
Willy Tarreau	33d8b006d4	BUG/MINOR: cpu-topo: fix group-by-cluster policy for disordered clusters Some (rare) boards have their clusters in an erratic order. This is the case for the Radxa Orion O6 where one of the big cores appears as CPU0 due to booting from it, then followed by the small cores, then the medium cores, then the remaining big cores. This results in clusters appearing this order: 0,2,1,0. The core in cpu_policy_group_by_cluster() expected ordered clusters, and performs ordered comparisons to decide whether a CPU's cluster has already been taken care of. On the board above this doesn't work, only clusters 0 and 2 appear and 1 is skipped. Let's replace the cluster number comparison with a cpuset to record which clusters have been taken care of. Now the groups properly appear like this: Tgrp/Thr Tid CPU set 1/1-2 1-2 2: 0,11 2/1-4 3-6 4: 1-4 3/1-6 7-12 6: 5-10 No backport is needed, this is purely 3.2.	2025-05-13 16:48:30 +02:00
Amaury Denoyelle	f3b9676416	MINOR: quic: display stream age Add a field to save the creation date of qc_stream_desc instance. This is useful to display QUIC stream age in "show quic stream" output.	2025-05-13 15:44:22 +02:00
Amaury Denoyelle	dbf07c754e	MINOR: quic: display QCS info on "show quic stream" Complete stream output for "show quic" by displaying information from its upper QCS. Note that QCS may be NULL if already released, so a default output is also provided.	2025-05-13 15:43:28 +02:00
Amaury Denoyelle	cbadfa0163	MINOR: quic: add stream format for "show quic" Add a new format for "show quic" command labelled as "stream". This is an equivalent of "show sess", dedicated to the QUIC stack. Each active QUIC streams are listed on a line with their related infos. The main objective of this command is to ensure there is no freeze streams remaining after a transfer.	2025-05-13 15:41:51 +02:00
Amaury Denoyelle	1ccede211c	MINOR: mux-quic: account Rx data per stream Add counters to measure Rx buffers usage per QCS. This reused the newly defined bdata_ctr type already used for Tx accounting. Note that for now, <tot> value of bdata_ctr is not used. This is because it is not easy to account for data accross contiguous buffers. These values are displayed both on log/traces and "show quic" output.	2025-05-13 15:41:51 +02:00
Amaury Denoyelle	a1dc9070e7	MINOR: quic: account Tx data per stream Add accounting at qc_stream_desc level to be able to report the number of allocated Tx buffers and the sum of their data. This represents data ready for emission or already emitted and waiting on ACK. To simplify this accounting, a new counter type bdata_ctr is defined in quic_utils.h. This regroups both buffers and data counter, plus a maximum on the buffer value. These values are now displayed on QCS info used both on logline and traces, and also on "show quic" output.	2025-05-13 15:41:41 +02:00
Willy Tarreau	9a05c1f574	BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly As discussed here: https://github.com/httpwg/http2-spec/pull/936 https://github.com/haproxy/haproxy/issues/2941 It's important to take care of some special characters in the :authority pseudo header before reassembling a complete URI, because after assembly it's too late (e.g. the '/'). This patch does this, both for h2 and h3. The impact on H2 was measured in the worst case at 0.3% of the request rate, while the impact on H3 is around 1%, but H3 was about 1% faster than H2 before and is now on par. It may be backported after a period of observation, and in this case it relies on this previous commit: MINOR: http: add a function to validate characters of :authority Thanks to @DemiMarie for reviving this topic in issue #2941 and bringing new potential interesting cases.	2025-05-12 18:02:47 +02:00
Willy Tarreau	ebab479cdf	MINOR: http: add a function to validate characters of :authority As discussed here: https://github.com/httpwg/http2-spec/pull/936 https://github.com/haproxy/haproxy/issues/2941 It's important to take care of some special characters in the :authority pseudo header before reassembling a complete URI, because after assembly it's too late (e.g. the '/'). This patch adds a specific function which was checks all such characters and their ranges on an ist, and benefits from modern compilers optimizations that arrange the comparisons into an evaluation tree for faster match. That's the version that gave the most consistent performance across various compilers, though some hand-crafted versions using bitmaps stored in register could be slightly faster but super sensitive to code ordering, suggesting that the results might vary with future compilers. This one takes on average 1.2ns per character at 3 GHz (3.6 cycles per char on avg). The resulting impact on H2 request processing time (small requests) was measured around 0.3%, from 6.60 to 6.618us per request, which is a bit high but remains acceptable given that the test only focused on req rate. The code was made usable both for H2 and H3.	2025-05-12 18:02:47 +02:00
Aurelien DARRAGON	c40d6ac840	BUG/MINOR: server: perform lbprm deinit for dynamic servers Last commit 7361515 ("BUG/MINOR: server: dont depend on proxy for server cleanup in srv_drop()") introduced a regression because the lbprm server_deinit is not evaluated anymore with dynamic servers, possibly resulting in a memory leak. To fix the issue, in addition to free_proxy(), the server deinit check should be manually performed in cli_parse_delete_server() as well. No backport needed.	2025-05-12 16:29:36 +02:00
Aurelien DARRAGON	736151556c	BUG/MINOR: server: dont depend on proxy for server cleanup in srv_drop() In commit b5ee8bebfc ("MINOR: server: always call ssl->destroy_srv when available"), we made it so srv_drop() doesn't depend on proxy to perform server cleanup. It turns out this is now mandatory, because during deinit, free_proxy() can occur before the final srv_drop(). This is the case when using Lua scripts for instance. In 2a9436f96 ("MINOR: lbprm: Add method to deinit server and proxy") we added a freeing check under srv_drop() that depends on the proxy. Because of that UAF may occur during deinit when using a Lua script that manipulate server objects. To fix the issue, let's perform the lbprm server deinit logic under free_proxy() directly, where the DEINIT server hooks are evaluated. Also, to prevent similar bugs in the future, let's explicitly document in srv_drop() that server cleanups should assume that the proxy may already be freed. No backport needed unless 2a9436f96 is.	2025-05-12 16:17:26 +02:00
Willy Tarreau	be4d816be2	BUG/MINOR: cfgparse: improve the empty arg position report's robustness OSS Fuzz found that the previous fix ebb19fb367 ("BUG/MINOR: cfgparse: consider the special case of empty arg caused by \x00") was incomplete, as the output can sometimes be larger than the input (due to variables expansion) in which case the work around to try to report a bad arg will fail. While the parse_line() function has been made more robust now in order to avoid this condition, let's fix the handling of this special case anyway by just pointing to the beginning of the line if the supposed error location is out of the line's buffer. All details here: https://oss-fuzz.com/testcase-detail/5202563081502720 No backport is needed unless the fix above is backported.	2025-05-12 16:11:15 +02:00
Willy Tarreau	2b60e54fb1	BUG/MINOR: tools: improve parse_line()'s robustness against empty args The fix in 10e6d0bd57 ("BUG/MINOR: tools: only fill first empty arg when not out of range") was not that good. It focused on protecting against <arg> becoming out of range to detect we haven't emitted anything, but it's not the right way to detect this. We're always maintaining arg_start as a copy of outpos, and that later one is incremented when emitting a char, so instead of testing args[arg] against out+arg_start, we should instead check outpos against arg_start, thereby eliminating the <out> offset and the need to access args[]. This way we now always know if we've emitted an empty arg without dereferencing args[]. There's no need to backport this unless the fix above is also backported.	2025-05-12 16:11:15 +02:00
Aurelien DARRAGON	7d057e56af	BUG/MINOR: threads: fix soft-stop without multithreading support When thread support is disabled ("USE_THREAD=" or "USE_THREAD=0" when building), soft-stop doesn't work as haproxy never ends after stopping the proxies. This used to work fine in the past but suddenly stopped working with ef422ced91 ("MEDIUM: thread: make stopping_threads per-group and add stopping_tgroups") because the "break;" instruction under the stopping condition is never executed when support for multithreading is disabled. To fix the issue, let's add an "else" block to run the "break;" instruction when USE_THREAD is not defined. It should be backported up to 2.8	2025-05-12 14:18:39 +02:00
William Lallemand	8b0d1a4113	MINOR: ssl/ckch: warn when the same keyword was used twice When using a crt-list or a crt-store, keywords mentionned twice on the same line overwritte the previous value. This patch emits a warning when the same keyword is found another time on the same line.	2025-05-09 19:18:38 +02:00
William Lallemand	9c0c05b7ba	BUG/MINOR: ssl/ckch: always ha_freearray() the previous entry during parsing The ckch_conf_parse() function is the generic function which parses crt-store keywords from the crt-store section, and also from a crt-list. When having multiple time the same keyword, a leak of the previous value happens. This patch ensure that the previous value is always freed before overwriting it. This is the same problem as the previous "BUG/MINOR: ssl/ckch: always free() the previous entry during parsing" patch, however this one applies on PARSE_TYPE_ARRAY_SUBSTR. No backport needed.	2025-05-09 19:16:02 +02:00
William Lallemand	96b1f1fd26	MINOR: tools: ha_freearray() frees an array of string ha_freearray() is a new function which free() an array of strings terminated by a NULL entry. The pointer to the array will be free and set to NULL.	2025-05-09 19:12:05 +02:00
William Lallemand	311e0aa5c7	BUG/MINOR: ssl/ckch: always free() the previous entry during parsing The ckch_conf_parse() function is the generic function which parses crt-store keywords from the crt-store section, and also from a crt-list. When having multiple time the same keyword, a leak of the previous value happens. This patch ensure that the previous value is always freed before overwriting it. This patch should be backported as far as 3.0.	2025-05-09 19:01:28 +02:00
William Lallemand	9ce3fb35a2	BUG/MINOR: ssl: prevent multiple 'crt' on the same ssl-f-use line The 'ssl-f-use' implementation doesn't prevent to have multiple time the 'crt' keyword, which overwrite the previous value. Letting users think that is it possible to use multiple certificates on the same line, which is not the case. This patch emits an alert when setting the 'crt' keyword multiple times on the same ssl-f-use line. Should fix issue #2966. No backport needed.	2025-05-09 18:52:09 +02:00
William Lallemand	0c4abf5a22	BUG/MINOR: ssl: doesn't fill conf->crt with first arg Commit c7f29afc ("MEDIUM: ssl: replace "crt" lines by "ssl-f-use" lines") forgot to remove an the allocation of the crt field which was done with the first argument. Since ssl-f-use takes keywords, this would put the first keyword in "crt" instead of the certificate name.	2025-05-09 18:23:06 +02:00
Willy Tarreau	8a96216847	MEDIUM: sock-inet: re-check IPv6 connectivity every 30s IPv6 connectivity might start off (e.g. network not fully up when haproxy starts), so for features like resolvers, it would be nice to periodically recheck. With this change, instead of having the resolvers code rely on a variable indicating connectivity, it will now call a function that will check for how long a connectivity check hasn't been run, and will perform a new one if needed. The age was set to 30s which seems reasonable considering that the DNS will cache results anyway. There's no saving in spacing it more since the syscall is very check (just a connect() without any packet being emitted). The variables remain exported so that we could present them in show info or anywhere else. This way, "dns-accept-family auto" will now stay up to date. Warning though, it does perform some caching so even with a refreshed IPv6 connectivity, an older record may be returned anyway.	2025-05-09 15:45:44 +02:00
Willy Tarreau	1404f6fb7b	DEBUG: pools: add a new integrity mode "backup" to copy the released area This way we can preserve the entire contents of the released area for later inspection. This automatically enables comparison at reallocation time as well (like "integrity" does). If used in combination with integrity, the comparison is disabled but the check of non-corruption of the area mangled by integrity is still operated.	2025-05-09 14:57:00 +02:00
William Lallemand	e7574cd5f0	MINOR: acme: add the global option 'acme.scheduler' The automatic scheduler is useful but sometimes you don't want to use, or schedule manually. This patch adds an 'acme.scheduler' option in the global section, which can be set to either 'auto' or 'off'. (auto is the default value) This also change the ouput of the 'acme status' command so it does not shows scheduled values. The state will be 'Stopped' instead of 'Scheduled'.	2025-05-09 14:00:39 +02:00
Willy Tarreau	0ae14beb2a	DEBUG: pool: permit per-pool UAF configuration The new MEM_F_UAF flag can be set just after a pool's creation to make this pool UAF for debugging purposes. This allows to maintain a better overall performance required to reproduce issues while still having a chance to catch UAF. It will only be used by developers who will manually add it to areas worth being inspected, though.	2025-05-09 13:59:02 +02:00
Amaury Denoyelle	14e4f2b811	BUG/MEDIUM: mux-quic: fix crash on invalid fctl frame dereference Emission of flow-control frames have been recently modified. Now, each frame is sent one by one, via a single entry list. If a failure occurs, emission is interrupted and frame is reinserted into the original <qcc.lfctl.frms> list. This code is incorrect as it only checks if qcc_send_frames() returns an error code to perform the reinsert operation. However, an error here does not always mean that the frame was not properly emitted by lower quic-conn layer. As such, an extra test LIST_ISEMPTY() must be performed prior to reinsert the frame. This bug would cause a heap overflow. Indeed, the reinsert frame would be a random value. A crash would occur as soon as it would be dereferenced via <qcc.lfctl.frms> list. This was reproduced by issuing a POST with a big file and interrupt it after just a few seconds. This results in a crash in about a third of the tests. Here is an example command using ngtcp2 : $ ngtcp2-client -q --no-quic-dump --no-http-dump \ -m POST -d ~/infra/html/1g 127.0.0.1 20443 "http://127.0.0.1:20443/post" Heap overflow was detected via a BUG_ON() statement from qc_frm_free() via qcc_release() caller : FATAL: bug condition "!((&((frm)->reflist))->n == (&((frm)->reflist)))" matched at src/quic_frame.c:1270 This does not need to be backported.	2025-05-09 11:07:11 +02:00
Willy Tarreau	3f9194bfc9	[RELEASE] Released version 3.2-dev15 Released version 3.2-dev15 with the following main changes : - BUG/MEDIUM: stktable: fix sc_*(<ctr>) BUG_ON() regression with ctx > 9 - BUG/MINOR: acme/cli: don't output error on success - BUG/MINOR: tools: do not create an empty arg from trailing spaces - MEDIUM: config: warn about the consequences of empty arguments on a config line - MINOR: tools: make parse_line() provide hints about empty args - MINOR: cfgparse: visually show the input line on empty args - BUG/MINOR: tools: always terminate empty lines - BUG/MINOR: tools: make parseline report the required space for the trailing 0 - DEBUG: threads: don't keep lock label "OTHER" in the per-thread history - DEBUG: threads: merge successive idempotent lock operations in history - DEBUG: threads: display held locks in threads dumps - BUG/MINOR: proxy: only use proxy_inc_fe_cum_sess_ver_ctr() with frontends - Revert "BUG/MEDIUM: mux-spop: Handle CLOSING state and wait for AGENT DISCONNECT frame" - MINOR: acme/cli: 'acme status' show the status acme-configured certificates - MEDIUM: acme/ssl: remove 'acme ps' in favor of 'acme status' - DOC: configuration: add "acme" section to the keywords list - DOC: configuration: add the "crt-store" keyword - BUG/MAJOR: queue: lock around the call to pendconn_process_next_strm() - MINOR: ssl: add filename and linenum for ssl-f-use errors - BUG/MINOR: ssl: can't use crt-store some certificates in ssl-f-use - BUG/MINOR: tools: only fill first empty arg when not out of range - MINOR: debug: bump the dump buffer to 8kB - MINOR: stick-tables: add "ipv4" as an alias for the "ip" type - MINOR: quic: extend return value during TP parsing - BUG/MINOR: quic: use proper error code on missing CID in TPs - BUG/MINOR: quic: use proper error code on invalid server TP - BUG/MINOR: quic: reject retry_source_cid TP on server side - BUG/MINOR: quic: use proper error code on invalid received TP value - BUG/MINOR: quic: fix TP reject on invalid max-ack-delay - BUG/MINOR: quic: reject invalid max_udp_payload size - BUG/MEDIUM: peers: hold the refcnt until updating ts->seen - BUG/MEDIUM: stick-tables: close a tiny race in __stksess_kill() - BUG/MINOR: cli: fix too many args detection for commands - MINOR: server: ensure server postparse tasks are run for dynamic servers - BUG/MEDIUM: stick-table: always remove update before adding a new one - BUG/MEDIUM: quic: free stream_desc on all data acked - BUG/MINOR: cfgparse: consider the special case of empty arg caused by \x00 - DOC: config: recommend disabling libc-based resolution with resolvers	2025-05-09 10:51:30 +02:00
Willy Tarreau	4e20fab7ac	DOC: config: recommend disabling libc-based resolution with resolvers Using both libc and haproxy resolvers can lead to hard to diagnose issues when their bevahiour diverges; recommend using only one type of resolver. Should be backported to stable versions. Link: https://www.mail-archive.com/haproxy@formilux.org/msg45663.html Co-authored-by: Lukas Tribus <lukas@ltri.eu>	2025-05-09 10:31:39 +02:00
Willy Tarreau	ebb19fb367	BUG/MINOR: cfgparse: consider the special case of empty arg caused by \x00 The reporting of the empty arg location added with commit 08d3caf30 ("MINOR: cfgparse: visually show the input line on empty args") falls victim of a special case detected by OSS Fuzz: https://issues.oss-fuzz.com/issues/415850462 In short, making an argument start with "\x00" doesn't make it empty for the parser, but still emits an empty string which is detected and displayed. Unfortunately in this case the error pointer is not set so the sanitization function crashes. What we're doing in this case is that we fall back to the position of the output argument as an estimate of where it was located in the input. It's clearly inexact (quoting etc) but will still help the user locate the problem. No backport is needed unless the commit above is backported.	2025-05-09 10:01:44 +02:00
Amaury Denoyelle	3fdb039a99	BUG/MEDIUM: quic: free stream_desc on all data acked The following patch simplifies qc_stream_desc_ack(). The qc_stream_desc instance is not freed anymore, even if all data were acknowledged. As implies by the commit message, the caller is responsible to perform this cleaning operation. f4a83fbb14bdd14ed94752a2280a2f40c1b690d2 MINOR: quic: do not remove qc_stream_desc automatically on ACK handling However, despite the commit instruction, qc_stream_desc_free() invokation was not moved in the caller. This commit fixes this by adding it after stream ACK handling. This is performed only when a transfer is completed : all data is acknowledged and qc_stream_desc has been released by its MUX stream instance counterpart. This bug may cause a significant increase in memory usage when dealing with long running connection. However, there is no memory leak, as every qc_stream_desc attached to a connection are finally freed when quic_conn instance is released. This must be backported up to 3.1.	2025-05-09 09:25:47 +02:00
Willy Tarreau	576e47fb9a	BUG/MEDIUM: stick-table: always remove update before adding a new one Since commit 388539faa ("MEDIUM: stick-tables: defer adding updates to a tasklet"), between the entry creation and its arrival in the updates tree, there is time for scheduling, and it now becomes possible for an stksess entry to be requeued into the list while it's still in the tree as a remote one. Only local updates were removed prior to being inserted. In this case we would re-insert the entry, causing it to appear as the parent of two distinct nodes or leaves, and to be visited from the first leaf during a delete() after having already been removed and freed, causing a crash, as Christian reported in issue #2959. There's no reason to backport this as this appeared with the commit above in 3.2-dev13.	2025-05-08 23:32:25 +02:00
Aurelien DARRAGON	f03e999912	MINOR: server: ensure server postparse tasks are run for dynamic servers commit 29b76cae4 ("BUG/MEDIUM: server/log: "mode log" after server keyword causes crash") introduced some postparsing checks/tasks for server Initially they were mainly meant for "mode log" servers postparsing, but we already have a check dedicated to "tcp/http" servers (ie: only tcp proto supported) However when dynamic servers are added they bypass _srv_postparse() since the REGISTER_POST_SERVER_CHECK() is only executed for servers defined in the configuration. To ensure consistency between dynamic and static servers, and ensure no post-check init routine is missed, let's manually invoke _srv_postparse() after creating a dynamic server added via the cli.	2025-05-08 02:03:50 +02:00
Aurelien DARRAGON	976e0bd32f	BUG/MINOR: cli: fix too many args detection for commands d3f928944 ("BUG/MINOR: cli: Issue an error when too many args are passed for a command") added a new check to prevent the command to run when too many arguments are provided. In this case an error is reported. However it turns out this check (despite marked for backports) was ineffective prior to 20ec1de21 ("MAJOR: cli: Refacor parsing and execution of pipelined commands") as 'p' pointer was reset to the end of the buffer before the check was executed. Now since 20ec1de21, the check works, but we have another issue: we may read past initialized bytes in the buffer because 'p' pointer is always incremented in a while loop without checking if we increment it past 'end' (This was detected using valgrind) To fix the issue introduced by 20ec1de21, let's only increment 'p' pointer if p < end. For 3.2 this is it, now for older versions, since d3f928944 was marked for backport, a sligthly different approach is needed: - conditional p increment must be done in the loop (as in this patch) - max arg check must moved above "fill unused slots" comment where p is assigned to the end of the buffer This patch should be backported with d3f928944.	2025-05-08 02:03:43 +02:00
Willy Tarreau	0cee7b5b8d	BUG/MEDIUM: stick-tables: close a tiny race in __stksess_kill() It might be possible not to see the element in the tree, then not to see it in the update list, thus not to take the lock before deleting. But an element in the list could have moved to the tree during the check, and be removed later without the updt_lock. Let's delete prior to checking the presence in the tree to avoid this situation. No backport is needed since this arrived in -dev13 with the update list.	2025-05-07 18:49:21 +02:00
Willy Tarreau	006a3acbde	BUG/MEDIUM: peers: hold the refcnt until updating ts->seen In peer_treat_updatemsg(), we call stktable_touch_remote() after releasing the write lock on the TS, asking it to decrement the refcnt, then we update ts->seen. Unfortunately this is racy and causes the issue that Christian reported in issue #2959. The sequence of events is very hard to trigger manually, but what happens is the following: T1. stktable_touch_remote(table, ts, 1); -> at this point the entry is in the mt_list, and the refcnt is zero. T2. stktable_trash_oldest() or process_table_expire() -> these can run, because the refcnt is now zero. The entry is cleanly deleted and freed. T1. HA_ATOMIC_STORE(&ts->seen, 1) -> we dereference freed memory. A first attempt at a fix was made by keeping the refcnt held during all the time the entry is in the mt_list, but this is expensive as such entries cannot be purged, causing lots of skips during trash_oldest_data(). This managed to trigger watchdogs, and was only hiding the real cause of the problem. The correct approach clearly is to maintain the ref_cnt until we touch ->seen. That's what this patch does. It does not decrement the refcnt, while calling stktable_touch_remote(), and does it manually after touching ->seen. With this the problem is gone. Note that a reproducer involves the following: - a config with 10 stick-ctr tracking the same table with a random key between 10M and 100M depending on the machine. - the expiration should be between 10 and 20s. http_req_cnt is stored and shared with the peers. - 4 total processes with such a config on the local machine, each corresponding to a different peer. 3 of the peers are bound to half of the cores (all threads) and share the same threads; the last process is bound to the other half with its own threads. - injecting at full load, ~256 conn, on the shared listening port. After ~2x expiration time to 1 minute the lone process should segfault in pools code due to a corrupted by_lru list. This problem already exists in earlier versions but the race looks narrower. Given how difficult it is to trigger on a given machine in its current form, it's likely that it only happens once in a while on stable branches. The fix must be backported wherever the code is similar, and there's no hope to reproduce it to validate the backport. Thanks again to Christian for his amazing help!	2025-05-07 18:49:21 +02:00
Amaury Denoyelle	4bc7aa548a	BUG/MINOR: quic: reject invalid max_udp_payload size Add a checks on received max_udp_payload transport parameters. As defined per RFC 9000, values below 1200 are invalid, and thus the connection must be closed with TRANSPORT_PARAMETER_ERROR code. Prior to this patch, an invalid value was silently ignored. This should be backported up to 2.6. Note that is relies on previous patch "MINOR: quic: extend return value on TP parsing".	2025-05-07 15:21:30 +02:00
Amaury Denoyelle	ffabfb0fc3	BUG/MINOR: quic: fix TP reject on invalid max-ack-delay Checks are implemented on some received transport parameter values, to reject invalid ones defined per RFC 9000. This is the case for max_ack_delay parameter. The check was not properly implemented as it only reject values strictly greater than the limit set to 2^14. Fix this by rejecting values of 2^14 and above. Also, the proper error code TRANSPORT_PARAMETER_ERROR is now set. This should be backported up to 2.6. Note that is relies on previous patch "MINOR: quic: extend return value on TP parsing".	2025-05-07 15:21:30 +02:00
Amaury Denoyelle	b60a17aad7	BUG/MINOR: quic: use proper error code on invalid received TP value As per RFC 9000, checks must be implemented to reject invalid values for received transport parameters. Such values are dependent on the parameter type. Checks were already implemented for ack_delay_exponent and active_connection_id_limit, accordingly with the QUIC specification. However, the connection was closed with an incorrect error code. Fix this to ensure that TRANSPORT_PARAMETER_ERROR code is used as expected. This should be backported up to 2.6. Note that is relies on previous patch "MINOR: quic: extend return value on TP parsing".	2025-05-07 15:21:30 +02:00
Amaury Denoyelle	10f1f1adce	BUG/MINOR: quic: reject retry_source_cid TP on server side Close the connection on error if retry_source_connection_id transport parameter is received. This is specified by RFC 9000 as this parameter must not be emitted by a client. Previously, it was silently ignored. This should be backported up to 2.6. Note that is relies on previous patch "MINOR: quic: extend return value on TP parsing".	2025-05-07 15:21:30 +02:00
Amaury Denoyelle	a54fdd3d92	BUG/MINOR: quic: use proper error code on invalid server TP This commit is similar to the previous one. It fixes the error code reported when dealing with invalid received transport parameters. This time, it handles reception of original_destination_connection_id, preferred_address and stateless_reset_token which must not be emitted by the client. This should be backported up to 2.6. Note that is relies on previous patch "MINOR: quic: extend return value on TP parsing".	2025-05-07 15:20:06 +02:00
Amaury Denoyelle	df6bd4909e	BUG/MINOR: quic: use proper error code on missing CID in TPs Handle missing received transport parameter value initial_source_connection_id / original_destination_connection_id. Previously, such case would result in an error reported via quic_transport_params_store(), which triggers a TLS alert converted as expected as a CONNECTION_CLOSE. The issue is that the error code reported in the frame was incorrect. Fix this by returning QUIC_TP_DEC_ERR_INVAL for such conditions. This is directly handled via quic_transport_params_store() which set the proper TRANSPORT_PARAMETER_ERROR code for the CONNECTION_CLOSE. However, no error is reported so the SSL handshake is properly terminated without a TLS alert. This is enough to ensure that the CONNECTION_CLOSE frame will be emitted as expected. This should be backported up to 2.6. Note that is relies on previous patch "MINOR: quic: extend return value on TP parsing".	2025-05-07 15:20:06 +02:00
Amaury Denoyelle	294bf26c06	MINOR: quic: extend return value during TP parsing Extend API used for QUIC transport parameter decoding. This is done via the introduction of a dedicated enum to report the various error condition detected. No functional change should occur with this patch, as the only returned code is QUIC_TP_DEC_ERR_TRUNC, which results in the connection closure via a TLS alert. This patch will be necessary to properly reject transport parameters with the proper CONNECTION_CLOSE error code. As such, it should be backported up to 2.6 with the following series.	2025-05-07 15:19:52 +02:00
Willy Tarreau	46b5dcad99	MINOR: stick-tables: add "ipv4" as an alias for the "ip" type However the doc purposely says the opposite, to encourage migrating away from "ip". The goal is that in the future we change "ip" to mean "ipv6", which seems to be what most users naturally expect. But we cannot break configurations in the LTS version so for now "ipv4" is the alias. The reason for not changing it in the table is that the type name is used at a few places (look for "].kw"): - dumps - promex We'd rather not change that output for 3.2, but only do it in 3.3. This way, 3.2 can be made future-proof by using "ipv4" in the config without any other side effect. Please see github issue #2962 for updates on this transition.	2025-05-07 10:11:55 +02:00
Willy Tarreau	697a531516	MINOR: debug: bump the dump buffer to 8kB Now with the improved backtraces, the lock history and details in the mux layers, some dumps appear truncated or with some chars alone at the beginning of the line. The issue is in fact caused by the limited dump buffer size (2kB for stderr, 4kB for warning), that cannot hold a complete trace anymore. Let's jump bump them to 8kB, this will be plenty for a long time.	2025-05-07 10:02:58 +02:00
Willy Tarreau	10e6d0bd57	BUG/MINOR: tools: only fill first empty arg when not out of range In commit 3f2c8af313 ("MINOR: tools: make parse_line() provide hints about empty args") we've added the ability to record the position of the first empty arg in parse_line(), but that check requires to access the args[] array for the current arg, which is not valid in case we stopped on too large an argument count. Let's just check the arg's validity before doing so. This was reported by OSS Fuzz: https://issues.oss-fuzz.com/issues/415850462 No backport is needed since this was in the latest dev branch.	2025-05-07 07:25:29 +02:00
William Lallemand	fbceabbccf	BUG/MINOR: ssl: can't use crt-store some certificates in ssl-f-use When declaring a certificate via the crt-store section, this certificate can then be used 2 ways in a crt-list: - only by using its name, without any crt-store options - or by using the exact set of crt-list option that was defined in the crt-store Since ssl-f-use is generating a crt-list, this is suppose to behave the same. To achieve this, ckch_conf_parse() will parse the keywords related to the ckch_conf on the ssl-f-use line and use ckch_conf_cmp() to compare it to the previous declaration from the crt-store. This comparaison is only done when any ckch_conf keyword are present. However, ckch_conf_parse() was done for the crt-list, and the crt-list does not use the "crt" parameter to declare the name of the certificate, since it's the first element of the line. So when used with ssl-f-use, ckch_conf_parse() will always see a "crt" keyword which is a ckch_conf one, and consider that it will always need to have the exact same set of paremeters when using the same crt in a crt-store and an ssl-f-use line. So a simple configuration like this: crt-store web load crt "foo.com.crt" key "foo.com.key" alias "foo" frontend mysite bind :443 ssl ssl-f-use crt "@web/foo" ssl-min-ver TLSv1.2 Would lead to an error like this: config : '@web/foo' in crt-list '(null)' line 0, is already defined with incompatible parameters: - different parameter 'key' : previously 'foo.com.key' vs '(null)' In order to fix the issue, this patch parses the "crt" parameter itself for ssl-f-use instead of using ckch_conf_parse(), so the keyword would never be considered as a ckch_conf keyword to compare. This patch also take care of setting the CKCH_CONF_SET_CRTLIST flag only if a ckch_conf keyword was found. This flag is used by ckch_conf_cmp() to know if it has to compare or not. No backport needed.	2025-05-06 21:36:29 +02:00
William Lallemand	b3b282d2ee	MINOR: ssl: add filename and linenum for ssl-f-use errors Fill cfg_crt_node with a filename and linenum so the post_section callback can use it to emit errors. This way the errors are emitted with the right filename and linenum where ssl-f-use is used instead of (null):0	2025-05-06 21:36:29 +02:00
Willy Tarreau	99f5be5631	BUG/MAJOR: queue: lock around the call to pendconn_process_next_strm() The extra call to pendconn_process_next_strm() made in commit cda7275ef5 ("MEDIUM: queue: Handle the race condition between queue and dequeue differently") was performed after releasing the server queue's lock, which is incompatible with the calling convention for this function. The result is random corruption of the server's streams list likely due to picking old or incorrect pendconns from the queue, and in the end infinitely looping on apparently already locked mt_list objects. Just adding the lock fixes the problem. It's very difficult to reproduce, it requires low maxconn values on servers, stickiness on the servers (cookie), a long enough slowstart (e.g. 10s), and regularly flipping servers up/down to re-trigger the slowstart. No backport is needed as this was only in 3.2.	2025-05-06 18:59:54 +02:00
William Lallemand	e035f0c48e	DOC: configuration: add the "crt-store" keyword Add the "crt-store" keyword with its argument in the "3.12" section, so this could be detected by haproxy-dconv has a keyword and put in the keywords list. Must be backported as far as 3.0	2025-05-06 16:07:29 +02:00
William Lallemand	e516b14d36	DOC: configuration: add "acme" section to the keywords list Add the "acme" keyword with its argument in the "3.13" section, so this could be detected by haproxy-dconv has a keyword and put in the keywords list.	2025-05-06 15:34:39 +02:00
William Lallemand	b7c4a68ecf	MEDIUM: acme/ssl: remove 'acme ps' in favor of 'acme status' Remove the 'acme ps' command which does not seem useful anymore with the 'acme status' command. The big difference with the 'acme status' command is that it was only displaying the running tasks instead of the status of all certificate.	2025-05-06 15:27:29 +02:00
William Lallemand	48f1ce77b7	MINOR: acme/cli: 'acme status' show the status acme-configured certificates The "acme status" command, shows the status of every certificates configured with ACME, not only the running task like "acme ps". The IO handler loops on the ckch_store tree and outputs a line for each ckch_store which has an acme section set. This is still done under the ckch_store lock and doesn't support resuming when the buffer is full, but we need to change that in the future.	2025-05-06 15:27:29 +02:00
Christopher Faulet	a3ce7d7772	Revert "BUG/MEDIUM: mux-spop: Handle CLOSING state and wait for AGENT DISCONNECT frame" This reverts commit 53c3046898633e56f74f7f05fb38cabeea1c87a1. This patch introduced a regression leading to a loop on the frames demultiplexing because a frame may be ignore but not consumed. But outside this regression that can be fixed, there is a design issue that was not totally fixed by the patch above. The SPOP connection state is mixed with the status of the frames demultiplexer and this needlessly complexify the connection management. Instead of fixing the fix, a better solution is to revert it to work a a proper solution. For the record, the idea is to deal with the spop connection state onlu using 'state' field and to introduce a new field to handle the frames demultiplexer state. This should ease the closing state management. Another issue that must be fixed. We must take care to not abort a SPOP stream when an error is detected on a SPOP connection or when the connection is closed, if the ACK frame was already received for this stream. It is not a common case, but it can be solved by saving the last known stream ID that recieved a ACK. This patch must be backported if the commit above is backported.	2025-05-06 13:43:59 +02:00
Aurelien DARRAGON	b39825ee45	BUG/MINOR: proxy: only use proxy_inc_fe_cum_sess_ver_ctr() with frontends proxy_inc_fe_cum_sess_ver_ctr() was implemented in 9969adbc ("MINOR: stats: add by HTTP version cumulated number of sessions and requests") As its name suggests, it is meant to be called for frontends, not backends Also, in 9969adbc, when used under h1_init(), a precaution is taken to ensure that the function is only called with frontends. However, this precaution was not applied in h2_init() and qc_init(). Due to this, it remains possible to have proxy_inc_fe_cum_sess_ver_ctr() being called with a backend proxy as parameter. While it did not cause known issues so far, it is not expected and could result in bugs in the future. Better fix this by ensuring the function is only called with frontends. It may be backported up to 2.8	2025-05-06 11:01:39 +02:00
Willy Tarreau	3bb6eea6d5	DEBUG: threads: display held locks in threads dumps Based on the lock history, we can spot some locks that are still held by checking the last operation that happened on them: if it's not an unlock, then we know the lock is held. In this case we append the list after "locked:" with their label and state like below: U:QUEUE S:IDLE_CONNS U:IDLE_CONNS R:TASK_WQ U:TASK_WQ S:QUEUE S:QUEUE S:QUEUE locked: QUEUE(S) S:IDLE_CONNS U:IDLE_CONNS S:TASK_RQ U:TASK_RQ S:QUEUE U:QUEUE S:IDLE_CONNS locked: IDLE_CONNS(S) R:TASK_WQ S:TASK_WQ R:TASK_WQ S:TASK_WQ R:TASK_WQ S:TASK_WQ R:TASK_WQ locked: TASK_WQ(R) W:STK_TABLE W:STK_TABLE_UPDT U:STK_TABLE_UPDT W:STK_TABLE W:STK_TABLE_UPDT U:STK_TABLE_UPDT W:STK_TABLE W:STK_TABLE_UPDT locked: STK_TABLE(W) STK_TABLE_UPDT(W) The format is slightly different (label(status)) so as to easily differentiate them visually from the history.	2025-05-06 05:20:37 +02:00
Willy Tarreau	feaac66b5e	DEBUG: threads: merge successive idempotent lock operations in history In order to make the lock history a bit more useful, let's try to merge adjacent lock/unlock sequences that don't change anything for other threads. For this we can replace the last unlock with the new operation on the same label, and even just not store it if it was the same as the one before the unlock, since in the end it's the same as if the unlock had not been done. Now loops that used to be filled with "R:LISTENER U:LISTENER" show more useful info such as: S:IDLE_CONNS U:IDLE_CONNS S:PEER U:PEER S:IDLE_CONNS U:IDLE_CONNS R:LISTENER U:LISTENER U:STK_TABLE W:STK_SESS U:STK_SESS R:STK_TABLE U:STK_TABLE W:STK_SESS U:STK_SESS R:STK_TABLE R:STK_TABLE U:STK_TABLE W:STK_SESS U:STK_SESS W:STK_TABLE_UPDT U:STK_TABLE_UPDT S:PEER It's worth noting that it can sometimes induce confusion when recursive locks of the same label are used (a few exist on peers or stick-tables), as in such a case the two operations would be needed. However these ones are already undebuggable, so instead they will just have to be renamed to make sure they use a distinct label.	2025-05-05 18:36:12 +02:00
Willy Tarreau	743dce95d2	DEBUG: threads: don't keep lock label "OTHER" in the per-thread history Most threads are filled with "R:OTHER U:OTHER" in their history. Since anything non-important can use other it's not observable but it pollutes the history. Let's just drop OTHER entirely during the recording.	2025-05-05 18:10:57 +02:00
Willy Tarreau	1f51f1c816	BUG/MINOR: tools: make parseline report the required space for the trailing 0 The fix in commit 09a325a4de ("BUG/MINOR: tools: always terminate empty lines") is insufficient. While it properly addresses the lack of trailing zero, it doesn't account for it in the returned outlen that is used to allocate a larger line. This happens at boot if the very first line of the test file is exactly a sharp with nothing else. In this case it will return a length 0 and the caller (parse_cfg()) will try to re-allocate an entry of size zero and will fail, bailing out a lack of memory. This time it should really be OK. It doesn't need to be backported, unless the patch above would be.	2025-05-05 17:58:04 +02:00
Willy Tarreau	09a325a4de	BUG/MINOR: tools: always terminate empty lines Since latest commit 7e4a2f39ef ("BUG/MINOR: tools: do not create an empty arg from trailing spaces"), an empty line will no longer produce an arg and no longer append a trailing zero to them. This was not visible because one is already present in the input string, however all the trailing args are set to out+outpos-1, which now points one char before the buffer since nothing was emitted, and was noticed by ASAN, and/or when parsing garbage. Let's make sure to always emit the zero for empty lines as well to address this issue. No backport is needed unless the patch above gets backported.	2025-05-05 17:33:22 +02:00
Willy Tarreau	08d3caf30e	MINOR: cfgparse: visually show the input line on empty args Now when an empty arg is found on a line, we emit the sanitized input line and the position of the first empty arg so as to help the user figure the cause (likely an empty environment variable). Co-authored-by: Valentine Krasnobaeva <vkrasnobaeva@haproxy.com>	2025-05-05 16:17:24 +02:00
Willy Tarreau	3f2c8af313	MINOR: tools: make parse_line() provide hints about empty args In order to help parse_line() callers report the position of empty args to the user, let's decide that if no error is emitted, then we'll stuff the errptr with the position of the first empty arg without affecting the return value. Co-authored-by: Valentine Krasnobaeva <vkrasnobaeva@haproxy.com>	2025-05-05 16:17:24 +02:00
Willy Tarreau	9d14f2c764	MEDIUM: config: warn about the consequences of empty arguments on a config line For historical reasons, the config parser relies on the trailing '\0' to detect the end of the line being parsed. When the lines started to be tokenized into arguments, this principle has been preserved, and now all the parsers rely on *args[arg]='\0' to detect the end of a line. But as reported in issue #2944, while most of the time it breaks the parsing like below: http-request deny if { path_dir '' } it can also cause some elements to be silently ignored like below: acl bad_path path_sub '%2E' '' '%2F' This may also subtly happen with environment variables that don't exist or which are empty: acl bad_path path_sub '%2E' "$BAD_PATTERN" '%2F' Fortunately, parse_line() returns the number of arguments found, so it's easy from the callers to verify if any was empty. The goal of this commit is not to perform sensitive changes, it's only to mention when parsing a line that an empty argument was found and alert about its consequences using a warning. Most of the time when this happens, the config does not parse. But for examples as the ACLs above, there could be consequences that are better detected early. This patch depends on this previous fix: BUG/MINOR: tools: do not create an empty arg from trailing spaces Co-authored-by: Valentine Krasnobaeva <vkrasnobaeva@haproxy.com>	2025-05-05 16:17:24 +02:00
Willy Tarreau	7e4a2f39ef	BUG/MINOR: tools: do not create an empty arg from trailing spaces Trailing spaces on the lines of the config file create an empty arg which makes it complicated to detect really empty args. Let's first address this. Note that it is not user-visible but prevents from fixing user-visible issues. No backport is needed. The initial issue was introduced with this fix that already tried to address it: 8a6767d266 ("BUG/MINOR: config: don't count trailing spaces as empty arg (v2)") The current patch properly addresses leading and trailing spaces by only counting arguments if non-lws chars were found on the line. LWS do not cause a transition to a new arg anymore but they complete the current one. The whole new code relies on a state machine to detect when to create an arg (!in_arg->in_arg), and when to close the current arg. A special care was taken for word expansion in the form of "${ARGS[]}" which still continue to emit individual arguments past the first LWS. This example works fine: ARGS="100 check inter 1000" server name 192.168.1."${ARGS[]}" It properly results in 6 args: "server", "name", "192.168.1.100", "check", "inter", "1000" This fix should not have any visible user impact and is a bit tricky, so it's best not to backport it, at least for a while. Co-authored-by: Valentine Krasnobaeva <vkrasnobaeva@haproxy.com>	2025-05-05 16:16:54 +02:00
William Lallemand	af5bbce664	BUG/MINOR: acme/cli: don't output error on success Previous patch 7251c13c7 ("MINOR: acme: move the acme task init in a dedicated function") mistakenly returned the wrong error code when "acme renew" parsing was successful, and tried to emit an error message. This patch fixes the issue by returning 0 when the acme task was correctly scheduled to start. No backport needed.	2025-05-02 21:21:09 +02:00
Aurelien DARRAGON	0e6f968ee3	BUG/MEDIUM: stktable: fix sc_(<ctr>) BUG_ON() regression with ctx > 9 As reported in GH #2958, commit 6c9b315 caused a regression with sc_ fetches and tracked counter id > 9. As such, the below configuration would cause a BUG_ON() to be triggered: global log stdout format raw local0 tune.stick-counters 11 defaults log global mode http frontend www bind :8080 acl track_me bool(true) http-request set-var(txn.track_var) str("a") http-request track-sc10 var(txn.track_var) table rate_table if track_me http-request set-var(txn.track_var_rate) sc_gpc_rate(0,10,rate_table) http-request return status 200 backend rate_table stick-table type string size 1k expire 5m store gpc_rate(1,1m) While in 6c9b315 the src_fetch logic was removed from smp_fetch_sc_stkctr(), num > 9 is indeed not expected anymore as original num value. But what we didn't consider is that num is effectively re-assigned for generic sc_ variant. Thus the BUG_ON() is misplaced as it should only be evaluated for non-generic fetches. It explains why it triggers with valid configurations Thanks to GH user @tkjaer for his detailed report and bug analysis No backport needed, this bug is specific to 3.2.	2025-05-02 16:57:45 +02:00
Willy Tarreau	758e0818c3	[RELEASE] Released version 3.2-dev14 Released version 3.2-dev14 with the following main changes : - MINOR: acme: retry label always do a request - MINOR: acme: does not leave task for next request - BUG/MINOR: acme: reinit the retries only at next request - MINOR: acme: change the default max retries to 5 - MINOR: acme: allow a delay after a valid response - MINOR: acme: wait 5s before checking the challenges results - MINOR: acme: emit a log when starting - MINOR: acme: delay of 5s after the finalize - BUG/MEDIUM: quic: Let it be known if the tasklet has been released. - BUG/MAJOR: tasks: fix task accounting when killed - CLEANUP: tasks: use the local state, not t->state, to check for tasklets - DOC: acme: external account binding is not supported - MINOR: hlua: ignore "tune.lua.bool-sample-conversion" if set after "lua-load" - MEDIUM: peers: Give up if we fail to take locks in hot path - MEDIUM: stick-tables: defer adding updates to a tasklet - MEDIUM: stick-tables: Limit the number of old entries we remove - MEDIUM: stick-tables: Limit the number of entries we expire - MINOR: cfgparse-global: add explicit error messages in cfg_parse_global_env_opts - MINOR: ssl: add function to extract X509 notBefore date in time_t - BUILD: acme: need HAVE_ASN1_TIME_TO_TM - MINOR: acme: move the acme task init in a dedicated function - MEDIUM: acme: add a basic scheduler - MINOR: acme: emit a log when the scheduler can't start the task	2025-05-02 16:23:28 +02:00
William Lallemand	7ad501e6a1	MINOR: acme: emit a log when the scheduler can't start the task Emit an error log when the renewal scheduler can't start the task.	2025-05-02 16:12:41 +02:00
William Lallemand	7fe59ebb88	MEDIUM: acme: add a basic scheduler This patch implements a very basic scheduler for the ACME tasks. The scheduler is a task which is started from the postparser function when at least one acme section was configured. The scheduler will loop over the certificates in the ckchs_tree, and for each certificate will start an ACME task if the notAfter date is past curtime + (notAfter - notBefore) / 12, or 7 days if notBefore is not available. Once the lookup over all certificates is terminated, the task will sleep and will wakeup after 12 hours.	2025-05-02 16:01:32 +02:00
William Lallemand	7251c13c77	MINOR: acme: move the acme task init in a dedicated function acme_start_task() is a dedicated function which starts an acme task for a specified <store> certificate. The initialization code was move from the "acme renew" command parser to this function, in order to be called from a scheduler.	2025-05-02 16:01:32 +02:00
William Lallemand	878a3507df	BUILD: acme: need HAVE_ASN1_TIME_TO_TM Restrict the build of the ACME feature to libraries which provide ASN1_TIME_to_tm() function.	2025-05-02 16:01:32 +02:00
William Lallemand	626de9538e	MINOR: ssl: add function to extract X509 notBefore date in time_t Add x509_get_notbefore_time_t() which returns the notBefore date in time_t format.	2025-05-02 16:01:32 +02:00
Valentine Krasnobaeva	8a4b3216f9	MINOR: cfgparse-global: add explicit error messages in cfg_parse_global_env_opts When env variable name or value are not provided for setenv/presetenv it's not clear from the old error message shown at stderr, what exactly is missed. User needs to search in it's configuration. Let's add more explicit error messages about these inconsistencies. No need to be backported.	2025-05-02 15:37:45 +02:00
Olivier Houchard	994cc58576	MEDIUM: stick-tables: Limit the number of entries we expire In process_table_expire(), limit the number of entries we remove in one call, and just reschedule the task if there's more to do. Removing entries require to use the heavily contended update write lock, and we don't want to hold it for too long. This helps getting stick tables perform better under heavy load.	2025-05-02 15:27:55 +02:00
Olivier Houchard	d2d4c3eb65	MEDIUM: stick-tables: Limit the number of old entries we remove Limit the number of old entries we remove in one call of stktable_trash_oldest(), as we do so while holding the heavily contended update write lock, so we'd rather not hold it for too long. This helps getting stick tables perform better under heavy load.	2025-05-02 15:27:55 +02:00
Olivier Houchard	388539faa3	MEDIUM: stick-tables: defer adding updates to a tasklet There is a lot of contention trying to add updates to the tree. So instead of trying to add the updates to the tree right away, just add them to a mt-list (with one mt-list per thread group, so that the mt-list does not become the new point of contention that much), and create a tasklet dedicated to adding updates to the tree, in batchs, to avoid keeping the update lock for too long. This helps getting stick tables perform better under heavy load.	2025-05-02 15:27:55 +02:00
Olivier Houchard	b3ad7b6371	MEDIUM: peers: Give up if we fail to take locks in hot path In peer_send_msgs(), give up in order to retry later if we failed at getting the update read lock. Similarly, in __process_running_peer_sync(), give up and just reschedule the task if we failed to get the peer lock. There is an heavy contention on both those locks, so we could spend a lot of time trying to get them. This helps getting peers perform better under heavy load.	2025-05-02 15:27:55 +02:00
Aurelien DARRAGON	7a8d1a3122	MINOR: hlua: ignore "tune.lua.bool-sample-conversion" if set after "lua-load" tune.lua.bool-sample-conversion must be set before any lua-load or lua-load-per-thread is used for it to be considered. Indeed, lua-load directives are parsed on the fly and will cause some parts of the scripts to be executed during init already (script body/init contexts). As such, we cannot afford to have "tune.lua.bool-sample-conversion" set after some Lua code was loaded, because it would mean that the setting would be handled differently for Lua's code executed during or after config parsing. To avoid ambiguities, the documentation now states that the setting must be set before any lua-load(-per-thread) directive, and if the setting is met after some Lua was already loaded, the directive is ignored and a warning informs about that. It should fix GH #2957 It may be backported with 29b6d8af16 ("MINOR: hlua: rename "tune.lua.preserve-smp-bool" to "tune.lua.bool-sample-conversion"")	2025-05-02 14:38:37 +02:00
William Lallemand	6051a6e485	DOC: acme: external account binding is not supported Add a note on external account binding in the ACME section.	2025-05-02 12:04:07 +02:00
Willy Tarreau	1ed238101a	CLEANUP: tasks: use the local state, not t->state, to check for tasklets There's no point reading t->state to check for a tasklet after we've atomically read the state into the local "state" variable. Not only it's more expensive, it's also less clear whether that state is supposed to be atomic or not. And in any case, tasks and tasklets have their type forever and the one reflected in state is correct and stable.	2025-05-02 11:09:28 +02:00
Willy Tarreau	45e83e8e81	BUG/MAJOR: tasks: fix task accounting when killed After recent commit b81c9390f ("MEDIUM: tasks: Mutualize the TASK_KILLED code between tasks and tasklets"), the task accounting was no longer correct for killed tasks due to the decrement of tasks in list that was no longer done, resulting in infinite loops in process_runnable_tasks(). This just illustrates that this code remains complex and should be further cleaned up. No backport is needed, as this was in 3.2.	2025-05-02 11:09:28 +02:00
Olivier Houchard	faa18c1ad8	BUG/MEDIUM: quic: Let it be known if the tasklet has been released. quic_conn_release() may, or may not, free the tasklet associated with the connection. So make it return 1 if it was, and 0 otherwise, so that if it was called from the tasklet handler itself, the said handler can act accordingly and return NULL if the tasklet was destroyed. This should be backported if 9240cd4a2771245fae4d0d69ef025104b14bfc23 is backported.	2025-05-02 11:09:28 +02:00
William Lallemand	f63ceeded0	MINOR: acme: delay of 5s after the finalize Let 5 seconds by default to the server after the finalize to generate the certificate. Some servers would not send a Retry-After during processing.	2025-05-02 10:34:48 +02:00
William Lallemand	2db4848fc8	MINOR: acme: emit a log when starting Emit a administrative log when starting the ACME client for a certificate.	2025-05-02 10:23:42 +02:00
William Lallemand	fbd740ef3e	MINOR: acme: wait 5s before checking the challenges results Wait 5 seconds before trying to check if the challenges are ready, so it let time to server to execute the challenges.	2025-05-02 10:18:24 +02:00
William Lallemand	f7cae0e55b	MINOR: acme: allow a delay after a valid response Use the retryafter value to set a delay before doing the next request when the previous response was valid.	2025-05-02 10:16:12 +02:00
William Lallemand	18d2371e0d	MINOR: acme: change the default max retries to 5 Change the default max retries constant to 5 instead of 3. Some servers can be be a bit long to execute the challenge.	2025-05-02 09:40:12 +02:00
William Lallemand	24fbd1f724	BUG/MINOR: acme: reinit the retries only at next request The retries were reinitialized incorrectly, it must be reinit only when we didn't retry. So any valid response would reinit the retries number.	2025-05-02 09:34:45 +02:00
William Lallemand	6626011720	MINOR: acme: does not leave task for next request The next request was always leaving the task befor initializing the httpclient. This patch optimize it by jumping to the next step at the end of the current one. This way, only the httpclient is doing a task_wakeup() to handle the response. But transiting from response to the next request does not leave the task.	2025-05-02 09:31:39 +02:00
William Lallemand	51f9415d5e	MINOR: acme: retry label always do a request Doing a retry always result in initializing a request again, set ACME_HTTP_REQ directly in the label instead of doing it for each step.	2025-05-02 09:15:07 +02:00
Willy Tarreau	c589964bcc	[RELEASE] Released version 3.2-dev13 Released version 3.2-dev13 with the following main changes : - MEDIUM: checks: Make sure we return the tasklet from srv_chk_io_cb - MEDIUM: listener: Make sure w ereturn the tasklet from accept_queue_process - MEDIUM: mux_fcgi: Make sure we return the tasklet from fcgi_deferred_shut - MEDIUM: quic: Make sure we return the tasklet from qcc_io_cb - MEDIUM: quic: Make sure we return NULL in quic_conn_app_io_cb if needed - MEDIUM: quic: Make sure we return the tasklet from quic_accept_run - BUG/MAJOR: tasklets: Make sure he tasklet can't run twice - BUG/MAJOR: listeners: transfer connection accounting when switching listeners - MINOR: ssl/cli: add a '-t' option to 'show ssl sni' - DOC: config: fix ACME paragraph rendering issue - DOC: config: clarify log-forward "host" option - MINOR: promex: expose ST_I_PX_RATE (current_session_rate) - BUILD: acme: use my_strndup() instead of strndup() - BUILD: leastconn: fix build warning when building without threads on old machines - MINOR: threads: prepare DEBUG_THREAD to receive more values - MINOR: threads: turn the full lock debugging to DEBUG_THREAD=2 - MEDIUM: threads: keep history of taken locks with DEBUG_THREAD > 0 - MINOR: threads/cli: display the lock history on "show threads" - MEDIUM: thread: set DEBUG_THREAD to 1 by default - BUG/MINOR: ssl/acme: free EVP_PKEY upon error - MINOR: acme: separate the code generating private keys - MINOR: acme: failure when no directory is specified - MEDIUM: acme: generate the account file when not found - MEDIUM: acme: use 'crt-base' to load the account key - MINOR: compiler: add more macros to detect macro definitions - MINOR: cli: split APPCTX_CLI_ST1_PROMPT into two distinct flags - MEDIUM: cli: make the prompt mode configurable between n/i/p - MEDIUM: mcli: make the prompt mode configurable between i/p - MEDIUM: mcli: replicate the current mode when enterin the worker process - DOC: configuration: acme account key are auto generated - CLEANUP: acme: remove old TODO for account key - DOC: configuration: add quic4 to the ssl-f-use example - BUG/MINOR: acme: does not try to unlock after a failed trylock - BUG/MINOR: mux-h2: fix the offset of the pattern for the ping frame - MINOR: tcp: add support for setting TCP_NOTSENT_LOWAT on both sides - BUG/MINOR: acme: creating an account should not end the task - MINOR: quic: rename min/max fields for congestion window algo - MINOR: quic: refactor BBR API - BUG/MINOR: quic: ensure cwnd limits are always enforced - MINOR: thread: define cshared type - MINOR: quic: account for global congestion window - MEDIUM: quic: limit global Tx memory - MEDIUM: acme: use a map to store tokens and thumbprints - BUG/MINOR: acme: remove references to virt@acme - MINOR: applet: add appctx_schedule() macro - BUG/MINOR: dns: add tempo between 2 connection attempts for dns servers - CLEANUP: dns: remove unused dns_stream_server struct member - BUG/MINOR: dns: prevent ds accumulation within dss - CLEANUP: proxy: mention that px->conn_retries isn't relevant in some cases - DOC: ring: refer to newer RFC5424 - MINOR: tools: make my_strndup() take a size_t len instead of and int - MINOR: Add "sigalg" to "sigalg name" helper function - MINOR: ssl: Add traces to ssl init/close functions - MINOR: ssl: Add traces to recv/send functions - MINOR: ssl: Add traces to ssl_sock_io_cb function - MINOR: ssl: Add traces around SSL_do_handshake call - MINOR: ssl: Add traces to verify callback - MINOR: ssl: Add ocsp stapling callback traces - MINOR: ssl: Add traces to the switchctx callback - MINOR: ssl: Add traces about sigalg extension parsing in clientHello callback - MINOR: Add 'conn' param to ssl_sock_chose_sni_ctx - BUG/MEDIUM: mux-spop: Wait end of handshake to declare a spop connection ready - BUG/MEDIUM: mux-spop: Handle CLOSING state and wait for AGENT DISCONNECT frame - BUG/MINOR: mux-h1: Don't pretend connection was released for TCP>H1>H2 upgrade - BUG/MINOR: mux-h1: Fix trace message in h1_detroy() to not relay on connection - BUILD: ssl: Fix wolfssl build - BUG/MINOR: mux-spop: Use the right bitwise operator in spop_ctl() - MEDIUM: mux-quic: increase flow-control on each bufsize - MINOR: mux-quic: limit emitted MSD frames count per qcs - MINOR: add hlua_yield_asap() helper - MINOR: hlua_fcn: enforce yield after *_get_stats() methods - DOC: config: restore default values for resolvers hold directive - MINOR: ssl/cli: "acme ps" shows the acme tasks - MINOR: acme: acme_ctx_destroy() returns upon NULL - MINOR: acme: use acme_ctx_destroy() upon error - MEDIUM: tasks: Mutualize code between tasks and tasklets. - MEDIUM: tasks: More code factorization - MEDIUM: tasks: Remove TASK_IN_LIST and use TASK_QUEUED instead. - MINOR: tasks: Remove unused tasklet_remove_from_tasklet_list - MEDIUM: tasks: Mutualize the TASK_KILLED code between tasks and tasklets - BUG/MEDIUM: connections: Report connection closing in conn_create_mux() - BUILD/MEDIUM: quic: Make sure we build with recent changes	2025-04-30 18:25:28 +02:00
Olivier Houchard	81e4083efb	BUILD/MEDIUM: quic: Make sure we build with recent changes TASK_IN_LIST has been changed to TASK_QUEUED, but one was missed in quic_conn.c, so fix that.	2025-04-30 18:00:56 +02:00
Olivier Houchard	b138eab302	BUG/MEDIUM: connections: Report connection closing in conn_create_mux() Add an extra parametre to conn_create_mux(), "closed_connection". If a pointer is provided, then let it know if the connection was closed. Callers have no way to determine that otherwise, and we need to know that, at least in ssl_sock_io_cb(), as if the connection was closed we need to return NULL, as the tasklet was free'd, otherwise that can lead to memory corruption and crashes. This should be backported if 9240cd4a2771245fae4d0d69ef025104b14bfc23 is backported too.	2025-04-30 17:17:36 +02:00
Olivier Houchard	b81c9390f4	MEDIUM: tasks: Mutualize the TASK_KILLED code between tasks and tasklets The code to handle a task/tasklet when it's been killed before it were to run is mostly identical, so move it outside of task and tasklet specific code, and inside the common code. This commit is just cosmetic, and should have no impact.	2025-04-30 17:09:14 +02:00
Olivier Houchard	4abfade371	MINOR: tasks: Remove unused tasklet_remove_from_tasklet_list Remove tasklet_remove_from_tasklet_list, as the function hasn't been used for a long time, and there is little reason to keep it.	2025-04-30 17:09:06 +02:00
Olivier Houchard	2bab043c8c	MEDIUM: tasks: Remove TASK_IN_LIST and use TASK_QUEUED instead. TASK_QUEUED was used to mean "the task has been scheduled to run", TASK_IN_LIST was used to mean "the tasklet has been scheduled to run", remove TASK_IN_LIST and just use TASK_QUEUED for tasklets instead. This commit is just cosmetic, and should not have any impact.	2025-04-30 17:08:57 +02:00
Olivier Houchard	35df7cbe34	MEDIUM: tasks: More code factorization There is some code that should run no matter if the task was killed or not, and was needlessly duplicated, so only use one instance. This also fixes a small bug when a tasklet that got killed before it could run would still count as a tasklet that ran, when it should not, which just means that we'd run one less useful task before going back to the poller. This commit is mostly cosmetic, and should not have any impact.	2025-04-30 17:08:57 +02:00
Olivier Houchard	438c000e9f	MEDIUM: tasks: Mutualize code between tasks and tasklets. The code that checks if we're currently running, and waits if so, was identical between tasks and tasklets, so move it in code common to tasks and tasklets. This commit is just cosmetic, and should not have any impact.	2025-04-30 17:08:57 +02:00
William Lallemand	6462f183ad	MINOR: acme: use acme_ctx_destroy() upon error Use acme_ctx_destroy() instead of a simple free() upon error in the "acme renew" error handling. It's better to use this function to be sure than everything has been been freed.	2025-04-30 17:18:46 +02:00
William Lallemand	b8a5270334	MINOR: acme: acme_ctx_destroy() returns upon NULL acme_ctx_destroy() returns when its argument is NULL.	2025-04-30 17:17:58 +02:00
William Lallemand	563ca94ab8	MINOR: ssl/cli: "acme ps" shows the acme tasks Implement a way to display the running acme tasks over the CLI. It currently only displays a "Running" status with the certificate name and the acme section from the configuration. The displayed running tasks are limited to the size of a buffer for now, it will require a backref list later to be called multiple times to resume the list.	2025-04-30 17:12:50 +02:00
Aurelien DARRAGON	4bceca83fc	DOC: config: restore default values for resolvers hold directive Default values for hold directive (resolver context) used to be documented but this was lost when the keyword description was reworked in 24b319b ("Default value is 10s for "valid", 0s for "obsolete" and 30s for others.") Restoring the part that describes the default value. It may be backported to all stable versions with 24b319b	2025-04-30 17:00:37 +02:00
Aurelien DARRAGON	7f418ac7d2	MINOR: hlua_fcn: enforce yield after _get_stats() methods {listener,proxy,server}_get_stats() methods are know to be expensive, expecially if used under an iteration. Indeed, while automatic yield is performed every X lua instructions (defaults to 10k), computing an object's stats 10K times in a single cpu loop is not desirable and could create contention. In this patch we leverage hlua_yield_asap() at the end of _get_stats() methods in order to force the automatic yield to occur ASAP after the method returns. Hopefully this should help in similar scenarios as the one described in GH #2903	2025-04-30 17:00:31 +02:00
Aurelien DARRAGON	97363015a5	MINOR: add hlua_yield_asap() helper When called, this function will try to enforce a yield (if available) as soon as possible. Indeed, automatic yield is already enforced every X Lua instructions. However, there may be some cases where we know after running heavy operation that we should yield already to avoid taking too much CPU at once. This is what this function offers, instead of asking the user to manually yield using "core.yield()" from Lua itself after using an expensive Lua method offered by haproxy, we can directly enforce the yield without the need to do it in the Lua script.	2025-04-30 17:00:27 +02:00
Amaury Denoyelle	df50d3e39f	MINOR: mux-quic: limit emitted MSD frames count per qcs The previous commit has implemented a new calcul method for MAX_STREAM_DATA frame emission. Now, a frame may be emitted as soon as a buffer was consumed by a QCS instance. This will probably increase the number of MAX_STREAM_DATA frame emission. It may even cause a series of frame emitted for the same stream with increasing values under high load, which is completely unnecessary. To improve this, limit the number of MAX_STREAM_DATA frames built to one per QCS instance. This is implemented by storing a reference to this frame in QCS structure via a new member <tx.msd_frm>. Note that to properly reset QCS msd_frm member, emission of flow-control frames have been changed. Now, each frame is emitted individually. On one side, it is better as it prevent to emit frames related to different streams in a single datagram, which is not desirable in case of packet loss. However, this can also increase sendto() syscall invocation.	2025-04-30 16:08:47 +02:00
Amaury Denoyelle	14a3fb679f	MEDIUM: mux-quic: increase flow-control on each bufsize Recently, QCS Rx allocation buffer method has been improved. It is now possible to allocate multiple buffers per QCS instances, which was necessary to improve HTTP/3 POST throughput. However, a limitation remained related to the emission of MAX_STREAM_DATA. These frames are only emitted once at least half of the receive capacity has been consumed by its QCS instance. This may be too restrictive when a client need to upload a large payload. Improve this by adjusting MAX_STREAM_DATA allocation. If QCS capacity is still limited to 1 or 2 buffers max, the old calcul is still used. This is necessary when user has limited upload throughput via their configuration. If QCS capacity is more than 2 buffers, a new frame is emitted if at least a buffer was consumed. This patch has reduced number of STREAM_DATA_BLOCKED frames received in POST tests with some specific clients.	2025-04-30 16:08:47 +02:00
Christopher Faulet	2ccfebcebf	BUG/MINOR: mux-spop: Use the right bitwise operator in spop_ctl() Becaues of a typo, '\|\|' was used instead of '\|' to test the SPOP conneciton flags and decide if the mux is ready or not. The regression was introduced in the commit fd7ebf117 ("BUG/MEDIUM: mux-spop: Wait end of handshake to declare a spop connection ready"). This patch must be backported to 3.1 with the commit above.	2025-04-30 16:01:36 +02:00
Remi Tricot-Le Breton	f191a830d8	BUILD: ssl: Fix wolfssl build The newly added SSL traces require an extra 'conn' parameter to ssl_sock_chose_sni_ctx which was added in the "regular" code but not in the wolfssl specific one. Wolfssl also has a different prototype for some getter functions (SSL_get_servername for instance), which do not expect a const SSL while openssl version does.	2025-04-30 15:50:10 +02:00
Christopher Faulet	7dc4e94830	BUG/MINOR: mux-h1: Fix trace message in h1_detroy() to not relay on connection h1_destroy() may be called to release a H1C after a multiplexer upgrade. In that case, the connection is no longer attached to the H1C. It must not be used in the h1 trace message because the connection context is no longer a H1C. Because of this bug, when a H1>H2 upgrade is performed, a crash may be experienced if the H1 traces are enabled. This patch must be backport to all stable versions.	2025-04-30 14:44:42 +02:00
Christopher Faulet	2dc334be61	BUG/MINOR: mux-h1: Don't pretend connection was released for TCP>H1>H2 upgrade When an applicative upgrade of the H1 multiplexer is performed, we must not pretend the connection was released. Indeed, in that case, a H1 stream is still their with a stream connector attached on it. It must be detached first before releasing the H1 connection and the underlying connection. So it is important to not pretend the connection was already released. Concretely, in that case h1_process() must return 0 instead of -1. It is minor error because, AFAIK, it is harmless. But it is not correct. So let's fix it to avoid futur bugs. To be clear, this happens when a TCP connection is upgraded to H1 connection and a H2 preface is detected, leading to a second upgrade from H1 to H2. This patch may be backport to all stable versions.	2025-04-30 14:44:42 +02:00
Christopher Faulet	53c3046898	BUG/MEDIUM: mux-spop: Handle CLOSING state and wait for AGENT DISCONNECT frame In the SPOE specification, when an error occurred on the SPOP connection, HAProxy must send a DISCONNECT frame and wait for the agent DISCONNECT frame in return before trully closing the connection. However, this part was not properly handled by the SPOP multiplexer. In this case, the SPOP connection should be in the CLOSING state. But this state was not used at all. Depending on when the error was encountered, the connection could be closed immediately, without sending any DISCONNECT frame. It was the case when an early error was detected during the AGENT-HELLO frame parsing. Or it could be moved from ERROR to FRAME_H state, as if no error were detected. This case was less dramatic than it seemed because some flags were also set to prevent any problem. But it was not obvious. So now, the SPOP connection is properly switch to CLOSING state when an DISCONNECT is sent to the agent to be able to wait for its DISCONNECT in reply. spop_process_demux() was updated to parse frames in that state and some validity checks was added. This patch must be backport to 3.1.	2025-04-30 14:44:42 +02:00
Christopher Faulet	fd7ebf117b	BUG/MEDIUM: mux-spop: Wait end of handshake to declare a spop connection ready A SPOP connection must not be considered as ready while the hello handshake is not finished with success. In addition, no error or shutdown must have been reported for the underlying connection. Otherwise a freshly openned spop connexion may be reused while it is in fact dead, leading to a connection retry. This patch must be backported to 3.1.	2025-04-30 14:44:42 +02:00
Remi Tricot-Le Breton	047fb37b19	MINOR: Add 'conn' param to ssl_sock_chose_sni_ctx This is only useful in the traces, the conn parameter won't be used otherwise.	2025-04-30 11:11:26 +02:00
Remi Tricot-Le Breton	6519cec2ed	MINOR: ssl: Add traces about sigalg extension parsing in clientHello callback We had to parse the sigAlg extension by hand in order to properly select the certificate used by the SSL frontends. These traces allow to dump the allowed sigAlg list sent by the client in its clientHello.	2025-04-30 11:11:26 +02:00
Remi Tricot-Le Breton	105c1ca139	MINOR: ssl: Add traces to the switchctx callback This callback allows to pick the used certificate on an SSL frontend. The certificate selection is made according to the information sent by the client in the clientHello. The traces that were added will allow to better understand what certificate was chosen and why. It will also warn us if the chosen certificate was the default one. The actual certificate parsing happens in ssl_sock_chose_sni_ctx. It's in this function that we actually get the filename of the certificate used.	2025-04-30 11:11:26 +02:00
Remi Tricot-Le Breton	dbdd0630e1	MINOR: ssl: Add ocsp stapling callback traces If OCSP stapling fails because of a missing or invalid OCSP response we used to silently disable stapling for the given session. We can now know a bit more what happened regarding OCSP stapling.	2025-04-30 11:11:26 +02:00
Remi Tricot-Le Breton	0fb05540b2	MINOR: ssl: Add traces to verify callback Those traces allow to know which errors were met during certificate chain validation as well as which ones were ignored.	2025-04-30 11:11:26 +02:00
Remi Tricot-Le Breton	4a8fa28e36	MINOR: ssl: Add traces around SSL_do_handshake call Those traces dump information about the multiple SSL_do_handshake calls (renegotiation and regular call). Some errors coud also be dumped in case of rejected early data. Depending on the chosen verbosity, some information about the current handshake can be dumped as well (servername, tls version, chosen cipher for instance). In case of failed handshake, the error codes and messages will also be dumped in the log to ease debugging.	2025-04-30 11:11:26 +02:00
Remi Tricot-Le Breton	9f146bdab3	MINOR: ssl: Add traces to ssl_sock_io_cb function Add new SSL traces.	2025-04-30 11:11:26 +02:00
Remi Tricot-Le Breton	475bb8d843	MINOR: ssl: Add traces to recv/send functions Those traces will allow to identify sessions on which early data is used as well as some forcefully closed connections.	2025-04-30 11:11:26 +02:00
Remi Tricot-Le Breton	9bb8d6dcd1	MINOR: ssl: Add traces to ssl init/close functions Add a dedicated trace for some unlikely allocation failures and async errors. Those traces will ostly be used to identify the start and end of a given SSL connection.	2025-04-30 11:11:26 +02:00
Remi Tricot-Le Breton	08e40f4589	MINOR: Add "sigalg" to "sigalg name" helper function This function can be used to convert a TLSv1.3 sigAlg entry (2bytes) from the signature_agorithms client hello extension into a string. In order to ease debugging, some TLSv1.2 combinations can also be dumped. In TLSv1.2 those signature algorithms pairs were built out of a one byte signature identifier combined to a one byte hash identifier. In TLSv1.3 those identifiers are two bytes blocs that must be treated as such.	2025-04-30 11:11:26 +02:00
Willy Tarreau	566b384e4e	MINOR: tools: make my_strndup() take a size_t len instead of and int In relation to issue #2954, it appears that turning some size_t length calculations to the int that uses my_strndup() upsets coverity a bit. Instead of dealing with such warnings each time, better address it at the root. An inspection of all call places show that the size passed there is always positive so we can safely use an unsigned type, and size_t will always suit it like for strndup() where it's available.	2025-04-30 05:17:43 +02:00
Lukas Tribus	5f9ce99c79	DOC: ring: refer to newer RFC5424 In the ring configuration example we refer to RFC3164 - the original BSD syslog protocol without support for structured data (SDATA). Let's refer to RFC5424 instead so SDATA is by default forwarded if someone copy & pastes from the documentation: https://discourse.haproxy.org/t/structured-data-lost-when-forwarding-logs-voa-syslog-forwarding-feature/11741/5 Should be backported to 2.6.	2025-04-29 21:39:01 +02:00
Aurelien DARRAGON	bd48e26a74	CLEANUP: proxy: mention that px->conn_retries isn't relevant in some cases Since 91e785edc ("MINOR: stream: Rely on a per-stream max connection retries value"), px->conn_retries may be ignored in the following cases: * proxy not part of a list which gets properly post-init (ie: main proxy list, log-forward list, sink list) * proxy lacking the CAP_FE capability Documenting such cases where the px->conn_retries is set but effectively ignored, so that we either remove ignored statements or fix them in the future if they are really needed. In fact all cases affected here are automomous applets that already handle the retries themselves so the fact that 91e785edc made ->conn_retries ineffective should not be a big deal anyway.	2025-04-29 21:21:19 +02:00
Aurelien DARRAGON	5288b39011	BUG/MINOR: dns: prevent ds accumulation within dss when dns session callback (dns_session_release()) is called upon error (ie: when some pending queries were not sent), we try our best to re-create the applet in order to preserve the pending queries and give them a chance to be retried. This is done at the end of dns_session_release(). However, doing so exposes to an issue: if the error preventing queries from being sent is still encountered over and over the dns session could stay there indefinitely. Meanwhile, other dns sessions may be created on the same dns_stream_server periodically. If previous failing dns sessions don't terminate but we also keep creating new ones, we end up accumulating failing sessions on a given dns_stream_server, which can eventually cause ressource shortage. This issue was found when trying to address ("BUG/MINOR: dns: add tempo between 2 connection attempts for dns servers") To fix it, we track the number of failed consecutive sessions for a given dns server. When we reach the threshold (set to 100), we consider that the link to the dns server is broken (at least temporarily) and we force dns_session_new() to fail, so that we stop creating new sessions until one of the existing one eventually succeeds. A workaround for this fix consists in setting the "maxconn" parameter on nameserver directive (under resolvers section) to a reasonnable value so that no more than "maxconn" sessions may co-exist on the same server at a given time. This may be backported to all stable versions. ("CLEANUP: dns: remove unused dns_stream_server struct member") may be backported to ease the backport.	2025-04-29 21:20:54 +02:00
Aurelien DARRAGON	14ebe95a10	CLEANUP: dns: remove unused dns_stream_server struct member dns_stream_server "max_slots" is unused, let's get rid of it	2025-04-29 21:20:44 +02:00
Aurelien DARRAGON	27236f2218	BUG/MINOR: dns: add tempo between 2 connection attempts for dns servers As reported by Lukas Tribus on the mailing list [1], trying to connect to a nameserver with invalid network settings causes haproxy to retry a new connection attempt immediately which eventually causes unexpected CPU usage on the thread responsible for the applet (namely 100% on one CPU will be observed). This can be reproduced with the test config below: resolvers default nameserver ns1 tcp4@8.8.8.8:53 source 192.168.99.99 listen listen mode http bind :8080 server s1 www.google.com resolvers default init-addr none To fix this the issue, we add a temporisation of one second between a new connection attempt is retried. We do this in dns_session_create() when we know that the applet was created in the release callback (when previous query attempt was unsuccessful), which means initial connection is not affected. [1]: https://www.mail-archive.com/haproxy@formilux.org/msg45665.html This should fix GH #2909 and may be backported to all stable versions. This patch depends on ("MINOR: applet: add appctx_schedule() macro")	2025-04-29 21:20:11 +02:00
Aurelien DARRAGON	1ced5ef2fd	MINOR: applet: add appctx_schedule() macro Just like task_schedule() but for applets to wakeup an applet at a specific time, leverages _task_schedule() internally	2025-04-29 21:19:37 +02:00
William Lallemand	c11ab983bf	BUG/MINOR: acme: remove references to virt@acme "virt@acme" was the default map used during development, now this must be configured in the acme section or it won't try to use any map. This patch removes the references to virt@acme in the comments and the code.	2025-04-29 16:35:35 +02:00
William Lallemand	5555926fdd	MEDIUM: acme: use a map to store tokens and thumbprints The stateless mode which was documented previously in the ACME example is not convenient for all use cases. First, when HAProxy generates the account key itself, you wouldn't be able to put the thumbprint in the configuration, so you will have to get the thumbprint and then reload. Second, in the case you are using multiple account key, there are multiple thumbprint, and it's not easy to know which one you want to use when responding to the challenger. This patch allows to configure a map in the acme section, which will be filled by the acme task with the token corresponding to the challenge, as the key, and the thumbprint as the value. This way it's easy to reply the right thumbprint. Example: http-request return status 200 content-type text/plain lf-string "%[path,field(-1,/)].%[path,field(-1,/),map(virt@acme)]\n" if { path_beg '/.well-known/acme-challenge/' }	2025-04-29 16:15:55 +02:00
Amaury Denoyelle	0f9b3daf98	MEDIUM: quic: limit global Tx memory Define a new settings tune.quic.frontend.max-tot-window. It contains a size argument which can be used to set a limit on the sum of all QUIC connections congestion window. This is applied both on quic_cc_path_set() and quic_cc_path_inc(). Note that this limitation cannot reduce a congestion window more than the minimal limit which is set to 2 datagrams.	2025-04-29 15:19:32 +02:00
Amaury Denoyelle	e841164a44	MINOR: quic: account for global congestion window Use the newly defined cshared type to account for the sum of congestion window of every QUIC connection. This value is stored in global counter quic_mem_global defined in proto_quic module.	2025-04-29 15:19:32 +02:00
Amaury Denoyelle	3891456d20	MINOR: thread: define cshared type Define a new type "struct cshared". This can be used as a tool to manipulate a global counter with thread-safety ensured. Each thread would declare its thread-local cshared type, which would point to a global counter. Each thread can then add/substract value to their owned thread-local cshared instance via cshared_add(). If the difference exceed a configured limit, either positively or negatively, the global counter is updated and thread-local instance is reset to 0. Each thread can safely read the global counter value using cshared_read().	2025-04-29 15:10:06 +02:00
Amaury Denoyelle	7bad88c35c	BUG/MINOR: quic: ensure cwnd limits are always enforced Congestion window is limit by a minimal and maximum values which can never be exceeded. Min value is hardcoded to 2 datagrams as recommended by the specification. Max value is specified via haproxy configuration. These values must be respected each time the congestion window size is adjusted. However, in some rare occasions, limit were not always enforced. Fix this by implementing wrappers to set or increment the congestion window. These functions ensure limits are always applied after the operation. Additionnally, wrappers also ensure that if window reached a new maximum value, it is saved in <cwnd_last_max> field. This should be backported up to 2.6, after a brief period of observation.	2025-04-29 15:10:06 +02:00
Amaury Denoyelle	c01d455288	MINOR: quic: refactor BBR API Write minor adjustments to QUIC BBR functions. The objective is to centralize every modification of path cwnd field. No functional change. This patch will be useful to simplify implementation of global QUIC Tx memory usage limitation.	2025-04-29 15:10:06 +02:00
Amaury Denoyelle	2eb1b0cd96	MINOR: quic: rename min/max fields for congestion window algo There was some possible confusion between fields related to congestion window size min and max limit which cannot be exceeded, and the maximum value previously reached by the window. Fix this by adopting a new naming scheme. Enforced limit are now renamed <limit_max>/<limit_min>, while the previously reached max value is renamed <cwnd_last_max>. This should be backported up to 3.1.	2025-04-29 15:10:06 +02:00
William Lallemand	62dfe1fc87	BUG/MINOR: acme: creating an account should not end the task The account creation was mistakenly ending the task instead of being wakeup for the NewOrder state, it was preventing the creation of the certificate, however the account was correctly created. To fix this, only the jump to the end label need to be remove, the standard leaving codepath of the function will allow to be wakeup. No backport needed.	2025-04-29 14:18:05 +02:00
Willy Tarreau	2cdb3cb91e	MINOR: tcp: add support for setting TCP_NOTSENT_LOWAT on both sides TCP_NOTSENT_LOWAT is very convenient as it indicates when to report EAGAIN on the sending side. It takes a margin on top of the estimated window, meaning that it's no longer needed to store too many data in socket buffers. Instead there's just enough to fill the send window and a little bit of margin to cover the scheduling time to restart sending. Experiments on a 100ms network have shown a 10-fold reduction in the memory used by socket buffers by just setting this value to tune.bufsize, without noticing any performance degradation. Theoretically the responsiveness on multiplexed protocols such as H2 should also be improved.	2025-04-29 12:13:42 +02:00
Willy Tarreau	989f609b1a	BUG/MINOR: mux-h2: fix the offset of the pattern for the ping frame The ping frame's pattern must be written at offset 9 (frame header length), not 8. This was added in 3.2 with commit 4dcfe098a6 ("MINOR: mux-h2: prepare to support PING emission"), so no backport is needed.	2025-04-29 12:13:41 +02:00
William Lallemand	2f7f65e159	BUG/MINOR: acme: does not try to unlock after a failed trylock Return after a failed trylock in acme_update_certificate() instead of jumping to the error label which does an unlock.	2025-04-29 11:29:52 +02:00
William Lallemand	1cd0b35896	DOC: configuration: add quic4 to the ssl-f-use example The ssl-f-use keyword is very useful in the case of multiple SSL bind lines. Add a quic4 bind line in the example to show that.	2025-04-29 10:50:39 +02:00
William Lallemand	582614e1b2	CLEANUP: acme: remove old TODO for account key Remove old TODO comments about the account key.	2025-04-29 09:59:32 +02:00
William Lallemand	59d83688e8	DOC: configuration: acme account key are auto generated Explain that account key are auto generated when they do not exist.	2025-04-29 09:32:33 +02:00
Willy Tarreau	dc06495b71	MEDIUM: mcli: replicate the current mode when enterin the worker process While humans can find it convenient to enter the worker process in prompt mode, for external tools it will not be convenient to have to systematically disable it. A better approach is to replicate the master socket's mode there, since it has already been configured to suit the user: interactive, prompt and timed modes are automatically passed to the worker process. This makes the using the worker commands more natural from the master process, without having to systematically adapt it for each new connection.	2025-04-28 20:21:06 +02:00
Willy Tarreau	c347cb73fa	MEDIUM: mcli: make the prompt mode configurable between i/p Support the same syntax in master mode as in worker mode in order to configure the prompt. The only thing is that for now the master doesn't have a non-interactive mode and it doesn't seem necessary to implement it, so we only support the interactive and prompt modes. However the code was written in a way that makes it easy to change this later if desired.	2025-04-28 20:21:06 +02:00
Willy Tarreau	e5c255c4e5	MEDIUM: cli: make the prompt mode configurable between n/i/p Now the prompt mode can more finely be configured between non-interactive (default), interactive without prompt, and interactive with prompt. This will ease the usage from automated tools which are not necessarily interested in having to consume '> ' after each command nor displaying "+" on payload lines. This can also be convenient when coming from the master CLI to keep the same output format.	2025-04-28 20:21:06 +02:00
Willy Tarreau	f25b4abc9b	MINOR: cli: split APPCTX_CLI_ST1_PROMPT into two distinct flags The CLI's "prompt" command toggles two distinct things: - displaying or hiding the prompt at the beginning of the line - single-command vs interactive mode These are two independent concepts and the prompt mode doesn't always cope well with tools that would like to upload data without having to read the prompt on return. Also, the master command line works in interactive mode by default with no prompt, which is not consistent (and not convenient for tools). So let's start by splitting the bit in two, and have a new APPCTX_CLI_ST1_INTER flag dedicated to the interactive mode. For now the "prompt" command alone continues to toggle the two at once.	2025-04-28 20:21:06 +02:00
Willy Tarreau	5ac280f2a7	MINOR: compiler: add more macros to detect macro definitions We add __equals_0(NAME) which is only true if NAME is defined as zero, and __def_as_empty(NAME) which is only true if NAME is defined as an empty string.	2025-04-28 20:21:06 +02:00
William Lallemand	32b2b782e2	MEDIUM: acme: use 'crt-base' to load the account key Prefix the filename with the 'crt-base' before loading the account key, in order to work like every other keypair in haproxy.	2025-04-28 18:20:21 +02:00
William Lallemand	856b6042d3	MEDIUM: acme: generate the account file when not found Generate the private key on the account file when the file does not exists. This generate a private key of the type and parameters configured in the acme section.	2025-04-28 18:20:21 +02:00
William Lallemand	b2dd6dd72b	MINOR: acme: failure when no directory is specified The "directory" parameter of the acme section is mandatory. This patch exits with an alert when this parameter is not found.	2025-04-28 18:20:21 +02:00
William Lallemand	420de91d26	MINOR: acme: separate the code generating private keys acme_EVP_PKEY_gen() generates private keys of specified <keytype>, <curves> and <bits>. Only RSA and EC are supported for now.	2025-04-28 18:20:21 +02:00
William Lallemand	0897175d73	BUG/MINOR: ssl/acme: free EVP_PKEY upon error Free the EPV_PKEY upon error when the X509_REQ generation failed. No backport needed.	2025-04-28 18:20:21 +02:00
Willy Tarreau	12c7189bc8	MEDIUM: thread: set DEBUG_THREAD to 1 by default Setting DEBUG_THREAD to 1 allows recording the lock history for each thread. Tests have shown that (as predicted) the cost of updating a single thread-local variable is not perceptible in the noise, especially when compared to the cost of obtaining a lock. Since this can provide useful value when debugging deadlocks, let's enable it by default when threads are enabled.	2025-04-28 16:50:34 +02:00
Willy Tarreau	d9a659ed96	MINOR: threads/cli: display the lock history on "show threads" This will display the lock labels and modes for each non-empty step at the end of "show threads" when these are defined. This allows to emit up to the last 8 locking operation for each thread on 64 bit machines.	2025-04-28 16:50:34 +02:00
Willy Tarreau	b8a1c2380b	MEDIUM: threads: keep history of taken locks with DEBUG_THREAD > 0 by only storing a word in each thread context, we can keep the history of all taken/dropped locks by label. This is expected to be very cheap and to permit to store up to 8 consecutive lock operations in 64 bits. That should significantly help detect recursive locks as well as figure what thread was likely to hinder another one waiting for a lock. For now we only store the final state of the lock, we don't store the attempt to get it. It's just a matter of space since we already need 4 ops (rd,sk,wr,un) which take 2 bits, leaving max 64 labels. We're already around 45. We could also multiply by 5 and still keep 8 bits total per lock, that would limit us to 51 locks max. It seems that most of the time if we get a watchdog panic, anyway the victim thread will be perfectly located so that we don't need a specific value for this. Another benefit is that we perform a single memory write per lock.	2025-04-28 16:50:34 +02:00
Willy Tarreau	23371b3e7c	MINOR: threads: turn the full lock debugging to DEBUG_THREAD=2 At level 1 it now does nothing. This is reserved for some subsequent patches which will implement lighter debugging.	2025-04-28 16:50:34 +02:00
Willy Tarreau	903a6b14ef	MINOR: threads: prepare DEBUG_THREAD to receive more values We now default the value to zero and make sure all tests properly take care of values above zero. This is in preparation for supporting several degrees of debugging.	2025-04-28 16:50:34 +02:00
Willy Tarreau	aa49965d4e	BUILD: leastconn: fix build warning when building without threads on old machines Machines lacking CAS8B/DWCAS and emit a warning in lb_fwlc.c without threads due to declaration ordering. Let's just move the variable declaration into the block that uses it as a last variable. No backport is needed.	2025-04-28 16:50:34 +02:00
Willy Tarreau	589d916efa	BUILD: acme: use my_strndup() instead of strndup() Not all systems have strndup(), that's why we have our "my_strndup()", so let's make use of it here. This fixes the build on Solaris 10. No backport is needed.	2025-04-28 16:37:54 +02:00
Aurelien DARRAGON	dc95a3ed61	MINOR: promex: expose ST_I_PX_RATE (current_session_rate) It has been requested to have the current_session_rate exposed at the frontend level. For now only the per-process value was exposed (ST_I_INF_SESS_RATE). Thanks to the work done lately to merge promex and stat_cols_px[] array, let's simply defined an .alt_name for the ST_I_PX_RATE metric in order to have promex exposing it as current_session_rate for relevant contexts.	2025-04-28 12:23:20 +02:00
Aurelien DARRAGON	e921362810	DOC: config: clarify log-forward "host" option log-forward "host" option may be confusing because we often mention the host field when talking about syslog RFC3164 or RFC5424 messages, but neither rfc actually define "host" field. In fact, everywhere we used "host field" we actually meant "hostname field" as documented in RFC5424. This was a language abuse on our side. In this patch we replace "host" with "hostname" where relevant in the documentation to prevent confusion. Thanks to Nick Ramirez for having reported the issue.	2025-04-28 12:23:16 +02:00
Aurelien DARRAGON	385b3f923f	DOC: config: fix ACME paragraph rendering issue Nick Ramirez reported that the ACME paragraph (3.13) caused a rendering issue where simple text was rendered as a directive. This was caused by the use of unescaped <name> which confuses dconv. Let's escape <name> by putting quotes around it to prevent the rendering issue. No backport needed.	2025-04-28 12:23:12 +02:00
William Lallemand	83975f34e4	MINOR: ssl/cli: add a '-t' option to 'show ssl sni' Add a -t option to 'show ssl sni', allowing to add an offset to the current date so it would allow to check which certificates are expired after a certain period of time.	2025-04-28 11:35:11 +02:00
Willy Tarreau	f1064c7382	BUG/MAJOR: listeners: transfer connection accounting when switching listeners Since we made it possible for a bind_conf to listen to multiple thread groups with shards in 2.8 with commit 9d360604bd ("MEDIUM: listener: rework thread assignment to consider all groups"), the per-listener connection count was not properly transferred to the target listener with the connection when switching to another thread group. This results in one listener possibly reaching high values and another one possibly reaching negative values. Usually it's not visible, unless a maxconn is set on the bind_conf, in which case comparisons will quickly put an end to the willingness to accept new connections. This problem only happens when thread groups are enabled, and it seems very hard to trigger it normally, it only impacts sockets having a single shard, hence currently the CLI (or any conf with "bind ... shards 1"), where it can be reproduced with a config having a very low "maxconn" on the stats socket directive (here, 4), and issuing a few tens of socat <<< "show activity" in parallel, or sending HTTP connections to a single-shared listener. Very quickly, haproxy stops accepting connections and eats CPU in the poller which tries to get its connections accepted. A BUG_ON(l->nbconn<0) after HA_ATOMIC_DEC() in listener_release() also helps spotting them better. Many thanks to Christian Ruppert who once again provided a very accurate report in GH #2951 with the required data permitting this analysis. This fix must be backported to 2.8.	2025-04-25 18:47:11 +02:00
Olivier Houchard	9240cd4a27	BUG/MAJOR: tasklets: Make sure he tasklet can't run twice tasklets were originally designed to alway run on only one thread, so it was not possible to have it run on 2 threads concurrently. The API has been extended so that another thread may wake the tasklet, the idea was still that we wanted to have it run on one thread only. However, the way it's been done meant that unless a tasklet was bound to a specific tid with tasklet_set_tid(), or we explicitely used tasklet_wakeup_on() to specify the thread for the target to run on, it would be scheduled to run on the current thread. This is in fact a desirable feature. There is however a race condition in which the tasklet would be scheduled on a thread, while it is running on another. This could lead to the same tasklet to run on multiple threads, which we do not want. To fix this, just do what we already do for regular tasks, set the "TASK_RUNNING" flag, and when it's time to execute the tasklet, wait until that flag is gone. Only one case has been found in the current code, where the tasklet could run on different threads depending on who wakes it up, in the leastconn load balancer, since commit 627280e15f03755b8f59f0191cd6d6bcad5afeb3. It should not be a problem in practice, as the function called can be called concurrently. If a bug is eventually found in relation to this problem, and this patch should be backported, the following patches should be backported too : MEDIUM: quic: Make sure we return the tasklet from quic_accept_run MEDIUM: quic: Make sure we return NULL in quic_conn_app_io_cb if needed MEDIUM: quic: Make sure we return the tasklet from qcc_io_cb MEDIUM: mux_fcgi: Make sure we return the tasklet from fcgi_deferred_shut MEDIUM: listener: Make sure w ereturn the tasklet from accept_queue_process MEDIUM: checks: Make sure we return the tasklet from srv_chk_io_cb	2025-04-25 16:14:26 +02:00
Olivier Houchard	09f5501bb9	MEDIUM: quic: Make sure we return the tasklet from quic_accept_run In quic_accept_run, return the tasklet to tell the scheduler the tasklet is still alive, it is not yet needed, but will be soon.	2025-04-25 16:14:26 +02:00
Olivier Houchard	5838786fa0	MEDIUM: quic: Make sure we return NULL in quic_conn_app_io_cb if needed In quic_conn_app_io_cb, make sure we return NULL if the tasklet has been destroyed, so that the scheduler knows. It is not yet needed, but will be soon.	2025-04-25 16:14:26 +02:00
Olivier Houchard	15c5846db8	MEDIUM: quic: Make sure we return the tasklet from qcc_io_cb In qcc_io_cb, return the tasklet to tell the scheduler the tasklet is still alive, it is not yet needed, but will be soon.	2025-04-25 16:14:26 +02:00
Olivier Houchard	8f70f9c04b	MEDIUM: mux_fcgi: Make sure we return the tasklet from fcgi_deferred_shut In fcgi_deferred_shut, return the tasklet to tell the scheduler the tasklet is still alive, it is not yet needed, but will be soon.	2025-04-25 16:14:26 +02:00
Olivier Houchard	7d190e7df6	MEDIUM: listener: Make sure w ereturn the tasklet from accept_queue_process In accept_queue_process, return the tasklet to tell the scheduler the tasklet is still alive, it is not yet needed, but will be soon.	2025-04-25 16:14:26 +02:00
Olivier Houchard	81dc3e67cf	MEDIUM: checks: Make sure we return the tasklet from srv_chk_io_cb In srv_chk_io_cb, return the tasklet to tell the scheduler the tasklet is still alive, it is not yet needed, but will be soon.	2025-04-25 16:14:26 +02:00
Willy Tarreau	beb23069c6	[RELEASE] Released version 3.2-dev12 Released version 3.2-dev12 with the following main changes : - BUG/MINOR: quic: do not crash on CRYPTO ncbuf alloc failure - BUG/MINOR: proxy: always detach a proxy from the names tree on free() - CLEANUP: proxy: detach the name node in proxy_free_common() instead - CLEANUP: Slightly reorder some proxy option flags to free slots - MINOR: proxy: Add options to drop HTTP trailers during message forwarding - MINOR: h1-htx: Skip C-L and T-E headers for 1xx and 204 messages during parsing - MINOR: mux-h1: Keep custom "Content-Length: 0" header in 1xx and 204 messages - MINOR: hlua/h1: Use http_parse_cont_len_header() to parse content-length value - CLEANUP: h1: Remove now useless h1_parse_cont_len_header() function - BUG/MEDIUM: mux-spop: Respect the negociated max-frame-size value to send frames - MINOR: http-act: Add 'pause' action to temporarily suspend the message analysis - MINOR: acme/cli: add the 'acme renew' command to the help message - MINOR: httpclient: add an "https" log-format - MEDIUM: acme: use a customized proxy - MEDIUM: acme: rename "uri" into "directory" - MEDIUM: acme: rename "account" into "account-key" - MINOR: stick-table: use a separate lock label for updates - MINOR: h3: simplify h3_rcv_buf return path - BUG/MINOR: mux-quic: fix possible infinite loop during decoding - BUG/MINOR: mux-quic: do not decode if conn in error - BUG/MINOR: cli: Issue an error when too many args are passed for a command - MINOR: cli: Use a full prompt command for bidir connections with workers - MAJOR: cli: Refacor parsing and execution of pipelined commands - MINOR: cli: Rename some CLI applet states to reflect recent refactoring - CLEANUP: applet: Update st0/st1 comment in appctx structure - BUG/MINOR: hlua: Fix I/O handler of lua CLI commands to not rely on the SC - BUG/MINOR: ring: Fix I/O handler of "show event" command to not rely on the SC - MINOR: cli/applet: Move appctx fields only used by the CLI in a private context - MINOR: cache: Add a pointer on the cache in the cache applet context - MINOR: hlua: Use the applet name in error messages for lua services - MINOR: applet: Save the "use-service" rule in the stream to init a service applet - CLEANUP: applet: Remove unsued rule pointer in appctx structure - BUG/MINOR: master/cli: properly trim the '@@' process name in error messages - MEDIUM: resolvers: add global "dns-accept-family" directive - MINOR: resolvers: add command-line argument -4 to force IPv4-only DNS - MINOR: sock-inet: detect apparent IPv6 connectivity - MINOR: resolvers: add "dns-accept-family auto" to rely on detected IPv6 - MEDIUM: acme: use Retry-After value for retries - MEDIUM: acme: reset the remaining retries - MEDIUM: acme: better error/retry management of the challenge checks - BUG/MEDIUM: cli: Handle applet shutdown when waiting for a command line - Revert "BUG/MINOR: master/cli: properly trim the '@@' process name in error messages" - BUG/MINOR: master/cli: only parse the '@@' prefix on complete lines - MINOR: resolvers: use the runtime IPv6 status instead of boot time one	2025-04-25 10:19:03 +02:00
Willy Tarreau	40aceb7414	MINOR: resolvers: use the runtime IPv6 status instead of boot time one On systems where the network is not reachable at boot time (certain HA systems for example, or dynamically addressed test machines), we'll want to be able to periodically revalidate the IPv6 reachability status. The current code makes it complicated because it sets the config bits once for all at boot time. This commit changes this so that the config bits are not changed, but instead we rely on a static inline function that relies on sock_inet6_seems_reachable for every test (really cheap). This also removes the now unneeded resolvers late init code. This variable for now is still set at boot time but this will ease the transition later, as the resolvers code is now ready for this.	2025-04-25 09:32:05 +02:00
Willy Tarreau	7a79f54c98	BUG/MINOR: master/cli: only parse the '@@' prefix on complete lines The new adhoc parser for the '@@' prefix forgot to require the presence of the LF character marking the end of the line. This is the reason why entering incomplete commands would display garbage, because the line was expected to have its LF character replaced with a zero. The problem is well illustrated by using socat in raw mode: socat /tmp/master.sock STDIO,raw,echo=0 then entering "@@1 show info" one character at a time would error just after the second "@". The command must take care to report an incomplete line and wait for more data in such a case.	2025-04-25 09:05:00 +02:00
Willy Tarreau	931d932b3e	Revert "BUG/MINOR: master/cli: properly trim the '@@' process name in error messages" This reverts commit 0e94339eaf1c8423132debb6b1b485d8bb1bb7da. This patch was in fact fixing the symptom, not the cause. The root cause of the problem is that the parser was processing an incomplete line when looking for '@@'. When the LF is present, this problem does not exist as it's properly replaced with a zero. This can be verified using socat in raw mode: socat /tmp/master.sock STDIO,raw,echo=0 Then entering "@@1 show info" one character at a time will immediately fail on "@@" without going further. A subsequent patch will fix this. No backport is needed.	2025-04-25 09:05:00 +02:00
Christopher Faulet	101cc4f334	BUG/MEDIUM: cli: Handle applet shutdown when waiting for a command line When the CLI applet was refactord in the commit 20ec1de21 ("MAJOR: cli: Refacor parsing and execution of pipelined commands"), a regression was introduced. The applet shutdown was not longer handled when the applet was waiting for the next command line. It is especially visible when a client timeout occurred because the client connexion is no longer closed. To fix the issue, the test on the SE_FL_SHW flag was reintroduced in CLI_ST_PARSE_CMDLINE state, but only is there is no pending input data. It is a 3.2-specific issue. No backport needed.	2025-04-25 08:47:05 +02:00
William Lallemand	27b732a661	MEDIUM: acme: better error/retry management of the challenge checks When the ACME task is checking for the status of the challenge, it would only succeed or retry upon failure. However that's not the best way to do it, ACME objects contain an "status" field which could have a final status or a in progress status, so we need to be able to retry. This patch adds an acme_ret enum which contains OK, RETRY and FAIL. In the case of the CHKCHALLENGE, the ACME could return a "pending" or a "processing" status, which basically need to be rechecked later with the RETRY. However a "invalid" or "valid" status is final and will return either a FAIL or a OK. So instead of retrying in any case, the "invalid" status will ends the task with an error.	2025-04-24 20:14:47 +02:00
William Lallemand	0909832e74	MEDIUM: acme: reset the remaining retries When a request succeed, reset the remaining retries to the default ACME_RETRY value (3 by default).	2025-04-24 20:14:47 +02:00
William Lallemand	bb768b3e26	MEDIUM: acme: use Retry-After value for retries Parse the Retry-After header in response and store it in order to use the value as the next delay for the next retry, fallback to 3s if the value couldn't be parse or does not exist.	2025-04-24 20:14:47 +02:00
Willy Tarreau	69b051d1dc	MINOR: resolvers: add "dns-accept-family auto" to rely on detected IPv6 Instead of always having to force IPv4 or IPv6, let's now also offer "auto" which will only enable IPv6 if the system has a default gateway for it. This means that properly configured dual-stack systems will default to "ipv4,ipv6" while those lacking a gateway will only use "ipv4". Note that no real connectivity test is performed, so firewalled systems may still get it wrong and might prefer to rely on a manual "ipv4" assignment.	2025-04-24 17:52:28 +02:00
Willy Tarreau	5d41d476f3	MINOR: sock-inet: detect apparent IPv6 connectivity In order to ease dual-stack deployments, we could at least try to check if ipv6 seems to be reachable. For this we're adding a test based on a UDP connect (no traffic) on port 53 to the base of public addresses (2001::) and see if the connect() is permitted, indicating that the routing table knows how to reach it, or fails. Based on this result we're setting a global variable that other subsystems might use to preset their defaults.	2025-04-24 17:52:28 +02:00
Willy Tarreau	2c46c2c042	MINOR: resolvers: add command-line argument -4 to force IPv4-only DNS In order to ease troubleshooting and testing, the new "-4" command line argument enforces queries and processing of "A" DNS records only, i.e. those representing IPv4 addresses. This can be useful when a host lack end-to-end dual-stack connectivity. This overrides the global "dns-accept-family" directive and is equivalent to value "ipv4".	2025-04-24 17:52:28 +02:00
Willy Tarreau	940fa19ad8	MEDIUM: resolvers: add global "dns-accept-family" directive By default, DNS resolvers accept both IPv4 and IPv6 addresses. This can be influenced by the "resolve-prefer" keywords on server lines as well as the family argument to the "do-resolve" action, but that is only a preference, which does not block the other family from being used when it's alone. In some environments where dual-stack is not usable, stumbling on an unreachable IPv6-only DNS record can cause significant trouble as it will replace a previous IPv4 one which would possibly have continued to work till next request. The "dns-accept-family" global option permits to enforce usage of only one (or both) address families. The argument is a comma-delimited list of the following words: - "ipv4": query and accept IPv4 addresses ("A" records) - "ipv6": query and accept IPv6 addresses ("AAAA" records) When a single family is used, no request will be sent to resolvers for the other family, and any response for the othe family will be ignored. The default value is "ipv4,ipv6", which effectively enables both families.	2025-04-24 17:52:28 +02:00
Willy Tarreau	0e94339eaf	BUG/MINOR: master/cli: properly trim the '@@' process name in error messages When '@@' alone is sent on the master CLI (no trailing LF), we get an error that displays anything past these two characters in the buffer since there's no room for a \0. Let's make sure to limit the length of the process name in this case. No backport is needed since this was added with 00c967fac4 ("MINOR: master/cli: support bidirectional communications with workers").	2025-04-24 17:52:28 +02:00
Christopher Faulet	29632bcabf	CLEANUP: applet: Remove unsued rule pointer in appctx structure Thanks to previous commits, the "rule" field in the appctx structure is no longer used. So we can safely remove it.	2025-04-24 16:22:31 +02:00
Christopher Faulet	568ed6484a	MINOR: applet: Save the "use-service" rule in the stream to init a service applet When a service is initialized, the "use-service" rule that was executed is now saved in the stream, using "current_rule" field, instead of saving it into the applet context. It is safe to do so becaues this field is unused at this stage. To avoid any issue, it is reset after the service initialization. Doing so, it is no longer necessary to save it in the applet context. It was the last usage of the rule pointer in the applet context. The init functions for TCP and HTTP lua services were updated accordingly.	2025-04-24 16:22:24 +02:00
Christopher Faulet	6f59986e7c	MINOR: hlua: Use the applet name in error messages for lua services The lua function name was used in error messages of HTTP/TCP lua services while the applet name can be used. Concretely, this will not change anything, because when a lua service is regiestered, the lua function name is used to name the applet. But it is easier, cleaner and more logicial because it is really the applet name that should be displayed in these error messages.	2025-04-24 15:59:33 +02:00
Christopher Faulet	e05074f632	MINOR: cache: Add a pointer on the cache in the cache applet context Thanks to this change, when a response is delivered from the cache, it is no longer necessary to get the cache filter configuration from the http "use-cache" rule saved in the appctx to get the currently used cache. It was a bit complex to get an info that can be directly and naturally stored in the cache applet context.	2025-04-24 15:48:59 +02:00
Christopher Faulet	b734d7c156	MINOR: cli/applet: Move appctx fields only used by the CLI in a private context There are several fields in the appctx structure only used by the CLI. To make things cleaner, all these fields are now placed in a dedicated context inside the appctx structure. The final goal is to move it in the service context and add an API for cli commands to get a command coontext inside the cli context.	2025-04-24 15:09:37 +02:00
Christopher Faulet	03dc54d802	BUG/MINOR: ring: Fix I/O handler of "show event" command to not rely on the SC Thanks to the CLI refactoring ("MAJOR: cli: Refacor parsing and execution of pipelined commands"), it is possible to fix "show event" I/O handle function to no longer use the SC. When the applet API was refactored to no longer manipulate the channels or the stream-connectors, this part was missed. However, without the patch above, it could not be fixed. It is now possible so let's do it. This patch must not be backported becaues it depends on refactoring of the CLI applet.	2025-04-24 15:09:37 +02:00
Christopher Faulet	e406fe16ea	BUG/MINOR: hlua: Fix I/O handler of lua CLI commands to not rely on the SC Thanks to the CLI refactoring ("MAJOR: cli: Refacor parsing and execution of pipelined commands"), it is possible to fix the I/O handler function used by lua CLI commands to no longer use the SC. When the applet API was refactored to no longer manipulate the channels or the stream-connectors, this part was missed. However, without the patch above, it could not be fixed. It is now possible so let's do it. This patch must not be backported becaues it depends on refactoring of the CLI applet.	2025-04-24 15:09:37 +02:00
Christopher Faulet	742dc01537	CLEANUP: applet: Update st0/st1 comment in appctx structure Today, these states are used by almost all applets. So update the comments of these fields.	2025-04-24 15:09:37 +02:00
Christopher Faulet	44ace9a1b7	MINOR: cli: Rename some CLI applet states to reflect recent refactoring CLI_ST_GETREQ state was renamed into CLI_ST_PARSE_CMDLINE and CLI_ST_PARSEREQ into CLI_ST_PROCESS_CMDLINE to reflect the real action performed in these states.	2025-04-24 15:09:37 +02:00
Christopher Faulet	20ec1de214	MAJOR: cli: Refacor parsing and execution of pipelined commands Before this patch, when pipelined commands were received, each command was parsed and then excuted before moving to the next command. Pending commands were not copied in the input buffer of the applet. The major issue with this way to handle commands is the impossibility to consume inputs from commands with an I/O handler, like "show events" for instance. It was working thanks to a "bug" if such commands were the last one on the command line. But it was impossible to use them followed by another command. And this prevents us to implement any streaming support for CLI commands. So we decided to refactor the command line parsing to have something similar to a basic shell. Now an entire line is parsed, including the payload, before starting commands execution. The command line is copied in a dedicated buffer. "appctx->chunk" buffer is used for this purpose. It was an unsed field, so it is safe to use it here. Once the command line copied, the commands found on this line are executed. Because the applet input buffer was flushed, any input can be safely consumed by the CLI applet and is available for the command I/O handler. Thanks to this change, "show event -w" command can be followed by a command. And in theory, it should be possible to implement commands supporting input data streaming. For instance, the Tetris like lua applet can be used on the CLI now. Note that the payload, if any, is part of the command line and must be fully received before starting the commands processing. It means there is still the limitation to a buffer, but not only for the payload but for the whole command line. The payload is still necessarily at the end of the command line and is passed as argument to the last command. Internally, the "appctx->cli_payload" field was introduced to point on the payload in the command line buffer. This patch is quite huge but it cannot easily be splitted. It should not introduced significant changes.	2025-04-24 15:09:37 +02:00
Christopher Faulet	69a9ec5bef	MINOR: cli: Use a full prompt command for bidir connections with workers When a bidirection connection with no command is establisehd with a worker (so "@@<pid>" alone), a "prompt" command is automatically added to display the worker's prompt and enter in interactive mode in the worker context. However, till now, an unfinished command line is sent, with a semicolon instead of a newline at the end. It is not exactly a bug because this works. But it is not really expected and could be a problem for future changes. So now, a full command line is sent: the "prompt" command finished by a newline character.	2025-04-24 15:09:37 +02:00
Christopher Faulet	d3f9289447	BUG/MINOR: cli: Issue an error when too many args are passed for a command When a command is parsed to split it in an array of arguments, by default, at most 64 arguments are supported. But no warning was emitted when there were too many arguments. Instead, the arguments above the limit were silently ignored. It could be an issue for some commands, like "add server", because there was no way to know some arguments were ignored. Now an error is issued when too many arguments are passed and the command is not executed. This patch should be backported to all stable versions.	2025-04-24 14:58:24 +02:00
Amaury Denoyelle	6c5030f703	BUG/MINOR: mux-quic: do not decode if conn in error Add an early return to qcc_decode_qcs() if QCC instance is flagged on error and connection is scheduled for immediate closure. The main objective is to ensure to not trigger BUG_ON() from qcc_set_error() : if a stream decoding has set the connection error, do not try to process decoding on other streams as they may also encounter an error. Thus, the connection is closed asap with the first encountered error case. This should be backported up to 2.6, after a period of observation.	2025-04-24 14:15:02 +02:00
Amaury Denoyelle	fbedb8746f	BUG/MINOR: mux-quic: fix possible infinite loop during decoding With the support of multiple Rx buffers per QCS instance, stream decoding in qcc_io_recv() has been reworked for the next haproxy release. An issue appears in a double while loop : a break statement is used in the inner loop, which is not sufficient as it should instead exit from the outer one. Fix this by replacing break with a goto statement. No need to backport this.	2025-04-24 14:15:02 +02:00
Amaury Denoyelle	3dcda87e58	MINOR: h3: simplify h3_rcv_buf return path Remove return statement in h3_rcv_buf() in case of stream/connection error. Instead, reuse already existing label err. This simplifies the code path. It also fixes the missing leave trace for these cases.	2025-04-24 14:15:02 +02:00
Willy Tarreau	1af592c511	MINOR: stick-table: use a separate lock label for updates Too many locks were sharing STK_TABLE_LOCK making it hard to analyze. Let's split the already heavily used update lock.	2025-04-24 14:02:22 +02:00
William Lallemand	f192e446d6	MEDIUM: acme: rename "account" into "account-key" Rename the "account" option of the acme section into "account-key".	2025-04-24 11:10:46 +02:00
William Lallemand	af73f98a3e	MEDIUM: acme: rename "uri" into "directory" Rename the "uri" option of the acme section into "directory".	2025-04-24 10:52:46 +02:00
William Lallemand	4e14889587	MEDIUM: acme: use a customized proxy Use a customized proxy for the ACME client. The proxy is initialized at the first acme section parsed. The proxy uses the httpsclient log format as ACME CA use HTTPS.	2025-04-23 15:37:57 +02:00
William Lallemand	d700a242b4	MINOR: httpclient: add an "https" log-format Add an experimental "https" log-format for the httpclient, it is not used by the httpclient by default, but could be define in a customized proxy. The string is basically a httpslog, with some of the fields replaced by their backend equivalent or - when not available: "%ci:%cp [%tr] %ft -/- %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r %[bc_err]/%[ssl_bc_err,hex]/-/-/%[ssl_bc_is_resumed] -/-/-"	2025-04-23 15:32:46 +02:00
William Lallemand	d19a62dc65	MINOR: acme/cli: add the 'acme renew' command to the help message Add the 'acme renew' command to the 'help' command of the CLI.	2025-04-23 13:59:27 +02:00
Christopher Faulet	1709cfd31d	MINOR: http-act: Add 'pause' action to temporarily suspend the message analysis The 'pause' HTTP action can now be used to suspend for a moment the message analysis. A timeout, expressed in milliseconds using a time-format parameter, or an expression can be used. If an expression is used, errors and invalid values are ignored. Internally, the action will set the analysis expiration date on the corresponding channel to the configured value and it will yield while it is not expired. The 'pause' action is available for 'http-request' and 'http-response' rules.	2025-04-22 16:14:47 +02:00
Christopher Faulet	ce8c2d359b	BUG/MEDIUM: mux-spop: Respect the negociated max-frame-size value to send frames When a SPOP connection is opened, the maximum size for frames is negociated. This negociated size is properly used when a frame is received and if a too big frame is detected, an error is triggered. However, the same was not performed on the sending path. No check was performed on frames sent to the agent. So it was possible to send frames bigger than the maximum size supported by the the SPOE agent. Now, the size of NOTIFY and DISCONNECT frames is checked before sending them to the agent. Thanks to Miroslav to have reported the issue. This patch must be backported to 3.1.	2025-04-22 16:14:47 +02:00
Christopher Faulet	a56feffc6f	CLEANUP: h1: Remove now useless h1_parse_cont_len_header() function Since the commit "MINOR: hlua/h1: Use http_parse_cont_len_header() to parse content-length value", this function is no longer used. So it can be safely removed.	2025-04-22 16:14:47 +02:00
Christopher Faulet	9e05c14a41	MINOR: hlua/h1: Use http_parse_cont_len_header() to parse content-length value Till now, h1_parse_cont_len_header() was used during the H1 message parsing and by the lua HTTP applets to parse the content-length header value. But a more generic function was added some years ago doing exactly the same operations. So let's use it instead.	2025-04-22 16:14:47 +02:00
Christopher Faulet	a6b32922fc	MINOR: mux-h1: Keep custom "Content-Length: 0" header in 1xx and 204 messages Thanks to the commit "MINOR: mux-h1: Don't remove custom "Content-Length: 0" header in 1xx and 204 messages", we are now sure that 1xx and 204 responses were sanitized during the parsing. So, if one of these headers are found in such responses when sent to the client, it means it was added by hand, via a "set-header" action for instance. In this context, we are able to make an exception for the "Content-Length: 0" header, and only this one with this value, to not break leagacy applications. So now, a user can force the "Content-Length: 0" header to appear in 1xx and 204 responses by adding the right action in hist configuration. "Transfer-Encoding" headers are still dropped as "Content-Length" headers with another value than 0. Note, that in practice, only 101 and 204 are concerned because other 1xx message are not subject to HTTP analysis. This patch should fix the issue #2888. There is no reason to backport it. But if we do so, the patch above must be backported too.	2025-04-22 16:14:47 +02:00
Christopher Faulet	1db99b09d0	MINOR: h1-htx: Skip C-L and T-E headers for 1xx and 204 messages during parsing According to the RFC9110 and RFC9112, a server must not add 'Content-Length' or 'Transfer-Encoding' headers into 1xx and 204 responses. So till now, these headers were dropped from the response when it is sent to the client. However, it seems more logical to remove it during the message parsing. In addition to sanitize messages as early as possible, this will allow us to apply some exception in some cases (This will be the subject of another patch). In this patch, 'Content-Length' and 'Transfer-Encoding' headers are removed from 1xx and 204 responses during the parsing but the same is still performed during the formatting stage.	2025-04-22 16:14:47 +02:00
Christopher Faulet	5200203677	MINOR: proxy: Add options to drop HTTP trailers during message forwarding In RFC9110, it is stated that trailers could be merged with the headers. While it should be performed with a speicial care, it may be a problem for some applications. To avoid any trouble with such applications, two new options were added to drop trailers during the message forwarding. On the backend, "http-drop-request-trailers" option can be enabled to drop trailers from the requests before sending them to the server. And on the frontend, "http-drop-response-trailers" option can be enabled to drop trailers from the responses before sending them to the client. The options can be defined in defaults sections and disabled with "no" keyword. This patch should fix the issue #2930.	2025-04-22 16:14:46 +02:00
Christopher Faulet	044ef9b3d6	CLEANUP: Slightly reorder some proxy option flags to free slots PR_O_TCPCHK_SSL and PR_O_CONTSTATS was shifted to free a slot. The idea is to have 2 contiguous slots to be able to insert two new options.	2025-04-22 16:14:46 +02:00
Willy Tarreau	5763a891a9	CLEANUP: proxy: detach the name node in proxy_free_common() instead This changes commit d2a9149f0 ("BUG/MINOR: proxy: always detach a proxy from the names tree on free()") to be cleaner. Aur�lien spotted that the free(p->id) was indeed already done in proxy_free_common(), which is called before we delete the node. That's still a bit ugly and it only works because ebpt_delete() does not dereference the key during the operation. Better play safe and delete the entry before freeing it, that's more future-proof.	2025-04-19 10:21:19 +02:00
Willy Tarreau	d2a9149f09	BUG/MINOR: proxy: always detach a proxy from the names tree on free() Stephen Farrell reported in issue #2942 that recent haproxy versions crash if there's no resolv.conf. A quick bisect with his reproducer showed that it started with commit 4194f75 ("MEDIUM: tree-wide: avoid manually initializing proxies") which reorders the proxies initialization sequence a bit. The crash shows a corrupted tree, typically indicating a use-after-free. With the help of ASAN it was possible to find that a resolver proxy had been destroyed and freed before the name insertion that causes the crash, very likely caused by the absence of the needed resolv.conf: #0 0x7ffff72a82f7 in free (/usr/local/lib64/libasan.so.5+0x1062f7) #1 0x94c1fd in free_proxy src/proxy.c:436 #2 0x9355d1 in resolvers_destroy src/resolvers.c:2604 #3 0x93e899 in resolvers_create_default src/resolvers.c:3892 #4 0xc6ed29 in httpclient_resolve_init src/http_client.c:1170 #5 0xc6fbcf in httpclient_create_proxy src/http_client.c:1310 #6 0x4ae9da in ssl_ocsp_update_precheck src/ssl_ocsp.c:1452 #7 0xa1b03f in step_init_2 src/haproxy.c:2050 But free_proxy() doesn't delete the ebpt_node that carries the name, which perfectly explains the situation. This patch simply deletes the name node and Stephen confirmed that it fixed the problem for him as well. Let's also free it since the key points to p->id which is never freed either in this function! No backport is needed since the patch above was first merged into 3.2-dev10.	2025-04-18 23:50:13 +02:00
Amaury Denoyelle	4309a6fbf8	BUG/MINOR: quic: do not crash on CRYPTO ncbuf alloc failure To handle out-of-order received CRYPTO frames, a ncbuf instance is allocated. This is done via the helper quic_get_ncbuf(). Buffer allocation was improperly checked. In case b_alloc() fails, it crashes due to a BUG_ON(). Fix this by removing it. The function now returns NULL on allocation failure, which is already properly handled in its caller qc_handle_crypto_frm(). This should fix the last reported crash from github issue #2935. This must be backported up to 2.6.	2025-04-18 18:11:17 +02:00
Willy Tarreau	acd372d6ac	[RELEASE] Released version 3.2-dev11 Released version 3.2-dev11 with the following main changes : - CI: enable weekly QuicTLS build - DOC: management: slightly clarify the prefix role of the '@' command - DOC: management: add a paragraph about the limitations of the '@' prefix - MINOR: master/cli: support bidirectional communications with workers - MEDIUM: ssl/ckch: add filename and linenum argument to crt-store parsing - MINOR: acme: add the acme section in the configuration parser - MINOR: acme: add configuration for the crt-store - MINOR: acme: add private key configuration - MINOR: acme/cli: add the 'acme renew' command - MINOR: acme: the acme section is experimental - MINOR: acme: get the ACME directory - MINOR: acme: handle the nonce - MINOR: acme: check if the account exist - MINOR: acme: generate new account - MINOR: acme: newOrder request retrieve authorizations URLs - MINOR: acme: allow empty payload in acme_jws_payload() - MINOR: acme: get the challenges object from the Auth URL - MINOR: acme: send the request for challenge ready - MINOR: acme: implement a check on the challenge status - MINOR: acme: generate the CSR in a X509_REQ - MINOR: acme: finalize by sending the CSR - MINOR: acme: verify the order status once finalized - MINOR: acme: implement retrieval of the certificate - BUG/MINOR: acme: ckch_conf_acme_init() when no filename - MINOR: ssl/ckch: handle ckch_conf in ckchs_dup() and ckch_conf_clean() - MINOR: acme: copy the original ckch_store - MEDIUM: acme: replace the previous ckch instance with new ones - MINOR: acme: schedule retries with a timer - BUILD: acme: enable the ACME feature when JWS is present - BUG/MINOR: cpu-topo: check the correct variable for NULL after malloc() - BUG/MINOR: acme: key not restored upon error in acme_res_certificate() - BUG/MINOR: thread: protect thread_cpus_enabled_at_boot with USE_THREAD - MINOR: acme: default to 2048bits for RSA - DOC: acme: explain how to configure and run ACME - BUG/MINOR: debug: remove the trailing \n from BUG_ON() statements - DOC: config: add the missing "profiling.memory" to the global kw index - DOC: config: add the missing "force-cfg-parser-pause" to the global kw index - DEBUG: init: report invalid characters in debug description strings - DEBUG: rename DEBUG_GLITCHES to DEBUG_COUNTERS and enable it by default - DEBUG: counters: make COUNT_IF() only appear at DEBUG_COUNTERS>=1 - DEBUG: counters: add the ability to enable/disable updating the COUNT_IF counters - MINOR: tools: let dump_addr_and_bytes() support dumping before the offset - MINOR: debug: in call traces, dump the 8 bytes before the return address, not after - MINOR: debug: detect call instructions and show the branch target in backtraces - BUG/MINOR: acme: fix possible NULL deref - CLEANUP: acme: stored value is overwritten before it can be used - BUILD: incompatible pointer type suspected with -DDEBUG_UNIT - BUG/MINOR: http-ana: Properly detect client abort when forwarding the response - BUG/MEDIUM: http-ana: Report 502 from req analyzer only during rsp forwarding - CI: fedora rawhide: enable unit tests - DOC: configuration: fix a typo in ACME documentation - MEDIUM: sink: add a new dpapi ring buffer - Revert "BUG/MINOR: acme: key not restored upon error in acme_res_certificate()" - BUG/MINOR: acme: key not restored upon error in acme_res_certificate() V2 - BUG/MINOR: acme: fix the exponential backoff of retries - DOC: configuration: specify limitations of ACME for 3.2 - MINOR: acme: emit logs instead of ha_notice - MINOR: acme: add a success message to the logs - BUG/MINOR: acme/cli: fix certificate name in error message - MINOR: acme: register the task in the ckch_store - MINOR: acme: free acme_ctx once the task is done - BUG/MEDIUM: h3: trim whitespaces when parsing headers value - BUG/MEDIUM: h3: trim whitespaces in header value prior to QPACK encoding - BUG/MINOR: h3: filter upgrade connection header - BUG/MINOR: h3: reject invalid :path in request - BUG/MINOR: h3: reject request URI with invalid characters - MEDIUM: h3: use absolute URI form with :authority - BUG/MEDIUM: hlua: fix hlua_applet_{http,tcp}_fct() yield regression (lost data) - BUG/MINOR: mux-h2: prevent past scheduling with idle connections - BUG/MINOR: rhttp: fix reconnect if timeout connect unset - BUG/MINOR: rhttp: ensure GOAWAY can be emitted after reversal - BUG/MINOR: mux-h2: do not apply timer on idle backend connection - MINOR: mux-h2: refactor idle timeout calculation - MINOR: mux-h2: prepare to support PING emission - MEDIUM: server/mux-h2: implement idle-ping on backend side - MEDIUM: listener/mux-h2: implement idle-ping on frontend side - MINOR: mux-h2: do not emit GOAWAY on idle ping expiration - MINOR: mux-h2: handle idle-ping on conn reverse - BUILD: makefile: enable backtrace by default on musl - BUG/MINOR: threads: set threads_idle and threads_harmless even with no threads - BUG/MINOR debug: fix !USE_THREAD_DUMP in ha_thread_dump_fill() - BUG/MINOR: wdt/debug: avoid signal re-entrance between debugger and watchdog - BUG/MINOR: debug: detect and prevent re-entrance in ha_thread_dump_fill() - MINOR: debug: do not statify a few debugging functions often used with wdt/dbg - MINOR: tools: also protect the library name resolution against concurrent accesses - MINOR: tools: protect dladdr() against reentrant calls from the debug handler - MINOR: debug: protect ha_dump_backtrace() against risks of re-entrance - MINOR: tinfo: keep a copy of the pointer to the thread dump buffer - MINOR: debug: always reset the dump pointer when done - MINOR: debug: remove unused case of thr!=tid in ha_thread_dump_one() - MINOR: pass a valid buffer pointer to ha_thread_dump_one() - MEDIUM: wdt: always make the faulty thread report its own warnings - MINOR: debug: make ha_stuck_warning() only work for the current thread - MINOR: debug: make ha_stuck_warning() print the whole message at once - CLEANUP: debug: no longer set nor use TH_FL_DUMPING_OTHERS - MINOR: sched: add a new function is_sched_alive() to report scheduler's health - MINOR: wdt: use is_sched_alive() instead of keeping a local ctxsw copy - MINOR: sample: add 4 new sample fetches for clienthello parsing - REGTEST: add new reg-test for the 4 new clienthello fetches - MINOR: servers: Move the per-thread server initialization earlier - MINOR: proxies: Initialize the per-thread structure earlier. - MINOR: servers: Provide a pointer to the server in srv_per_tgroup. - MINOR: lb_fwrr: Move the next weight out of fwrr_group. - MINOR: proxies: Add a per-thread group lbprm struct. - MEDIUM: lb_fwrr: Use one ebtree per thread group. - MEDIUM: lb_fwrr: Don't start all thread groups on the same server. - MINOR: proxies: Do stage2 initialization for sinks too	2025-04-18 14:19:47 +02:00
Olivier Houchard	c4aec7a52f	MINOR: proxies: Do stage2 initialization for sinks too In check_config_validity(), we initialize the proxy in several stages. We do so for the sink list for stage1, but not for stage2. It may not be needed right now, but it may become needed in the future, so do it anyway.	2025-04-17 17:38:23 +02:00
Olivier Houchard	658eaa4086	MEDIUM: lb_fwrr: Don't start all thread groups on the same server. Now that all there is one tree per thread group, all thread groups will start on the same server. To prevent that, just insert the servers in a different order for each thread group.	2025-04-17 17:38:23 +02:00
Olivier Houchard	3758eab71c	MEDIUM: lb_fwrr: Use one ebtree per thread group. When using the round-robin load balancer, the major source of contention is the lbprm lock, that has to be held every time we pick a server. To mitigate that, make it so there are one tree per thread-group, and one lock per thread-group. That means we now have a lb_fwrr_per_tgrp structure that will contain the two lb_fwrr_groups (active and backup) as well as the lock to protect them in the per-thread lbprm struct, and all fields in the struct server are now moved to the per-thread structure too. Those changes are mostly mechanical, and brings good performances improvment, on a 64-cores AMD CPU, with 64 servers configured, we could process about 620000 requests par second, and we now can process around 1400000 requests per second.	2025-04-17 17:38:23 +02:00
Olivier Houchard	f36f6cfd26	MINOR: proxies: Add a per-thread group lbprm struct. Add a new structure in the per-thread groups proxy structure, that will contain whatever is per-thread group in lbprm. It will be accessed as p->per_tgrp[tgid].lbprm.	2025-04-17 17:38:23 +02:00
Olivier Houchard	7ca1c94ff0	MINOR: lb_fwrr: Move the next weight out of fwrr_group. Move the "next_weight" outside of fwrr_group, and inside struct lb_fwrr directly, one for the active servers, one for the backup servers. We will soon have one fwrr_group per thread group, but next_weight will be global to all of them.	2025-04-17 17:38:23 +02:00
Olivier Houchard	444125a764	MINOR: servers: Provide a pointer to the server in srv_per_tgroup. Add a pointer to the server into the struct srv_per_tgroup, so that if we only have access to that srv_per_tgroup, we can come back to the corresponding server.	2025-04-17 17:38:23 +02:00
Olivier Houchard	5e1ce09e54	MINOR: proxies: Initialize the per-thread structure earlier. Move the call to initialize the proxy's per-thread structure earlier than currently done, so that they are usable when we're initializing the load balancers.	2025-04-17 17:38:23 +02:00
Olivier Houchard	e7613d3717	MINOR: servers: Move the per-thread server initialization earlier Move the code responsible for calling per-thread server initialization earlier than it was done, so that per-thread structures are available a bit later, when we initialize load-balancing.	2025-04-17 17:38:23 +02:00
Mariam John	9a8c4df45d	REGTEST: add new reg-test for the 4 new clienthello fetches Add a reg-test which uses the 4 fetches: - req.ssl_cipherlist - req.ssl_sigalgs - req.ssl_keyshare_groups - req.ssl_supported_groups	2025-04-17 16:39:47 +02:00
Mariam John	fa063a9e77	MINOR: sample: add 4 new sample fetches for clienthello parsing This patch contains this 4 new fetches and doc changes for the new fetches: - req.ssl_cipherlist - req.ssl_sigalgs - req.ssl_keyshare_groups - req.ssl_supported_groups Towards:#2532	2025-04-17 16:39:47 +02:00
Willy Tarreau	5901164789	MINOR: wdt: use is_sched_alive() instead of keeping a local ctxsw copy Now we can simply call is_sched_alive() on the local thread to verify that the scheduler is still ticking instead of having to keep a copy of the ctxsw and comparing it. It's cleaner, doesn't require to maintain a local copy, doesn't rely on activity[] (whose purpose is mainly for observation and debugging), and shows how this could be extended later to cover other use cases. Practically speaking this doesn't change anything however, the algorithm is still the same.	2025-04-17 16:25:47 +02:00
Willy Tarreau	36ec70c526	MINOR: sched: add a new function is_sched_alive() to report scheduler's health This verifies that the scheduler is still ticking without having to access the activity[] array nor keeping local copies of the ctxsw counter. It just tests and sets a flag that is reset after each return from a ->process() function.	2025-04-17 16:25:47 +02:00
Willy Tarreau	874ba2afed	CLEANUP: debug: no longer set nor use TH_FL_DUMPING_OTHERS TH_FL_DUMPING_OTHERS was being used to try to perform exclusion between threads running "show threads" and those producing warnings. Now that it is much more cleanly handled, we don't need that type of protection anymore, which was adding to the complexity of the solution. Let's just get rid of it.	2025-04-17 16:25:47 +02:00
Willy Tarreau	513397ac82	MINOR: debug: make ha_stuck_warning() print the whole message at once It has been noticed quite a few times during troubleshooting and even testing that warnings can happen in avalanches from multiple threads at the same time, and that their reporting it interleaved bacause the output is produced in small chunks. Originally, this code inspired by the panic code aimed at making sure to log whatever could be emitted in case it would crash later. But this approach was wrong since writes are atomic, and performing 5 writes in sequence in each dumping thread also means that the outputs can be mixed up at 5 different locations between multiple threads. The output of warnings is never very long, and the stack-based buffer is 4kB so let's just concatenate everything in the buffer and emit it at once using a single write(). Now there's no longer this confusion on the output.	2025-04-17 16:25:47 +02:00
Willy Tarreau	c16d5415a8	MINOR: debug: make ha_stuck_warning() only work for the current thread Since we no longer call it with a foreign thread, let's simplify its code and get rid of the special cases that were relying on ha_thread_dump_fill() and synchronization with a remote thread. We're not only dumping the current thread so ha_thread_dump_one() is sufficient.	2025-04-17 16:25:47 +02:00
Willy Tarreau	a06c215f08	MEDIUM: wdt: always make the faulty thread report its own warnings Warnings remain tricky to deal with, especially for other threads as they require some inter-thread synchronization that doesn't cope very well with other parallel activities such as "show threads" for example. However there is nothing that forces us to handle them this way. The panic for example is already handled by bouncing the WDT signal to the faulty thread. This commit rearranges the WDT handler to make a better used of this existing signal bouncing feature of the WDT handler so that it's no longer limited to panics but can also deal with warnings. In order not to bounce on all wakeups, we only bounce when there is a suspicion, that is, when the warning timer has been crossed. We'll let the target thread verify the stuck flag and context switch count by itself to decide whether or not to panic, warn, or just do nothing and update the counters. As a bonus, now all warning traces look the same regardless of the reporting thread: call trace(16): \| 0x6bc733 <01 00 00 e8 6d e6 de ff]: ha_dump_backtrace+0x73/0x309 > main-0x2570 \| 0x6bd37a <00 00 00 e8 d6 fb ff ff]: ha_thread_dump_fill+0xda/0x104 > ha_thread_dump_one \| 0x6bd625 <00 00 00 e8 7b fc ff ff]: ha_stuck_warning+0xc5/0x19e > ha_thread_dump_fill \| 0x7b2b60 <64 8b 3b e8 00 aa f0 ff]: wdt_handler+0x1f0/0x212 > ha_stuck_warning \| 0x7fd7e2cef3a0 <00 00 00 00 0f 1f 40 00]: libpthread:+0x123a0 \| 0x7ffc6af9e634 <85 a6 00 00 00 0f 01 f9]: linux-vdso:__vdso_gettimeofday+0x34/0x2b0 \| 0x6bad74 <7c 24 10 e8 9c 01 df ff]: sc_conn_io_cb+0x9fa4 > main-0x2400 \| 0x67c457 <89 f2 4c 89 e6 41 ff d0]: main+0x1cf147 \| 0x67d401 <48 89 df e8 8f ed ff ff]: cli_io_handler+0x191/0xb38 > main+0x1cee80 \| 0x6dd605 <40 48 8b 45 60 ff 50 18]: task_process_applet+0x275/0xce9	2025-04-17 16:25:47 +02:00
Willy Tarreau	b24d7f248e	MINOR: pass a valid buffer pointer to ha_thread_dump_one() The goal is to let the caller deal with the pointer so that the function only has to fill that buffer without worrying about locking. This way, synchronous dumps from "show threads" are produced and emitted directly without causing undesired locking of the buffer nor risking causing confusion about thread_dump_buffer containing bits from an interrupted dump in progress. It's only the caller that's responsible for notifying the requester of the end of the dump by setting bit 0 of the pointer if needed (i.e. it's only done in the debug handler).	2025-04-17 16:25:47 +02:00
Willy Tarreau	5ac739cd0c	MINOR: debug: remove unused case of thr!=tid in ha_thread_dump_one() This function was initially designed to dump any threadd into the presented buffer, but the way it currently works is that it's always called for the current thread, and uses the distinction between coming from a sighandler or being called directly to detect which thread is the caller. Let's simplify all this by replacing thr with tid everywhere, and using the thread-local pointers where it makes sense (e.g. th_ctx, th_ctx etc). The confusing "from_signal" argument is now replaced with "is_caller" which clearly states whether or not the caller declares being the one asking for the dump (the logic is inverted, but there are only two call places with a constant).	2025-04-17 16:25:47 +02:00
Willy Tarreau	5646ec4d40	MINOR: debug: always reset the dump pointer when done We don't need to copy the old dump pointer to the thread_dump_pointer area anymore to indicate a dump is collected. It used to be done as an artificial way to keep the pointer for the post-mortem analysis but since we now have this pointer stored separately, that's no longer needed and it simplifies the mechanim to reset it.	2025-04-17 16:25:47 +02:00
Willy Tarreau	6d8a523d14	MINOR: tinfo: keep a copy of the pointer to the thread dump buffer Instead of using the thread dump buffer for post-mortem analysis, we'll keep a copy of the assigned pointer whenever it's used, even for warnings or "show threads". This will offer more opportunities to figure from a core what happened, and will give us more freedom regarding the value of the thread_dump_buffer itself. For example, even at the end of the dump when the pointer is reset, the last used buffer is now preserved.	2025-04-17 16:25:47 +02:00
Willy Tarreau	d20e9cad67	MINOR: debug: protect ha_dump_backtrace() against risks of re-entrance If a thread is dumping itself (warning, show thread etc) and another one wants to dump the state of all threads (e.g. panic), it may interrupt the first one during backtrace() and re-enter it from the signal handler, possibly triggering a deadlock in the underlying libc. Let's postpone the debug signal delivery at this point until the call ends in order to avoid this.	2025-04-17 16:25:47 +02:00
Willy Tarreau	2dfb63313b	MINOR: tools: protect dladdr() against reentrant calls from the debug handler If a thread is currently resolving a symbol while another thread triggers a thread dump, the current thread may enter the debug handler and call resolve_sym_addr() again, possibly deadlocking if the underlying libc uses locking. Let's postpone the debug signal delivery in this area during the call. This will slow the resolution a little bit but we don't care, it's not supposed to happen often and it must remain rock-solid.	2025-04-17 16:25:47 +02:00
Willy Tarreau	8d0c633677	MINOR: tools: also protect the library name resolution against concurrent accesses This is an extension of eb41d768f ("MINOR: tools: use only opportunistic symbols resolution"). It also makes sure we're not calling dladddr() in parallel to dladdr_and_size(), as a preventive measure against some potential deadlocks in the inner layers of the libc.	2025-04-17 16:25:47 +02:00
Willy Tarreau	5b5960359f	MINOR: debug: do not statify a few debugging functions often used with wdt/dbg A few functions are used when debugging debug signals and watchdog, but being static, they're not resolved and are hard to spot in dumps, and they appear as any random other function plus an offset. Let's just not mark them static anymore, it only hurts: - cli_io_handler_show_threads() - debug_run_cli_deadlock() - debug_parse_cli_loop() - debug_parse_cli_panic()	2025-04-17 16:25:47 +02:00
Willy Tarreau	47f8397afb	BUG/MINOR: debug: detect and prevent re-entrance in ha_thread_dump_fill() In the following trace trying to abuse the watchdog from the CLI's "debug dev loop" command running in parallel to "show threads" loops, it's clear that some re-entrance may happen in ha_thread_dump_fill(). A first minimal fix consists in using a test-and-set on the flag indicating that the function is currently dumping threads, so that the one from the signal just returns. However the caller should be made more reliable to serialize all of this, that's for future work. Here's an example capture of 7 threads stuck waiting for each other: (gdb) bt #0 0x00007fe78d78e147 in sched_yield () from /lib64/libc.so.6 #1 0x0000000000674a05 in ha_thread_relax () at src/thread.c:356 #2 0x00000000005ba4f5 in ha_thread_dump_fill (thr=2, buf=0x7ffdd8e08ab0) at src/debug.c:402 #3 ha_thread_dump_fill (buf=0x7ffdd8e08ab0, thr=<optimized out>) at src/debug.c:384 #4 0x00000000005baac4 in ha_stuck_warning (thr=thr@entry=2) at src/debug.c:840 #5 0x00000000006a360d in wdt_handler (sig=<optimized out>, si=<optimized out>, arg=<optimized out>) at src/wdt.c:156 #6 <signal handler called> #7 0x00007fe78d78e147 in sched_yield () from /lib64/libc.so.6 #8 0x0000000000674a05 in ha_thread_relax () at src/thread.c:356 #9 0x00000000005ba4c2 in ha_thread_dump_fill (thr=2, buf=0x7fe78f2d6420) at src/debug.c:426 #10 ha_thread_dump_fill (buf=0x7fe78f2d6420, thr=2) at src/debug.c:384 #11 0x00000000005ba7c6 in cli_io_handler_show_threads (appctx=0x2a89ab0) at src/debug.c:548 #12 0x000000000057ea43 in cli_io_handler (appctx=0x2a89ab0) at src/cli.c:1176 #13 0x00000000005d7885 in task_process_applet (t=0x2a82730, context=0x2a89ab0, state=<optimized out>) at src/applet.c:920 #14 0x0000000000659002 in run_tasks_from_lists (budgets=budgets@entry=0x7ffdd8e0a5c0) at src/task.c:644 #15 0x0000000000659bd7 in process_runnable_tasks () at src/task.c:886 #16 0x00000000005cdcc9 in run_poll_loop () at src/haproxy.c:2858 #17 0x00000000005ce457 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3075 #18 0x0000000000430628 in main (argc=<optimized out>, argv=<optimized out>) at src/haproxy.c:3665	2025-04-17 16:25:47 +02:00
Willy Tarreau	ebf1757dc2	BUG/MINOR: wdt/debug: avoid signal re-entrance between debugger and watchdog As seen in issue #2860, there are some situations where a watchdog could trigger during the debug signal handler, and where similarly the debug signal handler may trigger during the wdt handler. This is really bad because it could trigger some deadlocks inside inner libc code such as dladdr() or backtrace() since the code will not protect against re- entrance but only against concurrent accesses. A first attempt was made using ha_sigmask() but that's not always very convenient because the second handler is called immediately after unblocking the signal and before returning, leaving signal cascades in backtrace. Instead, let's mark which signals to block at registration time. Here we're blocking wdt/dbg for both signals, and optionally SIGRTMAX if DEBUG_DEV is used as that one may also be used in this case. This should be backported at least to 3.1.	2025-04-17 16:25:47 +02:00
Willy Tarreau	0b56839455	BUG/MINOR debug: fix !USE_THREAD_DUMP in ha_thread_dump_fill() The function must make sure to return NULL for foreign threads and the local buffer for the current thread in this case, otherwise panics (and sometimes even warnings) will segfault when USE_THREAD_DUMP is disabled. Let's slightly re-arrange the function to reduce the #if/else since we have to specifically handle the case of !USE_THREAD_DUMP anyway. This needs to be backported wherever b8adef065d ("MEDIUM: debug: on panic, make the target thread automatically allocate its buf") was backported (at least 2.8).	2025-04-17 16:25:47 +02:00
Willy Tarreau	337017e2f9	BUG/MINOR: threads: set threads_idle and threads_harmless even with no threads Some signal handlers rely on these to decide about the level of detail to provide in dumps, so let's properly fill the info about entering/leaving idle. Note that for consistency with other tests we're using bitops with t->ltid_bit, while we could simply assign 0/1 to the fields. But it makes the code more readable and the whole difference is only 88 bytes on a 3MB executable. This bug is not important, and while older versions are likely affected as well, it's not worth taking the risk to backport this in case it would wake up an obscure bug.	2025-04-17 16:25:47 +02:00
Willy Tarreau	f499fa3dcd	BUILD: makefile: enable backtrace by default on musl The reason musl builds was not producing exploitable backtraces was that the toolchain used appears to automatically omit the frame pointer at -O2 but leaves it at -O0. This patch just makes sure to always append -fno-omit-frame-pointer to the BACKTRACE cflags and enables the option with musl where it now works. This will allow us to finally get exploitable traces from docker images where core dumps are not always available.	2025-04-17 16:25:47 +02:00
Amaury Denoyelle	bd1d02e2b3	MINOR: mux-h2: handle idle-ping on conn reverse This commit extends MUX H2 connection reversal step to properly take into account the new idle-ping feature. It first ensures that h2c task is properly instantiated/freed depending now on both timers and idle-ping configuration. Also, h2c_update_timeout() is now called instead of manually requeuing the task, which ensures the proper timer value is selected depending on the new connection side.	2025-04-17 14:49:36 +02:00
Amaury Denoyelle	cc5a7a760f	MINOR: mux-h2: do not emit GOAWAY on idle ping expiration If idle-ping is activated and h2c task is expired due to missing PING ACK, consider that the peer is away and the connection can be closed immediately. GOAWAY emission is thus skipped. A new test is necessary in h2c_update_timeout() when PING ACK is currently expected, but the next timer expiration selected is not idle-ping. This may happen if http-keep-alive/http-request timers are selected first. In this case, H2_CF_IDL_PING_SENT flag is resetted. This is necessary to not prevent GOAWAY emission on expiration.	2025-04-17 14:49:36 +02:00
Amaury Denoyelle	52246249ab	MEDIUM: listener/mux-h2: implement idle-ping on frontend side This commit is the counterpart of the previous one, adapted on the frontend side. "idle-ping" is added as keyword to bind lines, to be able to refresh client timeout of idle frontend connections. H2 MUX behavior remains similar as the previous patch. The only significant change is in h2c_update_timeout(), as idle-ping is now taken into account also for frontend connection. The calculated value is compared with http-request/http-keep-alive timeout value. The shorter delay is then used as expired date. As hr/ka timeout are based on idle_start, this allows to run them in parallel with an idle-ping timer.	2025-04-17 14:49:36 +02:00
Amaury Denoyelle	a78a04cfae	MEDIUM: server/mux-h2: implement idle-ping on backend side This commit implements support for idle-ping on the backend side. First, a new server keyword "idle-ping" is defined in configuration parsing. It is used to set the corresponding new server member. The second part of this commit implements idle-ping support on H2 MUX. A new inlined function conn_idle_ping() is defined to access connection idle-ping value. Two new connection flags are defined H2_CF_IDL_PING and H2_CF_IDL_PING_SENT. The first one is set for idle connections via h2c_update_timeout(). On h2_timeout_task() handler, if first flag is set, instead of releasing the connection as before, the second flag is set and tasklet is scheduled. As both flags are now set, h2_process_mux() will proceed to PING emission. The timer has also been rearmed to the idle-ping value. If a PING ACK is received before next timeout, connection timer is refreshed. Else, the connection is released, as with timer expiration. Also of importance, special care is needed when a backend connection is going to idle. In this case, idle-ping timer must be rearmed. Thus a new invokation of h2c_update_timeout() is performed on h2_detach().	2025-04-17 14:49:36 +02:00
Amaury Denoyelle	4dcfe098a6	MINOR: mux-h2: prepare to support PING emission Adapt the already existing function h2c_ack_ping(). The objective is to be able to emit a PING request. First, it is renamed as h2c_send_ping(). A new boolean argument <ack> is used to emit either a PING request or ack.	2025-04-17 14:49:36 +02:00
Amaury Denoyelle	99b2e52f89	MINOR: mux-h2: refactor idle timeout calculation Reorganize code for timeout calculation in case the connection is idle. The objective is to better reflect the relations between each timeouts as follow : * if GOAWAY already emitted, use shut-timeout, or if unset fallback to client/server one. However, an already set timeout is never erased. * else, for frontend connection, http-request or keep-alive timeout is applied depending on the current demux state. If the selected value is unset, fallback to client timeout * for backend connection, no timeout is set to perform http-reuse This commit is pure refactoring, so no functional change should occur.	2025-04-17 14:49:36 +02:00
Amaury Denoyelle	243bc95de0	BUG/MINOR: mux-h2: do not apply timer on idle backend connection Since the following commit, MUX H2 timeout function has been slightly exetended. d38d8c6ccb189e7bc813b3693fec3093c9be55f1 BUG/MEDIUM: mux-h2: make sure control frames do not refresh the idle timeout A side-effect of this patch is that now backend idle connection expire timer is not reset if already defined. This means that if a timer was registered prior to the connection transition to idle, the connection would be destroyed on its timeout. If this happens for enough connection, this may have an impact on the reuse rate. In practice, this case should be rare, as h2c timer is set to TICK_ETERNITY while there is active streams. The timer is not refreshed most of the time before going the transition to idle, so the connection won't be deleted on expiration. The only case where it could occur is if there is still pending data blocked on emission on stream detach. Here, timeout server is applied on the connection. When the emission completes, the connection goes to idle, but the timer will still armed, and thus will be triggered on the idle connection. To prevent this, explicitely reset h2c timer to TICK_ETERNITY for idle backend connection via h2c_update_timeout(). This patch is explicitely not scheduled for backport for now, as it is difficult to estimate the real impact of the previous code state.	2025-04-17 14:49:36 +02:00
Amaury Denoyelle	9e6f8ce328	BUG/MINOR: rhttp: ensure GOAWAY can be emitted after reversal GOAWAY emission should not be emitted before preface. Thus, max_id field from h2c acting as a server is initialized to -1, which prevents its emission until preface is received from the peer. If acting as a client, max_id is initialized to a valid value on the first h2s emission. This causes an issue with reverse HTTP on the active side. First, it starts as a client, so the peer does not emit a preface but instead a simple SETTINGS frame. As role are switched, max_id is initialized much later when the first h2s response is emitted. Thus, if the connection must be terminated before any stream transfer, GOAWAY cannot be emitted. To fix this, ensure max_id is initialized to 0 on h2_conn_reverse() for active connect side. Thus, a GOAWAY indicating that no stream has been handled can be generated. Note that passive connect side is not impacted, as it max_id is initialized thanks to preface reception. This should be backported up to 2.9.	2025-04-17 14:49:36 +02:00
Amaury Denoyelle	2b8da5f9ab	BUG/MINOR: rhttp: fix reconnect if timeout connect unset Active connect on reverse http relies on connect timeout to detect connection failure. Thus, if this timeout was unset, connection failure may not be properly detected. Fix this by fallback on hardcoded value of 1s for connect if timeout is unset in the configuration. This is considered as a minor bug, as haproxy advises against running with timeout unset. This must be backported up to 2.9.	2025-04-17 14:49:36 +02:00
Amaury Denoyelle	3ebdd3ae50	BUG/MINOR: mux-h2: prevent past scheduling with idle connections While reviewing HTTP/2 MUX timeout, it seems there is a possibility that MUX task is requeued via h2c_update_timeout() with an already expired date. This can happens with idle connections on two cases : * first with shut timeout, as timer is not refreshed if already set * second with http-request and keep-alive timers, which are based on idle_start Queuing an already expired task is an undefined behavior. Fix this by using task_wakeup() instead of task_queue() at the end of h2c_update_timeout() if such case occurs. This should be backported up to 2.6.	2025-04-17 14:49:36 +02:00
Aurelien DARRAGON	b81ab159a6	BUG/MEDIUM: hlua: fix hlua_applet_{http,tcp}_fct() yield regression (lost data) Jacques Heunis from bloomberg reported on the mailing list [1] that with haproxy 2.8 up to master, yielding from a Lua tcp service while data was still buffered inside haproxy would eat some data which was definitely lost. He provided the reproducer below which turned out to be really helpful: global log stdout format raw local0 info lua-load haproxy_yieldtest.lua defaults log global timeout connect 10s timeout client 1m timeout server 1m listen echo bind *:9090 mode tcp tcp-request content use-service lua.print_input haproxy_yieldtest.lua: core.register_service("print_input", "tcp", function(applet) core.Info("Start printing input...") while true do local inputs = applet:getline() if inputs == nil or string.len(inputs) == 0 then core.Info("closing input connection") return end core.Info("Received line: "..inputs) core.yield() end end) And the script below: #!/usr/bin/bash for i in $(seq 1 9999); do for j in $(seq 1 50); do echo "${i}_foo_${j}" done sleep 2 done Using it like this: ./test_seq.sh \| netcat localhost 9090 We can clearly see the missing data for every "foo" burst (every 2 seconds), as they are holes in the numbering. Thanks to the reproducer, it was quickly found that only versions >= 2.8 were affected, and that in fact this regression was introduced by commit 31572229e ("MEDIUM: hlua/applet: Use the sedesc to report and detect end of processing") In fact in 31572229e 2 mistakes were made during the refaco. Indeed, both in hlua_applet_tcp_fct() (which is involved in the reproducer above) and hlua_applet_http_fct(), the request (buffer) is now systematically consumed when returning from the function, which wasn't the case prior to this commit: when HLUA_E_AGAIN is returned, it means a yield was requested and that the processing is not done yet, thus we should not consume any data, like we did prior to the refacto. Big thanks to Jacques who did a great job reproducing and reporting this issue on the mailing list. [1]: https://www.mail-archive.com/haproxy@formilux.org/msg45778.html It should be backported up to 2.8 with commit 31572229e	2025-04-17 14:40:34 +02:00
Amaury Denoyelle	2c3d656f8d	MEDIUM: h3: use absolute URI form with :authority Change the representation of the start-line URI when parsing a HTTP/3 request into HTX. Adopt the same conversion as HTTP/2. If :authority header is used (default case), the URI is encoded using absolute-form, with scheme, host and path concatenated. If only a plain host header is used instead, fallback to the origin form. This commit may cause some configuration to be broken if parsing is performed on the URI. Indeed, now most of the HTTP/3 requests will be represented with an absolute-form URI at the stream layer. Note that prior to this commit a check was performed on the path used as URI to ensure that it did not contain any invalid characters. Now, this is directly performed on the URI itself, which may include the path. This must not be backported.	2025-04-16 18:32:00 +02:00
Amaury Denoyelle	1faa1285aa	BUG/MINOR: h3: reject request URI with invalid characters Ensure that the HTX start-line generated after parsing an HTTP/3 request does not contain any invalid character, i.e. control or whitespace characters. Note that for now path is used directly as URI. Thus, the check is performed directly over it. A patch will change this to generate an absolute-form URI in most cases, but it won't be backported to avoid configuration breaking in stable versions. This must be backported up to 2.6.	2025-04-16 18:32:00 +02:00
Amaury Denoyelle	fc28fe7191	BUG/MINOR: h3: reject invalid :path in request RFC 9114 specifies some requirements for :path pseudo-header when using http or https scheme. This commit enforces this by rejecting a request if needed. Thus, path cannot be empty, and it must either start with a '/' character or contains only '*'. This must be backported up to 2.6.	2025-04-16 18:31:55 +02:00
Amaury Denoyelle	6403bfbce8	BUG/MINOR: h3: filter upgrade connection header As specified in RFC 9114, connection headers required special care in HTTP/3. When a request is received with connection headers, the stream is immediately closed. Conversely, when translating the response from HTX, such headers are not encoded but silently ignored. However, "upgrade" was not listed in connection headers. This commit fixes this by adding a check on it both on request parsing and response encoding. This must be backported up to 2.6.	2025-04-16 18:31:04 +02:00
Amaury Denoyelle	bd3587574d	BUG/MEDIUM: h3: trim whitespaces in header value prior to QPACK encoding This commit does a similar job than the previous one, but it acts now on the response path. Any leading or trailing whitespaces characters from a HTX block header value are removed, prior to the header encoding via QPACK. This must be backported up to 2.6.	2025-04-16 18:31:04 +02:00
Amaury Denoyelle	a17e5b27c0	BUG/MEDIUM: h3: trim whitespaces when parsing headers value Remove any leading and trailing whitespace from header field values prior to inserting a new HTX header block. This is done when parsing a HEADERS frame, both as headers and trailers. This must be backported up to 2.6.	2025-04-16 18:31:04 +02:00
William Lallemand	8efafe76a3	MINOR: acme: free acme_ctx once the task is done Free the acme_ctx task context once the task is done. It frees everything but the config and the httpclient, everything else is free. The ckch_store is freed in case of error, but when the task is successful, the ptr is set to NULL to prevent the free once inserted in the tree.	2025-04-16 18:08:01 +02:00
William Lallemand	e778049ffc	MINOR: acme: register the task in the ckch_store This patch registers the task in the ckch_store so we don't run 2 tasks at the same time for a given certificate. Move the task creation under the lock and check if there was already a task under the lock.	2025-04-16 17:12:43 +02:00
William Lallemand	115653bfc8	BUG/MINOR: acme/cli: fix certificate name in error message The acme command had a new parameter so the certificate name is not correct anymore because args[1] is not the certificate value anymore.	2025-04-16 17:06:52 +02:00
William Lallemand	39088a7806	MINOR: acme: add a success message to the logs Add a success log when the certificate was updated. Ex: acme: foobar.pem: Successful update of the certificate.	2025-04-16 14:51:18 +02:00
William Lallemand	31a1d13802	MINOR: acme: emit logs instead of ha_notice Emit logs using the global logs when the ACME task failed or retries, instead of using ha_notice().	2025-04-16 14:39:39 +02:00
William Lallemand	f36f9ca21c	DOC: configuration: specify limitations of ACME for 3.2 Specify the version for which the limitation applies.	2025-04-16 14:30:45 +02:00
William Lallemand	608eb3d090	BUG/MINOR: acme: fix the exponential backoff of retries Exponential backoff values was multiplied by 3000 instead of 3 with a second to ms conversion. Leading to a 9000000ms value at the 2nd attempt. Fix the issue by setting the value in seconds and converting the value in tick_add(). No backport needed.	2025-04-16 14:20:00 +02:00
William Lallemand	7814a8b446	BUG/MINOR: acme: key not restored upon error in acme_res_certificate() V2 When receiving the final certificate, it need to be loaded by ssl_sock_load_pem_into_ckch(). However this function will remove any existing private key in the struct ckch_store. In order to fix the issue, the ptr to the key is swapped with a NULL ptr, and restored once the new certificate is commited. However there is a discrepancy when there is an error in ssl_sock_load_pem_into_ckch() fails and the pointer is lost. This patch fixes the issue by restoring the pointer in the error path. This must fix issue #2933.	2025-04-16 14:05:04 +02:00
William Lallemand	e21a165af6	Revert "BUG/MINOR: acme: key not restored upon error in acme_res_certificate()" This reverts commit 7a43094f8d8fe3c435ecc003f07453dd9de8134a. Part of another incomplete patch was accidentally squash into the patch.	2025-04-16 14:03:08 +02:00
William Lallemand	bea6235629	MEDIUM: sink: add a new dpapi ring buffer Add a 1MB ring buffer called "dpapi" for communication with the dataplane API. It would first be used to transmit ACME informations to the dataplane API but could be used for more.	2025-04-16 13:56:12 +02:00
William Lallemand	f6fc914fb6	DOC: configuration: fix a typo in ACME documentation Fix "supposed" typo in ACME documentation.	2025-04-16 13:55:25 +02:00
Ilia Shipitsin	4dee087f19	CI: fedora rawhide: enable unit tests Run the new make unit-tests on the CI.	2025-04-15 16:53:54 +02:00
Christopher Faulet	d160046e2c	BUG/MEDIUM: http-ana: Report 502 from req analyzer only during rsp forwarding A server abort must be handled by the request analyzers only when the response forwarding was already started. Otherwise, it it the responsability of the response analyzer to detect this event. L7-retires and conditions to decide to silently close a client conneciotn are handled by this analyzer. Because a reused server connections closed too early could be detected at the wrong place, it was possible to get a 502/SH instead of a silent close, preventing the client to safely retries its request. Thanks to this patch, we are able to silently close the client connection in this case and eventually to perform a L7 retry. This patch must be backported as far as 2.8.	2025-04-15 16:28:15 +02:00
Christopher Faulet	c672b2a297	BUG/MINOR: http-ana: Properly detect client abort when forwarding the response During the response payload forwarding, if the back SC is closed, we try to figure out if it is because of a client abort or a server abort. However, the condition was not accurrate, especially when abortonclose option is set. Because of this issue, a server abort may be reported (SD-- in logs) instead of a client abort (CD-- in logs). The right way to detect a client abort when we try to forward the response is to test if the back SC was shut down (SC_FL_SHUT_DOWN flag set) AND aborted (SC_FL_ABRT_DONE flag set). When these both flags are set, it means the back connection underwent the shutdown, which should be converted to a client abort at this stage. This patch should be backported as far as 2.8. It should fix last strange SD report in the issue #2749.	2025-04-15 16:28:15 +02:00
William Lallemand	c291a5c73c	BUILD: incompatible pointer type suspected with -DDEBUG_UNIT src/jws.c: In function '__jws_init': src/jws.c:594:38: error: passing argument 2 of 'hap_register_unittest' from incompatible pointer type [-Wincompatible-pointer-types] 594 \| hap_register_unittest("jwk", jwk_debug); \| ^~~~~~~~~ \| \| \| int ()(int, char ) In file included from include/haproxy/api.h:36, from include/import/ebtree.h:251, from include/import/ebmbtree.h:25, from include/haproxy/jwt-t.h:25, from src/jws.c:5: include/haproxy/init.h:37:52: note: expected 'int ()(void)' but argument is of type 'int ()(int, char )' 37 \| void hap_register_unittest(const char name, int (*fct)()); \| ~~~~~~^~~~~~ GCC 15 is warning because the function pointer does have its arguments in the register function. Should fix issue #2929.	2025-04-15 15:49:44 +02:00
William Lallemand	05ebb448b5	CLEANUP: acme: stored value is overwritten before it can be used >>> CID 1609049: Code maintainability issues (UNUSED_VALUE) >>> Assigning value "NULL" to "new_ckchs" here, but that stored value is overwritten before it can be used. 592 struct ckch_store old_ckchs, new_ckchs = NULL; Coverity reported an issue where a variable is initialized to NULL then directry overwritten with another value. This doesn't arm but this patch removes the useless initialization. Must fix issue #2932.	2025-04-15 11:44:45 +02:00
William Lallemand	3866d3bd12	BUG/MINOR: acme: fix possible NULL deref Task was dereferenced when setting ctx but was checked after. This patch move the setting of ctx after the check. Should fix issue #2931	2025-04-15 11:41:58 +02:00
Willy Tarreau	3cbbf41cd8	MINOR: debug: detect call instructions and show the branch target in backtraces In backtraces, sometimes it's difficult to know what was called by a given point, because some functions can be fairly long making one doubt about the correct pointer of unresolved ones, others might just use a tail branch instead of a call + return, etc. On common architectures (x86 and aarch64), it's not difficult to detect and decode a relative call, so let's do it on both of these platforms and show the branch location after a '>'. Example: x86_64: call trace(19): \| 0x6bd644 <64 8b 38 e8 ac f7 ff ff]: debug_handler+0x84/0x95 > ha_thread_dump_one \| 0x7feb3e5383a0 <00 00 00 00 0f 1f 40 00]: libpthread:+0x123a0 \| 0x7feb3e53748b <c0 b8 03 00 00 00 0f 05]: libpthread:__close+0x3b/0x8b \| 0x7619e4 <44 89 ff e8 fc 97 d4 ff]: _fd_delete_orphan+0x1d4/0x1d6 > main-0x2130 \| 0x743862 <8b 7f 68 e8 8e e1 01 00]: sock_conn_ctrl_close+0x12/0x54 > fd_delete \| 0x5ac822 <c0 74 05 4c 89 e7 ff d0]: main+0xff512 \| 0x5bc85c <48 89 ef e8 04 fc fe ff]: main+0x10f54c > main+0xff150 \| 0x5be410 <4c 89 e7 e8 c0 e1 ff ff]: main+0x111100 > main+0x10f2c0 \| 0x6ae6a4 <28 00 00 00 00 ff 51 58]: cli_io_handler+0x31524 \| 0x6aeab4 <7c 24 08 e8 fc fa ff ff]: sc_destroy+0x14/0x2a4 > cli_io_handler+0x31430 \| 0x6c685d <48 89 ef e8 43 82 fe ff]: process_chk_conn+0x51d/0x1927 > sc_destroy aarch64: call trace(15): \| 0xaaaaad0c1540 <60 6a 60 b8 c3 fd ff 97]: debug_handler+0x9c/0xbc > ha_thread_dump_one \| 0xffffa8c177ac <c2 e0 3b d5 1f 20 03 d5]: linux-vdso:__kernel_rt_sigreturn \| 0xaaaaad0b0964 <c0 03 5f d6 d2 ff ff 97]: cli_io_handler+0x28e44 > sedesc_new \| 0xaaaaad0b22a4 <00 00 80 d2 94 f9 ff 97]: sc_new_from_strm+0x1c/0x54 > cli_io_handler+0x28dd0 \| 0xaaaaad0167e8 <21 00 80 52 a9 6e 02 94]: stream_new+0x258/0x67c > sc_new_from_strm \| 0xaaaaad0b21f8 <e1 03 13 aa e7 90 fd 97]: sc_new_from_endp+0x38/0xc8 > stream_new \| 0xaaaaacfda628 <21 18 40 f9 e7 5e 03 94]: main+0xcaca8 > sc_new_from_endp \| 0xaaaaacfdb95c <42 c0 00 d1 02 f3 ff 97]: main+0xcbfdc > main+0xc8be0 \| 0xaaaaacfdd3f0 <e0 03 13 aa f5 f7 ff 97]: h1_io_cb+0xd0/0xb90 > main+0xcba40	2025-04-14 20:06:48 +02:00
Willy Tarreau	9740f15274	MINOR: debug: in call traces, dump the 8 bytes before the return address, not after In call traces, we're interested in seeing the code that was executed, not the code that was not yet. The return address is where the CPU will return to, so we want to see the bytes that precede this location. In the example below on x86 we can clearly see a number of direct "call" instructions (0xe8 + 4 bytes). There are also indirect calls (0xffd0) that cannot be exploited but it gives insights about where the code branched, which will not always be the function above it if that one used tail branching for example. Here's an example dump output: call ------------, v 0x6bd634 <64 8b 38 e8 ac f7 ff ff]: debug_handler+0x84/0x95 0x7fa4ea2593a0 <00 00 00 00 0f 1f 40 00]: libpthread:+0x123a0 0x752132 <00 00 00 00 00 90 41 55]: htx_remove_blk+0x2/0x354 0x5b1a2c <4c 89 ef e8 04 07 1a 00]: main+0x10471c 0x5b5f05 <48 89 df e8 8b b8 ff ff]: main+0x108bf5 0x60b6f4 <89 ee 4c 89 e7 41 ff d0]: tcpcheck_eval_send+0x3b4/0x14b2 0x610ded <00 00 00 e8 53 a5 ff ff]: tcpcheck_main+0x7dd/0xd36 0x6c5ab4 <48 89 df e8 5c ab f4 ff]: wake_srv_chk+0xc4/0x3d7 0x6c5ddc <48 89 f7 e8 14 fc ff ff]: srv_chk_io_cb+0xc/0x13	2025-04-14 19:28:22 +02:00
Willy Tarreau	003f5168e4	MINOR: tools: let dump_addr_and_bytes() support dumping before the offset For code dumps, dumping from the return address is pointless, what is interesting is to dump before the return address to read the machine code that was executed before branching. Let's just make the function support negative sizes to indicate that we're dumping this number of bytes to the address instead of this number from the address. In this case, in order to distinguish them, we're using a '<' instead of '[' to start the series of bytes, indicating where the bytes expand and where they stop. For example we can now see this: 0x6bd634 <64 8b 38 e8 ac f7 ff ff]: debug_handler+0x84/0x95 0x7fa4ea2593a0 <00 00 00 00 0f 1f 40 00]: libpthread:+0x123a0 0x752132 <00 00 00 00 00 90 41 55]: htx_remove_blk+0x2/0x354 0x5b1a2c <4c 89 ef e8 04 07 1a 00]: main+0x10471c 0x5b5f05 <48 89 df e8 8b b8 ff ff]: main+0x108bf5 0x60b6f4 <89 ee 4c 89 e7 41 ff d0]: tcpcheck_eval_send+0x3b4/0x14b2 0x610ded <00 00 00 e8 53 a5 ff ff]: tcpcheck_main+0x7dd/0xd36 0x6c5ab4 <48 89 df e8 5c ab f4 ff]: wake_srv_chk+0xc4/0x3d7 0x6c5ddc <48 89 f7 e8 14 fc ff ff]: srv_chk_io_cb+0xc/0x13	2025-04-14 19:25:27 +02:00
Willy Tarreau	b708345c17	DEBUG: counters: add the ability to enable/disable updating the COUNT_IF counters These counters can have a noticeable cost on large machines, though not dramatic. There's no single good choice to keep them enabled or disabled. This commit adds multiple choices: - DEBUG_COUNTERS set to 2 will automatically enable them by default, while 1 will disable them by default - the global "debug.counters on/off" will allow to change the setting at boot, regardless of DEBUG_COUNTERS as long as it was at least 1. - the CLI "debug counters on/off" will also allow to change the value at run time, allowing to observe a phenomenon while it's happening, or to disable counters if it's suspected that their cost is too high Finally, the "debug counters" command will append "(stopped)" at the end of the CNT lines when these counters are stopped. Not that the whole mechanism would easily support being extended to all counter types by specifying the types to apply to, but it doesn't seem useful at all and would require the user to also type "cnt" on debug lines. This may easily be changed in the future if it's found relevant.	2025-04-14 19:02:13 +02:00
Willy Tarreau	a142adaba0	DEBUG: counters: make COUNT_IF() only appear at DEBUG_COUNTERS>=1 COUNT_IF() is convenient but can be heavy since some of them were found to trigger often (roughly 1 counter per request on avg). This might even have an impact on large setups due to the cost of a shared cache line bouncing between multiple cores. For now there's no way to disable it, so let's only enable it when DEBUG_COUNTERS is 1 or above. A future change will make it configurable.	2025-04-14 19:02:13 +02:00
Willy Tarreau	61d633a3ac	DEBUG: rename DEBUG_GLITCHES to DEBUG_COUNTERS and enable it by default Till now the per-line glitches counters were only enabled with the confusingly named DEBUG_GLITCHES (which would not turn glitches off when disabled). Let's instead change it to DEBUG_COUNTERS and make sure it's enabled by default (though it can still be disabled with -DDEBUG_GLITCHES=0 just like for DEBUG_STRICT). It will later be expanded to cover more counters.	2025-04-14 19:02:13 +02:00
Willy Tarreau	a8148c313a	DEBUG: init: report invalid characters in debug description strings It's easy to leave some trailing \n or even other characters that can mangle the debug output. Let's verify at boot time that the debug sections are clean by checking for chars 0x20 to 0x7e inclusive. This is very simple to do and it managed to find another one in a multi-line message: [WARNING] (23696) : Invalid character 0x0a at position 96 in description string at src/cli.c:2516 _send_status() This way new offending code will be spotted before being committed.	2025-04-14 19:02:13 +02:00
Willy Tarreau	9efc60c887	DOC: config: add the missing "force-cfg-parser-pause" to the global kw index It was documented but missing from the index, let's add it. This can be backported to 3.1.	2025-04-14 19:02:13 +02:00
Willy Tarreau	640a699804	DOC: config: add the missing "profiling.memory" to the global kw index It was in the description but not in the index. This can be backported to all versions where it applies.	2025-04-14 19:02:13 +02:00
Willy Tarreau	23705564ae	BUG/MINOR: debug: remove the trailing \n from BUG_ON() statements These ones were added by mistake during the change of the cfgparse mechanism in 3.1, but they're corrupting the output of "debug counters" by leaving stray ']' on their own lines. We could possibly check them all once at boot but it doens't seem worth it. This should be backported to 3.1.	2025-04-14 19:02:13 +02:00
William Lallemand	f9390a689f	DOC: acme: explain how to configure and run ACME Add configuration about the acme section in the configuration manual, as well as the acme command in the management guide.	2025-04-14 16:14:57 +02:00
William Lallemand	7119b5149d	MINOR: acme: default to 2048bits for RSA Change the default RSA value to 2048 bits.	2025-04-14 16:14:57 +02:00
Valentine Krasnobaeva	08efe8cd24	BUG/MINOR: thread: protect thread_cpus_enabled_at_boot with USE_THREAD Following error is triggered at linker invokation, when we try to compile with USE_THREAD=0 and -O0. make -j 8 TARGET=linux-glibc USE_LUA=1 USE_PCRE2=1 USE_LINUX_CAP=1 \ USE_MEMORY_PROFILING=1 OPT_CFLAGS=-O0 USE_THREAD=0 /usr/bin/ld: src/thread.o: warning: relocation against `thread_cpus_enabled_at_boot' in read-only section `.text' /usr/bin/ld: src/thread.o: in function `thread_detect_count': /home/vk/projects/haproxy/src/thread.c:1619: undefined reference to `thread_cpus_enabled_at_boot' /usr/bin/ld: /home/vk/projects/haproxy/src/thread.c:1619: undefined reference to `thread_cpus_enabled_at_boot' /usr/bin/ld: /home/vk/projects/haproxy/src/thread.c:1620: undefined reference to `thread_cpus_enabled_at_boot' /usr/bin/ld: warning: creating DT_TEXTREL in a PIE collect2: error: ld returned 1 exit status make: *** [Makefile:1044: haproxy] Error 1 thread_cpus_enabled_at_boot is only available when we compiled with USE_THREAD=1, which is the default for the most targets now. In some cases, we need to recompile in mono-thread mode, thus thread_cpus_enabled_at_boot should be protected with USE_THREAD in thread_detect_count(). thread_detect_count() is always called during the process initialization never mind of multi thread support. It sets some defaults in global.nbthread and global.nbtgroups. This patch is related to GitHub issue #2916. No need to be backported as it was added in 3.2-dev9 version.	2025-04-14 16:03:21 +02:00
William Lallemand	7a43094f8d	BUG/MINOR: acme: key not restored upon error in acme_res_certificate() When receiving the final certificate, it need to be loaded by ssl_sock_load_pem_into_ckch(). However this function will remove any existing private key in the struct ckch_store. In order to fix the issue, the ptr to the key is swapped with a NULL ptr, and restored once the new certificate is commited. However there is a discrepancy when there is an error in ssl_sock_load_pem_into_ckch() fails and the pointer is lost. This patch fixes the issue by restoring the pointer in the error path. This must fix issue #2933.	2025-04-14 10:55:44 +02:00
Willy Tarreau	4a44d592ae	BUG/MINOR: cpu-topo: check the correct variable for NULL after malloc() We were testing ha_cpu_topo instead of ha_cpu_clusters after an allocation, making the check ineffective. No backport is needed.	2025-04-12 18:23:29 +02:00
William Lallemand	39c05cedff	BUILD: acme: enable the ACME feature when JWS is present The ACME feature depends on the JWS, which currently does not work with every SSL libraries. This patch only enables ACME when JWS is enabled.	2025-04-12 01:39:03 +02:00
William Lallemand	a96cbe32b6	MINOR: acme: schedule retries with a timer Schedule the retries with a 3s exponential timer. This is a temporary mesure as the client should follow the Retry-After field for rate-limiting for every request (https://datatracker.ietf.org/doc/html/rfc8555#section-6.6)	2025-04-12 01:39:03 +02:00
William Lallemand	768458a79e	MEDIUM: acme: replace the previous ckch instance with new ones This step is the latest to have a usable ACME certificate in haproxy. It looks for the previous certificate, locks the "BIG CERTIFICATE LOCK", copy every instance, deploys new ones, remove the previous one. This is done in one step in a function which does not yield, so it could be problematic if you have thousands of instances to handle. It still lacks the rate limit which is mandatory to be used in production, and more cleanup and deinit.	2025-04-12 01:39:03 +02:00
William Lallemand	9505b5bdf0	MINOR: acme: copy the original ckch_store Copy the original ckch_store instead of creating a new one. This allows to inherit the ckch_conf from the previous structure when doing a ckchs_dup(). The ckch_conf contains the SAN for ACME. Free the previous PKEY since it a new one is generated.	2025-04-12 01:39:03 +02:00
William Lallemand	5b85b81d84	MINOR: ssl/ckch: handle ckch_conf in ckchs_dup() and ckch_conf_clean() Handle new members of the ckch_conf in ckchs_dup() and ckch_conf_clean(). This could be automated at some point since we have the description of all types in ckch_conf_kws.	2025-04-12 01:39:03 +02:00
William Lallemand	73ab78e917	BUG/MINOR: acme: ckch_conf_acme_init() when no filename Does not try to strdup the configuration filename if there is none. No backport needed.	2025-04-12 01:39:03 +02:00
William Lallemand	5500bda9eb	MINOR: acme: implement retrieval of the certificate Once the Order status is "valid", the certificate URL is accessible, this patch implements the retrieval of the certificate which is stocked in ctx->store.	2025-04-12 01:39:03 +02:00
William Lallemand	27fff179fe	MINOR: acme: verify the order status once finalized This implements a call to the order status to check if the certificate is ready.	2025-04-12 01:39:03 +02:00
William Lallemand	680222b382	MINOR: acme: finalize by sending the CSR This patch does the finalize step of the ACME task. This encodes the CSR into base64 format and send it to the finalize URL. https://www.rfc-editor.org/rfc/rfc8555#section-7.4	2025-04-12 01:29:27 +02:00
William Lallemand	de5dc31a0d	MINOR: acme: generate the CSR in a X509_REQ Generate the X509_REQ using the generated private key and the SAN from the configuration. This is only done once before the task is started. It could probably be done at the beginning of the task with the private key generation once we have a scheduler instead of a CLI command.	2025-04-12 01:29:27 +02:00
William Lallemand	00ba62df15	MINOR: acme: implement a check on the challenge status This patch implements a check on the challenge URL, once haproxy asked for the challenge to be verified, it must verify the status of the challenge resolution and if there weren't any error.	2025-04-12 01:29:27 +02:00
William Lallemand	711a13a4b4	MINOR: acme: send the request for challenge ready This patch sends the "{}" message to specify that a challenge is ready. It iterates on every challenge URL in the authorization list from the acme_ctx. This allows the ACME server to procede to the challenge validation. https://www.rfc-editor.org/rfc/rfc8555#section-7.5.1	2025-04-12 01:29:27 +02:00
William Lallemand	ae0bc88f91	MINOR: acme: get the challenges object from the Auth URL This patch implements the retrieval of the challenges objects on the authorizations URLs. The challenges object contains a token and a challenge url that need to be called once the challenge is setup. Each authorization URLs contain multiple challenge objects, usually one per challenge type (HTTP-01, DNS-01, ALPN-01... We only need to keep the one that is relevent to our configuration.	2025-04-12 01:29:27 +02:00
William Lallemand	7231bf5726	MINOR: acme: allow empty payload in acme_jws_payload() Some ACME requests are required to have a JWS with an empty payload, let's be more flexible and allow this function to have an empty buffer.	2025-04-12 01:29:27 +02:00
William Lallemand	4842c5ea8c	MINOR: acme: newOrder request retrieve authorizations URLs This patch implements the newOrder action in the ACME task, in order to ask for a new certificate, a list of SAN is sent as a JWS payload. the ACME server replies a list of Authorization URLs. One Authorization is created per SAN on a Order. The authorization URLs are stored in a linked list of 'struct acme_auth' in acme_ctx, so we can get the challenge URLs from them later. The location header is also store as it is the URL of the order object. https://datatracker.ietf.org/doc/html/rfc8555#section-7.4	2025-04-12 01:29:27 +02:00
William Lallemand	04d393f661	MINOR: acme: generate new account The new account action in the ACME task use the same function as the chkaccount, but onlyReturnExisting is not sent in this case!	2025-04-12 01:29:27 +02:00
William Lallemand	7f9bf4d5f7	MINOR: acme: check if the account exist This patch implements the retrival of the KID (account identifier) using the pkey. A request is sent to the newAccount URL using the onlyReturnExisting option, which allow to get the kid of an existing account. acme_jws_payload() implement a way to generate a JWS payload using the nonce, pkey and provided URI.	2025-04-12 01:29:27 +02:00
William Lallemand	0aa6dedf72	MINOR: acme: handle the nonce ACME requests are supposed to be sent with a Nonce, the first Nonce should be retrieved using the newNonce URI provided by the directory. This nonce is stored and must be replaced by the new one received in the each response.	2025-04-12 01:29:27 +02:00
William Lallemand	471290458e	MINOR: acme: get the ACME directory The first request of the ACME protocol is getting the list of URLs for the next steps. This patch implements the first request and the parsing of the response. The response is a JSON object so mjson is used to parse it.	2025-04-12 01:29:27 +02:00
William Lallemand	4780a1f223	MINOR: acme: the acme section is experimental Allow the usage of the acme section only when expose-experimental-directives is set.	2025-04-12 01:29:27 +02:00
William Lallemand	b8209cf697	MINOR: acme/cli: add the 'acme renew' command The "acme renew" command launch the ACME task for a given certificate. The CLI parser generates a new private key using the parameters from the acme section..	2025-04-12 01:29:27 +02:00
William Lallemand	bf6a39c4d1	MINOR: acme: add private key configuration This commit allows to configure the generated private keys, you can configure the keytype (RSA/ECDSA), the number of bits or the curves. Example: acme LE uri https://acme-staging-v02.api.letsencrypt.org/directory account account.key contact foobar@example.com challenge HTTP-01 keytype ECDSA curves P-384	2025-04-12 01:29:27 +02:00
William Lallemand	2e8c350b95	MINOR: acme: add configuration for the crt-store Add new acme keywords for the ckch_conf parsing, which will be used on a crt-store, a crt line in a frontend, or even a crt-list. The cfg_postparser_acme() is called in order to check if a section referenced elsewhere really exists in the config file.	2025-04-12 01:29:27 +02:00
William Lallemand	077e2ce84c	MINOR: acme: add the acme section in the configuration parser Add a configuration parser for the new acme section, the section is configured this way: acme letsencrypt uri https://acme-staging-v02.api.letsencrypt.org/directory account account.key contact foobar@example.com challenge HTTP-01 When unspecified, the challenge defaults to HTTP-01, and the account key to "<section_name>.account.key". Section are stored in a linked list containing acme_cfg structures, the configuration parsing is mostly resolved in the postsection parser cfg_postsection_acme() which is called after the parsing of an acme section.	2025-04-12 01:29:27 +02:00
William Lallemand	20718f40b6	MEDIUM: ssl/ckch: add filename and linenum argument to crt-store parsing Add filename and linenum arguments to the crt-store / ckch_conf parsing. It allows to use them in the parsing function so we could emits error.	2025-04-12 01:29:27 +02:00
Willy Tarreau	00c967fac4	MINOR: master/cli: support bidirectional communications with workers Some rare commands in the worker require to keep their input open and terminate when it's closed ("show events -w", "wait"). Others maintain a per-session context ("set anon on"). But in its default operation mode, the master CLI passes commands one at a time to the worker, and closes the CLI's input channel so that the command can immediately close upon response. This effectively prevents these two specific cases from being used. Here the approach that we take is to introduce a bidirectional mode to connect to the worker, where everything sent to the master is immediately forwarded to the worker (including the raw command), allowing to queue multiple commands at once in the same session, and to continue to watch the input to detect when the client closes. It must be a client's choice however, since doing so means that the client cannot batch many commands at once to the master process, but must wait for these commands to complete before sending new ones. For this reason we use the prefix "@@<pid>" for this. It works exactly like "@" except that it maintains the channel open during the whole execution. Similarly to "@<pid>" with no command, "@@<pid>" will simply open an interactive CLI session to the worker, that will be ended by "quit" or by closing the connection. This can be convenient for the user, and possibly for clients willing to dedicate a connection to the worker.	2025-04-11 16:09:17 +02:00
Willy Tarreau	b6a8abcd0b	DOC: management: add a paragraph about the limitations of the '@' prefix The '@' prefix permits to execute a single command at once in a worker. It is very handy but comes with some limitations affecting rare commands, which is better to be documented (one command per session, input closed) since it can seldom have user-visible effects.	2025-04-11 16:09:17 +02:00
Willy Tarreau	e8267d1ce2	DOC: management: slightly clarify the prefix role of the '@' command While the examples were clear, the text did not fully imply what was reflected there. Better have the text explicitly mention that the '@' command may be used as a prefix or wrapper in front of a command as well as a standalone command.	2025-04-11 16:09:17 +02:00
Ilya Shipitsin	eed4116c07	CI: enable weekly QuicTLS build QuicTLS started own fork not dependant on OpenSSL, lets add that to weekly builds ML: https://www.mail-archive.com/haproxy@formilux.org/msg45574.html GH: https://github.com/quictls/quictls/issues/244	2025-04-11 16:01:45 +02:00
Willy Tarreau	a6982a898e	[RELEASE] Released version 3.2-dev10 Released version 3.2-dev10 with the following main changes : - REORG: ssl: move curves2nid and nid2nist to ssl_utils - BUG/MEDIUM: stream: Fix a possible freeze during a forced shut on a stream - MEDIUM: stream: Save SC and channel flags earlier in process_steam() - BUG/MINOR: peers: fix expire learned from a peer not converted from ms to ticks - BUG/MEDIUM: peers: prevent learning expiration too far in futur from unsync node - CI: spell check: allow manual trigger - CI: codespell: add "pres" to spellcheck whitelist - CLEANUP: assorted typo fixes in the code, commits and doc - CLEANUP: atomics: remove support for gcc < 4.7 - CLEANUP: atomics: also replace __sync_synchronize() with __atomic_thread_fence() - TESTS: Fix build for filltab25.c - MEDIUM: ssl: replace "crt" lines by "ssl-f-use" lines - DOC: configuration: replace "crt" by "ssl-f-use" in listeners - MINOR: backend: mark srv as nonnull in alloc_dst_address() - BUG/MINOR: server: ensure check-reuse-pool is copied from default-server - MINOR: server: activate automatically check reuse for rhttp@ protocol - MINOR: check/backend: support conn reuse with SNI - MINOR: check: implement check-pool-conn-name srv keyword - MINOR: task: add thread safe notification_new and notification_wake variants - BUG/MINOR: hlua_fcn: fix potential UAF with Queue:pop_wait() - MINOR: hlua_fcn: register queue class using hlua_register_metatable() - MINOR: hlua: add core.wait() - MINOR: hlua: core.wait() takes optional delay paramater - MINOR: hlua: split hlua_applet_tcp_recv_yield() in two functions - MINOR: hlua: add AppletTCP:try_receive() - MINOR: hlua_fcn: add Queue:alarm() - MEDIUM: task: make notification_* API thread safe by default - CLEANUP: log: adjust _lf_cbor_encode_byte() comment - MEDIUM: ssl/crt-list: warn on negative wildcard filters - MEDIUM: ssl/crt-list: warn on negative filters only - BUILD: atomics: fix build issue on non-x86/non-arm systems - BUG/MINOR: log: fix CBOR encoding with LOG_VARTEXT_START() + lf_encode_chunk() - BUG/MEDIUM: sample: fix risk of overflow when replacing multiple regex back-refs - DOC: configuration: rework the crt-list section - MINOR: ring: support arbitrary delimiters through ring_dispatch_messages() - MINOR: ring/cli: support delimiting events with a trailing \0 on "show events" - DEV: h2: fix h2-tracer.lua nil value index - BUG/MINOR: backend: do not use the source port when hashing clientip - BUG/MINOR: hlua: fix invalid errmsg use in hlua_init() - MINOR: proxy: add setup_new_proxy() function - MINOR: checks: mark CHECKS-FE dummy frontend as internal - MINOR: flt_spoe: mark spoe agent frontend as internal - MEDIUM: tree-wide: avoid manually initializing proxies - MINOR: proxy: add deinit_proxy() helper func - MINOR: checks: deinit checks_fe upon deinit - MINOR: flt_spoe: deinit spoe agent proxy upon agent release	2025-04-11 10:04:00 +02:00
Aurelien DARRAGON	f3b231714f	MINOR: flt_spoe: deinit spoe agent proxy upon agent release Even though spoe agent proxy is statically allocated, it uses the proxy API and is initialized like a regular proxy, thus specific cleanup is required upon release. This is not tagged as a bug because as of now this would only cause some minor memory leak upon deinit. We check the presence of proxy->id to know if it was initialized since we cannot rely on a pointer for that.	2025-04-10 22:10:31 +02:00
Aurelien DARRAGON	8a944d0e46	MINOR: checks: deinit checks_fe upon deinit This is just to make valgrind and friends happy, leverage deinit_proxy() for checks_fe proxy upon deinit to ensure proper cleanup. We check the presence of proxy->id to know if it was initialized because we cannot rely on a pointer for that.	2025-04-10 22:10:31 +02:00
Aurelien DARRAGON	fbfeb591f7	MINOR: proxy: add deinit_proxy() helper func Same as free_proxy(), but does not free the base proxy pointer (ie: the proxy itself may not be allocated) Goal is to be able to cleanup statically allocated dummy proxies.	2025-04-10 22:10:31 +02:00
Aurelien DARRAGON	4194f756de	MEDIUM: tree-wide: avoid manually initializing proxies In this patch we try to use the proxy API init functions as much as possible to avoid code redundancy and prevent proxy initialization errors. As such, we prefer using alloc_new_proxy() and setup_new_proxy() instead of manually allocating the proxy pointer and performing the base init ourselves.	2025-04-10 22:10:31 +02:00
Aurelien DARRAGON	60f45564a1	MINOR: flt_spoe: mark spoe agent frontend as internal spoe agent frontend is used by the agent internally, but it is not meant to be directly exposed like user-facing proxies defined in the config. As such, better mark it as internal using PR_CAP_INT capability to prevent any mis-use.	2025-04-10 22:10:31 +02:00
Aurelien DARRAGON	5087048b6d	MINOR: checks: mark CHECKS-FE dummy frontend as internal CHECKS-FE frontend is a dummy frontend used to create checks sessions as such, it is internal and should not be exposed to the user. Better mark it as internal using PR_CAP_INT capability to prevent proxy API from ever exposing it.	2025-04-10 22:10:31 +02:00
Aurelien DARRAGON	e1cec655ee	MINOR: proxy: add setup_new_proxy() function Split alloc_new_proxy() in two functions: the preparing part is now handled by setup_new_proxy() which can be called individually, while alloc_new_proxy() takes care of allocating a new proxy struct and then calling setup_new_proxy() with the freshly allocated proxy.	2025-04-10 22:10:31 +02:00
Aurelien DARRAGON	ea3c96369f	BUG/MINOR: hlua: fix invalid errmsg use in hlua_init() errmsg is used with memprintf and friends, thus it must be NULL initialized before being passed to memprintf, else invalid read will occur. However in hlua_init() the errmsg value isn't initialized, let's fix that This is really minor because it would only cause issue on error paths, yet it may be backported to all stable versions, just in case.	2025-04-10 22:10:26 +02:00
Willy Tarreau	7b6df86a83	BUG/MINOR: backend: do not use the source port when hashing clientip The server's "usesrc" keyword supports among other options "client" and "clientip". The former means we bind to the client's IP and port to connect to the server, while the latter means we bind to its IP only. It's done in two steps, first alloc_bind_address() retrieves the IP address and port, and second, tcp_connect_server() decides to either bind to the IP only or IP+port. The problem comes with idle connection pools, which hash all the parameters: the hash is calculated before (and ideally withouy) calling tcp_connect_server(), and it considers the whole struct sockaddr_storage for the hash, except that both client and clientip entirely fill it with the client's address. This means that both client and clientip make use of the source port in the hash calculation, making idle connections almost not reusable when using "usesrc clientip" while they should for clients coming from the same source. A work-around is to force the source port to zero using "tcp-request session set-src-port int(0)" but it's ugly. Let's fix this by properly zeroing the port for AF_INET/AF_INET6 addresses. This can be backported to 2.4. Thanks to Sebastien Gross for providing a reproducer for this problem.	2025-04-09 11:05:22 +02:00
Aurelien DARRAGON	afd5f5d671	DEV: h2: fix h2-tracer.lua nil value index Nick Ramirez reported the following error while testing the h2-tracer.lua script: Lua filter 'h2-tracer' : [state-id 0] runtime error: /etc/haproxy/h2-tracer.lua:227: attempt to index a nil value (field '?') from /etc/haproxy/h2-tracer.lua:227: in function line 109. It is caused by h2ff indexing with an out of bound value. Indeed, h2ff is indexed with the frame type, which can potentially be > 9 (not common nor observed during Willy's tests), while h2ff only defines indexes from 0 to 9. The fix was provided by Willy, it consists in skipping h2ff indexing if frame type is > 9. It was confirmed that doing so fixes the error.	2025-04-08 17:44:41 +02:00
Willy Tarreau	f4634e5a38	MINOR: ring/cli: support delimiting events with a trailing \0 on "show events" At the moment it is not supported to produce multi-line events on the "show events" output, simply because the LF character is used as the default end-of-event mark. However it could be convenient to produce well-formatted multi-line events, e.g. in JSON or other formats. UNIX utilities have already faced similar needs in the past and added "-print0" to "find" and "-0" to "xargs" to mention that the delimiter is the NUL character. This makes perfect sense since it's never present in contents, so let's do exactly the same here. Thus from now on, "show events <ring> -0" will delimit messages using a \0 instead of a \n, permitting a better and safer encapsulation.	2025-04-08 14:36:35 +02:00
Willy Tarreau	0be6d73e88	MINOR: ring: support arbitrary delimiters through ring_dispatch_messages() In order to support delimiting output events with other characters than just the LF, let's pass the delimiter through the API. The default remains the LF, used by applet_append_line(), and ignored by the log forwarder.	2025-04-08 14:36:35 +02:00
William Lallemand	038a372684	DOC: configuration: rework the crt-list section The crt-list section was unclear, this patch reworks it, giving more details on the matching algorithms and how the things are loaded.	2025-04-08 14:29:10 +02:00
Willy Tarreau	3e3b9eebf8	BUG/MEDIUM: sample: fix risk of overflow when replacing multiple regex back-refs Aleandro Prudenzano of Doyensec and Edoardo Geraci of Codean Labs reported a bug in sample_conv_regsub(), which can cause replacements of multiple back-references to overflow the temporary trash buffer. The problem happens when doing "regsub(match,replacement,g)": we're replacing every occurrence of "match" with "replacement" in the input sample, which requires a length check. For this, a max is applied, so that a replacement may not use more than the remaining length in the buffer. However, the length check is made on the replaced pattern and not on the temporary buffer used to carry the new string. This results in the remaining size to be usable for each input match, which can go beyond the temporary buffer size if more than one occurrence has to be replaced with something that's larger than the remaining room. The fix proposed by Aleandro and Edoardo is the correct one (check on "trash" not "output"), and is the one implemented in this patch. While it is very unlikely that a config will replace multiple short patterns each with a larger one in a request, this possibility cannot be entirely ruled out (e.g. mask a known, short IP address using "XXX.XXX.XXX.XXX"). However when this happens, the replacement pattern will be static, and not be user-controlled, which is why this patch is marked as medium. The bug was introduced in 2.2 with commit 07e1e3c93e ("MINOR: sample: regsub now supports backreferences"), so it must be backported to all versions. Special thanks go to Aleandro and Edoardo for reporting this bug with a simple reproducer and a fix.	2025-04-07 15:57:28 +02:00
Aurelien DARRAGON	9e8444b730	BUG/MINOR: log: fix CBOR encoding with LOG_VARTEXT_START() + lf_encode_chunk() There have been some reports that using %HV logformat alias with CBOR encoder would produce invalid CBOR payload according to some CBOR implementations such as "cbor.me". Indeed, with the below log-format: log-format "%{+cbor}o %(protocol)HV" And the resulting CBOR payload: BF6870726F746F636F6C7F48485454502F312E31FFFF cbor.me would complain with: "bytes/text mismatch (ASCII-8BIT != UTF-8) in streaming string") error message. It is due to the version string being first announced as text, while CBOR encoder actually encodes it as byte string later when lf_encode_chunk() is used. In fact it affects all patterns combining LOG_VARTEXT_START() with lf_encode_chunk() which means %HM, %HU, %HQ, %HPO and %HP are also affected. To fix the issue, in _lf_encode_bytes() (which is lf_encode_chunk() helper), we now check if we are inside a VARTEXT (we can tell it if ctx->in_text is true), in which case we consider that we already announced the current data as regular text so we keep the same type to encode the bytes from the chunk to prevent inconsistencies. It should be backported in 3.0	2025-04-07 12:27:14 +02:00
Willy Tarreau	f01ff2478f	BUILD: atomics: fix build issue on non-x86/non-arm systems Commit f435a2e518 ("CLEANUP: atomics: also replace __sync_synchronize() with __atomic_thread_fence()") replaced the builtins used for barriers, but the different API required an argument while the macros didn't specify any, resulting in double parenthesis that were causing obscure build errors such as "called object type 'void' is not a function or function pointer". Let's just specify the args for the macro. No backport is needed.	2025-04-07 09:38:22 +02:00
William Lallemand	ab4cd49c04	MEDIUM: ssl/crt-list: warn on negative filters only negative SNI filters on crt-list lines only have a meaning when they match a positive wildcard filter. This patch adds a warning which is emitted when trying to use negative filters without any wildcard on the same line. This was discovered in ticket #2900.	2025-04-04 18:18:44 +02:00
William Lallemand	a9ae6b516d	MEDIUM: ssl/crt-list: warn on negative wildcard filters negative wildcard filters were always a noop, and are not useful for anything unless you want to use !* alone to remove every name from a certificate. This is confusing and the documentation never stated it correctly. This patch adds a warning during the bind initialization if it founds one, only !* does not emit a warning. This patch was done during the debugging of issue #2900.	2025-04-04 17:13:51 +02:00
Aurelien DARRAGON	ce6951d6f9	CLEANUP: log: adjust _lf_cbor_encode_byte() comment _lf_cbor_encode_byte() comment was not updated in c33b857df ("MINOR: log: support true cbor binary encoding") to reflect the new behavior. Indeed, binary form is now supported. Updating the comment that says otherwise.	2025-04-03 17:52:56 +02:00
Aurelien DARRAGON	11d4d0957e	MEDIUM: task: make notification_* API thread safe by default Some notification_* functions were not thread safe by default as they assumed only one producer would emit events for registered tasks. While this suited well with the Lua sockets use-case, this proved to be a limitation with some other event sources (ie: lua Queue class) instead of having to deal with both the non thread safe and thread safe variants (_mt suffix), which is error prone, let's make the entire API thread safe regarding the event list. Pruning functions still require that only one thread executes them, with Lua this is always the case because there is one cleanup list per context.	2025-04-03 17:52:50 +02:00
Aurelien DARRAGON	976890edda	MINOR: hlua_fcn: add Queue:alarm() Queue:alarm() sets a wakeup alarm on the task when new data becomes available on Queue. It must be re-armed for each event. Lua documentation was updated	2025-04-03 17:52:44 +02:00
Aurelien DARRAGON	0ffc80d3ba	MINOR: hlua: add AppletTCP:try_receive() This is the non-blocking variant for AppletTCP:receive(). It doesn't take any argument, instead it tries to read as much data as available at once. If no data is available, empty string is returned. Lua documentation was updated.	2025-04-03 17:52:39 +02:00
Aurelien DARRAGON	86d3cfdeeb	MINOR: hlua: split hlua_applet_tcp_recv_yield() in two functions Split hlua_applet_tcp_recv_yield() in order to create hlua_applet_tcp_recv_try() helper function which does a single receive attempt.	2025-04-03 17:52:34 +02:00
Aurelien DARRAGON	c7cbfafa38	MINOR: hlua: core.wait() takes optional delay paramater core.wait() now accepts optional delay parameter in ms. Passed this delay the task is woken up if no event woke the task before. Lua documentation was updated.	2025-04-03 17:52:28 +02:00
Aurelien DARRAGON	1e4e5ab4d2	MINOR: hlua: add core.wait() Similar to core.yield(), except that the task is not woken up automatically, instead it waits for events to trigger the task wakeup. Lua documentation was updated.	2025-04-03 17:52:23 +02:00
Aurelien DARRAGON	748dba4859	MINOR: hlua_fcn: register queue class using hlua_register_metatable() Most lua classes are registered by leveraging the hlua_register_metatable() helper. Let's use that for the Queue class as well for consitency.	2025-04-03 17:52:17 +02:00
Aurelien DARRAGON	c6fa061f22	BUG/MINOR: hlua_fcn: fix potential UAF with Queue:pop_wait() If Queue:pop_wait() excecuted from a stream context and pop_wait() is aborted due to a Lua or ressource error, then the waiting object pointing to the task will still be registered, so if the task eventually dissapears, Queue:push() may try to wake invalid task pointer.. To prevent this bug from happening, we now rely on notification_* API to deliver waiting signals. This way signals are properly garbage collected when a lua context is destroyed. It should be backported in 2.8 with 86fb22c55 ("MINOR: hlua_fcn: add Queue class"). This patch depends on ("MINOR: task: add thread safe notification_new and notification_wake variants")	2025-04-03 17:52:09 +02:00
Aurelien DARRAGON	b77b1a2c3a	MINOR: task: add thread safe notification_new and notification_wake variants notification_new and notification_wake were historically meant to be called by a single thread doing both the init and the wakeup for other tasks waiting on the signals. In this patch, we extend the API so that notification_new and notification_wake have thread-safe variants that can safely be used with multiple threads registering on the same list of events and multiple threads pushing updates on the list.	2025-04-03 17:52:03 +02:00
Amaury Denoyelle	f0f1816f1a	MINOR: check: implement check-pool-conn-name srv keyword This commit is a direct follow-up of the previous one. It defines a new server keyword check-pool-conn-name. It is used as the default value for the name parameter of idle connection hash generation. Its behavior is similar to server keyword pool-conn-name, but reserved for checks reuse. If check-pool-conn-name is set, it is used in priority to match a connection for reuse. If unset, a fallback is performed on check-sni.	2025-04-03 17:19:07 +02:00
Amaury Denoyelle	43367f94f1	MINOR: check/backend: support conn reuse with SNI Support for connection reuse during server checks was implemented recently. This is activated with the server keyword check-reuse-pool. Similarly to stream processing via connect_backend(), a connection hash is calculated when trying to perform reuse for checks. This is necessary to retrieve for a connection which shares the check connect parameters. However, idle connections can additionnally be tagged using a pool-conn-name or SNI under connect_backend(). Check reuse does not test these values, which prevent to retrieve a matching connection. Improve this by using "check-sni" value as idle connection hash input for check reuse. be_calculate_conn_hash() API has been adjusted so that name value can be passed as input, both when using streams or checks. Even with the current patch, there is still some scenarii which could not be covered for checks connection reuse. most notably, when using dynamic pool-conn-name/SNI value. It is however at least sufficient to cover simpler cases.	2025-04-03 17:19:07 +02:00
Amaury Denoyelle	28116e307a	MINOR: server: activate automatically check reuse for rhttp@ protocol Without check-reuse-pool, it is impossible to perform check on server using @rhttp protocol. This is due to the inherent nature of the protocol which does not implement an active connect method. Thus, ensure that check-reuse-pool is always set when a reverse HTTP server is declared. This reduces server configuration and should prevent any omission. Note that it is still require to add "check" server keyword so activate server checks.	2025-04-03 17:19:07 +02:00
Amaury Denoyelle	ace9f5db10	BUG/MINOR: server: ensure check-reuse-pool is copied from default-server Duplicate server check.reuse_pool boolean value in srv_settings_cpy(). This is necessary to ensure that check-reuse-pool value can be set via default-server or server-template. This does not need to be backported.	2025-04-03 17:19:07 +02:00
Amaury Denoyelle	76e9156c9b	MINOR: backend: mark srv as nonnull in alloc_dst_address() Server instance can be NULL on connect_server(), either when dispatch or transparent proxy are active. However, in alloc_dst_address() access to <srv> is safe thanks to SF_ASSIGNED stream flag. Add an ASSUME_NONNULL() to reflect this state. This should fix coverity report from github issue #2922.	2025-04-03 17:19:07 +02:00
William Lallemand	feb1a9ea17	DOC: configuration: replace "crt" by "ssl-f-use" in listeners Replace the "crt" keyword from the frontend section with a "ssl-f-use" keyword, "crt" could be ambigous in case we don't want to put a certificate filename.	2025-04-03 16:38:15 +02:00
William Lallemand	c7f29afcea	MEDIUM: ssl: replace "crt" lines by "ssl-f-use" lines The new "crt" lines in frontend and listen sections are confusing: - a filename is mandatory but we could need a syntax without the filename in the future, if the filename is generated for example - there is no clue about the fact that its only used on the frontend side when reading the line A new "ssl-f-use" line replaces the "crt" line, but a "crt" keyword can be used on this line. "f" indicates that this is the frontend configuration, a "ssl-b-use" keyword could be used in the future. The "crt" lines only appeared in 3.2-dev so this won't change anything for people using configurations from previous major versions.	2025-04-03 16:38:15 +02:00
Olivier Houchard	4715c557e9	TESTS: Fix build for filltab25.c Give a return type to main(), so that filltab25.c compiles with modern compilers.	2025-04-03 15:59:41 +02:00
Willy Tarreau	f435a2e518	CLEANUP: atomics: also replace __sync_synchronize() with __atomic_thread_fence() The drop of older compilers also allows us to focus on clearer barriers, so let's use them.	2025-04-03 11:59:31 +02:00
Willy Tarreau	34e3b83f9c	CLEANUP: atomics: remove support for gcc < 4.7 The old __sync_* API is no longer necessary since we do not support gcc before 4.7 anymore. Let's just get rid of this code, the file is still ugly enough without it.	2025-04-03 11:55:35 +02:00
Ilia Shipitsin	27a6353ceb	CLEANUP: assorted typo fixes in the code, commits and doc	2025-04-03 11:37:25 +02:00
Ilia Shipitsin	bd477d5f51	CI: codespell: add "pres" to spellcheck whitelist spellcheck was triggered by the following: * pres : same as "res" but using the parent stream, if any. "pres" variables are only accessible during response processing of the parent stream.	2025-04-03 11:37:25 +02:00
Ilia Shipitsin	30df5b0f23	CI: spell check: allow manual trigger	2025-04-03 11:37:25 +02:00
Emeric Brun	b02b8453d1	BUG/MEDIUM: peers: prevent learning expiration too far in futur from unsync node This patch sets the expire of the entry to the max value in configuration if the value showed in the peer update message is too far in futur. This should be backported an all supported branches.	2025-04-03 11:26:29 +02:00
Emeric Brun	00461fbfbf	BUG/MINOR: peers: fix expire learned from a peer not converted from ms to ticks This is has now impact currently since MS_TO_TICKS macro does nothing but it will prevent further bugs.	2025-04-03 11:26:21 +02:00
Christopher Faulet	6365eb85e5	MEDIUM: stream: Save SC and channel flags earlier in process_steam() At the begining of process_stream(), the flags of the stream connectors and channels are saved to be able to handle changes performed in sub-functions (for instance in analyzers). But, some operations were performed before saving these flags: Synchronous receives and forced shutdowns. While it seems to safe for now, it is a bit annoying because some events could be missed. So, to avoid bugs in the future, the channels and stream connectors flags are now really saved before any other processing.	2025-04-03 10:19:58 +02:00
Christopher Faulet	51611a5b70	BUG/MEDIUM: stream: Fix a possible freeze during a forced shut on a stream When a forced shutdown is performed on a stream, it is possible to freeze it infinitly because it is performed in an unexpected way from process_stream() point of view, especially when the stream is waiting for a server connection. The events sequence is a bit complex but at the end the stream remains blocked in turn-around state and no event are trriggered to unblock it. By trying to fix the issue, we considered it was safer to rethink the feature. The idea is to quickly shutdown a stream to release resources. For instance to be able to delete a server. So, instead of scheduling a shutdown, it is more efficient to trigger an error and detach the stream from the server, if neecessary. The same code than the one used to deal with connection errors in back_handle_st_cer() is used. This patch must be slowly backported as far as 2.6.	2025-04-03 10:19:57 +02:00
William Lallemand	b351f06ff1	REORG: ssl: move curves2nid and nid2nist to ssl_utils curves2nid and nid2nist are generic functions that could be used outside the JWS scope, this patch put them at the right place so they can be reused.	2025-04-02 19:34:09 +02:00
Willy Tarreau	a8fab63604	[RELEASE] Released version 3.2-dev9 Released version 3.2-dev9 with the following main changes : - MINOR: quic: move global tune options into quic_tune - CLEANUP: quic: reorganize TP flow-control initialization - MINOR: quic: ignore uni-stream for initial max data TP - MINOR: mux-quic: define config for max-data - MINOR: quic: define max-stream-data configuration as a ratio - MEDIUM: lb-chash: add directive hash-preserve-affinity - MEDIUM: pools: be a bit smarter when merging comparable size pools - REGTESTS: disable the test balance/balance-hash-maxqueue - BUG/MINOR: log: fix gcc warn about truncating NUL terminator while init char arrays - CI: fedora rawhide: allow "on: workflow_dispatch" in forks - CI: fedora rawhide: install "awk" as a dependency - CI: spellcheck: allow "on: workflow_dispatch" in forks - CI: coverity scan: allow "on: workflow_dispatch" in forks - CI: cross compile: allow "on: workflow_dispatch" in forks - CI: Illumos: allow "on: workflow_dispatch" in forks - CI: NetBSD: allow "on: workflow_dispatch" in forks - CI: QUIC Interop on AWS-LC: allow "on: workflow_dispatch" in forks - CI: QUIC Interop on LibreSSL: allow "on: workflow_dispatch" in forks - MINOR: compiler: add __nonstring macro - MINOR: thread: dump the CPU topology in thread_map_to_groups() - MINOR: cpu-set: compare two cpu sets with ha_cpuset_isequal() - MINOR: cpu-set: add a new function to print cpu-sets in human-friendly mode - MINOR: cpu-topo: add a dump of thread-to-CPU mapping to -dc - MINOR: cpu-topo: pass an extra argument to ha_cpu_policy - MINOR: cpu-topo: add new cpu-policies "group-by-2-clusters" and above - BUG/MINOR: config: silence .notice/.warning/.alert in discovery mode - EXAMPLES: add "games.cfg" and an example game in Lua - MINOR: jws: emit the JWK thumbprint - TESTS: jws: change the jwk format - MINOR: ssl/ckch: add substring parser for ckch_conf - MINOR: mt_list: Implement mt_list_try_lock_prev(). - MINOR: lbprm: Add method to deinit server and proxy - MINOR: threads: Add HA_RWLOCK_TRYRDTOWR() - MAJOR: leastconn; Revamp the way servers are ordered. - BUG/MINOR: ssl/ckch: leak in error path - BUILD: ssl/ckch: potential null pointer dereference - MINOR: log: support "raw" logformat node typecast - CLEANUP: assorted typo fixes in the code and comments - DOC: config: fix two missing "content" in "tcp-request" examples - MINOR: cpu-topo: cpu_dump_topology() SMT info check little optimisation - BUILD: compiler: undefine the CONCAT() macro if already defined - BUG/MEDIUM: leastconn: Don't try to reposition if the server is down - BUG/MINOR: rhttp: fix incorrect dst/dst_port values - BUG/MINOR: backend: do not overwrite srv dst address on reuse - BUG/MEDIUM: backend: fix reuse with set-dst/set-dst-port - MINOR: sample: define bc_reused fetch - REGTESTS: extend conn reuse test with transparent proxy - MINOR: backend: fix comment when killing idle conns - MINOR: backend: adjust conn_backend_get() API - MINOR: backend: extract conn hash calculation from connect_server() - MINOR: backend: extract conn reuse from connect_server() - MINOR: backend: remove stream usage on connection reuse - MINOR: check define check-reuse-pool server keyword - MEDIUM: check: implement check-reuse-pool - BUILD: backend: silence a build warning when not using ssl - BUILD: quic_sock: address a strict-aliasing build warning with gcc 5 and 6 - BUILD: ssl_ckch: use my_strndup() instead of strndup() - DOC: update INSTALL to reflect the minimum compiler version	2025-04-02 18:12:34 +02:00
Willy Tarreau	1450b44bb9	DOC: update INSTALL to reflect the minimum compiler version The mt_list update in 3.1 mandated the support for c11-like atomics that arrived with gcc-4.7. As such, older versions are no longer supported. For special cases in single-threaded environments, mt_lists could be replaced with regular lists but it doesn't seem worth the hassle. It was verified that gcc 4.7 to 14 and clang 3.0 and 19 do build fine. That leaves us with 10 years of coverage of compiler versions, which remains reasonable assuming that users of old ultra-stable systems are unlikely to upgrade haproxy without touching the rest of the system. This should be backported to 3.1.	2025-04-02 18:09:47 +02:00
Willy Tarreau	90e9b9d477	BUILD: ssl_ckch: use my_strndup() instead of strndup() Not all systems have strndup(), that's why we have our "my_strndup()", so let's make use of it here. This fixes the build on Solaris 10. No backport is needed, this was just merged with commit fdcb97614c ("MINOR: ssl/ckch: add substring parser for ckch_conf").	2025-04-02 17:20:03 +02:00
Willy Tarreau	dd900aead8	BUILD: quic_sock: address a strict-aliasing build warning with gcc 5 and 6 The UDP GSO code emits a build warning with older toolchains (gcc 5 and 6): src/quic_sock.c: In function 'cmsg_set_gso': src/quic_sock.c:683:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] ((uint16_t )CMSG_DATA(c)) = gso_size; ^ Let's just use the write_u16() function that's made for this purpose. It was verified that for all versions from 5 to 13, gcc produces the exact same code with the fix (and without the warning). It arrived in 3.1 with commit 448d3d388a ("MINOR: quic: add GSO parameter on quic_sock send API") so this can be backported there.	2025-04-02 16:07:31 +02:00
Willy Tarreau	870f7aa5cf	BUILD: backend: silence a build warning when not using ssl Since recent commit ee94a6cfc1 ("MINOR: backend: extract conn reuse from connect_server()") a build warning "set but not used" on the "reuse" variable is emitted, because indeed the variable is now only checked when SSL is in use. Let's just mark it as such.	2025-04-02 15:26:31 +02:00
Amaury Denoyelle	f1fb396d71	MEDIUM: check: implement check-reuse-pool Implement the possibility to reuse idle connections when performing server checks. This is done thanks to the recently introduced functions be_calculate_conn_hash() and be_reuse_connection(). One side effect of this change is that be_calculate_conn_hash() can now be called with a NULL stream instance. As such, part of the functions are adjusted accordingly. Note that to simplify configuration, connection reuse is not performed if any specific check connection parameters are defined on the server line or via the tcp-check connect rule. This is performed via newly defined tcpcheck_use_nondefault_connect().	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	e34f748e3a	MINOR: check define check-reuse-pool server keyword Define a new server keyword check-reuse-pool, and its counterpart with a "no" prefix. For the moment, only parsing is implemented. The real behavior adjustment will be implemented in the next patch.	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	20eb57b486	MINOR: backend: remove stream usage on connection reuse Adjust newly defined be_reuse_connection() API. The stream argument is removed. This will allows checks to be able to invoke it without relying on a stream instance.	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	ee94a6cfc1	MINOR: backend: extract conn reuse from connect_server() Following the previous patch, the part directly related to connection reuse is extracted from connect_server(). It is now define in a new function be_reuse_connection().	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	c7cc6b6401	MINOR: backend: extract conn hash calculation from connect_server() On connection reuse, a hash is first calculated. It is generated from various connection parameters, to retrieve a matching connection. Extract hash calculation from connect_server() into a new dedicated function be_calculate_conn_hash(). The objective is to be able to perform connection reuse for checks, without connect_server() invokation which relies on a stream instance.	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	4f0240f9a4	MINOR: backend: adjust conn_backend_get() API The main objective of this patch is to remove the stream instance from conn_backend_get() parameters. This would allow to perform reuse outside of stream contexts, for example for checks purpose.	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	2ca616b4e1	MINOR: backend: fix comment when killing idle conns Previously, if a server reached its pool-high-count limit, connection were killed on connect_server() when reuse was not possible. However, this is now performed even if reuse is done since the following patch : b3397367dc7cec9e78c62c54efc24d9db5cde2d2 MEDIUM: connections: Kill connections even if we are reusing one. Thus, adjust the related comment to reflect this state.	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	2f36162ee1	REGTESTS: extend conn reuse test with transparent proxy Recently, work on connection reuses reveals an issue when mixed with transparent proxy and set-dst. This patch rewrites the related regtests to be able to catch this now fixed bug. Note that it is the first regtest which relies on bc_reused recently introduced sample fetches. This fetch could be reuse in other related connection reuse regtests to simplify them.	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	ec76d52cea	MINOR: sample: define bc_reused fetch Define a new layer4 sample fetch "bc_reused". It is used as a boolean, set to true if backend connection was reused for the request.	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	5fda64e87e	BUG/MEDIUM: backend: fix reuse with set-dst/set-dst-port On backend connection reuse, a hash is calculated from various parameters, to ensure the selected connection match the requested parameters. Notably, destination address is one of these parameters. However, it is only taken into account if using a transparent server (server address 0.0.0.0). This may cause issue where an incorrect connection is reused, which is not targetted to the correct destination address. This may be the case if a set-dst/set-dst-port is used with a transparent proxy (proxy option transparent). The fix is simple enough. Destination address is now always used as input to the connection reuse hash. This must be backported up to 2.6. Note that for reverse HTTP to work, it relies on the following patch, which ensures destination address remains NULL in this case. commit e94baf6ca71cb2319610baa74dbf17b9bc602b18 BUG/MINOR: rhttp: fix incorrect dst/dst_port values	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	d7fa8e88c4	BUG/MINOR: backend: do not overwrite srv dst address on reuse Previously, destination address of backend connection was systematically always reassigned. However, this step is unnecessary on connection reuse. Indeed, reuse should only be conducted with connection using the same destination address matching the stream requirements. This patch removes this unnecessary assignment. It is now only performed when reuse cannot be conducted and a new connection is instantiated. Functionnally speaking, this patch should not change anything in theory, as reuse is performed in conformance with the destination address. However, it appears that it was not always properly enforced. The systematic assignment of the destination address hides these issues, so it is now remove. The identified bogus cases will then be fixed in the following patches.would This should be backported up to all stable versions.	2025-04-02 14:57:40 +02:00
Amaury Denoyelle	c05bb8c967	BUG/MINOR: rhttp: fix incorrect dst/dst_port values With a @rhttp server, connect is not possible, transfer is only possible via idle connection reuse. The server does not have any network address. Thus, it is unnecessary to allocate the stream destination address prior to connection reuse. This patch adjusts this by fixing alloc_dst_address() to take this into account. Prior to this patch, alloc_dst_address() would incorrectly assimilate a @rhttp server with a transparent proxy mode. Thus stream destination address would be copied from the destination address. Connection adress would then be rewrote with this incorrect value. This did not impact connect or reuse as destination addr is only used in idle conn hash calculation for transparent servers. However, it causes incorrect values for dst/dst_port samples. This should be backported up to 2.9.	2025-04-02 14:57:40 +02:00
Olivier Houchard	f59297e492	BUG/MEDIUM: leastconn: Don't try to reposition if the server is down It may happen that the server is going down, and fwlc_srv_reposition() is still called, because streams still attached to the server are being terminated. So in fwlc_srv_reposition(), just do nothing if we've been removed from the tree. This should fix github issue #2919. This should not be backported, unless commit 9fe72bba3cf3484577fa1ef00723de08df757996 is also backported.	2025-04-02 12:24:04 +02:00
Willy Tarreau	4ec5509541	BUILD: compiler: undefine the CONCAT() macro if already defined As Ilya reported in issue #2911, the CONCAT() macro breaks on NetBSD which defines its own as __CONCAT() (which is exactly the same). Let's just undefine it before ours to fix the issue instead of renaming, but keep ours so that we don't have doubts about what we're running with. Note that the patch introducing this breaking change was backported to 3.0.	2025-04-02 11:36:43 +02:00
David Carlier	a703eeaef7	MINOR: cpu-topo: cpu_dump_topology() SMT info check little optimisation Once we stumble across the first cpu having the criteria, we exit earlier from the loop.	2025-04-02 11:31:37 +02:00
Willy Tarreau	3de99a0919	DOC: config: fix two missing "content" in "tcp-request" examples As reported by Uku S�rmus in GitHub issue #2917, two "tcp-request" rules in an example were mistakenly missing the "content" hook, rendering them invalid. This can be backported.	2025-04-02 11:17:05 +02:00
Ilia Shipitsin	78b849b839	CLEANUP: assorted typo fixes in the code and comments code, comments and doc actually.	2025-04-02 11:12:20 +02:00
Aurelien DARRAGON	423cca64b6	MINOR: log: support "raw" logformat node typecast "raw" logformat node typecast is a special value (unlike str,bool,int..) which tells haproxy to completely ignore logformat options (including encoding ones) and force binary output for the current node only. It is mainly intended for use with JSON or CBOR encoders in order to generate nested CBOR or nested JSON by storing intermediate log-formats within variables and assembling the final object in the parent log-format. Example: http-request set-var-fmt(txn.intermediate) "%{+json}o %(lower)[str(value)]" log-format "%{+json}o %(upper)[str(value)] %(intermediate:raw)[var(txn.intermediate)]" Would produce: {"upper": "value", "intermediate": {"lower": "value"}}	2025-04-02 11:04:43 +02:00
William Lallemand	31bd3627cd	BUILD: ssl/ckch: potential null pointer dereference src/ssl_ckch.c: In function ‘ckch_conf_parse’: src/ssl_ckch.c:4852:40: error: potential null pointer dereference [-Werror=null-dereference] 4852 \| while (r) { \| ^~ Add a test on r before using r. No backport needed	2025-04-02 10:02:07 +02:00
William Lallemand	2e8acf54d4	BUG/MINOR: ssl/ckch: leak in error path fdcb97614cb ("MINOR: ssl/ckch: add substring parser for ckch_conf") introduced a leak in the error path when the strndup fails. This patch fixes issue #2920. No backport needed.	2025-04-02 09:53:48 +02:00
Olivier Houchard	9fe72bba3c	MAJOR: leastconn; Revamp the way servers are ordered. For leastconn, servers used to just be stored in an ebtree. Each server would be one node. Change that so that nodes contain multiple mt_lists. Each list will contain servers that share the same key (typically meaning they have the same number of connections). Using mt_lists means that as long as tree elements already exist, moving a server from one tree element to another does no longer require the lbprm write lock. We use multiple mt_lists to reduce the contention when moving a server from one tree element to another. A list in the new element will be chosen randomly. We no longer remove a tree element as soon as they no longer contain any server. Instead, we keep a list of all elements, and when we need a new element, we look at that list only if it contains a number of elements already, otherwise we'll allocate a new one. Keeping nodes in the tree ensures that we very rarely have to take the lbrpm write lock (as it only happens when we're moving the server to a position for which no element is currently in the tree). The number of mt_lists used is defined as FWLC_NB_LISTS. The number of tree elements we want to keep is defined as FWLC_MIN_FREE_ENTRIES, both in defaults.h. The value used were picked afrer experimentation, and seems to be the best choice of performances vs memory usage. Doing that gives a good boost in performances when a lot of servers are used. With a configuration using 500 servers, before that patch, about 830000 requests per second could be processed, with that patch, about 1550000 requests per second are processed, on an 64-cores AMD, using 1200 concurrent connections.	2025-04-01 18:05:30 +02:00
Olivier Houchard	ba521a1d88	MINOR: threads: Add HA_RWLOCK_TRYRDTOWR() Add HA_RWLOCK_TRYRDTOWR(), that tries to upgrade a lock from reader to writer, and fails if any seeker or writer already holds it.	2025-04-01 18:05:30 +02:00
Olivier Houchard	2a9436f96b	MINOR: lbprm: Add method to deinit server and proxy Add two new methods to lbprm, server_deinit() and proxy_deinit(), in case something should be done at the lbprm level when removing servers and proxies.	2025-04-01 18:05:30 +02:00
Olivier Houchard	17059098e7	MINOR: mt_list: Implement mt_list_try_lock_prev(). Implement mt_list_try_lock_prev(), that does the same thing as mt_list_lock_prev(), exceot if the list is locked, it returns { NULL, NULL } instaed of waiting.	2025-04-01 18:05:30 +02:00
William Lallemand	fdcb97614c	MINOR: ssl/ckch: add substring parser for ckch_conf Add a substring parser for the ckch_conf keyword parser, this will split a string into multiple substring, and strdup them in a array.	2025-04-01 15:38:32 +02:00
William Lallemand	fa01c9d92b	TESTS: jws: change the jwk format The format of the jwk output changed a little bit because of the previous commit.	2025-04-01 14:37:22 +02:00
William Lallemand	f8fe84caca	MINOR: jws: emit the JWK thumbprint jwk_thumbprint() is a function which is a function which implements RFC7368 and emits a JWK thumbprint using a EVP_PKEY. EVP_PKEY_EC_to_pub_jwk() and EVP_PKEY_RSA_to_pub_jwk() were changed in order to match what is required to emit a thumbprint (ie, no spaces or lines and the lexicographic order of the fields)	2025-04-01 11:57:55 +02:00
Willy Tarreau	ed1d4807da	EXAMPLES: add "games.cfg" and an example game in Lua The purpose is mainly to exhibit certain limitations that come with such less common programming models, to show users how to program interactive tools in Lua, and how to connect interactively. Other use cases that could be envisioned are "top" and various monitoring utilities, with sliding graphs etc. Lua is particularly attractive for this usage, easy to program, well known from most AI tools (including its integration into haproxy), making such programs very quick to obtain in their basic form, and to improve later. A very limited example game is provided, following the principle of a very popular one, where the player must compose lines from falling pieces. It quickly revealed the need to the ability to enforce a timeout to applet:receive(). Other identified limitations include the difficulty from the Lua side to monitor multiple events at once, but it seems that callbacks and/or event dispatchers would be useful here. At the moment the CLI is not workable (it interactivity was broken in 2.9 when line buffering was adopted), though it was verified that it works with older releases. The command needed to connect to the game is displayed as a notice message during boot.	2025-04-01 09:10:00 +02:00
Willy Tarreau	2c779f3938	BUG/MINOR: config: silence .notice/.warning/.alert in discovery mode When first pre-parsing the config to detect the presence or absence of the master mode, we must not emit messages because they are not supposed to be visible at this point, otherwise they appear twice each. The pre-parsing, also called discovery mode, is only for internal use, thus it should remain silent. This should be backported to 3.1 where this mode was introduced.	2025-04-01 09:06:25 +02:00
Willy Tarreau	9f00702dc6	MINOR: cpu-topo: add new cpu-policies "group-by-2-clusters" and above This adds "group-by-{2,3,4}-clusters", which, as its name implies, create one thread group per X clusters. This can be useful when CPUs are split into too small clusters, as well as when the total number of assigned cores is not even between the clusters, to try to spread the load between less different ones.	2025-03-31 16:21:37 +02:00
Willy Tarreau	1e9a2529aa	MINOR: cpu-topo: pass an extra argument to ha_cpu_policy This extra argument will allow common functions to distinguish between multiple policies. For now it's not used.	2025-03-31 16:21:37 +02:00
Willy Tarreau	e4053b0d09	MINOR: cpu-topo: add a dump of thread-to-CPU mapping to -dc When emitting the CPU topology info with -dc, also emit a list of thread-to-CPU mapping. The group/thread and thread ID are emitted with the list of their CPUs on each line. The count of CPUs is shown to ease comparisons, and as much as possible, we try to pack identical lines within a group by showing thread ranges.	2025-03-31 16:21:37 +02:00
Willy Tarreau	571573874a	MINOR: cpu-set: add a new function to print cpu-sets in human-friendly mode The new function "print_cpu_set()" will print cpu sets in a human-friendly way, with commas and dashes for intervals. The goal is to keep them compact enough.	2025-03-31 16:21:37 +02:00
Willy Tarreau	3955f151b1	MINOR: cpu-set: compare two cpu sets with ha_cpuset_isequal() This function returns true if two CPU sets are equal.	2025-03-31 16:21:37 +02:00
Willy Tarreau	e17512c3b2	MINOR: thread: dump the CPU topology in thread_map_to_groups() It was previously done in thread_detect_count() but that's not quite handy because we still don't know about the groups setting. Better do it slightly later and have all the relevant info instead.	2025-03-31 15:42:13 +02:00
Valentine Krasnobaeva	b303861469	MINOR: compiler: add __nonstring macro GCC 15 throws the following warning on fixed-size char arrays if they do not contain terminated NUL: src/tools.c:2041:25: error: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (17 chars into 16 available) [-Werror=unterminated-string-initialization] 2041 \| const char hextab[16] = "0123456789ABCDEF"; We are using a couple of such definitions for some constants. Converting them to flexible arrays, like: hextab[] = "0123456789ABCDEF" may have consequences, as enlarged arrays won't fit anymore where they were possibly located due to the memory alignement constraints. GCC adds 'nonstring' variable attribute for such char arrays, but clang and other compilers don't have it. Let's wrap 'nonstring' with our __nonstring macro, which will test if the compiler supports this attribute. This fixes the issue #2910.	2025-03-31 13:50:28 +02:00
Ilia Shipitsin	415d446065	CI: QUIC Interop on LibreSSL: allow "on: workflow_dispatch" in forks previously that build were limited to "haproxy" github organization only. let's allow manual builds from forks	2025-03-28 09:51:35 +01:00
Ilia Shipitsin	8d591c387a	CI: QUIC Interop on AWS-LC: allow "on: workflow_dispatch" in forks previously that build were limited to "haproxy" github organization only. let's allow manual builds from forks	2025-03-28 09:51:35 +01:00
Ilia Shipitsin	7de45e3874	CI: NetBSD: allow "on: workflow_dispatch" in forks previously that build were limited to "haproxy" github organization only. let's allow manual builds from forks	2025-03-28 09:51:35 +01:00
Ilia Shipitsin	8231f58fdc	CI: Illumos: allow "on: workflow_dispatch" in forks previously that build were limited to "haproxy" github organization only. let's allow manual builds from forks	2025-03-28 09:51:35 +01:00
Ilia Shipitsin	7495dbed22	CI: cross compile: allow "on: workflow_dispatch" in forks previously that build were limited to "haproxy" github organization only. let's allow manual builds from forks	2025-03-28 09:51:35 +01:00
Ilia Shipitsin	7eb54656ae	CI: coverity scan: allow "on: workflow_dispatch" in forks previously that build were limited to "haproxy" github organization only. let's allow manual builds from forks	2025-03-28 09:51:35 +01:00
Ilia Shipitsin	424ca19831	CI: spellcheck: allow "on: workflow_dispatch" in forks previously that build were limited to "haproxy" github organization only. let's allow manual builds from forks	2025-03-28 09:51:35 +01:00
Ilia Shipitsin	d9cb95c2a5	CI: fedora rawhide: install "awk" as a dependency for some reason it is not installed by default on rawhide anymore	2025-03-28 09:51:35 +01:00
Ilia Shipitsin	21894300c1	CI: fedora rawhide: allow "on: workflow_dispatch" in forks previously that build were limited to "haproxy" github organization only. let's allow manual builds from forks	2025-03-28 09:51:35 +01:00
Valentine Krasnobaeva	44f98f1747	BUG/MINOR: log: fix gcc warn about truncating NUL terminator while init char arrays gcc 15 throws such kind of warnings about initialization of some char arrays: src/log.c:181:33: error: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (17 chars into 16 available) [-Werror=unterminated-string-initialization] 181 \| const char sess_term_cond[16] = "-LcCsSPRIDKUIIII"; /* normal, Local, CliTo, CliErr, SrvTo, SrvErr, PxErr, Resource, Internal, Down, Killed, Up, -- / \| ^~~~~~~~~~~~~~~~~~ src/log.c:182:33: error: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (9 chars into 8 available) [-Werror=unterminated-string-initialization] 182 \| const char sess_fin_state[8] = "-RCHDLQT"; / cliRequest, srvConnect, srvHeader, Data, Last, Queue, Tarpit */ So, let's make it happy by not giving the sizes of these char arrays explicitly, thus he can accomodate there NUL terminators. Reported in GitHub issue #2910. This should be backported up to 2.6.	2025-03-27 11:52:33 +01:00
Willy Tarreau	9b53a4a7fb	REGTESTS: disable the test balance/balance-hash-maxqueue This test brought by commit 8ed1e91efd ("MEDIUM: lb-chash: add directive hash-preserve-affinity") seems to have hit a limitation of what can be expressed in vtc, as it would be desirable to have one server response release two clients at once but the various attempts using barriers have failed so far. The test seems to work fine locally but still fails almost 100% of the time on the CI, so it remains timing dependent in some ways. Tests have been done with nbthread 1, pool-idle-shared off, http-reuse never (since always fails locally) etc but to no avail. Let's just mark it broken in case we later figure another way to fix it. It's still usable locally most of the time, though.	2025-03-25 18:24:49 +01:00
Willy Tarreau	6b17310757	MEDIUM: pools: be a bit smarter when merging comparable size pools By default, pools of comparable sizes are merged together. However, the current algorithm is dumb: it rounds the requested size to the next multiple of 16 and compares the sizes like this. This results in many entries which are already multiples of 16 not being merged, for example 1024 and 1032 are separate, 65536 and 65540 are separate, 48 and 56 are separate (though 56 merges with 64). This commit changes this to consider not just the entry size but also the average entry size, that is, it compares the average size of all objects sharing the pool with the size of the object looking for a pool. If the object is not more than 1% bigger nor smaller than the current average size or if it neither 16 bytes smaller nor larger, then it can be merged. Also, it always respects exact matches in order to avoid merging objects into larger pools or worse, extending existing ones for no reason, and when there's a tie, it always avoids extending an existing pool. Also, we now visit all existing pools in order to spot the best one, we do not stop anymore at the smallest one large enough. Theoretically this could cost a bit of CPU but in practice it's O(N^2) with N quite small (typically in the order of 100) and the cost at each step is very low (compare a few integer values). But as a side effect, pools are no longer sorted by size, "show pools bysize" is needed for this. This causes the objects to be much better grouped together, accepting to use a little bit more sometimes to avoid fragmentation, without causing everyone to be merged into the same pool. Thanks to this we're now seeing 36 pools instead of 48 by default, with some very nice examples of compact grouping: - Pool qc_stream_r (80 bytes) : 13 users > qc_stream_r : size=72 flags=0x1 align=0 > quic_cstrea : size=80 flags=0x1 align=0 > qc_stream_a : size=64 flags=0x1 align=0 > hlua_esub : size=64 flags=0x1 align=0 > stconn : size=80 flags=0x1 align=0 > dns_query : size=64 flags=0x1 align=0 > vars : size=80 flags=0x1 align=0 > filter : size=64 flags=0x1 align=0 > session pri : size=64 flags=0x1 align=0 > fcgi_hdr_ru : size=72 flags=0x1 align=0 > fcgi_param_ : size=72 flags=0x1 align=0 > pendconn : size=80 flags=0x1 align=0 > capture : size=64 flags=0x1 align=0 - Pool h3s (56 bytes) : 17 users > h3s : size=56 flags=0x1 align=0 > qf_crypto : size=48 flags=0x1 align=0 > quic_tls_se : size=48 flags=0x1 align=0 > quic_arng : size=56 flags=0x1 align=0 > hlua_flt_ct : size=56 flags=0x1 align=0 > promex_metr : size=48 flags=0x1 align=0 > conn_hash_n : size=56 flags=0x1 align=0 > resolv_requ : size=48 flags=0x1 align=0 > mux_pt : size=40 flags=0x1 align=0 > comp_state : size=40 flags=0x1 align=0 > notificatio : size=48 flags=0x1 align=0 > tasklet : size=56 flags=0x1 align=0 > bwlim_state : size=48 flags=0x1 align=0 > xprt_handsh : size=48 flags=0x1 align=0 > email_alert : size=56 flags=0x1 align=0 > caphdr : size=41 flags=0x1 align=0 > caphdr : size=41 flags=0x1 align=0 - Pool quic_cids (32 bytes) : 13 users > quic_cids : size=16 flags=0x1 align=0 > quic_tls_ke : size=32 flags=0x1 align=0 > quic_tls_iv : size=12 flags=0x1 align=0 > cbuf : size=32 flags=0x1 align=0 > hlua_queuew : size=24 flags=0x1 align=0 > hlua_queue : size=24 flags=0x1 align=0 > promex_modu : size=24 flags=0x1 align=0 > cache_st : size=24 flags=0x1 align=0 > spoe_appctx : size=32 flags=0x1 align=0 > ehdl_sub_tc : size=32 flags=0x1 align=0 > fcgi_flt_ct : size=16 flags=0x1 align=0 > sig_handler : size=32 flags=0x1 align=0 > pipe : size=24 flags=0x1 align=0 - Pool quic_crypto (1032 bytes) : 2 users > quic_crypto : size=1032 flags=0x1 align=0 > requri : size=1024 flags=0x1 align=0 - Pool quic_conn_r (65544 bytes) : 2 users > quic_conn_r : size=65536 flags=0x1 align=0 > dns_msg_buf : size=65540 flags=0x1 align=0 On a very unscientific test consisting in sending 1 million H1 requests and 1 million H2 requests to the stats page, we're seeing an ~6% lower memory usage with the patch: before the patch: Total: 48 pools, 4120832 bytes allocated, 4120832 used (~3555680 by thread caches). after the patch: Total: 36 pools, 3880648 bytes allocated, 3880648 used (~3299064 by thread caches). This should be taken with care however since pools allocate and release in batches.	2025-03-25 18:01:01 +01:00
Pierre-Andre Savalle	8ed1e91efd	MEDIUM: lb-chash: add directive hash-preserve-affinity When using hash-based load balancing, requests are always assigned to the server corresponding to the hash bucket for the balancing key, without taking maxconn or maxqueue into account, unlike in other load balancing methods like 'first'. This adds a new backend directive that can be used to take maxconn and possibly maxqueue in that context. This can be used when hashing is desired to achieve cache locality, but sending requests to a different server is preferable to queuing for a long time or failing requests when the initial server is saturated. By default, affinity is preserved as was the case previously. When 'hash-preserve-affinity' is set to 'maxqueue', servers are considered successively in the order of the hash ring until a server that does not have a full queue is found. When 'maxconn' is set on a server, queueing cannot be disabled, as 'maxqueue=0' means unlimited. To support picking a different server when a server is at 'maxconn' irrespective of the queue, 'hash-preserve-affinity' can be set to 'maxconn'.	2025-03-25 18:01:01 +01:00
Amaury Denoyelle	cf9e40bd8a	MINOR: quic: define max-stream-data configuration as a ratio	2025-03-25 16:30:35 +01:00
Amaury Denoyelle	68c10d444d	MINOR: mux-quic: define config for max-data Define a new global configuration tune.quic.frontend.max-data. This allows users to explicitely set the value for the corresponding QUIC TP initial-max-data, with direct impact on haproxy memory consumption.	2025-03-25 16:30:09 +01:00
Amaury Denoyelle	1f1a18e318	MINOR: quic: ignore uni-stream for initial max data TP Initial TP value for max-data is automatically calculated to be adjusted to the maximum number of opened streams over a QUIC connection. This took into account both max-streams-bidi-remote and uni-streams. By default, this is equivalent to 100 + 3 = 103 max opened streams. This patch simplifies the calculation by only using bidirectional streams. Uni streams are ignored because they are only used for HTTP/3 control exchanges, which should only represents a few bytes. For now, users can only configure the max number of remote bidi streams, so the simplified calculation should make more sense to them. Note that this relies on the assumption that HTTP/3 is used as application protocol. To support other protocols, it may be necessary to review this and take into account both local bidi and uni streams.	2025-03-25 16:29:38 +01:00
Amaury Denoyelle	3db5320289	CLEANUP: quic: reorganize TP flow-control initialization Adjust initialization of flow-control transport parameters via quic_transport_params_init(). This is purely cosmetic, with some comments added. It is also a preparatory step for future patches with addition of new configuration keywords related to flow-control TP values.	2025-03-25 16:29:35 +01:00
Amaury Denoyelle	a71007c088	MINOR: quic: move global tune options into quic_tune A new structure quic_tune has recently been defined. Its purpose is to store global options related to QUIC. Previously, only the tunable to toggle pacing was stored in it. This commit moves several QUIC related tunable from global to quic_tune structure. This better centralizes QUIC configuration option and gives room for future generic options.	2025-03-24 10:01:46 +01:00
Willy Tarreau	119a79f479	[RELEASE] Released version 3.2-dev8 Released version 3.2-dev8 with the following main changes : - MINOR: jws: implement JWS signing - TESTS: jws: implement a test for JWS signing - CI: github: add "jose" to apt dependencies - CLEANUP: log-forward: remove useless options2 init - CLEANUP: log: add syslog_process_message() helper - MINOR: proxy: add proxy->options3 - MINOR: log: migrate log-forward options from proxy->options2 to options3 - MINOR: log: provide source address information in syslog_process_message() - MINOR: tools: only print address in sa2str() when port == -1 - MINOR: log: add "option host" log-forward option - MINOR: log: handle log-forward "option host" - MEDIUM: log: change default "host" strategy for log-forward section - BUG/MEDIUM: thread: use pthread_self() not ha_pthread[tid] in set_affinity - MINOR: compiler: add a simple macro to concatenate resolved strings - MINOR: compiler: add a new __decl_thread_var() macro to declare local variables - BUILD: tools: silence a build warning when USE_THREAD=0 - BUILD: backend: silence a build warning when threads are disabled - DOC: management: rename some last occurences from domain "dns" to "resolvers" - BUG/MINOR: stats: fix capabilities and hide settings for some generic metrics - MINOR: cli: export cli_io_handler() to ease symbol resolution - MINOR: tools: improve symbol resolution without dl_addr - MINOR: tools: ease the declaration of known symbols in resolve_sym_name() - MINOR: tools: teach resolve_sym_name() a few more common symbols - BUILD: tools: avoid a build warning on gcc-4.8 in resolve_sym_name() - DEV: ncpu: also emulate sysconf() for _SC_NPROCESSORS_* - DOC: design-thoughts: commit numa-auto.txt - MINOR: cpuset: make the API support negative CPU IDs - MINOR: thread: rely on the cpuset functions to count bound CPUs - MINOR: cpu-topo: add ha_cpu_topo definition - MINOR: cpu-topo: allocate and initialize the ha_cpu_topo array. - MINOR: cpu-topo: rely on _SC_NPROCESSORS_CONF to trim maxcpus - MINOR: cpu-topo: add a function to dump CPU topology - MINOR: cpu-topo: update CPU topology from excluded CPUs at boot - REORG: cpu-topo: move bound cpu detection from cpuset to cpu-topo - MINOR: cpu-topo: add detection of online CPUs on Linux - MINOR: cpu-topo: add detection of online CPUs on FreeBSD - MINOR: cpu-topo: try to detect offline cpus at boot - MINOR: cpu-topo: add CPU topology detection for linux - MINOR: cpu-topo: also store the sibling ID with SMT - MINOR: cpu-topo: add NUMA node identification to CPUs on Linux - MINOR: cpu-topo: add NUMA node identification to CPUs on FreeBSD - MINOR: thread: turn thread_cpu_mask_forced() into an init-time variable - MINOR: cfgparse: move the binding detection into numa_detect_topology() - MINOR: cfgparse: use already known offline CPU information - MINOR: global: add a command-line option to enable CPU binding debugging - MINOR: cpu-topo: add a new "cpu-set" global directive to choose cpus - MINOR: cpu-topo: add "drop-cpu" and "only-cpu" to cpu-set - MEDIUM: thread: start to detect thread groups and threads min/max - MEDIUM: cpu-topo: make sure to properly assign CPUs to threads as a fallback - MEDIUM: thread: reimplement first numa node detection - MEDIUM: cfgparse: remove now unused numa & thread-count detection - MINOR: cpu-topo: refine cpu dump output to better show kept/dropped CPUs - MINOR: cpu-topo: fall back to nominal_perf and scaling_max_freq for the capacity - MINOR: cpu-topo: use cpufreq before acpi cppc - MINOR: cpu-topo: boost the capacity of performance cores with cpufreq - MINOR: cpu-topo: skip CPU detection when /sys/.../cpu does not exist - MINOR: cpu-topo: skip identification of non-existing CPUs - MINOR: cpu-topo: skip CPU properties that we've verified do not exist - MINOR: cpu-topo: implement a sorting mechanism for CPU index - MINOR: cpu-topo: implement a sorting mechanism by CPU locality - MINOR: cpu-topo: implement a CPU sorting mechanism by cluster ID - MINOR: cpu-topo: ignore single-core clusters - MINOR: cpu-topo: assign clusters to cores without and renumber them - MINOR: cpu-topo: make sure we don't leave unassigned IDs in the cpu_topo - MINOR: cpu-topo: assign an L3 cache if more than 2 L2 instances - MINOR: cpu-topo: renumber cores to avoid holes and make them contiguous - MINOR: cpu-topo: add a function to sort by cluster+capacity - MINOR: cpu-topo: consider capacity when forming clusters - MINOR: cpu-topo: create an array of the clusters - MINOR: cpu-topo: ignore excess of too small clusters - MINOR: cpu-topo: add "only-node" and "drop-node" to cpu-set - MINOR: cpu-topo: add "only-thread" and "drop-thread" to cpu-set - MINOR: cpu-topo: add "only-core" and "drop-core" to cpu-set - MINOR: cpu-topo: add "only-cluster" and "drop-cluster" to cpu-set - MINOR: cpu-topo: add a CPU policy setting to the global section - MINOR: cpu-topo: add a 'first-usable-node' cpu policy - MEDIUM: cpu-topo: use the "first-usable-node" cpu-policy by default - CLEANUP: thread: now remove the temporary CPU node binding code - MINOR: cpu-topo: add cpu-policy "group-by-cluster" - MEDIUM: cpu-topo: let the "group-by-cluster" split groups - MINOR: cpu-topo: add a new "performance" cpu-policy - MINOR: cpu-topo: add a new "efficiency" cpu-policy - MINOR: cpu-topo: add a new "resource" cpu-policy - MINOR: jws: add new functions in jws.h - MINOR: cpu-topo: fix unused stack var 'cpu2' reported by coverity - MINOR: hlua: add an optional timeout to AppletTCP:receive() - MINOR: jws: use jwt_alg type instead of a char - BUG/MINOR: log: prevent saddr NULL deref in syslog_io_handler() - MINOR: stream: decrement srv->served after detaching from the list - BUG/MINOR: hlua: fix optional timeout argument index for AppletTCP:receive() - MINOR: server: simplify srv_has_streams() - CLEANUP: server: make it clear that srv_check_for_deletion() is thread-safe - MINOR: cli/server: don't take thread isolation to check for srv-removable - BUG/MINOR: limits: compute_ideal_maxconn: don't cap remain if fd_hard_limit=0 - MINOR: limits: fix check_if_maxsock_permitted description - BUG/MEDIUM: hlua/cli: fix cli applet UAF in hlua_applet_wakeup() - MINOR: tools: path_base() concatenates a path with a base path - MEDIUM: ssl/ckch: make the ckch_conf more generic - BUG/MINOR: mux-h2: Reset streams with NO_ERROR code if full response was already sent - MINOR: stats: add .generic explicit field in stat_col struct - MINOR: stats: STATS_PX_CAP___B_ macro - MINOR: stats: add .cap for some static metrics - MINOR: stats: use stat_col storage stat_cols_info - MEDIUM: promex: switch to using stat_cols_info for global metrics - MINOR: promex: expose ST_I_INF_WARNINGS (AKA total_warnings) metric - MEDIUM: promex: switch to using stat_cols_px for front/back/server metrics - MINOR: stats: explicitly add frontend cap for ST_I_PX_REQ_TOT - CLEANUP: promex: remove unused PROMEX_FL_{INFO,FRONT,BACK,LI,SRV} flags - BUG/MEDIUM: mux-quic: fix crash on RS/SS emission if already close local - BUG/MINOR: mux-quic: remove extra BUG_ON() in _qcc_send_stream() - MEDIUM: mt_list: Reduce the max number of loops with exponential backoff - MINOR: stats: add alt_name field to stat_col struct - MINOR: stats: add alt name info to stat_cols_info where relevant - MINOR: promex: get rid of promex_global_metric array - MINOR: stats-proxy: add alt_name field for ME_NEW_{FE,BE,PX} helpers - MINOR: stats-proxy: add alt name info to stat_cols_px where relevant - MINOR: promex: get rid of promex_st_metrics array - MINOR: pools: rename the "by_what" field of the show pools context to "how" - MINOR: cli/pools: record the list of pool registrations even when merging them	2025-03-21 17:33:36 +01:00
Willy Tarreau	9091c5317f	MINOR: cli/pools: record the list of pool registrations even when merging them By default, create_pool() tries to merge similar pools into one. But when dealing with certain bugs, it's hard to say which ones were merged together. We do have the information at registration time, so let's just create a list of registrations ("pool_registration") attached to each pool, that will store that information. It can then be consulted on the CLI using "show pools detailed", where the names, sizes, alignment and flags are reported.	2025-03-21 17:09:30 +01:00
Willy Tarreau	baf8b742b4	MINOR: pools: rename the "by_what" field of the show pools context to "how" The goal will be to support other dump options. We don't need 32 bits to express sorting criteria, let's reserve only 4 bits for them and leave the remaining ones unused.	2025-03-21 17:09:30 +01:00
Aurelien DARRAGON	83074bf690	MINOR: promex: get rid of promex_st_metrics array In this patch we pursue the work started in a5aadbd ("MEDIUM: promex: switch to using stat_cols_px for front/back/server metrics"): Indeed, while having ".promex_name" info in stat_cols_info generic array was confusing, Willy suggested that we have ".alt_name" which stays generic and may be considered by alternative exporters for metric naming. For now, only promex exporter will make use of it. Thanks to this, it allows us to completely get rid of the stat_cols_px array. The other main benefit is that it will be much harder to overlook promex metric definition now because .alt_name has more visibility in the main metric array rather than in an addon file.	2025-03-21 17:05:31 +01:00
Aurelien DARRAGON	276491dc22	MINOR: stats-proxy: add alt name info to stat_cols_px where relevant For all metrics defined under promex_st_metrics array, add the corresponding .alt_name field in the general purpose stat_cols_px array.	2025-03-21 17:05:26 +01:00
Aurelien DARRAGON	7f9d8c1327	MINOR: stats-proxy: add alt_name field for ME_NEW_{FE,BE,PX} helpers For now alt_name is systematically set to NULL. Thanks to this change we may easily add an altname to existing metrics. Also by requiring explicit value it offers more visibility for this field.	2025-03-21 17:05:19 +01:00
Aurelien DARRAGON	155fb4ec74	MINOR: promex: get rid of promex_global_metric array In this patch we pursue the work started in 1adc796 ("MEDIUM: promex: switch to using stat_cols_info for global metrics"): Indeed, while having ".promex_name" info in stat_cols_info generic array was confusing, Willy suggested that we have ".alt_name" which stays generic and may be considered by alternative exporters for metric naming. For now, only promex exporter will make use of it. Thanks to this, it allows us to completely get rid of the promex_global_metric array. The other main benefit is that it will be much harder to overlook promex metric definition now because .alt_name has more visibility in the main metric array rather than in an addon file.	2025-03-21 17:05:14 +01:00
Aurelien DARRAGON	b03e05cd36	MINOR: stats: add alt name info to stat_cols_info where relevant For all metrics defined under promex_global_metrics array, add the corresponding .alt_name field in the general purpose stat_cols_info array.	2025-03-21 17:05:02 +01:00
Aurelien DARRAGON	7ec6f4412c	MINOR: stats: add alt_name field to stat_col struct alt_name will be used by metric exporters to know how the metric should be presented to the user. If the alt_name is NULL, the metric should be ignored. For now only promex exporter will make use of this.	2025-03-21 17:04:54 +01:00
Olivier Houchard	98967aa09f	MEDIUM: mt_list: Reduce the max number of loops with exponential backoff Reduce the max number of loops in the mt_list code while waiting for a lock to be available with exponential backoff. It's been observed that the current value led to severe performances degradation at least on some hardware, hopefully this value will be acceptable everywhere.	2025-03-21 11:30:59 +01:00
Amaury Denoyelle	c5f8df8d55	BUG/MINOR: mux-quic: remove extra BUG_ON() in _qcc_send_stream() The following patch fixed a BUG_ON() which could be triggered if RS/SS emission was scheduled after stream local closure. 7ee1279f4b8416435faba5cb93a9be713f52e4df BUG/MEDIUM: mux-quic: fix crash on RS/SS emission if already close local qcc_send_stream() was rewritten as a wrapper around an internal _qcc_send_stream() used to bypass the faulty BUG_ON(). However, an extra unnecessary BUG_ON() was added by mistake in _qcc_send_stream(). This should not cause any issue, as the BUG_ON() is only active if <urg> argument is false, which is not the case for RS/SS emission. However, this patch is labelled as a bug as this BUG_ON() is unnecessary and may cause issues in the future. This should be backported up to 2.8, after the above mentionned patch.	2025-03-20 18:18:52 +01:00
Amaury Denoyelle	7ee1279f4b	BUG/MEDIUM: mux-quic: fix crash on RS/SS emission if already close local A BUG_ON() is present in qcc_send_stream() to ensure that emission is never performed with a stream already closed locally. However, this function is also used for RESET_STREAM/STOP_SENDING emission. No protection exists to ensure that RS/SS is not scheduled after stream local closure, which would result in this BUG_ON() crash. This crash can be triggered with the following QUIC client sequence : 1. SS is emitted to open a new stream. QUIC-MUX schedules a RS emission by and the stream is locally closed. 2. An invalid HTTP/3 request is sent on the same stream, for example with duplicated pseudo-headers. The objective is to ensure qcc_abort_stream_read() is called after stream closure, which results in the following backtrace. 0x000055555566a620 in qcc_send_stream (qcs=0x7ffff0061420, urg=1, count=0) at src/mux_quic.c:1633 1633 BUG_ON(qcs_is_close_local(qcs)); [ ## gdb ## ] bt #0 0x000055555566a620 in qcc_send_stream (qcs=0x7ffff0061420, urg=1, count=0) at src/mux_quic.c:1633 #1 0x000055555566a921 in qcc_abort_stream_read (qcs=0x7ffff0061420) at src/mux_quic.c:1658 #2 0x0000555555685426 in h3_rcv_buf (qcs=0x7ffff0061420, b=0x7ffff748d3f0, fin=0) at src/h3.c:1454 #3 0x0000555555668a67 in qcc_decode_qcs (qcc=0x7ffff0049eb0, qcs=0x7ffff0061420) at src/mux_quic.c:1315 #4 0x000055555566c76e in qcc_recv (qcc=0x7ffff0049eb0, id=12, len=0, offset=23, fin=0 '\000', data=0x7fffe0049c1c "\366\r,\230\205\354\234\301;\2563\335\037k\306\334\037\260", <incomplete sequence \323>) at src/mux_quic.c:1901 #5 0x0000555555692551 in qc_handle_strm_frm (pkt=0x7fffe00484b0, strm_frm=0x7ffff00539e0, qc=0x7fffe0049220, fin=0 '\000') at src/quic_rx.c:635 #6 0x0000555555694530 in qc_parse_pkt_frms (qc=0x7fffe0049220, pkt=0x7fffe00484b0, qel=0x7fffe0075fc0) at src/quic_rx.c:980 #7 0x0000555555696c7a in qc_treat_rx_pkts (qc=0x7fffe0049220) at src/quic_rx.c:1324 #8 0x00005555556b781b in quic_conn_app_io_cb (t=0x7fffe0037f20, context=0x7fffe0049220, state=49232) at src/quic_conn.c:601 #9 0x0000555555d53788 in run_tasks_from_lists (budgets=0x7ffff748e2b0) at src/task.c:603 #10 0x0000555555d541ae in process_runnable_tasks () at src/task.c:886 #11 0x00005555559c39e9 in run_poll_loop () at src/haproxy.c:2858 #12 0x00005555559c41ea in run_thread_poll_loop (data=0x55555629fb40 <ha_thread_info+64>) at src/haproxy.c:3075 The proper solution is to not execute this BUG_ON() for RS/SS emission. Indeed, it is valid and can be useful to emit these frames, even after stream local closure. To implement this, qcc_send_stream() has been rewritten as a mere wrapper function around the new internal _qcc_send_stream(). The latter is used only by QMUX for STREAM, RS and SS emission. Application layer continue to use the original function for STREAM emission, with the BUG_ON() still in place there. This must be backported up to 2.8.	2025-03-20 17:32:14 +01:00
Aurelien DARRAGON	85f2f93d11	CLEANUP: promex: remove unused PROMEX_FL_{INFO,FRONT,BACK,LI,SRV} flags Now promex metric dumping relies on stat_cols API, we don't make use of these flags, so let's remove them.	2025-03-20 11:42:58 +01:00
Aurelien DARRAGON	2ab82124ec	MINOR: stats: explicitly add frontend cap for ST_I_PX_REQ_TOT While being a generic metric, ST_I_PX_REQ_TOT is handled specifically for the frontend case. But the frontend capability isn't set for that metric It is actually quite misleading, because the capability may be checked to see whether the metric is relevant for a given scope, yet it is relevant for frontend scope. In this patch we also add the frontend capability for the metric.	2025-03-20 11:42:43 +01:00
Aurelien DARRAGON	a5aadbd512	MEDIUM: promex: switch to using stat_cols_px for front/back/server metrics Now the stat_cols_px array contains all info that-prometheus requires stop using the promex_st_metrics array that contains redundant infos. As for ("MEDIUM: promex: switch to using stat_cols_info for global metrics"), initial goal was to completely get rid of promex_st_metrics array, but it turns out it is still required but only for the name mapping part now. So in this commit we change it from complex structure array (with redundant info) to a simple ist array with the metric id:promex name mapping. If a metric name is not defined there, then promex ignores it.	2025-03-20 11:40:07 +01:00
Aurelien DARRAGON	d31ef6134a	MINOR: promex: expose ST_I_INF_WARNINGS (AKA total_warnings) metric It has been requested to have the ST_I_INF_WARNINGS metric available from prometheus, let's define it in promex_global_metrics ist array so that prometheus starts advertising it.	2025-03-20 11:39:16 +01:00
Aurelien DARRAGON	1adc796c4b	MEDIUM: promex: switch to using stat_cols_info for global metrics Now the stat_cols_info array contains all info that prometheus requires, stop using the promex_global_metrics array that contains redundant infos. Initial goal was to completely drop the promex_global_metrics array. However it was deemed no longer relevant as prometheus stats rely on a custom name that cannot be derived from stat_cols_info[], unless we add a specific ".promex_name" field or similar to name the stats for prometheus. This is what was carried over on a first attempt but it proved to burden stat_cols_info[] array (not only memory wise, it is quite confusing to see promex in the main codebase, given that prometheus is shipped as an optional add-on). The new strategy consists in revamping the promex_global_metrics array from promex_metric (with all redundant fields for metrics) to a simple ID<==>IST mapping. If the metric is mapped, then it means promex addon should advertise it (using the name provided in the mapping). Now for all the metric retrieval, no longer rely on built-in hardcoded values but instead leverage the new stat cols API. The tricky part is the .type association because the general rule doesn't apply for all metrics as it seems that we stated that some non-counters oriented metrics (at least from haproxy point of view) had to be presented as counter metrics. So in this patch we add some special treatment for those metrics to emulate the old behavior. If that's not relevant in the future, it may be removed. But this requires to ensure that promex users will properly cope with that change. At least for now, no change of behavior should be expected.	2025-03-20 11:38:56 +01:00
Aurelien DARRAGON	af68343a56	MINOR: stats: use stat_col storage stat_cols_info Use stat_col storage for stat_cols_info[] array instead of name_desc. As documented in 65624876f ("MINOR: stats: introduce a more expressive stat definition method"), stat_col supersedes name_desc storage but it remains backward compatible. Here we migrate to the new API to be able to further extend stat_cols_info[] in following patches.	2025-03-20 11:38:32 +01:00
Aurelien DARRAGON	8aa8626d12	MINOR: stats: add .cap for some static metrics Goal is to merge promex metrics definition into the main one. Promex metrics will use the metric capability to know available scopes, thus only metrics relevant for prometheus were updated.	2025-03-20 11:38:17 +01:00
Aurelien DARRAGON	9c60fc9fe1	MINOR: stats: STATS_PX_CAP___B_ macro STATS_PX_CAP___B_ points to STATS_PX_CAP_BE, it is just an alias for consistency, like STATS_PX_CAP____S which points to STATS_PX_CAP_SRV.	2025-03-20 11:37:47 +01:00
Aurelien DARRAGON	3c1b00b127	MINOR: stats: add .generic explicit field in stat_col struct Further extend logic implemented in 65624876 ("MINOR: stats: introduce a more expressive stat definition method") and 4e9e8418 ("MINOR: stats: prepare stats-file support for values other than FN_COUNTER"): we don't rely anymore on the presence of the capability to know if the metric is generic or not. This is because it prevents us from setting a capability on static statistics. Yet it could be useful to set the capability even on static metrics, thus we add a dedicated .generic bit to tell haproxy that the metric is generic and can be handled automatically by the API. Also, ME_NEW_* helpers are not explicitly associated to generic metric definition (as it was already the case before) to avoid ambiguities. It may change in the future as we may need to use the new definition method to define static metrics (without the generic bit set). But for now it isn't the case as this need definition was implemented for generic metrics support in the first place. If we want to define static metrics using the API, we could add a new set of helpers for instance.	2025-03-20 11:37:21 +01:00
Christopher Faulet	e87397bc7d	BUG/MINOR: mux-h2: Reset streams with NO_ERROR code if full response was already sent On frontend side, when a stream is shut while the response was already fully sent, it was cancelled by sending a RST_STREAM(CANCEL) frame. However, it is not accurrate. CANCEL error code must only be used if the response headers were sent, but not the full response. As stated in the RFC 9113, when the response was fully sent, to stop the request sending, a RST_STREAM with an error code of NO_ERROR must be sent. This patch should solve the issue #1219. It must be backported to all stable versions.	2025-03-20 08:36:06 +01:00
William Lallemand	2fb6270910	MEDIUM: ssl/ckch: make the ckch_conf more generic The ckch_store_load_files() function makes specific processing for PARSE_TYPE_STR as if it was a type only used for paths. This patch changes a little bit the way it's done, PARSE_TYPE_STR is only meant to strdup() a string and stores the resulting pointer in the ckch_conf structure. Any processing regarding the path is now done in the callback. Since the callbacks were basically doing the same thing, they were transformed into the DECLARE_CKCH_CONF_LOAD() macros which allows to do some templating of these functions. The resulting ckch_conf_load_* functions will do the same as before, except they will also do the path processing instead of letting ckch_store_load_files() do it, which means we don't need the "base" member anymore in the struct ckch_conf_kws.	2025-03-19 18:08:40 +01:00
William Lallemand	b0ad777902	MINOR: tools: path_base() concatenates a path with a base path With the SSL configuration, crt-base, key-base are often used, these keywords concatenates the base path with the path when the path does not start by '/'. This is done at several places in the code, so a function to do this would be better to standardize the code.	2025-03-19 17:59:31 +01:00
Aurelien DARRAGON	21601f4a27	BUG/MEDIUM: hlua/cli: fix cli applet UAF in hlua_applet_wakeup() Recent commit e5e36ce09 ("BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers") revealed a bug in hlua cli applet handling Indeed, playing with Willy's lua tetris script on the cli, a segfault would be encountered when forcefully closing the session by sending a CTRL+C on the terminal. In fact the crash was caused by a UAF: while the cli applet was already freed, the lua task responsible for waking it up would still point to it. Thus hlua_applet_wakeup() could be called even if the applet didn't exist anymore. To fix the issue, in hlua_cli_io_release_fct() we must also free the hlua task linked to the applet, like we already do for hlua_applet_tcp_release() and hlua_applet_http_release(). While this bug exists on stable versions (where it should be backported too for precaution), it only seems to be triggered starting with 3.0.	2025-03-19 17:03:28 +01:00
Valentine Krasnobaeva	6986e3f41f	MINOR: limits: fix check_if_maxsock_permitted description Fix typo in check_if_maxsock_permitted() description.	2025-03-18 17:38:04 +01:00
Valentine Krasnobaeva	060f441199	BUG/MINOR: limits: compute_ideal_maxconn: don't cap remain if fd_hard_limit=0 'global.fd_hard_limit' stays uninitialized, if haproxy is started with -m (global.rlimit_memmax). 'remain' is the MAX between soft and hard process fd limits. It will be always bigger than 'global.fd_hard_limit' (0) in this case. So, if we reassign 'remain' to the 'global.fd_hard_limit' unconditionally, calculated then 'maxconn' will be even negative and the DEFAULT_MAXCONN (100) will be set as the 'ideal_maxconn'. During the 'global.maxconn' calculations in set_global_maxconn(), if the provided 'global.rlimit_memmax' is quite big, system will refuse to calculate based on its 'global.maxconn' and we will do a fallback to the 'ideal_maxconn', which is 100. Same problem for the configs with SSL frontends and backends. This fixes the issue #2899. This should be backported to v3.1.0.	2025-03-18 17:37:33 +01:00
Willy Tarreau	6336b636f7	MINOR: cli/server: don't take thread isolation to check for srv-removable Thanks to the previous commits, we now know that "wait srv-removable" does not require thread isolation, as long as 3372a2ea00 ("BUG/MEDIUM: queues: Stricly respect maxconn for outgoing connections") and c880c32b16 ("MINOR: stream: decrement srv->served after detaching from the list") are present. Let's just get rid of thread_isolate() here, which can consume a lot of CPU on highly threaded machines when removing many servers at once.	2025-03-18 17:36:02 +01:00
Willy Tarreau	aad8e74cb9	CLEANUP: server: make it clear that srv_check_for_deletion() is thread-safe This function was marked as requiring thread isolation because its code was extracted from cli_parse_delete_server() and was running under isolation. But upon closer inspection, and using atomic loads to check a few counters, it is actually safe to run without isolation, so let's reflect that in its description. However, it remains true that cli_parse_delete_server() continues to call it under isolation.	2025-03-18 17:36:02 +01:00
Willy Tarreau	0e8c573b4b	MINOR: server: simplify srv_has_streams() Now that thanks to commit c880c32b16 ("MINOR: stream: decrement srv->served after detaching from the list") we can trust srv->served, let's use it and no longer loop on threads when checking if a server still has streams attached to it. This will be much cheaper and will result in keeping isolation for a shorter time in the "wait" command.	2025-03-18 17:36:02 +01:00
Aurelien DARRAGON	4651c4edd5	BUG/MINOR: hlua: fix optional timeout argument index for AppletTCP:receive() Baptiste reported that using the new optional timeout argument introduced in 19e48f2 ("MINOR: hlua: add an optional timeout to AppletTCP:receive()") the following error would occur at some point: runtime error: file.lua:lineno: bad argument #-2 to 'receive' (number expected, got light userdata) from [C]: in method 'receive... In fact this is caused by exp_date being retrieved using relative index -1 instead of absolute index 3. Indeed, while using relative index is fine most of the time when we trust the stack, when combined with yielding the top of the stack when resuming from yielding is not necessarily the same as when the function was first called (ie: if some data was pushed to the stack in the yieldable function itself). As such, it is safer to use explicit index to access exp_date variable at position 3 on the stack. It was confirmed that doing so addresses the issue. No backport needed unless 19e48f2 is.	2025-03-18 16:48:32 +01:00
Willy Tarreau	c880c32b16	MINOR: stream: decrement srv->served after detaching from the list In commit 3372a2ea00 ("BUG/MEDIUM: queues: Stricly respect maxconn for outgoing connections"), it has been ensured that srv->served is held as long as possible around the periods where a stream is attached to a server. However, it's decremented early when entering sess_change_server, and actually just before detaching from that server's list. While there is theoretically nothing wrong with this, it prevents us from looking at this counter to know if streams are still using a server or not. We could imagine decrementing it much later but that wouldn't work with leastconn, since that algo needs ->served to be final before calling lbprm.server_drop_conn(). Thus what we're doing here is to detach from the server, then decrement ->served, and only then call the LB callback to update the server's position in the tree. At this moment the stream doesn't know the server anymore anyway (except via this function's local variable) so it's safe to consider that no stream knows the server once the variable reaches zero.	2025-03-18 11:43:52 +01:00
Aurelien DARRAGON	7895726bff	BUG/MINOR: log: prevent saddr NULL deref in syslog_io_handler() In ad0133cc ("MINOR: log: handle log-forward "option host""), we de-reference saddr without first checking if saddr is NULL. In practise saddr shouldn't be null, but it may be the case if memory error happens for tcp syslog handler so we must assume that it can be NULL at some point. To fix the bug, we simply check for NULL before de-referencing it under syslog_io_handler(), as the function comment suggests. No backport needed unless ad0133cc is.	2025-03-18 00:13:19 +01:00
William Lallemand	29b4b985c3	MINOR: jws: use jwt_alg type instead of a char This patch implements the function EVP_PKEY_to_jws_algo() which returns a jwt_alg compatible with the private key. This value can then be passed to jws_b64_protected() and jws_b64_signature() which modified to take an jwt_alg instead of a char.	2025-03-17 18:06:34 +01:00
Willy Tarreau	19e48f237f	MINOR: hlua: add an optional timeout to AppletTCP:receive() TCP services might want to be interactive, and without a timeout on receive(), the possibilities are a bit limited. Let's add an optional timeout in the 3rd argument to possibly limit the wait time. In this case if the timeout strikes before the requested size is complete, a possibly incomplete block will be returned.	2025-03-17 16:19:34 +01:00
Valentine Krasnobaeva	557f62593f	MINOR: cpu-topo: fix unused stack var 'cpu2' reported by coverity Coverity has reported that cpu2 seems sometimes unused in cpu_fixup_topology(): *** CID 1593776: Code maintainability issues (UNUSED_VALUE) /src/cpu_topo.c: 690 in cpu_fixup_topology() 684 continue; 685 686 if (ha_cpu_topo[cpu].cl_gid != curr_id) { 687 if (curr_id >= 0 && cl_cpu <= 2) 688 small_cl++; 689 cl_cpu = 0; >>> CID 1593776: Code maintainability issues (UNUSED_VALUE) >>> Assigning value from "cpu" to "cpu2" here, but that stored value is overwritten before it can be used. 690 cpu2 = cpu; 691 curr_id = ha_cpu_topo[cpu].cl_gid; 692 } 693 cl_cpu++; 694 } 695 That's it. 'cpu2' automatic/stack variable is used only in for() loop scopes to save cpus ID in which we are interested in. In the loop pointed by coverity this variable is not used for further processing within the loop's scope. Then it is always reinitialized to 0 in the another following loops. This fixes GitHUb issue #2895.	2025-03-17 14:53:36 +01:00
William Lallemand	de67f25a7e	MINOR: jws: add new functions in jws.h Add signatures of jws_b64_payload(), jws_b64_protected(), jws_b64_signature(), jws_flattened() which allows to create a complete JWS flattened object.	2025-03-17 11:51:52 +01:00
Willy Tarreau	e3fd9970a9	MINOR: cpu-topo: add a new "resource" cpu-policy This cpu policy keeps the smallest CPU cluster. This can be used to limit the resource usage to the strict minimum that still delivers decent performance, for example to try to further reduce power consumption or minimize the number of cores needed on some rented systems for a sidecar setup, in order to scale the system down more easily. Note that if a single cluster is present, it will still be fully used. When started on a 64-core EPYC gen3, it uses only one CCX with 8 cores and 16 threads, all in the same group.	2025-03-14 18:33:16 +01:00
Willy Tarreau	ad3650c354	MINOR: cpu-topo: add a new "efficiency" cpu-policy This cpu policy tries to evict performant core clusters and only focuses on efficiency-oriented ones. On an intel i9-14900k, we can get 525k rps using 8 performance cores, versus 405k when using all 24 efficiency cores. In some cases the power savings might be more desirable (e.g. scalability tests on a developer's laptop), or the performance cores might be better suited for another component (application or security component).	2025-03-14 18:33:16 +01:00
Willy Tarreau	dcae2fa4a4	MINOR: cpu-topo: add a new "performance" cpu-policy This cpu policy tries to evict efficient core clusters and only focuses on performance-oriented ones. On an intel i9-14900k, we can get 525k rps using only 8 cores this way, versus 594k when using all 24 cores. The gains from using all these codes are not significant enough to waste them on this. Also these cores can be much slower at doing SSL handshakes so it can make sense to evict them. Better keep the efficiency cores for network interrupts for example. Also, on a developer's machine it can be convenient to keep all these cores for the local tasks and extra tools (load generators etc).	2025-03-14 18:33:16 +01:00
Willy Tarreau	96cd420dc3	MEDIUM: cpu-topo: let the "group-by-cluster" split groups When a cluster is too large to fit into a single group, let's split it into two equal groups, which will still be allowed to use all the CPUs of the cluster. This allows haproxy to start all the threads with a minimum number of groups (e.g. 2x40 for 80 cores).	2025-03-14 18:33:16 +01:00
Willy Tarreau	8aeb096740	MINOR: cpu-topo: add cpu-policy "group-by-cluster" This policy forms thread groups from the CPU clusters, and bind all the threads in them to all the CPUs of the cluster. This is recommended on system with bad inter-CCX latencies. It was shown to simply triple the performance with queuing on a 64-core EPYC without having to manually assign the cores with cpu-map.	2025-03-14 18:33:16 +01:00
Willy Tarreau	aaa4080b8b	CLEANUP: thread: now remove the temporary CPU node binding code This is now superseded by the default "safe" cpu-policy, and every time it's used, that code was bypassed anyway since global.nbthread was set. We can now safely remove it. Note that for other policies which do not set a thread count nor further restrict CPUs (such as "none", or even "safe" when finding a single node), we continue to go through the fallback code that automatically assigns CPUs to threads and counts them.	2025-03-14 18:33:16 +01:00
Willy Tarreau	56d939866b	MEDIUM: cpu-topo: use the "first-usable-node" cpu-policy by default This now turns the cpu-policy to "first-usable-node" by default, so that we preserve the current default behavior consisting in binding to the first node if nothing was forced. If a second node is found, global.nbthread is set and the previous code will be skipped.	2025-03-14 18:33:16 +01:00
Willy Tarreau	7fc6cdd0b1	MINOR: cpu-topo: add a 'first-usable-node' cpu policy This is a reimplemlentation of the current default policy. It binds to the first node having usable CPUs if found, and drops CPUs from the second and next nodes.	2025-03-14 18:33:16 +01:00
Willy Tarreau	156430ceb6	MINOR: cpu-topo: add a CPU policy setting to the global section We'll need to let the user decide what's best for their workload, and in order to do this we'll have to provide tunable options. For that, we're introducing struct ha_cpu_policy which contains a name, a description and a function pointer. The purpose will be to use that function pointer to choose the best CPUs to use and now to set the number of threads and thread-groups, that will be called during the thread setup phase. The only supported policy for now is "none" which doesn't set/touch anything (i.e. all available CPUs are used).	2025-03-14 18:33:16 +01:00
Willy Tarreau	9a8e8af11a	MINOR: cpu-topo: add "only-cluster" and "drop-cluster" to cpu-set These are processed after the topology is detected, and they allow to restrict binding to or evict CPUs matching the indicated hardware cluster number(s). It can be used to bind to only some clusters, such as CCX or different energy efficiency cores. For this reason, here we use the cluster's local ID (local to the node).	2025-03-14 18:33:16 +01:00
Willy Tarreau	a946cfa8b5	MINOR: cpu-topo: add "only-core" and "drop-core" to cpu-set These are processed after the topology is detected, and they allow to restrict binding to or evict CPUs matching the indicated hardware core number(s). It can be used to bind to only some clusters as well as to evict efficient cores whose number is known.	2025-03-14 18:33:16 +01:00
Willy Tarreau	c591c9d6a6	MINOR: cpu-topo: add "only-thread" and "drop-thread" to cpu-set These are processed after the topology is detected, and they allow to restrict binding to or evict CPUs matching the indicated hardware thread number(s). It can be used to reserve even threads for HW IRQs and odd threads for haproxy for example, or to evict efficient cores that do only have thread #0.	2025-03-14 18:33:16 +01:00
Willy Tarreau	c93ee25054	MINOR: cpu-topo: add "only-node" and "drop-node" to cpu-set These are processed after the topology is detected, and they allow to restrict binding to or evict CPUs matching the indicated node(s).	2025-03-14 18:33:16 +01:00
Willy Tarreau	7263366606	MINOR: cpu-topo: ignore excess of too small clusters On some Arm systems (typically A76/N1) where CPUs can be associated in pairs, clusters are reported while they have no incidence on I/O etc. Yet it's possible to have tens of clusters of 2 CPUs each, which is counter productive since it does not even allow to start enough threads. Let's detect this situation as soon as there are at least 4 clusters having each 2 CPUs or less, which is already very suspcious. In this case, all these clusters will be reset as meaningless. In the worst case if needed they'll be re-assigned based on L2/L3.	2025-03-14 18:33:12 +01:00
Willy Tarreau	aa4776210b	MINOR: cpu-topo: create an array of the clusters The goal here is to keep an array of the known CPU clusters, because we'll use that often to decide of the performance of a cluster and its relevance compared to other ones. We'll store the number of CPUs in it, the total capacity etc. For the capacity, we count one unit per core, and 1/3 of it per extra SMT thread, since this is roughly what has been measured on modern CPUs. In order to ease debugging, they're also dumped with -dc.	2025-03-14 18:30:31 +01:00
Willy Tarreau	204ac3c0b6	MINOR: cpu-topo: consider capacity when forming clusters By using the cluster+capacity sorting function we can detect heterogneous clusters which are not properly reported. Thanks to this, the following misnumbered machine featuring 4 big cores, 4 medium ones an 4 small ones is properly detected with its clusters correctly assigned: [keep] thr= 0 -> cpu= 0 pk=00 no=00 cl=000 ts=000 capa=1024 [keep] thr= 1 -> cpu= 1 pk=00 no=00 cl=002 ts=008 capa=278 [keep] thr= 2 -> cpu= 2 pk=00 no=00 cl=002 ts=009 capa=278 [keep] thr= 3 -> cpu= 3 pk=00 no=00 cl=002 ts=010 capa=278 [keep] thr= 4 -> cpu= 4 pk=00 no=00 cl=002 ts=011 capa=278 [keep] thr= 5 -> cpu= 5 pk=00 no=00 cl=001 ts=004 capa=905 [keep] thr= 6 -> cpu= 6 pk=00 no=00 cl=001 ts=005 capa=905 [keep] thr= 7 -> cpu= 7 pk=00 no=00 cl=001 ts=006 capa=866 [keep] thr= 8 -> cpu= 8 pk=00 no=00 cl=001 ts=007 capa=866 [keep] thr= 9 -> cpu= 9 pk=00 no=00 cl=000 ts=001 capa=984 [keep] thr= 10 -> cpu= 10 pk=00 no=00 cl=000 ts=002 capa=984 [keep] thr= 11 -> cpu= 11 pk=00 no=00 cl=000 ts=003 capa=1024 Also this has the benefit of always assigning highest performance clusters with the smallest IDs so that simple configs can decide to simply bind to cluster 0 or clusters 0,1 and benefit from optimal performance.	2025-03-14 18:30:31 +01:00
Willy Tarreau	4a6eaf6c5e	MINOR: cpu-topo: add a function to sort by cluster+capacity The purpose here is to detect heterogenous clusters which are not properly reported, based on the exposed information about the cores capacity. The algorithm here consists in sorting CPUs by capacity within a cluster, and considering as equal all those which have 5% or less difference in capacity with the previous one. This allows large clusters of more than 5% total between extremities, while keeping apart those where the limit is more pronounced. This is quite common in embedded environments with big.little systems, as well as on some laptops.	2025-03-14 18:30:31 +01:00
Willy Tarreau	0290b807dd	MINOR: cpu-topo: renumber cores to avoid holes and make them contiguous Due to the way core numbers are assigned and the presence of SMT on some of them, some holes may remain in the array. Let's renumber them to plug holes once they're known, following pkg/node/die/llc etc, so that they're local to a (pkg,node) set. Now an i7-14700 shows cores 0 to 19, not 0 to 27.	2025-03-14 18:30:31 +01:00
Willy Tarreau	b633b9d422	MINOR: cpu-topo: assign an L3 cache if more than 2 L2 instances On some machines, L3 is not always reported (e.g. on some lx2 or some armada8040). But some also don't have L3 (core 2 quad). However, no L3 when there are more than 2 L2 is quite unheard of, and while we don't really care about firing 2 thread groups for 2 L2, we'd rather avoid doing this if there are 8! In this case we'll declare an L3 instance to fix the situation. This allows small machines to continue to start with two groups while not derivating on large ones.	2025-03-14 18:30:31 +01:00
Willy Tarreau	d169758fa9	MINOR: cpu-topo: make sure we don't leave unassigned IDs in the cpu_topo It's important that we don't leave unassigned IDs in the topology, because the selection mechanism is based on index-based masks, so an unassigned ID will never be kept. This is particularly visible on systems where we cannot access the CPU topology, the package id, node id and even thread id are set to -1, and all CPUs are evicted due to -1 not being set in the "only-cpu" sets. Here in new function "cpu_fixup_topology()", we assign them with the smallest unassigned value. This function will be used to assign IDs where missing in general.	2025-03-14 18:30:31 +01:00
Willy Tarreau	af648c7b58	MINOR: cpu-topo: assign clusters to cores without and renumber them Due to the previous commit we can end up with cores not assigned any cluster ID. For this, at the end we sort the CPUs by topology and assign cluster IDs to remaining CPUs based on pkg/node/llc. For example an 14900 now shows 5 clusters, one for the 8 p-cores, and 4 of 4 e-cores each. The local cluster numbers are per (node,pkg) ID so that any rule could easily be applied on them, but we also keep the global numbers that will help with thread group assignment. We still need to force to assign distinct cluster IDs to cores running on a different L3. For example the EPYC 74F3 is reported as having 8 different L3s (which is true) and only one cluster. Here we introduce a new function "cpu_compose_clusters()" that is called from the main init code just after cpu_detect_topology() so that it's not OS-dependent. It deals with this renumbering of all clusters in topology order, taking care of considering any distinct LLC as being on a distinct cluster.	2025-03-14 18:30:31 +01:00
Willy Tarreau	385360fe81	MINOR: cpu-topo: ignore single-core clusters Some platforms (several armv7, intel 14900 etc) report one distinct cluster per core. This is problematic as it cannot let clusters be used to distinguish real groups of cores, and cannot be used to build thread groups. Let's just compare the cluster cpus to the siblings, and ignore it if they exactly match. We must also take care of not falling back to core_cpus_list, which can enumerate cores that already have their cluster assigned (e.g. intel 14900 has 4 4-Ecore clusters in addition to the 8 Pcores).	2025-03-14 18:30:31 +01:00
Willy Tarreau	a4471ea56d	MINOR: cpu-topo: implement a CPU sorting mechanism by cluster ID This will be used to detect and fix incorrect setups which report the same cluster ID for multiple L3 instances. The arrangement of functions in this file is becoming a real problem. Maybe we should move all this to cpu_topo for example, and better distinguish OS-specific and generic code.	2025-03-14 18:30:31 +01:00
Willy Tarreau	a8acdbd9fd	MINOR: cpu-topo: implement a sorting mechanism by CPU locality Once we've kept only the CPUs we want, the next step will be to form groups and these ones are based on locality. Thus we'll have to sort by locality. For now the locality is only inferred by the index. No grouping is made at this point. For this we add the "cpu_reorder_by_locality" function with a locality-based comparison function.	2025-03-14 18:30:31 +01:00
Willy Tarreau	18133a054d	MINOR: cpu-topo: implement a sorting mechanism for CPU index CPU selection will be performed by sorting CPUs according to various criteria. For dumps however, that's really not convenient and we'll need to reorder the CPUs according to their index only. This is what the new function cpu_reorder_by_index() does. It's called in thread_detect_count() before dumping the CPU topology.	2025-03-14 18:30:31 +01:00
Willy Tarreau	661d49a18a	MINOR: cpu-topo: skip CPU properties that we've verified do not exist A number of entries under /cpu/cpu%d only exist on certain kernel versions, certain archs and/or with certain modules loaded. It's pointless to insist on trying to read them all for all CPUs when we've already verified they do not exist. Thus let's use stat() the first time prior to checking some of them, and only try to access them when they really exist. This almost completely eliminates the large number of ENOENT that was visible in strace during startup.	2025-03-14 18:30:31 +01:00
Willy Tarreau	baeea08dba	MINOR: cpu-topo: skip identification of non-existing CPUs There's no point trying to read all entries under /cpu/cpu%d when that one does not exist, so let's just skip it in this case.	2025-03-14 18:30:31 +01:00
Willy Tarreau	8542c79f9d	MINOR: cpu-topo: skip CPU detection when /sys/.../cpu does not exist There's no point scanning all entries when /cpu doesn't exist in the first place. Let's check once for it and skip the loop in this case.	2025-03-14 18:30:30 +01:00
Willy Tarreau	c5ddf4a5b2	MINOR: cpu-topo: boost the capacity of performance cores with cpufreq Cpufreq alone isn't a good metric on heterogenous CPUs because efficient cores can reach almost as high frequencies as performant ones. Tests have shown that majoring performance cores by 50% gives a pretty accurate estimate of the performance to expect on modern CPUs, and that counting +33% per extra SMT thread is reasonable as well. We don't have the info about the core's quality, but using the presence of SMT is a reasonable approach in this case, given that efficiency cores will not use it. As an example, using one thread of each of the 8 P-cores of an intel i9-14900k gives 395k rps for a corrected total capacity of 69.3k, using the 16 E-cores gives 40.5k for a total capacity of 70.4k, and using both threads of 6 P-cores gives 41.1k for a total capacity of 69.6k. Thus the 3 same scores deliver the same performance in various combinations.	2025-03-14 18:30:30 +01:00
Willy Tarreau	e4aa13e786	MINOR: cpu-topo: use cpufreq before acpi cppc The acpi_cppc method was found to take about 5ms per CPU on a 64-core EPYC system, which is plain unacceptable as it delays the boot by half a second. Let's use the less accurate cpufreq first, which should be sufficient anyway since many systems do not have acpi_cppc. We'll only fall back to acpi_cppc for systems without cpufreq. If it were to be an issue over time, we could also automatically consider that all threads of the same core or even of the same cluster run at the same speed (when a cluster is known to be accurate).	2025-03-14 18:30:30 +01:00
Willy Tarreau	d11241b7ba	MINOR: cpu-topo: fall back to nominal_perf and scaling_max_freq for the capacity When cpu_capacity is not present, let's try to check acpi_cppc's nominal_perf which is similar and commonly found on servers, then scaling_max_freq (though that last one may vary a bit between CPUs depending on die quality). That variation is not a problem since we can absorb a ~5% variation without issue. It was verified on an i9-14900 featuring 5.7-P, 6.0-P and 4.4-E GHz that P-cores were not reordered and that E cores were placed last. It was also OK on a W3-2345 with 4.3 to 4.5GHz.	2025-03-14 18:30:30 +01:00
Willy Tarreau	322c28cc19	MINOR: cpu-topo: refine cpu dump output to better show kept/dropped CPUs It's becoming difficult to see which CPUs are going to be kept/dropped. Let's just skip all offline CPUs, and indicate "keep" in front of those that are going to be used, and "----" in front of the excluded ones. It is way more readable this way. Also let's just drop the array entry number, since it's always the same as the CPU number and is only an internal representation anyway.	2025-03-14 18:30:30 +01:00
Willy Tarreau	f1210ee7c6	MEDIUM: cfgparse: remove now unused numa & thread-count detection Ths is not needed anymore since already done before landing here via thread_detect_count().	2025-03-14 18:30:30 +01:00
Willy Tarreau	e3aef4c9a4	MEDIUM: thread: reimplement first numa node detection Let's reimplement automatic binding to the first NUMA node when thread count is not forced. It's the same thing as is already done in check_config_validity() except that this time it's based on the collected CPU information. The threads are automatically counted and CPUs from non-first node(s) are evicted.	2025-03-14 18:30:30 +01:00
Willy Tarreau	4a525e8d27	MEDIUM: cpu-topo: make sure to properly assign CPUs to threads as a fallback If no cpu-map is done and no cpu-policy could be enforced, we still need to count the number of usable CPUs, assign them to all threads and set the nbthread value accordingly. This already handles the part that was done in check_config_validity() via thread_cpus_enabled_at_boot.	2025-03-14 18:30:30 +01:00
Willy Tarreau	1af4942c95	MEDIUM: thread: start to detect thread groups and threads min/max By mutually refining the thread count and group count, we can try to detect the most suitable setup for the current machine. Taskset is implicitly handled correctly. tgroups automatically adapt to the configured number of threads. cpu-map manages to limit tgroups to the smallest supported value. The thread-limit is enforced. Just like in cfgparse, if the thread count was forced to a higher value, it's reduced and a warning is emitted. But if it was not set, the thr_max value is bound to this limit so that further calculations respect it. We continue to default to the max number of available threads and 1 tgroup by default, with the limit. This normally allows to get rid of that test in check_config_validity().	2025-03-14 18:30:30 +01:00
Willy Tarreau	68069e4b27	MINOR: cpu-topo: add "drop-cpu" and "only-cpu" to cpu-set These allow respectively to disable binding to CPUs listed in a set, and to disable binding to CPUs not in a set.	2025-03-14 18:30:30 +01:00
Willy Tarreau	cda4956d9c	MINOR: cpu-topo: add a new "cpu-set" global directive to choose cpus For now it's limited, it only supports "reset" to ask that any previous "taskset" be ignored. The goal will be to later add more actions that allow to symbolically define sets of cpus to bind to or to drop. This also clears the cpu_mask_forced variable that is used to detect that a taskset had been used.	2025-03-14 18:30:30 +01:00
Willy Tarreau	f0661e79fe	MINOR: global: add a command-line option to enable CPU binding debugging During development, everything related to CPU binding and the CPU topology is debugged using state dumps at various places, but it does make sense to have a real command line option so that this remains usable in production to help users figure why some CPUs are not used by default. Let's add "-dc" for this. Since the list of global.tune.options values is almost full and does not 100% match this option, let's add a new "tune.debug" field for this.	2025-03-14 18:30:30 +01:00
Willy Tarreau	94543d7b65	MINOR: cfgparse: use already known offline CPU information No need to reparse cpu/online, let's just rely on the info we learned previously about offline CPUs.	2025-03-14 18:30:30 +01:00
Willy Tarreau	1560827c9d	MINOR: cfgparse: move the binding detection into numa_detect_topology() For now the function refrains from detecting the CPU topology when a restrictive taskset or cpu-map was already performed on the process, and it's documented as such, the reason being that until we're able to automatically create groups, better not change user settings. But we'll need to be able to detect bound CPUs and to process them as desired by the user, so we now need to move that detection into the function itself. It changes nothing to the logic, just gives more freedom to the function.	2025-03-14 18:30:30 +01:00
Willy Tarreau	ac1db9db7d	MINOR: thread: turn thread_cpu_mask_forced() into an init-time variable The function is not convenient because it doesn't allow us to undo the startup changes, and depending on where it's being used, we don't know whether the values read have already been altered (this is not the case right now but it's going to evolve). Let's just compute the status during cpu_detect_usable() and set a variable accordingly. This way we'll always read the init value, and if needed we can even afford to reset it. Also, placing it in cpu_topo.c limits cross-file dependencies (e.g. threads without affinity etc).	2025-03-14 18:30:30 +01:00
Willy Tarreau	3a7cc676fa	MINOR: cpu-topo: add NUMA node identification to CPUs on FreeBSD With this patch we're also NUMA node IDs to each CPU when the info is found. The code is highly inspired from the one in commit f5d48f8b3 ("MEDIUM: cfgparse: numa detect topology on FreeBSD."), the difference being that we're just setting the value in ha_cpu_topo[].	2025-03-14 18:30:30 +01:00
Willy Tarreau	f6154c079e	MINOR: cpu-topo: add NUMA node identification to CPUs on Linux With this patch we're also assigning NUMA node IDs to each CPU when one is found. The code is highly inspired from the one in commit b56a7c89a ("MEDIUM: cfgparse: detect numa and set affinity if needed") that already did the job, except that it could be simplified since we're just collecting info to fill the ha_cpu_topo[] array.	2025-03-14 18:30:30 +01:00
Willy Tarreau	65612369e7	MINOR: cpu-topo: also store the sibling ID with SMT The sibling ID was not reported because it's not directly accessible but we don't care, what matters is that we assign numbers to all the threads we find using the same CPU so that some strategies permit to allocate one thread at a time if we want to use few threads with max performance.	2025-03-14 18:30:30 +01:00
Willy Tarreau	7cb274439b	MINOR: cpu-topo: add CPU topology detection for linux This uses the publicly available information from /sys to figure the cache and package arrangements between logical CPUs and fill ha_cpu_topo[], as well as their SMT capabilities and relative capacity for those which expose this. The functions clearly have to be OS-specific.	2025-03-14 18:30:30 +01:00
Willy Tarreau	12f3a2bbb7	MINOR: cpu-topo: try to detect offline cpus at boot When possible, the offline CPUs are detected at boot and their OFFLINE flag is set in the ha_cpu_topo[] array. When the detection is not possible (e.g. not linux, /sys not mounted etc), we just mark none of them as being offline, as we don't want to infer wrong info that could hinder automatic CPU placement detection. When valid, we take this opportunity for refining cpu_topo_lastcpu so that we don't need to manipulate CPUs beyond this value.	2025-03-14 18:30:30 +01:00
Willy Tarreau	44881e5abf	MINOR: cpu-topo: add detection of online CPUs on FreeBSD On FreeBSD we can detect online CPUs at least by doing the bitwise-OR of the CPUs of all domains, so we're using this and adding this detection to ha_cpuset_detect_online(). If we find simpler later, we can always rework it, but it's reasonably inexpensive since we only check existing domains.	2025-03-14 18:30:30 +01:00
Willy Tarreau	8f72ce335a	MINOR: cpu-topo: add detection of online CPUs on Linux This adds a generic function ha_cpuset_detect_online() which for now only supports linux via /sys. It fills a cpuset with the list of online CPUs that were detected (or returns a failure).	2025-03-14 18:30:30 +01:00
Willy Tarreau	8c524c7c9d	REORG: cpu-topo: move bound cpu detection from cpuset to cpu-topo The cpuset files are normally used only for cpu manipulations. It happens that the initial CPU binding detection was initially placed there since there was no better place, but in practice, being OS-specific, it should really be in cpu-topo. This simplifies cpuset which doesn't need to know about the OS anymore.	2025-03-14 18:30:30 +01:00
Willy Tarreau	a6fdc3eaf0	MINOR: cpu-topo: update CPU topology from excluded CPUs at boot Now before trying to resolve the thread assignment to groups, we detect which CPUs are not bound at boot so that we can mark them with HA_CPU_F_EXCLUDED. This will be useful to better know on which CPUs we can count later. Note that we purposely ignore cpu-map here as we don't know how threads and groups will map to cpu-map entries, hence which CPUs will really be used. It's important to proceed this way so that when we have no info we assume they're all available.	2025-03-14 18:30:30 +01:00
Willy Tarreau	bdb731172c	MINOR: cpu-topo: add a function to dump CPU topology The new function cpu_dump_topology() will centralize most debugging calls, and it can make efforts of not dumping some possibly irrelevant fields (e.g. non-existing cache levels).	2025-03-14 18:30:30 +01:00
Willy Tarreau	041462c4af	MINOR: cpu-topo: rely on _SC_NPROCESSORS_CONF to trim maxcpus We don't want to constantly deal with as many CPUs as a cpuset can hold, so let's first try to trim the value to what the system claims to support via _SC_NPROCESSORS_CONF. It is obviously still subject to the limit of the cpuset size though. The value is stored globally so that we can reuse it elsewhere after initialization.	2025-03-14 18:30:30 +01:00
Willy Tarreau	656cedad42	MINOR: cpu-topo: allocate and initialize the ha_cpu_topo array. This does the bare minimum to allocate and initialize a global ha_cpu_topo array for the number of supported CPUs and release it at deinit time.	2025-03-14 18:30:30 +01:00
Willy Tarreau	d165f5d3ab	MINOR: cpu-topo: add ha_cpu_topo definition This structure will be used to store information about each CPU's topology (package ID, L3 cache ID, NUMA node ID etc). This will be used in conjunction with CPU affinity setting to try to perform a mostly optimal binding between threads and CPU numbers by default. Since it was noticed during tests that absolutely none of the many machines tested reports different die numbers, the die_id is not stored. Also, it was found along experiments that the cluster ID will be used a lot, half of the time as a node-local identifier, and half of the time as a global identifier. So let's store the two versions at once (cl_gid, cl_lid). Some flags are added to indicate causes of exclusion (offline, excluded at boot, excluded by rules, ignored by policy).	2025-03-14 18:30:30 +01:00
Willy Tarreau	05a4efb102	MINOR: thread: rely on the cpuset functions to count bound CPUs let's just clean up the thread_cpus_enabled() code a little bit by removing the OS-specific code and rely on ha_cpuset_detect_bound() instead. On macos we continue to use sysconf() for now.	2025-03-14 18:30:30 +01:00
Willy Tarreau	32bb68e736	MINOR: cpuset: make the API support negative CPU IDs Negative IDs are very convenient to mean "not set", so let's just make the cpuset API robust against this, especially with ha_cpuset_isset() so that we don't have to manually add this check everywhere when a value is not known.	2025-03-14 18:30:30 +01:00
Willy Tarreau	f156baf8ce	DOC: design-thoughts: commit numa-auto.txt Lots of collected data and observations aggregated into a single commit so as not to lose them. Some parts below come from several commit messages and are incremental. Add captures and analysis of intel 14900 where it's not easy to draw the line between the desired P and E cores. The 14900 raises some questions (imagine a dual-die variant in multi-socket). That's the start of an algorithmic distribution of performance cores into thread groups. cpu-map currently conflicts a lot with the choices after auto-detection but it doesn't have to. The problem is the inability to configure the threads for the whole process like taskset does. By offering this ability we can also start to designate groups of CPUs symbolically (package, die, ccx, cores, smt). It can also be useful to exploit the info from cpuinfo that is not available in /sys, such as the model number. At least on arm, higher numbers indicate bigger cores and can be useful to distinguish cores inside a cluster. It will not indicate big vs medium ones of the same type (e.g. a78 3.0 vs 2.4 GHz) but can still be effective at identifying the efficient ones. In short, infos such as cluster ID not always reliable, and are local to the package. die_id as well. die number is not reported here but should definitely be used, as a higher priority than L3. We're still missing a discriminant between the l3 and cluster number in order to address heterogenous CPUs (e.g. intel 14900), though in terms of locality that's currently done correctly. CPU selection is also a full topic, and some thoughts were noted regarding sorting by perf vs locality so as never to mix inter- socket CPUs due to sorting. The proposed cpu-selection cannot work as-is, because it acts both on restriction and preference, and these two are not actions but a sequence. First restrictions must be enforced, and second the remaining CPUs are sorted according to the preferred criterion, and a number of threads are selected. Currently we refine the OS-exposed cluster number but it's not correct as we can end up with something poorly numbered. We need to respect the LLC in any case so let's explain the approach.	2025-03-14 18:30:30 +01:00
Willy Tarreau	0ceb1f2c51	DEV: ncpu: also emulate sysconf() for _SC_NPROCESSORS_* This is also needed in order to make the requested number of CPUs appear. For now we don't reroute to the original sysconf() call so we return -1,EINVAL for all other info.	2025-03-14 18:30:30 +01:00
Willy Tarreau	ed75148ca0	BUILD: tools: avoid a build warning on gcc-4.8 in resolve_sym_name() A build warning is emitted with gcc-4.8 in tools.c since commit e920d73f59 ("MINOR: tools: improve symbol resolution without dl_addr") because the compiler doesn't see that <size> is necessarily initialized. Let's just preset it.	2025-03-14 18:30:30 +01:00
Willy Tarreau	4e09789644	MINOR: tools: teach resolve_sym_name() a few more common symbols This adds run_poll_loop, run_tasks_from_lists, process_runnable_tasks, ha_dump_backtrace and cli_io_handler which are fairly common in backtraces. This will be less relative symbols when dladdr is not usable.	2025-03-13 17:31:16 +01:00
Willy Tarreau	a3582a77f7	MINOR: tools: ease the declaration of known symbols in resolve_sym_name() Let's have a macro that declares both the symbol and its name, it will avoid the risk of introducing typos, and encourages adding more when needed. The macro also takes an optional second argument to permit an inline declaration of an extern symbol.	2025-03-13 17:30:48 +01:00
Willy Tarreau	e920d73f59	MINOR: tools: improve symbol resolution without dl_addr When dl_addr is not usable or fails, better fall back to the closest symbol among the known ones instead of providing everything relative to main. Most often, the location of the function will give some hints about what it can be. Thus now we can emit fct+0xXXX in addition to main+0xXXX or main-0xXXX. We keep a margin of +256kB maximum after a function for a match, which is around the maximum size met in an object file, otherwise it becomes pointless again.	2025-03-13 17:30:48 +01:00
Willy Tarreau	1e99efccef	MINOR: cli: export cli_io_handler() to ease symbol resolution It's common to meet this function in backtraces, it's a bit annoying that it's not resolved, so let's export it so that it becomes resolvable.	2025-03-13 17:30:48 +01:00
Aurelien DARRAGON	8311be5ac6	BUG/MINOR: stats: fix capabilities and hide settings for some generic metrics Performing a diff on stats output before vs after commit 66152526 ("MEDIUM: stats: convert counters to new column definition") revealed that some metrics were not properly ported to to the new API. Namely, "lbtot", "cli_abrt" and "srv_abrt" are now exposed on frontend and listeners while it was not the case before. Also, "hrsp_other" is exposed even when "mode http" wasn't set on the proxy. In this patch we restore original behavior by fixing the capabilities and hide settings. As this could be considered as a minor regression (looking at the commit message it doesn't seem intended), better tag this as a bug. It should be backported in 3.0 with 66152526.	2025-03-13 11:49:18 +01:00
Aurelien DARRAGON	4c3eb60e70	DOC: management: rename some last occurences from domain "dns" to "resolvers" This is a complementary patch to cf913c2f9 ("DOC: management: rename show stats domain cli "dns" to "resolvers"). The doc still refered to the legacy "dns" domain filter for stat command. Let's rename those occurences to "resolvers". It may be backported to all stable versions.	2025-03-13 11:49:10 +01:00
Willy Tarreau	78ef52dbd1	BUILD: backend: silence a build warning when threads are disabled Since commit 8de8ed4f48 ("MEDIUM: connections: Allow taking over connections from other tgroups.") we got this partially absurd build warning when disabling threads: src/backend.c: In function 'conn_backend_get': src/backend.c:1371:27: warning: array subscript [0, 0] is outside array bounds of 'struct tgroup_info[1]' [-Warray-bounds] The reason is that gcc sees that curtgid is not equal to tgid which is defined as 1 in this case, thus it figures that tgroup_info[curtgid-1] will be anything but zero and that doesn't fit. It is ridiculous as it is a perfect case of dead code elimination which should not warrant a warning. Nevertheless we know we don't need to do this when threads are disabled and in this case there will not be more than 1 thread group, so we can happily use that preliminary test to help the compiler eliminate the dead condition and avoid spitting this warning. No backport is needed.	2025-03-12 18:16:14 +01:00
Willy Tarreau	b61ed9babe	BUILD: tools: silence a build warning when USE_THREAD=0 The dladdr_lock that was added to avoid re-entering into dladdr is conditioned by threads, but the way it's declared causes a build warning if threads are disabled due to the insertion of a lone semi colon in the variables block. Let's switch to __decl_thread_var() for this. This can be backported wherever commit eb41d768f9 ("MINOR: tools: use only opportunistic symbols resolution") is backported. It relies on these previous two commits: bb4addabb7 ("MINOR: compiler: add a simple macro to concatenate resolved strings") 69ac4cd315 ("MINOR: compiler: add a new __decl_thread_var() macro to declare local variables")	2025-03-12 18:11:14 +01:00
Willy Tarreau	69ac4cd315	MINOR: compiler: add a new __decl_thread_var() macro to declare local variables __decl_thread() already exists but is more suited for struct members. When using it in a variables block, it appends the final trailing semi-colon which is a statement that ends the variable block. Better clean this up and have one precisely for variable blocks. In this case we can simply define an unused enum value that will consume the semi-colon. That's what the new macro __decl_thread_var() does.	2025-03-12 18:08:12 +01:00
Willy Tarreau	bb4addabb7	MINOR: compiler: add a simple macro to concatenate resolved strings It's often useful to be able to concatenate strings after resolving them (e.g. __FILE__, __LINE__ etc). Let's just have a CONCAT() macro to do that, which calls _CONCAT() with the same arguments to make sure the contents are resolved before being concatenated.	2025-03-12 18:06:55 +01:00
Willy Tarreau	12383fd9f5	BUG/MEDIUM: thread: use pthread_self() not ha_pthread[tid] in set_affinity A bug was uncovered by the work on NUMA. It only triggers in the CI with libmusl due to a race condition. What happens is that the call to set_thread_cpu_affinity() is done very early in the polling loop, and that it relies on ha_pthread[tid] instead of pthread_self(). The problem is that ha_pthread[tid] is only set by the return from pthread_create(), which might happen later depending on the number of CPUs available to run the starting thread. Let's just use pthread_self() here. ha_pthread[] is only used to send signals between threads, there's no point in using it here. This can be backported to 2.6.	2025-03-12 15:59:23 +01:00
Aurelien DARRAGON	e942305214	MEDIUM: log: change default "host" strategy for log-forward section Historically, log-forward proxy used to preserve host field from input message as much as possible, and if syslog host wasn't provided (rfc5424 '-' or bad rfc3164 or rfc5424 message) then "localhost" or "-" would be used as host when outputting message using rfc3164 or rfc5424. We change that behavior (which corresponds to "keep" host option), so that log-forward now uses "fill" strategy as default: if the host is provided in input message, it is preserved. However if it is missing and IP address from sender is available, we use it.	2025-03-12 10:55:49 +01:00
Aurelien DARRAGON	ad0133cc50	MINOR: log: handle log-forward "option host" Following previous patch, we know implement the logic for the host option under log-forward section. Possible strategies are: replace If input message already contains a value for the host field, we replace it by the source IP address from the sender. If input message doesn't contain a value for the host field (ie: '-' as input rfc5424 message or non compliant rfc3164 or rfc5424 message), we use the source IP address from the sender as host field. fill If input message already contains a value for the host field, we keep it. If input message doesn't contain a value for the host field (ie: '-' as input rfc5424 message or non compliant rfc3164 or rfc5424 message), we use the source IP address from the sender as host field. keep If input message already contains a value for the host field, we keep it. If input message doesn't contain a value for the host field, we set it to localhost (rfc3164) or '-' (rfc5424). (This is the default) append If input message already contains a value for the host field, we append a comma followed by the IP address from the sender. If input message doesn't contain a value for the host field, we use the source IP address from the sender. Default value (unchanged) is "keep" strategy. option host is only relevant with rfc3164 or rfc5424 format on log targets. Also, if the source address is not available (ie: UNIX socket), default behavior prevails. Documentation was updated.	2025-03-12 10:52:07 +01:00
Aurelien DARRAGON	003fe530ae	MINOR: log: add "option host" log-forward option add only the parsing part, options are currently unused	2025-03-12 10:51:35 +01:00
Aurelien DARRAGON	47f14be9f3	MINOR: tools: only print address in sa2str() when port == -1 Support special value for port in sa2str: if port is equal to -1, only print the address without the port, also ignoring <map_ports> value.	2025-03-12 10:51:20 +01:00
Aurelien DARRAGON	2de62d0461	MINOR: log: provide source address information in syslog_process_message() provide struct sockaddr_storage pointer from the message sender in syslog_process_message()	2025-03-12 10:50:30 +01:00
Aurelien DARRAGON	bc76f6dde9	MINOR: log: migrate log-forward options from proxy->options2 to options3 Migrate recently added log-forward section options, currently stored under proxy->options2 to proxy->options3 since proxy->options2 is running out of space and we plan on adding more log-forward options.	2025-03-12 10:50:03 +01:00
Aurelien DARRAGON	cc5a66212d	MINOR: proxy: add proxy->options3 proxy->options2 is almost full, yet we will add new log-forward options in upcoming patches so we anticipate that by adding a new {no_}options3 and cfg_opts3[] to further extend proxy options	2025-03-12 10:49:36 +01:00
Aurelien DARRAGON	d47e7103b8	CLEANUP: log: add syslog_process_message() helper Prevent code duplication under syslog_fd_handler() and syslog_io_handler() by merging common code path in a single syslog_process_message() helper that processed a single message stored in <buf> according to <frontend> settings.	2025-03-12 10:49:18 +01:00
Aurelien DARRAGON	8b8520305e	CLEANUP: log-forward: remove useless options2 init It is actually not required to zero out proxy->options2 since proxy is allocated using calloc() which already does it.	2025-03-12 10:49:08 +01:00
William Lallemand	c6e6318125	CI: github: add "jose" to apt dependencies jose is used in the JWS unit-test, let's add it to the CI.	2025-03-11 22:29:40 +01:00
William Lallemand	d014d7ee72	TESTS: jws: implement a test for JWS signing This test returns a JWS payload signed a specified private key in the PEM format, and uses the "jose" command tool to check if the signature is correct against the jwk public key. The test could be improved later by using the code from jwt.c allowing to check a signature.	2025-03-11 22:29:40 +01:00
William Lallemand	3abb428fc8	MINOR: jws: implement JWS signing This commits implement JWS signing, this is divided in 3 parts: - jws_b64_protected() creates a JWS "protected" header, which takes the algorithm, kid or jwk, nonce and url as input, and fill a destination buffer with the base64url version of the header - jws_b64_payload() just encode a payload in base64url - jws_b64_signature() generates a signature using as input the protected header and the payload, it supports ES256, ES384 and ES512 for ECDSA keys, and RS256 for RSA ones. The RSA signature just use the EVP_DigestSign() API with its result encoded in base64url. For ECDSA it's a little bit more complicated, and should follow section 3.4 of RFC7518, R and S should be padded to byte size. Then the JWS can be output with jws_flattened() which just formats the 3 base64url output in a JSON representation with the 3 fields, protected, payload and signature.	2025-03-11 22:29:40 +01:00
Willy Tarreau	3cbeb6a74b	[RELEASE] Released version 3.2-dev7 Released version 3.2-dev7 with the following main changes : - BUG/MEDIUM: applet: Don't handle EOI/EOS/ERROR is applet is waiting for room - BUG/MEDIUM: spoe/mux-spop: Introduce an NOOP action to deal with empty ACK - BUG/MINOR: cfgparse: fix NULL ptr dereference in cfg_parse_peers - BUG/MEDIUM: uxst: fix outgoing abns address family in connect() - REGTESTS: fix reg-tests/server/abnsz.vtc - BUG/MINOR: log: fix outgoing abns address family - BUG/MINOR: sink: add tempo between 2 connection attempts for sft servers - MINOR: clock: always use atomic ops for global_now_ms - CI: QUIC Interop: clean old docker images - BUG/MINOR: stream: do not call co_data() from __strm_dump_to_buffer() - BUG/MINOR: mux-h1: always make sure h1s->sd exists in h1_dump_h1s_info() - MINOR: tinfo: add a new thread flag to indicate a call from a sig handler - BUG/MEDIUM: stream: never allocate connection addresses from signal handler - MINOR: freq_ctr: provide non-blocking read functions - BUG/MEDIUM: stream: use non-blocking freq_ctr calls from the stream dumper - MINOR: tools: use only opportunistic symbols resolution - CLEANUP: task: move the barrier after clearing th_ctx->current - MINOR: compression: Introduce minimum size - BUG/MINOR: h2: always trim leading and trailing LWS in header values - MINOR: tinfo: split the signal handler report flags into 3 - BUG/MEDIUM: stream: don't use localtime in dumps from a signal handler - OPTIM: connection: don't try to kill other threads' connection when !shared - BUILD: add possibility to use different QuicTLS variants - MEDIUM: fd: Wait if locked in fd_grab_tgid() and fd_take_tgid(). - MINOR: fd: Add fd_lock_tgid_cur(). - MEDIUM: epoll: Make sure we can add a new event - MINOR: pollers: Add a fixup_tgid_takeover() method. - MEDIUM: pollers: Drop fd events after a takeover to another tgid. - MEDIUM: connections: Allow taking over connections from other tgroups. - MEDIUM: servers: Add strict-maxconn. - BUG/MEDIUM: server: properly initialize PROXY v2 TLVs - BUG/MINOR: server: fix the "server-template" prefix memory leak - BUG/MINOR: h3: do not report transfer as aborted on preemptive response - CLEANUP: h3: fix documentation of h3_rcv_buf() - MINOR: hq-interop: properly handle incomplete request - BUG/MEDIUM: mux-fcgi: Try to fully fill demux buffer on receive if not empty - MINOR: h1: permit to relax the websocket checks for missing mandatory headers - BUG/MINOR: hq-interop: fix leak in case of rcv_buf early return - BUG/MINOR: server: check for either proxy-protocol v1 or v2 to send hedaer - MINOR: jws: implement a JWK public key converter - DEBUG: init: add a way to register functions for unit tests - TESTS: add a unit test runner in the Makefile - TESTS: jws: register a unittest for jwk - CI: github: run make unit-tests on the CI - TESTS: add config smoke checks in the unit tests - MINOR: jws: conversion to NIST curves name - CI: github: remove smoke tests from vtest.yml - TESTS: ist: fix wrong array size - TESTS: ist: use the exit code to return a verdict - TESTS: ist: add a ist.sh to launch in make unit-tests - CI: github: fix h2spec.config proxy names - DEBUG: init: Add a macro to register unit tests - MINOR: sample: allow custom date format in error-log-format - CLEANUP: log: removing "log-balance" references - BUG/MINOR: log: set proper smp size for balance log-hash - MINOR: log: use __send_log() with exact payload length - MEDIUM: log: postpone the decision to send or not log with empty messages - MINOR: proxy: make pr_mode enum bitfield compatible - MINOR: cfgparse-listen: add and use cfg_parse_listen_match_option() helper - MINOR: log: add options eval for log-forward - MINOR: log: detach prepare from parse message - MINOR: log: add dont-parse-log and assume-rfc6587-ntf options - BUG/MEIDUM: startup: return to initial cwd only after check_config_validity() - TESTS: change the output of run-unittests.sh - TESTS: unit-tests: store sh -x in a result file - CI: github: show results of the Unit tests - BUG/MINOR: cfgparse/peers: fix inconsistent check for missing peer server - BUG/MINOR: cfgparse/peers: properly handle ignored local peer case - BUG/MINOR: server: dont return immediately from parse_server() when skipping checks - MINOR: cfgparse/peers: provide more info when ignoring invalid "peer" or "server" lines - BUG/MINOR: stream: fix age calculation in "show sess" output - MINOR: stream/cli: rework "show sess" to better consider optional arguments - MINOR: stream/cli: make "show sess" support filtering on front/back/server - TESTS: quic: create first quic unittest - MINOR: h3/hq-interop: restore function for standalone FIN receive - MINOR/OPTIM: mux-quic: do not allocate rxbuf on standalone FIN - MINOR: mux-quic: refine reception of standalone STREAM FIN - MINOR: mux-quic: define globally stream rxbuf size - MINOR: mux-quic: define rxbuf wrapper - MINOR: mux-quic: store QCS Rx buf in a single-entry tree - MINOR: mux-quic: adjust Rx data consumption API - MINOR: mux-quic: adapt return value of qcc_decode_qcs() - MAJOR: mux-quic: support multiple QCS RX buffers - MEDIUM: mux-quic: handle too short data splitted on multiple rxbuf - MAJOR: mux-quic: increase stream flow-control for multi-buffer alloc - BUG/MINOR: cfgparse-tcp: relax namespace bind check - MINOR: startup: adjust alert messages, when capabilities are missed	2025-03-07 16:37:57 +01:00
Valentine Krasnobaeva	7d427134fe	MINOR: startup: adjust alert messages, when capabilities are missed CAP_SYS_ADMIN support was added, in order to access sockets in namespaces. So let's adjust the alert at startup, where we check preserved capabilities from global.last_checks. Let's mention here cap_sys_admin as well.	2025-03-07 16:37:16 +01:00
Damien Claisse	f0a07f834c	BUG/MINOR: cfgparse-tcp: relax namespace bind check Commit 5cbb278 introduced cap_sys_admin support, and enforced checks for both binds and servers. However, when binding into a namespace, the bind is done before dropping privileges. Hence, checking that we have cap_sys_admin capability set in this case is not needed (and it would decrease security to add it). For users starting haproxy with other user than root and without cap_sys_admin, bind should have already failed. As a consequence, relax runtime check for binds into a namespace.	2025-03-07 16:23:29 +01:00
Amaury Denoyelle	dc7913d814	MAJOR: mux-quic: increase stream flow-control for multi-buffer alloc Support for multiple Rx buffers per QCS instance has been introduced by previous patches. However, due to flow-control initial values, client were still unable to fully used this to increase their upload throughput. This patch increases max-stream-data-bidi-remote flow-control initial values. A new define QMUX_STREAM_RX_BUF_FACTOR will fix the number of concurrent buffers allocable per QCS. It is set to 90. Note that connection flow-control initial value did not changed. It is still configured to be equivalent to bufsize multiplied by the maximum concurrent streams. This ensures that Rx buffers allocation is still constrained per connection, so that it won't be possible to have all active QCS instances using in parallel their maximum Rx buffers count.	2025-03-07 12:06:27 +01:00
Amaury Denoyelle	75027692a3	MEDIUM: mux-quic: handle too short data splitted on multiple rxbuf Previous commit introduces support for multiple Rx buffers per QCS instance. Contiguous data may be splitted accross multiple buffers depending on their offset. A particular issue could arise with this new model. Indeed, app_ops rcv_buf callback can still deal with a single buffer at a time. This may cause a deadlock in decoding if app_ops layer cannot proceed due to partial data, but such data are precisely divided on two buffers. This can for example intervene during HTTP/3 frame header parsing. To deal with this, a new function is implemented to force data realign between two contiguous buffers. This is called only when app_ops rcv_buf returned 0 but data is available in the next buffer after the current one. In this case, data are transferred from the next into the current buffer via qcs_transfer_rx_data(). Decoding is then restarted, which should ensure that app_ops layer has enough data to advance. During this operation, special care is ensure to removed both qc_stream_rxbuf entries, as their offset are adjusted. The next buffer is only reinserted if there is remaining data in it, else it can be freed. This case is not easily reproducible as it depends on the HTTP/3 framing used by the client. It seems to be easily reproduced though with quiche. $ quiche-client --http-version HTTP/3 --method POST --body /tmp/100m \ "https://127.0.0.1:20443/post"	2025-03-07 12:06:27 +01:00
Amaury Denoyelle	60f64449fb	MAJOR: mux-quic: support multiple QCS RX buffers Implement support for multiple Rx buffers per QCS instances. This requires several changes mostly in qcc_recv() / qcc_decode_qcs() which deal with STREAM frames reception and decoding. These multiple buffers can be stored in QCS rx.bufs tree which was introduced in an earlier patch. On STREAM frame reception, a buffer is retrieved from QCS bufs tree, or allocated if necessary, based on the data starting offset. Each buffers are aligned on bufsize for convenience. This ensures there is no overlap between two contiguous buffers. Special care is taken when dealing with a STREAM frame which must be splitted and stored in two contiguous buffers. When decoding input data, qcc_decode_qcs() is still invoked with a single buffer as input. This requires a new while loop to ensure decoding is performed accross multiple contiguous buffers until all data are decoded or app stream buffer is full. Also, after qcs_consume() has been performed, the stream Rx channel is immediately closed if FIN was already received and QCS now contains only a single buffer with all remaining data. This is necessary as qcc_recv() is unable to close the Rx channel if FIN is received for a buffer different from the current readable offset. Note that for now stream flow-control value is still too low to fully utilizing this new infrastructure and improve clients upload throughput. Indeed, flow-control max-stream-data initial values are set to match bufsize. This ensures that each QCS will use 1 buffer, or at most 2 if data are splitted. A future patch will increase this value to unblock this limitation.	2025-03-07 12:06:26 +01:00
Amaury Denoyelle	7b168e356f	MINOR: mux-quic: adapt return value of qcc_decode_qcs() Change return value of qcc_decode_qcs(). It now directly returns the value from app_ops rcv_buf callback. Function documentation is updated to reflect this. For now, qcc_decode_qcs() return value is ignored by callers, so this patch should not have any functional change. However, it will become necessary when implementing multiple Rx buffers per QCS, as a loop will be implemented to invoke qcc_decode_qcs() on several contiguous buffers. Decoding must be stopped however as soon as an error is returned by rcv_buf callback. This is also the case in case of a null value, which indicates there is not enough data to continue decoding.	2025-03-07 12:06:26 +01:00
Amaury Denoyelle	6b5607d66f	MINOR: mux-quic: adjust Rx data consumption API HTTP/3 data are converted into HTX via qcc_decode_qcs() function. On completion, these data are removed from QCS Rx buffer via qcs_consume(). This patch adjust qcs_consume() API with several changes. Firstly, the Rx buffer instance to operate on must now be specified as a new argument to the function. Secondly, buffer liberation when all data were removed from qcs_consume() is extracted up to qcc_decode_qcs() caller. No functional change with this patch. The objective is to have an API which can be better adapted to multiple Rx buffers per QCS instance.	2025-03-07 12:06:26 +01:00
Amaury Denoyelle	a4f31ffeeb	MINOR: mux-quic: store QCS Rx buf in a single-entry tree Convert QCS rx buffer pointer to a tree container. Additionnaly, offset field of qc_stream_rxbuf is thus transformed into a node tree. For now, only a single Rx buffer is stored at most in QCS tree. Multiple Rx buffers will be implemented in a future patch to improve QUIC clients upload throughput.	2025-03-07 12:06:26 +01:00
Amaury Denoyelle	cc3c2d1f12	MINOR: mux-quic: define rxbuf wrapper Define a new type qc_stream_rxbuf. This is used as a wrapper around QCS Rx buffer with encapsulation of the ncbuf storage. It is allocated via a new pool. Several functions are adapted to be able to deal with qc_stream_rxbuf as a wrapper instead of the previous plain ncbuf instance. No functional change should happen with this patch. For now, only a single qc_stream_rxbuf can be instantiated per QCS. However, this new type will be useful to implement multiple Rx buffer storage in a future commit.	2025-03-07 12:06:26 +01:00
Amaury Denoyelle	4b1e63d191	MINOR: mux-quic: define globally stream rxbuf size QCS uses ncbuf for STREAM data storage. This serves as a limit for maximum STREAM buffering capacity, advertised via QUIC transport parameters for initial flow-control values. Define a new function qmux_stream_rx_bufsz() which can be used to retrieve this Rx buffer size. This can be used both in MUX/H3 layers and in QUIC transport parameters.	2025-03-07 12:06:26 +01:00
Amaury Denoyelle	7dd1eec2b1	MINOR: mux-quic: refine reception of standalone STREAM FIN Reception of standalone STREAM FIN is a corner case, which may be difficult to handle. In particular, care must be taken to ensure app_ops rcv_buf() is always called to be notify about FIN, even if Rx buffer is empty or full demux flag is set. If this is the case, it could prevent closure of QCS Rx channel. To ensure this, rcv_buf() was systematically called if FIN was received, with or without data payload. This could called unnecessary invokation when FIN is transmitted with data and full demux flag is set, or data are received out-of-order. This patches improve qcc_recv() by detecting explicitely a standalone FIN case. Thus, rcv_buf() is only forcefully called in this case and if all data were already previously received.	2025-03-07 12:06:26 +01:00
Amaury Denoyelle	20dc8e4ec2	MINOR/OPTIM: mux-quic: do not allocate rxbuf on standalone FIN STREAM FIN may be received without any payload. However, qcc_recv() always called qcs_get_ncbuf() indiscriminately, which may allocate a QCS Rx buffer. This is unneeded as there is no payload to store. Improve this by skipping qcs_get_ncbuf() invokation when dealing with a standalone FIN signal. This should prevent superfluous buffer allocation.	2025-03-07 12:06:26 +01:00
Amaury Denoyelle	861b11334c	MINOR: h3/hq-interop: restore function for standalone FIN receive Previously, a function qcs_http_handle_standalone_fin() was implemented to handle a received standalone FIN, bypassing app_ops layer decoding. However, this was removed as app_ops layer interaction is necessary. For example, HTTP/3 checks that FIN is never sent on the control uni stream. This patch reintroduces qcs_http_handle_standalone_fin(), albeit in a slightly diminished version. Most importantly, it is now the responsibility of the app_ops layer itself to use it, to avoid the shortcoming described above. The main objective of this patch is to be able to support standalone FIN in HTTP/0.9 layer. This is easily done via the reintroduction of qcs_http_handle_standalone_fin() usage. This will be useful to perform testing, as standalone FIN is a corner case which can easily be broken.	2025-03-07 12:06:26 +01:00
Amaury Denoyelle	6f95d0dad0	TESTS: quic: create first quic unittest Define a first unit-test dedicated to QUIC. A single test for now ensures that variable length decoding is compliant. This should be extended in the future with new set of tests.	2025-03-07 12:06:26 +01:00
Willy Tarreau	5e558c1727	MINOR: stream/cli: make "show sess" support filtering on front/back/server With "show sess", particularly "show sess all", we're often missing the ability to inspect only streams attached to a frontend, backend or server. Let's just add these filters to the command. Only one at a time may be set. One typical use case could be to dump streams attached to a server after issuing "shutdown sessions server XXX" to figure why any wouldn't stop for example.	2025-03-07 10:38:12 +01:00
Willy Tarreau	2bd7cf53cb	MINOR: stream/cli: rework "show sess" to better consider optional arguments The "show sess" CLI command parser is getting really annoying because several options were added in an exclusive mode as the single possible argument. Recently some cumulable options were added ("show-uri") but the older ones were not yet adapted. Let's just make sure that the various filters such as "older" and "age" now belong to the options and leave only <id>, "all", and "help" for the first ones. The doc was updated and it's now easier to find these options.	2025-03-07 10:36:58 +01:00
Willy Tarreau	1cdf2869f6	BUG/MINOR: stream: fix age calculation in "show sess" output The "show sess" output reports an age that's based on the last byte of the HTTP request instead of the stream creation date, due to a confusion between logs->request_ts and the request_date sample fetch function. Most of the time these are equal except when the request is not yet full for any reason (e.g. wait-body). This explains why a few "show sess" could report a few new streams aged by 99 days for example. Let's perform the correct request timestamp calculation like the sample fetch function does, by adding t_idle and t_handshake to the accept_ts. Now the stream's age is correct and can be correctly used with the "show sess older <age>" variant. This issue was introduced in 2.9 and the fix can be backported to 3.0.	2025-03-07 10:36:58 +01:00
Aurelien DARRAGON	dbb25720dd	MINOR: cfgparse/peers: provide more info when ignoring invalid "peer" or "server" lines Invalid (incomplete) "server" or "peer" lines under peers section are now properly ignored. For completeness, in this patch we add some reports so that the user knows that incomplete lines were ignored. For an incomplete server line, since it is tolerated (see GH #565), we only emit a diag warning. For an incomplete peer line, we report a real warning, as it is not expected to have a peer line without an address:port specified. Also, 'newpeer == curpeers->local' check could be simplified since we already have the 'local_peer' variable which tells us that the parsed line refers to a local peer.	2025-03-07 09:39:51 +01:00
Aurelien DARRAGON	a76b5358f0	BUG/MINOR: server: dont return immediately from parse_server() when skipping checks If parse_server() is called under peers section parser, and the address needs to be parsed but it is missing, we directly return from the function However since 0fc136ce5b ("REORG: server: use parsing ctx for server parsing"), parse_server() uses parsing ctx to emit warning/errors, and the ctx must be reset before returning from the function, yet this early return was overlooked. Because of that, any ha_{warning,alert..} message reported after early return from parse_server() could cause messages to have an extra "parsing [file:line]" info. We fix that by ensuring parse_server() doesn't return without resetting the parsing context. It should be backported up to 2.6	2025-03-07 09:39:46 +01:00
Aurelien DARRAGON	054443dfb9	BUG/MINOR: cfgparse/peers: properly handle ignored local peer case In 8ba10fea6 ("BUG/MINOR: peers: Incomplete peers sections should be validated."), some checks were relaxed in parse_server(), and extra logic was added in the peers section parser in an attempt to properly ignore incomplete "server" or "peer" statement under peers section. This was done in response to GH #565, the main intent was that haproxy should already complain about incomplete peers section (ie: missing localpeer). However, 8ba10fea69 explicitly skipped the peer cleanup upon missing srv association for local peers. This is wrong because later haproxy code always assumes that peer->srv is valid. Indeed, we got reports that the (invalid) config below would cause segmentation fault on all stable versions: global localpeer 01JM0TEPAREK01FQQ439DDZXD8 peers my-table peer 01JM0TEPAREK01FQQ439DDZXD8 listen dummy bind localhost:8080 To fix the issue, instead of by-passing some cleanup for the local peer, handle this case specifically by doing the regular peer cleanup and reset some fields set on the curpeers and curpeers proxy because of the invalid local peer (do as if the peer was not declared). It should still comply with requirements from #565. This patch should be backported to all stable versions.	2025-03-06 22:05:29 +01:00
Aurelien DARRAGON	2560ab892f	BUG/MINOR: cfgparse/peers: fix inconsistent check for missing peer server In the "peers" section parser, right after parse_server() is called, we used to check whether the curpeers->peers_fe->srv pointer was set or not to know if parse_server() successfuly added a server to the peers proxy, server that we can then associate to the new peer. However the check is wrong, as curpeers->peers_fe->srv points to the last added server, if a server was successfully added before the failing one, we cannot detect that the last parse_server() didn't add a server. This is known to cause bug with bad "peer"/"server" statements. To fix the issue, we save a pointer on the last known curpeers->peers_fe->srv before parse_server() is called, and we then compare the save with the pointer after parse_server(), if the value didn't change, then parse_server() didn't add a server. This makes the check consistent in all situations. It should be backported to all stable versions.	2025-03-06 22:05:24 +01:00
William Lallemand	29db5406b4	CI: github: show results of the Unit tests Add a "Show Unit-Tests results" section which show each unit test which failed by displaying their result file.	2025-03-06 21:23:54 +01:00
William Lallemand	0b22c8e0e0	TESTS: unit-tests: store sh -x in a result file Store `sh -e -x` of the test in a result file. This file is deleted upon success, but can be consulted if the test fails	2025-03-06 21:22:38 +01:00
William Lallemand	7fdc4160b2	TESTS: change the output of run-unittests.sh - "check" is run with sh -e so it will stop at the first error - output of "check" is not shown anymore - add a line with the name of the failed test	2025-03-06 17:53:53 +01:00
Valentine Krasnobaeva	e900ef987e	BUG/MEIDUM: startup: return to initial cwd only after check_config_validity() In check_config_validity() we evaluate some sample fetch expressions (log-format, server rules, etc). These expressions may use external files like maps. If some particular 'default-path' was set in the global section before, it's no longer applied to resolve file pathes in check_config_validity(). parse_cfg() at the end of config parsing switches back to the initial cwd. This fixes the issue #2886. This patch should be backported in all stable versions since 2.4.0, including 2.4.0.	2025-03-06 10:49:48 +01:00
Roberto Moreda	f98b5c4f59	MINOR: log: add dont-parse-log and assume-rfc6587-ntf options This commit introduces the dont-parse-log option to disable log message parsing, allowing raw log data to be forwarded without modification. Also, it adds the assume-rfc6587-ntf option to frame log messages using only non-transparent framing as per RFC 6587. This avoids missparsing in certain cases (mainly with non RFC compliant messages). The documentation is updated to include details on the new options and their intended use cases. This feature was discussed in GH #2856	2025-03-06 09:30:39 +01:00
Roberto Moreda	c25e6f5efa	MINOR: log: detach prepare from parse message This commit adds a new function `prepare_log_message` to initialize log message buffers and metadata. This function sets default values for log level and facility, ensuring a consistent starting state for log processing. It also prepares the buffer and metadata fields, simplifying subsequent log parsing and construction.	2025-03-06 09:30:31 +01:00
Roberto Moreda	834e9af877	MINOR: log: add options eval for log-forward This commit adds parsing of options in log-forward config sections and prepares the scenario to implement actual changes of behaviuor. So far we only take in account proxy->options2, which is the bit container with more available positions.	2025-03-06 09:30:25 +01:00
Aurelien DARRAGON	0746f6bde0	MINOR: cfgparse-listen: add and use cfg_parse_listen_match_option() helper cfg_parse_listen_match_option() takes cfg_opt array as parameter, as well current args, expected mode and cap bitfields. It is expected to be used under cfg_parse_listen() function or similar. Its goal is to remove code duplication around proxy->options and proxy->options2 handling, since the same checks are performed for the two. Also, this function could help to evaluate proxy options for mode-specific proxies such as log-forward section for instance: by giving the expected mode and capatiblity as input, the function would only match compatible options.	2025-03-06 09:30:18 +01:00
Aurelien DARRAGON	d9aa199100	MINOR: proxy: make pr_mode enum bitfield compatible Current pr_mode enum is a regular enum because a proxy only supports one mode at a time. However it can be handy for a function to be given a list of compatible modes for a proxy, and we can't do that using a bitfield because pr_mode is not bitfield compatible (values share the same bits). In this patch we manually define pr_mode values so that they are all using separate bits and allows a function to take a bitfield of compatible modes as parameter.	2025-03-06 09:30:11 +01:00
Aurelien DARRAGON	c7abe7778e	MEDIUM: log: postpone the decision to send or not log with empty messages As reported by Nick Ramirez in GH #2891, it is currently not possible to use log-profile without a log-format set on the proxy. This is due to historical reason, because all log sending functions avoid trying to send a log with empty message. But now with log-profile which can override log-format, it is possible that some loggers may actually end up generating a valid log message that should be sent! Yet from the upper logging functions we don't know about that because loggers are evaluated in lower API functions. Thus, to avoid skipping potentially valid messages (thanks to log-profile overrides), in this patch we postpone the decision to send or not empty log messages in lower log API layer, ie: _process_send_log_final(), once the log-profile settings were evaluated for a given logger. A known side-effect of this change is that fe->log_count statistic may be increased even if no log message is sent because the message was empty and even the log-profile didn't help to produce a non empty log message. But since configurations lacking proxy log-format are not supposed to be used without log-profile (+ log steps combination) anyway it shouldn't be an issue.	2025-03-05 15:38:52 +01:00
Aurelien DARRAGON	9e9b110032	MINOR: log: use __send_log() with exact payload length Historically, __send_log() was called with terminating NULL byte after the message payload. But now that __send_log() supports being called without terminating NULL byte (thanks to size hint), and that __sendlog() actually stips any \n or NULL byte, we don't need to bother with that anymore. So let's remove extra logic around __send_log() users where we added 1 extra byte for the terminating NULL byte. No change of behavior should be expected.	2025-03-05 15:38:46 +01:00
Aurelien DARRAGON	94a9b0f5de	BUG/MINOR: log: set proper smp size for balance log-hash result.data.u.str.size was set to size+1 to take into account terminating NULL byte as per the comment. But this is wrong because the caller is free to set size to just the right amount of bytes (without terminating NULL byte). In fact all smp API functions will not read past str.data so there is not risk about uninitialized reads, but this leaves an ambiguity for converters that may use all the smp size to perform transformations, and since we don't know about the "message" memory origin, we cannot assume that its size may be greater than size. So we max it out to size just to be safe. This bug was not known to cause any issue, it was spotted during code review. It should be backported in 2.9 with b30bd7a ("MEDIUM: log/balance: support for the "hash" lb algorithm")	2025-03-05 15:38:41 +01:00
Aurelien DARRAGON	ddf66132f4	CLEANUP: log: removing "log-balance" references This is a complementary patch to 0e1f389fe9 ("DOC: config: removing "log-balance" references"): we properly removed all log-balance references in the doc but there remained some in the code, let's fix that. It could be backported in 2.9 with 0e1f389fe9	2025-03-05 15:38:34 +01:00
Valentine Krasnobaeva	b46b81949f	MINOR: sample: allow custom date format in error-log-format Sample fetches %[accept_date] and %[request_date] with converters can be used in error-log-format string. But in the most error cases they fetches nothing, as error logs are produced on SSL handshake issues or when invalid PROXY protocol header is used. Stream object is never allocated in such cases and smp_fetch_accept_date() just simply returns 0. There is a need to have a custom date format (ISO8601) also in the error logs, along with normal logs. When sess_build_logline_orig() builds log line it always copies the accept date to strm_logs structure. When stream is absent, accept date is copied from the session object. So, if the steam object wasn't allocated, let's use the session date info in smp_fetch_accept_date(). This allows then, in sample_process(), to apply to the fetched date different converters and formats. This fixes the issue #2884.	2025-03-04 18:57:29 +01:00
Olivier Houchard	335ef3264b	DEBUG: init: Add a macro to register unit tests Add a new macro, REGISTER_UNITTEST(), that will automatically make sure we call hap_register_unittest(), instead of having to create a function that will do so.	2025-03-04 18:18:10 +01:00
William Lallemand	588237ca6e	CI: github: fix h2spec.config proxy names h2spec.config config file emitted a warning because the frontend name has the same name as the backend.	2025-03-04 11:44:03 +01:00
William Lallemand	06d86822c1	TESTS: ist: add a ist.sh to launch in make unit-tests Compile and run the ist unit tests from ist.sh	2025-03-04 11:25:35 +01:00
William Lallemand	11ea331e20	TESTS: ist: use the exit code to return a verdict Use the exit code to return a verdict on the test.	2025-03-04 11:25:35 +01:00
William Lallemand	ddd2c82a35	TESTS: ist: fix wrong array size test_istzero() and test_istpad() has the wrong array size buf[] which lacks the space for the '\0'; Could be backported in every stable branches.	2025-03-04 11:25:25 +01:00
William Lallemand	937ece45d4	CI: github: remove smoke tests from vtest.yml Smoke tests from the vtest.yml are not useful anymore since they are run directly by tests/unit/smoke/test.sh. This patch removes them.	2025-03-03 12:46:20 +01:00
William Lallemand	cf71e9f5cf	MINOR: jws: conversion to NIST curves name OpenSSL version greater than 3.0 does not use the same API when manipulating EVP_PKEY structures, the EC_KEY API is deprecated and it's not possible anymore to get an EC_GROUP and simply call EC_GROUP_get_curve_name(). Instead, one must call EVP_PKEY_get_utf8_string_param with the OSSL_PKEY_PARAM_GROUP_NAME parameter, but this would result in a SECG curves name, instead of a NIST curves name in previous version. (ex: secp384r1 vs P-384) This patch adds 2 functions: - the first one look for a curves name and converts it to an openssl NID. - the second one converts a NID to a NIST curves name The list only contains: P-256, P-384 and P-521 for now, it could be extended in the fure with more curves.	2025-03-03 12:43:32 +01:00
William Lallemand	8a6b0b06cd	TESTS: add config smoke checks in the unit tests vtest.yml contains some config checks that are used to check the memleaks. This patch adds a unit test which runs the same tests.	2025-03-03 12:43:32 +01:00
William Lallemand	7a2a613132	CI: github: run make unit-tests on the CI Run the new make unit-tests on the CI. It requires HAProxy to be built with -DDEBUG_UNIT so the -U option is available in HAProxy	2025-03-03 12:43:32 +01:00
William Lallemand	09457111bb	TESTS: jws: register a unittest for jwk Add a way to test the jwk converter in the unit test system $ make TARGET=linux-glibc USE_OPENSSL=1 CFLAGS="-DDEBUG_UNIT=1" $ ./haproxy -U jwk foobar.pem.rsa { "kty": "RSA", "n": "...", "e": "AQAB" } $ ./haproxy -U jwk foobar.pem.ecdsa { "kty": "EC", "crv": "P-384", "x": "...", "y": "..." } This is then tested by a shell script: $ HAPROXY_PROGRAM=${PWD}/haproxy tests/unit/jwk/test.sh + readlink -f tests/unit/jwk/test.sh + BASENAME=/haproxy/tests/unit/jwk/test.sh + dirname /haproxy/tests/unit/jwk/test.sh + TESTDIR=/haproxy/tests/unit/jwk + HAPROXY_PROGRAM=/haproxy/haproxy + mktemp + FILE1=/tmp/tmp.iEICxC5yNK + /haproxy/haproxy -U jwk /haproxy/tests/unit/jwk/ecdsa.key + diff -Naurp /haproxy/tests/unit/jwk/ecdsa.pub.jwk /tmp/tmp.iEICxC5yNK + rm /tmp/tmp.iEICxC5yNK + mktemp + FILE2=/tmp/tmp.EIrGZGaCDi + /haproxy/haproxy -U jwk /haproxy/tests/unit/jwk/rsa.key + diff -Naurp /haproxy/tests/unit/jwk/rsa.pub.jwk /tmp/tmp.EIrGZGaCDi + rm /tmp/tmp.EIrGZGaCDi $ echo $? 0	2025-03-03 12:43:32 +01:00
William Lallemand	1e7478bb4e	TESTS: add a unit test runner in the Makefile `make unit-tests` would run shell scripts from tests/unit/ The run-unittests.sh script will look for any .sh in tests/unit/ and will call it twice: - first with the 'check' argument in order to decide if we should skip the test or not - second to run the check A simple test could be written this way: #!/bin/sh check() { ${HAPROXY_PROGRAM} -cc 'feature(OPENSSL)' command -v socat } run() { ${HAPROXY_PROGRAM} -dI -f ${ROOTDIR}/examples/quick-test.cfg -c } case "$1" in "check") check ;; "run") run ;; esac The tests MUST be written in POSIX shell in order to be portable, and any special commands should be tested with `command -v` before using it. Tests are run with `sh -e` so everything must be tested.	2025-03-03 12:43:32 +01:00
William Lallemand	a647839954	DEBUG: init: add a way to register functions for unit tests Doing unit tests with haproxy was always a bit difficult, some of the function you want to test would depend on the buffer or trash buffer initialisation of HAProxy, so building a separate main() for them is quite hard. This patch adds a way to register a function that can be called with the "-U" parameter on the command line, will be executed just after step_init_1() and will exit the process with its return value as an exit code. When using the -U option, every keywords after this option is passed to the callback and could be used as a parameter, letting the capability to handle complex arguments if required by the test. HAProxy need to be built with DEBUG_UNIT to activate this feature.	2025-03-03 12:43:32 +01:00
William Lallemand	4dc0ba233e	MINOR: jws: implement a JWK public key converter Implement a converter which takes an EVP_PKEY and converts it to a public JWK key. This is the first step of the JWS implementation. It supports both EC and RSA keys. Know to work with: - LibreSSL - AWS-LC - OpenSSL > 1.1.1	2025-03-03 12:43:32 +01:00
Willy Tarreau	730641f7ca	BUG/MINOR: server: check for either proxy-protocol v1 or v2 to send hedaer As reported in issue #2882, using "no-send-proxy-v2" on a server line does not properly disable the use of proxy-protocol if it was enabled in a default-server directive in combination with other PP options. The reason for this is that the sending of a proxy header is determined by a test on srv->pp_opts without any distinction, so disabling PPv2 while leaving other options results in a PPv1 header to be sent. Let's fix this by explicitly testing for the presence of either send-proxy or send-proxy-v2 when deciding to send a proxy header. This can be backported to all versions. Thanks to Andre Sencioles (@asenci) for reporting the issue and testing the fix.	2025-03-03 04:05:47 +01:00
Amaury Denoyelle	d0f97040a3	BUG/MINOR: hq-interop: fix leak in case of rcv_buf early return HTTP/0.9 parser was recently updated to support truncated requests in rcv_buf operation. However, this caused a leak as input buffer is allocated early. In fact, the leak was already present in case of fatal errors. Fix this by first delaying buffer allocation, so that initial checks are performed before. Then, ensure that buffer is released in case of a latter error. This is considered as minor, as HTTP/0.9 is reserved for experiment and QUIC interop usages. This should be backported up to 2.6.	2025-02-28 17:37:00 +01:00
Willy Tarreau	fd5d59967a	MINOR: h1: permit to relax the websocket checks for missing mandatory headers At least one user would like to allow a standards-violating client setup WebSocket connections through haproxy to a standards-violating server that accepts them. While this should of course never be done over the internet, it can make sense in the datacenter between application components which do not need to mask the data, so this typically falls into the situation of what the "accept-unsafe-violations-in-http-request" option and the "accept-unsafe-violations-in-http-response" option are made for. See GH #2876 for more context. This patch relaxes the test on the "Sec-Websocket-Key" header field in the request, and of the "Sec-Websocket-Accept" header in the response when these respective options are set. The doc was updated to reference this addition. This may be backported to 3.1 but preferably not further.	2025-02-28 17:31:20 +01:00
Christopher Faulet	0e08252294	BUG/MEDIUM: mux-fcgi: Try to fully fill demux buffer on receive if not empty Don't reserve space for the HTX overhead on receive if the demux buffer is not empty. Otherwise, the demux buffer may be erroneously reported as full and this may block records processing. Because of this bug, a ping-pong loop till timeout between data reception and demux process can be observed. This bug was introduced by the commit 5f927f603 ("BUG/MEDIUM: mux-fcgi: Properly handle read0 on partial records"). To fix the issue, if the demux buffer is not empty when we try to receive more data, all free space in the buffer can now be used. However, if the demux buffer is empty, we still try to keep it aligned with the HTX. This patch must be backported to 3.1.	2025-02-28 16:07:05 +01:00
Amaury Denoyelle	3cc095a011	MINOR: hq-interop: properly handle incomplete request Extends HTTP/0.9 layer to be able to deal with incomplete requests. Instead of an error, 0 is returned. Thus, instead of a stream closure. QUIC-MUX may retry rcv_buf operation later if more data is received, similarly to HTTP/3 layer. Note that HTTP/0.9 is only used for testing and interop purpose. As such, this limitation is not considered as a bug. It is probably not worth to backport it.	2025-02-27 17:34:06 +01:00
Amaury Denoyelle	0aa35289b3	CLEANUP: h3: fix documentation of h3_rcv_buf() Return value of h3_rcv_buf() is incorrectly documented. Indeed, it may return a positive value to indicate that input bytes were converted into HTX. This is especially important, as caller uses this value to consume the reported data amount in QCS Rx buffer. This should be backported up to 2.6. Note that on 2.8, h3_rcv_buf() was named h3_decode_qcs().	2025-02-27 17:31:40 +01:00
Amaury Denoyelle	f6648d478b	BUG/MINOR: h3: do not report transfer as aborted on preemptive response HTTP/3 specification allows a server to emit the entire response even if only a partial request was received. In particular, this happens when request STREAM FIN is delayed and transmitted in an empty payload frame. In this case, qcc_abort_stream_read() was used by HTTP/3 layer to emit a STOP_SENDING. Remaining received data were not transmitted to the stream layer as they were simply discared. However, this prevents FIN transmission to the stream layer. This causes the transfer to be considered as prematurely closed, resulting in a cL-- log line status. This is misleading to users which could interpret it as if the response was not sent. To fix this, disable STOP_SENDING emission on full preemptive reponse emission. Rx channel is kept opened until the client closes it with either a FIN or a RESET_STREAM. This ensures that the FIN signal can be relayed to the stream layer, which allows the transfer to be reported as completed. This should be backported up to 2.9.	2025-02-27 17:23:24 +01:00
Dragan Dosen	0ae7a5d672	BUG/MINOR: server: fix the "server-template" prefix memory leak The srv->tmpl_info.prefix was not freed in srv_free_params(). This could be backported to all stable versions.	2025-02-27 04:21:01 +01:00
Dragan Dosen	6838fe43a3	BUG/MEDIUM: server: properly initialize PROXY v2 TLVs The PROXY v2 TLVs were not properly initialized when defined with "set-proxy-v2-tlv-fmt" keyword, which could have caused a crash when validating the configuration or malfunction (e.g. when used in combination with "server-template" and/or "default-server"). The issue was introduced with commit 6f4bfed3a ("MINOR: server: Add parser support for set-proxy-v2-tlv-fmt"). This should be backported up to 2.9.	2025-02-27 04:20:45 +01:00
Olivier Houchard	706b008429	MEDIUM: servers: Add strict-maxconn. Maxconn is a bit of a misnomer when it comes to servers, as it doesn't control the maximum number of connections we establish to a server, but the maximum number of simultaneous requests. So add "strict-maxconn", that will make it so we will never establish more connections than maxconn. It extends the meaning of the "restricted" setting of tune.takeover-other-tg-connections, as it will also attempt to get idle connections from other thread groups if strict-maxconn is set.	2025-02-26 13:00:18 +01:00
Olivier Houchard	8de8ed4f48	MEDIUM: connections: Allow taking over connections from other tgroups. Allow haproxy to take over idle connections from other thread groups than our own. To control that, add a new tunable, tune.takeover-other-tg-connections. It can have 3 values, "none", where we won't attempt to get connections from the other thread group (the default), "restricted", where we only will try to get idle connections from other thread groups when we're using reverse HTTP, and "full", where we always try to get connections from other thread groups. Unless there is a special need, it is advised to use "none" (or restricted if we're using reverse HTTP) as using connections from other thread groups may have a performance impact.	2025-02-26 13:00:18 +01:00
Olivier Houchard	d31b1650ae	MEDIUM: pollers: Drop fd events after a takeover to another tgid. In pollers that support it, provide the generation number in addition to the fd, and, when an event happened, if the generation number is the same, but the tgid changed, then assumed the fd was taken over by a thread from another thread group, and just delete the event from the current thread's poller, as we no longer want to hear about it.	2025-02-26 13:00:18 +01:00
Olivier Houchard	c36aae2af1	MINOR: pollers: Add a fixup_tgid_takeover() method. Add a fixup_tgid_takeover() method to pollers for which it makes sense (epoll, kqueue and evport). That method can be called after a takeover of a fd from a different thread group, to make sure the poller's internal structure reflects the new state.	2025-02-26 13:00:18 +01:00
Olivier Houchard	752c5cba5d	MEDIUM: epoll: Make sure we can add a new event Check that the call to epoll_ctl() succeeds, and if it does not, if we're adding a new event and it fails with EEXIST, then delete and re-add the event. There are a few cases where we may already have events for a fd. If epoll_ctl() fails for any reason, use BUG_ON to make sure we immediately crash, as this should not happen.	2025-02-26 13:00:18 +01:00
Olivier Houchard	c5cc09c00d	MINOR: fd: Add fd_lock_tgid_cur(). Add fd_lock_tgid_cur(), a function that will lock the tgid, without modifying its value.	2025-02-26 13:00:18 +01:00
Olivier Houchard	52b97ff8dd	MEDIUM: fd: Wait if locked in fd_grab_tgid() and fd_take_tgid(). Wait while the tgid is locked in fd_grab_tgid() and fd_take_tgid(). As that lock is barely used, it should have no impact.	2025-02-26 13:00:18 +01:00
Ilia Shipitsin	814b5dfe30	BUILD: add possibility to use different QuicTLS variants initially QuicTLS started as a patchset on top of OpenSSL, currently project has started its own journey as QuicTLS somehow we need both ML: https://www.mail-archive.com/haproxy@formilux.org/msg45574.html GH: https://github.com/quictls/quictls/issues/244	2025-02-25 10:29:46 +01:00
Willy Tarreau	a826250659	OPTIM: connection: don't try to kill other threads' connection when !shared Users may have good reasons for using "tune.idle-pool.shared off", one of them being the cost of moving cache lines between cores, or the kernel- side locking associated with moving FDs. For this reason, when getting close to the file descriptors limits, we must not try to kill adjacent threads' FDs when the sharing of pools is disabled. This is extremely expensive and kills the performance. We must limit ourselves to our local FDs only. In such cases, it's up to the users to configure a large enough maxconn for their usages. Before this patch, perf top reported 9% CPU usage in connect_server() onthe trylock used to kill connections when running at 4800 conns for a global maxconn of 6400 on a 128-thread server. Now it doesn't spend its time there anymore, and performance has increased by 12%. Note, it was verified that disabling the locks in such a case has no effect at all, so better keep them and stay safe.	2025-02-25 09:23:46 +01:00
Willy Tarreau	2e0bac90da	BUG/MEDIUM: stream: don't use localtime in dumps from a signal handler In issue #2861, Jarosaw Rzesz�tko reported another issue with "show threads", this time in relation with the conversion of a stream's accept date to local time. Indeed, if the libc was interrupted in this same function, it could have been interrupted with a lock held, then it's no longer possible to dump the date, and we face a deadlock. This is easy to reproduce with logging enabled. Let's detect we come from a signal handler and do not try to resolve the time to localtime in this case.	2025-02-24 13:40:42 +01:00
Willy Tarreau	fb7874c286	MINOR: tinfo: split the signal handler report flags into 3 While signals are not recursive, one signal (e.g. wdt) may interrupt another one (e.g. debug). The problem this causes is that when leaving the inner handler, it removes the outer's flag, hence the protection that comes with it. Let's just have 3 distinct flags for regular signals, debug signal and watchdog signal. We add a 4th definition which is an aggregate of the 3 to ease testing.	2025-02-24 13:37:52 +01:00
Willy Tarreau	bbf824933f	BUG/MINOR: h2: always trim leading and trailing LWS in header values Annika Wickert reported some occasional disconnections between haproxy and varnish when communicating over HTTP/2, with varnish complaining about protocol errors while captures looked apparently normal. Nils Goroll managed to reproduce this on varnish by injecting the capture of the outgoing haproxy traffic and noticed that haproxy was forwarding a header value containing a trailing space, which is now explicitly forbidden since RFC9113. It turns out that the only way for such a header to pass through haproxy is to arrive in h2 and not be edited, in which case it will arrive in HTX with its undesired spaces. Since the code dealing with HTX headers always trims spaces around them, these are not observable in dumps, but only when started in debug mode (-d). Conversions to/from h1 also drop the spaces. With this patch we trim LWS both on input and on output. This way we always present clean headers in the whole stack, and even if some are manually crafted by the configuration or Lua, they will be trimmed on the output. This must be backported to all stable versions. Thanks to Annika for the helpful capture and Nils for the help with the analysis on the varnish side!	2025-02-24 09:39:57 +01:00
Vincent Dechenaux	9011b3621b	MINOR: compression: Introduce minimum size This is the introduction of "minsize-req" and "minsize-res". These two options allow you to set the minimum payload size required for compression to be applied. This helps save CPU on both server and client sides when the payload does not need to be compressed.	2025-02-22 11:32:40 +01:00
Willy Tarreau	e7510d6230	CLEANUP: task: move the barrier after clearing th_ctx->current There's a barrier after releasing the current task in the scheduler. However it's improperly placed, it's done after pool_free() while in fact it must be done immediately after resetting the current pointer. Indeed, the purpose is to make sure that nobody sees the task as valid when it's in the process of being released. This is something that could theoretically happen if interrupted by a signal in the inlined code of pool_free() if the compiler decided to postpone the write to ->current. In practice since nothing fancy is done in the inlined part of the function, there's currently no risk of reordering. But it could happen if the underlying __pool_free() were to be inlined for example, and in this case we could possibly observe th_ctx->current pointing to something currently being destroyed. With the barrier between the two, there's no risk anymore.	2025-02-21 18:31:46 +01:00
Willy Tarreau	eb41d768f9	MINOR: tools: use only opportunistic symbols resolution As seen in issue #2861, dladdr_and_size() an be quite expensive and will often hold a mutex in the underlying library. It becomes a real problem when issuing lots of "show threads" or wdt warnings in parallel because threads will queue up waiting for each other to finish, adding to their existing latency that possibly caused the warning in the first place. Here we're taking a different approach. If the thread is not isolated and not panicking, it's doing unimportant stuff like showing threads or warnings. In this case we try to grab a lock, and if we fail because another thread is already there, we just pretend we cannot resolve the symbol. This is not critical because then we fall back to the already used case which consists in writing "main+<offset>". In practice this will almost never happen except in bad situations which could have otherwise degenerated.	2025-02-21 18:26:29 +01:00
Willy Tarreau	3c22fa315b	BUG/MEDIUM: stream: use non-blocking freq_ctr calls from the stream dumper The stream dump function is called from signal handlers (warning, show threads, panic). It makes use of read_freq_ctr() which might possibly block if it tries to access a locked freq_ctr in the process of being updated, e.g. by the current thread. Here we're relying on the non-blocking API instead. It may return incorrect values (typically smaller ones after resetting the curr counter) but at least it will not block. This needs to be backported to stable versions along with the previous commit below: MINOR: freq_ctr: provide non-blocking read functions At least 3.1 is concerned as the warnings tend to increase the risk of this situation appearing.	2025-02-21 18:26:29 +01:00
Willy Tarreau	29e246a84c	MINOR: freq_ctr: provide non-blocking read functions Some code called by the debug handlers in the context of a signal handler accesses to some freq_ctr and occasionally ends up on a locked one from the same thread that is dumping it. Let's introduce a non-blocking version that at least allows to return even if the value is in the process of being updated, it's less problematic than hanging.	2025-02-21 18:26:29 +01:00
Willy Tarreau	84d4c948fc	BUG/MEDIUM: stream: never allocate connection addresses from signal handler In __strm_dump_to_buffer(), we call conn_get_src()/conn_get_dst() to try to retrieve the connection's IP addresses. But this function may be called from a signal handler to dump a currently running stream, and if the addresses were not allocated yet, a poll_alloc() will be performed while we might possibly already be running pools code, resulting in pool list corruption. Let's just make sure we don't call these sensitive functions there when called from a signal handler. This must be backported at least to 3.1 and ideally all other versions, along with this previous commit: MINOR: tinfo: add a new thread flag to indicate a call from a sig handler	2025-02-21 17:41:38 +01:00
Willy Tarreau	ddd173355c	MINOR: tinfo: add a new thread flag to indicate a call from a sig handler Signal handlers must absolutely not change anything, but some long and complex call chains may look innocuous at first glance, yet result in some subtle write accesses (e.g. pools) that can conflict with a running thread being interrupted. Let's add a new thread flag TH_FL_IN_SIG_HANDLER that is only set when entering a signal handler and cleared when leaving them. Note, we're speaking about real signal handlers (synchronous ones), not deferred ones. This will allow some sensitive call places to act differently when detecting such a condition, and possibly even to place a few new BUG_ON().	2025-02-21 17:41:38 +01:00
Willy Tarreau	a56dfbdcb4	BUG/MINOR: mux-h1: always make sure h1s->sd exists in h1_dump_h1s_info() This function may be called from a signal handler during a warning, a panic or a show thread. We need to be more cautious about what may or may not be dereferenced since an h1s is not necessarily fully initialized. Loops of "show threads" sometimes manage to crash when dereferencing a null h1s->sd, so let's guard it and add a comment remining about the unusual call place. This can be backported to the relevant versions.	2025-02-21 17:41:38 +01:00
Willy Tarreau	9d5bd47634	BUG/MINOR: stream: do not call co_data() from __strm_dump_to_buffer() co_data() was instrumented to detect cases where c->output > data and emits a warning if that's not correct. The problem is that it happens quite a bit during "show threads" if it interrupts traffic anywhere, and that in some environments building with -DDEBUG_STRICT_ACTION=3, it will kill the process. Let's just open-code the channel functions that make access to co_data(), there are not that many and the operations remain very simple. This can be backported to 3.1. It didn't trigger in earlier versions because they didn't have this CHECK_IF_HOT() test.	2025-02-21 17:18:00 +01:00
Ilia Shipitsin	0bdf414fa5	CI: QUIC Interop: clean old docker images currently temporary docker images are kept forever. let's delete outdated ones	2025-02-21 11:34:43 +01:00
Aurelien DARRAGON	97a19517ff	MINOR: clock: always use atomic ops for global_now_ms global_now_ms is shared between threads so we must give hint to the compiler that read/writes operations should be performed atomically. Everywhere global_now_ms was used, atomic ops were used, except in clock_update_global_date() where a read was performed without using atomic op. In practise it is not an issue because on most systems such reads should be atomic already, but to prevent any confusion or potential bug on exotic systems, let's use an explicit _HA_ATOMIC_LOAD there. This may be backported up to 2.8	2025-02-21 11:22:35 +01:00
Aurelien DARRAGON	9561b9fb69	BUG/MINOR: sink: add tempo between 2 connection attempts for sft servers When the connection for sink_forward_{oc}_applet fails or a previous one is destroyed, the sft->appctx is instantly released. However process_sink_forward_task(), which may run at any time, iterates over all known sfts and tries to create sessions for orphan ones. It means that instantly after sft->appctx is destroyed, a new one will be created, thus a new connection attempt will be made. It can be an issue with tcp log-servers or sink servers, because if the server is unavailable, process_sink_forward() will keep looping without any temporisation until the applet survives (ie: connection succeeds), which results in unexpected CPU usage on the threads responsible for that task. Instead, we add a tempo logic so that a delay of 1second is applied between two retries. Of course the initial attempt is not delayed. This could be backported to all stable versions.	2025-02-21 11:22:35 +01:00
Aurelien DARRAGON	c9d4192726	BUG/MINOR: log: fix outgoing abns address family While reviewing the code in an attempt to fix GH #2875, I stumbled on another case similar to aac570c ("BUG/MEDIUM: uxst: fix outgoing abns address family in connect()") that caused abns(z) addresses to fail when used as log targets. The underlying cause is the same as aac570c, which is the rework of the unix socket families in order to support custom addresses for different adressing schemes, where a real_family() was overlooked before passing a haproxy-internal address struct to socket-oriented syscall. To fix the issue, we first copy the target's addr, and then leverage real_family() to set the proper low-level address family that is passed to sendmsg() syscall. It should be backported in 3.1	2025-02-21 11:22:28 +01:00
Aurelien DARRAGON	26d97ec148	REGTESTS: fix reg-tests/server/abnsz.vtc It was proved in GH #2875 that the regtest was broken, at least for the server-side abnsz, as the connect() was not performed using the proper family, which results in kernel refusing to perform the call, while the reg-test actually succeeds. Indeed, in the test we used vtest client to connect to haproxy, which then routed the request to another haproxy instance listening on an abnsz socket, and this last haproxy was the one to answer the http request. As we only used "rxresp" in vtest client, the test succeeded with empty responses, which was the case due to the server connection failing on the first haproxy process.	2025-02-21 08:22:25 +01:00
Willy Tarreau	aac570cd03	BUG/MEDIUM: uxst: fix outgoing abns address family in connect() Since we reworked the unix socket families in order to support custom addresses for different addressing schemes, we've been using extra values for the ss_family field in sockaddr_storage. These ones have to be adjusted before calling bind() or connect(). It turns out that after the abns/abnsz updates in 3.1, the connect() code was not adjusted to take care of the change, resulting in AF_CUST_ABNS or AF_CUST_ABNSZ to be placed in the address that was passed to connect(). The right approach is to locally copy the address, get its length, fixup the family and use the fixed value and length for connect(). This must be backported to 3.1. Many thanks for @Mewp for reporting this issue in github issue #2875.	2025-02-21 07:59:08 +01:00
Valentine Krasnobaeva	390df282c1	BUG/MINOR: cfgparse: fix NULL ptr dereference in cfg_parse_peers When "peers" keyword is followed by more than one argument and it's the first "peers" section in the config, cfg_parse_peers() detects it and exits with "ERR_ALERT\|ERR_FATAL" err_code. So, upper layer parser, parse_cfg(), continues and parses the next keyword "peer" and then he tries to check the global cfg_peers, which should contain "my_cluster". The global cfg_peers is still NULL, because after alerting a user in alertif_too_many_args, cfg_parse_peers() exited. peers my_cluster __some_wrong_data__ peer haproxy1 1.1.1.1 1000 In order to fix this, let's add ERR_ABORT, if "peers" keyword is followed by more than one argument. Like this parse_cfg() will stops immediately and terminates haproxy with "too many args for peers my_cluster..." alert message. It's more reliable, than add checks "if (cfg_peers !=NULL)" in "peer" subparser, as we may have many "peers" sections. peers my_another_cluster peer haproxy1 1.1.1.2 1000 peers my_cluster __some_wrong_data__ peer haproxy1 1.1.1.1 1000 In addition, for the example above, parse_cfg() will parse all configuration until the end and only then terminates haproxy with the alert "too many args...". Peer haproxy1 will be wrongly associated with my_another_cluster. This fixes the issue #2872. This should be backported in all stable versions.	2025-02-20 17:10:26 +01:00
Christopher Faulet	851e52b551	BUG/MEDIUM: spoe/mux-spop: Introduce an NOOP action to deal with empty ACK In the SPOP protocol, ACK frame with empty payload are allowed. However, in that case, because only the payload is transferred, there is no data to return to the SPOE applet. Only the end of input is reported. Thus the applet is never woken up. It means that the SPOE filter will be blocked during the processing timeout and will finally return an error. To workaournd this issue, a NOOP action is introduced with the value 0. It is only an internal action for now. It does not exist in the SPOP protocol. When an ACK frame with an empy payload is received, this noop action is transferred to the SPOE applet, instead of nothing. Thanks to this trick, the applet is properly notified. This works because unknown actions are ignored by the SPOE filter. This patch must be backported to 3.1.	2025-02-20 11:56:27 +01:00
Christopher Faulet	efc46de294	BUG/MEDIUM: applet: Don't handle EOI/EOS/ERROR is applet is waiting for room The commit 7214dcd52 ("BUG/MEDIUM: applet: Don't pretend to have more data to handle EOI/EOS/ERROR") introduced a regression. Because of this patch, it was possible to handle EOI/EOS/ERROR applet flags too early while the applet was waiting for more room to transfer the last output data. This bug can be encountered with any applet using its own buffers (cache and stats for instance). And depending on the configuration and the timing, the data may be truncated or the stream may be blocked, infinitely or not. Streams blocked infinitely were observed with the cache applet and the HTTP compression enabled. For the record, it is important to detect EOI/EOS/ERROR applet flags to be able to report the corresponding event on the SE and by transitivity on the SC. Most of time, this happens when some data should be transferred to the stream. The .rcv_buf callback function is called and these flags are properly handled. However, some applets may also report them spontaneously, outside of any data transfer. In that case, the .rcv_buf callback is not called. It is the purpose of this patch (and the one above). Being able to detect pending EOI/EOS/ERROR applet flags. However, we must be sure to not handle them too early at this place. When these flags are set, it means no more data will be produced by the applet. So we must only wait to have transferred everything to the stream. And this happens when the applet is no longer waiting for more room. This patch must be backported to 3.1 with the one above.	2025-02-20 10:00:32 +01:00
Willy Tarreau	4ef6be4a1f	[RELEASE] Released version 3.2-dev6 Released version 3.2-dev6 with the following main changes : - BUG/MEDIUM: debug: close a possible race between thread dump and panic() - DEBUG: thread: report the spin lock counters as seek locks - DEBUG: thread: make lock time computation more consistent - DEBUG: thread: report the wait time buckets for lock classes - DEBUG: thread: don't keep the redundant _locked counter - DEBUG: thread: make lock_stat per operation instead of for all operations - DEBUG: thread: reduce the struct lock_stat to store only 30 buckets - MINOR: lbprm: add a new callback ->server_requeue to the lbprm - MEDIUM: server: allocate a tasklet for asyncronous requeuing - MAJOR: leastconn: postpone the server's repositioning under contention - BUG/MINOR: quic: reserve length field for long header encoding - BUG/MINOR: quic: fix CRYPTO payload size calcul for encoding - MINOR: quic: simplify length calculation for STREAM/CRYPTO frames - BUG/MINOR: mworker: section ignored in discovery after a post_section_parser - BUG/MINOR: mworker: post_section_parser for the last section in discovery - CLEANUP: mworker: "program" section does not have a post_section_parser anymore - MEDIUM: initcall: allow to register mutiple post_section_parser per section - CI: cirrus-ci: bump FreeBSD image to 14-2 - DOC: initcall: name correctly REGISTER_CONFIG_POST_SECTION() - REGTESTS: stop using truncated.vtc on freebsd - MINOR: quic: refactor STREAM encoding and splitting - MINOR: quic: refactor CRYPTO encoding and splitting - BUG/MEDIUM: fd: mark FD transferred to another process as FD_CLONED - BUG/MINOR: ssl/cli: "show ssl crt-list" lacks client-sigals - BUG/MINOR: ssl/cli: "show ssl crt-list" lacks sigals - MINOR: ssl/cli: display more filenames in 'show ssl cert' - DOC: watchdog: document the sequence of the watchdog and panic - MINOR: ssl: store the filenames resulting from a lookup in ckch_conf - MINOR: startup: allow hap_register_feature() to enable a feature in the list - MINOR: quic: support frame type as a varint - BUG/MINOR: startup: leave at first post_section_parser which fails - BUG/MINOR: startup: hap_register_feature() fix for partial feature name - BUG/MEDIUM: cli: Be sure to drop all input data in END state - BUG/MINOR: cli: Wait for the last ACK when FDs are xferred from the old worker - BUG/MEDIUM: filters: Handle filters registered on data with no payload callback - BUG/MINOR: fcgi: Don't set the status to 302 if it is already set - MINOR: ssl/crtlist: split the ckch_conf loading from the crtlist line parsing - MINOR: ssl/crtlist: handle crt_path == cc->crt in crtlist_load_crt() - MINOR: ssl/ckch: return from ckch_conf_clean() when conf is NULL - MEDIUM: ssl/crtlist: "crt" keyword in frontend - DOC: configuration: document the "crt" frontend keyword - DEV: h2: add a Lua-based HTTP/2 connection tracer - BUG/MINOR: quic: prevent crash on conn access after MUX init failure - BUG/MINOR: mux-quic: prevent crash after MUX init failure - DEV: h2: fix flags for the continuation frame - REGTESTS: Fix truncated.vtc to send 0-CRLF - BUG/MINOR: mux-h2: Properly handle full or truncated HTX messages on shut - Revert "REGTESTS: stop using truncated.vtc on freebsd" - MINOR: mux-quic: define a QCC application state member - MINOR: mux-quic/h3: emit SETTINGS via MUX tasklet handler - MINOR: mux-quic/h3: support temporary blocking on control stream sending	2025-02-19 18:39:51 +01:00
Amaury Denoyelle	a7645d7cd5	MINOR: mux-quic/h3: support temporary blocking on control stream sending When HTTP/3 layer is initialized via QUIC MUX, it first emits a SETTINGS frame on an unidirectional control stream. However, this could be prevented if client did not provide initial flow control. Previously, QUIC MUX was unable to deal with such situation. Thus, the connection was closed immediately and no transfer could occur. Improve this by extending QUIC MUX application layer API : initialization may now return a transient error. This allows MUX to continue to use the connection normally. Initialization will be retried periodically alter until it can succeed. This new API allows to deal with the flow control issue described above. Note that this patch is not considered as a bug fix. Indeed, clients are strongly advised to provide enough flow control for a SETTINGS frame exchange.	2025-02-19 11:08:02 +01:00
Amaury Denoyelle	06e7674399	MINOR: mux-quic/h3: emit SETTINGS via MUX tasklet handler Previously, QUIC MUX application layer was installed and initialized via MUX init. However, the latter stage involve I/O operations, for example when using HTTP/3 with the emission of a SETTINGS frame. Change this to prevent any I/O operations during MUX init. As such, finalize app_ops callback is now called during the first invokation of qcc_io_send(), in the context of MUX tasklet. To implement this, a new application state value is added, to detect the transition from NULL to INIT stage.	2025-02-19 11:03:40 +01:00
Amaury Denoyelle	188fc45b95	MINOR: mux-quic: define a QCC application state member Introduce a new QCC field to track the current application layer state. For the moment, only INIT and SHUT state are defined. This allows to replace the older flag QC_CF_APP_SHUT. This commit does not bring major changes. It is only necessary to permit future evolutions on QUIC MUX. The only noticeable change is that QMUX traces can now display this new field.	2025-02-19 10:59:53 +01:00
Christopher Faulet	4a99f15f0c	Revert "REGTESTS: stop using truncated.vtc on freebsd" This reverts commit 0b9a75e8781593c250f6366a64a019018ade688e. Thanks to the previous fixes ("REGTESTS: Fix truncated.vtc to send 0-CRLF" and "BUG/MINOR: mux-h2: Properly handle full or truncated HTX messages on shut"), this script can be reenabled for FreeBSD.	2025-02-18 17:35:00 +01:00
Christopher Faulet	b70921f2c1	BUG/MINOR: mux-h2: Properly handle full or truncated HTX messages on shut On shut, truncated HTX messages were not properly handled by the H2 multiplexer. Depending on how data were emitted, a chunked HTX message without the 0-CRLF could be considered as full and an empty data with ES flag set could be emitted instead of a RST_STREAM(CANCEL) frame. In the H2 multiplexer, when a shut is performed, an HTX message is considered as truncated if more HTX data are still expected. It is based on the presence or not of the H2_SF_MORE_HTX_DATA flag on the H2 stream. However, this flag is set or unset depending on the HTX extra field value. This field is used to state how much data that must still be transferred, based on the announced data length. For a message with a content-length, this assumption is valid. But for a chunked message, it is not true. Only the length of the current chunk is announced. So we cannot rely on this field in that case to know if a message is full or not. Instead, we must rely on the HTX start-line flags to know if more HTX data are expected or not. If the xfer length is known (the HTX_SL_F_XFER_LEN flag is set on the HTX start-line), it means that more data are always expected, until the end of message is reached (the HTX_FL_EOM flag is set on the HTX message). This is true for bodyless message because the end of message is reported with the end of headers. This is also true for tunneled messages because the end of message is received before switching the H2 stream in tunnel mode. This patch must be backported as far as 2.8.	2025-02-18 17:34:59 +01:00
Christopher Faulet	b93e419750	REGTESTS: Fix truncated.vtc to send 0-CRLF When a chunked messages is sent, the 0-CRLF must be explicitely sent. Since the begining, it is missing. Just add it.	2025-02-18 17:34:59 +01:00
Willy Tarreau	af5c07eee9	DEV: h2: fix flags for the continuation frame It's flag 2 (end of headers) that's defined there, not 3 (padded).	2025-02-18 14:17:17 +01:00
Amaury Denoyelle	2715dbe9d0	BUG/MINOR: mux-quic: prevent crash after MUX init failure qmux_init() may fail for several reasons. In this case, connection resources are freed and underlying and a CONNECTION_CLOSE will be emitted via its quic_conn instance. In case of qmux_init() failure, qcc_release() is used to clean up resources, but QCC <conn> member is first resetted to NULL, as connection released must be delayed. Some cleanup operations are thus skipped, one of them is the resetting of <ctx> connection member to NULL. This may cause a crash as <ctx> is a dangling pointer after QCC release. One of the possible reproducer is to activate QMUX traces, which will cause a segfault on the qmux_init() error leave trace. To fix this, simply reset <ctx> to NULL manually on qmux_init() failure. This must be backported up to 3.0.	2025-02-18 11:02:46 +01:00
Amaury Denoyelle	2cdc4695cb	BUG/MINOR: quic: prevent crash on conn access after MUX init failure Initially, QUIC-MUX was responsible to reset quic_conn <conn> member to NULL when MUX was released. This was performed via qcc_release(). However, qcc_release() is also used on qmux_init() failure. In this case, connection must be freed via its session, so QCC <conn> member is resetted to NULL prior to qcc_release(), which prevents quic_conn <conn> member to also be resetted. As the connection is freed soon after, quic_conn <conn> is a dangling pointer, which may cause crashes. This bug should be very rare as first it implies that QUIC-MUX initialization has failed (for example due to a memory alloc error). Also, <conn> member is rarely used by quic_conn instance. In fact, the only reproducible crash was done with QUIC traces activated, as in this case connection is accessed via quic_conn under __trace_enabled() function. To fix this, detach connection from quic_conn via the XPRT layer instead of the MUX. More precisely, this is performed via quic_close(). This should ensure that it will always be conducted, either on normal connection closure, but also after special conditions such as MUX init failure. This should be backported up to 2.6.	2025-02-18 10:43:56 +01:00
Willy Tarreau	607aa57b2e	DEV: h2: add a Lua-based HTTP/2 connection tracer The following config is sufficient to trace H2 exchanges between a client and a server: global lua-load "dev/h2/h2-tracer.lua" listen h2_sniffer mode tcp bind :8002 filter lua.h2-tracer #hex server s1 127.0.0.1:8003 The commented "hex" argument will also display full frames in hex (not recommended). The connections are prefixed with a 3-hex digit number in order to also support a bit of multiplexing without impacting the reading too much. The screen is split in two, with the request on the left and the response on the right. Here's an example of what it does between an haproxy backend and an haproxy frontend both in H2, when submitted a curl request for /?s=30k handled by httpterm: [001] ### req start [001] [PREFACE len=24] [001] [SETTINGS sid=0 len=24 (bytes=24)] [001] \| ### res start [001] \| [SETTINGS sid=0 len=18 (bytes=27)] [001] \| [SETTINGS ACK sid=0 len=0 (bytes=0)] [001] [SETTINGS ACK sid=0 len=0 (bytes=56)] [001] [HEADERS EH+ES sid=1 len=47 (bytes=47)] [001] \| [HEADERS EH sid=1 len=101 (bytes=15351)] [001] \| [DATA sid=1 len=15126 (bytes=15241)] [001] \| [DATA sid=1 len=1258 (bytes=106)] [001] \| ... -106 = 1152 [001] \| ... -1152 = 0 [001] [WINDOW_UPDATE sid=1 len=4 (bytes=43)] [001] [WINDOW_UPDATE sid=0 len=4 (bytes=30)] [001] [WINDOW_UPDATE sid=1 len=4 (bytes=17)] [001] [WINDOW_UPDATE sid=0 len=4 (bytes=4)] [001] \| [DATA ES sid=1 len=14336 (bytes=14336)] [001] [WINDOW_UPDATE sid=0 len=4 (bytes=4)] [001] ### req end: 31080 bytes total [001] \| [GOAWAY sid=0 len=8 (bytes=8)] [001] \| ### res end: 31097 bytes total It deserves some improvements. For instance at the moment it does not verify the preface, any 24 bytes will work. It does not perform any protocol validation either. Detecting some issues such as out-of-sequence frames could be helpful. But it already helps as-is.	2025-02-18 09:26:15 +01:00
William Lallemand	764f6910ed	DOC: configuration: document the "crt" frontend keyword Document the "crt" keyword of frontend and listen section.	2025-02-17 18:26:37 +01:00
William Lallemand	cd6a02ace9	MEDIUM: ssl/crtlist: "crt" keyword in frontend This patch implements the "crt" keywords in frontend, declaring an implicit crt-list named after the frontend. The patch is split in two steps: The first step is the crt keyword parser, which parses crt lines and fill a "cfg_crt_node" struct containing a ssl_bind_conf and a ckch_conf which are put in a list to be used later. After parsing the frontend section, as a 2nd step, a post_section_parser is called, it will create a crt-list named after the frontend and will fill it with certificates from the list of cfg_crt_node. Once created this crt-list will be loaded in every "ssl" bind lines that didn't declare any crt or crt-list. Example: listen https bind :443 ssl crt foobar.pem crt test1.net.crt key test1.net.key Implements part of #2854	2025-02-17 18:26:37 +01:00
William Lallemand	82f927817e	MINOR: ssl/ckch: return from ckch_conf_clean() when conf is NULL ckch_conf_clean() mustn't be executed when the argument is NULL, this will keep the API more consistant like any free() function.	2025-02-17 18:26:37 +01:00
William Lallemand	0330011acf	MINOR: ssl/crtlist: handle crt_path == cc->crt in crtlist_load_crt() Handle the case where crt_path == cc->crt, so the pointer doesn't get free'd before getting strdup'ed in crtlist_load_crt().	2025-02-17 18:26:37 +01:00
William Lallemand	69163cd63e	MINOR: ssl/crtlist: split the ckch_conf loading from the crtlist line parsing ckch_conf loading is not that simple as it requires to check - if the cert already exists in the ckchs_tree - if the ckch_conf is compatible with an existing cert in ckchs_tree - if the cert is a bundle which need to load multiple ckch_store This logic could be reuse elsewhere, so this commit introduce the new crtlist_load_crt() function which does that.	2025-02-17 18:26:37 +01:00
Christopher Faulet	ca79ed5eef	BUG/MINOR: fcgi: Don't set the status to 302 if it is already set When a "Location" header was found in a FCGI response, the status code was forced to 302. But it should only be performed if no status code was set first. So now, we take care to not override an already defined status code when the "Location" header is found. This patch should fix the issue #2865. It must backported to all stable versions.	2025-02-17 16:37:53 +01:00
Christopher Faulet	34542d5ec2	BUG/MEDIUM: filters: Handle filters registered on data with no payload callback An HTTP filter with no http_payload callback function may be registered on data. In that case, this filter is obviously not called when some data are received but it remains important to update its internal state to be sure to keep it synchronized on the stream, especially its offet value. Otherwise, the wrong calculation on the global offset may be performed in flt_http_end(), leading to an integer overflow when data are moved from input to output. This overflow triggers a BUG_ON() in c_adv(). The same is true for TCP filters with no tcp_payload callback function. This patch must be backport to all stable versions.	2025-02-17 16:16:29 +01:00
Christopher Faulet	49b7bcf583	BUG/MINOR: cli: Wait for the last ACK when FDs are xferred from the old worker On reload, the new worker requests bound FDs to the old one. The old worker sends them in message of at most 252 FDs. Each message is acknowledged by the new worker. All messages sent or received by the old worker are handled manually via sendmsg/recv syscalls. So the old worker must be sure consume all the ACK replies. However, the last one was never consumed. So it was considered as a command by the CLI applet. This issue was hidden since recently. But it was the root cause of the issue #2862. Note this last ack is also the first one when there are less than 252 FDs to transfer. This patch must be backported to all stable versions.	2025-02-17 15:31:07 +01:00
Christopher Faulet	972ce87676	BUG/MEDIUM: cli: Be sure to drop all input data in END state Commit 7214dcd ("BUG/MEDIUM: applet: Don't pretend to have more data to handle EOI/EOS/ERROR") revealed a bug with the CLI applet. Pending input data when the applet is in CLI_ST_END state were never consumed or dropped, leading to a wakeup loop. The CLI applet implements its own snd_buf callback function. It is important it consumes all pending input data. Otherwise, the applet is woken up in loop until it empties the request buffer. Another way to fix the issue would be to report an error. But in that case, it seems reasonnable to drop these data. The issue can be observed on reload, in master/worker mode, because of issue about the last ACK message which was never consummed by the _getsocks() command. This patch should fix the issue #2862. It must be backported to 3.1 with the commit above.	2025-02-17 15:31:07 +01:00
William Lallemand	ab2fa95bdd	BUG/MINOR: startup: hap_register_feature() fix for partial feature name In patch 2fe4cbd8e ("MINOR: startup: allow hap_register_feature() to enable a feature in the list"), the ability to overwrite a '-' in the feature list was added. However the code was not tokenizing correctly the string, and partial feature name found in the name could result in having the same feature name multiple time. This patch rewrites the lookup of the string by tokenizing it correctly.	2025-02-17 14:56:09 +01:00
William Lallemand	7268e9c249	BUG/MINOR: startup: leave at first post_section_parser which fails Since we are now iterating on post_section_parser() for a same keyword, we need to exit at the first ERR_ABORT. The post_section_parser() is called when parsing a new section, but also at the end of the file to be called for the last section. The changes in 4de86bb ("MEDIUM: initcall: allow to register mutiple post_section_parser per section") should have added tests on the ERR_ABORT value. Also pcs->post_section_parser() must be called instead of cs->post_section_parser() because we could have a NULL ptr. This bug does not affect anything since we don't use REGISTER_CONFIG_POST_SECTION() yet.	2025-02-17 11:21:20 +01:00
Amaury Denoyelle	32691e7c25	MINOR: quic: support frame type as a varint QUIC frame type is encoded as a variable-length integer. Thus, 64-bit integer should be used for them. Currently, this was not the case as type was represented as a 1-byte char inside quic_frame structure. This does not cause any issue with QUIC from RFC9000, as all frame types fit in this range. Furthermore, a QUIC implementation is required to use the smallest size varint when encoding a frame type. However, the current code is unable to accept QUIC extension with bigger frame types. This is notably the case for quic-on-streams draft. Thus, this commit readjusts quic_frame architecture to be able to support higher frame type values. First, type field of quic_frame is changed to a 64-bits variable. Both encoding and decoding frame functions uses variable-length integer helpers to manipulate the frame type field. Secondly, the quic_frame builders/parsers infrastructure is still preserved. However, it could be impossible to define new large frame type as an index into quic_frame_builders / quic_frame_parsers arrays. Thus, wrapper functions are now provided to access the builders and parsers. Both qf_builder() and qf_parser() wrappers can then be extended to return custom builder/parser instances for larger frame type. Finally, unknown frame type detection also uses the new wrapper quic_frame_is_known(). As with builders/parsers, for large frame type, this function must be manually completed to support a new type value.	2025-02-14 09:00:05 +01:00
William Lallemand	2fe4cbd8e5	MINOR: startup: allow hap_register_feature() to enable a feature in the list This patch allows hap_register_feature() to enable a feature in the list which was already registered and marked disabled. This way we could enable automatically some features under certain condition without the need of the USE argument with make and correctly report its activation.	2025-02-14 00:09:17 +01:00
William Lallemand	7034f2ca48	MINOR: ssl: store the filenames resulting from a lookup in ckch_conf With this patch, files resulting from a lookup (.key, .ocsp, *.issuer etc) are now stored in the ckch_conf. It allows to see the original filename from where it was loaded in "show ssl cert <filename>"	2025-02-13 17:44:00 +01:00
Willy Tarreau	a4d65c9cc8	DOC: watchdog: document the sequence of the watchdog and panic Each time we go into the watchdog and panic code, it's super hard to figure who calls what since signals are involved to bounce between threads. Let's document the main principles and sequences to ease the journey next time.	2025-02-13 16:45:07 +01:00
William Lallemand	0c0b38d64c	MINOR: ssl/cli: display more filenames in 'show ssl cert' "show ssl cert <file>" only displays a unique filename, which is the key used in the ckch_store tree. This patch extends it by displaying every filenames from the ckch_conf that can be configured with the crt-store. In order to be more consistant, some changes are needed in the future: - we need to store the complete path in the ckch_conf (meaning with crt-path or key-path) - we need to fill a ckch_conf in cases the files are autodiscovered	2025-02-13 16:18:06 +01:00
William Lallemand	5a7cbb8d81	BUG/MINOR: ssl/cli: "show ssl crt-list" lacks sigals 1d3c8223 ("MINOR: ssl: allow to change the server signature algorithm") mplemented the sigals keyword in the crt-list but never the dump of the keyword over the CLI. Must be backported as far as 2.8.	2025-02-12 17:16:50 +01:00
William Lallemand	037d2e5498	BUG/MINOR: ssl/cli: "show ssl crt-list" lacks client-sigals b6ae2aafde43 ("MINOR: ssl: allow to change the signature algorithm for client authentication") implemented the client-sigals keyword in the crt-list but never the dump of the keyword over the CLI. Must be backported as far as 2.8.	2025-02-12 17:16:50 +01:00
Willy Tarreau	561319bd1c	BUG/MEDIUM: fd: mark FD transferred to another process as FD_CLONED The crappy epoll API stroke again with reloads and transferred FDs. Indeed, when listening sockets are retrieved by a new worker from a previous one, and the old one finally stops listening on them, it closes the FDs. But in this case, since the sockets themselves were not closed, epoll will not unregister them and will continue to report new activity for these in the old process, which can only observe, count an fd_poll_drop event and not unregister them since they're not reachable anymore. The unfortunate effect is that long-lasting old processes are woken up at the same rate as the new process when accepting new connections, and can waste a lot of CPU. Accept rates divided by 8 were observed on a small test involving a slow transfer on 10 connections facing a reload every second so that 10 processes were busy dealing with them while another process was hammering the service with new connections. Fortunately, years ago we implemented a flag FD_CLONED exactly for similar purposes. Let's simply mark transferred FDs with FD_CLONED so that the process knows that these ones require special treatment and have to be manually unregistered before being closed. This does the job fine, now old processes correctly unregister the FD before closing it and no longer receive accept events for the new process. This needs to be backported to all stable versions. It only affects epoll, as usual, and this time in combination with transferred FDs (typically reloads in master-worker mode). Thanks to Damien Claisse for providing all detailed measurements and statistics allowing to understand and reproduce the problem.	2025-02-12 16:35:01 +01:00
Amaury Denoyelle	e2744d23be	MINOR: quic: refactor CRYPTO encoding and splitting This patch is the direct follow-up of the previous one which refactor STREAM frame encoding. Reuse the newly defined quic_strm_frm_fillbuf() and quic_strm_frm_split() functions for CRYPTO frame encoding. The code for CRYPTO and STREAM frames encoding should now be clearer as it is mostly identical.	2025-02-12 15:10:54 +01:00
Amaury Denoyelle	f96af8e463	MINOR: quic: refactor STREAM encoding and splitting CRYPTO and STREAM frames encoding is similar. If payload is too large, frame will be splitted and only the first payload part will be written in the output QUIC packet. This process is complexified by the presence of a variable-length integer Length field prior to the payload. This commit aims at refactor these operations. Define two functions to simplify the code : * quic_strm_frm_fillbuf() which is used to calculate the optimal frame length of a STREAM/CRYPTO frame with its payload in a buffer * quic_strm_frm_split() which is used to split the frame payload if buffer is too small With this patch, both functions are now implemented for STREAM encoding.	2025-02-12 15:10:03 +01:00
William Lallemand	0b9a75e878	REGTESTS: stop using truncated.vtc on freebsd We never succeed to make the truncated.vtc reg-test work constantly on the Cirrus FreeBSD CI. Let's exclude it from the FreeBSD tests so the CI don't break randomly.	2025-02-12 13:34:40 +01:00
William Lallemand	0b47e5fa20	DOC: initcall: name correctly REGISTER_CONFIG_POST_SECTION() REGISTER_CONFIG_POST_SECTION() was not named correctly.	2025-02-12 13:27:44 +01:00
William Lallemand	6097938209	CI: cirrus-ci: bump FreeBSD image to 14-2 FreeBSD CI since to be broken for a while, try to upgrade the image to the latest 14.2 version.	2025-02-12 13:18:55 +01:00
William Lallemand	4de86bbbfc	MEDIUM: initcall: allow to register mutiple post_section_parser per section Before this patch, REGISTER_CONFIG_SECTION() allowed to register one and only one callback (<post>) called after the parsing of a section. It was limitating because you couldn't register a post callback from anywhere else in the code. This patch introduces the new REGISTER_CONFIG_SECTION_POST() macros which allows to register a new post callback for a section keyword from anywhere. This patch introduces the feature by allowing `struct cfg_section` entries that does not have a `section_parser`, and then iterating on all cfg_section with a post_section_parser for a keyword.	2025-02-12 12:52:41 +01:00
William Lallemand	5c2039b5b8	CLEANUP: mworker: "program" section does not have a post_section_parser anymore The "program" section does not have a post_section_parser anymore so no need to make an exception for it.	2025-02-12 12:37:01 +01:00
William Lallemand	313eeae7db	BUG/MINOR: mworker: post_section_parser for the last section in discovery Previous patch 2c270a05f ("BUG/MINOR: mworker: section ignored in discovery after a post_section_parser") needs an adjustment for the last section of the file. Indeed the post_section_parser of the last section must not be called in discovery mode. Must be backported in 3.1.	2025-02-12 12:34:57 +01:00
William Lallemand	2c270a05f0	BUG/MINOR: mworker: section ignored in discovery after a post_section_parser When a new section is discovered, the post_section_parser of the previous section is called. However in the new master-worker mode the discovery mode will skip the post_section_parser. But instead of trying to parse the current section keyword after that, it would skip completely the current line. This is a minor bug since there isn't a lot of section with post_section_parser, and not a lot of section to parse in discovery mode. But this could be reproduced like this: global expose-deprecated-directives resolvers res parse-resolv-conf program foo command sleep 10 program bar command sleep 10 Ths 'resolvers' section has a post_section_parser which will be ignored in discovery mode with the consequence of ignoring the first program section. This must be backported in 3.1.	2025-02-12 12:18:17 +01:00
Amaury Denoyelle	731340afbd	MINOR: quic: simplify length calculation for STREAM/CRYPTO frames STREAM and CRYPTO frames have a similar encoding format. In particular, both of them have a variable-length integer Length field just before the frame payload. It is complex to determine the optimal Length value before copying the payload data in the remaining buffer space. As such, helper functions were implemented to calculate this. However, CRYPTO and STREAM frames encoding implementation were not completely aligned, which renders the code harder to follow. The purpose of this commit is to simplify CRYPTO and STREAM frames encoding. First, a new helper quic_int_cap_length() is defined which is useful to determine the optimal buffer room available if prefixed by a variable-length integer as Length field. Then, processing of both CRYPTO and STREAM frames is now nearly identical, based on this new helper function. Functions max_available_room() and max_stream_data_size() are now unused and are removed.	2025-02-12 11:51:09 +01:00
Amaury Denoyelle	e6a223542a	BUG/MINOR: quic: fix CRYPTO payload size calcul for encoding Function max_stream_data_size() is used to determine the payload length of a CRYPTO frame. It takes into account that the CRYPTO length field is a variable length integer. Implemented calcul was incorrect as it reserved too much space as a frame header. This error is mostly due because max_stream_data_size() reuses max_available_room() which also reserve space for a variable length integer. This results in CRYPTO frames shorter of 1 to 2 bytes than the maximum achievable value, which produces in the end datagram shorter than the MTU. Fix max_stream_data_size() implementation. It is now merely a wrapper on max_available_room(). This ensures that CRYPTO frame encoding is now properly optimized to use the MTU available. This should be backported up to 2.6.	2025-02-12 11:51:09 +01:00
Amaury Denoyelle	63747452a3	BUG/MINOR: quic: reserve length field for long header encoding Long header packets have a mandatory Length field, which contains the size of Packet number and payload, encoded as a variable-length integer. Its value can thus only be determined after the payload size is known, which depends on the remaining buffer space after this variable-length field. Packet payload are encoded in two steps. First, a list of input frames is processed until the packet buffer is full. CRYPTO and STREAM frames payload can be splitted if need to fill the buffer. Real encoding is then performed as a second stage operation, first with Length field, then with the selected frames themselves. Before this patch, no space was reserved in the buffer for Length field when attaching the frames to the packet. This could result in a error as the packet payload would be too large for the remaining space. In practice, this issue was rarely encounted, mostly as a side-effect from another issue linked to CRYPTO frame encoding. Indeed, a wrong calculation is performed on CRYPTO splitting, which results in frame payload shorter by a few bytes than expected. This however ensured there would be always enough room for the Length field and payload during encoding. As CRYPTO frames are the only big enough content emitted with a Long header packet, this renders the current issue mostly non reproducible. Fix the original issue by reserving some space for Length field prior to frame payload calculation, using a maximum value based on the remaining room space. Packet length is then reduced if needed when encoding is performed, which ensures there is always enough room for the selected frames. Note that the other issue impacting CRYPTO frame encoding is not yet fixed. This could result in datagrams with Long header packets not completely extended to the full MTU. The issue will be addressed in another patch. This should be backported up to 2.6.	2025-02-12 11:51:09 +01:00
Willy Tarreau	627280e15f	MAJOR: leastconn: postpone the server's repositioning under contention When leastconn is used under many threads, there can be a lot of contention on leastconn, because the same node has to be moved around all the time (when picking it and when releasing it). In GH issue #2861 it was noticed that 46 threads out of 64 were waiting on the same lock in fwlc_srv_reposition(). In such a case, the accuracy of the server's key becomes quite irrelevant because nobody cares if the same server is picked twice in a row and the next one twice again. While other approaches in the past considered using a floating key to avoid moving the server each time (which was not compatible with the round-robin rule for equal keys), here a more drastic solution is needed. What we're doing instead is that we turn this lock into a trylock. If we can grab it, we do the job. If we can't, then we just wake up a server's tasklet dedicated to this. That tasklet will then try again slightly later, knowing that during this short time frame, the server's position in the queue is slightly inaccurate. Note that any thread touching the same server will also reposition it and save that work for next time. Also if multiple threads wake the tasklet up, then that's fine, their calls will be merged and a single lock will be taken in the end. Testing this on a 24-core EPYC 74F3 showed a significant performance boost from 382krps to 610krps. The performance profile reported by perf top dropped from 43% to 2.5%: Before: Overhead Shared Object Symbol 43.46% haproxy-master-inlineebo [.] fwlc_srv_reposition 21.20% haproxy-master-inlineebo [.] fwlc_get_next_server 0.91% haproxy-master-inlineebo [.] process_stream 0.75% [kernel] [k] ice_napi_poll 0.51% [kernel] [k] tcp_recvmsg 0.50% [kernel] [k] ice_start_xmit 0.50% [kernel] [k] tcp_ack After: Overhead Shared Object Symbol 30.37% haproxy [.] fwlc_get_next_server 2.51% haproxy [.] fwlc_srv_reposition 1.91% haproxy [.] process_stream 1.46% [kernel] [k] ice_napi_poll 1.36% [kernel] [k] tcp_recvmsg 1.04% [kernel] [k] tcp_ack 1.00% [kernel] [k] skb_release_data 0.96% [kernel] [k] ice_start_xmit 0.91% haproxy [.] conn_backend_get 0.82% haproxy [.] connect_server 0.82% haproxy [.] run_tasks_from_lists Tested on an Ampere Altra with 64 aarch64 cores dedicated to haproxy, the gain is even more visible (3.6x): Before: 311-323k rps, 3.16-3.25ms, 6400% CPU Overhead Shared Object Symbol 55.69% haproxy-master [.] fwlc_srv_reposition 33.30% haproxy-master [.] fwlc_get_next_server 0.89% haproxy-master [.] process_stream 0.45% haproxy-master [.] h1_snd_buf 0.34% haproxy-master [.] run_tasks_from_lists 0.32% haproxy-master [.] connect_server 0.31% haproxy-master [.] conn_backend_get 0.31% haproxy-master [.] h1_headers_to_hdr_list 0.24% haproxy-master [.] srv_add_to_idle_list 0.23% haproxy-master [.] http_request_forward_body 0.22% haproxy-master [.] __pool_alloc 0.21% haproxy-master [.] http_wait_for_response 0.21% haproxy-master [.] h1_send After: 1.21M rps, 0.842ms, 6400% CPU Overhead Shared Object Symbol 17.44% haproxy [.] fwlc_get_next_server 6.33% haproxy [.] process_stream 4.40% haproxy [.] fwlc_srv_reposition 3.64% haproxy [.] conn_backend_get 2.75% haproxy [.] connect_server 2.71% haproxy [.] h1_snd_buf 2.66% haproxy [.] srv_add_to_idle_list 2.33% haproxy [.] run_tasks_from_lists 2.14% haproxy [.] h1_headers_to_hdr_list 1.56% haproxy [.] stream_set_backend 1.37% haproxy [.] http_request_forward_body 1.35% haproxy [.] http_wait_for_response 1.34% haproxy [.] h1_send And at similar loads, the CPU usage considerably drops (3.55x), as well as the response time (10x): After: 320k rps, 0.322ms, 1800% CPU Overhead Shared Object Symbol 7.62% haproxy [.] process_stream 4.64% haproxy [.] h1_headers_to_hdr_list 3.09% haproxy [.] h1_snd_buf 3.08% haproxy [.] h1_process_demux 2.22% haproxy [.] __pool_alloc 2.14% haproxy [.] connect_server 1.87% haproxy [.] h1_send > 1.84% haproxy [.] fwlc_srv_reposition 1.84% haproxy [.] run_tasks_from_lists 1.77% haproxy [.] sock_conn_iocb 1.75% haproxy [.] srv_add_to_idle_list 1.66% haproxy [.] http_request_forward_body 1.65% haproxy [.] wake_expired_tasks 1.59% haproxy [.] h1_parse_msg_hdrs 1.51% haproxy [.] http_wait_for_response > 1.50% haproxy [.] fwlc_get_next_server The cost of fwlc_get_next_server() naturally increases as the server count increases, but now has no visible effect on updates. The load distribution remains unchanged compared to the previous approach, the weight still being respected. For further improvements to the fwlc algo, please consult github issue #881 which centralizes everything related to this algorithm.	2025-02-12 11:48:10 +01:00
Willy Tarreau	b6a8318cc2	MEDIUM: server: allocate a tasklet for asyncronous requeuing This creates a tasklet that only expects to be called when the LB algorithm is under contention when trying to reposition the server in its tree. Indeed, that's one of the operations that usually requires to take a write lock on a highly contended area, often for very little benefits under contention; indeed, under load, if a server keeps its previous position for a few extra microseconds, usually there's no harm. Thus this new tasklet can be woken up by the LB algo to ask the server to later call lbprm.server_requeue(). It does nothing else.	2025-02-11 17:24:09 +01:00
Willy Tarreau	20b8c4ddba	MINOR: lbprm: add a new callback ->server_requeue to the lbprm This callback will be used to reposition a server to its expected position regardless of the fact that it was taken or dropped. It will only be used by supporting LB algos. For now, only fwlc defines it and assigns it to fwlc_srv_reposition(). At the moment it's not used yet.	2025-02-11 17:16:14 +01:00
Willy Tarreau	eced1d6d8a	DEBUG: thread: reduce the struct lock_stat to store only 30 buckets Storing only 30 buckets means we only keep 256 bytes per label. This further simplifies address calculation and reduces the memory used without complicating the locking code. It means we won't measure wait times larger than a second but we're not supposed to face this as it would trigger the watchdog anyway. It may become a little bit just if measuring using rdtsc() instead of now_mono_time() though (typically the limit would be around 350ms for a 3 GHz CPU).	2025-02-10 18:34:43 +01:00
Willy Tarreau	c2f2d6fd3c	DEBUG: thread: make lock_stat per operation instead of for all operations It's more convenient (and more readable) to have the lock stats arranged by operation type (read, seek, write). It will also allow to later simplify the structure format and the bucket address calculation. Now lock_stat[] got split into lock_stats_rd[], lock_stats_sk[], lock_stats_wr[].	2025-02-10 18:34:43 +01:00
Willy Tarreau	4168d1278c	DEBUG: thread: don't keep the redundant _locked counter Now that we have our sums by bucket, the _locked counter is redundant since it's always equal to the sum of all entries. Let's just get rid of it and replace its consumption with a loop over all buckets, this will reduce the overhead of taking each lock at the expense of a tiny extra effort when dumping all locks, which we don't care about.	2025-02-10 18:34:43 +01:00
Willy Tarreau	a22550fbd7	DEBUG: thread: report the wait time buckets for lock classes In addition to the total/average wait time, we now also store the wait time in 2^N buckets. There are 32 buckets for each type (read, seek, write), allowing to store wait times from 1-2ns to 2.1-4.3s, which is quite sufficient, even if we'd want to switch from NS to CPU cycles in the future. The counters are only reported for non- zero buckets so as not to visually pollute the output. This significantly inflates the lock_stat struct, which is now aligned to 256 bytes and rounded up to 1kB. But that's not really a problem, given that there's only one per lock label.	2025-02-10 18:34:43 +01:00
Willy Tarreau	0b849c59fb	DEBUG: thread: make lock time computation more consistent The lock time computation was a bit inconsistent between functions, particularly those using a try_lock. Some of them would count the lock as taken without counting the time, others would simply not count it. This is essentially due to the way the time is retrieved, as it was done inside the atomic increment. Let's instead always use start_time to carry the elapsed time, by presetting it to the negative time before the event and addinf the positive time after, so that it finally contains the duration. Then depending on the try lock's success, we add the result or not. This was generalized to all lock functions for consistency, and because this will be handy for future changes.	2025-02-10 18:34:43 +01:00
Willy Tarreau	99a88ee904	DEBUG: thread: report the spin lock counters as seek locks Technically speaking, spin locks use a seek lock, not a write lock, so better count them appropriately for consistency (lock time, or function calls count).	2025-02-10 18:34:43 +01:00
Willy Tarreau	7ddcdff33f	BUG/MEDIUM: debug: close a possible race between thread dump and panic() The rework of the thread dumping mechanism in 2.8 with commit 9a6ecbd590 ("MEDIUM: debug: simplify the thread dump mechanism") opened a small race, which is that a thread in the process of dumping other ones may block the other one from panicing while it's looping at the end of ha_thread_dump_fill(), or any other sequence involving the currently dumped one. This was emphasized in 3.1 with commit 148eb5875f ("DEBUG: wdt: better detect apparently locked up threads and warn about them") that allowed to emit warnings about long-stuck threads, because in this case, what happens is that sometimes a thread starts to emit a warning (or a set of warnings), and while the warning is being awaited for, a panic finally happens and interrupts either the dumping thread, which never finishes and waits for the target's pointer to become NULL which will never happen since it was supposed to do it itself, or the currently dumped thread which could wait for the dumping thread to become ready while this one has not released the former. In order to address this, first we now make sure never to dump a thread that is already in the process of dumping another one. We're adding a new thread flag to know this situation, that is set in ha_thread_dump_fill() and cleared in ha_thread_dump_done(). And similarly, we don't trigger the watchdog on a thread waiting for another one to finish its dump, as it's likely a case of warning (and maybe even a panic) that makes them wait for each other and we don't want such cases to be reentrant. Finally, we check in the main polling loop that the flag never accidentally leaked (e.g. wrong flag manipulation) as this would be difficult to spot with bad consequences. This should be backported at least to 2.8, and should resolve github issue #2860. Thanks to Chris Staite for the very informative backtrace that exhibited the problem.	2025-02-10 18:34:26 +01:00
Willy Tarreau	37e84676c7	[RELEASE] Released version 3.2-dev5 Released version 3.2-dev5 with the following main changes : - BUG/MINOR: ssl: put ssl_sock_load_ca under SSL_NO_GENERATE_CERTIFICATES - CLEANUP: ssl: rename ssl_sock_load_ca to ssl_sock_gencert_load_ca - CLEANUP: ssl: move ssl_sock_gencert_load_ca declaration in ssl_gencert.h - CLEANUP: tree-wide: define and use acl_match_cond() helper - MINOR: epoll: permit to mask certain specific events - MINOR: proxies: Add a per-thread group field to struct proxy. - MINOR: Add fields to the per-thread group field in struct server. - MINOR: proxies/servers: Calculate queueslength and use it. - MEDIUM: servers/proxies: Switch to using per-tgroup queues. - BUG/MINOR: stream: Properly handle "on-marked-up shutdown-backup-sessions" - MEDIUM: stream: Map task wake up reasons to dedicated stream events - MEDIUM: stream: No longer use TASK_F_UEVT* to shut a stream down - BUILD: tools: fix build on BSD by dropping the ETIME check - MINOR: queues: use __ha_cpu_relax() on failed CAS. - BUILD: queues: Use unsigned int when needed - BUILD: ssl: allow to build without the renegotiation API of WolfSSL - BUILD: ssl: more cleaner approach to WolfSSL without renegotiation - BUG/MEDIUM: chunk: make sure to flush the trash pool before resizing - MINOR: quic: remove references to burst in quic-cc-algo parsing - MINOR: quic: allow BBR testing without pacing - MINOR: quic: transform pacing settings into a global option - MAJOR: quic: mark pacing as stable and enable it by default - MINOR: quic: mark BBR as stable - MINOR: quic: define quic_tune - BUILD: quic: fix overflow in global tune - DEBUG: fd: add a counter of takeovers of an FD since it was last opened - MINOR: fd: add a generation number to file descriptors - DEBUG: epoll: store and compare the FD's generation count with reported event - MEDIUM: epoll: skip reports of stale file descriptors - MINOR: mux-h1: Add masks to group H1S DEMUX and MUX errors - BUG/MINOR: mux-h1: Only report a SE error on demux error - MINOR: tevt: Add the termination events log's fundations - MINOR: tevt/stconn: Add a termination events log in the SE descriptor - MINOR: tevt/mux-h1: Report termination events for the H1C and H1S - MINOR: tevt/mux-h2: Report termination events for the H2C - MINOR: tevt/stream/stconn: Report termination events for stream and sc - MINOR: tevt/conn: Report intercepted event for L4 rules - MINOR: tevt/mux-h1/mux-h2: Add termination events log when dumping mux info - MINOR: tevt/muxes: Add CTL and SCTL command to get the termination event logs - MINOR: tevt/mux-pt: Add support for termination event logs - MINOR: tevt/connection: Add dedicated termination events for lower locations - MEDIUM: tevt/muxes: Add dedicated termination events for muxc/se locations - MINOR: tevt/stconn: Be more accurate to report shutw events - MEDIUM: tevt/stconn/stream: Add dedicated termination events for stream location - MINOR: tevt: Don't duplicate termination event during reporting - MINOR: tevt/applet: Add limited support for termination event logs for applets - MINOR: tevt: Add a sample to get termination events for all locations - MINOR: tevt: Improve function to convert a termination events log to string - REORG: tevt/connection: Move enums at the end of the header file - MINOR: tevt/dev: Add term_events tool - MINOR: tevt/connection: Add support for POLL_HUP/POLL_ERR events - MINOR: tevt/dev: Parse tuple of termination events - BUG/MEDIUM: htx: wrong count computation in htx_xfer_blks() - DOC: htx: clarify <mark> parameter for htx_xfer_blks() - BUILD: quic: remove GCC undefined error in qc_release_lost_pkts() - MEDIUM: htx: prevent <mark> to copy incomplete headers in htx_xfer_blks() - BUG/MEDIUM: mux-fcgi: Properly handle read0 on partial records - BUG/MINOR: tevt/http-ana: Remove badly placed event reports - DEBUG: http-ana: Remove debug counters from HTTP analyzers - DEBUG: mux-h1: Remove some debug counters - BUG/MINOR: tcp-rules: Don't forward close during tcp-response content rules eval - MEDIUM: stream: interrupt costly rulesets after too many evaluations - BUG/MINOR: http-check: Don't pretend a C-L heeader is set before adding it - BUILD: ssl: remove a boringssl definition defined by recent boringssl libs - BUG/MINOR: tevt/mux-h2: Set truncated receive/eos events at SE level on error - BUG/MEDIUM: flt-spoe: Set/test applet flags instead of SE flags from I/O handler - BUG/MEDIUM: applet: Don't pretend to have more data to handle EOI/EOS/ERROR - BUG/MEDIUM: flt-spoe: Properly handle end of stream from the SPOE applet - MINOR: flt-spoe: Report end of input immediately after applet init - MINOR: mux-spop: Report EOI on the SE when a ACK is received for a stream - MINOR: mux-spop: Set SPOP_CF_ERROR flag on connection error only - MINOR: tevt/mux-spop: Report termination events for the SPOP connect/stream - CLEANUP: mux-spop: Remove useless comments - MINOR: mux-spop: Dump info about connections and streams in dedicated functions - MINOR: mux-spop: Implement .show_sd callback function - MEDIUM: mux-fcgi: Add a function to propagate termination flags from fstrm to SE - BUG/MEDIUM: mux-fcgi: Propagate flags to SE in fcgi_strm_wake_one_stream - MINOR: tevt/mux-fcgi: Report termination events for the FCGI connect/stream - MINOR: mux-fcgi: Dump info about connections and streams in dedicated functions - MINOR: mux-spop/mux-fcgi: Add support of the debug string for logs - BUG/MINOR: cli: Don't set SE flags from the cli applet - BUG/MINOR: cli: Fix memory leak on error for _getsocks command - BUG/MINOR: cli: Fix a possible infinite loop in _getsocks() - BUG/MINOR: config/userlist: Support one 'users' option for 'group' directive - BUG/MINOR: auth: Fix a leak on error path when parsing user's groups - BUG/MINOR: flt-trace: Support only one name option - MINOR: filters: Improve errors formating during filters parsing - BUG/MINOR: stats-json: Define JSON_INT_MAX as a signed integer - DOC: option redispatch should mention persist options - BUG/MINOR: debug: make "debug dev sched" accept a negative TID - BUG/MINOR: debug: make sure the "debug dev sched" tasks don't block stopping - IMPORT: plock: export the uninlined version of the lock wait function - IMPORT: plock: give higher precedence to W than S - IMPORT: plock: lower the slope of the exponential back-off - IMPORT: plock: use cpu_relax() for a shorter time in EBO - Revert "IMPORT: plock: export the uninlined version of the lock wait function" - BUG/MEDIUM: ssl: chosing correct certificate using RSA-PSS with TLSv1.3	2025-02-08 05:53:40 +01:00
William Lallemand	3912780b1e	BUG/MEDIUM: ssl: chosing correct certificate using RSA-PSS with TLSv1.3 The clienthello callback was written when TLSv1.3 was not yet out, and signatures algorithm changed since then. With TLSv1.2, the least significant byte was used to determine the SignatureAlgorithm, which could be rsa(1), dsa(2), ecdsa(3). https://datatracker.ietf.org/doc/html/rfc5246#section-7.4.1.4.1 This was used to chose which type of certificate to push to the client. But TLSv1.3 changed that, and introduced new RSA-PSS algorithms that does not have the least sinificant byte to 1. https://datatracker.ietf.org/doc/html/rfc8446#section-4.2.3 This would result in chosing the wrong certificate when an RSA an ECDSA ones are in the configuration for the same SNI or default entry. This patch fixes the issue by parsing bothe hash and signature field to check the RSA-PSS signature scheme. This must fix issue #2852. This must be backported in every stable versions. The code was moved from ssl_sock.c to ssl_clienthello in recent versions.	2025-02-07 20:56:42 +01:00
Willy Tarreau	ae540e3d9c	Revert "IMPORT: plock: export the uninlined version of the lock wait function" This reverts commit 5496d06b2b1ea276ffb6aec78ffca177b88d89cd. It breaks the build on Windows which apparently doesn't support the weak attribute well on functions. It's not big deal anyway, playing with build options while debugging still works though it's less easy to use.	2025-02-07 19:51:15 +01:00
Willy Tarreau	b957e2f3ef	IMPORT: plock: use cpu_relax() for a shorter time in EBO Tests have shown that on modern CPUs it's interesting to wait a bit less in cpu_relax(). Till now we were looping down to 60 iterations and then switching to just barriers. Increasing the threshold to 90 iterations left before getting out of the loop improved the average and max time to grab a write lock by a few percent (e.g. 10% at 1us, 20% at 256ns or lower). Higher values tend to progressively lose that gain so let's stick to this one. This was measured on an EPYC 74F3 like previous measurements that initially led to this value, and the value might possibly depend on the mask applied to the loop counter. This is plock commit 74ca0a7307fa6aec3139f27d3b7e534e1bdb748e.	2025-02-07 18:04:29 +01:00
Willy Tarreau	253fba01a7	IMPORT: plock: lower the slope of the exponential back-off Along many tests involving both haproxy's scheduler and forwarded traffic, various exponents and algorithms were attempted for the EBO and their effects were measured. It was found that a growth in 1.25^N limited to 128k cycles consistently gives a better latency than 1.5^N limited to 256k cycles, without degrading general performance. The measures of the time to grab a write lock on a 48-thread EPYC show that the number of occurrences of low times was roughly multiplied by 2-3 while the number of occurrences of times above 64us was reduced by similar factors, to even reach 300 at 64us and limiting the maximum time by a factor of 4. The other variants that were experimented with are: m = ((m + (m >> 1)) + 2) & 0x3ffff; // original m = ((m + (m >> 1) + (m >> 3)) + 2) & 0x3ffff; m = ((m + (m >> 1) + (m >> 4)) + 2) & 0x3ffff; m = ((m + (m >> 1) + (m >> 4)) + 2) & 0x1ffff; m = ((m + (m >> 1) + (m >> 4)) + 1) & 0x1ffff; m = ((m + (m >> 2) + (m >> 4)) + 1) & 0x1ffff; // lowest CPU on pl_wr test + good perf m = ((m + (m >> 2)) + 1) & 0x1ffff; // even lower cpu usage, lowest max m = ((m + (m >> 1) + (m >> 2)) + 1) & 0x1ffff; // correct but slightly higher maxes m = ((m + (m >> 1) + (m >> 3)) + 1) & 0x1ffff; // less good than m+m>>2 m = ((m + (m >> 2) + (m >> 3)) + 1) & 0x1ffff; // better but not as good as m+m>>2 m = ((m + (m >> 3) + (m >> 4)) + 1) & 0x1ffff; // less good, lower rates on small coounts. m = ((m + (m >> 2) + (m >> 3) + (m >> 4)) + 1) & 0x1ffff; // less good as well m = ((m & 0x7fff) + (m >> 1) + (m >> 4)) + 2; m = ((m & 0xffff) + (m >> 1) + (m >> 4)) + 2; This is plock commit dddd9ee01c522da33c353e2e4d4fd743d8336ec3.	2025-02-07 18:04:29 +01:00
Willy Tarreau	9dd56da730	IMPORT: plock: give higher precedence to W than S It was noticed in haproxy that in certain extreme cases, a write lock subject to EBO may fail for a very long time in front of a large set of readers constantly trying to upgrade to the S state. The reason is that among many readers, one will succeed in its upgrade, and this situation can last for a very long time with many readers upgrading in turn, while the writer waits longer and longer before trying again. Here we're taking a reasonable approach which is that the write lock should have a higher precedence in its attempt to grab the lock. What is done is that instead of fully rolling back in case of conflict with a pure S lock, the writer will only release its read part in order to let the S upgrade to W if needed, and finish its operations. This guarantees no other seek/read/write can enter. Once the conflict is resolved, the writer grabs the read part again and waits for readers to be gone (in practice it could even return without waiting since we know that any possible wanderers would leave or even not be there at all, but it avoids a complicated loop code that wouldn't improve the practical situation but inflate the code). Thanks to this change, the maximum write lock latency on a 48 threads AMD with aheavily loaded scheduler went down from 256 to 64 ms, and the number of occurrences of 32ms or more was divided by 300, while all occurrences of 1ms or less were multiplied by up to 3 (3 for the 4-16ns cases). This is plock commit b6a28366d156812f59c91346edc2eab6374a5ebd.	2025-02-07 18:04:29 +01:00
Willy Tarreau	5496d06b2b	IMPORT: plock: export the uninlined version of the lock wait function The inlining of the lock waiting function was made more easily configurable with commit 7505c2e ("plock: always expose the inline version of the lock wait function"). However, the standard one remained static, but in order to resolve the symbols in "perf top", it's much better to export it, so let's move "static" with "inline" and leave it exported when PLOCK_INLINE_EBO is not set. This is plock commit 3bea7812ec705b9339bbb0ed482a2cd8aa6c185c.	2025-02-07 18:04:29 +01:00
Willy Tarreau	8d63dc50ab	BUG/MINOR: debug: make sure the "debug dev sched" tasks don't block stopping When "debug dev sched" is used to pop up background tasks, these tasks are never stopped, so we must be careful to stop them when the stopping flag is set, otherwise they can prevent the process from stopping when sufficiently numerous (tests went as far as 100 million tasks, leading the run queue never being completely purged in one poll round). No backport is needed since this is only used when debugging and tuning the scheduler.	2025-02-07 18:04:29 +01:00
Willy Tarreau	6765a32eb4	BUG/MINOR: debug: make "debug dev sched" accept a negative TID The TID passed to "debug dev sched" is used to pin the task to a given thread. A negative value normally means the task is unpinned and goes to the shared wait queue and run queue. However due to the type of the variable, negative values were mapped as highly positive values and were set to the current thread. Let's add the proper cast to fix this. No backport is needed since this is only used to experiment with the scheduler and measure its performance.	2025-02-07 18:04:29 +01:00
Lukas Tribus	5926fb7823	DOC: option redispatch should mention persist options "option redispatch" remains vague in which cases a session would persist; let's mention "option persist" and "force-persist" as an example so folks don't draw the conclusion that this may be default. Should be backported to stable branches.	2025-02-06 17:49:13 +01:00
Christopher Faulet	d48b5add88	BUG/MINOR: stats-json: Define JSON_INT_MAX as a signed integer A JSON integer is defined in the range [-(253)+1, (253)-1]. Macro are used to define the minimum and the maximum value, The minimum one is defined using the maximum one. So JSON_INT_MAX must be defined as a signed integer value to avoid wrong cast of JSON_INT_MIN. It was reported by Coverity in #2841: CID 1587769. This patch could be backported to all stable versions.	2025-02-06 17:19:49 +01:00
Christopher Faulet	bc487afc85	MINOR: filters: Improve errors formating during filters parsing The error message reported by a filter during parsing are displayed between quotes. It is not really user friendly. So let's remove the quotes here.	2025-02-06 17:03:40 +01:00
Christopher Faulet	b20e2c96cf	BUG/MINOR: flt-trace: Support only one name option When a trace filter is defined, only one 'name' option is expected. But it was not tested. Thus it was possible to set several names leading to a memory leak. It is now tested, and it is not allowed to redefine the trace filter name. It was reported by Coverity in #2841: CID 1587768. This patch could be backported to all stable versions.	2025-02-06 17:01:15 +01:00
Christopher Faulet	a7f513af91	BUG/MINOR: auth: Fix a leak on error path when parsing user's groups In a userlist section, when a user is parsed, if a specified group is not found, an error is reported. In this case we must take care to release the alredy built groups list. It was reported by Coverity in #2841: CID 1587770. This patch could be backported to all stable versions.	2025-02-06 16:55:37 +01:00
Christopher Faulet	a1e14d2a82	BUG/MINOR: config/userlist: Support one 'users' option for 'group' directive When a group is defined in a userlist section, only one 'users' option is expected. But it was not tested. Thus it was possible to set several options leading to a memory leak. It is now tested, and it is not allowed to redefine the users option. It was reported by Coverity in #2841: CID 1587771. This patch could be backported to all stable versions.	2025-02-06 16:55:29 +01:00
Christopher Faulet	75e8c8ed33	BUG/MINOR: cli: Fix a possible infinite loop in _getsocks() In _getsocks() functuoin, when we failed to set the unix socket in non-blocking mode, a goto to "out" label led to loop infinitly. To fix the issue, we must only let the function exit. This patch should be backported to all stable versions.	2025-02-06 15:44:21 +01:00
Christopher Faulet	372cc696d4	BUG/MINOR: cli: Fix memory leak on error for _getsocks command Some errors in parse function of _getsocks commands were not properly handled and immediately returned, leading to a memory leak on cmsgbuf and tmpbuf buffers. To fix the issue, instead of immediately return with -1, we jump to "out" label. Returning 1 intead of -1 in that case is valid. This was reported by Coverity in #2841: CIDs 1587773 and 1587772. This patch should be backported as far as 2.4.	2025-02-06 15:43:04 +01:00
Christopher Faulet	7e927243b9	BUG/MINOR: cli: Don't set SE flags from the cli applet Since the CLI was updated to use the new applet API, it should no longer set directly the SE flags. Instead, the corresponding applet flags must be set, using the applet API (appet_set_*). It is true for the CLI I/O handler but also for the commands parse function and I/O callback function. This patch should be backported as far as 3.0.	2025-02-06 15:23:20 +01:00
Christopher Faulet	0aa69e7865	MINOR: mux-spop/mux-fcgi: Add support of the debug string for logs Now it is possible to have debug info about FCGI and SPOP multiplexers. To do so, the support for the MUX_SCTL_DBG_STR command was implemented for these muxes. The have this log message, the log-format must be set to: log-format "$HAPROXY_HTTP_LOG_FMT bs=<%[bs.debug_str]>"	2025-02-06 11:19:32 +01:00
Christopher Faulet	456cfa450a	MINOR: mux-fcgi: Dump info about connections and streams in dedicated functions fcgi_show_fd() function was splitted to dump the info about the FCGI connections and the FCGI streams in dedicated functions, duplicating this way what is performed in other muxes. In addition, the FCGI multiplexer now implements the .show_sd callback function called by "show sess" CLI command.	2025-02-06 11:19:32 +01:00
Christopher Faulet	bbc8c98a54	MINOR: tevt/mux-fcgi: Report termination events for the FCGI connect/stream Termination events are now reported for the FCGI connections and the FCGI streams. In addition, all available termination events logs are reported in the "show-fd" callback function. The .ctl and .sctl callback functions were also update to support, respectively, MUX_CTL_TEVTS and MUX_SCTL_TEVTS commands.	2025-02-06 11:19:32 +01:00
Christopher Faulet	5b1c2277ae	BUG/MEDIUM: mux-fcgi: Propagate flags to SE in fcgi_strm_wake_one_stream The commit is flagged as a bug because the same fix on the H2 multiplexer was reported as a bug. But no issue was reported. When a stream is explicitly woken up by the FCGI conneciton, if an error condition is detected, the corresponding error flag is set on the SE. So SE_FL_ERROR or SE_FL_ERR_PENDING, depending if the end of stream was reported or not. However, there is no attempt to propagate other termination flags. We must be sure to properly set SE_FL_EOI and SE_FL_EOS when appropriate to be able to switch a pending error to a fatal error. Because of this bug, the SE could remain with a pending error and no end of stream, preventing the applicative stream to trully abort it. It means on some abort scenario, it seems to be possible to block a stream infinitely. This patche depends on: * MEDIUM: mux-fcgi: Add a function to propagate termination flags from fstrm to SE * BUG/MEDIUM: mux-fcgi: Properly handle read0 on partial records This patch could be backported at least as far as 2.8 after a period of observation. However no bug was reportedn so there is no rush.	2025-02-06 11:19:32 +01:00
Christopher Faulet	ccdca4bb77	MEDIUM: mux-fcgi: Add a function to propagate termination flags from fstrm to SE The function fcgi_strm_propagate_term_flags() was added to check the FSTRM state and evaluate when EOI/EOS/ERR_PENDING/ERROR flags must be set on the SE. It is not the only place where those flags are set. But it centralizes the synchro between the FCGI stream and the SC. For now, this function is only used at the end of fcgi_rcv_buf(). But it will be used to fix a potential bug.	2025-02-06 11:19:32 +01:00
Christopher Faulet	7b638eb1a6	MINOR: mux-spop: Implement .show_sd callback function The SPOP multiplexer now implements the .show_sd callback function called by "show sess" CLI command.	2025-02-06 11:19:32 +01:00
Christopher Faulet	5aeb678762	MINOR: mux-spop: Dump info about connections and streams in dedicated functions spop_show_fd() function was splitted to dump the info about the SPOP connections and the SPOP streams in dedicated functions, duplicating this way what is performed in other muxes.	2025-02-06 11:19:32 +01:00
Christopher Faulet	eb4e517489	CLEANUP: mux-spop: Remove useless comments Just a small cleanup to remove some comments added during the development of the mux.	2025-02-06 11:19:32 +01:00
Christopher Faulet	4f8ae5b1f6	MINOR: tevt/mux-spop: Report termination events for the SPOP connect/stream Termination events are now reported for the SPOP connections and the SPOP streams. In addition, all available termination events logs are reported in the "show-fd" callback function. The .ctl and .sctl callback functions were also update to support, respectively, MUX_CTL_TEVTS and MUX_SCTL_TEVTS commands.	2025-02-06 11:19:32 +01:00
Christopher Faulet	514a912a4d	MINOR: mux-spop: Set SPOP_CF_ERROR flag on connection error only The SPOP_CF_ERROR flag is now set on connection error only. It was also set on some demux failures. But it is not mandatory because the connection is closed anyway. And it is handy to have a flag dedicated to tcp connection error. It was the original purpose of this flag. This patch could be backported to 3.1 to ease future backports.	2025-02-06 11:19:32 +01:00
Christopher Faulet	d16c534511	MINOR: mux-spop: Report EOI on the SE when a ACK is received for a stream The spop stream now reports the end of input when the ACK is transferred to the SPOE applet. To do so, the flag SPOP_SF_ACK_RCVD was added. It is set on the SPOP stream when its ACK is received by the SPOP connection. In addition when SPOP stream flags are propagated to the SE, the error is now reported if end of input was not reached instead of testing the connection error code. It is more accurate. This patch should be backported to 3.1.	2025-02-06 11:19:32 +01:00
Christopher Faulet	f7e5718596	MINOR: flt-spoe: Report end of input immediately after applet init The SPOE applet forwards the message that must be sent to agent during its init stage. So just after it is created. When it is performed, the end of input must be reported because no more data will be forwarded. However, it was performed after receiving the ACK response. It is harmless, but there is no reason to delay the EOI. It is now fixed. This patch must be backported to 3.1.	2025-02-06 11:19:32 +01:00
Christopher Faulet	38aac2c7bc	BUG/MEDIUM: flt-spoe: Properly handle end of stream from the SPOE applet The previous fix ("BUG/MEDIUM: applet: Don't pretend to have more data to handle EOI/EOS/ERROR") revealed an issue with the way the SPOE applet was reporting the end of stream, leading to never shut the applet down. In fact, there is two bug in one. The first one is about the applet shutdown. Since the fix above, the applet is no longer closed. Before, it was closed because it was reported in error. But now, it is just delayed because the applet and the SPOP stream are declared to support half close connections. So the applet is only closed when the SPOP connection is closed. To fix this bug, both side are now stating that half close connections are not supported. The second bug is about the way the end of stream is reported. It is reported when the ACK response is received. But it is too early, because the parent stream must process the response first. So now, we take care to have processed the ACK from the parent applet before reporting an end of stream. This patch must be backported with the commit above to 3.1.	2025-02-06 11:19:32 +01:00
Christopher Faulet	7214dcd52d	BUG/MEDIUM: applet: Don't pretend to have more data to handle EOI/EOS/ERROR The way appctx EOI/EOS/ERROR flags were reported for applets using the new API were to state the applet had more data to deliver. But it was not correct and for APPCTX_FL_EOS, this led to report an error on the SE because it is not expected. More data to deliver and an end of stream is an impossible situation. This was added as a fix by commit b8ca114031 ("BUG/MEDIUM: applet: State appctx have more data if its EOI/EOS/ERROR flag is set"), mainly to make the SPOE applet work. When an applet set one of these flags, it really means it has no more data to deliver. So we must not try to trigger a new receive to handle these flags. Instead we must handle them directly in task_process_applet() function and only if the corresponding SE flags were not already set. This patch must be backported to 3.1.	2025-02-06 11:19:32 +01:00
Christopher Faulet	db504fbdbe	BUG/MEDIUM: flt-spoe: Set/test applet flags instead of SE flags from I/O handler The SPOE applet is using the new applet API. Thus end of input, end of stream and errors must be reported using the applet flags, not the SE flags. This was not the case. So let's fix it. It seems this bug is harmless for now. This patch must be backported to 3.1.	2025-02-06 11:19:32 +01:00
Christopher Faulet	54a09dfe0f	BUG/MINOR: tevt/mux-h2: Set truncated receive/eos events at SE level on error When receive or EOS termination events are reported at the SE level, a truncation was erroneously reported when no error was detected. Of course, it must be the opposite. No backport needed.	2025-02-06 11:19:32 +01:00
Frederic Lecaille	85cb1cc7f4	BUILD: ssl: remove a boringssl definition defined by recent boringssl libs This is the case for AWS-LC which derives from boringssl, where X509_OBJECT_get0_X509_CRL() is already defined. There is definitively no more need to define this function to build haproxy against TLS libs derived from boringssl.	2025-02-06 10:48:25 +01:00
Christopher Faulet	fad68cb16d	BUG/MINOR: http-check: Don't pretend a C-L heeader is set before adding it When a GET/HEAD/OPTIONS/DELETE healthcheck request was formatted, we claimed there was a "content-length" header set even when there was no payload, leading to actually send a "content-length: 0" header to the server. It was unexpected and could be rejected by servers. When a healthcheck request is sent we must take care to state there is a "content-length" header when it is explicitly added. This patch should fix the issue #2851. It must be backported as far as 2.9.	2025-02-03 18:46:41 +01:00
Aurelien DARRAGON	0846638f7f	MEDIUM: stream: interrupt costly rulesets after too many evaluations It is not rare to see configurations with a large number of "tcp-request content" or "http-request" rules for instance. A large number of rules combined with cpu-demanding actions (e.g.: actions that work on content) may create thread contention as all the rules from a given ruleset are evaluated under the same polling loop if the evaluation is not interrupted Thus, in this patch we add extra logic around "tcp-request content", "tcp-response content", "http-request" and "http-response" rulesets, so that when a certain number of rules are evaluated under the single polling loop, we force the evaluating function to yield. As such, the rule which was about to be evaluated is saved, and the function starts evaluating rules from the save pointer when it returns (in the next polling loop). We use task_wakeup(task, TASK_WOKEN_MSG) to explicitly wake the task so that no time is wasted and the processing is resumed ASAP. TASK_WOKEN_MSG is mandatory here because process_stream() expects TASK_WOKEN_MSG for explicit analyzers re-evaluation. rules_bcount stream's attribute was added to count how manu rules were evaluated since last interruption (yield). Also, SF_RULE_FYIELD flag was added to know that the s->current_rule was assigned due to forced yield and not regular yield. By default haproxy will enforce a yield every 50 rules, this behavior can be configured using the "tune.max-rules-at-once" global keyword. There is a limitation though: for now, if the ACT_OPT_FINAL flag is set on act_opts, we consider it is not safe to yield (as it is already the case for automatic yield). In this case instead of yielding an taking the risk of not being called back, we skip the yield and hope it will not create contention. This is something we should ideally try to improve in order to yield in all conditions.	2025-02-03 17:09:48 +01:00
Christopher Faulet	04bbfa4354	BUG/MINOR: tcp-rules: Don't forward close during tcp-response content rules eval When the tcp-response content ruleset evaluation is delayed because of an ACL condition, the close forwarding on the client side is not explicitly blocked. So it is possible to close the client side before the end of the response evaluation. To fix the issue, this is now done in all cases where some data are missing. Concretely, channel_dont_close() is called in "missing_data" goto label. Note it is only a theorical bug (or pending bug). It is not possible to trigger it for now because an ACL cannot wait for more data when a close was received. But the code remains a bit weak. It is safer this way. It is especially mandatory for the "force yield" option that should be added soon. This patch could be backported to all stable versions.	2025-02-03 15:31:59 +01:00
Christopher Faulet	431c5533b7	DEBUG: mux-h1: Remove some debug counters Several debug counters were added to debug a strange issue about early aborts. Most of them are now useless, especially because it is now possible to rely on the termination events logs. So, it is better to remove them. Note that these counters are still there in 3.1.	2025-02-03 08:48:31 +01:00
Christopher Faulet	1c6512f8fc	DEBUG: http-ana: Remove debug counters from HTTP analyzers Several debug counters were added in HTTP analyzers to help debugging a strange issue about early aborts. But these counters are a bit overkill now. Especially because it is now possible to rely on the termination event log. So just remove them. Note that these counters are still there in 3.1.	2025-02-03 08:28:45 +01:00
Christopher Faulet	274c9d21a6	BUG/MINOR: tevt/http-ana: Remove badly placed event reports When specific events for the stream location were added, some reports about message interception were not removed. These reports are now removed. No need to backport.	2025-02-03 08:20:41 +01:00
Christopher Faulet	5f927f603a	BUG/MEDIUM: mux-fcgi: Properly handle read0 on partial records A Read0 event could be ignored by the FCGI multiplexer if it is blocked on a partial record. Instead of handling the event, it remained blocked, waiting for the end of the record. To fix the issue, the same solution than the H2 multiplexer is used. Two flags are introduced. The first one, FCGI_CF_END_REACHED, is used to acknowledge a read0. This flag is set when a read0 was received AND the FCGI multiplexer must handle it. The second one, FCGI_CF_DEM_SHORT_READ, is set when the demux is interrupted on a partial record. A short read and a read0 lead to set the FCGI_CF_END_REACHED flag. With these changes, the FCGI mux should be able to properly handle read0 on partial records. This patch should be backported to all stable versions after a period of observation.	2025-02-03 07:49:50 +01:00
William Lallemand	0a28b1ea0c	MEDIUM: htx: prevent <mark> to copy incomplete headers in htx_xfer_blks() Prevent a partial copy of trailers or headers when using the <mark> parameter. When using htx_xfer_blks(), transfering partial headers or trailers are prevented when restricted by the <count> parameter. However using the <mark> parameter will still allow to do it. This patch changes the behavior by checking the <mark> type only after checking the headers/trailers type, so we can still rollback on partial transfer. No impact on the current code, which does not try to do that yet.	2025-01-31 15:51:51 +01:00
Amaury Denoyelle	4ad2accfee	BUILD: quic: remove GCC undefined error in qc_release_lost_pkts() Every once in a while, GCC reports issues with qc_release_lost_pkts() function. It seems that its static analysis is foiled by the code structuring. The latest warning reports the following issue : CC src/quic_loss.o src/quic_loss.c: In function ‘qc_release_lost_pkts’: src/quic_loss.c:313:58: error: potential null pointer dereference [-Werror=null-dereference] 313 \| unsigned int period = newest_lost->time_sent_ms - oldest_lost->time_sent_ms; \| ~~~~~~~~~~~^~~~~~~~~~~~~~ To fix definitely this, change slightly the code. <oldest_lost> and <newest_lost> are now initialized on the first list entry outside of the loop. This is enough to guarantee to GCC that they cannot be NULL for the remainder of the function.	2025-01-31 15:34:30 +01:00
William Lallemand	c17e029232	DOC: htx: clarify <mark> parameter for htx_xfer_blks() Clarify the fact that the first <mark> block is transferred before stopping when using htx_xfer_blks()	2025-01-31 15:23:47 +01:00
William Lallemand	c6390cdf9c	BUG/MEDIUM: htx: wrong count computation in htx_xfer_blks() When transfering blocks from an src to another dst htx representation, htx_xfer_blks() decreases the size of each block removed from the <count> value passed in parameter, so it can't transfer more than <count>. The size must also contains the metadata, represented by a simple sizeof(struct htk_blk). However, the code was doing a sizeof(dstblk) instead of a sizeof(*dstblk) which as the consequence of removing only a size_t from count. Fortunately htx_blk size is 64bits, so that does not provoke any problem in 64bits. But on 32bits architecture, the count value is not decreased correctly and the function could try to transfer more blocks than allowed by the count parameter. Must be backported in every stable release.	2025-01-31 15:02:58 +01:00
Christopher Faulet	956cb5d554	MINOR: tevt/dev: Parse tuple of termination events term_events tool is now able to parse tuple of termination events, as returned by "term_events" sample fetch function.	2025-01-31 10:46:08 +01:00
Christopher Faulet	71320fc9c1	MINOR: tevt/connection: Add support for POLL_HUP/POLL_ERR events Connection errors can be detected via connect/recv/send syscall, but also because it was reported by the poller. So dedicated events, at the FD level, are introduced to make the difference. term_events tool was updated accordingly.	2025-01-31 10:41:50 +01:00
Christopher Faulet	c7457427ab	MINOR: tevt/dev: Add term_events tool This development tool can be used to convert a string representing a termination event logs to its human redable representation. Several string may be converting at a time. To do so, several arguments can be specified on the commeand line or they can be provided on STDIN, using "-" argument. Here is an exemple: > term_events f2x2f4x4 m2m4m1 e2e1 s2s1S1 E1 M1 F1 ### f2x2f4x4 : fd:shutr > xprt:shutr > fd:snd_err > xprt:snd_err ### m2m4m1 : muxc:shutr > muxc:snd_err > muxc:shutw ### e2e1 : se:eos > se:shutw ### s2s1S1 : strm:eos > strm:shutw > STRM:shutw ### E1 : SE:shutw ### M1 : MUXC:shutw ### F1 : FD:shutw The make target "dev/term_events/term_events" must be used to compile it.	2025-01-31 10:41:50 +01:00
Christopher Faulet	990854ee0d	REORG: tevt/connection: Move enums at the end of the header file Enums used to report events were placed in the connection header for conveniance. But it is not specifically related to connection. So, they are moved at the end of the file to have a better isolation.	2025-01-31 10:41:50 +01:00
Christopher Faulet	487d6b09f1	MINOR: tevt: Improve function to convert a termination events log to string The function is now responsible to handle empty log because no event was reported. In that case, an empty string is returned. It is also responsible to handle case where termination events log is not supported for an given entity (for instance the quic mux for now). In that case, a dash ("-") is returned.	2025-01-31 10:41:50 +01:00
Christopher Faulet	b161155498	MINOR: tevt: Add a sample to get termination events for all locations "term_events" is a sample fetche function that can be used to get termination events for all locations in one call. The format equivalent to: {fc_term_events,fc_mux_term_events,fs.term_events,txn.term_events,bs.term_events,bc_mux_term_events,bc_term_events} If no event was reported for a location, the field is empty. If the feature is not supported yet, a dash ('-') is printed.	2025-01-31 10:41:50 +01:00
Christopher Faulet	eb2f1a4ba4	MINOR: tevt/applet: Add limited support for termination event logs for applets There is no termination events log for applet but events for the SE location are filled when the endpoint is an applet. Most of them relies on the new applet API. Only few events are reported for legacy applets.	2025-01-31 10:41:50 +01:00
Christopher Faulet	cbd898c42b	MINOR: tevt: Don't duplicate termination event during reporting It is hard to never detect the same event several time without painful tests. In other words, the same termination event can be reported several time and this must be handled. To do so, "tevt_report_event" macro is updated to ignore an event if the last reported one is of the same type, for the same location. Of course, if the same event is reported several times at different moment, it will not be detected.	2025-01-31 10:41:50 +01:00
Christopher Faulet	2dc02f75b1	MEDIUM: tevt/stconn/stream: Add dedicated termination events for stream location If it is the last patch to introduce dedicated termination events for each location. In this one, events for the stream location are introcued. The old enum is also removed because it is now unused. Here, more accurate evets are added. The "intercepted" event was splitted.	2025-01-31 10:41:50 +01:00
Christopher Faulet	9697704932	MINOR: tevt/stconn: Be more accurate to report shutw events In se_shutdown() a SE termination event is reported while the shutw stream event is reported in sc_app_shut_conn().	2025-01-31 10:41:50 +01:00
Christopher Faulet	a58e650ad1	MEDIUM: tevt/muxes: Add dedicated termination events for muxc/se locations Termination events dedicated to mux connection and stream-endpoint descriptors are added in this patch. Specific events to these locations are thus added. Changes for the H1 and H2 multiplexers are reviewed to be more accurate.	2025-01-31 10:41:50 +01:00
Christopher Faulet	f2778ccc7d	MINOR: tevt/connection: Add dedicated termination events for lower locations To be able to add more accurate termination events for each location, the enum will be splitted by location. Indeed, there are at most 16 possbile events. It will be pretty confusing to use same termination events for the different locations. So the best is to split them. In this patch, the termination events for the fd, hs and xprt locations are introduced. For now some holes are added to keep similar events aligned across enums. But this may change in future.	2025-01-31 10:41:50 +01:00
Christopher Faulet	9cbc3229ec	MINOR: tevt/mux-pt: Add support for termination event logs A termination event logs is added to the mux-pt context and appropriate events are reported for the muxc location. There is no SE events for this mux.	2025-01-31 10:41:50 +01:00
Christopher Faulet	a4c281a190	MINOR: tevt/muxes: Add CTL and SCTL command to get the termination event logs MUX_CTL_TEVTS command is added to get the termination event logs of a mux connection and MUX_SCTL_TEVTS command to get the termination event logs of a mux stream.	2025-01-31 10:41:50 +01:00
Christopher Faulet	95029305d3	MINOR: tevt/mux-h1/mux-h2: Add termination events log when dumping mux info The termiantion events logs of the multiplexer connection and stream are now dumped when corresponding mux info are dumped. The termination event logs of the underlying connection is also dumped in the debug string.	2025-01-31 10:41:50 +01:00
Christopher Faulet	170d46989c	MINOR: tevt/conn: Report intercepted event for L4 rules When a L4 rules interrupts the processing, a termination event is reported for the connection, with the "fd" location.	2025-01-31 10:41:50 +01:00
Christopher Faulet	00a07c8b54	MINOR: tevt/stream/stconn: Report termination events for stream and sc In this patch, events for the stream location are reported. These events are first reported on the corresponding stream-connector. So front events on scf and back event on scb. Then all events are both merged in the stream. But only 4 events are saved on the stream. Several internal events are for now grouped with the type "tevt_type_intercepted". More events will be added to have a better resolution. But at least the place to report these events are identified. For now, when a event is reported on a SC, it is also reported on the stream and vice versa.	2025-01-31 10:41:50 +01:00
Christopher Faulet	147b6d3d4d	MINOR: tevt/mux-h2: Report termination events for the H2C shutdown for reads (read0), receive errors, shutdown for writes and timeouts are reported, but only for the H2 connection for now. As for the H1 multiplexer, more events must be added to report protocol errors, goaways and rst-streams. And of course, all events for the H2 streams must be reported too.	2025-01-31 10:41:50 +01:00
Christopher Faulet	5f03261166	MINOR: tevt/mux-h1: Report termination events for the H1C and H1S shutdown for reads (read0), receive errors, shutdown for writes and timeouts are reported. It is not too hard to know where to report events generated by HAProxy (timeouts and shutw). For detected events (shutr and receive error), it is not so simple. These events must not be reported when they are detected but when the mux can handle them. For instance, some unprocessed input data may block a read0. So, the experience will tell us if these events are reported at the rigth time and on the right conditions. For now, no internal errors (parsing errors, protocol errors, intenral errors...) are reported because these event types have not yet been added.	2025-01-31 10:41:50 +01:00
Christopher Faulet	992b4b9726	MINOR: tevt/stconn: Add a termination events log in the SE descriptor This termination events log will be used to report events from the mux streams. The location will be "tevt_loc_se" and the muxes will be responsible to report the corresponding events.	2025-01-31 10:41:50 +01:00
Christopher Faulet	e944944990	MINOR: tevt: Add the termination events log's fundations Termination events logs will be used to report the events that led to close a connection. Unlike flags, that reflect a state, the idea here is to store a log to preserve the order of the events. Most of time, when debugging an issue, the order of the events is crucial to be able to understand the root cause of the issue. The traces are trully heplful to do so. But it is not always possible to active them because it is pretty verbose. On heavily loaded platforms, it is not acceptable. We hope that the termination events logs will help us in that situations. One termination events log will be be store at each layer (connection, mux connection, mux stream...) as a 32-bits integer. Each event will be store on 8 bits, 4 bits for the location and 4 bits for the type. So the first four events will be stored only for each layer. It should be enough why a connection is closed. In this patch, the enums defining the termination event locations and types are added. The macro to report a new event is also added and a function to convert a termination events log to a string that could be display in log messages for instance.	2025-01-31 10:41:49 +01:00
Christopher Faulet	4ccca7efcf	BUG/MINOR: mux-h1: Only report a SE error on demux error When a demux error is reported by the H1S, an error must be reported on the SE and not an end-of-input or an end-of-stream. So SE_FL_ERROR flag must be set and not SE_FL_EOI/SE_FL_EOS. It seems this bug has no impact. So there is no reason to backport it.	2025-01-31 10:41:49 +01:00
Christopher Faulet	e56e718c82	MINOR: mux-h1: Add masks to group H1S DEMUX and MUX errors It is just a small patch to clean up mux/demux functions. Instead of listing the H1S errors that must be handled during demux of mux operations, masks of flags are used. It is more readable.	2025-01-31 10:41:49 +01:00
Willy Tarreau	8235a24782	MEDIUM: epoll: skip reports of stale file descriptors Now that we can see that some events are reported for older instances of a file descriptor, let's skip these ones instead of reporting dangerous events on them. It might possibly qualify as a bug if it helps fixing strange issues in certain environments, in which case it can make sense to backport it along with the following recent patches: DEBUG: fd: add a counter of takeovers of an FD since it was last opened MINOR: fd: add a generation number to file descriptors DEBUG: epoll: store and compare the FD's generation count with reported event	2025-01-30 19:45:34 +01:00
Willy Tarreau	5012b6c6d9	DEBUG: epoll: store and compare the FD's generation count with reported event There have been some reported cases where races between threads in epoll were causing wrong reports of close or error events. Since the epoll_event data is 64 bits, we can store the FD's generation counter in the upper bits to verify if we're speaking about the same instance of the FD as the current one or a stale one. If the generation number does not match, then we classify these into 3 conditions and increment the relevant COUNT_IF() counters (stale report for closed FD, stale report of harmless event on reopened FD, stale report of HUP/ERR on reopened FD). Tests have shown that with heavy concurrency, a very small maxconn (typically 1 per thread), http-reuse always and a server closing connections first but randomly (httpterm with /C=2r), such events can happen at a pace of a few per second for the closed FDs, and a few per minute for the other ones, so there's value in leaving this accessible for troubleshooting. E.g after a few minutes: Count Type Location function(): "condition" [comment] 5541 CNT ev_epoll.c:296 _do_poll(): "1" [epoll report of event on a just closed fd (harmless)] 10 CNT ev_epoll.c:294 _do_poll(): "1" [epoll report of event on a closed recycled fd (rare)] 42 CNT ev_epoll.c:289 _do_poll(): "1" [epoll report of HUP on a stale fd reopened on the same thread (suspicious)] 212 CNT ev_epoll.c:279 _do_poll(): "1" [epoll report of HUP/ERR on a stale fd reopened on another thread (harmless)] 1 CNT mux_h1.c:3911 h1_send(): "b_data(&h1c->obuf)" [connection error (send) with pending output data] This one with the following setup, whicih abuses threads contention by starting 64 threads on two cores: - config: global nbthread 64 stats socket /tmp/sock1 level admin stats timeout 1h defaults timeout client 5s timeout server 5s timeout connect 5s mode http listen p2 bind :8002 http-reuse always server s1 127.0.0.1:8000 maxconn 4 - haproxy forcefully started on 2C4T: $ taskset -c 0,1,4,5 ./haproxy -db -f epoll-dbg.cfg - httpterm on port 8000, cpus 2,3,6,7 (2C4T) - h1load with responses larger than a single buffer, and randomly closing/keeping alive: $ taskset -c 2,3,6,7 h1load -e -t 4 -c 256 -r 1 0:8002/?s=19k/C=2r	2025-01-30 19:45:34 +01:00
Willy Tarreau	d155924efe	MINOR: fd: add a generation number to file descriptors This patch adds a counter of close() on file descriptors in the fdtab. The goal is to better detect if reported events concern the current or a previous file descriptor. For now the counter is only added, and is showed in "show fd" as "gen". We're reusing unused space at the end of the struct. If it's needed for something more important later, this patch can be reverted.	2025-01-30 19:45:34 +01:00
Willy Tarreau	44ac7a7e73	DEBUG: fd: add a counter of takeovers of an FD since it was last opened That's essentially in order to help with debugging strange cases like the occasional epoll issues/races, by keeping a counter of how many times an FD was taken over since last inserted. The room is available so let's use it. If it's needed later, this patch can easily be reverted. The counter is also reported in "show fd" as "tkov".	2025-01-30 19:45:34 +01:00
Amaury Denoyelle	b849ee5fa3	BUILD: quic: fix overflow in global tune A new global option was recently introduced to disable pacing. However, the value used (1<<31) caused issue with some compiler as options field used for storage is declared as int. Move pacing deactivation flag outside into the newly defined quic_tune to fix this. This should be backported up to 3.1 after a period of observation. Note that it relied on the previous patch which defined new quic_tune type.	2025-01-30 18:12:53 +01:00
Amaury Denoyelle	09e9c7d5b7	MINOR: quic: define quic_tune Define a new structure quic_tune. It will be useful to regroup various configuration settings and tunable related to QUIC, instead of defining them into the global structure.	2025-01-30 18:12:40 +01:00
Amaury Denoyelle	2fc63cb186	MINOR: quic: mark BBR as stable Pacing has recently been moved out of experimental status and is activated by default. This is a mandatory requirement for BBR. Furthermore, BBR is now considered stable. As such, removes its experimental status with this commit.	2025-01-30 17:20:41 +01:00
Amaury Denoyelle	a19d9b0486	MAJOR: quic: mark pacing as stable and enable it by default Remove pacing experimental status, so it's not required anymore to use expose-experimental-directives to enable it. Along this change, pacing is now activated by default. As such, pacing configuration is transformed into its final form. The global on/off setting is turned into a disable setting without argument.	2025-01-30 17:20:41 +01:00
Amaury Denoyelle	0c8b54b2d1	MINOR: quic: transform pacing settings into a global option Pacing support was previously activated on each bind line individually, via an optional argument of quic-cc-algo keyword. Remove this optional argument and introduce a global setting to enable/disable pacing. Pacing activation is still flagged as experimental. One important change is that previously BBR usage automatically activated pacing support. This is not the case anymore, so users should now always explicitely activate pacing if BBR is selected. A new warning message will be displayed if this is not the case. Another consequence of this change is that now pacing_inter callback is always defined for every quic_cc_algo types. As such, QUIC MUX uses global.tune.options to determine if pacing is required. This should be backported up to 3.1, after a period of observation.	2025-01-30 17:19:38 +01:00
Amaury Denoyelle	d04e93bc2e	MINOR: quic: allow BBR testing without pacing Pacing is activated per bind line via an optional boolean argument of quic-cc-algo keyword. Contrary to the default usage, pacing is automatically activated when BBR is chosen. This is because this algorithm is expected to run on top of pacing, else its behavior is undefined. Previously, pacing argument was thus ignored when BBR was selected. Change this to support explicit deactivation of pacing with it. This could be useful to test BBR without pacing when debugging some issues. This should be backported up to 3.1, after a period of observation.	2025-01-30 17:18:02 +01:00
Amaury Denoyelle	6acf391e89	MINOR: quic: remove references to burst in quic-cc-algo parsing Pacing activation configuration has been recently revamped. Previously, pacing related quic-cc-algo argument was used to specify a burst size. It evolved into a boolean value as burst size is dynamically calculated now. As such, removes any references to the old burst value in config parsing code for cleaner code. This should be backported up to 3.1, after a period of observation.	2025-01-30 17:02:59 +01:00
Willy Tarreau	bd7a688b8b	BUG/MEDIUM: chunk: make sure to flush the trash pool before resizing Late in 3.1 we've added an integrity check to make sure we didn't keep trash objects allocated before resizing the trash with commit 0bfd36e7b8 ("MINOR: chunk: add a BUG_ON upon the next init_trash_buffer()"), but it turns out that the counter that is being checked includes the number of objects left in local thread caches. As such it can trigger despite no object being allocated. This precisely happens when setting tune.memory.hot-size to a few megabytes because some temporarily used trash objects will remain in cache. In order to address this, let's first flush the pool before running the check. That was previously done by pool_destroy() but the check had to be inserted before it. So now we first flush the trash pool, then verify it's no longer used, and finally we can destroy it. This needs to be backported to 3.1. Thanks to Christian Ruppert for reporting this bug.	2025-01-29 17:55:18 +01:00
William Lallemand	b43e5d8c16	BUILD: ssl: more cleaner approach to WolfSSL without renegotiation Patch discussed in https://github.com/wolfSSL/wolfssl/issues/6834 When building Wolfssl without renegotiation options, WolfSSL still defines the macros about it, which warns during the build. This patch completes the previous one by undefining the macros so haproxy could build without any warning.	2025-01-28 20:55:20 +01:00
William Lallemand	c6a8279cdf	BUILD: ssl: allow to build without the renegotiation API of WolfSSL In ticket https://github.com/wolfSSL/wolfssl/issues/6834, it was suggested to push --enable-haproxy within --enable-distro. WolfSSL does not want to include the renegotiation support in --enable-distro. To achieve this, let haproxy build without SSL_renegotiate_pending() when wolfssl does not define HAVE_SECURE_RENEGOCIATION or HAVE_SERVER_RENEGOCIATION_INFO.	2025-01-28 18:31:32 +01:00
Olivier Houchard	9253146b90	BUILD: queues: Use unsigned int when needed Use unsigned int instead of int when calculating which thread group we should dequeue from next, as the difference in signedness makes clang unhappy.	2025-01-28 17:44:54 +01:00
Olivier Houchard	b74ec1efc2	MINOR: queues: use __ha_cpu_relax() on failed CAS. Make sure we call __ha_cpu_relax() if we fail a CAS, to help with contention.	2025-01-28 16:00:19 +01:00
Willy Tarreau	f17b0a994b	BUILD: tools: fix build on BSD by dropping the ETIME check Commit 44537379fc ("MINOR: tools: add errname to print errno macro name") brought a facility to report errno using a symbolic string when known instead of showing only the value. However, among the listed options, ETIME is mentioned but is unknown from FreeBSD where it breaks the build. Let's simply drop it, we don't use ETIME anyway and even if it would be reported, the default code path still reports the numeric value so there's no harm. If other ones fail to build in the future, they could be handled the same way.	2025-01-28 15:58:57 +01:00
Christopher Faulet	36d151dc10	MEDIUM: stream: No longer use TASK_F_UEVT* to shut a stream down Thanks to the previous patch, it is now possible to explicitly rely on stream's events to shut it down. The right event is set in stream_shutdown(), before waking up the stream, via an atomic operation. In process_stream(), this event will be handled as expected. Thus, TASK_F_UEVT* are no longer used, but not removed since still usable for other tasks. This patch depends on "MEDIUM: stream: Map task wake up reasons to dedicated stream events".	2025-01-28 14:53:37 +01:00
Christopher Faulet	6048460102	MEDIUM: stream: Map task wake up reasons to dedicated stream events To fix thread-safety issues when a stream must be shut, three new task states were added. These states are generic (UEVT1, UEVT2 and UEVT3), the task callback function is responsible to know what to do with them. However, it is not really scalable. The best is to use an atomic field in the stream structure itself to deal with these dedicated events. There is already the "pending_events" field that save wake up reasons (TASK_WOKEN_) to not loose them if process_stream() is interrupted before it had a chance to handle them. So the idea is to introduce a new field to handle streams dedicated events and merged them with the task's wake up reasons used by the stream. This means a mapping must be performed between some task wake up reasons and streams events. Note that not all task wake up reasons will be mapped. In this patch, the "new_events" field is introduced. It is an atomic bit-field. Streams events (STRM_EVT_) are also introduced to map the task wake up reasons used by process_stream(). Only TASK_WOKEN_TIMER and TASK_WOKEN_MSG are mapped, in addition to TASK_F_UEVT* flags. In process_stream(), "pending_events" field is now filled with new stream events and the mapping of the wake up reasons.	2025-01-28 14:53:37 +01:00
Christopher Faulet	0a52a75ef7	BUG/MINOR: stream: Properly handle "on-marked-up shutdown-backup-sessions" shutdown-backup-sessions action for on-marked-up directive does not work anymore since the stream_shutdown() function was modified to be async-safe. When stream_shutdown() was modified to be async-safe, dedicated task events were added to map the reasons to shut a stream down. SF_ERR_DOWN was mapped to TASK_F_EVT1 and SF_ERR_KILLED was mapped to TASK_F_EVT2. The reverse mapping was performed by process_stream() to shut the stream with the appropriate reason. However, SF_ERR_UP reason, used by shutdown-backup-sessions action to shut a stream down because a preferred server became available, was not mapped in the same way. So since commit b8e3b0a18d ("BUG/MEDIUM: stream: make stream_shutdown() async-safe"), this action is ignored and does not work anymore. To fix an issue, and being able to bakcport the fix, a third task event was added. TASK_F_EVT3 is now mapped on SF_ERR_UP. This patch should fix the issue #2848. It must be backported as far as 2.6.	2025-01-28 14:53:37 +01:00
Olivier Houchard	26b3e5236f	MEDIUM: servers/proxies: Switch to using per-tgroup queues. For both servers and proxies, use one connection queue per thread-group, instead of only one. Having only one can lead to severe performance issues on NUMA machines, it is actually trivial to get the watchdog to trigger on an AMD machine, having a server with a maxconn of 96, and an injector that uses 160 concurrent connections. We now have one queue per thread-group, however when dequeueing, we're dequeuing MAX_SELF_USE_QUEUE (currently 9) pendconns from our own queue, before dequeueing one from another thread group, if available, to make sure everybody is still running.	2025-01-28 12:49:41 +01:00
Olivier Houchard	583303c48b	MINOR: proxies/servers: Calculate queueslength and use it. For both proxies and servers, properly calculates queueslength, which is the total number of element in each queues (as they currently are only using one queue, it is equivalent to the number of element of that queue), and use it instead of the queue's length.	2025-01-28 12:49:41 +01:00
Olivier Houchard	59eddabe16	MINOR: Add fields to the per-thread group field in struct server. Add a per-thread group queue and associated fields in per-thread group field in struct server, as well as a new field, queues length. This is currently unused, so should change nothing.	2025-01-28 12:49:41 +01:00
Olivier Houchard	f879b9a18a	MINOR: proxies: Add a per-thread group field to struct proxy. Add a per-thread group field to struct proxy, that will contain a struct queue, as well as a new field, "queueslength". This is currently unused, so should change nothing. Please note that proxy_init_per_thr() must now be called for each proxy once the thread groups number is known.	2025-01-28 12:49:41 +01:00
Willy Tarreau	7fa70da06d	MINOR: epoll: permit to mask certain specific events A few times in the past we've seen cases where epoll was caught reporting a wrong event that caused trouble (e.g. spuriously reporting HUP or RDHUP after a successful connect()). The new tune.epoll.mask-events directive permits to mask events such as ERR, HUP and RDHUP and convert them to IN events that are processed by the regular receive path. This should help better diagnose and troubleshoot issues such as this one, as well as rule out such a cause when similar issues are reported: https://github.com/haproxy/haproxy/issues/2368 https://www.spinics.net/lists/netdev/msg876470.html It should be harmless to backport this if necessary.	2025-01-27 15:47:46 +01:00
Aurelien DARRAGON	e768a531b7	CLEANUP: tree-wide: define and use acl_match_cond() helper acl_match_cond() combines acl_exec_cond() + acl_pass() and a check on the condition->pol (to check if the cond is inverted) in order to return either 0 if the cond doesn't match or 1 if it matches (or NULL). Thanks to this we can actually simplify some redundant constructs that iterate over rules and evaluate if the condition matches or not. Conditions for tcp-request inspect-content and tcp-response inspect-content couldn't be simplified because they perform an extra check for missing data, and thus still need to leverage acl_exec_cond() It's best to display the patch using "-w", like "git show xxxx -w", because some blocks had to be re-indented after the cleanup, which makes the patch hard to review by default.	2025-01-27 11:11:43 +01:00
Valentine Krasnobaeva	94d3b7375a	CLEANUP: ssl: move ssl_sock_gencert_load_ca declaration in ssl_gencert.h As ssl_sock_gencert_load_ca and ssl_sock_gencert_free_ca are compiled only if SSL_NO_GENERATE_CERTIFICATES is not defined, let's align it and move these declarations in ssl_gencert.h.	2025-01-24 12:31:07 +01:00
Valentine Krasnobaeva	846819b316	CLEANUP: ssl: rename ssl_sock_load_ca to ssl_sock_gencert_load_ca ssl_sock_load_ca is defined in ssl_gencert.c and compiled only if SSL_NO_GENERATE_CERTIFICATES is not defined. It's name is a bit confusing, as we may think at the first glance, that it's a generic function, which is also used to load CA file, provided via 'ca-file' keyword. ssl_set_verify_locations_file is used in this case. So let's rename ssl_sock_load_ca into ssl_sock_gencert_load_ca. Same is applied to ssl_sock_free_ca.	2025-01-24 12:31:07 +01:00
Valentine Krasnobaeva	c987f30245	BUG/MINOR: ssl: put ssl_sock_load_ca under SSL_NO_GENERATE_CERTIFICATES ssl_sock_load_ca and ssl_sock_free_ca definitions are compiled only, if SSL_NO_GENERATE_CERTIFICATES is not set. In case, when we set this define and build haproxy, linker throws an error. So, let's fix this. This should be backported in all stable versions.	2025-01-24 12:31:07 +01:00
Willy Tarreau	670182bc9e	[RELEASE] Released version 3.2-dev4 Released version 3.2-dev4 with the following main changes : - BUG/MINOR: stktable: fix big-endian compatiblity in smp_to_stkey() - MINOR: stktable: add stkey_to_smp() helper - MINOR: stktable: add stksess_getkey() helper - MINOR: stktable: add sc[0-2]_key fetches - BUG/MEDIUM: queues: Adjust the proxy counters when appropriate - MINOR: trace: add help message for -dt argument - MINOR: trace: ensure -dt priority over traces config section - MINOR: trace: support all source alias on -dt - BUG/MINOR: quic: reject NEW_TOKEN frames from clients - MINOR: stktable: fix potential build issue in smp_to_stkey - BUG/MEDIUM: stktable: fix missing lock on some table converters - BUG/MEDIUM: promex: Use right context pointers to dump backends extra-counters - MINOR: stktable: fix potential build issue in smp_to_stkey (2nd try) - MINOR: stktable: add smp_fetch_stksess() helper function - MEDIUM: stktable: split src-based key smp_fetch_sc functions - MEDIUM: stktable: split sc_ and src_ fetch lookup logics - MEDIUM: stktable: leverage smp_fetch_* helpers from sample conv - DOC: config: unify sample conv\|fetches optional arguments syntax - DOC: config: stick-table converters support implicit <table> argument - DOC: config: stick-table converter do accept ANY-typed input - DOC: config: clarify return type for some stick-table converters - DOC: config: refer to canonical sticktable converters for src_* fetches - CLEANUP: stktable: move sample_conv_table_bytes_out_rate() - MINOR: stktable: add table_{inc,clr}_gpc* converters - BUG/MAJOR: quic: reject too large CRYPTO frames - BUG/MAJOR: log/sink: possible sink collision in sink_new_from_srv() - BUG/MINOR: init: set HAPROXY_STARTUP_VERSION from the variable, not the macro - REORG: version: move the remaining BUILD_* stuff from haproxy.c to version.c - BUG/MINOR: quic: ensure a detached coalesced packet can't access its neighbours - MINOR: quic: Add a BUG_ON() on quic_tx_packet refcount - BUILD: quic: Move an ASSUME_NONNULL() for variable which is not null - BUG/MEDIUM: mux-h1: Properly close H1C if an error is reported before sending data - CLEANUP: quic: remove unused prototype - MINOR: quic: rename pacing_rate cb to pacing_inter - BUG/MINOR: quic: do not increase congestion window if app limited - MINOR: mux-quic: increment pacing retry counter on expired - MEDIUM: quic: implement credit based pacing - MEDIUM: mux-quic: reduce pacing CPU usage with passive wait - MEDIUM: quic: use dynamic credit for pacing - MINOR: quic: remove unused pacing burst in bind_conf/quic_cc_path - MINOR: quic: adapt credit based pacing to BBR - MINOR: tools: add errname to print errno macro name - MINOR: debug: debug_parse_cli_show_dev: use errname - MINOR: debug: show boot and runtime process settings in table	2025-01-24 11:01:06 +01:00
Valentine Krasnobaeva	8620ae7962	MINOR: debug: show boot and runtime process settings in table Let's reformat output of "show dev" in order to show some boot and runtime process settings in a table. This makes the output less crowded.	2025-01-24 09:54:57 +01:00
Valentine Krasnobaeva	df7f16d960	MINOR: debug: debug_parse_cli_show_dev: use errname Let's use errname, introduced in the previous commit in the output of "show dev". This output is destined to engineers. So, no need to provide a long descriptions of errnos given by strerror.	2025-01-24 09:54:57 +01:00
Valentine Krasnobaeva	44537379fc	MINOR: tools: add errname to print errno macro name Add helper to print the name of errno's corresponding macro, for example "EINVAL" for errno=22. This may be helpful for debugging and for using in some CLI commands output. The switch-case in errname() contains only the errnos currently used in the code. So, it needs to be extended, if one starts to use new syscalls.	2025-01-24 09:54:57 +01:00
Amaury Denoyelle	42bac9339c	MINOR: quic: adapt credit based pacing to BBR Credit based pacing has been further refined to be able to calculate dynamically burst size based on congestion parameter. However, BBR algorithm already provides pacing rate and burst size (labelled as send_quantum) for 1ms of emission. Adapt quic_pacing_reload() to use BBR values to compute pacing credit. This is done via pacing_burst callback which is now only defined for BBR. For other algorithms, determine the burst size over 1ms with the congestion window size and RTT. This should be backported up to 3.1.	2025-01-23 17:41:07 +01:00
Amaury Denoyelle	7896edccdc	MINOR: quic: remove unused pacing burst in bind_conf/quic_cc_path Pacing burst size is now dynamic. As such, configuration value has been removed and related fields in bind_conf and quic_cc_path structures can be safely removed. This should be backported up to 3.1.	2025-01-23 17:40:48 +01:00
Amaury Denoyelle	cb91ccd8a8	MEDIUM: quic: use dynamic credit for pacing Major improvements have been introduced in pacing recently. Most notably, QMUX schedules emission on a millisecond resolution, which allow to use passive wait to be much CPU friendly. However, an issue remains with the pacing max credit. Unless BBR is used, it is fixed to the configured value from quic-cc-algo bind statement. This is not practical as if too low, it may drastically reduce performance due to 1ms sleep resolution. If too high, some clients will suffer from too much packet loss. This commit fixes the issue by implementing a dynamic maximum credit value based on the network condition specific to each clients. Calculation is done to fix a maximum value which should allow QMUX current tasklet context to emit enough data to cover the delay with the next tasklet invokation. As such, avg_loop_us is used to detect the process load. If too small, 1.5ms is used as minimal value, to cover the extra delay incurred by the system which will happen for a default 1ms sleep. This should be backported up to 3.1.	2025-01-23 17:40:48 +01:00
Amaury Denoyelle	8098be1fdc	MEDIUM: mux-quic: reduce pacing CPU usage with passive wait Pacing algorithm has been revamped in the previous commit to implement a credit based solution. This is a far more adaptative solution, in particular which allow to catch up in case pause between pacing emission was longer than expected. This allows QMUX to remove the active loop based on tasklet wake-up. Instead, a new task is used when emission should be paced. The main advantage is that CPU usage is drastically reduced. New pacing task timer is reset each time qcc_io_send() is invoked. Timer will be set only if pacing engine reports that emission must be interrupted. In this case timer is set via qcc_wakeup_pacing() to the delay reported by congestion algorithm, or 1ms if delay is too short. At the end of qcc_io_cb(), pacing task is queued if timer has been set. Pacing task execution is simple enough : it immediately wakes up QCC I/O handler. Note that to have decent performance, it requires to have a large enough burst defined in configuration of quic-cc-algo. However, this value is common to every listener clients, which may cause too much loss under network conditions. This will be address in a future patch. This should be backported up to 3.1.	2025-01-23 17:40:22 +01:00
Amaury Denoyelle	4489a61585	MEDIUM: quic: implement credit based pacing Implement a new method for QUIC pacing emission based on credit. This represents the number of packets which can be emitted in a single burst. After emission, decrement from the credit the number of emitted packets. Several emission can be conducted in the same sequence until the credit is completely decremented. When a new emission sequence is initiated (i.e. under a new QMUX tasklet invokation), credit is refilled according to the delay which occured between the last and current emission context. This new mechanism main advantage is that it allows to conduct several emission in the same task context without having to wait between each invokation. Wait is only forced if pacing is expired, which is now equivalent to having a null credit. Furthermore, if delay between two emissions sequence would have been smaller than expected, credit is only partially refilled. This allows to restart emission without having to wait for the whole credit to be available. On the implementation side, a new field <credit> is avaiable in quic_pacer structure. It is automatically decremented on quic_pacing_sent_done() invokation. Also, a new function quic_pacing_reload() must be used by QUIC MUX when a new emission sequence is initiated to refill credit. <next> field from quic_pacer has been removed. For the moment, credit is based on the burst configured via quic-cc-algo keyword, or directly reported by BBR. This should be backported up to 3.1.	2025-01-23 17:40:20 +01:00
Amaury Denoyelle	9d8589f0de	MINOR: mux-quic: increment pacing retry counter on expired A field <paced_sent_ctr> from quic_pacer structure is used to report the number of occurences where emission has been interrupted due to pacing. However, it was not incremented when QUIC MUX had to pause immediately emission as pacing was still not yet expired. Fix this by incrementing <paced_sent_ctr> in qcc_io_send() prior to emission if pacing is expired. Note that incrementation is only done once if the tasklet is then repeatdely woken up until the timer is expired. This should be backported up to 3.1.	2025-01-23 17:29:14 +01:00
Amaury Denoyelle	bbaa7aef7b	BUG/MINOR: quic: do not increase congestion window if app limited Previously, congestion window was increased any time each time a new acknowledge was received. However, it did not take into account the window filling level. In a network condition with negligible loss, this will cause the window to be incremented until the maximum value (by default 480k), even though the application does not have enough data to fill it. In most cases, this issue is not noticeable. However, it may lead to excessive memory consumption when a QUIC connection is suddendly interrupted, as in this case haproxy will fill the window with retransmission. It even has caused OOM crash when thousands of clients were interrupted at once on a local network benchmark. Fix this by first checking window level prior to every incrementation via a new helper function quic_cwnd_may_increase(). It was arbitrarily decided that the window must be at least 50% full when the ACK is handled prior to increment it. This value is a good compromise to keep window in check while still allowing fast increment when needed. Note that this patch only concerns cubic and newreno algorithm. BBR has already its notion of application limited which ensures the window is only incremented when necessary. This should be backported up to 2.6.	2025-01-23 14:49:35 +01:00
Amaury Denoyelle	7c0820892f	MINOR: quic: rename pacing_rate cb to pacing_inter Rename one of the congestion algorithms pacing callback from pacing_rate to pacing_inter. This better reflects that this function returns a delay (in nanoseconds) which should be applied between each packet emission to fill the congestion window with a perfectly smoothed emission. This should be backported up to 3.1.	2025-01-23 14:49:35 +01:00
Amaury Denoyelle	2178bf1192	CLEANUP: quic: remove unused prototype Remove undefined quic_pacing_send() function prototype from quic_pacing module. This should be backported up to 3.1.	2025-01-23 14:49:35 +01:00
Christopher Faulet	b18e988e0d	BUG/MEDIUM: mux-h1: Properly close H1C if an error is reported before sending data It is possible to have front H1 connections waiting for the client timeout while they should be closed because a conneciton error was reported before sebding an error message to the client. It is not a leak because the connections are closed when the timeout expires but it is a waste of ressources, especially if the client timeout is high. When an early error message must be sent to the client, if an error was already detected, no data are sent and the output buffer is released. At this stage, the H1 connection is in CLOSING state and it must be released. But because of a bug, this is not performed. The client timeout is rearmed and the H1 connection is only closed when it expires. To fix the issue, the condition to close a H1C must also be evaluated when an error is detected before sending data. It is only an issue with idle client connections, because there is no H1 stream in that case and the error message is generated by the mux itself. This patch must be backported as far as 2.8.	2025-01-23 11:05:48 +01:00
Frederic Lecaille	1f099db7e2	BUILD: quic: Move an ASSUME_NONNULL() for variable which is not null Some new compilers warn that <oldest_lost> variable can be null even this cannot be the case as mentioned by the comment about an already present ASSUME_NONNULL() call comment as follows: src/quic_loss.c: In function ‘qc_release_lost_pkts’: src/quic_loss.c:307:86: error: potential null pointer dereference [-Werror=null-dereference] 307 \| unsigned int period = newest_lost->time_sent_ms - oldest_lost->time_sent_ms; \| ~~~~~~~~~~~^~~~~~~~~~~~~~ Move up this ASSUME_NONNULL() statement to please these compiler. Must be backported as far as 2.6 to easy any further backport around this code part.	2025-01-21 22:01:34 +01:00
Frederic Lecaille	4f38c4bfd8	MINOR: quic: Add a BUG_ON() on quic_tx_packet refcount This is definitively a bug to call quic_tx_packet_refdec() to decrement the reference counter of a TX packet calling quic_tx_packet_refdec(), and possibly to release its memory when it is negative or null. This counter is incremented when a TX frm is attached to it with some allocated memory and when the packet is inserted into a data structure, if needed (list or tree). Should be easily backported as far as 2.6 to ease any further backport around this code part.	2025-01-21 22:01:34 +01:00
Frederic Lecaille	cb729fb64d	BUG/MINOR: quic: ensure a detached coalesced packet can't access its neighbours Reset ->prev and ->next fields of a coalesced TX packet to ensure it cannot access several times its neighbours after it is supposed to be detached from them calling quic_tx_packet_dgram_detach(). There are two cases where a packet can be coalesced to another previous built one: this is when it is built into the same datagrame without GSO (and flagged flag with QUIC_FL_TX_PACKET_COALESCED) or when sent from the same sendto() syscall with GOS (not flagged with QUIC_FL_TX_PACKET_COALESCED). This fix may be in relation with GH #2839. Must be backported as far as 2.6.	2025-01-21 22:01:34 +01:00
Willy Tarreau	b066c0affb	REORG: version: move the remaining BUILD_* stuff from haproxy.c to version.c version.c tries to centralize all variables conveying version information, but there's still an issue with the BUILD_* variables which are only passed to haproxy.o and are only updated when that one is rebuilt. This is not very logical given that we can end up with values there which contradict info from version.c. Better move all of these to version.c which is systematically rebuilt. Most of these variables only end up as string concatenation at the moment. Some of them are even duplicated. In version.c we now have one variable (or constant) for each of them and haproxy.c references them in messages. This is much more logical and easier to maintain in a consistent state. The patch looks a bit large but it really only moves the ifdefed string assignment from one file to another, placing them into variables.	2025-01-20 17:53:55 +01:00
Willy Tarreau	9e61cf6790	BUG/MINOR: init: set HAPROXY_STARTUP_VERSION from the variable, not the macro This environment variable was added by commit d4c0be6b20 ("MINOR: startup: HAPROXY_STARTUP_VERSION contains the version used to start"). However, it's set from the macro that is passed during the build process instead of being set from the variable that's kept up to date in version.c. The difference is visible only during debugging/bisecting because only changed files and version.o are rebuilt, but not necessarily haproxy.o, which is where the environment variable is set. This means that the version exposed in the environment is not necessarily the same as the one presented in "haproxy -v" during such debugging sessions. This should be backported to 2.8. It has no impact at all on regularly built binaries.	2025-01-20 17:53:55 +01:00
Aurelien DARRAGON	bfa493d4be	BUG/MAJOR: log/sink: possible sink collision in sink_new_from_srv() sink_new_from_srv() leverages sink_new_buf() with the server id as name, sink_new_buf() then calls __sink_new() with the provided name. Unfortunately sink_new() is designed in such a way that it will first look up in the list of existing sinks to check if a sink already exists with given name, in which case the existing sink is returned. While this behavior may be error-prone, it is actually up to the caller to ensure that the provided name is unique if it really expects a unique sink pointer. Due to this bug in sink_new_from_srv(), multiple tcp servers with the same name defined in distinct log backends would end up sharing the same sink, which means messages sent to one of the servers would also be forwarded to all servers with the same name across all log backend sections defined in the config, which is obviously an issue and could even raise security concerns. Example: defaults log backend@log-1 local0 backend log-1 mode log server s1 127.0.0.1:514 backend log-2 mode log server s1 127.0.0.1:5114 With the above config, logs sent to log-1/s1 would also end up being sent to log-2/s1 due to server id "s1" being used for tcp servers in distinct log backends. To fix the issue, we now prefix the sink ame with the backend name: back_name/srv_id combination is known to be unique (backend name serves as a namespace) This bug was reported by GH user @landon-lengyel under #2846. UDP servers (with udp@ prefix before the address) are not affected as they don't make use of the sink facility. As a workaround, one should manually ensure that all tcp servers across different log backends (backend with "mode log" enabled) use unique names This bug was introduced in e58a9b4 ("MINOR: sink: add sink_new_from_srv() function") thus it exists since the introduction of log backends in 2.9, which means this patch should be backported up to 2.9.	2025-01-20 12:33:20 +01:00
Amaury Denoyelle	c3a4a4d166	BUG/MAJOR: quic: reject too large CRYPTO frames Received CRYPTO frames are inserted in a ncbuf to handle out-of-order reception via ncb_add(). They are stored on the position relative to the frame offset, minus a base offset which corresponds to the in-order data length already handled. Previouly, no check was implemented on the frame offset value prior to ncb_add(), which could easily trigger a crash if relative offset was too large. Fix this by ensuring first that the frame can be stored in the buffer before ncb_add() invokation. If this is not the case, connection is closed with error CRYPTO_BUFFER_EXCEEDED, as required by QUIC specification. This should fix github issue #2842. This must be backported up to 2.6.	2025-01-20 11:43:23 +01:00
Aurelien DARRAGON	0486b9e491	MINOR: stktable: add table_{inc,clr}_gpc* converters As discussed in GH #2423, there are some cases where src_{inc,clr}_gpc* is not sufficient because we need to perform the lookup on a specific key. Indeed, just like we did in e642916 ("MEDIUM: stktable: leverage smp_fetch_* helpers from sample conv"), we can easily implement new table converters based on existing fetches. This is what we do in this patch. Also the doc was updated so that src_{inc,clr}_gpc* fetches now point to their generic equivalent table_{inc,clr}_gpc. Indeed, src_{inc,clr}_gpc are simply aliases. This should fix GH #2423.	2025-01-16 11:50:33 +01:00
Aurelien DARRAGON	9f68049cc1	CLEANUP: stktable: move sample_conv_table_bytes_out_rate() sample_conv_table_bytes_out_rate() was defined in the middle of other stick-table sample convs without any ordering logic. Let's put it where it belongs, right after sample_conv_table_bytes_in_rate().	2025-01-16 11:50:27 +01:00
Aurelien DARRAGON	62e42184ab	DOC: config: refer to canonical sticktable converters for src_* fetches When available, to prevent doc duplication, let's make src_* fetches point to equivalent table_* converters, as they are in fact aliases for src,table_* converters.	2025-01-16 11:50:20 +01:00
Aurelien DARRAGON	163c1124a2	DOC: config: clarify return type for some stick-table converters Some stick-table converters such as "table_gpt" erroneously suggest that the returned type is a boolean while in fact it is integer type, as properly documented for the sample fetch equivalents.	2025-01-16 11:50:14 +01:00
Aurelien DARRAGON	a8407cf3f7	DOC: config: stick-table converter do accept ANY-typed input Since 2d17db58 ("MINOR: stick-table: change all stick-table converters' inputs to SMP_T_ANY"), all stick-table converters accept ANY input type as parameter, this means that it does no longer restrict the key as a string representation of the input. However the doc wasn't updated when the change was made. Moreover, some converters document the updated behavior while others don't, which is kind of confusing, let's fix that.	2025-01-16 11:50:08 +01:00
Aurelien DARRAGON	0d318b4383	DOC: config: stick-table converters support implicit <table> argument As with stick-table sample fetches, the <table> argument is not strictly needed and defaults to the current proxy's stick-table when not provided Let's update the doc and prototype to reflect the current behavior.	2025-01-16 11:50:02 +01:00
Aurelien DARRAGON	dfdee47a8e	DOC: config: unify sample conv\|fetches optional arguments syntax The most common way (and proper way it seems) to declare optional arguments in sample fetch or converters' prototype is to declare them between square brackets, including the leading coma (because the coma should be omitted if the argument is not provided). Also, when multiple optional arguments are found, we should apply the same logic but recursively. In this patch we fix prototypes that include optional arguments and don't follow this syntax. This improves readibility and sets the norm for upcoming sample fetches/converters.	2025-01-16 11:49:55 +01:00
Aurelien DARRAGON	e6429166b9	MEDIUM: stktable: leverage smp_fetch_* helpers from sample conv In this patch we try to prevent code duplication: some fetches and sample converters do the exact same thing, except that the converter takes the argument as input data. Until now, both the converter and the fetch had their own implementation (copy pasted), with the fetch specific or converter specific lookup part. Thanks to previous commits, we now have generic sample fetch helpers that take the stkctr as argument, so let's leverage them directly from the converter functions when available. This allows to remove a lot of code duplication and should make code maintenance easier in the future.	2025-01-15 14:04:55 +01:00
Aurelien DARRAGON	6c9b315187	MEDIUM: stktable: split sc_ and src_ fetch lookup logics While this patch actually adds more insertions than deletions, it actually tries to simplify the lookup logic for sc_ and src_ sticktable fetches. Indeed, smp_create_src_stkctr() and smp_fetch_sc_stkctr() combination was used everywhere the fetch supports sc_ and src_ form, and smp_fetch_sc_stkctr() even integrated some of the src-oriented fetch logic. Not only this was confusing, but it made the task of adding new generic fetches even more complex. Thus in this patch we completely dedicate smp_fetch_sc_stkctr() to sc_ oriented fetches, while smp_create_src_stkctr() is now renamed to smp_fetch_src_stkctr() and can now work on its own for src_ oriented fetches. It takes an additional paramater, "create" to tell the function if the entry should be created if it doesn't exist yet. Now it's up to the calling function to know if it should be using the sc_ oriented fetch or the src_ oriented one based on the input keyword.	2025-01-15 14:04:50 +01:00
Aurelien DARRAGON	22229a41a2	MEDIUM: stktable: split src-based key smp_fetch_sc functions In this patch we split several sample fetch functions that are leveraged by the "src-" fetches such as smp_fetch_sc_inc_gpc(). Indeed, for all of them, we add an intermediate helper function that takes a stkctr pointer as parameter and performs the logic, leaving the lookup part in the calling function. Before this patch existing functions were doing the lookup + the fetch logic. Thanks to this patch it will become easier to add generic converters taking lookup key as input. List of targeted functions: - smp_fetch_sc_inc_gpc() - smp_fetch_sc_inc_gpc0() - smp_fetch_sc_inc_gpc1() - smp_fetch_sc_clr_gpc() - smp_fetch_sc_clr_gpc0() - smp_fetch_sc_clr_gpc1() - smp_fetch_sc_conn_cnt() - smp_fetch_sc_conn_rate() - smp_fetch_sc_updt_conn_cnt() - smp_fetch_sc_conn_curr() - smp_fetch_sc_glitch_cnt() - smp_fetch_sc_glitch_rate() - smp_fetch_sc_sess_cnt() - smp_fetch_sc_sess_rate() - smp_fetch_sc_http_req_cnt() - smp_fetch_sc_http_req_rate() - smp_fetch_sc_http_err_cnt() - smp_fetch_sc_http_err_rate() - smp_fetch_sc_http_fail_cnt() - smp_fetch_sc_http_fail_rate() - smp_fetch_sc_kbytes_in() - smp_fetch_sc_bytes_in_rate() - smp_fetch_kbytes_out() - smp_fetch_sc_gpc1_rate() - smp_fetch_sc_gpc0_rate() - smp_fetch_sc_gpc_rate() - smp_fetch_sc_get_gpc1() - smp_fetch_sc_get_gpc0() - smp_fetch_sc_get_gpc() - smp_fetch_sc_get_gpt0() - smp_fetch_sc_get_gpt() - smp_fetch_sc_bytes_out_rate() Please note that this patch doesn't render any good using "git show" or "git diff". For all the functions listed above, a new helper function was defined right above it, with the same name without "_sc". These new functions perform the fetch part, while the original ones (with "_sc") now simply perform the lookup and then leverage the corresponding fetch helper.	2025-01-15 14:04:45 +01:00
Aurelien DARRAGON	f71bad4694	MINOR: stktable: add smp_fetch_stksess() helper function smp_fetch_stksess(table, smp, create) performs a lookup in <table> by using <smp> as a key. It returns matching entry on success and NULL on failure. <create> can be set to 1 to force the entry creation. We then use this helper everywhere relevant to prevent code duplication	2025-01-15 14:04:40 +01:00
Aurelien DARRAGON	0fb8807820	MINOR: stktable: fix potential build issue in smp_to_stkey (2nd try) As discussed in GH #2838, the previous fix f399dbf ("MINOR: stktable: fix potential build issue in smp_to_stkey") which attempted to remove conversion ambiguity and prevent build warning proved to be insufficient. This time, we implement Willy's suggestion, which is to use an union to perform the conversion. Hopefully this should fix GH #2838. If that's the case (and only in that case), then this patch may be backported with f399dbf (else the patch won't apply) anywhere b59d1fd ("BUG/MINOR: stktable: fix big-endian compatiblity in smp_to_stkey()") was backported.	2025-01-15 14:04:31 +01:00
Christopher Faulet	91578212d7	BUG/MEDIUM: promex: Use right context pointers to dump backends extra-counters When backends extra counters are dumped, the wrong pointer was used in the promex context to retrieve the stats module. p[1] must be used instead of p[2]. Because of this typo, a infinite loop could be experienced if the output buffer is full during this stage. But in all cases an overflow is possible leading to a memory corruption. This patch may be related to issue #2831. It must be backported as far as 3.0.	2025-01-14 15:38:43 +01:00
Aurelien DARRAGON	8919a80da9	BUG/MEDIUM: stktable: fix missing lock on some table converters In 819fc6f563 ("MEDIUM: threads/stick-tables: handle multithreads on stick tables"), sample fetch and action functions were properly guarded with stksess read/write locks for read and write operations respectively, but the sample_conv_table functions leveraged by "table_" converters were overlooked. This bug was not known to cause issues in existing deployments yet (at least it was not reported), but due to its nature it can theorically lead to inconsistent values being reported by "table_" converters if the value is being updated by another thread in parallel. It should be backported to all stable versions. [ada: for versions < 3.0, glitch_cnt and glitch_rate samples should be ignored as they first appeared in 3.0]	2025-01-14 11:36:04 +01:00
Aurelien DARRAGON	f399dbf70c	MINOR: stktable: fix potential build issue in smp_to_stkey smp_to_stkey() uses an ambiguous cast from 64bit integer to 32 bit unsigned integer. While it is intended, let's make the cast less ambiguous by explicitly casting the right part of the assignment to the proper type. This should fix GH #2838	2025-01-13 09:45:40 +01:00
Amaury Denoyelle	4a5d82a97d	BUG/MINOR: quic: reject NEW_TOKEN frames from clients As specified by RFC 9000, reject NEW_TOKEN frames emitted by clients. Close the connection with error code PROTOCOL_VIOLATION. This must be backported up to 2.6.	2025-01-10 14:50:59 +01:00
Amaury Denoyelle	a2c0c459a4	MINOR: trace: support all source alias on -dt Command line argument -dt can be used to activate traces during startup. Via its optional argument, it is possible to change settings for a particular trace source. It is also possible to update every registered sources by specifying an empty name. Support the trace source alias "all". This is an alternative to the empty name to update every sources.	2025-01-10 14:50:59 +01:00
Amaury Denoyelle	a50dd07c16	MINOR: trace: ensure -dt priority over traces config section Traces can be activated on startup either via -dt command line argument or via the traces configuration section. This can caused confusion as it may not be clear as trace source can be completed or overriden by one or the other. Fix the precedence to give the priority to the command line argument. Now, each trace source configured via -dt is first resetted to a default state before applying new settings. Then, it is impossible to change a trace source via the configuration file if it was already targetted via -dt argument.	2025-01-10 14:50:59 +01:00
Amaury Denoyelle	da9a7e0bd9	MINOR: trace: add help message for -dt argument Traces can be activated on startup via -dt command line argument. To facilitate its usage, display a usage description and examples when "help" is specified.	2025-01-10 14:50:59 +01:00
Olivier Houchard	659d5f6579	BUG/MEDIUM: queues: Adjust the proxy counters when appropriate In process_srv_queue(), if we manage to successfully run an extra task, don't forget to adjust the proxy's totpend and served counters accordingly. Having an inaccurate served could lead to various subtle bugs, as it is used when making load balancing decisions. This should not be backported, unless cda7275ef5d5e49fb2ea2373ea3b1ba63fc927c3 is backported too.	2025-01-09 17:46:46 +01:00
Aurelien DARRAGON	24042df94e	MINOR: stktable: add sc[0-2]_key fetches As discussed in GH #1750, we were lacking a sample fetch to be able to retrieve the key from the currently tracked counter entry. To do so, sc_key fetch can now be used. It returns a sample with the correct type (table key type) corresponding to the tracked counter entry (from previous track-sc rules). If no entry is currently tracked, it returns nothing. It can be used using the standard form "sc_key(<sc_number>)" or the legacy form: "sc0_key", "sc1_key", "sc2_key" Documentation was updated.	2025-01-09 10:57:01 +01:00
Aurelien DARRAGON	7423310d5d	MINOR: stktable: add stksess_getkey() helper stksess_getkey(t, ts) returns a stktable_key struct pointer filled with data from input <ts> entry in <t> table. Returned pointer uses the static_table_key variable. Indeed, stktable_key struct is more convenient to manipulate than having to deal with the key extraction from stktsess struct directly.	2025-01-09 10:56:56 +01:00
Aurelien DARRAGON	df9c2ef2c3	MINOR: stktable: add stkey_to_smp() helper reverse operation for smp_to_stkey(): fills input <smp> from a stktable_key struct. Returns 1 on success and 0 on failure.	2025-01-09 10:56:50 +01:00
Aurelien DARRAGON	b59d1fd911	BUG/MINOR: stktable: fix big-endian compatiblity in smp_to_stkey() When smp_to_stkey() deals with SINT samples, since stick-tables deals with 32 bits integers while SINT sample is 64 bit integer, inplace conversion was done in smp_to_stkey. For that the 64 bit integer was truncated before the key would point to it. Unfortunately this only works on little endian architectures because with big endian ones, the key would point to the wrong 32bit range. To fix the issue and make the conversion endian-proof, let's re-assign the sample as 32bit integer before the key points to it. Thanks to Willy for having spotted the bug and suggesting the above fix. It should be backported to all stable versions.	2025-01-09 10:56:43 +01:00
Willy Tarreau	7be596b35c	[RELEASE] Released version 3.2-dev3 Released version 3.2-dev3 with the following main changes : - DOC: config: add missing "track-sc0" in action keywords matrix - BUG/MINOR: stktable: invalid use of stkctr_set_entry() with mixed table types - BUG/MAJOR: mux-quic: fix BUG_ON on empty STREAM emission - BUG/MEDIUM: mux-h2: Count copied data when looping on RX bufs in h2_rcv_buf() - Revert "BUG/MAJOR: mux-quic: fix BUG_ON on empty STREAM emission" - BUG/MAJOR: mux-quic: properly fix BUG_ON on empty STREAM emission - MINOR: mux-quic: add traces on sd attach - BUG/MEDIUM: mux-quic: do not attach on already closed stream - BUG/MINOR: compression: handle a possible strdup() failure - BUG/MINOR: pool: handle a possible strdup() failure - BUG/MINOR: cfgparse-tcp: handle a possible strdup() failure - BUG/MINOR: log: Allow to use if/unless conditionnals for do-log action - MINOR: config: Alert about extra arguments for errorfile and errorloc - BUG/MINOR: mux-quic: fix wakeup on qcc_set_error() - MINOR: mux-quic: change return value of qcs_attach_sc() - BUG/MINOR: mux-quic: handle closure of uni-stream - BUG/MEDIUM: promex/resolvers: Don't dump metrics if no nameserver is defined - BUG/MAJOR: ssl/ocsp: fix NULL conn object dereferencing to access QUIC TLS counters - MEDIUM: errors: get rid of shm_open() - BUILD: makefile: do not clean standalone binaries on a simple "make clean" - BUILD: makefile: add a qinfo macro to pass info in quiet mode - DEV: ncpu: add a simple utility to help with NUMA development - DEV: ncpu: implement a wrapper mode - DEV: ncpu: make the wrapper work both as a lib and executable - BUG/MEDIUM: h1-htx: Properly handle bodyless messages - MINOR: tools: add a few functions to simply check for a file's existence	2025-01-09 09:21:04 +01:00
Willy Tarreau	b25850f25b	MINOR: tools: add a few functions to simply check for a file's existence At many places we'd like to be able to simply construct a path from a format string and check if that path corresponds to an existing file, directory etc. Here we add 3 functions, a generic one to test that a path corresponds to a given file mode (e.g. S_IFDIR, S_IFREG etc), and two other ones specifically checking for a file or a dir for easier use.	2025-01-09 09:18:49 +01:00
Christopher Faulet	b9cc361b35	BUG/MEDIUM: h1-htx: Properly handle bodyless messages During h1 parsing, there are some postparsing checks to detect bodyless messages and switch the parsing in DONE state. However, a case was not properly handled. Responses to HEAD requests with a "transfer-encoding" header. The response parser remained blocked waiting for the response body. To fix the issue, the postparsing was sliglty modified. Instead of trying to handle bodyless messages in a common way between the request and the response, it is now performed in the dedicated postparsing functions. It is easier to enumerate all cases, especially because there is already a test for responses to HEAD requests. This patch should fix the issue #2836. It must be backported as far as 2.9.	2025-01-08 18:20:26 +01:00
Willy Tarreau	ca773e1a2a	DEV: ncpu: make the wrapper work both as a lib and executable It's convenient to have a share lib be able to also work as a wrapper. But recent glibc broke support for this dual-mode thing some time ago: https://patchwork.ozlabs.org/project/glibc/patch/20190312130235.8E82C89CE49C@oldenburg2.str.redhat.com/ https://stackoverflow.com/questions/59074126/loading-executable-or-executing-a-library Trying to preload such an executable indeed returns: ERROR: ld.so: object '/path/to/ncpu.so' from LD_PRELOAD cannot be preloaded (cannot dynamically load position-independent executable): ignored. Note that the code still supports it since libc.so is both an executable and a lib. The approach taken here is the same as in the nousr.so wrapper. It consists in dropping the DF_1_PIE flag from the resulting executable since it's what the dynamic linker is looking for. This flag is found in FLAGS_1 in the .dynamic section. As readelf -a suggests, it's after the tag 0x6ffffffb. The value is 0x08000000. We're using objdump to figure the length and offset of the struct, dd to extract the 3 parts, and sed to patch the binary. It's likely that it will only work on 64-bit little endian, though tests should be performed to see what to do on other platforms. At least on x86_64, ld.so is happy and it continues to be possible to use the binary as a .so, and that the platform where most of the development happens so that's fine. In any case the wrapper and the standard shared lib are still made two distinct files so that it's possible to use the non-patched version on unsupported OSes or architectures.	2025-01-08 11:27:10 +01:00
Willy Tarreau	3fdf875716	DEV: ncpu: implement a wrapper mode The wrapper mode allows to present itself as LD_PRELOAD before loading haproxy, which is often more convenient since it allows to pass the number of CPUs in argument. However, this mode is no longer supported by modern glibcs, so a future patch will come to implement a trick that was tested to work at least on x86.	2025-01-08 11:26:05 +01:00
Willy Tarreau	25c08562cb	DEV: ncpu: add a simple utility to help with NUMA development Collecting captures of /sys isn't sufficient for NUMA development because haproxy detects the number of CPUs at boot time and will not be able to inspect more than this number. Let's just have a small utility to report a fake number of CPUs, that will be loaded using LD_PRELOAD. It checks the NCPU variable if it exists and will present this number of CPUs, or if it does not exist, will expose the maximum supported number.	2025-01-08 11:26:05 +01:00
Willy Tarreau	bd06502b22	BUILD: makefile: add a qinfo macro to pass info in quiet mode Some commands such as $(cmd_CC) etc already handle the quiet vs verbose mode in the makefile, but sometimes we may want to pass other info. The new "qinfo" macro can be called with a 9-char string argument (spaces included) as a prefix for some commands, to emit that string when in quiet mode. The caller must fill the spaces needed for alignment. E.g: $(call quinfo, CC )$(CC) ...	2025-01-08 11:26:05 +01:00
Willy Tarreau	c87619fa25	BUILD: makefile: do not clean standalone binaries on a simple "make clean" Running "make clean" currently gets rid of a number of auxiliary tools, including the standalone ones that do not depend on haproxy's build options. This is a bit annoying as they have to be rebuilt each time. Let's move them to the distclean target instead.	2025-01-08 11:26:01 +01:00
William Lallemand	143be1b59f	MEDIUM: errors: get rid of shm_open() Since 5ee266b7 ("MINOR: error: simplify startup_logs_init_shm"), the FD of the startup logs is always closed and the HAPROXY_STARTUPLOGS_FD variable is not used anymore. Which means we only need a mmap. Indeed the shm_open() function was only needed to keep the shm between the exec() of the master so we can get the logs stored there after doing the final exec() in wait mode. Since the wait mode doesn't exist anymore and the parsing is done in a worker, we only need to share a memory zone between the master and the worker. This patch removes shm_open() and replace it with a simple mmap(), this way the shared startup-logs become more portable and USE_SHM_OPEN is not required anymore.	2025-01-07 16:42:38 +01:00
Frederic Lecaille	d7fc90afe9	BUG/MAJOR: ssl/ocsp: fix NULL conn object dereferencing to access QUIC TLS counters This bug arrived with this commit in the current dev branch: 056ec51c26 MEDIUM: ssl/ocsp: counters for OCSP stapling and could occur for QUIC connections during handshake when the underlying <conn> connection object is not already initialized. So in this case the TLS counters attached to TLS listeners cannot be accessed through this object but from the QUIC connection object. Modify the code to initialize the listener (<li> variable) for both QUIC and TCP connections, then initialize the variables for the TLS counters if the listener is also initialized. Thank you to @Tristan971 for having reported this issue in GH #2833. Must be backported with the commit mentioned above if it is planned to be backported.	2025-01-07 15:19:42 +01:00
Christopher Faulet	892eb2bb2c	BUG/MEDIUM: promex/resolvers: Don't dump metrics if no nameserver is defined A 'resolvers' section may be defined without any nameserver. In that case, we must take care to not dump corresponding Prometheus metrics. However there is an issue that could lead to a crash or a strange infinite loop because we are looping on an empty list and, at some point, we are dereferencing an invalid pointer. There is an issue because the loop on the nameservers of a resolvers section is performed via callback functions and not the standard list_for_each_entry macro. So we must take care to properly detect end of the list and empty lists for nameservers. But the fix is not so simple because resolvers sections with and without nameservers may be mixed. To fix the issue, in rslv_promex_start_ts() and rslv_promex_next_ts(), when the next resolvers section must be evaluated, a loop is now used to properly skip empty sections. This patch is related to #2831. Not sure it fixes it. It must be backported as far as 3.0.	2025-01-06 09:08:38 +01:00
Amaury Denoyelle	801e39e1cc	BUG/MINOR: mux-quic: handle closure of uni-stream This commit is a direct follow-up to the previous one. As already described, a previous fix was merged to prevent streamdesc attach operation on already completed QCS instances scheduled for purging. This was implemented by skipping app proto decoding. However, this has a bad side-effect for remote uni-directional stream. If receiving a FIN stream frame on such a stream, it will considered as complete because streamdesc are never attached to a uni stream. Due to the mentionned new fix, this prevent analysis of this last frame for every uni stream. To fix this, do not skip anymore app proto decoding for completed QCS. Update instead qcs_attach_sc() to transform it as a noop function if QCS is already fully closed before streamdesc instantiation. However, success return value is still used to prevent an invalid decoding error report. The impact of this bug should be minor. Indeed, HTTP3 and QPACK uni streams are never closed by the client as this is invalid due to the spec. The only issue was that this prevented QUIC MUX to close the connection with error H3_ERR_CLOSED_CRITICAL_STREAM. This must be backported along the previous patch, at least to 3.1, and eventually to 2.8 if mentionned patches are merged there.	2025-01-03 17:21:19 +01:00
Amaury Denoyelle	af00be8e0f	MINOR: mux-quic: change return value of qcs_attach_sc() A recent fix was introduced to ensure that a streamdesc instance won't be attached to an already completed QCS which is eligible to purging. This was performed by skipping application protocol decoding if a QCS is in such a state. Here is the patch responsible for this change. caf60ac696a29799631a76beb16d0072f65eef12 BUG/MEDIUM: mux-quic: do not attach on already closed stream However, this is too restrictive, in particular for unidirection stream where no streamdesc is never attached. To fix this behavior, first qcs_attach_sc() API has been modified. Instead of returning a streamdesc instance, it returns either 0 on success or a negative error code. There should be no functional changes with this patch. It is only to be able to extend qcs_attach_sc() with the possibility of skipping streamdesc instantiation while still keeping a success return value. This should be backported wherever the above patch has been merged. For the record, it was scheduled for immediate backport on 3.1, plus merging on older releases up to 2.8 after a period of observation.	2025-01-03 17:19:21 +01:00
Amaury Denoyelle	4f2554903b	BUG/MINOR: mux-quic: fix wakeup on qcc_set_error() The following patch was a major refactoring of QUIC MUX. It removes pacing specific code path. In particular, qcc_wakeup() utility function was removed and replaced by its tasklet_wakup() usage. 41f0472d967b2deb095d5adc8a167da973fbee3d MEDIUM: mux-quic: remove pacing specific code on qcc_io_cb However, an incorrect substitution was performed in qcc_set_error(). As such, there was no explicit wakeup in case an error is detected by QUIC MUX or the app protocol layer. This may lead to missing error reporting to clients. Fix this by re-add tasklet_wakup() usage into qcc_set_error(). This must be backported up to 3.1 where above patch is scheduled.	2025-01-03 10:39:49 +01:00
Christopher Faulet	f578811c4e	MINOR: config: Alert about extra arguments for errorfile and errorloc errorfile and errorloc directives expect excatly two arguments. But extra arguments were just ignored while an error should be emitted. It is now fixed. This patch could be backported as far as 2.2 if necessary.	2025-01-03 10:10:09 +01:00
Christopher Faulet	a785a20bef	BUG/MINOR: log: Allow to use if/unless conditionnals for do-log action The do-log action does not accept argument for now. But an error was triggered if any extra arguments was found, preventing the use of if/unless conditionnals. When an action is parsed, expected arguments must be tested to detect missing ones but not unexpected extra arguments because this should be performed by the conditionnal parser. So just removing the test in the do-log parser function is enough to fix the issue. This patch must be backported to 3.1.	2025-01-03 09:44:08 +01:00
Ilia Shipitsin	bbd1cedefc	BUG/MINOR: cfgparse-tcp: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2025-01-02 14:31:07 +01:00
Ilia Shipitsin	beca953c55	BUG/MINOR: pool: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2025-01-02 14:31:07 +01:00
Ilia Shipitsin	b4f965be9e	BUG/MINOR: compression: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2025-01-02 14:31:07 +01:00
Amaury Denoyelle	caf60ac696	BUG/MEDIUM: mux-quic: do not attach on already closed stream Due to QUIC packet reordering, a stream may be opened via a new RESET_STREAM or STOP_SENDING frame. This would cause either Tx or Rx channel to be immediately closed. This can cause an issue with current QUIC MUX implementation with QCS purging. QCS are inserted into QCC purge list when transfer could be considered as completed. In most cases, this happens after full request/response exchange. However, it can also happens after request reception if RESET_STREAM/STOP_SENDING are received first. A BUG_ON() crash will occur if a STREAM frame is received after. In this case, streamdesc instance will be attached via qcs_attach_sc() to handle the new request. However, QCS is already considered eligible to purging. It could cause it to be released while its streamdesc instance remains. A BUG_ON() crash detects this problem in qcc_purge_streams(). To fix this, extend qcc_decode_qcs() to skip app proto rcv_buf invokation if QCS is considered completed. A similar condition was already implemented when read was previously aborted after a STOP_SENDING emission by QUIC MUX. This crash was reproduced on haproxy.org. Here is the output of the backtrace : Core was generated by `./haproxy-dev -db -f /etc/haproxy/haproxy-current.cfg -sf 16495'. Program terminated with signal SIGILL, Illegal instruction. #0 0x00000000004e442b in qcc_purge_streams (qcc=0x774cca0) at src/mux_quic.c:2661 2661 BUG_ON_HOT(!qcs_is_completed(qcs)); [Current thread is 1 (LWP 1457)] [ ## gdb ## ] bt #0 0x00000000004e442b in qcc_purge_streams (qcc=0x774cca0) at src/mux_quic.c:2661 #1 0x00000000004e4db7 in qcc_io_process (qcc=0x774cca0) at src/mux_quic.c:2744 #2 0x00000000004e5a54 in qcc_io_cb (t=0x7f71193940c0, ctx=0x774cca0, status=573504) at src/mux_quic.c:2886 #3 0x0000000000b4f792 in run_tasks_from_lists (budgets=0x7ffdcea1e670) at src/task.c:603 #4 0x0000000000b5012f in process_runnable_tasks () at src/task.c:883 #5 0x00000000007de4a3 in run_poll_loop () at src/haproxy.c:2771 #6 0x00000000007deb9f in run_thread_poll_loop (data=0x1335a00 <ha_thread_info>) at src/haproxy.c:2985 #7 0x00000000007dfd8d in main (argc=6, argv=0x7ffdcea1e958) at src/haproxy.c:3570 This BUG_ON() crash can only happen since 3.1 refactoring. Indeed, purge list was only implemented on this version. As such, please backport it on 3.1 immediately. However, a logic issue remains for older version as a stream could be attached on a fully closed QCS. Thus, it should be backported up to 2.8, this time after a period of observation.	2025-01-02 11:25:40 +01:00
Amaury Denoyelle	4a997e5a93	MINOR: mux-quic: add traces on sd attach Add traces into qcs_attach_sc(). This function is called when a request is received on a QCS stream and a streamdesc instance is attached. This will be useful to facilitate debugging.	2025-01-02 11:25:40 +01:00
Amaury Denoyelle	ddfd8031f8	BUG/MAJOR: mux-quic: properly fix BUG_ON on empty STREAM emission Properly fix BUG_ON() occurence when QUIC MUX emits only empty STREAM frames. This was addressed by a previous patch but it causes another regression so a revert was needed. BUG_ON() on qcc_build_frms() return value is invalid. Indeed, qcc_build_frms() may return 0, but this does not imply that frame list is empty, as encoded frames can have a zero length payload. As such, simply remove this invalid BUG_ON(). This must be backported up to 3.1.	2025-01-02 11:25:40 +01:00
Amaury Denoyelle	85e27f1e92	Revert "BUG/MAJOR: mux-quic: fix BUG_ON on empty STREAM emission" This reverts commit 98064537423fafe05b9ddd97e81cedec8b6b278d. Above patch tried to fix a BUG_ON() occurence when MUX only emitted empty STREAM frames via qcc_build_frms(). Return value of qcs_send() was changed from the payload STREAM frame to the whole frame length. However, this is invalid as this return value is used to ensure connection flow-control is not exceeded on sending retry. This causes occurence of BUG_ON() crash in qcc_io_send() as send-list is not properly purged after QCS emission. Reverts this incorrect fix. The original issue will be properly dealt in the next commit. This commit must be backported to 3.1 if reverted commit was already applied on it.	2025-01-02 11:00:25 +01:00
Christopher Faulet	22f8d2c99e	BUG/MEDIUM: mux-h2: Count copied data when looping on RX bufs in h2_rcv_buf() When data was copied from RX buffers to the channel buffer, more data than expected could be moved because amount of data copied was never decremented from the limit. This could lead to a stream dead lock when the compression filter was inuse. The issue was introduced by commit 4eb3ff1 ("MAJOR: mux-h2: make streams use the connection's buffers") but revealed by 3816c38 ("MAJOR: mux-h2: permit a stream to allocate as many buffers as desired"). Because a h2 stream can now have several RX buffers, in h2_rcv_buf(), we loop on these buffers to fill the channel buffer. However, we must still take care to respect the limit to not copy to much data. However, the "count" variable was never decremented to reflect amount of data already copied. So, it was possible to exceed the limit. It was an issue when the compression filter was inuse because the channel buffer could be fully filled, preventing the compression to be performed. When this happened, the stream was infinitly blocked because the compression filter was asking for some space but nothing was scheduled to be forwarded. This patch should fix the issue #2826. It must be backported to 3.1.	2025-01-02 09:58:23 +01:00
Amaury Denoyelle	9806453742	BUG/MAJOR: mux-quic: fix BUG_ON on empty STREAM emission A BUG_ON() is present in qcc_io_send() to ensure that encoded frame list is empty if qcc_build_frms() previously returned 0. This BUG_ON() may be triggered if empty STREAM frame is encoded for standalone FIN. Indeed, qcc_build_frms() returns the sum of all STREAM payload length. In case only empty STREAM frames are generated, return value will be 0, despite new frames encoded and inserted into frame list. To fix this, change return value of qcs_send(). This now returns the whole STREAM frame length, both header and payload included. This ensures that qcc_build_frms() won't return a nul value if new frames are encoded, even empty ones. This must be backported up to 3.1.	2024-12-31 16:39:53 +01:00
Aurelien DARRAGON	5bbdd14f56	BUG/MINOR: stktable: invalid use of stkctr_set_entry() with mixed table types Some actions such as "sc0_get_gpc0" (using smp_fetch_sc_stkctr() internally) can take an optional table name as parameter to perform the lookup on a different table from the tracked one but using the key from the tracked entry. It is done by leveraging the stktable_lookup() function which was originally meant to perform intra-table lookups. Calling sc0_get_gpc0() with a different table name will result in stktable_lookup() being called to perform lookup using a stktsess from a different table. While it is theorically fine, it comes with a pitfall: both tables (the one from where the stktsess originates and the actual target table) should rely on the exact same key type and length. Failure to do so actually results in undefined behavior, because the key type and/or length from one table is used to perform the lookup in another table, while the underlying lookup API expects explicit type and key length. For instance, consider the below example: peers testpeers bind 127.0.0.1:10001 server localhost table test type binary len 1 size 100k expire 1h store gpc0 table test2 type string size 100k expire 1h store gpc0 listen test_px mode http bind 0.0.0.0:8080 http-request track-sc0 bin(AA) table testpeers/test http-request track-sc1 str(ok) table testpeers/test2 log-format "%[sc0_get_gpc0(testpeers/test2)]" log stdout format raw local0 server s1 git.haproxy.org:80 Performing a curl request to localhost:8080 will cause unitialized reads because string "ok" from test2 table will be compared as a string against "AA" binary sample which is not NULL terminated: ==2450742== Conditional jump or move depends on uninitialised value(s) ==2450742== at 0x484F238: strlen (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==2450742== by 0x27BCE6: stktable_lookup (stick_table.c:539) ==2450742== by 0x281470: smp_fetch_sc_stkctr (stick_table.c:3580) ==2450742== by 0x283083: smp_fetch_sc_get_gpc0 (stick_table.c:3788) ==2450742== by 0x2A805C: sample_process (sample.c:1376) So let's prevent that by adding some comments in stktable_set_entry() func description, and by adding a check in smp_fetch_sc_stkctr() to ensure both source stksess and target table share the same key properties. While it could be relevant to backport this in all stable versions, it is probably safer to wait for some time before doing so, to ensure that no existing configs rely on this ambiguity because the fact that the target table and source stksess entry need to share the same key type and length is not explicitly documented.	2024-12-31 16:36:00 +01:00
Aurelien DARRAGON	f94c63021b	DOC: config: add missing "track-sc0" in action keywords matrix In d54e8f8107 ("DOC: config: reorganize actions into their own section"), "track-sc0" keyword was properly documented but the keyword was not placed in the action keywords matrix alongside other track-sc* statements. It was probably overlooked, so let's fix that. Could be backported up to 2.9 with d54e8f8107.	2024-12-31 16:35:54 +01:00
Willy Tarreau	e148dfd35d	[RELEASE] Released version 3.2-dev2 Released version 3.2-dev2 with the following main changes : - MINOR: build: define DEBUG_STRESS - MINOR: applet: define applet_putchk_stress() alternative - MINOR: stats: use stress mode to force reentrant dumps - CI: scripts: add support for AWS-LC-FIPS in build-ssl.sh - MINOR: ssl: add "FIPS" details in haproxy -vv - MEDIUM: ssl: rename 'OpenSSL' by 'SSL library' in haproxy -vv - CI: github: let's add an AWS-LC-FIPS job - MINOR: window_filter: rely on the time to update the filter samples (QUIC/BBR) - BUG/MINOR: quic: wrong logical statement in in_recovery_period() (BBR) - BUG/MINOR: quic: fix BBB max bandwidth oscillation issue. - BUG/MINOR: quic: wrong bbr_target_inflight() implementation - BUG/MINOR: quic: remove max_bw filter from delivery rate sampling - BUG/MINOR: quic: underflow issue for bbr_inflight_hi_from_lost_packet() - BUG/MINOR: quic: reduce packet losses at least during ProbeBW_CRUISE (BBR) - MINOR: quic: reduce the private data size of QUIC cc algos - CLEANUP: quic: remove a wrong comment about ->app_limited (drs) - BUG/MINOR: quic: fix the wrong tracked recovery start time value - BUG/MINOR: quic: too permissive exit condition for high loss detection in Startup (BBR) - BUG/MINOR: cli: cli_snd_buf: preserve \r\n for payload lines - REGTESTS: ssl: add a PEM with mix of LF and CRLF line endings - BUG/MINOR: quic: missing Startup accelerating probing bw states - CLEANUP: quic: Rename some BBR functions in relation with bw probing - REORG: startup: move global.maxconn calculations in limits.c - REORG: startup: move code that applies limits to limits.c - REORG: startup: move nofile limit checks in limits.c - MINOR: ssl: add utils functions to extract X509 notAfter date - MINOR: ssl/cli: allow to filter expired certificates with 'show ssl sni' - MINOR: ssl/cli: add -A to the 'show ssl sni' command description - BUG/MINOR: ssl/cli: 'show ssl cert' escape the first '' of a filename - BUG/MINOR: ssl/cli: 'show ssl crl-file' escape the first '' of a filename - BUG/MINOR: ssl/cli: 'show ssl ca-file' escape the first '' of a filename - BUG/MEDIUM: stconn: Only consider I/O timers to update stream's expiration date - BUG/MEDIUM: queues: Make sure we call process_srv_queue() when leaving - BUG/MEDIUM: queues: Do not use pendconn_grab_from_px(). - CLEANUP: queues: Remove pendconn_grab_from_px(). - BUILD: debug: only dump/reset glitch counters when really defined - MINOR: compiler: add a __has_builtin() macro to detect features more easily - MINOR: compiler: rely on builtin detection for __builtin_unreachable() - MINOR: compiler: add a new "ASSUME" macro to help the compiler - MINOR: compiler: also enable __builtin_assume() for ASSUME() - MINOR: compiler: add ASSUME_NONNULL() to tell the compiler a pointer is valid - MINOR: bug: make BUG_ON() fall back to ASSUME - CLEANUP: cache: use ASSUME_NONNULL() instead of DISGUISE() - CLEANUP: hlua: use ASSUME_NONNULL() instead of ALREADY_CHECKED() - CLEANUP: htx: use ASSUME_NONNULL() to mark the start line as non-null - CLEANUP: mux-fcgi: use ASSUME_NONNULL() to indicate that the first block exists - CLEANUP: stats: use ASSUME_NONNULL() to indicate that the first block exists - CLEANUP: quic: replace ALREADY_CHECKED() with ASSUME_NONNULL() at a few places - CLEANUP: ssl-sock: drop two now unneeded ALREADY_CHECKED() - BUG/MEDIUM: mux-quic: do not mix qcc_io_send() return codes with pacing - CLEANUP: mux-quic: remove unused qcc member send_retry_list - MINOR: quic: add traces - MINOR: mux-quic: refactor wait-for-handshake support - MEDIUM/OPTIM: mux-quic: define a recv_list for demux resumption - MEDIUM/OPTIM: mux-quic: implement purg_list - MINOR: mux-quic: extract code to build STREAM frames list - MINOR: mux-quic: split STREAM and RS/SS emission - MEDIUM/OPTIM: mux-quic: do not rebuild frms list on every send - MEDIUM: mux-quic: remove pacing specific code on qcc_io_cb - MINOR: trace: implement tracing disabling API - MINOR: mux-quic: hide traces when woken up on pacing only - MINOR: ssl/cli: add a 'Uncommitted' status for 'show ssl' commands - MINOR: ssl/ocsp: Add extra details in error logs when possible - BUILD: ssl/ocsp: error: ‘%.s’ directive argument is null - MEDIUM: ssl/ocsp: OCSP response is expired with OCSP_MAX_RESPONSE_TIME_SKEW - MINOR: ssl: improve HAVE_SSL_OCSP ifdef - DOC: config: add example for server "track" keyword - DOC: config: reorder "tune.lua.*" keywords by alphabetical order - DOC: config: add "tune.lua.burst-timeout" to the list of global parameters - MINOR: hlua: add option to preserve bool type from smp to lua - REGTESTS: fix lua-based regtests using tune.lua.smp-preserve-bool - BUG/MEDIUM: mux-quic: prevent BUG_ON() by refreshing frms on MAX_DATA - CLEANUP: mux-quic: remove dead err label in qcc_build_frms() - BUG/MINOR: h2/rhttp: fix HTTP2 conn counters on reverse - MINOR: hlua: rename "tune.lua.preserve-smp-bool" to "tune.lua.bool-sample-conversion" - MINOR: ssl: change visibility of ssl_stats_module - MINOR: ssl: rework the error management in the OCSP callback - MEDIUM: ssl/ocsp: counters for OCSP stapling - CI: limit aws-lc and libressl Quic Interop to "haproxy" only - BUG/MEDIUM: queue: Make process_srv_queue return the number of streams - CI: github: try to build the latest WolfSSL master weekly - CI: github: activate ASAN on the WolfSSL weekly job - BUG/MINOR: stats: fix segfault caused by uninitialized value in "show schema json" - MINOR: stktable: add stktable_get_data_type_idx() helper function - MINOR: stktable: support optional index for array types in {set, clear, show} table commands - CI: scripts: allow to build wolfssl with --enable-debug - CI: github: activate debug in wolfssl weekly build - BUG/MEDIUM: queues: Stricly respect maxconn for outgoing connections - MEDIUM: queue: Handle the race condition between queue and dequeue differently - CLEANUP: Remove pendconn_must_try_again(). - BUILD: compat: add missing fcntl.h before defining F_SETPIPE_SZ - BUILD: mworker: always initialize the saveptr of strtok_r() - BUILD: limits: make normalize_rlim() take an rlim_t to fix build on m68k - BUG/MINOR: checks: handle a possible strdup() failure - BUG/MINOR: listener: handle a possible strdup() failure - BUG/MINOR: mux_h1: handle a possible strdup() failure - BUG/MINOR: debug: handle a possible strdup() failure	2024-12-25 15:17:01 +01:00
Ilia Shipitsin	6524fbfb70	BUG/MINOR: debug: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2024-12-25 12:42:33 +01:00
Ilia Shipitsin	a3e6c783cd	BUG/MINOR: mux_h1: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2024-12-25 12:42:33 +01:00
Ilia Shipitsin	89c62693da	BUG/MINOR: listener: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2024-12-25 12:41:08 +01:00
Ilia Shipitsin	495f1f9741	BUG/MINOR: checks: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2024-12-25 12:40:56 +01:00
Willy Tarreau	f486f976c7	BUILD: limits: make normalize_rlim() take an rlim_t to fix build on m68k As can be seen here, the build fails on m68k since commit 665dde648 ("MINOR: debug: use LIM2A to show limits") in 3.1: https://github.com/haproxy/haproxy/actions/runs/12440234399/job/34735360177 The reason is the comparison between a ulong limit and RLIM_INFINITY. Indeed, on m68k, rlim_t is an unsigned long long. Let's just change the function's input type to take an rlim_t instead. This also allows to get rid of the casts in the call place. This can be backported to 3.1 though it's not important given the low prevalence of this platform for such use cases.	2024-12-25 12:33:06 +01:00
Willy Tarreau	21df7677a9	BUILD: mworker: always initialize the saveptr of strtok_r() Building with some libcs which define strtok_r() as an inline function can yield a possibly uninitialized warning due to a loop dereferencing this save pointer early, even though the doc clearly mentions that it is ignored. This is actually more of a mismatch between the compiler and the libc (gcc-4.7 and glibc-2.23 in that case). It's trivial to set s2 to NULL here so let's do it to please this old couple. Note that while the warning is triggered in all supported versions, there's no point backporting it since it's unlikely this combination will be relevant outside of backwards compatibility checks now.	2024-12-25 12:18:46 +01:00
Willy Tarreau	f78121dd32	BUILD: compat: add missing fcntl.h before defining F_SETPIPE_SZ n 1.5-dev8, 13 years ago, support for setting pipe size was added by commit bd9a0a778 ("OPTIM/MINOR: make it possible to change pipe size (tune.pipesize)"). For compatibility purposes, it was defining F_SETPIPE_SZ in compat.h if it was not set. It apparently always had F_SETPIPE_SZ defined before being included. Now in 3.2-dev1, commit fbc534a6f ("REORG: startup: move nofile limit checks in limits.c") reordered a few includes and ended up with mworker-prog.c including compat.h before fcntl.h, causing a redefinition error on certain libcs: CC src/mworker-prog.o In file included from /usr/include/bits/fcntl.h:61:0, from /usr/include/fcntl.h:35, from include/haproxy/limits.h:11, from include/haproxy/mworker.h:18, from src/mworker-prog.c:27: /usr/include/bits/fcntl-linux.h:203:0: warning: "F_SETPIPE_SZ" redefined [enabled by default] In file included from include/haproxy/api-t.h:35:0, from include/haproxy/api.h:33, from src/mworker-prog.c:23: include/haproxy/compat.h:161:0: note: this is the location of the previous definition Let's simply include fcntl.h in compat.h before the macro is redefined. There's normally no need to backport this, though it's harmless to do it if needed.	2024-12-25 11:53:11 +01:00
Olivier Houchard	505480eeef	CLEANUP: Remove pendconn_must_try_again(). Remove pendconn_must_try_again(), now that it no longer is used.	2024-12-24 14:10:06 +01:00
Olivier Houchard	cda7275ef5	MEDIUM: queue: Handle the race condition between queue and dequeue differently There is a small race condition, where a server would check if there is something left in the proxy queue, and adding something to the proxy queue. If the server checks just before the stream is added to the queue, and it no longer has any stream to deal with, then nothing will take care of the stream, that may stay in the queue forever. This was worked around with commit 5541d4995d, by checking for that exact condition after adding the stream to the queue, and trying again to get a server assigned if it is detected. That fix lead to multiple infinite loops, that got fixed, but it is not unlikely that it could happen again. So let's fix the initial problem differently : a single server may mark itself as ready, and it removes itself once used. The principle is that when we discover that the just queued stream is alone with no active request anywhere ot dequeue it, instead of rebalancing it, it will be assigned to that current "ready" server that is available to handle it. The extra cost of the atomic ops is negligible since the situation is super rare.	2024-12-24 14:10:06 +01:00
Olivier Houchard	3372a2ea00	BUG/MEDIUM: queues: Stricly respect maxconn for outgoing connections The "served" field of struct server is used to know how many connections are currently in use for a server. But served used to be incremented way after the server was picked, so there were race conditions that could lead more than maxconn connections to be allocated for one server. To fix this, increment served way earlier, and make sure at the time that it never goes past maxconn. We now should never have more outgoing connections than set by maxconn.	2024-12-24 14:10:06 +01:00
William Lallemand	4332fed6c1	CI: github: activate debug in wolfssl weekly build Activate the WolfSSL debugging of WolfSSL in the weekly job.	2024-12-23 18:00:34 +01:00
William Lallemand	287b2dc6dd	CI: scripts: allow to build wolfssl with --enable-debug Allow to activate the debugging of WolfSSL when building it. WOLFSSL_DEBUG=1 WOLFSSL_VERSION=git-master ./scripts/build-ssl.sh	2024-12-23 18:00:25 +01:00
Aurelien DARRAGON	e8b7337d86	MINOR: stktable: support optional index for array types in {set, clear, show} table commands As discussed in GH #2286, {set, clear, show} table commands were unable to deal with array types such as gpt, because they handled such types as a non-array types, thus only the first entry (ie: gpt[0]) was considered. In this patch we add an extra logic around array-types handling so that it is possible to specify an array index right after the type, like this: set table peer/table key mykey data.gpt[2] value # where 2 is the entry index that we want to access If no index is specified, then it implicitly defaults to 0 to mimic previous behavior.	2024-12-23 17:32:11 +01:00
Aurelien DARRAGON	c0dc7769d4	MINOR: stktable: add stktable_get_data_type_idx() helper function Same as stktable_get_data_type(), but tries to parse optional index in the form "name[idx]" (only for array types). Falls back to stktable_get_data_type() when no index is provided.	2024-12-23 17:32:09 +01:00
Aurelien DARRAGON	ac1f413590	BUG/MINOR: stats: fix segfault caused by uninitialized value in "show schema json" Since b3d5708 ("MINOR: stats: remove implicit static trash_chunk usage") a segfault can occur when issuing "show schema json" on the stats socket. Indeed, now the dumping functions don't rely on trash_chunk anymore, but instead they rely on the appctx->chunk buffer. However, unlike other stats dumping commands, the "show schema json" only have an io handler, and no parse function. With other command, the parse function is responsible for pre-setting some data, including applet ctx reservation. Thus due to "show schema json" lacking parsing function, the applet ctx is used uninitialized, which is a bug obviously. To fix the issue we simply add a parse function for "show schema json", although all it does for now is calling applet_reserve_svcctx() for the current applet ctx. This issue was reported by @dsuch in GH #2825. It must be backported up to 3.0.	2024-12-23 17:32:07 +01:00
William Lallemand	dfc403f5c6	CI: github: activate ASAN on the WolfSSL weekly job Activate ASAN on the WolfSSL weekly job in order to have use-after-free traces.	2024-12-23 17:27:27 +01:00
William Lallemand	ef108705e4	CI: github: try to build the latest WolfSSL master weekly The WolfSSL latest version is still broken (5.7.4), no new release was done with a new version. Modify the weekly CI job so we could build with the latest git version.	2024-12-23 17:27:00 +01:00
Olivier Houchard	5b8899b6cc	BUG/MEDIUM: queue: Make process_srv_queue return the number of streams Make process_srv_queue() return the number of streams unqueued, as pendconn_grab_from_px() did, as that number is used by srv_update_status() to generate logs. This should be backported up to 2.6 with 111ea83ed4e13ac3ab028ed5e95201a1b4aa82b8	2024-12-23 15:03:40 +01:00
Ilia Shipitsin	6aae995b1d	CI: limit aws-lc and libressl Quic Interop to "haproxy" only those CI are not supposed to run in forks (however, if someone wants, he can enable it personally)	2024-12-23 13:59:48 +01:00
William Lallemand	056ec51c26	MEDIUM: ssl/ocsp: counters for OCSP stapling Add 2 counters in the SSL stats module for OCSP stapling. - ssl_ocsp_staple is the number of OCSP response successfully stapled with the handshake - ssl_failed_ocsp_stapled is the number of OCSP response that we couldn't staple, it could be because of an error or because the response is expired. These counters are incremented in the OCSP stapling callback, so if no OCSP was configured they won't never increase. Also they are only working in frontends. This was discussed in github issue #2822.	2024-12-23 11:23:00 +01:00
William Lallemand	6e4dd4c64c	MINOR: ssl: rework the error management in the OCSP callback Use an error label to fail in the OCSP callback, instead of returns everywhere.	2024-12-23 11:23:00 +01:00
William Lallemand	0e6af97233	MINOR: ssl: change visibility of ssl_stats_module In order to add stats from other files, the ssl_stats_module need to be visible from other files. This moves the ssl_counters definition in ssl_sock-t.h and removes the static of ssl_stats_module.	2024-12-23 11:23:00 +01:00
Aurelien DARRAGON	29b6d8af16	MINOR: hlua: rename "tune.lua.preserve-smp-bool" to "tune.lua.bool-sample-conversion" A better name was found for the option implemented in ec74438 ("MINOR: hlua: add option to preserve bool type from smp to lua") Indeed, "tune.lua.preserve-smp-bool {on \| off}" wasn't explicit enough nor did it encourage the adoption of the new "fixed" behavior (vs historical behavior which is now considered as a bug). Thus it becomes "tune.lua.bool-sample-conversion { normal \| pre-3.1-bug }" which actively encourage users to switch the new behavior after having patched in-use Lua script if needed. From a technical point of view, the logic remains the same, as the option currently defaults to "pre-3.1-bug" to prevent script breakage, and a warning is emitted if the option isn't set explicily and Lua is used. Documentation and regtests were updated. Must be backported in 3.1 with ec74438 and f2838f5 ("REGTESTS: fix lua-based regtests using tune.lua.smp-preserve-bool")	2024-12-20 17:34:05 +01:00
Amaury Denoyelle	8633446337	BUG/MINOR: h2/rhttp: fix HTTP2 conn counters on reverse Dedicated HTTP/2 stats proxy counters are available for current and total number of HTTP/2 connection on both frontend and backend sides. Both counters are simply incremented into h2_init(). This causes issues when using reverse HTTP. First, increment is not performed on the expected side, as it is triggered before h2_conn_reverse() which switches a connection from frontend to backend or vice versa. For example on active revers side, h2_total_connections is incremented on the backend only even after connection is reversed and attached to a listener for the remainder of its lifetime. h2_open_connections suffers from a similar but arguably worst behavior as it is also decremented. If increment and decrement operations are not performed on the same proxy side, which happens for every connection which has been successfully reversed, it causes an invalid counter value, possibly with an integer overflow. To fix this, delay increment operations on reverse HTTP from h2_init() to h2_conn_reverse(). Both counters are updated only after reverse has completed, thus using the expected frontend or backend side. To prevent overflow on h2_open_connections, ensure h2_release() decrement is not performed if a connection is freed before achieving its reversal, as in this case it would not have been accounted by H2 counters. This should be backported up to 2.9. This should fix github issue #2821.	2024-12-19 17:32:01 +01:00
Amaury Denoyelle	4490df57a6	CLEANUP: mux-quic: remove dead err label in qcc_build_frms() STREAM frames emission in qcc_build_frms() has been splitted from RESET_STREAM/STOP_SENDING into qcc_emit_rs_ss(). Now, the former cannot fail, as such err label can be removed as it is unreachable. This should be backported up to 3.1. This should fix github issue #2824.	2024-12-19 16:36:33 +01:00
Amaury Denoyelle	7edb2ffae7	BUG/MEDIUM: mux-quic: prevent BUG_ON() by refreshing frms on MAX_DATA QUIC MUX emission has been optimized recently by recycling STREAM frames list between emission cycles. This is done via qcc frms list member. If new data is available, frames list must be cleared before the next emission to force the encoding of new STREAM frames. If a refresh frames list is missed, it would lead to incomplete data emission on the next transfer. In most cases, this is detected via a BUG_ON() inside qcc_io_send(), as qcs instances remains in send_list after a qcc_send_frames() full emission. A bug was recently found which causes this BUG_ON() crash. This is directly related to flow control. Indeed, when sending credit is increased on the connection or a stream, frames list should be cleared as new larger STREAM frames could be encoded. This was already performed on MAX_DATA/MAX_STREAM_DATA reception but only if flow-control limit was unblocked. However this is not the proper condition and it may lead to insufficient frames refresh and thus this BUG_ON() crash. Fix this by adjusting the condition for frames refresh on flow control credit increase. Now, frames list is cleared if real offset is not blocked and soft offset was equal or greater to the previous limit. Indeed, this is the only case in which frames refreshing is necessary as it would result in bigger encoded STREAM frames. This bug was detected on QUIC interop with go-x-net client. It can also be reproduced, albeit not systematically, using the following command : $ ngtcp2-client -q --no-quic-dump --no-http-dump \ --exit-on-all-streams-close --max-data 10 \ 127.0.0.1 20443 -n10 "http://127.0.0.1:20443/?s=10k" This bug appeared with the following patch. As it is scheduled for 3.1 backporting, the current fix should be backported with it. 14710b5e6bf76834343d58db22e00b72590b16fe MEDIUM/OPTIM: mux-quic: do not rebuild frms list on every send	2024-12-19 16:36:28 +01:00
Aurelien DARRAGON	f2838f5172	REGTESTS: fix lua-based regtests using tune.lua.smp-preserve-bool Because of the previous commit, configs making use of lua script without setting "tune.lua.smp-preserve-bool" explicitly now raise a warning. However, since 6f746af91 ("REGTESTS: use -dW by default on every reg-tests"), regtests are not allowed to raise warnings anymore. Because of this the CI now fails for every tests that relies on Lua. To fix this, let's explicitly set the "tune.lua.smp-preserve-bool" for all tests involving Lua. Here we set the value to "on" because we know it is safe to do so, and this way it will be future-proof. If ec7443827 ("MINOR: hlua: add option to preserve bool type from smp to lua") is backported, then this patch must be backported with it (if it is not trivial to backport, then simply follow this rule: grep for "lua-load" in reg-tests directory, then for each match, make sure to set the tune.smp-preserve-bool tunable in the global section.	2024-12-19 14:21:35 +01:00
Aurelien DARRAGON	ec74438273	MINOR: hlua: add option to preserve bool type from smp to lua As discussed in GH #2814, there is an ambiguity in hlua implementation that causes haproxy smp boolean type to be pushed as an integer on the Lua stack. On the other hand, when doing Lua to haproxy smp conversion, the boolean type is properly perserved. Of course this situation is not desirable and can lead to unexpected results. However we cannot simply fix the behavior because in Lua boolean and integer types are not are completely distinct types and cannot be used interchangeably. So in order to prevent breaking existing scripts logic, in this patch we add a dedicated lua tunable named "tune.lua.smp-preserve-bool" which can take the following values: - "on" : when converting haproxy smp to lua, boolean type is preserved - "off": when converting haproxy smp to lua, boolean is converted to integer (legacy behavior) For now, the tunable defaults to "off" to preserve historical behavior. However, when the option isn't set explicitly and lua is used, a warning will be emitted in order to raise user's awareness about this ambiguity. It is expected that the tunable could default to "on" in future versions, thus it is recommended to avoid setting it to "off" except when using existing Lua scripts that still rely on the old behavior regarding boolean smp to Lua conversion, and that they cannot be fixed easily. This should solve issue GH #2814. It may be relevant to backport this in haproxy 3.1.	2024-12-19 13:50:27 +01:00
Aurelien DARRAGON	67e3270c59	DOC: config: add "tune.lua.burst-timeout" to the list of global parameters "tune.lua.burst-timeout" was properly defined but not listed in the list of global parameters as it was overlooked in 58e36e5b1 ("MEDIUM: hlua: introduce tune.lua.burst-timeout")	2024-12-19 13:50:21 +01:00
Aurelien DARRAGON	985a45d9c7	DOC: config: reorder "tune.lua." keywords by alphabetical order Effort was made to properly organize "tune." keywords by alphabetical order, but "tune.lua" keywords didn't follow that rule with care. Let's fix that.	2024-12-19 13:50:16 +01:00
Aurelien DARRAGON	48545113f4	DOC: config: add example for server "track" keyword As requested on GH #2325, "track" server keyword could benefit from a simple config example to show how to make use of it. That's what we're doing in this commit, thanks to GH user @HAkmiller for the suggestion.	2024-12-19 13:50:03 +01:00
William Lallemand	acb2c9eb8b	MINOR: ssl: improve HAVE_SSL_OCSP ifdef Allow to build correctly without OCSP. It could be disabled easily with OpenSSL build with OPENSSL_NO_OCSP. Or even with DEFINE="-DOPENSSL_NO_OCSP" on haproxy make line.	2024-12-19 10:53:05 +01:00
William Lallemand	1c7f5ce32e	MEDIUM: ssl/ocsp: OCSP response is expired with OCSP_MAX_RESPONSE_TIME_SKEW When a OCSP response has a nextUpdate date which is OCSP_MAX_RESPONSE_TIME_SKEW (300) seconds in the future, the OCSP stapling callback ssl_sock_ocsp_stapling_cbk() returns SSL_TLSEXT_ERR_NOACK. However we don't emit an error when trying to load the file. There is a OCSP_check_validity() check using OCSP_MAX_RESPONSE_TIME_SKEW, but it checks that the OCSP response is not thisUpdate is not too much in the past. This patch emits an error during loading so we don't try to load an OCSP response which would never be emitted because of OCSP_MAX_RESPONSE_TIME_SKEW. This was discussed in issue #2822.	2024-12-18 16:14:32 +01:00
William Lallemand	6e11d34940	BUILD: ssl/ocsp: error: ‘%.s’ directive argument is null Some gcc version will emit an error because a '%.s' argument have a NULL parameter. Initialize the string to "" instead.	2024-12-18 11:25:22 +01:00
Remi Tricot-Le Breton	93f2c73423	MINOR: ssl/ocsp: Add extra details in error logs when possible When the ocsp response auto update process fails during insertion or while validating the received ocsp response, we call ssl_sock_update_ocsp_response or ssl_ocsp_check_response respectively and both these functions take an 'err' parameter in which detailed error messages can be written. Until now, those error messages were discarded and the only information given to the user was a generic error (ERR_CHECK or ERR_INSERT) which does not help much. We now keep a pointer to the last error message in the certificate_ocsp structure and dump its content in the update logs as well as in the "show ssl ocsp-updates" cli command. This issue was raised in GitHub #2817.	2024-12-18 10:41:16 +01:00
William Lallemand	4abedc3fb0	MINOR: ssl/cli: add a 'Uncommitted' status for 'show ssl' commands Add a 'Uncommitted' status for 'show ssl' commands on the 'Status' line when accessing a non-empty and uncommitted SSL transaction. Available with: - show ssl cert - show ssl ca-file - show ssl crl-file	2024-12-18 10:32:26 +01:00
Amaury Denoyelle	53db43aff2	MINOR: mux-quic: hide traces when woken up on pacing only Previous commit aligned default and pacing emission. This is a cleaner and more robust code. However, it may disrupt traces analysis when pacing is rescheduled until timer expiration. Hide traces when qcc_io_cb() is woken up only due to pacing and timer is not yet expired. This is implemented by using special TASK_WOKEN_IO for pacing. This should be backported up to 3.1.	2024-12-18 09:52:16 +01:00
Amaury Denoyelle	9d155ca706	MINOR: trace: implement tracing disabling API Define a set of functions to temporarily disable/reactivate tracing for the current thread. This could be useful when wanting to quickly remove tracing output for some code parts. The API relies on a disable/resume set of functions, with a thread-local counter. This counter is tested under __trace_enabled(). It is a cumulative value so that the same count of resume must be issued after several disable usage. There is also the possibility to force reset the counter to 0 before restoring the old value. This should be backported up to 3.1.	2024-12-18 09:52:06 +01:00
Amaury Denoyelle	41f0472d96	MEDIUM: mux-quic: remove pacing specific code on qcc_io_cb Pacing was recently implemented by QUIC MUX. Its tasklet is rescheduled until next emission timer is reached. To improve performance, an alternate execution of qcc_io_cb was performed when rescheduled due to pacing. This was implemented using TASK_F_USR1 flag. However, this model is fragile, in particular when several events happened alongside pacing scheduling. This has caused some issue recently, most notably when MUX is subscribed on transport layer on receive for handshake completion while pacing emission is performed in parallel. MUX qcc_io_cb() would not execute the default code path, which means the reception event is silently ignored. Recent patches have reworked several parts of qcc_io_cb. The objective was to improve performance with better algorithm on send and receive part. Most notable, qcc frames list is only cleared when new data is available for emission. With this, pacing alternative code is now mostly unneeded. As such, this patch removes it. The following changes are performed : * TASK_F_USR1 is now not used by QUIC MUX. As such, tasklet_wakeup() default invokation can now replace obsolete wrappers qcc_wakeup/qcc_wakeup_pacing * qcc_purge_sending is removed. On pacing rescheduling, all qcc_io_cb() is executed. This is less error-prone, in particular when pacing is mixed with other events like receive handling. This renders the code less fragile, as it completely solves the described issue above. This should be backported up to 3.1.	2024-12-18 09:49:20 +01:00
Amaury Denoyelle	14710b5e6b	MEDIUM/OPTIM: mux-quic: do not rebuild frms list on every send A newly introduced frames list member has been defined into QCC instance with pacing implementation. This allowed to preserve STREAM frames built between different emission scheduled by pacing, without having to regenerate it if no new QCS data is available. Generalize this principle outside of pacing scheduling. Now, the frames list will be reused accross several qcc_io_send() usage. Frames list is only cleared when necessary. This will force its refreshing in the next qcc_io_send() via qcc_build_frms_list(). Frames list refreshing is performed in the following cases : * on successful transfer from stream snd_buf / done_ff / shut * on stream reset or read abort * on max_data/max_stream_data reception with window increase Note that the two first cases are in fact covered directly due to qcc_send_stream() usage when QCS is (re)inserted into the send_list. The main objective of this patch will be to remove QUIC MUX pacing specific code path. It could also provide better performance as emission of large frames may often be rescheduled due to transport layer, either on congestion or full socket buffer. When QUIC MUX is rescheduled, no new data is available and frames list can be reuse as-is, avoiding an unecessary loop over send_list. This should be backported up to 3.1.	2024-12-18 09:49:02 +01:00
Amaury Denoyelle	9ecc1a8e57	MINOR: mux-quic: split STREAM and RS/SS emission This commit is a follow-up of the previous one which defines function qcc_build_frms(). This function implements looping over qcc send_list, to both encode and send individually any STOP_SENDING and RESET_STREAM, but also encode STREAM frames as a preparator step. STREAM frames were then sent as a list outside of qcc_build_frms() via qcc_send_frames(). Extract STOP_SENDING/RESET_STREAM encoding and emission step into a new function qcc_emit_rs_ss(). The code is thus cleaner. In particular it highlights that an error during STOP_SENDING/RESET_STREAM emission stage is fatal and prevent any STREAM frames processing. This should be backported up to 3.1.	2024-12-18 09:40:21 +01:00
Amaury Denoyelle	244dc00b09	MINOR: mux-quic: extract code to build STREAM frames list Extracts code responsible to generate STREAM, RESET_STREAM and STOP_SENDING frames for each qcs instances registered in qcc send_list. It is moved from qcc_io_send() to its owned new function qcc_build_frms(). This commit does not bring functional change. It is a preparatory step to adapt QUIC MUX send mechanism to allow reusing of qcc frms list accross qcc_io_send() invokation. As a side change, qcc_tx_frms_free() is renamed to qcc_clear_frms(). This better highlights its relationship with qcc_build_frms(). This should be bkacported up to 3.1.	2024-12-18 09:38:19 +01:00
Amaury Denoyelle	e296585ae9	MEDIUM/OPTIM: mux-quic: implement purg_list This commit is part of the current serie which aims to refactor and improve overall performance of QUIC MUX I/O handler. qcc_io_process() is responsible to perform some internal operations on QUIC MUX after I/O completion. It is notably called on every qcc_io_cb() tasklet handler. The most intensive work on it is the purging of QCS instances after transfer completion. This was implemented by looping on QCC streams tree and inspecting the state of every QCS. The purpose of this commit is to optimize this processing. A new purg_list QCC member is defined. It is responsible to list every QCS instances whose transfer has been completed. It is thus safe to reuse <el_send> QCS list attach point. Stream purging will thus only loop on purg_list instead of every known QCS. This should be backported up to 3.1.	2024-12-18 09:33:52 +01:00
Amaury Denoyelle	4b42dd4ae0	MEDIUM/OPTIM: mux-quic: define a recv_list for demux resumption This commit is part of the current serie which aims to refactor and improve overall performance of QUIC MUX I/O handler. Define a recv_list element into qcc structure. This is used to registered every instance of qcs which are currently blocked on demuxing, which happen on no more space in <rx.appbuf>. The purpose of this patch is to reduce qcc_io_recv() CPU usage. Now, only recv_list iteration is performed, instead of the previous looping over every qcs instances. This is useful as qcc_io_recv() is called each time qcc_io_cb() is scheduled, even if only sending condition was the wakeup origin. A qcs is not inserted into recv_list immediately after blocking on demux full buffer. Instead, this is only done after unblocking via stream rcv_buf callback, which ensure that new buffer space is available. This should be backported up to 3.1.	2024-12-18 09:23:41 +01:00
Amaury Denoyelle	0a53a008d0	MINOR: mux-quic: refactor wait-for-handshake support This commit refactors wait-for-handshake support from QUIC MUX. The flag logic QC_CF_WAIT_HS is inverted : it is now positionned only if MUX is instantiated before handshake completion. When the handshake is completed, the flag is removed. The flag is now set directly on initialization via qmux_init(). Removal via qcc_wait_for_hs() is moved from qcc_io_process() to qcc_io_recv(). This is deemed more logical as QUIC MUX is scheduled on RECV to be notify by the transport layer about handshake termination. Moreover, qcc_wait_for_hs() is now called if recv subscription is still active. This commit is the first of a serie which aims to refactor QUIC MUX I/O handler and improves its overall performance. The ultimate objective is to be able to stream qcc_io_cb() by removing pacing specific code path via qcc_purge_sending(). This should be backported up to 3.1.	2024-12-18 09:23:41 +01:00
Amaury Denoyelle	9dcd2369e2	MINOR: quic: add traces Add some traces to better follow QUIC MUX scheduling, in particular with pacing interaction. This should be backported up to 3.1.	2024-12-18 09:20:20 +01:00
Amaury Denoyelle	17bfe93768	CLEANUP: mux-quic: remove unused qcc member send_retry_list Remove unused fields send_retry_list from qcc and its corresponding attach element el from qcs. This should be backported up to 3.1.	2024-12-18 09:20:20 +01:00
Amaury Denoyelle	2e3542bec6	BUG/MEDIUM: mux-quic: do not mix qcc_io_send() return codes with pacing With pacing implementation, qcc_send_frames() return code has been extended to report emission interruption due to pacing limitation. This is used only in qcc_io_send(). However, its invokation may be skipped using 'sent_done' label. This happens on emission failure of a STOP_SENDING or RESET_STREAM (either memory allocation failure, or transport layer rejection). In this case, return values are mixed as qcs_send() is wrongly compared against pacing interruption condition. This value corresponds to the length of the last built STREAM frames. If by mischance the last frame was 1 byte long, qcs_send() return value is equal to pacing interruption condition. This has several effects. If pacing is activated, it may lead to unneeded wakeup on QUIC MUX. Worst, if pacing is not used, a BUG_ON() crash will be triggered. Fix this by using a different variable dedicated to qcc_send_frames() return value. By default it is initialized to 0. This ensures that pacing code won't be activated in case qcc_send_frames() is not used. This must be backported up to 3.1.	2024-12-18 09:18:48 +01:00
Willy Tarreau	93d4e9d50f	CLEANUP: ssl-sock: drop two now unneeded ALREADY_CHECKED() In ssl_sock_bind_verifycbk() a BUG_ON() checks the validity of "ctx" and "bind_conf". There was a pair of ALREADY_CHECKED() macros after BUG_ON() for the case where DEBUG_STRICT=0. But this is now addressed so we can remove these two macros and rely on the BUG_ON() instead.	2024-12-17 17:47:57 +01:00
Willy Tarreau	7760e3a374	CLEANUP: quic: replace ALREADY_CHECKED() with ASSUME_NONNULL() at a few places There were 4 instances of ALREADY_CHECKED() used to tell the compiler that the argument couldn't be NULL by design. Let's change them to the cleaner ASSUME_NONNULL(). Functions like qc_snd_buf() were slightly reduced in size (-24 bytes). Apparently gcc-13 sees a potential case that others don't see, and it's likely a bug since depending what is masked, it will completely change the output warnings to the point of contradicting itself. After many attempts, it appears that just checking that CMSG_FIRSTHDR(msg) is not null suffices to calm it down, so the strange warnings might have been the result of an overoptimization based on a supposed UB in the first place. At least now all versions up to 13.2 as well as clang are happy.	2024-12-17 17:47:57 +01:00
Willy Tarreau	1f93622779	CLEANUP: stats: use ASSUME_NONNULL() to indicate that the first block exists In stats_scope_ptr(), the validity of blk() was assumed using ALREADY_CHECKED(blk), but we can now use the cleaner ASSUME_NONNULL(). In addition this simplifies the BUG_ON() check that follows.	2024-12-17 17:47:57 +01:00
Willy Tarreau	6dfd541ca8	CLEANUP: mux-fcgi: use ASSUME_NONNULL() to indicate that the first block exists In fcgi_snd_buf(), this was previously achieved using ALREADY_CHECKED(blk), but we can now fold it into the cleaner ASSUME_NONNULL().	2024-12-17 17:47:57 +01:00
Willy Tarreau	143a103696	CLEANUP: htx: use ASSUME_NONNULL() to mark the start line as non-null In http_replace_req_uri(), this assumption was previously made using ALREADY_CHECKED() but the new one is cleaner (and smaller, 24 bytes less).	2024-12-17 17:47:57 +01:00
Willy Tarreau	a4f50c69e4	CLEANUP: hlua: use ASSUME_NONNULL() instead of ALREADY_CHECKED() The purpose of the test in hlua_applet_tcp_new() was precisely to declare non-nullity. Let's just do it using ASSUME_NONNULL() now.	2024-12-17 17:47:57 +01:00
Willy Tarreau	29b2c5d4d4	CLEANUP: cache: use ASSUME_NONNULL() instead of DISGUISE() DISGUISE() was used to avoid a NULL warning. Using ASSUME_NONNULL() instead makes it clearer and made the function slightly shorter.	2024-12-17 17:42:11 +01:00
Willy Tarreau	7b6acb6a51	MINOR: bug: make BUG_ON() fall back to ASSUME When the strict level is zero and BUG_ON() is not implemented, some possible null-deref warnings are emitted again because some were covering for these cases. Let's make it fall back to ASSUME() so that the compiler continues to know that the tested expression never happens. It also allows to further optimize certain functions by helping the compiler eliminate certain tests for impossible values. However it requires that the expression is really evaluated before passing the result through ASSUME() otherwise it was shown that gcc-11 and above will fail to evaluate its implications and will continue to emit the null-deref warnings in case the expression is non-trivial (e.g. it has multiple terms). We don't do it for BUG_ON_HOT() however because the extra cost of evaluating the condition is generally not welcome in fast paths, particularly when that BUG_ON_HOT() was kept disabled for performance reasons.	2024-12-17 17:39:12 +01:00
Willy Tarreau	63798088b3	MINOR: compiler: add ASSUME_NONNULL() to tell the compiler a pointer is valid At plenty of places we have ALREADY_CHECKED() or DISGUISE() on a pointer just to avoid "possibly null-deref" warnings. These ones have the side effect of weakening optimizations by passing through an assembly step. Using ASSUME_NONNULL() we can avoid that extra step. And when the __builtin_unreachable() builtin is not present, we fall back to the old method using assembly. The macro returns the input value so that it may be used both as a declarative way to claim non-nullity or directly inside an expression like DISGUISE().	2024-12-17 16:46:46 +01:00
Willy Tarreau	2ce63b7b17	MINOR: compiler: also enable __builtin_assume() for ASSUME() Clang apparently has __builtin_assume() which does exactly the same as our macro, since at least v3.8. Let's enable it, in case it may even better detect assumptions vs unreachable code.	2024-12-17 16:46:46 +01:00
Willy Tarreau	efc897484b	MINOR: compiler: add a new "ASSUME" macro to help the compiler This macro takes an expression, tests it and calls an unreachable statement if false. This allows the compiler to know that such a combination does not happen, and totally eliminate tests that would be related to this condition. When the statement is not available in the compiler, we just perform a break from a do {} while loop so that the expression remains evaluated if needed (e.g. function call).	2024-12-17 16:46:46 +01:00
Willy Tarreau	41fc18b1d1	MINOR: compiler: rely on builtin detection for __builtin_unreachable() Due to __builtin_unreachable() only being associated to gcc 4.5 and above, it turns out it was not enabled for clang. It's not used that much but still a little bit, so let's enable it now. This reduces the code size by 0.2% and makes it a bit more efficient.	2024-12-17 16:46:46 +01:00
Willy Tarreau	96cfcb1df3	MINOR: compiler: add a __has_builtin() macro to detect features more easily We already have a __has_attribute() macro to detect when the compiler supports a specific attribute, but we didn't have the equivalent for builtins. clang-3 and gcc-10 have __has_builtin() for this. Let's just bring it using the same mechanism as __has_attribute(), which will allow us to simply define the macro's value for older compilers. It will save us from keeping that many compiler-specific tests that are incomplete (e.g. the __builtin_unreachable() test currently doesn't cover clang).	2024-12-17 16:46:46 +01:00
Willy Tarreau	4710ab5604	BUILD: debug: only dump/reset glitch counters when really defined If neither DEBUG_GLITCHES nor DEBUG_STRICT is set, we end up with no dbg_cnt section, resulting in debug_parse_cli_counters not building due to __stop_dbg_cnt and __start_dbg_cnt not being defined. Let's just condition the end of the function to these conditions. An alternate approach (less elegant) is to always declare a dummy entry of type DBG_COUNTER_TYPES in debug.c. This must be backported to 3.1 since it was brought with glitches.	2024-12-17 16:46:25 +01:00
Olivier Houchard	b3cd5a4b86	CLEANUP: queues: Remove pendconn_grab_from_px(). pendconn_grab_from_px() is now unused, so just remove it.	2024-12-17 16:05:44 +01:00
Olivier Houchard	111ea83ed4	BUG/MEDIUM: queues: Do not use pendconn_grab_from_px(). pendconn_grab_from_px() was called when a server was brought back up, to get some streams waiting in the proxy's queue and get them to run on the newly available server. It is very similar to process_srv_queue(), except it only goes through the proxy's queue, which can be a problem, because there is a small race condition that could lead us to add more streams to the server queue just as it's going down. If that happens, the server would just be ignored when back up by new streams, as its queue is not empty, and it would never try to process its queue. The other problem with pendconn_grab_from_px() is that it is very liberal with how it dequeues streams, and it is not very good at enforcing maxconn, it could lead to having 3*maxconn connections. For both those reasons, just get rid of pendconn_grab_from_px(), and just use process_srv_queue(). Both problems are easy to reproduce, especially on a 64 threads machine, set a maxconn to 100, inject in H2 with 1000 concurrent connections containing up to 100 streams each, and after a few seconds/minutes the max number of concurrent output streams will be much higher than maxconn, and eventually the server will stop processing connections. It may be related to github issue #2744. Note that it doesn't totally fix the problem, we can occasionally see a few more connections than maxconn, but the max that have been observed is 4 more connections, we no longer get multiple times maxconn. have more outgoing connections than maxconn, This should be backported up to 2.6.	2024-12-17 16:05:44 +01:00
Olivier Houchard	dc9ce9c264	BUG/MEDIUM: queues: Make sure we call process_srv_queue() when leaving In stream_free(), make sure we call process_srv_queue() each time we call sess_change_server(), otherwise a server may end up not dequeuing any stream when it could do so. In some extreme cases it could lead to an infinite loop, as the server would appear to be available, as its "served" parameter would be < maxconn, but would end up not being used, as there are elements still in its queue. This should be backported up to 2.6.	2024-12-17 16:05:44 +01:00
Christopher Faulet	4f32d03360	BUG/MEDIUM: stconn: Only consider I/O timers to update stream's expiration date In sc_notify(), it remained a case where it was possible to set an expiration date on the stream in the past, leading to a crash because of a BUG_ON(). This must never happen of course. In sc_notify(), The stream's expiration may be updated in case no wakeup conditions are encoutered. In that case, we must take care to never set an expiration date in the past. However, it appeared there was still a condition to do so. This code is based on an implicit postulate: the stream's expiration date must always be set when we leave process_stream(). It was true since the 2.9. But in 3.0, the buffer allocation mechanism was improved and on an alloc failure in process_stream(), the stream is inserted in a wait-list and its expiration date is set to TICK_ETERNITY. With the good timing, and an analysis expiration date set on a channel, it is possible to set the stream's expiration date in past. After analysis, it appeared that the proper way to fix the issue is to only evaluate I/O timers (read and write timeout) and not stream's timers (analase_exp or conn_exp) because only I/O timers may have changed since the last process_stream() call. This patch must be backported as far as 3.0 to fix the issue. But it is probably a good idea to also backported it as far as 2.8.	2024-12-16 17:47:25 +01:00
William Lallemand	e3b760ebcc	BUG/MINOR: ssl/cli: 'show ssl ca-file' escape the first '' of a filename When doing a 'show ssl ca-file <filename>', prefixing a filename with a '' allows to show the uncommited transaction asociated to this filename. However for people using '*' as the first character of their filename, there is no way to access this filename. This patch fixes the problem by allowing to escape the first character with \. This should be backported in every stable branches.	2024-12-16 17:09:34 +01:00
William Lallemand	82c83a11a1	BUG/MINOR: ssl/cli: 'show ssl crl-file' escape the first '' of a filename When doing a 'show ssl crl-file <filename>', prefixing a filename with a '' allows to show the uncommited transaction asociated to this filename. However for people using '*' as the first character of their filename, there is no way to access this filename. This patch fixes the problem by allowing to escape the first character with \. This should be backported in every stable branches.	2024-12-16 16:46:52 +01:00
William Lallemand	2ba4cf541b	BUG/MINOR: ssl/cli: 'show ssl cert' escape the first '' of a filename When doing a 'show ssl cert <filename>', prefixing a filename with a '' allows to show the uncommited transaction asociated to this filename. However for people using '*' as the first character of their filename, there is no way to access this filename. This patch fixes the problem by allowing to escape the first character with \. This should be backported in every stable branches.	2024-12-16 16:17:12 +01:00
William Lallemand	fd35b7fb97	MINOR: ssl/cli: add -A to the 'show ssl sni' command description Add [-A] to the 'show ssl sni' command description.	2024-12-16 15:22:27 +01:00
William Lallemand	7c8e38d4d6	MINOR: ssl/cli: allow to filter expired certificates with 'show ssl sni' -A option in 'show ssl sni' shows certificates that are past the notAfter date. The patch reworks the options parsing to get multiple.	2024-12-16 14:55:23 +01:00
William Lallemand	bb88f68cf7	MINOR: ssl: add utils functions to extract X509 notAfter date Add ASN1_to_time_t() which converts an ASN1_TIME to a time_t and x509_get_notafter_time_t() which returns the notAfter date in time_t format.	2024-12-16 14:54:53 +01:00
Valentine Krasnobaeva	fbc534a6fa	REORG: startup: move nofile limit checks in limits.c Let's encapsulate the code, which checks the applied nofile limit into a separate helper check_nofile_lim_and_prealloc_fd(). Let's keep in this new function scope the block, which tries to create a copy of FD with the highest number, if prealloc-fd is set in the configuration.	2024-12-16 10:44:01 +01:00
Valentine Krasnobaeva	14f5e00d38	REORG: startup: move code that applies limits to limits.c In step_init_3() we try to apply provided or calculated earlier haproxy maxsock and memmax limits. Let's encapsulate these code blocks in dedicated functions: apply_nofile_limit() and apply_memory_limit() and let's move them into limits.c. Limits.c gathers now all the logic for calculating and setting system limits in dependency of the provided configuration.	2024-12-16 10:44:01 +01:00
Valentine Krasnobaeva	1332e9b58d	REORG: startup: move global.maxconn calculations in limits.c Let's encapsulate the code, which calculates global.maxconn and global.maxsslconn into a dedicated function set_global_maxconn() and let's move this function in limits.c. In limits.c we keep helpers to calculate and check haproxy internal limits, based on the system nofile and memory limits.	2024-12-16 10:44:01 +01:00
Frederic Lecaille	949bc18f66	CLEANUP: quic: Rename some BBR functions in relation with bw probing Rename bbr_is_probing_bw() to bbr_is_in_a_probe_state() and bbr_is_accelerating_probing_bw() to bbr_is_probing_bw() to match the function names of the BBR v3 internet draft. Must be backported to 3.1 to ease any further backport to come.	2024-12-13 19:41:21 +01:00
Frederic Lecaille	0dc0c890ea	BUG/MINOR: quic: missing Startup accelerating probing bw states Startup state is also a probing with acceleration bandwidth state. This modification should have come with this previous one: BUG/MINOR: quic: reduce packet losses at least during ProbeBW_CRUISE (BBR) Must be backported to 3.1.	2024-12-13 19:41:21 +01:00
Valentine Krasnobaeva	ea4a148a7d	REGTESTS: ssl: add a PEM with mix of LF and CRLF line endings User tried to update a PEM, generated automatically. Part of this PEM has LF line endings, and another part (CA certificate), added by some API, has CRLF line endings. This has revealed a bug in cli_snd_buf(), see more details in issue GitHUB #2818. So, let's add an example of such PEM in our SSL regtest.	2024-12-13 18:13:42 +01:00
Valentine Krasnobaeva	d60c893991	BUG/MINOR: cli: cli_snd_buf: preserve \r\n for payload lines cli_snd_buf() analyzez input line by line. Before this patch it has always scanned a given line for the presence of '\r' followed by '\n'. This is only needed for strings, that contain the commands itself like "show ssl cert\n", "set ssl cert test.pem <<\n". In case of strings, which contain the command's payload, like "-----BEGIN CERTIFICATE-----\r\n", '\r\n' should be preserved as is. This patch fixes the GitHub issue #2818. This patch should be backported in v3.1 and in v3.0.	2024-12-13 18:13:42 +01:00
Frederic Lecaille	178109f608	BUG/MINOR: quic: too permissive exit condition for high loss detection in Startup (BBR) This bug fixes the 3rd condition used by bbr_check_startup_high_loss() to decide it has detected some high loss as mentioned by the BBR v3 RFC draft: 4.3.1.3. Exiting Startup Based on Packet Loss ... There are at least BBRStartupFullLossCnt=6 discontiguous sequence ranges lost in that round trip. where a <= operator was used in place of <. Must be backported to 3.1.	2024-12-13 14:42:43 +01:00
Frederic Lecaille	e61b418907	BUG/MINOR: quic: fix the wrong tracked recovery start time value bbr_congestion_event() role is to track the start time of recovery periods. This was done using <ts> passed as parameter. But this parameter is the time the newest lost packet has been sent. The timestamp value to store in ->recovery_start_ts is <now_ms>. Must be backported to 3.1.	2024-12-13 14:42:43 +01:00
Frederic Lecaille	e1d25cdbdd	CLEANUP: quic: remove a wrong comment about ->app_limited (drs) ->app_limited quic_drs struct member is not a boolean. This is the index of the last transmitted packet marked as application-limited, or 0 if the connection is not currently application-limited (see C.app_limited definition in BBR v3 draft).	2024-12-13 14:42:43 +01:00
Frederic Lecaille	eeaeb412dc	MINOR: quic: reduce the private data size of QUIC cc algos After these commits: BUG/MINOR: quic: remove max_bw filter from delivery rate sampling BUG/MINOR: quic: fix BBB max bandwidth oscillation issue where some members were removed from bbr struct, the private data size of QUIC cc algorithms may be reduced from 160 to 144 uint32_t. Should be easily backported to 3.1 alonside the commits mentioned above.	2024-12-13 14:42:43 +01:00
Frederic Lecaille	9813de0537	BUG/MINOR: quic: reduce packet losses at least during ProbeBW_CRUISE (BBR) Upon congestion events (for a instance packet loss), bbr_adapt_lower_bounds_from_congestion() role is to adapt some BBR internal variables in relation with the estimated bandwidth (BBR.bw). According to the BBR v3 draft, this function should do nothing if BBRIsProbingBW() pseudo-code returns true. That said, this function is not defined by the BBR v3 draft. But according to this part mentioned before defining the pseudo-code for BBRAdaptLowerBoundsFromCongestion(): 4.5.10.3. When not Probing for Bandwidth When not explicitly accelerating to probe for bandwidth (Drain, ProbeRTT, ProbeBW_DOWN, ProbeBW_CRUISE), BBR responds to loss by slowing down to some extent. This is because loss suggests that the available bandwidth and safe volume of in-flight data may have decreased recently, and the flow needs to adapt, slowing down toward the latest delivery process. BBR flows implement this response by reducing the short-term model parameters, BBR.bw_lo and BBR.inflight_lo. BBRIsProbingBW() should concern the accelerating probe for bandwidth states which are BBR_ST_PROBE_BW_REFILL and BBR_ST_PROBE_BW_UP. Adapt the code to match this latter assumption. At least this reduce drastically the packet loss volumes at least during ProbeBW_CRUISE. As an example, on a 100MBits/s internet link with ~94ms as RTT, before this patch, 4329640 sent packets were needed with 1617119 lost packets (!!!) to download a 3GB object. After this patch, 2843952 sent packets vs 144134 lost packets are needed. There may be some packet loss issue. I suspect the maximum bandwidth which may be overestimated. More this is the case, more the packet loss is big. That said, at this time, it remains below 5% depending on the size of the objects, 5% being for more than 2GB objects. Must be backported to 3.1.	2024-12-13 14:42:43 +01:00
Frederic Lecaille	ebfc301d5d	BUG/MINOR: quic: underflow issue for bbr_inflight_hi_from_lost_packet() Add a test to ensure that values of a local variable used by bbr_inflight_hi_from_lost_packet() is not be impacted by underflow issues when subtracting too big numbers and make this function return a correct value. Must be backported to 3.1.	2024-12-13 14:42:43 +01:00
Frederic Lecaille	22ab45a3a8	BUG/MINOR: quic: remove max_bw filter from delivery rate sampling This filter is no more needed after this commit: BUG/MINOR: quic: fix BBB max bandwidth oscillation issue. Indeed, one added this filter at delivery rate sampling level to filter the BBR max bandwidth estimations and was inspired from ngtcp2 code source when trying to fix the oscillation issue. But this BBR max bandwidth oscillation issue was fixed by the aforementioned commit. Furthermore this code tends to always increment the BBR max bandwidth. From my point of view, this is not a good idea at all. Must be backported to 3.1.	2024-12-13 14:42:43 +01:00
Frederic Lecaille	2bcd5b4cba	BUG/MINOR: quic: wrong bbr_target_inflight() implementation This bug arrived with this commit: 6404b7a18a BUG/MINOR: quic: fix bbr_inflight() calls with wrong gain value This patch partially reverts after having checked the BBR v3 draft. This bug was invisible when testing long BBR flows. Must be backported to 3.1.	2024-12-13 14:42:43 +01:00
Frederic Lecaille	b47e1e65df	BUG/MINOR: quic: fix BBB max bandwidth oscillation issue. Remove the code in relation with BBR.ack_phase as per this commit: `ee98c12ad6` I do now kwow at this time why such a request was pushed on GH for the BBR v3 draft pseudo-code. That said, the use of such an ack phase seemed confusing, adding much more information about a BBR flow state than needed. Indeed, the ack phase state is modified several times in the BBR draft pseudo-code but only used to decide if the max bandwidth filter virtual clock had to be incremented by BBRAdvanceMaxBwFilter(). In addition to this, when discussing about haproxy BBR implementation with Neal Cardwell on the BBR development google group about an oscillation issue of the max bandwidth (BBR.max_bw), I concluded that this was due to the fact that its filter virutal clock was too often update, due to the ack phase wich was stalled in BBR_ACK_PHASE_ACKS_PROBE_STOPPING state for too long. This is where Neal asked me to test the aforementioned commit. This definitively makes the max bandwidth (BBR.max_bw) oscillation issue disappear. Another solution would have been to add a new ack phase enum afer BBR_ACK_PHASE_ACKS_PROBE_STOPPING. BBR_ACK_PHASE_ACKS_PROBE_STOPPED would have been a good candidate. Remove the code in relation with BBR.ack_phase. Must be backported to 3.1.	2024-12-13 14:42:43 +01:00
Frederic Lecaille	1dbf6b8bed	BUG/MINOR: quic: wrong logical statement in in_recovery_period() (BBR) A && logical operator was badly replaced by a \|\| in this function which decides if BBR is in a recovery period. Must be backported to 3.1.	2024-12-13 14:42:43 +01:00
Frederic Lecaille	a9a2f98f86	MINOR: window_filter: rely on the time to update the filter samples (QUIC/BBR) The windowed filters are used only the BBR implementation for QUIC to filter the maximum bandwidth samples for its estimation over a virtual time interval tracked by counting the cyclical progression through ProbeBW cycles. ngtcp2 and quiche use such windowed filters in their BBR implementation. But in a slightly different way. When updating the 2nd or 3rd filter samples, this is done based on their values in place of the time they have been sampled. It seems more logical to rely on the sample timestamps even if this has no implication because when a sample is updated using another sample because it has the same value, they have both the same timestamps! This patch modifies two statements which compare two consecutive filter samples based on their values (smp[]->v) by statements which compare them based on the virtual time they have been sampled (smp[]->t). This fully complies which the code used by the Linux kernel in lib/win_minmax.c. Alo take the opportunity of this patch to shorten some statements using <smp> local variable value to update smp[2] sample in place of initializing its two members with the <smp> member values. This patch SHOULD be easily backported to 3.1 where BBR was first implemented.	2024-12-13 14:42:43 +01:00
William Lallemand	0c1fdb2908	CI: github: let's add an AWS-LC-FIPS job Add a job which does exactly the same as the aws-lc.yml job, but using the AWS-LC-FIPS build.	2024-12-12 16:35:42 +01:00
William Lallemand	0107bfdb1a	MEDIUM: ssl: rename 'OpenSSL' by 'SSL library' in haproxy -vv It's been some time since we are compatible with multiple SSL libraries, let's rename the "OpenSSL library" strings in "SSL library" strings in haproxy -vv, in order to be more generic.	2024-12-12 15:58:57 +01:00
William Lallemand	f97ffb9ec4	MINOR: ssl: add "FIPS" details in haproxy -vv Add the FIPS mode in haproxy -vv, it need to be activated on the system with openssl.cnf or by compiling the SSL library with the right options. Can't work with OpenSSL >= 3.0 because fips a "provider" to load, works with AWS-LC, WolfSSL and OpenSSL 1.1.1.	2024-12-12 15:57:38 +01:00
William Lallemand	23f670f1f5	CI: scripts: add support for AWS-LC-FIPS in build-ssl.sh Allow the build-ssl.sh script to build AWS-LC-FIPS. Example: sudo AWS_LC_FIPS_VERSION=3.0.0 BUILDSSL_DESTDIR=/opt/awslc-fips-3.0.0/ ./scripts/build-ssl.sh	2024-12-12 15:57:30 +01:00
Amaury Denoyelle	ee7241ed18	MINOR: stats: use stress mode to force reentrant dumps Provide alternative code during stats dump when stress mode is active. The objective is to force the applet to yield on every output line. This allows to easily test reentrant code paths, in particular while adding and removing server instances. To support this, output is interrupted every time the output buffer (or its equivalent) is not empty. Use COND_STRESS() macro to provide default and stress alternative conditions.	2024-12-12 11:26:33 +01:00
Amaury Denoyelle	1f458b3ea8	MINOR: applet: define applet_putchk_stress() alternative Previous patch introduced stress mode to be able to easily test alternative code paths. The first point would be to force interruption of stats dump on every line and check reentrant patchs, in particular while adding and removing servers instances. The purpose of this patch is to be able to use applet_putchk_stress() during stats dump while not impacting other applets. To support this, extract applet_putchk() into an internal _applet_putchk() which have a new argument stress. Define two helpers applet_putchk() and applet_putchk_stress(), the latter to set the stress argument to true. For the moment, applet_putchk_stress() is not used. This will be the subject of the next patch.	2024-12-12 11:26:33 +01:00
Amaury Denoyelle	9d19fc4cf7	MINOR: build: define DEBUG_STRESS Define a new build mode DEBUG_STRESS. This will be used to stress some code parts which cannot be reproduce easily with an alternative suboptimal code. First, a global <mode_stress> is set either to 1 or 0 depending on DEBUG_STRESS compilation. A new global keyword "stress-level" is also defined. It allows to specify a level from 0 to 9, to increase the stress incurred on the code. Helper macro STRESS_RUN* are defined for each stress level. This allows to easily specify an instruction in default execution and a stress counterpart if running on the corresponding stress level.	2024-12-12 11:19:10 +01:00
Willy Tarreau	f36ac42274	[RELEASE] Released version 3.2-dev1 Released version 3.2-dev1 with the following main changes : - MINOR: pattern: split pat_ref_set() - MINOR: pattern: add pat_ref_gen_set() function - MINOR: pattern: add pat_ref_gen_find_elt() function - MINOR: pattern: add pat_ref_gen_delete() function - MEDIUM: pattern: consider gen_id in pat_ref_set_from_node() - MEDIUM: pattern: always consider gen_id for pat_ref lookup operations - MINOR: version: this is development again (3.2) - DEV: patchbot: prepare for new version 3.2-dev - BUG/MEDIUM: sock: Remove FD_POLL_HUP during connect() if FD_POLL_ERR is not set - MINOR: proxy: Add support of 421-Misdirected-Request in retry-on status - BUG/MINOR: log: fix lf_text() behavior with empty string - MINOR: log: always consider "+M" option in lf_text_len() - BUG/MINOR: improve BBR throughput on very fast links - MINOR: event_hdl: add PAT_REF events - MINOR: pattern: publish event_hdl events on pat_ref updates - MINOR: hlua: add patref class - MINOR: hlua: add core.get_patref method - MINOR: hlua_fcn: implement index and pair metamethods for patref class - MINOR: hlua_fcn: wrap pat_ref struct for patref class - MINOR: pattern: add pat_ref_may_commit() helper function - MINOR: hlua_fcn: add Patref:commit() method - MINOR: hlua_fcn: add Patref:prepare() method - MINOR: hlua_fcn: add Patref:purge() method - MINOR: hlua_fcn: add Patref:giveup() - MINOR: hlua_fcn: add Patref:add() - MINOR: hlua_fcn: add Patref:del() - MINOR: hlua_fcn: add Patref:set() - MINOR: hlua_fcn: add Patref:add_bulk() - MINOR: hlua_fcn: add Patref:event_sub() - DOC: lua: prefer Patref:{set,add}() over legacy methods for acl and maps - BUG/MINOR: hlua_fcn: fix Patref:set() force parameter - BUG/MEDIUM: event_hdl: fix uninitialized value in async mode when no data is provided - BUG/MEDIUM: quic: prevent stream freeze on pacing - BUG/MEDIUM: http-ana: Reset request flag about data sent to perform a L7 retry - BUG/MINOR: h1-htx: Use default reason if not set when formatting the response - BUILD: quic: fix a build error about an non initialized timestamp - CI: github: allow coredumps on aws-lc and wolfssl jobs - BUG/MINOR: listener: fix potential null pointer dereference in listener_release() - MINOR: hlua: fix ambiguous hlua usage in hlua_filter_delete() - BUG/MINOR: signal: register default handler for SIGINT in signal_init() - BUG/MINOR: startup: close pidfd and free global.pidfile in handle_pidfile() - BUG/MINOR: startup: fix pidfile creation - MINOR: tools: add a new macro DEFVAL() to provide a default argument - MINOR: tasklet: set TASK_WOKEN_OTHER on tasklets by default - BUG/MINOR: quic: fix bbr_inflight() calls with wrong gain value - BUG/MEDIUM: init: make sure only daemonized processes change their session - BUG/MINOR: init: do not call fork_poller() for non-forked processes - BUG/MEDIUM: mux-quic: remove pacing status when everything is sent - BUG/MINOR: quic: remove startup alert if conn socket-owner unsupported - BUG/MINOR: quic: remove startup alert if GSO unsupported - MINOR: stktable: implement "recv-only" table option - CLEANUP: stktable: replace nopurge attribute with flag - CLEANUP: stktable: add some stktable flags polishing - BUG/MEDIUM: mux-h2: make sure not to touch dummy streams when sending WU - MINOR: mux-quic: clean up zero-copy done_ff callback - BUG/MINOR: config: Fix parsing of accept-invalid-http-{request,response} - BUG/MINOR: mworker: don't save program PIDs in oldpids - BUG/MINOR: mworker: fix -D -W -sf/-st modes - BUG/MINOR: startup: fix error path for master, if can't open pidfile - CLEANUP: startup: make if condition to kill old pids more readable - DOC: config: fix confusing init-state examples - MINOR: mux-h1: use explicit __objt_server on idle conn reinsert - MINOR: mux-h2: use explicit __objt_server on idle conn reinsert - MINOR: mux-spop: use explicit __objt_server on idle conn reinsert - MINOR: mux-fcgi: use explicit __objt_server on idle conn reinsert - MINOR: quic: convert startup check in a freestanding function - MINOR: quic: split startup check function - MINOR: quic: implement build options report - BUG/MINOR: debug: COUNT_IF() should return true/false - MINOR: mux-h2/traces: add a missing trace on negative initial window size - CLEANUP: mux-h2/traces: reword certain ambiguous traces - MINOR: mux-h2/glitches: add a description to the H2 glitches - BUG/MINOR: mux-h2: fix expression when detecting excess of CONTINUATION frames - BUILD: debug: fix build issues in COUNT_IF() with -Wunused-value - MINOR: tools: make fddebug() automatically emit the location - MINOR: ssl: add notBefore and notAfter utility functions - MEDIUM: ssl/cli: "show ssl sni" list the loaded SNI in frontends - BUG/MEDIUM: startup: don't daemonize if started with -c - BUG/MEDIUM: startup: report status if daemonized process fails - BUG/MEDIUM: mworker: report status, if daemonized master fails - BUG/MINOR: mworker: detach from tty when received READY from worker - BUG/MINOR: namespace: handle a possible strdup() failure - BUG/MINOR: ssl_crtlist: handle a possible strdup() failure - BUG/MINOR: resolvers: handle a possible strdup() failure - CI: use "/tmp" as default value for TMPDIR when searching logs - DOC: management: fix typos and paragraph ordering in 'show ssl sni' - CLEANUP: ssl: fix comment in 'show ssl sni' - MINOR: ssl/cli: add negative filters to "show ssl sni" - BUG/MINOR: stats: decrement srv refcount on stats-file release - MINOR: list: define a watcher type - BUG/MEDIUM: stats/server: use watcher to track server during stats dump - MINOR: server: remove prev_deleted server list - BUG/MINOR: http-fetch: Ignore empty argument string for query() - BUG/MINOR: server-state: Fix expiration date of srvrq_check tasks - BUG/MINOR: hlua_fcn: restore server pairs iterator pointer consistency	2024-12-11 14:17:46 +01:00
Aurelien DARRAGON	358166ae6a	BUG/MINOR: hlua_fcn: restore server pairs iterator pointer consistency Since 9c91b30 ("MINOR: server: remove prev_deleted server list"), hlua server pair iterator may use and return invalid (stale) server pointer if multiple servers were deleted between two iterations. Indeed, the server refcount mechanism (using srv_take()) is no longer sufficient as the prev_deleted mitigation was removed. To ensure server pointer consistency between two yields, the new watcher mechanism must be used (as it already the case for stats dumping). Thus in this patch we slightly change the server iteration logic: hlua_server_list_iterator_context struct now stores the next valid server pointer, and a watcher is added to ensure this pointer is never stale. Then in hlua_listable_servers_pairs_iterator(), this next pointer is used to create the Lua server object, and the next valid pointer is obtained by leveraging watcher_next(). No backport needed unless 9c91b30 ("MINOR: server: remove prev_deleted server list") is. Please note that dynamic servers were not supported in Lua prior to 2.8, so it doesn't make sense to backport this patch further than 2.8.	2024-12-11 10:52:11 +01:00
Christopher Faulet	647a290662	BUG/MINOR: server-state: Fix expiration date of srvrq_check tasks "hold.timeout" was used as expiration date for srvrq_check tasks. But it is not accurrate. The expiration date must be based on the resolution timeouts instead (resolve and retry). The purpose of srvrq_check task is to clean up the server resolution status when outdated info are inherited from the state file. Using "hold.timeout" is not accurrate here because hold timeouts concern the resolution response items not the resolution status of servers. It may be set to a huge value or 0. The expiration date of these tasks must be based on the resolution timeouts instead. So now the ("timeout resolve" + resolve_retries * "timeout retry") value is used. This patch should fix the issue #2816. It must be backported to all stable versions.	2024-12-11 10:00:01 +01:00
Christopher Faulet	e1525e7b8f	BUG/MINOR: http-fetch: Ignore empty argument string for query() query() sample fetch function takes an optional argument string. During configuration parsing, empty string must be ignored. It is especially important when the sample is used with empty parenthesis. The argument is optional and it is a list of options to configure the behavior of the sample fetch. So it is logical to ignore empty strings. This patch should fix the issue #2815. It must be backported to 3.1.	2024-12-11 10:00:01 +01:00
Amaury Denoyelle	9c91b30139	MINOR: server: remove prev_deleted server list This patch is a direct follow-up to the previous one. Thanks to watcher type, it is not safe to assume that servers manipulated via stats dump were not targetted by a "delete server" CLI command. As such, prev_deleted list server member is now unneeded. This patch thus removes any reference to it.	2024-12-10 16:19:33 +01:00
Amaury Denoyelle	071ae8ce3d	BUG/MEDIUM: stats/server: use watcher to track server during stats dump If a server A is deleted while a stats dump is currently on it, deletion is delayed thanks to reference counting. Server A is nonetheless removed from the proxy list. However, this list is a single linked list. If the next server B is deleted and freed immediately, server A would still point to it. This problem has been solved by the prev_deleted list in servers. This model seems correct, but it is difficult to ensure completely its validity. In particular, it implies when stats dump is resumed, server A elements will be accessed despite the server being in a half-deleted state. Thus, it has been decided to completely ditch the refcount mechanism for stats dump. Instead, use the watcher element to register every stats dump currently tracking a server instance. Each time a server is deleted on the CLI, each stats dump element which may points to it are updated to access the next server instance, or NULL if this is the last server. This ensures that a server which was deleted via CLI but not completely freed is never accessed on stats dump resumption. Currently, no race condition related to dynamic servers and stats dump is known. However, as described above, the previous model is deemed too fragile, as such this patch is labelled as bug-fix. It should be backported up to 2.6, after a reasonable period of observation. It relies on the following patch : MINOR: list: define a watcher type	2024-12-10 16:19:33 +01:00
Amaury Denoyelle	eafa8a32bb	MINOR: list: define a watcher type Define a new watcher type into list module. This type is similar to bref and can be used to register an element which is currently tracking a dynamic target. Contrary to bref, if the target is freed, every watcher element are updated to point to a next valid entry or NULL. This type will simplify handling of dynamic servers deletion, in particular while stats dump are performed. This patch is not a bug-fix. However, it is mandatory to fix a race condition in dynamic servers. Thus, it should be backported along the next commit up to 2.6.	2024-12-10 16:04:11 +01:00
Amaury Denoyelle	2199179461	BUG/MINOR: stats: decrement srv refcount on stats-file release Servers instance may be removed at runtime. This can occurs during a stat dump which currently references this server instance. This case is protected by server refcount to prevent the server immediate release. CLI output may be interrupted prior to stats dump completion, for example if client CLI has been disconnected before the full response transfer. As such, srv_drop() must be called in every stats dump release callback. srv_drop() was missing for stats-file dump release callback. This could cause a race condition which would prevent a server instance to be fully removed. Fix this by adding srv_drop() invokation into cli_io_handler_release_dump_stat_file(). This should be backported up to 3.0.	2024-12-10 16:04:11 +01:00
William Lallemand	a6b3080966	MINOR: ssl/cli: add negative filters to "show ssl sni" The 'show ssl sni' output can be confusing when using crt-list, because the wildcards can be completed with negative filters, and they need to be associated to the same line. Having a negative filter on its line alone does not make much sense, this patch adds a new 'Negative Filter' column that show the exception applied on a wildcard from a crt-list line.	2024-12-10 11:36:50 +01:00
William Lallemand	da28cd08f5	CLEANUP: ssl: fix comment in 'show ssl sni' Fix a comment in the 'show ssl sni' IO handler.	2024-12-10 11:17:10 +01:00
William Lallemand	9681fe0dba	DOC: management: fix typos and paragraph ordering in 'show ssl sni' Fixes small typos, uppercase and paragraph ordering in the 'show ssl sni' section.	2024-12-10 10:27:57 +01:00
Ilia Shipitsin	d61cac4ed1	CI: use "/tmp" as default value for TMPDIR when searching logs VTest use /tmp already if not defined, let stick the behaviour for searching logs as well	2024-12-10 08:20:51 +01:00
Ilia Shipitsin	193c94a539	BUG/MINOR: resolvers: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2024-12-10 08:05:50 +01:00
Ilia Shipitsin	ce30bc1730	BUG/MINOR: ssl_crtlist: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2024-12-10 08:05:42 +01:00
Ilia Shipitsin	abee546850	BUG/MINOR: namespace: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2024-12-10 08:05:34 +01:00
Valentine Krasnobaeva	1f63a53955	BUG/MINOR: mworker: detach from tty when received READY from worker Some master process' initialization steps are conditioned by receiving the READY message from worker (pidfile creation, forwarding READY message to the launching parent). So, master process can not do these initialization routines before. If the master process fails, while creating pid or forwarding the READY to the parent in daemon mode, he exits with a proper alert message. In daemon mode we no longer see such message, as process is already detached from the tty. To fix this, as these alerts could be very useful, let's detach the master process from the tty after his last initialization steps in _send_status.	2024-12-09 21:32:54 +01:00
Valentine Krasnobaeva	97aaf76716	BUG/MEDIUM: mworker: report status, if daemonized master fails As daemonization fork happens now very early and before the master-worker fork, if master or worker processes fail during the initialization, some critical errors can't be reported to stdout. The launching (parent) process in such cases exits with 0. This makes an impression, that master and his worker have successfully started at background, which really complicates the operations. In the previous commit a pipe was added to make daemonized child communicate with his parent. Let's add the same logic to master-worker mode. Up to receiving the READY message from the worker, master will "forward" it via the pipe to the launching process. Launching process can obtain master's exit status, if the master fails to start and nothing has been written in the pipe. This fix should be backported only in 3.1.	2024-12-09 21:32:49 +01:00
Valentine Krasnobaeva	663d75e7a0	BUG/MEDIUM: startup: report status if daemonized process fails Due to master-worker rework, daemonization fork happens now before parsing and applying the configuration. This makes impossible to report correctly all warnings and alerts to shell's stdout. Daemonzied process fails, while being already in background, exit code reported by shell via '$?' equals to 0, as it's the exit code of his parent. To fix this, let's create a pipe between parent and daemonized child. The child will send into this pipe a "READY" message, when it finishes his initialization. The parent will wait on the "read" end of the pipe until receiving something. If read() fails, parent obtains the status of the exited child with waitpid(). So, the parent can correctly report the error to the stdout and he can exit with child's exitcode. This fix should be backported only in 3.1.	2024-12-09 21:32:44 +01:00
Valentine Krasnobaeva	5f94e98d89	BUG/MEDIUM: startup: don't daemonize if started with -c Due to master-worker refactoring, daemonization fork happens now very early, before parsing and verifying the configuration. For the moment there is no any specific syntax, which needs for the daemon mode to be really applied in order to perform the tests. So, it's better not to do the daemonization fork, if 'daemon' keyword is presented in the config (or -D option), when we started with -c (MODE_CHECK). Like this, during the config verification, the process will always stay in foreground and all warning or errors will be delivered to the stdout. This fix should be backported only in 3.1.	2024-12-09 21:32:36 +01:00
William Lallemand	5d1b30d6b8	MEDIUM: ssl/cli: "show ssl sni" list the loaded SNI in frontends The "show ssl sni" command, allows one to dump the list of SNI in an haproxy process, or a designated frontend. It lists the SNI with the type, filename, and dates of expiration and activation	2024-12-09 18:29:35 +01:00
William Lallemand	5454824e31	MINOR: ssl: add notBefore and notAfter utility functions Extracting notBefore and notAfter as a string can be bothersome, add 2 utility functions that returns the value in a static buffer.	2024-12-09 18:29:23 +01:00
Willy Tarreau	c3ee4e375b	MINOR: tools: make fddebug() automatically emit the location fddebug() is sometimes quite helpful, but annoying to use when following a call path because it's a pain to always repeat the function name and call place. Let's have it automatically prepend the function name, the file name and the line number, and make its arguments optional, replacing them by a simple LF when all absent. This way, simply placing: fddebug(); is sufficient to emit a location follocing "[%s@%s:%d]\n". This function must not be used in production (and even call places with it shouldn't be committed) and it should only be used by developers, so the simplest the better.	2024-12-09 18:05:09 +01:00
Willy Tarreau	d6dc8120c0	BUILD: debug: fix build issues in COUNT_IF() with -Wunused-value Commit 7f64bb79fd ("BUG/MINOR: debug: COUNT_IF() should return true/false") allowed the COUNT_IF() macro to return the evaluated value. This is handy to place it in "if ()" conditions and count them at the same time. When glitches are disabled, the condition is just returned as-is, but most call places do not use the result, making some compilers complain. In addition, while reviewing this, it was noticed that when DEBUG_STRICT=0, the macro would still be replaced by a "do { } while (0)" statement, which not only does not evaluate the expression, but also cannot return anything. Ditto for COUNT_IF_HOT(). Let's make sure both are always properly evaluated now.	2024-12-09 18:04:51 +01:00
Willy Tarreau	cb21db04c7	BUG/MINOR: mux-h2: fix expression when detecting excess of CONTINUATION frames Latest commit f0eca8fe7 ("MINOR: mux-h2/glitches: add a description to the H2 glitches") misplaced the optional glitch description field, with it appearing at the end of the if () condition and always reporting an excess of CONTINUATION frames from the first exceeding one. This needs to be backported along with that commit once it gets backported.	2024-12-06 18:53:19 +01:00
Willy Tarreau	f0eca8fe73	MINOR: mux-h2/glitches: add a description to the H2 glitches Since we can now list them using "debug counters" and now support a description, better add the description to all glitches. This patch may be backported to 3.1, but before this the following patches must also be picked: 86823c828 MINOR: mux-h2/traces: add a missing trace on negative initial window size 7c8e9420a CLEANUP: mux-h2/traces: reword certain ambiguous traces	2024-12-06 18:49:07 +01:00
Willy Tarreau	7c8e9420a2	CLEANUP: mux-h2/traces: reword certain ambiguous traces Some h2 traces were not very clear, let's reword them a bit.	2024-12-06 18:45:46 +01:00
Willy Tarreau	86823c828f	MINOR: mux-h2/traces: add a missing trace on negative initial window size When a negative initial windows size is reported, we're going to close the connection, so it's important to report a trace to explain why! This should be backported at least to 3.1 and possibly 3.0 (adapting the context since there's no glitches there).	2024-12-06 18:45:46 +01:00
Willy Tarreau	7f64bb79fd	BUG/MINOR: debug: COUNT_IF() should return true/false The COUNT_IF() macro was initially meant to return true/false to be used in if() conditions but had an extra do { } while(0) that prevents it from doing so. Let's get rid of the do { } while(0) before the code generalizes to too many places. There's no impact on existing code, but may have to be backported if future fixes rely on it.	2024-12-06 18:45:46 +01:00
Amaury Denoyelle	fc0bb6224c	MINOR: quic: implement build options report Define a new function quic_register_build_options(). Its purpose is to register a build options string for QUIC features which is reported when using haproxy -vv. This will allow to easily determine if connection socket-owner mode and GSO are supported or not. Here is the new filtered output : $ ./haproxy -vv\|grep '^QUIC:' QUIC: connection socket-owner mode support : yes QUIC: GSO emission support : yes	2024-12-06 18:34:10 +01:00
Amaury Denoyelle	cab2cc15c1	MINOR: quic: split startup check function Two features are tested on startup via quic_test_socketopts() : connection socket-owner mode support and GSO. Extract both test in their separated functions called by quic_test_socketopts(). This patch will allow to reuse easily QUIC features detection for build options report via haproxy -vv.	2024-12-06 18:34:09 +01:00
Amaury Denoyelle	e7fd458c14	MINOR: quic: convert startup check in a freestanding function quic_test_socketopts() function is used to detect system support for QUIC network stack. Previously, it relies on an already bound listener instance, notably to ensure that two UDP sockets can be bound on the same source address. Improve quic_test_socketopts() to run without any listener argument. It now automatically instantiates and manipulates two dummy sockets FDs to check for multi-bind support. This brings two advantages : * the function is now called via an initcall * it will easily be reusable to implement build option description	2024-12-06 18:33:50 +01:00
Amaury Denoyelle	d4f6f2df5e	MINOR: mux-fcgi: use explicit __objt_server on idle conn reinsert This commit is the counterpart of the previous one for FCGI mux. It replaces objt_server() by unsafe __objt_server(), as conn target is guarantee to point to a valid server instance, which can then be used as _srv_add_idle() argument.	2024-12-06 18:02:55 +01:00
Amaury Denoyelle	1778284824	MINOR: mux-spop: use explicit __objt_server on idle conn reinsert This commit is the counterpart of the previous one for SPOP mux. It replaces objt_server() by unsafe __objt_server(), as conn target is guarantee to point to a valid server instance, which can then be used as _srv_add_idle() argument. This should fix coverity report from github issue #2811.	2024-12-06 18:02:55 +01:00
Amaury Denoyelle	762d0764d7	MINOR: mux-h2: use explicit __objt_server on idle conn reinsert This commit is the counterpart of the previous one for H2 mux. It replaces objt_server() by unsafe __objt_server(), as conn target is guarantee to point to a valid server instance, which can then be used as _srv_add_idle() argument.	2024-12-06 18:02:55 +01:00
Amaury Denoyelle	ece3bf65ca	MINOR: mux-h1: use explicit __objt_server on idle conn reinsert When dealing with a backend connection, H1 mux IO handler must reinsert it in its idle list pool if it was extracted from it at the beginning. This is the case if conn_in_list is true. On reinsert, idle list pool is retrieved via the server instance accessible from <conn.target>. Replace objt_server usage with __objt_server as an idle connection is always attached to a server. This ensures that there is no issue when using _srv_add_idle() then. This should fix coverity report from github issue #2810.	2024-12-06 18:02:55 +01:00
Aurelien DARRAGON	7934eef25d	DOC: config: fix confusing init-state examples in 50322dff ("MEDIUM: server: add init-state"), some examples on how to use init-state server keyword were added alongside with the keyword documentation. However, as reported by Nick Ramirez, there was an error because the example that stated that haproxy will pass the traffic to the server after 3 successful health checks used the "init-state down" instead of the "init-state fully-down". Thus the behavior wouldn't match what the comment said (only 1 successful health check was required). Here we fix the example in itself to match with the comment. Also the following example ("# or") was also affected, but it is kind of redundant as the main purpose of the examples are to illustrate the feature in itself and not how to use server-template directive, so we remove it. This should be backported in 3.1 with 50322dff	2024-12-06 13:16:12 +01:00
Valentine Krasnobaeva	f24e57d717	CLEANUP: startup: make if condition to kill old pids more readable Update comment and condition. nb_oldpids it's not a pointer, but a signed int, which keeps the max number of elements in oldpids array. So, it's a good practice to check, if it's strictly positive here.	2024-12-06 12:00:22 +01:00
Valentine Krasnobaeva	cd0b58e23e	BUG/MINOR: startup: fix error path for master, if can't open pidfile If master process can't open a pidfile, there is no sense to send SIGTTIN to oldpids, as it will exit. So, old workers will terminate as well. It's better to send the last alert to the log about unrecoverable error, because master is already in its polling loop. For the standalone mode we should keep the previous logic in this case: send SIGTTIN to old process and unbind listeners for the new one. So, it's better to put this error path in main(), as it's done when other configuration settings can't be applied. This patch should be backported only in 3.1.	2024-12-06 12:00:22 +01:00
Valentine Krasnobaeva	ee111d2004	BUG/MINOR: mworker: fix -D -W -sf/-st modes When a new master process is launched like below: ./haproxy -W -D -p ha.pid -sf $(cat ha.pid)... The old master process and its workers do not stop. Since the master-worker refactoring, the code, which sends USR1/TERM to old pids from -sf, is called only for the standalone mode. In master-worker mode we should receive the READY message from the newly forked worker at first, in order to be able to terminate the previous master. So, to fix this, let's terminate the previous master in _send_status(), where we parse the READY message from the newly forked worker. And let's continue to use oldpids array, as it was in 3.0, in order to stop the workers, launched before the reload. This patch should be backported only in 3.1.	2024-12-06 12:00:22 +01:00
Valentine Krasnobaeva	1fead6c0ca	BUG/MINOR: mworker: don't save program PIDs in oldpids After reload, previously launched programs are stopped explicitly in mworker_ext_launch_all(). So, there is no longer need to save their PIDs in oldpids array before the master reexec(). This also prepares the fix of "-D -W -sf/-st" modes, as we will need to loop over this array in the master process context, in order to stop the previous master, when the new one is ready. This patch should be backported only in 3.1.	2024-12-06 12:00:22 +01:00
Christopher Faulet	bc453c5106	BUG/MINOR: config: Fix parsing of accept-invalid-http-{request,response} These options are now deprectated, but the proxy capabilities are not properly checked during the configuration parsing leading to always ignore these options. This is now fixed by checking the frontend capability for "accept-invalid-http-request" option and the backend capability for "accept-invalid-http-response" option. In addition, the messages about the deprecation of these options are now emitted with ha_warning() instead of ha_alert() because they are only warnings and not errors. This patch should fix the issue #2806. It must be backported to 3.1.	2024-12-05 22:02:58 +01:00
Amaury Denoyelle	7885a3b3e1	MINOR: mux-quic: clean up zero-copy done_ff callback Recently, an issue was found with QUIC zero-copy forwarding on 3.0 version. A desynchronization could occur internally in QCS Tx bytes counters which would cause a BUG_ON() crash on qcs_destroy() when the stream is detached. It was silently fixed in version 3.1 by the following patch. As it was considered as an optimization, it was not scheduled yet for backport. 6697e87ae5e1f569dc87cf690b5ecfc049c4aab0 MINOR: mux-quic: Don't send an emtpy H3 DATA frame during zero-copy forwarding This mistake has been caused due to some counter-intuitive manipulation in QUIC zero-copy implementation. Try to streamline this in QUIC MUX done_ff callback and its application protocol counterpart. Especially for values exchanged between MUX and application on one side, and MUX and stconn layer as done_fastfwd return value. First, application done_ff callback now returns the length of the wholly encoded frame. For HTTP/3, it means header length + payload length h3 frame. This value can then be reused as qcc_send_stream() argument to increase QCS Tx soft offset. As previously, special care has been taken to ensure that QUIC MUX done_ff only return the transferred data bytes. Thus, any extra offset for HTTP/3 header is properly excluded. This is mandatory for stconn layer to consider the transfer has completed. Secondly, remove duplicated code in application done_ff to reset iobuf info. This is now factorize in QUIC MUX done_ff itself. This patch is related to github issue #2678.	2024-12-05 16:57:31 +01:00
Willy Tarreau	d649278fce	BUG/MEDIUM: mux-h2: make sure not to touch dummy streams when sending WU Since commit 1cc851d9f2 ("MEDIUM: mux-h2: start to update stream when sending WU") we started storing stream offsets in the h2s struct. These offsets are updated at a few points, where it's safe to write to the stream, and in h2c_send_strm_wu(), where the h2s->h2c was not performed. Due to this, nothing protects the h2s from being updated when sending a WU for a closed stream, which might only happen when acknowledging a frame after resetting that stream, which is quite unlikely. In any case if this happens, it will crash as in issue #2793 since the closed streams are purposely read-only to catch such bugs. The fix is trivial, just check h2s->h2c before deciding to update the stream. Thanks to @Wahnes for reporting this, and Christopher for spotting the cause. This needs to be backported to 3.1 only.	2024-12-05 15:25:09 +01:00
Aurelien DARRAGON	ae9d8d40d0	CLEANUP: stktable: add some stktable flags polishing Better late than never, commit 1f73d35 ("MINOR: stktable: implement "recv-only" table option") implemented stktable flags and initial definitions, but it lacks some comments plus the flag is stored as 16bits but the SKT_FL_ definition width allows for only 8bits so it is a bit confusing, let's fix that	2024-12-05 13:14:21 +01:00
Aurelien DARRAGON	9f44c5f9be	CLEANUP: stktable: replace nopurge attribute with flag Thanks to previous commit stktable struct now have a "flags" struct member Let's take this opportunity to remove the isolated "nopurge" attribute in stktable struct and rely on a flag named STK_FL_NOPURGE instead. This helps to better organize stktable struct members.	2024-12-05 12:15:31 +01:00
Aurelien DARRAGON	1f73d3524d	MINOR: stktable: implement "recv-only" table option When "recv-only" keyword is added on a stick table declaration (in peers or proxy section), haproxy considers that the table is only used for data retrieval from a remote location and not used to perform local updates. As such, it enables the retrieval of local-only values such as conn_cur that are ignored by default. This can be useful in some contexts where we want to know about local-values such are conn_cur from a remote peer. To do this, add stktable struct flags which default to NONE and enable the RECV_ONLY flag on the table then "recv-only" keyword is found in the table declaration. Then, when in peer_treat_updatemsg(), when handling table updates, don't ignore data updates for local-only values if the flag is set.	2024-12-05 12:15:24 +01:00
Amaury Denoyelle	3c239b2f80	BUG/MINOR: quic: remove startup alert if GSO unsupported This patch is similar to the previous one, but for GSO support. Remove alert level message to a diag report only visible with argument -dD. This must be backported up to 3.1.	2024-12-05 11:30:31 +01:00
Amaury Denoyelle	6fed219fd7	BUG/MINOR: quic: remove startup alert if conn socket-owner unsupported QUIC relies on several advanced network API features from the kernel to perform optimally. Checks are performed during startup to ensure that these features are supported. A fallback is automatically performed for every incompatible feature. Besides the automatic fallback mechanism, a message is also reported to the user at the same time. Previously, alert level was used, but it is incorrect as it is reserved for unrecoverable errors which should prevent haproxy to start. Warning level could be used, but this can annoy users running with zero-warning mode. This patch removes the alert message when 'socket-owner connection' mode cannot be activated. Convert the message to a diag level. This allows users to start without forcing configuration modification to hide a warning. Besides, several feature fallback such as the polling mechanism does not emit any warning either, so it's better to adopt a similar behavior for QUIC features. This must be backported up to 2.8.	2024-12-05 11:30:12 +01:00
Amaury Denoyelle	08f557f0c4	BUG/MEDIUM: mux-quic: remove pacing status when everything is sent TASK_F_USR1 is used by MUX tasklet when emission has been interrupted due to pacing. When the tasklet runs again, only qcc_purge_sending() will be called as an optimization. Pacing status is only removed via qcc_wakeup(). Until then, TASK_F_USR1 is not cleared. This causes an issue after emission with pacing completion if the MUX tasklet is woken up for a recv subscribe, as qcc_wakeup() is not used by quic-conn layer. The tasklet will incorrectly run only for pacing emission, without handling reception process. Worst, a crash will occur if QCC tx frames list is empty, due to a BUG_ON() in qcc_purge_sending(). Recv subscribe is only used for 0-RTT, when QUIC MUX is instantiated before quic-conn handshake completion. Thus, this bug can only be reproduced with 0-rtt. Furthermore, MUX must already have emitted at least a few response bytes with pacing, before QUIC handshake completion. It cannot easily be reproduced, at least with CLI clients where the handshake is always already completed before MUX exchanges. To fix this, remove TASK_F_USR1 when pacing emission has been completed. At least, this prevents BUG_ON() on qcc_purge_sending() as it won't be called with an empty QCC Tx frame list anymore. However, this bug has revealed that MUX tasklet architecture is not suitable when both handling reception and emission part. This will be improved in a future serie of patches. This should fix github issue #2796. This must be backported up to 3.1.	2024-12-05 11:04:06 +01:00
Willy Tarreau	8b16b72541	BUG/MINOR: init: do not call fork_poller() for non-forked processes In 3.1-dev10, commit 8dd4efe42f ("MAJOR: mworker: move master-worker fork in init()") made the fork_poller() code unconditional, while it is only desirable for processes that have been forked from a parent (standalone daemon mode) or from a master (master-worker mode). The call can be expensive in some cases as it will create a new poller, scan and try to migrate to it all existing FDs till the highest known one. With very high numbers of FDs, this can take several seconds to start. This should be backported to 3.1.	2024-12-04 19:46:42 +01:00
Willy Tarreau	70e4938aec	BUG/MEDIUM: init: make sure only daemonized processes change their session Commit 8dd4efe42f ("MAJOR: mworker: move master-worker fork in init()") introduced some sensitive changes to the startup code (which was expected), and one sensitive change is that the second call to setsid() was accidentally made unconditional. As such it even applies to foreground processes, resulting in foreground processes being detached from the terminal and no longer responding to Ctrl-C nor Ctrl-Z. An example of this simply consists in start haproxy -db under sudo. Then a new shell is required to stop it. This patch removes this second setsid(), as it is already done in apply_daemon_mode(). This must be backported to 3.1.	2024-12-04 19:46:42 +01:00
Frederic Lecaille	6404b7a18a	BUG/MINOR: quic: fix bbr_inflight() calls with wrong gain value This patch fixes two wrong calls to bbr_inflight(). bbr_target_inflight() aim is to compute the number of bytes BBR has to put on the network as bytes in flight (sent but not acked bytes). It must call bbr_inflight() with the current window gain value (in place of a wrong fixed 100 gain value here, in percents). bbr_is_time_to_cruise() also called bbr_inflight() with a wrong gain value as parameter due to a confusion between the value mentioned by the RFC (1 meaning 100% of the current window) and our implementation which needs value in percents (so 100 in place of 1 here). Note that bbr_is_time_to_cruise() aim is to make BBR the decision to leave the probing_bw down state. The bug had as side effect to make BBR stay in this state during too long periods of time during which the bottleneck bandwidth is decreasing, leading to big oscillations between the mininum and maximum bottleneck bandwidth estimations. This patch must be backported to 3.1 where BBR was first implemented.	2024-12-04 18:47:15 +01:00
Willy Tarreau	e6f4f15929	MINOR: tasklet: set TASK_WOKEN_OTHER on tasklets by default Now when tasklets are woken up via tasklet_wakeup(), tasklet_wakeup_on() or tasklet_wakeup_after(), either the optional wakeup flags will be used, or TASK_WOKEN_OTHER will be used. This allows tasklet handlers waking up for any given cause to notice whether or not they were also woken for another reason. For example, a mux handler could skip heavy parts when seeing that TASK_WOKEN_OTHER is absent, proving that no standard tasklet_wakeup() was done, for example in response to a subscribe(). The benefit of the TASK_WOKEN_* flags is that they're purged during the wakeup, and that they're easy to check for using TASK_WOKEN_ANY. TASK_F_UEVT1 and TASK_F_UEVT2 are also usable for private use (e.g. wakeup from a stream to a connection inside a mux). Probably that in the future, code dealing with subscribe events should start to place TASK_WOKEN_IO like is done for upper layers.	2024-12-03 19:45:08 +01:00
Willy Tarreau	6322c9fbbf	MINOR: tools: add a new macro DEFVAL() to provide a default argument This is like DEFZERO and DEFNULL, but this one allows to specify the default value to be used as the first argument.	2024-12-03 19:45:08 +01:00
Valentine Krasnobaeva	295071007b	BUG/MINOR: startup: fix pidfile creation Pidfile should be created at the latest initialization stage, when we are sure, that process is able to start successfully, otherwise PID value, written in this file is no longer valid. So, for the standalone mode, let's move the block, which opens the pidfile and let's put it just before applying "chroot". In master-worker mode, master doesn't perform chroot. So it creates the pidfile, only when the "READY" message from the newly forked worker is received. This should be backported only in 3.1	2024-12-02 17:28:04 +01:00
Valentine Krasnobaeva	a33977da48	BUG/MINOR: startup: close pidfd and free global.pidfile in handle_pidfile() After master-worker mode refactoring, global.pidfile is only used in handle_pidfile(), which opens the provided file and writes the PID into it. So, it's more appropriate to perform the close(pidfd) and ha_free(&global.pidfile) also in this function. This commit prepares the fix of the pidfile creation, as it's created now very early, when we are not sure, that process has successfully started. In master-worker mode handle_pidfile() can be called in the master process context. So, let's make it accessible from other compilation units via global.h. This should be backported only in 3.1.	2024-12-02 17:28:04 +01:00
Valentine Krasnobaeva	d3c20b0246	BUG/MINOR: signal: register default handler for SIGINT in signal_init() When haproxy is launched in a background and in a subshell (see example below), according to POSIX standard (2.11. Signals and Error Handling), it inherits from the subshell SIG_IGN signal handler for SIGINT and SIGQUIT. $ (./haproxy -f env4.cfg &) So, when haproxy is lanched like this, it doesn't stop upon receiving the SIGINT. This can be a root cause of some unexpected timeouts, when haproxy is started under VTest, as VTest sends to the process SIGINT in order to terminate it. To fix this, let's explicitly register the default signal handler for the SIGINT in signal_init() initcall. This should be backported in all stable versions.	2024-12-02 17:28:04 +01:00
Aurelien DARRAGON	70b5cd6794	MINOR: hlua: fix ambiguous hlua usage in hlua_filter_delete() In GH #2804, @Bbulatov reported that the result of hlua_stream_ctx_get() was used and de-referenced without checking if it's NULL in hlua_filter_delete() while other functions used to check for NULL before de-referencing it. In fact hlua_stream_ctx_get() can only return NULL if hlua_stream_ctx_prepare() failed or was not called on the current stream. Now because of the filter's API, since hlua_filter_delete() is mapped as detach method and hlua_filter_new() as attach method, and since hlua_filter_new() is responsible for calling hlua_stream_ctx_prepare(), there's no reason hlua_filter_delete() should be called if hlua_filter_new() failed or wasn't called. Thus we can assume that hlua can never be NULL in hlua_filter_delete(), so we add a BUG_ON() to ensure it is always the case and remove the ambiguity.	2024-12-02 17:22:51 +01:00
Aurelien DARRAGON	b167426b6b	BUG/MINOR: listener: fix potential null pointer dereference in listener_release() As reported by @Bbulatov on GH #2804, fe is found at multiple places in listener_release(): in some places it is first checked against NULL before being de-referenced while in some other places it is not, which is ambiguous and could hide a bug. In practise, fe cannot be NULL for now, but it might not be the case in the future as we want to keep the possibility to run isolated listeners (that is, without proxy attached). We've already ensured this was the case with a57786e ("BUG/MINOR: listener: null pointer dereference suspected by coverity"), but this promise was recently broken by 65ae134 ("BUG/MINOR: listener: Wake proxy's mngmt task up if necessary on session release"). Let's fix that by conditionning the block with an "else if" statement instead of a regular "else". No need for backport except if multi-connection protocols (ie: FTP) were to be backported as well.	2024-12-02 17:22:45 +01:00
William Lallemand	a582b9c18d	CI: github: allow coredumps on aws-lc and wolfssl jobs The weekly aws-lc and wolfssl jobs lacks an `ulimit -c` call in order to allow to get the coredumps.	2024-12-02 15:19:41 +01:00
Frederic Lecaille	7868dc9c45	BUILD: quic: fix a build error about an non initialized timestamp This is to please a non identified compilers which complains about an hypothetic <time_ns> variable which would be not initialized even if this is the case only when it is not used. This build issue arrived with this commit: BUG/MINOR: improve BBR throughput on very fast links Should be backported to 3.1 with this previous commit.	2024-11-29 14:48:37 +01:00
Christopher Faulet	37487ada73	BUG/MINOR: h1-htx: Use default reason if not set when formatting the response When the response status line is formatted before sending it to the client, if there is no reason set, HAProxy should add one that matches the status code, as stated in the configuration manual. However it is not performed. It is possible to hit this bug when the response comes from a H2 server, because there is no reason field in HTTP/2 and above. This patch should fix the issue #2798. It should be backported to all stable versions.	2024-11-29 14:46:38 +01:00
Christopher Faulet	62f37801c8	BUG/MEDIUM: http-ana: Reset request flag about data sent to perform a L7 retry It is possible to loose the request after several L7 retries, leading to crashes, because the request channel flag stating some data were sent is not properly reset. When a L7 retry is performed, some flags on different entities must be reset to be sure a new connection will be properly retried, just like it was the first one, mainly because there was no connection establishment failure. One of them, on the request channel, is not reset. The flag stating some data were already sent. It is annoying because this flag is used during the connection establishment to know if an error is triggered at the connection level or at the data level. In the last case, the error must be handled by the HTTP response analyzer, to eventually perform another L7 retry. Because CF_WROTE_DATA flag is not removed when a L7 retry is performed, a subsequent connection establishment error may be handled as a L7 error while in fact the request was never sent. It also means the request was never saved in the buffer used to performed L7 retries. Thus, on the next L7 retires, the request is just lost. This forecefully leads to a bunch of undefined behavior. One of them is a crash, when the request is used to perform the load-balancing. This patch should fix issue #2793. It must be backported to all stable versions.	2024-11-29 14:46:38 +01:00
Amaury Denoyelle	9d4c26ebaa	BUG/MEDIUM: quic: prevent stream freeze on pacing On snd_buf completion, QUIC MUX tasklet is scheduled if newly data has been transferred from the stream layer. Thanks to qcc_wakeup(), pacing status is removed from tasklet, which ensure next emission will reset Tx frames and use the new data. Tasklet is not scheduled if MUX is already subscribed on send due to a previous blocking condition. This is an optimization to prevent an unneeded IO handler execution. However, this causes a bug if an emission is currently delayed due to pacing. As pacing status is not removed on snd_buf, next emission process will continue emission with older data without refreshing the newly transferred one. This causes a transfer freeze. Unless there is some activity on the connection, the transfer will be eventually aborted due to idle timeout. To fix this, remove TASK_F_USR1 if tasklet wakeup is not called due to send subscription. Note that this code is also duplicated in done_ff for zero-copy transfer. This must be backported up to 3.1.	2024-11-29 14:35:10 +01:00
Aurelien DARRAGON	dd56616067	BUG/MEDIUM: event_hdl: fix uninitialized value in async mode when no data is provided In _event_hdl_publish(), when we prepare the asynchronous event and no <data> was provided (set to NULL), we forgot to initialize the _data event_hdl_async_event struct member to NULL, which leads to uninitialized reads in event_hdl_async_free_event() when the event is freed: ==1002331== Conditional jump or move depends on uninitialised value(s) ==1002331== at 0x35D9D1: event_hdl_async_free_event (event_hdl.c:224) ==1002331== by 0x1CC8EC: hlua_event_runner (hlua.c:9917) ==1002331== by 0x39AD3F: run_tasks_from_lists (task.c:641) ==1002331== by 0x39B7B4: process_runnable_tasks (task.c:883) ==1002331== by 0x314B48: run_poll_loop (haproxy.c:2976) ==1002331== by 0x315218: run_thread_poll_loop (haproxy.c:3190) ==1002331== by 0x18061D: main (haproxy.c:3747) The bug severity was set to MEDIUM because of its nature, and it's best if this patch can be backported up to 2.8. But in practise it can only be triggered with events that don't provide optional data: since PAT_REF events are the first native events making use of this feature, this bug shouldn't be an issue before f72a66e ("MINOR: pattern: publish event_hdl events on pat_ref updates")	2024-11-29 10:18:07 +01:00
Aurelien DARRAGON	4e52438c0b	BUG/MINOR: hlua_fcn: fix Patref:set() force parameter Patref:set(key, val[, force]) takes optional "force" parameter (defaults to false) to force the entry to be created if it doesn't already exist To retrieve the value, lua_tointeger() was used in place of lua_toboolean(), and because of that force is not enabled if "true" is passed as parameter (only numbers were recognized) despite the documentation mentioning that "force" is a boolean. To fix the issue, we replace lua_tointeger by lua_toboolean. Also, the doc was updated to rename "bool" to "boolean" for the "force" parameter to stay consistent with historical naming in the file. No backport needed unless 9ee37de5c ("MINOR: hlua_fcn: add Patref:set()") is.	2024-11-29 07:39:38 +01:00
Aurelien DARRAGON	e5acb03137	DOC: lua: prefer Patref:{set,add}() over legacy methods for acl and maps Patref:set() can achieve the same thing as core.set_map() Patref:add() can achieve the same thing as core.add_acl() Patref:del() can achieve the same thing as core.del_map() and core.del_acl() As a bonus, Patref:{set,add} are more efficient than their core legacy equivalent, because they don't require systematic pattern reference lookup for each individual operation. Let's mention that in the doc to encourage Patref methods adoption.	2024-11-29 07:23:59 +01:00
Aurelien DARRAGON	7ff9a1c341	MINOR: hlua_fcn: add Patref:event_sub() Just like we did for server events, in this patch we expose the PAT_REF event family (see "MINOR: event_hdl: add PAT_REF events") in Lua. Unlike server events, Patref events don't provide additional event data, and the registration can only take place from a Patref object (ie: not globally). Thanks to this commit it now becomes possible to trigger actions when updates are performed on a map (or acl list) being monitor, without the need to loop or use inefficient workarounds.	2024-11-29 07:23:53 +01:00
Aurelien DARRAGON	884dc6232a	MINOR: hlua_fcn: add Patref:add_bulk() There is no cli equivalent for this one. It is similar to Patref:add() excepts thay it takes a table as parameter (for acl: table of keys, for maps: table of keys:values). The goal is to add multiple entries at once to limit locking time to the strict minimum. It is recommended to use this one over Patref:add() when adding multiple entries at once.	2024-11-29 07:23:48 +01:00
Aurelien DARRAGON	9ee37de5cf	MINOR: hlua_fcn: add Patref:set() Just like "set map" on the cli, the Patref:set() method (only relevant for maps) can be used to modify an existing entry's value in the pattern reference pointed to by the Lua Patref object. Lookup is performed on the key. The update will target the live pattern reference version, unless Patref:prepare() is ongoing.	2024-11-29 07:23:43 +01:00
Aurelien DARRAGON	a5f74a2a2d	MINOR: hlua_fcn: add Patref:del() Just like "del map" and "del acl" on the cli, the Patref:del() method can be used to delete an existing entry in the pattern reference pointed to by the Lua Patref object. The update will target the live pattern reference version, unless Patref:prepare() is ongoing.	2024-11-29 07:23:37 +01:00
Aurelien DARRAGON	6cc2662ce7	MINOR: hlua_fcn: add Patref:add() Just like "add map" and "add acl" on the cli, the Patref:add() method can be used to add a new entry to the pattern reference pointed to by the Lua Patref object. The update will target the live pattern reference version, unless Patref:prepare() is ongoing.	2024-11-29 07:23:32 +01:00
Aurelien DARRAGON	3bcc653ce1	MINOR: hlua_fcn: add Patref:giveup() If Patref:commit() was used and the new version (generation) isn't going to be committed, calling Patref:giveup() will allow allocated resources to be freed and reused. It is a good habit to call this if commit() isn't called after a prepare().	2024-11-29 07:23:26 +01:00
Aurelien DARRAGON	fda5ca3472	MINOR: hlua_fcn: add Patref:purge() method It is a special Lua Patref method: it bypasses the commit/prepare logic and purges the whole pattern reference items pointed to by Patref Lua object (all versions, not just the current one). It doesn't have a cli equivalent: it leverages pat_ref_purge_range().	2024-11-29 07:23:20 +01:00
Aurelien DARRAGON	fe394598c5	MINOR: hlua_fcn: add Patref:prepare() method Just like the "prepare map" or "prepare acl" on the cli, but for Lua: it leverages the pattern API to create a subset (ie: a new generation id) that will automatically be used as target for following Patref operations (add/set/del...) until the "commit" method is invoked to atomically push the pending updates.	2024-11-29 07:23:14 +01:00
Aurelien DARRAGON	8bce7ff854	MINOR: hlua_fcn: add Patref:commit() method commit() method may be used to commit pending updates on the local patref object: hlua_patref flags were added: HLUA_PATREF_FL_GEN means the patref object has been updated and it is associated to a new revision (curr_gen) in order to prepare and commit the pending updates. upon commit, the pattern API is leveraged with curr_gen as revision to commit new object items. Once commit is performed, previous (pending) revisions that are older than the committed one are cleaned up (similar to what's done with commit on the cli). Also, Patref function APIs now take into account curr_gen to perform lookups.	2024-11-29 07:23:08 +01:00
Aurelien DARRAGON	e769d8f426	MINOR: pattern: add pat_ref_may_commit() helper function pat_ref_may_commit() may be used to know if a given generation ID id still valid, which means it may still be committed at some point. Else it means that another pending generation ID older than the tested one was already committed and thus other generations ID below this one are stale and must be regenerated.	2024-11-29 07:23:01 +01:00
Aurelien DARRAGON	43ab25f007	MINOR: hlua_fcn: wrap pat_ref struct for patref class In order to extend the patref class features, let's wrap the pat_ref struct into hlua_patref struct. This way we may add additional data alongside the pat_ref pointer to store additional context required for pat_ref data manipulation from lua. Since the wrapper (hlua_patref) is an allocated object, we declare the _gc metamethod for patref class in order to properly cleanup resources when they are out of scope.	2024-11-29 07:22:54 +01:00
Aurelien DARRAGON	2021072391	MINOR: hlua_fcn: implement index and pair metamethods for patref class patref object may now leverage index and pair methamethods to list and access patref elements at a specific index (=key) Also, patref:is_map() method may be used to know if the patref stores acl (key only) or map-style (key:value) patterns.	2024-11-29 07:22:46 +01:00
Aurelien DARRAGON	31784efad2	MINOR: hlua: add core.get_patref method core.get_patref() method may be used to get a reference to a pattern object (pat_ref struct which is used for maps and acl storage) from Lua by providing the reference name (filename for files, or prefix+name for opt or virtual pattern references). Lua documentation was updated.	2024-11-29 07:22:38 +01:00
Aurelien DARRAGON	956a25cf60	MINOR: hlua: add patref class Implement patref class to expose pat_ref struct internal pattern struct in lua. This is some prerequisite work needed to be able to manipulate exisiting generic pattern object lists (acl/map) from Lua, because the Map class can only be used to perform matching ops on Map files.	2024-11-29 07:22:32 +01:00
Aurelien DARRAGON	f72a66eef2	MINOR: pattern: publish event_hdl events on pat_ref updates Now that PAT_REF events were defined in previous commit, let's actually publish them from pattern API where relevant. Unlike server events, pattern reference events are only published in the pat_ref subscriber's list on purpose, because in some setups patref updates (updates performed on a map for instance from action or cli) are very frequent, and we don't want to impact pattern API performance just for that. Moreover, as the main use case is to be able to subscribe to maps updates from Lua, allowing a per-pattern reference registration is already enough. No additional data is provided for such events (also for performance reason) Care was taken not to publish events when the update doesn't affect the live subset (the one targeted by curr_gen).	2024-11-29 07:22:25 +01:00
Aurelien DARRAGON	f7267bd315	MINOR: event_hdl: add PAT_REF events This is some prerequisite work for implementing PAT_REF events. In this commit we define the PAT_REF event_hdl family (which gets family slot id #2), with the following supported events: - EVENT_HDL_SUB_PAT_REF_ADD: element was added to the current version of the pattern ref - EVENT_HDL_SUB_PAT_REF_DEL: element was deleted from the current version of the pattern ref - EVENT_HDL_SUB_PAT_REF_SET: element was modified in the current version of the pattern ref - EVENT_HDL_SUB_PAT_REF_COMMIT: pending element(s) was/were commited in the current version of the pattern ref - EVENT_HDL_SUB_PAT_REF_CLEAR: all elements were cleared from the current version of the pattern ref The goal is to be able to track a pat_ref struct in order to be notified when it is updated. For performance reasons, events from this family won't provide any additional info, and will only be published in the pat_ref subscription list. Indeed, pat_ref may be updated at a relatively high frequency (or worse, batch work), so we cannot afford doing expensive treatment for each update.	2024-11-29 07:22:18 +01:00
Frederic Lecaille	f8b697c19b	BUG/MINOR: improve BBR throughput on very fast links This patch fixes the loss of information when computing the delivery rate (quic_cc_drs.c) on links with very low latency due to usage of 32bits variables with the millisecond as precision. Initialize the quic_conn task with TASK_F_WANTS_TIME flag ask it to ask the scheduler to update the call date of this task. This allows this task to get a nanosecond resolution on the call date calling task_mono_time(). This is enabled only for congestion control algorithms with delivery rate estimation support (BBR only at this time). Store the send date with nanosecond precision of each TX packet into ->time_sent_ns new quic_tx_packet struct member to store the date a packet was sent in nanoseconds thanks to task_mono_time(). Make use of this new timestamp by the delivery rate estimation algorithm (quic_cc_drs.c). Rename current ->time_sent member from quic_tx_packet struct to ->time_sent_ms to distinguish the unit used by this variable (millisecond) and update the code which uses this variable. The logic found in quic_loss.c is not modified at all. Must be backported to 3.1.	2024-11-28 21:39:05 +01:00
Aurelien DARRAGON	e37976166b	MINOR: log: always consider "+M" option in lf_text_len() Historically, when lf_text_len() or lf_text() were called with a NULL string and "+M" option was set, "-" would be printed. However, if the input string was simply an empty one with len > 0, then nothing would be printed. This can happen if lf_text() is called with an empty string because in this case len is set to size (indeed, for performance reasons we don't pre-compute the length, we stop as soon as we encounter a NULL-byte) In practise, a lot of call places making use of lf_text() or lf_text_len() try their best to avoid calling lf_text() with an empty string, and instead explicitly call lf_text_len() with NULL as parameter to consider the "+M" option. But this is not enough, as shown in GH #2797, there could still be places where lf_text() is called with an empty string. In such case, instead of ignoring the "+M" option, let's check after _lf_text_len() if the returned pointer differs from the original one. If both are equal, then it means that nothing was printed (ie: result of empty string): in that case we check the "+M" option to print "-" when possible. While this commit seems harmless, it's probably better to avoid backporting it since it could break existing applications relying on the historical behavior.	2024-11-28 13:11:11 +01:00
Aurelien DARRAGON	3e470471b7	BUG/MINOR: log: fix lf_text() behavior with empty string As reported by Baptiste in GH #2797, if a logformat alias leveraging lf_text() ends up printing nothing (empty string), the whole logformat evaluation stops, leading garbage log message. This bug was introduced during 3.0 cycle in fcb7e4b ("MINOR: log: add lf_rawtext{_len}() functions"). At that time I genuinely thought that if strlcpy2() returned 0, it was due to a lack of space, actually forgetting that the function may simply be called with an empty string. Because of that, lf_text() would return NULL if called with an empty string, and since all lf_*() helpers are expected to return NULL on error, this explains why the logformat evaluation immediately stops in this case. To fix the issue, let's simply consider that strlcpy2() returning 0 is not an error, like it was already the case before. It should be backported in 3.1 and 3.0 with fcb7e4b.	2024-11-28 12:10:11 +01:00
Christopher Faulet	bc66d31985	MINOR: proxy: Add support of 421-Misdirected-Request in retry-on status The "421" status can now be specified on retry-on directives. PR_RE_* flags were updated to remains sorted. This patch should fix the issue #2794. It is quite simple so it may safely be backported to 3.1 if necessary.	2024-11-28 11:47:40 +01:00
Christopher Faulet	7262433183	BUG/MEDIUM: sock: Remove FD_POLL_HUP during connect() if FD_POLL_ERR is not set epoll_wait() may return EPOLLUP and/or EPOLLRDHUP after an asynchronous connect(), to indicate that the peer accepted the connection then immediately closed before epoll_wait() returned. When this happens, sock_conn_check() is called to check whether or not the connection correctly established, and after that the receive channel of the socket is assumed to already be closed. This lets haproxy send the request at best (if RDHUP and not HUP) then immediately close. Over the last two years, there were a few reports about this spuriously happening on connections where network captures proved that the server had not closed at all (and sometimes even received the request and responded to it after haproxy had closed). The logs show that a successful connection is immediately reported on error after the request was sent. After investigations, it appeared that a EPOLLUP, or eventually a EPOLLRDHUP, can be reported by epool_wait() during the connect() but in sock_conn_check(), the connect() reports a success. So the connection is validated but the HUP is handled on the first receive and an error is reported. The same behavior could be observed on health-checks, leading HAProxy to consider the server as DOWN while it is not. The only explanation at this point is that it is a kernel bug, notably because it does not even match the documentation for connect() nor epoll. In addition for now it was only observed with Ubuntu kernels 5.4 and 5.15 and was never reproduced on any other one. We have no reproducer but here is the typical strace observed: socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 114 fcntl(114, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 setsockopt(114, SOL_TCP, TCP_NODELAY, [1], 4) = 0 connect(114, {sa_family=AF_INET, sin_port=htons(11000), sin_addr=inet_addr("A.B.C.D")}, 16) = -1 EINPROGRESS (Operation now in progress) epoll_ctl(19, EPOLL_CTL_ADD, 114, {events=EPOLLIN\|EPOLLOUT\|EPOLLRDHUP, data={u32=114, u64=114}}) = 0 epoll_wait(19, [{events=EPOLLIN, data={u32=15, u64=15}}, {events=EPOLLIN, data={u32=151, u64=151}}, {events=EPOLLIN, data={u32=59, u64=59}}, {events=EPOLLIN\|EPOLLRDHUP, data={u32=114, u64=114}}], 200, 0) = 4 epoll_ctl(19, EPOLL_CTL_MOD, 114, {events=EPOLLOUT, data={u32=114, u64=114}}) = 0 epoll_wait(19, [{events=EPOLLOUT, data={u32=114, u64=114}}, {events=EPOLLIN, data={u32=15, u64=15}}, {events=EPOLLIN, data={u32=10, u64=10}}, {events=EPOLLIN, data={u32=165, u64=165}}], 200, 0) = 4 connect(114, {sa_family=AF_INET, sin_port=htons(11000), sin_addr=inet_addr("A.B.C.D")}, 16) = 0 sendto(114, "POST "..., 1009, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = 1009 close(114) = 0 Some ressources about this issue: - https://www.spinics.net/lists/netdev/msg876470.html - https://github.com/haproxy/haproxy/issues/1863 - https://github.com/haproxy/haproxy/issues/2368 So, to workaround the issue, we have decided to remove FD_POLL_HUP flag on the FD during the connection establishement if FD_POLL_ERR is not reported too in sock_conn_check(). This way, the call to connect() is able to validate or reject the connection. At the end, if the HUP or RDHUP flags were valid, either connect() would report the error itself, or the next recv() would return 0 confirming the closure that the poller tried to report. EPOLL_RDHUP is only an optimization to save a syscall anyway, and this pattern is so rare that nobody will ever notice the extra call to recv(). Please note that at least one reporter confirmed that using poll() instead of epoll() also addressed the problem, so that can also be a temporary workaround for those discovering the problem without the ability to immediately upgrade. The event is accounted via a COUNT_IF(), to be able to spot it in future issue. Just in case. This patch should fix the issue #1863 and #2368. It may be related to #2751. It should be backported as far as 2.4. In 3.0 and below, the COUNT_IF() must be removed.	2024-11-27 12:16:25 +01:00
Willy Tarreau	eea2697e95	DEV: patchbot: prepare for new version 3.2-dev The bot will now load the prompt for the upcoming 3.2 version so we have to rename the files and update their contents to match the current version.	2024-11-26 17:24:21 +01:00
Willy Tarreau	97d33abb23	MINOR: version: this is development again (3.2) This basically reverts commit b629f366a7 ("MINOR: version: mention that 3.1 is stable now").	2024-11-26 17:21:16 +01:00
Aurelien DARRAGON	aa69a02d7f	MEDIUM: pattern: always consider gen_id for pat_ref lookup operations Historically, pat_ref lookup operations were performed on the whole pat_ref elements list. As such, set, find and delete operations on a given key would cause any matching element in pat_ref to be considered. When prepare/commit operations were added, gen_id was impelemnted in order to be able to work on a subset from pat_ref without impacting the current (live) version from pat_ref, until a new subset is committed to replace the current one. While the logic was good, there remained a design flaw from the historical implementation: indeed, legacy functions such as pat_ref_set(), pat_ref_delete() and pat_ref_find_elt() kept performing the lookups on the whole set of elements instead of considering only elements from the current subset. Because of this, mixing new prepare/commit operations with legacy operations could yield unexpected results. For instance, before this commit: echo "add map #0 key oldvalue" \| socat /tmp/ha.sock - echo "prepare map #0" \| socat /tmp/ha.sock - New version created: 1 echo "add map @1 #0 key newvalue" \| socat /tmp/ha.sock - echo "del map #0 key" \| socat /tmp/ha.sock - echo "commit map @1 #0" \| socat /tmp/ha.sock - -> the result would be that "key" entry doesn't exist anymore after the commit, while we would expect the new value to be there instead. Thanks to the previous commits, we may finally fix this issue: for set, find_elt and delete operations, the current generation id is considered. With the above example, it means that the "del map #0 key" would only target elements from the current subset, thus elements in "version 1" of the map would be immune to the delete (as we would expect it to work).	2024-11-26 16:12:31 +01:00
Aurelien DARRAGON	010c34b8c7	MEDIUM: pattern: consider gen_id in pat_ref_set_from_node() Don't set all duplicates from a given node if they don't have the same gen_id. Indeed, now we consider the gen_id to only work on the same pattern ref revision.	2024-11-26 16:12:26 +01:00
Aurelien DARRAGON	4792f27892	MINOR: pattern: add pat_ref_gen_delete() function pat_ref_gen_delete(ref, gen_id, key) tries to delete all samples belonging to <gen_id> and matching <key> under <ref> The goal is to be able to target a single subset from <ref>	2024-11-26 16:12:21 +01:00
Aurelien DARRAGON	a131c542a6	MINOR: pattern: add pat_ref_gen_find_elt() function pat_ref_gen_find_elt(ref, gen_id, key) tries to find <elt> element belonging to <gen_id> and matching <key> in <ref> reference. The goal is to be able to target a single subset from <ref>	2024-11-26 16:12:16 +01:00
Aurelien DARRAGON	c9d6af3c6d	MINOR: pattern: add pat_ref_gen_set() function pat_ref_gen_set(ref, gen_id, value, err) modifies to <value> the sample of all patterns matching <key> and belonging to <gen_id> (generation id) under <ref> The goal is to be able to target a single subset from <ref>	2024-11-26 16:12:11 +01:00
Aurelien DARRAGON	3d250b3be8	MINOR: pattern: split pat_ref_set() split pat_ref_set() function in 2 distinct functions. Indeed, since 0844bed7d3 ("MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)"), pat_ref_set() prototype was updated to include an extra <elt> argument. But the logic behind is not explicit because the function will not only try to set <elt>, but also its duplicate (unlike pat_ref_set_elt() which only tries to update <elt>). Thus, to make it clearer and better distinguish between the key-based lookup version and the elt-based one, restotre pat_ref_set() previous prototype and add a dedicated pat_ref_set_elt_duplicate() that takes <elt> as argument and tries to update <elt> and all duplicates.	2024-11-26 16:12:05 +01:00
Willy Tarreau	4d58f521ee	[RELEASE] Released version 3.2-dev0 Released version 3.2-dev0 with the following main changes : - exact copy of 3.1.0	2024-11-26 15:33:57 +01:00
Willy Tarreau	f2b97918e8	[RELEASE] Released version 3.1.0 Released version 3.1.0 with the following main changes : - BUG/MAJOR: mux-h1: Properly handle wrapping on obuf when dumping the first-line - BUILD: activity/memprofile: fix a build warning in the posix_memalign handler - BUG/MINOR: quic: Avoid BUG_ON() on ->on_pkt_lost() BBR callback call - CI: update to the latest AWS-LC version - CI: update to the latest WolfSSL version - DOC: ot: mention planned deprecation of the OT filter - Revert "CI: update to the latest WolfSSL version" - CI: github: add a WolfSSL job which tries the latest version - BUILD: systemd: fix usage of reserved name "sun" in the address field - BUILD: init: use the more portable FD_CLOEXEC for /dev/null - CI: github: improve the Wolfssl job - CI: github: improve the AWS-LC job - BUG/MINOR: mux-quic: fix show quic report of QCS prepared bytes - BUG/MEDIUM: quic: fix sending performance due to qc_prep_pkts() return - MINOR: mux-quic: use sched call time for pacing - CI: github: allow to run the Illumos job manually - BUILD: tcp_sample: var_fc_counter defined but not used - CI: github: add 'workflow_dispatch' on remaining build jobs - DOC: config: refine a little bit the text on QUIC pacing - MINOR: proto_sockpair: send_fd_uxst: init iobuf, cmsghdr, cmsgbuf to zeros - MINOR: startup: rename on_new_child_failure to mworker_on_new_child_failure - REORG: startup: move on_new_child_failure in mworker.c - MINOR: startup: prefix prepare_master and run_master with mworker_* - REORG: startup: move mworker_prepare_master in mworker.c - MINOR: startup: keep updating verbosity modes only in haproxy.c - REORG: startup: move mworker_run_master and mworker_loop in mworker.c - REORG: startup: move mworker_reexec and mworker_reload in mworker.c - MINOR: startup: prefix apply_master_worker_mode with mworker_* - REORG: startup: move mworker_apply_master_worker_mode in mworker.c - MINOR: cfgparse-quic: strengthen quic-cc-algo parsing - BUG/MAJOR: quic: fix wrong packet building due to already acked frames - DEV: lags/show-sess-to-flags: Properly handle fd state on server side - BUG/MEDIUM: http-ana: Don't release too early the L7 buffer - MINOR: quic: make bbr consider the max window size setting - DOC: quic: Amend the pacing information about BBR. - BUG/MEDIUM: quic: prevent EMSGSIZE with GSO for larger bufsize - MINOR: cli: Add a "help" keyword to show sess - MINOR: cli/quic: Add a "help" keyword to show quic - DOC: management: mention "show sess help" and "show quic help" - DOC: install: update the list of supported versions - MINOR: version: mention that 3.1 is stable now	2024-11-26 15:24:10 +01:00
Christopher Faulet	b629f366a7	MINOR: version: mention that 3.1 is stable now This version will be maintained up to around Q1 2026. The INSTALL file also mentions it.	2024-11-26 15:23:54 +01:00
Willy Tarreau	0a406054c7	DOC: install: update the list of supported versions OpenSSL up to 3.4 was tested, and gcc up to 14 was tested, so let's reflect this in the install doc.	2024-11-26 15:23:54 +01:00
Willy Tarreau	16022c2a7b	DOC: management: mention "show sess help" and "show quic help" These ones were recently added but we forgot to update the doc.	2024-11-26 15:00:51 +01:00
Olivier Houchard	4f973ab23a	MINOR: cli/quic: Add a "help" keyword to show quic Add a help keyword to show quic, that will provide a longer explanation of all the available options than what is provided by the command "help".	2024-11-26 14:55:30 +01:00
Olivier Houchard	5288d0f47b	MINOR: cli: Add a "help" keyword to show sess Add a help keyword to show sess, that will provide a longer explanation of all the available options than what is provided by the command "help".	2024-11-26 14:55:30 +01:00
Amaury Denoyelle	2fffd85b97	BUG/MEDIUM: quic: prevent EMSGSIZE with GSO for larger bufsize A UDP datagram cannot be greater than 65535 bytes, as UDP length header field is encoded on 2 bytes. As such, sendmsg() will reject a bigger input with error EMSGSIZE. By default, this does not cause any issue as QUIC datagrams are limited to 1.252 bytes and sent individually. However, with GSO support, value bigger than 1.252 bytes are specified on sendmsg(). If using a bufsize equal to or greater than 65535, syscall could reject the input buffer with EMSGSIZE. As this value is not expected, the connection is immediately closed by haproxy and the transfer is interrupted. This bug can easily reproduced by requesting a large object on loopback interface and using a bufsize of 65535 bytes. In fact, the limit is slightly less than 65535, as extra room is also needed for IP + UDP headers. Fix this by reducing the count of datagrams encoded in a single GSO invokation via qc_prep_pkts(). Previously, it was set to 64 as specified by man 7 udp. However, with 1252 datagrams, this is still too many. Reduce it to a value of 52. Input to sendmsg will thus be restricted to at most 65.104 bytes if last datagram is full. If there is still data available for encoding in qc_prep_pkts(), they will be written in a separate batch of datagrams. qc_send_ppkts() will then loop over the whole QUIC Tx buffer and call sendmsg() for each series of at most 52 datagrams. This does not need to be backported.	2024-11-26 11:49:30 +01:00
Frederic Lecaille	3cee8d7830	DOC: quic: Amend the pacing information about BBR. BBR handles itself its own burst size (mentioned as send_quantum in BBR RFC).	2024-11-26 08:00:58 +01:00
Frederic Lecaille	a3248a39eb	MINOR: quic: make bbr consider the max window size setting Limit the BBR congestion control window size as this is done for all the others congestion control algorithms with tune.quic.frontend.default-max-window-size or as first argument passed to "bbr" option for "quic-cc-algo".	2024-11-26 08:00:58 +01:00
Christopher Faulet	dc15581c02	BUG/MEDIUM: http-ana: Don't release too early the L7 buffer In some cases, the buffer used to store the request to be able to perform a L7 retry is released released too early, leading to a crash because a retry is performed with an empty request. First, there is a test on invalid 101 responses that may be caught by the "junk-response" retry policy. Then, it is possible to get an error (empty-response, bad status code...) after an interim response. In both cases, the L7 buffer is already released while it should not. To fix the issue, the L7 buffer is now released at the end of the AN_RES_WAIT_HTTP analyser, but only when a response was successfully received and processed. In all error cases, the stream is quickly released, with the L7 buffer. So there is no leak and it is safer this way. This patch may fix the issue #2793. It must be as far as 2.4.	2024-11-25 22:18:19 +01:00
Christopher Faulet	ceb80aed57	DEV: lags/show-sess-to-flags: Properly handle fd state on server side It must be handled as an hexadecimal value.	2024-11-25 21:57:30 +01:00
Frederic Lecaille	96b2641fc8	BUG/MAJOR: quic: fix wrong packet building due to already acked frames If a packet build was asked to probe the peer with frames which have just been acked, the frames build run by qc_build_frms() could be cancelled by qc_stream_frm_is_acked() whose aim is to check that current frames to be built have not been already acknowledged. In this case the packet build run by qc_do_build_pkt() is not interrupted, leading to the build of an empty packet which should be ack-eliciting. This is a bug detected by the BUG_ON() statement in qc_do_build_pk(): BUG_ON(qel->pktns->tx.pto_probe && !(pkt->flags & QUIC_FL_TX_PACKET_ACK_ELICITING)); Thank you to @Tristan971 for having reported this issue in GH #2709 This is an old bug which must be backported as far as 2.6.	2024-11-25 18:55:45 +01:00
Amaury Denoyelle	d41273c633	MINOR: cfgparse-quic: strengthen quic-cc-algo parsing quic-cc-algo is a bind keyword which is used to specify the congestion control algorithm. It is parsed via function bind_parse_quic_cc_algo(). The parsing function was too laxed as it used strncmp for algo token matching. This could cause surprise if specifying an invalid algorithm but starting identically to another entry. Especially if extra parameters are specified in parenthesis, as in this case parameters value will be completely ignored and default value used instead. To fix this, convert algo argument to ist. Then, use istsplit() to extract algo token from the optional extra arguments and compare the whole value with isteq().	2024-11-25 16:19:54 +01:00
Valentine Krasnobaeva	3500865bc1	REORG: startup: move mworker_apply_master_worker_mode in mworker.c mworker_apply_master_worker_mode() is called only in master-worker mode, so let's move it mworker.c	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	3899a7ecaa	MINOR: startup: prefix apply_master_worker_mode with mworker_* This patch prepares the move of apply_master_worker_mode in mworker.c. So, let's at first rename it to mworker_apply_master_worker_mode.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	dee247c14e	REORG: startup: move mworker_reexec and mworker_reload in mworker.c Let's move mworker_reexec() and mworker_reload() in mworker.c. mworker_reload() is called only within the functions, which are already in mworker.c. So, this reorganization allows to declare mworker_reload() as a static.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	0c7b93eb1d	REORG: startup: move mworker_run_master and mworker_loop in mworker.c mworker_run_master() is called only in master mode. mworker_loop() is static and called only in mworker_run_master(). So let's move these both functions in mworker.c. We also need here to make run_thread_poll_loop() accessible from other units, as it's used in mworker_loop().	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	56894db000	MINOR: startup: keep updating verbosity modes only in haproxy.c This commit prepares the move of mworker_run_master() in mworker.c. Let's remove from it's definition the code, which adjusts verbosity in dependency of other global run time modes (daemon or foreground). This part should stay in main(), where all verbosity modes are handeled for different mode combinations.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	7974089ac6	REORG: startup: move mworker_prepare_master in mworker.c mworker_prepare_master() performs some preparation routines for the new worker process, which will be forked during the startup. It's called only in master-worker mode, so let's move it in mworker.c.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	41cc1fe310	MINOR: startup: prefix prepare_master and run_master with mworker_* This patch prepares the move of prepare_master() and run_master() definitions into mworker.c. So, let's at first prefix its names with mworker_*.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	af642420b4	REORG: startup: move on_new_child_failure in mworker.c mworker_on_new_child_failure() performs some routines for the worker process, if it has failed the reload. As it's called only in mworker_catch_sigchld() from mworker.c, let's move mworker_on_new_child_failure() in mworker.c as well. Like this it could also be declared as a static.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	321c021a83	MINOR: startup: rename on_new_child_failure to mworker_on_new_child_failure This patch prepares the moving of on_new_child_failure definition into mworker.c. So, let's rename it accordingly and let's also update its description.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	10c14a1ed0	MINOR: proto_sockpair: send_fd_uxst: init iobuf, cmsghdr, cmsgbuf to zeros In master-worker mode, worker process uses now send_fd_uxst() to send '_send_status' command to master. Since refactoring, this started to trigger the following Valgrind reports: ==810584== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s) ==810584== at 0x4AAC99D: __libc_sendmsg (sendmsg.c:28) ==810584== by 0x4AAC99D: sendmsg (sendmsg.c:25) ==810584== by 0x56350F: send_fd_uxst (proto_sockpair.c:271) ==810584== by 0x3AA25C: main (haproxy.c:4151) ==810584== Address 0x1ffefffbfe is on thread 1's stack ==810584== in frame #1, created by send_fd_uxst (proto_sockpair.c:241) ==810584== ==810584== Syscall param sendmsg(msg.msg_control) points to uninitialised byte(s) ==810584== at 0x4AAC99D: __libc_sendmsg (sendmsg.c:28) ==810584== by 0x4AAC99D: sendmsg (sendmsg.c:25) ==810584== by 0x56350F: send_fd_uxst (proto_sockpair.c:271) ==810584== by 0x3AA25C: main (haproxy.c:4151) ==810584== Address 0x1ffefffc14 is on thread 1's stack ==810584== in frame #1, created by send_fd_uxst (proto_sockpair.c:241) ==810584== So, let's initialize with zeros all buffers, which are passed to sendmsg syscall(), used in send_fd_uxst() to avoid these Valgrind messages. They increase Valgrind output and could make unnoticeable some other, more important reports.	2024-11-25 15:20:24 +01:00
Willy Tarreau	7fb98e833c	DOC: config: refine a little bit the text on QUIC pacing The QUIC pacing options changed a few times during their development. For example the unit is now in datagrams not bytes. Also a few sentences were slightly ambiguous so let's reword this. No backport is needed.	2024-11-25 14:54:16 +01:00
William Lallemand	dee3f4b3ff	CI: github: add 'workflow_dispatch' on remaining build jobs Add 'workflow_dispatch' on the remaining scheduled build jobs that does not have it. This keyword allows to start manually a job from the "Actions" interface in github.	2024-11-25 14:03:13 +01:00
William Lallemand	da1331b0b5	BUILD: tcp_sample: var_fc_counter defined but not used var_fc_counter is not used on Illumos and emit a warning src/tcp_sample.c:291:12: warning: ‘var_fc_counter’ defined but not used [-Wunused-function] 291 \| static int var_fc_counter(struct arg args, char *err) \| ^~~~~~~~~~~~~~ Let's add an ifdef to build it.	2024-11-25 11:41:26 +01:00
William Lallemand	079193e375	CI: github: allow to run the Illumos job manually Add the "workflow_dispatch" option to the Illumos CI so it can be run manually from the github actions page.	2024-11-25 11:30:55 +01:00
Amaury Denoyelle	22bd92a87f	MINOR: mux-quic: use sched call time for pacing QUIC pacing was recently implemented to limit burst and improve overall bandwidth. This is used only for MUX STREAM emission. Pacing requires nanosecond resolution. As such, it used now_cpu_time() which relies on clock_gettime() syscall. The usage of clock_gettime() has several drawbacks : * it is a syscall and thus requires a context-switch which may hurt performance * it is not be available on all systems * timestamp is retrieved multiple times during a single task execution, thus yielding different values which may tamper pacing calculation Improve this by using task_mono_time() instead. This returns task call time from the scheduler thread context. It requires the flag TASK_F_WANTS_TIME on QUIC MUX tasklet to force the scheduler to update call time with now_mono_time(). This solves every limitations listed above : * syscall invokation is only performed once before tasklet execution, thus reducing context-switch impact * on non compatible system, a millisecond timer is used as a fallback which should ensure that pacing works decently for them * timer value is now guaranteed to be fixed duing task execution	2024-11-25 11:21:45 +01:00
Amaury Denoyelle	044452546e	BUG/MEDIUM: quic: fix sending performance due to qc_prep_pkts() return qc_prep_pkts() is a QUIC transport level function which encodes one or several datagrams in a buffer before sending them. It returns the number of encoded datagram. This is especially important when pacing is used to limit packet bursts. This datagram accounting was not trivial as qc_prep_pkts() used several code paths depending on the condition of the current encoded packet. Thus, there were several places were the local variable dgram_cnt could have been incremented. This was implemented by the following commit : commit 5cb8f8a6224db96f4386277c41ddae4a29a4130d MINOR: quic: support a max number of built packet per send iteration However, there is a bug due to a missing increment when all frames from the current QEL have been encoded. In this case, the encoding continue in the same datagram to coalesce a futur packet. However, if this is the last QEL, encoding loop will then break. As first_pkt is not NULL, qc_txb_store() is called outside but dgram_cnt is yet not incremented. In particular, this causes qc_prep_pkts() to return 0 when there is only small STREAM frames to emit for application QEL. In qc_send(), this is interpreted as a value which prevents further emission for the current invokation. Thus, it may hurts performance, both without and with pacing. To fix this, removing multiple dgram_cnt increment. Now, it is modified only in a single place which should cover every case, and render the code easier to validate. The most notable case where the bug is visible is when using cubic with pacing without any burst, with quic-cc-algo cubic(,1). First, transfer bandwidth in average was suboptimal, with significant variation. Worst, it could sometimes fall dramatically for a particular stream without recovering before returning to an expected level on the next one. No need to backport.	2024-11-25 11:21:28 +01:00
Amaury Denoyelle	3704e0e174	BUG/MINOR: mux-quic: fix show quic report of QCS prepared bytes On show quic, each MUX streams are listed with their various indicator for buffering on Rx and Tx. In particular, txoff displays in parenthesis the current level of data prepared by the upper stream instance not yet emitted by QUIC transport layer. This value is only accessible after a substract operation. However, there was a typo which caused the result to be always 0. Fix this by reusing the correct offsets in the calculation. This should be backported up to 3.0.	2024-11-25 11:21:28 +01:00
William Lallemand	a7e5180c71	CI: github: improve the AWS-LC job Like the WolfSSL job, improve the AWS-LC job by adding the socat command so all SSL reg-tests can be run. Also add gdb and output of corefiles.	2024-11-25 11:14:33 +01:00
William Lallemand	b0c2745ed0	CI: github: improve the Wolfssl job Improve the WolfSSL job by adding the missing socat command. Also add gdb and output corefiles like it's done on the VTest job.	2024-11-25 11:00:03 +01:00
Willy Tarreau	a3613d239b	BUILD: init: use the more portable FD_CLOEXEC for /dev/null In 3.1-dev10, commit 8dd4efe42f ("MAJOR: mworker: move master-worker fork in init()"), the FD associated to /dev/null was made CLOEXEC using O_CLOEXEC. Unfortunately this is not portable on older OSes, doesn't build on Solaris for example, and was even reported as breaking moderately old Linux OSes for other projects. Better not use it unless absolutely certain it will work (currently we only use it for Linux namespaces, which are optional), and use the conventional FD_CLOEXEC instead. No backport is needed.	2024-11-25 08:46:29 +01:00
Willy Tarreau	f0548302bb	BUILD: systemd: fix usage of reserved name "sun" in the address field systemd.c doesn't build on Solaris / Illumos because it uses "sun" as the field name in a structure, while "sun" is the name of the macro used to detect Solaris: src/systemd.c: In function 'sd_notify': src/systemd.c:43:22: error: expected identifier or '(' before numeric constant struct sockaddr_un sun; ^ src/systemd.c:44:2: warning: no semicolon at end of struct or union } socket_addr = { ^ Admittedly, the OS could have instead defined "sun" to itself to avoid this. Any other name will work, let's just use "ux" for the short form of "unix". The problem appeared in 3.0-dev with commit aa3632962f ("MEDIUM: mworker: get rid of libsystemd"), though by then this file was only built when USE_SYSTEMD was set, which was not the case for non-linux platforms. However since 3.1-dev14 with commit 15845247db ("MEDIUM: mworker: remove USE_SYSTEMD requirement for -Ws"), all platforms now build this file. No backport is needed even though it will not hurt to have it in 3.0 for completeness.	2024-11-25 08:09:09 +01:00
William Lallemand	a941c92c12	CI: github: add a WolfSSL job which tries the latest version Like the AWS-LC job, add a CI job which looks for the latest WolfSSL version and tries to build it. The patch adds a function which determines the latest version of WolfSSL from the github tag, and the yml which describes the job.	2024-11-22 17:40:34 +01:00
William Lallemand	16e44e70c8	Revert "CI: update to the latest WolfSSL version" This reverts commit 03f57fcf94dae61906b56d10d1fb21f7afaae4fc. Looks like the 5.7.4 version is broke with HAProxy, let's revert the CI for now.	2024-11-22 16:24:23 +01:00
Willy Tarreau	450528b9f5	DOC: ot: mention planned deprecation of the OT filter Miroslav mentioned below that he's currently working on an OpenTelemetry replacement for the OpenTracing filter since OpenTracing itself is no longer maintained nor supported: https://github.com/haproxy/haproxy/issues/2782#issuecomment-2493576327 Given that he aims for 3.2, let's already settle on an upcoming deprecation of the filter for 3.3 with a removal for 3.5. This will leave time to finish the development and permit users to switch smoothly. At this point no warning is emitted (since the users have no alternative) but better mention this plan in the doc to make them aware of future changes.	2024-11-22 16:11:51 +01:00
William Lallemand	03f57fcf94	CI: update to the latest WolfSSL version Update the CI to the 5.7.4 WolfSSL version.	2024-11-22 16:05:32 +01:00
William Lallemand	0022962ecb	CI: update to the latest AWS-LC version Update the CI to the 1.39.0 AWS-LC version.	2024-11-22 16:03:28 +01:00
Frederic Lecaille	7472990f86	BUG/MINOR: quic: Avoid BUG_ON() on ->on_pkt_lost() BBR callback call The per-packet delivery rate sample is applied to ack-eliciting packet only calling ->drs_on_transmit() BBR callback. So, ->on_pkt_lost() which inspects the delivery rate sampling information during packet loss detection must not be called for non ack-eliciting packet. If not, it would be facing with non initialized variables with big chance to trigger a BUG_ON(). As BBR is implemented in the current developement version, there is no need to backport this patch.	2024-11-22 15:51:29 +01:00
Willy Tarreau	b30639848e	BUILD: activity/memprofile: fix a build warning in the posix_memalign handler A "return NULL" statement was placed for error handling in the posix_memalign() handler instead of an int errno value, by recent commit 5ddc8b3ad4 ("MINOR: activity/memprofile: monitor non-portable calls as well"). Surprisingly the warning only triggered on gcc-4.8. Let's use ENOMEM instead. No backport needed.	2024-11-22 09:42:49 +01:00
Christopher Faulet	b150ae46dd	BUG/MAJOR: mux-h1: Properly handle wrapping on obuf when dumping the first-line The formatting of the first-line, for a request or a response, does not properly handle the wrapping of the output buffer. This may lead to a data corruption for the current response or eventually for the previous one. Utility functions used to format the first-line of the request or the response rely on the chunk API. So it is not expected to pass a buffer that wraps. Unfortunatly, because of a change performed during the 2.9 dev cycle, the output buffer was direclty used instead of a non-wrapping buffer created from it with b_make() function. It is not an issue for the request because its start-line is always the first block formatted in the output buffer. But for the response, the output may be not empty and may wrap. In that case, the response start-line is dumped at a random position in the buffer, corrupting data. AFAIK, it is only an issue if the HTTP request pipelining is used. To fix the issue, we now take care to create a non-wapping buffer from the output buffer. This patch should fix issues #2779 and #2996. It must be backported as far as 2.9.	2024-11-22 08:48:53 +01:00
Willy Tarreau	c5d0342fa2	[RELEASE] Released version 3.1-dev14 Released version 3.1-dev14 with the following main changes : - MINOR: acl: export find_acl_default() - MINOR: sample: extend the "when" converter to support an ACL - MINOR: cfgparse: parse tune.{rcvbuf,sndbuf}.{client,server} as sizes - MINOR: cfgparse: parse tune.{rcvbuf,sndbuf}.{frontend,backend} as sizes - MINOR: cfgparse: parse tune.pipesize as a size - MINOR: cfgparse: parse tune.recv_enough as a size - MINOR: cfgparse: parse tune.bufsize as a size - MINOR: cfgparse: parse tune.bufsize.small as a size - REGTESTS: silence the "log format ignored" warnings - REGTESTS: silence warning "previous 'http-response' action is final" - REGTESTS: make the unit explicit for very short timeouts - REGTESTS: silence warnings about content-type being ignored - REGTESTS: remove a duplicate "option httpslog" in the defaults section - REGTESTS: silence warning "L6 sample fetches ignored" in cond_set_var - REGTESTS: add missing timeouts to 30 tests - REGTESTS: only use tune.ssl.default-dh-param when not using AWS-LC - REGTESTS: enable -dW on almost all tests to fail on warnings - MEDIUM: config: warn on unitless timeouts < 100 ms - MINOR: tools: make parse_size_err() support 32/64 bits - MINOR: ring: support unit suffixes in the size - MINOR: cfgparse-global: parse options to allow non std keywords in discovery mode - BUG/MINOR: mworker-prog: don't warn about deprecated section with expose-deprecated-directives - MINOR: cli: make "show env" accessible via master CLI without enabling debug - MINOR: config: show HAPROXY_BRANCH in "show env" output - MINOR: http-ana: Add option to keep query-string on a localtion-based redirect - MINOR: http-ana: Add support for "set-cookie-fmt" option to redirect rules - MINOR: agent-check: Be able to set absolute weight via an agent - MINOR: stream: Add an option to "show sess" command to dump the captured URI - DOC: config: A a space before ':' for {bs,fs}.aborted and {bs,fs}.rst_code - DOC: config: Fix a typo in "1.3.1. The Request line" - MINOR: http: Add support for HTTP 414/431 status codes - DEV: phash: Update 414 and 431 status codes to phash - MINIR: mux-h1: Return 414 or 431 when appropriate - BUG/MINOR: http_ana: Report -1 for %Tr for invalid response only - DOC: config: Slightly improve the %Tr documentation - DOC: config: Move wait_end in section about internal samples - DOC: config: Move fs.* and bs.* in section about L5 samples - MINOR: stats-file: add the filename in the warning - MEDIUM: stats-file: explicitely ignore comments starting by // - DOC: quic: rename max-window-size as with default prefix - MINOR: mux-quic: add missing values for show flags - MINOR: quic: simplify qc_prep_pkts() exit path - MINOR: quic: support a max number of built packet per send iteration - MINOR: quic: extend qc_send_mux() return type with a dedicated enum - MINOR: quic: define quic_pacing module - MINOR: quic/pacing: implement quic_pacer engine - MINOR: quic/pacing: support pacing emission on quic_conn layer - MINOR: quic/pacing: add burst support - MINOR: mux-quic: define a tx STREAM frame list member - MINOR: mux-quic: encapsulate QCC tasklet wakeup - MAJOR: mux-quic: support pacing emission - MINOR: quic: use dynamic cc_algo on bind_conf - MINOR: quic: extend quic-cc-algo optional parameters - MEDIUM: quic: define cubic-pacing congestion algorithm - MINOR: mux_quic/pacing: display pacing info on show quic - MEDIUM: stats-file: silently ignore be/fe mistmatch - REGTESTS: use -dW by default on every reg-tests - DOC: lua: fix yield-dependent methods expected contexts - DOC: sched: add missing scheduler API documentation for tasklet_wakeup_after() - DOC: sched: document the missing TASK_F_UEVT* flags - CLEANUP: tinfo: move sched__date/_mono_time to the thread-local area - MINOR: stream: don't update s->lat_time when the wakeup date is not set - MINOR: tinfo/clock: turn sched_call_date to 64-bits - MINOR: sched: add TASK_F_WANTS_TIME to make the scheduler update the call date - MINOR: tools: add new macro DEFZERO to provide a default zero argument - MINOR: tasklet: make the low-level tasklet API take a flag - MINOR: tasklet: support an optional set of wakeup flags to tasklet_wakeup_on() - DOC: configuration: explain the rules regarding spaces in arguments - DOC: configuration: explain quotes and spaces in conditional blocks - DOC: configuration: wrap long line for "strstr()" conditional expression - BUG/MINOR: http-ana: Adjust the server status before the L7 retries - MINOR: http-fetch: Add an option to 'query" to get the QS with the '?' - BUG/MINOR: cfgparse-quic: fix renaming of max-window-size - MEDIUM: mworker: remove USE_SYSTEMD requirement for -Ws - CI: vtest: temporarily build from the sd-notify PR - MINOR: systemd: replace SOCK_CLOEXEC by fcntl call to FD_CLOEXEC - BUILD: makefile: make ERR apply to build options as well - MINOR: startup: set HAPROXY_LOCALPEER only once - DOC: configuration: update "Environment variables" chapter - DOC: config: indent the list of environment variables - OPTION: map/hlua: make core.set_map() lookup more efficient - REGTESTS: switch to -Ws for master-worker reg-tests - REGTESTS: disable temporarly mworker test on OSX - MINOR: quic: Add the congestion window initial value to QUIC path - MINOR: window_filter: Implement windowed filter (only max) - MINOR: quic: implement delivery rate sampling algorithm - MINOR: quic: implement BBR congestion control algorithm for QUIC - MINOR: quic: quic_cc modifications to support BBR - MINOR: quic: quic_loss modifications to support BBR - MINOR: quic: RX part modifications to support BBR - MINOR: quic: TX part modifications to support BBR. - MINOR: quic: add "bbr" new "quic-cc-algo" option - BUG/MEDIUM: mux-h2: Increase max number of headers when encoding HEADERS frames - BUG/MEDIUM: mux-h2: Check the number of headers in HEADERS frame after decoding - BUG/MEDIUM: h3: Properly limit the number of headers received - BUG/MEDIUM: h3: Increase max number of headers when sending headers - DOC: config: Improve documentation of tune.http.maxhdr directive - DOC: management: Clearly state "show errors" only reports malformed H1 messages - BUILD: makefile: build flags.c before haproxy to speed up the build - BUILD: makefile: reorder object files by build time - MINOR: config: Improve warnings on misplaced rules by adding an optional arg - CLEANUP: cfgparse: Add direction in functions name that warn on misplaced rules - MINOR: cfgparse: Emit a warning for misplaced "tcp-response content" rules - BUG/MINOR: cfgparse-quic: fix bbr initialization - MINOR: cfgparse-quic: activate pacing only via burst argument - MINOR: quic: Useless rate sample member initialization - BUG/MINOR: cfgparse-quic: fix warning for cc-aglo with 0 burst - MINOR: quic: support pacing for newreno and nocc - BUG/MINOR: quic: Missing application limitations tracking for BBR - MINOR: cfgparse-global: add cfg_parse_global_chroot - MINOR: cfgparse-global: add more checks for "chroot" argument - BUG/MINOR: startup: fix UAF when set the default for log_tag - MINOR: capabilities: rename program_name argument to progname - MINOR: startup: use global progname variable - MINOR: cfgparse-global: add cfg_parse_global_localpeer - BUG/MINOR: config: allow to check HAPROXY_LOCALPEER in config - BUG/MINOR: startup: init_early: remove obsolete comment - BUG/MEDIUM: debug: don't set the STUCK flag from debug_handler() - BUG/MEDIUM: wdt: fix the stuck detection for warnings - BUG/MINOR: activity/memprofile: reinitialize the free calls on DSO summary - MINOR: activity/memprofile: offer a function to unregister stale info - BUG/MEDIUM: pools/memprofile: always clean stale pool info on pool_destroy() - MINOR: activity: better report nil than ffff in unknown callers - CLEANUP: activity: better use a mask to tests freeing methods - MINOR: activity/memprofile: also monitor strdup() activity - MINOR: activity/memprofile: monitor non-portable calls as well - MINOR: activity: interrupt the show profile dump more often - MINOR: tools: resolve main() only once in resolve_sym_name() - MINOR: tools: add a new function "resolve_dso_name" to find a symbol's DSO - MINOR: activity/memprofile: use resolve_dso_name() for the DSO summary - REGTESTS: relax strerror matching to avoid a failure on libmusl - REGTESTS: don't rely on the base64 utility when openssl base64 is already used	2024-11-21 23:26:41 +01:00
Willy Tarreau	a89a2d8902	REGTESTS: don't rely on the base64 utility when openssl base64 is already used Regtest ocsp_auto_update.vtc used to fail here on FreeBSD because the base64 utility was not installed by default. Once installed it would still fail because the utility doesn't support -w to wrap lines. Since the regtest already relies on openssl base64 for a few commands, let's just rely on it for the other ones. The only limitation is that openssl freezes on lines longer than 1024 bytes, and doesn't seem to process more than 255 chars at once, which might be the reason for using base64 -w 1000 in the first place (the script was probably tested like this). Instead sed is efficient at wrapping long lines and does the job pretty well. The output was fixed at 72 chars so that the output is also readable on a terminal for debugging.	2024-11-21 21:10:09 +01:00
Willy Tarreau	a1ace74b7e	REGTESTS: relax strerror matching to avoid a failure on libmusl The regtest4be_1srv_smtpchk_httpchk_layer47errors.vtc fails on musl because it reports "Network unreachable" for -EUNREACH while the check matches "Network is unreachable" as on other OSes. Let's just replace " is" with ".*". It now works on both glibc and musl.	2024-11-21 20:26:46 +01:00
Willy Tarreau	ead0b0154b	MINOR: activity/memprofile: use resolve_dso_name() for the DSO summary Let's simplify the code by making use of this simpler and sometimes more efficient variant.	2024-11-21 19:58:06 +01:00
Willy Tarreau	670507a66e	MINOR: tools: add a new function "resolve_dso_name" to find a symbol's DSO In the memprofile summary per DSO, we currently have to pay a high price by calling dladdr() on each symbol when doing the summary per DSO at the end, while we're not interested in these details, we just want the DSO name which can be made cheaper to obtain, and easier to manipulate. So let's create resolve_dso_name() to only extract minimal information from an address. At the moment it still uses dladdr() though it avoids all the extra expensive work, and will further be able to leverage the same mechanism as "show libs" to instantly spot DSO from address ranges.	2024-11-21 19:58:06 +01:00
Willy Tarreau	a205a91bb3	MINOR: tools: resolve main() only once in resolve_sym_name() resolv_sym_name() calls dladdr(main) for each symbol in order to compare the first address with other symbols. But this is pointless and quite expensive in outputs to "show profiling" for example. Let's just keep a local copy and have a variable indicating if the resolution is needed/ in progress/done to save the value for subsequent calls.	2024-11-21 19:58:06 +01:00
Willy Tarreau	9a8b834435	MINOR: activity: interrupt the show profile dump more often The calls to resolv_sym_name() can be a bit expensive. Forcing to yield more often is better for the latency and will avoid the watchdog reporting warnings. Note that it's still called in the sort at the end, but that one cannot be avoided. At best we could try to rely on the list of libs but that's not trivial and not always present.	2024-11-21 19:58:06 +01:00
Willy Tarreau	5ddc8b3ad4	MINOR: activity/memprofile: monitor non-portable calls as well Some dependencies might very well rely on posix_memalign(), strndup() or other less portable callsn making us miss them when chasing memory leaks, resulting in negative global allocation counters. Let's provide the handlers for the following functions: strndup() // _POSIX_C_SOURCE >= 200809L \|\| glibc >= 2.10 valloc() // _BSD_SOURCE \|\| _XOPEN_SOURCE>=500 \|\| glibc >= 2.12 aligned_alloc() // _ISOC11_SOURCE posix_memalign() // _POSIX_C_SOURCE >= 200112L memalign() // obsolete pvalloc() // obsolete This time we don't fail if they're not found, we just silently forward the calls.	2024-11-21 19:58:06 +01:00
Willy Tarreau	33c0ce299d	MINOR: activity/memprofile: also monitor strdup() activity Some memory profiling outputs have showed negative counters, very likely due to some libs calling strdup(). Let's add it to the list of monitored activities. Actually even haproxy itself uses some. Having "profiling.memory on" in the config reveals 35 call places.	2024-11-21 19:58:06 +01:00
Willy Tarreau	623a2c4e19	CLEANUP: activity: better use a mask to tests freeing methods In "show profiling memory", we need to distinguish methods which really free memory from those which do not so that we don't account for the free value twice. However for now it's done using multiple tests, which are going to complicate the addition of new methods. Let's switch to a bit field defined as a mask in a single place instead, as we don't intend to use more than 32/64 methods!	2024-11-21 19:58:06 +01:00
Willy Tarreau	f3547d0b74	MINOR: activity: better report nil than ffff in unknown callers For unknown callers we try to get the lowest known address and we purposely ignore NULL during calculation of the min. But the side effect is that we also report ffff in the per-DSO address. Better catch this case and finally accept to report nil. Before it would report this: $ socat - /tmp/sock1 <<< "show profiling memory" \|grep nil 50000 10 9600000 9440\| (nil) [other] unknown(192) [delta=9590560] [pool=http_txn] 50000 10 9600000 9440\| (nil) DSO:other; delta_calls=49990; delta_bytes=9590560 now it reports this: $ socat - /tmp/sock1 <<< "show profiling memory" \|grep nil 50000 11 9600000 9656\| (nil) [other] unknown(192) [delta=9590344] [pool=connection] 50000 11 9600000 9656\| (nil) DSO:other; delta_calls=49989; delta_bytes=9590344	2024-11-21 19:58:06 +01:00
Willy Tarreau	ed3ed35867	BUG/MEDIUM: pools/memprofile: always clean stale pool info on pool_destroy() There's actually a problem with memprofiles: the pool pointer is stored in ->info but some pools are replaced during startup, such as the trash pool, leaving a dangling pointer there, that may randomly report crap or even crash during "show profile memory". Let's make pool_destroy() call memprof_remove_stale_info() added by previous patch so that these entries are properly unregistered. This must be backported along with the previous patch (MINOR: activity/memprofile: offer a function to unregister stale info) as far as 2.8.	2024-11-21 19:58:06 +01:00
Willy Tarreau	859341c1ec	MINOR: activity/memprofile: offer a function to unregister stale info There's actually a problem with memprofiles: the pool pointer is stored in ->info but some pools are replaced during startup, such as the trash pool, leaving a dangling pointer there. Let's complete the API with a new function memprof_remove_stale_info() that will remove all stale references to this info pointer. It's also present when USE_MEMORY_PROFILING is not set so as to ease the job on callers.	2024-11-21 19:58:06 +01:00
Willy Tarreau	c42a2b8c94	BUG/MINOR: activity/memprofile: reinitialize the free calls on DSO summary In commit 401fb0e87a ("MINOR: activity/memprofile: show per-DSO stats") we added a summary per DSO. However the free calls/tot were not initialized when creating a new entry because initially they were applied to any entry, but since we don't update free calls for non-free capable callers, we still need to reinitialize these entries when reassigning one. Because of this bug, a "show profiling memory" output can randomly show highly negative values on the DSO lines if it turns out that the DSO entry was created on an alloc instead of a realloc/free. Since the commit above was backported to 2.9, this one must go there as well.	2024-11-21 19:58:05 +01:00
Willy Tarreau	24ce001771	BUG/MEDIUM: wdt: fix the stuck detection for warnings If two slow tasks trigger one warning even a few seconds apart, the watchdog code will mistakenly take this for a definite stuck task and kill the process. The reason is that since commit 148eb5875f ("DEBUG: wdt: better detect apparently locked up threads and warn about them") the updated ctxsw count is not the correct one, instead of updating the private counter it resets the public one, preventing it from making progress and making the wdt believe that no progress was made. In addition the initial value was read from [tid] instead of [thr]. Please note that another fix is needed in debug_handler() otherwise the watchdog will fire early after the first warning or thread dump. A simple test for this is to issue several of these commands back-to-back on the CLI, which crashes an unfixed 3.1 very quickly: $ socat /tmp/sock1 - <<< "expert-mode on; debug dev loop 1000" This needs to be backported to 2.9 since the fix above was backported there. The impact on 3.0 and 2.9 is almost inexistent since the watchdog there doesn't apply the shorter warning delay, so the first call already indicates that the thread is stuck.	2024-11-21 19:58:05 +01:00
Willy Tarreau	1151fe6818	BUG/MEDIUM: debug: don't set the STUCK flag from debug_handler() Since 2.0 with commit e6a02fa65a ("MINOR: threads: add a "stuck" flag to the thread_info struct"), the TH_FL_STUCK flag was set by the debugger to flag that a thread was stuck and report it in the output. However, two commits later (2bfefdbaef "MAJOR: watchdog: implement a thread lockup detection mechanism"), this flag was used to detect that a thread had already been reported as stuck. The problem is that it seldom happens that a "show threads" command instantly crashes because it calls debug_handler(), which sets the flag, and if the watchdog timer was about to trigger before going back to the scheduler, the watchdog believes that the thread has been stuck for a while and will kill the process. The issue was magnified in 3.1 with the lower-delay warning, because it's possible for a thread to die on the next wakeup after the first warning (which calls debug_handler() hence sets the STUCK flag). One good approach would have been to use two distinct flags, one for "stuck" as reported by the debug handler, and one for "stuck" as seen by the watchdog. However, one could also argue that since the second commit, given that the wdt monitors the threads, there's no point any more for the debug handler to set the flag itself. Removing this code means that two consecutive "show threads" will not report "stuck" until the watchdog sets it, which aligns better with expectations. This can be backported to all stable releases. This code has changed a bit over time, the "if" block and the harmless variables just need to be removed.	2024-11-21 19:58:05 +01:00
Valentine Krasnobaeva	332839eb9d	BUG/MINOR: startup: init_early: remove obsolete comment This fixes the commit d6ccd1738bae ("MINOR: startup: set HAPROXY_LOCALPEER only once"). Comment "/* preset some environment variables */" is now useless here as HAPROXY_LOCALPEER is set later during the initialization stage and only once. This should not be backported, as related to the latest master-worker refactoring.	2024-11-21 19:55:21 +01:00
Valentine Krasnobaeva	aa88d6ee37	BUG/MINOR: config: allow to check HAPROXY_LOCALPEER in config This fixes the commit d6ccd1738bae ("MINOR: startup: set HAPROXY_LOCALPEER only once"). HAPROXY_LOCALPEER could be checked in the configuration to set some servers settings or listeners. So, we need to set it just before we read the configuration at the second time. Let's mark HAPROXY_LOCALPEER as "usable" in the configuration in the related documentation chapter. This should not be backported, as related to the latest master-worker refactoring.	2024-11-21 19:55:21 +01:00
Valentine Krasnobaeva	d253f30823	MINOR: cfgparse-global: add cfg_parse_global_localpeer This commit prepares the parsing of localpeer keyword in MODE_DISCOVERY. We need this, as HAPROXY_LOCALPEER environment variable could be checked in the configuration in order to enable some backend or frontend settings. So, let's at first add a dedicated parser for localpeer. At second, we no longer need to check, if cfg_peers is valid pointer, as in MODE_DISCOVERY we parse only the "global" section. In addition, let's make the code of localpeer parser a little bit more readable.	2024-11-21 19:55:21 +01:00
Valentine Krasnobaeva	bfe0f9d02d	MINOR: startup: use global progname variable Let's store progname in the global variable, as it is handy to use it in different parts of code to format messages sent to stdout. This reduces the number of arguments, which we should pass to some functions.	2024-11-21 19:55:21 +01:00
Valentine Krasnobaeva	ef154a49e1	MINOR: capabilities: rename program_name argument to progname This commit prepares the usage of the global progname variable. prepare_caps_from_permitted_set() use progname value in warning messages. So, let's rename program_name argument to progname.	2024-11-21 19:55:21 +01:00
Valentine Krasnobaeva	351ae5dbed	BUG/MINOR: startup: fix UAF when set the default for log_tag In the init_early() global.log_tag is initialized to the string from progname pointer and global.log_tag.area points to this pointer. If log-tag keyword is provided in the configuration, its parser at first frees global.log_tag.area and then it does a new memory allocation to copy there the argument of log-tag. So, progname no longer points to the valid memory. To fix this, let's always keep progname and global.log_tag.area at separate memory areas. If log_tag will be redefined in the configuration, its parser will free the memory allocated for the default value in chunk_destroy(). Memory allocated for progname will be freed in deinit(). This should not be backported as related to the latest master-worker refactoring.	2024-11-21 19:55:21 +01:00
Valentine Krasnobaeva	d1c3cd8974	MINOR: cfgparse-global: add more checks for "chroot" argument If directory provided as a "chroot" keyword argument does not exist or inaccessible, this is reported only at the latest initialization stage, when haproxy tries to perform chroot. Sometimes it's not very convenient, as the process is already bound to listen sockets. This was done explicitly in order not to break the case, when haproxy is launched with "-c" option in some specific environment, where it's not possible to create or to modify chroot directory, provided in the configuration. So, let's add more checks for "chroot" directory during the parsing stage and let's show diagnostic warnings, if this directory has become non-accesible or was deleted. Like this, users, who wants to catch errors related to misconfigured chroot before starting the process, can launch haproxy with -dW and -dD. zero-warning mode will stop the process with error, if any warning was emitted during initialization stage.	2024-11-21 19:55:21 +01:00
Valentine Krasnobaeva	c853502cc6	MINOR: cfgparse-global: add cfg_parse_global_chroot Let's add a dedicated parser for "chroot" keyword, as we add some more checks for its argument in the next commit. This reduces the size of cfg_parse_global().	2024-11-21 19:55:21 +01:00
Frederic Lecaille	01fcbd6c08	BUG/MINOR: quic: Missing application limitations tracking for BBR The ->app_limited member of the delivery rate struct (quic_cc_drs) aim is to store the index of the last transmitted byte marked as application-limited so that to track the application-limited phases. During these phases, BBR must ignore delivery rate samples to properly estimate the delivery rate. Without such a patch, the Startup phase could be exited very quickly with a very low estimated bottleneck bandwidth. This had a very bad impact on little objects with download times smaller than the expected Startup phase duration. For such objects, with enough bandwith, BBR should stay in the Startup state. No need to be backported, as BBR is implemented in the current developement version.	2024-11-21 19:23:53 +01:00
Amaury Denoyelle	95d3edd68f	MINOR: quic: support pacing for newreno and nocc Extend extra pacing support for newreno and nocc congestion algorithms, as with cubic. For better extensibility of cc algo definition, define a new flags field in quic_cc_algo structure. For now, the only value is QUIC_CC_ALGO_FL_OPT_PACING which is set if pacing support can be optionally activated. Both cubic, newreno and nocc now supports this. This new flag is then reused by QUIC config parser. If set, extra quic-cc-algo burst parameter is taken into account. If positive, this will activate pacing support on top of the congestion algorithm. As with cubic previously, pacing is only supported if running under experimental mode. Only BBR is not flagged with this new value as pacing is directly builtin in the algorithm and cannot be turn off. Furthermore, BBR calculates automatically its value for maximum burst. As such, any quic-cc-algo burst argument used with BBR is still ignored with a warning.	2024-11-21 11:33:44 +01:00
Amaury Denoyelle	99497d23b5	BUG/MINOR: cfgparse-quic: fix warning for cc-aglo with 0 burst Optional burst argument for quic-cc-algo is used to toggle pacing support on top of cubic. This is the case if it is positive. The default value is 0, which do not activate pacing. However, in this case, an incorrect warning is reported about the parameter being ignored. Fix this by removing the warning in this case. No need to backport.	2024-11-21 11:26:36 +01:00
Frederic Lecaille	ea17de01ac	MINOR: quic: Useless rate sample member initialization This poor/inefficient code has been revealed by coverity GH issue in #2788 where some quic_cc_rs struct member initializations were mentionned as overwritten (after initialization) before being used as follows: CID 1565821: Code maintainability issues (UNUSED_VALUE) /src/quic_cc_bbr.c: 1373 in bbr_handle_lost_packet() 1367 } 1368 1369 static void bbr_handle_lost_packet(struct bbr bbr, struct quic_cc_path p, 1370 struct quic_tx_packet pkt, 1371 uint32_t lost) 1372 { >>> CID 1565821: Code maintainability issues (UNUSED_VALUE) >>> Assigning value "0UL" to "rs.tx_in_flight" here, but that stored value is overwritten before it can be used. 1373 struct quic_cc_rs rs = {0}; 1374 1375 / C.delivered = bbr->drs.delivered / 1376 bbr_note_loss(bbr, bbr->drs.delivered); 1377 if (!bbr->bw_probe_samples) 1378 return; / not a packet sent while probing bandwidth */ Remove the {0} initializer for <rs> variable. This is safe because the members initializations of <rs> local variable passed to functions from bbr_handle_lost_packet() are done. Add a comment to mention this.	2024-11-21 11:01:53 +01:00
Amaury Denoyelle	de86fd1e6c	MINOR: cfgparse-quic: activate pacing only via burst argument Recently, pacing support was added for cubic congestion algorithm. This was activated by using the new token "cubic-pacing" on quic-cc-algo. Furthermore, it was possible to define a burst size with a new parameters after congestion token between parenthesis. This configuration is not oblivious to users. In particular, it can cause to easily forgot to tweak burst size, which can dramatically impact performance. Simplify this by removing the extra "-pacing" suffix. Now, pacing will be activated solely based on the burst parameter. If 0, burst is considered as infinite and no pacing will be used. Pacing will be activating for any positive burst. This better reflects the link between pacing and burst and its importance. Note that for the moment, if burst is specified, it will be ignored with a warning for algorithm outside of cubic. This is not a breaking change as pacing support was implemented in the current dev version.	2024-11-21 10:55:55 +01:00
Amaury Denoyelle	7b23c9075c	BUG/MINOR: cfgparse-quic: fix bbr initialization To support pacing with cubic, a recent change was introduced to render quic_cc_algo on bind line dynamically allocated, instead of pointing to a globally defined variable. This allows customization of the algorithm callbacks per bind line. This was not correctly used for BBR as it was set to point to the global quic_cc_algo_bbr. This causes a segfault on haproxy process closing. Fix this by properly initializing BBR as other algorithms. This should fix coverity report from github issue #2786.	2024-11-21 10:49:16 +01:00
Christopher Faulet	e58a30d369	MINOR: cfgparse: Emit a warning for misplaced "tcp-response content" rules When a "tcp-response content" rule is placed after a "http-response" rule, a warning is now emitted, just like for rules applied on the requests.	2024-11-21 09:55:04 +01:00
Christopher Faulet	5dcd3b0d99	CLEANUP: cfgparse: Add direction in functions name that warn on misplaced rules This only concerns functions emitting warnings about misplaced tcp-request rules. The direction is now specified in the functions name. For instance "warnif_misplaced_tcp_conn" is replaced by "warnif_misplaced_tcp_req_conn".	2024-11-21 09:51:37 +01:00
Christopher Faulet	7710580428	MINOR: config: Improve warnings on misplaced rules by adding an optional arg In warnings about misplaced rules, only the first keyword is mentionned. It works well for http-request or quic-initial rules for instance. But it is a bit confusing for tcp-request rules, because the layer is missing (session or content). To make it a bit systematic (and genric), the second argument can now be provided. It can be set to NULL if there is no layer or scope. But otherwise, it may be specified and it will be reported in the warning. So the following snippet: tcp-request content reject if FALSE tcp-request session reject if FALSE tcp-request connection reject if FALSE Will now emit the following warnings: a 'tcp-request session' rule placed after a 'tcp-request content' rule will still be processed before. a 'tcp-request connection' rule placed after a 'tcp-request session' rule will still be processed before. This patch should fix the issue #2596.	2024-11-21 09:28:42 +01:00
Willy Tarreau	c329bfe3f5	BUILD: makefile: reorder object files by build time mux_spop is quite long to build and was at the end. The rest did not change much, but the build time is now dominated by hlua.o and mux_h2.o and by a large margin. On the 80-core ARM mux_h2.o is present from beginning to end and on the PC it's hlua.o, so both might have to be split at some point to benefit from multi-core. Nevertheless, the changes allowed to shrink about one second out of the 18 it was taking on that machine.	2024-11-20 18:49:56 +01:00
Willy Tarreau	f16edcd34c	BUILD: makefile: build flags.c before haproxy to speed up the build The end of the build is often super slow. In practice it's flags.o that now takes ages (3.4 seconds) and blocks everything on a single core at the end. Let's declare it before the haproxy target so that it starts earlier. On a quad-2.2 GHz CPU, the build time goes down from 44 to 42s and the end feels less painful.	2024-11-20 18:49:56 +01:00
Christopher Faulet	667ac8acc6	DOC: management: Clearly state "show errors" only reports malformed H1 messages For now, only the H1 multiplexer is able to capture malformed messages. So it is better to update the management guide accordingly to avoid any confusion.	2024-11-20 18:08:17 +01:00
Christopher Faulet	e863d8d681	DOC: config: Improve documentation of tune.http.maxhdr directive The description was inproved to clrealy mentionned it is applied on received requests and responses. In addition, a comment was added about HTTP/2 and HTTP/3 limitation when messages are encoded to be sent.	2024-11-20 18:02:36 +01:00
Christopher Faulet	3bd9a9e7d7	BUG/MEDIUM: h3: Increase max number of headers when sending headers In the same way than for the H2, the maximum number of headers that can be encoded when headers are sent must be increased to match the limit imposed when they are received. Reasons are the sames. On receive path, the maximum number of headers accepted must be higher than the configured limit to be able to handle pseudo headers and cookies headers. On the sending path, the same limit must be applied because the pseudo headers will consume some extra slots and the cookie header could be splitted. This patch should be backported as far as 2.6.	2024-11-20 17:44:22 +01:00
Christopher Faulet	785e633353	BUG/MEDIUM: h3: Properly limit the number of headers received The number of headers are limited before the decoding but pseudo headers and cookie headers consume extra slots. In practice, this lowers the maximum number of headers that can be received. To workaround this issue, the limit is doubled during the frame decoding to be sure to have enough extra slots. And the number of headers is tested against the configured limit after the HTX message was created to be able to report an error. Unfortunatly no parsing error are reported because the QUIC multiplexer is not able to do so for now. The same is performed on trailers to be consistent with H2. This patch should be backported as far as 2.6.	2024-11-20 17:44:22 +01:00
Christopher Faulet	63d2760dfa	BUG/MEDIUM: mux-h2: Check the number of headers in HEADERS frame after decoding There is no explicit test on the number of headers when a HEADERS frame is received. It is implicitely limited by the size of the header list. But it is twice the configured limit to be sure to decode the frame. So now, a check is performed after the HTX message was created. This way, we are sure to not exceed the configured limit after the decoding stage. If there are too many headers, a parsing error is reported. Note the same is performed on the trailers. This patch should patially address the issue #2685. It should be backported to all stable versions.	2024-11-20 17:44:22 +01:00
Christopher Faulet	e415e3cb7a	BUG/MEDIUM: mux-h2: Increase max number of headers when encoding HEADERS frames When a HEADERS frame is encoded to be sent, the maximum number of headers allowed in the frame is lower than on receiving path. This can lead to report a sending error while the message was accepted. It could be confusing. In addition, the start-line is splitted into pseudo-headers and consummes this way some header slots, increasing the difference between HEADERS frames encoding and decoding. It is even more noticeable because when a HEADERS frame is decoded, a margin is used to be able to handle splitted cookie headers. Concretly, on decoding path, a limit of twice the maxumum number of headers allowed in a message (tune.http.maxhdr * 2) is used. On encoding path, the exact limit is used. It is not consistent. Note that when a frame is decoded, we must use a larger limit because the pseudo headers are reassembled in the start-line and must count for one. But also because, most of time, the cookies are splitted into several headers and are reassembled too. To fix the issue, the same ratio is applied on sending path. A limit must be defined because an dynamic allocation is not acceptable. Twice of the configured limit should be good enough to support headers manipulation. This patch should be backported to all stable versions.	2024-11-20 17:44:22 +01:00
Frederic Lecaille	349954601f	MINOR: quic: add "bbr" new "quic-cc-algo" option Add this new "bbr" option to the list of the congestion control algorithms which may be set by "quic-cc-algo" setting. This new algorithm is considered as experimental and may be enabled only if "expose-experimental-directive" is set. Also update the documentation for this new setting.	2024-11-20 17:34:22 +01:00
Frederic Lecaille	e778b9a2b6	MINOR: quic: TX part modifications to support BBR. Very few modifications: call ->on_transmit() and ->drs_on_transmit() congestion control algorithm (quic_cc) callbacks from qc_send_ppkts() just after having sents some packets.	2024-11-20 17:34:22 +01:00
Frederic Lecaille	44af88d856	MINOR: quic: RX part modifications to support BBR qc_notify_cc_of_newly_acked_pkts() aim is to notify the congestion algorithm of all the packet acknowledgements. It must call quic_cc_drs_update_rate_sample() to update the delivery rate sampling information. It must also call quic_cc_drs_on_ack_recv() to update the state of the delivery rate sampling part used by BBR. Finally, ->on_ack_rcvd() is called with the total number of bytes delivered by the sender from the newly acknowledged packets with <bytes_delivered> as parameter to do so. <pkt_delivered> store the per-packet number of bytes delivered by the newly sent acknowledged packet (the packet with the highest packet number). <bytes_lost> is also used and has been set by qc_packet_loss_lookup() before calling qc_notify_cc_of_newly_acked_pkts().	2024-11-20 17:34:22 +01:00
Frederic Lecaille	d85eb127e9	MINOR: quic: quic_loss modifications to support BBR qc_packet_loss_lookup() aim is to detect the packet losses. This is this function which must called ->on_pkt_lost() BBR specific callback. It also set <bytes_lost> passed parameter to the total number of bytes detected as lost upon an ACK frame receipt for its caller. Modify qc_release_lost_pkts() to call ->congestion_event() with the send time from the newest packet detected as lost. Modify qc_release_lost_pkts() to call ->slow_start() callback only if define by the congestion control algorithm. This is not the case for BBR.	2024-11-20 17:34:22 +01:00
Frederic Lecaille	af75665cb7	MINOR: quic: quic_cc modifications to support BBR Add several callbacks to quic_cc_algo struct which are only called by BBR. ->get_drs() may be used to retrieve the delivery rate sampling information from an congestion algorithm struct (quic_cc). ->on_transmit() must be called before sending any packet a QUIC sender. ->on_ack_rcvd() must be called after having received an ACK. ->on_pkt_lost() must be called after having detected a packet loss. ->congestion_event() must be called after any congestion event detection Modify quic_cc.c to call ->event only if defined. This is not the case for BBR.	2024-11-20 17:34:22 +01:00
Frederic Lecaille	d04adf44dc	MINOR: quic: implement BBR congestion control algorithm for QUIC Implement the version 3 of BBR for QUIC specified by the IETF in this draft: https://datatracker.ietf.org/doc/draft-ietf-ccwg-bbr/ Here is an extract from the Abstract part to sum up the the capabilities of BBR: BBR ("Bottleneck Bandwidth and Round-trip propagation time") uses recent measurements of a transport connection's delivery rate, round-trip time, and packet loss rate to build an explicit model of the network path. BBR then uses this model to control both how fast it sends data and the maximum volume of data it allows in flight in the network at any time. Relative to loss-based congestion control algorithms such as Reno [RFC5681] or CUBIC [RFC9438], BBR offers substantially higher throughput for bottlenecks with shallow buffers or random losses, and substantially lower queueing delays for bottlenecks with deep buffers (avoiding "bufferbloat"). BBR can be implemented in any transport protocol that supports packet-delivery acknowledgment. Thus far, open source implementations are available for TCP [RFC9293] and QUIC [RFC9000]. In haproxy, this implementation is considered as still experimental. It depends on the newly implemented pacing feature. BBR was asked in GH #2516 by @KazuyaKanemura, @osevan and @kennyZ96.	2024-11-20 17:34:22 +01:00
Frederic Lecaille	472d575950	MINOR: quic: implement delivery rate sampling algorithm This patch implements an algorithm which may be used by congestion algorithms for QUIC to estimate the current delivery rate of a sender. It is at least used by BBR and could be used by others congestion algorithms as cubic. This algorithm was specified by an RFC draft here: https://datatracker.ietf.org/doc/html/draft-cheng-iccrg-delivery-rate-estimation before being merged into BBR v3 here: https://datatracker.ietf.org/doc/html/draft-cardwell-ccwg-bbr#section-4.5.2.2	2024-11-20 17:34:22 +01:00
Frederic Lecaille	c08b877657	MINOR: window_filter: Implement windowed filter (only max) Implement the Kathleen Nichols' algorithm used by several congestion control algorithm implementation (TCP/BBR in Linux kernel, QUIC/BBR in quiche) to track the maximum value of a data type during a fixe time interval. In this implementation, counters which are periodically reset are used in place of timestamps. Only the max part has been implemented. (see lib/minmax.c implemenation for Linux kernel).	2024-11-20 17:34:22 +01:00
Frederic Lecaille	7bbe8828ba	MINOR: quic: Add the congestion window initial value to QUIC path Add ->initial_wnd new member to quic_cc_path struct to keep the initial value of the congestion window. This member is initialized as soon as a QUIC connection is allocated. This modification is required for BBR congestion control algorithm.	2024-11-20 17:34:22 +01:00
William Lallemand	5ebecbe45b	REGTESTS: disable temporarly mworker test on OSX -Ws on VTest is not working correctly for an unknown reason, the polling of the NOTIFY_SOCKET seems to timeout, and VTest never receives the READY message. This patch disables the reg-tests using -Ws on OS X.	2024-11-20 17:13:59 +01:00
William Lallemand	b7d81b3511	REGTESTS: switch to -Ws for master-worker reg-tests The -W mode implemented in VTest is not reliable anymore, because VTest waits for the pidfile to be created. But with the new master-worker mode, this file is created long before haproxy is ready. This can lead to the test being started too soon, and failing from time to time. The -Ws option allows to wait for haproxy to deliver a message to VTest once it is ready.	2024-11-20 17:13:59 +01:00
Aurelien DARRAGON	2ce0db4e4b	OPTION: map/hlua: make core.set_map() lookup more efficient 0844bed7d3 ("MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)") improved lookup efficiency for set-map http action, but the core.set_map() lua method which is built on the same construct was overlooked. Let's also benefit from this optim as it easily applies.	2024-11-20 16:14:13 +01:00
Willy Tarreau	311dc748b0	DOC: config: indent the list of environment variables In the doc our lists are indented but for any reason this one was not, making it harder to visually delimit. Let's just indent it. No need to backport this, it's totally cosmetic and would need adaptations since it was recently touched.	2024-11-20 15:57:09 +01:00
Valentine Krasnobaeva	41d906d69b	DOC: configuration: update "Environment variables" chapter There are some variables, which are set by HAProxy process (HAPROXY_). Some of them are handy to check or to redefine in the configuration, in order to create conditional blocks and make the configuration more flexible. But it wasn't clear in the documentation, which variables are really safe and usefull to redefine and which ones could be only read via "show env" output. Latest changes in master-worker architecture makes the existed description even more confusing. So let's sort all HAPROXY_ variables to four categories and let's also mark explicitly, which ones are set in which process, when haproxy is started in master-worker mode. In addition, update examples in chapter "2.4. Conditional blocks". This might bring more ideas for users how HAPROXY_* variables could be used in the conditional blocks.	2024-11-20 15:56:50 +01:00
Valentine Krasnobaeva	d6ccd1738b	MINOR: startup: set HAPROXY_LOCALPEER only once Before this patch HAPROXY_LOCALPEER variable could be set in init_early(), in init_args() and in cfg_parse_global(). In master-worker mode, if localpeer keyword set in the global section, HAPROXY_LOCALPEER in the worker environment is set to this keyword's value, but in the master environment it still keeps the default, a localhost name. This is confusing. To fix it, let's set HAPROXY_LOCALPEER only once, when a worker or process in a standalone mode has finished to parse its configuration. And let's set this variable only for the worker process or for the process in a standalone mode, because the master doesn't need it. HAPROXY_LOCALPEER takes the value saved in localpeer global variable, which is always set by default in init_early() to the local hostname. Then, localpeer could be reset in init_args (-L option) and in cfg_parse_global() (while parsing "localpeer" keyword).	2024-11-20 15:44:10 +01:00
Willy Tarreau	1171a23aec	BUILD: makefile: make ERR apply to build options as well Once in a while we find some makefiles ignoring some outdated arguments and just emit a warning. What's annoying is that if users (say, distro packagers), have purposely added ERR=1 to their build scripts to make sure to fail on any warning, these ones will be ignored and the build can continue with invalid or missing options. William rightfully suggested that ERR=1 should also catch make's warnings so this patch implements this, by creating a new "complain" variable that points either to "error" or "warning" depending on $(ERR), and that is used to send the messages using $(call $(complain),...). This does the job right at little effort (tested from GNU make 3.82 to 4.3). Note that for this purpose the ERR declaration was upped in the makefile so that it appears before the new errors.mk file is included.	2024-11-20 14:58:35 +01:00
William Lallemand	b861dc9371	MINOR: systemd: replace SOCK_CLOEXEC by fcntl call to FD_CLOEXEC Since we build systemd.o for every target, we need it to be more portable. The SOCK_CLOEXEC argument from socket() is not portable and won't build on some OS like macOS X. This patch fixes the issue by replace SOCK_CLOEXEC by a fnctl set to FD_CLOEXEC.	2024-11-20 14:26:23 +01:00
William Lallemand	1ceeeacbad	CI: vtest: temporarily build from the sd-notify PR Build VTest temporarily from the sd-notify PR until the https://github.com/vtest/VTest/pull/41 is merged. This PR allows starting with -Ws in order to have more reliables tests in master-worker mode.	2024-11-20 12:07:38 +01:00
William Lallemand	15845247db	MEDIUM: mworker: remove USE_SYSTEMD requirement for -Ws Since sd_notify() is now implemented in src/systemd.c, there is no need anymore to build its support conditionnally with USE_SYSTEMD. This patch add supports for -Ws for every build and removes the USE_SYSTEMD build option. It also remove every reference to USE_SYSTEMD in the documentation and the CI. This also allows to run the reg-tests in -Ws with the new VTest support.	2024-11-20 12:07:38 +01:00
Amaury Denoyelle	16147e6cf3	BUG/MINOR: cfgparse-quic: fix renaming of max-window-size A patch has recently tried to rename QUIC max-window-size global parameter to default-max-window-size to better reflect its usage. However, only the documentation was edited but not cfgparse-quic.c. Fix this by updating cfgparse-quic.c with the new default- naming. No need to backport.	2024-11-20 11:12:06 +01:00
Christopher Faulet	17d4e6eaf9	MINOR: http-fetch: Add an option to 'query" to get the QS with the '?' As mentionned by Thayne McCombs in #2728, it could be handy to have a sample fetch function to retrieve the query string with the question mark character. Indeed, for now, "query" sample fetch function already extract the query string from the path, but the question mark character is not included. Instead of adding a new sample fetch function with a too similar name, an optional argument is added to "query". If "with_qm" is passed as argument, the question mark will be included in the query string, but only if it is not empty. Thanks to this patch, the following rule: http-request redirect location /destination?%[query] if { -m found query } some_condition http-request redirect location /destination if some_condition can now be expressed this way: http-request redirect location /destination%[query(with_qm)] if some_condition	2024-11-20 10:20:05 +01:00
Christopher Faulet	2a5da31cce	BUG/MINOR: http-ana: Adjust the server status before the L7 retries The server status must be adjusted, if necessary, at each retry. It is properly performed when "obersve layer4" directive is set. But for the layer 7, only the last attempt was considered. When the L7 retries were implemented, all retries were added before the server status adjutement. So only the last attempt was considered. To fix the issue, we must adjut the server status first, and then try to perform a L7 retry. This patch should fix the issue #2679. It must be backported to all stable versions.	2024-11-20 09:22:06 +01:00
Willy Tarreau	5c15899410	DOC: configuration: wrap long line for "strstr()" conditional expression This keyword had too long a description line, let's split it. This can be backported to 2.8.	2024-11-20 09:04:53 +01:00
Willy Tarreau	da1620b317	DOC: configuration: explain quotes and spaces in conditional blocks Conditional blocks inherit the same tokenizer and argument parser as the rest of the configuration, but are also silently concatenated around groups of spaces and tabs. This can lead to subtle failures for configs containing spaces around commas and parenthesis, where a string comparison might silently fail for example. Let's better document this particular case. Thanks to Valentine for analysing and reporting the problem. This can be backported to 2.4.	2024-11-20 09:04:53 +01:00
Willy Tarreau	962d5e038f	DOC: configuration: explain the rules regarding spaces in arguments Spaces around commas or parenthesis in expressions are generally part of the value due to the long history of supporting unquoted arguments. But this tends to come as a surprise to new users and sometimes creates subtly invalid configurations. Let's add some text covering this. This can be backported to 2.4.	2024-11-20 08:42:02 +01:00
Willy Tarreau	12fcd65468	MINOR: tasklet: support an optional set of wakeup flags to tasklet_wakeup_on() tasklet_wakeup_on() and its derivates (tasklet_wakeup_after() and tasklet_wakeup()) do not support passing a wakeup cause like task_wakeup(). This is essentially due to an API limitation cause by the fact that for a very long time the only reason for waking up was to process pending I/O. But with the growing complexity of mux tasks, it is becoming important to be able to skip certain heavy processing when not strictly needed. One possibility is to permit the caller of tasklet_wakeup() to pass flags like task_wakeup(). Instead of going with a complex naming scheme, let's simply make the flags optional and be zero when not specified. This means that tasklet_wakeup_on() now takes either 2 or 3 args, and that the third one is the optional flags to be passed to the callee. Eligible flags are essentially the non-persistent ones (TASK_F_UEVT* and TASK_WOKEN_*) which are cleared when the tasklet is executed. This way the handler will find them in its <state> argument and will be able to distinguish various causes for the call.	2024-11-19 20:13:41 +01:00
Willy Tarreau	0334cb28a9	MINOR: tasklet: make the low-level tasklet API take a flag Everything in the tasklet layer supports flags, except that they are just not implemented in the wakeup functions, while they are in the task_wakeup functions. Initially it was not considered useful to pass wakeup causes because these were essentially I/O, but with the growing number of I/O handlers having to deal with various types of operations (typically cheap I/O notifications on subscribe vs heavy parsing on application-level wakeups), it would be nice to start to make this distinction possible. This commit extends _tasklet_wakeup_on() and _tasklet_wakeup_after() to pass a set of flags that continues to be set as zero. For now this changes nothing, but new functions will come.	2024-11-19 20:13:41 +01:00
Willy Tarreau	e57581d76d	MINOR: tools: add new macro DEFZERO to provide a default zero argument This is the equivalent of DEFNULL except that it sets a zero value instead of a NULL for a missing argument.	2024-11-19 20:13:41 +01:00
Willy Tarreau	c5052bad8a	MINOR: sched: add TASK_F_WANTS_TIME to make the scheduler update the call date Currently tasks being profiled have th_ctx->sched_call_date set to the current nanosecond in monotonic time. But there's no other way to have this, despite the scheduler being capable of it. Let's just declare a new task flag, TASK_F_WANTS_TIME, that makes the scheduler take the time just before calling the handler. This way, a task that needs nanosecond resolution on the call date will be able to be called with an up-to-date date without having to abuse now_mono_time() if not needed. In addition, if CLOCK_MONOTONIC is not supported (now_mono_time() always returns 0), the date is set to the most recently known now_ns, which is guaranteed to be atomic and is only updated once per poll loop. This date can be more conveniently retrieved using task_mono_time(). This can be useful, e.g. for pacing. The code was slightly adjusted so as to merge the common parts between the profiling case and this one.	2024-11-19 20:13:41 +01:00
Willy Tarreau	12969c1b17	MINOR: tinfo/clock: turn sched_call_date to 64-bits We used to store it in 32-bits since we'd only use it for latency and CPU usage calculation but usages will evolve so let's not truncate the value anymore. Now we store the full 64 bits. Note that this doesn't even increase the storage size due to alignment. The 3 usage places were verified to still be valid (most were already cast to 32 bits anyway).	2024-11-19 20:13:41 +01:00
Willy Tarreau	33c461314c	MINOR: stream: don't update s->lat_time when the wakeup date is not set In 2.7 was added a stream wakeup latency calculation with commit 6a28a30efa ("MINOR: tasks: do not keep cpu and latency times in struct task"). However, due to the transformation of the previous code, it kept unconditionally updating s->lat_time even of the sched_wake_date was zero. In other words, s->lat_time is constantly updated for the huge majority of calls that are made without profiling. Let's just check the sched_wake_date status before doing so.	2024-11-19 20:13:41 +01:00
Willy Tarreau	973c81ceec	CLEANUP: tinfo: move sched__date/_mono_time to the thread-local area These ones are never atomically accessed, they have nothing to do in the atomic ops cache line, let's move them to the thread-local area.	2024-11-19 20:13:41 +01:00
Willy Tarreau	8dc68f3c75	DOC: sched: document the missing TASK_F_UEVT* flags These are user-defined one-shot events that are application-specific and reset upon wakeup and were not documented. No backport is needed since these were added to 3.1.	2024-11-19 20:13:41 +01:00
Willy Tarreau	e5ca72cb6f	DOC: sched: add missing scheduler API documentation for tasklet_wakeup_after() This was added to 2.6 but the doc was forgotten. Let's add it. It's not needed to backport this since it's only used for new developments.	2024-11-19 20:13:41 +01:00
Aurelien DARRAGON	501827ebe0	DOC: lua: fix yield-dependent methods expected contexts Contrary to what the doc states, it is not expected (nor relevant) to use yield-dependent methods such as core.yield() or core.(m)sleep() from contexts that don't support yielding. Such contexts include body, init, fetches and converters. Thus the doc got it wrong since the beginning, because such methods were never supported from the above contexts, yet it was listed in the list of compatible contexts (probably the result of a copy-paste), which is error-prone because it could either cause a Lua runtime error to be thrown, or be ignored in some other cases. It should be backported to all stable versions.	2024-11-19 19:36:02 +01:00
William Lallemand	6f746af915	REGTESTS: use -dW by default on every reg-tests Every reg-test now runs without any warning, so let's acivate -dW by default so the new ones will inheritate the option. This patch reverts 9d511b3c ("REGTESTS: enable -dW on almost all tests to fail on warnings") and adds -dW in the default HAPROXY_ARGS of scripts/run-regtests.sh instead.	2024-11-19 16:53:10 +01:00
William Lallemand	e1fb9a47e1	MEDIUM: stats-file: silently ignore be/fe mistmatch Most of the invalid or unknow field in the stats-file parser are ignored silently, which is not the case of the frontend/backend mismatch on a guid, which is kind of strange. Since this is ""documented"" to be ignored in the reg-tests/stats/sample-stats-file file, let's also ignore this kind of line. This will allow to run the associated reg-test with -dW.	2024-11-19 16:44:51 +01:00
Amaury Denoyelle	5a29fd6c61	MINOR: mux_quic/pacing: display pacing info on show quic To improve debugging, extend "show quic" output to report if pacing is activated on a connection. Two values will be displayed for pacing : * a new counter paced_sent_ctr is defined in QCC structure. It will be incremented each time an emission is interrupted due to pacing. * pacing engine now saves the number of datagrams sent in the last paced emission. This will be helpful to ensure burst parameter is valid.	2024-11-19 16:21:05 +01:00
Amaury Denoyelle	24cea66e07	MEDIUM: quic: define cubic-pacing congestion algorithm Define a new QUIC congestion algorithm token 'cubic-pacing' for quic-cc-algo bind keyword. This is identical to default cubic implementation, except that pacing is used for STREAM frames emission. This algorithm supports an extra argument to specify a burst size. This is stored into a new bind_conf member named quic_pacing_burst which can be reuse to initialize quic path. Pacing support is still considered experimental. As such, 'cubic-pacing' can only be used with expose-experimental-directives set.	2024-11-19 16:20:58 +01:00
Amaury Denoyelle	6dfc8fbf1d	MINOR: quic: extend quic-cc-algo optional parameters Modify quic-cc-algo for better extensability of optional parameters parsing. This will be useful to support a new parameter for maximum allowed pacing burst size. Take this opportunity to refine quic-cc-algo documentation. Optional parameters are now presented as a list which would be soon extended.	2024-11-19 16:20:52 +01:00
Amaury Denoyelle	a6504c9cfb	MINOR: quic: use dynamic cc_algo on bind_conf A QUIC congestion algorithm can be specified on the bind line via keyword quic-cc-algo. As such, bind_conf structure has a member quic_cc_algo. Previously, if quic-cc-algo was set, bind_conf member was initialized to one of the globally defined CC algo structure. This patch changes bind_conf quic_cc_algo initialization to point to a dynamically allocated copy of CC algo structure. With this change, it will be possible to tweak individually each CC algo of a bind line. This will be used to activate pacing on top of the congestion algorithm. As bind_conf member is dynamically allocated now, its member is now freed via free_proxy() to prevent any leak.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	796446a15e	MAJOR: mux-quic: support pacing emission Support pacing emission for STREAM frames at the QUIC MUX layer. This is implemented by adding a quic_pacer engine into QCC structure. The main changes have been written into qcc_io_send(). It now differentiates cases when some frames have been rejected by transport layer. This can occur as previously due to congestion or FD buffer full, which requires subscribing on transport layer. The new case is when emission has been interrupted due to pacing timing. In this case, QUIC MUX I/O tasklet is rescheduled to run with the flag TASK_F_USR1. On tasklet execution, if TASK_F_USR1 is set, all standard processing for emission and reception is skipped. Instead, a new function qcc_purge_sending() is called. Its purpose is to retry emission with the saved STREAM frames list. Either all remaining frames can now be send, subscribe is done on transport error or tasklet must be rescheduled for pacing purging. In the meantime, if tasklet is rescheduled due to other conditions, TASK_F_USR1 is reset. This will trigger a full regeneration of STREAM frames. In this case, pacing expiration must be check before calling qcc_send_frames() to ensure emission is now allowed.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	ede4cd4c2e	MINOR: mux-quic: encapsulate QCC tasklet wakeup QUIC MUX will be responsible to drive emission with pacing. This will be implemented via setting TASK_F_USR1 before I/O tasklet wakeup. To prepare this, encapsulate each I/O tasklet wakeup into a new function qcc_wakeup(). This commit is purely refactoring prior to pacing implementation into QUIC MUX.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	4a94a018f0	MINOR: mux-quic: define a tx STREAM frame list member For STREAM emission, MUX QUIC previously used a local list defined under qcc_io_send(). This was suitable as either all frames were sent, or emission must be interrupted due to transport congestion or fatal error. In the latter case, the list was emptied anyway and a new frame list was built on future qcc_io_send() invokation. For pacing, MUX QUIC may have to save the frame list if pacing should be applied across emission. This is necessary to avoid to unnecessarily rebuilt stream frame list between each paced emission. To support this, STREAM list is now stored as a member of QCC structure. Ensure frame list is always deleted, even on QCC release, using newly defined utility function qcc_tx_frms_free().	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	886a7c475c	MINOR: quic/pacing: add burst support qc_send_mux() has been extended previously to support pacing emission. This will ensure that no more than one datagram will be emitted during each invokation. However, to achieve better performance, it may be necessary to emit a batch of several datagrams one one turn. A so-called burst value can be specified by the user in the configuration. However, some congestion control algos may defined their owned dynamic value. As such, a new CC callback pacing_burst is defined. quic_cc_default_pacing_burst() can be used for algo without pacing interaction, such as cubic. It will returns a static value based on user selected configuration.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	8039fe43e6	MINOR: quic/pacing: support pacing emission on quic_conn layer Pacing will be implemented for STREAM frames emission. As such, qc_send_mux() API has been extended to add an argument to a quic_pacer engine. If non NULL, engine will be used to pace emission. In short, no more than one datagram will be emitted for each qc_send_mux() invokation. Pacer is then notified about the emission and a timer for a future emission is calculated. qc_send_mux() will return PACING error value, to inform QUIC MUX layer that it will be responsible to retry emission after some delay.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	ab82fab442	MINOR: quic/pacing: implement quic_pacer engine Extend quic_pacer engine to support pacing emission. Several functions are defined. * quic_pacing_sent_done() to notify engine about an emission of one or several datagrams * quic_pacing_expired() to check if emission should be delayed or can be conducted immediately	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	3e11492c99	MINOR: quic: define quic_pacing module Add a new module quic_pacing. A new structure quic_pacer is defined. This will be used as a pacing engine to implement smooth emission of QUIC data.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	7fd48a5723	MINOR: quic: extend qc_send_mux() return type with a dedicated enum This commit is part of a adjustment on QUIC transport send API to support pacing. Here, qc_send_mux() return type has been changed to use a new enum quic_tx_err. This is useful to explain different failure causes of emission. For now, only two values have been defined : NONE and FATAL. When pacing will be implemented, a new value would be added to specify that emission was interrupted on pacing. This won't be a fatal error as this allows to retry emission but not immediately.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	5cb8f8a622	MINOR: quic: support a max number of built packet per send iteration Extend QUIC transport emission function to support a maximum datagram argument. The purpose is to ensure that qc_send() won't emit more than the specified value, unless it is 0 which is considered as unlimited. In qc_prep_pkts(), a counter of built datagram has been added to support this. The packet building loop is interrupted if it reaches a specified maximum value. Also, its return value has been changed to the number of prepared datagrams. This is reused by qc_send() to interrupt its work if a specified max datagram argument value is reached over one or several iteration of prepared/sent datagrams. This change is necessary to support pacing emission. Note that ideally, the total length in bytes of emitted datagrams should be taken into account instead of the raw number of datagrams. However, for a first implementation, it was deemed easier to implement it with the latter.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	a554d82131	MINOR: quic: simplify qc_prep_pkts() exit path To prepare pacing support, qc_prep_pkts() exit path have been rewritten to be easily modified. This is purely refactoring which should not have any functional change : * a dedicated error path has been added * ensure qc_txb_store() is always called to finalize datagram on normal exit path if first_pkt is not NULL. Needed to support breaking from packet building loop in a easier way.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	4069873403	MINOR: mux-quic: add missing values for show flags Add QCC QC_CF_WAIT_FOR_HS and QCS QC_SF_TXBUB_OOB flags to their respective show_flags to be able to decipher them via dev flags utility. These values have been added in the current dev version, thus no need to backport this patch.	2024-11-19 16:16:48 +01:00
Amaury Denoyelle	8540886f00	DOC: quic: rename max-window-size as with default prefix Rename 'tune.quic.frontend.max-window-size' with the prefix 'default-'. This highlights the fact that it is not a hard limit, as it can be overriden if specifying an optional window size via quic-cc-algo on a bind line. No need to backport as this keyword was added on the current dev version.	2024-11-19 16:16:48 +01:00
William Lallemand	f36caf7b81	MEDIUM: stats-file: explicitely ignore comments starting by // Explicitely ignore comments starting by // so they don't emit a warning.	2024-11-19 15:49:44 +01:00
William Lallemand	96f2736e99	MINOR: stats-file: add the filename in the warning Add the name of the stats-file in the warning so it's clear that the warning was provoked by the stats-file and not the config file.	2024-11-19 15:49:44 +01:00
Christopher Faulet	e68c6852ad	DOC: config: Move fs.* and bs.* in section about L5 samples These sample fetch functions were added in the wrong section. Move them in the section about sample fetch functions at L5 layer.	2024-11-19 15:29:41 +01:00
Christopher Faulet	4ccc3f4048	DOC: config: Move wait_end in section about internal samples wait_end is an internal sample fetch functions and not a L6 one. So move it in the corresponding section.	2024-11-19 15:29:40 +01:00
Christopher Faulet	e9021a4ca1	DOC: config: Slightly improve the %Tr documentation Specify -1 can also be reported for %Tr delay when the response is invalid.	2024-11-19 15:29:40 +01:00
Christopher Faulet	5863d33fce	BUG/MINOR: http_ana: Report -1 for %Tr for invalid response only The server response time is erroneously reported as -1 when it is intercepted by HAProxy. As stated in the documentation, the server response time is reported as -1 when the last response header was never seen. It happens when a server timeout is triggered before the server managed to process the request. It also happens if the response is invalid. This may be reported by the mux during the response parsing, but also by the HTTP analyzers. However, in this last case, the response time must only be reported as -1 on 502. This patch must be backported to all stable versions. It should fix the issue #2384.	2024-11-19 15:29:40 +01:00
Christopher Faulet	bc967758a2	MINIR: mux-h1: Return 414 or 431 when appropriate When the request is too large to fit in a buffer a 414 or a 431 error message is returned depending on the error state of the request parser. A 414 is returned if the URI is too long, otherwise a 431 is returned. This patch should fix the issue #1309.	2024-11-19 15:29:40 +01:00
Christopher Faulet	41f28b3c53	DEV: phash: Update 414 and 431 status codes to phash The phash tool was updated to reflect the previous change. 414 and 431 are now part of the handled status codes.	2024-11-19 15:29:40 +01:00
Christopher Faulet	62dc8750a9	MINOR: http: Add support for HTTP 414/431 status codes 414-Uri-Too-Long and 431-Request-Header-Fields-Too-Large are now part of supported status codes that can be define as error files. The hash table defined in http_get_status_idx() was updated accordingly.	2024-11-19 15:29:40 +01:00
Christopher Faulet	18de419f96	DOC: config: Fix a typo in "1.3.1. The Request line" At the beginning of the last paragraph of this section, HTTP/3 was used instead of HTTP/2. It is not fixed.	2024-11-19 15:29:40 +01:00
Christopher Faulet	3af2d91b3b	DOC: config: A a space before ':' for {bs,fs}.aborted and {bs,fs}.rst_code A space was missing before the ':' for the sample fetch functions above. It was an issue for the text to HTML conversion script. So, let's fix it.	2024-11-19 15:29:40 +01:00
Christopher Faulet	fa43ca2ed0	MINOR: stream: Add an option to "show sess" command to dump the captured URI "show sess" command now supports a list of options that can be set after all other possible arguments (<id>, all...). For now, "show-uri" is the only supported option. With this options, the captured URI, if non-null, is added to the dump of a stream, complete or now. The URI may be anonymized if necessary. This patch should fix the issue #663.	2024-11-19 15:29:40 +01:00
Christopher Faulet	e9bc5937c9	MINOR: agent-check: Be able to set absolute weight via an agent Historically, an agent-check program is only able to set a proportial weight to the initial server's weight. However, it could be handy to also set an absolute value. It is the purpose of this patch. Instead of changing the current way to set a server's weight, a new agent-check command is introduced. The string "weight:", followed by an positive interger or a positive interger percentage, can now be used. If the value ends with the '%' sign, then the new weight will be proportional to the initially weight of the server. Otherwise, the value is considered as an absolute weight and must be between 0 and 256. This patch should fix the issue #360.	2024-11-19 15:29:40 +01:00
Christopher Faulet	1be7140ade	MINOR: http-ana: Add support for "set-cookie-fmt" option to redirect rules It is now possible to use a log-format string to define the "Set-Cookie" header value of a response generated by a redirect rule. There is no special check on the result format and it is not possible during the configuration parsing. It is proably not a big deal because already existing "set-cookie" and "clear-cookie" options don't perform any check. Here is an example: http-request redirect location https://someurl.com/ set-cookie haproxy="%[var(txn.var)]" This patch should fix the issue #1784.	2024-11-19 15:20:02 +01:00
Christopher Faulet	b2877db47c	MINOR: http-ana: Add option to keep query-string on a localtion-based redirect On prefix-based redirect, there is an option to drop the query-string of the location. Here it is the opposite. an option is added to preserve the query-string of the original URI for a localtion-based redirect. By setting "keep-query" option, for a location-based redirect only, the query-string of the original URI is appended to the location. If there is no query-string, nothing is added (no empty '?'). If there is already a non-empty query-string on the localtion, the original one is appended with '&' separator. This patch should fix issue #2728.	2024-11-19 15:20:02 +01:00
Valentine Krasnobaeva	7848692c4c	MINOR: config: show HAPROXY_BRANCH in "show env" output Before this patch HAPROXY_BRANCH was unset just after configuration parsing. Let's keep it, as it could be used in conditional blocks and some configuration directives and it's handy to check its runtime value via "show env". In master-worker mode, this variable is set to the same value for both processes.	2024-11-19 14:13:50 +01:00
Valentine Krasnobaeva	d58a8d1f64	MINOR: cli: make "show env" accessible via master CLI without enabling debug Before this patch, we have need to put the master CLI in debug mode to be able to issue 'show env' command for the master process. Output of this command is handy even for the master process context, as it allows to control its environment variables, which could be used/modified in the 'global' section. So, let's provide in 'show env' command structure the level ACCESS_MASTER. This allows to see and to access this command in master CLI without putting it in debug mode.	2024-11-19 14:13:42 +01:00
Valentine Krasnobaeva	b9536717cd	BUG/MINOR: mworker-prog: don't warn about deprecated section with expose-deprecated-directives As master parses now expose-deprecated-directives option, let's emit warning about deprecated 'progam' section only in case, if this option wasn't set in the 'global' section. This allows to people, who don't prefer to remove the 'program' section immediately to continue to start the process in zero-warning mode. Adjust the warning message accordingly and mcli_start_progs.vtc test. As expose-deprecated-directives option is a 'global' section keyword, this section must always precede any 'program' section, if users still continue to keep 'program' section. This doesn't need to be backported, as related to the latest changes in the master-worker architecture.	2024-11-19 14:13:30 +01:00
Valentine Krasnobaeva	39ea0df38f	MINOR: cfgparse-global: parse options to allow non std keywords in discovery mode 'Program' section is considered as deprecated now, see the commit 581c8a27d98c ("MEDIUM: mworker: depreciate the 'program' section"). So, the 'program' section parser emits a warning every time since this commit, if its section is presented. This makes impossible to launch the process in zero-warning mode. After master-worker refactoring only the master process parses the 'program' section. So, at first, in order to be able to start in zero-warning mode, we need to parse in master process option, which allows deprecated keywords. Thus, let's set in this commit KWF_DISCOVERY flag to cfg_parse_global_non_std_directives parser, which parses 'expose-deprecated-directives' and 'expose-deprecated-directives' options.	2024-11-19 14:13:19 +01:00
Willy Tarreau	f8d3d2e4cf	MINOR: ring: support unit suffixes in the size The ring size used to take only numbers and silently ignore letters (due to atol()), resulting it tiny buffers when trying to collect traces and using e.g. "size 10g". Let's make use of parse_size_err() to properly parse units.	2024-11-19 10:56:45 +01:00
Willy Tarreau	82f190f882	MINOR: tools: make parse_size_err() support 32/64 bits parse_size_err() currently is a function working only on an uint. It's not convenient for certain elements such as rings on large machines. This commit addresses this by having one function for uints and one for ullong, and making parse_size_err() a macro that automatically calls one or the other. It also has the benefit of automatically supporting compatible types (long, size_t etc).	2024-11-19 10:50:42 +01:00
Willy Tarreau	9c6ccb8dbb	MEDIUM: config: warn on unitless timeouts < 100 ms From time to time we face a configuration with very small timeouts which look accidental because there could be expectations that they're expressed in seconds and not milliseconds. This commit adds a check for non-nul unitless values smaller than 100 and emits a warning suggesting to append an explicit unit if that was the intent. Only the common timeouts, the server check intervals and the resolvers hold and timeout values were covered for now. All the code needs to be manually reviewed to verify if it supports emitting warnings. This may break some configs using "zero-warning", but greps in existing configs indicate that these are extremely rare and solely intentionally done during tests. At least even if a user leaves that after a test, it will be more obvious when reading 10ms that something's probably not correct.	2024-11-19 10:33:20 +01:00
Willy Tarreau	9d511b3c27	REGTESTS: enable -dW on almost all tests to fail on warnings Now that warnings were almost all removed, let's enable zero-warning via -dW. All tests were adjusted, but two: - mcli/mcli_start_progs.vtc: the programs section currently cannot be silenced - stats/stats-file.vtc: the warning comes from the stats file itself on comment lines. All other ones are now OK.	2024-11-19 09:27:08 +01:00
Willy Tarreau	efd745e22d	REGTESTS: only use tune.ssl.default-dh-param when not using AWS-LC This option is not available with AWS-LC and emits a warning, so let's properly enclose the test to cover this special case.	2024-11-19 09:27:08 +01:00
Willy Tarreau	d37610f43d	REGTESTS: add missing timeouts to 30 tests No less than 30 tests were missing timeouts, preventing them from being started with zero-warning. Since they were not supposed to trigger, they have been set to 30s so as never to trigger, and now they do not produce any warning anymore.	2024-11-19 08:46:02 +01:00
Willy Tarreau	52b72ec3ba	REGTESTS: silence warning "L6 sample fetches ignored" in cond_set_var This reg-test uses req.len in an HTTP backend. It does work but emits a warning suggesting that this is ignored, so most likely its days are counted now. Let's just use req.hdrs,length instead.	2024-11-19 08:33:15 +01:00
Willy Tarreau	b9537fe66d	REGTESTS: remove a duplicate "option httpslog" in the defaults section This triggers the following warning: 'option httpslog' overrides previous 'option httpslog' in 'defaults' section.	2024-11-19 08:06:26 +01:00
Willy Tarreau	dce394a303	REGTESTS: silence warnings about content-type being ignored The following rules are triggering warnings about content-type being ignored: http-request return content-type "text/plain" if { path /def-4 } http-request return content-type "text/plain" file /dev/null hdr "x-custom-hdr" "%[url]" if { path /empty-file } Annoyingly, the content-type is mandatory when the file is not empty, that might be something to revisit in the future to relax at least one of the rules so that the config doesn't strictly require to know the file contents upfront.	2024-11-19 08:06:26 +01:00
Willy Tarreau	6d70da76d3	REGTESTS: make the unit explicit for very short timeouts Two tests were using "timeout {client,server} 1" to forcefully trigger them, but a forthcoming patch will emit a warning for such small unitless values, so let's be explicit about the unit.	2024-11-19 08:06:26 +01:00
Willy Tarreau	04465d25bc	REGTESTS: silence warning "previous 'http-response' action is final" The regtest "h1or2_to_h1c" contains both an allow and a deny at the end, likely to help catch rare bugs. But this triggers a warning that we can silence by placing a condition on the penultimate rule.	2024-11-19 08:06:26 +01:00
Willy Tarreau	671f6beac1	REGTESTS: silence the "log format ignored" warnings Several tests were declaring a log format without having an explicit log server configured, causing a warning. Let's clean them up.	2024-11-19 08:06:26 +01:00
Willy Tarreau	e72b525832	MINOR: cfgparse: parse tune.bufsize.small as a size Till now this value was parsed as raw integer using atol() and would silently ignore any trailing suffix, causing unexpected behaviors when set, e.g. to "4k". Let's make use of parse_size_err() on it so that units are supported. This requires to turn it to uint as well, which was verified to be OK.	2024-11-18 19:07:05 +01:00
Willy Tarreau	a344d37fad	MINOR: cfgparse: parse tune.bufsize as a size Till now this value was parsed as raw integer using atol() and would silently ignore any trailing suffix, preventing from starting when set e.g. to "64k". Let's make use of parse_size_err() on it so that units are supported. This requires to turn it to uint as well, and to explicitly limit its range to INT_MAX - 2sizeof(void), which was previously partially handled as part of the sign check.	2024-11-18 19:06:25 +01:00
Willy Tarreau	2f0c6ff3a5	MINOR: cfgparse: parse tune.recv_enough as a size Till now this value was parsed as raw integer using atol() and would silently ignore any trailing suffix, causing unexpected behaviors when set, e.g. to "512k". Let's make use of parse_size_err() on it so that units are supported. This requires to turn it to uint as well, and since it's sometimes compared to an int, we limit its range to 0..INT_MAX.	2024-11-18 19:01:28 +01:00
Willy Tarreau	a90a7d4d60	MINOR: cfgparse: parse tune.pipesize as a size Till now this value was parsed as raw integer using atol() and would silently ignore any trailing suffix, causing unexpected behaviors when set, e.g. to "512k". Let's make use of parse_size_err() on it so that units are supported. This requires to turn it to uint as well, which was verified to be OK.	2024-11-18 18:51:31 +01:00
Willy Tarreau	f9f28b7584	MINOR: cfgparse: parse tune.{rcvbuf,sndbuf}.{frontend,backend} as sizes Till now these values were parsed as raw integer using atol() and would silently ignore any trailing suffix, causing unexpected behaviors when set, e.g. to "512k". Let's make use of parse_size_err() on them so that units are supported. This requires to turn them to uint as well, which is OK.	2024-11-18 18:50:02 +01:00
Willy Tarreau	a923c72357	MINOR: cfgparse: parse tune.{rcvbuf,sndbuf}.{client,server} as sizes Till now these values were parsed as raw integer using atol() and would silently ignore any trailing suffix, causing unexpected behaviors when set, e.g. to "512k". Let's make use of parse_size_err() on them so that units are supported. This requires to turn them to uint as well, which is OK.	2024-11-18 18:49:01 +01:00
Willy Tarreau	45f9e95f22	MINOR: sample: extend the "when" converter to support an ACL Sometimes conditions to decide of an anomaly are not as easy to define as just an error or a success. One example use case would be to monitor the transfer time and fix a threshold. An idea suggested by Tristan would be to make permit the "when" converter to refer to a more variable or dynamic condition. Here we make this possible by making "when" rely on a named ACL. The ACL then needs to be specified in either the proxy or the defaults section. Since it is evaluated inline, it may even refer to information available at the end (at log time) such as the data transfer time. If the ACL evalutates to true, the converter passes the data. Example: log "dbg={-}" when fine, or "dbg={... debug info ...}" on slow transfers: acl slow_xfer res.timer.data ge 10000 # more than 10s is slow log-format "$HAPROXY_HTTP_LOG_FMT \ fsdbg={%[fs.debug_str,when(acl,slow_xfer)]} \ bsdbg={%[bs.debug_str,when(acl,slow_xfer)]}"	2024-11-18 16:11:55 +01:00
Willy Tarreau	00fcda1ff2	MINOR: acl: export find_acl_default() It will be needed in a future patch, so let's export it (it was static).	2024-11-18 15:15:54 +01:00
Willy Tarreau	9539f2b097	[RELEASE] Released version 3.1-dev13 Released version 3.1-dev13 with the following main changes : - MEDIUM: mworker: depreciate the 'program' section - BUILD: ot: use a cebtree instead of a list for variable names - MINOR: startup: replace HAPROXY_LOAD_SUCCESS with global load_status - BUG/MINOR: startup: set HAPROXY_CFGFILES in read_cfg - BUG/MINOR: cli: don't show sockpairs in HAPROXY_CLI and HAPROXY_MASTER_CLI - BUG/MEDIUM: stconn: Don't forward shut for SC in connecting state - BUG/MEDIUM: resolvers: Insert a non-executed resulution in front of the wait list - MINOR: debug: explicitly permit the counter condition to be empty - MINOR: debug: add a new counter type for glitches - MINOR: mux-h2: count glitches when they're reported - BUG/MINOR: deinit: release uri_auth admin rules - MINOR: uri_auth: add stats_uri_auth_free helper - MEDIUM: uri_auth: implement clean uri_auth cleaning - MINOR: mux-quic/h3: count glitches when they're reported - BUG/MEDIUM: mux-h2: Don't send RST_STREAM frame for streams with no ID - BUG/MINOR: Don't report early srv aborts on request forwarding in DONE state - MINOR: promex: Expose the global node and description in process metrics - MINOR: promex: Add global and proxies description as labels to all metrics - OPTIM: pattern: only apply LRU cache for large enough lists - BUG/MEDIUM: checks: make sure to always apply offsets to now_ms in expiration - BUG/MINOR: debug: do not set task expiration to TICK_ETERNITY - BUG/MEDIUM: mailers: make sure to always apply offsets to now_ms in expiration - BUG/MINOR: mux_quic: make sure to always apply offsets to now_ms in expiration - BUG/MINOR: peers: make sure to always apply offsets to now_ms in expiration - BUG/MEDIUM: clock: make sure now_ms cannot be TICK_ETERNITY - MINOR: debug/cli: replace "debug dev counters" with "debug counters" - DOC: config: add tune.h2.{be,fe}.rxbuf to the global keywords index - MINOR: chunk: add a BUG_ON upon the next init_trash_buffer()	2024-11-15 18:42:29 +01:00
William Lallemand	0bfd36e7b8	MINOR: chunk: add a BUG_ON upon the next init_trash_buffer() The trash pool is initialized twice in haproxy, first during STG_POOL, and 2nd after configuration parsing. Doing alloc_trash_chunk() between this 2 phases can lead to strange things if we are using it after, indeed the pool is destroyed and trying to do a free_trash_chunk() or accessing the pointer will lead to crashes. This patch checks that we don't have used buffers from the trash pool before initializing the pool again.	2024-11-15 17:15:06 +01:00
Willy Tarreau	5f37af7a8e	DOC: config: add tune.h2.{be,fe}.rxbuf to the global keywords index These two keywords were missing from the index, let's add them.	2024-11-15 16:32:37 +01:00
Willy Tarreau	4420939fcd	MINOR: debug/cli: replace "debug dev counters" with "debug counters" "debug dev" commands are not meant to be used by end-users, and are purposely not documented. Yet due to their usefulness in troubleshooting sessions, users are increasingly invited by developers to use some of them. "debug dev counters" is one of them. Better move it to "debug counters" and document it so that users can check them even if the output can look cryptic at times. This, combined with DEBUG_GLITCHES, can be convenient to observe suspcious activity. The doc however precises that the format may change between versions and that new entries/types might appear within a stable branch.	2024-11-15 16:26:01 +01:00
Willy Tarreau	5a3735a155	BUG/MEDIUM: clock: make sure now_ms cannot be TICK_ETERNITY In clock ticks, 0 is TICK_ETERNITY. Long ago we used to make sure now_ms couldn't be zero so that it could be assigned to expiration timers, but it has long changed after functions like tick_add() were instrumented to make the check. The problem is that aside the rare few accidental direct assignments to expiration dates, it's also used to mark the beginning of an event that's later checked against TICK_ETERNITY to know if it has already struck. The problem in this case is that certain events may just be replaced or dropped just because they apparently never appeared. It's probably the case for stconn's "lra" and "fsb" fields, just like it is for all those involving tick_add_ifset(), like h2c->idle_start. The right approach would be to change the type of now_ms to something else that cannot take direct computations and that represents a timestamp, forcing to always use the conversion functions. The variables holding such timestamps would also be distinguished from intervals. At first glance we could have for timestamps: - 0 = never happened (for the past), eternity (for the future) - X = date and for intervals: - 0 = not set - X = interval However this requires significant changes. Instead for now, let's just make sure again that now_ms is never 0 by setting it to 1 when this happens (1 / 4 billion times, or 1ms every 49.7 days). This will need to be carefully backported to older versions. Note that with this patch backported, the previous ones fixing the zero date are not strictly needed.	2024-11-15 16:01:31 +01:00
Willy Tarreau	ed55ff878d	BUG/MINOR: peers: make sure to always apply offsets to now_ms in expiration Now_ms can be zero nowadays, so it's not suitable for direct assignment to t->expire, as there's a risk that the timer never wakes up once assigned (TICK_ETERNITY). Let's use tick_add(now_ms, 0) for an immediate wakeup instead. The impact here might be a reconnect programmed upon signal receipt at the wrapping date not having a working timeout. This should be backported where it applies.	2024-11-15 15:44:05 +01:00
Willy Tarreau	f66bfcff96	BUG/MINOR: mux_quic: make sure to always apply offsets to now_ms in expiration Now_ms can be zero nowadays, so it's not suitable for direct assignment to t->expire, as there's a risk that the timer never wakes up once assigned (TICK_ETERNITY). Let's use tick_add(now_ms, 0) for an immediate wakeup instead. The impact looks nul since the task is also woken up, but better not leave such tasks in the timer tree anyway. This should be backported where it applies.	2024-11-15 15:41:21 +01:00
Willy Tarreau	841be4cdd1	BUG/MEDIUM: mailers: make sure to always apply offsets to now_ms in expiration Now_ms can be zero nowadays, so it's not suitable for direct assignment to t->expire, as there's a risk that the timer never wakes up once assigned (TICK_ETERNITY). Let's use tick_add(now_ms, 0) for an immediate wakeup instead. The impact here might be mailers suddenly stopping. This should be backported where it applies.	2024-11-15 15:39:58 +01:00
Willy Tarreau	808a7cc777	BUG/MINOR: debug: do not set task expiration to TICK_ETERNITY Using "debug task", it's possible to change a task's expiration, but we must be careful not to set it to TICK_ETERNITY. Let's use tick_add() instead. The risk is basically nul since it's a debugging command, so no backport is needed.	2024-11-15 15:39:00 +01:00
Willy Tarreau	2f287f14f3	BUG/MEDIUM: checks: make sure to always apply offsets to now_ms in expiration Now_ms can be zero nowadays, so it's not suitable for direct assignment to t->expire, as there's a risk that the timer never wakes up once assigned (TICK_ETERNITY). Let's use tick_add(now_ms, 0) for an immediate wakeup instead. The impact here might be health checks suddenly stopping. This should be backported where it applies.	2024-11-15 15:39:00 +01:00
Willy Tarreau	555994c968	OPTIM: pattern: only apply LRU cache for large enough lists As shown in issue #1518, the LRU cache has a non-null cost that can sometimes be above the match cost it's trying to avoid. After a number of tests, it appears that: - "simple" match operations (sub, beg, end, int etc) reach a break-even after ~20 patterns in list - "heavy" match operations (reg) reach a break-even after ~5 patterns in list Let's only consult the LRU cache when the number of patterns in the expression is at least as large as this limit. Of course there will always be outliers but it already starts good. Another improvement consists in reducing the cache size to further speed up lookups, which makes sense if less expressions use the cache.	2024-11-15 15:33:04 +01:00
Christopher Faulet	25b0592745	MINOR: promex: Add global and proxies description as labels to all metrics While the global description is exposed, when defined, in a dedicated metric, it is not possible to dump the description defined in a frontend/listen/backend sections. So, thanks to this patch, it is now possible to dump it as a label of all metrics of the corresponding section. To do so, "desc-labels" parameter must be provided on the URL: /metrics?desc-labels When this parameter is set, if a description is provided in a section, including the global one, the "desc" label will be added to all metrics of this section. For instance: haproxy_frontend_current_sessions{proxy="front-http",desc="..."} 1 Note that servers metrics inherit the description of their backend/listen section. This patch should solve the issue #1531.	2024-11-15 14:25:13 +01:00
Christopher Faulet	451d216a53	MINOR: promex: Expose the global node and description in process metrics The global node value is now exposed via "haproxy_process_node" metrics. The metric value is always set to 1 and the node name itself is the "node" label. The same is performed for the global description. But only if it is defined. In that case "haproxy_process_description" metric is defined, with 1 as value and the description itself is set in the "desc" label.	2024-11-15 14:24:31 +01:00
Christopher Faulet	a930e99f46	BUG/MINOR: Don't report early srv aborts on request forwarding in DONE state L7-retries may be ignored if server aborts are detected during the request forwarding, when the request is already in DONE state. When a request was fully processed (so in HTTP_MSG_DONE state) and is waiting for be forwarded to the server, there is a test to detect server aborts, to be able to report the error. However, this test must be skipped if the response was not received yet, to let the reponse analyszers handle the abort. It is important to properly handle the retries. This test must only be performed if the response analysis was finished. It means the response must be at least in HTTP_MSG_BODY state. This patch should be backported as far as 2.8.	2024-11-15 11:00:05 +01:00
Christopher Faulet	f065d00098	BUG/MEDIUM: mux-h2: Don't send RST_STREAM frame for streams with no ID On server side, the H2 stream is first created with an unassigned ID (ID == 0). Its ID is assigned when the request is emitted, before formatting the HEADERS frame. However, the session may be aborted during that stage. We must take care to not emit RST_STREAM frame for this stream, because it does not exist yet for the server. It is especially important to do so because, depending on the timing, it may also happens before the H2 PREFACE was sent. This patch must be backported to all stable versions. It is related to issue	2024-11-15 10:34:47 +01:00
Willy Tarreau	4fd6d15344	MINOR: mux-quic/h3: count glitches when they're reported The qcc_report_glitch() function is now replaced with a macro to support enumerating counters for each individual glitch line. For now this adds 36 such counters. The macro supports an optional description, though that is not being used for now. As a reminder, this requires to build with -DDEBUG_GLITCHES=1.	2024-11-14 20:43:33 +01:00
Aurelien DARRAGON	42710b7320	MEDIUM: uri_auth: implement clean uri_auth cleaning proxy auth_uri struct was manually cleaned up during deinit, but the logic behind was kind of akward because it was required to find out which ones were shared or not. Instead, let's switch to a proper refcount mechanism and free the auth_uri struct directly in proxy_free_common().	2024-11-14 15:03:38 +01:00
Aurelien DARRAGON	e1ec37ea51	MINOR: uri_auth: add stats_uri_auth_free helper Let's now leverage stats_uri_auth_free() helper to free uri_auth struct instead of manually performing the cleanup, which is error-prone.	2024-11-14 15:03:33 +01:00
Aurelien DARRAGON	350a3ab052	BUG/MINOR: deinit: release uri_auth admin rules When uri_auth admin rules were implemented in 474be415 ("[MEDIUM] stats: add an admin level") no attempt was made to free the list of allocated rules, which makes valgrind unhappy upon deinit when "stats admin" is used in the config. To fix the issue, let's cleanup the admin rules list upon deinit where uri_auth freeing is already handled. While this could be backported to every stable versions, given how minor this is and has no impact on the dying process, it is probably not worth the effort.	2024-11-14 15:03:27 +01:00
Willy Tarreau	df93cf72b9	MINOR: mux-h2: count glitches when they're reported The h2c_report_glitch() function is now replaced with a macro to support enumerating counters for each individual glitch line. For now this adds 43 such counters. The macro supports an optional description, though that is not being used for now. It gives outputs like this (note that the last one was purposely instrumented to pass a description): > debug dev counters glt all 0 GLT mux_h2.c:5976 h2c_dec_hdrs() 0 GLT mux_h2.c:5960 h2c_dec_hdrs() (...) 0 GLT mux_h2.c:2207 h2c_frt_recv_preface() 0 GLT mux_h2.c:1954 h2c_frt_stream_new(): new stream too early As a reminder, this requires to build with -DDEBUG_GLITCHES=1.	2024-11-14 09:01:57 +01:00
Willy Tarreau	502790ed7e	MINOR: debug: add a new counter type for glitches COUNT_GLITCH() will implement an unconditional counter on its declaration line when DEBUG_GLITCHES is set, and do nothing otherwise. The output will be reported as "GLT" and can be filtered as "glt" on the CLI. The purpose is to help figure what's happening if some glitches counters start going through the roof. The macro supports an optional string argument to describe the cause of the glitch (e.g. "truncated header"), which is then reported in the dump. For now this is conditioned by DEBUG_GLITCHES but if it turns out to be light enough, maybe we'll keep it enabled full time. In this case it might have to be moved away from debug dev, or at least documented (or done as debug counters maybe so that dev can remain undocumented and updatable within a branch?).	2024-11-14 08:49:38 +01:00
Willy Tarreau	e119095290	MINOR: debug: explicitly permit the counter condition to be empty In order to count new event types, we'll need to support empty conditions so that we don't have to fake if (1) that would pollute the output. This change checks if #cond is an empty string before concatenating it with the optional var args, and avoids dumping the colon on the dump if the whole description is empty.	2024-11-14 08:47:00 +01:00
Christopher Faulet	8f28dbeea9	BUG/MEDIUM: resolvers: Insert a non-executed resulution in front of the wait list When a resolver is woken up to process DNS resolutions, it is possible to trigger an infinite loop on the resolver's wait list because delayed resolutions are always reinserted at the end of this list. This leads the watchdog to kill the process. By re-inserting them in front of the list, that fixes the bug. When a resolver tries to send the queries for the resolutions in its wait list, it may be unable to proceed for a resolution. This may happen because the resolution must be skipped (no hostname to resolv, a resolution already in-progress) or when an error occurred. In that case, the resolution is re-inserted in the resolver's wait list to be retry later, on a next wakeup. However, the resolution is inserted at the end of the wait list. So it is immediately reevaluated, in the same execution loop, instead of to be delayed. Most of time, it is not an issue because the resolution is considered as not expired on the second run. But it is an problem when the internal time wraps and is equal to 0. In that case, the resolution expiration date is badly computed and it is always considered as expired. If two or more resolutions are in that state, the resolver loops for ever on its wait list, until the process is killed by the watchdog. So we can argue that the way the resolution expiration date is computed must be fixed. And it would be true in a perfect world. However, the resolvers code is so crapy that it is hard to be sure to not introduce regressions. It is farly easier to re-insert delayed resolutions in front of the wait list. This fixes the issue and at worst, these resolutions will be evaluated one time too many on the next wakeup and only if now_ms was equal to 0 on the prior wakeup. This patch should be backported to all stable versions. On 2.2, LIST_ADD() must be used instead of LIST_INSERT()	2024-11-13 10:53:27 +01:00
Christopher Faulet	72e529829b	BUG/MEDIUM: stconn: Don't forward shut for SC in connecting state In connecting state, shutdown must not be forwarded or scheduled because otherwise this will prevent any connection retries. Indeed, if a EOS is reported by the mux during the connection establishment, this should be handled by the stream to eventually retries. If the write side is closed first, this will not be possible because the stconn will be switched in DIS state. If the shut is scheduled because pending data are blocked, the same may happen, depending on the abort-on-close option. This patch should be slowly be backported as far as 2.4. But an observation period is mandatory. On 2.4, the patch must be adapted to use the stream-interface API.	2024-11-13 10:53:27 +01:00
Valentine Krasnobaeva	113745e6f0	BUG/MINOR: cli: don't show sockpairs in HAPROXY_CLI and HAPROXY_MASTER_CLI Before this fix, HAPROXY_CLI and HAPROXY_MASTER_CLI have contained along with CLI sockets addresses internal sockpairs, which are used only for master CLI (reload sockpair and sockpair shared with a worker process). These internal sockpairs are always need to be hidden. At the moment there is no any client, who uses sockpair addresses for the stats listener or in order to connect to master CLI. So, let's simply not copy these internal sockpair addresses of MASTER and GLOBAL proxy listeners. As listeners with sockpairs are skipped and they can be presented in the listeners list in any order, let's add semicolon separator between addresses only in the case, when there are already some string saved in the trash and we are sure, that we are adding a new address to it. Otherwise, we could have such weird output: HAPROXY_MASTER_CLI=unix@/tmp/mcli.sock;; This fix is need to be backported in all stable versions.	2024-11-13 09:50:05 +01:00
Valentine Krasnobaeva	1f0cd91fe7	BUG/MINOR: startup: set HAPROXY_CFGFILES in read_cfg load_cfg() is called only once before the first reading of the configuration (we parse here only the global section). Then, before reading the rest of the sections (second call of read_cfg()), we call clean_env(). As HAPROXY_CFGFILES is set in load_cfg(), which is called only once, clean_env() erases it. Thus, it's not longer shown in "show env" output. To fix this, let's set HAPROXY_CFGFILES in read_cfg(). Like this in master-worker mode it is set for master and for worker processes, as it was before the refactoring. This fix doesn't need to be backported as related to the latest master-worker architecture change.	2024-11-13 09:50:05 +01:00
Valentine Krasnobaeva	d5d41dee3d	MINOR: startup: replace HAPROXY_LOAD_SUCCESS with global load_status After master-worker refactoring, master performs re-exec only once up to receiving "reload" command or USR2 signal. There is no more the second master's re-exec to free unused memory. Thus, there is no longer need to export environment variable HAPROXY_LOAD_SUCCESS with worker process load status. This status can be simply saved in a global variable load_status.	2024-11-13 09:50:05 +01:00
Miroslav Zagorac	aadda34fd6	BUILD: ot: use a cebtree instead of a list for variable names In order for the function flt_ot_vars_scope_dump() to work, it is necessary to take into account the changes made by the commits 47ec7c681 ("OPTIM: vars: use a cebtree instead of a list for variable names") and 5d350d1e5 ("OPTIM: vars: use multiple name heads in the vars struct"). The function is only used if the OT_DEBUG=1 option is set when compiling HAProxy.	2024-11-12 11:07:13 +01:00
William Lallemand	581c8a27d9	MEDIUM: mworker: depreciate the 'program' section The program section is unreliable and should not be used, more reliable alternatives exist outside HAProxy. Let's depreciate the section so we could remove it completely in 3.3.	2024-11-08 17:06:58 +01:00
Willy Tarreau	0434e87348	[RELEASE] Released version 3.1-dev12 Released version 3.1-dev12 with the following main changes : - MINOR: startup: tune.renice.{startup,runtime} allow to change priorities - BUG/MEDIUM: promex: Fix dump of extra counters - BUILD: import/mt_list: support building with TCC - BUILD: compiler: define __builtin_prefetch() for tcc - CLEANUP: quic: Remove the useless directive "tune.quic.backend.max-idle-timeou" - DOC: config: document connection error 44 (reverse connect failure) - CLEANUP: connection: properly name the CO_ER_SSL_FATAL enum entry - DEBUG: cli: support closing "hard" using close() in addition to fd_delete() - MINOR: connection: add more connection error codes to cover common errno - MINOR: rawsock: set connection error codes when returning from recv/send/splice - MINOR: connection: add new sample fetch functions fc_err_name and bc_err_name - MINOR: quic: Help diagnosing malformed probing packets - BUG/MINOR: quic: fix malformed probing packet building - MINOR: listener: Remove useless checks on the receiver protocol existence - MINOR: http-conv: Remove unreachable goto statement in sample_conv_q_preferred - MINOR: http: don't %-encode the payload when not relevant - MINOR: quic: simplify qc_parse_pkt_frms() return path - MINOR: quic: use dynamically allocated frame on parsing - MINOR: quic: extend return value of CRYPTO parsing - BUG/MINOR: quic: repeat packet parsing to deal with fragmented CRYPTO - BUG/MINOR: mworker: do 'program' postparser checks in read_cfg_in_discovery_mode - EXAMPLES: add "traces.cfg" with traces examples - BUG/MEDIUM: quic: do not consider ACK on released stream as error - CLEANUP: stats: fix misleading comment on top of stat_idx_info - MINOR: wdt: move the local timers to a struct - MINOR: debug: add a function to dump a stuck thread - DEBUG: wdt: better detect apparently locked up threads and warn about them - DEBUG: cli: make it possible for "debug dev loop" to trigger warnings - DEBUG: wdt: make the blocked traffic warning delay configurable - DEBUG: wdt: add a stats counter "BlockedTrafficWarnings" in show info - DEBUG: wdt: set the default blocked task delay to 100 ms - MINOR: debug: move the "recover now" warn message after the optional notes - MINOR: event_hdl: add event_hdl_sub_list_empty() helper func - MINOR: pattern: add _pat_ref_new() helper func - OPTIM: pattern: use malloc() to initialize new pat_ref struct - MINOR: pattern: add pat_ref_free() helper func - CLEANUP: guid: remove global tree export - BUG/MINOR: guid/server: ensure thread-safety on GUID insert/delete - DOC: management: explain the change of behavior of the program section - BUG/MEDIUM: mux-h2: try to wait for the peer to read the GOAWAY - BUG/MEDIUM: quic: prevent crash due to CRYPTO parsing error	2024-11-08 15:46:54 +01:00
Amaury Denoyelle	2975e8805d	BUG/MEDIUM: quic: prevent crash due to CRYPTO parsing error A packet which contains several splitted and out of order CRYPTO frames may be parsed multiple times to ensure it can be handled via ncbuf. Only 3 iterations can be performed to prevent excessive CPU usage. There is a risk of crash if packet parsing is interrupted after maximum iterations is reached, or no progress can be made on the ncbuf. This is because <frm> may be dangling after list_for_each_entry_safe() The crash occurs on qc_frm_free() invokation, on error path of qc_parse_pkt_frms(). To fix it, always reset frm to NULL after list_for_each_entry_safe() to ensure it is not dangling. This should fix new report on github isue #2776. This regression has been triggered by the following patch : 1767196d5b2d8d1e557f7b3911a940000166ecda BUG/MINOR: quic: repeat packet parsing to deal with fragmented CRYPTO As such, it must be backported up to 2.6, after the above patch.	2024-11-08 15:19:57 +01:00
Willy Tarreau	3ed9361688	BUG/MEDIUM: mux-h2: try to wait for the peer to read the GOAWAY When timeout http-keep-alive is very short (e.g. 10ms), it's possible sometimes for a client to face truncated responses due to an early close that happens while the system is still pushing the last data, colliding with the client's WINDOW_UPDATEs that trigger RSTs. Here we're trying to do better: first we send a GOAWAY on timeout, then we wait up to clientfin/client timeout for the peer to react so that we don't immediately close. This is sufficient to avoid truncation as soon as the timeout is more than a few hundred ms. It's not certain it should be backported, because it's a bit sensistive and might possibly fall into certain edge cases.	2024-11-08 14:31:07 +01:00
William Lallemand	75b302d123	DOC: management: explain the change of behavior of the program section The program section does not work exactly the same way with the master-worker rework of HAProxy 3.1. Let's explain it in the program documentation.	2024-11-08 12:00:26 +01:00
Amaury Denoyelle	8e0e7d9d1a	BUG/MINOR: guid/server: ensure thread-safety on GUID insert/delete Since 3.0, it is possible to assign a GUID to proxies, listeners and servers. These objects are stored in a global tree guid_tree. Proxies and listeners are static. However, servers may be added or deleted at runtime, which imply that guid_tree must be protected. Fix this by declaring a read-write lock to protect tree access. For now, only guid_insert() and guid_remove() are protected using a write lock. Outside of these, GUID tree is not accessed at runtime. If server CLI commands are extended to support GUID as server identifier, lookup operation should be extended with a read lock protection. Note that during stat-file preloading, GUID tree is accessed for lookup. However, as it is performed on startup which is single threaded, there is no need for lock here. A BUG_ON() has been added to ensure this precondition remains true. This bug could caused a segfault when using dynamic servers with GUID. However, it was never reproduced for now. This must be backported up to 3.0. To avoid a conflict issue, the previous cleanup patch can be merged before it.	2024-11-07 18:17:03 +01:00
Amaury Denoyelle	b70880cdc9	CLEANUP: guid: remove global tree export guid_tree is not directly used outside of functions provided by the guid module. Remove its export from the include file.	2024-11-07 17:20:00 +01:00
Aurelien DARRAGON	aba3ed62ae	MINOR: pattern: add pat_ref_free() helper func For now, pat_ref struct are never freed, except during init in case of error. The freeing is done directly in the init functions because we don't have an helper for that. No having an helper func to properly free pat_ref struct doesn't encourage us to free unused pat_ref structs, plus it is error-prone if new dynamic members are added to pat_ref struct in the future. To fix that, let's add a pat_ref_free() helper func and use it where relevant (which means only under pat_ref init function for now..)	2024-11-07 11:36:13 +01:00
Aurelien DARRAGON	e8a0dbff93	OPTIM: pattern: use malloc() to initialize new pat_ref struct As mentioned in the previous commit, in _pat_ref_new(), it was not strictly needed to explicitly assign all struct members to 0 since the struct was allocated with calloc() which does the zeroing for us. However, it was verified that we already initialize all fields explictly, thus there is no reason to keep using calloc() instead of malloc(). In fact using malloc() is less expensive, so let's use that instead now.	2024-11-07 11:36:08 +01:00
Aurelien DARRAGON	d1397401f0	MINOR: pattern: add _pat_ref_new() helper func pat_ref_newid() and pat_ref_new() are two functions to create and initialize a pat_ref struct based on input parameters. Both function perform the same generic allocation and initialization for pat_ref struct, thus there is quite a lot of code redundancy. This is error-prone if the pat_ref init sequence has to be updated at some point. To reduce maintenance costs, let's add a _pat_ref_new() helper func that takes care of the generic allocation and base initialization for pat_ref struct.	2024-11-07 11:36:01 +01:00
Aurelien DARRAGON	79a346aa28	MINOR: event_hdl: add event_hdl_sub_list_empty() helper func event_hdl_sub_list_empty() may be used to know if the subscription list passed as argument is empty or not (ie: if there currently are any subcribers or not). It can be useful to know if the subscription is empty is order to avoid unecessary preparation work and skip event publishing to save CPU time if we already know that no one is interested in tracking the changes for a given subscription list.	2024-11-07 11:35:55 +01:00
Willy Tarreau	5dcf2012fc	MINOR: debug: move the "recover now" warn message after the optional notes At the end of the too long processing warning added by commit 0950778b3a ("MINOR: debug: add a function to dump a stuck thread"), there can be some optional notes about lua and memory trimming. However it's a bit awkward that they appear after the "trying to recover now" message. Let's just move that message after the notes.	2024-11-07 07:56:13 +01:00
Willy Tarreau	5f4fe20116	DEBUG: wdt: set the default blocked task delay to 100 ms The warn-blocked-traffic-after can be significantly lowered. In any case, in order to be usable it must be well below the limit to have a chance to emit exploitable traces before the watchdog finally fires. Even configured at 1ms it looks very difficult to trigger it on a laptop doing SSL and compression, so applying a 100-fold factor to cover for large configs and small machines sounds sane for 3.1. In any case, even at 100ms, the service degradation becomes quite visible.	2024-11-06 18:35:42 +01:00
Willy Tarreau	84dd05e7d8	DEBUG: wdt: add a stats counter "BlockedTrafficWarnings" in show info Every time a warning is issued about traffic being blocked, let's increment a global counter so that we can check for this situation in "show info".	2024-11-06 18:35:42 +01:00
Willy Tarreau	6127e5a4e9	DEBUG: wdt: make the blocked traffic warning delay configurable The new global "warn-blocked-traffic-after" allows one to configure after how much time a warning should be emitted when traffic is blocked.	2024-11-06 18:35:42 +01:00
Willy Tarreau	7337c42224	DEBUG: cli: make it possible for "debug dev loop" to trigger warnings A new argument "warn" allows to force the emission of a warning while stuck in the loop by making the internal state inconsistent.	2024-11-06 18:35:42 +01:00
Willy Tarreau	148eb5875f	DEBUG: wdt: better detect apparently locked up threads and warn about them In order to help users detect when threads are behaving abnormally, let's try to emit a warning when one is no longer making any progress. This will allow to catch faulty situations more accurately, instead of occasionally triggering just after the long task. It will also let users know that there is something wrong with their configuration, and inspect the call trace to figure whether they're using excessively long rules or Lua for example (the usual warnings about lua-load vs lua-load-per-thread are still reported). The warning will only be emitted for threads not yet marked as stuck so as not to interfere with panic dumps and avoid sending a warning just before a panic. A tainted flag is set when this happens however (0x2000).	2024-11-06 18:35:42 +01:00
Willy Tarreau	0950778b3a	MINOR: debug: add a function to dump a stuck thread There's currently no way to just emit a warning informing that a thread is stuck without crashing. This is a problem because sometimes users would benefit from this info to clean up their configuration (e.g. abuse of map_regm, lua-load etc). This commit adds a new function ha_stuck_warning() that will emit a warning indicating that the designated thread has been stuck for XX milliseconds, with a number of streams blocked, and will make that thread dump its own state. The warning will then be sent to stderr, along with some reminders about the impacts of such situations to encourage users to fix their configuration. In order not to disrupt operations, a local 4kB buffer is allocated in the stack. This should be quite sufficient. For now the function is not used.	2024-11-06 18:35:42 +01:00
Willy Tarreau	3f4d646849	MINOR: wdt: move the local timers to a struct Better have a local struct for per-thread timers, as this will allow us to store extra info that are useful to improve accurate reporting.	2024-11-06 18:35:42 +01:00
Willy Tarreau	1f34a0fd27	CLEANUP: stats: fix misleading comment on top of stat_idx_info The comment asks to update the "metrics_info" array, which does not exist, instead it's called stat_cols_info[] and is in stats.c. Let's mention all that to save time searching for the needed info. While no version seems to have ever known that "metrics_info", it's not needed to backport this as it's only a comment.	2024-11-06 18:35:42 +01:00
Amaury Denoyelle	3b851a326b	BUG/MEDIUM: quic: do not consider ACK on released stream as error When an ACK is received by haproxy, a lookup is performed to retrieve the related emitted frames. For STREAM type frames, a lookup is performed under quic_conn stream_desc tree. Indeed, the corresponding stream instance could be already released if multiple ACK were received refering to the same stream offset, which can happen notably if retransmission occured. qc_handle_newly_acked_frm() implements this logic. If the case with an already released stream is encounted, an error is returned. In the end, this error is propagated via qc_parse_pkt_frms() into qc_treat_rx_pkts(), despite being in fact a perfectly valid case. Fix this by adjusting ACK handling function to return a success value for the particular case of released stream instead. The impact of this bug is unknown, but it can have several consequences. * if the packet with the ACK contains other frames after it, their content will be skipped * the packet won't be acknowledged by haproxy, even if it contains other frames and is ack-eliciting. This may cause unneeded retransmission by the client. * RTT sampling information related to this ACK is ignored by haproxy Finally, it also caused the increment of the quic_conn counter dropped_parsing (droppars in "show quic" output) which should be reserved only for real error cases. This regression is present since the following patch : e7578084b0536e3e5988be7f09091c85beb8fa9d MINOR: quic: implement dedicated type for out-of-order stream ACK Before, qc_handle_newly_acked_frm() return type was always ignored. As such, no backport is needed.	2024-11-06 17:37:44 +01:00
William Lallemand	66bff034d7	EXAMPLES: add "traces.cfg" with traces examples Add an example on how to use the traces section. The example use the 3.1-dev8 syntax and enables all traces on stderr.	2024-11-06 17:32:32 +01:00
Valentine Krasnobaeva	e9928c306c	BUG/MINOR: mworker: do 'program' postparser checks in read_cfg_in_discovery_mode cfg_program_postparser() contains 2 parts: - check the combination of MODE_MWORKER and "program" section. if "program" section was parsed, MODE_MWORKER is mandatory; - check "command" keyword, which is mandatory for this section as well. This is more appropriate now, after the master-worker refactoring, do the first part in read_cfg_in_discovery_mode, where we already check the combination of MODE_MWORKER and -S option. We need to do the second part just below, in read_cfg_in_discovery_mode() as well, because it's only the master process, who parses now program section and programs are forked before running postparser functions in step_init_2. Otherwise, mworker_ext_launch_all() will emit a log message, that program is started, but actually nothing has been launched, if 'command' keyword is absent. This not needs to be backported, as related to the master-worker refactoring.	2024-11-06 15:49:44 +01:00
Amaury Denoyelle	1767196d5b	BUG/MINOR: quic: repeat packet parsing to deal with fragmented CRYPTO A ClientHello may be splitted accross several different CRYPTO frames, then mixed in a single QUIC packet. This is used notably by clients such as chrome to render the first Initial packet opaque to middleboxes. Each packet frame is handled sequentially. Out-of-order CRYPTO frames are buffered in a ncbuf, until gaps are filled and data is transferred to the SSL stack. If CRYPTO frames are heavily splitted with small fragments, buffering may fail as ncbuf does not support small gaps. This causes the whole packet to be rejected and unacknowledged. It could be solved if the client reemits its ClientHello after remixing its CRYPTO frames. This patch is written to improve CRYPTO frame parsing. Each CRYPTO frames which cannot be buffered due to ncbuf limitation are now stored in a temporary list. Packet parsing is completed until all frames have been handled. If temporary list is not empty, reparsing is done on the stored frames. With the newly buffered CRYPTO frames, ncbuf insert operation may this time succeeds if the frame now covers a whole gap. Reparsing will loop until either no progress can be made or it has been done at least 3 times, to prevent CPU utilization. This patch should fix github issue #2776. This should be backported up to 2.6, after a period of observation. Note that it relies on the following refactor patches : MINOR: quic: extend return value of CRYPTO parsing MINOR: quic: use dynamically allocated frame on parsing MINOR: quic: simplify qc_parse_pkt_frms() return path	2024-11-06 14:29:14 +01:00
Amaury Denoyelle	d65e782c8c	MINOR: quic: extend return value of CRYPTO parsing qc_handle_crypto_frm() is the function used to handled a newly received CRYPTO frame. Change its API to use a newly dedicated return type. This allows to report if the frame was properly handled, ignored if already parsed previously or rejected after a fatal error. This commit does not have any functional changes. However, it allows to simplify qc_handle_crypto_frm() API by removing <fast_retrans> as output parameter. Also, this patch will be necessary to support multiple iteration of packet parsing for CRYPTO frames.	2024-11-06 14:28:14 +01:00
Amaury Denoyelle	190fc97606	MINOR: quic: use dynamically allocated frame on parsing qc_parse_pkt_frms() is the function responsible to parse a received QUIC packet. Payload is decoded and splitted into individual frames which are then handled individually. Previously, frame was used as locally stack allocated. Change this to work on a dynamically allocated frame. This commit does bring any functional changes. However, it will be useful to extend packet parsing. In particular, it will be necessary to save some frames during parsing to reparse them after the others.	2024-11-06 14:28:14 +01:00
Amaury Denoyelle	498a99a849	MINOR: quic: simplify qc_parse_pkt_frms() return path Change qc_parse_pkt_frms() return path for normal and error cases. Most notably, it allows to remove local variable ret as now return value is hardcoded on normal and err label. This also allows to define a different trace for error leaving code.	2024-11-06 14:28:14 +01:00
Aurelien DARRAGON	24dd7154a6	MINOR: http: don't %-encode the payload when not relevant As reported by Pierre Maoui in GH #2477, it's not possible to render control chars from variables or expressions verbatim in the payload part of http-return statements. That's a problem because this part should not require to be encoded at all (we could even imagine building favicons on the fly for example). In fact it is the LOG_OPT_HTTP option when passed as default options on parse_logformat_string() which tells the log encoder that the payload should be http-encoded using lf_chunk() instead of being printed using the per-type encoder. This option was set when parsing logformat expressions for lf-string expression under http-return statements, as well as logformat expressions for set-map action. While it is true that those actions may only be used under http context, the LOG_OPT_HTTP logformat option is not relevant there, because the payload is expected to be used without being encoded. So let's simply get rid of this option when parsing logformat expressions for set-map action key/value and lf-string from http-request return action, and add a note next to LOG_OPT_HTTP option to indicate that it is used to tell the log encoder that the payload should be HTTP-encoded. Thanks to Pierre for having reported the issue and Willy for the analysis and patch proposal.	2024-11-06 10:21:15 +01:00
Christopher Faulet	97d3096040	MINOR: http-conv: Remove unreachable goto statement in sample_conv_q_preferred This was reported by Coverity. In sample_conv_q_preferred() function, a goto statement after a "while(1)" loop is unreachable. Instead of just removing it, the same goto statement in the loop is replaced by a break. It is safer this way, in case the loop change in future. This patch should fix the issue #2683.	2024-11-06 10:06:52 +01:00
Christopher Faulet	1cc9340afd	MINOR: listener: Remove useless checks on the receiver protocol existence The receiver protocol is always set when a listener is created or cloned. At least for now. And there is no check on it at many places, except in listener_accept() function. So, let's remove remaining useless checks. That will avoid false Coverity reports in future. This patch should fix the issue #2631.	2024-11-06 09:35:01 +01:00
Frederic Lecaille	217e467e89	BUG/MINOR: quic: fix malformed probing packet building This bug arrived with this commit: cdfceb10a MINOR: quic: refactor qc_prep_pkts() loop which prevents haproxy from sending PING only packets/datagrams (some packets/datagrams with only PING frame as ack-eliciting frames inside). Such packets/datagrams are useful in rare cases during retransmissions when one wants to probe the peer without exceeding the anti-amplification limit. Modify the condition passed to qc_build_pkt() to add padding to the current datagram. One does not want to do that when probing the peer without ack-eliciting frames passed as <frms> parameter. Indeed qc_build_pkt() calls qc_do_build_pkt() which supports this case: if <probe> is true (probing required), qc_do_build_pkt() handles the case where some padding must be added to a PING only packet/datagram. This is the case when probing with an empty <frms> frame list of ack-eliciting frames without exceeding the anti-amplification limit from qc_dgrams_retransmit(). Add some comments to qc_build_pkt() and qc_do_build_pkt() to clarify this as this code is easy to break! Thank you for @Tristan971 for having reported this issue in GH #2709. Must be backported to 3.0.	2024-11-05 20:17:35 +01:00
Frederic Lecaille	444a19ea38	MINOR: quic: Help diagnosing malformed probing packets Add a BUG_ON() to detect some malformed packets which are supposed to probe the peer without being ack-eliciting: the peer would not acknowledged such packets.	2024-11-05 20:17:35 +01:00
Willy Tarreau	601b34fe7b	MINOR: connection: add new sample fetch functions fc_err_name and bc_err_name These functions return a symbolic error code such as ECONNRESET to keep logs compact while making them human-readable. It's a good alternative to the numeric code in that it's more expressive, and a good one to the full message since it's shorter and more precise (some codes even match errno names). The doc was updated so that the symbolic names appear in the table. It could be useful to backport this feature to help with troubleshooting some issues, though backporting the doc might possibly be more annoying in case users have local patches already, so maybe the table update does not need to be backported in this case.	2024-11-05 18:57:43 +01:00
Willy Tarreau	822d82caf4	MINOR: rawsock: set connection error codes when returning from recv/send/splice For a long time the errno values returned by recv/send/splice() were not translated to connection error codes. There are not that many eligible and having them would help a lot when debugging some complex issues where logs disagree with network traces. Let's add them now.	2024-11-05 18:57:43 +01:00
Willy Tarreau	00c383ff65	MINOR: connection: add more connection error codes to cover common errno While we get reports of connection setup errors in fc_err/bc_err, we don't have the equivalent for the recv/send/splice syscalls. Let's add provisions for new codes that cover the common errno values that recv/send/splice can return, i.e. ECONNREFUSED, ENOMEM, EBADF, EFAULT, EINVAL, ENOTCONN, ENOTSOCK, ENOBUFS, EPIPE. We also add a special case for when the poller reported the error itself. It's worth noting that EBADF/EFAULT/EINVAL will generally indicate serious bugs in the code and should not be reported. The only thing is that it's quite hard to forcefully (and reliably) trigger these errors in automated tests as the timing is critical. Using iptables to manually reset established connections in the middle of large transfers at least permits to see some ECONNRESET and/or EPIPE, but the other ones are harder to trigger.	2024-11-05 18:57:43 +01:00
Willy Tarreau	0f1d37a479	DEBUG: cli: support closing "hard" using close() in addition to fd_delete() "debug dev close <fd>" currently closes that FD using fd_delete() after checking that it's known from the fdtab. Sometimes we also want to just perform a pure close() of FDs not in the fdtab (pollers, etc) in order to provoke certain error cases. The optional "hard" argument to the command will make it use a plain close() instead of fd_delete() and skip the fd owner check. The main visible effect when closing a traffic socket with it is that instead of dying from a double fd_delete() by seeing that fd.owner is already 0, it will die during the next fd_insert() seeing that fd.owner was not 0.	2024-11-05 18:57:43 +01:00
Willy Tarreau	393957908b	CLEANUP: connection: properly name the CO_ER_SSL_FATAL enum entry It was the only one prefixed with "CO_ERR_", making it harder to batch process and to look up. It was added in 2.5 by commit 61944f7a73 ("MINOR: ssl: Set connection error code in case of SSL read or write fatal failure") so it can be backported as far as 2.6 if needed to help integrate other patches.	2024-11-05 18:57:42 +01:00
Willy Tarreau	abed9e0426	DOC: config: document connection error 44 (reverse connect failure) It was missing from commit ac1164de7c ("MINOR: connection: define error for reverse connect"), and can be backported to 3.0 and 2.9.	2024-11-05 18:57:42 +01:00
Christopher Faulet	1f71ec85b0	CLEANUP: quic: Remove the useless directive "tune.quic.backend.max-idle-timeou" First there is a typo in the directive name, then it is not documented and finally, it is not used at all. The directive is only removed from the keyword list. Parsing function is not updated. This patch should fix the issue #2601.	2024-11-05 18:53:54 +01:00
Willy Tarreau	b300db55f6	BUILD: compiler: define __builtin_prefetch() for tcc We're using a few occurrences of __builtin_prefetch() but tcc doesn't know about it so let's give it a dummy definition. Now the code builds and works again with tcc without thread support.	2024-11-05 15:43:17 +01:00
Willy Tarreau	033db091fc	BUILD: import/mt_list: support building with TCC TCC is often convenient to quickly test builds, run CI tests etc. It has limited thread support (e.g. no thread-local stuff) but that is often sufficient for testing. TCC lacks __atomic_exchange_n() but has the exactly equivalent __atomic_exchange(), and doesn't have any barrier. For this reason we force the atomic_exchange to use the stricter SEQ_CST mem ordering that allows to ignore the barrier. [wt: that's upstream commit ca8b865 ("BUILD: support building with TCC")]	2024-11-05 15:43:17 +01:00
Christopher Faulet	d1adfd9fe4	BUG/MEDIUM: promex: Fix dump of extra counters When extra counters are dumped for an entity (frontend, backend, server or listener), there is a filter on capabilities. Some extra counters are not available for all entities and must be ignored. However, when this was performed, the field number, used as an index to dump the metric value, was still incremented while it should not and leads to an overflow or a stats mix-up. This patch must be backported to 3.0.	2024-11-05 15:36:41 +01:00
William Lallemand	e75a019fba	MINOR: startup: tune.renice.{startup,runtime} allow to change priorities This commit introduces the tune.renice.startup and tune.renice.runtime global keywords that allows to change the priority with setpriority(). tune.renice.startup is parsed and applied in the worker or the standalone process for configuration parsing. If this keyword is used alone, the nice value is changed to the previous one after configuration parsing. tune.renice.runtime is applied after configuration parsing, so in the worker or a standalone process. Combined with tune.renice.startup it allows to have a different nice value during configuration parsing and during runtime. The feature was discussed in github issue #1919. Example: global tune.renice.startup 15 tune.renice.runtime 0	2024-11-04 17:48:58 +01:00
Willy Tarreau	2092199353	[RELEASE] Released version 3.1-dev11 Released version 3.1-dev11 with the following main changes : - BUG/MINOR: httpclient: return NULL when no proxy available during httpclient_new() - BUG/MEDIUM: mworker/httpclient: initialization skipped by accident in mworker mode - BUG/MINOR: resolvers/mworker: missing default resolvers in mworker mode - MINOR: mworker/ocsp: skip ocsp-update proxy init in master - BUG/MEDIUM: stconn: Wait iobuf is empty to shut SE down during a check send - MINOR: mux-h1: Show the SD iobuf in trace messages on stream send events - MINOR: mux-h1: Add a trace on shutdown when keep-alive is not possible - BUG/MINOR: http-ana: Don't report a server abort if response payload is invalid - BUG/MEDIUM: stconn: Check FF data of SC to perform a shutdown in sc_notify() - BUG/MAJOR: filters/htx: Add a flag to state the payload is altered by a filter - REGTESTS: Never reuse server connection in http-messaging/truncated.vtc - BUG/MINOR: quic: avoid leaking post handshake frames - MINOR: quic: send new tokens (NEW_TOKEN) even for 1RTT sessions - BUG/MEDIUM: quic: avoid freezing 0RTT connections - DOC: config: fix rfc7239 forwarded typo in desc - MINOR: http_ext: implement rfc7239_{nn,np} converters - CLEANUP: http_ext: remove useless BUG_ON() in http_handle_xot_header() - BUG/MINOR: sample: free err2 in smp_resolve_args for type ARGT_REG - MINOR: arg: add an argument type for identifier - BUILD: buffers: keep b_getblk_nc() and b_peek_varint() in buf.h - CLEANUP: buffers: simplify b_get_varint() - OPTIM: buffers: avoid a useless wrapping check for ofs == 0 - MINOR: debug: make mark_tainted() return the previous value - MINOR: chunk: drop the global thread_dump_buffer - MINOR: debug: split ha_thread_dump() in two parts - MINOR: debug: slightly change the thread_dump_pointer signification - MINOR: debug: make ha_thread_dump_done() take the pointer to be used - MINOR: debug: replace ha_thread_dump() with its two components - MEDIUM: debug: on panic, make the target thread automatically allocate its buf - BUILD: mux-h2/traces: fix build on 32-bit due to size of the DATA frame - CI: prepare Coverity build for Ubuntu 24 - CI: bump development builds explicitely to Ubuntu 24.04 - CI: modernize macos builds to macos-15 - BUG/MINOR: mworker: fix mworker-max-reloads parser - MINOR: mux-quic: simplify sending of empty STREAM FIN - BUG/MINOR: mux-quic: do not close STREAM with empty FIN if no data sent - CLEANUP: debug: make the BUG_ON() macros check the condition in the outer one - MEDIUM: debug: add match counters for BUG_ON/WARN_ON/CHECK_IF - MINOR: debug: add a new debug macro COUNT_IF() - MINOR: debug: add "debug dev counters" to list code counters - BUG/MEDIUM: stats-html: Never dump more data than expected during 0-copy FF - BUG/MEDIUM: mux-h2: Remove H2S from send list if data are sent via 0-copy FF - BUG/MINOR: stconn: Pretend the SE have more data to deliver on abortonclose - CLEANUP: stream: remove outdated comments - DEBUG: stream: Add debug counters to track some client/server aborts - DEBUG: mux-h1: Add debug counters to track some errors - MINOR: mux-h1: Add support of the debug string for logs - MINOR: stream: maintain per-stream counters of the number of passes on code - MINOR: filters: add per-filter call counters - MINOR: sample: add the "when" converter to condition some expressions - BUG/MEDIUM: connection/http-reuse: fix address collision on unhandled address families - BUILD: spoe: fix build warning on older gcc around sub-struct initialization - Revert "OPTIM: mux-h2: make h2_send() report more accurate wake up conditions" - DEBUG: mux-h1: Add debug counters to track errors with in/out pending data - BUG/MINOR: mux-h1: Fix conditions on pipe in some COUNT_IF() - MINOR: activity/memprofile: show per-DSO stats - BUG/MINOR: mworker/cli: show master startup logs in recovery mode - MINOR: mworker: stop MASTER proxy listener on worker mcli sockpair - MINOR: error: simplify startup_logs_init_shm - BUG/MINOR: mworker: show worker warnings in startup logs - CLEANUP: mworker: clean mworker_reexec - MINOR: mworker/cli: split mworker_cli_proxy_create - BUG/MINOR: server: fix dynamic server leak with check on failed init - BUG/MEDIUM: server: fix race on servers_list during server deletion - BUG/MEDIUM: stconn: Report blocked send if sends are blocked by an error - BUG/MINOR: http-ana: Fix wrong client abort reports during responses forwarding - BUG/MINOR: stconn: Don't disable 0-copy FF if EOS was reported on consumer side - MINOR: mworker/cli: add 'debug' to 'show proc' - MINOR: mworker/cli: remove comment line for program when useless - MINOR: mworker/cli: 'show proc debug' for old workers - BUILD: debug: silence a build warning with threads disabled - CLEANUP: mux-h2: remove the unused "full" variable in h2_frt_transfer_data() - MINOR: pools: export the pools variable - MINOR: debug: place a magic pattern at the beginning of post_mortem - MINOR: debug: place the post_mortem struct in its own section. - MINOR: debug: store important pointers in post_mortem - MINOR: debug: do not limit backtraces to stuck threads - MINOR: cli: remove non-printable characters from 'debug dev fd' - MINOR: cli: add an 'echo' command - MINOR: debug: also add a pointer to struct global to post_mortem - CLEANUP: mworker: make mworker_create_master_cli more readable - BUG/MEIDUM: mworker: fix fd leak from master to worker - BUG/MINOR: mworker/cli: fix mworker_cli_global_proxy_new_listener - MINOR: tools: add strnlen2() helper - CLEANUP: log: use strnlen2() in _lf_text_len() to compute string length - DOC: design: add notes about more detailed error reporting for logs - MINOR: debug: also add fdtab and acitvity to struct post_mortem - MINOR: debug: remove the redundant process.thread_info array from post_mortem - DEV: gdb: add a number of gdb scripts to navigate in core dumps - BUG/MINOR: trace: stop rewriting argv with -dt - MEDIUM: protocol: make abns a custom unix socket address family - MEDIUM: protocol: rely on AF_CUST_ABNS family to recognize ABNS sockets - CLEANUP: tools: rely on address family to detect ABNS sockets - MINOR: protocol: create abnsz socket address family - MINOR: sock: restore effective UNIX family in sock_get_old_sockets() - MEDIUM: sock: also restore effective unix family in get_{src,dst}() - MEDIUM: sock_unix: use per-family addrcmp function - MEDIUM: socket: add zero-terminated ABNS alternative - BUG/MINOR: ssl/cli: 'set ssl cert' does not check the transaction name correctly - BUG/MINOR: mworker: mworker_reexec: unset MODE_STARTING before free startup logs ring - BUG/MINOR: errors: startup_logs_free: set global startup_logs ptr to NULL - BUG/MINOR: errors: print_message: don't allocate startup logs ring - BUG/MINOR: startup: don't fork worker if started with -c -W - BUG/MINOR: startup: dump libs only in worker if started with -W -dL - BUG/MINOR: startup: dump keywords only in worker if started with -W -dKAll - BUG/MINOR: startup: don't dump polling info for master in verbose mode - CI: switch QUIC Interop on AWS-LC to common docker image - CI: switch QUIC Interop on LibreSSL to common docker image - CI: enable chacha20 test on LibreSSL QUIC Interop - DOC: config: add missing glitch_{cnt,rate} data types - DOC: config: add missing glitch_{cnt,rate} sample definitions - CI: LibreSSL QUIC Interop: fix docker context - DEBUG: mux-h1: Add H1C expiration dates in trace messages - BUG/MEDIUM: mux-h1: Fix how timeouts are applied on H1 connections - BUG/MINOR: http-ana: Report internal error if an action yields on a final eval - MINOR: stream: Save last evaluated rule on invalid yield - MINOR: quic: complete trace in qc_may_build_pkt() - MINOR: quic: move qc_send_mux() prototype into quic_tx.h - MINOR: stream: Replace last_rule_file/line fields by a more generic field - MINOR: stream: Save the last filter evaluated interrupting the processing - MINOR: stream: Save the entity waiting to continue its processing - MINOR: stream: Use an enum to identify last and waiting entities for streams - MINOR: stream: Add http-buffer-request option in the waiting entities - DOC: config: Add documentation about last_entity sample fetch - DOC: config: Add documentation about waiting_entity sample fetch	2024-11-01 10:17:02 +01:00
Christopher Faulet	1cd8173687	DOC: config: Add documentation about waiting_entity sample fetch The commit adds the documentation for the waiting_entity sample fetch.	2024-10-31 20:47:59 +01:00
Christopher Faulet	6034080c49	DOC: config: Add documentation about last_entity sample fetch The commit adds the documentation for the last_entity sample fetch.	2024-10-31 20:25:07 +01:00
Christopher Faulet	64554a55f4	MINOR: stream: Add http-buffer-request option in the waiting entities When http-buffer-request option is set on a proxy, the processing will be paused to wait the full request payload or a full buffer. So it is an entity that block the processing, just like a rule or a filter that yields. So now, it is reported as a waiting entity if an error or a timeout occurred. To do so, an stream entity type is added for this option. There is no pointer. And "waiting_entity" sample fetch returns the option name.	2024-10-31 20:24:50 +01:00
Christopher Faulet	c64712b085	MINOR: stream: Use an enum to identify last and waiting entities for streams Instead of using 1 for last/waiting rule and 2 for last/waiting filter, an enum is used. It is less ambiguous this way.	2024-10-31 20:24:37 +01:00
Christopher Faulet	537f20eb3e	MINOR: stream: Save the entity waiting to continue its processing When a rule or a filter yields because it waits for something to be able to continue its processing, this entity is saved in the stream. If an error or a timeout occurred, info on this entity may be retrieved via the "waiting_entity" sample fetch, for instance to dump it in the logs. This info may be useful to found root cause of some bugs because it is a way to know the processing was temporarily stopped. This may explain timeouts for instance. The sample fetch is not documented yet.	2024-10-31 16:40:09 +01:00
Christopher Faulet	53de6da1c0	MINOR: stream: Save the last filter evaluated interrupting the processing It is very similar to the last evaluated rule. When a filter returns an error that interrupts the processing, it is saved in the stream, in the last_entity field, with the type 2. The pointer on filter config is saved. This pointer never changes during runtime and is part of the proxy's structure. It is an element of the filter_configs list in the proxy structure. "last_entity" sample fetch was update accordingly. The filter identifier is returned, if defined. Otherwise the save pointer.	2024-10-31 16:39:04 +01:00
Christopher Faulet	c9fa78e747	MINOR: stream: Replace last_rule_file/line fields by a more generic field The last evaluated rule is now saved in a generic structure, named last_entity, with a type to identify it. The idea is to be able to store other kind of entity that may interrupt a specific processing. The type of the last evaluated rule is set to 1. It will be replace later by an enum to be more explicit. In addition, the pointer to the rule itself is saved instead of its location. The sample fetch "last_entity" was added to retrieve the information about it. In this case, it is the rule localtion, the config file containing the rule followed by the line where the rule is defined, separated by a colon. This sample fetch is not documented yet.	2024-10-31 16:36:39 +01:00
Amaury Denoyelle	dcf334168c	MINOR: quic: move qc_send_mux() prototype into quic_tx.h qc_send_mux() is defined in quic_tx.c. As such, its prototype is moved from quic_conn.h to quic_tx.h.	2024-10-31 15:35:31 +01:00
Amaury Denoyelle	a8738f4156	MINOR: quic: complete trace in qc_may_build_pkt() Log the encryption level in qc_may_build_pkt(). This is necessary to fully understand the sending conditions of the QUIC stack.	2024-10-31 15:35:31 +01:00
Christopher Faulet	0b7605491e	MINOR: stream: Save last evaluated rule on invalid yield When an action yields while it is not allowed, an internal error is reported. This interrupts the processing. So info about the last evaluated rule must be filled. This patch may be bakcported if needed. If so, the commit ("MINOR: stream: Save last evaluated rule on invalid yield") must be backported first.	2024-10-31 09:30:52 +01:00
Christopher Faulet	65ea29dcf8	BUG/MINOR: http-ana: Report internal error if an action yields on a final eval This was already performed for tcp actions at content level, but not for HTTP actions. It is always a bug, so it must be reported accordingly. This patch may be backported to all stable versions.	2024-10-31 09:30:52 +01:00
Christopher Faulet	3c09b34325	BUG/MEDIUM: mux-h1: Fix how timeouts are applied on H1 connections There were several flaws in the way the different timeouts were applied on H1 connections. First, the H1C task handling timeouts was not created if no client/server timeout was specified. But there are other timeouts to consider. First, the client-fin/server-fin timeouts. But for frontend connections, http-keey-alive and http-request timeouts may also be used. And finally, on soft-stop, the close-spread-time value must be considered too. So at the end, it is probably easier to always create a task to manage H1C timeouts. Especially since the client/server timeouts are most often set. Then, when the expiration date of the H1C's task must only be updated if the considered timeout is set. So tick_add_ifset() must be used instead of tick_add(). Otherwise, if a timeout is undefined, the taks may expire immediately while it should in fact never expire. Finally, the idle expiration date must only be considered for idle connections. This patch should be backported in all stable versions, at least as far as 2.6. On the 2.4, it will have to be slightly adapted for the idle_exp part. On 2.2 and 2.0, the patch will have to be rewrite because h1_refresh_timeout() is quite different.	2024-10-31 09:30:52 +01:00
Christopher Faulet	9fa5b379fa	DEBUG: mux-h1: Add H1C expiration dates in trace messages The expiration date of the H1C task and the H1C idle expiration date are now dumped in the trace messages.	2024-10-31 09:30:52 +01:00
Ilia Shipitsin	976af317a4	CI: LibreSSL QUIC Interop: fix docker context in the commit `98099287ee` building docker was switched to URL, but I forgotten to change context. this is a followup fix.	2024-10-30 19:42:31 +01:00
Aurelien DARRAGON	0686fd8cfc	DOC: config: add missing glitch_{cnt,rate} sample definitions Following previous commit, when glitch_cnt and glitch_rate data types were implemented in c9c6b683f ("MEDIUM: stick-tables: add a new stored type for glitch_cnt and glitch_rate"), newly exposed samples such as table_glitch_cnt(), table_glitch_rate, src_glitch_cnt() and src_glitch_rate() were documented but their definitions was missing in supported keywords list. It should be backported in 3.0 with c9c6b683f	2024-10-30 17:47:30 +01:00
Aurelien DARRAGON	9a6fc2d474	DOC: config: add missing glitch_{cnt,rate} data types When glitch_cnt and glitch_rate data types were implemented in c9c6b683f ("MEDIUM: stick-tables: add a new stored type for glitch_cnt and glitch_rate"), the data types list for "stick-table" keyword documentation was overlooked. This was reported by Nick Ramirez. It should be backported in 3.0 with c9c6b683f.	2024-10-30 17:47:24 +01:00
Ilia Shipitsin	3ecca216b4	CI: enable chacha20 test on LibreSSL QUIC Interop it was commented on purpose "until LibreSSL-4.0 is released". lets enable it	2024-10-30 16:46:22 +01:00
Ilia Shipitsin	98099287ee	CI: switch QUIC Interop on LibreSSL to common docker image previously we used different docker images for different SSL libs, now all of them are merged into one, lets switch to it	2024-10-30 16:46:06 +01:00
Ilia Shipitsin	4d40e9384c	CI: switch QUIC Interop on AWS-LC to common docker image previously we used different docker images for different SSL libs, now all of them are merged into one, lets switch to it	2024-10-30 16:45:36 +01:00
Valentine Krasnobaeva	d3eb00e61d	BUG/MINOR: startup: don't dump polling info for master in verbose mode As master-worker fork happens now before step_init_2(), when pollers are initialized and polling settings and dumped then in verbose and in debug modes to stdout, it turns out that master and worker dump its same polling settings separately. This creates long and messy output in these modes. Polling settings are the same for master and for worker process for the moment. Even if they would diverge in future we are interested here in worker's settings. So, when started in the master-worker mode let's dump it only in the worker context. This doesn't need to be backported as related to the latest master-worker refactoring.	2024-10-30 10:50:09 +01:00
Valentine Krasnobaeva	bbe7828d49	BUG/MINOR: startup: dump keywords only in worker if started with -W -dKAll If haproxy was started with -W -dK*, after master-worker refactoring, we dump registered keywords to stdout twice in master and in worker processes. This information is redundant and output has no longer the right format. So, as the keyword registration happens very early before the fork, let's dump keywords only in the worker context, if haproxy was launched with -W. This does not need to be backported, as related to the latest master-worker refactoring.	2024-10-30 10:01:28 +01:00
Valentine Krasnobaeva	ea824aebc1	BUG/MINOR: startup: dump libs only in worker if started with -W -dL If haproxy was started with -W -dL, after master-worker refactoring we dump libs to stdout twice in master and in worker processes. This is information is redundant. So let's show linked libraries only in the worker context, if haproxy was started also with -W. This does not need to be backported, as related to the latest master-worker rework.	2024-10-30 10:00:40 +01:00
Valentine Krasnobaeva	d1c6d44976	BUG/MINOR: startup: don't fork worker if started with -c -W Don't do master-worker fork if MODE_CHECK is detected from the command line along with the master-worker mode. We should exit in MODE_CHECK, after the configuration parsing and validation. So, with the new master-worker architecture it's better to align this mode with the standalone. This patch does not need to be backported, as related to the latest master-worker rework.	2024-10-30 09:59:59 +01:00
Valentine Krasnobaeva	f0f03b98f7	BUG/MINOR: errors: print_message: don't allocate startup logs ring Don't call startup_logs_init() in order to allocate the startup logs ring again, if startup_logs pointer is NULL. Startup logs ring is allocated explicitly in step_init_1 routine, when the process starts, and it's freed explicitly for master process at the end of mworker_reexec scope. So, when we no longer have this pointer, let's just save the log message in the message buffer. Otherwise, in case of master process, we will allocate the startup logs ring again here and we will lost its address after execvp. No need to backport this fix as it's related to the latest master-worker refactoring.	2024-10-29 18:17:49 +01:00
Valentine Krasnobaeva	bf8c871e26	BUG/MINOR: errors: startup_logs_free: set global startup_logs ptr to NULL ring_free() calls free() on the ring struct pointer, but startup_logs continues to keep this address. So let's reset at the end startup_logs to NULL. startup_logs is checked in print_message(). No need to backport this fix, as it's related to the latest master-worker refactoring.	2024-10-29 18:17:49 +01:00
Valentine Krasnobaeva	cd57ee7ffa	BUG/MINOR: mworker: mworker_reexec: unset MODE_STARTING before free startup logs ring Flag MODE_STARTING should be unset for master just before freeing the startup logs ring, as it triggers the copy of process logs to this ring, see the code of print_message(). Moreover with this flag set, if startup logs ring pointer is NULL, any print_message() triggered just before the execvp in mworker_reexec() will call startup_logs_init(). So ring will be allocated again "discretely" and after execvp we will lost its address, as in step_init_1() we will call again startup_logs_init(). No need to backport this fix as it's related to the latest master-worker refactoring.	2024-10-29 18:17:49 +01:00
William Lallemand	984d2cfb61	BUG/MINOR: ssl/cli: 'set ssl cert' does not check the transaction name correctly Since commit 089c13850f ("MEDIUM: ssl: ssl-load-extra-del-ext work only with .crt"), the 'set ssl cert' CLI command does not check correctly if the transaction you are trying to update is the right one. The consequence is that you could commit accidentaly a transaction on the wrong certificate. The fix introduces the check again in case you are not using ssl-load-extra-del-ext. This must be backported in all stable versions.	2024-10-29 16:01:07 +01:00
Tristan	18582ede05	MEDIUM: socket: add zero-terminated ABNS alternative When an abstract unix socket is bound by HAProxy (using "abns@" prefix), NUL bytes are appended at the end of its path until sun_path is filled (for a total of 108 characters). Here we add an alternative to pass only the non-NUL length of that path to connect/bind calls, such that the effective path of the socket's name is as humanly written. This may be useful to interconnect with existing softwares that implement abstract sockets with this logic instead of the default haproxy one. This is achieved by implementing the "abnsz" socket prefix (instead of "abns"), which stands for "zero-terminated ABNS". "abnsz" prefix may be used anywhere "abns" is. Internally, haproxy uses the custom socket family (AF_CUST_ABNS vs AF_CUST_ABNSZ) to differentiate default abns sockets from zero-terminated ones. Documentation was updated and regtest was added. Fixes GH issues #977 and #2479 Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2024-10-29 12:15:24 +01:00
Aurelien DARRAGON	43861e3234	MEDIUM: sock_unix: use per-family addrcmp function Thanks to previous commit, we may now use dedicated addrcmp functions for each UNIX address family. This allows to simplify sock_unix_addrcmp() function and avoid useless checks in order to try to guess the socket type. In this patch we implement sock_abns_addrcmp() and sock_abnsz_addrcmp() functions, which are respectively used for ABNS and ABNSZ custom families sock_unix_addrcmp() now only holds regular UNIX socket comparing logic.	2024-10-29 12:15:09 +01:00
Aurelien DARRAGON	d879bf6600	MEDIUM: sock: also restore effective unix family in get_{src,dst}() As in previous commit, let's push the logic a bit further in order to properly restore the effective UNIX socket type when leveraging get_src() and get_dst() sock functions, since they rely on getpeername() and getsockname() under the hood, both of which will actually loose the effective family and return AF_UNIX for all our custom UNIX sockets. To do this, add sock_restore_unix_family() helper function from the logic implemented in the previous commit, and call this function from get_src() and get_dst() in case of unix socket prior to returning.	2024-10-29 12:15:03 +01:00
Aurelien DARRAGON	ae64444303	MINOR: sock: restore effective UNIX family in sock_get_old_sockets() When getting sockets from older process in sock_get_old_sockets(), we leverage getsockname() to fill sockaddr struct from known fd. However, the kernel doesn't know about our custom UNIX families such as CUST_ABNS and CUST_ABNSZ which are both based on AF_UNIX real family. Since haproxy socket API relies on effective family (and not real family) to recognize the socket type instead of having to guess it by analyzing the path content, let's restore it right after getsockname() since we have all the infos needed to deduce the right family. If the path starts with a NULL byte, we know that it is an abstract sock. Then we simply check <addrlen> value from getsockname() to know if the addr makes uses of the whole path space (normal ABNS) or partial path space (zero ABNS / aka ABNZ) terminated by 0.	2024-10-29 12:14:57 +01:00
Willy Tarreau	d24768ab44	MINOR: protocol: create abnsz socket address family For now it's the same as abns. We'll need to modify sock_unix_addrcmp(), and a few other ones to support effective path length when dealing with the \0. Let's check with Tristan's patch for this (upcoming patch). Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2024-10-29 12:14:50 +01:00
Aurelien DARRAGON	9fea4a3ca5	CLEANUP: tools: rely on address family to detect ABNS sockets Following previous commit, in str2sa_range(), make use of address' family which was just set to check if the socket is ABNS or not instead of relying on an extra boolean to save this info.	2024-10-29 12:14:44 +01:00
Aurelien DARRAGON	5d766260f0	MEDIUM: protocol: rely on AF_CUST_ABNS family to recognize ABNS sockets Now that we can easily distinguish regular UNIX socket from ABNS sockets by simply looking at the address family, stop looking at the first byte from addr->sun_path to guess if the socket is an ABNS one or not. Looking at the family is straightforward and will allow to differentiate between upcoming ABNSZ and ABNS (where looking at the first byte from path won't help anymore).	2024-10-29 12:14:37 +01:00
Willy Tarreau	78ac312bbd	MEDIUM: protocol: make abns a custom unix socket address family This is a pre-requisite to adding the abnsz socket address family: in this patch we make use of protocol API rework started by 732913f ("MINOR: protocol: properly assign the sock_domain and sock_family") in order to implement a dedicated address family for ABNS sockets (based on UNIX parent family). Thanks to this, it will become trivial to implement a new ABNSZ (for abns zero) family which is essentially the same as ABNS but with a slight difference when it comes to path handling (ABNS uses the whole sun_path length, while ABNSZ's path is zero terminated and evaluation stops at 0) It was verified that this patch doesn't break reg-tests and behaves properly (tests performed on the CLI with show sess and show fd). Anywhere relevant, AF_CUST_ABNS is handled alongside AF_UNIX. If no distinction needs to be made, real_family() is used to fetch the proper real family type to handle it properly. Both stream and dgram were converted, so no functional change should be expected for this "internal" rework, except that proto will be displayed as "abns_{stream,dgram}" instead of "unix_{stream,dgram}". Before ("show sess" output): 0x64c35528aab0: proto=unix_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=00 epoch=0 age=0s calls=1 rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,ax=] rp[f=80008000h,i=0,an=00h,ax=] scf=[8,0h,fd=21,rex=10s,wex=] scb=[8,1h,fd=-1,rex=,wex=] exp=10s rc=0 c_exp= After: 0x619da7ad74c0: proto=abns_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=00 epoch=0 age=0s calls=1 rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,ax=] rp[f=80008000h,i=0,an=00h,ax=] scf=[8,0h,fd=22,rex=10s,wex=] scb=[8,1h,fd=-1,rex=,wex=] exp=10s rc=0 c_exp= Co-authored-by: Aurelien DARRAGON <adarragon@haproxy.com>	2024-10-29 12:14:25 +01:00
William Lallemand	596db3ef86	BUG/MINOR: trace: stop rewriting argv with -dt When using trace with -dt, the trace_parse_cmd() function is doing a strtok which write \0 into the argv string. When using the mworker mode, and reloading, argv was modified and the trace won't work anymore because the first : is replaced by a '\0'. This patch fixes the issue by allocating a temporary string so we don't modify the source string directly. It also replace strtok by its reentrant version strtok_r. Must be backported as far as 2.9.	2024-10-29 11:01:47 +01:00
Willy Tarreau	e240be5495	DEV: gdb: add a number of gdb scripts to navigate in core dumps These is a collection of functions I'm occasionally using to navigate in core dumps. Only working ones were extracted. Those requiring knowledge of global variables (e.g. pools, proxy list) use the one extracted from the post_mortem struct. That one is defined in post-mortem.gdb and needs to be initialized using "pm_init post_mortem" or "pm_init <pointer>". From this point a number of global variables are accessible even if symbols are missing; those ones are then used by other functions to dump streams, threads, pools, proxies etc. The files can be sourced or copy-pasted into a gdb session. It's worth trying to keep them up-to-date, as the old ones used to navigate through tasks are no longer usable due to massive changes.	2024-10-28 17:55:08 +01:00
Willy Tarreau	52240680f1	MINOR: debug: remove the redundant process.thread_info array from post_mortem That one is huge and unneeded since we now have the pointer to the whole thread_info[] array, which does contain the freshest version of these info and many more. Let's just get rid of it entirely.	2024-10-28 17:14:48 +01:00
Willy Tarreau	da5cf52173	MINOR: debug: also add fdtab and acitvity to struct post_mortem These ones are often used as well when trying to analyse sequences of events, let's add them.	2024-10-28 17:14:48 +01:00
Willy Tarreau	20ffa35f66	DOC: design: add notes about more detailed error reporting for logs These are the notes of a day long code analysis session (CFA+WTA) aimed at figuring what's missing during most code troubleshooting sessions. The goal is to provide good indications about what rules/ filters were still active when the processing ended (timeout, error etc), what subscribers are still active (indicating waiting for an event), and what shut/abort events were met at the various levels of each side's stack, in each direction.	2024-10-28 17:14:48 +01:00
Aurelien DARRAGON	6d5b32daad	CLEANUP: log: use strnlen2() in _lf_text_len() to compute string length Thanks to previous commit, we can now use strnlen2() function to perform strnlen() portable equivalent instead of re-implementing the logic under _lf_text_len() function.	2024-10-28 14:59:42 +01:00
Aurelien DARRAGON	24131dee30	MINOR: tools: add strnlen2() helper strnlen2() is functionally equivalent to strnlen(). Goal is to provide an alternative to strnlen() which is not portable since it requires _POSIX_C_SOURCE >= 200809L	2024-10-28 14:59:35 +01:00
Valentine Krasnobaeva	7855069655	BUG/MINOR: mworker/cli: fix mworker_cli_global_proxy_new_listener There is no need to close proc->ipc_fd[0] on the error path in mworker_cli_global_proxy_new_listener(), as it's already closed before by the caller.	2024-10-26 22:53:24 +02:00
Valentine Krasnobaeva	4931d1ca5f	BUG/MEIDUM: mworker: fix fd leak from master to worker During re-execution master keeps always opened "reload" sockpair FDs and shared sockpair ipc_fd[0], the latter is using to transfert listeners sockets from the previously forked worker to the new one. So, these master's FDs are inherited in the newly forked worker and must be closed in its context. "reload" sockpair inherited FDs and shared sockpair FD (ipc_fd[0]) are closed separately, becase master doesn't recreate "reload" sockpair each time after its re-exec. It always keeps the same FDs for this "reload" sockpair. So in worker context it can be closed immediately after the fork. At contrast, shared sockpair is created each time after reload, when the new worker will be forked. So, if N previous workers are still exist at this moment, the new worker will inherit N ipc_fd[0] from master. So, it's more save to close all these FDs after get_listeners_fd() and bind_listeners() calls. Otherwise, early closed FDs in the worker context will be immediately bound to listeners and we could potentially have some bugs.	2024-10-26 22:53:24 +02:00
Valentine Krasnobaeva	745a4c5e93	CLEANUP: mworker: make mworker_create_master_cli more readable Using nested 'if' operator, while checking if we will need to allocate again the "reload" sockpair, does not degrade performance, as mworker_create_master_cli is a startup routine. This nested 'if' (we check one condition in each operator) makes more visible the fact, that the "reload" sockpair is allocated only once, when the master process starts and it does not re-allocated again (hence, its FDs are not closed) during reloads. This way of checking multiple conditions here makes more easy to spot this fact, while analysing the code in order to investigate FD leaks between master and worker.	2024-10-26 22:26:49 +02:00
Willy Tarreau	2f04ebe14a	MINOR: debug: also add a pointer to struct global to post_mortem The pointer to struct global is also an important element to have in post_mortem given that it's used a lot to take decisions in the code. Let's just add it. It's worth noting that we could get rid of argc/argv at this point since they're also present in the global struct, but they don't cost much there anyway.	2024-10-26 11:33:09 +02:00
William Lallemand	dc1c0a169c	MINOR: cli: add an 'echo' command Add an echo command to write text over the CLI output.	2024-10-24 17:20:57 +02:00
William Lallemand	944a224358	MINOR: cli: remove non-printable characters from 'debug dev fd' When using 'debug dev fd', the output of laddr and raddr can contain some garbage. This patch replaces any control or non-printable character by a '.'.	2024-10-24 16:45:11 +02:00
Willy Tarreau	4adb2d864d	MINOR: debug: do not limit backtraces to stuck threads Historically for size limitation reasons, we would only dump the backtrace of stuck threads. The problem is that when triggering a panic or other reasons, we have no backtrace, which effectively limits it to the watchdog timer. It's also visible in "show threads" which used to report backtraces for all threads in 2.4 and displays none nowadays, making its use much more limited. A first approach could be to just dump the thread that triggers the panic (in addition to stuck threads). But that remains quite limited since "show threads" would still display nothing. This patch takes a better approach consisting in dumping all non-idle threads. This way the output is less polluted that with the older approach (no need to dump all those waiting in the poller), and all active threads are visible, in panics as well as in "show threads". As such, the CLI command "debug dev panic" now dmups backtraces again. This is already a benefit which will ease testing of various locations against the ability to resolve useful symbols.	2024-10-24 16:12:46 +02:00
Willy Tarreau	e5fccfe0b6	MINOR: debug: store important pointers in post_mortem Dealing with a core and a stripped executable is a pain when it comes to finding pools, proxies or thread contexts. Let's put a pointer to these heads and arrays in the post_mortem struct for easier location. Other critical lists like this could possibly benefit from being added later. Here we now have: - tgroup_info - thread_info - tgroup_ctx - thread_ctx - pools - proxies Example: $ objdump -h haproxy\|grep post 34 _post_mortem 000014b0 0000000000cfd400 0000000000cfd400 008fc400 2*8 (gdb) set $pm=(struct post_mortem)0x0000000000cfd400 (gdb) p $pm->tgroup_ctx[0] $8 = { threads_harmless = 254, threads_idle = 254, stopping_threads = 0, timers = { b = {0x0, 0x0} }, niced_tasks = 0, __pad = 0xf5662c <ha_tgroup_ctx+44> "", __end = 0xf56640 <ha_tgroup_ctx+64> "" } (gdb) info thr Id Target Id Frame * 1 Thread 0x7f9e7706a440 (LWP 21169) 0x00007f9e76a9c868 in raise () from /lib64/libc.so.6 2 Thread 0x7f9e76a60640 (LWP 21175) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6 3 Thread 0x7f9e7613d640 (LWP 21176) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6 4 Thread 0x7f9e7493a640 (LWP 21179) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6 5 Thread 0x7f9e7593c640 (LWP 21177) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6 6 Thread 0x7f9e7513b640 (LWP 21178) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6 7 Thread 0x7f9e6ffff640 (LWP 21180) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6 8 Thread 0x7f9e6f7fe640 (LWP 21181) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6 (gdb) p/x $pm->thread_info[0].pth_id $12 = 0x7f9e7706a440 (gdb) p/x $pm->thread_info[1].pth_id $13 = 0x7f9e76a60640 (gdb) set $px = *$pm->proxies while ($px != 0) printf "%#lx %s served=%u\n", $px, $px->id, $px->served set $px = ($px)->next end 0x125eda0 GLOBAL served=0 0x12645b0 stats served=0 0x1266940 comp served=0 0x1268e10 comp_bck served=0 0x1260cf0 <OCSP-UPDATE> served=0 0x12714c0 <HTTPCLIENT> served=0	2024-10-24 16:12:46 +02:00
Willy Tarreau	93c3f2a0b4	MINOR: debug: place the post_mortem struct in its own section. Placing it in its own section will ease its finding, particularly in gdb which is too dumb to find anything in memory. Now it will be sufficient to issue this: $ gdb -ex "info files" -ex "quit" ./haproxy core 2>/dev/null \|grep _post_mortem 0x0000000000cfd300 - 0x0000000000cfe780 is _post_mortem or this: $ objdump -h haproxy\|grep post 34 _post_mortem 00001480 0000000000cfd300 0000000000cfd300 008fc300 2*8 to spot the symbol's address. Then it can be read this way: (gdb) p (struct post_mortem *)0x0000000000cfd300	2024-10-24 16:12:46 +02:00
Willy Tarreau	989b02e193	MINOR: debug: place a magic pattern at the beginning of post_mortem In order to ease finding of the post_mortem struct in core dumps, let's make it start with a recognizable pattern of exactly 32 chars (to preserve alignment): "POST-MORTEM STARTS HERE+7654321\0" It can then be found like this from gdb: (gdb) find 0x000000012345678, 0x0000000100000000, 'P','O','S','T','-','M','O','R','T','E','M' 0xcfd300 <post_mortem> 1 pattern found. Or easier with any other more practical tool (who as ever used "find" in gdb, given that it cannot iterate over maps and is 100% useless?).	2024-10-24 16:12:46 +02:00
Willy Tarreau	fba48e1c40	MINOR: pools: export the pools variable We want it to be accessible from debuggers for inspection and it's currently unavailable. Let's start by exporting it as a first step.	2024-10-24 16:12:46 +02:00
Willy Tarreau	db76949cff	CLEANUP: mux-h2: remove the unused "full" variable in h2_frt_transfer_data() During 11th and 12th iteration of the development cycle for the H2 auto rx window, several approaches were attempted to figure if another buffer could be allocated or not. One of them consisted in looping back to the beginning of the function requesting a new buffer slot and getting one if the buffer was either apparently or confirmed full. The latest one consisted in directly allocating the next buffer from the two places where it's found to be proven full, instead of checking with the now defunct h2s_may_get_rxbuf() if we were allowed to get once an loop. That approach was retained. In this case the "full" variabled is no longer needed, so let's get rid of it because the construct looks bogus and confuses coverity (and possibly code readers as the intent is unclear compared to the code).	2024-10-24 16:12:46 +02:00
Willy Tarreau	f163cbfb7f	BUILD: debug: silence a build warning with threads disabled Commit 091de0f9b2 ("MINOR: debug: slightly change the thread_dump_pointer signification") caused the following warning to be emitted when threads are disabled: src/debug.c: In function 'ha_thread_dump_one': src/debug.c:359:9: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] Let's just disguise the pointer to silence it. It should be backported where the patch above was backported, since it was part of a series aiming at making thread dumps more exploitable from core dumps.	2024-10-24 16:12:46 +02:00
William Lallemand	5db761f709	MINOR: mworker/cli: 'show proc debug' for old workers Add FD details for old workers in 'show proc debug'.	2024-10-24 14:47:28 +02:00
William Lallemand	b49ddae21b	MINOR: mworker/cli: remove comment line for program when useless Remove the '# programs' line on 'show proc' output when there are no program.	2024-10-24 14:39:41 +02:00
William Lallemand	84640aaa2a	MINOR: mworker/cli: add 'debug' to 'show proc' This patch adds a 'debug' parameter to the 'show proc' command of the master CLI. It allows to show debug details about the processes. Example: echo 'show proc debug' \| socat /tmp/master.sock - \#<PID> <type> <reloads> <uptime> <version> <ipc_fd[0]> <ipc_fd[1]> 391999 master 0 [failed: 0] 0d00h00m02s 3.1-dev10-b9095a-63 5 6 \# workers 392001 worker 0 0d00h00m02s 3.1-dev10-b9095a-63 3 -1 \# programs	2024-10-24 14:23:27 +02:00
Christopher Faulet	362de90f3e	BUG/MINOR: stconn: Don't disable 0-copy FF if EOS was reported on consumer side There is no reason to disable the 0-copy data forwarding if an end-of-stream was reported on the consumer side. Indeed, the consumer will send data in this case. So there is no reason to check the read side here. This patch may be backported as far as 2.9.	2024-10-24 12:07:50 +02:00
Christopher Faulet	5970c6abec	BUG/MINOR: http-ana: Fix wrong client abort reports during responses forwarding When the response forwarding is aborted, we must not report a client abort if a EOS was seen on client side. On abort performed by the stream must be considered. This bug was introduced when the SHUTR was splitted in 2 flags. This patch must be backported as far as 2.8.	2024-10-24 12:07:50 +02:00
Christopher Faulet	fbc3de6e9e	BUG/MEDIUM: stconn: Report blocked send if sends are blocked by an error When some data must be sent to the endpoint but an error was previously reported, nothing is performed and we leave. But, in this case, the SC is not notified the sends are blocked. It is indeed an issue if the endpoint reports an error after consuming all data from the SC. In the endpoint the outgoing data are trashed because of the error, but on the SC, everything was sent, even if an error was also reported. Because of this bug, it is possible to have outgoing data blocked at the SC level but without any write timeout armed. In some cases, this may lead to blocking conditions where the stream is never closed. So now, when outgoing data cannot be sent because an previous error was triggered, a blocked send is reported. This way, it is possible to report a write timeout. This patch should fix the issue #2754. It must be backported as far as 2.8.	2024-10-24 11:46:33 +02:00
Amaury Denoyelle	7a02fcaf20	BUG/MEDIUM: server: fix race on servers_list during server deletion Each server is inserted in a global list named servers_list on new_server(). This list is then only used to finalize servers initialization after parsing. On dynamic server creation, there is no issue as new_server() is under thread isolation. However, when a server is deleted after its refcount reached zero, srv_drop() removes it from servers_list without lock protection. In the longterm, this can cause list corruption and crashes, especially if multiple adjacent servers are removed in parallel. To fix this, convert servers_list to a mt_list. This should not impact performance as servers_list is not used during runtime outside of server creation/deletion. This should fix github issue #2733. Thanks to Chris Staite who first found the issue here. This must be backported up to 2.6.	2024-10-24 11:35:57 +02:00
Amaury Denoyelle	116178563c	BUG/MINOR: server: fix dynamic server leak with check on failed init If a dynamic server is added with check or agent-check, its refcount is incremented after server keyword parsing. However, if add server fails at a later stage, refcount is only decremented once, which prevented the server to be fully released. This causes a leak with a server which is detached from most of the lists but still exits in the system. This bug is considered minor as only a few conditions may cause a failure in add server after check/agent-check initialization. This is the case if there is a naming collision or the dynamic ID cannot be generated. To fix this, simply decrement server refcount on add server error path if either check and/or agent-check are flagged as activated. This bug is related to github issue #2733. Thanks to Chris Staite who first found the leak. This must be backported up to 2.6.	2024-10-24 11:35:57 +02:00
Valentine Krasnobaeva	ddb829bb51	MINOR: mworker/cli: split mworker_cli_proxy_create There are two parts in mworker_cli_proxy_create(): allocating and setting up MASTER proxy and allocating and setting up servers on ipc_fd[0] of the sockpairs shared with workers. So, let's split mworker_cli_proxy_create() into two functions respectively. Each of them takes **errmsg as an argument to write an error message, which may be triggered by some subcalls. The content of this errmsg will allow to extend the final alert message shown to user, if these new functions will fail. The main goals of this split is to allow to move these two parts independantly in future and makes the code of haproxy initialization in haproxy.c more transparent.	2024-10-24 11:32:20 +02:00
Valentine Krasnobaeva	a0d727e069	CLEANUP: mworker: clean mworker_reexec Before refactoring master-worker architecture, resources to setup master CLI for the new worker process (shared sockpair, entry in proc_list) were created in init() before parsing the configuration and binding listening sockets. So, master during its re-exec has had to cleanup the new worker's ressources in a case, when it fails at some initialization step before the fork. Now fork happens very early and worker parses its configuration by itself. If it fails during the initialization stage, all clean ups (deleting the fds of the shared sockpair, proc_list cleanup) are performed in SIGCHLD handler up to catching the SIGCHLD corresponded to this new worker. So, there is no longer need to call mworker_cleanup_proc() in mworker_reexec(). As for mworker_cleanlisteners(), there is no longer need to call this function. Master parses now only "global" and "program" sections, so it allocates only MASTER proxy, which is stopped in mworker_reexec() by mworker_cli_proxy_stop(). Let's keep the definitions of mworker_cleanlisteners() and mworker_cleanup_proc() in mworker.c for the moment. We may reuse parts of its code later.	2024-10-24 11:32:20 +02:00
Valentine Krasnobaeva	4db0f69527	BUG/MINOR: mworker: show worker warnings in startup logs As master-worker fork happens now at early init stage and worker then parses its configuration and performs all initialization steps, let's duplicate startup logs ring for it, just before the moment when it enters in its pollong loop. Startup logs ring content is shown as an output of the "reload" master CLI command and we should be able to dump here worker initialization logs. Log messages are written in startup logs ring only, when mode MODE_STARTING is set (see print_message()). So, to be able to keep in startup logs the last worker alerts, let's withdraw MODE_STARTING and let's reset user messages context respectively just before entering in polling loop. This fix does not need to be backported as it is a part of previous patches from this version, which refactor master-worker architecture.	2024-10-24 11:32:20 +02:00
Valentine Krasnobaeva	5ee266b745	MINOR: error: simplify startup_logs_init_shm This patch simplifies the code of startup_logs_init_shm(). We no longer re-exec master process twice after each reload to free its unused memory, which it had to allocate, because it has parsed all configuration sections. So, there is no longer need to keep SHM fd opened between the first and the next reloads. We can completely remove HAPROXY_STARTUPLOGS_FD. In step_init_1() we continue to call startup_logs_init_shm() to open SHM and to allocate startup logs ring area within it. In master-worker mode, worker duplicates initial startup logs ring after sending its READY state to master. Sharing the same ring between two processes until the worker finishes its initialization allows to show at master CLI output worker's startup logs. During the next reload master process should free the memory allocated for the ring structure. Then after the execvp() it will reopen and map SHM area again and it will reallocate again the ring structure.	2024-10-24 11:32:20 +02:00
Valentine Krasnobaeva	e9c8e0efc9	MINOR: mworker: stop MASTER proxy listener on worker mcli sockpair After sending its "READY" status worker should not keep the access to MASTER proxy, thus, it shouldn't be able to send any other commands further to master process. To achieve this, let's stop in master context master CLI listener attached on the sockpair shared with worker. We do this just after receiving the worker's status message.	2024-10-24 11:32:20 +02:00
Valentine Krasnobaeva	3a5b28e00c	BUG/MINOR: mworker/cli: show master startup logs in recovery mode When master enters in recovery mode after unsuccessfull reload HAPROXY_LOAD_SUCCESS should be set as 0. Like this cli_io_handler_show_cli_sock() could dump in master CLI its warnings and alerts, saved in startup logs ring. No need to backport this fix, as this is related to the previous patches in this version to refactor master-worker architecture.	2024-10-24 11:32:20 +02:00
Willy Tarreau	401fb0e87a	MINOR: activity/memprofile: show per-DSO stats On systems where many libs are loaded, it's hard to track suspected leaks. Having a per-DSO summary makes it more convenient. That's what we're doing here by summarizing all calls per DSO before showing the total.	2024-10-24 10:49:21 +02:00
Christopher Faulet	c91745e3a4	BUG/MINOR: mux-h1: Fix conditions on pipe in some COUNT_IF() The previous commit contains a bug in some COUNT_IF() relying on the pipe inside the IOBUF. We must take care to have a pipe before checking its size. No backport needed.	2024-10-24 09:50:16 +02:00
Christopher Faulet	7e60928c9c	DEBUG: mux-h1: Add debug counters to track errors with in/out pending data Debug counters were added on all connection error when pending data remain blocked in the input or ouput buffers. The same is performed when the H1C is released, when the connection is closed and when a timeout is reached. Idea is to be able to count all cases where data are lost, especially the outgoing ones.	2024-10-24 08:18:55 +02:00
Willy Tarreau	1eb31d30fe	Revert "OPTIM: mux-h2: make h2_send() report more accurate wake up conditions" This reverts commit 9fbc01710a313968c90e72537a5906432f438062. In 3.1-dev10, commit 9fbc01710a ("OPTIM: mux-h2: make h2_send() report more accurate wake up conditions") leveraged the more accurate distinction between demux and recv to decide when to wake the tasklet up after a send. But other cases are needed. When we just need to wake the processing task up so that it itself wakes up other streams, for example because these ones are blocked. Indeed, a temporarily blocked stream may block other ones, which will never be woken up if the demux has nothing to do. In an ideal world we would check all cases where blocking flags were dropped. However it looks like this case after a send is probably the only one that deserves waking up the connection again. It's likely that in practice the MUX_MFULL flag was dropped and that it was that one that was blocking the send. In addition, dealing with these cases was not sufficient, as one case was encountered where dbuf was empty, subs=0, short_read still present while in FRH state... and the timeouts were still there (easily found with halog -tcn cD at a rate of 1-2 every 2 minutes roughly). Interestingly, in a dump, some MBUF_HAS_DATA were seen on an empty mbuf, so it means that certain conditions must be taken very carefully in the wakeup conditions. So overall this indicates that there remain subtle inconsistencies that this optimization is sensitive to. It may have to be revisited later but for now better revert it. No backport is needed. Annex: - first dump showing a dependency on WAIT_INLIST after h2_send(): 0x6dc2800: [23/Oct/2024:18:07:22.861247] id=1696 proto=tcpv4 flags=0x100c4a, conn_retries=0, conn_exp=<NEVER> conn_et=0x000 srv_conn=0x597a900, pend_pos=(nil) waiting=0 epoch=0 frontend=public (id=2 mode=http), listener=SSL (id=5) backend=gitweb-haproxy (id=6 mode=http) task=0x6e1d090 (state=0x00 nice=0 calls=23 rate=0 exp=2s tid=0(1/0) age=57s) txn=0x6e3f7c0 flags=0x43000 meth=1 status=200 req.st=MSG_DONE rsp.st=MSG_DATA req.f=0x4c rsp.f=0x2e scf=0x6dc33a0 flags=0x00002482 ioto=1m state=EST endp=CONN,0x6dc6c20,0x40405001 sub=3 rex=<NEVER> wex=3s rto=3s wto=3s iobuf.flags=0x00000000 .pipe=0 .buf=0@(nil)+0/0 h2s=0x6dc6c20 h2s.id=59 .st=HCR .flg=0x7001 .rxwin=32712 .rxbuf.c=0 .t=0@(nil)+0/0 .h=0@(nil)+0/0 .sc=0x6dc33a0(.flg=0x00002482 .app=0x6dc2800) .sd=0x6e83fd0(.flg=0x40405001) .subs=0x6dc33b8(ev=3 tl=0x6e22a20 tl.calls=10 tl.ctx=0x6dc33a0 tl.fct=sc_conn_io_cb) h2c=0x6e66570 h2c.st0=FRH .err=0 .maxid=77 .lastid=-1 .flg=0x2000e00 .nbst=2 .nbsc=2 .nbrcv=0 .glitches=0 .fctl_cnt=0 .send_cnt=2 .tree_cnt=2 .orph_cnt=0 .sub=1 .dsi=77 .dbuf=0@(nil)+0/0 .mbuf=[4..4\|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] .task=0x6dbdc60 .exp=<NEVER> co0=0x7f84881614b0 ctrl=tcpv4 xprt=SSL mux=H2 data=STRM target=LISTENER:0x2acb7c0 flags=0x80000300 fd=19 fd.state=121 updt=0 fd.tmask=0x1 scb=0x2a8da90 flags=0x00001211 ioto=1m state=EST endp=CONN,0x6e5a530,0x106c0001 sub=0 rex=<NEVER> wex=<NEVER> rto=3s wto=<NEVER> iobuf.flags=0x00000000 .pipe=0 .buf=0@(nil)+0/0 h1s=0x6e5a530 h1s.flg=0x14094 .sd.flg=0x106c0001 .req.state=MSG_DONE .res.state=MSG_DATA .meth=GET status=200 .sd.flg=0x106c0001 .sc.flg=0x00001211 .sc.app=0x6dc2800 .subs=(nil) h1c=0x7f84880f5f40 h1c.flg=0x80000020 .sub=0 .ibuf=32704@0x6ddef30+16262/32768 .obuf=0@(nil)+0/0 .task=0x6e131d0 .exp=<NEVER> co1=0x7f8488172b70 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=SERVER:0x597a900 flags=0x00000300 fd=31 fd.state=10122 updt=0 fd.tmask=0x1 filters={0x6e49f30="cache store filter", 0x6e67ad0="compression filter"} req=0x6dc2828 (f=0x21840000 an=0x48000 tofwd=0 total=224) an_exp=<NEVER> buf=0x6dc2830 data=(nil) o=0 p=0 i=0 size=0 htx=0x104d2c0 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0 res=0x6dc2870 (f=0xa0040000 an=0x24000000 tofwd=0 total=309982) an_exp=<NEVER> buf=0x6dc2878 data=0x6dceef0 o=16333 p=16333 i=16435 size=32768 htx=0x6dceef0 flags=0x0 size=32720 data=16333 used=1 wrap=NO extra=0 ----------------------------------- strm.flg 0x100c4a SF_SRV_REUSED SF_HTX SF_REDIRECTABLE SF_CURR_SESS SF_BE_ASSIGNED SF_ASSIGNED task.state 0 0 txn.meth 1 GET txn.flg 0x43000 TX_NOT_FIRST TX_CACHE_COOK TX_CACHEABLE txn.req.flg 0x4c HTTP_MSGF_BODYLESS HTTP_MSGF_VER_11 HTTP_MSGF_XFER_LEN txn.rsp.flg 0x2e HTTP_MSGF_COMPRESSING HTTP_MSGF_VER_11 HTTP_MSGF_XFER_LEN HTTP_MSGF_TE_CHNK f.sc.flg 0x2482 SC_FL_SND_EXP_MORE SC_FL_RCV_ONCE SC_FL_WONT_READ SC_FL_EOI f.sc.sd.flg 0x40405001 SE_FL_HAVE_NO_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_EOI SE_FL_NOT_FIRST SE_FL_T_MUX f.h2s.flg 0x7001 H2_SF_HEADERS_RCVD H2_SF_OUTGOING_DATA H2_SF_HEADERS_SENT H2_SF_ES_RCVD f.h2s.sd.flg 0x40405001 SE_FL_HAVE_NO_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_EOI SE_FL_NOT_FIRST SE_FL_T_MUX f.h2c.flg 0x2000e00 H2_CF_MBUF_HAS_DATA H2_CF_DEM_IN_PROGRESS H2_CF_DEM_SHORT_READ H2_CF_WAIT_INLIST f.co.flg 0x80000300 CO_FL_XPRT_TRACKED CO_FL_XPRT_READY CO_FL_CTRL_READY f.co.fd.st 0x121 FD_POLL_IN FD_EV_READY_W FD_EV_ACTIVE_R b.sc.flg 0x1211 SC_FL_SND_NEVERWAIT SC_FL_NEED_ROOM SC_FL_NOHALF SC_FL_ISBACK b.sc.sd.flg 0x106c0001 SE_FL_WAIT_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_MAY_FASTFWD_PROD SE_FL_WANT_ROOM SE_FL_RCV_MORE SE_FL_T_MUX b.h1s.sd.flg 0x106c0001 SE_FL_WAIT_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_MAY_FASTFWD_PROD SE_FL_WANT_ROOM SE_FL_RCV_MORE SE_FL_T_MUX b.h1s.flg 0x14094 H1S_F_HAVE_CLEN H1S_F_HAVE_O_CONN H1S_F_NOT_FIRST H1S_F_WANT_KAL H1S_F_RX_CONGESTED b.h1c.flg 0x80000020 H1C_F_IS_BACK H1C_F_IN_FULL b.co.flg 0x300 CO_FL_XPRT_READY CO_FL_CTRL_READY b.co.fd.st 0x278a FD_POLL_OUT FD_POLL_PRI FD_POLL_IN FD_EV_ERR_RW FD_EV_READY_R 0x2008 req.flg 0x21840000 CF_FLT_ANALYZE CF_DONT_READ CF_AUTO_CONNECT CF_WROTE_DATA req.ana 0x48000 AN_REQ_FLT_END AN_REQ_HTTP_XFER_BODY req.htx.flg 0 0 res.flg 0xa0040000 CF_ISRESP CF_FLT_ANALYZE CF_WROTE_DATA res.ana 0x24000000 AN_RES_FLT_END AN_RES_HTTP_XFER_BODY res.htx.flg 0 0 ----------------------------------- - second example of stuck connection after properly checking for WAIT_INLIST as well: 0x73438d0: [23/Oct/2024:18:46:57.235709] id=3963 proto=tcpv4 flags=0x100c4a, conn_retries=0, conn_exp=<NEVER> conn_et=0x000 srv_conn=0x5dd3f50, pend_pos=(nil) waiting=0 epoch=0x13 p_stc=25 p_req=29 p_res=29 p_prp=29 frontend=public (id=2 mode=http), listener=SSL (id=5) backend=gitweb-haproxy (id=6 mode=http) task=0x72a13e0 (state=0x00 nice=0 calls=24 rate=0 exp=7s tid=0(1/0) age=53s) txn=0x7287260 flags=0x43000 meth=1 status=200 req.st=MSG_DONE rsp.st=MSG_DATA req.f=0x4c rsp.f=0x2e scf=0x729e520 flags=0x00042082 ioto=1m state=EST endp=CONN,0x737ffd0,0x4040d001 sub=2 rex=<NEVER> wex=46s rto=46s wto=46s iobuf.flags=0x00000000 .pipe=0 .buf=0@(nil)+0/0 h2s=0x737ffd0 h2s.id=57 .st=HCR .flg=0x7001 .rxwin=32712 .rxbuf.c=0 .t=0@(nil)+0/0 .h=0@(nil)+0/0 .sc=0x729e520(.flg=0x00042082 .app=0x73438d0) .sd=0x72afd50(.flg=0x4040d001) .subs=0x729e538(ev=2 tl=0x72af760 tl.calls=10 tl.ctx=0x729e520 tl.fct=sc_conn_io_cb) h2c=0x72555a0 h2c.st0=FRH .err=0 .maxid=77 .lastid=-1 .flg=0x60e00 .nbst=1 .nbsc=1 .nbrcv=0 .glitches=0 .fctl_cnt=0 .send_cnt=1 .tree_cnt=1 .orph_cnt=0 .sub=0 .dsi=77 .dbuf=0@(nil)+0/0 .mbuf=[2..2\|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] .task=0x725e660 .exp=<NEVER> co0=0x7378e00 ctrl=tcpv4 xprt=SSL mux=H2 data=STRM target=LISTENER:0x2f24800 flags=0x80040300 fd=23 fd.state=1122 updt=0 fd.tmask=0x1 scb=0x2ee74c0 flags=0x00001211 ioto=1m state=EST endp=CONN,0x7287190,0x106c0001 sub=0 rex=<NEVER> wex=<NEVER> rto=46s wto=<NEVER> iobuf.flags=0x00000000 .pipe=0 .buf=0@(nil)+0/0 h1s=0x7287190 h1s.flg=0x14094 .sd.flg=0x106c0001 .req.state=MSG_DONE .res.state=MSG_DATA .meth=GET status=200 .sd.flg=0x106c0001 .sc.flg=0x00001211 .sc.app=0x73438d0 .subs=(nil) h1c=0x7373920 h1c.flg=0x80000020 .sub=0 .ibuf=32704@0x7272700+318/32768 .obuf=0@(nil)+0/0 .task=0x729e700 .exp=<NEVER> co1=0x72f5290 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=SERVER:0x5dd3f50 flags=0x00000300 fd=19 fd.state=10122 updt=0 fd.tmask=0x1 filters={0x728f1f0="cache store filter" [3], 0x728fea0="compression filter" [28]} req=0x73438f8 (f=0x21840000 an=0x48000 tofwd=0 total=224) an_exp=<NEVER> buf=0x7343900 data=(nil) o=0 p=0 i=0 size=0 htx=0x105f440 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0 res=0x7343940 (f=0xa0040000 an=0x24000000 tofwd=0 total=359574) an_exp=<NEVER> buf=0x7343948 data=0x72b1b30 o=16333 p=16333 i=16435 size=32768 htx=0x72b1b30 flags=0x8 size=32720 data=16333 used=1 wrap=NO extra=0 ----------------------------------- strm.flg 0x100c4a SF_SRV_REUSED SF_HTX SF_REDIRECTABLE SF_CURR_SESS SF_BE_ASSIGNED SF_ASSIGNED task.state 0 0 txn.meth 1 GET txn.flg 0x43000 TX_NOT_FIRST TX_CACHE_COOK TX_CACHEABLE txn.req.flg 0x4c HTTP_MSGF_BODYLESS HTTP_MSGF_VER_11 HTTP_MSGF_XFER_LEN txn.rsp.flg 0x2e HTTP_MSGF_COMPRESSING HTTP_MSGF_VER_11 HTTP_MSGF_XFER_LEN HTTP_MSGF_TE_CHNK f.sc.flg 0x42082 SC_FL_EOS SC_FL_SND_EXP_MORE SC_FL_WONT_READ SC_FL_EOI f.sc.sd.flg 0x4040d001 SE_FL_HAVE_NO_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_EOS SE_FL_EOI SE_FL_NOT_FIRST SE_FL_T_MUX f.h2s.flg 0x7001 H2_SF_HEADERS_RCVD H2_SF_OUTGOING_DATA H2_SF_HEADERS_SENT H2_SF_ES_RCVD f.h2s.sd.flg 0x4040d001 SE_FL_HAVE_NO_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_EOS SE_FL_EOI SE_FL_NOT_FIRST SE_FL_T_MUX f.h2c.flg 0x60e00 H2_CF_END_REACHED H2_CF_RCVD_SHUT H2_CF_MBUF_HAS_DATA H2_CF_DEM_IN_PROGRESS H2_CF_DEM_SHORT_READ f.co.flg 0x80040300 CO_FL_XPRT_TRACKED CO_FL_SOCK_RD_SH CO_FL_XPRT_READY CO_FL_CTRL_READY f.co.fd.st 0x1122 FD_POLL_HUP FD_POLL_IN FD_EV_READY_W FD_EV_READY_R b.sc.flg 0x1211 SC_FL_SND_NEVERWAIT SC_FL_NEED_ROOM SC_FL_NOHALF SC_FL_ISBACK b.sc.sd.flg 0x106c0001 SE_FL_WAIT_DATA SE_FL_MAY_FASTFWD_CONS SE_FL_MAY_FASTFWD_PROD SE_FL_WANT_ROOM SE_FL_RCV_MORE SE_FL_T_MUX	2024-10-23 19:17:10 +02:00
Willy Tarreau	a1d0e58b06	BUILD: spoe: fix build warning on older gcc around sub-struct initialization gcc-4.8 is unhappy with the cfg_file initialization: src/flt_spoe.c: In function 'parse_spoe_flt': src/flt_spoe.c:2202:9: warning: missing braces around initializer [-Wmissing-braces] struct cfgfile cfg_file = {0}; ^ src/flt_spoe.c:2202:9: warning: (near initialization for 'cfg_file.list') [-Wmissing-braces] This is due to the embedded list member. Initializing it to empty like we do almost everywhere else makes it happy. No backport is needed as this was changed in 3.1-dev5 only.	2024-10-23 15:12:59 +02:00
Aurelien DARRAGON	b5b40a9843	BUG/MEDIUM: connection/http-reuse: fix address collision on unhandled address families As described in GH #2765, there were situations where http connections would be re-used for requests to different endpoints, which is obviously unexpected. In GH #2765, this occured with httpclient and UNIX socket combination, but later code analysis revealed that while disabling http reuse on httpclient proxy helped, it didn't fix the underlying issue since it was found that conn_calculate_hash_sockaddr() didn't take into account families such as AF_UNIX or AF_CUST_SOCKPAIR, and because of that the sock_addr part of the connection wasn't hashed. To properly fix the issue, let's explicly handle UNIX (both regular and ABNS) and AF_CUST_SOCKPAIR families, so that the destination address is properly hashed. To prevent this bug from re-appearing: when the family isn't known, instead of doing nothing like before, let's fall back to a generic (unoptimal) hashing which hashes the whole sockaddr_storage struct As a workaround, http-reuse may be disabled on impacted proxies. (unfortunately this doesn't help for httpclient since reuse policy defaults to safe and cannot be modified from the config) It should be backported to all stable versions. Shout out to @christopherhibbert for having reported the issue and provided a trivial reproducer. [ada: prior to 3.0, ctx adjt is required because conn_hash_update()'s prototype is slightly different]	2024-10-23 11:48:16 +02:00
Willy Tarreau	b74fb1325e	MINOR: sample: add the "when" converter to condition some expressions Sometimes it would be desirable to include some debugging output only under certain conditions, but the end of the transfer is too late to apply some rules. Here we take the approach of making a converter ("when") that takes a condition among an arbitrary list, and decides whether or not to let the input sample pass through or not based on the condition. This allows for example to log debugging information only when an error was encountered during the processing (sort of an extension of dontlog-normal). The conditions are quite limited (stopping, error, normal, toapplet, forwarded, processed) and can be negated. The converter can also be chained to use more complex conditions. A suggested example will be: # log "dbg={-}" when fine, or "dbg={... debug info ...}" on error: log-format "$HAPROXY_HTTP_LOG_FMT dbg={%[bs.debug_str,when(!normal)]}"	2024-10-22 20:13:00 +02:00
Willy Tarreau	19e4ec43b9	MINOR: filters: add per-filter call counters The idea here is to record how many times a filter is being called on a stream. We're incrementing the same counter all along, regardless of the type of event, since the purpose is essentially to detect one that might be misbehaving. The number of calls is reported in "show sess all" next to the filter name. It may also help detect suboptimal processing. For example compressing 1GB shows 138k calls to the compression filter, which is roughly two calls per buffer. Maybe we wake up with incomplete buffers and compress less. That's left for a future analysis.	2024-10-22 20:13:00 +02:00
Willy Tarreau	37d5c6fe3a	MINOR: stream: maintain per-stream counters of the number of passes on code Process_stream() is a complex function and a few times some lopos were either witnessed or suspected. Each time this happens it's extremely difficult to figure why because it involves combinations of analysers, filters, errors etc. Let's at least maintain a set of 4 counters per stream that report the number of times we've been through each of the 4 most important blocks (stconn changes, request analysers, response analysers, and propagation of changes down). These ones are stored in the stream and reported in "show sess all", just like they will be reported in panic dumps.	2024-10-22 20:13:00 +02:00
Christopher Faulet	ce314cfb39	MINOR: mux-h1: Add support of the debug string for logs Now it is possible to have info about front and back H1 multiplexer. For instance: <134>Oct 22 18:10:46 haproxy[3841864]: 127.0.0.1:44280 [22/Oct/2024:18:10:43.265] front-http back-http/www 0/0/-1/-1/3082 503 217 - - SC-- 1/1/0/0/3 0/0 "GET / HTTP/1.1" fs=< h1s=0x13b6f10 h1s.flg=0x14010 .sd.flg=0x50404601 .req.state=MSG_DONE .res.state=MSG_DONE .meth=GET status=503 .sd.flg=0x50404601 .sc.flg=0x00034482 .sc.app=0x11e4c30 .subs=(nil) h1c.flg=0x0 .sub=0 .ibuf =0@(nil)+0/0 .obuf=0@(nil)+0/0 .task=0x1337d10 .exp=<NEVER> conn.flg=0x80000300> bs=< h1s=0x13bb400 h1s.flg=0x100010 .sd.flg=0x10400001 .req.state=MSG_RQBEFORE .res.state=MSG_RPBEFORE .meth=UNKNOWN status=0 .sd.flg=0x10400001 .sc.flg=0x0003c007 .sc.app=0x11e4c30 .subs=(nil) h1c.flg=0x80000000 .sub=0 .ibuf=0@(nil)+0/0 .obuf=0@(nil)+0/0 .task=0x12ba610 .exp=<NEVER> conn.flg=0x5c0300> The have this log message, the log-format must be set to: log-format "$HAPROXY_HTTP_LOG_FMT fs=<%[fs.debug_str]> bs=<%[bs.debug_str]>"	2024-10-22 18:21:28 +02:00
Christopher Faulet	35ab9b8c6d	DEBUG: mux-h1: Add debug counters to track some errors Debug counters are added to track errors about wrong the payload length during the message formatting (on the sending path). Aborts are also concerned. connection shutdowns and errors while the end of the message was not reached are now tracked. On the sending path, shutdown performed while all the message was not forwarded are tracked too.	2024-10-22 17:39:32 +02:00
Christopher Faulet	c8aecc393b	DEBUG: stream: Add debug counters to track some client/server aborts Not all aborts are tracked for now but only those a bit ambiguous. Mainly, aborts during the data forwarding are concerned. Those triggered during the request or the response analysis are easier to analyze with the stream termination state.	2024-10-22 16:46:37 +02:00
Christopher Faulet	19b736a5fb	CLEANUP: stream: remove outdated comments Comments added during a refactoring session were still there while they are now totally useless. So let's remove them.	2024-10-22 16:14:15 +02:00
Christopher Faulet	7dc930d231	BUG/MINOR: stconn: Pretend the SE have more data to deliver on abortonclose When abortonclose option is enabled on the backend, at the SC level, we must still pretend the SE have more data to deliver to be able to receive the EOS. It must be performed at 2 places: * When the backend is set and the connection is requested. It is when the option is seen for the first time. * After a receive attempt, if the EOI flag is set on the sedesc. Otherwise, when an abort is detected by the mux, the SC is not notified. This patch should fix the issue #2764. This bug probably exists in all stable version but is only visible since bca5e1423 ("OPTIM: stconn: Don't pretend mux have more data to deliver on EOI/EOS/ERROR"). So I suggest to not backport it for now, except if the commit above is backported.	2024-10-22 11:16:24 +02:00
Christopher Faulet	ded28f6e5c	BUG/MEDIUM: mux-h2: Remove H2S from send list if data are sent via 0-copy FF When data are sent via the zero-copy data forwarding, in h2_done_ff, we must be sure to remove the H2 stream from the send list if something is send. It was only performed if no blocking condition was encountered. But we must also do it if something is sent. Otherwise the transfer may be blocked till timeout. This patch must be backported as far as 2.9.	2024-10-22 08:00:32 +02:00
Christopher Faulet	529e4f36a3	BUG/MEDIUM: stats-html: Never dump more data than expected during 0-copy FF During the zero-copy data forwarding, the caller specify the maximum amount of data the producer may push. However, the HTML stats applet does not use it and can fill all the free space in the buffer. It is especially an issue when the consumer is limited by a flow control, like the H2. Because we may emit too large DATA frame in this case. It is especially visible with big buffer (for instance 32k). In the early age or zero-copy data forwarding, the caller was responsible to pass a properly resized buffer. And during the different refactoring steps, this has changed but the HTML stats applet was not updated accordingly. To fix the bug, the buffer used to dump the HTML page is resized to be sure not too much data are dumped. This patch should solve the issue #2757. It must be backported to 3.0.	2024-10-22 08:00:32 +02:00
Willy Tarreau	f2c415cec1	MINOR: debug: add "debug dev counters" to list code counters Issuing "debug dev counters" on the CLI will now scan all existing counters, and report their count, type, location, function name, the condition and an optional comment passed to the macro. The command takes a number of arguments: - "show": this is the default, it will just list the counters - "reset": will reset the matching counters instead of listing them - "all": by default, only non-zero counters are listed. With "all", they are all listed - "bug": restrict the reset or dump to counters of type "BUG" (BUG_ON usually) - "chk": restrict the reset or dump to counters of type "CHK" (CHECK_IF) - "cnt": restrict the reset or dump to counters of type "CNT" (COUNT_IF) The types may be cumulated, and the option entered in any order. Here's an example of the output of "debug dev counters show all bug": Count Type Location function(): "condition" [comment] 0 BUG ring.h:114 ring_dup(): "max > ring_size(dst)" 0 BUG vecpair.h:223 vp_getblk_ofs(): "ofs >= v1->len + v2->len" 0 BUG buf.h:395 b_add(): "b->data + count > b->size" 0 BUG buf.h:106 b_room(): "b->data > b->size" 0 BUG task.h:328 _task_queue(): "(ulong)caller & 1" 0 BUG task.h:324 _task_queue(): "task->tid != tid" 0 BUG task.h:313 _task_queue(): "(ulong)caller & 1" (...) This is expected to be convenient combined with the use and abuse of COUNT_IF() at select locations.	2024-10-21 19:17:55 +02:00
Willy Tarreau	da66c42f65	MINOR: debug: add a new debug macro COUNT_IF() This macro works exactly like BUG_ON() except that it never logs anything nor crashes, it only implements an atomic counter that is incremented on every call. This can be used to count a number of unlikely events that are worth checking at run time on setups showing unusual and unreproducible behaviors.	2024-10-21 19:14:07 +02:00
Willy Tarreau	776fd03509	MEDIUM: debug: add match counters for BUG_ON/WARN_ON/CHECK_IF These macros do not always kill the process, and sometimes it would be nice to know if some match or not, and how many times (especially for the CHECK_IF one). This commit adds a new section "dbg_cnt" made of structs that contain function name, file name, line number, check type, condition and match count. A newe macro __DBG_COUNT() adds one to the counter, and is placed inside _BUG_ON() and _BUG_ON_ONCE(). It's worth noting that the exact type of the check is not very precise but in practice we don't care, as most checks will cause the process to die anyway unless they're of type _BUG_ON_ONCE() (used by CHECK_IF by default). All of this is limited to !defined(USE_OBSOLETE_LINKER) because we're creating a section, thus we need a modern linker to be able to scan this section later. Doing so adds ~50kB to the executable due to the ~1266 BUG_ON() and others placed there. That's not huge in comparison to the visibility it can provide.	2024-10-21 19:14:07 +02:00
Willy Tarreau	8844ed2009	CLEANUP: debug: make the BUG_ON() macros check the condition in the outer one The BUG_ON() macros are made of two levels so as to resolve the condition to a string. However this doesn't offer much flexibility for performing other operations when the condition is validated, so let's adjust them so that the condition is checked in the outer macro and the operations are performed in the inner one.	2024-10-21 18:17:25 +02:00
Amaury Denoyelle	68c8c91023	BUG/MINOR: mux-quic: do not close STREAM with empty FIN if no data sent A stream may be shut without any HTX EOM reported to report a proper closure. This is the case for QCS instances flagged with QC_SF_UNKNOWN_PL_LENGTH. Shut is performed with an empty FIN emission instead of a RESET_STREAM. This has been implemented since the following patch : 24962dd1784dd22babc8da09a5fc8769617f89e3 BUG/MEDIUM: mux-quic: do not emit RESET_STREAM for unknown length However, in case of HTTP/3, an empty FIN should only be done after a full message is emitted, which requires at least a HEADERS frame. If an empty FIN is emitted without it, client may interpret this as invalid and close the connection. To prevent this, fallback to a RESET_STREAM emission if no data were emitted on the stream. This was reproduced using ngtcp2-client with 10% loss (-r 0.1) on a remote host, with httpterm request "/?s=100k&C=1&b=0&P=400". An error ERR_H3_FRAME_UNEXPECTED is returned by ngtcp2-client when the bug occurs. Note that this change is incomplete. The message validity depends solely on the application protocol in use. As such, a new app_ops callback should be implemented to ensure the stream is closed accordingly. However, this first patch ensures that at least HTTP/3 case is valid while keeping a minimal backport process. This should be backported up to 2.8.	2024-10-21 11:24:38 +02:00
Amaury Denoyelle	b200d3d80b	MINOR: mux-quic: simplify sending of empty STREAM FIN An empty STREAM frame can be emitted by QUIC MUX to notify about a delayed FIN when there is no data left to transmit. This requires a tedious comparison on stream offset in qmux_ctrl_send() to ensure an empty stream frame is not always considered as retransmitted, which is necessary to locally close the QCS instance. Simplify this by unsubscribe from streamdesc layer when the QCS is locally closed on FIN transmission notification. This prevents all future retransmitted frames to be reported to the QCS instance, especially any potentially retransmitted empty FIN.	2024-10-21 11:21:07 +02:00
Valentine Krasnobaeva	af1d170122	BUG/MINOR: mworker: fix mworker-max-reloads parser Before this patch, when wrong argument was provided in the configuration for mworker-max-reloads keyword, parser shows these errors below on the stderr: [WARNING] (1820317) : config : parsing [haproxy.cfg:154] : (null)parsing [haproxy.cfg:154] : 'mworker-max-reloads' expects an integer argument. In a case, when by mistake two arguments were provided instead of one, this has also triggered a buggy error message: [ALERT] (1820668) : config : parsing [haproxy.cfg:154] : 'mworker-max-reloads' cannot handle unexpected argument '45'. [WARNING] (1820668) : config : parsing [haproxy.cfg:154] : (null) So, as 'mworker-max-reloads' is parsed in discovery mode by master process let's align now its parser with all others, which could be called for this mode. Like this in cases, when there are too many args or argument isn't a valid integer we return proper error codes to global section parser and messages are formated properly. This fix should be backported in all stable versions.	2024-10-21 10:46:58 +02:00
Ilya Shipitsin	8a1aabb133	CI: modernize macos builds to macos-15 macos-15 support was announced few months ago: https://github.com/github/roadmap/issues/986	2024-10-21 07:54:38 +02:00
Ilya Shipitsin	50cf89ad5c	CI: bump development builds explicitely to Ubuntu 24.04 Initially we agreed to split builds into "latest" for development branch and fixed 22.04 for stable branches. It got broken when "latest" label migrated from ubuntu-22 to ubuntu-24 ... because of build cache. Cache key is built using runner label, it was not prepared to use the same "latest" cache from ubuntu 22 on ubuntu 24. To make things clear, let's stick explicitely to ubuntu 24.	2024-10-21 07:54:35 +02:00
Ilya Shipitsin	b6491ab19f	CI: prepare Coverity build for Ubuntu 24 PCRE2 is recommended, PCRE was chosen for no reason. GHA Ubuntu 22 images include both libs, but recent Ubuntu 24 does not. Let us prepare for Ubuntu 24	2024-10-21 07:54:32 +02:00
Willy Tarreau	9aa86b9dbd	BUILD: mux-h2/traces: fix build on 32-bit due to size of the DATA frame Commit cf3fe1eed ("MINOR: mux-h2/traces: print the size of the DATA frames") added the size of the DATA frame to the traces. Unfortunately it uses ullong instead of ulong to cast a pointer, which breaks the build on 32-bit platforms. Let's just switch it to ulong which works on both.	2024-10-21 04:17:59 +02:00
Willy Tarreau	278b9613a3	MEDIUM: debug: on panic, make the target thread automatically allocate its buf One main problem with panic dumps is that they're filling the dumping thread's trash, and that the global thread_dump_buffer is too small to catch enough of them. Here we're proceeding differently. When dumping threads for a panic, we're passing the magic value 0x2 as the buffer, and it will instruct the target thread to allocate its own buffer using get_trash_chunk() (which is signal safe), so that each thread dumps into its own buffer. Then the thread will wait for the buffer to be consumed, and will assign its own thread_dump_buffer to it. This way we can simply dump all threads' buffers from gdb like this: (gdb) set $t=0 while ($t < global.nbthread) printf "%s\n", ha_thread_ctx[$t].thread_dump_buffer.area set $t=$t+1 end For now we make it wait forever since it's only called on panic and we want to make sure the thread doesn't leave and continues to use that trash buffer or do other nasty stuff. That way the dumping thread will make all of them die. This would be useful to backport to the most recent branches to help troubleshooting. It backports well to 2.9, except for some trivial context in tinfo-t.h for an updated comment. 2.8 and older would also require TAINTED_PANIC. The following previous patches are required: MINOR: debug: make mark_tainted() return the previous value MINOR: chunk: drop the global thread_dump_buffer MINOR: debug: split ha_thread_dump() in two parts MINOR: debug: slightly change the thread_dump_pointer signification MINOR: debug: make ha_thread_dump_done() take the pointer to be used MINOR: debug: replace ha_thread_dump() with its two components	2024-10-19 16:01:52 +02:00
Willy Tarreau	afeac4bc02	MINOR: debug: replace ha_thread_dump() with its two components At the few places we were calling ha_thread_dump(), now we're calling separately ha_thread_dump_fill() and ha_thread_dump_done() once the data are consumed.	2024-10-19 15:42:34 +02:00
Willy Tarreau	d7c34ba479	MINOR: debug: make ha_thread_dump_done() take the pointer to be used This will allow the caller to decide whether to definitely clear the pointer and release the thread, or to leave it unlocked so that it's easy to analyse from the struct (the goal will be to use that in panic() so that cores are easy to analyse).	2024-10-19 15:42:07 +02:00
Willy Tarreau	091de0f9b2	MINOR: debug: slightly change the thread_dump_pointer signification Now the thread_dump_pointer is returned ORed with 1 once done, or NULL when cancelled (for now noone cancels). The goal will be to permit the callee to provide its own pointer. The ha_thread_dump_fill() function now returns the buffer pointer that was used (without OR 1) or NULL, for ease of use from the caller.	2024-10-19 15:42:07 +02:00
Willy Tarreau	2036f5bba1	MINOR: debug: split ha_thread_dump() in two parts We want to have a function to trigger the dump and another one to wait for it to be completed. This will be important to permit panic dumps to be done on local threads. For now this does not change anything, as the function still calls the two new functions one after the other.	2024-10-19 15:42:07 +02:00
Willy Tarreau	a6698304e0	MINOR: chunk: drop the global thread_dump_buffer This variable is not very useful and is confusing anyway. It was mostly used to detect that a panic dump was still in progress, but we can now check mark_tainted() for this. The pointer was set to one of the dumping thread's trash chunks. Let's temporarily continue to copy the dumps to that trash, we'll remove it later.	2024-10-19 15:42:00 +02:00
Willy Tarreau	8e048603d1	MINOR: debug: make mark_tainted() return the previous value Since mark_tainted() uses atomic ops to update the tainted status, let's make it return the prior value, which will allow the caller to detect if it's the first one to set it or not.	2024-10-19 15:13:47 +02:00
Willy Tarreau	84340d108b	OPTIM: buffers: avoid a useless wrapping check for ofs == 0 As mentioned in previous commit, b_peek_ofs() performs a wrapping check but is often called with ofs == 0 as a constant. We can detect this case with __builtin_const_p() so it makes sense to use it. A test shows a size reduction of about 320 bytes, which is not much, but it happens in hot code paths, and each 16 bytes reduction indicates an eliminated conditional branch. Some clear winners are ci_getblk_nc() (-48 bytes), h2c_dec_hdrs (-141B), h1_copy_msg_data (-124B), tcpcheck_spop_expect_hello (-80B), h1_parse_msg_data (-44B). These ones will definitely benefit from doing less conditional jumps.	2024-10-18 18:42:47 +02:00
Willy Tarreau	fca212292a	CLEANUP: buffers: simplify b_get_varint() The function is an exact copy of b_peek_varint() with ofs==0 and doing a b_del() at the end. We can simply call that other one and delete the contents. It turns out that the code is bigger with this change because b_peek_varint() passes its offset to b_peek() which performs a wrapping check. When ofs==0 the wrapping cannot happen, but there's no real way to tell that to the compiler. Instead conditioning the if() in b_peek() with (!__builtin_constant_p(ofs) \|\| ofs) does the job, but it's not worth it at the moment since we have no users of b_get_varint() for now. Let's just stick to the simple normal code.	2024-10-18 18:28:39 +02:00
Willy Tarreau	8b5a1fd1fc	BUILD: buffers: keep b_getblk_nc() and b_peek_varint() in buf.h Some large functions were moved to buf.c by commit ac66df4e2 ("REORG: buffers: move some of the heavy functions from buf.h to buf.c"). However, as found by Amaury, haring doesn't build anymore. Upon close inspection, b_getblk_nc() isn't that big since it's very much inlinable, and a part of its apparently large size comes from the BUG_ON_HOT() that were implemented. Regarding b_peek_varint(), it doesn't have any dependency and is used only at 4 places in the DNS code, so its loop will not have big impacts, and the rest around can be optimised away by the compiler so it remains relevant to keep it inlined. Also it can serve as a base to deduplicate the code in b_get_varint(). No backport needed.	2024-10-18 17:53:25 +02:00
Dragan Dosen	f33e9079a9	MINOR: arg: add an argument type for identifier The ARGT_ID argument type may now be used to set a custom resolve function in order to help resolve the argument string value. If the custom resolve function is not set, the behavior is the same as of type ARGT_STR.	2024-10-18 14:30:24 +02:00
Dragan Dosen	40ab88899c	BUG/MINOR: sample: free err2 in smp_resolve_args for type ARGT_REG The err2 may be leaking memory in case an error occurred as a result of regex_comp() call.	2024-10-18 14:29:56 +02:00
Aurelien DARRAGON	9262b7109e	CLEANUP: http_ext: remove useless BUG_ON() in http_handle_xot_header() A useless BUG_ON() statement was let in a conditional block that already checks that the condition cannot be met within the block. Remove the useless BUG_ON()	2024-10-17 17:25:06 +02:00
Aurelien DARRAGON	d28d016f43	MINOR: http_ext: implement rfc7239_{nn,np} converters "option forwarded" provides a convenient way to automatically insert rfc7239 forwarded header to requests sent to servers. On the other hand, manually crafting the header is quite complicated due to specific formatting rules that must be followed as per rfc7239. However, sometimes it may be necessary to craft the header manually, for instance if it has to be conditional or based on parameters that "option forwarded" doesn't provide. To ease this task, in this patch we implement rfc7239_nn and rfc7239_np which are respectively meant to craft nodename: nodeport values, specifically intended to manually build rfc7239 'for' and 'by' header fields while ensuring rfc7239 compliancy. Example: # build RFC-compliant 7239 header: http-request set-var-fmt(txn.forwarded) "for=\"%[ipv6(::1),rfc7239_nn]:%[str(8888),rfc7239_np]\";host=\"haproxy.org\";proto=http" # check RFC-compliancy: http-request set-var(txn.test) "var(txn.forwarded),debug(ok,stderr),rfc7239_is_valid,debug(ok,stderr)" # stderr output: # [debug] ok: type=str <for="[::1]:_8888";host="haproxy.org";proto=http> # [debug] ok: type=bool <1> See documentation for more info and examples.	2024-10-17 17:24:58 +02:00
Aurelien DARRAGON	45cbbdc845	DOC: config: fix rfc7239 forwarded typo in desc replace specicy with specify in rfc7239 forwarded option description. Multiple occurences were found. May be backported in 2.8.	2024-10-17 17:24:51 +02:00
Frederic Lecaille	b1af5dabf0	BUG/MEDIUM: quic: avoid freezing 0RTT connections This issue came with this commit: f627b92 BUG/MEDIUM: quic: always validate sender address on 0-RTT and could be easily reproduced with picoquic QUIC client with -Q option which splits a big ClientHello TLS message into two Initial datagrams. A second condition must be fulfilled to reprodue this issue: picoquic must not send the token provided by haproxy (NEW_TOKEN). To do that, haproxy must be patched to prevent it to send such tokens. Under these conditions, if haproxy has enough time to reply to the first Initial datagrams, when it receives the second Initial datagram it sends a Retry paquet. Then the client ignores the Retry paquet as mentionned by RFC 9000: 17.2.5.2. Handling a Retry Packet A client MUST accept and process at most one Retry packet for each connection attempt. After the client has received and processed an Initial or Retry packet from the server, it MUST discard any subsequent Retry packets that it receives. On its side, haproxy has closed the connection. When it receives the second Initial datagram, it open a new connection but with Initial packets it cannot decrypt (wrong ODCID) leaving the client without response. To fix this, as the aim of the token (NEW_TOKEN) sent by haproxy is to validate the peer address, in place of closing the connection when no token was received for a 0RTT connection, one leaves this validation to the handshake process. Indeed, the peer adress is validated during the handshake when a valid handshake packet is received by the listener. But as one does not want haproxy to process 0RTT data when no token was received, one does not accept the connection before the successful handshake completion. In addition to this, the 0RTT packets are not released after successful handshake completion when no token was received to leave a chance to haproxy to process these 0RTT data in such case (see quic_conn_io_cb()). Must be backported as far as 2.9.	2024-10-17 15:04:06 +02:00
Frederic Lecaille	c7f14a38f5	MINOR: quic: send new tokens (NEW_TOKEN) even for 1RTT sessions Tokens are sent when opening a connection, just after the handshake, to be possibly reused by the peer for the next connection. They are used to validate the peer address during the 0RTT connection openings. But there is no reason to reserve this feature to 0RTT connections. This patch modifies quic_build_post_handshake_frames() to do so.	2024-10-17 15:04:06 +02:00
Frederic Lecaille	19aa320f64	BUG/MINOR: quic: avoid leaking post handshake frames This bug came with this commit: f627b92 BUG/MEDIUM: quic: always validate sender address on 0-RTT If an error happens in quic_build_post_handshake_frames() during the code exexuted for th NEW_TOKEN frame allocation, some could leak because of the wrong label used to interrupt this function asap. Replace the "goto leave" by "goto err" to deallocated such frames to fix this issue. Must be backported as far as 2.9.	2024-10-17 15:04:06 +02:00
Christopher Faulet	e7be13da87	REGTESTS: Never reuse server connection in http-messaging/truncated.vtc A "Connection: close" header is added to responses to avoid any connection reuse. This should avoid errors on the client side.	2024-10-17 14:44:01 +02:00
Christopher Faulet	52a3d807fc	BUG/MAJOR: filters/htx: Add a flag to state the payload is altered by a filter When a filter is registered on the data, it means it may change the payload length by rewritting data. It means consumers of the message cannot trust the expected length of payload as announced by the producer. The commit 8bd835b2d2 ("MEDIUM: filters/htx: Don't rely on HTX extra field if payload is filtered") was pushed to solve this issue. When the HTTP payload of a message is filtered, the extra field is set to 0 to be sure it will never be used by error by any consumer. However, it is not enough. Indeed, the filters must be called before fowarding some data. They cannot be by-passed. But if a consumer is unable to flush the HTX message, some outgoing data can remain blocked in the channel's buffer. If some new data are then pushed because there is some room in the channel's buffe, the producer will set the HTX extra field. At this stage, if the consumer is unblocked and can send again data, it is possible to call it to forward outgoing data blocked in the channel's buffer before waking the stream up to filter new input data. It is the purpose of the data fast-forwarding. In this case, the HTX extra field will be seen by the consumer. It is unexpected and leads to undefined behavior. One consequence of this bug is to perform a wrong chunking on compressed messages, leading to processing errors at the end of the message, reported as "ID--" in logs. To fix the bug, a HTX flag is added to state the payload of the current HTX message is altered. When this flag is set (HTX_FL_ALTERED_PAYLOAD), the HTX extra field must not be trusted. And to keep things simple, when this flag is set, the HTX extra field is automatically set to 0 when the HTX message is loaded, in htxbuf() function. It is probably the less intrusive way to fix the bug for now. But this part must be reviewed to save meta-info of the HTX message outside of the message itself. This commit should solve the issue #2741. It must be backported as far as 2.9.	2024-10-17 13:54:54 +02:00
Christopher Faulet	0fcfed9e23	BUG/MEDIUM: stconn: Check FF data of SC to perform a shutdown in sc_notify() In sc_notify() function, the consumer side of the SC is tested to verify if we must perform a shutdown on the endpoint. To do so, no output data must be present in the buffer and in the iobuf. However, there is a bug here, the iobuf of the opposite SC is tested instead of the one of the current SC. So a shutdown can be performed on the endpoint while there are still output data in the iobuf that must be sent. Concretely, it can only be data blocked in a pipe. Because of this bug, data blocked in the pipe will be never sent. I've not tested but I guess this may block the stream during the client or server timeout. This patch must be backported as far as 2.9.	2024-10-17 13:53:40 +02:00
Christopher Faulet	6790067e79	BUG/MINOR: http-ana: Don't report a server abort if response payload is invalid If a parsing error is reported by the mux on the response payload, a proxy error (PRXCOND) must be reported instead of a server abort (SRVCL). Because of this bug, inavlid response may are reported as "SD--" or "SL--" in logs instead of "PD--" or "PL--". This patch must be backported to all stable versions.	2024-10-17 13:53:40 +02:00
Christopher Faulet	f98feda53f	MINOR: mux-h1: Add a trace on shutdown when keep-alive is not possible When the stream is shut down, some tests are performed to know if the connection must also be closed or not. There are trace messages for all cases, except for the default one: Abort or close-mode. Thanks to this patch, there is now a message too in this case.	2024-10-17 13:53:40 +02:00
Christopher Faulet	2c82ca60c6	MINOR: mux-h1: Show the SD iobuf in trace messages on stream send events Info about the SD iobuf are now dumped in trace messages when a stream send event is processed. It is a useful information to debug zero-copy forwarding issues.	2024-10-17 13:53:40 +02:00
Christopher Faulet	48f1e2b6fe	BUG/MEDIUM: stconn: Wait iobuf is empty to shut SE down during a check send When a send attempt is performed on the opposite side from sc_notify() and all outgoing data are sent while a shut was scheduled, the SE is shut down because we consider all data were sent and no more are expected. However, here we must also be carefull to have sent all pending data in the iobuf. Indeed, some spliced data may be blocked. In this case, if the SE is shut down, these data may be lost. This patch should fix the original bug reported in #2749. It must be backported as far as 2.9.	2024-10-17 13:53:40 +02:00
William Lallemand	043f11e891	MINOR: mworker/ocsp: skip ocsp-update proxy init in master The proxy must be created in mworker mode, but only in the worker, not in the master. The current code creates the proxy in both processes. The patch only checks that we are not in the master to start the ocsp-update pre-check. No backport needed.	2024-10-17 12:30:59 +02:00
William Lallemand	5184f3fb30	BUG/MINOR: resolvers/mworker: missing default resolvers in mworker mode Since commit fe75c1e12da061 ("MEDIUM: startup: remove MODE_MWORKER_WAIT") the MODE_MWORKER_WAIT constant disappeared. The initialization of the default resolvers section was conditionned by this constant. The section must be created in mworker mode, but only in the worker not in the master. It was currently completely disabled in both the master and the worker which could break configuration using it, as well as the httpclient. No backport needed.	2024-10-17 12:17:23 +02:00
William Lallemand	fdbff3a020	BUG/MEDIUM: mworker/httpclient: initialization skipped by accident in mworker mode Since commit fe75c1e12da061 ("MEDIUM: startup: remove MODE_MWORKER_WAIT") the MODE_MWORKER_WAIT constant disappearded. The initialization of the httpclient proxy was conditionned by this constant. The proxy must be created in mworker mode, but only in the worker not in the master. It was currently completely disabled in both the master and the worker provoking a NULL dereference upon httpclient usage. No backport needed.	2024-10-17 12:16:35 +02:00
William Lallemand	e7b7072943	BUG/MINOR: httpclient: return NULL when no proxy available during httpclient_new() Latest patches on the mworker rework skipped the httpclient_proxy creation by accident. This is not supposed to happen because haproxy is supposed to stop when the proxy creation failed, but it shows a flaw in the API. When the httpclient_proxy or the proxy used in parameter of httpclient_new_from_proxy() is NULL, it will be dereferenced and cause a crash. The patch only returns a NULL when doing an httpclient_new() if the proxy is not available. Must be backported as far as 2.7.	2024-10-17 11:57:29 +02:00
Willy Tarreau	1fb61475f2	[RELEASE] Released version 3.1-dev10 Released version 3.1-dev10 with the following main changes : - BUG/MAJOR: mux-quic: do not crash on empty STREAM frame emission - BUG/MINOR: stats: Fix the name for the total number of streams created - MINOR: quic: strengthen qc_release_frm() - MEDIUM: quic: decount acknowledged data for MUX txbuf window - MINOR: quic: implement dedicated type for out-of-order stream ACK - MEDIUM: quic: merge contiguous/overlapping buffered ack stream range - MEDIUM: quic: decount out-of-order ACK data range for MUX txbuf window - MINOR: log: add do_log() logging helper - MINOR: log: add do_log_parse_act() helper func - MINOR: action: add do-log action - REGTESTS: add some tests for 'do-log' action - BUG/MEDIUM: hlua: make hlua_ctx_renew() safe - BUG/MEDIUM: hlua: properly handle sample func errors in hlua_run_sample_{fetch,conv}() - BUG/MINOR: quic: fix discarding of already stored out-of-order ACK - BUG/MEDIUM: quic: properly decount out-of-order ACK on stream release - MINOR: ssl: disable server side default CRL check with WolfSSL - MEDIUM: sink: implement sink_find_early() - MINOR: trace: postresolve sink names - MINOR: sample: postresolve sink names in debug() converter - BUG/MEDIUM: mux-quic: ensure timeout server is active for short requests - MINOR: cfgparse: simulate long configuration parsing with force-cfg-parser-pause - BUILD: cache: silence an uninitialized warning at -Og with gcc-12.2 - BUG/MINOR: mux-h2/traces: present the correct buffer for trailers errors traces - MINOR: mux-h2/traces: print the size of the DATA frames - CLEANUP: muxes: remove useless inclusion of ebmbtree.h - REORG: buffers: move some of the heavy functions from buf.h to buf.c - MINOR: buffer: add a buffer list type with functions - MINOR: mux-h2: split the amount of rx data from the amount to ack - MINOR: mux-h2: create and initialize an rx offset per stream - MEDIUM: mux-h2: start to update stream when sending WU - MEDIUM: mux-h2: start to introduce the window size in the offset calculation - MINOR: mux-h2: count within a connection, how many streams are receiving data - MINOR: mux-h2: allocate the array of shared rx bufs in the h2c - MINOR: mux-h2: add rxbuf head/tail/count management for h2s - MINOR: mux-h2: move H2_CF_WAIT_IN_LIST flag away from the demux flags - MINOR: mux-h2: simplify the exit code in h2_rcv_buf() - MINOR: mux-h2: simplify the wake up code in h2_rcv_buf() - MINOR: mux-h2: clear up H2_CF_DEM_DFULL and H2_CF_DEM_SHORT_READ ambiguity - MAJOR: mux-h2: make streams use the connection's buffers - MAJOR: mux-h2: permit a stream to allocate as many buffers as desired - MAJOR: mux-h2: make the rxbuf allocation algorithm a bit smarter - MINOR: mux-h2: add tune.h2.be.rxbuf and tune.h2.fe.rxbuf global settings - MEDIUM: mux-h2: change the default initial window to 16kB - DOC: design-thoughts: add diagrams illustrating an rx win groth - MEDIUM: mux-h2: rework h2_restart_reading() to differentiate recv and demux - OPTIM: mux-h2: make h2_send() report more accurate wake up conditions - OPTIM: mux-h2: try to continue reading after demuxing when useful - OPTIM: mux-h2: use tasklet_wakeup_after() in h2s_notify_recv() - MINOR: mux-h2/traces: add missing flags and proxy ID in traces - MINOR: mux-h2/traces: add buffer-related info to h2s and h2c - CI: cirrus-ci: bump FreeBSD image to 14-1 - REGTESTS: fix a reload race in abns_socket.vtc - MINOR: activity/memprofile: always return "other" bin on NULL return address - MINOR: quic: notify connection layer on handshake completion - BUG/MINOR: stream: unblock stream on wait-for-handshake completion - BUG/MEDIUM: quic: support wait-for-handshake - BUG/MEDIUM: server: server stuck in maintenance after FQDN change - BUG/MEDIUM: queue: make sure never to queue when there's no more served conns - DEBUG: mux-h2/flags: add H2_CF_DEM_RXBUF & H2_SF_EXPECT_RXDATA for the decoder - REGTESTS: cli: add delay 0.1 before connect to cli - MINOR: startup: add O_CLOEXEC flag to open /dev/null - MEDIUM: startup: move daemonization fork in init - MINOR: startup: refactor "daemonization" fork - MEDIUM: startup: move PID handling in init() - MAJOR: mworker: move master-worker fork in init() - BUG/MINOR: mworker: fix memory leak due to master-worker fork - REORG: mworker: set nbthread=1 for master after fork - MINOR: init: check MODE_MWORKER before creating master CLI - REORG: mworker: move mworker_create_master_cli in master 'case' - MEDIUM: startup: call chroot() if needed in one place - MEDIUM: startup: do set_identity() if needed in one place - MINOR: startup: only worker gets capabilities from bin - CLEANUP: haproxy: rm no longer used mworker_reexec_waitmode - MINOR: startup: rename exit_on_waitmode_failure to exit_on_failure - MINOR: defaults: update MASTER_MAXCONN description - MEDIUM: startup: remove MODE_MWORKER_WAIT - MINOR: global: add MODE_DISCOVERY flag - MEDIUM: cfgparse: add KWF_DISCOVERY keyword flag - MEDIUM: cfgparse: call some parsers only in MODE_DISCOVERY - MEDIUM: cfgparse-global: parse only KWF_DISCOVERY keywords in MODE_DISCOVERY - MEDIUM: cfgparse: parse only "global" section in MODE_DISCOVERY - MEDIUM: startup: introduce load_cfg and read_cfg - MINOR: cfgparse: fix thread keywords sensitive to global section position - MINOR: mworker/cli: rename mworker_cli_proxy_new_listener - MINOR: mworker/cli: rename and clean mworker_cli_sockpair_new - MINOR: mworker/cli: create master CLI sockpair before fork - MINOR: mworker/cli: create MASTER proxy before mcli listeners - MINOR: mworker: add and set state PROC_O_INIT for new worker - MEDIUM: mworker/cli: close child and parent fds, setup listeners - MINOR: mworker: mworker_catch_sigchld: use fd_delete instead of close - MINOR: startup: rename and adapt reexec_on_failure - MINOR: mworker: add support for case when new worker dies - MINOR: mworker: simplify the code that sets PROC_O_LEAVING - MINOR: mworker/cli: add _send_status to support state transition - MEDIUM: startup: split sending oldpids_sig logic for standalone and mworker modes - MINOR: startup: split init() into separate initialization routines - MINOR: startup: split main: add step_init_3 - MINOR: startup: simplify check for calling sock_get_old_sockets - MINOR: startup: encapsulate sock_get_old_sockets in a function - MINOR: startup: add bind_listeners - MINOR: startup: split main: add step_init_4 - MINOR: startup: encapsulate master's code in run_master - MINOR: startup: add read_cfg_in_discovery_mode - MINOR: mworker: adapt exit_on_failure for master recovery mode - MEDIUM: mworker: add support of master recovery mode - MINOR: startup: add set_verbosity - MEDIUM: mworker: block reloads - MINOR: mworker: slow load status delivery if worker is starting - MINOR: mworker: readapt program support in mworker_catch_sigchld - MINOR: mworker: deserialize process list before read_cfg_in_discovery_mode - MINOR: mworker: parse program only in MODE_DISCOVERY - MINOR: cfgparse: add support for program section - MINOR: startup: reintroduce program support - MINOR: mworker-prog: stop old programs in mworker_ext_launch_all - MINOR: mworker: reintroduce systemd support - MINOR: mworker: report explicitly when worker exits due to max reloads - MINOR: cfgparse-global: parse env keywords in MODE_DISCOVERY - MINOR: startup: reintroduce *env keywords support - MINOR: startup: close devnullfd, when daemon mode is applied	2024-10-16 22:57:52 +02:00
Valentine Krasnobaeva	c42ad79134	MINOR: startup: close devnullfd, when daemon mode is applied In case of daemon mode now daemonization fork happens in the early init stage before parsing and applying the configuration, so we can't close stdio/stderr/stdout immediately after forking. We keep it open until the most of configuration, including chroot are applied in order to show alerts, if there are some problems. To achieve this /dev/null is opened just before calling chroot(), and after the chroot block it's used to close all standard outputs and stdin. At this point we no longer need the fd of /dev/null, so we can close it as well.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	dc53c37234	MINOR: startup: reintroduce *env keywords support setenv/resetenv/presetenv/unsetenv keywords in the configuration modify the process environment. In case of master-worker and programs we need to restore the initial process environment before reload, as the configuration could change in between and newly forked workers and programs should be launched in the environment corresponded to this new configuration. To achieve this we backup the initial process environment before the first configuration read, when 'global' and 'program' sections are read. And then we clean up master process environment and restore the initial one from the backup in mworker_reexec().	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	d5ad92c7aa	MINOR: cfgparse-global: parse env keywords in MODE_DISCOVERY setenv/resetenv/presetenv/unsetenv keywords should be parsed by master process and by worker. As some other master parameters could be enabled in conditional blocks (.if...endif). To achieve this let's tag 'env' keywords with KWF_DISCOVERY flag.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	d11dc11e5a	MINOR: mworker: report explicitly when worker exits due to max reloads It's convienient for testing and for usage to produce different warning messages, when the former worker exits due to max reloads exceeded, and when it was terminated by the master.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	4c8303a59e	MINOR: mworker: reintroduce systemd support Let's reintroduce systemd support in the refactored master-worker mode. As for now, the master-worker fork happens during early initialization steps and then the master process receieves the "READY" status message from the newly forked worker, that has successfully loaded. Let's propagate this "READY" status message at this moment to the systemd from the master process context (_send_status()). We use the master process to send messages to systemd, because it is only the process, monitored by systemd. In master recovery mode, we also need to send to the systemd the "READY" message, but with the status "Reload failed". "READY" will signal to systemd, that master process is still alive, because it doesn't exit in recovery mode and it keeps the existed worker. Status "Reload failed" will signal to user, that something wrong has happened with the configuration. Same message logic was originally preserved for the case, when the worker fails to read its configuration, see on_new_child_failure() for more details.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	9e23cfa5c2	MINOR: mworker-prog: stop old programs in mworker_ext_launch_all This patch is a part of series to reintroduce the program support in the new master-worker architecture. Now, after refactoring in master-worker mode it's the master process, who stops workers forked before the reload. Current worker no longer sends USR1 or TERM signals to the previous one after ports binding. This behaviour is kept only for the standalone mode. So, in case of programs, it's up to master process as well to stop programs, which were launched before reload. Let's do this in mworker_ext_launch_all(), just before starting the new programs.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	0fc2ff4b7d	MINOR: startup: reintroduce program support This patch is a part of series to reintroduce the program support in the new master-worker architecture. Let's add here mworker_ext_launch_all() call before master-worker fork to start external programs. We keep the order and the place of these two forks (program and master-worker) the same as before the refactoring, in order to avoid regressions.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	a2fac5a3a1	MINOR: cfgparse: add support for program section This patch is a part of series to reintroduce the program support in the new master-worker architecture. Programs are launched by master, thus only the master process needs its configuration. Therefore, program section parser should be called only in discovery mode, when master parses its configuration. Program section has a post section parser. It should be called only in discovery mode as well.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	45a284895a	MINOR: mworker: parse program only in MODE_DISCOVERY This patch is a part of series to reintroduce the program support in the new master-worker architecture. Master process launches external programs, so it needs to read program section. Thus, it should be parsed in MODE_DISCOVERY. Worker does not need program settings, so let's check the runtime mode in cfg_parse_program. Worker should always skip this section.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	ee7fc98320	MINOR: mworker: deserialize process list before read_cfg_in_discovery_mode This patch is a part of series to reintroduce the program support in the new master-worker architecture. For the moment we keep the order of program and worker forks the same as before the refactoring, as we need to be sure that this won't introduce regressions. So, programs are forked before the new worker process. Before the program's fork we already need deserialized processes list to find the programs launched before reload and to stop them. Processes list saved before the reload in HAPROXY_PROCESSES variable. It should be deserialized before the first configuration read in discovery mode, because resetenv keyword could be presented in the global section. So, let's move mworker_env_to_proc_list() from mworker_create_master_cli() to main(). We need to call it only after reload in master-worker mode, thus HAPROXY_MWORKER_REEXEC and HAPROXY_PROCESSES should be still presented in the re-executing process environment before the first configuration read.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	7a267c4a27	MINOR: mworker: readapt program support in mworker_catch_sigchld This patch is a part of series to reintroduce the program support in the new master-worker architecture. We just only launch and stop external programs and there is no any communication between the master process and the started program binary. So, ipc_fd[0] and ipc_fd[1] are not used and kept as -1 for programs processes. Due to this, no need for the exiting program process to call fd_delete on this fds. Otherwise, this will trigger a BUG_ON.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	d766677d92	MINOR: mworker: slow load status delivery if worker is starting With refactored master-worker architecture master and worker processes parse its parts of the configuration. Worker could have a huge configuration, so it will take some time to load. As now HAPROXY_LOAD_SUCCESS is set to 1 only after receiving the status READY from the new worker cli_io_handler_show_loadstatus() may exit very fast by showing load status 0, and in such case and mcli socket will be closed. This already breaks some regression tests and can confuse some APIs. So, let's slow down the load status delivery. If in the process list there is still some process, which is loading (PROC_O_INIT). appctx task will sleep in this case for 50ms and then return 0. cli_io_handler_show_loadstatus() is called in loop, so with such pacing, there is a high chance that the next time, when we enter in its scope all processes will have the state READY. Like this master CLI connection socket won't be closed until the loading of the new worker is really finished, thus the reload status and logs (Success=1/0) will be shown in synchronious way.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	5f16453082	MEDIUM: mworker: block reloads When reloads arrive very often (sent by some APIs), newly forked workers almost don't have a time to load completely and to send its READY status to master, which allows then to stop the previous worker (launched before reload). As a result, the number of workers increases very quickly, previous workers are still alive and the memory consumption is very high. To avoid such situations let's return in cli_parse_reload() reload status 0 with the text ""Another reload is still in progress", if there is still a process with PROC_O_INIT flag in the processes list.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	5be14b338a	MINOR: startup: add set_verbosity Let's encapsulate the logic to set verbosity modes (MODE_DEBUG and MODE_VERBOSE) in a separate function set_verbosity(). This makes the code of main() more readable and this allows to call set_verbosity() for master process in recovery mode. So, in this mode, verbosity settings before the master re-execution will be re-applied to master. set_verbosity() will be extended in future commits to reduce the verbosiness of master in order not to dump pollers list and filters, if it was started with -V or -d.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	5909d508bc	MEDIUM: mworker: add support of master recovery mode In this commit we add run_master_in_recovery_mode(), which groups all necessary initialization steps, which master should perform to be able to enter in its polling loop (run_master()), when it fails while parsing its new config. As exit_on_failure() is now adapted for master recovery mode. Let's register it as atexit handler, when master enters in this mode. And let's remove atexit_flag variable for master, because we no longer use it. We also slightly refactor here read_cfg_in_discovery_mode() in order to call run_master_in_recovery_mode() for the case, described above. Warning messages are mandatory before calling the run_master_in_recovery_mode() as this allows to stop haproxy with error, if it was launched in zero-warning mode. So, in recovery mode master does not launch any worker. It just performs its necessary initialization routines and enters in its polling loop to continue to monitor the existed worker process.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	fe4708feaa	MINOR: mworker: adapt exit_on_failure for master recovery mode Master recovery mode replaces the former wait-mode with a difference, that master in this case doesn't try to fork the new worker process. But it still needs to enter to its polling loop in order to monitor the previous worker. Master performs some initialization steps for this and it recreates its master CLI. During its initialization steps, master could potentially fail again. As we use for the moment for master init steps some common routines (step_init_2() and step_init_3()), there is no way there to signal to user that failure has happened for the master and in addition, in its recovery mode. So, in such case exit_on_failure() can be still useful in order to print an appropriate alert, as we can register this function as atexit handler for the master.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	6615e46456	MINOR: startup: add read_cfg_in_discovery_mode Let's encapsulate here the code to load and to read the configuration at the first time in MODE_DISCOVERY. This makes the code of main() more readable and this adds the structure for adding necessary master initializations routines to support master recovery mode.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	1cee184145	MINOR: startup: encapsulate master's code in run_master Let's encapsulate master's code (steps which it does before entering in its polling loop and deinitialization routines after) in a separate run_master() function. This makes the code of main() more readable. In future we plan to put in run_master() more master process related code, in order to clean completely init_step_2(), init_step_3() and init_step_4().	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	e5cd81cf8f	MINOR: startup: split main: add step_init_4 Let's encapsulate here another part of main, after binding listeners sockets and before calling the master's code in master-worker mode. This block contains the code, which applies verbosity settings, checks limits and updates the ready date. It will take some time to figure out, which of these parts are really needed for the master, or which ones it could skip. So let's put all these for the moment in step_init_4() and let's call it for all modes.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	26a6fdf542	MINOR: startup: add bind_listeners Let's encapsulate here the code, which tries to bind listeners for the new process in a separate function. This will make the main() code more readable. Master process, even if it has failed while reading its new configuration, has to bind its master CLI sockets. So like this we will can call this function in the master recovery mode. Master CLI socket address and port for external connections (user, monitoring tools) are provided for now only via the command line. So, master, even after this failure can and must reestablish master CLI connections again.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	babbcb047e	MINOR: startup: encapsulate sock_get_old_sockets in a function Let's encapsulate here the code, that calls sock_get_old_sockets() to obtain listeners sockets from the previous process into a separate function. This will make the code of main() more readable and we can move this new function (if we might need so) in future.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	f4e73b4302	MINOR: startup: simplify check for calling sock_get_old_sockets MODE_CHECK and MODE_CHECK_CONDITION are applied now very early in step_init_1() and step_init_2() in order to check the configuration or to check some condition provided via the command line. When these checks have terminated, the main process exits. So, no longer need to verify these modes at the moment, when the current process have already done its basic initialization routines and is asking for listeners sockets from the previously started one.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	c4795e4019	MINOR: startup: split main: add step_init_3 The first part of main(), just after calling the former init() and before trying to bind listeners, need to be also encapsulated into a separate step_init_3() as it is. It contains important blocks to register signals, to apply memory and nofile limits, etc. The order of these blocks should be also preserved (especially the signals part). For the moment step_init_3() must be also executed for all runtime modes.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	49772c55e3	MINOR: startup: split init() into separate initialization routines This is the first commit in a series to add a support of the 5-th reload use case, when the master process fails to read its new configuration. In this case it just need to perform its initialization steps and keep the existed worker. To add the support for this last use case we need to split init() and main() in a shorter steps in order to encapsulate necessary initialization routines into separate functions. Let's at first, make here progname as a global variable for haproxy.c, as it will be used in error messages in the initialization functions. Then let's split the init() into separate routines, which set and apply modes, write process PID in a pidfile, etc. The big part of the former init(), which called functions to allocate pools, to initialize proxies, to calculate maxconn and to perform some post checks was just encasulated as is, into step_init_2(). It will take some time to figure out exactly which parts of this initialization block are really necessary for the master process and which ones it could skip. So, for the moment step_init_2() is called for all runtime modes.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	81dbc2c2e2	MEDIUM: startup: split sending oldpids_sig logic for standalone and mworker modes Before refactoring the master-worker mode, in all runtime modes, when the new process successfully parsed its configuration and bound to sockets, it sent either SIGUSR1 or SIGTERM to the previous one in order to terminate it. Let's keep this logic as is for the standalone mode. In addition, in standalone mode we need to send the signal to old process before calling set_identity(), because in set_identity() effective user or group may change. So, the order is important here. In case of master-worker mode after refactoring, master terminates the previous worker by itself up to receiving "READY" status from the new one in _send_status(). Master also sets at this moment HAPROXY_LOAD_SUCCESS env variable and checks, if there are some other workers to terminate with max_reloads exceeded. So, now in master-worker mode we terminate old workers only, when the new one has successfully done all initialization steps and has sent "READY" status to master.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	b73a278df4	MINOR: mworker/cli: add _send_status to support state transition In the new master-worker architecture, when a worker process is forked and successfully initialized it needs somehow to communicate its "READY" state to the master, in order to terminate the previous worker and workers, that might exceeded max_reloads counter. So, let's implement for this a new master CLI _send_status command. A new worker can send its status string "READY" to the master, when it's about entering to the run poll loop, thus it can start to receive data. In _send_status() in the master context we update the status of the new worker: PROC_O_INIT flag is withdrawn. When TERM signal is sent to a worker, worker terminates and this triggers the mworker_catch_sigchld() handler in master. This handler deletes the exiting process entry from the processes list. In _send_status() we loop over the processes list twice. At the first time, in order to stop workers that exceeded the max_reloads counter. At the second time, in order to stop the worker forked before the last reload. In the corner case, when max_reloads=1, we avoid to send SIGTERM twice to the same worker by setting sigterm_sent flag during the first loop.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	154848a314	MINOR: mworker: simplify the code that sets PROC_O_LEAVING When master performs a reexec it should set for an already existed worker the flag PROC_O_LEAVING. It means that existed worked is marked as the previous one and will be terminated after the reload. In the previous implementation master process was need to do the reexec twice (the first time for parsing its configuration and the second time to free unused ressources). So the logic of setting PROC_O_LEAVING was based on comparing the number of reloads, performed by each process from the processes list, except the master. Now, as being mentioned before, reexec is performed only once. So, in this case we need to set PROC_O_LEAVING flag, when we deserialize the list. It is done for all processes, which have the number of reloads stricly positive.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	c8aac63893	MINOR: mworker: add support for case when new worker dies The case, when the new worker fails while it parses its configuration or while it tries to apply it, could be considered as the new one, because the master process is no longer need to reexec again. The master simply keeps the previous worker (forked before the reload) and it let the new one to exit with failure. When the new worker exits, in the master process context (mworker_catch_sigchld) we need to stop a MASTER proxy listener and we need to drop the server, attached to new worker's CLI sockpair (it's inherited in master). Then we explicitly delete master's end of this sockpair (child->ipc_fd[0]) from the fdtab and we free the memory allocated for the worker process. on_new_child_failure() is called before the clean up to signal systemd that reload/load was failed. If the new worker fails during the first start, so there is no any previous worker, master process should exit immediately in order to keep the same behaviour, as it was before this architecture change.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	2bb07b913d	MINOR: startup: rename and adapt reexec_on_failure Previously reexec_on_failure() was called in cases when the process has failed after reload, while it was parsing its configuration or it was trying to apply it. reexec_on_failure() has called mworker_reexec() and the master process has been reexecuted. With the new architecture in such cases there is no longer need to reexecute the master process after its reload again. It simply keeps the previous worker, forked before the reload, and it lets the new one to exit with an error. But we still need the code, which increments the number of failed reloads and which notifies systemd with new "Reload failed!" status. So, let's reuse and adapt for this reexec_on_failure() and let's rename it to on_new_child_failure().	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	9b27f82da3	MINOR: mworker: mworker_catch_sigchld: use fd_delete instead of close If the worker exits due to failure or due to receiving TERM signal, in the master context, we can't now simply close the master's fd (ipc_fd[0]) of the inherited master CLI sockpair. When the worker is created, in the master process context MASTER proxy listener is bound to ipc_fd[0]. When this worker fails or exits, master process is always in its polling loop. So, closing some fd in its context immediately triggers the BUG_ON(fd->owner), as the poller try to reinsert the "freed" fd into fdtab and try to reuse it. We must call fd_delete in this case. This will deinitializes fd auxilary data and closes its properly.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	cf150fd73d	MEDIUM: mworker/cli: close child and parent fds, setup listeners Basically, this is the continuation of the previous commits. So, here after the fork, worker process closes the "master" end of the copied CLI sockpair and binds its end, ipc_fd[1], to the GLOBAL proxy listener. mworker_cli_global_proxy_new_listener() guarantees that GLOBAL proxy will be created, if it wasn't the case before. Master process, at first, allocates the MASTER proxy, creates master CLI listener (-S command line option) and reload sockpair and then closes the "worker" end of the copied CLI sockpair and binds its end, ipc_fd[0], to the created MASTER proxy. Usage of the new PROC_O_INIT state helps to reduce test conditions to find the newly forked worker.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	646299fc95	MINOR: mworker: add and set state PROC_O_INIT for new worker Here, to distinguish between the new worker and the previous one let's add a new process state PROC_O_INIT and let's set it, when the memory is allocated for the new worker in the processes list.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	26ad5465cc	MINOR: mworker/cli: create MASTER proxy before mcli listeners For the master process we always need to create a MASTER proxy, even if master cli settings were not provided via command line, because now we bind a listener in the master process context at ipc_fd[0]. So, MASTER proxy should be already allocated at this moment.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	6ec38c9a74	MINOR: mworker/cli: create master CLI sockpair before fork The main idea here is to create a master CLI inherited sockpair just before the master-worker fork. And only then after the fork let each process to bind a needed listener to the its end of this sockpair. Like this master and worker processes can close unused "ends" of its sockpair copy (ipc_fd[0] for worker and and ipc_fd[1] for master). When this sockpair creation happens inside the mworker_cli_global_proxy_new_listener() is not possible for the master to close ipc_fd[1] bound to the GLOBAL proxy listener, as this triggers a BUG_ON(fd->owner) in fd_insert() in master context, because master process has alredy entered in its polling loop and poller in its turn tries to reused closed fd.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	cc1a631beb	MINOR: mworker/cli: rename and clean mworker_cli_sockpair_new Let's rename mworker_cli_sockpair_new() to mworker_cli_global_proxy_new_listener() to outline that this function creates the GLOBAL proxy, allocates the listener with "master-socket" bind conf and attaches this listener to this GLOBAL proxy. Listener is bound to ipc_fd[1] of the sockpair inherited in master and in worker (master CLI sockpair).	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	0fbf1973ad	MINOR: mworker/cli: rename mworker_cli_proxy_new_listener This is the first commit in a series to add the support of 4 primary reload use-cases for the new master-worker architecture: 1. Newly forked worker process dies before any reload, due to some errors in the configuration. Newly forked worker process crashes before any reload after sending its "READY" state to master. 2. Newly forked worker process dies due to some errors in the new configuration. This happens after reload, when this new configuration was supplied, so the previous worker process is still here. 3. Newly forked worker process crashes after sending its "READY" state to master due to some bugs. This happens after reload, so the previous worker process is still here. 4. Newly forked worker process has sent its "READY" state to master and starts to receive traffic. This happens after reload, the old worker hasn't terminated yet, as it is waiting on some idle connection and it crashes. Let's rename in this commit mworker_cli_proxy_new_listener() to mworker_cli_master_proxy_new_listener() to outline, that this function creates "master-socket" bind conf and allocates a listener. This listener is attached to the MASTER proxy and it's bound to the ipc_fd[0] of the sockpair, inherited in master and in worker processes (master CLI sockpair).	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	223caab96f	MINOR: cfgparse: fix thread keywords sensitive to global section position thread keywords parsers are sensitive to global section position. If they are present there, the global section must be the first section in the configuration. *thread parsers logic is based on non_global_section_parsed counter. So, we need to reset it explicitly before the second configuration read done by worker or in a standalone mode.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	0ed262d7bf	MEDIUM: startup: introduce load_cfg and read_cfg This commit is a part of the series to add a support of discovery mode in the configuration parser and in initialization sequence. In order to support discovery mode, we need to read the configuration twice. So, we need to split the stage, when we load all configuration files, from the stage when we parse it. To do this, let's encapsulate in read_cfg() the part, where we load the configuration files in a separate function, load_cfg(). Like this we can call only the parsing part as many times as we need. Before reading configuration at the first time we set MODE_DISCOVERY. After the reading this mode is immediately unset, as the real runtime mode has been already set by discovery keywords parsers. Second read is performed when all primary runtime modes (daemon, master-worker) are applied, because we should not read the configuration twice in the master process.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	e2b4768224	MEDIUM: cfgparse: parse only "global" section in MODE_DISCOVERY This commit is a part of the series to add a support of discovery mode in the configuration parser and in initialization sequence. So, in discovery mode, when we read the configuration the first time, we parse for the moment only the "global" section. Unknown section names will be ignored.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	699be6a55d	MEDIUM: cfgparse-global: parse only KWF_DISCOVERY keywords in MODE_DISCOVERY This commit is a part of the series to add a support of discovery mode in the configuration parser and in initialization sequence. Global section parser parses the majority of keywords in its function, so those keywords don't have any dedicated parsers yet. Only after this parsing block cfg_parse_global() starts to call dedicated parsers for any other discovered keywords, which were not found in the block. As all keywords, which should be parsed in MODE_DISCOVERY have its own parser funtions, we can skip this block with goto discovery_kw and start directly from the part, where we call parsers from the keywords list. KWF_DISCOVERY flag helps to call in MODE_DISCOVERY only the parsers, which we are needed at this mode. All unknown keywords and garbage will be ignored at this stage.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	48371e6a30	MEDIUM: cfgparse: call some parsers only in MODE_DISCOVERY This commit is a part of the series to add a support of discovery mode in the configuration parser and in initialization sequence. Some keyword parsers tagged with KWF_DISCOVERY (for example those, which parse runtime modes, poller types, pidfile), should not be called twice when the configuration will be read the second time after the discovery mode. It's redundant and could trigger parser's errors in standalone mode. In master-worker mode the worker process inherits parsed settings from the master.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	f9123e2183	MEDIUM: cfgparse: add KWF_DISCOVERY keyword flag This commit is a part of the series to add a support of discovery mode in the configuration parser and in initialization sequence. So, let's add here KWF_DISCOVERY flag to distinguish the keywords, which should be parsed in "discovery" mode and which are needed for master process, from all others. Keywords, that should be parsed in "discovery" mode have its dedicated parser funtions. Let's tag these functions with KWF_DISCOVERY flag in keywords list. Like this, only these keyword parsers might be called during the first configuration read in discovery mode.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	6769745fe5	MINOR: global: add MODE_DISCOVERY flag This is the first commit from a series to add a support of discovery mode in the configuration parser and in initialization sequence. Discovery mode is the mode, when we read the configuration at the first time and we parse and set runtime modes: daemon, zero-warning, master-worker. In this mode we also parse some parameters needed for the master process to start, in case if we are in the master-worker mode. Like this the master process doesn't allocate any additional resources, which it doesn't use and it quickly finishes its initialization and enters to its polling loop. The worker process after its fork reads the rest of the configuration. So, let's add in this commit MODE_DISCOVERY flag to check it in configuration parser functions.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	fe75c1e12d	MEDIUM: startup: remove MODE_MWORKER_WAIT MODE_MWORKER_WAIT becames redundant with MODE_MWORKER, due to moving master-worker fork in init(). This change allows master no longer perform reexec just after forking in order to free additional memory. As after the fork in the master process we set 'master' variable, we can replace now MODE_MWORKER_WAIT in some 'if' statements by simple check of this 'master' variable. Let's also continue to get rid of HAPROXY_MWORKER_WAIT_ONLY environment variable, as it's no longer needed as well. In cfg_program_postparser(), which is used to check if cmdline is defined to launch a program, we completely remove the check of mode for now, because the master process does not parse the configuration for the moment. 'program' section parsing will be reintroduced in master later in the next commits.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	fb7bef781d	MINOR: defaults: update MASTER_MAXCONN description This is a one of the commits to prepare the removal of MODE_MWORKER_WAIT support, as it became redundant with MODE_MWORKER due to moving master-worker fork in init().	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	3f5f57845b	MINOR: startup: rename exit_on_waitmode_failure to exit_on_failure As we no longer support MODE_MWORKER_WAIT for master (it became redundant with MODE_MWORKER after moving master-worker fork in init()), let's rename exit_on_waitmode_failure() callback in just exit_on_failure().	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	7795d49ae6	CLEANUP: haproxy: rm no longer used mworker_reexec_waitmode This a first commit to prepare the removal of MODE_MWORKER_WAIT support. It has became redundant with MODE_MWORKER, due to moving master-worker fork in init(). Master process does no longer perform reexec to free additional memory after forking and does no longer changing its mode to MODE_MWORKER_WAIT, where it has entered to its wait polling loop and has handled signals. Now, master enters in this loop almost immediately after forking a worker and being always in mode MODE_MWORKER. So, we can remove mworker_reexec_waitmode() wrapper, which was used to set HAPROXY_MWORKER_WAIT_ONLY variable and to call mworker_reexec(). But let's keep for the moment the logic of reexec_on_failure() atexit callback for master in order if in the future we will need to support this case again.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	cb0f1f42e1	MINOR: startup: only worker gets capabilities from bin Due to moving the master-worker fork in init(), we need to protect prepare_caps_from_permitted_set() call, which is executed after init(). This call makes sense only for worker, daemon and for foreground mono process modes. prepare_caps_from_permitted_set() allows to read Linux capabilities from haproxy binary and to move some of them in process Effective set, if 'setcap' keyword lists needed capabilities in the global section.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	fe04c2ad37	MEDIUM: startup: do set_identity() if needed in one place There are two set_identity() calls, both under quite same: 'if ((global.mode & (MODE_MWORKER\|MODE_DAEMON...)...' The first call serves to change uid/gid and set some needed Linux capabilities only for process in the foreground mode. The second comes after master-worker fork and allows to do the same in daemon and in worker modes. Due to moving the master-worker fork in init() in some previous commit, the second set_identity() now is no longer under the 'if'. So, it is executed for all modes, except MODE_MWORKER. Now in MODE_MWORKER process enters in its wait polling loop just after forking a worker and it terminates almost immediately, if it exits this loop. Worker, daemon and process in a foreground mode will perform set_identity() as before, but now it will be called in a one place at main(). global.last_checks should be verified just after set_identity() call. As it's stated in comments some configuration options may require full privileges or some Linux capabilities need to be granted to process. set_identity() via prepare_caps_for_setuid() may put configured capabilities in process Effective set and, hence, remove respective flag from global.last_checks.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	02af1fe067	MEDIUM: startup: call chroot() if needed in one place There are two 'chroot' code blocks, both under quite same: 'if ((global.mode & (MODE_MWORKER\|MODE_DAEMON...)...' The first block serves to perform chroot only for process in the foreground mode. The second comes after master-worker fork and allows to do chroot in daemon and in worker modes. Due to moving the master-worker fork in init() in some previous commit, the second 'chroot' code block now is no longer under the 'if'. So, it is executed for all modes, except MODE_MWORKER. Now in MODE_MWORKER process enters in its wait polling loop just after forking a worker and it terminates almost immediately, if it exits this loop. Worker, daemon and process in a foreground mode will perform the chroot as before, but now it will be done in a one place at main().	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	7a2ee10d71	REORG: mworker: move mworker_create_master_cli in master 'case' Let's move mworker_create_master_cli() call in 'master' case just above and get rid of redundant global.mode tests.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	e4c10a704d	MINOR: init: check MODE_MWORKER before creating master CLI mworker_create_master_cli() creates MASTER proxy and allocates listeners, which are attached to this proxy. It also creates a reload sockpair. So, it's more appropriate to do the check, that we are in a MODE_MWORKER, if master CLI settings were provided via command line, just after the config parsing. And only then, if runtime mode and command line settings are coherent, try to perform master-worker fork and try to create master CLI.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	26e53e2e8c	REORG: mworker: set nbthread=1 for master after fork After moving master-worker fork into init() and reintroducing it into a switch-case (see the previous commit), it is more appropriate to set nbthread=1 and nbtgroups=1 immediately in the 'case' for the parent process.	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	ae84f06025	BUG/MINOR: mworker: fix memory leak due to master-worker fork Before this fix, startup logs ring was duplicated before the fork(), so master and worker had both the original startup_logs ring and the duplicated one. In the worker context we freed the original ring and used a duplicated one. In the master context we did nothing, but we still create a duplicated copy again and again during the reload. So, let's duplicate startup logs ring only in the worker context. Master continues to use the original ring initialized in init() before its fork().	2024-10-16 22:02:39 +02:00
Valentine Krasnobaeva	8dd4efe42f	MAJOR: mworker: move master-worker fork in init() This refactoring allows to simplify 'master-worker' logic. The master process with this change will fork a worker very early at the initialization stage, which allows to perform a configuration parsing only for the worker. In reality only the worker process needs to parse and to apply the whole configuration. Master process just polls master CLI sockets, watches worker status, catches its termination state and handles the signals. With this refactoring there is no longer need for master to perform re-execution after reading the whole configuration file to free additional memory. And there is no longer need for worker to register atexit callbacks, in order to free the memory, when it fails to apply the new configuration. In contrast, we now need to set proc_self pointer to the new worker entry in processes list just after the fork in the worker process context. proc_self is dereferenced in mworker_sockpair_register_per_thread(), which is called when worker enters in its polling loop. Following patches will try to gather more 'worker' and 'master' specific' code in the dedicated cases of this new fork() switch, or in a separate functions.	2024-10-16 22:00:58 +02:00
Valentine Krasnobaeva	4cbfcc60f4	MEDIUM: startup: move PID handling in init() Let's move PID handling in init() from the main() code. It is more appropriate to open and to write the PID of the process just after daemonization fork. In case of daemon monoprocess mode, we will simply write a PID of the process, which is already in the background. In case of 'master-worker' mode, we keep the previous behaviour and we write only a PID of the master process. This allows to remove redundant tests of the process execution mode, tests of the pidfd value and consequent writes to this pidfd. This patch prepares the refactoring of master-worker fork by moving it in init() function as well.	2024-10-16 22:00:58 +02:00
Valentine Krasnobaeva	95c19be2ab	MINOR: startup: refactor "daemonization" fork Let's put "daemonization" fork into a switch-case. This is more readable and we don't need to allocate memory for the fork() return value here.	2024-10-16 22:00:58 +02:00
Valentine Krasnobaeva	90b8181c0a	MEDIUM: startup: move daemonization fork in init Let's move daemonization fork in init(). We need to perform this fork always before forking a worker process, in order to be able to launch master and then its worker in daemon, i.e. background mode, if haproxy was started with '-D' option. This refactoring is a preparation step, needed for replacing then master-worker fork in init() as well. This allows the master process not to read the whole configuration file and not to do re-execution in order to free additional memory, when worker was forked. In the new refactored design only the worker process will read and apply a new configuration, while the master will arrive very fast in its polling loop to wait worker's termination and to handle signals. See more details in the following commits.	2024-10-16 22:00:58 +02:00
Valentine Krasnobaeva	df12791da3	MINOR: startup: add O_CLOEXEC flag to open /dev/null As master process performs execvp() syscall to handle USR2 and HUP signals in mworker_reexec(), let's add O_CLOEXEC flag, when we open '/dev/null' in order to avoid fd leak. This a preparation step to refactor master-worker logic. See more details in the next commits.	2024-10-16 22:00:58 +02:00
Valentine Krasnobaeva	5bbcdc003a	REGTESTS: cli: add delay 0.1 before connect to cli When vtest starts haproxy process, it loops until the moment, when haproxy pidfile is created. When pidfile is created, vtest considers that haproxy process is ready and it starts to perform test commands, in particular, it connects to CLI. It's not very reliable approach to base the check of the process readiness on the PID file. After master-worker architecture refactoring pidfile is created in the early init stage, but master and worker are not yet finished its initialization routines. So, all mcli tests and some tests where we sent commands to CLI start to fail regularly. In vtest at the moment there is no any other approach to check that the process is really ready. So let's add a delay 0.1s before connecting to CLI in all mcli tests and in acl_cli_spaces test.	2024-10-16 22:00:58 +02:00
Willy Tarreau	2c2dac77aa	DEBUG: mux-h2/flags: add H2_CF_DEM_RXBUF & H2_SF_EXPECT_RXDATA for the decoder Both flags were recently added but missing from the decoders flags, so they appeared in hex in dev/flags/flags output. No backport needed.	2024-10-16 18:32:52 +02:00
Willy Tarreau	ca275d99ce	BUG/MEDIUM: queue: make sure never to queue when there's no more served conns Since commit 53f52e67a0 ("BUG/MEDIUM: queue: always dequeue the backend when redistributing the last server"), we've got two reports again still showing the theoretically impossible condition in pendconn_add(), including a single threaded one. Thanks to the traces, the issue could be tracked down to the redispatch part. In fact, in non-determinist LB algorithms (RR, LC, FAS), we don't perform the LB if there are pending connections in the backend, since it indicates that previous attempts already failed, so we directly return SRV_STATUS_FULL. And contrary to a previous belief, it is possible to meet this condition with be->served==0 when redispatching (and likely with maxconn not greater than the number of threads). The problem is that in this case, the entry is queued and then the pendconn_must_try_again() function checks if any connections are currently being served to detect if we missed a race, and tries again, but that situation is not caused by a concurrent thread and will never fix itself, resulting in the loop. All that part around pendconn_must_try_again() is still quite brittle, and a safer approach would involve a sequence counter to detect new arrivals and dequeues during the pendconn_add() call. But it's more sensitive work, probably for a later fix. This fix must be backported wherever the fix above was backported. Thanks to Patrick Hemmer, as well as Damien Claisse and Basha Mougamadou from Criteo for their help on tracking this one!	2024-10-16 18:08:39 +02:00
Aurelien DARRAGON	85298189bf	BUG/MEDIUM: server: server stuck in maintenance after FQDN change Pierre Bonnat reported that SRV-based server-template recently stopped to work properly. After reviewing the changes, it was found that the regression was caused by a4d04c6 ("BUG/MINOR: server: make sure the HMAINT state is part of MAINT") Indeed, HMAINT is not a regular maintenance flag. It was implemented in b418c122 a4d04c6 ("BUG/MINOR: server: make sure the HMAINT state is part of MAINT"). This flag is only set (and never removed) when the server FQDN is changed from its initial config-time value. This can happen with "set server fqdn" command as well as SRV records updates from the DNS. This flag should ideally belong to server flags.. but it was stored under srv_admin enum because cur_admin is properly exported/imported via server state-file while regular server's flags are not. Due to a4d04c6, when a server FQDN changes, the server is considered in maintenance, and since the HMAINT flag is never removed, the server is stuck in maintenance. To fix the issue, we partially revert a4d04c6. But this latter commit is right on one point: HMAINT flag was way too confusing and mixed-up between regular MAINT flags, thus there's nothing to blame about a4d04c6 as it was error-prone anyway.. To prevent such kind of bugs from happening again, let's rename HMAINT to something more explicit (SRV_ADMF_FQDN_CHANGED) and make it stand out under srv_admin enum so we're not tempted to mix it with regular maintenance flags anymore. Since a4d04c6 was set to be backported in all versions, this patch must be backported there as well.	2024-10-16 14:26:57 +02:00
Amaury Denoyelle	0918c41ef6	BUG/MEDIUM: quic: support wait-for-handshake wait-for-handshake http-request action was completely ineffective with QUIC protocol. This commit implements its support for QUIC. QUIC MUX layer is extended to support wait-for-handshake. A new function qcc_handle_wait_for_hs() is executed during qcc_io_process(). It detects if MUX processing occurs after underlying QUIC handshake completion. If this is the case, it indicates that early data may be received. As such, connection is flagged with CO_FL_EARLY_SSL_HS, which is necessary to block stream processing on wait-for-handshake action. After this, qcc subscribs on quic_conn layer for RECV notification. This is used to detect QUIC handshake completion. Thus, qcc_handle_wait_for_hs() can be reexecuted one last time, to remove CO_FL_EARLY_SSL_HS and notify every streams flagged as SE_FL_WAIT_FOR_HS. This patch must be backported up to 2.6, after a mandatory period of observation. Note that it relies on the backport of the two previous patches : - MINOR: quic: notify connection layer on handshake completion - BUG/MINOR: stream: unblock stream on wait-for-handshake completion	2024-10-16 11:51:35 +02:00
Amaury Denoyelle	73031e81cd	BUG/MINOR: stream: unblock stream on wait-for-handshake completion wait-for-handshake is an http-request action which permits to delay the processing of content received as TLS early data. The action yields as long as connection handshake is in progress. In the meantime, stconn is flagged with SE_FL_WAIT_FOR_HS. When the handshake is finished, MUX layer is responsible to woken up SE_FL_WAIT_FOR_HS flagged stconn instances to restart the stream processing. On sc_conn_process(), SE_FL_WAIT_FOR_HS flag is removed and stream layer is woken up. However, there may be a blocking after MUX notification. sc_conn_recv() may return 0 due to no new data reception, which prevents sc_conn_process() execution. The stream is thus blocked until its timeout. To fix this, checks in sc_conn_recv() about the handshake termination condition. If true, explicitely returns 1 to ensure sc_conn_process() will be executed. Note that this bug is not reproducible due to various conditions related to early data implementation in haproxy. Indeed, connection layer instantiation is always delayed until SSL handshake completion, which prevents the handling of early data as expected. This fix will be necessary to implement wait-for-handshake support for QUIC. As such, it must be backported with the next commit up to 2.6, after a mandatory period of observation.	2024-10-16 11:44:31 +02:00
Amaury Denoyelle	5a5950e42d	MINOR: quic: notify connection layer on handshake completion Wake up connection layer on QUIC handshake completion via quic_conn_io_cb. Select SUB_RETRY_RECV as this was previously unused by QUIC MUX layer. For the moment, QUIC MUX never subscribes for handshake completion. However, this will be necessary for features such as the delaying of early data forwarding via wait-for-handshake. This patch will be necessary to implement wait-for-handshake support for QUIC. As such, it must be backported with next commits up to 2.6, after a mandatory period of observation.	2024-10-16 11:42:06 +02:00
Willy Tarreau	5091f90479	MINOR: activity/memprofile: always return "other" bin on NULL return address It was found in a large "show profiling memory" output that a few entries have a NULL return address, which causes confusion because this address will be reused by the next new allocation caller, possibly resulting in inconsistencies such as "free() ... pool=trash" which makes no sense. The cause is in fact that the first caller had an entry->info pointing to the trash pool from a p_alloc/p_free with a NULL return address, and the second had a different type and reused that entry. Let's make sure undecodable stacks causing an apparent NULL return address all lead to the "other" bin. While this is not exactly a bug, it would make sense to backport it to the recent branches where the feature is used (probably at least as far as 2.8).	2024-10-15 08:12:34 +02:00
Willy Tarreau	93c9f19af7	REGTESTS: fix a reload race in abns_socket.vtc This test issues a reload over the master CLI, but it is totally possible that the master has not yet finished starting up the master CLI when the command is issued, resulting in a failure. This was much more visible on the new master-worker model, but definitely affects the old one and could be the reason for this test to occasionally fail on the CI.	2024-10-14 19:15:21 +02:00
William Lallemand	0302adf996	CI: cirrus-ci: bump FreeBSD image to 14-1 FreeBSD CI since to be broken for a while, try to upgrade the image to the latest 14.1 version.	2024-10-14 14:28:26 +02:00
Willy Tarreau	e4cb0ad632	MINOR: mux-h2/traces: add buffer-related info to h2s and h2c The traces currently don't contain any info about the amount of data present in buffers, making it difficult to figure if an empty buffer is the cause for not demuxing or if a full buffer is the cause for not reading more data. Let's add them, with the head/tail info as well.	2024-10-12 18:07:21 +02:00
Willy Tarreau	a8f907a459	MINOR: mux-h2/traces: add missing flags and proxy ID in traces H2 traces are unusable to detect bugs most of the time because they miss the h2c and h2s flags, as well as the proxy, which makes it very hard to figure if the info comes from the client or the server as soon as two layers are stacked. This commit adds these precious information as well as the h2s's rx and tx windows. This could be backported to a few recent branches, but the rx window calculation will have to be replaced with the static value there.	2024-10-12 17:45:51 +02:00
Willy Tarreau	fcab647613	OPTIM: mux-h2: use tasklet_wakeup_after() in h2s_notify_recv() This reduces the avg wakeup latency of sc_conn_io_cb() from 1900 to 51us. The L2 cache misses from from 1.4 to 1.2 billion for 20k req. But the perf is not better. Also there are situations where we must not perform such wakeup, these may only be done from h2_io_cb, hence the test on the next_tasklet pointer and its reset when leaving the function. In practice all callers to h2s_close() or h2s_destroy() can reach that code, this includes h2_detach, h2_snd_buf, h2_shut etc. Another test with 40 concurrent connections, transferring 40k 1MB objects at different concurrency levels from 1 to 80 also showed a 21% drop in L2 cache misses, and a 2% perf improvement: Before: 329,510,887,528 instructions 50,907,966,181 branches 843,515,912 branch-misses 2,753,360,222 cache-misses 19,306,172,474 L1-icache-load-misses 17,321,132,742 L1-dcache-load-misses 951,787,350 LLC-load-misses 44.660469000 seconds user 62.459354000 seconds sys => avg perf: 373 MB/s After: 331,310,219,157 instructions 51,343,396,257 branches 851,567,572 branch-misses 2,183,369,149 cache-misses 19,129,827,134 L1-icache-load-misses 17,441,877,512 L1-dcache-load-misses 906,923,115 LLC-load-misses 42.795458000 seconds user 62.277983000 seconds sys => avg perf: 380 MB/s With small requests, it's the L1 and L3 cache misses which reduced by 3% and 7% respectively, and the performance went up by 3%.	2024-10-12 17:17:51 +02:00
Willy Tarreau	04ce6536e1	OPTIM: mux-h2: try to continue reading after demuxing when useful When we stop demuxing in the middle of a frame, we know that there are other data following. The demux buffer is small and unique, but now we have rxbufs, so after h2_process_demux() is left, the dbuf is almost empty and has room to be delivered into another rxbuf. Let's implement a short loop with a counter and a few conditions around the demux call. We limit the number of turns to the number of available rxbufs and no more than 12, since it shows good performance, and the wakeup is only called once. This has shown a nice 12-20% bandwidth gain on backend-side H2 transferring 1MB-large objects, and does not affect the rest (headers, control etc). The number of wakeup calls was divided by 5 to 8, which is also a nice improvement. The counter is limited to make sure we don't add processing latency. Tests were run to find the optimal limit, and it turns out that 16 is just slightly better, but not worth the +33% increase in peak processing latency. The h2_process_demux() function just doens't call the wakeup function anymore, and solely focuses on transferring from dbuf to rxbuf. Practical measurement: test with h2load producing 4 concurrent connections with 10 concurrent streams each, downloading 1MB objects (20k total) via two layers of haproxy stacked, reaching httpterm over H1 (numbers are total for the 2 h2 front and 1 h2 back). All on a single thread. Before: 549-553 MB/s (on h2load) function calls cpu_tot cpu_avg h2_io_cb 2562340 8.157s 3.183us <- h2c_restart_reading@src/mux_h2.c:957 tasklet_wakeup h2_io_cb 30109 840.9ms 27.93us <- sock_conn_iocb@src/sock.c:1007 tasklet_wakeup h2_io_cb 16105 106.4ms 6.607us <- ssl_sock_io_cb@src/ssl_sock.c:5721 tasklet_wakeup h2_io_cb 1 11.75us 11.75us <- sock_conn_iocb@src/sock.c:986 tasklet_wakeup h2_io_cb 2608555 9.104s 3.490us --total-- perf stat: 153,117,996,214 instructions (71.41%) 22,919,659,027 branches # 14.97% of inst (71.41%) 384,009,600 branch-misses # 1.68% of all branches (71.42%) 44,052,220 cache-misses # 1 inst / 3476 (71.44%) 9,819,232,047 L1-icache-load-misses # 6.4% of inst (71.45%) 8,426,410,306 L1-dcache-load-misses # 5.5% of inst (57.15%) 10,951,949 LLC-load-misses # 1 inst / 13982 (57.13%) 12.372600000 seconds user 23.629506000 seconds sys After: 660 MB/s (+20%) function calls cpu_tot cpu_avg h2_io_cb 244502 4.410s 18.04us <- h2c_restart_reading@src/mux_h2.c:957 tasklet_wakeup h2_io_cb 42107 1.062s 25.22us <- sock_conn_iocb@src/sock.c:1007 tasklet_wakeup h2_io_cb 13703 106.3ms 7.758us <- ssl_sock_io_cb@src/ssl_sock.c:5721 tasklet_wakeup h2_io_cb 1 13.74us 13.74us <- sock_conn_iocb@src/sock.c:986 tasklet_wakeup h2_io_cb 300313 5.578s 18.57us --total-- perf stat: 126,840,441,876 instructions (71.40%) 17,576,059,236 branches # 13.86% of inst (71.40%) 274,136,753 branch-misses # 1.56% of all branches (71.42%) 30,413,562 cache-misses # 1 inst / 4170 (71.45%) 6,665,036,203 L1-icache-load-misses # 5.25% of inst (71.46%) 7,519,037,097 L1-dcache-load-misses # 5.9% of inst (57.15%) 6,702,411 LLC-load-misses # 1 inst / 18925 (57.12%) 10.490097000 seconds user 19.212515000 seconds sys It's also interesting to see that less total time is spent in these functions, clearly indicating that the cost of interrupted processing, and the extraneous cache misses come into play at some point. Indeed, after the change, the number of instructions went down by 17.2%, while the L2 cache misses dropped by 31% and the L3 cache misses by 39%!	2024-10-12 16:38:36 +02:00
Willy Tarreau	9fbc01710a	OPTIM: mux-h2: make h2_send() report more accurate wake up conditions h2_send() used to report non-zero every time any data were sent, and this was used from h2_snd_buf() or h2_done_ff() to trigger a wakeup, which possibly can do nothing. Restricting this wakeup to either a successful send() combined with the ability to demux, or an error. Doing this makes the number of h2_io_cb() wakeups drop from 422k to 245k for 1000 1MB objects delivered over 100 streams between two H2 proxies, without any behavior change nor performance change. In practice, most send() calls do not result in a wakeup anymore but synchronous errors still do. A local test downloading 10k 1MB objects from an H1 server with a single connection shows this change: before after caller 1547 1467 h2_process_demux() 2138 0 h2_done_ff() <--- 38 1453 ssl_sock_io_cb() <--- 18 0 h2_snd_buf() 1 1 h2_init() 3742 2921 -- total -- In practice the ssl_sock_io_cb() wakeups are those notifying about SUB_RETRY_RECV, which are not accounted for when h2_done_ff() performs the wakeup because the tasklet is already queued (a counter placed there shows that it's nonetheless called). So there's no transfer and h2_done_ff() was only hiding the other one. Another test involving 4 connections with 10 concurrent streams each and 20000 1MB objects total shows a total disparition of the wakeups from h2_snd_buf and h2_done_ff, which used to account together for 50% of the wakeups, resulting in effectively halving the number of wakeups which, based on their avg process time, were not doing anything: Before: function calls cpu_tot cpu_avg h2_io_cb 2571208 7.406s 2.880us <- h2c_restart_reading@src/mux_h2.c:940 tasklet_wakeup h2_io_cb 2536949 251.4ms 99.00ns <- h2_snd_buf@src/mux_h2.c:7573 tasklet_wakeup ### h2_io_cb 41100 5.622ms 136.0ns <- h2_done_ff@src/mux_h2.c:7779 tasklet_wakeup ### h2_io_cb 38979 852.8ms 21.88us <- sock_conn_iocb@src/sock.c:1007 tasklet_wakeup h2_io_cb 12519 90.28ms 7.211us <- ssl_sock_io_cb@src/ssl_sock.c:5721 tasklet_wakeup h2_io_cb 1 13.81us 13.81us <- sock_conn_iocb@src/sock.c:986 tasklet_wakeup h2_io_cb 5200756 8.606s 1.654us --total-- After: h2_io_cb 2562340 8.157s 3.183us <- h2c_restart_reading@src/mux_h2.c:957 tasklet_wakeup h2_io_cb 30109 840.9ms 27.93us <- sock_conn_iocb@src/sock.c:1007 tasklet_wakeup h2_io_cb 16105 106.4ms 6.607us <- ssl_sock_io_cb@src/ssl_sock.c:5721 tasklet_wakeup h2_io_cb 1 11.75us 11.75us <- sock_conn_iocb@src/sock.c:986 tasklet_wakeup h2_io_cb 2608555 9.104s 3.490us --total--	2024-10-12 16:38:36 +02:00
Willy Tarreau	633c41c621	MEDIUM: mux-h2: rework h2_restart_reading() to differentiate recv and demux From the beginning, h2_restart_reading() has always been confusing because it decides whether or not to wake the tasklet handler up or not. This tasklet handler does two things, one is receiving from the socket to the demux buf, and one is demuxing from the demux buf to the streams' rxbufs. The conditions are governed by h2_recv_allowed(), which is also called at a few places to decide whether or not to actually receive from the socket. It starts to be visible that this leaves some difficulties regarding what to do with possibly pending data. In 2.0 with commit 3ca18bf0b ("BUG/MEDIUM: h2: Don't attempt to recv from h2_process_demux if we subscribed."), we even had to address a special case where it was possibly to endlessly wake up because the conditions would rely on the demux buffer's contents, though the solution consisted in passing a flag to decide whether or not to consider the buffer's contents. In 2.5 commit b5f7b5296 ("BUG/MEDIUM: mux-h2: Handle remaining read0 cases on partial frames") introduced a new flag H2_CF_DEM_SHORT_READ which indicates that the demux had to stop in the middle of a frame and cannot make progress without more data. More adaptations later came in based on this but this actually reflected exactly what was needed to solve this painful situation: a state indicating whether to receive or parse. Now's about time to definitely address this by reworking h2_restart_reading() to check two completely independent things: - the ability to receive more data into the demux buffer, which is based on its allocation/fill state and the socket's errors - the ability to demux such data, which is based on the presence of enough data (i.e. no stuck short read), and ability to find an rx buf to continue the processing. Now the conditions are much more understandable, and it's also visible that the consider_buffer argument, whose value was not trivial for callers, is not used anymore. Tests stacking two layers of H2 show strictly no change to the wakeup cause distributions nor counts.	2024-10-12 16:38:36 +02:00
Willy Tarreau	e057f8367c	DOC: design-thoughts: add diagrams illustrating an rx win groth Let's just see on a diagram how the receiver can detect that the window is large enough for the remote sender to fill the link. Here it seems that a first criterion is that data are accumulating in the rxbuf, indicating that the next hop doesn't consume them fast enough. On the diagram it's visible when blue arrows (incoming data) are more frequent than the magenta ones on average (outgoing data), which happens when silence moments are less frequent and don't allow the reader to catch up. It's also visible that there are two phases alternating in the transfer: - measure round trip time (i.e. how long it takes to restart sending after a WU was sent after a long silence) - measure the lowest rxbuf size during the previous round trip It's worth noting that a window size change only has observable effect after two RTT: the first RTT is to restart sending (opening or enlarging the window), the second RTT to measure the lowest rxbuf size over the period. By turning the advertised window into an offset and comparing it to the received quantity, it's possible to measure the RTT of the whole chain (including the client possibly producing the data). Note that when multiple streams compete for BW this can become tricky. Limiting the window to available buffers and counting the number of sending streams on a connection could work (i.e. split total buffers into 1+#senders, first one being used for tx).	2024-10-12 16:38:36 +02:00
Willy Tarreau	0fd66703c2	MEDIUM: mux-h2: change the default initial window to 16kB Now that we're using all available rx buffers for transfers, there's no point anymore in advertising more than the minimum value we can safely buffer. Let's be conservative and only rely on the dynamic buffers to improve speed beyond the configured value, and make sure than many streams will no longer cause unfairness. Interestingly, the total number of wakeups has further shrunk down, but with a different distribution. From 128k for 1000 1M transfers, it went down to 119k, with 96k from restart_reading, 10k from done_ff and 2.6k from snd_buf. done_ff went up by 30% and restart_reading went down by 30%.	2024-10-12 16:38:26 +02:00
Willy Tarreau	1ed9d37c88	MINOR: mux-h2: add tune.h2.be.rxbuf and tune.h2.fe.rxbuf global settings These settings allow to change the total buffer size allocated to the backend and frontend respectively. This way it's no longer necessary to play with tune.bufsize nor increase the number of streams to benefit from more buffers. Setting tune.h2.fe.rxbuf to 4m to match a sender's max tcp_wmem resulted in 257 Mbps for a single stream at 103ms vs 121 Mbps default (or 5.1 Mbps with a single buffer and 64kB window).	2024-10-12 16:29:16 +02:00
Willy Tarreau	e018d9a0cf	MAJOR: mux-h2: make the rxbuf allocation algorithm a bit smarter Without using bandwidth estimates, we can already use up to the number of allocatable rxbufs and share them evenly between receiving streams. In practice we reserve one buffer for any non-receiving stream, plus 1 per 8 possible new streams, and divide the rest between the number of receiving streams. Finally, for front streams, this is rounded up to the buffer size while for back streams we round it down. The rationale here is that front to back is very fast to flush and slow to refill so we want to optimise upload bandwidth regardless of the number of streams, while it's the opposite in the other way so we try to minimize HoL. That shows good results with a single stream being able to send at 121 Mbps at 103ms using 1.4 MB buffer with default settings, or 8 streams sharing the bandwidth at 180kB each. Previously the limit was approx 5.1 Mbps per stream. It also enables better sharing of backend connections: a slow (100 Mbps) and a fast (1 Gbps) clients were both downloading 2 100MB files each over a shared H2 connection. The fast one used to show 6.86 to 20.74s with an avg of 11.45s and an stddev of 5.81s before the patch, and went to a much more respectable 6.82 to 7.73s with 7.08s avg and 0.336s stddev. We don't try to increase the window past the remaining content length. First, this is pointless (though harmless), but in addition it causes needless emission of WINDOW_UPDATE frames on small uploads that are smaller than a window, and beyond being useless, it upsets vtest which expects an RST on some tests. The scheduling is not reliable enough to insert an expect for a window update first, so in the end wich that extra check we save a few useless frames on small uploads and please vtest. A new setting should be added to allow to increase the number of buffers without having to change the number of streams. At this point it's not done.	2024-10-12 16:29:16 +02:00
Willy Tarreau	3816c38601	MAJOR: mux-h2: permit a stream to allocate as many buffers as desired Now we don't enforce allocation limits in h2s_get_rxbuf(), since there is no benefit in not processing pending data, it would still cause HoL for no saving. The only reason for not allocating is if there are no buffers available for the connection. In theory this should not change anything except that it excerts code paths that support reallocating multiple buffers, which could possibly uncover a sleeping bug. This is why it's placed in a separate commit. And one observation worth noting is that it almost cut in half the number of iocb wakeups: for 1000 1MB transfers over 100 concurrent streams of a single connection, we used to observe 208k wakeups (110 from restart_reading, 80 from snd_buf, 11 from done_ff), and now we're observing 128k (113 from restart_reading, 2.4 from snd_buf, 6.9k from done_ff), which seems to indicate that pretty often the demuxing was blocked on a buffer full due to the default advertised window of 64k.	2024-10-12 16:29:16 +02:00
Willy Tarreau	4eb3ff1d3b	MAJOR: mux-h2: make streams use the connection's buffers For now it seems to work as before, and even when artificially inflating the number of allocatable buffers per stream. The number of allocated slots is always the same as the max number of streams, which guarantees that each stream will find one buffer. we only grant one buffer per stream at this point, since the goal was to replace the existing single rxbuf. A new demux blocking flag, H2_CF_DEM_RXBUF, was added to indicate a failure to get an rxbuf slot from the connection. It was lightly tested (by forcing bl_init() to a lower number of buffers). It is not yet certain whether it's more useful to have a new flag or to reuse the existing H2_CF_DEM_SFULL which indicates the rxbuf is full, but at least the new flag more accurately translates the condition, that may make a difference in the future. However, given that when RXBUF is set, most of the time it results in a failure to find more room to demux and it sets SFULL, for now we have to always clear SFULL when clearing RXBUF as well. This means that most of the time we'll see 3 combinations: - none: everything's OK - SFULL: the unique rx buffer is full - RXBUF \|\| (RXBUF\|SFULL): cannot allocate more entries Note that we need to be super careful in h2_frt_transfer_data() because the htx_free_data_space() function doesn't guarantee that the room is usable, so htx_add_data() may still fail despite an apparent room. For this reason, h2_frt_transfer_data() maintains a "full" flag to indicate that a transfer attempt failed and that a new buffer is required.	2024-10-12 16:29:16 +02:00
Willy Tarreau	6279cbc9e9	MINOR: mux-h2: clear up H2_CF_DEM_DFULL and H2_CF_DEM_SHORT_READ ambiguity Since commit 485da0b05 ("BUG/MEDIUM: mux_h2: Handle others remaining read0 cases on partial frames"), H2_CF_DEM_SHORT_READ is set when there is no blocking flags. However, it checks H2_CF_DEM_BLOCK_ANY which does not include H2_CF_DEM_DFULL. This results in many cases where both H2_CF_DEM_DFULL and H2_CF_DEM_SHORT_READ are set together, which makes no sense, since one says the demux buffer is full while the other one says an incomplete read was done. This doesn't permit to properly decide whether to restart reading or processing. Let's make sure to clear DFULL in h2_process_demux() whenever we consume incoming data from the dbuf, and check for DFULL before setting SHORT_READ. This could probably be considered as a bug fix but it's hard to say if it has any impact on the current code, probably at worst it might cause a few useless wakeups, so until there's any proof that it needs to be backported, better not do it.	2024-10-12 16:29:16 +02:00
Willy Tarreau	b74bedf157	MINOR: mux-h2: simplify the wake up code in h2_rcv_buf() The code used to decide when to restart reading is far from being trivial and will cause trouble after the forthcoming changes: it checks if the current stream is the same that is being demuxed, and only if so, wakes the demux to restart reading. Once streams will start to use multiple buffers, this condition will make no sense anymore. Actually the real reason is split into two steps: - detect if the demux is currently blocked on the current stream, and if so remove SFULL - detect if any demux blocking flags were removed during the operations, and if so, wake demuxing. For now this doesn't change anything.	2024-10-12 16:29:16 +02:00
Willy Tarreau	a0ed92f3dd	MINOR: mux-h2: simplify the exit code in h2_rcv_buf() The code used to decide what to tell to the upper layer and when to free the rxbuf is a bit convoluted and difficult to adapt to dynamic rxbufs. We first need to deal with memory management (b_free) and only then to decide what to report upwards. Right now it does it the other way around. This should not change anything.	2024-10-12 16:29:16 +02:00
Willy Tarreau	3b5ac2b553	MINOR: mux-h2: move H2_CF_WAIT_IN_LIST flag away from the demux flags It's not convenient to have this flag in the middle of the demux flags, it easily hides other ones that need to be added. Let's move it after the other ones.	2024-10-12 16:29:16 +02:00
Willy Tarreau	8cf418811d	MINOR: mux-h2: add rxbuf head/tail/count management for h2s Now the h2s get their rx_head, rx_tail and rx_count associated with the shared rxbufs. A few functions are provided to manipulate all this, essentially allocate/release a buffer for the stream, return a buffer pointer to the head/tail, counting allocated buffers for the stream and reporting if a stream may still allocate. For now this code is not used.	2024-10-12 16:29:16 +02:00
Willy Tarreau	a891534bfd	MINOR: mux-h2: allocate the array of shared rx bufs in the h2c In preparation for having a shared list of rx bufs, we're now allocating the array of shared rx bufs in the h2c. The pool is created at the max size between the front and back max streams for now, and the array is not used yet.	2024-10-12 16:29:16 +02:00
Willy Tarreau	721ea5b06c	MINOR: mux-h2: count within a connection, how many streams are receiving data A stream is receiving data from after the HEADERS frame missing END_STREAM, to the end of the stream or HREM (the presence of END_STREAM). We're now adding a flag to the stream that indicates this state, as well as a counter in the connection of streams currently receiving data. The purpose will be to gauge at any instant the number of streams that might have to share the available bandwidth and buffers count in order not to allocate too much flow control to any single stream. For now the counter is kept up to date, and is reported in "show fd".	2024-10-12 16:29:16 +02:00
Willy Tarreau	c9275084bc	MEDIUM: mux-h2: start to introduce the window size in the offset calculation Instead of incrementing the last_max_ofs by the amount of received bytes, we now start from the new current offset to which we add the static window size. The result is exactly the same but it prepares the code to use a window size combined with an offset instead of just refilling the budget from what was received. It was even verified that changing h2_fe_settings_initial_window_size in the middle of a transfer using gdb does indeed allow the transfer speed to adapt accordingly.	2024-10-12 16:29:16 +02:00
Willy Tarreau	1cc851d9f2	MEDIUM: mux-h2: start to update stream when sending WU The rationale here is that we don't absolutely need to update the stream offset live, there's already the rcvd_s counter to remind us we've received data. So we can continue to exploit the current check points for this. Now we know that rcvd_s indicates the amount of newly received bytes for the stream since last call to h2c_send_strm_wu() so we can update our stream offsets within that function. The wu_s counter is set to the difference between next_adv_ofs and last_adv_ofs, which are resynchronized once the frame is sent. If the stream suddenly disappears with unacked data (aborted upload), the presence of the last update in h2c->wu_s is sufficient to let the connection ack the data alone, and upon subsequent calls with new rcvd_s, the received counter will be used to ack, like before. We don't need to do more anyway since the goal is to let the client abort ASAP when it gets an RST. At this point, the stream knows its current rx offset, the computed max offset and the last advertised one.	2024-10-12 16:29:16 +02:00
Willy Tarreau	eb0fe66c61	MINOR: mux-h2: create and initialize an rx offset per stream In H2, everything is accounted as budget. But if we want to moderate the rcv window that's not very convenient, and we'd rather have offsets instead so that we know where we are in the stream. Let's first add the fields to the struct and initialize them. The curr_rx_ofs indicates the position in the stream where next incoming bytes will be stored. last_adv_ofs tells what's the offset that was last advertised as the window limit, and next_max_ofs is the one that will need to be advertised, which is curr_rx_ofs plus the current window. next_max_ofs will have to cause a WINDOW_UPDATE to be emitted when it's higher than last_adv_ofs, and once the WU is sent, its value will have to be copied over last_adv_ofs. The problem is, for now wherever we emit a stream WU, we have no notion of stream (the stream might even not exist anymore, e.g. after aborting an upload), because we currently keep a counter of stream window to be acked for the current stream ID (h2c->dsi) in the connection (rcvd_s). Similarly there are a few places early in the frame header processing where rcvd_s is incremented without knowing the stream yet. Thus, lookups will be needed for that, unless such a connection-level counter remains used and poured into the stream's count once known (delicate). Thus for now this commit only creates the fields and initializes them.	2024-10-12 16:29:15 +02:00
Willy Tarreau	560e474cdd	MINOR: mux-h2: split the amount of rx data from the amount to ack We'll need to keep track of the total amount of data received for the current stream, and the amount of data to ack for the current stream, which might soon diverge as soon as we'll have to update the stream's offset with received data, which are different from those to be ACKed. One reason is that in case a stream doesn't exist anymore (e.g. aborted an upload), the rcvd_s info might get lost after updating the stream, so we do need to have an in-connection counter for that. What's done here is that the rcvd_s count is transferred to wu_s in h2c_send_strm_wu(), to be used as the counter to send, and both are considered as sufficient when non-null to call the function.	2024-10-12 16:29:15 +02:00
Willy Tarreau	8f09bdce10	MINOR: buffer: add a buffer list type with functions The buffer ring is problematic in multiple aspects, one of which being that it is only usable by one entity. With multiplexed protocols, we need to have shared buffers used by many entities (streams and connection), and the only way to use the buffer ring model in this case is to have each entity store its own array, and keep a shared counter on allocated entries. But even with the default 32 buf and 100 streams per HTTP/2 connection, we're speaking about 3210132 bytes = 103424 bytes per H2 connection, just to store up to 32 shared buffers, spread randomly in these tables. Some users might want to achieve much higher than default rates over high speed links (e.g. 30-50 MB/s at 100ms), which is 3 to 5 MB storage per connection, hence 180 to 300 buffers. There it starts to cost a lot, up to 1 MB per connection, just to store buffer indexes. Instead this patch introduces a variant which we call a buffer list. That's basically just a free list encoded in an array. Each cell contains a buffer structure, a next index, and a few flags. The index could be reduced to 16 bits if needed, in order to make room for a new struct member. The design permits initializing a whole freelist at once using memset(0). The list pointer is stored at a single location (e.g. the connection) and all users (the streams) will just have indexes referencing their first and last assigned entries (head and tail). This means that with a single table we can now have all our buffers shared between multiple streams, irrelevant to the number of potential streams which would want to use them. Now the 180 to 300 entries array only costs 7.2 to 12 kB, or 80 times less. Two large functions (bl_deinit() & bl_get()) were implemented in buf.c. A basic doc was added to explain how it works.	2024-10-12 16:29:15 +02:00
Willy Tarreau	ac66df4e2e	REORG: buffers: move some of the heavy functions from buf.h to buf.c Over time, some of the buffer management functions grew quite a bit, and were still forced to remain inlined since all defined in buf.h. Let's create buf.c and move the heaviest ones there. All those moved here were above 200 bytes.	2024-10-12 16:29:15 +02:00
Willy Tarreau	d288ddb575	CLEANUP: muxes: remove useless inclusion of ebmbtree.h Since 2.7 with commit 8522348482 ("BUG/MAJOR: conn-idle: fix hash indexing issues on idle conns"), we've been using eb64 trees and not ebmb trees anymore, and later we dropped all that to centralize the operations in the server. Let's remove the ebmbtree.h includes from the muxes that do not use them.	2024-10-12 16:29:15 +02:00
Willy Tarreau	cf3fe1eed4	MINOR: mux-h2/traces: print the size of the DATA frames DATA frames produce a special trace with the amount of transferred data in arg4, but this was not reported by h2_trace(). This commit just adds it.	2024-10-12 16:29:15 +02:00
Willy Tarreau	af064b497a	BUG/MINOR: mux-h2/traces: present the correct buffer for trailers errors traces The local "rxbuf" buffer was passed to the trace instead of h2s->rxbuf that is used when decoding trailers. The impact is essentially the impossibility to present some buffer contents in some rare cases. It may be backported but it's unlikely that anyone will ever notice the difference.	2024-10-12 16:29:15 +02:00
Willy Tarreau	0fa654ca92	BUILD: cache: silence an uninitialized warning at -Og with gcc-12.2 Building with gcc-12.2 -Og yields this incorrect warning in cache.c: In function 'release_entry_unlocked', inlined from 'http_action_store_cache' at src/cache.c:1449:4: src/cache.c:330:9: warning: 'object' may be used uninitialized [-Wmaybe-uninitialized] 330 \| release_entry(cache, entry, 1); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/cache.c: In function 'http_action_store_cache': src/cache.c:1200:29: note: 'object' was declared here 1200 \| struct cache_entry object, old; \| ^~~~~~ This is wrong, the only way to reach the function is with first!=NULL and the gotos that reach there are all those made with first==NULL. Let's just preset object to NULL to silence it.	2024-10-12 16:28:54 +02:00
William Lallemand	edf85a1d76	MINOR: cfgparse: simulate long configuration parsing with force-cfg-parser-pause This command is pausing the configuration parser for <timeout> milliseconds. This is useful for development or for testing timeouts of init scripts, particularly to simulate a very long reload. It requires the expose-experimental-directives to be set.	2024-10-11 17:40:37 +02:00
Amaury Denoyelle	232083c3e5	BUG/MEDIUM: mux-quic: ensure timeout server is active for short requests If a small request is received on QUIC MUX frontend, it can be transmitted directly with the FIN on attach operation. rcv_buf is skipped by the stream layer. Thus, it is necessary to ensure that there is similar behavior when FIN is reported either on attach or rcv_buf. One difference was that se_expect_data() was called only for rcv_buf but not on attach. This most obvious effect is that stream timeout was deactivated for this request : client timeout was disabled on EOI but server one not armed due to previous se_expect_no_data(). This prevents the early closure of too long requests. To fix this, add an invokation of se_expect_data() on attach operation. This bug can simply be detected using httpterm with delay request (for example /?t=10000) and using smaller client/server timeouts. The bug is present if the request is not aborted on timeout but instead continue until its proper HTTP 200 termination. This has been introduced by the following commit : 85eabfbf672c57e4ed082da1b96c95348b331320 MEDIUM: mux-quic: Don't expect data from server as long as request is unfinished This must be backported up to 2.8.	2024-10-10 17:20:39 +02:00
Aurelien DARRAGON	7144e60cd2	MINOR: sample: postresolve sink names in debug() converter debug() converter used to resolve sink names during parsing time. Because of this, we were unable to specify sink names that were defined after the debug() converter was placed. Like in the previous commit, let's implement proper postparsing for the debug() converter, in order to be able to use sink names that are about to be defined later in the config file.	2024-10-10 16:55:15 +02:00
Aurelien DARRAGON	ed266589b6	MINOR: trace: postresolve sink names A previous known limitation about traces was that parsing was performed on the fly, meaning that when using "sink" keyword, only sinks that were either internal or previously defined in the config could be used. Indeed, it was not possible to use a ring section defined AFTER the traces section when using the 'sink' keyword from traces. This limitation was also mentioned in the config file. Let's get rid of that limitation by implementing proper postparsing for the sink parameter in traces section. To do this, make use of the new sink_find_early() helper to start referencing sink by their names even if they don't exist yet (if they are about to be defined later in the config) Traces commands on the cli are not concerned by this change.	2024-10-10 16:55:15 +02:00
Aurelien DARRAGON	1bdf6e884a	MEDIUM: sink: implement sink_find_early() sink_find_early() is a convenient function that can be used instead of sink_find() during parsing time in order to try to find a matching sink even if the sink is not defined yet. Indeed, if the sink is not defined, sink_find_early() will try to create it and mark it as forward-declared. It will also save informations from the caller to better identify it in case of errors. If the sink happens to be found in the config, it will transition from forward-declared type to its final type. Else, it means that the sink was not found in the config, in this case, during postresolve, we raise an error to indicate that the sink was not found in the configuration. It should help solve postresolving issue with rings, because for now only log targets implement proper ring postresolving.. but rings may be used at different places in the code, such as debug() converter or in "traces" section.	2024-10-10 16:55:15 +02:00
Damien Claisse	ba7c03c18e	MINOR: ssl: disable server side default CRL check with WolfSSL Patch 64a77e3ea5 disabled CRL check when no CRL file was provided, but it only did it on bind side. Add the same fix in server context initialization side. This allows to enable peer verification (verify required) on a server using TLS, without having to provide a CRL file.	2024-10-10 09:31:19 +02:00
Amaury Denoyelle	456c3997b2	BUG/MEDIUM: quic: properly decount out-of-order ACK on stream release Out-of-order STREAM ACK are buffered in its related streambuf tree. On insertion, overlapping or contiguous ranges are merged together. The total size of buffered ack range is stored in <room> streambuf member and reported to QUIC MUX layer on streambuf release. The objective is to ensure QUIC MUX layer can allocate Tx buffers conveniently to preserve a good transfer throughput. Streamdesc is the overall container of many streambufs. It may also been released when its upper QCS instance is freed, after all stream data have been emitted. In this case, the active streambuf is also released via custom code. However, in this code path, <room> was not reported to the QUIC MUX layer. This bug caused wrong estimation for the QUIC MUX txbuf window, with bytes reamining even after all ACK reception. This may cause transfer freeze on other connection streams, with RESET_STREAM emission on timeout client. To fix this, reuse the existing qc_stream_buf_release() function on streamdesc release. This ensures that notify_room is correctly used. No need to backport.	2024-10-09 17:47:16 +02:00
Amaury Denoyelle	f0049d0748	BUG/MINOR: quic: fix discarding of already stored out-of-order ACK To properly decount out-of-order acked data range, contiguous or overlapping ranges are first merged before their insertion in a tree. The first step ensure that a newly reported range is not completely covered by the existing tree ranges. However, one of the condition was incorrect. Fix this to ensure that the final range tree does not contain duplicated entry. The impact of this bug is unknown. However, it may have allowed the insertion of overlapping ranges, which could in turn cause an error in QUIC MUX txbuf window, with a possible transfer freeze. No need to backport.	2024-10-09 17:32:30 +02:00
Aurelien DARRAGON	f88f162868	BUG/MEDIUM: hlua: properly handle sample func errors in hlua_run_sample_{fetch,conv}() To execute sample fetches and converters from lua. hlua API leverages the sample API. Prior to executing the sample func, the arg checker is called from hlua_run_sample_{fetch,conv}() to detect potential errors. However, hlua_run_sample_{fetch,conv}() both pass NULL as <err> argument, but it is wrong for two reasons. First we miss an opportunity to report precise error messages to help the user know what went wrong during the check.. and more importantly, some val check functions consider that the <err> pointer is never NULL. This is the case for example with check_crypto_hmac(). Because of this, when such val check functions encounter an error, they will crash the process because they will try to de-reference NULL. This bug was discovered and reported by GH user @JB0925 on #2745. Perhaps val check functions should make sure that the provided <err> pointer is != NULL prior to de-referencing it. But since there are multiple occurences found in the code and the API isn't clear about that, it is easier to fix the hlua part (caller) for now. To fix the issue, let's always provide a valid <err> pointer when leveraging val_arg() check function pointer, and make use of it in case or error to report relevant message to the user before freeing it. It should be backported to all stable versions.	2024-10-08 12:00:42 +02:00
Aurelien DARRAGON	d0e0105181	BUG/MEDIUM: hlua: make hlua_ctx_renew() safe hlua_ctx_renew() is called from unsafe places where the caller doesn't expect it to LJMP.. however hlua_ctx_renew() makes use of Lua library function that could potentially raise errors, such as lua_newthread(), and it does nothing to catch errors. Because of this, haproxy could unexpectedly crash. This was discovered and reported by GH user @JB0925 on #2745. To fix the issue, let's simply make hlua_ctx_renew() safe by applying the same logic implemented for hlua_ctx_init() or hlua_ctx_destroy(), which is catching Lua errors by leveraging SET_SAFE_LJMP_PARENT() helper. It should be backported to all stable versions.	2024-10-08 12:00:36 +02:00
Aurelien DARRAGON	3f4a788329	REGTESTS: add some tests for 'do-log' action Now that 'do-log' action may be used for all existing action contexts, let's add some tests in reg-tests/log/log_profile.vtc to ensure it works as expected. quic-ini is not tested as it may not be builtin depending on build options..	2024-10-04 21:38:19 +02:00
Aurelien DARRAGON	3ba924a4da	MINOR: action: add do-log action Thanks to the two previous commits, we can now expose the do-log action on all available action contexts, including the new quic-init context. Each context is responsible for exposing the do-log action by registering the relevant log steps, saving the idendifier, and then store it in the rule's context so that do_log_action() automatically uses it to produce the log during runtime. To use the feature, it is simply needed to use "do-log" (without argument) on an action directive, example: tcp-request connection do-log As mentioned before, each context where the action is exposed has its own log step identifier. Currently known identifiers are: quic-initial: quic-init tcp-request connection: tcp-req-conn tcp-request session: tcp-req-sess tcp-request content: tcp-req-cont tcp-response content: tcp-res-cont http-request: http-req http-response: http-res http-after-response: http-after-res Thus, these "additional" logging steps can be used as-is under log-profile section (after "on" keyword). However, although the parser will accept them, it makes no sense to use them with the "log-steps" proxy keyword, since the only path for these origins to trigger a log generation is through the explicit use of "do-log" action. This need was described in GH #401, it should help to conditionally trigger logs using ACL at specific key points.. and may either be used alone or combined with "log-steps" to add additional log "trackers" during transaction handling. Documentation was updated and some examples were added.	2024-10-04 21:38:14 +02:00
Aurelien DARRAGON	0e271f1d2a	MINOR: log: add do_log_parse_act() helper func Function may be used from places where per-context actions are usually registered (tcp_act.c, http_act.c, quic_rules.c.. to name a few) in order to expose the do_log() action.	2024-10-04 21:38:08 +02:00
Aurelien DARRAGON	e63c7da508	MINOR: log: add do_log() logging helper do_log() is quite similar to sess_log() or strm_log(), excepts that it may be called at any time during session handling in an opportunistic way as long as the session exists (the stream may or may not exist). Also, it will try to emit the log as INFO by default, unless set-log-level is used on the stream, or error origin flag is set.	2024-10-04 21:38:02 +02:00
Amaury Denoyelle	f6599cf5a6	MEDIUM: quic: decount out-of-order ACK data range for MUX txbuf window This commit is the last one of a serie whose objective is to restore QUIC transfer throughput performance to the state prior to the recent QUIC MUX buffer allocator rework. This gain is obtained by reporting received out-of-order ACK data range to the QUIC MUX which can then decount room in its txbuf window. This is implemented in QUIC streamdesc layer by adding a new invokation of notify_room callback. This is done into qc_stream_buf_store_ack() which handle out-of-order ACK data range. Previous commit has introduced merging of overlapping ACK data range. As such, it's easy to only report the newly acknowledged data range. As with in-order ACKs, this new notification is only performed on released streambuf. As such, when a streambuf instance is released, notify_room notification now also reports the total length of out-of-order ACK data range currently stored. This value is stored in a new streambuf member <room> to avoid unnecessary tree lookup. This <room> member also serves on in-order ACK notification to reduce the notified room. This prevents to report invalid values when overlap ranges are treated first out-of-order and then in-order, which would cause an invalid QUIC MUX txbuf window value. After this change has been implemented, performance has been significantly improved, both with ngtcp2-client rate usage and on interop goodput test. These values are now similar to the rate observed on older haproxy version before QUIC MUX buffer allocator rework.	2024-10-04 18:09:51 +02:00
Amaury Denoyelle	ae3e768d32	MEDIUM: quic: merge contiguous/overlapping buffered ack stream range Transfer throughput was deteriorated since recent rework of QUIC MUX txbuf allocator. This was partially restorated with the commit to decount individual in-order ACK from the MUX buffer window. To fully retrieve the old performance level, all ACKs must be decounted when handled by QUIC streamdesc layer, event out-of-order ranges. However, this is not easily implemented as several ranges may exist in parallel with overlap on the underlying data. It would cause miscalculation for QUIC MUX buffer window if such ranges were blindly reported. The proper solution is to first implement merge of contiguous or overlapping ACK data ranges to reduce the number of stored ranges to the minimal. This is the purpose of this patch. This is implemented in a new static function named qc_stream_buf_store_ack() into streamdesc layer. The merge algorithm is simple enough. First, it ensures the newly added range is not already fully covered by a preexisting entry. Then, it checks if there is contiguity/overlap with one or several ranges starting at the same of a greater offset. If true, the newly added entry is extended to cover them all, and all contiguous/overlapped ranges are removed. Finally, if there is contiguity or overlap with an entry starting at a smaller offset, no new range is instantiated and instead the smaller offset is extended. Now that contiguous or overlapped ranges cannot exits anymore, ACK data ranges tree instiatiation can used EB_ROOT_UNIQUE. Outside of the longer term objective which is to decount out-of-order ACKs from MUX txbuf window, this commit could also improve some performance and/or memory usage for connections where stream data fragmentation and packet reording is high.	2024-10-04 18:07:52 +02:00
Amaury Denoyelle	e7578084b0	MINOR: quic: implement dedicated type for out-of-order stream ACK QUIC streamdesc layer is responsible to handle reception of ACK for streams. It removes stream data from the underlying buffers on ACK reception. Streamdesc layer treats ACK in order at the stream level. Out of order ACKs are buffered in a tree until they can be handled on older data acknowledgement reception. Previously, qf_stream instance which comes from the quic_tx_packet was used as tree node to buffer such ranges. Introduce a new type dedicated to represent out of order stream ack data range. This type is named qc_stream_ack. It contains minimal infos only relative to the acknowledged stream data range. This allows to reduce size of frequently used quic_frame with the removal of tree node from qf_stream. Another side effect of this change is that now quic_frame are always released immediately on ACK reception, both in-order and out-of-order. This allows to also release the quic_tx_packet instance which should reduce memory consumption. The drawback of this change is that qc_stream_ack instance must be allocated on out-of-order ACK reception. As such, qc_stream_desc_ack() may fail if an error happens on allocation. For the moment, such error is silenly recovered up to qc_treat_rx_pkts() with the dropping of the received packet containing the ACK frame. In the future, it may be useful to close the connection as this error may only happens on low memory usage.	2024-10-04 17:56:45 +02:00
Amaury Denoyelle	4ff87db5fe	MEDIUM: quic: decount acknowledged data for MUX txbuf window Recently, a new allocation mechanism was implemented for Tx buffers used by QUIC MUX. Now, underlying congestion window size is used to determine if it is still possible or not to allocate a new buffer when necessary. This mechanism has render the QUIC stack more flexible. However, it also has brought some performance degradation, with transfer time longer in certain environment. It was first discovered on the measurement results of the interop. It can also easily be reproduced using the following ngtcp2-client example which forces a very small congestion window due to frequent loss : $ ngtcp2-client -q --no-quic-dump --no-http-dump --exit-on-all-streams-close -r 0.1 127.0.0.1 20443 "https://[::]:20443/?s=10m" This performance decrease is caused by the allocator which is now too strict. It may cause buffer underrun frequently at the MUX layer when the congestion window is too small, as new buffers cannot be allocated until the current one is fully acknowledged. This resuls in transfers with very bad throughput utilisation. The objective of this new serie of patches is to relax some restrictions to permit QUIC MUX to allocate new buffers more quickly, while preserving the initial limitation based on congestion window size. An interesting method for this is to notify QUIC MUX about newly available room on individual ACK reception, without waiting for the full bffer acknowledgement. This is easily implemented by adding a new notify_room invokation in QUIC streamdesc layer on ACK reception. However, ACK reception are handled in-order at the stream level. Out of order ACKs are buffered and are not decounted for now. This will be implemented in a future commit. Note that for a single buffer instance, data can in parallel be written by QUIC MUX and removed on ACK reception. This could cause room notification to QUIC MUX layer to report invalid values. As such, ACK reception are only accounted for released buffers. This ensures that such buffers won't received any new data. In the same time, buffer room is notified on release operation as it does not need acknowledgement. This commit has permit to improve performance for the ngtcp2-client scenario above. However, it is not yet sufficient enough for interop goodput test.	2024-10-04 17:31:26 +02:00
Amaury Denoyelle	324a49ed4d	MINOR: quic: strengthen qc_release_frm() quic_frame is the type used to represent frames emitted in a QUIC Tx packet. Each frame is attached to a packet, and can also be linked to other frames from the the same packet, or duplicated frames for retransmission. As such, quic_frame free operation is a tedious process. qc_release_frm() has been implemented to ensure quic_frame is always properly freed after detaching from all its list attach point. One particular point is to ensure that when a frame is released, the frame origin and all origin copies, including the current <frm> are flagged as acked and detached from the reflist. Add a BUG_ON() to ensure this loop is properly conducted when dealing with the current <frm> instance.	2024-10-04 16:00:05 +02:00
Christopher Faulet	131b877565	BUG/MINOR: stats: Fix the name for the total number of streams created Because of a copy/paste error, CurrStreams was reused by mistake. It should be "CumStreams" No backports needed.	2024-10-04 15:44:40 +02:00
Amaury Denoyelle	c1d714156e	BUG/MAJOR: mux-quic: do not crash on empty STREAM frame emission Most of the time STREAM frames emitted by QUIC MUX have some data in it. However, it is possible to use an empty frame when a delayed FIN must be transferred. Recently, QUIC MUX send callback notification has been refactored. Now, this callback is blindly called by quic_conn lower layer each time a STREAM frame is built into a newly Tx packet. QUIC MUX is responsible to ensure the notified frame corresponds to newly emitted data or retransmission. Offsets are used for this comparison, but this requires special care for empty FIN frames. Sadly, the comparison written to determine if an empty FIN frame was sent for the first time or retransmitted is not correct. This caused such frame to always be dismissed as retransmission in QUIC MUX sent callback. This prevented the related QCS instance to be removed from the send_list, causing qcc_io_send() to retry a new emission. This was finally interrupted by the BUG_ON() assertion to prevent an infinite loop. Fix this crash by updating the condition in QUIC MUX send callback. For empty STREAM frame, it is sufficient to check if QC_SF_FIN_STREAM was already removed or not to detect a retransmission. Indeed, empty STREAM frames are never used outside of delayed FIN reporting. No need to backport. This crash was introduced in the current dev branch by the following commit. d7f4e5abf0b7129329d0ea716c104474fd934bc6 MEDIUM: quic: strengthen MUX send notification	2024-10-04 11:31:11 +02:00
Willy Tarreau	7cdc9325a1	[RELEASE] Released version 3.1-dev9 Released version 3.1-dev9 with the following main changes : - MINOR: tools: add minimal file name management - CLEANUP: stick-table: make the file location point to a global file name - MINOR: proxy: use the global file names for conf->file - CLEANUP: cfgparse: factor proxy vs log-forward collisions - BUG/MINOR: cfgparse: detect another uncaught case of duplicate defaults - MINOR: proxy: add a list of orphaned defaults sections - MEDIUM: cfgparse: drop duplicate named defaults sections after use - OPTIM: cfgparse: speed up duplicate server detection - MEDIUM: cfgparse: warn about deprecated use of duplicate server names - BUG/MINOR: server: shut down streams under thread isolation - BUG/MINOR: proxy: also make the cli and resolvers use the global name - REGTESTS: log: fix log-profile.vtc - MEDIUM: mailers: warn about deprecated legacy mailers - BUG/MEDIUM: cli: Be sure to catch immediate client abort - DEV: flags/applet: decode appctx flags - BUG/MEDIUM: cli: Deadlock when setting frontend maxconn - MINOR: log: fix indent in strm_log() - MINOR: log: introduce extra log profile steps - MINOR: log: handle extra log origins in _process_send_log_override() - MINOR: log: introduce log_orig flags - MINOR: log: explicitly handle extra log origins as error when relevant - MINOR: log: support extra log origins for '%OG' alias - MINOR: proxy: add log_steps struct member - MINOR: log: introduce "log-steps" proxy keyword - MINOR: log: add log_orig_proxy() helper function - MEDIUM: log: consider log-steps proxy setting for existing log origins - DOC: config: document proxy "log-steps" keyword - REGTESTS: add a test for proxy "log-steps" - Revert "BUG/MINOR: server: shut down streams under thread isolation" - MINOR: task: define two new one-shot events for use with WOKEN_OTHER or MSG - BUG/MEDIUM: stream: make stream_shutdown() async-safe - BUG/MINOR: server: make sure the HMAINT state is part of MAINT - BUG/MINOR: queue: make sure that maintenance redispatches server queue - MINOR: server: make srv_shutdown_sessions() call pendconn_redistribute() - BUILD: tools: only include execinfo.h for the real backtrace() function - MINOR: tools: do not attempt to use backtrace() on linux without glibc - OPTIM: channel: speed up co_getline()'s search of the end of line - OPTIM: stconn: Don't pretend mux have more data to deliver on EOI/EOS/ERROR - BUG/MINOR: mcli: Pretend the mux have more data to deliver between two commands - MINOR: action: Export release_expr_int_action() release function - MINOR: stream: Rely on a per-stream max connection retries value - MINOR: stream: Support dynamic changes of the number of connection retries - MINOR: stream/stats: Expose the current number of streams in stats - MINOR: stream/stats: Expose the total number of streams ever created in stats - BUG/MINOR: cfgparse-global: fix allowed args number for setenv - MINOR: cfgparse-global: add dedicated parser for *env keywords - MINOR: mux-quic: complete Tx infos for QCS dump - MINOR: quic: ensure txbuf realloc is only performed on empty buffer - MINOR: mux-quic: strengthen qcs_send_metadata() usage - MINOR: quic: remove unneeded notification of txbuf room - MINOR: quic: refactor MUX send notification - MEDIUM: quic: strengthen MUX send notification - MINOR: quic: refactor STREAM room notification - MINOR: quic: do not remove qc_stream_desc automatically on ACK handling - MINOR: quic: store streambuf in a streamdesc tree - MINOR: quic: move buffered ACK to streambuf - MEDIUM: quic: handle out-of-order ACK at streamdesc layer - MEDIUM: quic: refactor buffered STREAM ACK consuming - BUG/MEDIUM: queue: always dequeue the backend when redistributing the last server - MINOR: config/trace: Add a 'traces' section to declare debug traces - MINOR: trace: Be able to chain commands for a source in one line - MINOR: tcpcheck: Add support for an option host header value for httpchk option - BUG/MINOR: mux-h1: Fix condition to set EOI on SE during zero-copy forwarding - MINOR: mux-h1: Use a dedicated function to conditionnaly set EOI flag on SE - BUG/MINOR: http-ana: Disable fast-fwd for unfinished req waiting for upgrade - BUG/MINOR: mux-quic: fix crash on qcc_init() early return - BUG/MINOR: quic: fix trace on releasing STREAM frame after ack	2024-10-03 17:47:33 +02:00
Amaury Denoyelle	b74df9fbc9	BUG/MINOR: quic: fix trace on releasing STREAM frame after ack Fix NULL argument pass to qc_release_frm(). This allows to give more context on the traces inside it. Note that no crash occured as QUIC traces always check validity on first arg before derefencing it. No backport needed.	2024-10-02 17:10:51 +02:00
Amaury Denoyelle	58b7a72d07	BUG/MINOR: mux-quic: fix crash on qcc_init() early return qcc_release() may be used in case qcc_init() cannot complete. In this case, connection instance is NULL. As such, it cannot be dereferenced without testing it first. This should fix github coverity report #2739. No backport needed.	2024-10-02 17:06:31 +02:00
Christopher Faulet	cea1379cf1	BUG/MINOR: http-ana: Disable fast-fwd for unfinished req waiting for upgrade If a request is waiting for a protocol upgrade but it is not finished, the data fast-forwarding is disabled. Otherwise, the request analyzers will miss the end of the message. This case is possible since the commit 01fb1a54 ("BUG/MEDIUM: mux-h1/mux-h2: Reject upgrades with payload on H2 side only"). Indeed, before, a protocol upgrade was not allowed for request with payload. But it is now possible and this comes with a side-effect. It is not really satisfying but for now there is no other way to sync the muxes and the applicative stream. It seems to be a reasonnable fix for now, waiting for a deeper refactoring. This patch must be backported with the commit above.	2024-10-02 10:31:40 +02:00
Christopher Faulet	267ba1d889	MINOR: mux-h1: Use a dedicated function to conditionnaly set EOI flag on SE The same conditions are evaluated in h1_process_demux() and h1_fastfwd() to know if SE_FL_EOI flag must be set or not on the sedesc. So now, a dedicated function is used.	2024-10-02 10:22:51 +02:00
Christopher Faulet	6b39e245e1	BUG/MINOR: mux-h1: Fix condition to set EOI on SE during zero-copy forwarding During zero-copy data forwarding, the producer must set the EOI flag on the SE when end of the message is reached. It is already done but there is a case where this flag is set while it should not. When a request wants to perform a protocol upgrade and it is waiting for the server response, the flag must not be set because the HTTP message is finished but some data are possibly still expected, depending on the server response. On a 101-switching-protocol, more data will be sent because the producer is switch to TUNNEL state. So, now, the right condition is used. In DONE state, SE_FL_EOI flag is set on the sedesc iff: - it is the response - it is the request and the response is also in DONNE state - it is a request but no a protocol upgrade nor a CONNECT This patch must be backported as far as 2.9.	2024-10-02 10:22:51 +02:00
Christopher Faulet	27ee292731	MINOR: tcpcheck: Add support for an option host header value for httpchk option Support for headers and body hidden in the version for the "option httpchk" directive was removed. However a Host header is mandatory for HTTP/1.1 requests and some servers may return an error if it is not set. For now, to add it, an "http-check send" rule must be added. But it is not really handy to use an extra config line for this purpose. So now, it is possible to set the host header value, a log-format string, as extra argument to "option httpchk" directive. It must be the fourth argument: option httpchk GET / HTTP/1.1 www.srv.com While this patch is not a bug fix, it is simple enough to be backported if necessary. On 2.9 and older, lf_init_expr() does not exist and LIST_INIT() must be used instead.	2024-10-02 10:22:51 +02:00
Christopher Faulet	c39c351a73	MINOR: trace: Be able to chain commands for a source in one line In the configuration file or on the CLI, configuring traces for a specific source is a bit painful because this must be done in several lines. Thanks to this patch, it is now possible to fully configure traces for a source in one line. For instance, the following on the CLI: trace h1 sink stderr; trace h1 level developer; trace h1 verbosity complete; trace h1 start now can now be replaced by: trace h1 sink stderr level developer verbosity complete start now The same is true for the 'trace' directives in the configuration file.	2024-10-02 10:22:51 +02:00
Christopher Faulet	15a520d474	MINOR: config/trace: Add a 'traces' section to declare debug traces It is no longer supported to declare debug traces, via 'trace' directive, in a global section. A 'traces' directive must be used instead. The syntax of the 'trace' directive in these sections remains the same. But it is no longer experimental. The main reason for this change is to avoid to have a ring section defined before a global one. Indeed, for now, forward declarations of ring sections are not supported. So to configure traces, you had to add a ring section before the global one defining the traces. Most of time, that meant to have two global sections : global [...] # global settings ring <name> [...] global [...] # trace config In addition, it will be possible to easily extend the traces section by adding some new directives.	2024-10-02 10:22:51 +02:00
Willy Tarreau	53f52e67a0	BUG/MEDIUM: queue: always dequeue the backend when redistributing the last server An interesting bug was revealed by commit 5541d4995d ("BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue()"). When shutting down a server to redistribute its connections, no check is made on the backend's queue. If we're turning off the last server and the backend has pending connections, these ones will wait there till the queue timeout. But worse, since the commit above, we can enter an endless loop in the following situation: - streams are present in the backend's queue - streams are purged on the last server via srv_shutdown_streams() - that one calls pendconn_redistribute(srv) which does not purge the backend's pendconns - a stream performs some load balancing and enters assign_server_and_queue() - assign_server() is called in turn - the LB algo is non-deterministic and there are entries in the backend's queue. The function notices it and returns SRV_STATUS_FULL - assign_server_and_queue() calls pendconn_add() to add the connection to the backend's queue - on return, pendconn_must_try_again() is called, it figures there's no stream served anymore on the server nor the proxy, so it removes the pendconn from the queue and returns 1 - assign_server_and_queue() loops back to the beginning to try again, while the conditions have not changed, resulting in an endless loop. Ideally a change count should be used in the queues so that it's possible to detect that some dequeuing happened and/or that a last stream has left. But that wouldn't completely solve the problem that is that we must never ever add to a queue when there's no server streams to dequeue the new entries. The current solution consists in making pendconn_redistribute() take care of the proxy after the server in case there's no more server available on the proxy. It at least ensures that no pending streams are left in the backend's queue when shutting streams down or when the last server goes down. The try_again loop remains necessary to deal with inevitable races during pendconn additions. It could be limited to a few rounds, though, but it should never trigger if the conditions are sufficient to permit it to converge. One way to reproduce the issue is to run a config with a single server with maxconn 1 and plenty of threads, then run in loops series of: "disable server px/s;shutdown sessions server px/s; wait 100ms server-removable px/s; show servers conn px; enable server px/s" on the CLI at ~10/s while injecting with around 40 concurrent conns at 40-100k RPS. In this case in 10s - 1mn the crash can appear with a backtrace like this one for at least 1 thread: #0 pendconn_add (strm=strm@entry=0x17f2ce0) at src/queue.c:487 #1 0x000000000064797d in assign_server_and_queue (s=s@entry=0x17f2ce0) at src/backend.c:1064 #2 0x000000000064a928 in srv_redispatch_connect (s=s@entry=0x17f2ce0) at src/backend.c:1962 #3 0x000000000064ac54 in back_handle_st_req (s=s@entry=0x17f2ce0) at src/backend.c:2287 #4 0x00000000005ae1d5 in process_stream (t=t@entry=0x17f4ab0, context=0x17f2ce0, state=<optimized out>) at src/stream.c:2336 It's worth noting that other threads may often appear waiting after the poller and one in server_atomic_sync() waiting for isolation, because the event that is processed when shutting the server down is consumed under isolation, and having less threads available to dequeue remaining requests increases the probability to trigger the problem, though it is not at all necessary (some less common traces never show them). This should carefully be backported wherever the commit above was backported.	2024-10-01 18:57:51 +02:00
Amaury Denoyelle	8d68717a41	MEDIUM: quic: refactor buffered STREAM ACK consuming For the moment, streamdesc layer can only deal with in-order ACK at the stream level. Received out-of-order ACKs are buffered in a tree attached to a streambuf instance. Previously, caller of qc_stream_desc_ack() was responsible to implement consumption of these buffered ACKs. Refactor this by implementing it directly at the streamdesc layer within qc_stream_desc_ack(). This simplifies quic_rx ACK handling and ensure buffered ACKs are consumed as soon as possible.	2024-10-01 16:22:23 +02:00
Amaury Denoyelle	cc4384aeb7	MEDIUM: quic: handle out-of-order ACK at streamdesc layer qc_stream_desc_ack() is the entrypoint for streamdesc layer to handle a new acknowledgement of previously emitted STREAM data. Previously, it was only able to deal with in-order ACK offset. The caller was responsible to buffer out-of-order ACKs. Change this by dealing with the latter case directly in qc_stream_desc_ack(). This notably simplify ACK handling in quic_rx module.	2024-10-01 16:22:20 +02:00
Amaury Denoyelle	62558a9285	MINOR: quic: move buffered ACK to streambuf QUIC streamdesc layer is used to manage QUIC MUX stream txbuf data storage until acknowledgment. Currently, it only supports in-order acknowledgment at the stream level. This requires to be able to buffer out-of-order ACKs until they can be handled. Previously, these ACKs were stored in a tree to the streamdesc instance. Move this indexed storage at the streambuf instance. This commit is purely an architecture change. However, it will allow to extend ACK management in future patches, such as the ability to merge overlapping out-of-order ACKs.	2024-10-01 16:19:42 +02:00
Amaury Denoyelle	943e48dadd	MINOR: quic: store streambuf in a streamdesc tree qc_stream_desc layer is used by QUIC MUX to store emitted STREAM data until their acknowledgement. Each stream with Tx capability can allocate its own qc_stream_desc. In turn, each stream desc can have one or multiple data buffers. This is useful when a MUX stream releases a buffer and allocate a new one, to preserve bandwith without waiting to receive all acknowledgement of the previous buffer. Each buffer is encapsulated in a qc_stream_buf structure. Previously, it was stored as a list into qc_stream_desc. Change this storage to use a tree instead. Each buffer is indexed by their offset. This commit does not introduce functional changes. However, this rearchitecture will be necessary for future commit to extend ACK management which require fetching individual buffer instance, not just the first or last element of a streamdesc, by their offset.	2024-10-01 16:19:41 +02:00
Amaury Denoyelle	f4a83fbb14	MINOR: quic: do not remove qc_stream_desc automatically on ACK handling qc_stream_desc_ack() is used to handle ACK received for STREAM frame. It removes acknowledged data from their underlying buffer. If all data were removed after ACK handling, qc_stream_desc instance would automatically be freed at the end of qc_stream_desc_ack(). However, this renders the function complicated to use. Simplify this by removing this automatic removal. Now, caller is responsible to check after ACK handling if qc_stream_desc instance can be removed. This is easily done using qc_stream_desc_done() helper.	2024-10-01 16:19:25 +02:00
Amaury Denoyelle	db68f8ed86	MINOR: quic: refactor STREAM room notification qc_stream_desc is an intermediary layer between QUIC MUX and quic_conn. It is a facility which permits to store data to emit and keep them for retransmission until acknowledgment. This layer is responsible to notify QUIC MUX each time a buffer is freed. This is necessary as MUX buffer allocation is limited by the underlying congestion window size. Refactor this to use a mechanism similar to send notification. A new callback notify_room can now be registered to qc_stream_desc instance. This is set by QUIC MUX to qmux_ctrl_room(). On MUX QUIC free, special care is now taken to reset notify_room callback to NULL. Thanks to this refactoring, further adjustment have been made to refine the architecture. One of them is the removal of qc_stream_desc QC_SD_FL_OOB_BUF, which is now converted to a MUX layer flag QC_SF_TXBUF_OOB.	2024-10-01 16:19:25 +02:00
Amaury Denoyelle	d7f4e5abf0	MEDIUM: quic: strengthen MUX send notification Previous commit implement a refactor of MUX send notification from quic_conn layer. With this new architecture, a proper callback is defined for each qc_stream_desc instance. This architecture change allows to simplify notification from quic_conn layer. First, ensure the MUX callback to properly ignore retransmission of an already emitted frame. Luckily, this can be handled easily by comparing offsets and FIN status. Also, each QCS instance can now be unregistered from send notification just prior qc_stream_desc releasing. This ensures a QCS is never manipulated from quic_conn after its emission ending. Both these changes render the send notification more robust. As a nice effect, flag QUIC_FL_CONN_TX_MUX_CONTEXT can be removed as it is now unneeded.	2024-10-01 16:19:25 +02:00
Amaury Denoyelle	6ad99af0a9	MINOR: quic: refactor MUX send notification For STREAM emission, MUX QUIC generates one or several frames and emit them via qc_send_mux(). Lower layer may use them as-is, or split them to lower chunk to fit in a QUIC packet. It is then responsible to notify the MUX to report the amount of data sent. Previously, this was done via a direct call from quic_conn to MUX using qcc_streams_sent_done(). Modify this to have a better isolation accross layers. Define a send callback handled by the qc_stream_desc instance. This allows the MUX to register each QCS instance individually to the renamved qmux_ctrl_send() which replaces qcc_streams_sent_done(). At quic_conn layer, qc_stream_desc_send() can be used now. This is a wrapper to qc_stream_desc layer to invoke the send callback if registered. This mechanism of qc_stream_desc callback should be extended later to implement other notifications accross the QUIC stack.	2024-10-01 16:19:25 +02:00
Amaury Denoyelle	4859d8e71d	MINOR: quic: remove unneeded notification of txbuf room When a stream buffer is freed, qc_stream_desc notify MUX. This is useful if MUX is waiting for Tx buffer allocation. Remove this notification in qc_stream_desc(). This is because the function is called when all stream data have been acknowledged and thus notified. This function can also be called with some data unacknowledged, but in this case this is only true just before connection closure. As such, it is useful to notify the MUX in this condition.	2024-10-01 16:19:25 +02:00
Amaury Denoyelle	12782da020	MINOR: mux-quic: strengthen qcs_send_metadata() usage This function is reserved for QCS instance where no data was emitted. A BUG_ON() ensures this by checking that streamdesc buf_list is empty. However, this condition would not be enough if data were previously emitted but already fully acknowledged. Thus, extend the condition by also checking the streamdesc ack_offset is 0.	2024-10-01 16:17:03 +02:00
Amaury Denoyelle	fdc16c1e01	MINOR: quic: ensure txbuf realloc is only performed on empty buffer QUIC application protocol layer has the ability to either allocate a standard buffer or a smaller one. The latter is useful when only small data are transferred to prevent consuming too much of the QUIC MUX buffer window. This operation is performed using qc_stream_buf_realloc(). Add a new BUG_ON() in it to ensure no data is present in the buffer. Indeed, this would cause to data loss, or even crash when trying to acknowledge data. Note that for the moment qc_stream_buf_realloc() is only use for HTTP/3 headers transmission, and this usage is conform to the new BUG_ON. This commit is thus not a bug fix, but only to strengthen the API.	2024-10-01 11:51:51 +02:00
Amaury Denoyelle	172404a8ec	MINOR: mux-quic: complete Tx infos for QCS dump Complete debug info when a QCS instance is dumped either on traces or show quic. Display the value of Tx offset both soft and real, along with the current flow-control limit.	2024-10-01 11:51:51 +02:00
Valentine Krasnobaeva	f18b52cc80	MINOR: cfgparse-global: add dedicated parser for env keywords This commit prepares the config parser to support MODE_DISCOVERY and, thus, refactored master-worker mode. The latter implies, that master process reads only the 'DISCOVERY' tagged keywords from the global section and it must call for this an appropriate keyword parser. So, let's move the code, which parses env keywords, from the global section parser to its own keyword registered parser.	2024-10-01 10:37:29 +02:00
Valentine Krasnobaeva	df68f7ec96	BUG/MINOR: cfgparse-global: fix allowed args number for setenv Keywords setenv and presetenv take 2 arguments: variable name and value. So, the total number, that should be passed to alertif_too_many_args is 2 ("setenv <name> <value>") instead of 3. For alertif_too_many_args the first argument index is 0. This should be backported in all stable versions.	2024-10-01 10:35:09 +02:00
Christopher Faulet	273d322b6f	MINOR: stream/stats: Expose the total number of streams ever created in stats A shared counter is added in the thread context to track the total number of streams created on the thread. This number is then reported in stats. It will be a useful information to diagnose some bugs.	2024-09-30 16:55:53 +02:00
Christopher Faulet	18ee22ff76	MINOR: stream/stats: Expose the current number of streams in stats A shared counter is added in the thread context to track the current number of streams. This number is then reported in stats. It will be a useful information to diagnose some bugs.	2024-09-30 16:55:53 +02:00
Christopher Faulet	6a94b7419e	MINOR: stream: Support dynamic changes of the number of connection retries Thanks to the previous patch, it is now possible to add an action to dynamically change the maxumum number of connection retires for a stream. "set-retries" action may now be used to do so, from a "tcp-request content" or a "http-request" rule. This action accepts an expression or an integer between 0 and 100. The integer value is checked during the configuration parsing and leads to an error if it is not in the expected range. However, for the expression, the value is retrieve at runtime. So, invalid value are just ignored. Too high value is forbidden to avoid any trouble. 100 retries seems already be an amazingly hight value. In addition, the option is only available on backend or listen sections. Because the max retries is limited to 100 at most, it can be stored as a unsigned short. This save some space in the stream structure.	2024-09-30 16:55:53 +02:00
Christopher Faulet	91e785edc9	MINOR: stream: Rely on a per-stream max connection retries value Instead of directly relying on the backend parameter to limit the number of connection retries, we now use a per-stream value. This value is by default inherited from the backend value when it is set. So for now, there is no change except the stream value is used instead of the backend value. But thanks to this change, it will be possible to dynamically change this value.	2024-09-30 16:55:53 +02:00
Christopher Faulet	0d91de2be4	MINOR: action: Export release_expr_int_action() release function This function was only used by TCP actions and was private to tcp_act.c file. However, it make sense to make it public to be used by any action relying on an int-or-expression argument.	2024-09-30 16:55:53 +02:00
Christopher Faulet	688abb6f30	BUG/MINOR: mcli: Pretend the mux have more data to deliver between two commands Since the commit "OPTIM: stconn: Don't pretend mux have more data to deliver on EOI/EOS/ERROR", the SC no longer pretend its mux have more data to deliver when one of EOI/EOS/ERROR flags are set on its sedesc. However, for the master cli, it is an issue because any EOI/EOS at the end of a command is in fact detected on the attempt to get the next command. To do so, the stream is reset. Because if the commit above, the next received is never performed. To fix the issue, when the stream is reset, the front SC pretend its mux have more data to deliver. This patch must only be bacported if the commit above is backported.	2024-09-30 16:55:53 +02:00
Christopher Faulet	bca5e14235	OPTIM: stconn: Don't pretend mux have more data to deliver on EOI/EOS/ERROR Doing some benchs on the 3.0, we encountered a small loss on requests/sec on small objects compared to the 2.8 . After bisecting the issue, it appeared that this was introduced when the mux-to-mux zero-copy data forwarding was implemented in 2.9-dev8. Extra subscribes on receives at the end of the message were responsible of the loss. A basic configuration, sending H2 requests to a H1 server returning responses without payload is enough to observe the issue. With the following command, we can observe a huge increase of epoll_ctl calls on 2.9/3.x: h2load -c 100 -m 10 -n 100000 http://... On 2.8 we have around 3200 calls to epoll_ctl against more than 20k on 3.1. The fix seems obvious. After a receive, there is no reason to state a mux have more data to deliver if EOI/EOS/ERROR flag was set on the stream-endpoint descriptor. With this change, extra calls to epoll_ctl disappear. However it is a sensitive part so it is important to keep an eye on it and to not backport it. Thanks to Willy and Emeric to have spot the issue.	2024-09-30 16:55:48 +02:00
Willy Tarreau	11051ed9c7	OPTIM: channel: speed up co_getline()'s search of the end of line Previously, co_getline() was essentially used for occasional parsing in peers's banner or Lua, so it could afford to read one character at a time. However now it's also used on the TCP log path, where it can consume up to 40% CPU as mentioned in GH issue #2731. Let's speed it up by using memchr() to look for the LF, and copying the data at once using memcpy(). Previously it would take 2.44s to consume 1 GB of log on a single thread of a Core i7-8650U, now it takes 1.56s (-36%).	2024-09-30 11:36:39 +02:00
Willy Tarreau	7caf073faa	MINOR: tools: do not attempt to use backtrace() on linux without glibc The function is provided by glibc. Nothing prevents us from using our own outside of glibc there (tested on aarch64 with musl). We still do not enable it by default as we don't yet know if all archs work well, but it's sufficient to pass USE_BACKTRACE=1 when building with musl to verify it's OK.	2024-09-29 09:52:23 +02:00
Willy Tarreau	1c4776dbc3	BUILD: tools: only include execinfo.h for the real backtrace() function No need to include this possibly non-existing file when using our own backtrace() implementation, it's only needed for the libc-provided one. Because of this it's currently not possible to build musl with backtrace enabled.	2024-09-29 09:52:23 +02:00
Willy Tarreau	1d403caf8a	MINOR: server: make srv_shutdown_sessions() call pendconn_redistribute() When shutting down server sessions, the queue was not considered, which is a problem if some element reached the queue at the moment the server was going down, because there will be no more requests to kick them out of it. Let's always make sure we scan the queue to kick these streams out of it and that they can possibly find a more suitable server. This may make a difference in the time it takes to shut down a server on the CLI when lots of servers are in the queue. It might be interesting to backport this to 3.0 but probably not much further.	2024-09-27 19:01:38 +02:00
Willy Tarreau	1385e33eb0	BUG/MINOR: queue: make sure that maintenance redispatches server queue Turning a server to maintenance currently doesn't redispatch the server queue unless there's an explicit "option redispatch" and no "option persist", while the former has never really been the purpose of this test. Better refine this so that forced maintenance also causes the queue to be flushed, and possibly redispatched unless the proxy has option persist. This way now when turning a server to maintenance, the queue is immediately flushed and streams can decide what to do. This can be backported, though there's no need to go far since it was never directly reported and only noticed as part of debugging some rare "shutdown sessions" strangeness, which it might participate to.	2024-09-27 18:54:07 +02:00
Willy Tarreau	a4d04c649a	BUG/MINOR: server: make sure the HMAINT state is part of MAINT In 1.8 when adding "set server fqdn" with commit b418c1228c ("MINOR: server: cli: Add server FQDNs to server-state file and stats socket."), the HMAINT flag was not made part of the MAINT ones, so technically speaking when changing the FQDN, the server is not completely considered as in maintenance mode. In its defense, the code location around that was completely messy, with the aggregator flag being hidden between other values and purposely but discretely ignoring one of the flags, so the comments were updated to make the intent clearer (particularly regarding CMAINT which looked like it was also forgotten while it was on purpose). This can be backported anywhere.	2024-09-27 18:40:15 +02:00
Willy Tarreau	b8e3b0a18d	BUG/MEDIUM: stream: make stream_shutdown() async-safe The solution found in commit b500e84e24 ("BUG/MINOR: server: shut down streams under thread isolation") to deal with inter-thread stream shutdown doesn't work fine because there exists code paths involving a server lock which can then deadlock on thread_isolate(). A better solution then consists in deferring the shutdown to the stream itself and just wake it up for that. The only thing is that TASK_WOKEN_OTHER is a bit too generic and we need to pass at least 2 types of events (SF_ERR_DOWN and SF_ERR_KILLED), so we're now leveraging the new TASK_F_UEVT1 and _UEVT2 flags on the task's state to convey these info. The caller only needs to wake the task up with these flags set, and the stream handler will then finish the job locally using stream_shutdown_self(). This needs to be carefully backported to all branches affected by the dequeuing issue and containing any of the 5541d4995d ("BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue()"), and/or b11495652e ("BUG/MEDIUM: queue: implement a flag to check for the dequeuing").	2024-09-27 12:15:41 +02:00
Willy Tarreau	b5281283bb	MINOR: task: define two new one-shot events for use with WOKEN_OTHER or MSG TASK_WOKEN_MSG only says "someone sent you a message" but doesn't convey any info about the message. TASK_WOKEN_OTHER says "you're woken for another reason" but doesn't tell which one. Most often they're used as-is by the task handlers to report very specific situations. For some important control notifications, having the ability to modulate the message a little bit is useful, so let's define two user event types UEVT1 and UEVT2 to be used in conjunction with TASK_WOKEN_MSG or _OTHER so that the application can know that a specific condition was explicitly requested. It will be used this way: task_wakeup(s->task, TASK_WOKEN_MSG \| TASK_F_UEVT1); or: task_wakeup(s->task, TASK_WOKEN_OTHER \| TASK_F_UEVT2); Since events are cumulative, keep in mind not to consider a 3rd value as the combination of EVT1+EVT2; these really mean that the two events appeared (though in unspecified order).	2024-09-27 11:56:10 +02:00
Willy Tarreau	d1c398b786	Revert "BUG/MINOR: server: shut down streams under thread isolation" This reverts commit b500e84e24fd19ccbcdf4fae5165aeb07e46bd67. Thread isolation does not work well for this, there exists code paths which already hold the server's lock and result in a deadlock. Let's revert that and address it better without isolation.	2024-09-27 10:17:31 +02:00
Aurelien DARRAGON	0c94b2efec	REGTESTS: add a test for proxy "log-steps" Now that proxy "log-steps" keyword was implemented and is usable since ("MEDIUM: log: consider log-steps proxy setting for existing log origins") let's add some tests for it in reg-tests/log/log_profile.vtc.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	7ad4e00c1f	DOC: config: document proxy "log-steps" keyword Now that "log-steps" proxy keyword is functional, let's add some documentation and usage examples for it.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	e3eb6a9035	MEDIUM: log: consider log-steps proxy setting for existing log origins During tcp/http transaction processing, haproxy may produce logs at different steps during the processing (accept, connect, request, response, close). But the behavior is hardly configurable because haproxy will only emit a single log per transaction, and by default it will try to produce the log once all log aliases or fetches used in the logformat could be satisfied, which means the log is often emitted during connection teardown, unless "option logasap" is used. We were often asked to have a way to emit multiple logs for a single transaction, like for instance emit log during accept, then request, response and close for instance, see GH #401 for more context. Thanks to "log-steps" keyword introduced by commit "MINOR: log: introduce "log-steps" proxy keyword", it is now possible to explictly configure when logs should be generated by haproxy when processing a transaction. This commit adds the required checks so that log-steps proxy option is properly considered for existing logs generated by haproxy. If "log-steps" is not specified on the proxy, the old behavior is preserved. Note: a slight cpu overhead should only be visible when "log-steps" keyword will be used due to the implementation relying on eb32 lookup instead of basic bitfield check as described in "MINOR: proxy: add log_steps struct member". However, the default behavior shouldn't be affected. When combining log-steps with log-profiles, user has the ability to explicitly control how and when haproxy should generate logs during requests handling.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	4189eb7aca	MINOR: log: add log_orig_proxy() helper function Function may be used on proxy where log-steps are used to check if a given log origin should be handled or not.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	c043d5d372	MINOR: log: introduce "log-steps" proxy keyword For now it is only available for proxies with frontend capability because log-steps are only evaluated under sess_log() or strm_log() which essentially focus on the frontend side when it comes to log settings so it's better to keep it this way for better consistency, at least for now. For now the setting does nothing (it is not considered during runtime), it will be implemented and documented in upcoming commits.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	9341792baf	MINOR: proxy: add log_steps struct member add proxy->conf.log_steps eb32 root tree which will be used to store the log origin identifiers that should result in haproxy emitting a log as configured by the user using upcoming "log-steps" proxy keyword. It was chosen to use eb32 tree instead of simple bitfield because despite the slight overhead it is more future-proof given that we already implemented the prerequisites for seamless custom log origins registration that will also be usable from "log-steps" proxy keyword.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	b882402a29	MINOR: log: support extra log origins for '%OG' alias Following previous commits, let's improve log_orig_to_str() so that extra log origins (registered through log_orig_register()) can be translated to string from origin ID. For that, it is required to add eb_32 tree node to log_origin struct in order to enable quick integer lookup during runtime. Slow name lookup using the list is acceptable for config parsing, but it is not the case during runtime when log_orig_to_str() is expected to be used. Also, to prevent duplicated info, get rid of ->id field and use ->tree.key instead	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	f8bb9d5c57	MINOR: log: explicitly handle extra log origins as error when relevant Thanks to previous commit, we can know check for log_orig optional flags in functions taking struct log_orig as parameter. Let's take this opportunity to add the LOG_ORIG_FL_ERROR flag and check this flag at a few places to handle the log message differently because if the flag is set then the caller expects the log to be handled as an error explicitly. e.g.: in _process_send_log_override(), if the flag is set, use the error log format instead of the dedicated one.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	3c15ee05e9	MINOR: log: introduce log_orig flags Rename 'enum log_orig' to 'enum log_orig_id', since this enum specifically contains the log origin ids. Add 'struct log_orig' which wraps 'enum log_orig' with optional flags (no flags defined for now). Add log_orig() helper func that takes id and flags as parameter and returns log_orig struct initialized with input arguments. Update functions taking log origin as parameter so they explicitly take log orig id or log orig wrapper as argument depending on the level of context expected by the function.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	6567e37680	MINOR: log: handle extra log origins in _process_send_log_override() Thanks to the previous commit, it is now possible to register additional log origins that may be used from log-profile section as 'on' steps. As such, let's make _process_send_log_override() function aware of them by trying to lookup in the tree of extra logging steps in the default switch-case catchall. If the log origin id matches with the id of the extra logging step, we use the associated log format instead of the "any" log format.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	818475c5cc	MINOR: log: introduce extra log profile steps add a way to register additional log origins using log_origin_register() that may be used as log profile steps from log profile sections. For now this does nothing as no extra origins are registered and extra log origins are not yet considered for runtime logging paths. When specifying an extra logging step for on <step> under log-profile section, the logging step is stored within a binary tree for efficient lookup during runtime. No performance impact should be expected if extra log origins are not being used, and slight performance impact if extra log origins are used. Don't forget to update the documentation when new log origins are added (both %OG log alias and on <step> log-profile keyword are concerned.	2024-09-26 16:53:07 +02:00
Aurelien DARRAGON	facf259d88	MINOR: log: fix indent in strm_log() 8f34320e15 ("MINOR: log: provide log origin in logformat expressions using '%OG'") caused wrong indent in strm_log()	2024-09-26 16:53:07 +02:00
Oliver Dala	a889413f5e	BUG/MEDIUM: cli: Deadlock when setting frontend maxconn The proxy lock state isn't passed down to relax_listener through dequeue_proxy_listeners, which causes a deadlock in relax_listener when it tries to get that lock. Backporting: Older versions didn't have relax_listener and directly called resume_listener in dequeue_proxy_listeners. lpx should just be passed directly to resume_listener then. The bug was introduced in commit 001328873c352e5e4b1df0dcc8facaf2fc1408aa [cf: This patch should fix the issue #2726. It must be backported as far as 2.4]	2024-09-25 17:12:11 +02:00
Christopher Faulet	96edacc546	DEV: flags/applet: decode appctx flags Decode APPCTX flags via appctx_show_flags() function.	2024-09-24 18:26:36 +02:00
Christopher Faulet	14a413033c	BUG/MEDIUM: cli: Be sure to catch immediate client abort A client abort while nothing was sent is properly handled except when this immediately happens after the connection was accepted. The read0 event is caught before the CLI applet is created. In that case, the shutdown is not handled and the applet is no longer wakeup. In that case, the stream remains blocked and no timeout are armed. The bug was due to the fact that when the applet I/O handler was called for the first time, the applet context was initialized and nothing more was performed. A shutdown, if any, would be handled on the next call. In that case, it was too late. Now, afet the init step, we loop to eval the first command. There is no command here but the shutdown will be tested. This patch should fix the issue #2727. It must be backported to 3.0.	2024-09-24 18:01:38 +02:00
Aurelien DARRAGON	d622f9d5b6	MEDIUM: mailers: warn about deprecated legacy mailers As mentioned in 2.8 announce on the mailing list [1] and on the wiki [2], use of legacy mailers is now deprecated and will not be supported anymore starting with version 3.3. Use of Lua script (AKA Lua mailers) is now encouraged (and fully supported since 2.8) for this purpose, as it offers more flexibility (e.g: alerts can be customized) and is more future-proof. Configurations relying on legacy mailers will now raise a warning. Users willing to keep their existing mailers config in a working state should simply add the following line to their global section: # mailers.lua file as provided in the git repository # adjust path as needed lua-load examples/lua/mailers.lua [1]: https://www.mail-archive.com/haproxy@formilux.org/msg43600.html [2]: https://github.com/haproxy/wiki/wiki/Breaking-changes	2024-09-23 20:16:27 +02:00
Aurelien DARRAGON	cdaa749ba0	REGTESTS: log: fix log-profile.vtc Add missing wait for Slg4 introduced in f8299bc ("MINOR: log: "drop" support for log-profile steps"), and missing barrier increase due to the use of barrier sync, which could have resulted in the regtest being timing-sentive and thus less-reliable. Also, the "error" check in Slg4 wasn't even considered because it is emitted by frontend 4, not frontend 2.. No backport needed unless f8299bc is.	2024-09-23 20:15:47 +02:00
Willy Tarreau	fdf38ed7fc	BUG/MINOR: proxy: also make the cli and resolvers use the global name As detected by ASAN on the CI, two places still using strdup() on the proxy names were left by commit b325453c3 ("MINOR: proxy: use the global file names for conf->file"). No backport is needed.	2024-09-21 20:08:06 +02:00
Willy Tarreau	b500e84e24	BUG/MINOR: server: shut down streams under thread isolation Since the beginning of thread support, the shutdown of streams attached to a server was run under the server's lock, but that's not sufficient. It indeed turns out that shutting down streams (either from the CLI using "shutdown sessions server XXX" or due to "on-error shutdown-sessions") iterates over all the streams to shut them down, but stream_shutdown() has no way to protect its actions against concurrent actions from the stream itself on another thread, and streams offer no such provisions anyway. The impact is some rare but possible crashes when shutting down streams from the CLI in cmopetition with high server traffic. The probability is low enough to mark it minor, though it was observed in the field. At least since 2.4 the streams are arranged in per-thread lists, so it likely would be possible using the event subsystem to delegate these events to dedicated per-thread tasks which would address the problem. But server streams don't get killed often enough to justify such extra complexity, so better just run the loop under thread isolation. It also shows that the internal API could probably be improved to support a lighter thread exclusion instead of full isolation: various places want to only exclude one thread and here it could work. But again there's no point doing this for now. This patch should be backported to all stable branches. It's important to carefully check that this srv_shutdowns_streams() function is never called itself under isolation in older versions (though at first glance it looks OK).	2024-09-21 19:35:35 +02:00
Willy Tarreau	e77c73316a	MEDIUM: cfgparse: warn about deprecated use of duplicate server names As discussed below, there are too many problems and limitations caused by still supporting duplicate server names. That's already particularly complicated and dissuasive to use since it requires these servers to have explicit IDs to be accept. Let's now warn on any duplicate, even with explicit IDs and remind that this will become forbidden in 3.3. Link: https://www.mail-archive.com/haproxy@formilux.org/msg45185.html	2024-09-20 17:15:11 +02:00
Willy Tarreau	029d75df1e	OPTIM: cfgparse: speed up duplicate server detection Surprisingly, the duplicate server name detection has never made use of the names tree, so lookups were still in O(N^2). It took 1 second to validate 50k servers spread into 25 backends at 2k per backend. By simply using the tree (and since the current server already is in the tree), we just have to walk using ebpt_prev_dup to visit previous servers with the same name. We can then detect which ones conflict without having an ID set and error. The config check time is now 1/4 of the previous one for 2k servers per backend, and more importantly it will make it simpler to check for any duplicates later.	2024-09-20 17:14:50 +02:00
Willy Tarreau	ccd1ecba1d	MEDIUM: cfgparse: drop duplicate named defaults sections after use It has never been permitted to explicitly reference named defaults sections for which there are duplicate names. This means that when a duplicate defaults section is found, there's no point in keeping it since it will never be used for lookups, so it can be dropped. However, some such defaults sections might have some rules in them that are implicitly referenced by proxies placed after them. In this case they cannot be removed. What is done here is that upon each new named section creation, if another one is found with the same name, its config location is stored into the new proxy's {prev_file,prev_line} pair, and the old section is either destroyed if its refcount is null, or just unindexed. The dup check when creating a new proxy now consists in checking the prev_line instead of performing a dup lookup on the defaults section. This will guarantee that we can't find duplicate defaults sections in their tree anymore, while still keeping track of what's allocated and releasing everything upon exit. Beyond the consistency gain, there are nice savings for large configs involving many defaults sections: a test with 300k sections saved about 1.9 GB of RAM, and started 25% faster likely thanks to spending less time allocating memory.	2024-09-20 16:35:32 +02:00
Willy Tarreau	c8b813771d	MINOR: proxy: add a list of orphaned defaults sections We'll soon delete unreferenced and duplicated named defaults sections from the list of proxies. The problem with this is that this list (in fact a name-based tree) is used to release all of them at the end. Let's add a list of orphaned defaults sections, typically those containing "http-check send" statements or various other rules, and that are implicitly inherited by a proxy hence have a non-zero refcount while also having a name. These now makes it possible to remove them from the name index while still keeping their memory around for the lifetime of the process, and cleaning it at the end.	2024-09-20 15:59:04 +02:00
Willy Tarreau	cb4c236fac	BUG/MINOR: cfgparse: detect another uncaught case of duplicate defaults The following sequence was not properly caught: defaults def backend back from def defaults def But this one was: defaults def defaults def backend back from def Let's check when defaults are declared that they're not already referenced. Better not backport this. While it will catch broken configs (possibly some with backends pasted after the wrong defaults), these might still work by accident. It may be reported as a diag warning though.	2024-09-20 15:58:10 +02:00
Willy Tarreau	5b221d1e41	CLEANUP: cfgparse: factor proxy vs log-forward collisions This simplifies the check added in 1a38684fbc ("MEDIUM: cfgparse: detect collisions between defaults and log-forward"), by factoring it with the other existing one. The tests are ugly in that code because a first block tests pure proxies, a second one proxies or defaults and inside that one we have special cases for defaults. Let's just move the tests to the "any proxy type" block.	2024-09-20 14:13:14 +02:00
Willy Tarreau	b325453c36	MINOR: proxy: use the global file names for conf->file Proxy file names are assigned a bit everywhere (resolvers, peers, cli, logs, proxy). All these elements were enumerated and now use copy_file_name(). The only ha_free() call was turned to drop_file_name(). As a bonus side effect, a 300k backend config saved 14 MB of RAM.	2024-09-19 15:38:19 +02:00
Willy Tarreau	9ab21a3c2d	CLEANUP: stick-table: make the file location point to a global file name The file name used to point to the calling function's stack for stick tables, which was OK during parsing but remained dangling afterwards. At least it was already marked const so as not to accidentally free it. Let's make it point to a file_name_node now.	2024-09-19 15:38:19 +02:00
Willy Tarreau	d6c060c5ae	MINOR: tools: add minimal file name management In proxies, stick-tables, servers, etc... at plenty of places we store a file name and a line number. Some file names are the result of strdup() (e.g. in proxies), others not (e.g. stick-tables) and leave dangling pointers at the end of parsing. The risk of double-free is not null either. In order to stop this, let's first add a simple tool that allows to register short strings inside a global list, these strings happening to be server names. The strings are either duplicated and stored upon failure to find them, or just added to this storage. Since file names are not expected to disappear before the end of the process, for now we don't even implement refcounting, and we free them all at the end. There's already a drop_file_name() function to reset the pointer like ha_free() used to do, and even if not strictly needed it's a good habit to get used to doing it. The strings are returned as const so that they're stored as-is in structs, and that nasty free() calls are easily caught. The pointer points to the char[] storage inside the node itself. This way later if we want to implement refcounting, it will be trivial to just look up a string and change its associated node's refcount. If needed, comparisons can also be made on pointers. For now they're not used yet and are released on deinit().	2024-09-19 15:36:58 +02:00
Willy Tarreau	30a0e93fe6	[RELEASE] Released version 3.1-dev8 Released version 3.1-dev8 with the following main changes : - DOC: configuration: place the HAPROXY_HTTP_LOG_FMT example on the correct line - MINOR: mux-h1: Set EOI on SE during demux when both side are in DONE state - BUG/MEDIUM: mux-h1/mux-h2: Reject upgrades with payload on H2 side only - REGTESTS: h1/h2: Update script testing H1/H2 protocol upgrades - BUG/MEDIUM: clock: detect and cover jumps during execution - BUG/MINOR: pattern: prevent const sample from being tampered in pat_match_beg() - BUG/MEDIUM: pattern: prevent uninitialized reads in pat_match_{str,beg} - BUG/MEDIUM: pattern: prevent UAF on reused pattern expr - MEDIUM: ssl/cli: "dump ssl cert" allow to dump a certificate in PEM format - BUG/MAJOR: mux-h1: Wake SC to perform 0-copy forwarding in CLOSING state - BUG/MINOR: h1-htx: Don't flag response as bodyless when a tunnel is established - REGTESTS: fix random failures with wrong_ip_port_logging.vtc under load - BUG/MINOR: pattern: do not leave a leading comma on "set" error messages - REGTESTS: shorten a bit the delay for the h1/h2 upgrade test - MINOR: server: allow init-state for dynamic servers - DOC: server: document what to check for when adding new server keywords - MEDIUM: h1: Accept invalid T-E values with accept-invalid-http-response option - BUG/MINOR: polling: fix time reporting when using busy polling - BUG/MINOR: clock: make time jump corrections a bit more accurate - BUG/MINOR: clock: validate that now_offset still applies to the current date - BUG/MEDIUM: queue: implement a flag to check for the dequeuing - OPTIM: sample: don't check casts for samples of same type - OPTIM: vars: remove the unneeded lock in vars_prune_* - OPTIM: vars: inline vars_prune() to avoid many calls - MINOR: vars: remove the emptiness tests in callers before pruning - IMPORT: import cebtree (compact elastic binary trees) - OPTIM: vars: use a cebtree instead of a list for variable names - OPTIM: vars: use multiple name heads in the vars struct - BUG/MINOR: peers: local entries updates may not be advertised after resync - DOC: config: Explicitly list relaxing rules for accept-invalid-http-* options - MINOR: proxy: Rename accept-invalid-http-* options - DOC: configuration: Remove dangerous directives from the proxy matrix - BUG/MEDIUM: sc_strm/applet: Wake applet after a successfull synchronous send - BUG/MEDIUM: cache/stats: Wait to have the request before sending the response - BUG/MEDIUM: promex: Wait to have the request before sending the response - MINOR: clock: test all clock_gettime() return values - MEDIUM: clock: collect the monotonic time in clock_local_update_date() - MEDIUM: clock: opportunistically use CLOCK_MONOTONIC for the internal time - MEDIUM: clock: use the monotonic clock for idle time calculation - MEDIUM: clock: don't compute before_poll when using monotonic clock - BUG/MINOR: fix missing "log-format overrides previous 'option tcplog clf'..." detection - BUG/MINOR: fix missing "'option httpslog' overrides previous 'option tcplog clf'..." detection - BUG/MINOR: cfgparse-listen: fix option httpslog override warning message - BUG/MINOR: cfgparse: detect incorrect overlap of same backend names - MEDIUM: cfgparse: warn about proxies having the same names - DOC: management: add init-state to add server keywords - BUG/MINOR: mux-quic: report glitches to session - BUILD: cebtree: silence a bogus gcc warning on impossible code paths - MEDIUM: cfgparse: warn about colliding names between defaults and proxies - MEDIUM: cfgparse: detect collisions between defaults and log-forward	2024-09-18 22:29:08 +02:00
Willy Tarreau	1a38684fbc	MEDIUM: cfgparse: detect collisions between defaults and log-forward Sadly, when log-forward were introduced they took great care of avoiding collision with regular proxies but defaults were missed (they need to be explicitly checked for). So now we have to move them to a warning for 3.1 instead of rejecting them.	2024-09-18 18:08:15 +02:00
Willy Tarreau	d8f4b07e40	MEDIUM: cfgparse: warn about colliding names between defaults and proxies In order to complete the checks added in 303a66573d ("MEDIUM: cfgparse: warn about proxies having the same names"), we also need to warn about regular proxies having the same name as defaults sections as well as defaults sections having the same name as proxies, since defaults sections are inherently proxies, albeit stored in a separate list for now.	2024-09-18 18:08:06 +02:00
Willy Tarreau	8df44eea6d	BUILD: cebtree: silence a bogus gcc warning on impossible code paths gcc-12 and above report a wrong warning about a negative length being passed to memcmp() on an impossible code path when built at -O0. The pattern is the same at a few places, basically: int foo(int op, const void a, const void b, size_t size, size_t arg) { if (op == 1) // arg is a strict multiple of size return memcmp(a, b, arg - size); return 0; } ... int bar() { return foo(0, a, b, sizeof(something), 0); } It might be possible to invent dummy values for the "len" argument above in the real code, but that significantly complexifies it and as usual can easily result in introducing undesired bugs. Here we take a different approach consisting in shutting the -Wstringop-overread warning on gcc>=12 at -O0 since that's the only condition that triggers it. The issue was reported to and confirmed by the gcc team here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114622 No backport needed, but this should be upstreamed into cebtree after checking that all involved macros are available.	2024-09-18 17:42:52 +02:00
Amaury Denoyelle	fcd6d29acf	BUG/MINOR: mux-quic: report glitches to session Glitch counter was implemented for QUIC/HTTP3. The counter is stored in the QCC MUX connection instance. However, this is never reported at the session level which is necessary if glitch counter is tracked via a stick-table. To fix this, use session_add_glitch_ctr() in various QUIC MUX functions which may increment glitch counter. This should be backported up to 3.0.	2024-09-18 16:11:03 +02:00
Damien Claisse	2c783c25d6	DOC: management: add init-state to add server keywords Commit ce6a621ae allowed init-state to be used for dynamic servers but I forgot to update management doc.	2024-09-17 22:44:53 +02:00
Willy Tarreau	303a66573d	MEDIUM: cfgparse: warn about proxies having the same names As discussed below, there are too many problems and uncaught bugs in the parser when trying to support proxies having similar names but different types. There's specific code to detect the presence of stick-tables in a pair of such proxies for example. It's even possible that certain combinations of backend+listen that were not previously detected have some nasty side effects. According to the proposal in the discussion, this is now deprecated in 3.1 (thus we emit a warning) and will become forbidden in 3.3. A backport might be useful, but reporting a diag_warning only, not a classical warning, so as not to break setups running in zero-warning mode. It was verified with a config involving all 9 combinations of (frontend,backend,listen) followed by one of the same three that all collisions are now properly blocked and that only back+front are kept and emit a warning. Link: https://www.mail-archive.com/haproxy@formilux.org/msg45185.html	2024-09-17 19:55:00 +02:00
Willy Tarreau	c70906c8a1	BUG/MINOR: cfgparse: detect incorrect overlap of same backend names As reported below, it's possible to declare a backend then a proxy with the same name, because for the proxy we check a frontend capability (the first one to be tested): backend b listen b bind :8888 Let's check the two capabilities in this case and not just the frontend. Better not backport this, as there's a risk of breakage of existing setups that work by accident. It might make sense to report them as diag warnings though. Link: https://www.mail-archive.com/haproxy@formilux.org/msg45185.html	2024-09-17 19:55:00 +02:00
Aurelien DARRAGON	17e52c922b	BUG/MINOR: cfgparse-listen: fix option httpslog override warning message "option httpslog" override warning messaged used to be reported as "option httplog", probably as a result of copy paste without adjusting the context. Let's fix that to prevent emitting confusing warning messages The issue exists since 98b930d ("MINOR: ssl: Define a default https log format"), thus it should be backported up to 2.6	2024-09-17 15:40:02 +02:00
Aurelien DARRAGON	bc4bf5779f	BUG/MINOR: fix missing "'option httpslog' overrides previous 'option tcplog clf'..." detection Same as b85edd44db0 ("BUG/MINOR: fix missing "log-format overrides previous 'option tcplog clf'..." detection") but for "option httpslog" keyword. No backport needed unless fd48b28 ("MINOR: Implements new log format of option tcplog clf") is.	2024-09-17 15:40:02 +02:00
Aurelien DARRAGON	607b9adc9b	BUG/MINOR: fix missing "log-format overrides previous 'option tcplog clf'..." detection In commit fd48b28315 ("MINOR: Implements new log format of option tcplog clf") "option tcplog clf" detection was correcly added for "option tcplog" and "option httplog", but "log-format" case was overlooked. Thus, this config would report erroneous warning message: defaults option tcplog clf log-format "ok" [WARNING] (727893) : config : parsing [test.conf:3]: 'log-format' overrides previous 'log-format' in 'defaults' section. No backport needed unless fd48b28315 is.	2024-09-17 14:41:58 +02:00
Willy Tarreau	499e057644	MEDIUM: clock: don't compute before_poll when using monotonic clock There's no point keeping both clocks up to date; if the monotonic clock is ticking, let's just refrain from updating the wall clock one before polling since we won't use it. We still do it after polling however as we need a wall clock time to communicate with outside. This saves one gettimeofday() call per loop and two timeval comparisons.	2024-09-17 09:08:10 +02:00
Willy Tarreau	24496803d1	MEDIUM: clock: use the monotonic clock for idle time calculation By just keeping a copy of the last known value before entering polling, we can apply the same algorithm as we're currently using, except that it's now applied to the monotonic clock instead of the wall clock, when it's detected that it's ticking. This improves idle time calculation accuracy by making it independent on the wall clock.	2024-09-17 09:08:10 +02:00
Willy Tarreau	4150851ce5	MEDIUM: clock: opportunistically use CLOCK_MONOTONIC for the internal time We already collect CLOCK_MONOTONIC when it's available when leaving the poller, but it's only used for profiling. The functions that return it set the value to zero when it's not available, so we can use that to detect if it works or not. The idea is that if the monotonic time is non-zero, it is ticking and usable, then we use if for now_ns, otherwise we use the corrected date. We continue to apply the now_offset to the returned value because it helps forcing an early time wrap-around. Proceeding like this presents two benefits: - on systems supporting this, the time is much more robust against time changes - when it works, it saves us from having to go through the time correction code, which is usually cheap, but better avoided anyway. Note that idle time calculation continues to rely on the wall-clock time.	2024-09-17 09:08:10 +02:00
Willy Tarreau	f793845f4a	MEDIUM: clock: collect the monotonic time in clock_local_update_date() Now we collect this clock in clock_local_update_date(), the closest from the poller, which is also used when busy-polling, and the values is set into the thread's curr_mono_time which did not exist before. Later, clock_leaving_poll() just sets the prev_mono_time value from the curr_ one instead of retrieving the time at this specific point. It also means that the monotonic time will now also cover the time needed to update the global time, which should be negligible. Note that we don't collect the CPU time in the clock_local_update_date() function even though it's tempting, because when doing busy-polling, it would be collected on each round while being useless. Doing so will make sure that the local time always knows the monotonic time when it is available.	2024-09-17 09:08:10 +02:00
Willy Tarreau	42e699903e	MINOR: clock: test all clock_gettime() return values Till now we were only using clock_gettime() for profiling, so if it would fail it was no big deal. We intend to use it as the main clock as well now, so we need to more reliably detect its absence or failure and gracefully fall back to other options. Without the test we would return anything present in the stack, which is neither clean nor easy to detect.	2024-09-17 09:08:10 +02:00
Christopher Faulet	bb2a2bc5f2	BUG/MEDIUM: promex: Wait to have the request before sending the response It is similar to the previous fix about the stats applet ("BUG/MEDIUM: cache/stats: Wait to have the request before sending the response"). However, for promex, there is no crash and no obvious issue. But it depends on the filter. Indeed, the request is used by promex, independantly if it was considered as forwarded or not. So if it is modified by the filter, modification are just ignored. Same bug, same fix. We now wait the request was forwarded before processing it and produce the response.	2024-09-16 22:56:28 +02:00
Christopher Faulet	afc50f2445	BUG/MEDIUM: cache/stats: Wait to have the request before sending the response It seems obvious. On a classical workflow, the request headers analysis is finished when these applets are woken up for the first time. So they don't take care to really have the request to start to process it and to send the response. But with a filter, it is possible to stop the request analysis after the applet creation. If this happens for the stats applet, this leads to a crash because we retrieve the request start-line without checking if it is available. For the cache applet, the response is just immediatly sent. And here it is a problem if the compression is enabled. In that case too, this may lead to a crash because the compression may be enabled but not initialized. For a true server, there is no issue because the connection cannot be established. The server is chosen only after the request analysis. The issue with applets is that once created, an applet is quickly switched to the established state. So it is probably a point that must be carefully reviewed and probably reworked. In the mean time, as a fix, in the cache and the stats applet, we just take care to have the request before sending the response. This will do the trick. The patch must be backported as far as 2.6. On 2.6, the patch must be adapted.	2024-09-16 22:55:40 +02:00
Christopher Faulet	5fc12b0afd	BUG/MEDIUM: sc_strm/applet: Wake applet after a successfull synchronous send On a synchronous send from the stream to an applet, if some data were sent, we must take care to wake the applet up. It is important because if everything was sent at this stage, there is no other chance to wake the applet up, mainly because SE_FL_WAIT_DATA flag is set on the applet's sedesc in sc_update_tx() at the end of process_stream(). This flag prevent any wakeup of the applet for a send event. It is not necessary for a mux because the mux stream is called when a syncrhonous send from the stream is performed. So it is reponsible to wake the mux connection if necessary. This patch must be backport to 3.0.	2024-09-16 22:55:40 +02:00
Christopher Faulet	655124f5cc	DOC: configuration: Remove dangerous directives from the proxy matrix For now, that only concerns accept-invalid-http-{request/response} and accept-unsafe-violations-in-http-{request/response}. But the idea is to make dangerous directives hard to find. It is one more way to discourage anyone to use it. And, optionnaly, it is also handy because it keeps the matrix aligned on 80 columns.	2024-09-16 22:55:25 +02:00
Christopher Faulet	4de6632693	MINOR: proxy: Rename accept-invalid-http-* options With these options, it is possible to accept some invalid messages that may considered as unsafe and may result as vulnerabilities. The naming is not explicit enough on this point. These option must really be considered as dangerous and only used as a temporary workaround. Unfortunately, when used, it is probably because there are some legacy and unsupported applications in place. Nevermind. The documentation warns about the use of these options. Now the name of the options itself is a warning. So now, "accept-invalid-http-request" and "accept-invalid-http-response" options are deprecated and replaced by "accept-unsafe-violations-in-http-request" and "accept-unsafe-violations-in-http-response" options.	2024-09-16 22:55:25 +02:00
Christopher Faulet	0f4fad5291	DOC: config: Explicitly list relaxing rules for accept-invalid-http-* options Time to time, new exceptions are added in the HTTP parsing (most of time H1) to not reject some invalid messages sent by legacy applications. But the documentation of accept-invalid-http-request and accept-invalid-http-response options is not pretty clear. So, now, there is an explicit list of relaxing rules for both options.	2024-09-16 22:55:24 +02:00
Aurelien DARRAGON	1e0920f855	BUG/MINOR: peers: local entries updates may not be advertised after resync Since commit 864ac3117 ("OPTIM: stick-tables: check the stksess without taking the read lock"), when entries for a local table are learned from another peer upon resynchro, and this is the only peer haproxy speaks to, local updates on such entries are not advertised to the peer anymore, until they eventually expire and can be recreated upon local updates. This is due to the fact that ts->seen is always set to 0 when creating new entry, and also when touch_remote is performed on the entry. Indeed, while 864ac3117 attempts to avoid useless updates, it didn't consider entries learned from a remote peer. Such entries are exclusively learned in peer_treat_updatemsg(): once the entry is created (or updated) with new data, touch_remote is used to commit the change. However, unlike touch_local, entries committed using touch_remote will not be advertised to the peer from which the entry was just learned (otherwise we would enter a looping situation). Due to the above patch, once an entry is learned from the (unique) remote peer, 'seen' will be stuck to 0 so it will never be advertised for its whole lifetime. Instead, when entries are learned from a peer, we should consider that the peer that taught us the entry has seen it. To do this, let's set seen=1 in peer_treat_updatemsg() after calling touch_remote(). This way, if we happen to perform updates on this entry, it will be properly advertized to relevant peers. This patch should not affect the performance gain documented in 864ac3117 given that the test scenario didn't involved entries learned by remote peers, but solely locally created entries advertised to remote peers upon updates. This should be backported in 3.0 with 864ac3117.	2024-09-16 14:06:39 +02:00
Willy Tarreau	5d350d1e50	OPTIM: vars: use multiple name heads in the vars struct Given that the original list-based version was using a list head as the root of the variables, while the tree is using a single pointer, it made sense to reuse that space to place multiple roots, indexed on the lower bits of the name hash. Two roots slightly increase the performance level, but the best gain is obtained with 4 roots. The performance is now always above that of the list, even with small counts, and with 100 vars, it's 21% higher than before, or 67% higher than with the list. We keep the same lock (it could have made sense to use one lock per head), because most of the variables in large configs are attached to a stream or a session, hence are not shared between threads. Thus there's no point in sharding the pointer.	2024-09-15 23:51:51 +02:00
Willy Tarreau	47ec7c681e	OPTIM: vars: use a cebtree instead of a list for variable names Configs involving many variables can start to eat a lot of CPU in name lookups. The reason is that the names themselves are dynamic in that they are relative to dynamic objects (sessions, streams, etc), so there's no fixed index for example. The current implementation relies on a standard linked list, and in order to speed up lookups and avoid comparing strings, only a 64-bit hash of the variable's name is stored and compared everywhere. But with just 100 variables and 1000 accesses in a config, it's clearly visible that variable name lookup can reach 56% CPU with a config generated this way: for i in {0..100}; do printf "\thttp-request set-var(txn.var%04d) int(%d)" $i $i; for j in {1..10}; do [ $i -lt $j ] \|\| printf ",add(txn.var%04d)" $((i-j)); done; echo; done The performance and a 4-core skylake 4.4 GHz reaches 85k RPS with a perf profile showing: Samples: 170K of event 'cycles', Event count (approx.): 142378815419 Overhead Shared Object Symbol 56.39% haproxy [.] var_to_smp 6.65% haproxy [.] var_set.part.0 5.76% haproxy [.] sample_process_cnv 3.23% haproxy [.] sample_conv_var2smp 2.88% haproxy [.] sample_conv_arith_add 2.33% haproxy [.] __pool_alloc 2.19% haproxy [.] action_store 2.13% haproxy [.] vars_get_by_desc 1.87% haproxy [.] smp_dup [above, var_to_smp() calls var_get() under the read lock]. By switching to a binary tree, the cost is significantly lower, the performance reaches 117k RPS (+37%) with this profile: Samples: 170K of event 'cycles', Event count (approx.): 142323631229 Overhead Shared Object Symbol 40.22% haproxy [.] cebu64_lookup 7.12% haproxy [.] sample_process_cnv 6.15% haproxy [.] var_to_smp 4.75% haproxy [.] cebu64_insert 3.79% haproxy [.] sample_conv_var2smp 3.40% haproxy [.] cebu64_delete 3.10% haproxy [.] sample_conv_arith_add 2.36% haproxy [.] action_store 2.32% haproxy [.] __pool_alloc 2.08% haproxy [.] vars_get_by_desc 1.96% haproxy [.] smp_dup 1.75% haproxy [.] var_set.part.0 1.74% haproxy [.] cebu64_first 1.07% [kernel] [k] aq_hw_read_reg 1.03% haproxy [.] pool_put_to_cache 1.00% haproxy [.] sample_process The performance lowers a bit earlier than with the list however. What can be seen is that the performance maintains a plateau till 25 vars, starts degrading a little bit for the tree while it remains stable till 28 vars for the list. Then both cross at 42 vars and the list continues to degrade doing a hyperbole while the tree resists better. The biggest loss is at around 32 variables where the list stays 10% higher. Regardless, given the extremely narrow band where the list is better, it looks relevant to switch to this in order to preserve the almost linear performance of large setups. For example at 1000 variables and 10k lookups, the tree is 18 times faster than the list. In addition this reduces the size of the struct vars by 8 bytes since there's a single pointer, though it could make sense to re-invest them into a secondary head for example.	2024-09-15 23:49:01 +02:00
Willy Tarreau	a0205f9de4	IMPORT: import cebtree (compact elastic binary trees) This is an import of the compact elastic binary trees at commit a9cd84a ("OPTIM: descent: better prefetch less and for writes when deleting") These will be used to replace certain lists (and possibly certain tree nodes as well). They're as fast (or even faster) than ebtrees for lookups, as fast for insertion and slower for deletion, and a node only uses 2 pointers (like a list). The only changes were cebtree.h where common/tools.h was replaced with ebtree.h which we already have and already provides the needed functions and macros, and the addition of a wrapper cebtree-prv.h in src/ to redirect to import/cebtree-prv.h.	2024-09-15 23:44:59 +02:00
Willy Tarreau	6e92988e20	MINOR: vars: remove the emptiness tests in callers before pruning All callers of vars_prune_* currently check the list for emptiness. Let's leave that to vars_prune() itself, it will ease some changes in the code. Thanks to the previous inlining of the vars_prune() function, there's no performance loss, and even a very tiny 0.1% gain.	2024-09-15 23:44:16 +02:00
Willy Tarreau	2c1a9c3a43	OPTIM: vars: inline vars_prune() to avoid many calls Many configs don't have variables and call it for no reason, and even configs with variables don't necessarily have some in all scopes.	2024-09-15 23:42:09 +02:00
Willy Tarreau	aad6b771dd	OPTIM: vars: remove the unneeded lock in vars_prune_* vars_prune() and vars_prune_all() take the variable lock while purging all variables from a head. However this is not needed: - proc scope variables are only purged during deinit, hence no lock is needed ; - all other scopes are attached to entities bound to a single thread so no lock is needed either. Removing the lock saves about 0.5% CPU on variables-intensive setups, but above all simplify the code, so let's do it.	2024-09-15 23:05:50 +02:00
Willy Tarreau	51ade2f1db	OPTIM: sample: don't check casts for samples of same type Originally when converters were created, they were mostly for casting types. Nowadays we have many artithmetic converters to perform operations on integers, and a number of converters operating on strings. Both of these categories most often do not need any cast since the input and output types are the same, which is visible as the cast function is c_none. However, profiling shows that when heavily using arithmetic converters, it's possible to spend up to ~7% of the time in sample_process_cnv(), a good part of which is only in accessing the sample_casts[] array. Simply avoiding this lookup when input and ouput types are equal saves about 2% CPU on such setups doing intensive use of converters.	2024-09-15 12:43:56 +02:00
Willy Tarreau	b11495652e	BUG/MEDIUM: queue: implement a flag to check for the dequeuing As unveiled in GH issue #2711, commit 5541d4995d ("BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue()") does have some side effects in that it can occasionally cause an endless loop. As Christopher analysed it, the problem is that process_srv_queue(), which uses a trylock in order to leave only one thread in charge of the dequeueing process, can lose the lock race against pendconn_add(). If this happens on the last served request, then there's no more thread to deal with the dequeuing, and assign_server_and_queue() will loop forever on a condition that was initially exepected to be extremely rare (and still is, except that now it can become sticky). Previously what was happening is that such queued requests would just time out and since that was very rare, nobody would notice. The root of the problem really is that trylock. It was added so that only one thread dequeues at a time but it doesn't offer only that guarantee since it also prevents a thread from dequeuing if another one is in the process of queuing. We need a different criterion. What we're doing now is to set a flag "dequeuing" in the server, which indicates that one thread is currently in the process of dequeuing requests. This one is atomically tested, and only if no thread is in this process, then the thread grabs the queue's lock and dequeues. This way it will be serialized with pendconn_add() and no request addition will be missed. It is not certain whether the original race covered by the fix above can still happen with this change, so better keep that fix for now. Thanks to @Yenya (Jan Kasprzak) for the precise and complete report allowing to spot the problem. This patch should be backported wherever the patch above was backported.	2024-09-13 08:35:47 +02:00
Willy Tarreau	adaba6f904	BUG/MINOR: clock: validate that now_offset still applies to the current date We want to make sure that now_offset is still valid for the current date: another thread could very well have updated it by detecting a backwards jump, and at the very same moment the time got fixed again, that we retrieve and add to the new offset, which results in a larger jump. Normally, for this to happen, it would mean that before_poll was also affected by the jump and was detected before and bounded within 2 seconds, resulting in max 2 seconds perturbations. Here we try to detect this situation and fall back to re-adjusting the offset instead. It's more of a strengthening of what's done by commit e8b1ad4c2b ("BUG/MEDIUM: clock: also update the date offset on time jumps") than a pure fix, in that the issue was not direclty observed but it's visibly possible by reading the code, so this should be backported along with the patch above. This is related to issue GH #2704. Note that this could be simplified in terms of operations by migrating the deadlines to nanoseconds, but this was the path to least intrusive changes.	2024-09-12 19:09:19 +02:00
Willy Tarreau	af48e4cc6b	BUG/MINOR: clock: make time jump corrections a bit more accurate Since commit e8b1ad4c2b ("BUG/MEDIUM: clock: also update the date offset on time jumps") we try to update the now_offet based on the last known valid date. But if it's off compared to the global_now_ns date shared by other threads, we'll get the time off a little bit. When this happens, we should consider the most recent of these dates so that if the global date was already known to be more recent, we should use it and stick to it. This will avoid setting too large an offset that could in turn provoke a larger jump on another thread. This is related to issue GH #2704. This can be backported to other branches having the patch above.	2024-09-12 18:27:03 +02:00
Willy Tarreau	ad98edd00a	BUG/MINOR: polling: fix time reporting when using busy polling Since commit beb859abce ("MINOR: polling: add an option to support busy polling") the time and status passed to clock_update_local_date() were incorrect. Indeed, what is considered is the before_poll date related to the configured timeout which does not correspond to what is passed to the poller. That's not correct because before_poll+the syscall's timeout will be crossed by the current date 100 ms after the start of the poller. In practice it didn't happen when the poller was limited to 1s timeout but at one minute it happens all the time. That's particularly visible when running a multi-threaded setup with busy polling and only half of the threads working (bind ... thread even). In this case, the fixup code of clock_update_local_date() is executed for each round of busy polling. The issue was made really visible starting with recent commit e8b1ad4c2b ("BUG/MEDIUM: clock: also update the date offset on time jumps") because upon a jump, the shared offset is reset, while it should not be in this specific case. What needs to be done instead is to pass the configured timeout of the poller (and not of the syscall), and always pass "interrupted" set so as to claim we got an event (which is sort of true as it just means the poller returned instantly). In this case we can still detect backwards/forward jumps and will use a correct boundary for the maximum date that covers the whole loop. This can be backported to all versions since the issue was introduced with busy-polling in 1.9-dev8.	2024-09-12 17:47:13 +02:00
Christopher Faulet	1900ca475f	MEDIUM: h1: Accept invalid T-E values with accept-invalid-http-response option Since the 2.6, A parsing error is reported when the chunked encoding is found twice. As stated in RFC9112, A sender must not apply the chunked transfer coding more than once to a message body. It means only one chunked coding must be found. In addition, empty values are also rejected becaues it is forbidden by RFC9110. However, in both cases, it may be useful to relax the rules for trusted legacy servers when accept-invalid-http-response option is set. Especially because it was accepted on 2.4 and older. In addition, T-E header is now sanitized before sending it. It is not a problem Because it is a hop-by-hop header Note that it remains invalid on client side because there is no good reason to relax the parsing on this side. We can argue a server is trusted so we can decide to support some legacy behavior. It is not true on client side and it is highly suspicious if a client is sending an invalid T-E header. Note also we continue to reject unsupported T-E values (so all codings except "chunked"). Because the "TE" header is sanitized and cannot contain other value than "Trailers", there is absolutely no reason for a server to use something else. This patch should fix the issue #2677. It could probably be backported as far as 2.6 if necessary.	2024-09-12 09:21:57 +02:00
Willy Tarreau	2b95c77c08	DOC: server: document what to check for when adding new server keywords It's too easy to overlook the dynamic servers when adding new server keywords, and the fields on each keyword line are totally obscure. This commit adds a title to each column of the table and explains what is expected and what to check for when adding a keyword.	2024-09-10 18:50:12 +02:00
Damien Claisse	ce6a621ae3	MINOR: server: allow init-state for dynamic servers Commit 50322df introduced the init-state keyword, but it didn't enable it for dynamic servers. However, this feature is perfectly desirable for virtual servers too, where someone would like a server inlived through "set server be1/srv1 state ready" to be put out of maintenance in down state until the next health check succeeds. At reading the code, it seems that it's only a matter of allowing this keyword for dynamic servers, as current code path calls srv_adm_set_ready() which incidentally triggers a call to _srv_update_status_adm().	2024-09-10 18:18:38 +02:00
Willy Tarreau	33deb4babe	REGTESTS: shorten a bit the delay for the h1/h2 upgrade test Commit d6c4ed9a96 ("REGTESTS: h1/h2: Update script testing H1/H2 protocol upgrades") introduced a 0.5 second delay which is higher than those of most other tests (usually 0.05 or 0.2) and triggers timeouts on my side. Let's just shorten it to 0.2 since its goal is only to send data separately. Note: maybe a barrier approach would be possible, though not studied.	2024-09-10 10:36:59 +02:00
Willy Tarreau	9f8d9c9e8b	BUG/MINOR: pattern: do not leave a leading comma on "set" error messages Commit 4f2493f355 ("BUG/MINOR: pattern: pat_ref_set: fix UAF reported by coverity") dropped the condition to concatenate error messages and as such introduced a leading comma in front of all of them. Then commit 911f4d93d4 ("BUG/MINOR: pattern: pat_ref_set: return 0 if err was found") changed the behavior to stop at the first error anyway, so all the mechanics dedicated to the concatenation of error messages is no longer needed and we can simply return the error as-is, without inserting any comma. This should be backported where the patches above are backported.	2024-09-10 08:55:29 +02:00
Willy Tarreau	036ab62231	REGTESTS: fix random failures with wrong_ip_port_logging.vtc under load This test has an expect rule for syslog that looks for [cC]D, to indicate a client abort or timeout during the data phase. The purpose was to say that when it fails it must be this, but the very low timeout (1ms) still makes it prone to succeeding if the machine is highly loaded. This has become more visible since commit e8b1ad4c2b ("BUG/MEDIUM: clock: also update the date offset on time jumps") because the clock drift adjustments are more systematic. Since this commit, running 50 such tests at twice more than the number of CPUs in parallel is sufficient to yield errors due to some lines appearing as succeeding: make reg-tests -- --j $((($(nproc)+1)*2)) --vtestparams -n50 reg-tests/log/wrong_ip_port_logging.vtc It was observed that pauses up to 300ms were observed in epoll_wait() in such circumstances, which were properly fixed by the time drift detection.. Another approach would consist in increasing the permitted margin during which we don't fix the clock drift but that would not be logical since the base time had really been awaited for. This should be backported to all stable releases since the commit above will trigger the issue more often.	2024-09-09 19:38:28 +02:00
Christopher Faulet	a99d58819f	BUG/MINOR: h1-htx: Don't flag response as bodyless when a tunnel is established This reverts commit 225a4d02e1f6a12c0b4f3584949fad3339d71708. When a 200-OK response is replied to a CONNECT request or a 101-Switching-protocol, a tunnel is considered as established between the client and the server. However, we must not declare the reponse as bodyless. Of course, there is no payload, but tunneled data are expected. Because of this bug, the zero-copy forwarding is disabled on the server side. This patch must be backported as far as 2.9.	2024-09-09 19:01:47 +02:00
Christopher Faulet	f6e193f1b0	BUG/MAJOR: mux-h1: Wake SC to perform 0-copy forwarding in CLOSING state When the mux is woken up on I/O events, if the zero-copy forwarding is enabled, receives are blocked. In this case, the SC is woken up to be able to perform 0-copy forwarding to the other side. This works well, except for the H1C in CLOSING state. Indeed, in that case, in h1_process(), the SC is not woken up because only RUNNING H1 connections are considered. As consequence, the mux will ignore connection closure. The H1 connection remains blocked, waiting for the shutdown timeout. If no timeout is configured, the H1 connection is never closed leading to a leak. This patch should fix leak reported by Damien Claisse in the issue #2697. It should be backported as far as 2.8.	2024-09-09 19:01:47 +02:00
William Lallemand	021ac6a108	MEDIUM: ssl/cli: "dump ssl cert" allow to dump a certificate in PEM format The new "dump ssl cert" CLI command allows to dump a certificate stored into HAProxy memory. Until now it was only possible to dump the description of the certificate using "show ssl cert", but with this new command you can dump the PEM content on the filesystem. This command is only available on a admin stats socket. $ echo "@1 dump ssl cert cert.pem" \| socat /tmp/master.sock - -----BEGIN PRIVATE KEY----- [...] -----END PRIVATE KEY----- -----BEGIN CERTIFICATE----- [...] -----END CERTIFICATE----- -----BEGIN CERTIFICATE----- [...] -----END CERTIFICATE-----	2024-09-09 16:54:48 +02:00
Aurelien DARRAGON	68cfb222b5	BUG/MEDIUM: pattern: prevent UAF on reused pattern expr Since c5959fd ("MEDIUM: pattern: merge same pattern"), UAF (leading to crash) can be experienced if the same pattern file (and match method) is used in two default sections and the first one is not referenced later in the config. In this case, the first default section will be cleaned up. However, due to an unhandled case in the above optimization, the original expr which the second default section relies on is mistakenly freed. This issue was discovered while trying to reproduce GH #2708. The issue was particularly tricky to reproduce given the config and sequence required to make the UAF happen. Hopefully, Github user @asmnek not only provided useful informations, but since he was able to consistently trigger the crash in his environment he was able to nail down the crash to the use of pattern file involved with 2 named default sections. Big thanks to him. To fix the issue, let's push the logic from c5959fd a bit further. Instead of relying on "do_free" variable to know if the expression should be freed or not (which proved to be insufficient in our case), let's switch to a simple refcounting logic. This way, no matter who owns the expression, the last one attempting to free it will be responsible for freeing it. Refcount is implemented using a 32bit value which fills a previous 4 bytes structure gap: int mflags; /* 80 4 / / XXX 4 bytes hole, try to pack / long unsigned int lock; / 88 8 */ (output from pahole) Even though it was not reproduced in 2.6 or below by @asmnek (the bug was revealed thanks to another bugfix), this issue theorically affects all stable versions (up to c5959fd), thus it should be backported to all stable versions.	2024-09-09 16:07:05 +02:00
Aurelien DARRAGON	8157c1caf2	BUG/MEDIUM: pattern: prevent uninitialized reads in pat_match_{str,beg} Using valgrind when running map_beg or map_str, the following error is reported: ==242644== Conditional jump or move depends on uninitialised value(s) ==242644== at 0x2E4AB1: pat_match_str (pattern.c:457) ==242644== by 0x2E81ED: pattern_exec_match (pattern.c:2560) ==242644== by 0x343176: sample_conv_map (map.c:211) ==242644== by 0x27522F: sample_process_cnv (sample.c:1330) ==242644== by 0x2752DB: sample_process (sample.c:1373) ==242644== by 0x319917: action_store (vars.c:814) ==242644== by 0x24D451: http_req_get_intercept_rule (http_ana.c:2697) In fact, the error is legit, because in pat_match_{beg,str}, we dereference the buffer on len+1 to check if a value was previously set, and then decide to force NULL-byte if it wasn't set. But the approach is no longer compatible with current architecture: data past str.data is not guaranteed to be initialized in the buffer. Thus we cannot dereference the value, else we expose us to uninitialized read errors. Moreover, the check is useless, because we systematically set the ending byte to 0 when the conditions are met. Finally, restoring the older value after the lookup is not relevant: indeed, either the sample is marked as const and in such case it is already duplicated, or the sample is not const and we forcefully add a terminating NULL byte outside from the actual string bytes (since we're past str.data), so as we didn't alter effective string data and that data past str.data cannot be dereferenced anyway as it isn't guaranteed to be initialized, there's no point in restoring previous uninitialized data. It could be backported in all stable versions. But since this was only detected by valgrind and isn't known to cause issues in existing deployments, it's probably better to wait a bit before backporting it to avoid any breakage.. although the fix should be theoretically harmless.	2024-09-09 15:57:30 +02:00
Aurelien DARRAGON	3449525a02	BUG/MINOR: pattern: prevent const sample from being tampered in pat_match_beg() This is a complementary patch to a68affeaa ("BUG/MINOR: pattern: a sample marked as const could be written"). Indeed the same logic from pat_match_str() is used there, but we lack the check to ensure that the sample is not const before writing data to it. It could be backported to all stable versions.	2024-09-09 15:57:23 +02:00
Willy Tarreau	ef8d8215de	BUG/MEDIUM: clock: detect and cover jumps during execution After commit e8b1ad4c2 ("BUG/MEDIUM: clock: also update the date offset on time jumps"), @firexinghe mentioned that the issue was still present in their case. In fact it depends on the load, which affects the probability that the time changes between two poll() calls vs that it changes during poll(). The time correction code used to only deal with the latter. But under load if it changes between two poll() calls, what happens then is that before_poll is off, and after returning from poll(), the date is within bounds defined by before_poll, so no correction is applied. After many tests, it turns out that the most reliable solution without using CLOCK_MONOTONIC is to prevent before_poll from being earlier than the previous after_poll (trivial), and to cover forward jumps, we need to enforce a margin. Given that the watchdog kills a looping task within 2 seconds and that no sane setup triggers it, it seems that 2 seconds remains a safe enough margin. This means that in the worst case, some forward jumps of up to 2 seconds will not be corrected, leading to an apparent fast time and low rates. But this is supposed to be an exceptional event anyway (typically an admin or crontab running ntpdate). For future versions, given that we now opportunistically call now_mono_time() before and after poll(), that returns zero if not supported, we could imagine relying on this one for the thread's local time when it's non-null.	2024-09-08 19:15:38 +02:00
Christopher Faulet	d6c4ed9a96	REGTESTS: h1/h2: Update script testing H1/H2 protocol upgrades "http-messaging/protocol_upgrade.vtc" script was updated to test upgrades for requests with a payload. It should fail when the request is sent to a H2 server. When sent to a H1 server, it should succeed, except if the server replies before the end of the request.	2024-09-06 14:18:02 +02:00
Christopher Faulet	001fb1a548	BUG/MEDIUM: mux-h1/mux-h2: Reject upgrades with payload on H2 side only Since 1d2d77b27 ("MEDIUM: mux-h1: Return a 501-not-implemented for upgrade requests with a body"), it is no longer possible to perform a protocol upgrade for requests with a payload. The main reason was to be able to support protocol upgrade for H1 client requesting a H2 server. In that case, the upgrade request is converted to a CONNECT request. So, it is not possible to convey a payload in that case. But, it is a problem for anyone wanting to perform upgrades on H1 server using requests with a payload. It is uncommon but valid. So, now, it is the H2 multiplexer responsibility to reject upgrade requests, on server side, if there is a payload. An INTERNAL_ERROR is returned for the H2S in that case. On H1 side, the upgrade is now allowed, but only if the server waits for the end of the request to return the 101-Switching-protocol response. Indeed, it is quite hard to synchronise the frontend side and the backend side in that case. Asking to servers to fully consume the request payload before returned the response seems reasonable. This patch should fix the issue #2684. It could be backported after a period of observation, as far as 2.4 if possible. But only if it is not too hard. It depends on "MINOR: mux-h1: Set EOI on SE during demux when both side are in DONE state".	2024-09-06 09:16:18 +02:00
Christopher Faulet	ad1ef94612	MINOR: mux-h1: Set EOI on SE during demux when both side are in DONE state For now, this case is already handled for all requests except for those waiting for a tunnel establishment (CONNECT and protocol upgrades). It is not an issue because only bodyless requests are supported in these cases. So the request is always finished at the end of headers and therefore before the response. However, to relax conditions for full H1 protocol upgrades (H1 client and server), this case will be necessary. Indeed, the idea is to be able to perform protocol upgrades for requests with a payload. Today, the "Upgrade:" header is removed before sending the request to the server. But to support this case, this patch is required to properly finish transaction when the server does not perform the upgrade.	2024-09-06 09:00:13 +02:00
Willy Tarreau	c22fc591d4	DOC: configuration: place the HAPROXY_HTTP_LOG_FMT example on the correct line When HAPROXY_HTTP_LOG_FMT was added by commit 537b9e7f36 ("MINOR: config: add environment variables for default log format"), the example was placed by accident after the clf log format instead of the HTTP log format, causing a bit of confusion. This can be backported to 2.8.	2024-09-06 07:41:16 +02:00
Willy Tarreau	a2aea9f573	[RELEASE] Released version 3.1-dev7 Released version 3.1-dev7 with the following main changes : - MINOR: config: Created env variables for http and tcp clf formats - MINOR: mux-quic: add buf_in_flight to QCC debug infos - MINOR: mux-quic: correct qcc_bufwnd_full() documentation - MINOR: tools: add helpers to backup/clean/restore env - MINOR: mworker: restore initial env before wait mode - BUG/MINOR: haproxy: free init_env in deinit only if allocated - BUILD: tools: environ is not defined in OS X and BSD - DEV: coccinelle: add a test to detect unchecked malloc() - DEV: coccinelle: add a test to detect unchecked calloc() - CI: QUIC Interop AWS-LC: enable ngtcp2 client - CI: fix missing comma introduced in 956839c0f68a7722acc586ecd91ffefad2ccb303 - CI: QUIC Interop: do not run bandwidth measurement tests - CI: QUIC Interop: use different artifact names for uploading logs - BUILD: quic: 32bits build broken by wrong integer conversions for printf() - CLEANUP: ssl: cleanup the clienthello capture - MEDIUM: ssl: capture the supported_versions extension from Client Hello - MEDIUM: ssl/sample: add ssl_fc_supported_versions_bin sample fetch - MEDIUM: ssl: capture the signature_algorithms extension from Client Hello - MEDIUM: ssl/sample: add ssl_fc_sigalgs_bin sample fetch - MINOR: proxy: Add support of 429-Too-Many-Requests in retry-on status - BUG/MEDIUM: mux-h2: Set ES flag when necessary on 0-copy data forwarding - BUG/MEDIUM: stream: Prevent mux upgrades if client connection is no longer ready - BUG/MINIR: proxy: Match on 429 status when trying to perform a L7 retry - CLEANUP: haproxy: fix typos in code comment - CLEANUP: mqtt: fix typo in MQTT_REMAINING_LENGHT_MAX_SIZE - MINOR: tools: Implement ipaddrcpy(). - MINOR: quic: Implement quic_tls_derive_token_secret(). - MINOR: quic: Token for future connections implementation. - BUG/MINOR: quic: Missing incrementation in NEW_TOKEN frame builder - MINOR: quic: Modify NEW_TOKEN frame structure (qf_new_token struct) - MINOR: quic: Implement qc_ssl_eary_data_accepted(). - MINOR: quic: Add trace for QUIC_EV_CONN_IO_CB event. - BUG/MEDIUM: quic: always validate sender address on 0-RTT - BUILD: quic: fix build errors on FreeBSD since recent GSO changes - MINOR: tools: extend str2sa_range to add an alt parameter - MINOR: server: add a alt_proto field for server - MEDIUM: sock: use protocol when creating socket - MEDIUM: protocol: add MPTCP per address support - BUG/MINOR: quic: Crash from trace dumping SSL eary data status (AWS-LC) - MEDIUM: stick-table: Add support of a factor for IN/OUT bytes rates - MEDIUM: bwlim: Use a read-lock on the sticky session to apply a shared limit - BUG/MEDIUM: mux-pt: Never fully close the connection on shutdown - BUG/MEDIUM: cli: Always release back endpoint between two commands on the mcli - BUG/MINOR: quic: unexploited retransmission cases for Initial pktns. - BUG/MEDIUM: mux-h1: Properly handle empty message when an error is triggered - MINOR: mux-h2: try to clear DEM_MROOM and MUX_MFULL at more places - BUG/MAJOR: mux-h2: always clear MUX_MFULL and DEM_MROOM when clearing the mbuf - BUG/MINOR: mux-spop: always clear MUX_MFULL and DEM_MROOM when clearing the mbuf - BUG/MINOR: Crash on O-RTT RX packet after dropping Initial pktns - BUG/MEDIUM: mux-pt: Fix condition to perform a shutdown for writes in mux_pt_shut() - CLEANUP: assorted typo fixes in the code and comments - DEV: patchbot: count the number of backported/non-backported patches - DEV: patchbot: add direct links to show only specific categories - DEV: patchbot: detect commit IDs starting with 7 chars - BUG/MEDIUM: clock: also update the date offset on time jumps - MEDIUM: server: add init-state	2024-09-05 18:53:54 +02:00
Aaron Kuehler	50322dff81	MEDIUM: server: add init-state Allow the user to set the "initial state" of a server. Context: Servers are always set in an UP status by default. In some cases, further checks are required to determine if the server is ready to receive client traffic. This introduces the "init-state {up\|down}" configuration parameter to the server. - when set to 'fully-up', the server is considered immediately available and can turn to the DOWN sate when ALL health checks fail. - when set to 'up' (the default), the server is considered immediately available and will initiate a health check that can turn it to the DOWN state immediately if it fails. - when set to 'down', the server initially is considered unavailable and will initiate a health check that can turn it to the UP state immediately if it succeeds. - when set to 'fully-down', the server is initially considered unavailable and can turn to the UP state when ALL health checks succeed. The server's init-state is considered when the HAProxy instance is (re)started, a new server is detected (for example via service discovery / DNS resolution), a server exits maintenance, etc. Link: https://github.com/haproxy/haproxy/issues/51	2024-09-05 11:13:10 +02:00
Willy Tarreau	e8b1ad4c2b	BUG/MEDIUM: clock: also update the date offset on time jumps In GH issue #2704, @swimlessbird and @xanoxes reported problems handling time jumps. Indeed, since 2.7 with commit 4eaf85f5d9 ("MINOR: clock: do not update the global date too often") we refrain from updating the global offset in case it didn't change. But there's a catch: in case of a large time jump, if the poller was interrupted, the local time remains the same and we return immediately from there without updating the offset. It then becomes incorrect regarding the "date" value, and upon subsequent call to the poller, there's no way to detect a jump anymore so we apply the old, incorrect offset and the date becomes wrong. Worse, going back to the original time (then in the past), global_now_ns remains higher than the local time and neither get updated anymore. What is missing in practice is to immediately update the offset when detecting a time jump. In an ideal world, the offset would be updated upon every call, that's what was being done prior to commit above but it's extremely CPU intensive on large systems. However we can perfectly afford to update the offset every time we detect a time jump, as it's not as common. This needs to be backported as far as 2.8. Thanks to both participants above for providing very helpful details.	2024-09-04 16:55:43 +02:00
Willy Tarreau	531bf44a65	DEV: patchbot: detect commit IDs starting with 7 chars Some commit messages contain commit IDs as short as 7 chars, let's detect them.	2024-09-04 09:41:40 +02:00
Willy Tarreau	f6910a4578	DEV: patchbot: add direct links to show only specific categories The per-category counters are now clickable so that it becomes possible to list the relevant ones.	2024-09-04 09:38:43 +02:00
Willy Tarreau	eaf4adb5e2	DEV: patchbot: count the number of backported/non-backported patches It's useful to instantly see how many patches of each category have already been backported and are still pending, let's count them and report them at the top of the page.	2024-09-04 09:11:04 +02:00
Ilya Shipitsin	1f6e5f7a61	CLEANUP: assorted typo fixes in the code and comments This is 43rd iteration of typo fixes	2024-09-03 17:49:21 +02:00
Christopher Faulet	e1cae42879	BUG/MEDIUM: mux-pt: Fix condition to perform a shutdown for writes in mux_pt_shut() A regression was introduced in the commit 76fa71f7a ("BUG/MEDIUM: mux-pt: Never fully close the connection on shutdown") because of a typo on the connection flags. CO_FL_SOCK_WR_SH flag must be tested to prevent a call to conn_sock_shutw() and not CO_FL_SOCK_RD_SH. Concretly, most of time, it is harmeless because shutdown for writes is always performed before any shutdown for reads. Except in case describe by the commit above. But it is not clear if it has an impact or not. This patch must be backported with the commit above, so as far as 2.9.	2024-09-03 15:25:05 +02:00
Frederic Lecaille	7e19432fd4	BUG/MINOR: Crash on O-RTT RX packet after dropping Initial pktns This bug arrived with this naive commit: BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only) which omitted to consider the case where the Initial packet number space could be discarded before receiving 0-RTT packets. To fix this, append/insert the O-RTT (early-data) packet number space into the encryption level list depending on the presence or not of the Initial packet number space. This issue was revealed when using aws-lc as TLS stack in GH #2701 issue. Thank you to @Tristan971 for having reported this issue. Must be backported where the commit mentionned above is supposed to be backported: as far as 2.9.	2024-09-03 15:23:06 +02:00
Willy Tarreau	f8bff3b531	BUG/MINOR: mux-spop: always clear MUX_MFULL and DEM_MROOM when clearing the mbuf That's the equivalent of the mux-h2 one, except that here there's no real risk to loop since normally we cannot feed data that bypass the closed state check (e.g. no zero-copy forward). But it still remains dirty to be able to leave and empty mbuf with MFULL and MROOM set, so better clear them as well. No backport is needed since this is only in 3.1.	2024-09-03 14:39:04 +02:00
Willy Tarreau	830e50561c	BUG/MAJOR: mux-h2: always clear MUX_MFULL and DEM_MROOM when clearing the mbuf There exists an extremely tricky code path that was revealed in 3.0 by the glitches feature, though it might theoretically have existed before. TL;DR: a mux mbuf may be full after successfully sending GOAWAY, and discard its remaining contents without clearing H2_CF_MUX_MFULL and H2_CF_DEM_MROOM, then endlessly loop in h2_send(), until the watchdog takes care of it. What can happen is the following: Some data are received, h2_io_cb() is called. h2_recv() is called to receive the incoming data. Then h2_process() is called and in turn calls h2_process_demux() to process input data. At some point, a glitch limit is reached and h2c_error() is called to close the connection. The input frame was incomplete, so some data are left in the demux buffer. Then h2_send() is called, which in turn calls h2_process_mux(), which manages to queue the GOAWAY frame, turning the state to H2_CS_ERROR2. The frame is sent, and h2_process() calls h2_send() a last time (doing nothing) and leaves. The streams are all woken up to notify about the error. Multiple backend streams were waiting to be scheduled and are woken up in turn, before their parents being notified, and communicate with the h2 mux in zero-copy-forward mode, request a buffer via h2_nego_ff(), fill it, and commit it with h2_done_ff(). At some point the mux's output buffer is full, and gets flags H2_CF_MUX_MFULL. The io_cb is called again to process more incoming data. h2_send() isn't called (polled) or does nothing (e.g. TCP socket buffers full). h2_recv() may or may not do anything (doesn't matter). h2_process() is called since some data remain in the demux buf. It goes till the end, where it finds st0 == H2_CS_ERROR2 and clears the mbuf. We're now in a situation where the mbuf is empty and MFULL is still present. Then it calls h2_send(), which doesn't call h2_process_mux() due to MFULL, doesn't enter the for() loop since all buffers are empty, then keeps sent=0, which doesn't allow to clear the MFULL flag, and since "done" was not reset, it loops forever there. Note that the glitches make the issue more reproducible but theoretically it could happen with any other GOAWAY (e.g. PROTOCOL_ERROR). What makes it not happen with the data produced on the parsing side is that we process a single buffer of input at once, and there's no way to amplify this to 30 buffers of responses (RST_STREAM, GOAWAY, SETTINGS ACK, WINDOW_UPDATE, PING ACK etc are all quite small), and since the mbuf is cleared upon every exit from h2_process() once the error was sent, it is not possible to accumulate response data across multiple calls. And the regular h2_snd_buf() path checks for st0 >= H2_CS_ERROR so it will not produce any data there either. Probably that h2_nego_ff() should check for H2_CS_ERROR before accepting to deliver a buffer, but this needs to be carefully studied. In the mean time the real problem is that the MFULL flag was kept when clearing the buffer, making the two inconsistent. Since it doesn't seem possible to trigger this sequence without the zero-copy-forward mechanism, this fix needs to be backported as far as 2.9, along with previous commit "MINOR: mux-h2: try to clear DEM_MROOM and MUX_MFULL at more places" which will strengthen the consistency between these checks. Many thanks to Annika Wickert for her detailed report that allowed to diagnose this problem. CVE-2024-45506 was assigned to this problem.	2024-09-03 14:39:04 +02:00
Willy Tarreau	e9cdedb39b	MINOR: mux-h2: try to clear DEM_MROOM and MUX_MFULL at more places The code leading to H2_CF_MUX_MFULL and H2_CF_DEM_MROOM being cleared is quite complex and assumptions about its state are extremely difficult when reading the code. There are indeed long sequences where the mux might possibly be empty, still having the flag set until it reaches h2_send() which will clear it after the last send. Even then it's not obviour whether it's always guaranteed to release the flag when invoked in multiple passes. Let's just simplify the conditionnn so that h2_send() does not depend on "sent" anymore and that h2_timeout_task() doesn't leave the flags set on the buffer on emptiness. While it doesn't seem to fix anything, it will make the code more robust against future changes.	2024-09-03 14:39:04 +02:00
Christopher Faulet	0d4271cdae	BUG/MEDIUM: mux-h1: Properly handle empty message when an error is triggered When a 400/408/500/501 error is returned by the H1 multiplexer, we first try to get the error message of the proxy before using the default one. This may be configured to be mapped on /dev/null or on an empty file. In that case, no message is emitted, as expected. But everything is handled as the error was successfully sent. However, there is an bug here. In h1_send_error() function, this case is not properly handled. The flag H1C_F_ABRTED is not set on the H1 connection as it should be and h1_close() function is not called, leaving the H1 connection in an undefined state. It is especially an issue when a "empty" 408-Request-Time-out error is emitted while there are data blocked in the output buffer. In that case, the connection remains openned until the client closes and a "cR--"/408 is logged repeatedly, every time the client timeout is reached. This patch must backported as far as 2.8.	2024-09-03 14:28:42 +02:00
Frederic Lecaille	15a737eb5f	BUG/MINOR: quic: unexploited retransmission cases for Initial pktns. qc_prep_hdshk_fast_retrans() job is to pick some packets to be retransmitted from Initial and Handshake packet number spaces. A packet may be coalesced to a first one into the same datagram. When a coalesced packet is inspected for retransmission, it is skipped if its length would make the total datagram length it is attached to exceeding the anti-amplification limit. But in this case, the first packet must be kept for the current retransmission. This is tracked by this trace statemement: TRACE_PROTO("will probe Initial packet number space", QUIC_EV_CONN_SPPKTS, qc); This was not the case because of the wrong "goto end" statement. This latter must be run only if the Initial packet number space must not be probe with the first packet found as coalesced to another one which must be skipped. This bug was revealed by AWS-LC interop runner with handshakeloss and handshakecorruption which always fail because this stack leads the server to send more Initial packets. Thank you to Ilya (@chipitsine) for this issue report in GH #2663. Must be backported as far as 2.6.	2024-09-03 11:47:51 +02:00
Christopher Faulet	d4781bd5e7	BUG/MEDIUM: cli: Always release back endpoint between two commands on the mcli When several commands are chained on the master CLI, the same client connection is used. Because, it is a TCP connection, the mux PT is used. It means there is no stream at the mux level. It is not possible to release the applicative stream between each commands as for the HTTP. So, to work around this limitation, between two commands, the master CLI is resetting the stream. It does exactly what it was performed on HTTP to manage keep-alive connections on old HAProxy versions. But this part was copied from a code dealing with connection only while the back endpoint can be an applet or a mux for the master cli. The previous fix on the mux PT ("BUG/MEDIUM: mux-pt: Never fully close the connection on shutdown") revealed a bug. Between two commands, the back endpoint was only released if the connection's XPRT was closed. This works if the back endpoint is an applet because there is no connection. But for commands sent to a worker, a connection is used. At this stage, this only works if the connection's XPRT is closed. Otherwise, the old endpoint is never detached leading to undefined behavior on the next command execution (most probably a crash). Without the commit above, the connection's XPRT is always closed on shutdown. It is no longer true. At this stage, we must inconditionnally release the back endpoint by resetting the corresponding sedesc to fix the bug. This patch must be backported with the commit above in all stable versions. On 2.4 and lower, it will need to be adapted.	2024-09-02 18:31:35 +02:00
Christopher Faulet	76fa71f7a8	BUG/MEDIUM: mux-pt: Never fully close the connection on shutdown When a shutdown is reported to the mux (shutdown for reads or writes), the connexion is immediately fully closed if the mux detects the connexion is closed in both directions. Only the passthrough multiplexer is able to perform this action at this stage because there is no stream and no internal data. Other muxes perform a full connection close during the mux's release stage. It was working quite well since recently. But, in theory, the bug is quite old. In fact, it seems possible for the lower layer to report an error on the connection in same time a shutdown is performed on the mux. Depending on how events are scheduled, the following may happen: 1. An connection error is detected at the fd layer and a wakeup is scheduled on the mux to handle the event. 2. A shutdown for writes is performed on the mux. Here the mux decides to fully close the connexion. If the xprt is not used to log info, it is released. 3. The mux is finally woken up. It tries to retrieve data from the xprt because it is not awayre there was an error. This leads to a crash because of a NULL-deref. By reading the code, it is not obvious. But it seems possible with SSL connection when the handshake is rearmed. It happens when a SSL_ERROR_WANT_WRITE is reported on a SSL_read() attempt or a SSL_ERROR_WANT_READ on a SSL_write() attempt. This bug is only visible if the XPRT is not used to log info. So it is no so common. This patch should fix the 2nd crash reported in the issue #2656. It must first be backported as far as 2.9 and then slowly to all stable versions.	2024-09-02 15:50:25 +02:00
Christopher Faulet	f9adcdf039	MEDIUM: bwlim: Use a read-lock on the sticky session to apply a shared limit There is no reason to acquire a write-lock on the sticky session when a shared limit is applied because only the frequency is updated. The sticky session itself is not modified. We must just take care it is not removed in the mean time. So a read-lock may be used instead.	2024-09-02 15:50:25 +02:00
Christopher Faulet	a7f6b0ac03	MEDIUM: stick-table: Add support of a factor for IN/OUT bytes rates Add a factor parameter to stick-tables, called "brates-factor", that is applied to in/out bytes rates to work around the 32-bits limit of the frequency counters. Thanks to this factor, it is possible to have bytes rates beyond the 4GB. Instead of counting each bytes, we count blocks of bytes. Among other things, it will be useful for the bwlim filter, to be able to configure shared limit exceeding the 4GB/s. For now, this parameter must be in the range ]0-1024].	2024-09-02 15:50:25 +02:00
Frederic Lecaille	db13df3d6e	BUG/MINOR: quic: Crash from trace dumping SSL eary data status (AWS-LC) This bug follows this patch: MINOR: quic: Add trace for QUIC_EV_CONN_IO_CB event. where a new third variable was added to be dumped from QUIC_EV_CONN_IO_CB trace event. The quic_trace() code did not reveal there was already another variable passed as third argument but not dumped. This leaded to crash when dereferencing a point to an int in place of a point to an SSL object. This issue was reproduced only by handshakecorruption aws-lc interop test with s2n-quic as client. Note that this patch must be backported with this one: BUG/MEDIUM: quic: always validate sender address on 0-RTT which depends on the commit mentionned above.	2024-09-02 10:01:41 +02:00
Aperence	20efb856e1	MEDIUM: protocol: add MPTCP per address support Multipath TCP (MPTCP), standardized in RFC8684 [1], is a TCP extension that enables a TCP connection to use different paths. Multipath TCP has been used for several use cases. On smartphones, MPTCP enables seamless handovers between cellular and Wi-Fi networks while preserving established connections. This use-case is what pushed Apple to use MPTCP since 2013 in multiple applications [2]. On dual-stack hosts, Multipath TCP enables the TCP connection to automatically use the best performing path, either IPv4 or IPv6. If one path fails, MPTCP automatically uses the other path. To benefit from MPTCP, both the client and the server have to support it. Multipath TCP is a backward-compatible TCP extension that is enabled by default on recent Linux distributions (Debian, Ubuntu, Redhat, ...). Multipath TCP is included in the Linux kernel since version 5.6 [3]. To use it on Linux, an application must explicitly enable it when creating the socket. No need to change anything else in the application. This attached patch adds MPTCP per address support, to be used with: mptcp{,4,6}@<address>[:port1[-port2]] MPTCP v4 and v6 protocols have been added: they are mainly a copy of the TCP ones, with small differences: names, proto, and receivers lists. These protocols are stored in __protocol_by_family, as an alternative to TCP, similar to what has been done with QUIC. By doing that, the size of __protocol_by_family has not been increased, and it behaves like TCP. MPTCP is both supported for the frontend and backend sides. Also added an example of configuration using mptcp along with a backend allowing to experiment with it. Note that this is a re-implementation of Bj�rn's work from 3 years ago [4], when haproxy's internals were probably less ready to deal with this, causing his work to be left pending for a while. Currently, the TCP_MAXSEG socket option doesn't seem to be supported with MPTCP [5]. This results in a warning when trying to set the MSS of sockets in proto_tcp:tcp_bind_listener. This can be resolved by adding two new variables: sock_inet(6)_mptcp_maxseg_default that will hold the default value of the TCP_MAXSEG option. Note that for the moment, this will always be -1 as the option isn't supported. However, in the future, when the support for this option will be added, it should contain the correct value for the MSS, allowing to correctly set the TCP_MAXSEG option. Link: https://www.rfc-editor.org/rfc/rfc8684.html [1] Link: https://www.tessares.net/apples-mptcp-story-so-far/ [2] Link: https://www.mptcp.dev [3] Link: https://github.com/haproxy/haproxy/issues/1028 [4] Link: https://github.com/multipath-tcp/mptcp_net-next/issues/515 [5] Co-authored-by: Dorian Craps <dorian.craps@student.vinci.be> Co-authored-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>	2024-08-30 18:53:49 +02:00
Aperence	2f171fe36a	MEDIUM: sock: use protocol when creating socket Use the protocol configured for a connection when creating the socket, instead of always using 0. This change is needed to allow new protocol to be used when creating the sockets, such as MPTCP. Note however that this patch won't change anything for now, as the only other value that proto->sock_prot could hold is IPPROTO_TCP, which has the same behavior as 0 when passed to socket.	2024-08-30 18:53:49 +02:00
Aperence	38618822e1	MINOR: server: add a alt_proto field for server Add a new field alt_proto to the server structures that specify if an alternate protocol should be used for this server. This field can be transparently passed to protocol_lookup to get an appropriate protocol structure. This change allows thus to create servers with different protocols, and not only TCP anymore.	2024-08-30 18:53:49 +02:00
Aperence	a7b04e383a	MINOR: tools: extend str2sa_range to add an alt parameter Add a new parameter "alt" that will store wether this configuration use an alternate protocol. This alt pointer will contain a value that can be transparently passed to protocol_lookup to obtain an appropriate protocol structure. This change is needed to allow for example the servers to know if it need to use an alternate protocol or not.	2024-08-30 18:53:49 +02:00
Willy Tarreau	2bc513dd31	BUILD: quic: fix build errors on FreeBSD since recent GSO changes The following commits broke the build on FreeBSD when QUIC is enabled: 35470d518 ("MINOR: quic: activate UDP GSO for QUIC if supported") 448d3d388 ("MINOR: quic: add GSO parameter on quic_sock send API") Indeed, it turns out that netinet/udp.h requires sys/types.h to be included before. Let's just change the includes order to fix the build. No backport is needed.	2024-08-30 18:53:49 +02:00
Frederic Lecaille	f627b9272b	BUG/MEDIUM: quic: always validate sender address on 0-RTT It has been reported by Wedl Michael, a student at the University of Applied Sciences St. Poelten, a potential vulnerability into haproxy as described below. An attacker could have obtained a TLS session ticket after having established a connection to an haproxy QUIC listener, using its real IP address. The attacker has not even to send a application level request (HTTP3). Then the attacker could open a 0-RTT session with a spoofed IP address trusted by the QUIC listen to bypass IP allow/block list and send HTTP3 requests. To mitigate this vulnerability, one decided to use a token which can be provided to the client each time it successfully managed to connect to haproxy. These tokens may be reused for future connections to validate the address/path of the remote peer as this is done with the Retry token which is used for the current connection, not the next one. Such tokens are transported by NEW_TOKEN frames which was not used at this time by haproxy. So, each time a client connect to an haproxy QUIC listener with 0-RTT enabled, it is provided with such a token which can be reused for the next 0-RTT session. If no such a token is presented by the client, haproxy checks if the session is a 0-RTT one, so with early-data presented by the client. Contrary to the Retry token, the decision to refuse the connection is made only when the TLS stack has been provided with enough early-data from the Initial ClientHello TLS message and when these data have been accepted. Hopefully, this event arrives fast enough to allow haproxy to kill the connection if some early-data have been accepted without token presented by the client. quic_build_post_handshake_frames() has been modified to build a NEW_TOKEN frame with this newly implemented token to be transported inside. quic_tls_derive_retry_token_secret() was renamed to quic_do_tls_derive_token_secre() and modified to be reused and derive the secret for the new token implementation. quic_token_validate() has been implemented to validate both the Retry and the new token implemented by this patch. When this is a non-retry token which could not be validated, the datagram received is marked as requiring a Retry packet to be sent, and no connection is created. When the Initial packet does not embed any non-retry token and if 0-RTT is enabled the connection is marked with this new flag: QUIC_FL_CONN_NO_TOKEN_RCVD. As soon as the TLS stack detects that some early-data have been provided and accepted by the client, the connection is marked to be killed (QUIC_FL_CONN_TO_KILL) from ha_quic_add_handshake_data(). This is done calling qc_ssl_eary_data_accepted() new function. The secret TLS handshake is interrupted as soon as possible returnin 0 from ha_quic_add_handshake_data(). The connection is also marked as requiring a Retry packet to be sent (QUIC_FL_CONN_SEND_RETRY) from ha_quic_add_handshake_data(). The the handshake I/O handler (quic_conn_io_cb()) knows how to behave: kill the connection after having sent a Retry packet. About TLS stack compatibility, this patch is supported by aws-lc. It is disabled for wolfssl which does not support 0-RTT at this time thanks to HAVE_SSL_0RTT_QUIC. This patch depends on these commits: MINOR: quic: Add trace for QUIC_EV_CONN_IO_CB event. MINOR: quic: Implement qc_ssl_eary_data_accepted(). MINOR: quic: Modify NEW_TOKEN frame structure (qf_new_token struct) BUG/MINOR: quic: Missing incrementation in NEW_TOKEN frame builder MINOR: quic: Token for future connections implementation. MINOR: quic: Implement quic_tls_derive_token_secret(). MINOR: tools: Implement ipaddrcpy(). Must be backported as far as 2.6.	2024-08-30 17:04:09 +02:00
Frederic Lecaille	8854cef036	MINOR: quic: Add trace for QUIC_EV_CONN_IO_CB event. Dump the early data status from QUIC_EV_CONN_IO_CB trace event. This is very helpful to know if the QUIC server has accepted the early data received from clients.	2024-08-30 17:04:09 +02:00
Frederic Lecaille	609b124561	MINOR: quic: Implement qc_ssl_eary_data_accepted(). This function is a wrapper around SSL_get_early_data_status() for OpenSSL derived stack and SSL_early_data_accepted() boringSSL derived stacks like AWS-LC. It returns true for a TLS server if it has accepted the early data received from a client. Also implement quic_ssl_early_data_status_str() which is dedicated to be used for debugging purposes (traces). This function converts the enum returned by the two function mentionned above to a human readable string.	2024-08-30 17:04:09 +02:00
Frederic Lecaille	e926378375	MINOR: quic: Modify NEW_TOKEN frame structure (qf_new_token struct) Modify qf_new_token structure to use a static buffer with QUIC_TOKEN_LEN as size as defined by the token for future connections (quic_token.c). Modify consequently the NEW_TOKEN frame parser (see quic_parse_new_token_frame()). Also add comments to denote that the NEW_TOKEN parser function is used only by clients and that its builder is used only by servers.	2024-08-30 17:04:09 +02:00
Frederic Lecaille	76c80605a6	BUG/MINOR: quic: Missing incrementation in NEW_TOKEN frame builder quic_build_new_token_frame() is the function which is called to build a NEW_TOKEN frame into a buffer. The position pointer for this buffer was not updated, leading the NEW_TOKEN frame to be malformed. Must be backported as far as 2.6.	2024-08-30 17:04:09 +02:00
Frederic Lecaille	f5b09dc452	MINOR: quic: Token for future connections implementation. There exist two sorts of token used by QUIC. They are both used to validate the peer address (path validation). Retry are used for the current connection the client want to open. This patch implement the other sort of tokens which after having been received from a connection, may be provided for the next connection from the same IP address to validate it (or validate the network path between the client and the server). The token generation is implemented by quic_generate_token(), and the token validation by quic_token_chek(). The same method is used as for Retry tokens to build such tokens to be reused for future connections. The format is very simple: one byte for the format identifier to distinguish these new tokens for the Retry token, followed by a 32bits timestamps. As this part is ciphered with AEAD as cryptographic algorithm, 16 bytes are needed for the AEAD tag. 16 more random bytes are added to this token and a salt to derive the AEAD secret used to cipher the token. In addition to this salt, this is the client IP address which is used also as AAD to derive the AEAD secret. So, the length of the token is fixed: 37 bytes.	2024-08-30 17:04:09 +02:00
Frederic Lecaille	74caa0eece	MINOR: quic: Implement quic_tls_derive_token_secret(). This is function is similar to quic_tls_derive_retry_token_secret(). Its aim is to derive the secret used to cipher the token to be used for future connections. This patch renames quic_tls_derive_retry_token_secret() to a more and reuses its code to produce a more generic one: quic_do_tls_derive_token_secret(). Two arguments are added to this latter to produce both quic_tls_derive_retry_token_secret() and quic_tls_derive_token_secret() new function which calls quic_do_tls_derive_token_secret().	2024-08-30 17:04:09 +02:00
Frederic Lecaille	fb7a092203	MINOR: tools: Implement ipaddrcpy(). Implement ipaddrcpy() new function to copy only the IP address from a sockaddr_storage struct object into a buffer.	2024-08-30 17:04:09 +02:00
Nicolas CARPi	a33407b499	CLEANUP: mqtt: fix typo in MQTT_REMAINING_LENGHT_MAX_SIZE There was a typo in the macro name, where LENGTH was incorrectly written. This didn't cause any issue because the typo appeared in all occurrences in the codebase.	2024-08-30 14:58:59 +02:00
Nicolas CARPi	534e7e4598	CLEANUP: haproxy: fix typos in code comment Use "from" instead of "form" in ha_random_boot function code comments.	2024-08-30 14:58:59 +02:00
Christopher Faulet	62c9d51ca4	BUG/MINIR: proxy: Match on 429 status when trying to perform a L7 retry Support for 429 was recently added to L7 retries (0d142e075 "MINOR: proxy: Add support of 429-Too-Many-Requests in retry-on status"). But the l7_status_match() function was not properly updated. The switch statement must match the 429 status to be able to perform a L7 retry. This patch must be backported if the commit above is backported. It is related to #2687.	2024-08-30 12:13:32 +02:00
Christopher Faulet	e4812404c5	BUG/MEDIUM: stream: Prevent mux upgrades if client connection is no longer ready If an early error occurred on the client connection, we must prevent any multiplexer upgrades. Indeed, it is unexpected for a mux to be initialized with no xprt. On a normal workflow it is impossible. So it is not an issue. But if a mux upgrade is performed at the stream level, an early error on the connection may have already been handled by the previous mux and the connection may be already fully closed. If the mux upgrade is still performed, a crash can be experienced. It is possible to have a crash with an implicit TCP>HTTP upgrade if there is no data in the input buffer. But it is also possible to get a crash with an explicit "switch-mode http" rule. It must be backported to all stable versions. In 2.2, the patch must be applied directly in stream_set_backend() function.	2024-08-28 16:38:20 +02:00
Christopher Faulet	4ef5251c44	BUG/MEDIUM: mux-h2: Set ES flag when necessary on 0-copy data forwarding When DATA frames are sent via the 0-copy data forwarding, we must take care to set the ES flag on the last DATA frame. It should be performed in h2_done_ff() when IOBUF_FL_EOI flag was set by the producer. This flag is here to know when the producer has reached the end of input. When this happens, the h2s state is also updated. It is switched to "half-closed local" or "closed" state depending on its previous state. It is mainly an issue on uploads because the server may be blocked waiting for the end of the request. A workaround is to disable the 0-copy forwarding support the the H2 by setting "tune.h2.zero-copy-fwd-send" directive to off in your global section. This patch should fix the issue #2665. It must be backported as far as 2.9.	2024-08-28 10:05:34 +02:00
Christopher Faulet	0d142e0756	MINOR: proxy: Add support of 429-Too-Many-Requests in retry-on status The "429" status can now be specified on retry-on directives. PR_RE_* flags were updated to remains sorted. This patch should fix the issue #2687. It is quite simple so it may safely be backported to 3.0 if necessary.	2024-08-28 10:05:34 +02:00
William Lallemand	d2fc1ab66e	MEDIUM: ssl/sample: add ssl_fc_sigalgs_bin sample fetch This new sample fetch allow to extract the binary list contained in the signature_algorithms (13) TLS extensions. https://datatracker.ietf.org/doc/html/rfc8446#section-4.2.3	2024-08-26 15:17:40 +02:00
William Lallemand	e8fecef0ff	MEDIUM: ssl: capture the signature_algorithms extension from Client Hello Activate the capture of the TLS signature_algorithms extension from the Client Hello. This list is stored in the ssl_capture buffer when the global option "tune.ssl.capture-cipherlist-size" is enabled.	2024-08-26 15:17:40 +02:00
William Lallemand	ac5c7158f9	MEDIUM: ssl/sample: add ssl_fc_supported_versions_bin sample fetch This new sample fetch allow to extract the binary list contained in the supported_versions (43) TLS extensions. https://datatracker.ietf.org/doc/html/rfc8446#section-4.2.1	2024-08-26 15:17:40 +02:00
William Lallemand	ce7fb6628e	MEDIUM: ssl: capture the supported_versions extension from Client Hello Activate the capture of the TLS supported_versions extension from the Client Hello. This list is stored in the ssl_capture buffer when the global option "tune.ssl.capture-cipherlist-size" is enabled.	2024-08-26 15:12:42 +02:00
William Lallemand	3c0a0f1e1b	CLEANUP: ssl: cleanup the clienthello capture In order to add more extensions, clean up the clienthello capture function a little bit.	2024-08-26 15:12:42 +02:00
Frederic Lecaille	414e3aa6bc	BUILD: quic: 32bits build broken by wrong integer conversions for printf() Since these commits the 32bits build is broken due to several errors as follow: CC src/quic_cli.o src/quic_cli.c: In function ‘dump_quic_full’: src/quic_cli.c:285:94: error: format ‘%ld’ expects argument of type ‘long int’, but argument 5 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Werror=format=] 285 \| chunk_appendf(&trash, " [initl] rx.ackrng=%-6zu tx.inflight=%-6zu(%ld%%)\n", \| ~~^ \| \| \| long int \| %lld 286 \| pktns->rx.arngs.sz, pktns->tx.in_flight, 287 \| pktns->tx.in_flight * 100 / qc->path->cwnd); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \| \| \| uint64_t {aka long long unsigned int} Replace several %ld by %llu with ull as printf conversion in quic_clic.c and a %ld by %lld with (long long) as printf conversion in quic_cc_cubic.c. Thank you to Ilya (@chipitsine) for having reported this issue in GH #2689. Must be backported to 3.0.	2024-08-26 11:21:48 +02:00
Ilya Shipitsin	4256961a44	CI: QUIC Interop: use different artifact names for uploading logs artifact names must be unique, otherwise only first failed logs are uploaded, other encounter 409 conflict	2024-08-26 11:19:41 +02:00
Ilya Shipitsin	438ad6b495	CI: QUIC Interop: do not run bandwidth measurement tests crosstraffic, goodput tests are intended to perform bandwidth measurement, we do not consider GitHub runners for that purpose GH issue: https://github.com/haproxy/haproxy/issues/2688	2024-08-26 11:19:41 +02:00
Ilya Shipitsin	f583ed9469	CI: fix missing comma introduced in 956839c0f68a7722acc586ecd91ffefad2ccb303 in 956839c0f68a7722acc586ecd91ffefad2ccb303 syntax was broken due to missing comma. it is follow up.	2024-08-26 11:19:41 +02:00
Ilya Shipitsin	956839c0f6	CI: QUIC Interop AWS-LC: enable ngtcp2 client Let's add it and see how it goes. GH issue: https://github.com/haproxy/haproxy/issues/2688	2024-08-24 19:13:59 +02:00
Ilya Shipitsin	8f2112a04f	DEV: coccinelle: add a test to detect unchecked calloc() The coccinelle test "unchecked-calloc.cocci" detects various cases of unchecked calloc().	2024-08-24 19:13:56 +02:00
Ilya Shipitsin	2ec42bff48	DEV: coccinelle: add a test to detect unchecked malloc() The coccinelle test "unchecked-malloc.cocci" detects various cases of unchecked malloc().	2024-08-24 19:13:56 +02:00
William Lallemand	7a03ab426f	BUILD: tools: environ is not defined in OS X and BSD Add extern char environ which in order to build the new functions to manipulate the environment. Indeed the variable environ is not required to be declared by POSIX, so it need to be declared manually: "In addition, the following variable, which must be declared by the user if it is to be used directly: extern char environ;" https://pubs.opengroup.org/onlinepubs/9699919799/functions/environ.html	2024-08-23 19:39:57 +02:00
Valentine Krasnobaeva	28ca7fc594	BUG/MINOR: haproxy: free init_env in deinit only if allocated This fixes 7b78e1571 (" MINOR: mworker: restore initial env before wait mode"). In cases, when haproxy starts without any configuration, for example: 'haproxy -vv', init_env array to backup env variables is never allocated. So, we need to check in deinit(), when we free its memory, that init_env is not a NULL ptr.	2024-08-23 19:08:53 +02:00
Valentine Krasnobaeva	7b78e1571b	MINOR: mworker: restore initial env before wait mode This patch is the follow-up of 1811d2a6ba (MINOR: tools: add helpers to backup/clean/restore env). In order to avoid unexpected behaviour in master-worker mode during the process reload with a new configuration, when the old one has contained '*env' keywords, let's backup its initial environment before calling parse_cfg() and let's clean and restore it in the context of master process, just before it enters in a wait polling loop. This will garantee that new workers will have a new updated environment and not the previous one inherited from the master, which does not read the configuration, when it's in a wait-mode.	2024-08-23 17:06:59 +02:00
Valentine Krasnobaeva	1811d2a6ba	MINOR: tools: add helpers to backup/clean/restore env 'setenv', 'presetenv', 'unsetenv', 'resetenv' keywords in configuration could modify the process runtime environment. In case of master-worker mode this creates a problem, as the configuration is read only once before the forking a worker and then the master process does the reexec without reading any config files, just to free the memory. So, during the reload a new worker process will be created, but it will inherited the previous unchanged environment from the master in wait mode, thus it won't benefit the changes in configuration, related to '*env' keywords. This may cause unexpected behavior or some parser errors in master-worker mode. So, let's add a helper to backup all process env variables just before it will read its configuration. And let's also add helpers to clean up the current runtime environment and to restore it to its initial state (as it was before parsing the config).	2024-08-23 17:06:33 +02:00
Amaury Denoyelle	960d68a5af	MINOR: mux-quic: correct qcc_bufwnd_full() documentation Fix returned value domment of qcc_bufwnd_full() which was incorrect.	2024-08-23 16:25:04 +02:00
Amaury Denoyelle	ecfedc2570	MINOR: mux-quic: add buf_in_flight to QCC debug infos Dump <buf_in_flight> QCC field both in QUIC MUX traces and "show quic". This could help to detect if MUX does not allocate enough buffers compared to quic_conn current congestion window.	2024-08-22 17:48:23 +02:00
Nathan Wehrman	5c07d58e08	MINOR: config: Created env variables for http and tcp clf formats Since we already have variables for the other formats and the change is trivial I thought it would be a nice addition for completeness	2024-08-22 09:15:58 +02:00
Willy Tarreau	599f043e74	[RELEASE] Released version 3.1-dev6 Released version 3.1-dev6 with the following main changes : - BUG/MINOR: proto_tcp: delete fd from fdtab if listen() fails - BUG/MINOR: proto_tcp: keep error msg if listen() fails - MINOR: proto_tcp: tcp_bind_listener: copy errno in errmsg - MINOR: channel: implement ci_insert() function - BUG/MEDIUM: mworker/cli: fix pipelined modes on master CLI - REGTESTS: mcli: test the pipelined commands on master CLI - MINOR: cfgparse: load_cfg_in_mem: fix null ptr dereference reported by coverity - MINOR: startup: fix unused value reported by coverity - BUG/MINOR: mux-quic: do not send too big MAX_STREAMS ID - BUG/MINOR: proto_uxst: delete fd from fdtab if listen() fails - BUG/MINOR: cfgparse: parse_cfg: fix null ptr dereference reported by coverity - MINOR: proto_uxst: copy errno in errmsg for syscalls - MINOR: mux-quic: do not trace error in qcc_send_frames() on empty list - BUG/MINOR: h3: properly reject too long header responses - CLEANUP: mworker/cli: clean up the mode handling - BUG/MINOR: tools: make fgets_from_mem() stop at the end of the input - BUG/MINOR: pattern: pat_ref_set: fix UAF reported by coverity - BUG/MINOR: pattern: pat_ref_set: return 0 if err was found - CI: keep logs for failed QIUC Interop jobs - BUG/MINOR: release-estimator: fix relative scheme in CHANGELOG URL - MINOR: release-estimator: add requirements.txt - MINOR: release-estimator: add installation steps in README.md - MINOR: release-estimator: fix the shebang of the python script - DOC: config: correct the table for option tcplog - MEDIUM: log: relax some checks and emit diag warnings instead in lf_expr_postcheck() - MINOR: log: "drop" support for log-profile steps - CI: QUIC Interop LibreSSL: document chacha20 test status - CI: modernize codespell action, switch to node 16 - CI: QUIC Interop AWS-LC: enable chrome client - DOC: lua: fix incorrect english in lua.txt - MINOR: Implements new log format of option tcplog clf - MINOR: cfgparse: limit file size loaded via /dev/stdin - BUG/MINOR: stats: fix color of input elements in dark mode - CLEANUP: stats: use modern DOCTYPE tag - BUG/MINOR: stats: add lang attribute to html tag - DOC: quic: fix default minimal value for max window size - DOC: quic: document nocc debug congestion algorithm - MINOR: quic: extract config window-size parsing - MINOR: quic: define max-window-size config setting - MINOR: quic: allocate stream txbuf via qc_stream_desc API - MINOR: mux-quic: account stream txbuf in QCC - MEDIUM: mux-quic: implement API to ignore txbuf limit for some streams - MINOR: h3: mark control stream as metadata - MINOR: mux-quic: define buf_in_flight - MAJOR: mux-quic: allocate Tx buffers based on congestion window - MINOR: quic/config: adapt settings to new conn buffer limit - MINOR: quic: define sbuf pool - MINOR: quic: support sbuf allocation in quic_stream - MEDIUM: h3: allocate small buffers for headers frames - MINOR: mux-quic: retry after small buf alloc failure - BUG/MINOR: cfgparse-global: fix err msg in mworker keyword parser - BUG/MINOR: cfgparse-global: clean common_kw_list - BUG/MINOR: cfgparse-global: remove redundant goto - MINOR: cfgparse-global: move 'pidfile' in global keywords list - MINOR: cfgparse-global: move 'expose-*' in global keywords list - MINOR: cfgparse-global: move tune options in global keywords list - MINOR: cfgparse-global: move unsupported keywords in global list - BUG/MINOR: cfgparse-global: remove tune.fast-forward from common_kw_list - MINOR: quic: store the lost packets counter in the quic_cc_event element - MINOR: quic: support a tolerance for spurious losses - MINOR: protocol: properly assign the sock_domain and sock_family - MINOR: protocol: add a family lookup - MEDIUM: socket: always properly use the sock_domain for requested families - MINOR: protocol: add the real address family to the protocol - MINOR: socket: don't ban all custom families from reuseport - MINOR: protocol: always initialize the receivers list on registration - CLEANUP: protocol: no longer initialize .receivers nor .nb_receivers	2024-08-21 17:50:03 +02:00
Willy Tarreau	9911b53d75	CLEANUP: protocol: no longer initialize .receivers nor .nb_receivers Protocol definitions no longer need to initialize these internal fields, as they're now properly initialized during protocol registration.	2024-08-21 17:37:46 +02:00
Willy Tarreau	1cb3b0b745	MINOR: protocol: always initialize the receivers list on registration Till now, protocols were required to self-initialize their receivers list head, which is not very convenient, and is quite error prone. Indeed, it's too easy to copy-paste a protocol definition and forget to update the .receivers field to point to itself, resulting in mixed lists. Let's just do that in protocol_register(). And while we're at it, let's also zero the nb_receivers entry that works with it, so that the protocol definition isn't required to pre-initialize stuff related to internal book-keeping.	2024-08-21 17:37:46 +02:00
Willy Tarreau	034974106f	MINOR: socket: don't ban all custom families from reuseport The test on ss_family >= AF_MAX is too strict if we want to support new custom families, let's apply this to the real_family instead so that we check that the underlying socket supports reuseport.	2024-08-21 17:37:46 +02:00
Willy Tarreau	2a799b64b0	MINOR: protocol: add the real address family to the protocol For custom families, there's sometimes an underlying real address and it would be nice to be able to directly use the real family in calls to bind() and connect() without having to add explicit checks for exceptions everywhere. Let's add a .real_family field to struct proto_fam for this. For now it's always equal to the family except for non-transferable ones such as rhttp where it's equal to the custom one (anything else could fit).	2024-08-21 17:37:46 +02:00
Willy Tarreau	d592ebdbeb	MEDIUM: socket: always properly use the sock_domain for requested families Now we make sure to always look up the protocol's domain for an address family. Previously we would use it as-is, which prevented from properly using custom addresses (which is when they differ). This removes some hard-coded tests such as in log.c where UNIX vs UDP was explicitly checked for example. It requires a bit of care, however, so as to properly pass value 1 in the 3rd arg of the protocol_lookup() for DGRAM stuff. Maybe one day we'll change these for defines or enums to limit mistakes.	2024-08-21 17:36:58 +02:00
Willy Tarreau	ba4a416c66	MINOR: protocol: add a family lookup At plenty of places we have access to an address family which may include some custom addresses but we cannot simply convert them to the real families without performing some random protocol lookups. Let's simply add a proto_fam table like we have for the protocols. The protocols could even be indexed there, but for now it's not worth it.	2024-08-21 16:46:15 +02:00
Willy Tarreau	732913f848	MINOR: protocol: properly assign the sock_domain and sock_family When we finally split sock_domain from sock_family in 2.3, something was not cleanly finished. The family is what should be stored in the address while the domain is what is supposed to be passed to socket(). But for the custom addresses, we did the opposite, just because the protocol_lookup() function was acting on the domain, not the family (both of which are equal for non-custom addresses). This is an API bug but there's no point backporting it since it does not have visible effects. It was visible in the code since a few places were using PF_UNIX while others were comparing the domain against AF_MAX instead of comparing the family. This patch clarifies this in the comments on top of proto_fam, addresses the indexing issue and properly reconfigures the two custom families.	2024-08-21 16:46:15 +02:00
Willy Tarreau	67bf1d6c9e	MINOR: quic: support a tolerance for spurious losses Tests performed between a 1 Gbps connected server and a 100 mbps client, distant by 95ms showed that: - we need 1.1 MB in flight to fill the link - rare but inevitable losses are sufficient to make cubic's window collapse fast and long to recover - a 100 MB object takes 69s to download - tolerance for 1 loss between two ACKs suffices to shrink the download time to 20-22s - 2 losses go to 17-20s - 4 losses reach 14-17s At 100 concurrent connections that fill the server's link: - 0 loss tolerance shows 2-3% losses - 1 loss tolerance shows 3-5% losses - 2 loss tolerance shows 10-13% losses - 4 loss tolerance shows 23-29% losses As such while there can be a significant gain sometimes in setting this tolerance above zero, it can also significantly waste bandwidth by sending far more than can be received. While it's probably not a solution to real world problems, it repeatedly proved to be a very effective troubleshooting tool helping to figure different root causes of low transfer speeds. In spirit it is comparable to the no-cc congestion algorithm, i.e. it must not be used except for experimentation.	2024-08-21 08:34:30 +02:00
Willy Tarreau	fab0e99aa1	MINOR: quic: store the lost packets counter in the quic_cc_event element Upon loss detection, qc_release_lost_pkts() notifies congestion controllers about the event and its final time. However it does not pass the number of lost packets, that can provide useful hints for some controllers. Let's just pass this option.	2024-08-21 08:02:44 +02:00
Valentine Krasnobaeva	2e6e159ac4	BUG/MINOR: cfgparse-global: remove tune.fast-forward from common_kw_list Remove tune.fast-forward from common_kw_list. It was replaced by 'tune.disable-fast-forward' and it's no longer present in "if..else if.." parser from cfg_parse_global(). Otherwise, it may be shown as the best-match keyword for some tune options, which is now wrong. Should be backported in versions 2.9 and 3.0.	2024-08-20 19:16:34 +02:00
Valentine Krasnobaeva	731ef865e3	MINOR: cfgparse-global: move unsupported keywords in global list Following the previous commits and in order to clean up cfg_parse_global let's move unsupported keywords in the global list and let's add for them a dedicated parser.	2024-08-20 19:16:33 +02:00
Valentine Krasnobaeva	55309592db	MINOR: cfgparse-global: move tune options in global keywords list In order to clean up cfg_parse_global() and to add the support of the new MODE_DISCOVERY in configuration parsing, let's move the keywords related to tune options into the global keywords list and let's add for them two dedicated parsers. Tune options keywords are sorted between two parsers in dependency of parameters number, which a given tune option needs. tune options parser is called by section parser and follows the common API, i.e. it returns -1 on failure, 0 on success and 1 on recoverable error. In case of recoverable error we've previously returned ERR_ALERT (0x10) and we have emitted an alert message at startup. Section parser treats all rc > 0 as ERR_WARN. So in case, if some tune option was set twice in the global section, tune options parser will return 1 (in order to respect the common API), section parser will treat this as ERR_WARN and a warning message will be emitted during process startup instead of alert, as it was before.	2024-08-20 19:16:32 +02:00
Valentine Krasnobaeva	c46497f16f	MINOR: cfgparse-global: move 'expose-' in global keywords list Following the previous commit let's also move 'expose-' keywords in the global cfg_kws list and let's add for them a dedicated parser. This will simplify the configuration parsing in the new MODE_DISCOVERY, which allows to read only the keywords, needed at the early start of haproxy process (i.e. modes, pidfile, chosen poller).	2024-08-20 19:16:31 +02:00
Valentine Krasnobaeva	450ce3e61b	MINOR: cfgparse-global: move 'pidfile' in global keywords list This commit cleans up cfg_parse_global() and prepares the config parser to support MODE_DISCOVERY. This step is needed in early starting stage, just to figura out in which mode the process was started, to set some necessary parameteres needed for this mode and to continue the initialization stage. 'pidfile' makes part of such common keywords, which are needed to be parsed very early and which are used almost in all process modes (except the foreground, '-d'). 'pidfile' keyword parser is called by section parser and follows the common API, i.e. it returns -1 on failure, 0 on success and 1 on recoverable error. In case of recoverable error we've previously returned ERR_ALERT (0x10) and we have emitted an alert message at startup. Section parser treats all rc > 0 as ERR_WARN. So in case, if pidfile was already specified via command line, the keyword parser will return 1 (in order to respect the common API), section parser will treat this as ERR_WARN and a warning message will be emitted during process startup instead of alert, as it was before.	2024-08-20 19:16:30 +02:00
Valentine Krasnobaeva	f29be97ac7	BUG/MINOR: cfgparse-global: remove redundant goto In the case, when the given keyword was found in the global 'cfg_kws' list, we go to 'out' label anyway, after testing rc returned by the keyword's parser. So there is not a much gain if we perform 'goto out' jump specifically when rc > 0.	2024-08-20 19:16:29 +02:00
Valentine Krasnobaeva	74bc6f3d66	BUG/MINOR: cfgparse-global: clean common_kw_list This patch fixes commits 118ac11ce ("MINOR: cfgparse-global: move mode's keywords in cfg_kw_list") and 83ff4db18 (MINOR: cfgparse-global: move no<poller_name> in cfg_kw_list). 'common_kw_list' serves to show the best-match keyword in cfg_parse_global(), if the given keyword was not parsed in "if..else if.." cases. cfg_parse_global() is still used as a parser for some keywords from the global section. Mode-specific and no<poller_name> keywords now have their own parsers. They no longer take place in the "if..else if.." from cfg_parse_global() and they are registered in the 'cfg_kws' list. So, there is no longer need to duplicate them in the 'common_kw_list'. Otherwise, they will be shown twice in parser error message.	2024-08-20 19:16:28 +02:00
Valentine Krasnobaeva	4291d10b44	BUG/MINOR: cfgparse-global: fix err msg in mworker keyword parser This patch fixes the commit 118ac11ce ("cfgparse-global: move mode's keywords in cfg_kw_list"). Error message delivered by keyword parser in **err is always shown with ha_alert() by the caller cfg_parse_global(). The caller always supplies these alerts with the filename and the line number.	2024-08-20 19:16:27 +02:00
Amaury Denoyelle	0d6112b40b	MINOR: mux-quic: retry after small buf alloc failure Previous commit switch to small buffers for HTTP/3 HEADERS emission. This ensures that several parallel streams can allocate their own buffer without hitting the connection buffer limit based now on the congestion window size. However, this prevents the transmission of responses with uncommonly large headers. Indeed, if all headers cannot be encoded in a single buffer, an error is reported which cause the whole connection closure. Adjust this by implementing a realloc API exposed by QUIC MUX. This allows application layer to switch from a small to a default buffer and restart its processing. This guarantees that again headers not longer than bufsize can be properly transferred.	2024-08-20 18:12:27 +02:00
Amaury Denoyelle	b355e89bf9	MEDIUM: h3: allocate small buffers for headers frames A major change was recently implemented to change QUIC MUX Tx buffer allocation limit, which is now based on the current connection congestion window size. As this size may be smaller than the previous static value, it is likely that the limit will be reached more frequently. When using HTTP/3, the majority of requests streams are used for small object exchanges. Every responses start with a HEADERS frames which should be much smaller in size than the default buffer. But as the whole buffer size is accounted against the congestion window, a single stream can block others even if only emitting a single HEADERS frame which is suboptimal for bandwith usage, if the congestion window is small enough. To adapt to this new situation, rely on the newly available small buffers to transfer HEADERS frame response. This at least guarantee that several parallel streams could allocate their own buffer for the first part of the response, even with a small congestion window. The situation could be further improve to use various indication on the data size and select a small buffer if sufficient. This could be done for example via the Content-length value or HTX extra field. However this must be the subject of a dedicated patch.	2024-08-20 18:12:27 +02:00
Amaury Denoyelle	885e4c5cf8	MINOR: quic: support sbuf allocation in quic_stream This patch extends qc_stream_desc API to be able to allocate small buffers. QUIC MUX API is similarly updated as ultimatly each application protocol is responsible to choose between a default or a smaller buffer. Internally, the type of allocated buffer is remembered via qc_stream_buf instance. This is mandatory to ensure that the buffer is released in the correct pool, in particular as small and standard buffers can be configured with the same size. This commit is purely an API change. For the moment, small buffers are not used. This will changed in a dedicated patch.	2024-08-20 18:12:27 +02:00
Amaury Denoyelle	d0d8e57d47	MINOR: quic: define sbuf pool Define a new buffer pool reserved to allocate smaller memory area. For the moment, its usage will be restricted to QUIC, as such it is declared in quic_stream module. Add a new config option "tune.bufsize.small" to specify the size of the allocated objects. A special check ensures that it is not greater than the default bufsize to avoid unexpected effects.	2024-08-20 18:12:27 +02:00
Amaury Denoyelle	1de5f718cf	MINOR: quic/config: adapt settings to new conn buffer limit QUIC MUX buffer allocation limit is now directly based on the underlying congestion window size. previous static limit based on conn-tx-buffers is now unused. As such, this commit adds a warning to users to prevent that it is now obsolete. Secondly, update max-window-size setting. It is now the main entrypoint to limit both the maximum congestion window size and the number of QUIC MUX allocated buffer on emission. Remove its special value '0' which was used to automatically adjust it on now unused conn-tx-buffers.	2024-08-20 17:59:35 +02:00
Amaury Denoyelle	aeb8c1ddc3	MAJOR: mux-quic: allocate Tx buffers based on congestion window Each QUIC MUX may allocate buffers for MUX stream emission. These buffers are then shared with quic_conn to handle ACK reception and retransmission. A limit on the number of concurrent buffers used per connection has been defined statically and can be updated via a configuration option. This commit replaces the limit to instead use the current underlying congestion window size. The purpose of this change is to remove the artificial static buffer count limit, which may be difficult to choose. Indeed, if a connection performs with minimal loss rate, the buffer count would limit severely its throughput. It could be increase to fix this, but it also impacts others connections, even with less optimal performance, causing too many extra data buffering on the MUX layer. By using the dynamic congestion window size, haproxy ensures that MUX buffering corresponds roughly to the network conditions. Using QCC <buf_in_flight>, a new buffer can be allocated if it is less than the current window size. If not, QCS emission is interrupted and haproxy stream layer will subscribe until a new buffer is ready. One of the criticals parts is to ensure that MUX layer previously blocked on buffer allocation is properly woken up when sending can be retried. This occurs on two occasions : * after an already used Tx buffer is cleared on ACK reception. This case is already handled by qcc_notify_buf() via quic_stream layer. * on congestion window increase. A new qcc_notify_buf() invokation is added into qc_notify_send(). Finally, remove <avail_bufs> QCC field which is now unused. This commit is labelled MAJOR as it may have unexpected effect and could cause significant behavior change. For example, in previous implementation QUIC MUX would be able to buffer more data even if the congestion window is small. With this patch, data cannot be transferred from the stream layer which may cause more streams to be shut down on client timeout. Another effect may be more CPU consumption as the connection limit would be hit more often, causing more streams to be interrupted and woken up in cycle.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	000976af58	MINOR: mux-quic: define buf_in_flight Define a new QCC counter named <buf_in_flight>. Its purpose is to account the current sum of all allocated stream buffer size used on emission. For this moment, this counter is updated and buffer allocation and deallocation. It will be used to replace <avail_bufs> once congestion window is used as limit for buffer allocation in a future commit.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	f9777bea30	MINOR: h3: mark control stream as metadata A current work is performed to change QUIC MUX buffer allocation limit from a configurable static value to use the size of the congestion window instead. This change may cause the buffer allocation limit to be triggered more frequently. To ensure HTTP/3 control emission is not perturbed by this change, mark the stream with qcc_send_metadata(). This ensures that buffer allocation for this stream won't be subject to the connection limit. This is necessary to guarantee that SETTINGS and GOAWAY frames are emitted.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	4c4bf26f44	MEDIUM: mux-quic: implement API to ignore txbuf limit for some streams Define a new qc_stream_desc flag QC_SD_FL_OOB_BUF. This is to mark streams which are not subject to the connection limit on allocated MUX stream buffer. The purpose is to simplify handling of QUIC MUX streams which do not transfer data and as such are not driven by haproxy layer, for example HTTP/3 control stream. These streams interacts synchronously with QUIC MUX and cannot retry emission in case of temporary failure. This commit will be useful once connection buffer allocation limit is reimplemented to directly rely on the congestion window size. This will probably cause the buffer limit to be reached more frequently, maybe even on QUIC MUX initialization. As such, it will be possible to mark control streams and prevent them to be subject to the buffer limit. QUIC MUX expose a new function qcs_send_metadata(). It can be used by an application protocol to specify which streams are used for control exchanges. For the moment, no such stream use this mechanism.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	f4d1bd0b76	MINOR: mux-quic: account stream txbuf in QCC A limit per connection is put on the number of buffers allocated by QUIC MUX for emission accross all its streams. This ensures memory consumption remains under control. This limit is simply explained as a count of buffers which can be concurrently allocated for each connection. As such, quic_conn structure was used to account currently allocated buffers. However, a quic_conn nevers allocates new stream buffers. This is only done at QUIC MUX layer. As such, this commit moves buffer accounting inside QCC structure. This simplifies the API, most notably qc_stream_buf_alloc() usage. Note that this commit inverts the accounting. Previously, it was initially set to 0 and increment for each allocated buffer. Now, it is set to the maximum value and decrement for each buf usage. This is considered as clearer to use.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	635fbaaa4a	MINOR: quic: allocate stream txbuf via qc_stream_desc API This commit simply adjusts QUIC stream buffer allocation. This operation is conducted by QUIC MUX using qc_stream_desc layer. Previously, qc_stream_buf_alloc() would return a qc_stream_buf instance and QUIC MUX would finalized the buffer area allocation. Change this to perform the buffer allocation directly into qc_stream_buf_alloc(). This patch clarifies the interaction between QUIC MUX and qc_stream_desc. It is cleaner to allocate the buffer via qc_stream_desc as it is already responsible to free the buffer. It also ensures that connection buffer accounting is only done after the whole qc_stream_buf and its buffer are allocated. Previously, the increment operation was performed between the two steps. This was not an issue, as this kind of error triggers the whole connection closure. However, if in the future this is handled as a stream closure instead, this commit ensures that the buffer remains valid in all cases.	2024-08-20 17:17:17 +02:00
Amaury Denoyelle	c24c8667b2	MINOR: quic: define max-window-size config setting Define a new global keyword tune.quic.frontend.max-window-size. This allows to set globally the maximum congestion window size for each QUIC frontend connections. The default value is 0. It is a special value which automatically derive the size from the configured QUIC connection buffer limit. This is similar to the previous "quic-cc-algo" behavior, which can be used to override the maximum window size per bind line.	2024-08-20 17:02:29 +02:00
Amaury Denoyelle	280b61468a	MINOR: quic: extract config window-size parsing quic-cc-algo is a bind line keyword which allow to select a QUIC congestion algorithm. It can take an optional integer to specify the maximum window size. This value is an integer and support the suffixes 'k', 'm' and 'g' to specify respectively kilobytes, megabytes and gigabytes. Extract the maximum window size parsing in a dedicated function named parse_window_size(). It accepts as input an integer value with an optional suffix, 'k', 'm' or 'g'. The first invalid character is returned by the function to the caller. No functional change. This commit will allow to quickly implement a new keyword to configure a default congestion window size in the global section.	2024-08-20 16:07:22 +02:00
Amaury Denoyelle	5b6e8c4d4d	DOC: quic: document nocc debug congestion algorithm Document nocc congestion algorithm as an entry of quic-cc-algo. Highlight the fact that it is reserved for debugging and should not be used outside of this use case.	2024-08-20 16:07:22 +02:00
Amaury Denoyelle	103d860777	DOC: quic: fix default minimal value for max window size It is possible to override the default QUIC congestion algorithm on a bind line. With the same setting, it is also possible to specify the maximum congestion window size. The parser rejects values outside of the range between 10k and 4g. This is in contradiction with the documentation which specify 1k as the lower value. Correct this value in the documentation. This should be backported up to 2.9.	2024-08-20 16:07:22 +02:00
Nicolas CARPi	bba679026c	BUG/MINOR: stats: add lang attribute to html tag The "html" element of the stats page was missing a "lang" attribute. This change specifies the "en" value, which corresponds to english language. It is also a required element for WCAG Success Criterion 3.1.1, which renders the web more accessible through a set of requirements. In this case it allows assistive technologies such as screen readers to determine the language of the page. MDN page: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/lang HTML standard: https://html.spec.whatwg.org/multipage/dom.html#attr-lang WCAG criterion: https://www.w3.org/WAI/WCAG22/Understanding/language-of-page.html	2024-08-20 15:55:45 +02:00
Nicolas CARPi	9318a624a1	CLEANUP: stats: use modern DOCTYPE tag Switching the stats page doctype to the modern standard is shorter and less complex, and is the recommended doctype by current HTML standard. It makes it clear that we do not want to run in quirks mode. More information below. Quirks mode: https://developer.mozilla.org/en-US/docs/Web/HTML/Quirks_Mode_and_Standards_Mode HTML Standard: https://html.spec.whatwg.org/multipage/syntax.html#the-doctype	2024-08-20 15:55:31 +02:00
Nicolas CARPi	c63d558e41	BUG/MINOR: stats: fix color of input elements in dark mode Previously the text color was dark, with a dark background, this makes it white, and thus readable. This is visible on the "Scope" input field.	2024-08-20 15:55:14 +02:00
Valentine Krasnobaeva	8b1dfa9def	MINOR: cfgparse: limit file size loaded via /dev/stdin load_cfg_in_mem() can continuously reallocate memory in order to load an extremely large input from /dev/stdin, until it fails with ENOMEM, which means that process has consumed all available RAM. In case of containers and virtualized environments it's not very good. So, in order to prevent this, let's introduce MAX_CFG_SIZE as 10MB, which will limit the size of input supplied via /dev/stdin.	2024-08-20 14:28:34 +02:00
Nathan Wehrman	fd48b28315	MINOR: Implements new log format of option tcplog clf Some systems require log formats in the CLF format and that meant that I could not send my logs for proxies in mode tcp to those servers. This implements a format that uses log variables that are compatble with TCP mode frontends and replaces traditional HTTP values in the CLF format to make them stand out. Instead of logging method and URI like this "GET /example HTTP/1.1" it will log "TCP " and for a response code I used "000" so it would be easy to separate from legitimate HTTP traffic. Now your log servers that require a CLF format can see the timings for TCP traffic as well as HTTP.	2024-08-20 07:46:34 +02:00
Nicolas CARPi	974fae2b17	DOC: lua: fix incorrect english in lua.txt This commit fixes some typos, grammatical errors and unusual english such as "can not" instead of preferred "cannot".	2024-08-20 05:21:02 +02:00
Ilia Shipitsin	ae8f6724a1	CI: QUIC Interop AWS-LC: enable chrome client chrome is important browser, let's enable it in AWS-LC weekly tests. the only test supported by chrome is http3	2024-08-20 05:13:46 +02:00
Ilia Shipitsin	6301042938	CI: modernize codespell action, switch to node 16 The following actions uses node12 which is deprecated and will be forced to run on node16: codespell-project/codespell-problem-matcher@v1. For more info: https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/	2024-08-20 05:13:46 +02:00
Ilia Shipitsin	8b422971ee	CI: QUIC Interop LibreSSL: document chacha20 test status due to https://github.com/haproxy/haproxy/issues/2569 chacha20 is disabled completely on LibreSSL. let's add a comment to not forget enabling it	2024-08-20 05:13:26 +02:00
Aurelien DARRAGON	f8299bc5ea	MINOR: log: "drop" support for log-profile steps It is now possible to use "drop" keyword for "on" lines under a log-profile section to specify that no log at all should be emitted for the specified step (setting an empty format was not sufficient to do so because only the log payload would be empty, not the log header, thus the log would still be emitted). It may be useful to selectively disable logging at specific steps for a given log target (since the log profile may be set on log directives): log-profile myprof on request format "blabla" sd "custom sd" on response drop New testcase was added to reg-tests/log/log_profiles.vtc	2024-08-19 18:53:01 +02:00
Aurelien DARRAGON	41ca89bc6f	MEDIUM: log: relax some checks and emit diag warnings instead in lf_expr_postcheck() With 7a21c3a ("MAJOR: log: implement proper postparsing for logformat expressions") which finally made postparsing checks reliable, we started to get report from users that couldn't start haproxy 3.0 with configs that used to work in the past. The current situation is described in GH #2642. While the checks are mostly relevant, it turns out there are not strictly needed anymore from a technical point of view. Most of them were useful in early logformat implementation to prevent runtime bugs due to the use of an alias or fetch at runtime from an incompatible proxy. It's been a few versions already that the code handling fetches and log aliases is robust enough to support fetches/aliases used from the wrong context: all it does is that the fetch/alias will silently fail if it's not available. This can be proved by the fact that even if the postparsing checks were partially broken in the past, it didn't cause runtime issues (at least on recent haproxy versions). Most of these checks can now be seen as configuration hints: when a check triggers, it will indicate a configuration inconsistency in most cases, but they are some corner cases where it is not possible to know at config time if the conditions will be met for the alias/fetch to work properly.. so instead of failing with a hard error like we did so far, let's just be more permissive and report our findings using "diag_warning": such warnings are only emitted when haproxy is started with '-dD' cli option. We also took this opportunity to improve messages clarity and make them more precise (report the offending item instead of complaining about the whole expression because of a single element). With this patch, configs that used to start before 7a21c3a shouldn't trigger hard errors anymore. This may be backported in 3.0.	2024-08-16 14:25:10 +02:00
Nathan Wehrman	9788ae1d19	DOC: config: correct the table for option tcplog option tcplog was reported as functional in the backend section in error. This can be back ported as needed but it simply corrects that.	2024-08-13 19:50:18 +02:00
William Lallemand	f14bdba867	MINOR: release-estimator: fix the shebang of the python script Fix the shebang of the python script to use /usr/bin/env, allowing to call the script directly from a virtualenv with `./release-estimator.py` without using the python3 install of the system.	2024-08-13 17:26:36 +02:00
William Lallemand	5131f32440	MINOR: release-estimator: add installation steps in README.md Update the README.md with the dependencies and the installation steps with a python venv.	2024-08-13 17:21:47 +02:00
William Lallemand	9857eba3ae	MINOR: release-estimator: add requirements.txt Add a requirements.txt file to install the release-estimator script.	2024-08-13 17:12:59 +02:00
William Lallemand	bb02d95e92	BUG/MINOR: release-estimator: fix relative scheme in CHANGELOG URL The CHANGELOG URL which is parsed in the HTML now have a relative scheme, which is incompatible with requests. This patch adds an https scheme to the URL.	2024-08-13 16:43:03 +02:00
Ilia Shipitsin	ec1d93a6e9	CI: keep logs for failed QIUC Interop jobs it might be useful to investigate logs of failed tests. to keep artifacts small the following actions are taken - only failed logs are kept - logs retention is 6 days	2024-08-13 16:21:01 +02:00
Valentine Krasnobaeva	911f4d93d4	BUG/MINOR: pattern: pat_ref_set: return 0 if err was found pat_ref_set_elt() returns 0, if we are run out of memory or can't parse a new map value. Any arror message emitted by pat_ref_set_elt() is saved in err buffer, if its provided by caller. These error messages are cumulated during the loop. pat_ref_set() is used to update values in map, referred to the same given key. If during the update pat_ref_set_elt() fails, let's retun 0 to caller immediately. We have the same non-unique key and the same new value in each loop. So it seems quite odd to cumulate the same error messages and print it in CLI: > add map @1 mytest.map << + 1.0.1.11 TestA + 1.0.1.11 TESTA + 1.0.1.11 test_a + > set map mytest.map 1.0.1.11 15 unable to parse '15' unable to parse '15' unable to parse '15'. cli_parse_set_map(), which calls pat_ref_set() to update map, will return only one error message with this patch: > set map mytest.map 1.0.1.11 15 unable to parse '15'. hlua_set_map() and http_action_set_map() don't provide error buffer and will just exit on the first error. This should be backported in all stable versions.	2024-08-13 16:13:43 +02:00
Valentine Krasnobaeva	4f2493f355	BUG/MINOR: pattern: pat_ref_set: fix UAF reported by coverity memprintf() performs realloc and updates then the pointer to an output buffer, where it has written the data. So free() is called on the previous buffer address, if it was provided. pat_ref_set_elt() uses memprintf() to write its error message as well as pat_ref_set(). So, when we re-enter into the while loop the second time and pat_ref_set_elt() has returned, the err ptr (previous value of merr) is already freed by memprintf() from pat_ref_set_el(). 'if (!found)' condition is false at this point, because we've found a node at the first loop. So, the second memprintf(), in order to write error messages, does again free(*err). This should be backported in all stable versions.	2024-08-13 16:13:41 +02:00
Willy Tarreau	0982bfd999	BUG/MINOR: tools: make fgets_from_mem() stop at the end of the input The memchr() used to look for the LF character must consider the end of input, not just the output buffer size. This was found by oss-fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=71096 No backport is needed.	2024-08-11 14:44:28 +02:00
William Lallemand	75944e266e	CLEANUP: mworker/cli: clean up the mode handling Cleanup the mode handling by refactoring the strings constant that are written multiple times	2024-08-09 17:47:20 +02:00
Amaury Denoyelle	48514c118c	BUG/MINOR: h3: properly reject too long header responses When encoding HTX to HTTP/3 headers on the response path, a bunch of ABORT_NOW() where used when buffer room was not enough. In most cases this is safe as output buffer has just been allocated and so is empty at the start of the function. However, with a header list longer than a whole buffer, this would cause an unexpected crash. Fix this by removing ABORT_NOW() statement with proper error return path. For the moment, this would cause the whole connection to be close rather than the stream only. This may be further improved in the future. Also remove ABORT_NOW() when encoding frame length at the end of headers or trailers encoding. Buffer room is sufficient as it was already checked prior in the same function. This should be backported up to 2.6. Special care should be handled however as this code path has changed frequently : * for 2.9 and older, the extra following statement must be inserted prior each newly added goto statement : h3c->err = H3_INTERNAL_ERROR; * for 2.6, trailers support is not implemented. As such, related chunks should just be ignored when backporting.	2024-08-09 17:41:16 +02:00
Amaury Denoyelle	8939d8e473	MINOR: mux-quic: do not trace error in qcc_send_frames() on empty list qcc_send_frames() can be called with an empty list and returns immediately with an error code. This is convenience to be able to call it in a while loop. Remove the trace with "error" when this is the case and replacing it with a less alarming "leaving on..." message. This should help debugging when traces are active.	2024-08-09 17:41:16 +02:00
Valentine Krasnobaeva	9fc69ebc0a	MINOR: proto_uxst: copy errno in errmsg for syscalls Let's copy errno in error messages, which we emit in cases when listen() or connect() fail. This is helpful for debugging.	2024-08-09 17:38:42 +02:00
Valentine Krasnobaeva	16e89f6b5c	BUG/MINOR: cfgparse: parse_cfg: fix null ptr dereference reported by coverity This commit fixes potential null ptr dereferences reported by coverity, see more details about it in the issues #2676 and #2668. 'outline' ptr, which is initialized to NULL explicitly as a temporary buffer to store split keywords may be in theory implicitly dereferenced in some corner cases (which we haven't encountered yet with real world configurations) in 'if (!**args)'. parse_line() code, called before under some conditions assigns: args[arg] = outline + outpos and outpos initial value is 0.	2024-08-09 15:43:29 +02:00
Valentine Krasnobaeva	eb82358690	BUG/MINOR: proto_uxst: delete fd from fdtab if listen() fails This patch is done mostly as a safeguard in order not to trigger BUG_ON(fdtab[fd].owner != NULL) check, if listen() will fail on UNIX domain socket. In uxst_bind_listener(), the pretty same logic of closing socket on error path was kept, as it was in tcp_bind_listener() before. The use of fd_delete() was not generalized, when the support of UNIX sock_stream protocol was implemented. So, let's remove fd from fdtab on failure, instead of closing it. Otherwise, uxst_bind_listener(), which could be called in loop for each receiver, will obtain the same fd via socket() for the next receiver. Then, it will bind it again and it will try to re-insert it in fdtab. This can be backported to all stable versions.	2024-08-09 15:23:28 +02:00
Amaury Denoyelle	f3c75a52df	BUG/MINOR: mux-quic: do not send too big MAX_STREAMS ID QUIC stream IDs are expressed as QUIC variable integer which cover the range for 0 to 2^62 - 1. As such, it is forbidden to send an ID for MAX_STREAMS flow-control frame which would allow to overcome this value. This patch fixes MAX_STREAMS emission to ensure sent value is valid. This also ensures that the peer cannot open a stream with an invalid ID as this would cause a flow-control violation instead. This must be backported up to 2.6.	2024-08-09 14:33:49 +02:00
Valentine Krasnobaeva	aae2ff7691	MINOR: startup: fix unused value reported by coverity Unused 0 is assigned to ret, as it's rewritten by error code of read_cfg(). This issue was reported by coverity.	2024-08-08 19:54:12 +02:00
Valentine Krasnobaeva	da82f08055	MINOR: cfgparse: load_cfg_in_mem: fix null ptr dereference reported by coverity This helps to optimize a bit load_cfg_in_mem() and fixes the potential null ptr dereference in fread() call. If (read_bytes + bytes_to_read) equals to initial chunk_size (zero), realloc is never called, *cfg_content keeps its NULL value. So, let's assure that initial number of bytes to read (read_bytes + bytes_to_read) is stricly positive, when we enter into loop at the first time.	2024-08-08 19:54:12 +02:00
William Lallemand	fe5ddcc490	REGTESTS: mcli: test the pipelined commands on master CLI A recent fix broke the pipelined command on the master CLI, this reg-tests implement a simple test that allow to check its right behavior. This could be backported as far as 2.6.	2024-08-08 17:29:37 +02:00
William Lallemand	b75edf2f11	BUG/MEDIUM: mworker/cli: fix pipelined modes on master CLI Since commit 3d93ecc ("BUG/MAJOR: cli: Restore non-interactive mode behavior with pipelined commands") and commit 598c7f16 ("BUG/MEDIUM: cli: Warn if pipelined commands are delimited by a \n"), the pipelined command on the master CLI are either broken or emit warnings depending on which version. The reason is that mode applied on the master CLI are saved on the in the current CLI session, and then reinserted for each pipelined command, however, these commande were inserted as new lines. For example: "@1; expert-mode on; debug dev log foo; debug dev log bar" Would be sent as: "expert mode on\ndebug dev log foo" "expert mode on\ndebug dev log bar" This patch fixes the issue by using the new ci_insert() function which inserts a string instead of a newline, and the command are now suffixed by ';' upon insertion allowing a correct pipelined command chain. This must be backported with the previous commit introducing ci_insert() in every stable version. This is broken since the 3.0 version, but it emits a warning in every version below, because 598c7f164 was backported.	2024-08-08 17:29:37 +02:00
William Lallemand	b2a8e8731d	MINOR: channel: implement ci_insert() function ci_insert() is a function which allows to insert a string <str> of size <len> at <pos> of the input buffer. This is the equivalent of ci_insert_line2() but without inserting '\r\n'	2024-08-08 17:29:37 +02:00
Valentine Krasnobaeva	46181e730a	MINOR: proto_tcp: tcp_bind_listener: copy errno in errmsg Let's copy errno in errmsg produced by tcp_bind_listener if it fails in a syscall(). This is helpful to debug issues, while binding listeners.	2024-08-08 16:34:13 +02:00
Valentine Krasnobaeva	81f48395b3	BUG/MINOR: proto_tcp: keep error msg if listen() fails If listen() fails, we need to keep the message about it, which is copied then in errmsg buffer on the error path. This buffer is properly provided by the caller (protocol_bind_all()) and reallocated if needed in memprintf(), but it was deleted without being returned. This can be backported to all stable versions.	2024-08-08 16:34:06 +02:00
Valentine Krasnobaeva	308c6881c0	BUG/MINOR: proto_tcp: delete fd from fdtab if listen() fails If listen() fails, fd should be deleted from fdtab, not just closed. Otherwise, sock_inet_bind_receiver(), which is called in loop for each receiver, will obtain the same fd via socket() for the next receiver, registered in the receivers list. Then, it will bind it again and it will try to re-insert it in fdtab, and fd_insert() will trigger the BUG_ON(fdtab[fd].owner != NULL) check. When tcp_bind_listener() code was implemented, the use of fd_delete() was not generalized and this one remained overlooked. This can be backported to all stable versions.	2024-08-08 16:33:53 +02:00
Willy Tarreau	8427c5b542	[RELEASE] Released version 3.1-dev5 Released version 3.1-dev5 with the following main changes : - BUG/MINOR: quic: Lack of precision when computing K (cubic only cc) - MEDIUM: ssl/quic: implement quic crypto with EVP_AEAD - MINOR: quic: rename confusing wording aes to hp - MEDIUM: quic: add key argument to header protection crypto functions - MEDIUM: quic: implement CHACHA20_POLY1305 for AWS-LC - MEDIUM: sink: assume sft appctx stickiness - MINOR: quic: delay Retry emission on quic-force-retry - MEDIUM: quic: implement quic-initial rules - MINOR: quic: support ACL for quic-initial rules - MINOR: quic: pass quic_dgram as obj_type for quic-initial rules - MINOR: quic: implement reject quic-initial action - MINOR: quic: implement send-retry quic-initial rules - BUG/MEDIUM: quic: fix invalid conn reject with CONNECTION_REFUSED - MEDIUM: h1: allow to preserve keep-alive on T-E + C-L - MINOR: quic: Add information to "show quic" for CUBIC cc. - MINOR: quic: Dump TX in flight bytes vs window values ratio. - BUG/MEDIUM: jwt: Clear SSL error queue on error when checking the signature - BUILD: cfgparse-quic: fix build error on Solaris due to missing netinet/in.h - MINOR: queue: add a function to check for TOCTOU after queueing - BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue() - DOC: config: Add documentation about spop mode for backends - BUG/MEDIUM: stconn: Report error on SC on send if a previous SE error was set - BUG/MEDIUM: mux-pt/mux-h1: Release the pipe on connection error on sending path - BUILD: mux-pt: Use the right name for the sedesc variable - BUG/MINOR: stconn: bs.id and fs.id had their dependencies incorrect - BUG/MEDIUM: ssl: reactivate 0-RTT for AWS-LC - BUG/MEDIUM: ssl: 0-RTT initialized at the wrong place for AWS-LC - BUILD: ssl: replace USE_OPENSSL_AWSLC by OPENSSL_IS_AWSLC - BUG/MEDIUM: quic: prevent conn freeze on 0RTT undeciphered content - MINOR: tcp_sample: Move TCP low level sample fetch function to control layer - MINOR: quic: Define ->get_info() control layer callback for QUIC - MINOR: flags/mux-quic: decode qcc and qcs flags - BUG/MINOR: quic: fix fc_rtt/srtt values - BUG/MIONR: quic: fix fc_lost - BUG/MINOR: h1: do not forward h2c upgrade header token - BUG/MINOR: h2: reject extended connect for h2c protocol - BUG/MEDIUM: http-ana: Report error on write error waiting for the response - BUG/MEDIUM: h2: Only report early HTX EOM for tunneled streams - BUG/MEDIUM: mux-h2: Propagate term flags to SE on error in h2s_wake_one_stream - BUG/MEDIUM: peer: Notify the applet won't consume data when it waits for sync - BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only) - CI: add weekly QUIC Interop regression against AWS-LC - CI: harden NetBSD builds by ERR=1 - BUG/MINOR: quic: Too short datagram during packet building failures (aws-lc only) - DEV: coccinelle: add a test to detect unchecked strdup() - BUG/MINOR: fcgi-app: handle a possible strdup() failure - BUG/MEDIUM: server/addr: fix tune.events.max-events-at-once event miss and leak - MINOR: quic: convert qc_stream_desc release field to flags - MINOR: quic: implement function to check if STREAM is fully acked - BUG/MEDIUM: quic: handle retransmit for standalone FIN STREAM - MINOR: quic: enforce ACK reception is handled in order - DOC: configuration: fix alphabetical ordering of {bs,fs}.aborted - MINOR: stconn: add a new pair of sf functions {bs,fs}.debug_str - MINOR: mux-h2: implement the debug string for logs - MINOR: mux-quic: define dump functions for QCC and QCS - MINOR: mux-quic: implement debug string for logs - MINOR: quic: dump quic_conn debug string for logs - MINOR: time: define tot_time structure - MINOR: mux-quic: measure QCS lifetime and its blocking state - BUG/MINOR: trace/quic: enable conn/session pointer recovery from quic_conn - BUG/MINOR: trace/quic: permit to lock on frontend/connect/session etc - BUG/MEDIUM: trace: fix null deref in lockon mechanism since TRACE_ENABLED() - BUG/MINOR: trace: automatically start in waiting mode with "start <evt>" - BUG/MINOR: trace/quic: make "qconn" selectable as a lockon criterion - BUG/MINOR: quic/trace: make quic_conn_enc_level_init() emit NEW not CLOSE - MINOR: trace: support setting the sink and level for all sources at once - MINOR: session/trace: enable very minimal session tracing - MEDIUM: trace: implement a "follow" mechanism - MINOR: trace: move the known trace context into a dedicated struct - MINOR: trace: add a per-source helper to pre-fill the context - MINOR: mux-h2: add a trace context filling helper - MINOR: mux-h1: add a trace context filling helper - MINOR: mux-quic: don't leave dangling pointer after freeing qcs->sd - MINOR: mux-quic: add a trace context filling helper - MINOR: mux-h1/trace: add a state trace on stream creation/upgrade - MINOR: mux-h2/trace: add a state trace on stream creation/destruction - MINOR: mux-h3/trace: add a state trace on stream creation/destruction - BUG/MINOR: quic: prevent freeze after early QCS closure - MINOR: server: ensure max_events_at_once > 0 in server_atomic_sync() - MINOR: cfgparse: add struct cfgfile to represent config in memory - REORG: tools: move list_append_word to cfgparse - MINOR: startup: adapt list_append_word to use cfgfile - MINOR: cfgparse: add load_cfg_in_mem - MINOR: cfgparse: load_cfg_in_mem: take in account file size - MINOR: tools: add fgets_from_mem - MEDIUM: startup: make read_cfg() return immediately on ENOMEM - MEDIUM: startup: load and parse configs from memory - MINOR: startup: rename readcfgfile in parse_cfg	2024-08-07 18:42:33 +02:00
Valentine Krasnobaeva	c6cfa7cb4a	MINOR: startup: rename readcfgfile in parse_cfg As readcfgfile no longer opens configuration files and reads them with fgets, but performs only the parsing of provided data, let's rename it to parse_cfg by analogy with read_cfg in haproxy.c.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	5b52df4c4d	MEDIUM: startup: load and parse configs from memory Let's call load_cfg_in_ram() helper for each configuration file to load it's content in some area in memory. Adapt readcfgfile() parser function respectively. In order to limit changes in its scope we give as an argument a cfgfile structure, already filled in init_args() and in load_cfg_in_ram() with file metadata and content. Parser function (readcfgfile()) uses now fgets_from_mem() instead of standard fgets from libc implementations. SPOE filter parses its own configuration file, pointed by 'config' keyword in the configuration already loaded in memory. So, let's allocate and fill for this a supplementary cfgfile structure, which is not referenced in cfg_cfgfiles list. This structure and the memory with content of SPOE filter configuration are freed immediately in parse_spoe_flt(), when readcfgfile() returns. HAProxy OpenTracing filter also uses its own configuration file. So, let's follow the same logic as we do for SPOE filter.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	2bb34edb0b	MEDIUM: startup: make read_cfg() return immediately on ENOMEM This commit prepares read_cfg() to call load_cfg_in_mem() helper in order to load configuration files in memory. Before, read_cfg() calls the parser for all files from cfg_cfgfiles list and cumulates parser's errors and memprintf's errors in for_each loop. memprintf's errors did not stop this loop and were accounted just after. Now, as we plan to load configuration files in memory, we stop the loop, if memprintf() fails, and we show appropraite error message with ha_alert. Then process terminates. So not all cumulated syntax-related errors will be shown before exit in this case and we has to stop, because we run out of memory. If we can't open the current file or we fail to allocate a memory to store some configuration line, the previous behaviour is kept, process emits appropriate alert message and exits. If parser returns some syntax-related error on the current file, the previous behaviour is kept as well. We cumulate such errors for all parsed files and we check them just after the loop. All syntax-related errors for all files is shown then as before in ha_alert messages line by line during the startup. Then process will exit with 1. As now cfg_cfgfiles list contains many pointers to some memory areas with configuration files content and this content could be big, it's better to free the list explicitly, when parsing was finished. So, let's change read_cfg() to return some integer value to its caller init(), and let's perform the free routine at a caller level, as cfg_cfgfiles list was initialized and initially filled at this level.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	007f7f2f02	MINOR: tools: add fgets_from_mem Add fgets_from_mem() helper to read lines from configuration files, stored now as memory chunks. In order to limit changes in the first-level parser code (readcfgfile()), it is better to reimplement the standard fgets, i.e. to have a fgets, which can read the serialized data line by line from some memory area, instead of file stream, and can keep the same behaviour as libc implementations fgets.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	03e63b98ca	MINOR: cfgparse: load_cfg_in_mem: take in account file size Let's take in account the given file size, when its reported via stat. It's very convenient for large configuration files, as this allows to perform only the one memory allocation call for precisely needeed file size. This also allows to perform only the one call to fread(). We need to provide to fread() file_stat.st_size + 1 to be able to grab EOF. Like this it sets feof(f)=1 flag and this allows to exit from the loop immediately, just after fread call. If /dev/stdin or /dev/null is provided as a file, we continue to read the configuration chunk by chunk, stat doesn't report the size.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	5b9ed6e4be	MINOR: cfgparse: add load_cfg_in_mem Add load_cfg_in_mem() helper, which allows to store the content of a given file in memory.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	bafb0ce272	MINOR: startup: adapt list_append_word to use cfgfile list_append_word() helper was used before only to chain configuration file names in a list. As now we start to use cfgfile structure which represents entire file in memory and its metadata, let's adapt this helper to use this structure and let's rename it to list_append_cfgfile(). Adapt functions, which process configuration files and directories to use cfgfile structure and list_append_cfgfile() instead of wordlist.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	39f2a19620	REORG: tools: move list_append_word to cfgparse Let's move list_append_word to cfgparse.c as it is used only to fill cfg_cfgfiles list with configuration file names.	2024-08-07 18:41:41 +02:00
Valentine Krasnobaeva	70b842e847	MINOR: cfgparse: add struct cfgfile to represent config in memory This and following commits serve to prepare loading configuration files in memory, before parsing them, as we may need to parse some parts of configuration in different moments of the startup sequence. This is a case of the new master-worker initialization process. Here we need to read at first only the global and the program sections and only after some steps (forking worker, etc) the rest of the configuration. Add a new structure cfgfile to keep configuration files metadata and content, loaded somewhere in a memory. Instances of filled cfgfile structures could be chained in a list, as the order in which they were loaded is important.	2024-08-07 18:41:41 +02:00
Aurelien DARRAGON	a6d1eb8f5d	MINOR: server: ensure max_events_at_once > 0 in server_atomic_sync() In 8f1fd96 ("BUG/MEDIUM: server/addr: fix tune.events.max-events-at-once event miss and leak"), we added a comment saying that tune.events.max-events-at-once is assumed to be strictly positive. It is so because the keyword parser forces values between 1 and 10000: we don't want less than 1 because it wouldn't make any sense, and 10k max because beyond that we could create contention in server_atomic_sync() Now as the above commit implements a do..while it heavily relies on the fact that the budget is at least 1. Upon soft-stop, we break away from the loop without decrementing the budget. With all that in mind, it is safe to assume that the 'remain' counter will only fall to 0 if the task runs out of budget while doing work, in which case the task still exists and must be rescheduled. As seen in GH #2667 this assumption was ambiguous, so let's make it official by adding a pair of BUG_ON() that make it explicit that it works because remain 'cannot' be 0 unless the entire budget was consumed. No backport needed.	2024-08-07 18:31:35 +02:00
Amaury Denoyelle	3ef1ee477d	BUG/MINOR: quic: prevent freeze after early QCS closure A connection freeze may occur if a QCS is released before transmitting any data. This can happen when an error is detected early by the stream, for example during HTTP response headers encoding, forcing the whole connection closure. In this case, a connection error is registered by the QUIC MUX to the lower layer. MUX is then release and xprt layer is notified to prepare CONNECTION_CLOSE emission. However, this is prevented because quic_conn streams tree is not empty as it contains the qc_stream_desc previously attached to the failed QCS instance. The connection will freeze until QUIC idle timeout. This situation is caused by an omission during qc_stream_desc release operation. In the described situation, qc_stream_desc current buffer is empty and can thus by removed, which is the purpose of this patch. This unblocks this previously failed situation, with qc_stream_desc removal from quic_conn tree. This issue can be reproduced by modifying H3/QPACK code to return an early error during HEADERS response processing. This must be backported up to 2.6, after a period of observation.	2024-08-07 18:14:29 +02:00
Willy Tarreau	d5da87b5dc	MINOR: mux-h3/trace: add a state trace on stream creation/destruction Logging below the developer level doesn't always yield very convenient traces as we don't know well where streams are allocated nor released. Let's just make that more explicit by using state-level traces for these important steps.	2024-08-07 16:02:59 +02:00
Willy Tarreau	23417ab9d4	MINOR: mux-h2/trace: add a state trace on stream creation/destruction Logging below the developer level doesn't always yield very convenient traces as we don't know well where streams are allocated nor released. Let's just make that more explicit by using state-level traces for these important steps.	2024-08-07 16:02:59 +02:00
Willy Tarreau	cc12d1b253	MINOR: mux-h1/trace: add a state trace on stream creation/upgrade Logging below the developer level doesn't always yield very convenient traces as we don't know well where streams are allocated nor released. Let's just make that more explicit by using state-level traces. Note that h1s destruction was already logged as closing connection or switching to idle mode.	2024-08-07 16:02:59 +02:00
Willy Tarreau	6191de6aa6	MINOR: mux-quic: add a trace context filling helper This helper is able to find a connection, a session, a stream, or a frontend from its args.	2024-08-07 16:02:59 +02:00
Willy Tarreau	b2cede590b	MINOR: mux-quic: don't leave dangling pointer after freeing qcs->sd In qcs_free() we're calling a few other functions after releasing qcs->sd. None of them make use of it for now but with traces that will change. Make sure to clear qcs->sd after releasing it.	2024-08-07 16:02:59 +02:00
Willy Tarreau	adfe0a30e1	MINOR: mux-h1: add a trace context filling helper This helper is able to find a connection, a session, a stream, a frontend or a backend from its args.	2024-08-07 16:02:59 +02:00
Willy Tarreau	6c6ef5ae12	MINOR: mux-h2: add a trace context filling helper This helper is able to find a connection, a session, a stream, a frontend or a backend from its args. Note that this required to always make sure that h2s->sess is reset on allocation because it's normally initialized later for backend streams, and producing traces between the two could pre-fill a bad pointer in the trace_ctx.	2024-08-07 16:02:59 +02:00
Willy Tarreau	10c8baca44	MINOR: trace: add a per-source helper to pre-fill the context Now sources which want to do it can provide a helper that can pre-fill some fields in the context based on their knowledge (e.g. mux streams).	2024-08-07 16:02:59 +02:00
Willy Tarreau	7d55a70f5a	MINOR: trace: move the known trace context into a dedicated struct We now have a trace_ctx to hold the sess, conn, qc, stream and so on. This will allow us to pass it across layers so that other helpers can help fill them. Ideally it should be passed as an argument to __trace_enabled() by __trace() so that it can be passed back to the trace callback. But it seems that trace callbacks are smart enough to figure all their info when they need them.	2024-08-07 16:02:59 +02:00
Willy Tarreau	d465610ec3	MEDIUM: trace: implement a "follow" mechanism With "follow" from one source to another, it becomes possible for a source to automatically follow another source's tracked pointer. The best example is the session: - the "session" source is enabled and has a "lockon session" -> its lockon_ptr is equal to the session when valid - other sources (h1,h2,h3 etc) are configured for "follow session" and will then automatically check if session's lockon_ptr matches its own session, in which case tracing will be enabled for that trace (no state change). It's not necessary to start/pause/stop traces when using this, only "follow" followed by a source with lockon enabled is needed. Some combinations might work better than others. At the moment the session is almost never known from the backend, but this may improve. The meta-source "all" is supported for the follower so that all sources will follow the tracked one.	2024-08-07 16:02:59 +02:00
Willy Tarreau	abb07af67e	MINOR: session/trace: enable very minimal session tracing By having traces at the session level, it becomes possible to start traces on session creation and pause them on session end. Doing so will soon open new possibilties to synchronize multiple traces.	2024-08-07 16:02:59 +02:00
Willy Tarreau	d2a49de9c7	MINOR: trace: support setting the sink and level for all sources at once It's extremely painful to have to set "trace <src> sink buf1" for all sources, then to do the same for "level developer" (for example). Let's have a possibility via a meta-source "all" to apply the change to all sources at once. This currently supports level and sink, which are not dependent on the source, this is a good start.	2024-08-07 16:02:59 +02:00
Willy Tarreau	6bf50dfccc	BUG/MINOR: quic/trace: make quic_conn_enc_level_init() emit NEW not CLOSE The event emitted by this trace was of type CLOSE instead of NEW, which would somtimes temporarily pause a started trace. This can be backported to 3.0, probably 2.6.	2024-08-07 16:02:59 +02:00
Willy Tarreau	7a22fbd453	BUG/MINOR: trace/quic: make "qconn" selectable as a lockon criterion The test was was performed but there's no way to set the option! Let's just add "qconn" to select the quic conn when the source supports it. This can be backported at least to 3.0, probably 2.6.	2024-08-07 16:02:59 +02:00
Willy Tarreau	0406efe9ad	BUG/MINOR: trace: automatically start in waiting mode with "start <evt>" The doc clearly says that "start <evt>" should leave the trace in pause mode until the indicated event appears. However it's not what's happening, the state is not changed until one command uses "now", so it's typically needed to configure the events with "start <evt>" then enable the waiting mode using "pause now". This is counter-intuitive and does not match the doc, so let's fix it so that "start <evt>" switches from stopped to waiting as long as at least one event is enabled. This can be backported to all versions.	2024-08-07 16:02:59 +02:00
Willy Tarreau	b5df6b5a31	BUG/MEDIUM: trace: fix null deref in lockon mechanism since TRACE_ENABLED() When calling TRACE_ENABLED(), which is called by TRACE_PRINTF(), we pass a NULL plockptr to __trace_enabled(). This argument is used when lockon is active, and may update the pointer. This is an overlook which also broke the lockon mechanism because now for calls from __trace(), it dereferences a pointer pointing to NULL, and never updates it due to the broken condition, so that trace() never sets up src->lockon_ptr. The bug was introduced in 2.8 by commit 8f9a9704bb ("MINOR: trace: add a TRACE_ENABLED() macro to determine if a trace is active"), so the fix must be backported there.	2024-08-07 16:02:59 +02:00
Willy Tarreau	88a752ca78	BUG/MINOR: trace/quic: permit to lock on frontend/connect/session etc These ones were not proposed in the list of trackable elements. Note that this depends on previous commit: BUG/MINOR: trace/quic: enable conn/session pointer recovery from quic_conn This should be backported to at least 3.0, maybe even 2.6.	2024-08-07 16:02:59 +02:00
Willy Tarreau	aa1915a9f5	BUG/MINOR: trace/quic: enable conn/session pointer recovery from quic_conn In __trace_enabled(), a quic_conn was detected, but it was not possible to derive the connection nor the session from it, which was quite limiting in terms of ability to track a same instance. This should be backported to at least 3.0, maybe even 2.6.	2024-08-07 16:02:59 +02:00
Amaury Denoyelle	9f829ea3f3	MINOR: mux-quic: measure QCS lifetime and its blocking state Reuse newly defined tot_time structure to measure various values related to a QCS lifetime. First, a timer is used to comptabilize the total QCS lifetime. Then, two other timers are used to account the total time during which Tx from stream layer to MUX is blocked, either on lack of buffer or due to flow-control. These three timers are reported in qmux_dump_qcs_info(). Thus, they are available in traces and for QUIC MUX debug string sample.	2024-08-07 15:40:52 +02:00
Amaury Denoyelle	a6e2523ca1	MINOR: time: define tot_time structure Define a new utility type tot_time. Its purpose is to be able to account elapsed time accross multiple periods. Functions are defined to easily start and stop measures, and return the current value.	2024-08-07 15:40:52 +02:00
Amaury Denoyelle	663416b4ef	MINOR: quic: dump quic_conn debug string for logs Define a new xprt_ops callback named dump_info. This can be used to extend MUX debug string with infos from the lower layer. Implement dump_info for QUIC stack. For now, only minimal info are reported : bytes in flight and size of the sending window. This should allow to detect if the congestion controller is fine. These info are reported via QUIC MUX debug string sample.	2024-08-07 15:40:52 +02:00
Amaury Denoyelle	630fa53c51	MINOR: mux-quic: implement debug string for logs Implement MUX_SCTL_DBG_STR for QUIC MUX. This returns info for the current QCS and QCC instances, reusing qmux_dump_qc{c,s}_info functions already used for traces, and the connection flags. This stream operation is useful for debug string sample support.	2024-08-07 15:40:52 +02:00
Amaury Denoyelle	eb4dfa3b36	MINOR: mux-quic: define dump functions for QCC and QCS Extract trace code to dump QCC and QCS instances into dedicated functions named qmux_dump_qc{c,s}_info(). This will allow to easily print QCC/QCS infos outside of traces.	2024-08-07 15:40:52 +02:00
Willy Tarreau	490cb16d3a	MINOR: mux-h2: implement the debug string for logs Now it permits to have this for a front and a back: <134>Jul 30 19:32:53 haproxy[24405]: 127.0.0.1:64860 [30/Jul/2024:19:32:53.732] test2 test2/s1 0/0/0/0/0 200 130 - - ---- 2/1/0/0/0 0/0 "GET /blah HTTP/2.0" h2s.id=1 .st=CLO .flg=0x7003 .rxbuf=0@(nil)+0/0 .sc=0x1e03fb0(.flg=0x00034482 .app=0x1e04020) .sd=0x1e03f30(.flg=0x50405601) .subs=(nil) h2c.st0=FRH .err=0 .maxid=1 .lastid=-1 .flg=0x100e00 .nbst=0 .nbsc=1, .glitches=0 .fctl_cnt=0 .send_cnt=0 .tree_cnt=1 .orph_cnt=0 .sub=1 .dsi=1 .dbuf=0@(nil)+0/0 .mbuf=[1..1\|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] .task=(nil) conn.flg=0x80000300 <134>Jul 30 19:32:53 haproxy[24405]: 127.0.0.1:65246 [30/Jul/2024:19:32:53.732] test1 test1/s1 0/0/0/0/0 200 130 - - ---- 2/1/0/0/0 0/0 "GET /blah HTTP/1.1" h2s.id=1 .st=CLO .flg=0x7003 .rxbuf=0@(nil)+0/0 .sc=0x1dfc7b0(.flg=0x0006d01b .app=0x1c65fe0) .sd=0x1dfc820(.flg=0x1040ca01) .subs=(nil) h2c.st0=FRH .err=0 .maxid=1 .lastid=-1 .flg=0x108e00 .nbst=0 .nbsc=1, .glitches=0 .fctl_cnt=0 .send_cnt=0 .tree_cnt=1 .orph_cnt=0 .sub=1 .dsi=1 .dbuf=0@(nil)+0/0 .mbuf=[1..1\|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] .task=(nil) conn.flg=0x000300 Just with this in the front and back proxies respectively: log-format "$HAPROXY_HTTP_LOG_FMT %[bs.debug_str(15)]" log-format "$HAPROXY_HTTP_LOG_FMT %[fs.debug_str(15)]" For now the mux only implements muxs, muxc, conn. Xprt is ignored.	2024-08-07 14:07:41 +02:00
Willy Tarreau	921e04bf87	MINOR: stconn: add a new pair of sf functions {bs,fs}.debug_str These are passed to the underlying mux to retrieve debug information at the mux level (stream/connection) as a string that's meant to be added to logs. The API is quite complex just because we can't pass any info to the bottom function. So we construct a union and pass the argument as an int, and expect the callee to fill that with its buffer in return. Most likely the mux->ctl and ->sctl API should be reworked before the release to simplify this. The functions take an optional argument that is a bit mask of the layers to dump: muxs=1 muxc=2 xprt=4 conn=8 sock=16 The default (0) logs everything available.	2024-08-07 14:07:41 +02:00
Willy Tarreau	b681a9e488	DOC: configuration: fix alphabetical ordering of {bs,fs}.aborted These must be before {bs,fs}.id, not after. Should be backported wherever 068ce2d5d2 ("MINOR: stconn: Add samples to retrieve about stream aborts") is (normally 3.0).	2024-08-07 14:07:41 +02:00
Amaury Denoyelle	b2282082dd	MINOR: quic: enforce ACK reception is handled in order Add a new BUG_ON() in qc-stream_desc_ack(). It ensures that acknowledgement are always notify in-order. This is because out-of-order ACKs cannot be handled by qc_stream_desc layer which does not support gap in STREAM sent data. Prior to this fix, out-of-order ACKs are simply ignored without any error. This currently cannot happen thanks to careful qc_stream_desc_ack() invokation. If this assumption is broken in the future by inatteion, this would cause loss of ACK notification which will prevent qc_stream_desc release.	2024-08-07 11:08:20 +02:00
Amaury Denoyelle	e177cf341c	BUG/MEDIUM: quic: handle retransmit for standalone FIN STREAM STREAM frames have dedicated handling on retransmission. A special check is done to remove data already acked in case of duplicated frames, thus only unacked data are retransmitted. This handling is faulty in case of an empty STREAM frame with FIN set. On retransmission, this frame does not cover any unacked range as it is empty and is thus discarded. This may cause the transfer to freeze with the client waiting indefinitely for the FIN notification. To handle retransmission of empty FIN STREAM frame, qc_stream_desc layer have been extended. A new flag QC_SD_FL_WAIT_FOR_FIN is set by MUX QUIC when FIN has been transmitted. If set, it prevents qc_stream_desc to be freed until FIN is acknowledged. On retransmission side, qc_stream_frm_is_acked() has been updated. It now reports false if FIN bit is set on the frame and qc_stream_desc has QC_SD_FL_WAIT_FOR_FIN set. This must be backported up to 2.6. However, this modifies heavily critical section for ACK handling and retransmission. As such, it must be backported only after a period of observation. This issue can be reproduced by using the following socat command as server to add delay between the response and connection closure : $ socat TCP-LISTEN:<port>,fork,reuseaddr,crlf SYSTEM:'echo "HTTP/1.1 200 OK"; echo ""; sleep 1;' On the client side, ngtcp2 can be used to simulate packet drop. Without this patch, connection will be interrupted on QUIC idle timeout or haproxy client timeout with ERR_DRAINING on ngtcp2 : $ ngtcp2-client --exit-on-all-streams-close -r 0.3 <host> <port> "http://<host>:<port>/?s=32o" Alternatively to ngtcp2 random loss, an extra haproxy patch can also be used to force skipping the emission of the empty STREAM frame : diff --git a/include/haproxy/quic_tx-t.h b/include/haproxy/quic_tx-t.h index efbdfe687..1ff899acd 100644 --- a/include/haproxy/quic_tx-t.h +++ b/include/haproxy/quic_tx-t.h @@ -26,6 +26,8 @@ extern struct pool_head pool_head_quic_cc_buf; / Flag a sent packet as being probing with old data / #define QUIC_FL_TX_PACKET_PROBE_WITH_OLD_DATA (1UL << 5) +#define QUIC_FL_TX_PACKET_SKIP_SENDTO (1UL << 6) + / Structure to store enough information about TX QUIC packets. / struct quic_tx_packet { / List entry point. / diff --git a/src/quic_tx.c b/src/quic_tx.c index 2f199ac3c..2702fc9b9 100644 --- a/src/quic_tx.c +++ b/src/quic_tx.c @@ -318,7 +318,7 @@ static int qc_send_ppkts(struct buffer buf, struct ssl_sock_ctx ctx) tmpbuf.size = tmpbuf.data = dglen; TRACE_PROTO("TX dgram", QUIC_EV_CONN_SPPKTS, qc); - if (!skip_sendto) { + if (!skip_sendto && !(first_pkt->flags & QUIC_FL_TX_PACKET_SKIP_SENDTO)) { int ret = qc_snd_buf(qc, &tmpbuf, tmpbuf.data, 0, gso); if (ret < 0) { if (gso && ret == -EIO) { @@ -354,6 +354,7 @@ static int qc_send_ppkts(struct buffer buf, struct ssl_sock_ctx ctx) qc->cntrs.sent_bytes_gso += ret; } } + first_pkt->flags &= ~QUIC_FL_TX_PACKET_SKIP_SENDTO; b_del(buf, dglen + QUIC_DGRAM_HEADLEN); qc->bytes.tx += tmpbuf.data; @@ -2066,6 +2067,17 @@ static int qc_do_build_pkt(unsigned char pos, const unsigned char *end, continue; } + switch (cf->type) { + case QUIC_FT_STREAM_8 ... QUIC_FT_STREAM_F: + if (!cf->stream.len && (qc->flags & QUIC_FL_CONN_TX_MUX_CONTEXT)) { + TRACE_USER("artificially drop packet with empty STREAM frame", QUIC_EV_CONN_TXPKT, qc); + pkt->flags \|= QUIC_FL_TX_PACKET_SKIP_SENDTO; + } + break; + default: + break; + } + quic_tx_packet_refinc(pkt); cf->pkt = pkt; }	2024-08-07 11:03:32 +02:00
Amaury Denoyelle	714009b7bc	MINOR: quic: implement function to check if STREAM is fully acked When a STREAM frame is retransmitted, a check is performed to remove range of data already acked from it. This is useful when STREAM frames are duplicated and splitted to cover different data ranges. The newly retransmitted frame contains only unacked data. This process is performed similarly in qc_dup_pkt_frms() and qc_build_frms(). Refactor the code into a new function named qc_stream_frm_is_acked(). It returns true if frame data are already fully acked and retransmission can be avoided. If only a partial range of data is acknowledged, frame content is updated to only cover the unacked data. This patch does not have any functional change. However, it simplifies retransmission for STREAM frames. Also, it will be reused to fix retransmission for empty STREAM frames with FIN set from the following patch : BUG/MEDIUM: quic: handle retransmit for standalone FIN STREAM As such, it must be backported prior to it.	2024-08-07 10:57:10 +02:00
Amaury Denoyelle	bb9ac256a1	MINOR: quic: convert qc_stream_desc release field to flags qc_stream_desc had a field <release> used as a boolean. Convert it with a new <flags> field and QC_SD_FL_RELEASE value as equivalent. The purpose of this patch is to be able to extend qc_stream_desc by adding newer flags values. This patch is required for the following patch BUG/MEDIUM: quic: handle retransmit for standalone FIN STREAM As such, it must be backported prior to it.	2024-08-06 18:00:17 +02:00
Aurelien DARRAGON	8f1fd96d17	BUG/MEDIUM: server/addr: fix tune.events.max-events-at-once event miss and leak An issue has been introduced with cd99440 ("BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates"). Indeed, in the above commit we implemented the atomic_sync task which is responsible for consuming pending server events to apply the changes atomically. For now only server's addr updates are concerned. To prevent the task from causing contention, a budget was assigned to it. It can be controlled with the global tunable 'tune.events.max-events-at-once': the task may not process more than this number of events at once. However, a bug was introduced with this budget logic: each time the task has to be interrupted because it runs out of budget, we reschedule the task to finish where it left off, but the current event which was already removed from the queue wasn't processed yet. This means that this pending event (each tune.events.max-events-at-once) is effectively lost. When the atomic_sync task deals with large number of concurrent events, this bug has 2 known consequences: first a server's addr/port update will be lost every 'tune.events.max-events-at-once'. This can of course cause reliability issues because if the event is not republished periodically, the server could stay in a stale state for indefinite amount of time. This is the case when the DNS server flaps for instance: some servers may not come back UP after the incident as described in GH #2666. Another issue is that the lost event was not cleaned up, resulting in a small memory leak. So in the end, it means that the bug is likely to cause more and more degradation over time until haproxy is restarted. As a workaround, 'tune.events.max-events-at-once' may be set to the maximum number of events expected per batch. Note however that this value cannot exceed 10 000, otherwise it could cause the watchdog to trigger due to the task being busy for too long and preventing other threads from making any progress. Setting higher values may not be optimal for common workloads so it should only be used to mitigate the bug while waiting for this fix. Since tune.events.max-events-at-once defaults to 100, this bug only affects configs that involve more than 100 servers whose addr:port properties are likely to be updated at the same time (batched updates from cli, lua, dns..) To fix the bug, we move the budget check after the current event is fully handled. For that we went from a basic 'while' to 'do..while' loop as we assume from the config that 'tune.events.max-events-at-once' cannot be 0. While at it, we reschedule the task once thread isolation ends (it was not required to perform the reschedule while under isolation) to give the hand back faster to waiting threads. This patch should be backported up to 2.9 with cd99440. It should fix GH #2666.	2024-08-06 16:41:37 +02:00
Ilia Shipitsin	aaaacaaf4b	BUG/MINOR: fcgi-app: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to 2.2.	2024-08-06 08:21:49 +02:00
Ilia Shipitsin	661e1db826	DEV: coccinelle: add a test to detect unchecked strdup() The coccinelle test "unchecked-strdup.cocci" detects various cases of unchecked strdup().	2024-08-06 08:21:49 +02:00
Frederic Lecaille	eb1a097a66	BUG/MINOR: quic: Too short datagram during packet building failures (aws-lc only) This issue was reported by Ilya (@Chipitsine) when building haproxy against aws-lc in GH #2663 where handshakeloss and handshakecorruption interop tests could lead haproxy to crash after having built too short datagrams: FATAL: bug condition "first_pkt->type == QUIC_PACKET_TYPE_INITIAL && (first_pkt->flags & (1UL << 0)) && length < 1200" matched at src/quic_tx.c:163 call trace(13): \| 0x55f4ee4dcc02 [ba d9 00 00 00 48 8d 35]: main-0x195bf2 \| 0x55f4ee4e3112 [83 3d 2f 16 35 00 00 0f]: qc_send+0x11f3/0x1b5d \| 0x55f4ee4e9ab4 [85 c0 0f 85 00 f6 ff ff]: quic_conn_io_cb+0xab1/0xf1c \| 0x55f4ee6efa82 [48 c7 c0 f8 55 ff ff 64]: run_tasks_from_lists+0x173/0x9c2 \| 0x55f4ee6f05d3 [8b 7d a0 29 c7 85 ff 0f]: process_runnable_tasks+0x302/0x6e6 \| 0x55f4ee671bb7 [83 3d 86 72 44 00 01 0f]: run_poll_loop+0x6e/0x57b \| 0x55f4ee672367 [48 8b 1d 22 d4 1d 00 48]: main-0x48d \| 0x55f4ee6755e0 [b8 00 00 00 00 e8 08 61]: main+0x2dec/0x335d This could happen after Handshake packet building failures which follow a successful Initial packet into the same datagram. In this case, the datagram could be emitted with a too short length (<1200 bytes). To fix this, store the datagram only if the first packet is not an Initial packet or if its length is big enough (>=1200 bytes). Must be backported as far as 2.6.	2024-08-05 13:40:51 +02:00
Ilia Shipitsin	7fc52032e3	CI: harden NetBSD builds by ERR=1 Add ERR=1 build option to the NetBSD build from github.	2024-08-05 08:49:19 +02:00
Ilia Shipitsin	15d47eda37	CI: add weekly QUIC Interop regression against AWS-LC currently only quic-go and picoquic clients are enabled. Tests will be run weekly.	2024-08-05 08:46:49 +02:00
Frederic Lecaille	e12620a8a9	BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only) By "aws-lc only", one means that this bug was first revealed by aws-lc stack. This does not mean it will not appeared for new versions of other TLS stacks which have never revealed this bug. This bug was reported by Ilya (@chipitsine) in GH #2657 where some QUIC interop tests (resumption, zerortt) could lead to crash with haproxy compiled against aws-lc TLS stack. These crashed were triggered by this BUG_ON() which detects that too short datagrams with at least one ack-eliciting Initial packet inside could be built. <0>2024-07-31T15:13:42.562717+02:00 [01\|quic\|5\|quic_tx.c:739] qc_prep_pkts(): next encryption level : qc@0x61d000041080 idle_timer_task@0x60d000006b80 flags=0x6000058 FATAL: bug condition "first_pkt->type == QUIC_PACKET_TYPE_INITIAL && (first_pkt->flags & (1UL << 0)) && length < 1200" matched at src/quic_tx.c:163 call trace(12): \| 0x563ea447bc02 [ba d9 00 00 00 48 8d 35]: main-0x1958ce \| 0x563ea4482703 [e9 73 fe ff ff ba 03 00]: qc_send+0x17e4/0x1b5d \| 0x563ea4488ab4 [85 c0 0f 85 00 f6 ff ff]: quic_conn_io_cb+0xab1/0xf1c \| 0x563ea468e6f9 [48 c7 c0 f8 55 ff ff 64]: run_tasks_from_lists+0x173/0x9c2 \| 0x563ea468f24a [8b 7d a0 29 c7 85 ff 0f]: process_runnable_tasks+0x302/0x6e6 \| 0x563ea4610893 [83 3d aa 65 44 00 01 0f]: run_poll_loop+0x6e/0x57b \| 0x563ea4611043 [48 8b 1d 46 c7 1d 00 48]: main-0x48d \| 0x7f64d05fb609 [64 48 89 04 25 30 06 00]: libpthread:+0x8609 \| 0x7f64d0520353 [48 89 c7 b8 3c 00 00 00]: libc:clone+0x43/0x5e That said everything was correctly done by qc_prep_ptks() to prevent such a case. But this relied on the hypothesis that the list of encryption levels it used was always built in the same order as follows for 0-RTT sessions: initial, early-data, handshake, application But this order is determined but the order the TLS stack derives the secrets for these encryption levels. For aws-lc, this order is not the same but as follows: initial, handshake, application, early-data During 0-RTT sessions, the server may have to build three ack-eliciting packets (with CRYPTO data inside) to reply to the first client packet: initial, hanshake, application. qc_prep_pkts() adds a PADDING frame to the last built packet for the last encryption level in the list. But after application level encryption, there is early-data encryption level. This prevented qc_prep_pkts() to build a padded applicaiton level last packet to send a 1200-bytes datagram. To fix this, always insert early-data encryption level after the initial encryption level into the encryption levels list when initializing this encryption level from quic_conn_enc_level_init(). Must be backported as far as 2.9.	2024-08-02 15:25:26 +02:00
Christopher Faulet	78b8b60030	BUG/MEDIUM: peer: Notify the applet won't consume data when it waits for sync When the peer applet is waiting for a synchronisation with the global sync task, we must notify it won't consume data. Otherwise, if some data are already waiting in the input buffer, the applet will be woken up in loop and this wil trigger the watchdog. Once synchronized, the applet is woken up. In that case, the peer applet must indicate it is going to consume data again. This patch should fix the issue #2656. It must be backported to 3.0.	2024-08-02 08:42:29 +02:00
Christopher Faulet	184f16ded7	BUG/MEDIUM: mux-h2: Propagate term flags to SE on error in h2s_wake_one_stream When a stream is explicitly woken up by the H2 conneciton, if an error condition is detected, the corresponding error flag is set on the SE. So SE_FL_ERROR or SE_FL_ERR_PENDING, depending if the end of stream was reported or not. However, there is no attempt to propagate other termination flags. We must be sure to properly set SE_FL_EOI and SE_FL_EOS when appropriate to be able to switch a pending error to a fatal error. Because of this bug, the SE remains with a pending error and no end of stream, preventing the applicative stream to trully abort it. It means on some abort scenario, it is possible to block a stream infinitely. This patch must be backported at least as far as 2.8. No bug was observed on older versions while the same code is inuse.	2024-08-02 08:42:28 +02:00
Christopher Faulet	6743e128f3	BUG/MEDIUM: h2: Only report early HTX EOM for tunneled streams For regular H2 messages, the HTX EOM flag is synonymous the end of input. So SE_FL_EOI flag must also be set on the stream-endpoint descriptor. However, there is an exception. For tunneled streams, the end of message is reported on the HTX message just after the headers. But in that case, no end of input is reported on the SE. But here, there is a bug. The "early" EOM is also report on the HTX messages when there is no payload (for instance a content-length set to 0). If there is no ES flag on the H2 HEADERS frame, it is an unexpected case. Because for the applicative stream and most probably for the opposite endpoint, the message is considered as finihsed. It is switched in its DONE state (or the equivalent on the endpoint). But, if an extra H2 frame with the ES flag is received, a TRAILERS frame or an emtpy DATA frame, an extra EOT HTX block is pushed to carry the HTX EOM flag. So an extra HTX block is emitted for a regular HTX message. It is totally invalid, it must never happen. Because it is an undefined behavior, it is difficult to predict the result. But it definitly prevent the applicative stream to properly handle aborts and errors because data remain blocked in the channel buffer. Indeed, the end of the message was seen, so no more data are forwarded. It seems to be an issue for 2.8 and upper. Harder to evaluate for older versions. This patch must be backported as far as 2.4.	2024-08-02 08:42:28 +02:00
Christopher Faulet	0ba6202796	BUG/MEDIUM: http-ana: Report error on write error waiting for the response When we are waiting for the server response, if an error is pending on the frontend side (a write error on client), it is handled as an abort and all regular response analyzers are removed, except the one responsible to release the filters, if any. However, while it is handled as an abort, the error is not reported, as usual, via http_reply_and_close() function. It is an issue because in that, the channels buffers are not reset. Because of this bug, it is possible to block a stream infinitely. The request side is waiting for the response side and the response side is blocked because filters must be released and this cannot be done because data remain blocked in channels buffers. So, in that case, calling http_reply_and_close() with no message is enough to unblock the stream. This patch must be backported as far as 2.8.	2024-08-02 08:42:28 +02:00
Amaury Denoyelle	7a5a30d28a	BUG/MINOR: h2: reject extended connect for h2c protocol This commit prevents forwarding of an HTTP/2 Extended CONNECT when "h2c" or "h2" token is set as targetted protocol. Contrary to the previous commit which deals with HTTP/1 mux, this time the request is rejected and a RESET_STREAM is reported to the client. This must be backported up to 2.4 after a period of observation.	2024-08-01 18:23:44 +02:00
Amaury Denoyelle	7b89aa5b19	BUG/MINOR: h1: do not forward h2c upgrade header token haproxy supports tunnel establishment through HTTP Upgrade mechanism. Since the following commit, extended CONNECT is also supported for HTTP/2 both on frontend and backend side. commit 9bf957335e2c385b74901481f7a89c9565dfce53 MEDIUM: mux_h2: generate Extended CONNECT from htx upgrade As specified by HTTP/2 rfc, "h2c" can be used by an HTTP/1.1 client to request an upgrade to HTTP/2. In haproxy, this is not supported so it silently ignores this. However, Connection and Upgrade headers are forwarded as-is on the backend side. If using HTTP/1 on the backend side and the server supports this upgrade mechanism, haproxy won't be able to parse the HTTP response. If using HTTP/2, mux backend tries to incorrectly convert the request to an Extended CONNECT with h2c protocol, which may also prevent the response to be transmitted. To fix this, flag HTTP/1 request with "h2c" or "h2" token in an upgrade header. On converting the header list to HTX, the upgrade header is skipped if any of this token is present and the H1_MF_CONN_UPG flag is removed. This issue can easily be reproduced using curl --http2 argument to connect to an HTTP/1 frontend. This must be backported up to 2.4 after a period of observation.	2024-08-01 18:23:32 +02:00
Amaury Denoyelle	a7a2db4ad5	BUG/MIONR: quic: fix fc_lost Control layer callback get_info has recently been implemented for QUIC. However, fc_lost always returned 0. This is because quic_get_info() does not use the correct input argument value to identify lost value. This does not need to be backported.	2024-08-01 11:35:27 +02:00
Amaury Denoyelle	522c3bea2c	BUG/MINOR: quic: fix fc_rtt/srtt values QUIC has recently implement get_info callback to return RTT/sRTT values. However, it uses milliseconds, contrary to TCP which uses microseconds. This cause smp fetch functions to return invalid values. Fix this by converting QUIC values to microseconds. This does not need to be backported.	2024-08-01 11:35:27 +02:00
Amaury Denoyelle	4b0bda42f7	MINOR: flags/mux-quic: decode qcc and qcs flags Decode QUIC MUX connection and stream elements via qcc_show_flags() and qcs_show_flags(). Flags definition have been moved outside of USE_QUIC to ease compilation of flags binary.	2024-07-31 17:59:35 +02:00
Frederic Lecaille	f7f76b8b0d	MINOR: quic: Define ->get_info() control layer callback for QUIC This low level callback may be called by several sample fetches for frontend connections like "fc_rtt", "fc_rttvar" etc. Define this callback for QUIC protocol as pointer to quic_get_info(). This latter supports these sample fetches: "fc_lost", "fc_reordering", "fc_rtt" and "fc_rttvar". Update the documentation consequently.	2024-07-31 10:29:42 +02:00
Frederic Lecaille	1733dff42a	MINOR: tcp_sample: Move TCP low level sample fetch function to control layer Add ->get_info() new control layer callback definition to protocol struct to retreive statiscal counters information at transport layer (TCPv4/TCPv6) identified by an integer into a long long int. Move the TCP specific code from get_tcp_info() to the tcp_get_info() control layer function (src/proto_tcp.c) and define it as the ->get_info() callback for TCPv4 and TCPv6. Note that get_tcp_info() is called for several TCP sample fetches. This patch is useful to support some of these sample fetches for QUIC and to keep the code simple and easy to maintain.	2024-07-31 10:29:42 +02:00
Amaury Denoyelle	bba6baff30	BUG/MEDIUM: quic: prevent conn freeze on 0RTT undeciphered content Received QUIC packets are stored in quic_conn Rx buffer after header protection removal in qc_rx_pkt_handle(). These packets are then removed after quic_conn IO handler via qc_treat_rx_pkts(). If HP cannot be removed, packets are still copied into quic_conn Rx buffer. This can happen if encryption level TLS keys are not yet available. The packet remains in the buffer until HP can be removed and its content processed. An issue occurs if client emits a 0-RTT packet but haproxy does not have the shared secret, for example after a haproxy process restart. In this case, the packet is copied in quic_conn Rx buffer but its HP won't ever be removed. This prevents the buffer to be purged. After some time, if the client has emitted enough packets, Rx buffer won't have any space left and received packets are dropped. This will cause the connection to freeze. To fix this, remove any 0-RTT buffered packets on handshake completion. At this stage, 0-RTT packets are unnecessary anymore. The client is expected to reemit its content in 1-RTT packet which are properly deciphered. This can easily reproduce with HTTP/3 POST requests or retrieving a big enough object, which will fill the Rx buffer with ACK frames. Here is a picoquic command to provoke the issue on haproxy startup : $ picoquicdemo -Q -v 00000001 -a h3 <hostname> 20443 "/?s=1g" Note that allow-0rtt must be present on the bind line to trigger the issue. Else haproxy will reject any 0-RTT packets. This must be backported up to 2.6. This could be one of the reason for github issue #2549 but it's unsure for now.	2024-07-31 10:24:53 +02:00
William Lallemand	f76e8e50f4	BUILD: ssl: replace USE_OPENSSL_AWSLC by OPENSSL_IS_AWSLC Replace USE_OPENSSL_AWSLC by OPENSSL_IS_AWSLC in the code source, so we won't need to set USE_OPENSSL_AWSLC in the Makefile on the long term.	2024-07-30 18:53:08 +02:00
William Lallemand	1889b86561	BUG/MEDIUM: ssl: 0-RTT initialized at the wrong place for AWS-LC Revert patch fcc8255 "MINOR: ssl_sock: Early data disabled during SSL_CTX switching (aws-lc)". The patch was done in the wrong callback which is never built for AWS-LC, and applies options on the SSL_CTX instead of the SSL, which should never be done elsewhere than in the configuration parsing. This was probably triggered by successfully linking haproxy against AWS-LC without using USE_OPENSSL_AWSLC. The patch also reintroduced SSL_CTX_set_early_data_enabled() in the ssl_quic_initial_ctx() and ssl_sock_initial_ctx(). So the initial_ctx does have the right setting, but it still needs to be applied to the selected SSL_CTX in the clienthello, because we need it on the selected SSL_CTX. Must be backported to 3.0. (ssl_clienthello.c part was in ssl_sock.c)	2024-07-30 18:53:08 +02:00
William Lallemand	56eefd6827	BUG/MEDIUM: ssl: reactivate 0-RTT for AWS-LC Then reactivate HAVE_SSL_0RTT and HAVE_SSL_0RTT_QUIC for AWS-LC, which were wrongly deactivated in f5353f2c ("MINOR: ssl: add HAVE_SSL_0RTT constant"). Must be backported to 3.0.	2024-07-30 18:53:08 +02:00
Willy Tarreau	376b147fff	BUG/MINOR: stconn: bs.id and fs.id had their dependencies incorrect The backend depends on the response and the frontend on the request, not the other way around. In addition, they used to depend on L6 (hence contents in the channel buffers) while they should only depend on L5 (permanent info known in the mux). This came in 2.9 with commit 24059615a7 ("MINOR: Add sample fetches to get the frontend and backend stream ID") so this can be backported there. (cherry picked from commit 61dd0156c82ea051779e6524cad403871c31fc5a) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-30 18:39:29 +02:00
Christopher Faulet	d9f41b1d6e	BUILD: mux-pt: Use the right name for the sedesc variable A typo was introduced in 760d26a86 ("BUG/MEDIUM: mux-pt/mux-h1: Release the pipe on connection error on sending path"). The sedesc variable is 'sd', not 'se'. This patch must be backported with the commit above.	2024-07-30 10:44:00 +02:00
Christopher Faulet	760d26a862	BUG/MEDIUM: mux-pt/mux-h1: Release the pipe on connection error on sending path When data are sent using the kernel splicing, if a connection error occurred, the pipe must be released. Indeed, in that case, no more data can be sent and there is no reason to not release the pipe. But it is in fact an issue for the stream because the channel will appear are not empty. This may prevent the stream to be released. This happens on 2.8 when a filter is also attached on it. On 2.9 and upper, it seems there is not issue. But it is hard to be sure and the current patch remains valid is all cases. On 2.6 and lower, the code is not the same and, AFAIK, there is no issue. This patch must be backported to 2.8. However, on 2.8, there is no zero-copy data forwarding. The patch must be adapted. There is no done_ff/resume_ff callback functions for muxes. The pipe must released in sc_conn_send() when an error flag is set on the SE, after the call to snd_pipe callback function.	2024-07-30 09:05:25 +02:00
Christopher Faulet	5dc45445ff	BUG/MEDIUM: stconn: Report error on SC on send if a previous SE error was set When a send on a connection is performed, if a SE error (or a pending error) was already reported earlier, we leave immediately. No send is performed. However, we must be sure to report the error at the SC level if necessary. Indeed, the SE error may have been reported during the zero-copy data forwarding. So during receive on the opposite side. In that case, we may have missed the opportunity to report it at the SC level. The patch must be backported as far as 2.8.	2024-07-30 09:05:25 +02:00
Christopher Faulet	33c9562f07	DOC: config: Add documentation about spop mode for backends The SPOE was refactored. Now backends referenced by a SPOE filter must use the spop mode to be able to use the spop multiplexer for server connections. The "spop" mode was added in the list of supported mode for backends.	2024-07-30 09:05:25 +02:00
Willy Tarreau	5541d4995d	BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue() After checking that a server or backend is full, it remains possible to call pendconn_add() just after the last pending requests finishes, so that there's no more connection on the server for very low maxconn (typ 1), leaving new ones in queue till the timeout. The approach depends on where the request was queued, though: - when queued on a server, we can simply detect that we may dequeue pending requests and wake them up, it will wake our request and that's fine. This needs to be done in srv_redispatch_connect() when the server is set. - when queued on a backend, it means that all servers are done with their requests. It means that all servers were full before the check and all were empty after. In practice this will only concern configs with less servers than threads. It's where the issue was first spotted, and it's very hard to reproduce with more than one server. In this case we need to load-balance again in order to find a spare server (or even to fail). For this, we call the newly added dedicated function pendconn_must_try_again() that tells whether or not a blocked pending request was dequeued and needs to be retried. This should be backported along with pendconn_must_try_again() to all stable versions, but with extreme care because over time the queue's locking evolved.	2024-07-29 09:27:01 +02:00
Willy Tarreau	1a8f3a368f	MINOR: queue: add a function to check for TOCTOU after queueing There's a rare TOCTOU case that happens from time to time with maxconn 1 and multiple threads. Between the moment we see the queue full and the moment we queue a request, it's possible that the last request on the server or proxy ended and that no other one is left to offer it its place. Given that all this code path is performance-critical and we cannot afford to increase the lock duration, better recheck for the condition after queueing. For this we need to be able to check for the condition and cleanly dequeue a request. That's what this patch provides via the new function pendconn_must_try_again(). It will catch more requests than absolutely needed though it will catch them all. It may find that around 1/1000 of requests are at risk, though testing shows that in practice, it's around 1 per million that really gets stuck (other ones benefit from timing and finishing late requests). Maybe in the future some conditions might be refined but it's harmless. What happens to such requests is that they're dequeued and their pendconn freed, so that the caller can decide to try to LB or queue them again. For now the function is not used, it's just added separately for easier tracking.	2024-07-29 09:27:01 +02:00
Willy Tarreau	4316ef2eab	BUILD: cfgparse-quic: fix build error on Solaris due to missing netinet/in.h Since commit 35470d518 ("MINOR: quic: activate UDP GSO for QUIC if supported"), Solaris build fails due to netinet/udp.h being included without netinet/in.h. Adding it is sufficient to fix the problem. No backport is needed.	2024-07-28 14:59:23 +02:00
Christopher Faulet	46b1fec0e9	BUG/MEDIUM: jwt: Clear SSL error queue on error when checking the signature When the signature included in a JWT is verified, if an error occurred, one or more SSL errors are queued and never cleared. These errors may be then caught by the SSL stack and a fatal SSL error may be erroneously reported during a SSL received or send. So we must take care to clear the SSL error queue when the signature verification failed. This patch should fix issue #2643. It must be backported as far as 2.6.	2024-07-26 16:59:00 +02:00
Frederic Lecaille	4abaadd842	MINOR: quic: Dump TX in flight bytes vs window values ratio. Display the ratio of the numbers of bytes in flight by packet number spaces versus the current window values in percent.	2024-07-26 16:42:44 +02:00
Frederic Lecaille	76ff8afa2d	MINOR: quic: Add information to "show quic" for CUBIC cc. Add ->state_cli() new callback to quic_cc_algo struct to define a function called by the "show quic (cc\|full)" commands to dump some information about the congestion algorithm internal state currently in use by the QUIC connections. Implement this callback for CUBIC algorithm to dump its internal variables: - K: (the time to reach the cubic curve inflexion point), - last_w_max: the last maximum window value reached before intering the last recovery period. This is also the window value at the inflexion point of the cubic curve, - wdiff: the difference between the current window value and last_w_max. So negative before the inflexion point, and positive after.	2024-07-26 16:42:44 +02:00
Willy Tarreau	2dab1ba84b	MEDIUM: h1: allow to preserve keep-alive on T-E + C-L In 2.5-dev9, commit 631c7e866 ("MEDIUM: h1: Force close mode for invalid uses of T-E header") enforced a recently arrived new security rule in the HTTP specification aiming at preventing a class of content-smuggling attacks involving HTTP/1.0 agents. It consists in handling the very rare T-E + C-L requests or responses in close mode. It happens it does have an impact of a rare few and very old clients (probably running insecure TLS stacks by the way) that continue to send both with their POST requests. The impact is that for each and every request they'll have to reconnect, possibly negotiating a full TLS handshake that becomes harmful to the machine in terms of CPU computation. This commit adds a new option "h1-do-not-close-on-insecure-transfer-encoding" that does exactly what it says, it just asks not to close on such messages, even though the message continues to be sanitized and C-L dropped. It means that the risk is only between the sender and haproxy, which is limited, and might be the only acceptable solution for such environments having to deal with broken implementations. The cases are so rare that it should not need to be backported, or in the worst case, to the latest LTS if there is any demand.	2024-07-26 15:59:35 +02:00
Amaury Denoyelle	85131f91bf	BUG/MEDIUM: quic: fix invalid conn reject with CONNECTION_REFUSED quic-initial rules were implemented just recently. For some actions, a new flags field was added in quic_dgram structure. This is used to report the result of the rules execution. However, this flags field was left uninitialized. Depending on its value, it may close the connection to be wrongly rejected via CONNECTION_REFUSED. Fix this by properly set flags value to 0. No need to backport.	2024-07-26 15:24:35 +02:00
Amaury Denoyelle	08515af9df	MINOR: quic: implement send-retry quic-initial rules Define a new quic-initial "send-retry" rule. This allows to force the emission of a Retry packet on an initial without token instead of instantiating a new QUIC connection.	2024-07-25 15:39:39 +02:00
Amaury Denoyelle	69d7e9f3b7	MINOR: quic: implement reject quic-initial action Define a new quic-initial action named "reject". Contrary to dgram-drop, the client is notified of the rejection by a CONNECTION_CLOSE with CONNECTION_REFUSED error code. To be able to emit the necessary CONNECTION_CLOSE frame, quic_conn is instantiated, contrary to dgram-drop action. quic_set_connection_close() is called immediatly after qc_new_conn() which prevents the handshake startup.	2024-07-25 15:39:39 +02:00
Amaury Denoyelle	f91be2657e	MINOR: quic: pass quic_dgram as obj_type for quic-initial rules To extend quic-initial rules, pass quic_dgram instance to argument for the various actions. As such, quic_dgram is now supported as an obj_type and can be used in session origin field.	2024-07-25 15:39:39 +02:00
Amaury Denoyelle	1259700763	MINOR: quic: support ACL for quic-initial rules Add ACL condition support for quic-initial rules. This requires the extension of quic_parse_quic_initial() to parse an extra if/unless block. Only layer4 client samples are allowed to be used with quic-initial rules. However, due to the early execution of quic-initial rules prior to any connection instantiation, some samples are non supported. To be able to use the 4 described samples, a dummy session is instantiated before quic-initial rules execution. Its src and dst fields are set from the received datagram values.	2024-07-25 15:39:39 +02:00
Amaury Denoyelle	cafe596608	MEDIUM: quic: implement quic-initial rules Implement a new set of rules labelled as quic-initial. These rules as specific to QUIC. They are scheduled to be executed early on Initial packet parsing, prior a new QUIC connection instantiation. Contrary to tcp-request connection, this allows to reject traffic earlier, most notably by avoiding unnecessary QUIC SSL handshake processing. A new module quic_rules is created. Its main function quic_init_exec_rules() is called on Initial packet parsing in function quic_rx_pkt_retrieve_conn(). For the moment, only "accept" and "dgram-drop" are valid actions. Both are final. The latter drops silently the Initial packet instead of allocating a new QUIC connection.	2024-07-25 15:39:39 +02:00
Amaury Denoyelle	a72e82c382	MINOR: quic: delay Retry emission on quic-force-retry Currently, quic Retry packets are emitted for two different reasons after processing an Initial without token : - quic-force-retry is set on bind-line - an abnormal number of half-open connection is currently detected Previously, these two conditions were checked separately in different functions during datagram parsing. Uniformize this by moving quic-force-retry check in quic_rx_pkt_retrieve_conn() along the second condition check. The purpose of this patch is to uniformize datagram parsing stages. It is necessary to implement quic-initial rules in quic_rx_pkt_retrieve_conn() prior to any Retry emission. This prevents to emit unnecessary Retry if an Initial is subject to a reject rule.	2024-07-25 15:29:50 +02:00
Aurelien DARRAGON	e328056ddc	MEDIUM: sink: assume sft appctx stickiness As mentioned in b40d804 ("MINOR: sink: add some comments about sft->appctx usage in applet handlers"), there are few places in the code where it looks like we assumed that the applet callbacks such as sink_forward_session_init() or sink_forward_io_handler() could be executing an appctx whose sft is detached from the appctx (appctx != sft->appctx). In practise this should not be happening since an appctx sticks to the same thread its entire lifetime, and the only times sft->appctx is effectively assigned is during the session/appctx creation (in process_sink_forward()) or release. Thus if sft->appctx wouldn't point to the appctx that the sft was bound to after appctx creation, it would probably indicate a bug rather than an expected condition. To further emphasize that and prevent the confusion, and since 3.1-dev4 was released, let's remove such checks and instead add a BUG_ON to ensure this never happens. In _sink_forward_io_handler(), the "hard_close" label was removed since there are no more uses for it (no hard errors may be caught from the function for now)	2024-07-25 14:56:19 +02:00
William Lallemand	28cb01f8e8	MEDIUM: quic: implement CHACHA20_POLY1305 for AWS-LC With AWS-LC, the aead part is covered by the EVP_AEAD API which provides the correct EVP_aead_chacha20_poly1305(), however for header protection it does not provides an EVP_CIPHER for chacha20. This patch implements exceptions in the header protection code and use EVP_CIPHER_CHACHA20 and EVP_CIPHER_CTX_CHACHA20 placeholders so we can use the CRYPTO_chacha_20() primitive manually instead of the EVP_CIPHER API. This requires to check if we are using EVP_CIPHER_CTX_CHACHA20 when doing EVP_CIPHER_CTX_free().	2024-07-25 13:45:39 +02:00
William Lallemand	177c84808c	MEDIUM: quic: add key argument to header protection crypto functions In order to prepare the code for using Chacha20 with the EVP_AEAD API, both quic_tls_hp_decrypt() and quic_tls_hp_encrypt() need an extra key argument. Indeed Chacha20 does not exists as an EVP_CIPHER in AWS-LC, so the key won't be embedded into the EVP_CIPHER_CTX, so we need an extra parameter to use it.	2024-07-25 13:45:39 +02:00
William Lallemand	d55a297b85	MINOR: quic: rename confusing wording aes to hp Some of the crypto functions used for headers protection in QUIC are named with an "aes" name even thought they are not used for AES encryption only. This patch renames these "aes" to "hp" so it is clearer.	2024-07-25 13:45:38 +02:00
William Lallemand	31c831e29b	MEDIUM: ssl/quic: implement quic crypto with EVP_AEAD The QUIC crypto is using the EVP_CIPHER API in order to achieve authenticated encryption, this was the API which was used with OpenSSL. With libraries that inspires from BoringSSL (libreSSL and AWS-LC), the AEAD algorithms are implemented using the EVP_AEAD API. This patch converts the call to the EVP_CIPHER API when called in the contex of AEAD cryptography for QUIC. The patch defines some QUIC_AEAD macros that can be either EVP_CIPHER or EVP_AEAD depending on the library. This was mainly done for AWS-LC but this could be useful for other libraries. This should finally allow to use CHACHA20_POLY1305 with AWS-LC. This patch allows to use the following ciphers with the EVP_AEAD API: - TLS1_3_CK_AES_128_GCM_SHA256 - TLS1_3_CK_AES_256_GCM_SHA384 AWS-LC does not implement TLS1_3_CK_AES_128_CCM_SHA256 and TLS1_3_CK_CHACHA20_POLY1305_SHA256 requires some hack for headers protection which will come in another patch.	2024-07-25 13:45:38 +02:00
Frederic Lecaille	a6d40e09f7	BUG/MINOR: quic: Lack of precision when computing K (cubic only cc) K cubic variable is stored in ms. But it was a formula with the second as unit for the window difference parameter which was used to compute K without considering the loss of information. Then the result was converted in ms (K *= 1000). This leaded to a lack of precision and multiples of 1000 as values. To fix this, use the same formula but with the window difference in ms as parameter passed to the cubic function and remove the conversion. Must be backported as far as 2.6.	2024-07-24 18:24:39 +02:00
Willy Tarreau	7eca16921b	[RELEASE] Released version 3.1-dev4 Released version 3.1-dev4 with the following main changes : - MINOR: limits: prepare to keep limits in one place - REORG: fd: move raise_rlim_nofile to limits - CLEANUP: fd: rm struct rlimit definition - REORG: global: move rlim_fd__at_boot in limits - MINOR: haproxy: prepare to move limits-related code - REORG: haproxy: move limits handlers to limits - MINOR: limits: add is_any_limit_configured - CLEANUP: quic: remove obsolete comment on send - MINOR: quic: extend detection of UDP API OS features - MINOR: quic: activate UDP GSO for QUIC if supported - MINOR: quic: define quic_cc_path MTU as constant - MINOR: quic: add GSO parameter on quic_sock send API - MAJOR: quic: support GSO when encoding datagrams - MEDIUM: quic: implement GSO fallback mechanism - MINOR: quic: add counters of sent bytes with and without GSO - BUG/MEDIUM: bwlim: Be sure to never set the analyze expiration date in past - CLEANUP: proto: rename TID affinity callbacks - CLEANUP: quic: rename TID affinity elements - BUG/MINOR: limits: fix license type in limits.h - BUG/MINOR: session: Eval L4/L5 rules defined in the default section - CLEANUP: stconn: Fix a typo in comments for SE_ABRT_SRC_ - MEDIUM: spoe: Remove fragmentation support - MEDIUM: spoe: Remove async mode support - MINOR: spoe: Use only a global engine-id per agent - MINOR: spoe: Remove debugging - MAJOR: spoe: Remove idle applets and pipelining support - MINOR: spoe: Remove the dedicated SPOE applet task - MEDIUM: proxy/spoe: Add a SPOP mode - MEDIUM: applet: Add a .shut callback function for applets - MINOR: connection: No longer include stconn type header in connection-t.h - MINOR: stconn: Use a dedicated function to get the opposite sedesc - MINOR: spoe: Rename some flags and constant to use SPOP prefix - MINOR: spoe: Dynamically alloc the message list per event of an agent - MINOR: spoe: Move all stuff regarding the filter/applet in the C file - MINOR: spoe: Move spoe_str_to_vsn() into the header file - MEDIUM: mux-spop: Introduce the SPOP multiplexer - MEDIUM: check/spoe: Use SPOP multiplexer to perform SPOP health-checks - MAJOR: spoe: Rewrite SPOE applet to use the SPOP mux - CLEANUP: spoe: Uniformize function definitions - MINOR: spoe: Add internal sample fetch to retrieve the SPOE engine ID - MEDIUM: spoe: Set a specific name for the connection pool of SPOP servers - MINOR: backend: Remove test on HTX streams to reuse idle connections on connect - MEDIUM: spoe: Force the reuse 'always' mode for SPOP backends - MINOR: mux-spop: Use a dedicated function to update the SPOP connection timeout - MAJOR: mux-spop: Make the SPOP connections reusable - MINOR: stats-html: Display reuse ratio for spop connections - MEDIUM: spoe: Directly xfer NOTIFY frame when SPOE applet is created - MEDIUM: spoe: Directly receive ACK frame in the SPOE context buffer - MEDIUM: mux-spop/spoe: Save negociated max-frame-size value in the mux - MINOR: spoe: Remove the spop version from the SPOE appctx context - MEDIUM: mux-spop: Add checks on received frames - MEDIUM: mux-spop: Announce the pipeling support if possible - MEDIUM: spoe: Forward SPOE context error to the SPOE applet - MEDIUM: spoe: Make the SPOE applet use its own buffers - DOC: spoe: Update SPOE documentation to reflect recent refactoring - BUILD: mux-spop: fix build failure on gcc 4-10 and clang - MINOR: fd: don't scan the full fdtab on all threads - MINOR: server: better mt_list usage for node migration (prev_deleted handling) - BUG/MINOR: do not close uninit FD in quic_test_socketops() - BUG/MEDIUM: debug/cli: fix "show threads" crashing with low thread counts - MINOR: debug: prepare feed_post_mortem_late - CLEANUP: debug: fix indents in debug_parse_cli_show_dev - MINOR: debug: store runtime uid/gid in postmortem - MINOR: debug: keep runtime capabilities in post_mortem - MINOR: debug: use LIM2A to show limits - MINOR: debug: prepare to show runtime limits - MINOR: debug: keep runtime limits in postmortem - DOC: install: don't reference removed CPU arg - BUG/MEDIUM: ssl_sock: fix deadlock in ssl_sock_load_ocsp() on error path - BUG/MAJOR: mux-h2: force a hard error upon short read with pending error - MEDIUM: sink: start applets asynchronously - OPTIM: sink: balance applets accross threads - MEDIUM: ocsp: fix ocsp when the chain is loaded from 'issuers-chain-path' - MEDIUM: ssl: add extra_chain to ckch_data - MINOR: ssl: change issuers-chain for show_cert_detail() - REGTESTS: ssl: test the issuers-chain-path keyword - DOC: configuration: issuers-chain-path not compatible with OCSP - DOC: configuration: issuers-chain-path is compatible with OCSP - BUG/MEDIUM: startup: fix zero-warning mode - BUILD: tree-wide: cast arguments to tolower/toupper to unsigned char (2) - MINOR: cfgparse-global: move mode's keywords in cfg_kw_list - MINOR: cfgparse-global: move no<poller_name> in cfg_kw_list - DOC: config: improve the http-keep-alive section - BUG/MINOR: stick-table: fix crash for src_inc_gpc() without stkcounter - BUG/MINOR: server: Don't warn fallback IP is used during init-addr resolution - BUG/MINOR: cli: Atomically inc the global request counter between CLI commands - MINOR: stream: Add a pointer to set the parent stream - MINOR: vars: Fill a description instead of hash and scope when a name is parsed - MINOR: vars: Use a description to set/unset a variable instead of its hash and scope - MEDIUM: vars: Be able to parse parent scopes for variables - MINOR: vars: Use a variable description to get variables of a specific scope - MEDIUM: vars: Be able to retrieve variable of the parent stream, if any - MEDIUM: spoe: Set the parent stream for SPOE streams - BUG/MINOR: quic: Non optimal first datagram. - DOC: config: Add a dedicated section about variables - DOC: config: Add info about variable scopes referencing the parent stream - DOC: config: Explicitly state the SPOE streams have a usable parent stream - MINOR: quic: Avoid cc priv buffer overflow. - MINOR: spoe: Add a function to validate a version is supported - MINOR: spoe: export the list of SPOP error reasons - MEDIUM: spoe/tcpcheck: Reintroduce SPOP check as a customized tcp-check - REGTESTS: check/spoe: Re-enable the script performing SPOP health-checks - BUG/MEDIUM: sink: properly init applet under sft lock - MINOR: sink: unify and sink_forward_io_handler() and sink_forward_oc_io_handler() - MINOR: sink: Remove useless test on SE_FL_SHR/SHW flags - MINOR: sink: merge sink_forward_io_handler() with sink_forward_oc_io_handler() - MINOR: sink: add some comments about sft->appctx usage in applet handlers - MINOR: sink: distinguish between hard and soft close in _sink_forward_io_handler() - MEDIUM: sink: don't set NOLINGER flag on the outgoing stream interface - MINOR: ring: count processed messages in ring_dispatch_messages() - MINOR: sink: add processed events counter in sft - MEDIUM: sink: "max-reuse" support for sink servers - OPTIM: sink: consider threads' current load when rebalancing applets	2024-07-24 18:20:24 +02:00
Aurelien DARRAGON	2513bd257f	OPTIM: sink: consider threads' current load when rebalancing applets In c454296f0 ("OPTIM: sink: balance applets accross threads"), we already made sure to balance applets accross threads by picking a random thread to spawn the new applet. Also, thanks to the previous commit, we also have the ability to destroy the applet when a certain amount of messages were processed to help distribute the load during runtime. Let's improve that by trying up to 3 different threads in the hope to pick a non-overloaded one in the best scenario, and the least over loaded one in the worst case. This should help to better distribute the load over multiple threads when high loads are expected. Logic was greatly inspired from thread migration logic used by server health checks, but it was simpliflied for sink's use case.	2024-07-24 17:59:18 +02:00
Aurelien DARRAGON	237849c911	MEDIUM: sink: "max-reuse" support for sink servers Thanks to the previous commit, it is now possible to know how many events were processed for a given sft/server sink pair. As mentioned in commit c454296 ("OPTIM: sink: balance applets accross threads"), let's provide the ability to restart a server connection when a certain amount of events were processed to help better balance the load over multiple threads. For this, we make use the of "max-reuse" server keyword which was only relevant under "http" context so far. Under sink context, "max-reuse" corresponds to the number of times the tcp connection can be reused for sending messages, which in fact means that "max-reuse + 1" is the number of events (ie: messages) that are allowed to be sent using the same tcp server connection: when this threshold is met, the connection will be destroyed and a new one will be created on a random thread. The value is not strict: it is the minimum value above which the connection may be destroyed since the value is checked after ring_dispatch_messages() which may process multiple messages at once. By default, no limit is enforced (the connection will be reused for as long as it is available). The documentation was updated accordingly.	2024-07-24 17:59:14 +02:00
Aurelien DARRAGON	709b3db941	MINOR: sink: add processed events counter in sft Add a new struct member to sft structure named e_processed in order to track the total number of events processed by sft applets. sink_forward_oc_io_handler() and sink_forward_io_handler() now make use of ring_dispatch_messages() optional value added in the previous commit in order to increase the number of processed events.	2024-07-24 17:59:08 +02:00
Aurelien DARRAGON	47323e64ad	MINOR: ring: count processed messages in ring_dispatch_messages() ring_dispatch_messages() now takes an optional argument <processed> which must point to a size_t counter when provided. When provided, the value is updated to the number of messages processed by the function.	2024-07-24 17:59:03 +02:00
Aurelien DARRAGON	0821460e3f	MEDIUM: sink: don't set NOLINGER flag on the outgoing stream interface Given that sink applets are responsible for conveying messages from the ring to the tcp server endpoint, there are no protocol timeout or errors expected there, it is an unidirectional flow of data over TCP. As such, NOLINGER flag which was inherited from peers applet, see dbd026792 ("BUG/MEDIUM: peers: set NOLINGER on the outgoing stream interface") is not desirable under sink context: The reason why we have the NOLINGER flag set is to ensure the connection is closed right away and avoid 60s TIME_WAIT delay on closed sockets. The downside is that messages sent right before closing the socket are not guaranteed to make it to the server because closing with NOLINGER flag set will result in RST packet being emitted right away, which could prevent in-flight messages from being properly delivered. Unlike peers applets, the only cases were sink applets are expected to close the connection are upon unexpected error or upon stopping, which are relatively rare events. Thanks to previous commit, ERROR flag is already set in case of error, so the use of NOLINGER is not mandatory for the RST to be sent. Now for the stopping case, it only happens once in the process lifetime so it's acceptable to close the socket using EOS+EOI flags without the NOLINGER option set. So in our case, it is preferable to ensure messages get properly delivered knowning that closed sockets should be piling up in TIME_WAIT, this means removing the NOLINGER flag on the outgoing stream interface for sink applets. It is a prerequisite for upcoming patches in order to cleanly shut the applet during runtime without risking to send the RST packet before all pending messages were sent to the endpoint.	2024-07-24 17:58:58 +02:00
Aurelien DARRAGON	c6ab0e14e2	MINOR: sink: distinguish between hard and soft close in _sink_forward_io_handler() Aborting the socket on soft-stop is not the same as aborting it due to unexpected error. As such, let's leverage the granularity offered by sedesc flags to better reflect the situation: abort during soft-stop is handled as a soft close thanks to EOI+EOS flags, while abort due to unexpected error is handled as hard error thanks to ERROR+EOS flags. Thanks to this change, hard error will always emit RST packet even if the NOLINGER option wasn't set on the socket.	2024-07-24 17:58:52 +02:00
Aurelien DARRAGON	b40d804c7f	MINOR: sink: add some comments about sft->appctx usage in applet handlers There seem to be an ambiguity in the code where sft->appctx would differ from the appctx that was assigned to it upon appctx creation. In practise, it doesn't seem this could be happening. Adding a few notes to come back to this later and try to see if we can remove this ambiguity.	2024-07-24 17:58:47 +02:00
Aurelien DARRAGON	10811fdfd6	MINOR: sink: merge sink_forward_io_handler() with sink_forward_oc_io_handler() Now that sink_forward_oc_io_handler() and sink_forward_io_handler() were unified again thanks to the previous commit, let's take a chance to merge code that is common to both functions in order to ease code maintenance. Let's add _sink_forward_io_handler() internal function which takes the applet and a message handler as argument: sink_forward_io_handler() and sink_forward_oc_io_handler() leverage this internal function by passing the correct message handler for the desired format.	2024-07-24 17:58:41 +02:00
Aurelien DARRAGON	f2848e6146	MINOR: sink: Remove useless test on SE_FL_SHR/SHW flags Re-apply dcd917d972 ("MINOR: applet: Remove uselelss test on SE_FL_SHR/SHW flags") for sink_forward_oc_io_handler() function as it was probably overlooked given that sink_forward_oc_io_handler() and sink_forward_io_handler() follow the same logic.	2024-07-24 17:58:35 +02:00
Aurelien DARRAGON	901a66b3fc	MINOR: sink: unify and sink_forward_io_handler() and sink_forward_oc_io_handler() In a739dc2 ("MEDIUM: sink: Use the sedesc to report and detect end of processing"), we added a drain after close in sink_forward_oc_io_handler() by the use of "goto out". However, since we perform a close, there is no reason to drain data from the socket. Moreover, before the patch there was no drain and nothing mentioned the fact that that the drain was added on purpose. Lastly, sink_forward_io_handler() and sink_forward_oc_io_handler() functions are strictly identical when in comes to processing logic, and the drain was only added in sink_forward_oc_io_handler() and not in sink_forward_io_handler(). As such, it's pretty safe to assume that the drain is not needed here and was added as accident. So in this patch we remove it in an attempt to unify sink_forward_io_handler() and sink_forward_oc_io_handler() functions like it was already the case before.	2024-07-24 17:58:30 +02:00
Aurelien DARRAGON	c81b8ee480	BUG/MEDIUM: sink: properly init applet under sft lock Since 09d69eacf8 ("MEDIUM: sink: start applets asynchronously") the applet is no longer initialized under the sft lock while it was the case before. At first it doesn't seem to be an issue, but if we look closer at sink_forward_session_init(), we can see that sft->appctx is assigned while it can be accessed at the same time from sink_init_forward(). Let's restore the old guarantees by performing the .init under the sft lock. No backport needed unless 09d69eacf8 is.	2024-07-24 17:58:24 +02:00
Christopher Faulet	06547dcf52	REGTESTS: check/spoe: Re-enable the script performing SPOP health-checks Thanks to previous patches, it is now possible to re-enable the test on SPOP health-checks support.	2024-07-24 14:19:10 +02:00
Christopher Faulet	51e18c9aa6	MEDIUM: spoe/tcpcheck: Reintroduce SPOP check as a customized tcp-check To be able to retrieve accurrate errors when a SPOP health-check is performed, a customized tcp-check is used. Indeed, it is not possible to rely on the SPOP multiplexer for now because the check is performed at the mux connection layer and the error, if any, cannot be retrieved by the health-check. A L4 success or error is reported. To fix this issue and restore the previous behavior, a customized tcp-check is created. The connection is forced to use the PT multiplexer. An hardcoded message is sent and a customer handler is used to decode the SPOA response. This way, it is possible to parse the response and return an accurrate status code.	2024-07-24 14:19:10 +02:00
Christopher Faulet	2f3c4d1b6c	MINOR: spoe: export the list of SPOP error reasons The strings representing the human-readable version for SPOP errors are now exported. It is now an array of IST to ease manipulation.	2024-07-24 14:19:10 +02:00
Christopher Faulet	f8fed07d3a	MINOR: spoe: Add a function to validate a version is supported spoe_check_vsn() function can now be used to check if a version, converted to an integer, via spoe_str_to_vsn() for instance, is supported. To do so, the list of all supported version is now exported.	2024-07-24 14:19:10 +02:00
Frederic Lecaille	735e4aecfc	MINOR: quic: Avoid cc priv buffer overflow. Add two initcall callback with BUG_ON_HOT() to newro and cubic modules to ensure there is no buffer overflow when accessing the private data of these congestion control algorithm state structures. This is to ensure that further modifications about these data structures will not lead to surprises. At this time there is no possible buffer overflow.	2024-07-24 11:07:19 +02:00
Christopher Faulet	e902db2609	DOC: config: Explicitly state the SPOE streams have a usable parent stream It is explicitly mentionned in the configuration manual that the parent of a SPOE stream is the filtered stream. It means variables of the filtered stream are usable from the SPOE stream.	2024-07-19 16:35:44 +02:00
Christopher Faulet	2e86de0e0f	DOC: config: Add info about variable scopes referencing the parent stream It is now possible for a stream to have a parent and it is also possible to retrieve variables defined in the parent stream context. To do so, some extra scopes were introduced. The section 2.8. was updated accordingly.	2024-07-19 16:35:38 +02:00
Christopher Faulet	b643fbb1a6	DOC: config: Add a dedicated section about variables The variables in the HAProxy configuration are now described in a dedicated section. Instead of repeating the same description everywhere a variable name can be used, the section 2.8. is now referenced.	2024-07-19 16:31:13 +02:00
Frederic Lecaille	402ce29e9e	BUG/MINOR: quic: Non optimal first datagram. This bug arrived with this commit: b068e758f MINOR: quic: simplify rescheduling for handshake This commit introduced a bad side effect. Haproxy always replied by an ACK-only datagram when it received the first client Initial packet. Then it handled the CRYPTO data insided. And finally, it sent its own CRYPTO data. This broke the packet coalescing rule whose aim is to optimally build and send as more as QUIC packets by datagram. To fix this, simply partially reverts this commit, to make the low level I/O task return again if some CRYPTO were received. This will delay the acknowledgement which will be sent with the CRYPTO data from the same datagram again. Must be backported to 3.0.	2024-07-19 16:22:00 +02:00
Christopher Faulet	127083a7a2	MEDIUM: spoe: Set the parent stream for SPOE streams When a SPOE applet is created to send a message to an agent, the parent of the associated stream is set to the one filtered. And the relationship between the streams is removed when the applet is released or when the processing on main stream is finished. In the mean time, it is possible to get variables of the parent stream from the SPOE one. It is not a huge change but this will be amazingly useful. For instance, it is now possible to be sticky on a server using a critera of the main streem. Here is an example using the client source address: listen http bind *:80 tcp-request content set-var(txn.client_src) src filter spoe engine {SPOE-NAME} config /{SPOE-CONFIG} http-request send-spoe-group {SPOE-NAME} {SPOE-MSG} server www 127.0.0.1:8000 backend spoe-backend mode spop timeout server 10s stick-table type ip size 200k expire 30m stick on var(ptxn.client_src) server srv1 ... server srv2 ... server srv3 ... server srv4 ... Of course, the feature is not limited to stick-tables. Everywhere variables are used, it is now possible to get the value set on the parent stream from the SPOE stream.	2024-07-18 17:06:12 +02:00
Christopher Faulet	230c1570ac	MEDIUM: vars: Be able to retrieve variable of the parent stream, if any It is now possible to retrieved the value of a variable using the parent stream or the parent session instead of the current one. It remains forbidden to set or unset this value. The sample fetch used to store the result is a local copy. So it may be safely altered by a converter without changing the value of the original variable. Note that for now, the parent of a stream is never set. So this part is not really used. This will change with the SPOE.	2024-07-18 17:06:12 +02:00
Christopher Faulet	1a1afecb8b	MINOR: vars: Use a variable description to get variables of a specific scope Now a variable description is retrieved when a variable is parsed, we can use it to get the variable value. It is mandatory to be able to know the parent stream, if any, must be used, instead of the current one.	2024-07-18 17:06:12 +02:00
Christopher Faulet	f93828f229	MEDIUM: vars: Be able to parse parent scopes for variables Add session/stream scopes related to the parent. To do so, "psess", "ptxn", "preq" or "pres" must be used instead of tranditionnal scopes (without the first "p"). the "proc" scope is not concerned by this change because it is not linked to a stream. When such scopes are used, a specific flags is added on the variable description during the variable parsing. For now, theses scopes are parsed and the variable description is updated accordingly. But at the end, any operation on the variable value fails.	2024-07-18 16:39:39 +02:00
Christopher Faulet	d430edcda3	MINOR: vars: Use a description to set/unset a variable instead of its hash and scope Now a variable description is retrieved when a variable is parsed, we can use it to set or unset the variable value. It is mandatory to be able to know the parent stream, if any, must be used, instead of the current one.	2024-07-18 16:39:38 +02:00
Christopher Faulet	eb2d71614f	MINOR: vars: Fill a description instead of hash and scope when a name is parsed A variable description is now used to parse a variable and extract its name and its scope. It is mandatory to be able to add some flags on the variable when it is evaluated (set or get). Among other things, this will be used to know the parent stream, if any, must be used, instead of the current one.	2024-07-18 16:39:38 +02:00
Christopher Faulet	b020bb73a0	MINOR: stream: Add a pointer to set the parent stream A pointer to a parent stream was added in the stream structure. For now, this pointer is never set, but the idea is to have an access to a stream environment from another one from the moment there is a parent/child relationship betwee these streams. Concretely, for now, there is nothing to formalize this relationship.	2024-07-18 16:39:38 +02:00
Christopher Faulet	3cdb3fa5d9	BUG/MINOR: cli: Atomically inc the global request counter between CLI commands The global request counter is used to set the stream id (s->uniq_id). It is incremented at different places. And it must be atomically incremented because it is a global value. However, in the analyer dealing with CLI command response, this was not the case. It is now fixed. This patch must be backported to all stable versions.	2024-07-18 16:39:38 +02:00
Christopher Faulet	abaafda485	BUG/MINOR: server: Don't warn fallback IP is used during init-addr resolution When a fallback IP address is provided in the list of methods to use to resolve the server address, a warning is emitted if previous methods failed. The aim is to inform this address will be used for the server. However, it is valid use-case. It is the expected behavior. There is no reason to emit a warning. Having a message during HAProxy startup to inform the fallback IP address will be used is probably a good idea. But it should be a notice not a warning. Otherwise, checking the configuration validity will always failed, just like starting HAProxy in zero-warning mode while the option was set on purpose. This patch should fix the issue #2627. It must be backported to all stable versions.	2024-07-18 16:39:38 +02:00
Amaury Denoyelle	ea7ea5198a	BUG/MINOR: stick-table: fix crash for src_inc_gpc() without stkcounter Since 2.5, an array of GPC is provided to replace legacy gpc0/gpc1. src_inc_gpc is a sample fetch which is used to increment counters in this array. A crash occurs if src_inc_gpc is used without any previous track-sc rule. This is caused by an error in smp_fetch_sc_inc_gpc(). When temporary stick counter is created via smp_create_src_stkctr(), table pointer arg value used is not correct : it points to the counter ID instead of the table argument. To fix this, use the proper sample fetch second arg. This can be reproduced with the following config : acl mark src_inc_gpc(0,<table>) -m bool tcp-request connection accept if mark This should be backported up to 2.6.	2024-07-18 16:12:36 +02:00
Willy Tarreau	2bd269cf2a	DOC: config: improve the http-keep-alive section Nathan Wehrman suggested this add-on to try to better explain the interactions between http-keep-alive and other timeouts, and the impacts on protocols (HTTP/1, HTTP/2 etc).	2024-07-18 14:24:07 +02:00
Valentine Krasnobaeva	83ff4db188	MINOR: cfgparse-global: move no<poller_name> in cfg_kw_list This commit continues to clean up cfg_parse_global() and to prepare the refactoring of master-worker mode. Master, after forking a worker, enters in its wait polling loop to catch signals and to provide master CLI. So, some poller types could be disabled for master process it as well.	2024-07-18 14:15:59 +02:00
Valentine Krasnobaeva	118ac11cea	MINOR: cfgparse-global: move mode's keywords in cfg_kw_list This commit cleans up cfg_parse_global() and prepares the config parser for master-worker mode refactoring, where daemon and master-worker fork() calls will happen very early in init(). So, the config in such case should be read twice: - at first: only some keywords in the global section for the mode discovery and everything, which is related to master process by opportunity; - at second: except the master process, all other keywords would be parsed;	2024-07-18 14:15:52 +02:00
Aurelien DARRAGON	d3d35f0fc6	BUILD: tree-wide: cast arguments to tolower/toupper to unsigned char (2) Fix build warning on NetBSD by reapplying f278eec37a ("BUILD: tree-wide: cast arguments to tolower/toupper to unsigned char"). This should fix issue #2551.	2024-07-18 13:29:52 +02:00
Valentine Krasnobaeva	fcd4bf54c8	BUG/MEDIUM: startup: fix zero-warning mode Let's check the second time a global counter of "ha_warning" messages, if zero-warning is set. And let's do this just before forking. At this moment we are sure, that we've already done all init operations, where we could emit "ha_warning", and we still have stderr fd opened. Even with the second check, we could lost some late and rare warnings about failing to drop supplementary groups and about re-enabling core dumps. Notes about this are added into 'zero-warning' keyword description.	2024-07-18 05:24:56 +02:00
William Lallemand	beaa0e1635	DOC: configuration: issuers-chain-path is compatible with OCSP Since patch f3dfd95a ("MEDIUM: ocsp: fix ocsp when the chain is loaded from 'issuers-chain-path'") the OCSP features are compatible with 'issuers-chain-path'.	2024-07-17 18:20:43 +02:00
William Lallemand	8a3e4a608b	DOC: configuration: issuers-chain-path not compatible with OCSP State that issuers-chain-path is not compatible with OCSP features. Must be backported in every stable version.	2024-07-17 17:46:16 +02:00
William Lallemand	4bac38d088	REGTESTS: ssl: test the issuers-chain-path keyword Add a reg-test which test the completion of the issuers-chain-path keyword Note that it could be interesting to have the loading of a .ocsp combined with this, but our pki for OCSP tests lacks the SubjectKeyIdentifier extensions.	2024-07-17 16:52:06 +02:00
William Lallemand	ae8c3f7f77	MINOR: ssl: change issuers-chain for show_cert_detail() Since data->chain is now completed when loading the files, we don't need to use ssl_get0_issuer_chain() anywhere else in the code. data->chain will always be completed once the files are loaded, but we can't know from show_cert_detail() from what chain file it was completed. That's why the extra_chain pointer was added to dump the chain file.	2024-07-17 16:52:06 +02:00
William Lallemand	344c3ce8fc	MEDIUM: ssl: add extra_chain to ckch_data The extra_chain member is a pointer to the 'issuers-chain-path' file that completed the chain. This is useful to get what chain file was used.	2024-07-17 16:52:06 +02:00
Valentine Krasnobaeva	f3dfd95aa2	MEDIUM: ocsp: fix ocsp when the chain is loaded from 'issuers-chain-path' This fixes OCSP, when issuer chain is in a separate PEM file. This is a case of issuers-chain-path keyword, which points to folder that contains only PEM with RootCA and IntermediateCA. Before this patch, the chain from 'issuers-chain-path' was applied directly to the SSL_CTX without being applied to the data->chain structure. This would work for SSL traffic, but every tests done with data->chain would fail, OCSP included, because the chain would be NULL. This patch moves the loading of the chain from ssl_sock_load_cert_chain(), which is the function that applies the chain to the SSL_CTX, to ssl_sock_load_pem_into_ckch() which is the function that loads the files into the ckch_data structure. Fixes issue #2635 but it changes thing on the CLI, so that's not backportable.	2024-07-17 16:52:06 +02:00
Aurelien DARRAGON	c454296f07	OPTIM: sink: balance applets accross threads Most of the time all sink applets (which are responsible for relaying messages from the ring to the tcp servers endpoints) would end up being assigned to the first available thread (tid:0), resulting in excessive CPU usage on a single thread when multiple sink servers were defined (no matter if they were defined over multiple "ring" sections) and significant message load was pushed through them over the ring API. This patch is similar to 34e4085f ("MEDIUM: peers: Balance applets across threads") but for sinks. We use a slightly different approach, which is to elect a random thread instead of picking the one with leasts applets. This proves to be already sufficient to alleviate the issue. In the case we want to have a better load distribution we should consider breaking existing connections to reestablish them on a new thread when we find out that they start monopolizing a cpu thread (ie: after a certain amount of messages for instance). Also check tcpchecks migrating model for inspiration. This patch depends on the previous one ("MEDIUM: sink: start applets asynchronously").	2024-07-17 16:45:49 +02:00
Aurelien DARRAGON	09d69eacf8	MEDIUM: sink: start applets asynchronously Since d9c1d33fa1 ("MEDIUM: applet: Add support for async appctx startup on a thread subset"), it is now possible to delay appctx's init: for that it is required that the .init callback is defined on the applet. When the applet will be processed on the first run, applet API will automatically finish the applet initialization. Thus we explicitly call appctx_wakeup() on the applet to schedule it for initial run instead of calling appctx_init() ourselves. This is done in prevision of the next patch in order to be able to schedule the applet on a different thread from the one executing sink_forward_session_create() function. Note: 'out_free_appctx' label was removed since it is no longer used.	2024-07-17 16:45:43 +02:00
Willy Tarreau	4de03e42cd	BUG/MAJOR: mux-h2: force a hard error upon short read with pending error A risk of truncated packet was addressed in 2.9 by commit 19fb19976f ("BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty") by ignoring CO_FL_ERROR after a recv() call as long as some data remained present in the buffer. However it has a side effect due to the fact that some frame processors only deal with full frames, for example, HEADERS. The side effect is that an incomplete frame will not be processed and will remain in the buffer, preventing the error from being taken into account, so the I/O handler wakes up the H2 parser to handle the error, and that one just subscribes for more data, and this loops forever wasting CPU cycles. Note that this only happens with errors at the SSL layer exclusively, otherwise we'd have a read0 pending that would properly be detected: conn->flags = CO_FL_XPRT_TRACKED \| CO_FL_ERROR \| CO_FL_XPRT_READY \| CO_FL_CTRL_READY conn->err_code = CO_ERR_SSL_FATAL h2c->flags = H2_CF_ERR_PENDING \| H2_CF_WINDOW_OPENED \| H2_CF_MBUF_HAS_DATA \| H2_CF_DEM_IN_PROGRESS \| H2_CF_DEM_SHORT_READ The condition to report the error in h2_recv() needs to be refined, so that connection errors are taken into account either when the buffer is empty, or when there's an incomplete frame, since we're certain it will never be completed. We're certain to enter that function because H2_CF_DEM_SHORT_READ implies too short a frame, and earlier there's a protocol check to validate that no frame size is larger than bufsize, hence a H2_CF_DEM_SHORT_READ implies there's some room left in the buffer and we're allowed to try to receive. The condition to reproduce the bug seems super hard to meet but was observed once by Patrick Hemmer who had the reflex to capture lots of information that allowed to explain the problem. In order to reproduce it, the SSL code had to be significantly modified to alter received contents at very empiric places, but that was sufficient to reproduce it and confirm that the current patch works as expected. The bug was tagged MAJOR because when it triggers there's no other solution to get rid of it but to restart the process. However given how hard it is to trigger on a lab, it does not seem very likely to occur in field. This needs to be backported to 2.9.	2024-07-17 15:07:47 +02:00
Valentine Krasnobaeva	9371c28c28	BUG/MEDIUM: ssl_sock: fix deadlock in ssl_sock_load_ocsp() on error path We could run under heavy load in containers or on premises and some automatic tool in parallel could use CLI to check OCSP updates statuses or to upload new OCSP responses. So, calloc() to store OCSP update callback arguments may fail and ocsp_tree_lock need to be unlocked, when exiting due to this failure. This needs to be backported in all stable versions until v2.4.0 included.	2024-07-17 14:52:11 +02:00
Lukas Tribus	a9e3decd76	DOC: install: don't reference removed CPU arg Remove reference to the removed CPU= build argument in commit 018443b8a1 ("BUILD: makefile: get rid of the CPU variable"). This should be backported to 3.0.	2024-07-16 20:06:06 +02:00
Valentine Krasnobaeva	e8799d2880	MINOR: debug: keep runtime limits in postmortem It's usefull to keep runtime limits (fd and RAM) in postmortem and show them in debug_parse_cli_show_dev(). Runtime limits are fed in feed_post_mortem_late(), as we are sure that at this moment that all configuration was parsed and all applied limits were alredy adjusted.	2024-07-16 14:04:41 +02:00
Valentine Krasnobaeva	3abd03aa78	MINOR: debug: prepare to show runtime limits This is a preparation patch to extend postmortem in order to store runtime limits. No need to perform getrlimit() in feed_post_mortem(), as we do this in the very beginning of main() and we store initial fd limits in global 'rlim_fd_cur_at_boot' and 'rlim_fd_max_at_boot' variables.	2024-07-16 14:04:41 +02:00
Valentine Krasnobaeva	665dde6481	MINOR: debug: use LIM2A to show limits It is more handy to use LIM2A in debug_parse_cli_show_dev(), as it allows to show a custom string ("unlimited"), if a given limit value equals to 0. normalize_rlim() handler is needed to convert properly RLIM_INFINITY to zero, with the respect of type sizes, as rlim_t is always 4 bytes on 32bit and 64bit arch.	2024-07-16 14:04:41 +02:00
Valentine Krasnobaeva	93cc7df276	MINOR: debug: keep runtime capabilities in post_mortem Let's extend postmortem to keep process runtime capabilities. This information is gathered in feed_post_mortem_late(), as it is called just before run_poll_loop() and we are sure at this moment, that all configuration settings were successfully applied.	2024-07-16 14:04:41 +02:00
Valentine Krasnobaeva	baa4e1cf39	MINOR: debug: store runtime uid/gid in postmortem Let's extend post_mortem to store runtime process uid and gid. This information is fed in feed_post_mortem_late(), just before calling run_poll_loop(). Like this we are sure that all configuration settings were successfully applied.	2024-07-16 14:04:41 +02:00
Valentine Krasnobaeva	ac8bd679dc	CLEANUP: debug: fix indents in debug_parse_cli_show_dev Fix indents in debug_parse_cli_show_dev() to avoid useless conflicts in case of future changes in this function or git-bisect.	2024-07-16 14:04:41 +02:00
Valentine Krasnobaeva	7cdf5751b5	MINOR: debug: prepare feed_post_mortem_late Process runtime information could be very useful in post_mortem, but we have to collect it just before calling run_poll_loop(). Like this we are sure, that we've successfully applied all configuration parameters and what we've collected are the latest runtime settings. The most appropraite place to collect such information is feed_post_mortem_late(). It's called in each thread, but puts thread info in the post_mortem only when it's in the last thread context. As it's called under mutex lock, other threads at this moment have to wait until feed_post_mortem_late() and another initialization functions from per_thread_init_list will finish. The number of threads could be large. So, to avoid spending a lot of time under the lock, let's exit immediately from feed_post_mortem_late(), if it wasn't called in the last thread.	2024-07-16 14:04:41 +02:00
Willy Tarreau	e0e2b66132	BUG/MEDIUM: debug/cli: fix "show threads" crashing with low thread counts The "show threads" command introduced early in the 2.0 dev cycle uses appctx->st1 to store its context (the number of the next thread to dump). It goes back to an era where contexts were shared between the various applets and the CLI's command handlers. In fact it was already not good by then because st1 could possibly have APPCTX_CLI_ST1_PAYLOAD (2) in it, that would make the dmup start at thread 2, though it was extremely unlikely. When contexts were finally cleaned up and moved to their own storage, this one was overlooked, maybe due to using st1 instead of st2 like most others. So it continues to rely on st1, and more recently some new flags were appended, one of which is APPCTX_CLI_ST1_LASTCMD (16) and is always there. This results in "show threads" to believe it must start do dump from thread 16, and if this thread is not present, it can simply crash the process. A tiny reproducer is: global nbthread 1 stats socket /tmp/sock1 level admin mode 666 $ socat /tmp/sock1 - <<< "show threads" The fix for modern versions simply consists in assigning a context to this command from the applet storage. We're using a single int, no need for a struct, an int* will do it. That's valid till 2.6. Prior to 2.6, better switch to appctx->ctx.cli.i0 or i1 which are all properly initialized before the command is executed. This must be backported to all stable versions. Thanks to Andjelko Horvat for the report and the reproducer.	2024-07-16 11:35:06 +02:00
Amaury Denoyelle	d57b95aab7	BUG/MINOR: do not close uninit FD in quic_test_socketops() On startup, quic_test_socketops() is called to ensure that chosen configuration option are compatible with UDP system stack. A dummy FD is allocated to invoke various setsockopt() settings. If no tests are required, FD is not allocated. In this case, close() should not be close. This is mostly for better coding as this does not cause any real issue for users. This should fix github issue #2638. No need to backport.	2024-07-16 10:51:02 +02:00
Aurelien DARRAGON	05f33e95ba	MINOR: server: better mt_list usage for node migration (prev_deleted handling) Now that mt_list v2 api was merged into haproxy's codebase in 4e65fc6 (" MAJOR: import: update mt_list to support exponential back-off (try #2)"), let's fix a hack in cli_parse_delete_server() which abused from mt_list api to migrate an element from one list to another: there used to be a tiny race there between the pop and the append operations, race that was compensated by the fact that it was performed under full thread isolation. However that was a bad example of the mt_list API which could have resulted in actual bug if the code was duplicated elsewhere without thread isolation. To fix this, we now make use of the MT_LIST_FOR_EACH_ENTRY_LOCKED() macro which allows us to simply migrate the current element to another list since the element is appended into another one while still in busy state and then unlinked from the original list.	2024-07-16 09:12:39 +02:00
Willy Tarreau	75b335abc7	MINOR: fd: don't scan the full fdtab on all threads During tests, it's pretty visible that with many threads and a large number of FDs, the process may take time to be ready. The reason for this is that the full fdtab array is scanned by each and every thread at boot in fd_reregister_all() in order to make each thread-local poller adopt the FDs that are relevant to it. The problem is that when dealing with 1-2M FDs and 64+ threads, it starts to represent quite a number of loops, and usually the fdtab array doesn't entirely fit in the CPU's L3 cache, causing extra memory accesses. It's particularly visible when issuing debugging commands to the CLI because usually the first one fails while the CPU is at 100% for half a second (which also is socat's timeout). A quick test with this: global stats socket /tmp/sock1 level admin mode 666 stats timeout 1h maxconn 2000000 And the following script started in another window: while ! time socat -t5 - /tmp/sock1 <<< "show version";do date -Ins;done shows that it takes 1.58s for the socat instance that succeeds on an Ampere Altra with 80 cores, this requires to change the timeout (defaults to half a second) otherwise it returns nothing. In addition it also means that during reloads, some CPU spikes will be noticed. Adding a prefetch of the current FD + 16 improves the startup time by 30% but that's far from being sufficient. In practice all of this is performed at boot time, a moment at which we know that extremely few FDs are registered (basically just the listeners), so FD numbers are usually very low and the rest of the table is scanned for no benefit. Ideally, knowing upfront how many FDs we have should be sufficient. A first approach would consist in counting the entries on a single thread before registering pollers. It's not necessarily efficient and would take time anyway. This patch takes a different approach. It consists in keeping a thread-local max ("fd_highest") that is updated whenever fd_insert() is called with a larger number. Of course this is not correct once all threads have started, but it will remain valid during boot since the same value is used during startup and is cloned for each thread, and no scheduling happens anywhere during this period, so that all threads are aware of the highest FD they've seen registered, even if it had been done in some init code, and this without having to deal with a shared variable. Here on the test platform, the script gets its response in 10ms vs 1580 before.	2024-07-15 19:19:13 +02:00
Willy Tarreau	a5c5a68454	BUILD: mux-spop: fix build failure on gcc 4-10 and clang A label at end of block was added in mux_spop.c in function spop_conn_update_timeout() by commit 7e1bb7283b ("MEDIUM: mux-spop: Introduce the SPOP multiplexer"). This is normally not permitted, so gcc-4 to 10 and clang whine about it: CC src/mux_spop.o src/mux_spop.c: In function 'spop_conn_update_timeout': src/mux_spop.c:899:2: error: label at end of compound statement 899 \| leave: \| ^~~~~ Let's just add a return there to make the compiler happy. No backport is needed.	2024-07-15 19:19:13 +02:00
Christopher Faulet	b353232641	DOC: spoe: Update SPOE documentation to reflect recent refactoring The SPOE was refactored. Several parameters were deprecated. Fragmentation and async capabilities support were removed. The default log-format was updated too. So, the SPOE documentation was updated accordingly. The related issue is #2502.	2024-07-12 16:38:49 +02:00
Christopher Faulet	e83ab972cc	MEDIUM: spoe: Make the SPOE applet use its own buffers The SPOE applet is rewritten to use its own buffers. It is not a huge change because, once started, the only responsibility of the SPOE applet is to transfer the ACK frame to the SPOE filter. So it means it does not send any data to the opposite endpoint, the NOTIFY frame was already transferred during the applet creation. And it does only receive one full frame. Once received, it can exit. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	1dd2e484b0	MEDIUM: spoe: Forward SPOE context error to the SPOE applet Errors triggered by a SPOE filter intance, mainly the processing timeout, are now forwarded to the SPOE applet. This way, an error can be reported to the SPOP mux stream to abort it early. Note that, for now, no abort reaon is set because the SPOP connection is not closed. Only the SPOP stream is aborted. But thanks to this patch, the SPOE applet can be released immediately, instead of waiting for the ACK frame or an error on the mux side. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	1755c32949	MEDIUM: mux-spop: Announce the pipeling support if possible Reintroduce the pipelining support. Everyting was alredy in place to be able to multiplex the streams on a SPOP connection. Here, the pipelining support is annonced and checked in the agent replies. A hard-coded limit to 20 streams is set if the pipelining is supported on both sides. Otherwise, it is disabled and only one stream at a time is allowed. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	880c037bcf	MEDIUM: mux-spop: Add checks on received frames Some conformance checks on received frames are added with this patch. Idea is to detect invalid frames and ignore unknown ones if possible. All checks are performed on the frame metatdata, mainly on the stream and the frame identifiers. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	7890d6b28d	MINOR: spoe: Remove the spop version from the SPOE appctx context The SPOE applet no longer manipulate the SPOP verison. So it can be safely removed from its context. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	62d3a96301	MEDIUM: mux-spop/spoe: Save negociated max-frame-size value in the mux The SPOE applet is just a pass-through now. It is no longer reponsible to check the frame size. On the other hand, the SPOP multiplexer negociate the maximum frame size with the agent. So, it seems logical to store this negociated value in the mux and no longer in the applet context. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	ba64bc3f20	MEDIUM: spoe: Directly receive ACK frame in the SPOE context buffer Just like the previous patch, here we avoid a buffer copy between the SPOE applet and the SPOE filter for the ACK reply. The buffer from the SPOE context is used to retrieve the ACK reply from the channel response buffer. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	07cf7769ce	MEDIUM: spoe: Directly xfer NOTIFY frame when SPOE applet is created Instead of using a buffer from the SPOE filter to store the NOTIFY frame, to copy it in a trash buffer in the SPOE applet to add meta-data and then tranfer it to the channel, the original buffer is directly transfered to the channel during the SPOE applet creation. The SPOE applet is thus simplied, the I/O handler is now only responsible to retrieve the ACK reply. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	6b9daec93d	MINOR: stats-html: Display reuse ratio for spop connections Now SPOP connections can be reused, it could be pretty useful to know the reuse rate. The corresponding backend and server counters are already incremented, but not displayed on the stats HTML page. Thanks to this patch, it is now possible to get it, just like for HTTP proxies. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	e68274c90a	MAJOR: mux-spop: Make the SPOP connections reusable Thanks to this patch, SPOP connections can now be inserted in idle connections list of the server or the session. There is no multiplexing by SPOP connecitons can be reused. It is the same mechanics than for other muxes. Noting really new. But it is a huge improvement. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	078f9d3583	MINOR: mux-spop: Use a dedicated function to update the SPOP connection timeout Force the SPOP servers to use the SPOE engine identifier as pool connection name. This way, idle SPOP connections, once implemented, of different engine but using the same backend will not be mixed up. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	e65ff4bf58	MEDIUM: spoe: Force the reuse 'always' mode for SPOP backends The reuse "always" mode is forced for SPOP backends. For now, SPOP connections cannot be idle, but once implemented, thanks to this patch, it will be possible to reuse SPOP connections. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	d2ce835fb7	MINOR: backend: Remove test on HTX streams to reuse idle connections on connect In connect_server() function, there is a test to be able to reuse idle connections for HTX streams only. Till now, only HTTP connections can be idle. And this tests was added to be sure to now reuse idle connections for legacy HTTP streams. But the legacy HTTP was removed in HAProxy-2.1. So we can safely remove this test. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	3a7879a652	MEDIUM: spoe: Set a specific name for the connection pool of SPOP servers With this patch, we force the connection pool name of SPOP server to the SPOE engine identifier. This way, SPOP idle connections cannot be shared between diffrente engines. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	706a57d55a	MINOR: spoe: Add internal sample fetch to retrieve the SPOE engine ID The internal sample fetch "spoe.engine-id" is added. It may be used to retrieve the current engine identifier, but only if the client endpoint is an SPOE applet. For now, this sample is not documented. It will only be used to set the connection pool name for a specific engine. This way, several engine can use the same SPOP backend without sharing their idle connections. The documentation will be added later, mainly because other SPOE sample fetches will be added, and some changes are expected. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	a492e08e62	CLEANUP: spoe: Uniformize function definitions SPOE functions definitions were splitted on 2 or more lines, with the return type alone on the first line. It is unusual in the HAProxy code. The related issue is #2502.	2024-07-12 15:27:05 +02:00
Christopher Faulet	cab98784d8	MAJOR: spoe: Rewrite SPOE applet to use the SPOP mux It is the huge part of the series. The patch is not so huge, it removes functions to produce or consume frames. The SPOE applet is pretty light now. But since this patch, the SPOP multiplexer is now used. The SPOP mode is now automatically ised for SPOP backends. So if there are bugs in the SPOP multiplexer, they will be visible now. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	1bea73612a	MEDIUM: check/spoe: Use SPOP multiplexer to perform SPOP health-checks The SPOP health-checks are now performed using the SPOP multiplexer. This will be fixed later, but for now, it is considered as a L4 health-check and no specific status code is reported. It means the corresponding vtest script is marked as broken for now. Functionnaly speaking, the same is performed. A connection is opened, a HELLO frame is sent to the agent and we wait for the HELLO frame from the agent in reply. But only L4OK, L4KO or L4TOUT will be reported. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	7e1bb7283b	MEDIUM: mux-spop: Introduce the SPOP multiplexer It is no possible yet to use it. Idles connections and pipelining mode are not supported for now. But it should be possible to open a SPOP connection, perform the HELLO handshake, send a NOTIFY frame based on data produced by the client side and receive the corresponding ACK frame to transfer its content to the client side. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	d0d23a7a66	MINOR: spoe: Move spoe_str_to_vsn() into the header file The function used to convert the SPOE version from a string to an integer is now located in spoe-t.h header file. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	08b522d6ac	MINOR: spoe: Move all stuff regarding the filter/applet in the C file Structures describing the SPOE applet context, the SPOE filter configuration and context and the SPOE messages and groups are moved in the C file. In spoe-t.h file, it remains the structure describing an SPOE agent and flags used by both sides. In addition, the SPOE frontend, created for a given SPOE engine, is moved from the SPOE filter configuration to the SPOE agent structure. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	e6145a0ea1	MINOR: spoe: Dynamically alloc the message list per event of an agent The inline array used to store, the configured messages per event in the SPOE agent structure, is replaced by a dynamic array, allocated during the configuration parsing. The main purpose of this change is to be able to move all stuff regarding the SPOE filter and applet in the C file. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	ce53bb6284	MINOR: spoe: Rename some flags and constant to use SPOP prefix A SPOP multiplexer will be added. Many flags, constants and structures will be remove from the applet scope. So the "SPOP" prefix is used instead of "SPOE", to be consistent. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	51ebf644e5	MINOR: stconn: Use a dedicated function to get the opposite sedesc se_opposite() function is added to let an endpoint retrieve the opposite endpoint descriptor. Muxes supportng the zero-copy forwarding can now use it. The se_shutdown() function too. This will be use by the SPOP multiplexer to be able to retrieve the SPOE agent configuration attached to the applet on client side. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	4b8098bf48	MINOR: connection: No longer include stconn type header in connection-t.h It is a small change, but it is cleaner to no include stconn-t.h header in connection-t.h, mainly to avoid circular definitions. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	33ac3dabcb	MEDIUM: applet: Add a .shut callback function for applets Applets can now define a shutdown callback function, just like the multiplexer. It is especially usefull to get the abort reason. This will be pretty useful to get the status code from the SPOP stream to report it at the SPOe filter level. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	1538c4aa82	MEDIUM: proxy/spoe: Add a SPOP mode The SPOE was significantly lightened. It is now possible to refactor it to use a dedicated multiplexer. The first step is to add a SPOP mode for proxies. The corresponding multiplexer mode is also added. For now, there is no SPOP multiplexer, so it is only declarative. But at the end, the SPOP multiplexer will be automatically selected for servers inside a SPOP backend. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	b986952a75	MINOR: spoe: Remove the dedicated SPOE applet task The dedicated task per SPOE applet is no longer used. So it is removed. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	4e589095d9	MAJOR: spoe: Remove idle applets and pipelining support Management of idle applets is removed. Consequently, the pipelining support is also removed. It is a huge change but it should be transparent for the agents, except regarding the performances. Of course, being able to reuse already openned connections and being able to multiplex frames on a given connection is a must have. These features will be restored later. hello and idle timeout are not longer used. Because an applet is spawned to process a NOTIFY frame and closed after receiving the ACK reply, the processing timeout is the only one required. In addition, the parameters to limit the SPOE applet creation are no longer used too. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	2405881ab0	MINOR: spoe: Remove debugging All the SPOE debugging is removed. The code will be easier to rework this way and the debugging will be mainly moved in the SPOP multiplexter via the trace API. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	d37489abef	MINOR: spoe: Use only a global engine-id per agent Because the async mode was removed, it is no longer mandatory to announce a different engine identifiers per thread for a given SPOE agent. This was used to be sure requests and the corresponding responses are stuck on the same thread. So, now, a SPOE agent only announces one engine identifier on all connections. No changes should be expected for agents. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	52ad7eb79e	MEDIUM: spoe: Remove async mode support The support for asynchronous mode, the ability to send messages on a connection and receive the responses on any other connections, is removed. It appears this feature was a bit overkill. And it is a problem for this refactoring. This feature is removed and will not be restored at the end. It is not a big deal for agent supporting the async mode because it is usable if it is announced on both sides. HAProxy stops to announce it. This should be transparent for agents. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	e3c92209f7	MEDIUM: spoe: Remove fragmentation support It is the first patch of a long series to refactor the SPOE filter. The idea is to rely on a dedicated multiplexer instead of hakcing HAProxy with a list of applets processing a message queue. First of all, optionnal features will be removed. Some will be restored at the end, some others will just be removed. It is the case here. The frame fragmentation support is removed. The only purpose of this feature is to be able to support the streaming. Because it is out of the scope of this refactoring, the fragmentation is removed. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Christopher Faulet	249a547f37	CLEANUP: stconn: Fix a typo in comments for SE_ABRT_SRC_* Just a little typo: s/set bu/ set by/	2024-07-12 15:27:04 +02:00
Christopher Faulet	0764445505	BUG/MINOR: session: Eval L4/L5 rules defined in the default section It is possible to define TCP/HTTP rules in a named default section to inherit from it in a proxy. However, there is an issue with L4/L5 rules. Only the lists of the current frontend are checked to know if an eval must be performed. Nothing is done for an empty list. Of course, the lists of the default proxy must also be checked to be sure to not ignored default L4/L5 rules. It is now fixed. This patch should fix the issue #2637. It must be backported as far as 2.6.	2024-07-12 15:27:04 +02:00
Valentine Krasnobaeva	9302869c95	BUG/MINOR: limits: fix license type in limits.h Need to use LGPL-2.1-or-later in headers since our hedaers default to LGPL.	2024-07-11 18:15:48 +02:00
Amaury Denoyelle	3be58fc720	CLEANUP: quic: rename TID affinity elements This commit is the renaming counterpart of the previous one, this time for quic_conn module. Several elements related to TID affinity update from quic_conn has been renamed : public functions, but also flag renamed to QUIC_FL_CONN_TID_REBIND and trace event to QUIC_EV_CONN_BIND_TID. This should be backported with the same instruction as the previous commit.	2024-07-11 15:14:06 +02:00
Amaury Denoyelle	9fbe8b0334	CLEANUP: proto: rename TID affinity callbacks Since the following patch, protocol API to update a connection TID affinity has been extended. commit 1a43b9f32c71267e3cb514aa70a13c75adb20742 MINOR: proto: extend connection thread rebind API The single callback set_affinity has been splitted in 3 different functions which are called at different stages during listener_accept(), depending on accept queue push success or not. However, the naming was rendered confusing by the usage of function prefix 1 and 2. Rename proto callback related to TID affinity update and use the following names : * bind_tid_prep * bind_tid_commit * bind_tid_reset This commit should probably be backported at least up to 3.0 with the above patch. This is because the fix was recently backported and it would allow to keep changes minimal between the two versions. It could even be backported up to 2.8 if there is no major conflict.	2024-07-11 15:14:06 +02:00
Christopher Faulet	2cb5b7dca6	BUG/MEDIUM: bwlim: Be sure to never set the analyze expiration date in past Every time a bandwidth limitation is evaluated on a channel, the analyze expiration date is renewed, mainly based on the internal bandwidth limitation filter expiration date. However, when the filter is called while there is no data to filter, we skip all limitation computations to jump at the end of the function. At this stage, the analyze expiration date is renewed before exiting. But here the internal expiration date may be expired and not reset. To sum up, it is possible to set the analyze expiration date of a channel in the past. It is unexpected and this could lead to a loop in process_stream. To fix the issue, we just now take care to reset the internal expiration date, if needed, before exiting. This patch should fix the issue #2634. It must be backported as far as 2.8.	2024-07-11 14:51:23 +02:00
Amaury Denoyelle	b0990b38f8	MINOR: quic: add counters of sent bytes with and without GSO Add a sent bytes counter for each quic_conn instance. A secondary field which only account bytes sent via GSO which is useful to ensure if this is activated. For the moment, these counters are reported on "show quic" but not aggregated on proxy quic module stats.	2024-07-11 11:02:44 +02:00
Amaury Denoyelle	d0ea173e35	MEDIUM: quic: implement GSO fallback mechanism UDP GSO on Linux is not implemented in every network devices. For example, this is not available for veth devices frequently used in container environment. In such case, EIO is reported on send() invocation. It is impossible to test at startup for proper GSO support in this case as a listener may be bound on multiple network interfaces. Furthermore, network interfaces may change during haproxy lifetime. As such, the only option is to react on send syscall error when GSO is used. The purpose of this patch is to implement a fallback when encountering such conditions. Emission can be retried immediately by trying to send each prepared datagrams individually. To support this, qc_send_ppkts() is able to iterate over each datagram in a so-called non-GSO fallback mode. Between each emission, a datagram header is rewritten in front of the buffer which allows the sending loop to proceed until last datagram is emitted. To complement this, quic_conn listener is flagged on first GSO send error with value LI_F_UDP_GSO_NOTSUPP. This completely disables GSO for all future emission with QUIC connections using this listener. For the moment, non-GSO fallback mode is activated when EIO is reported after GSO has been set. This is the error reported for the veth usage described above.	2024-07-11 11:02:44 +02:00
Amaury Denoyelle	af22792a43	MAJOR: quic: support GSO when encoding datagrams QUIC datagrams are encoded during emission via the function qc_prep_pkts(). By default, if GSO is not used, each datagram is prefixed by a metadata header which specify its length and address of its first quic_tx_packet instance. If GSO is activated, metadata header won't be inserted for datagrams following the first one sent in a single syscall. Length field will contain the total size of these datagrams. This allows to support both GSO and non-GSO prepared datagram in the same Tx buffer. qc_send_ppkts() is invoked just after datagrams encoding. It iterates over each metadata header in Tx buffer to sent each datagram individually. If length field is bigger than network MTU, GSO usage is assumed and qc_snd_buf() GSO parameter will be set. Another important point to note regarding GSO implementation is that during datagram encoding, packets from the same datagram instance are attached together. However, if using GSO, consecutive packets from different datagrams are also linked, but without QUIC_FL_TX_PACKET_COALESCED flag. This allows to properly update quic_conn status with all sent packets in qc_send_ppkts(). Packets from different datagrams are then unlinked to treat them separately when receiving corresponding ACK frames.	2024-07-11 11:02:44 +02:00
Amaury Denoyelle	448d3d388a	MINOR: quic: add GSO parameter on quic_sock send API Add <gso_size> parameter to qc_snd_buf(). When non-null, this specifies the value for socket option SOL_UDP/UDP_SEGMENT. This allows to send several datagrams in a single call by splitting data multiple times at <gso_size> boundary. For now, <gso_size> remains set to 0 by caller, as such there should not be any functional change.	2024-07-11 11:02:44 +02:00
Amaury Denoyelle	96a34d79d9	MINOR: quic: define quic_cc_path MTU as constant Future commits will implement GSO support to be able to emit multiple datagrams in a single syscall invocation. This will be used every time there is more data to sent than the UDP network MTU. No change will be done for Tx buffer encoding, in particular when using extra metadata datagram header. When GSO will be used, length field will contain the total length of all datagrams to emit in a single GSO syscall send. As such, QUIC send functions will detect that GSO is in use if total length is greater than MTU. This last assumption forces to ensure that MTU is constant. Indeed, in case qc_send() is interrupted, Tx buffer will be left with prepared datagrams. These datagrams will be emitted at the next qc_send() invocation. If MTU would change during these two calls, it would be impossible to know if GSO was used or not. To prevent this, mark <mtu> field of quic_cc_path as constant.	2024-07-11 11:02:44 +02:00
Amaury Denoyelle	35470d5185	MINOR: quic: activate UDP GSO for QUIC if supported Add a startup test for GSO support in quic_test_socketopts() and automatically activate it in qc_prep_pkts() when building datagrams as big as MTU. Also define a new config option tune.quic.disable-udp-gso. This is useful to prevent warning on older platform or to debug an issue which may be related to GSO.	2024-07-11 11:02:44 +02:00
Amaury Denoyelle	5bddf39fb2	MINOR: quic: extend detection of UDP API OS features QUIC haproxy implementation relies on specific OS features to activate some UDP optimization. One of these is the ability to bind multiple sockets on the same address, which is necessary to have a dedicated socket for each QUIC connections. This feature support is tested during startup via an internal proto-quic function. It automatically deactivate socket per connection if OS is not compatible. The purpose of this patch is to render this QUIC feature detection code more generic. Function is renamed quic_test_socketopts() and is still invoked on startup. Its internal code has been refactored to be able to implement other features support test in it. Return value has also been changed and is now taken into account. In case of ERR_FATAL, haproxy startup will be interrupted. This happens on socket() syscall failure used to duplicate a QUIC listener FD. This commit will become necessary to detect GSO support on startup.	2024-07-11 11:02:44 +02:00
Amaury Denoyelle	cac47d19bd	CLEANUP: quic: remove obsolete comment on send Remove comment on send which is now obsolete since the introduction of per-connection socket.	2024-07-11 11:02:44 +02:00
Valentine Krasnobaeva	3a0b44b122	MINOR: limits: add is_any_limit_configured Let's encapsulate the check of all supported for now process internal limits in a separate function. This will help in cases, when we need to simply check if we have even only one limit set in the configuration file. It's important, as the default value for a one limit (fd-hard-limits, for example) sometimes must not affect the computation of the others.	2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva	1f8addfdc2	REORG: haproxy: move limits handlers to limits This patch moves handlers to compute process related limits in 'limits' compilation unit.	2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva	22db643648	MINOR: haproxy: prepare to move limits-related code This patch is done in order to prepare the move of handlers to compute and to check process related limits as maxconn, maxsock, maxpipes. So, these handlers become no longer static due to the future move. We add the handlers declarations in limits.h in this patch as well, in order to keep the next patch, dedicated to code replacement, without any additional modifications. Such split also assures that this patch can be compiled separately from the next one, where we moving the handlers. This is important in case of git-bisect.	2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva	b8dc783eb9	REORG: global: move rlim_fd_*_at_boot in limits Let's move in 'limits' compilation unit global variables to keep the initial process fd limits.	2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva	47f2afb436	CLEANUP: fd: rm struct rlimit definition As raise_rlim_nofile() was moved to limits compilation unit, limits.h includes the system <sys/resource.h>. So, this definition of rlimit system type structure is no longer need for compilation of fd unit.	2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva	3759674047	REORG: fd: move raise_rlim_nofile to limits Let's move raise_rlim_nofile() from 'fd' compilation unit to 'limits', as it wraps setrlimit to change process RLIMIT_NOFILE.	2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva	1517bcb5e3	MINOR: limits: prepare to keep limits in one place The code which gets, sets and checks initial and current fd limits and process related limits (maxconn, maxsock, ulimit-n, fd-hard-limit) is spread around different functions in haproxy.c and in fd.c. Let's group it together in dedicated limits.c and limits.h. This patch is done in order to prepare the moving of limits-related functions from different places to the new 'limits' compilation unit. It helps to keep clean the next patch, which will do only the move without any additional modifications. Such detailed split is needed in order to be sure not to break accidentally limits logic and in order to be able to compile each commit separately in case of git-bisect.	2024-07-10 18:05:48 +02:00
Willy Tarreau	a4bc71a1a3	[RELEASE] Released version 3.1-dev3 Released version 3.1-dev3 with the following main changes : - BUG/MINOR: quic: Wrong datagram building when probing. - BUG/MEDIUM: quic: fix possible exit from qc_check_dcid() without unlocking - BUG/MINOR: promex: Remove Help prefix repeated twice for each metric - DOC: configuration: add details about crt-store in bind "crt" keyword - BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers - DOC: configuration: more details about the master-worker mode - BUG/MEDIUM: server: fix race on server_atomic_sync() - BUG/MINOR: jwt: don't try to load files with HMAC algorithm - CLEANUP: quic: cleanup prototypes related to CIDs handling - CLEANUP: quic: remove non-existing quic_cid_tree definition - MINOR: quic: remove access to CID global tree outside of quic_cid module - REORG: quic: remove quic_cid_trees reference from proto_quic - MINOR: quic: add 2 BUG_ON() on datagram dispatch - MINOR: quic: ensure quic_conn is never removed on thread affinity rebind - MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD - DOC: configuration: update maxconn description - MINOR: proto: extend connection thread rebind API - BUG/MEDIUM: quic: prevent crash on accept queue full - BUG/MEDIUM: peers: Fix crash when syncing learn state of a peer without appctx - CI: add weekly QUIC Interop regression against LibreSSL - DEV: flags/quic: decode quic_conn flags - MINOR: quic: rename "ssl error" trace - BUG/MEDIUM: init: fix fd_hard_limit default in compute_ideal_maxconn - BUG/MINOR: jwt: fix variable initialisation - MINOR: ssl/sample: ssl_c_san returns a comma separated list of SAN - OPTIM: pool: improve needed_avg cache line access pattern - MAJOR: import: update mt_list to support exponential back-off (try #2) - CI: weekly QUIC Interop: try to fix private image - BUG/MINOR: h1: Fail to parse empty transfer coding names - BUG/MINOR: h1: Reject empty coding name as last transfer-encoding value - BUG/MEDIUM: h1: Reject empty Transfer-encoding header - BUG/MEDIUM: spoe: Be sure to create a SPOE applet if none on the current thread - BUILD: listener: silence a build warning about unused value without threads - DOC: architecture: remove the totally outdated architecture manual - SCRIPTS: create-release: no more need to skip architecture.txt	2024-07-10 15:39:36 +02:00
Willy Tarreau	d96b9f4249	SCRIPTS: create-release: no more need to skip architecture.txt Now that it's gone we won't stumble upon it by accident anymore.	2024-07-10 15:38:45 +02:00
Willy Tarreau	95b9d8abee	DOC: architecture: remove the totally outdated architecture manual We've discussed about removing it many times and I thought it had been removed long ago, but apparently not as William proved me. Let's get rid of it now. It's totally outdated (last updated 18 years ago, when laptop processors were still 32 bits), mentions keywords and external products that don't exist anymore. It's not even on docs.haproxy.org. At some point, old stuff must really die.	2024-07-10 15:38:20 +02:00
Willy Tarreau	0cb8743209	BUILD: listener: silence a build warning about unused value without threads A variable introduced in commit 1a43b9f32c ("MINOR: proto: extend connection thread rebind API") is not used without threads and causes a build warning. Let's just mark it maybe_unused. Since the commit above is tagged for backporting, this one will need to be backported along with it.	2024-07-10 15:17:04 +02:00
Christopher Faulet	5e84f13a0b	BUG/MEDIUM: spoe: Be sure to create a SPOE applet if none on the current thread When a message is queued, waiting to be processed by a SPOE applet, there are some heuristic to know if a new applet must be created or not. There are 2 conditions to skip the applet creation: 1 - if there are enough idle applets on the current thread, or, 2 - if the processing rate on the current thread is high enough to handle this new message In the 2nd case, there is a flaw when the number of processed messages falls to zero while the processing rate is still greater than zero. In that case, we will skip the SPOE applet creation without taking care to check there is at least one applet on the current thread. So now, the conditions above to skip the SPOE applet creation are only evaluated if there is at least one applet on the current thread. This patch must be backported to every stable versions.	2024-07-10 10:52:20 +02:00
Christopher Faulet	4a2dd6f377	BUG/MEDIUM: h1: Reject empty Transfer-encoding header The Transfer-Encoding headers list the transfer coding that have been applied to the content in order to form the message body. It is a list of tokens. And as specified by RFC 9110, a token cannot be empty. When several coding names are specify as a comma-separated value, this case is properly handled and an error is triggered. However, an empty header value will just be skipped and no error is triggered. This could be an issue with some buggy servers. Now, empty Transfer-Encoding header are rejected too. This patch must be backported as far as 2.6.	2024-07-10 10:52:20 +02:00
Christopher Faulet	428451fe96	BUG/MINOR: h1: Reject empty coding name as last transfer-encoding value The following Transfer-Encoding header is now rejected with a 400-bad-request: Transfer-Encoding: chunked,\r\n This case was not properly handled and the last empty value was just ignored. This patch must be backported as far as 2.6.	2024-07-10 10:52:20 +02:00
Christopher Faulet	b8b0102760	BUG/MINOR: h1: Fail to parse empty transfer coding names Empty transfer coding names, inside a comma-separated list, are already rejected. But it is only by chance. Today, it is detected as an unknown coding names (not "chunked" concretly). Then, it is handled by the H1 multiplexer as an error and a 422-Unprocessable-Content response is returned. So, the error is properly detected in this case, but it is not accurate. A 400-bad-request response must be returned instead. Then, it is better to catch the error during the header parsing. It is the purpose of this patch. This patch should be backported as far as 2.6.	2024-07-10 10:52:20 +02:00
Ilia Shipitsin	89bdd8b62a	CI: weekly QUIC Interop: try to fix private image for some reason image built in HAProxy workflow is "private", it is succesfully built, but fails to pull. Let's try explicit docker login for run job as well	2024-07-10 09:43:02 +02:00
Willy Tarreau	4e65fc66f6	MAJOR: import: update mt_list to support exponential back-off (try #2 ) This is the second attempt at importing the updated mt_list code (commit 59459ea3). The previous one was attempted with commit c618ed5ff4 ("MAJOR: import: update mt_list to support exponential back-off") but revealed problems with QUIC connections and was reverted. The problem that was faced was that elements deleted inside an iterator were no longer reset, and that if they were to be recycled in this form, they could appear as busy to the next user. This was trivially reproduced with this: $ cat quic-repro.cfg global stats socket /tmp/sock1 level admin stats timeout 1h limited-quic frontend stats mode http bind quic4@:8443 ssl crt rsa+dh2048.pem alpn h3 timeout client 5s stats uri / $ ./haproxy -db -f quic-repro.cfg & $ h2load -c 10 -n 100000 --npn h3 https://127.0.0.1:8443/ => hang This was purely an API issue caused by the simplified usage of the macros for the iterator. The original version had two backups (one full element and one pointer) that the user had to take care of, while the new one only uses one that is transparent for the user. But during removal, the element still has to be unlocked if it's going to be reused. All of this sparked discussions with Fred and Aur�lien regarding the still unclear state of locking. It was found that the lock API does too much at once and is lacking granularity. The new version offers a much more fine- grained control allowing to selectively lock/unlock an element, a link, the rest of the list etc. It was also found that plenty of places just want to free the current element, or delete it to do anything with it, hence don't need to reset its pointers (e.g. event_hdl). Finally it appeared obvious that the root cause of the problem was the unclear usage of the list iterators themselves because one does not necessarily expect the element to be presented locked when not needed, which makes the unlock easy to overlook during reviews. The updated version of the list presents explicit lock status in the macro name (_LOCKED or _UNLOCKED suffixes). When using the _LOCKED suffix, the caller is expected to unlock the element if it intends to reuse it. At least the status is advertised. The _UNLOCKED variant, instead, always unlocks it before starting the loop block. This means it's not necessary to think about unlocking it, though it's obviously not usable with everything. A few _UNLOCKED were used at obvious places (i.e. where the element is deleted and freed without any prior check). Interestingly, the tests performed last year on QUIC forwarding, that resulted in limited traffic for the original version and higher bit rate for the new one couldn't be reproduced because since then the QUIC stack has gaind in efficiency, and the 100 Gbps barrier is now reached with or without the mt_list update. However the unit tests definitely show a huge difference, particularly on EPYC platforms where the EBO provides tremendous CPU savings. Overall, the following changes are visible from the application code: - mt_list_for_each_entry_safe() + 1 back elem + 1 back ptr => MT_LIST_FOR_EACH_ENTRY_LOCKED() or MT_LIST_FOR_EACH_ENTRY_UNLOCKED() + 1 back elem - MT_LIST_DELETE_SAFE() no longer needed in MT_LIST_FOR_EACH_ENTRY_UNLOCKED() => just manually set iterator to NULL however. For MT_LIST_FOR_EACH_ENTRY_LOCKED() => mt_list_unlock_self() (if element going to be reused) + NULL - MT_LIST_LOCK_ELT => mt_list_lock_full() - MT_LIST_UNLOCK_ELT => mt_list_unlock_full() - l = MT_LIST_APPEND_LOCKED(h, e); MT_LIST_UNLOCK_ELT(); => l=mt_list_lock_prev(h); mt_list_lock_elem(e); mt_list_unlock_full(e, l)	2024-07-09 16:46:38 +02:00
Willy Tarreau	87d269707b	OPTIM: pool: improve needed_avg cache line access pattern On an AMD EPYC 3rd gen, 20% of the CPU is spent calculating the amount of pools needed when using QUIC, because pool allocations/releases are quite frequent and the inter-CCX communication is super slow. Still, there's a way to save between 0.5 and 1% CPU by using fetch-add and sub-fetch that are converted to XADD so that the result is directly fed into the swrate_add argument without having to re-read the memory area. That's what this patch does.	2024-07-09 16:46:38 +02:00
William Lallemand	9797a7718c	MINOR: ssl/sample: ssl_c_san returns a comma separated list of SAN The ssl_c_san sample fetch returns a list of Subject Alt Name which was presented by the client certificate. The format is the same as the "openssl x509 -text" command, it's a Description: Value list separated by commas. The format is directly generated by the GENERAL_NAME_print() openssl function. https://github.com/openssl/openssl/blob/openssl-3.0/crypto/x509/v3_san.c#L207 Example: IP Address:127.0.0.1, IP Address:127.0.0.2, IP Address:127.0.0.3, URI:http://docs.haproxy.org/2.7/, DNS:ca.tests.haproxy.com	2024-07-09 13:57:18 +02:00
William Lallemand	0a1b251c1a	BUG/MINOR: jwt: fix variable initialisation Set the alg variable from sample_conv_jwt_verify_check() to JWT_ALG_DEFAULT. This was reported by coverity in #2630, but since you need to use the first argument to use the 2nd, this has no real impact. Mut be backported with 883f1bd (as far as 2.6).	2024-07-08 14:23:14 +02:00
Valentine Krasnobaeva	16a5fac4bb	BUG/MEDIUM: init: fix fd_hard_limit default in compute_ideal_maxconn This commit fixes 41275a691 ("MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD"). fd_hard_limit is taken in account implicitly via 'ideal_maxconn' value in all maxconn adjustements, when global.rlimit_memmax is set: MIN(global.maxconn, capped by global.rlimit_memmax, ideal_maxconn); It also caps provided global.rlimit_nofile, if it couldn't be set as a current process fd limit (see more details in the main() code). So, lets set the default value for fd_hard_limit only, when there is no any other haproxy-specific limit provided, i.e. rlimit_memmax, maxconn, rlimit_nofile. Otherwise we may break users configs. Please, note, that in master-worker mode, master does not need the DEFAULT_MAXFD (1048576) as well, as we explicitly limit its maxconn to 100. Must be backported in all stable versions until v2.6.0, including v2.6.0, like the commit above.	2024-07-08 11:26:16 +02:00
Amaury Denoyelle	3d4baa3c7b	MINOR: quic: rename "ssl error" trace SSL status is reported each time quic_conn_io_cb() is finished via a trace. Change the trace label from "ssl error" to "ssl status". This allows to search for errors easier without being distracted by this trace.	2024-07-08 09:38:35 +02:00
Amaury Denoyelle	19b8c1b7cd	DEV: flags/quic: decode quic_conn flags Decode quic_conn flags via qc_show_flags() function. To support this, quic flags definition have been put outside of USE_QUIC directive.	2024-07-08 09:38:35 +02:00
Ilia Shipitsin	f8a30b69d2	CI: add weekly QUIC Interop regression against LibreSSL currently only quic-go and picoquic clients are enabled with testsuites supposed to be "green". Tests will be run weekly.	2024-07-05 15:11:21 +02:00
Christopher Faulet	3e2d1476e6	BUG/MEDIUM: peers: Fix crash when syncing learn state of a peer without appctx For a given peer, the synchronization of the learn state is no longer performed in the peer appctx. It is delayed to be handled by the peers sync task. It means that for a given peer, it is possible to have finished to learn and only handle it after the appctx release. So the synchronization may happen on a peer without appctx. This was not tested and an unconditionnal wakeup on the appctx could lead to a crash because of a NULL-deref. It may be experienced by running reg-tests/peers/tls_basic_sync.vtc script in loop. The fix is obivous. In sync_peer_learn_state(), we must omit to wakeup the appctx if it was already released. This patch should fix issue #2629. It must be backported to 3.0.	2024-07-05 12:14:27 +02:00
Amaury Denoyelle	95f624540b	BUG/MEDIUM: quic: prevent crash on accept queue full Handshake for quic_conn instances runs on a single non-chosen thread. On completion, listener_accept() is performed to select the less loaded thread before initializing connection instance. As such, quic_conn instance is migrated to the thread with its upper connection. In case accept queue is full, listener_accept() fallback to local accept mode, which cause the connection to be assigned to the current thread. However, this is not supported by QUIC as quic_conn instance is left on the previously selected thread. In most cases, this will cause a BUG_ON() due to a task manipulation from an outside thread. To fix this, handle quic_conn thread rebind in multiple steps using the new extended protocol API. Several operations have been moved from qc_set_tid_affinity1() to newly defined qc_set_tid_affinity2(), in particular CID TID update. This ensures that quic_conn instance is not prematurely accessed on the new thread until accept queue push is guaranteed to succeed. qc_reset_tid_affinity() is also newly defined to reassign the newly created tasks and tasklets to the current thread. This is necessary to prevent the BUG_ON() crash described above. This must be backported up to 2.8 after a period of observation. Note that it depends on previous patch : MINOR: proto: extend connection thread rebind API	2024-07-04 17:28:56 +02:00
Amaury Denoyelle	1a43b9f32c	MINOR: proto: extend connection thread rebind API MINOR: listener: define callback for accept queue push Extend API for connection thread rebind API by replacing single callback set_affinity by three different ones. Each one of them is used at a different stage of the operation : * set_affinity1 is used similarly to previous set_affinity * set_affinity2 is called directly from accept_queue_push_mp() when an entry has been found in accept ring. This operation cannot fail. * reset_affinity is called after set_affinity1 in case of failure from accept_queue_push_mp() due to no space left in accept ring. This is necessary for protocols which must reconfigure resources before fallback on the current tid. This patch does not have any functional changes. However, it will be required to fix crashes for QUIC connections when accept queue ring is full. As such, it must be backported with it.	2024-07-04 16:33:21 +02:00
Valentine Krasnobaeva	ff024206f0	DOC: configuration: update maxconn description Let's update maxconn keyword description, in order to make it clear, which setting has the precedence over the global.maxconn and the SYSTEM_MAXCONN if set.	2024-07-04 07:53:07 +02:00
Valentine Krasnobaeva	41275a6918	MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD Let's provide a default value for fd_hard_limit, if it's not set in the configuration. With this patch we could set some specific default via compile-time variable DEFAULT_MAXFD as well. Hope, this will be helpfull for haproxy package maintainers. make -j 8 TARGET=linux-glibc DEBUG=-DDEFAULT_MAXFD=50000 If haproxy is comipled without DEFAULT_MAXFD defined, the default will be set to 1048576. This is done to avoid killing the process by its watchdog, while it started without any limitations in its configuration or in the command line and the hard RLIMIT_NOFILE is extremely huge (~1000000000). We use in this case compute_ideal_maxconn() to calculate maxconn and maxsock, maxsock defines the size of internal fdtab, which becames very-very large as well. When the process starts to simply loop over this fdtab (0(n)), this takes a lot of time, so watchdog does it job. To avoid this, maxconn now is always reduced to some reasonable value either by explicit global.fd-hard-limit from configuration, or by its default. The default may be changed at build-time and overwritten then by global.fd-hard-limit at runtime. Explicit global.fd-hard-limit from the configuration has always precedence over DEFAULT_MAXFD, if set. Must be backported in all stable versions until v2.6.0, including v2.6.0.	2024-07-04 07:52:42 +02:00
Amaury Denoyelle	bfdf145859	MINOR: quic: ensure quic_conn is never removed on thread affinity rebind On accept, quic_conn instance is migrated from its original thread to a new one. This operation is conducted in two steps, on the original than the new thread instance. During the interval, quic_conn is artificially rendered inactive. It must never be accessed nor removed until migration is completed via qc_finalize_affinity_rebind(). This new BUG_ON() will enforce that removal is never conducted until migration is completed.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	a4240fb26f	MINOR: quic: add 2 BUG_ON() on datagram dispatch QUIC datagram dispatch is an error prone operation as it must always ensure the correct thread is used before accessing to the recipient quic_conn instance. Strengthen this code part by adding two BUG_ON_HOT() to ensure thread safety.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	8550549cca	REORG: quic: remove quic_cid_trees reference from proto_quic Previous commit removed access/manipulation to QUIC CID global tree outside of quic_cid module. This ensures that proper locking is always performed. This commit finalizes this cleanup by marking CID global tree as static only to quic_cid source file. Initialization of this tree is removed from proto_quic and now performed using dedicated initcalls quic_alloc_global_cid_tree(). As a side change, complete CID global tree documentation, in particular to explain CID global tree artificial splitting and ODCID handling. Overall, the code is now clearer and safer.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	0a352ef08e	MINOR: quic: remove access to CID global tree outside of quic_cid module haproxy generates for each QUIC connection a set of CID. The peer must reuse them as DCID for its emitted packet. On datagram reception, DCID field serves as identifier to dispatch them on their correct thread. These CIDs are stored in a global CID tree. Access to this data structure must always be protected with CID_LOCK. This commit is a refactoring to regroup all CID tree access in quic_cid module. Several code parts are ajusted : * quic_cid_insert() is extended to check for insertion race-condition. This is useful on quic_conn instantiation. Code where such race cannot happen can use unsafe _quic_cid_insert() instead. * on RETIRE_CONNECTION_ID frame reception, existing quic_cid_delete() function is used. * remove tree lookup from qc_check_dcid(), extracted in the new quic_cmp_cid_conn() function. Ultimately, the latter should be removed as CID lookup could be conducted on quic_conn owned tree without locking.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	5d186673df	CLEANUP: quic: remove non-existing quic_cid_tree definition quic_cid_tree global variable does not exist anymore. Remove its definition in quic_conn.c.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	a05fefe74d	CLEANUP: quic: cleanup prototypes related to CIDs handling Remove duplicated prototypes from quic_conn.h also present in quic_cid.h. Also remove quic_derive_cid() prototype and mark it as static.	2024-07-03 15:02:40 +02:00
William Lallemand	883f1bdbce	BUG/MINOR: jwt: don't try to load files with HMAC algorithm When trying to use a HMAC algorithm (HS256, HS384, HS512) the sample_conv_jwt_verify_check() function of the converter tries to load a file even if it is only supposed to contain a secret instead of a path. When using lua, the check function is called at runtime so it even tries to load file at each call... This fixes the issue for HMAC algorithm but this is still a problem with the other algorithms, since we don't have a way of pre-loading files before the call. Another solution must be found to prevent disk IO with lua using other algorithms. Must be backported as far as 2.6.	2024-07-03 12:35:50 +02:00
Amaury Denoyelle	50ae717624	BUG/MEDIUM: server: fix race on server_atomic_sync() The following patch fixes a race condition during server addr/port update : cd994407a9545a8d84e410dc0cc18c30966b70d8 BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates The new update mechanism is implemented via an event update. It uses thread isolation to guarantee that no other thread is accessing server addr/port. Furthermore, to ensure server instance is not deleted just before the event handler, server instance is lookup via its ID in proxy tree. However, thread isolation is only entered after server lookup. This leaves a tiny race condition as the thread will be marked as harmless and a concurrent thread can delete the server in the meantime. This causes server_atomic_sync() to manipulated a deleted server instance to reinsert it in used_server_addr backend tree. This can cause a segfault during this operation or possibly on a future used_server_addr tree access. This issue was detected by criteo. Several backtraces were retrieved, each related to server addr_node insert or delete operation, either in srv_set_addr_desc(), or add/delete dynamic server handlers. To fix this, simply extend thread isolation section to start it before server lookup. This ensures that once retrieved the server cannot be deleted until its addr/port are updated. To ensure this issue won't happen anymore, a new BUG_ON() is added in srv_set_addr_desc(). Also note that ebpt_delete() is now called every time on delete handler as this is a safe idempotent operation. To reproduce these crashes, a script was executed to add then remove different servers every second. In parallel, the following CLI command was issued repeatdly without any delay to force multiple update on servers port : set server <srv> addr 0.0.0.0 port $((1024 + RANDOM % 1024)) This must be backported at least up to 3.0. If above mentionned patch has been selected for previous version, this commit must also be backported on them.	2024-07-03 09:20:24 +02:00
William Lallemand	419b79492a	DOC: configuration: more details about the master-worker mode Add more details about the master-worker mode in the "master-worker" global keyword. Should fix issue #2198.	2024-07-02 18:23:34 +02:00
Christopher Faulet	e5e36ce097	BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers In 3.0, the CLI applet was rewritten to use its own buffers. However, the lua part, used to register CLI commands at runtime, was not updated accordingly. It means the lua CLI commands still try to write in the channel buffers. This is of course totally unexepected and not supported. Because of this bug, the applet hangs intead of returning the command result. The registration of lua CLI commands relies on the lua TCP applets. So the send and receive functions were fixed to use the applet's buffer when it is required and still use the channel buffers otherwies. This way, other lua TCP applets can still run on the legacy mode, without the applet's buffers. This patch must be backported to 3.0.	2024-07-02 10:05:40 +02:00
William Lallemand	ba37ad41b2	DOC: configuration: add details about crt-store in bind "crt" keyword Add some details about the certificate storage cache system in the "crt" bind keyword. This should be backported to 3.0. Fix issue #2618.	2024-07-01 12:30:06 +02:00
Christopher Faulet	b789cef91f	BUG/MINOR: promex: Remove Help prefix repeated twice for each metric When the support for modules was added, the function producing the #HELP line of each metric was refactored. Since then, the prefix "#HELP <metric-name>" is printed twice because a code block was not removed. It is now fixed. This patch must be backported to 3.0.	2024-07-01 10:50:27 +02:00
Willy Tarreau	192abc6f83	BUG/MEDIUM: quic: fix possible exit from qc_check_dcid() without unlocking Locking of the CID tree was extended in qc_check_dcid() by recent commit 05f59a5 ("BUG/MINOR: quic: fix race condition in qc_check_dcid()") but there was a direct return from the middle of the function which was not covered by the unlock, resulting in the function keeping the lock on success return. Let's just remove this return and replace it with a variable to merge all exit paths. This must be backported wherever the fix above is backported.	2024-07-01 10:29:31 +02:00
Frederic Lecaille	6d943b8db6	BUG/MINOR: quic: Wrong datagram building when probing. This issue was revealed by chacha20 interop test which very often fails with ngtcp2 as client. This was due to the fact that 2 application level packets could be coalesced into the same datagram as revealed by such a capture: Frame 380: 255 bytes on wire (2040 bits), 255 bytes captured (2040 bits) Point-to-Point Protocol Internet Protocol Version 4, Src: 193.167.100.100, Dst: 193.167.0.100 User Datagram Protocol QUIC IETF QUIC Connection information [Connection Number: 0] [Packet Length: 187] QUIC Short Header DCID=ec523fe99840f9c17c868a88d649147814 PKN=333 0... .... = Header Form: Short Header (0) .1.. .... = Fixed Bit: True ..0. .... = Spin Bit: False [...0 0... = Reserved: 0] [.... .0.. = Key Phase Bit: False] [.... ..00 = Packet Number Length: 1 bytes (0)] Destination Connection ID: ec523fe99840f9c17c868a88d649147814 [Packet Number: 333] Protected Payload […]: 43537d43a3c83e47db6891bd6a4fd7d7fa31941badcb87a540e843341d6a5e493ed4c3f6e6bbff094804ee0ab06830dc1a1bbf52ace4323d2e4f6e0bd4eea73df0721d2949d05a058d3afb974e814494ebf44d1375b0e7f1fd5bcf634cf32ef9a9b4018758a49d39a24c40 STREAM id=0 fin=0 off=294768 len=144 dir=Bidirectional origin=Client-initiated Frame Type: STREAM (0x000000000000000e) .... ...0 = Fin: False .... ..1. = Len(gth): True .... .1.. = Off(set): True Stream ID: 0 .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... ...0 = Stream initiator: Client-initiated (0) .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... ..0. = Stream direction: Bidirectional (0) Offset: 294768 Length: 144 Stream Data […]: 63eef6ccee0d2ab602db3682d0e7cc09b72db6adc307d7699a211144b4b6c029cbed9beae1491c10a5fe0678d815a5303843d33c0593fedc9b64068fd0207e280d05aac2c0054fe9ab30857bc3669ee51d34756cfd2e098eb1ab31a03911f6a103f0a16f8f984d9861efdcf4433c QUIC IETF [Packet Length: 38] QUIC Short Header DCID=ec523fe99840f9c17c868a88d649147814 PKN=334 0... .... = Header Form: Short Header (0) .1.. .... = Fixed Bit: True ..0. .... = Spin Bit: False [...0 0... = Reserved: 0] [.... .0.. = Key Phase Bit: False] [.... ..00 = Packet Number Length: 1 bytes (0)] Destination Connection ID: ec523fe99840f9c17c868a88d649147814 [Packet Number: 334] Protected Payload: b9c0e6dc3fc523574f8164c31b6cd156496212 PING Frame Type: PING (0x0000000000000001) PADDING Length: 2 Frame Type: PADDING (0x0000000000000000) [Padding Length: 2] On the peer side these two packet are considered as a unique one because there may be only one packet by datagram at application encryption level and reported as a STREAM frame encoding error: I00000332 0xec523fe99840f9c17c868a88d649147814 con recv packet len=225 mask=b2c69c7827 sample=43a3c83e47db6891bd6a4fd7d7fa3194 I00000332 0xec523fe99840f9c17c868a88d649147814 pkt rx pkn=333 dcid=0xec523fe99840f9c17c868a88d649147814 type=1RTT k=0 I00000332 0xec523fe99840f9c17c868a88d649147814 frm rx 333 1RTT STREAM(0x0e) id=0x0 fin=0 offset=294768 len=144 uni=0 ngtcp2_conn_read_pkt: ERR_FRAME_ENCODING I00000332 0xec523fe99840f9c17c868a88d649147814 pkt tx pkn=1531039643 dcid=0xae79dfc99d6c65d6 type=1RTT k=0 I00000332 0xec523fe99840f9c17c868a88d649147814 frm tx 1531039643 1RTT CONNECTION_CLOSE(0x1c) error_code=FRAME_ENCODING_ERROR(0x7) frame_type=0 reason_len=0 reason=[] I00000332 0xec523fe99840f9c17c868a88d649147814 frm tx 1531039643 1RTT PADDING(0x00) len=9 Note here that the sum of the two packet sizes (from capture) is the same as the packet length reporte by ngtcp2: 187+38 = 225. It also seems that wireshark tries to parse as much as packet into the same datagram, regardless of the QUIC protocol rules. Haproxy traces revealed that this could happen at least when probing the peer. The recent low level packet building modifications aim was to build as much as datagrams into the same buffer. But it seems that the probing packet case treatment has been broken. That said, I have not identified impacted commit. This issue could be reproduced inside interop test environment (no possible git bisection). To fix this, rely on the <probe> variable value to identify if the last packet built by qc_prep_pkts() was a probing one, then try to coalesce some others packet into the same datagram if this was not the case. Of course the test on <probe> value has to be done before setting it for the next packet. Must be backported to 3.0.	2024-07-01 09:29:09 +02:00
Willy Tarreau	bbc2f043e3	[RELEASE] Released version 3.1-dev2 Released version 3.1-dev2 with the following main changes : - BUG/MINOR: log: fix broken '+bin' logformat node option - DEBUG: hlua: distinguish burst timeout errors from exec timeout errors - REGTESTS: ssl: fix some regtests 'feature cmd' start condition - BUG/MEDIUM: ssl: AWS-LC + TLSv1.3 won't do ECDSA in RSA+ECDSA configuration - MINOR: ssl: activate sigalgs feature for AWS-LC - REGTESTS: ssl: activate new SSL reg-tests with AWS-LC - BUG/MEDIUM: proxy: fix email-alert invalid free - REORG: mailers: move free_email_alert() to mailers.c - BUG/MINOR: proxy: fix email-alert leak on deinit() (2nd try) - DOC: configuration: fix alphabetical order of bind options - DOC: management: document ptr lookup for table commands - BUG/MAJOR: quic: fix padding with short packets - BUG/MAJOR: quic: do not loop on emission on closing/draining state - MINOR: sample: date converter takes HTTP date and output an UNIX timestamp - SCRIPTS: git-show-backports: do not truncate git-show output - DOC: api/event_hdl: small updates, fix an example and add some precisions - BUG/MINOR: h3: fix crash on STOP_SENDING receive after GOAWAY emission - BUG/MINOR: mux-quic: fix crash on qcs SD alloc failure - BUG/MINOR: h3: fix BUG_ON() crash on control stream alloc failure - BUG/MINOR: quic: fix BUG_ON() on Tx pkt alloc failure - DEV: flags/show-fd-to-flags: adapt to recent versions - MINOR: capabilities: export capget and __user_cap_header_struct - MINOR: capabilities: prepare support for version 3 - MINOR: capabilities: use _LINUX_CAPABILITY_VERSION_3 - MINOR: cli/debug: show dev: add cmdline and version - MINOR: cli/debug: show dev: show capabilities - MINOR: debug: print gdb hints when crashing - BUILD: debug: also declare strlen() in __ABORT_NOW() - BUILD: Missing inclusion header for ssize_t type - BUG/MINOR: hlua: report proper context upon error in hlua_cli_io_handler_fct() - MINOR: cfgparse/log: remove leftover dead code - BUG/MEDIUM: stick-table: Decrement the ref count inside lock to kill a session - MINOR: stick-table: Always decrement ref count before killing a session - REORG: init: do MODE_CHECK_CONDITION logic first - REORG: init: encapsulate CHECK_CONDITION logic in a func - REORG: init: encapsulate 'reload' sockpair and master CLI listeners creation - REORG: init: encapsulate code that reads cfg files - BUG/MINOR: server: fix first server template name lookup UAF - MINOR: activity: make the memory profiling hash size configurable at build time - BUG/MEDIUM: server/dns: prevent DOWN/UP flap upon resolution timeout or error - BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid - BUG/MEDIUM: h3: ensure the ":scheme" pseudo header is totally valid - BUG/MEDIUM: quic: fix race-condition in quic_get_cid_tid() - BUG/MINOR: quic: fix race condition in qc_check_dcid() - BUG/MINOR: quic: fix race-condition on trace for CID retrieval	2024-06-29 11:28:41 +02:00
Amaury Denoyelle	bbb9f8248e	BUG/MINOR: quic: fix race-condition on trace for CID retrieval quic_rx_pkt_retrieve_conn() is used when parsing a received datagram from the listener socket. It returned the quic_conn instance corresponding to the first packet DCID, unless it is mapped to another thread. As expected, global CID tree access is protected by a lock in the function. However, there is a race condition due to the final trace where qc instance is dereferenced outside of the lock. Fix this by adding a new trace under lock protection and remove qc deferencement at function end. This may fix first crash of github issue #2607. This must be backported up to 2.8.	2024-06-28 16:28:33 +02:00
Amaury Denoyelle	05f59a51ac	BUG/MINOR: quic: fix race condition in qc_check_dcid() qc_check_dcid() is a function which check that a DCID is associated to the expected quic_conn instance. This is used for quic_conn socket receive handler as there is a tiny risk that a datagram to another connection was received on this socket. As other operations on global CID tree, a lock must be used to protect against race condition. However, as previous commit, lock was not held long enough as CID tree node is accessed outside of the lock region. To fix this, increase critical section until CID dereferencement is done. The impact of this bug should be similar to the previous one. However, risk of crash are even less reduced as it should be extremely rare to receive datagram for other connections on a quic_conn socket. As such, most of the time first check condition of qc_check_dcid() is enough. This may fix first crash of issue github #2607. This must be backported up to 2.8.	2024-06-28 16:28:33 +02:00
Amaury Denoyelle	72267ff35f	BUG/MEDIUM: quic: fix race-condition in quic_get_cid_tid() haproxy generates CID for clients which reuse them as DCID on their packets. These CID are stored in a global tree quic_cid_trees. Each operation on this tree must be done under lock protection. quic_get_cid_tid() is a function which lookups a CID in global tree and return the associated thread ID. This is used on datagram reception on listener socket before redispatching the datagram to the correct thread. This function uses a lock to protect quic_cid_trees access. However, lock region is too small as CID tree node is accessed outside of it. Fix this by extending lock protection for CID dereferencement until thread ID is retrieved. The impact of this bug is unknown, but it may possible cause crashes. However, it is probably rare as most of datagram reception is done on quic_conn socket which does not uses quic_get_cid_tid(). This may fix first crash of github issue #2607. This must be backported up to 2.8.	2024-06-28 16:27:20 +02:00
Amaury Denoyelle	a3bed52d1f	BUG/MEDIUM: h3: ensure the ":scheme" pseudo header is totally valid Ensure pseudo-header scheme is only constitued of valid characters according to RFC 9110. If an invalid value is found, the request is rejected and stream is resetted. It's the same as for previous commit "BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid" except that this time it applies to the ":scheme" pseudo header. This must be backported up to 2.6.	2024-06-28 14:36:30 +02:00
Amaury Denoyelle	789d4abd73	BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid Ensure pseudo-header method is only constitued of valid characters according to RFC 9110. If an invalid value is found, the request is rejected and stream is resetted. Previously only characters forbidden in headers were rejected (NUL/CR/LF), but this is insufficient for :method, where some other forbidden chars might be used to trick a non-compliant backend server into seeing a different path from the one seen by haproxy. Note that header injection is not possible though. This must be backported up to 2.6. Many thanks to Yuki Mogi of FFRI Security Inc for the detailed report that allowed to quicky spot, confirm and fix the problem.	2024-06-28 14:36:30 +02:00
Aurelien DARRAGON	80aba1d284	BUG/MEDIUM: server/dns: prevent DOWN/UP flap upon resolution timeout or error This is a complementary patch to c16eba818 ("BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or error"). Indeed, since c16eba818, the port is properly preserved, but unsetting server's address this way results in server_atomic_sync() function thinking that we're actually setting a new address and not unsetting the previous one because addr family is != AF_UNSPEC. Upon DNS timeout, this could be observed: [WARNING] (2588257) : Server http/s1 is going DOWN for maintenance (DNS timeout status). 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. [WARNING] (2588257) : Server http/s1 ('test1.localhost') is UP/READY (resolves again). Notice that server timeouts and then immediately resolves again. Of course in this case case the server's address was properly set to 0, meaning that the server will not receive any traffic, but it is confusing and could result in haproxy temporarily thinking that the server is actually available while it's not. To properly fix the issue and restore historical behavior, let's explicitly set inetaddr's family to AF_UNSPEC after fetching original server's address. It should be backported in 3.0 with c16eba818.	2024-06-28 11:26:52 +02:00
Willy Tarreau	290659ffd3	MINOR: activity: make the memory profiling hash size configurable at build time The MEMPROF_HASH_BITS variable was set to 10 without a possibility to change it (beyond patching the code). After seeing a few reports already with "other" being listed and a list with close to 1024 entries, it looks like it's about time to either increase the hash size, or at least make it configurable for special cases. As a reminder, in order to remain fast, the algorithm searches no more than 16 places after the hash, so when a table is almost full, searches are long and new places are rare. The present patch just makes it possible to redefine it by passing "-DMEMPROF_HASH_BITS=11" or "-DMEMPROF_HASH_BITS=12" in CFLAGS, and moves the definition to defaults.h to make it easier to find. Such values should be way sufficient for the vast majority of use cases. Maybe in the future we'd change the default. At least this version should be backported to ease rebuilds, say, till 2.8 or so.	2024-06-27 18:01:27 +02:00
Aurelien DARRAGON	eec8048042	BUG/MINOR: server: fix first server template name lookup UAF This is a follow-up for 7223296 ("BUG/MINOR: server: fix first server template not being indexed"). Indeed, in 7223296 we added a new call to _srv_parse_set_id_from_prefix() for the first server before handling additional ones. But we actually overlooked the fact that _srv_parse_set_id_from_prefix() was already performed at the end of _srv_parse_tmpl_init() for the same server. Since _srv_parse_set_id_from_prefix() frees srv->id, it results in UAF when performing name lookups on the first server, because used_server_name node key still uses the freed string pointer. The early _srv_parse_set_id_from_prefix() call (added in 7223296) and the original one perform the same task, except that the new one is followed by name node insertion logic required for name lookups to work properly. So let's simply get rid of the old one at the end of the function. _srv_parse_set_id_from_prefix() in the 'err:' label was also removed since is is now useless as well starting with 7223296 and would trigger the same bug on error paths. Thanks to Amaury for noticing it. This bug was discovered while trying to address GH issue #2620. Thanks to @x-yuri for his detailed report (with working repro). It should be backported in 3.0 with 7223296.	2024-06-27 16:38:25 +02:00
Valentine Krasnobaeva	ed90ad895c	REORG: init: encapsulate code that reads cfg files Haproxy master process should not read its configuration the second time after performing reexec and passing to MODE_MWORKER_WAIT. So, to make this part of init() function more readable and to distinguish better the point, where configs have been read, let's encapsulate it in a separate function.	2024-06-27 16:09:38 +02:00
Valentine Krasnobaeva	5e06d45df7	REORG: init: encapsulate 'reload' sockpair and master CLI listeners creation Let's encapsulate the logic of 'reload' sockpair and master CLI listeners creation, used by master CLI into a separate function, as we needed this only in master-worker runtime mode. This makes the code of init() more readable.	2024-06-27 16:08:42 +02:00
Valentine Krasnobaeva	6f613faa71	REORG: init: encapsulate CHECK_CONDITION logic in a func As MODE_CHECK_CONDITION logic terminates the process anyway, no matter if the test for the provided condition was successfull or not, let's encapsulate it in a separate function. This makes the code of init() more readable.	2024-06-27 16:01:01 +02:00
Valentine Krasnobaeva	10de58fbfb	REORG: init: do MODE_CHECK_CONDITION logic first In MODE_CHECK_CONDITION we only parse check_condition string, provided by '-cc', and then we evaluate it. Haproxy process terminates at the end of {if..else} block anyway, if the test has failed or passed. So, it will be more appropriate to perform MODE_CHECK_CONDITION test first and then do all other process runtime mode verifications.	2024-06-27 15:59:43 +02:00
Christopher Faulet	ad946a704d	MINOR: stick-table: Always decrement ref count before killing a session Guarded functions to kill a sticky session, stksess_kill() stksess_kill_if_expired(), may or may not decrement and test its reference counter before really killing it. This depends on a parameter. If it is set to non-zero value, the ref count is decremented and if it falls to zero, the session is killed. Otherwise, if this parameter is equal to zero, the session is killed, regardless the ref count value. In the code, these functions are always called with a non-zero parameter and the ref count is always decremented and tested. So, there is no reason to still have a special case. Especially because it is not really easy to say if it is supported or not. Does it mean it is possible to kill a sticky session while it is still referenced somewhere ? probably not. So, does it mean it is possible to kill a unreferenced session ? This case may be problematic because the session is accessed outside of any lock and thus may be released by another thread because it is unreferenced. Enlarging scope of the lock to avoid any issue is possible but it is a bit of shame to do so because there is no usage for now. The best is to simplify the API and remove this case. Now, stksess_kill() and stksess_kill_if_expired() functions always decrement and test the ref count before killing a sticky session.	2024-06-26 15:05:06 +02:00
Christopher Faulet	9357873641	BUG/MEDIUM: stick-table: Decrement the ref count inside lock to kill a session When we try to kill a session, the shard must be locked before decrementing the ref count on the session. Otherwise, the ref count can fall to 0 and a purge task (stktable_trash_oldest or process_table_expire) may release the session before we have the opportunity to acquire the lock on the shard to effectively kill the session. This could lead to a double free. Here is the scenario: Thread 1 Thread 2 sktsess_kill(ts) if (ATOMIC_DEC(&ts->ref_cnt) != 0) return /* here the ref count is 0 / stktable_trash_oldest() LOCK(&sh_lock) if (!ATOMIC_LOAD(&ts->ref_cnf)) __stksess_free(ts) UNLOCK(&sh_lock) / here the session was released */ LOCK(&sh_lock) __stksess_free(ts) <--- double free UNLOCK(&sh_lock) The bug was introduced in 2.9 by the commit 7968fe3889 ("MEDIUM: stick-table: change the ref_cnt atomically"). The ref count must be decremented inside the lock for stksess_kill() and sktsess_kill_if_expired() function. This patch should fix the issue #2611. It must be backported as far as 2.9. On the 2.9, there is no sharding. All the table is locked. The patch will have to be adapted.	2024-06-26 12:05:37 +02:00
Aurelien DARRAGON	bcf98c9b5f	MINOR: cfgparse/log: remove leftover dead code Remove development leftover introduced by commit 15e9c7da6 ("MINOR: log: add log-profile parsing logic"). Indeed, since "log-profile" section keyword is registered via REGISTER_CONFIG_SECTION() macro, it is not relevant to declare it in common_kw_list[] from cfgparse-global.c. All it does is that it could confuse the user by suggesting him to use "log-profile" inside a global section when trying to find a best match in cfg_parse_global().	2024-06-26 11:06:31 +02:00
Aurelien DARRAGON	185d230e2c	BUG/MINOR: hlua: report proper context upon error in hlua_cli_io_handler_fct() As a result of copy pasting, hlua_cli_io_handler_fct() used to report lua exceptions like E_ETMOUT as "Lua converter" instead of "Lua cli". Let's fix that. It could be backported to all stable versions. [ada: for older versions, HLUA_E_BTMOUT case didn't exist so it has to be skipped]	2024-06-26 11:06:24 +02:00
Frederic Lecaille	bc9821fd26	BUILD: Missing inclusion header for ssize_t type Compilation issue detected as follows by gcc: In file included from src/ncbuf.c:19: src/ncbuf.c: In function 'ncb_write_off': include/haproxy/bug.h:144:10: error: unknown type name 'ssize_t' 144 \| extern ssize_t write(int, const void *, size_t); \	2024-06-26 10:17:09 +02:00
Willy Tarreau	2d27c80288	BUILD: debug: also declare strlen() in __ABORT_NOW() Previous commit 8f204fa8ae ("MINOR: debug: print gdb hints when crashing") broken on the CI where strlen() isn't known. Let's forward-declare it in the __ABORT_NOW() functions, just like write(). No backport is needed.	2024-06-26 08:04:40 +02:00
Willy Tarreau	8f204fa8ae	MINOR: debug: print gdb hints when crashing To make bug reporting easier for users, when crashing, let's suggest what to do. Typically when a BUG_ON() matches, only the current thread is useful the vast majority of the time, while when the watchdog triggers, all threads are interesting. The messages are printed at the end after the dump. We may adjust these with wiki links in the future is more detailed instructions are relevant.	2024-06-26 07:43:00 +02:00
Valentine Krasnobaeva	2cd52a88be	MINOR: cli/debug: show dev: show capabilities If haproxy compiled with Linux capabilities support, let's show process capabilities before applying the configuration and at runtime in 'show dev' command output. This maybe useful for debugging purposes. Especially in cases, when process changes its UID and GID to non-priviledged or it has started and run under non-priviledged UID and needed capabilities are set by admin on the haproxy binary.	2024-06-26 07:38:21 +02:00
Valentine Krasnobaeva	0d79c9bedf	MINOR: cli/debug: show dev: add cmdline and version 'show dev' command is very convenient to obtain haproxy debugging information, while process is run in container. Let's extend its output with version and cmdline. cmdline is useful in a way, as it shows absolute binary path and its arguments, because sometimes the person, who is debugging failing container is not the same, who has created and deployed it. argc and argv are stored in the exported global structure, because feed_post_mortem() is added as a post check function callback in the post_check_list. So we can't simply change the signature of feed_post_mortem(), without breaking other post check callbacks APIs. Parsers are not supposed to modify argv, so we can safely bypass its pointer to debug_parse_cli_show_dev(), without copying all argument stings somewhere in the heap or on stack.	2024-06-26 07:38:21 +02:00
Valentine Krasnobaeva	fba9ade891	MINOR: capabilities: use _LINUX_CAPABILITY_VERSION_3 Linux kernel shows the warning below, when _LINUX_CAPABILITY_VERSION_1 is used in capset() and capget(). [1710243.523230] capability: warning: `haproxy' uses 32-bit capabilities (legacy support in use) This triggers questions from users. Warning is shown by kernel, because since Linux 2.6.25, 64-bit capabilities support was introduced in _LINUX_CAPABILITY_VERSION_2. It's in order to be able to continiously extend capabilities list with the new ones. We can't use _LINUX_CAPABILITY_VERSION_2, because this version triggers another warning, according linux/kernel/capability.c (see also more details about it in comments from kernel sources and in man capset(2)). kernel/capability.c: ... static int cap_validate_magic(cap_user_header_t header, unsigned tocopy) { __u32 version; if (get_user(version, &header->version)) return -EFAULT; switch (version) { case _LINUX_CAPABILITY_VERSION_1: warn_legacy_capability_use(); tocopy = _LINUX_CAPABILITY_U32S_1; break; case _LINUX_CAPABILITY_VERSION_2: warn_deprecated_v2(); fallthrough; /* v3 is otherwise equivalent to v2 / case _LINUX_CAPABILITY_VERSION_3: tocopy = _LINUX_CAPABILITY_U32S_3; break; default: ... So, to avoid any warnings, lets use _LINUX_CAPABILITY_VERSION_3, which according to comments in linux/kernel/capability.c, has the same functionality as _LINUX_CAPABILITY_VERSION_2 (i.e. array of 2 __user_cap_data_struct with 32-bits integers for each capability set), but comes in Linux 2.6.26 with a header change, in order to protect legacy source code. For the moment, we don't authorize capabilities higher, than CAP_SYS_ADMIN (21-st bit), so we always check the "low" 32 bits, i.e. __user_cap_data_struct[0].	2024-06-26 07:38:21 +02:00
Valentine Krasnobaeva	e2e756a67d	MINOR: capabilities: prepare support for version 3 Commit e338d263a76a ("Add 64-bit capability support to the kernel") introduces in the kernel _LINUX_CAPABILITY_VERSION_1 and _LINUX_CAPABILITY_VERSION_2 and its corresponded magic numbers "1" (_LINUX_CAPABILITY_U32S_1) and "2" (_LINUX_CAPABILITY_VERSION_2). Capabilities sets, since this commit, are composed as an arrays of __user_cap_data_struct with length defined in version's magic number (e.g. struct __user_cap_data_struct kdata[_LINUX_CAPABILITY_U32S_1]). These magic numbers also help the kernel to figure out how many data (in __user_cap_data_struct "units") it needs to copy_from/to_user in capset/capget syscalls. In order to use _LINUX_CAPABILITY_VERSION_3 in the next commit (it has the same functionality as version 2), let's follow the kernel code and let's allocate memory to store 32-capabilities as an array of __user_cap_data_struct with the length of 1 (_LINUX_CAPABILITY_U32S_1).	2024-06-26 07:38:21 +02:00
Valentine Krasnobaeva	fcf1a0bcf5	MINOR: capabilities: export capget and __user_cap_header_struct To be able to show process capabilities before applying its configuration and also at runtime in 'show dev' command output, we need to export the wrapper around capget() syscall. It also seems more handy to place __user_cap_header_struct in .data section and declare it as globally accessible, as we always fill it with the same values. This avoids allocate and fill these 8 bytes each time on the stack frame, when capget() or capset() wrappers are called.	2024-06-26 07:38:21 +02:00
Willy Tarreau	a14c7d194a	DEV: flags/show-fd-to-flags: adapt to recent versions The script hadn't been updated since it was introduced, and the hard-coded field 12 doesn't match anymore (it's 16 now). Let's just use "grep -o cflg..." to extract the desired part more flexibly. This can be backported at least to 3.0, probably further, but it will need to be tested prior to this. Better not bring it too far, it's only used when debugging.	2024-06-25 08:13:24 +02:00
Amaury Denoyelle	d5376b7a87	BUG/MINOR: quic: fix BUG_ON() on Tx pkt alloc failure On quic_tx_packet allocation failure, it is possible to trigger BUG_ON() crash on INITIAL packet building. This statement is responsible to ensure INITIAL packets are padded to 1.200 bytes as required. If a packet on higher encryption level allocation fails, PADDING frame cannot properly encoded, despite the INITIAL packet properly built. This crash happens due to qc_txb_store() invokation after quic_tx_packet allocation failure to validate already built packets. However, this statement is unneeded as qc_purge_tx_buf() is called just after. Simply remove qc_txb_store() to fix this issue. This was detected using -dMfail. This should be backported up to 2.6.	2024-06-24 14:40:38 +02:00
Amaury Denoyelle	5718c67c19	BUG/MINOR: h3: fix BUG_ON() crash on control stream alloc failure BUG_ON() from qcc_set_error() is triggered on HTTP/3 control stream allocation failure. This is caused because both h3_finalize() and qcc_init_stream_local() call qcc_set_error() which is forbidden to prevent error code erasure. Fix this by removing qcc_set_error() invocation from h3_finalize() on allocation failure. Note that this function is still responsible to use it on SETTING frame emission failure. This was detected using -dMfail. This must be backported up to 3.0.	2024-06-24 14:40:38 +02:00
Amaury Denoyelle	3aded1d375	BUG/MINOR: mux-quic: fix crash on qcs SD alloc failure Since the following commit, sedesc are created since QCS instantiation in qcs_new(). 086e51017e7731ee9820b882fe6e8cc5f0dd5352 BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream However, sedesc is initialized before other QCS mandatory fields. If sedesc allocation fails, a crash would occur on qcs_free() invocation for QCS early release. To fix this, delay sedesc allocation until function end. This bug was detected using -dMfail. This should be backported up to 2.6.	2024-06-24 14:04:48 +02:00
Amaury Denoyelle	85838822ba	BUG/MINOR: h3: fix crash on STOP_SENDING receive after GOAWAY emission After emitting a HTTP/3 GOAWAY frame, opening of streams higher than advertised ID was prevented. h3_attach operation would return success but without allocating H3S stream context for QCS. In addition, the stream would be immediately scheduled for RESET_STREAM emission. Despite the immediate stream close, the current is not sufficient enough and can cause crashes. When of this occurence can be found if STOP_SENDING is the first frame received for a stream. A crash would occur under qcc_recv_stop_sending() after h3_attach invokation, when h3_close() is used which try to access to H3S context. To fix this, change h3_attach API. In case of success, H3S stream context is always allocated, even if the stream will be scheduled for immediate close. This renders the code more reliable. This crash should be extremely rare, as it can only happen after GOAWAY emission, which is only used on soft-stop or reload. This should solve the second crash occurence reported on GH #2607. This must be backported up to 2.8.	2024-06-24 12:03:55 +02:00
Aurelien DARRAGON	13e0972aea	DOC: api/event_hdl: small updates, fix an example and add some precisions Fix an example suggesting that using EVENT_HDL_SUB_TYPE(x, y) with y being 0 was valid. Then add some notes to explain how to use EVENT_HDL_SUB_FAMILY() and EVENT_HDL_SUB_TYPE() with valid values. Also mention that the feature is available starting from 2.8 and not 2.7. Finally, perform some purely cosmetic updates. This could be backported in 2.8.	2024-06-21 18:12:31 +02:00
Amaury Denoyelle	b27470fd1d	SCRIPTS: git-show-backports: do not truncate git-show output git-show-backports lists a git-show command which can be used to inspect all commits subject to backport. This command specifies formatting option to reproduce default git-show output, especially for commit messages indented with 4 spaces character. However, it also add wrapping on message line longer than 72 characters. This reduce lisibility of messages where large info are written such as backtraces. Improve this by changing git-show format option. Use a limit value of 0 to disable wrapping while preserving indentation. This could be backported to every stable version to simplify backporting process.	2024-06-21 15:08:42 +02:00
William Lallemand	5756f10cbc	MINOR: sample: date converter takes HTTP date and output an UNIX timestamp The `date` converter takes an HTTP date in input, it could be either a imf, rfc850 or asctime date. It will output an UNIX timestamp.	2024-06-20 16:38:48 +02:00
Amaury Denoyelle	937324d493	BUG/MAJOR: quic: do not loop on emission on closing/draining state To emit CONNECTION_CLOSE frame, a special buffer is allocated via qc_txb_store(). This is due to QUIC_FL_CONN_IMMEDIATE_CLOSE flag. However this flag is reset after qc_send_ppkts() invocation to prevent reemission of CONNECTION_CLOSE frame. qc_send() can invoke multiple times a series of qc_prep_pkts() + qc_send_ppkts() to emit several datagrams. However, this may cause a crash if on first loop a CONNECTION_CLOSE is emitted. On the next loop iteration, QUIC_FL_CONN_IMMEDIATE_CLOSE is resetted, thus qc_prep_pkts() will use the wrong buffer size as end delimiter. In some cases, this may cause a BUG_ON() crash due to b_add() outside of buffer. This bug can be reproduced by using a while loop of ngtcp2-client and interrupting them randomly via Ctrl+C. Here is the patch which introduce this regression : cdfceb10ae136b02e51f9bb346321cf0045d58e0 MINOR: quic: refactor qc_prep_pkts() loop	2024-06-19 15:15:59 +02:00
Amaury Denoyelle	c714b6bb55	BUG/MAJOR: quic: fix padding with short packets QUIC sending functions were extended to be more flexible. Of all the changes, they support now iterating over a variable instance of QEL instance of only 2 previously. This change has rendered PADDING emission less previsible, which was adjusted via the following patch : a60609f1aa3e5f61d2a2286fdb40ebf6936a80ee BUG/MINOR: quic: fix padding of INITIAL packets Its main purpose was to ensure PADDING would only be generated for the last iterated QEL instance, to avoid unnecessary padding. In parallel, a BUG_ON() statement ensure that built INITIAL packets are always padded to 1.200 bytes as necessary before emitted them. This BUG_ON() statement caused crash in one particular occurence : when building datagrams that mixed Initial long packets and 1-RTT short packets. This last occurence type does not have a length field in its header, contrary to Long packets. This caused a miscalculation for the necessary padding size, with INITIAL packets not padded enough to reach the necessary 1.200 bytes size. This issue was detected on 3.0.2. It can be reproduced by using 0-RTT combined with latency. Here are the used commands : $ ngtcp2-client --tp-file=/tmp/ngtcp2-tp.txt \ --session-file=/tmp/ngtcp2-session.txt --exit-on-all-streams-close \ 127.0.0.1 20443 "https://[::]/?s=32o" $ sudo tc qdisc add dev lo root netem latency 500ms Note that this issue cannot be reproduced on current dev version. Indeed, it seems that the following patch introduce a slight change in packet building ordering : cdfceb10ae136b02e51f9bb346321cf0045d58e0 MINOR: quic: refactor qc_prep_pkts() loop This must be backported to 3.0. This should fix github issue #2609.	2024-06-19 11:11:57 +02:00
Aurelien DARRAGON	7422f16da3	DOC: management: document ptr lookup for table commands Add missing documentation and examples for the optional ptr lookup method for table {show,set,clear} commands introduced in commit 9b2717e7 ("MINOR: stktable: use {show,set,clear} table with ptr"), as initially described in GH #2118. It may be backported in 3.0.	2024-06-19 10:28:10 +02:00
William Lallemand	0cc2913aec	DOC: configuration: fix alphabetical order of bind options Put the curves, ecdhe, severity-output, v4v6 and v6only keyword at the right place. Fix issue #2594. Could be backported in every stable versions.	2024-06-18 12:08:19 +02:00
Aurelien DARRAGON	9d312212df	BUG/MINOR: proxy: fix email-alert leak on deinit() (2nd try) As shown in GH #2608 and ("BUG/MEDIUM: proxy: fix email-alert invalid free"), simply calling free_email_alert() from free_proxy() is not the right thing to do. In this patch, we reuse proxy->email_alert.set memory space to introduce proxy->email_alert.flags in order to support 2 flags: PR_EMAIL_ALERT_SET (to mimic proxy->email_alert.set) and PR_EMAIL_ALERT_RESOLVED (set once init_email_alert() was called on the proxy to resolve email_alert.mailer pointer). Thanks to PR_EMAIL_ALERT_RESOLVED flag, free_email_alert() may now properly handle the freeing of proxy email_alert settings: if the RESOLVED flag is set, then it means the .email_alert.mailers.name parsing hint was replaced by the actual mailers pointer, thus no free should be attempted. No backport needed: as described in ("BUG/MEDIUM: proxy: fix email-alert invalid free"), this historical leak is not sensitive as it cannot be triggered during runtime.. thus given that the fix is not backport- friendly, it's not worth the trouble.	2024-06-17 19:37:29 +02:00
Aurelien DARRAGON	ee8be55942	REORG: mailers: move free_email_alert() to mailers.c free_email_alert() was declared in cfgparse.c, but it should belong to mailers.c instead.	2024-06-17 19:37:29 +02:00
Aurelien DARRAGON	8e226682be	BUG/MEDIUM: proxy: fix email-alert invalid free In fa90a7d3 ("BUG/MINOR: proxy: fix email-alert leak on deinit()"), I tried to fix email-alert deinit() leak the simple way by leveraging existing free_email_alert() helper function which was already used for freeing email alert settings used in a default section. However, as described in GH #2608, there is a subtelty that makes free_email_alert() not suitable for use from free_proxy(). Indeed, proxy 'mailers.name' hint shares the same memory space than the pointer to the corresponding mailers section (once the proxy is resolved, name hint is replaced by the pointer to the section). However, since both values share the same space (through union), we have to take care of not freeing `mailers.name` once init_email_alert() was called on the proxy. Unfortunately, free_email_alert() isn't protected against that, causing double free() during deinit when mailers section is referenced from multiple proxy sections. Since there is no easy fix, and that the leak in itself isn't a big deal (fa90a7d3 was simply an opportunistic fix rather than a must-have given that the leak only occurs during deinit and not during runtime), let's actually revert the fix to restore legacy behavior and prevent deinit errors. Thanks to @snetat for having reported the issue on Github as well as providing relevant infos to pinpoint the bug. It should be backported everywhere fa90a7d3 was backported. [ada: for versions prior to 3.0, simply revert the offending commit using 'git revert' as proxy_free_common() first appears in 3.0]	2024-06-17 19:37:24 +02:00
William Lallemand	c268313f60	REGTESTS: ssl: activate new SSL reg-tests with AWS-LC Prerequisites are now available in AWS-LC, so we can enable these reg-tests. With this patch, aws-lc only has 5 reg-tests that are not working: - reg-tests/ssl/ssl_reuse.vtc: stateful session resumption is only supported with TLSv1.2 - reg-tests/ssl/ssl_curve_name.vtc: function to extract curve name is not available - reg-tests/ssl/ssl_errors.vtc: errors are not the same than OpenSSL - reg-tests/ssl/ssl_dh.vtc: AWS-LC does not support DH - reg-tests/ssl/ssl_curves.vtc: not working correctly Which means most of the features are working correctly.	2024-06-17 17:43:22 +02:00
William Lallemand	30a432d198	MINOR: ssl: activate sigalgs feature for AWS-LC AWSLC lacks the SSL_CTX_set1_sigalgs_list define, however the function exists, which disables the feature in HAProxy, even if we could have build with it. SSL_CTX_set1_client_sigalgs_list() is not available, though. This patch introduce the define so the feature is enabled.	2024-06-17 17:40:49 +02:00
William Lallemand	ed9b8fec49	BUG/MEDIUM: ssl: AWS-LC + TLSv1.3 won't do ECDSA in RSA+ECDSA configuration SSL_get_ciphers() in AWS-LC seems to lack the TLSv1.3 ciphersuites, which break the ECDSA key selection when doing TLSv1.3. An issue was opened https://github.com/aws/aws-lc/issues/1638 Indeed, in ssl_sock_switchctx_cbk(), the sigalgs is used to determine if ECDSA is doable or not, then the function compares the list of ciphers in the clienthello with the list of configured ciphers. The fix solves the issue by never skipping the TLSv1.3 ciphersuites, even if they are not in SSL_get_ciphers().	2024-06-17 17:40:49 +02:00
William Lallemand	6da0879083	REGTESTS: ssl: fix some regtests 'feature cmd' start condition Since patch fde517b ("REGTESTS: wolfssl: temporarly disable some failing reg-tests") some 'feature cmd' lines have an extra quotation mark, so they were disable in every cases. Must be backported to 2.9.	2024-06-17 16:12:57 +02:00
Aurelien DARRAGON	983513d901	DEBUG: hlua: distinguish burst timeout errors from exec timeout errors hlua burst timeout was introduced in 58e36e5b1 ("MEDIUM: hlua: introduce tune.lua.burst-timeout"). It is a safety measure that allows to detect when too much time is spent on a single lua execution (between 2 interruptions/yields), meaning that the current thread is not able to perform other tasks. Such scenario should be avoided because it will cause thread contention which may have negative performance impact and could cause the watchdog to trigger. When the burst timeout is exceeded, the current Lua execution is aborted and a timeout error is reported to the user. Unfortunately, the same error is currently being reported for cumulative (AKA execution) timeout and for burst timeout, which may be confusing to the user. Indeed, "execution timeout" error historically results from the current hlua context exceeding the total (cumulative) time it's allowed to run. It is set per lua context using the dedicated tunables: - tune.lua.session-timeout - tune.lua.task-timeout - tune.lua.service-timeout We've already faced an user report where the user was able to trigger the burst timeout and got "Lua task: execution timeout." error while the user didn't set cumulative timeout. Thus the error was actually confusing because it was indeed the burst timeout which was causing it due to the use of cpu-intensive call from within the task without sufficient manual "yield" keypoints around the cpu-intensive call to ensure it runs on a dedicated scheduler cycle. In this patch we make it so burst timeout related errors are reported as "burst timeout" errors instead of "execution timeout" errors (which in fact became the generic timeout errors catchall with 58e36e5b1). To do this, hlua_timer_check() now returns a different value depending if the exeeded timeout is the burst one or the cumulative one, which allows us to return either HLUA_E_ETMOUT or HLUA_E_BTMOUT in hlua_ctx_resume(). It should improve the situation described in GH #2356 and may possibly be backported with 58e36e5b1 to improve error reporting if it applies without resistance.	2024-06-14 18:25:58 +02:00
Aurelien DARRAGON	0030f722a2	BUG/MINOR: log: fix broken '+bin' logformat node option In 12d08cf912 ("BUG/MEDIUM: log: don't ignore disabled node's options"), while trying to restore historical node option inheritance behavior, I broke the '+bin' logformat node option recently introduced in b7c3d8c87c ("MINOR: log: add +bin logformat node option"). Indeed, because of 12d08cf912, LOG_OPT_BIN is not set anymore on individual nodes even if it was set globally, making the feature unusable. ('+bin' is also used for binary cbor encoding) What I should have done instead is include LOG_OPT_BIN in the options inherited from global ones. This is what's being done in this commit. Misleading comment was adjusted. It must be backported in 3.0 with 12d08cf912.	2024-06-14 18:25:21 +02:00
Christopher Faulet	dc1bca4e9f	[RELEASE] Released version 3.1-dev1 Released version 3.1-dev1 with the following main changes : - REGTESTS: Remove REQUIRE_VERSION=2.1 from all tests - REGTESTS: Remove REQUIRE_VERSION=2.2 from all tests - CI: use "--no-install-recommends" for apt-get - CI: switch to lua 5.4 - CI: use USE_PCRE2 instead of USE_PCRE - DOC: replace the README by a markdown version - CI: VTest: accelerate package install a bit - ADMIN: acme.sh: remove the old acme.sh code - BUG/MINOR: cfgparse: remove the correct option on httpcheck send-state warning - BUG/MINOR: tcpcheck: report correct error in tcp-check rule parser - BUG/MINOR: tools: fix possible null-deref in env_expand() on out-of-memory - DOC: configuration: add an example for keywords from crt-store - CI: speedup apt package install - DOC: add the FreeBSD status badge to README.md - DOC: change the link to the FreeBSD CI in README.md - MINOR: stktable: avoid ambiguous stktable_data_ptr() usage in cli_io_handler_table() - BUG/MINOR: hlua: use CertCache.set() from various hlua contexts - CLEANUP: hlua: fix CertCache class comment - CI: FreeBSD: upgrade image, packages - BUG/MEDIUM: h1-htx: Don't state interim responses are bodyless - MEDIUM: stconn: Be able to unblock zero-copy data forwarding from done_fastfwd - BUG/MEDIUM: mux-quic: Unblock zero-copy forwarding if the txbuf can be released - BUG/MINOR: quic: prevent crash on qc_kill_conn() - CLEANUP: hlua: use hlua_pusherror() where relevant - BUG/MINOR: hlua: don't use lua_pushfstring() when we don't expect LJMP - BUG/MINOR: hlua: fix unsafe hlua_pusherror() usage - BUG/MINOR: hlua: prevent LJMP in hlua_traceback() - CLEANUP: hlua: get rid of hlua_traceback() security checks - BUG/MINOR: hlua: fix leak in hlua_ckch_set() error path - CLEANUP: hlua: simplify ambiguous lua_insert() usage in hlua_ctx_resume() - BUG/MEDIUM: mux-quic: Don't unblock zero-copy fwding if blocked during nego - MINOR: mux-quic: Don't send an emtpy H3 DATA frame during zero-copy forwarding - BUG/MEDIUM: ssl: wrong priority whem limiting ECDSA ciphers in ECDSA+RSA configuration - BUG/MEDIUM: ssl: bad auth selection with TLS1.2 and WolfSSL - BUG/MINOR: quic: fix computed length of emitted STREAM frames - BUG/MINOR: quic: ensure Tx buf is always purged - BUG/MEDIUM: stconn/mux-h1: Fix suspect change causing timeouts - BUG/MAJOR: mux-h1: Properly copy chunked input data during zero-copy nego - BUG/MINOR: mux-h1: Use the right variable to set NEGO_FF_FL_EXACT_SIZE flag - DOC: install: remove boringssl from the list of supported libraries - MINOR: log: fix "http-send-name-header" ignore warning message - BUG/MINOR: proxy: fix server_id_hdr_name leak on deinit() - BUG/MINOR: proxy: fix log_tag leak on deinit() - BUG/MINOR: proxy: fix email-alert leak on deinit() - BUG/MINOR: proxy: fix check_{command,path} leak on deinit() - BUG/MINOR: proxy: fix dyncookie_key leak on deinit() - BUG/MINOR: proxy: fix source interface and usesrc leaks on deinit() - BUG/MINOR: proxy: fix header_unique_id leak on deinit() - MINOR: proxy: add proxy_free_common() helper function - BUG/MEDIUM: proxy: fix UAF with {tcp,http}checks logformat expressions - MINOR: log: change wording in lf_expr_postcheck() error message - BUG/MEDIUM: log: fix lf_expr_postcheck() behavior with default section - CLEANUP: log/proxy: fix comment in proxy_free_common() - DOC: config: move "hash-key" from proxy to server options - DOC: config: add missing section hint for "guid" proxy keyword - DOC: config: add missing context hint for new server and proxy keywords - BUG/MINOR: promex: Skip resolvers metrics when there is no resolver section - DOC: internals: add a documentation about the master worker - BUG/MAJOR: mux-h1: Prevent any UAF on H1 connection after draining a request - BUG/MINOR: quic: fix padding of INITIAL packets - OPTIM: quic: fill whole Tx buffer if needed - MINOR: quic: refactor qc_build_pkt() error handling - MINOR: quic: use global datagram headlen definition - MINOR: quic: refactor qc_prep_pkts() loop - DOC/MINOR: management: add missed -dR and -dv options - DOC/MINOR: management: add -dZ option - DOC: management: rename show stats domain cli "dns" to "resolvers" - REORG: log: reorder send log helpers by dependency order - MINOR: session: expose session_embryonic_build_legacy_err() function - MEDIUM: log/session: handle embryonic session log within sess_log() - MINOR: log: provide sending log context to process_send_log() when available - MINOR: log: add log_orig_to_str() function - MINOR: log: provide log origin in logformat expressions using '%OG' - CLEANUP: log: remove ambiguous legacy comment for resolve_logger() - MINOR: log/backend: always free parsing hints in resolve_logger() - MINOR: log: make resolve_logger() static - MINOR: log: provide proxy context to resolve_logger() - MINOR: log: add __send_log_set_metadata_sd helper - MINOR: log: add logger flags - MINOR: log: add log-profile parsing logic - MINOR: log: add log profile buildlines - MEDIUM: log: handle log-profile in process_send_log() - DOC: config: add documentation for log profiles - REGTESTS: log: add a test for log-profile - MINOR: ssl: add ssl_sock_bind_verifycbk() in ssl_sock.h - REORG: ssl: move the SNI selection code in ssl_clienthello.c - BUILD: ssl: fix build with wolfSSL - CI: github: upgrade aws-lc to 1.29.0 - Revert "CI: github: upgrade aws-lc to 1.29.0" - MEDIUM: ssl: support for ECDA+RSA certificate selection with AWS-LC - BUILD: ssl: disable deprecated functions for AWS-LC 1.29.0 - MINOR: ssl: relax the 'ssl.default-dh-param' keyword parsing - CI: github: upgrade aws-lc to 1.29.0 - DOC: INSTALL: minimum AWS-LC version is v1.22.0 - CI: github: do the AWS-LC weekly build with ERR=1	2024-06-14 16:04:18 +02:00
William Lallemand	5e361c7767	CI: github: do the AWS-LC weekly build with ERR=1 The weekly CI that tries new version of AWS-LC was not building with ERR=1, which let us think that everything was good but there was in fact new warning that we missed. Add ERR=1 to the build so the CI will failed for any new warning.	2024-06-14 12:18:32 +02:00
William Lallemand	1950996e83	DOC: INSTALL: minimum AWS-LC version is v1.22.0 Change the minimum AWS-LC version required	2024-06-14 12:06:03 +02:00
William Lallemand	11e13175d4	CI: github: upgrade aws-lc to 1.29.0 Upgrade aws-lc to 1.29.0 on the push CI.	2024-06-14 11:37:11 +02:00
William Lallemand	7e80af04ca	MINOR: ssl: relax the 'ssl.default-dh-param' keyword parsing Some libraries are ignoring SSL_CTX_set_tmp_dh_callback(), but disabling the 'ssl.default-dh-param' keyword when the function is not supported would result in an error instead of silently continuing. This patch emits a warning when the keyword is not supported instead of a loading failure.	2024-06-14 11:36:52 +02:00
William Lallemand	ee5aa4e5e6	BUILD: ssl: disable deprecated functions for AWS-LC 1.29.0 AWS-LC have a lot of functions that does nothing, which are now deprecated and emits some warning. This patch disables the following useless functions that emits a warning: SSL_CTX_get_security_level(), SSL_CTX_set_tmp_dh_callback(), ERR_load_SSL_strings(), RAND_keep_random_devices_open() The list of deprecated functions is here: https://github.com/aws/aws-lc/blob/main/docs/porting/functionality-differences.md	2024-06-14 10:41:36 +02:00
William Lallemand	7120c77b14	MEDIUM: ssl: support for ECDA+RSA certificate selection with AWS-LC AWS-LC does not support the SSL_CTX_set_client_hello_cb() function from OpenSSL which allows to analyze ciphers and signatures algorithm of the ClientHello. However it supports the SSL_CTX_set_select_certificate_cb() which allows the same thing but was the implementation from the boringSSL side. This patch uses the SSL_CTX_set_select_certificate_cb() as well as the SSL_early_callback_ctx_extension_get() function to get the signature algorithms. This was successfully tested with openssl s_client as well as testssl.sh. This should allow to enable more reg-tests that depend on certificate selection. Require at least AWS-LC 1.22.0.	2024-06-13 19:36:40 +02:00
William Lallemand	935b3bd1b7	Revert "CI: github: upgrade aws-lc to 1.29.0" This reverts commit 6e986e7493ad2aa0c5a11c59d1235b03c02ef71c.	2024-06-13 17:14:58 +02:00
William Lallemand	6e986e7493	CI: github: upgrade aws-lc to 1.29.0 Upgrade aws-lc to 1.29.0 on the push CI.	2024-06-13 17:11:04 +02:00
William Lallemand	5149cc4990	BUILD: ssl: fix build with wolfSSL fix build with wolfSSL, broken since the reorg in src/ssl_clienthello.c	2024-06-13 17:01:45 +02:00
William Lallemand	4ced880d22	REORG: ssl: move the SNI selection code in ssl_clienthello.c Move the code which is used to select the final certificate with the clienthello callback. ssl_sock_client_sni_pool need to be exposed from outside ssl_sock.c	2024-06-13 16:48:17 +02:00
William Lallemand	fc7c5d892b	MINOR: ssl: add ssl_sock_bind_verifycbk() in ssl_sock.h Add missing ssl_sock_bind_verifycbk() in ssl_sock.h	2024-06-13 16:48:17 +02:00
Aurelien DARRAGON	bcad26c814	REGTESTS: log: add a test for log-profile Try to cover some common use-cases for "log-profile" feature. The tests mainly focus on log-profile section declaration, and testing the behavior of logformat / log-tag overriding capabilities. For now, the use of log-profiles is somewhat limited because we lack the ability to explicitly trigger the log building process at specific steps during the stream handling. Indeed, for now we rely on "option logasap" and proxy log-format string content "hacks" to force the log emission at some specific steps, thus more tests should be added over the time, when new mechanisms allowing the emission of logs at expected processing steps will be added, or if new keywords are added to the log-profile section. This test requires versions >= 3.0-dev1	2024-06-13 15:43:10 +02:00
Aurelien DARRAGON	8fa4036dae	DOC: config: add documentation for log profiles Now that log-profile parsing logic has been implemented in "MINOR: log: add log-profile parsing logic" and is actually effective since "MEDIUM: log: handle log-profile in process_send_log()", let's document the feature and add some examples. Log-profile section is declared like this: log-profile myprof log-tag "custom-tag" on error format "%ci: error" on any format "(custom httplog) ${HAPROXY_HTTP_LOG_FMT}" sd "[exampleSDID@1234 step=\"accept\" id=\"%ID\"]" (check out the documentation for the full list of options, some options are only relevant under specific contexts) And used this way (from usual "log" directive lines): global log stdout format rfc5424 profile myprof local0 -------------- For now, the use of log-profiles is somewhat limited because we lack the ability to explicitly trigger the log building process at specific steps during the stream handling, but it should gain more traction over the time as the feature evolves and new mechanisms allowing the emission of logs at expected processing steps will be added. It should partially fix GH #401	2024-06-13 15:43:10 +02:00
Aurelien DARRAGON	cc6fd2646b	MEDIUM: log: handle log-profile in process_send_log() In previous commit we implemented log-profile parsing logic. Now let's actually make use of available log-profile information from logger struct to decide whether we need to rebuild the logline under process_send_log() according to log profile settings. Nothing is done if the logger didn't specify a log-profile.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	48d34b98e4	MINOR: log: add log profile buildlines Now that we have log-profile parsing done, let's prepare for runtime log-profile handling by adding the necessary string buffer required to re-build log strings using sess_build_logline() on the fly without altering regular loglines content. Indeed, since a different log-profile may (or may not) be specified for each logger, we must keep the original string and only rebuild a custom one when required for the current logger (according to the selected log- profile).	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	15e9c7da6b	MINOR: log: add log-profile parsing logic This patch implements prerequisite log-profile struct and parser logic. It has no effect during runtime for now. Logformat expressions provided in log-profile "steps" are postchecked during postparsing for each proxy "log" directive that makes use of a given profile. (this allows to ensure that the logformat expressions used in the profile are compatible with proxy using them)	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	33f3bec7ee	MINOR: log: add logger flags Logger struct may benefit from having a "flags" struct member to set or remove different logger states. For that, we reuse an existing 4 bytes hole in the logger struct to store a 2 bytes flags integer, leaving the struct with a 2-bytes hole now.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	a6e38465fb	MINOR: log: add __send_log_set_metadata_sd helper Extract sd metadata assignment in __send_log() to make an inline helper function out of it in order to be able to use it from other functions if needed.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	3102c89dde	MINOR: log: provide proxy context to resolve_logger() Prerequisite work for log-profiles, we need to know under which proxy context the logger is being used. When the info is not available, (ie: global section or log-forward section, <px> is set to NULL)	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	42139fa16e	MINOR: log: make resolve_logger() static There is no need to expose this internal function, let's make it static.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	db47471155	MINOR: log/backend: always free parsing hints in resolve_logger() Since resolve_logger() always resolves logger target (even when error occurs), we must take care of freeing parsing hints because free_logger() won't try to do it if target RESOLVED flag is set on the target. This isn't considered as a bug because resolve_logger(), being a postparsing check, will make haproxy immediately exit upon fatal error in haproxy.c, but it's better to ensure that everything will be properly freed if we decide to perform a clean exit upon postparsing checks error in the future.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	2a1bf99923	CLEANUP: log: remove ambiguous legacy comment for resolve_logger() It is no longer relevant to say that <logger> is used for implicit settings. In fact the function resolves <logger>, but currently mainly focuses on loggers's target. However we could extend the function to perform additional work on the logger itself in the future. let's adjust the comment to prevent any confusion.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	8f34320e15	MINOR: log: provide log origin in logformat expressions using '%OG' '%OG' logformat alias may be used to report the log origin (when/where) that triggered log generation using sess_build_logline(). Possible values are: - "sess_error": log was generated during session error handling - "sess_killed": log was generated during session abortion (killed embryonic session) - "txn_accept": log was generated right after frontend conn was accepted - "txn_request": log was generated after client request was received - "txn_connect": log was generated after backend connection establishment - "txn_response": log was generated during server response handling - "txn_close": log was generated at the final txn step, before closing - "unspec": unknown or not specified Documentation was updated.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	b52862d401	MINOR: log: add log_orig_to_str() function Get human readable string from log_orig enum members.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	2a91bd52ad	MINOR: log: provide sending log context to process_send_log() when available This is another prerequisite work in preparation for log-profiles: in this patch we make process_send_log() aware of the log origin, primarily aiming for sess and txn logging steps such as error, accept, connect, close, as well as relevant sess and stream pointers.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	0b7a5a64eb	MEDIUM: log/session: handle embryonic session log within sess_log() Move the embryonic session logging logic down to sess_log() in preparation for log-profiles because then log preferences will be set per logger and not per proxy. Indeed, as each logger may come with its own log-profile that possibly overrides proxy logformat preferences, the check will need to be performed at a central place by lower sending functions. To ensure the change doesn't break existing behavior, a dedicated sess_log_embryonic() wrapper was added and is exclusively used by session_kill_embryonic() to indicate that a special logging logic must be performed under sess_log(). Also, thanks to this change, log-format-sd will now be taken into account for legacy embryonic session logging.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	79a0a7b4d8	MINOR: session: expose session_embryonic_build_legacy_err() function rename session_build_err_string() to session_embryonic_build_legacy_err() and add new <out> buffer argument to the prototype. <out> will be used as destination for the generated string instead of implicitly relying on the trash buffer. Finally, expose the new function through the header file so that it becomes usable from any source file. The function is expected to be called with a session originating from a connection and should not be used for applets.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	ee288a4eef	REORG: log: reorder send log helpers by dependency order This commit looks messy, but all it does is reorganize send_log() helpers by dependency order to remove the need of forward-declaring some of them. Also, since they're all internal helpers, let's explicitly mark them as static to prevent any misuse.	2024-06-13 15:43:09 +02:00
Aurelien DARRAGON	cf913c2f90	DOC: management: rename show stats domain cli "dns" to "resolvers" In commit f8642ee82 ("MEDIUM: resolvers: rename dns extra counters to resolvers extra counters"), we renamed "dns" counters to "resolvers", but we forgot to update the documentation accordingly. This may be backported to all stable versions.	2024-06-13 15:43:09 +02:00
Valentine Krasnobaeva	61d66a3d06	DOC/MINOR: management: add -dZ option Add some description for missed -dZ command line option in the "3. Starting HAProxy" chapter. Need to be backported until 2.9.	2024-06-12 18:21:21 +02:00
Valentine Krasnobaeva	27623d8393	DOC/MINOR: management: add missed -dR and -dv options Add some description for missed -dR and -dv command line options in the "3. Starting HAProxy" chapter. Need to be backported in every stable version.	2024-06-12 18:20:41 +02:00
Amaury Denoyelle	cdfceb10ae	MINOR: quic: refactor qc_prep_pkts() loop qc_prep_pkts() is built around a double loop iteration. First, it iterates over every QEL instance register on sending. The inner loop is used to repeatdly called qc_build_pkt() with a QEL instance. If the QEL instance has no more data to sent, the next QEL entry is selected. It can also be interrupted earlier if there is not enough room on the sent buffer. Clarify the inner loop by using qc_may_build_pkt() directly into it besides the check on buffer room left. This function is used to test if the QEL instance has something to send. This should simplify send evolution, in particular GSO implementation.	2024-06-12 18:05:40 +02:00
Amaury Denoyelle	ba00431625	MINOR: quic: use global datagram headlen definition Each emitted QUIC datagram is prefixed by an out-of-band header. This header specify the datagram length and the pointer to the first QUIC packet instance. This header length is defined via QUIC_DGRAM_HEADLEN. Replace every occurences of manually calculated header length with globally defined QUIC_DGRAM_HEADLEN. This should ease code maintenance and simplify GSO implementation.	2024-06-12 18:05:40 +02:00
Amaury Denoyelle	88681681cc	MINOR: quic: refactor qc_build_pkt() error handling qc_build_pkt() error handling was difficult due to multiple error code possible. Improve this by defining a proper enum to describe the various error code. Also clean up ending labels inside qc_build_pkt().	2024-06-12 18:05:40 +02:00
Amaury Denoyelle	ab37b86921	OPTIM: quic: fill whole Tx buffer if needed Previously, packets encoding was stopped as soon as buffer room left is less than UDP MTU. This is suboptimal if the next packet would be smaller than that. To improve this, only check if there is at least enough room for the mandatory packet header. qc_build_pkt() would ensure there is thus responsible to return QC_BUILD_PKT_ERR_BUFROOM as soon as buffer left is insufficient to stop packets encoding. An extra check is added to ensure end pointer would never exceed buffer end. This should not have any significant impact on the performance. However, this renders the code intention clearer.	2024-06-12 18:05:40 +02:00
Amaury Denoyelle	a60609f1aa	BUG/MINOR: quic: fix padding of INITIAL packets API for sending has been extended to support emission on more than 2 QEL instances. However, this has rendered the PADDING emission for INITIAL packets less previsible. Indeed, if qc_send() is used with empty QEL instances, a padding frame may be generated before handling the last QEL registered, which could cause unnecessary padding to be emitted. This commit simplify PADDING by only activating it for the last QEL registered. This ensures that no superfluous padding is generated as if the minimal INITIAL datagram length is reached, padding is resetted before handling last QEL instance. This bug is labelled as minor as haproxy already emit big enough INITIAL packets coalesced with HANDSHAKE one without needing padding. This however render the padding code difficult to test. Thus, it may be useful to force emission on INITIAL qel only without coalescing HANDSHAKE packet. Here is a sample to reproduce it : --- a/src/quic_conn.c +++ b/src/quic_conn.c @@ -794,6 +794,14 @@ struct task quic_conn_io_cb(struct task t, void context, unsigned int state) } } + if (qc->iel && qel_need_sending(qc->iel, qc)) { + struct list empty = LIST_HEAD_INIT(empty); + qel_register_send(&send_list, qc->iel, &qc->iel->pktns->tx.frms); + if (qc->hel) + qel_register_send(&send_list, qc->hel, &empty); + qc_send(qc, 0, &send_list); + } + / Insert each QEL into sending list if needed. */ list_for_each_entry(qel, &qc->qel_list, list) { if (qel_need_sending(qel, qc)) This should be backported up to 3.0.	2024-06-12 18:05:40 +02:00
Christopher Faulet	0e09cce0fd	BUG/MAJOR: mux-h1: Prevent any UAF on H1 connection after draining a request Since 2.9, it is possible to drain the request payload from the H1 multiplexer in case of early reply. When this happens, the upper stream is detached but the H1 stream is not destroyed. Once the whole request is drained, the end of the detach stage is finished. So the H1 stream is destroyed and the H1 connection is ready to be reused, if possible, otherwise it is released. And here is the issue. If some data of the next request are received with last bytes of the drained one, parsing of the next request is immediately started. The previous H1 stream is destroyed and a new one is created to handle the parsing. At this stage the H1 connection may be released, for instance because of a parsing error. This case was not properly handled. Instead of immediately exiting the mux, it was still possible to access the released H1 connection to refresh its timeouts, leading to a UAF issue. Many thanks to Annika for her invaluable help on this issue. The patch should fix the issue #2602. It must be backported as far as 2.9.	2024-06-12 16:12:47 +02:00
William Lallemand	82a4dd7df6	DOC: internals: add a documentation about the master worker Add a documentation about the history of the master-worker and how it was implemented in its first version and how it is currently working. This is a global view of the architecture, and not an exhaustive explanation of all mechanisms.	2024-06-12 14:46:05 +02:00
Christopher Faulet	91fe085943	BUG/MINOR: promex: Skip resolvers metrics when there is no resolver section By default, there is always at least on resolver section, the default one, based on "/etc/resolv.conf" content. However, it is possible to have no resolver at all if the file is empty or if any error occurred. Errors are silently ignored at this stage. In that case, there was a bug in the Prometheus exporter leading to a crash because the resolver section list is empty. An invalid resolver entity was used. To fix the issue we must only take care to not dump resolvers metrics when there is no resolver. Thanks to Aurelien to have spotted the offending commit. This patch should fix the issue #2604. It must be backported to 3.0.	2024-06-12 08:55:52 +02:00
Aurelien DARRAGON	c157894ba9	DOC: config: add missing context hint for new server and proxy keywords To stay consistent with the work started in 54627f991 ("DOC: config: add context hint for proxy keywords") and 3d4e1e682 ("DOC: config: add context hint for server keywords"), we add missing context hint for "guid" (both proxy and server) keyword and "hash-key" server keyword that were added during 3.0 development. This may be backported in 3.0.	2024-06-11 17:03:02 +02:00
Aurelien DARRAGON	aec02320bd	DOC: config: add missing section hint for "guid" proxy keyword "guid" proxy keyword added in da754b45 ("MINOR: proxy: implement GUID support") was lacking the section hint in the keyword description, let's fix that. It could be backported in 3.0 with da754b45.	2024-06-11 17:02:55 +02:00
Aurelien DARRAGON	cdf1d20e8a	DOC: config: move "hash-key" from proxy to server options As reported by Ashley Morris, "hash-key" keyword which was introduced in commit faa8c3e0 ("MEDIUM: lb-chash: Deterministic node hashes based on server address") doesn't belong to proxy keywords and should be found in 5.2 "Server and default-server options" instead. It should be backported in 3.0 with faa8c3e0	2024-06-11 17:02:50 +02:00
Aurelien DARRAGON	c6931a4f01	CLEANUP: log/proxy: fix comment in proxy_free_common() Thanks to previous commit, logformat expressions for default proxies are also postchecked, adjusting a comment that suggests it's not the case.	2024-06-11 11:00:11 +02:00
Aurelien DARRAGON	e4f122f3f4	BUG/MEDIUM: log: fix lf_expr_postcheck() behavior with default section Since 7a21c3a4ef ("MAJOR: log: implement proper postparsing for logformat expressions"), logformat expressions stored in a default section are not postchecked anymore. This is because the REGISTER_POST_PROXY_CHECK() only evaluates regular proxies. Because of this, proxy options which are automatically enabled on the proxy depending on the logformat expression features in use are not set on the default proxy, which means such options are not passed to the regular proxies that inherit from it (proxies that and will actually be running the logformat expression during runtime). Because of that, a logformat expression stored inside a default section and executed by a regular proxy may not behave properly. Also, since 03ca16f38b ("OPTIM: log: resolve logformat options during postparsing"), it's even worse because logformat node options postresoving is also skipped, which may also alter logformat expression encoding feature. To fix the issue, let's add a special case for default proxies in parse_logformat_string() and lf_expr_postcheck() so that default proxies are postchecked on the fly during parsing time in a "relaxed" way as we cannot assume that the features involved in the logformat expression won't be compatible with the proxy actually running it since we may have different types of proxies inheriting from the same default section. This bug was discovered while trying to address GH #2597. It should be backported to 3.0 with 7a21c3a4ef and 03ca16f38b.	2024-06-11 11:00:05 +02:00
Aurelien DARRAGON	cbc8e1394d	MINOR: log: change wording in lf_expr_postcheck() error message logformat_node was referenced as "node" in the error message reported to the user, but in fact it is referred to as "item" in user documentation. Using "item" in the error message to better comply with the doc. Error message was introduced with 7a21c3a4ef ("MAJOR: log: implement proper postparsing for logformat expressions")	2024-06-11 10:59:58 +02:00
Aurelien DARRAGON	318c290ff2	BUG/MEDIUM: proxy: fix UAF with {tcp,http}checks logformat expressions When parsing a logformat expression using parse_logformat_string(), the caller passes the proxy under which the expression is found as argument. This information allows the logformat expression API to check if the expression is compatible with the proxy settings. Since 7a21c3a ("MAJOR: log: implement proper postparsing for logformat expressions"), the proxy compatibilty checks are postponed after the proxy is fully parsed to ensure proxy properties are fully resolved for checks consistency. The way it works, is that each time parse_logformat_string() is called for a given expression and proxy, it schedules the expression for postchecking by appending the expression to the list of pending expression checks on the proxy (lf_checks struct). Then, when the proxy is called with the REGISTER_POST_PROXY_CHECK() hook, it iterates over unchecked expressions and performs the check, then it removes the expression from its list. However, I overlooked a special case: if a logformat expression is used on a proxy that is disabled or a default proxy: REGISTER_POST_PROXY_CHECK() hook is never called. Because of that, lf expressions may still point to the proxy after the proxy is freed. For most logformat expressions, this isn't an issue because they are stored within the proxy itself, but this isn't the case with {tcp,http}checks logformat expressions: during deinit() sequence, all proxies are first cleaned up, and only then shared checks are freed. Because of that, the below config will trigger UAF since 7a21c3a: uaf.conf: listen dummy bind localhost:2222 backend testback disabled mode http option httpchk http-check send hdr test "test" http-check expect status 200 haproxy -f uaf.conf -c: ==152096== Invalid write of size 8 ==152096== at 0x21C317: lf_expr_deinit (log.c:3491) ==152096== by 0x2334A3: free_tcpcheck_http_hdr (tcpcheck.c:84) ==152096== by 0x2334A3: free_tcpcheck_http_hdr (tcpcheck.c:79) ==152096== by 0x2334A3: free_tcpcheck_http_hdrs (tcpcheck.c:98) ==152096== by 0x23365A: free_tcpcheck.part.0 (tcpcheck.c:130) ==152096== by 0x2338B1: free_tcpcheck (tcpcheck.c:108) ==152096== by 0x2338B1: deinit_tcpchecks (tcpcheck.c:3780) ==152096== by 0x2CF9A4: deinit (haproxy.c:2949) ==152096== by 0x2D0065: deinit_and_exit (haproxy.c:3052) ==152096== by 0x169BC0: main (haproxy.c:3996) ==152096== Address 0x52a8df8 is 6,968 bytes inside a block of size 7,168 free'd ==152096== at 0x484B27F: free (vg_replace_malloc.c:872) ==152096== by 0x2CF8AD: deinit (haproxy.c:2906) ==152096== by 0x2D0065: deinit_and_exit (haproxy.c:3052) ==152096== by 0x169BC0: main (haproxy.c:3996) To fix the issue, let's ensure in proxy_free_common() that no unchecked expressions may still point to the proxy after the proxy is freed by purging the list (DEL_INIT is used to reset list items). Special thanks to GH user @mhameed who filed a comprehensive issue with all the relevant information required to reproduce the bug (see GH #2597), after having first reported the issue on the alpine project bug tracker.	2024-06-11 10:59:52 +02:00
Aurelien DARRAGON	005e4ba715	MINOR: proxy: add proxy_free_common() helper function As shown by previous patch series, having to free some common proxy struct members twice (in free_proxy() and proxy_free_defaults()) is error-prone: we often overlook one of the two free locations when adding new features. To prevent such bugs from being introduced in the future, and also avoid code duplication, we now have a proxy_free_common() function to free all proxy struct members that are common to all proxy types (either regular or default ones). This should greatly improve code maintenance related to proxy freeing logic.	2024-06-11 10:59:45 +02:00
Aurelien DARRAGON	847c406b9a	BUG/MINOR: proxy: fix header_unique_id leak on deinit() proxy header_unique_id wasn't cleaned up in proxy_free_defaults(), resulting in small memory leak if "unique-id-header" was used on a default proxy section. It may be backported to all stable versions.	2024-06-11 10:59:39 +02:00
Aurelien DARRAGON	1aa219078d	BUG/MINOR: proxy: fix source interface and usesrc leaks on deinit() proxy conn_src.iface_name was only freed in proxy_free_defaults(), whereas proxy conn_src.bind_hdr_name was only freed in free_proxy(). Because of that, using "source usesrc hdr_ip()" in a default proxy, or "source interface" in a regular or default proxy would cause memory leaks during deinit. It may be backported to all stable versions.	2024-06-11 10:59:33 +02:00
Aurelien DARRAGON	6f53df3fcf	BUG/MINOR: proxy: fix dyncookie_key leak on deinit() proxy dyncookie_key wasn't cleaned up in free_proxy(), resulting in small memory leak if "dynamic-cookie-key" was used on a regular or default proxy. It may be backported to all stable versions.	2024-06-11 10:59:27 +02:00
Aurelien DARRAGON	62d0465a96	BUG/MINOR: proxy: fix check_{command,path} leak on deinit() proxy check_{command,path} members (used for "external-check" feature) weren't cleaned up in free_proxy(), resulting in small memory leak if "external-check command" or "external-check path" were used on a regular or default proxy. It may be backported to all stable versions.	2024-06-11 10:59:20 +02:00
Aurelien DARRAGON	fa90a7d313	BUG/MINOR: proxy: fix email-alert leak on deinit() proxy email-alert settings weren't cleaned up in free_proxy(), resulting in small memory leak if "email-alert to" or "email-alert from" were used on a regular or default proxy. It may be backported to all stable versions.	2024-06-11 10:59:15 +02:00
Aurelien DARRAGON	77b192ea36	BUG/MINOR: proxy: fix log_tag leak on deinit() proxy log_tag wasn't cleaned up in free_proxy(), resulting in small memory leak if "log-tag" was used on a regular or default proxy. It may be backported to all stable versions.	2024-06-11 10:59:08 +02:00
Aurelien DARRAGON	99f3409582	BUG/MINOR: proxy: fix server_id_hdr_name leak on deinit() proxy server_id_hdr_name member (used for "http-send-name-header" option) wasn't cleaned up in free_proxy(), resulting in small memory leak if "http-send-name-header" was used on a regular or default proxy. This may be backported to all stable versions.	2024-06-11 10:59:02 +02:00
Aurelien DARRAGON	e5ccfda9d3	MINOR: log: fix "http-send-name-header" ignore warning message Warning message to indicate that the "http-send-name-header" option is ignored for backend in "mode log" was referenced using its internal struct wording instead of public name (as seen in the documentation). Let's fix that. It may be backported with c7783fb ("MINOR: log/backend: prevent "http-send-name-header" use with LOG mode") in 2.9.	2024-06-11 10:58:55 +02:00
William Lallemand	7acdc3f6ff	DOC: install: remove boringssl from the list of supported libraries BoringSSL support is known to be broken since 2021, it was removed from the CI at this time and never fixed. (30ee2965b66f20a2649323ca36029bf2440e34b9) Even the QUIC code for boringSSL was removed in 2022. (e06f7459faf36f5f63092cb6ce89d281dfc4ee6a)	2024-06-10 18:54:28 +02:00
Christopher Faulet	7bff576ebb	BUG/MINOR: mux-h1: Use the right variable to set NEGO_FF_FL_EXACT_SIZE flag Instead of setting this flag on the ones used for the zero-copy negociation, it is set on the connection flags used for xprt->rcv_buf() call. Fortunately, there is no real consequence. The only visible effect is the chunk size that is written on 8 bytes for no reason. This patch is related to issue #2598. It must be backported to 3.0.	2024-06-10 14:06:35 +02:00
Christopher Faulet	e8cc8a60be	BUG/MAJOR: mux-h1: Properly copy chunked input data during zero-copy nego When data are transfered via zero-copy data forwarding, if some data were already received, we try to immediately tranfer it during the negociation step. If data are chunked and the chunk size is unknown, 10 bytes are reserved to write the chunk size during the done step. However, when input data are finally transferred, the offset is ignored. Data are copied into the output buffer. But the first 10 bytes are then crushed by the chunk size. Thus the chunk is truncated leading to a malformed message. This patch should fix the issue #2598. It must be backported to 3.0.	2024-06-10 14:06:35 +02:00
William Manley	52eb6b23f8	BUG/MEDIUM: stconn/mux-h1: Fix suspect change causing timeouts This fixes an issue I've had where if a connection was idle for ~23s it would get in a bad state. I don't understand this code, so I'm not sure exactly why it was failing. I discovered this by bisecting to identify the commit that caused the regression between 2.9 and 3.0. The commit is d2c3f8dde7c2474616c0ea51234e6ba9433a4bc1: "MINOR: stconn/connection: Move shut modes at the SE descriptor level" - a part of v3.0-dev8. It seems to be an innocent renaming, so I looked through it and this stood out as suspect: - if (mode != CO_SHW_NORMAL) + if (mode & SE_SHW_NORMAL) It looks like the not went missing here, so this patch reverses that condition. It fixes my test. I don't quite understand what this is doing or is for so I can't write a regression test or decent commit message. Hopefully someone else will be able to pick this up from where I've left it. [CF: This inverts the condition to perform clean shutdowns. This means no clean shutdown are performed when it should do. This patch must be backported to 3.0]	2024-06-10 14:06:35 +02:00
Amaury Denoyelle	0ef94e2dff	BUG/MINOR: quic: ensure Tx buf is always purged quic_conn API for sending was recently refactored. The main objective was to regroup the different functions present for both handshake and application emission. After this refactoring, an optimization was introduced to avoid calling qc_send() if there was nothing new to emit. However, this prevent the Tx buffer to be purged if previous sending was interrupted, until new frames are finally available. To fix this, simply remove the optimization. qc_send() is thus now always called in quic_conn IO handlers. The impact of this bug should be minimal as it happens only on sending temporary error. However in this case, this could cause extra latency or even a complete sending freeze in the worst scenario. This must be backported up to 3.0.	2024-06-10 10:29:28 +02:00
Amaury Denoyelle	50470a5181	BUG/MINOR: quic: fix computed length of emitted STREAM frames qc_build_frms() is responsible to encode multiple frames in a single QUIC packet. It accounts for room left in the buffer packet for each newly encded frame. An incorrect computation was performed when encoding a STREAM frame in a single packet. Frame length was accounted twice which would reduce in excess the buffer packet room. This caused the remaining built frames to be reduced with the resulting packet not able to fill the whole MTU. The impact of this bug should be minimal. It is only present when multiple frames are encoded in a single packet after a STREAM. However in this case datagrams built are smaller than expecting, which is suboptimal for bandwith. This should be backported up to 2.6.	2024-06-10 10:24:02 +02:00
William Lallemand	711338e1ce	BUG/MEDIUM: ssl: bad auth selection with TLS1.2 and WolfSSL The ClientHello callback for WolfSSL introduced in haproxy 2.9, seems not to behave correctly with TLSv1.2. In TLSv1.2, this is the cipher that is used to chose the authentication algorithm (ECDSA or RSA), however an SSL client can send a signature algorithm. In TLSv1.3, the authentication is not part of the ciphersuites, and is selected using the signature algorithm. The mistake in the code is that the signature algorithm in TLSv1.2 are overwritting the auth that was selected using the ciphers. This must be backported as far as 2.9.	2024-06-07 15:47:15 +02:00
William Lallemand	93cc23a355	BUG/MEDIUM: ssl: wrong priority whem limiting ECDSA ciphers in ECDSA+RSA configuration The ClientHello Callback which is used for certificate selection uses both the signature algorithms and the ciphers sent by the client. However, when a client is announcing both ECDSA and RSA capabilities with ECSDA ciphers that are not available on haproxy side and RSA ciphers that are compatibles, the ECDSA certificate will still be used but this will result in a "no shared cipher" error, instead of a fallback on the RSA certificate. For example, a client could send 'ECDHE-ECDSA-AES128-CCM:ECDHE-RSA-AES256-SHA and HAProxy could be configured with only 'ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA'. This patch fixes the issue by validating that at least one ECDSA cipher is available on both side before chosing the ECDSA certificate. This must be backported on all stable versions.	2024-06-05 15:33:36 +02:00
Christopher Faulet	6697e87ae5	MINOR: mux-quic: Don't send an emtpy H3 DATA frame during zero-copy forwarding It may only happens when there is no data to forward but a last stream frame must be sent with the FIN bit. It is not invalid, but it is useless to send an empty H3 DATA frame in that case.	2024-06-05 07:28:10 +02:00
Christopher Faulet	9748df29ff	BUG/MEDIUM: mux-quic: Don't unblock zero-copy fwding if blocked during nego The previous fix (792a645ec2 ["BUG/MEDIUM: mux-quic: Unblock zero-copy forwarding if the txbuf can be released"]) introduced a regression. The zero-copy data forwarding must only be unblocked if it was blocked by the producer, after a successful negotiation. It is important because during a negotiation, the consumer may be blocked for another reason. Because of the flow control for instance. In that case, there is not necessarily a TX buffer. And it unexpected to try to release an unallocated TX buf. In addition, the same may happen while a TX buf is still in-use. In that case, it must also not be released. So testing the TX buffer is not the right solution. To fix the issue, a new IOBUF flag was added (IOBUF_FL_FF_WANT_ROOM). It must be set by the producer if it is blocked after a sucessful negotiation because it needs more room. In that case, we know a buffer was provided by the consummer. In done_fastfwd() callback function, it is then possible to safely unblock the zero-copy data forwarding if this flag is set. This patch must be backported to 3.0 with the commit above.	2024-06-05 07:28:10 +02:00
Aurelien DARRAGON	2bde0d64dd	CLEANUP: hlua: simplify ambiguous lua_insert() usage in hlua_ctx_resume() 'lua_insert(lua->T, -lua_gettop(lua->T))' is actually used to rotate the top value with the bottom one, thus the code was overkill and the comment was actually misleading, let's fix that by using explicit equivalent form (absolute index). It may be backported with 5508db9a2 ("BUG/MINOR: hlua: fix unsafe lua_tostring() usage with empty stack") to all stable versions to ease code maintenance.	2024-06-04 16:31:38 +02:00
Aurelien DARRAGON	755c2daf0f	BUG/MINOR: hlua: fix leak in hlua_ckch_set() error path in hlua_ckch_commit_yield() and hlua_ckch_set(), when an error occurs, we enter the error path and try to raise an error from the <err> msg pointer which must be freed afterwards. However, the fact that luaL_error() never returns was overlooked, because of that <err> msg is never freed in such case. To fix the issue, let's use hlua_pushfstring_safe() helper to push the err on the lua stack and then free it before throwing the error using lua_error(). It should be backported up to 2.6 with 30fcca18 ("MINOR: ssl/lua: CertCache.set() allows to update an SSL certificate file")	2024-06-04 16:31:30 +02:00
Aurelien DARRAGON	2be94c008e	CLEANUP: hlua: get rid of hlua_traceback() security checks Thanks to the previous commit, we may now assume that hlua_traceback() won't LJMP, so it's safe to use it from unprotected environment without any precautions.	2024-06-04 16:31:22 +02:00
Aurelien DARRAGON	365ee28510	BUG/MINOR: hlua: prevent LJMP in hlua_traceback() Function is often used on error paths where no precaution is taken against LJMP. Since the function is used on error paths (which include out-of-memory error paths) the function lua_getinfo() could also raise a memory exception, causing the process to crash or improper error handling if the caller isn't prepared against that eventually. Since the function is only used on rare events (error handling) and is lacking the __LJMP prototype pefix, let's make it safe by protecting the lua_getinfo() call so that hlua_traceback() callers may use it safely now (the function will always succeed, output will be truncated in case of error). This could be backported to all stable versions.	2024-06-04 16:31:15 +02:00
Aurelien DARRAGON	f0e5b825cf	BUG/MINOR: hlua: fix unsafe hlua_pusherror() usage Following previous commit's logic: hlua_pusherror() is mainly used from cleanup paths where the caller isn't protected against LJMPs. Caller was tempted to think that the function was safe because func prototype was lacking the __LJMP prefix. Let's make the function really LJMP-safe by wrapping the sensitive calls under lua_pcall(). This may be backported to all stable versions.	2024-06-04 16:31:09 +02:00
Aurelien DARRAGON	c0a3c1281f	BUG/MINOR: hlua: don't use lua_pushfstring() when we don't expect LJMP lua_pushfstring() is used in multiple cleanup paths (upon error) to push the error message that will be raised by lua_error(). However this is often done from an unprotected environment, or in the middle of a cleanup sequence, thus we don't want the function to LJMP! (it may cause various issues ranging from memory leaks to crashing the process..) Hopefully this has very few chances of happening but since the use of lua_pushfstring() is limited to error reporting here, it's ok to use our own hlua_pushfstring_safe() implementation with a little overhead to ensure that the function will never LJMP. This could be backported to all stable versions.	2024-06-04 16:31:01 +02:00
Aurelien DARRAGON	6e484996c6	CLEANUP: hlua: use hlua_pusherror() where relevant In hlua_map_new(), when error occurs we use a combination of luaL_where, lua_pushfstring and lua_concat to build the error string before calling lua_error(). It turns out that we already have the hlua_pusherror() macro which is exactly made for that purpose so let's use it. It could be backported to all stable versions to ease code maintenance.	2024-06-04 16:30:55 +02:00
Amaury Denoyelle	f7ae84e7d1	BUG/MINOR: quic: prevent crash on qc_kill_conn() Ensure idle_timer task is allocated in qc_kill_conn() before waking it up. It can be NULL if idle timer has already fired but MUX layer is still present, which prevents immediate quic_conn release. qc_kill_conn() is only used on send() syscall fatal error to notify upper layer of an error and close the whole connection asap. This crash occurence is pretty rare as it relies on timing issues. It happens only if idle timer occurs before the MUX release (a bigger client timeout is thus required) and any send() syscall detected error. For now, it was only reproduced using GDB to interrupt haproxy longer than the idle timeout. This should be backported up to 2.6.	2024-06-04 14:59:24 +02:00
Christopher Faulet	792a645ec2	BUG/MEDIUM: mux-quic: Unblock zero-copy forwarding if the txbuf can be released In done_fastfwd() callback function, if nothing was forwarding while the SD is blocked, it means there is not enough space in the buffer to proceed. It may be because there are data to be sent. But it may also be data already sent waiting for an ack. In this case, no data to be sent by the mux. So the quic stream is not woken up when data are finally removed from the buffer. The data forwarding can thus be stuck. This happens when the stats page is requested in QUIC/H3. Only applets are affected by this issue and only with the QUIC multiplexer because it is the only mux with already sent data in the TX buf. To fix the issue, the idea is to release the txbuf if possible and then unblock the SD to perform a new zero-copy data forwarding attempt. Doing so, and thanks to the previous patch ("MEDIUM: applet: Be able to unblock zero-copy data forwarding from done_fastfwd"), the applet will be woken up. This patch should fix the issue #2584. It must be backported to 3.0.	2024-06-04 14:23:40 +02:00
Christopher Faulet	d2a2014f15	MEDIUM: stconn: Be able to unblock zero-copy data forwarding from done_fastfwd This part is only experienced by applet. When an applet try to forward data via an iobuf, it may decide to block for any reason even if there is free space in the buffer. For instance, the stats applet don't procude data if the buffer is almost full. However, in this case, it could be good to let the consumer decide a new attempt is possible because more space was made. So, if IOBUF_FL_FF_BLOCKED flag is removed by the consumer when done_fastfwd() callback function is called, the SE_FL_WANT_ROOM flag is removed on the producer sedesc. It is only done for applets. And thanks to this change, the applet can be woken up for a new attempt. This patch is required for a fix on the QUIC multiplexer.	2024-06-04 14:23:40 +02:00
Christopher Faulet	7c84ee71f7	BUG/MEDIUM: h1-htx: Don't state interim responses are bodyless Interim responses are by definition bodyless. But we must not set the corresponding HTX start-line flag, beecause the start-line of the final response is still expected. Setting the flag above too early may lead the multiplexer on the sending side to consider the message is finished after the headers of the interim message. It happens with the H2 multiplexer on frontend side if a "100-Continue" is received from the server. The interim response is sent and HTX_SL_F_BODYLESS_RESP flag is evaluated. Then, the headers of the final response are sent with ES flag, because HTX_SL_F_BODYLESS_RESP flag was seen too early, leading to a protocol error if the response has a body. Thanks to grembo for this analysis. This patch should fix the issue #2587. It must be backported as far as 2.9.	2024-06-04 14:23:40 +02:00
Ilia Shipitsin	1ef6cdcd26	CI: FreeBSD: upgrade image, packages FreeBSD-13.2 was removed from cirrus-ci, let's upgrade to 14.0, also, pcre is EOL, let's switch to pcre2. lua is updated to 5.4	2024-06-04 11:19:00 +02:00
Aurelien DARRAGON	a63f2cde94	CLEANUP: hlua: fix CertCache class comment CLASS_CERTCACHE is used to declare CertCache global object, not Regex one This copy-paste typo introduced was in 30fcca18 ("MINOR: ssl/lua: CertCache.set() allows to update an SSL certificate file")	2024-06-03 17:00:06 +02:00
Aurelien DARRAGON	4f906a9c38	BUG/MINOR: hlua: use CertCache.set() from various hlua contexts Using CertCache.set() from init context wasn't explicitly supported and caused the process to crash: crash.lua: core.register_init(function() CertCache.set{filename="reg-tests/ssl/set_cafile_client.pem", ocsp=""} end) crash.conf: global lua-load crash.lua listen front bind localhost:9090 ssl crt reg-tests/ssl/set_cafile_client.pem ca-file reg-tests/ssl/set_cafile_interCA1.crt verify none ./haproxy -f crash.conf [NOTICE] (267993) : haproxy version is 3.0-dev2-640ff6-910 [NOTICE] (267993) : path to executable is ./haproxy [WARNING] (267993) : config : missing timeouts for proxy 'front'. \| While not properly invalid, you will certainly encounter various problems \| with such a configuration. To fix this, please ensure that all following \| timeouts are set to a non-zero value: 'client', 'connect', 'server'. [1] 267993 segmentation fault (core dumped) ./haproxy -f crash.conf This is because in hlua_ckch_set/hlua_ckch_commit_yield, we always consider that we're being called from a yield-capable runtime context. As such, hlua_gethlua() is never checked for NULL and we systematically try to wake hlua->task and yield every 10 instances. In fact, if we're called from the body or init context (that is, during haproxy startup), hlua_gethlua() will return NULL, and in this case we shouldn't care about yielding because it is ok to commit all instances at once since haproxy is still starting up. Also, when calling CertCache.set() from a non-yield capable runtime context (such as hlua fetch context), we kept doing as if the yield succeeded, resulting in unexpected function termination (operation would be aborted and the CertCache lock wouldn't be released). Instead, now we explicitly state in the doc that CertCache.set() cannot be used from a non-yield capable runtime context, and we raise a runtime error if it is used that way. These bugs were discovered by reading the code when trying to address Svace report documented by @Bbulatov GH #2586. It should be backported up to 2.6 with 30fcca18 ("MINOR: ssl/lua: CertCache.set() allows to update an SSL certificate file")	2024-06-03 17:00:00 +02:00
Aurelien DARRAGON	8860c22c00	MINOR: stktable: avoid ambiguous stktable_data_ptr() usage in cli_io_handler_table() As reported by @Bbulatov in GH #2586, stktable_data_ptr() return value is used without checking it isn't NULL first, which may happen if the given type is invalid or not stored in the table. However, since date_type is set by table_prepare_data_request() right before cli_io_handler_table() is invoked, date_type is not expected to be invalid: table_prepare_data_request() normally checked that the type is stored inside the table. Thus stktable_data_ptr() should not be failing at this point, so we add a BUG_ON() to indicate that.	2024-06-03 16:59:54 +02:00
William Lallemand	dc8a2c7f43	DOC: change the link to the FreeBSD CI in README.md Change the link to the FreeBSD CI status badge to use the cirrus.com jobs list.	2024-06-03 15:21:29 +02:00
William Lallemand	45cac52212	DOC: add the FreeBSD status badge to README.md Add the FreeBSD status badge that comes from the Cirrus CI in the README.md	2024-06-03 15:14:37 +02:00
Ilia Shipitsin	ab23d7eb69	CI: speedup apt package install we are fine to skip some repos like languages and translations. this drops number of repos twice	2024-06-03 11:59:07 +02:00
William Lallemand	c79c312142	DOC: configuration: add an example for keywords from crt-store In ticket #785, people are still confused about how to use the crt-store load parameters in a crt-list. This patch adds an example. This must be backported in 3.0	2024-06-03 11:02:23 +02:00
Willy Tarreau	ba958fb230	BUG/MINOR: tools: fix possible null-deref in env_expand() on out-of-memory In GH issue #2586 @Bbulatov reported a theoretical null-deref in env_expand() in case there's no memory anymore to expand an environment variable. The function should return NULL in this case so that the only caller (str2sa_range) sees it. In practice it may only happen during boot thus is harmless but better fix it since it's easy. This can be backported to all versions where this applies.	2024-05-31 18:55:36 +02:00
Willy Tarreau	8a7afb6964	BUG/MINOR: tcpcheck: report correct error in tcp-check rule parser When parsing tcp-check expect-header, a copy-paste error in the error message causes the name of the header to be reporetd as the invalid format string instead of its value. This is really harmless but should be backported to all versions to help users understand the cause of the problem when this happens. This was reported in GH issue #2586 by @Bbulatov.	2024-05-31 18:37:56 +02:00
Willy Tarreau	d8194fab82	BUG/MINOR: cfgparse: remove the correct option on httpcheck send-state warning In GH issue #2586 @Bbulatov reported a bug where the http-check send-state flag is removed from options instead of options2 when http-check is disabled. It only has an effect when this option is set and http-check disabled, where it displays a warning indicating this will be ignored. The option removed instead is srvtcpka when this happens. It's likely that both options being so minor, nobody ever faced it. This can be backported to all versions.	2024-05-31 18:30:16 +02:00
William Lallemand	f8418d3ade	ADMIN: acme.sh: remove the old acme.sh code Remove the acme.sh script since it was merged in https://github.com/acmesh-official/acme.sh/pull/4581 So people don't try to download a script which is not up to date with the current acme.sh master.	2024-05-31 13:37:47 +02:00
Ilia Shipitsin	f3e6dfdc92	CI: VTest: accelerate package install a bit let's check and install only package is required	2024-05-30 17:04:08 +02:00
William Lallemand	485b206f61	DOC: replace the README by a markdown version This patch removes the old README file and replaces it with a more modern markdown version which allows clickable links on the github page. It also adds some of the Github Actions worfklow Status. This patch includes the HAProxy png in the doc directory.	2024-05-30 13:53:46 +02:00
Ilia Shipitsin	09db70d021	CI: use USE_PCRE2 instead of USE_PCRE USE_PCRE2 is recommended, I guess USE_PCRE is left unintentionally	2024-05-29 22:37:26 +02:00
Ilia Shipitsin	11c088e203	CI: switch to lua 5.4 current release is 5.4, let's switch to it	2024-05-29 22:37:26 +02:00
Ilia Shipitsin	01c213a4bb	CI: use "--no-install-recommends" for apt-get this reduces number of packages installed by 1	2024-05-29 22:37:26 +02:00
Tim Duesterhus	e349159a34	REGTESTS: Remove REQUIRE_VERSION=2.2 from all tests HAProxy 2.2 is the lowest supported version, thus this always matches. see 7aff1bf6b90caadfa95f6b43b526275191991d6f	2024-05-29 22:36:15 +02:00
Tim Duesterhus	10418b6b5a	REGTESTS: Remove REQUIRE_VERSION=2.1 from all tests HAProxy 2.2 is the lowest supported version, thus this always matches. see 7aff1bf6b90caadfa95f6b43b526275191991d6f	2024-05-29 22:36:15 +02:00
Willy Tarreau	1eb0f22ee1	[RELEASE] Released version 3.1-dev0 Released version 3.1-dev0 with the following main changes : - MINOR: version: mention that it's development again	2024-05-29 15:00:02 +02:00
Willy Tarreau	555772e961	MINOR: version: mention that it's development again This essentially reverts 2e42a19cde.	2024-05-29 14:59:19 +02:00
Willy Tarreau	5590ada473	[RELEASE] Released version 3.0.0 Released version 3.0.0 with the following main changes : - MINOR: sample: implement the uptime sample fetch - CI: scripts: fix build of vtest regarding option -C - CI: scripts: build vtest using multiple CPUs - MINOR: log: rename 'log-format tag' to 'log-format alias' - DOC: config: document logformat item naming and typecasting features - BUILD: makefile: yearly reordering of objects by build time - BUILD: fd: errno is also needed without poll() - DOC: config: fix two typos "RST_STEAM" vs "RST_STREAM" - DOC: config: refer to the non-deprecated keywords in ocsp-update on/off - DOC: streamline http-reuse and connection naming definition - REGTESTS: complete http-reuse test with pool-conn-name - DOC: config: add %ID logformat alias alternative - CLEANUP: ssl/ocsp: readable ifdef in ssl_sock_load_ocsp - BUG/MINOR: ssl/ocsp: init callback func ptr as NULL - CLEANUP: ssl_sock: move dirty openssl-1.0.2 wrapper to openssl-compat - BUG/MINOR: activity: fix Delta_calls and Delta_bytes count - CI: github: upgrade the WolfSSL job to 5.7.0 - DOC: install: update quick build reminders with some missing options - DOC: install: update the range of tested openssl version to cover 3.3 - DEV: patchbot: prepare for new version 3.1-dev - MINOR: version: mention that it's 3.0 LTS now.	2024-05-29 14:43:38 +02:00
Willy Tarreau	2e42a19cde	MINOR: version: mention that it's 3.0 LTS now. The version will be maintained up to around Q2 2029. Let's also update the INSTALL file to mention this.	2024-05-29 14:40:26 +02:00
Willy Tarreau	bb7e62b98a	DEV: patchbot: prepare for new version 3.1-dev The bot will now load the prompt for the upcoming 3.1 version so we have to rename the files and update their contents to match the current version.	2024-05-29 14:38:21 +02:00
Willy Tarreau	8452a3f7c9	DOC: install: update the range of tested openssl version to cover 3.3 OpenSSL 3.3 is known to work since it's tested on the CI, to let's add it to the list of known good versions.	2024-05-29 10:23:59 +02:00
Willy Tarreau	2a949be18d	DOC: install: update quick build reminders with some missing options The quick build reminders claimed to present "all options" but were still missing QUIC. It was also the moment to split FreeBSD and OpenBSD apart since the latter uses LibreSSL and does not require the openssl compatibility wrapper. We also replace the hard-coded number of cpus for the parallel build, by the real number reported by the system.	2024-05-29 08:43:01 +02:00
William Lallemand	40cd5cc0e2	CI: github: upgrade the WolfSSL job to 5.7.0 WolfSSL 5.70 was released in March 2024, let's upgrade our CI job to this version.	2024-05-28 19:26:52 +02:00
Valentine Krasnobaeva	d5e43caaf5	BUG/MINOR: activity: fix Delta_calls and Delta_bytes count Thanks to the commit 5714aff4a6bf "DEBUG: pool: store the memprof bin on alloc() and update it on free()", the amount of memory allocations and memory "frees" is shown now on the same line, corresponded to the caller name. This is very convenient to debug memory leaks (haproxy should run with -dMcaller option). The implicit drawback of this solution is that we count twice same free_calls and same free_tot (bytes) values in cli_io_handler_show_profiling(), when we've calculed tot_free_calls and tot_free_bytes, by adding them to the these totalizators for p_alloc, malloc and calloc allocator types. See the details about why this happens in a such way in __pool_free() implementation and also in the commit message for 5714aff4a6bf. This double addition of free counters falses 'Delta_calls' and 'Delta_bytes', sometimes we even noticed that they show negative values. Same problem was with the calculation of average allocated buffer size for lines, where we show simultaneously the number of allocated and freed bytes.	2024-05-28 19:25:08 +02:00
Willy Tarreau	decb7c90df	CLEANUP: ssl_sock: move dirty openssl-1.0.2 wrapper to openssl-compat Valentine noticed this ugly SSL_CTX_get_tlsext_status_cb() macro definition inside ssl_sock.c that is dedicated to openssl-1.0.2 only. It would be better placed in openssl-compat.h, which is what this patch does. It also addresses a missing pair of parenthesis and removes an invalid extra semicolon.	2024-05-28 19:17:57 +02:00
Valentine Krasnobaeva	84380965a5	BUG/MINOR: ssl/ocsp: init callback func ptr as NULL In ssl_sock_load_ocsp() it is better to initialize local scope variable 'callback' function pointer as NULL, while we are declaring it. According to SSL_CTX_get_tlsext_status_cb() API, then we will provide a pointer to this 'on stack' variable in order to check, if the callback was already set before: OpenSSL 1.x.x and 3.x.x: long SSL_CTX_get_tlsext_status_cb(SSL_CTX ctx, int (callback)(SSL , void )); long SSL_CTX_set_tlsext_status_cb(SSL_CTX ctx, int (callback)(SSL , void )); WolfSSL 5.7.0: typedef int(tlsextStatusCb)(WOLFSSL* ssl, void); WOLFSSL_API int wolfSSL_CTX_get_tlsext_status_cb(WOLFSSL_CTX ctx, tlsextStatusCb* cb); WOLFSSL_API int wolfSSL_CTX_set_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb cb); When this func ptr variable stays uninitialized, haproxy comipled with ASAN crushes in ssl_sock_load_ocsp(): ./haproxy -d -f haproxy.cfg ... AddressSanitizer:DEADLYSIGNAL ================================================================= ==114919==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x5eab8951bb32 bp 0x7ffcdd6d8410 sp 0x7ffcdd6d82e0 T0) ==114919==The signal is caused by a READ memory access. ==114919==Hint: address points to the zero page. #0 0x5eab8951bb32 in ssl_sock_load_ocsp /home/vk/projects/haproxy/src/ssl_sock.c:1248:22 #1 0x5eab89510d65 in ssl_sock_put_ckch_into_ctx /home/vk/projects/haproxy/src/ssl_sock.c:3389:6 ... This happens, because callback variable is allocated on the stack. As not being explicitly initialized, it may contain some garbage value at runtime, due to the linked crypto library update or recompilation. So, following ssl_sock_load_ocsp code, SSL_CTX_get_tlsext_status_cb() may fail, callback will still contain its initial garbage value, 'if (!callback) {...' test will put us on the wrong path to access some ocsp_cbk_arg properties via its pointer, which won't be set and like this we will finish with segmentation fault. Must be backported in all stable versions. All versions does not have the ifdef, the previous cleanup patch is useful starting from the 2.7 version.	2024-05-28 18:14:26 +02:00
Valentine Krasnobaeva	fb7b46d267	CLEANUP: ssl/ocsp: readable ifdef in ssl_sock_load_ocsp Due to the support of different TLS/SSL libraries and its different versions, sometimes we are forced to use different internal typedefs and callback functions. We strive to avoid this, but time to time "#ifdef... #endif" become inevitable. In particular, in ssl_sock_load_ocsp() we define a 'callback' variable, which will contain a function pointer to our OCSP stapling callback, assigned further via SSL_CTX_set_tlsext_status_cb() to the intenal SSL context struct in a linked crypto library. If this linked crypto library is OpenSSL 1.x.x/3.x.x, for setting and getting this callback we have the following API signatures (see doc/man3/SSL_CTX_set_tlsext_status_cb.pod): long SSL_CTX_get_tlsext_status_cb(SSL_CTX ctx, int (callback)(SSL , void )); long SSL_CTX_set_tlsext_status_cb(SSL_CTX ctx, int (callback)(SSL , void )); If we are using WolfSSL, same APIs expect tlsextStatusCb function prototype, provided via the typedef below (see wolfssl/wolfssl/ssl.h): typedef int(tlsextStatusCb)(WOLFSSL* ssl, void); WOLFSSL_API int wolfSSL_CTX_get_tlsext_status_cb(WOLFSSL_CTX ctx, tlsextStatusCb* cb); WOLFSSL_API int wolfSSL_CTX_set_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb cb); It seems, that in OpenSSL < 1.0.0, there was no support for OCSP extention, so no need to set this callback. Let's avoid #ifndef... #endif for this 'callback' variable definition to keep things clear. #ifndef... #endif are usually less readable, than straightforward "#ifdef... #endif".	2024-05-28 18:00:44 +02:00
Aurelien DARRAGON	f9740230fc	DOC: config: add %ID logformat alias alternative unique-id sample fetch may be used instead of %ID alias but it wasn't mentioned explicitly in the doc.	2024-05-28 15:45:03 +02:00
Amaury Denoyelle	b0e1f77fea	REGTESTS: complete http-reuse test with pool-conn-name Add new test cases in http_reuse_conn_hash vtest. Ensure new server parameter "pool-conn-name" is used as expected for idle connection name, both alone and mixed with a SNI.	2024-05-28 15:00:54 +02:00
Amaury Denoyelle	8c09c7f39f	DOC: streamline http-reuse and connection naming definition With the introduction of "pool-conn-name", documentation related to http-reuse was rendered more complex than already, notably with multiple cross-references between "pool-conn-name" and "sni" server keywords. Took the opportunity to improve all http-reuse related documentation. First, "http-reuse" keyword general purpose has been greatly expanded and reordered. Then, "pool-conn-name" and "sni" have been clarified, in particular the relation between them, with the foremost being an advanced usage to the default SSL SNI case in the context of http-reuse. Also update attach-srv rule documentation as its name parameter is directly linked to both "pool-conn-name" and "sni".	2024-05-28 13:58:08 +02:00
Willy Tarreau	652a6f18b2	DOC: config: refer to the non-deprecated keywords in ocsp-update on/off The doc for "ocsp-update [ off \| on ]" was still referring to "tune.ssl.ocsp-update." instead of "ocsp-update.". No backport needed.	2024-05-27 20:13:42 +02:00
Willy Tarreau	2ed3531619	DOC: config: fix two typos "RST_STEAM" vs "RST_STREAM" These were added in 3.0-dev11 by commit 068ce2d5d2 ("MINOR: stconn: Add samples to retrieve about stream aborts"), no backport needed.	2024-05-27 19:51:19 +02:00
Willy Tarreau	725fa0ecd2	BUILD: fd: errno is also needed without poll() When building without USE_POLL, fd.c fails on errno because that one is only included when USE_POLL is set. Let's move it outside of the ifdef.	2024-05-27 19:14:14 +02:00
Willy Tarreau	35e9826c13	BUILD: makefile: yearly reordering of objects by build time Some large files have been split since 2.9 (e.g. stats) and build times have moved and become less smooth, causing a less even parallel build. As usual, a small reordering cleans all this up. The effect was less visible than previous years though.	2024-05-27 19:14:14 +02:00
Aurelien DARRAGON	141bc5ba0d	DOC: config: document logformat item naming and typecasting features The ability to give a name to a logformat_node (known as logformat item in the documentation) implemented in 2ed6068f2a ("MINOR: log: custom name for logformat node") wasn't documented. The same goes for the ability to force the logformat_node's output type to a specific type implemented in 1448478d62 ("MINOR: log: explicit typecasting for logformat nodes") Let's quickly describe such new usages at the start of the custom log format section.	2024-05-27 17:04:16 +02:00
Aurelien DARRAGON	435a9da267	MINOR: log: rename 'log-format tag' to 'log-format alias' In 2.9 we started to introduce an ambiguity in the documentation by referring to historical log-format variables ('%var') as log-format tags in 739c4e5b1e ("MINOR: sample: accept_date / request_date return %Ts / %tr timestamp values") and 454c372b60 ("DOC: configuration: add sample fetches for timing events"). In fact, we've had this confusion between log-format tag and log-format var for more than 10 years now, but in 2.9 it was the first time the confusion was exposed in the documentation. Indeed, both 'log-format variable' and 'log-format tag' actually refer to the same feature (that is: '%B' and friends that can be used for direct access to some log-oriented predefined fetches instead of using %[expr] with generic sample expressions). This feature was first implemented in 723b73ad75 ("MINOR: config: Parse the string of the log-format config keyword") and later documented in 4894040fa ("DOC: log-format documentation"). At that time, it was clear that we used to name it 'log-format variable'. But later the same year, 'log-format tag' naming started to appear in some commit messages (while still referring to the same feature), for instance with ffc3fcd6d ("MEDIUM: log: report SSL ciphers and version in logs using logformat %sslc/%sslv"). Unfortunately in 2.9 when we added (and documented) new log-format variables we officially started drifting to the misleading 'log-format tag' naming (perhaps because it was the most recent naming found for this feature in git log history, or because the confusion has always been there) Even worse, in 3.0 this confusion led us to rename all 'var' occurrences to 'tag' in log-format related code to unify the code with the doc. Hopefully William quickly noticed that we made a mistake there, but instead of reverting to historical naming (log-format variable), it was decided that we must use a different name that is less confusing than 'tags' or 'variables' (tags and variables are keywords that are already used to designate other features in the code and that are not very explicit under log-format context today). Now we refer to '%B' and friends as a logformat alias, which is essentially a handy way to print some log oriented information in the log string instead of leveraging '%[expr]' with generic sample expressions made of fetches and converters. Of course, there are some subtelties, such as a few log-format aliases that still don't have sample fetch equivalent for historical reasons, and some aliases that may be a little faster than their generic sample expression equivalents because most aliases are pretty much hardcoded in the log building function. But in general logformat aliases should be simply considered as an alternative to using expressions (with '%[expr']') Also, under log-format context, when we want to refer to either an alias ('%alias') or an expression ('%[expr]'), we should use the generic term 'logformat item', which in fact designates a single item within the logformat string provided by the user. Indeed, a logformat item (whether is is an alias or an expression) always starts with '%' and may accept optional flags / arguments Both the code and the documentation were updated in that sense, hopefully this will clarify things and prevent future confusions.	2024-05-27 17:03:48 +02:00
Willy Tarreau	7e943cdf27	CI: scripts: build vtest using multiple CPUs Now that vtest supports make -j, let's use it to save a bit of time (the build time is ~6s per test by default).	2024-05-27 12:15:50 +02:00
Willy Tarreau	01843c47a1	CI: scripts: fix build of vtest regarding option -C On Linux, GNU make emits "w" at the beginning of the MAKEFLAGS variable if -C is passed, which happens since vtest d6d228bcb3. In fact it emits any of the command line flags without the leading '-' in this case. gmake doesn't do that on BSD apparently. It's documented under Options/Recursion in the GNU make doc. There's also MFLAGS that could work but it does not contain the variables definitions. So let's just avoid the -C that we don't really need. This needs to be backported to stable versions.	2024-05-27 12:15:50 +02:00
William Lallemand	0a00302fab	MINOR: sample: implement the uptime sample fetch 'uptime' returns the uptime of the current HAProxy worker in seconds.	2024-05-27 11:06:40 +02:00
Willy Tarreau	f76e73511a	[RELEASE] Released version 3.0-dev13 Released version 3.0-dev13 with the following main changes : - CLEANUP: ssl/cli: remove unused code in dump_crtlist_conf - MINOR: ssl: check parameter in ckch_conf_cmp() - BUG/MINOR: ring: free ring's allocated area not ring's usable area when using maps - DOC: configuration: rework the crt-store load documentation - DEBUG: tools: add vma_set_name() helper - DEBUG: shctx: name shared memory using vma_set_name() - DEBUG: sink: add name hint for memory area used by memory-backed sinks - DEBUG: pollers: add name hint for large memory areas used by pollers - DEBUG: errors: add name hint for startup-logs memory area - DEBUG: fd: add name hint for large memory areas - MEDIUM: ssl: don't load file by discovering them in crt-store - DOC: configuration: update the crt-list documentation - DOC: configuration: add the supported crt-store options in crt-list - BUG/MEDIUM: proto: fix fd leak in <proto>_connect_server - MINOR: sock: set conn->err_code in case of EPERM - BUG/MINOR: http-ana: Don't crush stream termination condition on internal error - MAJOR: spoe: Let the SPOE back into the game - BUG/MINOR: connection: parse PROXY TLV for LOCAL mode - BUG/MINOR: server: free PROXY v2 TLVs on srv drop - MINOR: rhttp: add log on connection allocation failure - BUG/MEDIUM: rhttp: fix preconnect on single-thread - BUG/MINOR: rhttp: prevent listener suspend - BUG/MINOR: rhttp: fix task_wakeup state - MINOR: session: define flag to explicitely release listener on free - MEDIUM: rhttp: create session for active preconnect - MINOR: rhttp: support PROXY emission on preconnect - MINOR: connection: support PROXY v2 TLV emission without stream - MINOR: traces: enumerate the list of levels/verbosities when not found - BUG/MINOR: sock: fix sock_create_server_socket - MINOR: proto: fix coding style - BUG/MAJOR: quic: Crash with TLS_AES_128_CCM_SHA256 (libressl only) - REGTESTS: scripts: allow to change the vtest timeout - BUG/MEDIUM: quic_tls: prevent LibreSSL < 4.0 from negotiating CHACHA20_POLY1305 - CI: scripts/build-ssl.sh: loudly fail on unsupported platforms - BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream - MINOR: mux-quic: Set abort info for SC-less QCS on STOP_SENDING frame - CI: scripts/build-ssl: add a DESTDIR and TMPDIR variable - CI: scripts/buil-ssl: cleanup the boringssl and quictls build - MINOR: config: add thread-hard-limit to set an upper bound to nbthread - BUILD: quic: fix unused variable warning when threads are disabled - BUG/MEDIUM: stick-tables: Fix race with peers when trashing oldest entries - BUG/MEDIUM: stick-tables: Fix race with peers when killing a sticky session - BUG/MEDIUM: stick-tables: make sure never to create two same remote entries - CLEANUP: stick-tables: remove a few unneeded tests for use_wrlock - MINOR: stick-tables: remove the uneeded read lock in stksess_free() - CLEANUP: tools: fix vma_set_name() function comment - DEBUG: tools: add vma_set_name_id() helper - DEBUG: pollers/fd: add thread id suffix to per-thread memory areas name hints - DOC: config: fix aes_gcm_enc() description text - BUILD: trace: fix warning on null dereference - MEDIUM: config: prevent communication with privileged ports - MAJOR: config: prevent QUIC with clients privileged port by default - BUG/MINOR: quic: adjust restriction for stateless reset emission - MINOR: quic: clarify doc for quic_recv() - MINOR: server: generalize sni expr parsing - MINOR: server: define pool-conn-name keyword - MEDIUM: connection: use pool-conn-name instead of sni on reuse - BUG/MINOR: rhttp: initialize session origin after preconnect reversal - BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or error - BUG/MINOR: http-htx: Support default path during scheme based normalization - BUG/MINOR: server: Don't reset resolver options on a new default-server line - DOC: quic: specify that connection migration is not supported - DOC: config: fix incorrect section reference about custom log format - DOC: config: uniformize the naming and description of custom log format args - DOC: config: clarify the fact that custom log format is not just for logging - REGTESTS: acl_cli_spaces: avoid a warning caused by undefined logs	2024-05-24 17:57:29 +02:00
Willy Tarreau	45a187304e	REGTESTS: acl_cli_spaces: avoid a warning caused by undefined logs There's a warning being reported in this reg test in the detailed startup logs because of "log global" and "option httplog" while there's no global section hence no logger. Let's just drop both options since they're not relevant to this test.	2024-05-24 17:50:19 +02:00
Willy Tarreau	0af9bfcbc5	DOC: config: clarify the fact that custom log format is not just for logging The wording in the Custom log format section was still extremely centered on logging, but it's about time to mention that these are usable for other actions as well, otherwise it's very confusing for newcomers who try to define a variable or header. The updated text also reminds about the risks of safe encodings that may (rarely) mangle an output string, and encourages to migrate away from the unquoted definition which is full of backslashes. It would definitely deserve further improvements and refinements.	2024-05-24 17:32:59 +02:00
Willy Tarreau	c02cefce23	DOC: config: uniformize the naming and description of custom log format args A significant number of actions now take arguments that are evaluated as log-format expressions. Some of them are called "fmt", others "string". The description of the argument sometimes just says "the log-format string" or "log format" or "custom log format" etc. Most of them do not mention the section to visit, and section 8.2 speaking about log-format is very centric on logs usage (the primary use case), making all of this very confusing for newcomers. Since section 8.2.6 is titled "Custom log format" and describes the syntax to be used with the "log-format" (and other) directives, let's call this "Custom log format" everywhere and mention section 8.2.6. When the field was called "string", it was also renamed to "fmt". It doesn't seem worth backporting this, unless it applies fine.	2024-05-24 17:32:59 +02:00
Willy Tarreau	474cbcf842	DOC: config: fix incorrect section reference about custom log format Since 2.5 with commit 98b930d043 ("MINOR: ssl: Define a default https log format"), some log-format sections were shifted a bit without having been renumberred, causing 8.2.4 to be referenced as the custom log format while it's in fact 8.2.6. This patch fixes the affected locations. In addition two places mentioned 8.2.6 instead of 8.2.5 for the error log format. This can be backported to 2.6.	2024-05-24 17:32:59 +02:00
Amaury Denoyelle	59b69aafae	DOC: quic: specify that connection migration is not supported Currently haproxy does not support QUIC connection migration. This is advertized to clients on their connections. Document this in the first QUIC related paragraph. This should be backported up to 2.6.	2024-05-24 17:32:37 +02:00
Christopher Faulet	0d7c1bc6ab	BUG/MINOR: server: Don't reset resolver options on a new default-server line When a new "default-server" line is parsed, some resolver options are reset. Thus previously defined default options cannot be inherited. There is no reason to do so. First because other server options are inherited. And then because not all resolver options are reset. It is not consistent. This patch should fix issue #2559. It should be backported to all stable versions.	2024-05-24 16:31:01 +02:00
Christopher Faulet	8d2514e087	BUG/MINOR: http-htx: Support default path during scheme based normalization As stated in RFC3986, for an absolute-form URI, an empty path should be normalized to a path of "/". This is part of scheme based normalization rules. This kind of normalization is already performed for default ports. So we might as well deal with the case of empty path. The associated reg-tests was updated accordingly. This patch should fix the issue #2573. It may be backported as far as 2.4 if necessary.	2024-05-24 16:17:24 +02:00
Aurelien DARRAGON	c16eba8183	BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or error @boi4 reported in GH #2578 that since 3.0-dev1 for servers with address learned from A/AAAA records after a DNS flap server would be put out of maintenance with proper address but with invalid port (== 0), making it unusable and causing tcp checks to fail: [NOTICE] (1) : Loading success. [WARNING] (8) : Server mybackend/myserver1 is going DOWN for maintenance (DNS refused status). 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. [ALERT] (8) : backend 'mybackend' has no server available! [WARNING] (8) : mybackend/myserver1: IP changed from '(none)' to '127.0.0.1' by 'myresolver/ns1'. [WARNING] (8) : Server mybackend/myserver1 ('myhost') is UP/READY (resolves again). [WARNING] (8) : Server mybackend/myserver1 administratively READY thanks to valid DNS answer. [WARNING] (8) : Server mybackend/myserver1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. @boi4 also mentioned that this used to work fine before. Willy suggested that this regression may have been introduced by 64c9c8e ("BUG/MINOR: server/dns: use server_set_inetaddr() to unset srv addr from DNS") Turns out he was right! Indeed, in 64c9c8e we systematically memset the whole server_inetaddr struct (which contains both the requested server's addr and port planned for atomic update) instead of only memsetting the addr part of the structure: except when SRV records are involved (SRV records provide both the address and the port unlike A or AAAA records), we must not reset the server's port upon DNS errors because the port may have been provided at config time and we don't want to lose its value. Big thanks to @boi4 for his well-documented issue that really helped us to pinpoint the bug right on time for the dev-13 release. No backport needed (unless 64c9c8e gets backported).	2024-05-24 15:29:48 +02:00
Amaury Denoyelle	98ed11b0c5	BUG/MINOR: rhttp: initialize session origin after preconnect reversal Since the following commit, session is initialized early for rhttp preconnect. 12c40c25a9520fe3365950184fe724a1f4e91d03 MEDIUM: rhttp: create session for active preconnect Session origin member was not set. However, this prevents several session fetches to not work as expected. Worst, this caused a regression as previously session was created after reversal with origin member defined. This was reported by user William Manley on the mailing-list which rely on set-dst. One possible fix would be to set origin on session_new(). However, as this is done before reversal, some session members may be incorrectly initialized, in particular source and destination address. Thus, session origin is only set after reversal is completed. This ensures that session fetches have the same behavior on standard connections and reversable ones. This does not need to be backported.	2024-05-24 14:47:21 +02:00
Amaury Denoyelle	47168e217a	MEDIUM: connection: use pool-conn-name instead of sni on reuse Implement pool-conn-name support for idle connection reuse. It replaces SNI as arbitrary identifier for connections in the idle pool. Thus, every SNI reference in this context have been replaced. Main change occurs in connect_server() where pool-conn-name sample fetch is now prehash to generate idle connection identifier. SNI is now solely used in the context of SSL for ssl_sock_set_servername().	2024-05-24 14:47:21 +02:00
Amaury Denoyelle	be4f89f2b2	MINOR: server: define pool-conn-name keyword Define a new server keyword pool-conn-name. The purpose of this keyword will be to identify connections inside the idle connections pool, replacing SNI in case SSL is not wanted. This keyword uses a sample expression argument. It thus can reuse existing function parse_srv_expr() for parsing. In the future, it may be necessary to define a keyword variant which uses a logformat for extensability. This patch only implement parsing. Argument is stored inside new server field <pool_conn_name> and expression is generated in _srv_parse_finalize() into <pool_conn_name_expr>. If pool-conn-name is not set but SNI is, the latter is reused automatically as pool-conn-name via _srv_parse_finalize(). This ensures current reuse behavior remains compatible and idle connection reuse will not mix connections with different SNIs by mistake. Main usage will be for rhttp when SSL is not wanted between the two haproxy instances. Previously, it was possible to use "sni" keyword even without SSL on a server line which have a similar effect. However, having a dedicated "pool-conn-name" keyword is deemed clearer. Besides, it would allow for more complex configuration where pool-conn-name and SNI are use in parallel with different values.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	91001422b4	MINOR: server: generalize sni expr parsing Two functions exists for server sni sample expression parsing. This is confusing so this commit aims at clarifying this. Functions are renamed with the following identifiers. First function is named parse_srv_expr() and can be used during parsing. Besides expression parsing, it has ensure sample fetch validity in the context of a server line. Second function is renamed _parse_srv_expr() and is used internally by parse_srv_expr(). It only implements sample parsing without extra checks. It is already use for server instantiation derived from server-template as checks were already performed. Also, it is now used in http-client code as SNI is a fixed string. Finally, both functions are generalized to remove any reference to SNI. This will allow to reuse it to parse other server keywords which use an expression. This will be the case for the future keyword pool-conn-name.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	b9f67a46a2	MINOR: quic: clarify doc for quic_recv() Just highlight the fact that quic_recv() only receive a single datagram.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	5764bc50b5	BUG/MINOR: quic: adjust restriction for stateless reset emission Review RFC 9000 and ensure restriction on Stateless reset are properly enforced. After careful examination, several changes are introduced. First, redefine minimal Stateless Reset emitted packet length to 21 bytes (5 random bytes + a token). This is the new default length used in every case, unless received packet which triggered it is 43 bytes or smaller. Ensure every Stateless Reset packets emitted are at 1 byte shorter than the received packet which triggered it. No Stateless reset will be emitted if this falls under the above limit of 21 bytes. Thus this should prevent looping issues. This should be backported up to 2.6.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	f55748a422	MAJOR: config: prevent QUIC with clients privileged port by default Previous commit introduce new protection mechanism to forbid communications with clients which use a privileged source port. By default, this mechanism is disabled for every protocols. This patch changes the default value and activate the protection mechanism for QUIC protocol. This is justified as it is a probable sign of DNS/NTP amplification attack. This is labelled as major as it can be a breaking change with some network environments.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	45f40bac4c	MEDIUM: config: prevent communication with privileged ports This commit introduces a new global setting named harden.reject_privileged_ports.{tcp\|quic}. When active, communications with clients which use privileged source ports are forbidden. Such behavior is considered suspicious as it can be used as spoofing or DNS/NTP amplication attack. Value is configured per transport protocol. For each TCP and QUIC distinct code locations are impacted by this setting. The first one is in sock_accept_conn() which acts as a filter for all TCP based communications just after accept() returns a new connection. The second one is dedicated for QUIC communication in quic_recv(). In both cases, if a privileged source port is used and setting is disabled, received message is silently dropped. By default, protection are disabled for both protocols. This is to be able to backport it without breaking changes on stable release. This should be backported as it is an interesting security feature yet relatively simple to implement.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	4e632545f7	BUILD: trace: fix warning on null dereference Since a recent change on trace, the following compilation warning may occur : src/trace.c: In function ‘trace_parse_cmd’: src/trace.c:865:33: error: potential null pointer dereference [-Werror=null-dereference] 865 \| for (nd = src->decoding; nd->name && nd->desc; nd++) \| ~~~^~~~~~~~~~~~~~~ Fix this by rearranging code path to better highlight that only "quiet" verbosity is allowed if no trace source is specified. This was detected with GCC 14.1.	2024-05-24 14:36:03 +02:00
Willy Tarreau	77c228f04f	DOC: config: fix aes_gcm_enc() description text As reported by Nick Ramirez, it was written "decrypts" instead of "encrypts". No backport needed.	2024-05-24 12:09:25 +02:00
Aurelien DARRAGON	c9af6d5414	DEBUG: pollers/fd: add thread id suffix to per-thread memory areas name hints Willy reported that since abb8412d2 ("DEBUG: pollers: add name hint for large memory areas used by pollers") and 22ec2ad8b ("DEBUG: fd: add name hint for large memory areas") multiple maps with the same name could be found in /proc/<pid>/maps when haproxy process is started with multiple threads, which can be annoying. In fact this happens because some poller and fd-created memory areas are being created for each available thread, and since the naming was done using vma_set_name() with the same <type> and <name> inputs, the resulting name was the same for all threads. Thanks to the previous commit, we now use vma_set_name_id() for naming per-thread memory areas so that "-id" prefix is appended after the name name, where "id" equals to 'tid+1' (to match the thread numbering logic found in config file or in ha_panic() report), allowing to easily identify which haproxy thread owns the map in /proc/<pid>/maps: 7d3b26200000-7d3b26a01000 rw-p 00000000 00:00 0 [anon:ev_poll:poll_events-2] 7d3b26c00000-7d3b27001000 rw-p 00000000 00:00 0 [anon:fd:fd_updt-2] 7d3b27200000-7d3b27a01000 rw-p 00000000 00:00 0 [anon:ev_poll:poll_events-1] 7d3b34200000-7d3b34601000 rw-p 00000000 00:00 0 [anon:fd:fd_updt-1]	2024-05-24 12:07:18 +02:00
Aurelien DARRAGON	9d37c4b989	DEBUG: tools: add vma_set_name_id() helper Just like vma_set_name() from 51a8f134e ("DEBUG: tools: add vma_set_name() helper"), but also takes <id> as parameter to append "-$id" suffix after the name in order to differentiate 2 areas that were named using the same <type> and <name> combination. example, using mmap + MAP_SHARED\|MAP_ANONYMOUS: 7364c4fff000-736508000000 rw-s 00000000 00:01 3540 [anon_shmem:type:name-id] Another example, using mmap + MAP_PRIVATE\|MAP_ANONYMOUS or using glibc/malloc() above MMAP_THRESHOLD: 7364c4fff000-736508000000 rw-s 00000000 00:01 3540 [anon:type:name-id]	2024-05-24 12:07:13 +02:00
Aurelien DARRAGON	23814a44e5	CLEANUP: tools: fix vma_set_name() function comment There was a typo in the example provided in vma_set_name(): maps named using the function will show up as "type:name", not "type.name", updating the comment to reflect the current behavior.	2024-05-24 12:07:07 +02:00
Willy Tarreau	0bda33a3ec	MINOR: stick-tables: remove the uneeded read lock in stksess_free() During changes made in 2.7 by commits 8d3c3336f9 ("MEDIUM: stick-table: make stksess_kill_if_expired() avoid the exclusive lock") and 996f1a5124 ("MEDIUM: stick-table: do not take a lock to update t->current anymore."), the operation was done cautiously one baby step at a time and the final cleanup was not done, as we're keeping a read lock under an atomic dec. Furthermore there's a pool_free() call under that lock, and we try to avoid pool_alloc() and pool_free() under locks for their nasty side effects (e.g. when memory gets recompacted), so let's really drop it now. Note that the performance gain is not really perceptible here, it's essentially for code clarity reasons that this has to be done.	2024-05-24 11:52:57 +02:00
Willy Tarreau	8580f9db20	CLEANUP: stick-tables: remove a few unneeded tests for use_wrlock Due to the code in stktable_touch_with_exp() being the same as in other functions previously made around a loop trying first to upgrade a read lock then to fall back to a direct write lock, there remains a confusing construct with multiple tests on use_wrlock that is obviously zero when tested. Let's remove them since the value is known and the loop does not exist anymore.	2024-05-24 11:52:19 +02:00
Willy Tarreau	77f286e8bc	BUG/MEDIUM: stick-tables: make sure never to create two same remote entries In GH issue #2552, Christian Ruppert reported an increase in crashes with recent 3.0-dev versions, always related with stick-tables and peers. One particularity of his config is that it has a lot of peers. While trying to reproduce, it empirically was found that firing 10 load generators at 10 different haproxy instances tracking a random key among 100k against a table of max 5k entries, on 8 threads and between a total of 50 parallel peers managed to reproduce the crashes in seconds, very often in ebtree deletion or insertion code, but not only. The debugging revealed that the crashes are often caused by a parent node being corrupted while delete/insert tries to update it regarding a recently inserted/removed node, and that that corrupted node had always been proven to be deleted, then immediately freed, so it ought not be visited in the tree from functions enclosed between a pair of lock/unlock. As such the only possibility was that it had experienced unexpected inserts. Also, running with pool integrity checking would 90% of the time cause crashes during allocation based on corrupted contents in the node, likely because it was found at two places in the same tree and still present as a parent of a node being deleted or inserted (hence the __stksess_free and stktable_trash_oldest callers being visible on these items). Indeed the issue is in fact related to the test set (occasionally redundant keys, many peers). What happens is that sometimes, a same key is learned from two different peers. When it is learned for the first time, we end up in stktable_touch_with_exp() in the "else" branch, where the test for existence is made before taking the lock (since commit cfeca3a3a3 ("MEDIUM: stick-table: touch updates under an upgradable read lock") that was merged in 2.9), and from there the entry is added. But is one of the threads manages to insert it before the other thread takes the lock, then the second thread will try to insert this node again. And inserting an already inserted node will corrupt the tree (note that we never switched to enforcing a check in insertion code on this due to API history that would break various code parts). Here the solution is simple, it requires to recheck leaf_p after getting the lock, to avoid touching anything if the entry has already been inserted in the mean time. Many thanks to Christian Ruppert for testing this and for his invaluable help on this hard-to-trigger issue. This fix needs to be backported to 2.9.	2024-05-24 11:52:11 +02:00
Christopher Faulet	9938fb9c7a	BUG/MEDIUM: stick-tables: Fix race with peers when killing a sticky session When a sticky session is killed, we must be sure no other entity is still referencing it. The session's ref_cnt must be 0. However, there is a race with peers, as decribed in 21447b1dd4 ("BUG/MAJOR: stick-tables: fix race with peers in entry expiration"). When the update lock is acquire, we must recheck the ref_cnt value. This patch is part of a debugging session about issue #2552. It must be backported to 2.9.	2024-05-24 11:52:11 +02:00
Christopher Faulet	dfd938bad6	BUG/MEDIUM: stick-tables: Fix race with peers when trashing oldest entries It is the same that the one fixed in process_table_expire() (21447b1dd4 ["BUG/MAJOR: stick-tables: fix race with peers in entry expiration"]). In stktable_trash_oldest(), when the update lock is acquired, we must take care to check again the ref_cnt because some peers may increment it (See commit above for details). This patch fixes a crash mentionned in 2552#issuecomment-2110532706. It must be backported to 2.9.	2024-05-24 11:52:11 +02:00
Willy Tarreau	51f9f6cfd4	BUILD: quic: fix unused variable warning when threads are disabled The tree variable was introduced in 3.0 by commit dd58dff1e6 ("BUG/MEDIUM: quic: QUIC CID removed from tree without locking") which was marked for backport. The variable is only used for locks. Let's just mark the variable __maybe_unused for when the code is built without threads. The patch above was marked for backport to 2.7 so this should be backported wherever the fix was backported.	2024-05-24 11:51:41 +02:00
Willy Tarreau	381ed2a4dd	MINOR: config: add thread-hard-limit to set an upper bound to nbthread On todays large systems, it's not always desired to run on all threads for light loads, and usually users enforce nbthread to a lower value (e.g. 8). The problem is that this is a fixed value, and moving such configs to smaller machines continues to enforce the value and this becomes extremely unproductive due to having more threads than CPUs. This also happens quite a bit in VMs, containers, or cloud instances of various sizes. This commit introduces the thread-hard-limit setting that allows to only set an upper bound to the number of threads without raising a lower value. This means that using "thread-hard-limit 8" will make sure that no more than 8 threads will be used when available, but it will remain two when run on a dual-core machine.	2024-05-24 09:46:49 +02:00
William Lallemand	9c1fa3e411	CI: scripts/buil-ssl: cleanup the boringssl and quictls build Put the quictls and boringssl build in their own function instead of keeping it in the main part of the script.	2024-05-23 16:54:30 +02:00
William Lallemand	5d73643ca3	CI: scripts/build-ssl: add a DESTDIR and TMPDIR variable Add a DESTDIR and TMPDIR variables so the build-ssl.sh script can be used as a generic SSL lib installer outside the CI. The varibles are prefixed with BUILDSSL so they doesn't collide with the makefile one. Ex: OPENSSL_VERSION=3.2.0 BUILDSSL_DESTDIR=/opt/openssl-3.2.0/ ./scripts/build-ssl.sh WOLFSSL_VERSION=5.7.0 BUILDSSL_DESTDIR=/opt/wolfssl-5.7.0/ ./scripts/build-ssl.sh	2024-05-23 15:34:59 +02:00
Christopher Faulet	d11249f292	MINOR: mux-quic: Set abort info for SC-less QCS on STOP_SENDING frame It is a revert of cc9827bb09 ("BUG/MEDIUM: mux-quic: fix crash on STOP_SENDING received without SD"). This fix was based on a wrong assumption about QUIC streams that may have no stream-endpoint descriptor. However, it must never happen. And this was fixed. So we can now safely revert the commit above. However, it is not a bugfix because, for now, abort info are only used by the upper layer. So it is not a big deal to not set it when there is no SC.	2024-05-23 11:18:19 +02:00
Christopher Faulet	086e51017e	BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream Recent changes to save abort reason revealed an issue during the QUIC stream creation. Indeed, by design, when a mux stream is created, it must always have a valid stream-endpoint descriptor and it must remain valid till the mux stream destruction. On frontend side, it is the multiplexer responsibility to create it and set it as orphan. On the backend side, the sedesc is provided by the upper layer. It is the sedesc of the back stream-connector. For the QUIC multiplexer, the stream-endpoint descriptor was only created when the stream-connector was created and attached on it. It is unexpected and some bugs may be introduced because there is no valid sedesc on a QUIC stream. And a recent bug was introduced for this reason. This patch must be backported as far as 2.6.	2024-05-23 11:18:06 +02:00
Ilia Shipitsin	4a968d9d27	CI: scripts/build-ssl.sh: loudly fail on unsupported platforms	2024-05-22 16:52:43 +02:00
Willy Tarreau	c7335d55f8	BUG/MEDIUM: quic_tls: prevent LibreSSL < 4.0 from negotiating CHACHA20_POLY1305 As diagnosed in GH issue #2569, there's currently an issue in LibreSSL's CHACHA20 in-place implementation that makes haproxy discard incoming QUIC packets encrypted with it. It's not very easy to observe the issue because: - QUIC recommends that CHACHA20 is used in priority - on x86 with AES-NI, LibreSSL prefers AES-GCM for performance reasons, so the problem is only observed there if a client explicitly forces TLS_CHACHA20_POLY1305_SHA256 only. - discarded packets cause retransmits showing some apparent activity, and the handshake succeeds so it's not easy to analyze from the client which thinks that the server is slow to respond. Thus in practice, on non-x86 machines running LibreSSL, requests made over QUIC freeze for a long time, unless the client explicitly forces algos excluding TLS_CHACHA20_POLY1305_SHA256. That's typically the case by default on modern OpenBSD systems, and was reported in the issue above for an arm64 machine running OpenBSD -current, and was also observed on a mips64 one running OpenBSD 7.5. There is no simple solution to this problem due to some of the protocol's constraints without digging too low into the stack (and risking to break more). Here we're taking a pragmatic approach consisting in making the connection fail hard when TLS_CHACHA20_POLY1305_SHA256 is selected, regardless of the availability of other ciphers. This means that every time a connection would have hung, instead it will fail fast, allowing the client to retry over TLS/TCP. Theo Buehler recommends that we limit this protection to all LibreSSL versions before 4.0 since it's where the fix will be implemented. Older stable versions will just see TLS_CHACHA20_POLY1305_SHA256 disabled, which should be sufficient to make QUIC work there again as well. The following config is sufficient to reproduce the issue (on a non-x86 machine, both arm64 & mips64 were confirmed to reproduce it): global limited-quic frontend stats mode http #bind :8181 #bind :8443 ssl crt rsa+dh2048.pem bind quic4@:8443 ssl crt rsa+dh2048.pem alpn h3 timeout client 5s stats uri / And the following commands will trigger the problem on affected LibreSSL versions: curl --tls13-ciphers TLS_CHACHA20_POLY1305_SHA256 -v --http3 -k https://127.0.0.1:8443/ curl -v --http3 -k https://127.0.0.1:8443/ while these ones must work: curl --tls13-ciphers TLS_AES_128_GCM_SHA256 -v --http3 -k https://127.0.0.1:8443/ curl --tls13-ciphers TLS_AES_256_GCM_SHA384 -v --http3 -k https://127.0.0.1:8443/ Normally all of them will work with LibreSSL 4, and only the first one should fail with stable LibreSSL versions higher than 3.9.2. An haproxy version without this workaround will show an unresponsive command after the GET is sent, while a version with the workaround will close the connection on error. On a version with this workaround, if TCP listeners are uncommented, curl will automatically fall back to TCP and attempt the reqeust again over HTTP/2. Finally, on OpenSSL 1.1.1 in compat mode (hence the limited-quic option above) all of them must work. Many thanks to github user @lgv5 for the detailed report, tests, and for spotting the issue, and to @botovq (Theo Buehler) for the quick analysis, patch and help on this workaround. This needs to be backported to versions 2.6 and above.	2024-05-22 16:22:22 +02:00
William Lallemand	0182f6bbb6	REGTESTS: scripts: allow to change the vtest timeout $ make reg-tests VTEST_TIMEOUT=5 Allow to change the timeout of the regtests with the VTEST_TIMEOUT variable. The default value is still 10.	2024-05-22 15:43:53 +02:00
Frederic Lecaille	169fc0b771	BUG/MAJOR: quic: Crash with TLS_AES_128_CCM_SHA256 (libressl only) At least 3.9.0 version of libressl TLS stack does not behave as others stacks like quictls which make SSL_do_handshake() return an error when no cipher could be negotiated in addition to emit a TLS alert(0x28). This is the case when TLS_AES_128_CCM_SHA256 is forced as TLS1.3 cipher from the client side. This make haproxy enter a code path which leads to a crash as follows: [Switching to Thread 0x7ffff76b9640 (LWP 23902)] 0x0000000000487627 in quic_tls_key_update (qc=qc@entry=0x7ffff00371f0) at src/quic_tls.c:910 910 struct quic_kp_trace kp_trace = { (gdb) list 905 { 906 struct quic_tls_ctx tls_ctx = &qc->ael->tls_ctx; 907 struct quic_tls_secrets rx = &tls_ctx->rx; 908 struct quic_tls_secrets tx = &tls_ctx->tx; 909 / Used only for the traces / 910 struct quic_kp_trace kp_trace = { 911 .rx_sec = rx->secret, 912 .rx_seclen = rx->secretlen, 913 .tx_sec = tx->secret, 914 .tx_seclen = tx->secretlen, (gdb) p qc $1 = (struct quic_conn ) 0x7ffff00371f0 (gdb) p qc->ael $2 = (struct quic_enc_level *) 0x0 (gdb) bt #0 0x0000000000487627 in quic_tls_key_update (qc=qc@entry=0x7ffff00371f0) at src/quic_tls.c:910 #1 0x000000000049bca9 in qc_ssl_provide_quic_data (len=268, data=<optimized out>, ctx=0x7ffff0047f80, level=<optimized out>, ncbuf=<optimized out>) at src/quic_ssl.c:617 #2 qc_ssl_provide_all_quic_data (qc=qc@entry=0x7ffff00371f0, ctx=0x7ffff0047f80) at src/quic_ssl.c:688 #3 0x00000000004683a7 in quic_conn_io_cb (t=0x7ffff0047f30, context=0x7ffff00371f0, state=<optimized out>) at src/quic_conn.c:760 #4 0x000000000063cd9c in run_tasks_from_lists (budgets=budgets@entry=0x7ffff76961f0) at src/task.c:596 #5 0x000000000063d934 in process_runnable_tasks () at src/task.c:876 #6 0x0000000000600508 in run_poll_loop () at src/haproxy.c:3073 #7 0x0000000000600b67 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3287 #8 0x00007ffff7f6ae45 in start_thread () from /lib64/libpthread.so.0 #9 0x00007ffff78254af in clone () from /lib64/libc.so.6 When a TLS alert is emitted, haproxy calls quic_set_connection_close() which sets QUIC_FL_CONN_IMMEDIATE_CLOSE connection flag. This is this flag which is tested by this patch to make the handshake fail even if SSL_do_handshake() does not return an error. This test is specific to libressl and never run with others TLS stack. Thank you to @lgv5 and @botovq for having reported this issue in GH #2569. Must be backported as far as 2.6.	2024-05-22 15:21:55 +02:00
Valentine Krasnobaeva	0e93549d2a	MINOR: proto: fix coding style Remove redundant brackets for 'if' statements that contain only one instruction.	2024-05-22 12:00:11 +02:00
Valentine Krasnobaeva	83ab1479d0	BUG/MINOR: sock: fix sock_create_server_socket Set stream_err value as SF_ERR_NONE, if obtained socket fd has passed all common runtime and configuration related checks. '.connect()' method implementation in higher protocol layers requires Stream Error Flag as the return value. So, at the socket layer, we need to pass to sock_create_server_socket() a variable to set this flag, because syscalls and some socket options checks are convenient to performe at the socket layer.	2024-05-22 11:59:55 +02:00
Willy Tarreau	5b9503ed33	MINOR: traces: enumerate the list of levels/verbosities when not found It's quite frustrating, particularly on the command line, not to have access to the list of available levels and verbosities when one does not exist for a given source, because there's no easy way to find them except by starting without and connecting to the CLI. Let's enumerate the list of supported levels and verbosities when a name does not match. For example: $ ./haproxy -db -f quic-repro.cfg -dt h2:help [NOTICE] (9602) : haproxy version is 3.0-dev12-60496e-27 [NOTICE] (9602) : path to executable is ./haproxy [ALERT] (9602) : -dt: no such trace level 'help', available levels are 'error', 'user', 'proto', 'state', 'data', and 'developer'. $ ./haproxy -db -f quic-repro.cfg -dt h2:user:help [NOTICE] (9604) : haproxy version is 3.0-dev12-60496e-27 [NOTICE] (9604) : path to executable is ./haproxy [ALERT] (9604) : -dt: no such trace verbosity 'help' for source 'h2', available verbosities for this source are: 'quiet', 'clean', 'minimal', 'simple', 'advanced', and 'complete'. The same is done for the CLI where the existing help message is always displayed when entering an invalid verbosity or level.	2024-05-22 11:17:57 +02:00
Amaury Denoyelle	60496e884e	MINOR: connection: support PROXY v2 TLV emission without stream Update API for PROXY protocol header encoding. Previously, it requires stream parameter to be set. Change make_proxy_line() and associated functions to add an extra session parameter. This is useful in context where no stream is instantiated. For example, this is the case for rhttp preconnect. This change allows to extend PROXY v2 TLV encoding. Replace build_logline() which requires a stream instance and call directly sess_build_logline(). Note that stream parameter is kept as it is necessary for unique ID encoding. This change has no functional change for standard connections. However, it is necessary to support TLV encoding on rhttp preconnect.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	7a81bfc8d2	MINOR: rhttp: support PROXY emission on preconnect Extend preconnect to support PROXY protocol emission. Code is duplicated from connect_server() into new_reverse_conn(). This is necessary to support send-proxy on server line used as rhttp.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	12c40c25a9	MEDIUM: rhttp: create session for active preconnect Modify rhttp preconnect by instantiating a new session for each connection attempt. Connection is thus linked to a session directly on its instantiation contrary to previously where no session existed until listener_accept(). This patch will allow to extend rhttp usage. Most notably, it will be useful to use various sample fetches on the server line and extend logging capabilities. Changes are minimal, yet consequences are considered not trivial as for the first time a FE connection session is instantiated before listener_accept(). This requires an extra explicit check in session_accept_fd() to not overwrite an existing session. Also, flag SESS_FL_RELEASE_LI is not set immediately as listener counters must note be decremented if connection and its session are freed before reversal is completed, or else listener counters will be invalid. conn_session_free() is used as connection destroy callback to ensure the session will be freed automatically on connection release.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	45b80aed70	MINOR: session: define flag to explicitely release listener on free When a session is allocated for a FE connection, session_free() is responsible to call listener_release() to decrement listener connection counters and resume listening. Until now, <listener> member of session was tested inside session_free() before invocating listener_release(). To highlight more explicitely the relation between sessions and listeners, introduce a new flag SESS_FL_RELEASE_LI. Only session with such flag set will invoke listener_release() on their cleanup. Flag is set inside session_accept_fd() on success. This patch has no functional change. However, it will be useful to implement session creation for rHTTP preconnect.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	808daa7cfb	BUG/MINOR: rhttp: fix task_wakeup state TASK_WOKEN_ANY was incorrectly used as argument to task_wakeup() for rhttp preconnect task. This value is used as a flag. Replace it by proper individual values. This is labelled as a bug but it has no known impact. This should be backported up to 2.9.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	2770ef352e	BUG/MINOR: rhttp: prevent listener suspend Ensure "disable frontend" on a reverse HTTP listener is forbidden by returing -1 on suspend callback. Suspending such a listener has unknown effect and so is not properly implemented for now. This should be backported up to 2.9.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	ceebb09744	BUG/MEDIUM: rhttp: fix preconnect on single-thread On initialization of a rhttp bind, the first thread available on the listener is selected to execute the first occurence of the preconnect task. This thread selection was incorrect as it used my_ffsl() which returns value indexed from 1, contrary to tid which are indexed from 0. This cause the first listener thread to be skipped in favor of the second one. Worst, if haproxy runs in single-thread mode, calculated thread ID will be invalid and the task will never run, which prevent any preconnect execution. Fix this by substracting the result of my_ffsl() by 1 to have a value indexed from 0. This must be backported up to 2.9.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	4f80543220	MINOR: rhttp: add log on connection allocation failure Add an error log when new_reverse_conn() fails. This may help to diagnose future issues on reverse HTTP.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	3efd9f3925	BUG/MINOR: server: free PROXY v2 TLVs on srv drop Dynamically allocated servers PROXY TLVs were not freed on server release. This patch fixes this leak by extending srv_free_params(). Every server line with set-proxy-v2-tlv-fmt keyword is impacted. For static servers, issue is minimal as it will only cause leak on deinit(). However, this could be aggravated when performing multiple removal of dynamic servers. This should be backported up to 2.9.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	8b72270e95	BUG/MINOR: connection: parse PROXY TLV for LOCAL mode conn_recv_proxy() is responsible to parse PROXY protocol header. For v2 of the protocol, TLVs parsing is implemented. However, this step was only done inside 'PROXY' command label. TLVs were never extracted for 'LOCAL' command mode. Fix this by extracting TLV parsing loop outside of the switch case. Of notable importance, tlv_offset is updated on LOCAL label to point to first TLV location. This bug should be backported up to 2.9 at least. It should even probably be backported to every stable versions. Note however that this code has changed much over time. It may be useful to use option '--ignore-all-space' to have a clearer overview of the git diff.	2024-05-22 10:01:57 +02:00
Christopher Faulet	eb89a7da33	MAJOR: spoe: Let the SPOE back into the game This reverts commits 885e40494c5de6aee841222496d84dc718401fa0 and dff98071888ae06dcec0a6c3a9222e76e893305d. We decided to spend some time to refactor and rationnalize the SPOE for the 3.1. Thus there is no reason to still consider it as deprecated for the 3.0. Compatibility between the both versions will be maintained. See #2502 for more info.	2024-05-22 09:04:38 +02:00
Christopher Faulet	746e6f8597	BUG/MINOR: http-ana: Don't crush stream termination condition on internal error When internal error is reported from an HTTP analyzer, we must take care to not set the stream termination condition if it was already set. For instance, it happens when a message rewrite fails. In this case SF_ERR_PXCOND is set by the rule. The HTTP analyzer must not crush it with SF_ERR_INTERNAL. The regression was introduced with the commit 0fd25514d6 ("MEDIUM: http-ana: Set termination state before returning haproxy response"). The bug was discovered working in the issue #2568. It must be backported to 2.9.	2024-05-22 09:04:38 +02:00
Valentine Krasnobaeva	39caa20b3c	MINOR: sock: set conn->err_code in case of EPERM To improve the readability of sock_handle_system_err(), let's set explicitly conn->err_code as CO_ER_SOCK_ERR in case of EPERM (could be returned by setns syscall).	2024-05-21 20:14:31 +02:00
Valentine Krasnobaeva	5f713c03be	BUG/MEDIUM: proto: fix fd leak in <proto>_connect_server This fixes the fd leak, introduced in the commit d3fc982cd788 ("MEDIUM: proto: make common fd checks in sock_create_server_socket"). Initially sock_create_server_socket() was designed to return only created socket FD or -1. Its callers from upper protocol layers were required to test the returned errno and were required then to apply different configuration related checks to obtained positive sock_fd. A lot of this code was duplicated among protocols implementations. The new refactored version of sock_create_server_socket() gathers in one place all duplicated checks, but in order to be complient with upper protocol layers, it needs the 3rd parameter: 'stream_err', in which it sets the Stream Error Flag for upper levels, if the obtained sock_fd has passed all additional checks. No backport needed since this was introduced in 3.0-dev10.	2024-05-21 20:14:05 +02:00
William Lallemand	04a42a92f4	DOC: configuration: add the supported crt-store options in crt-list The crt-list supports some crt-store keywords. This patch list them in the crt-list documentation.	2024-05-21 18:30:45 +02:00
William Lallemand	e732de7db2	DOC: configuration: update the crt-list documentation Update the crt-list documentation with the supported keywords. Also format it in a more clear way. Must be backported to 2.8.	2024-05-21 18:30:45 +02:00
William Lallemand	e6657fd108	MEDIUM: ssl: don't load file by discovering them in crt-store In commit 55e9e9591 ("MEDIUM: ssl: temporarily load files by detecting their presence in crt-store"), ssl_sock_load_pem_into_ckch() was replaced by ssl_sock_load_files_into_ckch() in the crt-store loading. But the side effect was that we always try to autodetect, and this is not what we want. This patch reverse this, and add specific code in the crt-list loading, so we could autodetect in crt-list like it was done before, but still try to load files when a crt-store filename keyword is specified. Example: These crt-list lines won't autodetect files: foobar.crt [key foobar.key issuer foobar.issuer ocsp-update on] .foo.bar foobar.crt [key foobar.key] .foo.bar These crt-list lines will autodect files: foobar.pem [ocsp-update on] *.foo.bar foobar.pem	2024-05-21 18:30:45 +02:00
Aurelien DARRAGON	22ec2ad8b0	DEBUG: fd: add name hint for large memory areas Thanks to ("MINOR: tools: add vma_set_name() helper"), set a name hint for large arrays created by fd api (fdtab arrays and so on) so that that they can be easily identified in /proc/<pid>/maps. Depending on malloc() implementation, such memory areas will normally be merged on the heap under MMAP_THRESHOLD (128 kB by default) and will have a dedicated memory area once the threshold is exceeded. As such, when large enough, they will appear like this in /proc/<pid>/maps: 7b8e83200000-7b8e84201000 rw-p 00000000 00:00 0 [anon:fd:fdinfo] 7b8e84400000-7b8e85401000 rw-p 00000000 00:00 0 [anon:fd:polled_mask] 7b8e85600000-7b8e89601000 rw-p 00000000 00:00 0 [anon:fd:fdtab_addr] 7b8e90a00000-7b8e90e01000 rw-p 00000000 00:00 0 [anon:fd:fd_updt]	2024-05-21 17:55:29 +02:00
Aurelien DARRAGON	9424e5a06f	DEBUG: errors: add name hint for startup-logs memory area Thanks to ("MINOR: tools: add vma_set_name() helper"), set a name hint for startup-logs ring's memory area created using mmap() so it can be easily indentified in /proc/<pid>/maps. 7b8e91cce000-7b8e91cde000 rw-s 00000000 00:19 46 [anon_shmem:errors:startup_logs]	2024-05-21 17:55:20 +02:00
Aurelien DARRAGON	abb8412d20	DEBUG: pollers: add name hint for large memory areas used by pollers Thanks to ("MINOR: tools: add vma_set_name() helper"), set a name hint for large memory areas allocated by pollers upon init so that they can be easily indentified in /proc/<pid>/maps. For now, only linux-compatible pollers are considered since vma_set_name() requires a recent linux kernel (>= 5.17). Depending on malloc() implementation, such memory areas will normally be merged on the heap under MMAP_THRESHOLD (128 kB by default) and will have a dedicated memory area once the threshold is exceeded. As such, when large enough, they will appear like this in /proc/<pid>/maps: 7ec6b2d40000-7ec6b2d61000 rw-p 00000000 00:00 0 [anon:ev_poll:fd_evts_wr] 7ec6b2d61000-7ec6b2d82000 rw-p 00000000 00:00 0 [anon:ev_poll:fd_evts_rd]	2024-05-21 17:55:14 +02:00
Aurelien DARRAGON	6c5869f846	DEBUG: sink: add name hint for memory area used by memory-backed sinks Thanks to ("MINOR: tools: add vma_set_name() helper"), set a name hint for user created memory-backed sinks (ring sections without backing-file) so that they can be easily indentified in /proc/<pid>/maps. Depending on malloc() implementation, such memory areas will normally be merged on the heap under MMAP_THRESHOLD (128 kB by default) and will have a dedicated memory area once the threshold is exceeded. As such, when large enough, they will appear like this in /proc/<pid>/maps: 7b8e8ac00000-7b8e8bf13000 rw-p 00000000 00:00 0 [anon💍myring]	2024-05-21 17:55:09 +02:00
Aurelien DARRAGON	6de0da1b54	DEBUG: shctx: name shared memory using vma_set_name() In 98d22f212 ("MEDIUM: shctx: Naming shared memory context"), David implemented prctl/PR_SET_VMA support to give a name to shctx maps when supported. Maps were named after "HAProxy $name". It turns out that it is not relevant to include "HAProxy" in the map name, given that we're already looking at maps for a given PID (and here it's HAProxy's pid). Instead, let's name shctx maps by making use of the new vma_set_name() helper introduced by previous commit. Resulting maps will be named "shctx:$name", e.g.: "shctx:globalCache", they will appear like this in /proc/<pid>/maps: 7ec6aab0f000-7ec6ac000000 rw-s 00000000 00:01 405 [anon_shmem:shctx:custom_name]	2024-05-21 17:55:03 +02:00
Aurelien DARRAGON	51a8f134ef	DEBUG: tools: add vma_set_name() helper Following David Carlier's work in 98d22f21 ("MEDIUM: shctx: Naming shared memory context"), let's provide an helper function to set a name hint on a virtual memory area (ie: anonymous map created using mmap(), or memory area returned by malloc()). Naming will only occur if available, and naming errors will be ignored. The function takes mandatory <type> and <name> parameterss to build the map name as follow: "type:name". When looking at /proc/<pid>/maps, vma named using this helper function will show up this way (provided that the kernel has prtcl support for PR_SET_VMA_ANON_NAME): example, using mmap + MAP_SHARED\|MAP_ANONYMOUS: 7364c4fff000-736508000000 rw-s 00000000 00:01 3540 [anon_shmem:type:name] Another example, using mmap + MAP_PRIVATE\|MAP_ANONYMOUS or using glibc/malloc() above MMAP_THRESHOLD: 7364c4fff000-736508000000 rw-s 00000000 00:01 3540 [anon:type:name]	2024-05-21 17:54:58 +02:00
William Lallemand	4bb6ea5d00	DOC: configuration: rework the crt-store load documentation The load keyword from the documentation has its own section to be readable (like the server or bind options section). The ocsp-update keyword was move from the bind section to the crt-list load one.	2024-05-21 12:00:55 +02:00
Aurelien DARRAGON	0cfbeb1ae8	BUG/MINOR: ring: free ring's allocated area not ring's usable area when using maps Since 40d1c84bf0 ("BUG/MAJOR: ring: free the ring storage not the ring itself when using maps"), munmap() call for startup_logs's ring and file-backed rings fails to work (EINVAL) and causes memory leaks during process cleanup. munmap() fails because it is called with the ring's usable area pointer which is an offset from the underlying original memory block allocated using mmap(). Indeed, ring_area() helper function was misused because it didn't explicitly mention that the returned address corresponds to the usable storage's area, not the allocated one. To fix the issue, we add an explicit ring_allocated_area() helper to return the allocated area for the ring, just like we already have ring_allocated_size() for the allocated size, and we properly use both the allocated size and allocated area to manipulate them using munmap() and msync(). No backport needed.	2024-05-21 11:42:35 +02:00
William Lallemand	d74ba7cc24	MINOR: ssl: check parameter in ckch_conf_cmp() Check prev and new parameters in ckch_conf_cmp() so we don't dereference a NULL ptr. There is no risk since it's not used with a NULL ptr yet. Also remove the check that are done later, and do it at the beginning of the function. Should fix issue #2572.	2024-05-21 11:09:59 +02:00
William Lallemand	140078c19d	CLEANUP: ssl/cli: remove unused code in dump_crtlist_conf This code was never used because space is never define before: if (space) chunk_appendf(buf, " "); Should fix issue #2571.	2024-05-21 10:58:09 +02:00
Willy Tarreau	d236b43da7	[RELEASE] Released version 3.0-dev12 Released version 3.0-dev12 with the following main changes : - CI: drop asan.log umbrella completely - BUG/MINOR: log: fix leak in add_sample_to_logformat_list() error path - BUG/MINOR: log: smp_rgs array issues with inherited global log directives - MINOR: rhttp: Don't require SSL when attach-srv name parsing - REGTESTS: ssl: be more verbose with ocsp_compat_check.vtc - DOC: Update UUID references to RFC 9562 - MINOR: hlua: add hlua_nb_instruction getter - MEDIUM: hlua: take nbthread into account in hlua_get_nb_instruction() - BUG/MEDIUM: server: clear purgeable conns before server deletion - BUG/MINOR: mux-quic: fix error code on shutdown for non HTTP/3 - BUG/MINOR: qpack: fix error code reported on QPACK decoding failure - BUG/MEDIUM: htx: mark htx_sl as packed since it may be realigned - BUG/MEDIUM: stick-tables: properly mark stktable_data as packed - SCRIPTS: run-regtests: fix a few occurrences of extended regexes - BUG/MINOR: ssl_sock: fix xprt_set_used() to properly clear the TASK_F_USR1 bit - MINOR: dynbuf: provide a b_dequeue() variant for multi-thread - BUG/MEDIUM: muxes: enforce buf_wait check in takeover() - BUG/MINOR: h1: Check authority for non-CONNECT methods only if a scheme is found - BUG/MEDIUM: h1: Reject CONNECT request if the target has a scheme - BUG/MAJOR: h1: Be stricter on request target validation during message parsing - MINOR: qpack: prepare error renaming - MINOR: h3/qpack: adjust naming for errors - MINOR: h3: adjust error reporting on sending - MINOR: h3: adjust error reporting on receive - MINOR: mux-quic: support glitches - MINOR: h3: report glitch on RFC violation - BUILD: stick-tables: better mark the stktable_data as 32-bit aligned - MINOR: ssl: rename tune.ssl.ocsp-update.mode in ocsp-update.mode - REGTESTS: update the ocsp-update tests - BUILD: stats: remove non portable getline() usage - MEDIUM: ssl: add ocsp-update.mindelay and ocsp-update.maxdelay - BUILD: log: get rid of non-portable strnlen() func - BUG/MEDIUM: fd: prevent memory waste in fdtab array - CLEANUP: compat: make the MIN/MAX macros more reliable - Revert: MEDIUM: evports: permit to report multiple events at once" - BUG/MINOR: stats: Don't state the 303 redirect response is chunked - MINOR: mux-h1: Add a flag to ignore the request payload - REORG: mux-h1: Group H1S_F_BODYLESS_* flags - CLEANUP: mux-h1: Remove unused H1S_F_ERROR_MASK mask value - MEDIUM: mux-h1: Support C-L/T-E header suppressions when sending messages - MINOR: ssl: ckch_store_new_load_files_conf() loads filenames from ckch_conf - MEDIUM: ssl/crtlist: loading crt-store keywords from a crt-list - CLEANUP: ssl/ocsp: remove the deprecated parsing code for "ocsp-update" - MINOR: ssl: pass ckch_store instead of ckch_data to ssl_sock_load_ocsp() - MEDIUM: ssl: ckch_conf_parse() uses -1/0/1 for off/default/on - MINOR: ssl: handle PARSE_TYPE_INT and PARSE_TYPE_ONOFF in ckch_store_load_files() - MINOR: ssl/ocsp: use 'ocsp-update' in crt-store - MINOR: ssl: ckch_conf_clean() utility function for ckch_conf - MEDIUM: ssl: add ocsp-update.disable global option - MEDIUM: ssl/cli: handle crt-store keywords in crt-list over the CLI - MINOR: ssl: ckch_conf_cmp() compare multiple ckch_conf structures - MEDIUM: ssl: temporarily load files by detecting their presence in crt-store - REGTESTS: ocsp-update: change the reg-test to support the new crt-store mode - DOC: capabilities: fix chapter header rendering	2024-05-18 16:51:23 +02:00
Valentine Krasnobaeva	63bed0161d	DOC: capabilities: fix chapter header rendering The header of a new management guide chapter, "13.1. Linux capabilities support", is not rendered in HTML format in a proper way, because of missing dots at the end of this chapter's number.	2024-05-18 16:48:20 +02:00
William Lallemand	d33a5f8e14	REGTESTS: ocsp-update: change the reg-test to support the new crt-store mode Update the ocsp-update tests for the recent changes: - Incompatibilities check string changed to match the crt-store one - The "good configurations" are not good anymore because the ckch_conf_cmp() does not compare anymore with a global value.	2024-05-17 17:35:51 +02:00
William Lallemand	55e9e95914	MEDIUM: ssl: temporarily load files by detecting their presence in crt-store crt-store is maint to be stricter than your common crt argument on a bind line, and is supposed to be a declarative format. However, since the 'ocsp-update' was migrated from ssl_conf to ckch_conf, the .issuer file is not autodetected anymore when adding a ocsp-update keyword in a crt-list file, which breaks retro-compatibility. This patch is a quick fix that will disappear once we are able to be strict on a crt-store and autodetect on a crt-list.	2024-05-17 17:35:51 +02:00
William Lallemand	58103bc8e6	MINOR: ssl: ckch_conf_cmp() compare multiple ckch_conf structures The ckch_conf_cmp() function allow to compare multiple ckch_conf structures in order to check that multiple usage of the same crt in the configuration uses the same ckch_conf definition. A crt-list allows to use "crt-store" keywords that defines a ckch_store, that can lead to inconsistencies when a crt is called multiple time with different parameters. This function compare and dump a list of differences in the err variable to be output as error. The variant ckch_conf_cmp_empty() compares the ckch_conf structure to an empty one, which is useful for bind lines, that are not able to have crt-store keywords. These functions are used when a crt-store is already inialized and we need to verify if the parameters are compatible. ckch_conf_cmp() handles multiple cases: - When the previous ckch_conf was declared with CKCH_CONF_SET_EMPTY, we can't define any new keyword in the next initialisation - When the previous ckch_conf was declared with keywords in a crtlist (CKCH_CONF_SET_CRTLIST), the next initialisation must have the exact same keywords. - When the previous ckch_conf was declared in a "crt-store" (CKCH_CONF_SET_CRTSTORE), the next initialisaton could use no keyword at all or the exact same keywords.	2024-05-17 17:35:51 +02:00
William Lallemand	1bc6e990f2	MEDIUM: ssl/cli: handle crt-store keywords in crt-list over the CLI This patch adds crt-store keywords from the crt-list on the CLI. - keywords from crt-store can be used over the CLI when inserting certificate in a crt-list - keywords from crt-store are dumped when showing a crt-list content over the CLI The ckch_conf_kws.func function pointer needed a new "cli" parameter, in order to differenciate loading that come from the CLI or from the startup, as they don't behave the same. For example it must not try to load a file on the filesystem when loading a crt-list line from the CLI. dump_crtlist_sslconf() was renamed in dump_crtlist_conf() and takes a new ckch_conf parameter in order to dump relevant crt-store keywords.	2024-05-17 17:35:51 +02:00
William Lallemand	2bcf38c7c8	MEDIUM: ssl: add ocsp-update.disable global option This option allow to disable completely the ocsp-update. To achieve this, the ocsp-update.mode global keyword don't rely anymore on SSL_SOCK_OCSP_UPDATE_OFF during parsing to call ssl_create_ocsp_update_task(). Instead, we will inherit the SSL_SOCK_OCSP_UPDATE_* value from ocsp-update.mode for each certificate which does not specify its own mode. To disable completely the ocsp without editing all crt entries, ocsp-update.disable is used instead of "ocsp-update.mode" which is now only used as the default value for crt.	2024-05-17 17:35:51 +02:00
William Lallemand	2e6615b282	MINOR: ssl: ckch_conf_clean() utility function for ckch_conf - ckch_conf_clean() to free() the content of a ckch_conf structure, mostly the string that were strdup()	2024-05-17 17:35:51 +02:00
William Lallemand	2b6b7fea58	MINOR: ssl/ocsp: use 'ocsp-update' in crt-store Use the ocsp-update keyword in the crt-store section. This is not used as an exception in the crtlist code anymore. This patch introduces the "ocsp_update_mode" variable in the ckch_conf structure. The SSL_SOCK_OCSP_UPDATE_* enum was changed to a define to match the ckch_conf on/off parser so we can have off to -1.	2024-05-17 17:35:51 +02:00
William Lallemand	462e5b0098	MINOR: ssl: handle PARSE_TYPE_INT and PARSE_TYPE_ONOFF in ckch_store_load_files() The callback used by ckch_store_load_files() only works with PARSE_TYPE_STR. This allows to use a callback which will use a integer type for PARSE_TYPE_INT and PARSE_TYPE_ONOFF. This require to change the type of the callback to void * to pass either a char * or a int depending of the parsing type. The ssl_sock_load_* functions were encapsuled in ckch_conf_load_* function just to match the type. This will allow to handle crt-store keywords that are ONOFF or INT types.	2024-05-17 17:35:51 +02:00
William Lallemand	c5a665f5d8	MEDIUM: ssl: ckch_conf_parse() uses -1/0/1 for off/default/on ckch_conf_parse() now set -1 for a off value and 1 for a on value. This allow to detect when a value is the default since the struct are memset to 0.	2024-05-17 17:35:51 +02:00
William Lallemand	2b8880e395	MINOR: ssl: pass ckch_store instead of ckch_data to ssl_sock_load_ocsp() ssl_sock_put_ckch_into_ctx() and ssl_sock_load_ocsp() need to take a ckch_store in argument. Indeed the ocsp_update_mode is not stored anymore in ckch_data, but in ckch_conf which is part of the ckch_store. This is a minor change, but the function definition had to change.	2024-05-17 17:35:51 +02:00
William Lallemand	db09c2168f	CLEANUP: ssl/ocsp: remove the deprecated parsing code for "ocsp-update" Remove the "ocsp-update" keyword handling from the crt-list. The code was made as an exception everywhere so we could activate the ocsp-update for an individual certificate. The feature will still exists but will be parsed as a "crt-store" keyword which will still be usable in a "crt-list". This will appear in future commits. This commit also disable the reg-tests for now.	2024-05-17 17:35:51 +02:00
William Lallemand	d616932076	MEDIUM: ssl/crtlist: loading crt-store keywords from a crt-list This patch allows the usage of "crt-store" keywords from a "crt-list". The crtstore_parse_load() function was splitted into 2 functions, so the keywords parsing is done in ckch_conf_parse(). With this patch, crt are loaded with ckch_store_new_load_files_conf() or ckch_store_new_load_files_path() depending on weither or not there is a "crt-store" keyword. More checks need to be done on "crt" bind keywords to ensure that keywords are compatible. This patch does not introduce the feature on the CLI.	2024-05-17 17:35:51 +02:00
William Lallemand	8526d666d2	MINOR: ssl: ckch_store_new_load_files_conf() loads filenames from ckch_conf ckch_store_new_load_files_conf() is the equivalent of new_ckch_store_load_files_path() but instead of trying to find the files using a base filename, it will load them from a list of files.	2024-05-17 17:35:51 +02:00
Christopher Faulet	2fc9e6fa39	MEDIUM: mux-h1: Support C-L/T-E header suppressions when sending messages During the 2.9 dev cycle, to be able to support zero-copy data forwarding, a change on the H1 mux was performed to ignore the headers modifications about payload representation (Content-Length and Transfer-Encoding headers). It appears there are some use-cases where it could be handy to change values of these headers or just remove them. For instance, we can imagine to remove these headers on a server response to force the old HTTP/1.0 close mode behavior. So thaks to this patch, the rules are relaxed. It is now possible to remove these headers. When this happens, the following rules are applied: * If "Content-Length" header is removed but a "Transfer-Encoding: chunked" header is found, no special processing is performed. The message remains chunked. However the close mode is not forced. * If "Transfer-Encoding" header is removed but a "Content-Length" header is found, no special processing is performed. The payload length must comply to the specified content length. * If one of them is removed and the other one is not found, a response is switch the close mode and a "Content-Length: 0" header is forced on a request. With these rules, we fit the best to the user expectations. This patch depends on the following commit: * MINOR: mux-h1: Add a flag to ignore the request payload This patch should fix the issue #2536. It should be backported it to 2.9 with the commit above.	2024-05-17 16:33:53 +02:00
Christopher Faulet	1a2699d5f7	CLEANUP: mux-h1: Remove unused H1S_F_ERROR_MASK mask value This mask value is unused, so we can safely remove it. It is a chance because its value was wrong. But there is no bug here, even in stable versions, because it is no longer used in all versions.	2024-05-17 16:33:53 +02:00
Christopher Faulet	071057d112	REORG: mux-h1: Group H1S_F_BODYLESS_* flags To ease reading of H1S flags, H1S_F_BODYLESS_REQ and H1S_F_BODYLESS_RESP flags are grouped.	2024-05-17 16:33:53 +02:00
Christopher Faulet	8e55d29109	MINOR: mux-h1: Add a flag to ignore the request payload There was a flag to skip the response payload on output, if any, by stating it is bodyless. It is used for responses to HEAD requests or for 204/304 responses. This allow rewrites during analysis. For instance a HEAD request can be rewrite to a GET request for any reason (ie, a server not supporting HEAD requests). In this case, the server will send a response with a payload. On frontend side, the payload will be skipped and a valid response (without payload) will be sent to the client. With this patch we introduce the corresponding flag for the request. It will be used to skip the request payload. In addition, when payload must be skipped for a request or a response, The zero-copy data forwarding is now disabled.	2024-05-17 16:33:53 +02:00
Christopher Faulet	45a45c917a	BUG/MINOR: stats: Don't state the 303 redirect response is chunked Start-line flags for 303-See-Other response returned by the stats applet are not properly set. Indeed, the reponse has a "content-length" header but both HTX_SL_F_CHNK and HTX_SL_F_CLEN flags are set. Because of this bug, the reponse is considered as chunked. So, let's remove HTX_SL_F_CHNK flag. And also add HTX_SL_F_BODYLESS flag because there is no payload ("content-length" header is always set to 0). This patch must be backported to all stable versions. On the 2.8 and lower versions, the commit d0b04920d1 ("BUG/MINOR: htpp-ana/stats: Specify that HTX redirect messages have a C-L header") must be backported first.	2024-05-17 16:33:53 +02:00
Willy Tarreau	e362b076b1	Revert: MEDIUM: evports: permit to report multiple events at once" Tests have shown that switching nevlist to global.tune.maxpollevents is totally unreliable when using evports, and that events seem to be missed. A good reproducer seems to be QUIC. There are not enough users of Solaris to warrant spending more time trying to get down to this, and even the few that remain are by definition not interested in performance, so let's just revert the commit that tried to lift the value: e6662bf706 ("MEDIUM: evports: permit to report multiple events at once"). No backport is needed.	2024-05-17 15:57:18 +02:00
Willy Tarreau	0999e3d959	CLEANUP: compat: make the MIN/MAX macros more reliable After every release we say that MIN/MAX should be changed to be an expression that only evaluates each operand once, and before every version we forget to change it and we recheck that the code doesn't misuse them. Let's fix them now.	2024-05-17 15:57:18 +02:00
Aurelien DARRAGON	b9915a745e	BUG/MEDIUM: fd: prevent memory waste in fdtab array In 97ea9c49f1 ("BUG/MEDIUM: fd: always align fdtab[] to 64 bytes"), the patch doesn't do what the message says. The intent was only to align the base fdtab addr on 64 bytes so that all fdtab entries are aligned and thus don't share the same cache line. For that, fdtab pointer is adjusted from fdtab_addr (unaligned) address after it is allocated. Thus, all we need is an extra 64 bytes in the fdtab_addr array for the aligment. Because we use calloc() to perform the allocation, a dumb mistake was made: the '+64' was added on <size> calloc argument, which means EACH fdtab entry is allocated with 64 extra bytes. Given that a single fdtab entry is 64 bytes, since 97ea9c49f1 each fdtab entry now takes 128 bytes! We doubled fdtab memory consumption. To give you an idea, on my laptop, when looking at memory consumption using 'ps -p `pidof haproxy` -o size' right after starting haproxy process with default settings (no maxsock enforced): before 97ea9c49f1: -> 118440 (KB, ~= 118MB) after 97ea9c49f1: -> 183976 (KB, ~= 184MB) To fix this, use calloc with 1 <nmemb> and manually provide the size with <size> as we would do if we used malloc(). With this patch, we're back to pre-97ea9c49f1 for fdtab memory consumption (with 64 extra bytes the whole array, which is insignificant). It should be backported to all stable versions.	2024-05-17 15:25:03 +02:00
Aurelien DARRAGON	e84c8dee1a	BUILD: log: get rid of non-portable strnlen() func In c614fd3b9 ("MINOR: log: add +cbor encoding option"), I wrongly used strnlen() without noticing that the function is not portable (requires _POSIX_C_SOURCE >= 2008) and that it was the first occurrence in the entire project. In fact it is not a hard requirement since it's a pretty simple function. Thus to restore build compatibility with minimal/older build systems, let's actually get rid of it and use an equivalent portable code where needed (we cannot simply rely on strlen() because the string might not be NULL terminated, we must take upstream len into account). No backport needed (unless c614fd3b9 gets backported)	2024-05-17 15:24:53 +02:00
William Lallemand	f18ed8d07e	MEDIUM: ssl: add ocsp-update.mindelay and ocsp-update.maxdelay This patch deprecates tune.ssl.ocsp-update.* in favor of "ocsp-update.*". Since the ocsp-update is not really a tunable of the SSL connections.	2024-05-17 15:00:11 +02:00
Amaury Denoyelle	fbc3d46b9f	BUILD: stats: remove non portable getline() usage getline() was used to read stats-file. However, this function is not portable and may cause build issue on some systems. Replace it by standard fgets(). No need to backport.	2024-05-17 14:53:19 +02:00
William Lallemand	ef943c186d	REGTESTS: update the ocsp-update tests Update the ocsp-update tests for the recent changes: - "tune.ssl.ocsp-update.mode" was renamed iin "ocsp-update.mode"	2024-05-17 14:50:00 +02:00
William Lallemand	ee58fac1b4	MINOR: ssl: rename tune.ssl.ocsp-update.mode in ocsp-update.mode Since the ocsp-update is not strictly a tuning of the SSL stack, but a feature of its own, lets rename the option. The option was also missing from the index.	2024-05-17 14:50:00 +02:00
Willy Tarreau	ea3b89952d	BUILD: stick-tables: better mark the stktable_data as 32-bit aligned Aur�lien reported that clang's build was broken by the recent fix 845fb846c7 ("BUG/MEDIUM: stick-tables: properly mark stktable_data as packed"), because it now wants to use a helper for some atomic ops (to increment std_t_uint). While this makes no sense to do something that slow on modern architectures like x86 and arm64 which are fine with unaligned accesses, we actually we can simply mark the struct as aligned to its smallest element which is 32-bit (but still packed). With this, it was verified that it is enough for clang to see that its 32-bit operations will always be aligned, while making 64-bit operations safe on 64-bit platforms that do not support unaligned accesses. This should be backported wherever the patch above is backported.	2024-05-17 11:00:45 +02:00
Amaury Denoyelle	0d35f8d918	MINOR: h3: report glitch on RFC violation Increment glitch connection counter on every HTTP/3 or QPACK errors which is a violation of the specification. This could be useful to get rid early of bogus clients.	2024-05-16 10:58:54 +02:00
Amaury Denoyelle	216f70f989	MINOR: mux-quic: support glitches Implement basic support for glitches on QUIC multiplexer. This is mostly identical too glitches for HTTP/2. A new configuration option named tune.quic.frontend.glitches-threshold is defined to limit the number of glitches on a connection before closing it. Glitches counter is incremented via qcc_report_glitch(). A new qcc_app_ops callback <report_susp> is defined. On threshold reaching, it allows to set an application error code to close the connection. For HTTP/3, value H3_EXCESSIVE_LOAD is returned. If not defined, default code INTERNAL_ERROR is used. For the moment, no glitch are reported for QUIC or HTTP/3 usage. This will be added in future patches as needed.	2024-05-16 10:58:20 +02:00
Amaury Denoyelle	a6993a669b	MINOR: h3: adjust error reporting on receive This commit is the second step to simplify HTTP/3 error management. This times it deals with receive side on h3_rcv_buf(). Various internal HTTP/3 to HTX conversion functions does not set H3_INTERNAL_ERROR on h3c err anymore. Only standard error code are set. For every errors, both internal and protocol ones, a negative value is returned. This ensure that h3_rcv_buf() looping is interrupted. This function will then set H3_INTERNAL_ERROR only if no standard error is registered via h3c or h3s. Along the previous commit, this should better reflect internal errors from protocol ones caused by a faulty client.	2024-05-16 10:31:17 +02:00
Amaury Denoyelle	079d13f73f	MINOR: h3: adjust error reporting on sending It's currently difficult to differentiate HTTP/3 standard protocol violation from internal issues which use solely H3_INTERNAL_ERROR code. This patch aims is the first step to simplify this. The objective is to reduce H3_INTERNAL_ERROR. <err> field of h3c should be reserved exclusively to other values. Simplify error management in sending via h3_snd_buf(). Sending side is straightforward as only internal errors can be encountered. Do not manually set h3c.err to H3_INTERNAL_ERROR in HTX to HTTP/3 various conversion function. Instead, just return a negative value which is enough to break h3_snd_buf() loop. H3_INTERNAL_ERROR is thus positionned on a single location in this function for all sending operations.	2024-05-16 10:31:17 +02:00
Amaury Denoyelle	e094412337	MINOR: h3/qpack: adjust naming for errors Rename enum values used for HTTP/3 and QPACK RFC defined codes. First uses a prefix H3_ERR_* which serves as identifier between them. Also separate QPACK values in a new dedicated enum qpack_err. This is deemed cleaner.	2024-05-16 10:31:17 +02:00
Amaury Denoyelle	2dabcf30be	MINOR: qpack: prepare error renaming There is two distinct enums both related to QPACK error management. The first one is dedicated to RFC defined code. The other one is a set of internal values returned by qpack_decode_fs(). There has been issues discovered recently due to the confusion between them. Rename internal values with the prefix QPACK_RET_. The older name QPACK_ERR_ will be used in a future commit for the first enum.	2024-05-16 10:31:17 +02:00
Christopher Faulet	25bcdb1d95	BUG/MAJOR: h1: Be stricter on request target validation during message parsing As stated in issue #2565, checks on the request target during H1 message parsing are not good enough. Invalid paths, not starting by a slash are in fact parsed as authorities. The same error is repeated at the sample fetch level. This last point is annoying because routing rules may be fooled. It is also an issue when the URI or the Host header are updated. Because the error is repeated at different places, it must be fixed. We cannot be lax by arguing it is the server's job to accept or reject invalid request targets. With this patch, we strengthen the checks performed on the request target during H1 parsing. Idea is to reject invalid requests at this step to be sure it is safe to manipulate the path or the authority at other places. So now, the asterisk-form is only allowed for OPTIONS and OTHER methods. This last point was added to not reject the H2 preface. In addition, we take care to have only one asterisk and nothing more. For the CONNECT method, we take care to have a valid authority-form. All other form are rejected. The authority-form is now only supported for CONNECT method. No specific check is performed on the origin-form (except for the CONNECT method). For the absolute-form, we take care to have a scheme and a valid authority. These checks are not perfect but should be good enough to properly identify each part of the request target for a relative small cost. But, it is a breaking change. Some requests are now be rejected while they was not on older versions. However, nowadays, it is most probably not an issue. If it turns out it's really an issue for legitimate use-cases, an option would be to supports these kinds of requests when the "accept-invalid-http-request" option is set, with the consequence of seeing some sample fetches having an unexpected behavior. This patch should fix the issue #2665. It MUST NOT be backported. First because it is a breaking change. And then because by avoiding backporting it, it remains possible to relax the parsing with the "accept-invalid-http-request" option.	2024-05-15 21:20:37 +02:00
Christopher Faulet	d3d9d83f03	BUG/MEDIUM: h1: Reject CONNECT request if the target has a scheme The target of a CONNECT request must not have scheme. However, this was not checked during the message parsing. It is now rejected. This patch may be backported as far as 2.4.	2024-05-15 21:20:37 +02:00
Christopher Faulet	d724b0d147	BUG/MINOR: h1: Check authority for non-CONNECT methods only if a scheme is found When a non-CONNECT H1 request is parsed, the authority is compared to the host header value, to validate that they are the same. However there is an issue here when a relative path is used (not begining with a '/'). In this case, the path is considered as the authority and will be erroneously compared to the host header value. It is observable with this kind of request: GET admin HTTP/1.1 Host: www.mysite.com In this case "admin" is parsed as an authority while it is in fact a path. At this step, it is not a big deal because it just happens on the very first checks on the message during the parsing. However, the same happens when the authority is updated. This will be fixed in another commit Note this kind of request is invalid because the path does not start with a '/'. But, till now, HAProxy does not reject it. This patch is related to issue #2565. It must be backported as far as 2.4.	2024-05-15 21:20:37 +02:00
Willy Tarreau	821a04377d	BUG/MEDIUM: muxes: enforce buf_wait check in takeover() The ->takeover() is quite tricky. It didn't take care of the possibility that the original thread's connection handler had been woken up to handle an event (e.g. read0), failed to get a buffer, registered against its own thread's buffer_wait queue and left the connection in an idle state. A new thread could then come by, perform a takeover(), and when a buffer was available, the new thread's tasklet would be woken up by the old one via _buf_available(), causing all sort of problems. These problems are easy to reproduce, by running with shared backend connections and few buffers (tune.buffers.limit=20, 8 threads, 500 connections, transfer 64kB objects and wait 2-5s for a crash to appear). A first estimated solution consisted in removing the connection from the idle list but it turns out that it would be worse for the delete stuff (the connection no longer appearing as idle, making it impossible to find it in order to close it). Also, idle counts wouldn't match anymore the list's state, and the special case of private connections could be difficult to handle as the connection could be forcefully re-added to the idle list after allocation despite being private. After multiple attempts to address the problem in various ways, it appears that the only reliable solution for now (without starting to turn many lists to mt_lists) is to have the takeover() function handle the buf_wait detection or unregistration itself: - when doing a regular takeover aiming at finding an idle connection for a new request, connections that are blocked in a buffer_wait queue are quite rare and not interesting at all (since not immediately usable), so skipping them is sufficient. For this we detect that the desired connection belongs to a buffer_wait list by checking its buf_wait.list element. Note that this check is not* thread-safe! The LIST_DEL_INIT() is performed by __offer_buffers() after the callback was called. But this is sufficient as it is now because the only way for the element to be seen as not in a list is after the element was last touched by __offer_buffers(), so the situation for this connection will not change in a different way later. - when doing a server delete, we're running under thread isolation. The connection might get taken over to be killed. The only trick is that private connections not belonging to any idle list may also experience this, and in this case even the idle_conns lock will not offer any protection against anything. But since we're run under thread isolation, we're certain not to compete with the other thread, so it's safe to directly unregister the connection from its owner thread. Normally this is already handled by conn_release() in cli_parse_delete_server(), which calls mux->destroy(), but this would actually update the current thread's queue instead of the origin thread's, thus we do need to perform an explicit dequeue before completing the takeover. With this, the problem now looks solved for HTTP/1, HTTP/2 and FCGI, though extensive tests were essentially run on HTTP/1 and HTTP/2. While the problem has been there for a very long time, there should be no reason to backport it since buffer_wait didn't practically work before 3.0-dev and the process used to freeze hard very quickly before we'd even have a chance to meet that race.	2024-05-15 19:37:12 +02:00
Willy Tarreau	b0349cf2de	MINOR: dynbuf: provide a b_dequeue() variant for multi-thread In order to forcefully unregister a buffer waiter during an inter-thread takeover under isolation, we'll need to that the function works without th_ctx but the target thread's ctx instead. Let's implement this by passing the target thread as an argument. Now b_dequeue() simply calls this one with tid. It's OK it's not on that critical a path, especially since the list has been checked for existence before performing the call.	2024-05-15 19:37:12 +02:00
Willy Tarreau	edb99e296d	BUG/MINOR: ssl_sock: fix xprt_set_used() to properly clear the TASK_F_USR1 bit In 2.4-dev8 with commit 5c7086f6b0 ("MEDIUM: connection: protect idle conn lists with locks"), the idle conns list started to be protected using the lock for takeover, and the SSL layer used to always take that lock. Later in 2.4-dev11, with commit 4149168255 ("MEDIUM: ssl: implement xprt_set_used and xprt_set_idle to relax context checks"), we decided to relax this lock using TASK_F_USR1 just as is done in muxes. However the xprt_set_used() call, that's supposed to clear the flag, visibly suffered from a copy-paste and kept the OR operation instead of the AND, resulting in the flag never being released, so that SSL on the backend continues to take the lock on each and every I/O access even when the connection is not idle. The effect is only a reduced performance. This could be backported, but given the non-zero risk of triggering another bug somewhere, it would be prudent to wait for this fix to be sufficiently tested in new versions first.	2024-05-15 19:37:12 +02:00
Willy Tarreau	b6ed749adc	SCRIPTS: run-regtests: fix a few occurrences of extended regexes Running run-regtests on OpenBSD failed to identify haproxy version and the various build options because the backslash is not recognized in grep expressions. One must only use -E for the extended regexes and not use the slash.	2024-05-15 19:33:45 +02:00
Willy Tarreau	845fb846c7	BUG/MEDIUM: stick-tables: properly mark stktable_data as packed The stktable_data union is made of types of varying sizes, and depending on which types are stored in a table, some offsets might not necessarily be aligned. This results in a bus error for certain regtests (e.g. lb-services) on MIPS64. This bug may impact MIPS64, SPARC64, armv7 when accessing a 64-bit counter (e.g. bytes) and depending on how the compiler emitted the operation, and cause a trap that's emulated by the OS on RISCV (heavy cost). x86_64 and armv8 are not affected at all. Let's properly mark the struct with __attribute__((packed)) so that the compiler emits the suitable unaligned-compatible instructions when accessing the fields. This should be backported to all versions where it applies.	2024-05-15 19:03:18 +02:00
Willy Tarreau	276cdc11e8	BUG/MEDIUM: htx: mark htx_sl as packed since it may be realigned A test on MIPS64 revealed that the following reg tests would all fail at the same place in htx_replace_stline() when updating parts of the request line: reg-tests/cache/if-modified-since.vtc reg-tests/http-rules/h1or2_to_h1c.vtc reg-tests/http-rules/http_after_response.vtc reg-tests/http-rules/normalize_uri.vtc reg-tests/http-rules/path_and_pathq.vtc While the status line is normally aligned since it's the first block of the HTX, it may become unaligned once replaced. The problem is, it is a structure which contains some u16 and u32, and dereferencing them on machines not natively supporting unaligned accesses makes them crash or handle crap. Typically, MIPS/MIPS64/SPARC will crash, ARMv5 will either crash or (more likely) return swapped values and do crap, and RISCV will trap and turn to slow emulation. We can assign the htx_sl struct the packed attribute, but then this also causes the ints to fill the 2-bytes gap before them, always causing unaligned accesses for this part on such machines. The patch does a bit better, by explicitly filling this two-bytes hole, and packing the struct. This should be backported to all versions.	2024-05-15 19:03:17 +02:00
Amaury Denoyelle	86aafd0236	BUG/MINOR: qpack: fix error code reported on QPACK decoding failure qpack_decode_fs() is used to decode QPACK field section on HTTP/3 headers parsing. Its return value is incoherent as it returns either QPACK_DECOMPRESSION_FAILED defined in RFC 9204 or any other internal values defined in qpack-dec.h. On failure, such return code is reused by HTTP/3 layer to be reported via a CONNECTION_CLOSE frame. This is incorrect if an internal error values was reported as it is not defined by any specification. Fir return values of qpack_decode_fs() in two ways. Firstly, fix invalid usages of QPACK_DECOMPRESSION_FAILED when decoded content is too large for the correct internal error QPACK_ERR_TOO_LARGE. Secondly, adjust qpack_decode_fs() API to only returns internal code values. A new internal enum QPACK_ERR_DECOMP is defined to replace QPACK_DECOMPRESSION_FAILED. Caller is responsible to convert it to a suitable error value. For other internal values, H3_INTERNAL_ERROR is used. This is done through a set of convert functions. This should be backported up to 2.6. Note that trailers are not supported in 2.6 so chunk related to h3_trailers_to_htx() can be safely skipped.	2024-05-15 16:07:15 +02:00
Amaury Denoyelle	4295dd21bd	BUG/MINOR: mux-quic: fix error code on shutdown for non HTTP/3 qcc_shutdown() is called whenever the connection must be closed. If application protocol defined its owned shutdown callback, it is invoked to use the correct error code. Else transport error code NO_ERROR is used. A bug occurs in the latter case as NO_ERROR is used with quic_err_app() which is reserved for application errro codes. This will trigger the emission of a CONNECTION_CLOSE of type 0x1d (Application) instead of 0x1c (Transport). This bug is considered minor as it does not impact QUIC with HTTP/3. It may only be visible when using experimental HTTP/0.9 protocol. This should be backported up to 2.6. For 2.6, patch must be completed rewritten due to code differences. Here is the change to apply : diff --git a/src/mux_quic.c b/src/mux_quic.c index 26fb70ddf..c48f82e27 100644 --- a/src/mux_quic.c +++ b/src/mux_quic.c @@ -1918,7 +1918,9 @@ static void qc_release(struct qcc qcc) qc_send(qcc); } else { - qcc_emit_cc_app(qcc, QC_ERR_NO_ERROR, 0); + / Duplicate from qcc_emit_cc_app() for Transport error code. */ + if (!(qcc->conn->handle.qc->flags & QUIC_FL_CONN_IMMEDIATE_CLOSE)) + qcc->conn->handle.qc->err = quic_err_transport(QC_ERR_NO_ERROR); } }	2024-05-15 16:03:01 +02:00
Amaury Denoyelle	412f1eeb89	BUG/MEDIUM: server: clear purgeable conns before server deletion Since the following commit, idle connections are cleared before a server is deleted. This is better than blocking server deletion due to inactive connections : 6e0afb2e274952663957121ea33cb6bae574fc2e MEDIUM: server: close idle conn on server deletion A BUG_ON() has been added to ensure that server idle conn counter is nul after these connections are removed. However, Willy managed to trigger it easily by repeatedly and randomly delete servers accross a single-thread haproxy using a server-template with 1000 instances. In parallel, a h1load client is executed to generate traffic. This BUG_ON() reflected that it some connections referencing the server targetted for deletion remained, even though idle server list is empty. In fact, this is caused by connections scheduled for purging. These connections are moved from idle server list to a global toremove_list while still being accounted by the server. A first approach could be to decrement server idle counter while moving connection to the purge list. However, this is functionnaly incorrect as these purgeable connections still reference the server and it could cause a crash if cleared after it. The correct fix for this issue is simply to remove every purgeable connections before a server is deleted. This is implemented by this patch by extending cli_parse_delete_server(). It could be enough to only remove connections targetted the deleted server, but as these connections will be purged anyway it is justified to clear the whole list. This must not be backported, unless the above mentionned patch is.	2024-05-15 15:01:55 +02:00
Aurelien DARRAGON	231d3d32be	MEDIUM: hlua: take nbthread into account in hlua_get_nb_instruction() Based on Willy's idea (from 3.0-dev6 announcement message): in this patch we try to reduce the max latency that can be caused by running lua scripts with default settings. Indeed, by default, hlua engine is allowed to process up to 10k instructions per batch. While this value was found to be the optimal one for a single thread, it turns out that keeping a thread busy for 10k lua instructions could increase thread contention. This is especially true when the script is loaded with 'lua-load', because in that case the current thread owns the main lua lock and prevent other threads from making any progress if they're also waiting on the main lock. Thanks to Thierry Fournier's work, we know that performance-wise we can reach optimal performance by sticking between 500 and 10k instructions per batch. Given that, when the script is loaded using 'lua-load', if no "tune.lua.forced-yield" was set by the user, we automatically divide the default value (10K) by the number of threads haproxy can use to reduce thread contention (given that all threads could compete for the main lua lock), however we make sure not to return a value below 500, because Thierry's work showed that this would come with a significant performance loss. The historical behavior may still be enforced by setting "tune.lua.forced-yield" to 10000 in the global config section.	2024-05-15 11:59:44 +02:00
Aurelien DARRAGON	e60d9dddf8	MINOR: hlua: add hlua_nb_instruction getter No functional behavior change, but this will ease the work of dynamically computing hlua_nb_instruction value depending on various inputs.	2024-05-15 11:59:37 +02:00
Tim Duesterhus	6610f656ea	DOC: Update UUID references to RFC 9562 When support for UUIDv7 was added in commit aab6477b67415c4cc260bba5df359fa2e6f49733 the specification still was a draft. It has since been published as RFC 9562. This patch updates all UUID references from the obsoleted RFC 4122 and the draft for RFC 9562 to the published RFC 9562.	2024-05-15 11:40:08 +02:00
William Lallemand	8c6f43d382	REGTESTS: ssl: be more verbose with ocsp_compat_check.vtc the ocsp_compat_check.vtc reg-test is difficult to debug given than the haproxy output is piped in `grep -q`. This patch helps by showing the haproxy output as well as the return code.	2024-05-15 10:36:02 +02:00
William Manley	366b722f7e	MINOR: rhttp: Don't require SSL when attach-srv name parsing An attach-srv config line usually looks like this: tcp-request session attach-srv be/srv name ssl_c_s_dn(CN) while a rhttp server line usually looks like this: server srv rhttp@ sni req.hdr(host) The server sni argument is used as a key for looking up connection in the connection pool. The attach-srv name argument is used as a key for inserting connections into the pool. For it to work correctly they must match. There was a check that either both the attach-srv and server provide that key or neither does. It also checked that SSL and SNI was activated on the server. However, thanks to current connect_server() implementation, it appears that SNI is usable even without SSL to identify a connection in the pool. Thus, it can be diverted from its original intent in reverse HTTP case to serve even without SSL activated. For example, this could be useful to use `fc_pp_unique_id` as a name expression (DISCLAIMER: note that for now PROXY protocol is not compatible with rhttp). Error is still reported if either SNI or name is used without the other. This patch adjust the message to a more helpful one. Arguably it would be easier to understand if instead of using `name` and `sni` for `attach-srv` and `server` rules it used the same term in both places - like "conn-pool-key" or something. That would make it clear that the two must match.	2024-05-14 16:39:07 +02:00
Aurelien DARRAGON	32f0cd3242	BUG/MINOR: log: smp_rgs array issues with inherited global log directives When a log directive is defined in the global section, each time we use "log global" in a proxy section, the global log directives are duplicated for the current proxy. This works by creating a new proxy logger struct and duplicating every members for each global one. However, smp_rgs logger member is a special pointer member that is allocated when "range" is used on a log directive. Currently, we simply copy the array pointer (from the global one), instead of creating our own copy. Because of that, range log sampling may not work properly in some situations prior to 3f1284560 ("MINOR: log: remove the unused curr_idx in struct smp_log_range") when used in global log directives, for instance: global log 127.0.0.1:5114 format raw sample 1-2,3:4 local0 info # should receive 75% of all proxy logs log 127.0.0.1:5115 format raw sample 4:4 local0 info # should receive 25% of all proxy logs listen proxy1 log global listen proxy2 log global May not work as expected, because curr_idx was stored within smp_rgs array member prior to 3f1284560, and due to this bug, it happens to be shared between every log directive inherited from a "global" one. The result is that curr_idx counter will not behave properly because the index will be increased globally instead of per-log directive, and it could even suffer from concurrent thread accesses under load since we don't own the global log directive's lock when manipulating it. Another issue that was revealed because of this bug is that the smp_rgs array allocated during config parsing is never freed in free_logger(), resulting in small memory leak during clean exit. To fix these issues all at once, let's properly duplicate smp_rgs logger struct member in dup_logger() like we already do for other special members so that every log directive have its own sms_rgs copy, and then systematically free it in free_logger(). While this bug affects all stable versions (including 2.4), it's probably best to not backport this beyond 2.6 because of 211ea252d ("BUG/MINOR: logs: fix logsrv leaks on clean exit") prerequisite that first appears in 2.6. [ada: for versions prior to 2.9, 969e212 ("MINOR: log: add dup_logsrv() helper function") and 76acde91 ("BUG/MINOR: log: keep the ref in dup_logger()") must be backported first. Note: Some ctx adjustments should be performed because 'logger' struct used to be named 'logsrv' in the past and 2.9 introduced logger target struct member. Thus it's probably easier to manually apply 76acde91 and the current bugfix by hand directly on top of 969e212. ]	2024-05-14 12:00:23 +02:00
Aurelien DARRAGON	9d4a44e713	BUG/MINOR: log: fix leak in add_sample_to_logformat_list() error path If add_sample_to_logformat_list() fails to allocate new logformat_node, then we directly jump to error_free label to cleanup the node using free_logformat_node() before returning an error. However if the node failed to allocate, then the sample expression that was allocated just before (not yet assigned) isn't released (free_logformat_node() is a no-op when NULL is provided). Thus if expr wasn't assigned to the node during early failure, then it must be manually released. This bug was introduced by 2462e5bcc ("BUG/MINOR: log: fix potential lf->name memory leak") which wasn't marked for backports. It only affects 3.0.	2024-05-13 16:44:27 +02:00
Ilia Shipitsin	cbe78c0281	CI: drop asan.log umbrella completely asan.log redirection appeared to work poorly, let's cease that practice for good. ML: https://www.mail-archive.com/haproxy@formilux.org/msg44844.html	2024-05-13 11:36:36 +02:00
Willy Tarreau	7217a9e9b9	[RELEASE] Released version 3.0-dev11 Released version 3.0-dev11 with the following main changes : - BUILD: clock: improve check for pthread_getcpuclockid() - CI: add Illumos scheduled workflow - CI: netbsd: limit scheduled workflow to parent repo only - OPTIM: log: resolve logformat options during postparsing - BUG/MINOR: haproxy: only tid 0 must not sleep if got signal - REGTEST: add tests for acl() sample fetch - BUG/MINOR: acl: support built-in ACLs with acl() sample - BUG/MINOR: cfgparse: use curproxy global var from config post validation - MEDIUM: stconn/muxes: Add an abort reason for SE shutdowns on muxes - MINOR: mux-h2: Set the SE abort reason when a RST_STREAM frame is received - MEDIUM: mux-h2: Forward h2 client cancellations to h2 servers - MINOR: mux-quic: Set tha SE abort reason when a STOP_SENDING frame is received - MINOR: stconn: Add samples to retrieve about stream aborts - MINOR: mux-quic: Add .ctl callback function to get info about a mux connection - MINOR: muxes: Add ctl commands to get info on streams for a connection - MINOR: connection: Add samples to retrieve info on streams for a connection - BUG/MEDIUM: log/ring: broken syslog octet counting - BUG/MEDIUM: mux-quic: fix crash on STOP_SENDING received without SD - DOC: lua: fix filters.txt file location - MINOR: dynbuf: pass a criticality argument to b_alloc() - MINOR: dynbuf: add functions to help queue/requeue buffer_wait fields - MINOR: dynbuf: use the b_queue()/b_requeue() functions everywhere - MEDIUM: dynbuf: make the buffer_wq an array of list heads - CLEANUP: tinfo: better align fields in thread_ctx - MINOR: dynbuf: provide a b_dequeue() function to detach a bw from the queue - MEDIUM: dynbuf: generalize the use of b_dequeue() to detach buffer_wait - MEDIUM: dynbuf/stream: re-enable queueing upon failed buffer allocation - MEDIUM: dynbuf/stream: do not allocate the buffers in the callback - MEDIUM: applet: make appctx_buf_available() only wake the applet up, not allocate - MINOR: applet: set the blocking flag in the buffer allocation function - MINOR: applet: adjust the allocation criticity based on the requested buffer - MINOR: dynbuf/mux-h1: use different criticalities for buffer allocations - MEDIUM: dynbuf/mux-h1: do not allocate the buffers in the callback - MEDIUM: dynbuf: refrain from offering a buffer if more critical ones are waiting - MINOR: stconn: report that a buffer allocation succeeded - MINOR: stream: report that a buffer allocation succeeded - MINOR: applet: report about buffer allocation success - MINOR: mux-h1: report that a buffer allocation succeeded - MEDIUM: stream: allocate without queuing when retrying - MEDIUM: channel: allocate without queuing when retrying - MEDIUM: mux-h1: allocate without queuing when retrying - MEDIUM: dynbuf: implement emergency buffers - MEDIUM: dynbuf: use emergency buffers upon failed memory allocations	2024-05-10 17:39:19 +02:00
Willy Tarreau	fc792694a6	MEDIUM: dynbuf: use emergency buffers upon failed memory allocations Now, if a pool_alloc() fails for a buffer and if conditions are met based on the queue number, we'll try to get an emergency buffer. Thanks to this the situation is way more stable now. With only 4 reserve buffers and 1 buffer it's possible to reliably serve 500 concurrent end- to-end H1 connections and consult stats in parallel in loops showing the growing number of buf_wait events in "show activity" without facing an instant stall like in the past. Lower values still cause quick stalls though. It's also apparent that some subsystems do not seem to detach from the buffer_wait lists when leaving. For example several crashes in the H1 part showed list elements still present after a free(), so maybe some operations performed inside h1_release() after the b_dequeue() call can sometimes result in a new allocation. Same for streams, where the dequeue is done relatively early.	2024-05-10 17:18:13 +02:00
Willy Tarreau	0ce51dc93b	MEDIUM: dynbuf: implement emergency buffers The buffer reserve set by tune.buffers.reserve has long been unused, and in order to deal gracefully with failed memory allocations we'll need to resort to a few emergency buffers that are pre-allocated per thread. These buffers are only for emergency use, so every time their count is below the configured number a b_free() will refill them. For this reason their count can remain pretty low. We changed the default number from 2 to 4 per thread, and the minimum value is now zero (e.g. for low-memory systems). The tune.buffers.limit setting has always been a problem when trying to deal with the reserve but now we could simplify it by simply pushing the limit (if set) to match the reserve. That was already done in the past with a static value, but now with threads it was a bit trickier, which is why the per-thread allocators increment the limit on the fly before allocating their own buffers. This also means that the configured limit is saner and now corresponds to the regular buffers that can be allocated on top of emergency buffers. At the moment these emergency buffers are not used upon allocation failure. The only reason is to ease bisecting later if needed, since this commit only has to deal with resource management.	2024-05-10 17:18:13 +02:00
Willy Tarreau	47665be083	MEDIUM: mux-h1: allocate without queuing when retrying Now when trying to allocate a buffer, we can check if we've been notified of availability via the callback, in which case we should not consult the queue, or if we're doing a first allocation and check the queue. At this point it still doesn't change much since the stream still doesn't make use of it but some progress is expected.	2024-05-10 17:18:13 +02:00
Willy Tarreau	5b8d27617f	MEDIUM: channel: allocate without queuing when retrying Now when trying to allocate a channel buffer, we can check if we've been notified of availability via the producer stream connector callback, in which case we should not consult the queue, or if we're doing a first allocation and check the queue.	2024-05-10 17:18:13 +02:00
Willy Tarreau	b5714b45e8	MEDIUM: stream: allocate without queuing when retrying Now when trying to allocate the work buffer, we can check if we've been notified of availability via the buf_wait callback, in which case we should not consult the queue, or if we're doing a first allocation and check the queue.	2024-05-10 17:18:13 +02:00
Willy Tarreau	f552f79ba5	MINOR: mux-h1: report that a buffer allocation succeeded When the buffer allocation callback is notified of a buffer availability, it will now set a MAYALLOC flag in addition to clearing the ALLOC one, for each of the 3 levels where we may fail an allocation. The flag will be cleared upon a successful allocation. This will soon be used to decide to re-allocate without waiting again in the queue. For now it has no effect. There's just a trick, we need to clear the various *_ALLOC flags before testing h1_recv_allowed() otherwise it will return false!	2024-05-10 17:18:13 +02:00
Willy Tarreau	cb2d758043	MINOR: applet: report about buffer allocation success When appctx_buf_available() is called, it now sets APPCTX_FL_IN_MAYALLOC or APPCTX_FL_OUT_MAYALLOC depending on the reportedly permitted buffer allocation, and these flags are cleared when the said buffers are allocated. For now they're not used for anything else.	2024-05-10 17:18:13 +02:00
Willy Tarreau	17d8916bb1	MINOR: stream: report that a buffer allocation succeeded When the buffer allocation callback is notified of a buffer availability, it will now set a MAYALLOC flag on the stream so that the stream knows it is allowed to bypass the queue checks. For now this is not used.	2024-05-10 17:18:13 +02:00
Willy Tarreau	7aff64518c	MINOR: stconn: report that a buffer allocation succeeded We used to have two states for the channel's input buffer used by the SC, NEED_BUFF or not, flipped by sc_need_buff() and sc_have_buff(). We want to have a 3rd state, indicating that we've just got a desired buffer. Let's add an HAVE_BUFF flag that is set by sc_have_buff() and that is cleared by sc_used_buff(). This way by looking at HAVE_BUFF we know that we're coming back from the allocation callback and that the offered buffer has not yet been used.	2024-05-10 17:18:13 +02:00
Willy Tarreau	d1eb48a12b	MEDIUM: dynbuf: refrain from offering a buffer if more critical ones are waiting Now b_alloc() will check the queues at the same and higher criticality levels before allocating a buffer, and will refrain from allocating one if these are not empty. The purpose is to put some priorities in the allocation order so that most critical allocators are offered a chance to complete. However in order to permit a freshly dequeued task to allocate again while siblings are still in the queue, there is a special DB_F_NOQUEUE flag to pass to b_alloc() that will take care of this special situation.	2024-05-10 17:18:13 +02:00
Willy Tarreau	a160b3c50c	MEDIUM: dynbuf/mux-h1: do not allocate the buffers in the callback One of the problematic designs with the buffer_wait mechanism is that the callbacks pre-allocate the buffers and stay in the run queue for a while, resulting in all of the few buffers being assigned to waiting tasks instead of being all available to one task that needs them all at once. Here we simply stop doing this, the callback clears the waiting flags and wakes the task up so that it has a chance of still finding some buffers.	2024-05-10 17:18:13 +02:00
Willy Tarreau	c510e81a3f	MINOR: dynbuf/mux-h1: use different criticalities for buffer allocations While it could certainly still be improved, this first approach consists in assigning buffers like this in the H1 mux: - h1c->obuf : DB_MUX_TX - h1c->ibuf : DB_MUX_RX - h1s->rxbuf: DB_SE_RX That's done via 3 distinct functions for better code clarity, and it also allowed to move the missing buffer flags assignment there. Among possible improvements would be to take into consideration the state of the parser (i.e. no data yet vs data, or headers vs payload) so that even server beginning of response or pure payload can be lowered in priority.	2024-05-10 17:18:13 +02:00
Willy Tarreau	4a42af1744	MINOR: applet: adjust the allocation criticity based on the requested buffer When we want to allocate an in buffer, it's in order to pass data to the applet, that will consume it, so it must be seen as the same as a send() from the higher level, i.e. MUX_TX. And for the outbuf, it's a stream endpoint returning data, i.e. DB_SE_RX.	2024-05-10 17:18:13 +02:00
Willy Tarreau	4ffb3b5ebe	MINOR: applet: set the blocking flag in the buffer allocation function Instead of having each caller of appctx_get_buf() think about setting the blocking flag, better have the function do it, since it's already handling the queue anyway. This way we're sure that both are consistent.	2024-05-10 17:18:13 +02:00
Willy Tarreau	ee0d56ac85	MEDIUM: applet: make appctx_buf_available() only wake the applet up, not allocate Now we don't want bufwait handlers to preallocate the resources they were expecting since it contributes to the shortage. Let's just wake the applet up and that's all.	2024-05-10 17:18:13 +02:00
Willy Tarreau	9a27d7aa6f	MEDIUM: dynbuf/stream: do not allocate the buffers in the callback One of the problematic designs with the buffer_wait mechanism is that the callbacks pre-allocate the buffers and stay in the run queue for a while, resulting in all of the few buffers being assigned to waiting tasks instead of being all available to one task that needs them all at once. Here we simply stop doing this, the callback clears the waiting flags and wakes the task up so that it has a chance of still finding some buffers.	2024-05-10 17:18:13 +02:00
Willy Tarreau	db21062881	MEDIUM: dynbuf/stream: re-enable queueing upon failed buffer allocation The errors were not working fine anyway since we know that upon low memory condition everything freezes. However we have a chance to do better now, so let's start by re-enabling queueing when allocations fail.	2024-05-10 17:18:13 +02:00
Willy Tarreau	f5566afec6	MEDIUM: dynbuf: generalize the use of b_dequeue() to detach buffer_wait Now thanks to this the bufq_map field is expected to remain accurate.	2024-05-10 17:18:13 +02:00
Willy Tarreau	f70bd5fad1	MINOR: dynbuf: provide a b_dequeue() function to detach a bw from the queue Now that we need to keep the bitmap in sync with the list heads, we don't want tasks to leave just doing a LIST_DEL_INIT() without updating the map. Let's provide a b_dequeue() function for that purpose. The function detects when it's going to remove the last element and figures the queue number based on the pointer since it points to the root. It's not used yet.	2024-05-10 17:18:13 +02:00
Willy Tarreau	53461e4d94	CLEANUP: tinfo: better align fields in thread_ctx The introduction of buffer_wq[] in thread_ctx pushed a few fields around and the cache line alignment is less satisfying. And more importantly, even before this, all the lists in the local parts were 8-aligned, with the first one split across two cache lines. We can do better: - sched_profile_entry is not atomic at all, the data it points to is atomic so it doesn't need to be in the atomic-only region, and it can fill the 8-hole before the lists - the align(2*void) that was only before tasklets[] moves before all lists (and it's a nop for now) This now makes the lists and buffer_wq[] start on a cache line boundary, leaves 48 bytes after the lists before the atomic-only cache line, and leaves a full cache line at the end for 128-alignment. This way we still have plenty of room in both parts with better aligned fields.	2024-05-10 17:18:13 +02:00
Willy Tarreau	a5d6a79986	MEDIUM: dynbuf: make the buffer_wq an array of list heads Let's turn the buffer_wq into an array of 4 list heads. These are chosen by criticality. The DB_CRIT_TO_QUEUE() macro maps each criticality level into one of these 4 queues. The goal here clearly is to make it possible to wake up the most critical queues in priority in order to let some tasks finish their job and release buffers that others can use. In order to avoid having to look up all queues, a bit map indicates which queues are in use, which also allows to avoid looping in the most common case where queues are empty..	2024-05-10 17:18:13 +02:00
Willy Tarreau	a214197ce7	MINOR: dynbuf: use the b_queue()/b_requeue() functions everywhere The code places that were used to manipulate the buffer_wq manually now just call b_queue() or b_requeue(). This will simplify the multiple list management later.	2024-05-10 17:18:13 +02:00
Willy Tarreau	d1c2f325a2	MINOR: dynbuf: add functions to help queue/requeue buffer_wait fields When failing an allocation we always do the same dance, add the buffer_wait struct to a list if it's not, and return. Let's just add dedicated functions to centralize this, this will be useful to implement a bit more complex logic. For now they're not used.	2024-05-10 17:18:13 +02:00
Willy Tarreau	72d0dcda8e	MINOR: dynbuf: pass a criticality argument to b_alloc() The goal is to indicate how critical the allocation is, between the least one (growing an existing buffer ring) and the topmost one (boot time allocation for the life of the process). The 3 tcp-based muxes (h1, h2, fcgi) use a common allocation function to try to allocate otherwise subscribe. There's currently no distinction of direction nor part that tries to allocate, and this should be revisited to improve this situation, particularly when we consider that mux-h2 can reduce its Tx allocations if needed. For now, 4 main levels are planned, to translate how the data travels inside haproxy from a producer to a consumer: - MUX_RX: buffer used to receive data from the OS - SE_RX: buffer used to place a transformation of the RX data for a mux, or to produce a response for an applet - CHANNEL: the channel buffer for sync recv - MUX_TX: buffer used to transfer data from the channel to the outside, generally a mux but there can be a few specificities (e.g. http client's response buffer passed to the application, which also gets a transformation of the channel data). The other levels are a bit different in that they don't strictly need to allocate for the first two ones, or they're permanent for the last one (used by compression).	2024-05-10 17:18:13 +02:00
Aurelien DARRAGON	84f7525c5b	DOC: lua: fix filters.txt file location At the beginning of the filter class section, we encourage the user to check out filters.txt file to get to know how the filters API works within haproxy. However the file location is incorrect. The proper directory to look for the file is: doc/internals/api. It should be backported up to 2.5.	2024-05-10 11:02:56 +02:00
Amaury Denoyelle	cc9827bb09	BUG/MEDIUM: mux-quic: fix crash on STOP_SENDING received without SD Abort reason code received on STOP_SENDING is notified to upper layer since the following commit : 367ce1ebf3e4cead319a9f01581037c9f0280e77 MINOR: mux-quic: Set tha SE abort reason when a STOP_SENDING frame is received However, this causes a crash when a STOP_SENDING is received on a QCS instance without any stream instantiated. Fix this by checking first if qcs->sd is not NULL before setting abort code. This bug can easily be reproduced by emitting a STOP_SENDING as first frame of a stream. This should fix github issue #2563. This does not need to be backported.	2024-05-10 11:01:05 +02:00
Aurelien DARRAGON	fbbc2925d4	BUG/MEDIUM: log/ring: broken syslog octet counting As reported by Tristan in GH #2561, syslog messages sent over rings are malformed since commit 01aa0a05 ("MEDIUM: ring: change the ring reader to use the new vector-based API now"). Indeed, take a look at the following log message produced prior to 01aa0a05: 181 <134>1 2024-05-07T09:45:21.543263+02:00 - haproxy 113700 - - 127.0.0.1:56136 [07/May/2024:09:45:21.491] front front/s1 0/0/21/30/51 404 369 - - ---- 1/1/0/0/0 0/0 "GET / HTTP/1.1" Starting with 01aa0a05, here's the equivalent log message: <134>1 2024-05-07T09:45:21.543263+02:00 - haproxy 112729 - - 127.0.0.1:56136 [07/May/2024:09:45:21.491] front front/s1 0/0/66/39/105 404 369 - - ---- 1/1/0/0/0 0/0 "GET / HTTP/1.1"-fwr -> Message is missing octet counting header, and garbage bytes are found at the end of the payload. This bug is caused by a small mistake in syslog_applet_append_event(): when the function was refactored to use vector API instead of buffer API, we used 'trash.area' as starting pointer to write the event instead of 'trash.area + trash.data', causing existing octet counting prefix (already written in trash) to be overwritten and trash.data to be wrongly incremented. No backport needed (01aa0a05 was introduced during 3.0 development)	2024-05-07 19:23:01 +02:00
Christopher Faulet	bd47e344b8	MINOR: connection: Add samples to retrieve info on streams for a connection Thanks to the previous fix, it is now possible to get the number of opened streams for a connection and the negociated limit. Here, corresponding sample feches are added, in fc_ and bc_ scopes. On frontend side, the limit of streams is imposed by HAProxy. But on the backend side, the limit is defined by the server. it may be useful for debugging purpose because it may explain slow-downs on some processing.	2024-05-06 22:00:01 +02:00
Christopher Faulet	eca9831ec8	MINOR: muxes: Add ctl commands to get info on streams for a connection There are 2 new ctl commands that may be used to retrieve the current number of streams openned for a connection and its limit (the maximum number of streams a mux connection supports). For the PT and H1 muxes, the limit is always 1 and the current number of streams is 0 for idle connections, otherwise 1 is returned. For the H2 and the FCGI muxes, info are already available in the mux connection. For the QUIC mux, the limit is also directly available. It is the maximum initial sub-ID of bidirectional stream allowed for the connection. For the current number of streams, it is the number of SC attached on the connection and the number of not already attached streams present in the "opening_list" list.	2024-05-06 22:00:00 +02:00
Christopher Faulet	12fb6d73cd	MINOR: mux-quic: Add .ctl callback function to get info about a mux connection Other muxes implement this callback function. It was not implemented for the QUIC mux because it was useless. It will be used to retrieve the current/max number of stream for a quic connection. So let's added it, adding the default support for MUX_CTL_EXIT_STATUS command.	2024-05-06 22:00:00 +02:00
Christopher Faulet	068ce2d5d2	MINOR: stconn: Add samples to retrieve about stream aborts It is now possible to retrieve some info about the abort received for a server or a client stream, if any. * fs.aborted and bs.aborted can be used to know if an abort was received on frontend or backend side. A boolean is returned. * fs.rst_code and bs.rst_code return the code of the received RESET_STREAM frame for a H2 stream or the code of the received STOP_SENDING frame for a QUIC stream. In both cases, the error code attached to the frame is returned. The sample fetch fails if no such frame was received or if the stream is not an H2/QUIC stream.	2024-05-06 22:00:00 +02:00
Christopher Faulet	367ce1ebf3	MINOR: mux-quic: Set tha SE abort reason when a STOP_SENDING frame is received When STOP_SENDING frame is received for a quic stream, the error code is now saved in the SE abort reason. To do so, we use the QUIC source (SE_ABRT_SRC_MUX_QUIC). For now, this code is only set but not used on the opposite side.	2024-05-06 22:00:00 +02:00
Christopher Faulet	20b156ee15	MEDIUM: mux-h2: Forward h2 client cancellations to h2 servers When a H2 client sends a RST_STREAM(CANCEL) frame to abort a request, the abort reason is now used on server side, in the H2 mux, to set the RST_STREAM code. The main use case is to forward client cancellations to gRPC applications. This patch should fix the issue #172.	2024-05-06 22:00:00 +02:00
Christopher Faulet	dea79f3fe1	MINOR: mux-h2: Set the SE abort reason when a RST_STREAM frame is received When RST_STREAM frame is received, the error code is now saved in the SE abort reason. To do so, we use the H2 source (SE_ABRT_SRC_MUX_H2). For now, this code is only set but not used on the opposite side.	2024-05-06 22:00:00 +02:00
Christopher Faulet	96f8b7ad08	MEDIUM: stconn/muxes: Add an abort reason for SE shutdowns on muxes A reason is now passed as parameter to muxes shutdowns to pass additional info about the abort, if any. No info means no abort or only generic one. For now, the reason is composed of 2 32-bits integer. The first on represents the abort code and the other one represents the info about the code (for instance the source). The code should be interpreted according to the associated info. One info is the source, encoding on 5 bits. Other bits are reserverd for now. For now, the muxes are the only supported source. But we can imagine to extend it to applets, streams, health-checks... The current design is quite simple and will most probably evolved.. But the idea is to let the opposite side forward some errors and let's a mux know why its stream was aborted. At first glance, a abort reason must only be evaluated if SE_SHW_SILENT flag is set. The main goal at short term, is to forward some H2 RST_STREAM codes because it is mandatory for gRPC applications, mainly to forward gRPC cancellation from an H2 client to an H2 server. But we can imagine to alter this reason at the applicative level to enrich it. It would also be used to report more accurate errors in logs.	2024-05-06 22:00:00 +02:00
Patrick Hemmer	28489021b3	BUG/MINOR: cfgparse: use curproxy global var from config post validation Previously check_config_validity() had its own curproxy variable. This resulted in the acl() sample fetch being unable to determine which proxy was in use when used from within log-format statements. This change addresses the issue by having the check_config_validity() function use the global variable instead.	2024-05-06 18:45:47 +02:00
Patrick Hemmer	93d4e99714	BUG/MINOR: acl: support built-in ACLs with acl() sample Built-in ACLs were not being searched by the acl() sample fetch. This fixes that so they are searched if no other match is found.	2024-05-06 18:42:54 +02:00
Patrick Hemmer	7c6b410b35	REGTEST: add tests for acl() sample fetch This adds reg tests for the recently added acl() sample fetch	2024-05-06 18:41:57 +02:00
Valentine Krasnobaeva	4a9e3e102e	BUG/MINOR: haproxy: only tid 0 must not sleep if got signal This patch fixes the commit eea152ee68 ("BUG/MINOR: signals/poller: ensure wakeup from signals"). There is some probability that run_poll_loop() becomes inifinite, if TH_FL_SLEEPING is withdrawn from all threads in the second signal_queue_len check, when a signal has received just after the first one. In such particular case, the 'wake' variable, which is used to terminate thread's poll loop is never reset to 0. So, we never enter to the "stopping" part of the run_poll_loop() and threads, except the one with id 0 (tid 0 handles signals), will continue to call _do_poll() eternally and will never sleep, as its TH_FL_SLEEPING flag was unset. This flag needs to be removed only for the tid 0, as it was done in the first signal_queue_len check. This fixes an issue #2537 "infinite loop when shutting down". This fix must be backported in every stable version.	2024-05-06 18:39:08 +02:00
Aurelien DARRAGON	03ca16f38b	OPTIM: log: resolve logformat options during postparsing In lf_buildctx_prepare(), we perform costly bitwise operations for every nodes to resolve node options and check for incompatibilities with global options. In fact, all this logic may safely be performed during postparsing. This is what we're doing in this commit. Doing so saves us from unnecessary runtime checks and could help speedup sess_build_logline(). Since checks are not as costly as before (due to them being performed during postparsing and not on log building path anymore), an complementary check for OPT_HTTP vs OPT_ENCODE incompatibity was added: encoding is ignored if HTTP option is set, unless HTTP option wasn't set globally and encoding was set globally, which means encoding takes the precedence Thanks to this patch, lf_buildctx_prepare() now only takes care of assigning proper typecast and options settings depending if it's used from global or per-node context, and prepares CBOR-specific structure members when CBOR encode option is set.	2024-05-06 11:13:46 +02:00
Ilia Shipitsin	05ecba0813	CI: netbsd: limit scheduled workflow to parent repo only it is not very useful for most of forks.	2024-05-06 08:26:14 +02:00
Ilia Shipitsin	fab5a23731	CI: add Illumos scheduled workflow this is very initial build only implementation.	2024-05-06 08:26:05 +02:00
Ilia Shipitsin	a7cf2454dd	BUILD: clock: improve check for pthread_getcpuclockid() if _POSIX_THREAD_CPUTIME is greater than 0, pthread_getcpuclockid() is implemented. This should fix the build on Solaris 11. Reference: https://docs.oracle.com/cd/E88353_01/html/E37842/unistd-3head.html ML: https://www.mail-archive.com/haproxy@formilux.org/msg44915.html	2024-05-06 08:25:17 +02:00

960 changed files with 109256 additions and 36857 deletions

									
										6

.cirrus.yml
									
											View File
											
				@ -1,15 +1,15 @@

				FreeBSD_task:

				  freebsd_instance:

				    matrix:

				      image_family: freebsd-13-2

				      image_family: freebsd-14-3

				  only_if: $CIRRUS_BRANCH =~ 'master|next'

				  install_script:

				    - pkg update -f && pkg upgrade -y && pkg install -y openssl git gmake lua53 socat pcre

				    - pkg update -f && pkg upgrade -y && pkg install -y openssl git gmake lua54 socat pcre2

				  script:

				    - sudo sysctl kern.corefile=/tmp/%N.%P.core

				    - sudo sysctl kern.sugid_coredump=1

				    - scripts/build-vtest.sh

				    - gmake CC=clang V=1 ERR=1 TARGET=freebsd USE_ZLIB=1 USE_PCRE=1 USE_OPENSSL=1 USE_LUA=1 LUA_INC=/usr/local/include/lua53 LUA_LIB=/usr/local/lib LUA_LIB_NAME=lua-5.3

				    - gmake CC=clang V=1 ERR=1 TARGET=freebsd USE_ZLIB=1 USE_PCRE2=1 USE_PCRE2_JIT=1 USE_OPENSSL=1 USE_LUA=1 LUA_INC=/usr/local/include/lua54 LUA_LIB=/usr/local/lib LUA_LIB_NAME=lua-5.4

				    - ./haproxy -vv

				    - ldd haproxy

				  test_script:

									
										34

.github/actions/setup-vtest/action.yml
									
										vendored
									
										Normal file
									
											View File
											
				@ -0,0 +1,34 @@

				name: 'setup VTest'

				description: 'ssss'

				runs:

				  using: "composite"

				  steps:

				    - name: Setup coredumps

				      if: ${{ startsWith(matrix.os, 'ubuntu-') }}

				      shell: bash

				      run: |

				        sudo sysctl -w fs.suid_dumpable=1

				        sudo sysctl kernel.core_pattern=/tmp/core.%h.%e.%t

				    - name: Setup ulimit for core dumps

				      shell: bash

				      run: |

				        # This is required for macOS which does not actually allow to increase

				        # the '-n' soft limit to the hard limit, thus failing to run.

				        ulimit -n 65536

				        ulimit -c unlimited

				    - name: Install VTest

				      shell: bash

				      run: |

				        scripts/build-vtest.sh

				    - name: Install problem matcher for VTest

				      shell: bash

				      # This allows one to more easily see which tests fail.

				      run: echo "::add-matcher::.github/vtest.json"

6

.github/h2spec.config vendored

View File

 @ -19,9 +19,9 @@ defaults
 frontend h2
     mode http
     bind 127.0.0.1:8443 ssl crt reg-tests/ssl/common.pem alpn h2,http/1.1
     default_backend h2
     bind 127.0.0.1:8443 ssl crt reg-tests/ssl/certs/common.pem alpn h2,http/1.1
     default_backend h2b
 backend h2
 backend h2b
     errorfile 200 .github/errorfile
     http-request deny deny_status 200

									
										129

.github/matrix.py
									
										vendored
									
											View File
											
				@ -67,6 +67,37 @@ def determine_latest_aws_lc(ssl):

				    latest_tag = max(valid_tags, key=aws_lc_version_string_to_num)

				    return "AWS_LC_VERSION={}".format(latest_tag[1:])

				def aws_lc_fips_version_string_to_num(version_string):

				    return tuple(map(int, version_string[12:].split('.')))

				def aws_lc_fips_version_valid(version_string):

				    return re.match('^AWS-LC-FIPS-[0-9]+(\.[0-9]+)*$', version_string)

				@functools.lru_cache(5)

				def determine_latest_aws_lc_fips(ssl):

				    # the AWS-LC-FIPS tags are at the end of the list, so let's get a lot

				    tags = get_all_github_tags("https://api.github.com/repos/aws/aws-lc/tags?per_page=200")

				    if not tags:

				        return "AWS_LC_FIPS_VERSION=failed_to_detect"

				    valid_tags = list(filter(aws_lc_fips_version_valid, tags))

				    latest_tag = max(valid_tags, key=aws_lc_fips_version_string_to_num)

				    return "AWS_LC_FIPS_VERSION={}".format(latest_tag[12:])

				def wolfssl_version_string_to_num(version_string):

				    return tuple(map(int, version_string[1:].removesuffix('-stable').split('.')))

				def wolfssl_version_valid(version_string):

				    return re.match('^v[0-9]+(\.[0-9]+)*-stable$', version_string)

				@functools.lru_cache(5)

				def determine_latest_wolfssl(ssl):

				    tags = get_all_github_tags("https://api.github.com/repos/wolfssl/wolfssl/tags")

				    if not tags:

				        return "WOLFSSL_VERSION=failed_to_detect"

				    valid_tags = list(filter(wolfssl_version_valid, tags))

				    latest_tag = max(valid_tags, key=wolfssl_version_string_to_num)

				    return "WOLFSSL_VERSION={}".format(latest_tag[1:].removesuffix('-stable'))

				@functools.lru_cache(5)

				def determine_latest_libressl(ssl):

				    try:

				@ -94,9 +125,11 @@ def main(ref_name):

				    # Ubuntu

				    if "haproxy-" in ref_name:

				        os = "ubuntu-22.04" # stable branch

				        os = "ubuntu-24.04"         # stable branch

				        os_arm = "ubuntu-24.04-arm" # stable branch

				    else:

				        os = "ubuntu-latest" # development branch

				        os = "ubuntu-24.04"         # development branch

				        os_arm = "ubuntu-24.04-arm" # development branch

				    TARGET = "linux-glibc"

				    for CC in ["gcc", "clang"]:

				@ -123,11 +156,10 @@ def main(ref_name):

				                    "OT_INC=${HOME}/opt-ot/include",

				                    "OT_LIB=${HOME}/opt-ot/lib",

				                    "OT_RUNPATH=1",

				                    "USE_PCRE=1",

				                    "USE_PCRE_JIT=1",

				                    "USE_PCRE2=1",

				                    "USE_PCRE2_JIT=1",

				                    "USE_LUA=1",

				                    "USE_OPENSSL=1",

				                    "USE_SYSTEMD=1",

				                    "USE_WURFL=1",

				                    "WURFL_INC=addons/wurfl/dummy",

				                    "WURFL_LIB=addons/wurfl/dummy",

				@ -142,37 +174,37 @@ def main(ref_name):

				        # ASAN

				        matrix.append(

				            {

				                "name": "{}, {}, ASAN, all features".format(os, CC),

				                "os": os,

				                "TARGET": TARGET,

				                "CC": CC,

				                "FLAGS": [

				                    "USE_OBSOLETE_LINKER=1",

				                    'ARCH_FLAGS="-g -fsanitize=address"',

				                    'OPT_CFLAGS="-O1"',

				                    "USE_ZLIB=1",

				                    "USE_OT=1",

				                    "OT_INC=${HOME}/opt-ot/include",

				                    "OT_LIB=${HOME}/opt-ot/lib",

				                    "OT_RUNPATH=1",

				                    "USE_PCRE=1",

				                    "USE_PCRE_JIT=1",

				                    "USE_LUA=1",

				                    "USE_OPENSSL=1",

				                    "USE_SYSTEMD=1",

				                    "USE_WURFL=1",

				                    "WURFL_INC=addons/wurfl/dummy",

				                    "WURFL_LIB=addons/wurfl/dummy",

				                    "USE_DEVICEATLAS=1",

				                    "DEVICEATLAS_SRC=addons/deviceatlas/dummy",

				                    "USE_PROMEX=1",

				                    "USE_51DEGREES=1",

				                    "51DEGREES_SRC=addons/51degrees/dummy/pattern",

				                ],

				            }

				        )

				        for os_asan in [os, os_arm]:

				            matrix.append(

				                {

				                    "name": "{}, {}, ASAN, all features".format(os_asan, CC),

				                    "os": os_asan,

				                    "TARGET": TARGET,

				                    "CC": CC,

				                    "FLAGS": [

				                        "USE_OBSOLETE_LINKER=1",

				                        'ARCH_FLAGS="-g -fsanitize=address"',

				                        'OPT_CFLAGS="-O1"',

				                        "USE_ZLIB=1",

				                        "USE_OT=1",

				                        "OT_INC=${HOME}/opt-ot/include",

				                        "OT_LIB=${HOME}/opt-ot/lib",

				                        "OT_RUNPATH=1",

				                        "USE_PCRE2=1",

				                        "USE_PCRE2_JIT=1",

				                        "USE_LUA=1",

				                        "USE_OPENSSL=1",

				                        "USE_WURFL=1",

				                        "WURFL_INC=addons/wurfl/dummy",

				                        "WURFL_LIB=addons/wurfl/dummy",

				                        "USE_DEVICEATLAS=1",

				                        "DEVICEATLAS_SRC=addons/deviceatlas/dummy",

				                        "USE_PROMEX=1",

				                        "USE_51DEGREES=1",

				                        "51DEGREES_SRC=addons/51degrees/dummy/pattern",

				                    ],

				                }

				            )

				        for compression in ["USE_ZLIB=1"]:

				            matrix.append(

				@ -189,9 +221,10 @@ def main(ref_name):

				            "stock",

				            "OPENSSL_VERSION=1.0.2u",

				            "OPENSSL_VERSION=1.1.1s",

				            "OPENSSL_VERSION=3.5.1",

				            "QUICTLS=yes",

				            "WOLFSSL_VERSION=5.6.6",

				            "AWS_LC_VERSION=1.16.0",

				            "WOLFSSL_VERSION=5.7.0",

				            "AWS_LC_VERSION=1.39.0",

				            # "BORINGSSL=yes",

				        ]

				@ -203,8 +236,7 @@ def main(ref_name):

				        for ssl in ssl_versions:

				            flags = ["USE_OPENSSL=1"]

				            if ssl == "BORINGSSL=yes" or ssl == "QUICTLS=yes" or "LIBRESSL" in ssl or "WOLFSSL" in ssl or "AWS_LC" in ssl:

				                flags.append("USE_QUIC=1")

				            skipdup=0

				            if "WOLFSSL" in ssl:

				                flags.append("USE_OPENSSL_WOLFSSL=1")

				            if "AWS_LC" in ssl:

				@ -214,8 +246,23 @@ def main(ref_name):

				                flags.append("SSL_INC=${HOME}/opt/include")

				            if "LIBRESSL" in ssl and "latest" in ssl:

				                ssl = determine_latest_libressl(ssl)

				                skipdup=1

				            if "OPENSSL" in ssl and "latest" in ssl:

				                ssl = determine_latest_openssl(ssl)

				                skipdup=1

				            # if "latest" equals a version already in the list

				            if ssl in ssl_versions and skipdup == 1:

				                continue

				            openssl_supports_quic = False

				            try:

				              openssl_supports_quic = version.Version(ssl.split("OPENSSL_VERSION=",1)[1]) >= version.Version("3.5.0")

				            except:

				              pass

				            if ssl == "BORINGSSL=yes" or ssl == "QUICTLS=yes" or "LIBRESSL" in ssl or "WOLFSSL" in ssl or "AWS_LC" in ssl or openssl_supports_quic:

				                flags.append("USE_QUIC=1")

				            matrix.append(

				                {

				@ -233,7 +280,7 @@ def main(ref_name):

				    if "haproxy-" in ref_name:

				        os = "macos-13"     # stable branch

				    else:

				        os = "macos-14"     # development branch

				        os = "macos-26"     # development branch

				    TARGET = "osx"

				    for CC in ["clang"]:

									
										12

.github/workflows/aws-lc-fips.yml
									
										vendored
									
										Normal file
									
											View File
											
				@ -0,0 +1,12 @@

				name: AWS-LC-FIPS

				on:

				  schedule:

				    - cron: "0 0 * * 4"

				  workflow_dispatch:

				jobs:

				  test:

				    uses: ./.github/workflows/aws-lc-template.yml

				    with:

				      command: "from matrix import determine_latest_aws_lc_fips; print(determine_latest_aws_lc_fips(''))"

									
										94

.github/workflows/aws-lc-template.yml
									
										vendored
									
										Normal file
									
											View File
											
				@ -0,0 +1,94 @@

				name: AWS-LC template

				on:

				  workflow_call:

				    inputs:

				      command:

				        required: true

				        type: string

				permissions:

				  contents: read

				jobs:

				  test:

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				      - uses: actions/checkout@v5

				      - name: Determine latest AWS-LC release

				        id: get_aws_lc_release

				        run: |

				          result=$(cd .github && python3  -c "${{ inputs.command }}")

				          echo $result

				          echo "result=$result" >> $GITHUB_OUTPUT

				      - name: Cache AWS-LC

				        id: cache_aws_lc

				        uses: actions/cache@v4

				        with:

				          path: '~/opt/'

				          key: ssl-${{ steps.get_aws_lc_release.outputs.result }}-Ubuntu-latest-gcc

				      - name: Install apt dependencies

				        run: |

				          sudo apt-get update -o Acquire::Languages=none -o Acquire::Translation=none

				          sudo apt-get --no-install-recommends -y install socat gdb jose

				      - name: Install AWS-LC

				        if: ${{ steps.cache_ssl.outputs.cache-hit != 'true' }}

				        run: env ${{ steps.get_aws_lc_release.outputs.result }} scripts/build-ssl.sh

				      - name: Compile HAProxy

				        run: |

				          make -j$(nproc) ERR=1 CC=gcc TARGET=linux-glibc \

				            USE_OPENSSL_AWSLC=1 USE_QUIC=1 \

				            SSL_LIB=${HOME}/opt/lib SSL_INC=${HOME}/opt/include \

				            DEBUG="-DDEBUG_POOL_INTEGRITY -DDEBUG_UNIT" \

				            ADDLIB="-Wl,-rpath,/usr/local/lib/ -Wl,-rpath,$HOME/opt/lib/"

				          sudo make install

				      - name: Show HAProxy version

				        id: show-version

				        run: |

				          ldd $(which haproxy)

				          haproxy -vv

				          echo "version=$(haproxy -v |awk 'NR==1{print $3}')" >> $GITHUB_OUTPUT

				      - uses: ./.github/actions/setup-vtest

				      - name: Run VTest for HAProxy

				        id: vtest

				        run: |

				          make reg-tests VTEST_PROGRAM=../vtest/vtest REGTESTS_TYPES=default,bug,devel

				      - name: Run Unit tests

				        id: unittests

				        run: |

				          make unit-tests

				      - name: Show VTest results

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          for folder in ${TMPDIR:-/tmp}/haregtests-*/vtc.*; do

				            printf "::group::"

				            cat $folder/INFO

				            cat $folder/LOG

				            echo "::endgroup::"

				          done

				          exit 1

				      - name: Show coredumps

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          failed=false

				          shopt -s nullglob

				          for file in /tmp/core.*; do

				            failed=true

				            printf "::group::"

				            gdb -ex 'thread apply all bt full' ./haproxy $file

				            echo "::endgroup::"

				          done

				          if [ "$failed" = true ]; then

				            exit 1;

				          fi

				      - name: Show Unit-Tests results

				        if: ${{ failure() && steps.unittests.outcome == 'failure' }}

				        run: |

				          for result in ${TMPDIR:-/tmp}/ha-unittests-*/results/res.*; do

				            printf "::group::"

				            cat $result

				            echo "::endgroup::"

				          done

				          exit 1

									
										60

.github/workflows/aws-lc.yml
									
										vendored
									
											View File
											
				@ -5,62 +5,8 @@ on:

				    - cron: "0 0 * * 4"

				  workflow_dispatch:

				permissions:

				  contents: read

				jobs:

				  test:

				    runs-on: ubuntu-latest

				    steps:

				      - uses: actions/checkout@v4

				      - name: Install VTest

				        run: |

				          scripts/build-vtest.sh

				      - name: Determine latest AWS-LC release

				        id: get_aws_lc_release

				        run: |

				          result=$(cd .github && python3  -c "from matrix import determine_latest_aws_lc; print(determine_latest_aws_lc(''))")

				          echo $result

				          echo "result=$result" >> $GITHUB_OUTPUT

				      - name: Cache AWS-LC

				        id: cache_aws_lc

				        uses: actions/cache@v4

				        with:

				          path: '~/opt/'

				          key: ssl-${{ steps.get_aws_lc_release.outputs.result }}-Ubuntu-latest-gcc

				      - name: Install AWS-LC

				        if: ${{ steps.cache_ssl.outputs.cache-hit != 'true' }}

				        run: env ${{ steps.get_aws_lc_release.outputs.result }} scripts/build-ssl.sh

				      - name: Compile HAProxy

				        run: |

				          make -j$(nproc) CC=gcc TARGET=linux-glibc \

				            USE_OPENSSL_AWSLC=1 USE_QUIC=1 \

				            SSL_LIB=${HOME}/opt/lib SSL_INC=${HOME}/opt/include \

				            DEBUG="-DDEBUG_POOL_INTEGRITY" \

				            ADDLIB="-Wl,-rpath,/usr/local/lib/ -Wl,-rpath,$HOME/opt/lib/"

				          sudo make install

				      - name: Show HAProxy version

				        id: show-version

				        run: |

				          ldd $(which haproxy)

				          haproxy -vv

				          echo "version=$(haproxy -v |awk 'NR==1{print $3}')" >> $GITHUB_OUTPUT

				      - name: Install problem matcher for VTest

				        run: echo "::add-matcher::.github/vtest.json"

				      - name: Run VTest for HAProxy

				        id: vtest

				        run: |

				          # This is required for macOS which does not actually allow to increase

				          # the '-n' soft limit to the hard limit, thus failing to run.

				          ulimit -n 65536

				          make reg-tests VTEST_PROGRAM=../vtest/vtest REGTESTS_TYPES=default,bug,devel

				      - name: Show VTest results

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          for folder in ${TMPDIR}/haregtests-*/vtc.*; do

				            printf "::group::"

				            cat $folder/INFO

				            cat $folder/LOG

				            echo "::endgroup::"

				          done

				          exit 1

				    uses: ./.github/workflows/aws-lc-template.yml

				    with:

				      command: "from matrix import determine_latest_aws_lc; print(determine_latest_aws_lc(''))"

									
										9

.github/workflows/codespell.yml
									
										vendored
									
											View File
											
				@ -3,6 +3,7 @@ name: Spelling Check

				on:

				  schedule:

				    - cron: "0 0 * * 2"

				  workflow_dispatch:

				permissions:

				  contents: read

				@ -10,12 +11,12 @@ permissions:

				jobs:

				  codespell:

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' }}

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				    - uses: actions/checkout@v4

				    - uses: codespell-project/codespell-problem-matcher@v1

				    - uses: actions/checkout@v5

				    - uses: codespell-project/codespell-problem-matcher@v1.2.0

				    - uses: codespell-project/actions-codespell@master

				      with:

				        skip: CHANGELOG,Makefile,*.fig,*.pem,./doc/design-thoughts,./doc/internals

				        ignore_words_list: ist,ists,hist,wan,ca,cas,que,ans,te,nd,referer,ot,uint,iif,fo,keep-alives,dosen,ifset,thrid,strack,ba,chck,hel,unx,mor,clen,collet,bu,htmp,siz,experim

				        ignore_words_list: pres,ist,ists,hist,wan,ca,cas,que,ans,te,nd,referer,ot,uint,iif,fo,keep-alives,dosen,ifset,thrid,strack,ba,chck,hel,unx,mor,clen,collet,bu,htmp,siz,experim

				        uri_ignore_words_list: trafic,ressources

									
										17

.github/workflows/compliance.yml
									
										vendored
									
											View File
											
				@ -11,15 +11,10 @@ permissions:

				jobs:

				  h2spec:

				    name: h2spec

				    runs-on: ${{ matrix.os }}

				    strategy:

				      matrix:

				        include:

				        - TARGET: linux-glibc

				          CC: gcc

				          os: ubuntu-latest

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				    - uses: actions/checkout@v4

				    - uses: actions/checkout@v5

				    - name: Install h2spec

				      id: install-h2spec

				      run: |

				@ -28,12 +23,12 @@ jobs:

				        tar xvf h2spec.tar.gz

				        sudo install -m755 h2spec /usr/local/bin/h2spec

				        echo "version=${H2SPEC_VERSION}" >> $GITHUB_OUTPUT

				    - name: Compile HAProxy with ${{ matrix.CC }}

				    - name: Compile HAProxy with gcc

				      run: |

				        make -j$(nproc) all \

				          ERR=1 \

				          TARGET=${{ matrix.TARGET }} \

				          CC=${{ matrix.CC }} \

				          TARGET=linux-glibc \

				          CC=gcc \

				          DEBUG="-DDEBUG_POOL_INTEGRITY" \

				          USE_OPENSSL=1

				        sudo make install

									
										2

.github/workflows/contrib.yml
									
										vendored
									
											View File
											
				@ -10,7 +10,7 @@ jobs:

				  build:

				    runs-on: ubuntu-latest

				    steps:

				    - uses: actions/checkout@v4

				    - uses: actions/checkout@v5

				    - name: Compile admin/halog/halog

				      run: |

				        make admin/halog/halog

									
										13

.github/workflows/coverity.yml
									
										vendored
									
											View File
											
				@ -15,14 +15,15 @@ permissions:

				jobs:

				  scan:

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' }}

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				    - uses: actions/checkout@v4

				    - uses: actions/checkout@v5

				    - name: Install apt dependencies

				      run: |

				        sudo apt-get update

				        sudo apt-get install -y \

				          liblua5.3-dev \

				        sudo apt-get update -o Acquire::Languages=none -o Acquire::Translation=none

				        sudo apt-get --no-install-recommends -y install \

				          liblua5.4-dev \

				          libpcre2-dev \

				          libsystemd-dev

				    - name: Install QUICTLS

				      run: |

				@ -37,7 +38,7 @@ jobs:

				    - name: Build with Coverity build tool

				      run: |

				        export PATH=`pwd`/coverity_tool/bin:$PATH

				        cov-build --dir cov-int make CC=clang TARGET=linux-glibc USE_ZLIB=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_LUA=1 USE_OPENSSL=1 USE_QUIC=1 USE_SYSTEMD=1 USE_WURFL=1 WURFL_INC=addons/wurfl/dummy WURFL_LIB=addons/wurfl/dummy USE_DEVICEATLAS=1 DEVICEATLAS_SRC=addons/deviceatlas/dummy USE_51DEGREES=1 51DEGREES_SRC=addons/51degrees/dummy/pattern ADDLIB=\"-Wl,-rpath,$HOME/opt/lib/\" SSL_LIB=${HOME}/opt/lib SSL_INC=${HOME}/opt/include DEBUG+=-DDEBUG_STRICT=1 DEBUG+=-DDEBUG_USE_ABORT=1

				        cov-build --dir cov-int make CC=clang TARGET=linux-glibc USE_ZLIB=1 USE_PCRE2=1 USE_PCRE2_JIT=1 USE_LUA=1 USE_OPENSSL=1 USE_QUIC=1 USE_WURFL=1 WURFL_INC=addons/wurfl/dummy WURFL_LIB=addons/wurfl/dummy USE_DEVICEATLAS=1 DEVICEATLAS_SRC=addons/deviceatlas/dummy USE_51DEGREES=1 51DEGREES_SRC=addons/51degrees/dummy/pattern ADDLIB=\"-Wl,-rpath,$HOME/opt/lib/\" SSL_LIB=${HOME}/opt/lib SSL_INC=${HOME}/opt/include DEBUG+=-DDEBUG_STRICT=2 DEBUG+=-DDEBUG_USE_ABORT=1

				    - name: Submit build result to Coverity Scan

				      run: |

				        tar czvf cov.tar.gz cov-int

									
										7

.github/workflows/cross-zoo.yml
									
										vendored
									
											View File
											
				@ -6,6 +6,7 @@ name: Cross Compile

				on:

				  schedule:

				    - cron: "0 0 21 * *"

				  workflow_dispatch:

				permissions:

				  contents: read

				@ -90,15 +91,15 @@ jobs:

				          }

				        ]

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' }}

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				    - name: install packages

				      run: |

				        sudo apt-get update

				        sudo apt-get update -o Acquire::Languages=none -o Acquire::Translation=none

				        sudo apt-get -yq --force-yes install \

				            gcc-${{ matrix.platform.arch }} \

				            ${{ matrix.platform.libs }}

				    - uses: actions/checkout@v4

				    - uses: actions/checkout@v5

				    - name: install quictls

									
										19

.github/workflows/fedora-rawhide.yml
									
										vendored
									
											View File
											
				@ -3,6 +3,7 @@ name: Fedora/Rawhide/QuicTLS

				on:

				  schedule:

				    - cron: "0 0 25 * *"

				  workflow_dispatch:

				permissions:

				  contents: read

				@ -17,19 +18,19 @@ jobs:

				          { name: x86, cc: gcc,   QUICTLS_EXTRA_ARGS: "-m32 linux-generic32", ADDLIB_ATOMIC: "-latomic", ARCH_FLAGS: "-m32" },

				          { name: x86, cc: clang, QUICTLS_EXTRA_ARGS: "-m32 linux-generic32", ADDLIB_ATOMIC: "-latomic", ARCH_FLAGS: "-m32" }

				        ]

				      fail-fast: false

				    name: ${{ matrix.platform.cc }}.${{ matrix.platform.name }}

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' }}

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    container:

				      image: fedora:rawhide

				    steps:

				    - uses: actions/checkout@v4

				    - uses: actions/checkout@v5

				    - name: Install dependencies

				      run: |

				        dnf -y install diffutils git pcre-devel zlib-devel pcre2-devel 'perl(FindBin)' perl-IPC-Cmd 'perl(File::Copy)' 'perl(File::Compare)' lua-devel socat findutils systemd-devel clang

				        dnf -y install awk diffutils git pcre-devel zlib-devel pcre2-devel 'perl(FindBin)' perl-IPC-Cmd 'perl(File::Copy)' 'perl(File::Compare)' lua-devel socat findutils systemd-devel clang

				        dnf -y install 'perl(FindBin)' 'perl(File::Compare)' perl-IPC-Cmd 'perl(File::Copy)' glibc-devel.i686 lua-devel.i686 lua-devel.x86_64 systemd-devel.i686 zlib-ng-compat-devel.i686 pcre-devel.i686 libatomic.i686

				    - name: Install VTest

				      run: scripts/build-vtest.sh

				    - uses: ./.github/actions/setup-vtest

				    - name: Install QuicTLS

				      run: QUICTLS=yes QUICTLS_EXTRA_ARGS="${{ matrix.platform.QUICTLS_EXTRA_ARGS }}" scripts/build-ssl.sh

				    - name: Build contrib tools

				@ -40,7 +41,7 @@ jobs:

				        make dev/hpack/decode dev/hpack/gen-enc dev/hpack/gen-rht

				    - name: Compile HAProxy with ${{ matrix.platform.cc }}

				      run: |

				        make -j3 CC=${{ matrix.platform.cc }} V=1 ERR=1 TARGET=linux-glibc USE_OPENSSL=1 USE_QUIC=1 USE_ZLIB=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_LUA=1 USE_SYSTEMD=1 ADDLIB="${{ matrix.platform.ADDLIB_ATOMIC }} -Wl,-rpath,${HOME}/opt/lib" SSL_LIB=${HOME}/opt/lib SSL_INC=${HOME}/opt/include ARCH_FLAGS="${{ matrix.platform.ARCH_FLAGS }}"

				        make -j3 CC=${{ matrix.platform.cc }} V=1 ERR=1 TARGET=linux-glibc DEBUG="-DDEBUG_POOL_INTEGRITY -DDEBUG_UNIT" USE_OPENSSL=1 USE_QUIC=1 USE_ZLIB=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_LUA=1 ADDLIB="${{ matrix.platform.ADDLIB_ATOMIC }} -Wl,-rpath,${HOME}/opt/lib" SSL_LIB=${HOME}/opt/lib SSL_INC=${HOME}/opt/include ARCH_FLAGS="${{ matrix.platform.ARCH_FLAGS }}"

				        make install

				    - name: Show HAProxy version

				      id: show-version

				@ -57,9 +58,13 @@ jobs:

				    - name: Show VTest results

				      if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				      run: |

				        for folder in ${TMPDIR}/haregtests-*/vtc.*; do

				        for folder in ${TMPDIR:-/tmp}/haregtests-*/vtc.*; do

				          printf "::group::"

				          cat $folder/INFO

				          cat $folder/LOG

				          echo "::endgroup::"

				        done

				    - name: Run Unit tests

				      id: unittests

				      run: |

				        make unit-tests

									
										24

.github/workflows/illumos.yml
									
										vendored
									
										Normal file
									
											View File
											
				@ -0,0 +1,24 @@

				name: Illumos

				on:

				  schedule:

				    - cron: "0 0 25 * *"

				  workflow_dispatch:

				jobs:

				  gcc:

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    permissions:

				      contents: read

				    steps:

				      - name: "Checkout repository"

				        uses: actions/checkout@v5

				      - name: "Build on VM"

				        uses: vmactions/solaris-vm@v1

				        with:

				          prepare: |

				            pkg install gcc make

				          run: |

				            gmake CC=gcc TARGET=solaris USE_OPENSSL=1 USE_PROMEX=1

									
										20

.github/workflows/musl.yml
									
										vendored
									
											View File
											
				@ -20,13 +20,13 @@ jobs:

				        run: |

				          ulimit -c unlimited

				          echo '/tmp/core/core.%h.%e.%t' > /proc/sys/kernel/core_pattern

				      - uses: actions/checkout@v4

				      - uses: actions/checkout@v5

				      - name: Install dependencies

				        run: apk add gcc gdb make tar git python3 libc-dev linux-headers pcre-dev pcre2-dev openssl-dev lua5.3-dev grep socat curl musl-dbg lua5.3-dbg

				        run: apk add gcc gdb make tar git python3 libc-dev linux-headers pcre-dev pcre2-dev openssl-dev lua5.3-dev grep socat curl musl-dbg lua5.3-dbg jose

				      - name: Install VTest

				        run: scripts/build-vtest.sh

				      - name: Build

				        run: make -j$(nproc) TARGET=linux-musl ARCH_FLAGS='-ggdb3' CC=cc V=1 USE_LUA=1 LUA_INC=/usr/include/lua5.3 LUA_LIB=/usr/lib/lua5.3 USE_OPENSSL=1 USE_PCRE2=1 USE_PCRE2_JIT=1 USE_PROMEX=1

				        run: make -j$(nproc) TARGET=linux-musl DEBUG="-DDEBUG_POOL_INTEGRITY -DDEBUG_UNIT" ARCH_FLAGS='-ggdb3' CC=cc V=1 USE_LUA=1 LUA_INC=/usr/include/lua5.3 LUA_LIB=/usr/lib/lua5.3 USE_OPENSSL=1 USE_PCRE2=1 USE_PCRE2_JIT=1 USE_PROMEX=1

				      - name: Show version

				        run: ./haproxy -vv

				      - name: Show linked libraries

				@ -37,6 +37,10 @@ jobs:

				      - name: Run VTest

				        id: vtest

				        run: make reg-tests VTEST_PROGRAM=../vtest/vtest REGTESTS_TYPES=default,bug,devel

				      - name: Run Unit tests

				        id: unittests

				        run: |

				          make unit-tests

				      - name: Show coredumps

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				@ -60,3 +64,13 @@ jobs:

				            cat $folder/LOG

				            echo "::endgroup::"

				          done

				      - name: Show Unit-Tests results

				        if: ${{ failure() && steps.unittests.outcome == 'failure' }}

				        run: |

				          for result in ${TMPDIR:-/tmp}/ha-unittests-*/results/res.*; do

				            printf "::group::"

				            cat $result

				            echo "::endgroup::"

				          done

				          exit 1

									
										6

.github/workflows/netbsd.yml
									
										vendored
									
											View File
											
				@ -3,15 +3,17 @@ name: NetBSD

				on:

				  schedule:

				    - cron: "0 0 25 * *"

				  workflow_dispatch:

				jobs:

				  gcc:

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    permissions:

				      contents: read

				    steps:

				      - name: "Checkout repository"

				        uses: actions/checkout@v4

				        uses: actions/checkout@v5

				      - name: "Build on VM"

				        uses: vmactions/netbsd-vm@v1

				@ -19,4 +21,4 @@ jobs:

				          prepare: |

				            /usr/sbin/pkg_add gmake curl

				          run: |

				            gmake CC=gcc TARGET=netbsd USE_OPENSSL=1 USE_LUA=1 USE_PCRE2=1 USE_PCRE2_JIT=1 USE_PROMEX=1 USE_ZLIB=1

				            gmake CC=gcc TARGET=netbsd ERR=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE2=1 USE_PCRE2_JIT=1 USE_PROMEX=1 USE_ZLIB=1

									
										82

.github/workflows/openssl-ech.yml
									
										vendored
									
										Normal file
									
											View File
											
				@ -0,0 +1,82 @@

				name: openssl ECH

				on:

				  schedule:

				  - cron: "0 3 * * *"

				  workflow_dispatch:

				permissions:

				  contents: read

				jobs:

				  test:

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				      - uses: actions/checkout@v5

				      - name: Install VTest

				        run: |

				          scripts/build-vtest.sh

				      - name: Install apt dependencies

				        run: |

				          sudo apt-get update -o Acquire::Languages=none -o Acquire::Translation=none

				          sudo apt-get --no-install-recommends -y install socat gdb

				          sudo apt-get --no-install-recommends -y install libpsl-dev

				      - name: Install OpenSSL+ECH

				        run: env OPENSSL_VERSION="git-feature/ech" GIT_TYPE="branch" scripts/build-ssl.sh

				      - name: Install curl+ECH

				        run: env SSL_LIB=${HOME}/opt/ scripts/build-curl.sh

				      - name: Compile HAProxy

				        run: |

				          make -j$(nproc) CC=gcc TARGET=linux-glibc \

				            USE_QUIC=1 USE_OPENSSL=1 USE_ECH=1 \

				            SSL_LIB=${HOME}/opt/lib SSL_INC=${HOME}/opt/include \

				            DEBUG="-DDEBUG_POOL_INTEGRITY -DDEBUG_UNIT" \

				            ADDLIB="-Wl,-rpath,/usr/local/lib/ -Wl,-rpath,$HOME/opt/lib/" \

				            ARCH_FLAGS="-ggdb3 -fsanitize=address"

				          sudo make install

				      - name: Show HAProxy version

				        id: show-version

				        run: |

				          ldd $(which haproxy)

				          haproxy -vv

				          echo "version=$(haproxy -v |awk 'NR==1{print $3}')" >> $GITHUB_OUTPUT

				      - name: Install problem matcher for VTest

				        run: echo "::add-matcher::.github/vtest.json"

				      - name: Run VTest for HAProxy

				        id: vtest

				        run: |

				          # This is required for macOS which does not actually allow to increase

				          # the '-n' soft limit to the hard limit, thus failing to run.

				          ulimit -n 65536

				          # allow to catch coredumps

				          ulimit -c unlimited

				          make reg-tests VTEST_PROGRAM=../vtest/vtest REGTESTS_TYPES=default,bug,devel

				      - name: Show VTest results

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          for folder in ${TMPDIR:-/tmp}/haregtests-*/vtc.*; do

				            printf "::group::"

				            cat $folder/INFO

				            cat $folder/LOG

				            echo "::endgroup::"

				          done

				          exit 1

				      - name: Run Unit tests

				        id: unittests

				        run: |

				          make unit-tests

				      - name: Show coredumps

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          failed=false

				          shopt -s nullglob

				          for file in /tmp/core.*; do

				            failed=true

				            printf "::group::"

				            gdb -ex 'thread apply all bt full' ./haproxy $file

				            echo "::endgroup::"

				          done

				          if [ "$failed" = true ]; then

				            exit 1;

				          fi

									
										77

.github/workflows/openssl-master.yml
									
										vendored
									
										Normal file
									
											View File
											
				@ -0,0 +1,77 @@

				name: openssl master

				on:

				  schedule:

				  - cron: "0 3 * * *"

				  workflow_dispatch:

				permissions:

				  contents: read

				jobs:

				  test:

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				      - uses: actions/checkout@v5

				      - name: Install apt dependencies

				        run: |

				          sudo apt-get update -o Acquire::Languages=none -o Acquire::Translation=none

				          sudo apt-get --no-install-recommends -y install socat gdb

				          sudo apt-get --no-install-recommends -y install libpsl-dev

				      - uses: ./.github/actions/setup-vtest

				      - name: Install OpenSSL master

				        run: env OPENSSL_VERSION="git-master" GIT_TYPE="branch" scripts/build-ssl.sh

				      - name: Compile HAProxy

				        run: |

				          make -j$(nproc) ERR=1 CC=gcc TARGET=linux-glibc \

				            USE_QUIC=1 USE_OPENSSL=1 \

				            SSL_LIB=${HOME}/opt/lib SSL_INC=${HOME}/opt/include \

				            DEBUG="-DDEBUG_POOL_INTEGRITY -DDEBUG_UNIT" \

				            ADDLIB="-Wl,-rpath,/usr/local/lib/ -Wl,-rpath,$HOME/opt/lib/"

				          sudo make install

				      - name: Show HAProxy version

				        id: show-version

				        run: |

				          ldd $(which haproxy)

				          haproxy -vv

				          echo "version=$(haproxy -v |awk 'NR==1{print $3}')" >> $GITHUB_OUTPUT

				      - name: Install problem matcher for VTest

				        run: echo "::add-matcher::.github/vtest.json"

				      - name: Run VTest for HAProxy

				        id: vtest

				        run: |

				          # This is required for macOS which does not actually allow to increase

				          # the '-n' soft limit to the hard limit, thus failing to run.

				          ulimit -n 65536

				          # allow to catch coredumps

				          ulimit -c unlimited

				          make reg-tests VTEST_PROGRAM=../vtest/vtest REGTESTS_TYPES=default,bug,devel

				      - name: Show VTest results

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          for folder in ${TMPDIR:-/tmp}/haregtests-*/vtc.*; do

				            printf "::group::"

				            cat $folder/INFO

				            cat $folder/LOG

				            echo "::endgroup::"

				          done

				          exit 1

				      - name: Run Unit tests

				        id: unittests

				        run: |

				          make unit-tests

				      - name: Show coredumps

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          failed=false

				          shopt -s nullglob

				          for file in /tmp/core.*; do

				            failed=true

				            printf "::group::"

				            gdb -ex 'thread apply all bt full' ./haproxy $file

				            echo "::endgroup::"

				          done

				          if [ "$failed" = true ]; then

				            exit 1;

				          fi

									
										33

.github/workflows/openssl-nodeprecated.yml
									
										vendored
									
											View File
										
				@ -1,33 +0,0 @@

				#

				# special purpose CI: test against OpenSSL built in "no-deprecated" mode

				# let us run those builds weekly

				#

				# for example, OpenWRT uses such OpenSSL builds (those builds are smaller)

				#

				#

				# some details might be found at NL: https://www.mail-archive.com/haproxy@formilux.org/msg35759.html

				#                                GH: https://github.com/haproxy/haproxy/issues/367

				name: openssl no-deprecated

				on:

				  schedule:

				  - cron: "0 0 * * 4"

				permissions:

				  contents: read

				jobs:

				  test:

				    runs-on: ubuntu-latest

				    steps:

				    - uses: actions/checkout@v4

				    - name: Install VTest

				      run: |

				        scripts/build-vtest.sh

				    - name: Compile HAProxy

				      run: |

				        make DEFINE="-DOPENSSL_API_COMPAT=0x10100000L -DOPENSSL_NO_DEPRECATED" -j3 CC=gcc ERR=1 TARGET=linux-glibc USE_OPENSSL=1

				    - name: Run VTest

				      run: |

				        make reg-tests VTEST_PROGRAM=../vtest/vtest REGTESTS_TYPES=default,bug,devel

									
										104

.github/workflows/quic-interop-aws-lc.yml
									
										vendored
									
										Normal file
									
											View File
											
				@ -0,0 +1,104 @@

				#

				# goodput,crosstraffic are not run on purpose, those tests are intended to bandwidth measurement, we currently do not want to use GitHub runners for that

				#

				name: QUIC Interop AWS-LC

				on:

				  workflow_dispatch:

				  schedule:

				    - cron: "0 0 * * 2"

				jobs:

				  build:

				    runs-on: ubuntu-24.04

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - uses: actions/checkout@v5

				      - name: Log in to the Container registry

				        uses: docker/login-action@v3

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Build and push Docker image

				        id: push

				        uses: docker/build-push-action@v5

				        with:

				          context: https://github.com/haproxytech/haproxy-qns.git

				          push: true

				          build-args: |

				            SSLLIB=AWS-LC

				          tags: ghcr.io/${{ github.repository }}:aws-lc

				      - name: Cleanup registry

				        uses: actions/delete-package-versions@v5

				        with:

				          owner: ${{ github.repository_owner }}

				          package-name: 'haproxy'

				          package-type: container

				          min-versions-to-keep: 1

				          delete-only-untagged-versions: 'true'

				  run:

				    needs: build

				    strategy:

				      matrix:

				        suite: [

				          { client: chrome, tests: "http3" },

				          { client: picoquic, tests: "handshake,transfer,longrtt,chacha20,multiplexing,retry,resumption,zerortt,http3,blackhole,keyupdate,ecn,amplificationlimit,handshakeloss,transferloss,handshakecorruption,transfercorruption,ipv6,v2" },

				          { client: quic-go,  tests: "handshake,transfer,longrtt,chacha20,multiplexing,retry,resumption,zerortt,http3,blackhole,keyupdate,ecn,amplificationlimit,handshakeloss,transferloss,handshakecorruption,transfercorruption,ipv6,v2" },

				          { client: ngtcp2,  tests: "handshake,transfer,longrtt,chacha20,multiplexing,retry,resumption,zerortt,http3,blackhole,keyupdate,ecn,amplificationlimit,handshakeloss,transferloss,handshakecorruption,transfercorruption,ipv6,v2" }

				        ]

				      fail-fast: false

				    name: ${{ matrix.suite.client }}

				    runs-on: ubuntu-24.04

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				      - uses: actions/checkout@v5

				      - name: Log in to the Container registry

				        uses: docker/login-action@v3

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Install tshark

				        run: |

				          sudo apt-get update

				          sudo apt-get -y install tshark

				      - name: Pull image

				        run: |

				          docker pull ghcr.io/${{ github.repository }}:aws-lc

				      - name: Run

				        run: |

				          git clone https://github.com/quic-interop/quic-interop-runner

				          cd quic-interop-runner

				          pip install -r requirements.txt --break-system-packages

				          python run.py -j result.json -l logs -r haproxy=ghcr.io/${{ github.repository }}:aws-lc -t ${{ matrix.suite.tests }} -c ${{ matrix.suite.client }} -s haproxy

				      - name: Delete succeeded logs

				        if: failure()

				        run: |

				          cd quic-interop-runner/logs/haproxy_${{ matrix.suite.client }}

				          cat ../../result.json | jq -r '.results[][] | select(.result=="succeeded") | .name' | xargs rm -rf

				      - name: Logs upload

				        if: failure()

				        uses: actions/upload-artifact@v4

				        with:

				          name: logs-${{ matrix.suite.client }}

				          path: quic-interop-runner/logs/

				          retention-days: 6

									
										102

.github/workflows/quic-interop-libressl.yml
									
										vendored
									
										Normal file
									
											View File
											
				@ -0,0 +1,102 @@

				#

				# goodput,crosstraffic are not run on purpose, those tests are intended to bandwidth measurement, we currently do not want to use GitHub runners for that

				#

				name: QUIC Interop LibreSSL

				on:

				  workflow_dispatch:

				  schedule:

				    - cron: "0 0 * * 2"

				jobs:

				  build:

				    runs-on: ubuntu-24.04

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - uses: actions/checkout@v5

				      - name: Log in to the Container registry

				        uses: docker/login-action@v3

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Build and push Docker image

				        id: push

				        uses: docker/build-push-action@v5

				        with:

				          context: https://github.com/haproxytech/haproxy-qns.git

				          push: true

				          build-args: |

				            SSLLIB=LibreSSL

				          tags: ghcr.io/${{ github.repository }}:libressl

				      - name: Cleanup registry

				        uses: actions/delete-package-versions@v5

				        with:

				          owner: ${{ github.repository_owner }}

				          package-name: 'haproxy'

				          package-type: container

				          min-versions-to-keep: 1

				          delete-only-untagged-versions: 'true'

				  run:

				    needs: build

				    strategy:

				      matrix:

				        suite: [

				          { client: picoquic, tests: "handshake,transfer,longrtt,chacha20,multiplexing,retry,http3,blackhole,amplificationlimit,handshakeloss,transferloss,handshakecorruption,transfercorruption,v2" },

				          { client: quic-go,  tests: "handshake,transfer,longrtt,chacha20,multiplexing,retry,http3,blackhole,amplificationlimit,transferloss,transfercorruption,v2" }

				        ]

				      fail-fast: false

				    name: ${{ matrix.suite.client }}

				    runs-on: ubuntu-24.04

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				      - uses: actions/checkout@v5

				      - name: Log in to the Container registry

				        uses: docker/login-action@v3

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Install tshark

				        run: |

				          sudo apt-get update

				          sudo apt-get -y install tshark

				      - name: Pull image

				        run: |

				          docker pull ghcr.io/${{ github.repository }}:libressl

				      - name: Run

				        run: |

				          git clone https://github.com/quic-interop/quic-interop-runner

				          cd quic-interop-runner

				          pip install -r requirements.txt --break-system-packages

				          python run.py -j result.json -l logs -r haproxy=ghcr.io/${{ github.repository }}:libressl -t ${{ matrix.suite.tests }} -c ${{ matrix.suite.client }} -s haproxy

				      - name: Delete succeeded logs

				        if: failure()

				        run: |

				          cd quic-interop-runner/logs/haproxy_${{ matrix.suite.client }}

				          cat ../../result.json | jq -r '.results[][] | select(.result=="succeeded") | .name' | xargs rm -rf

				      - name: Logs upload

				        if: failure()

				        uses: actions/upload-artifact@v4

				        with:

				          name: logs-${{ matrix.suite.client }}

				          path: quic-interop-runner/logs/

				          retention-days: 6

									
										74

.github/workflows/quictls.yml
									
										vendored
									
										Normal file
									
											View File
											
				@ -0,0 +1,74 @@

				#

				# weekly run against modern QuicTLS branch, i.e. https://github.com/quictls/quictls

				#

				name: QuicTLS

				on:

				  schedule:

				    - cron: "0 0 * * 4"

				  workflow_dispatch:

				permissions:

				  contents: read

				jobs:

				  test:

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				      - uses: actions/checkout@v5

				      - name: Install apt dependencies

				        run: |

				          sudo apt-get update -o Acquire::Languages=none -o Acquire::Translation=none

				          sudo apt-get --no-install-recommends -y install socat gdb

				      - name: Install QuicTLS

				        run: env QUICTLS=yes QUICTLS_URL=https://github.com/quictls/quictls scripts/build-ssl.sh

				      - name: Compile HAProxy

				        run: |

				          make -j$(nproc) ERR=1 CC=gcc TARGET=linux-glibc \

				            USE_QUIC=1 USE_OPENSSL=1 \

				            SSL_LIB=${HOME}/opt/lib SSL_INC=${HOME}/opt/include \

				            DEBUG="-DDEBUG_POOL_INTEGRITY -DDEBUG_UNIT" \

				            ADDLIB="-Wl,-rpath,/usr/local/lib/ -Wl,-rpath,$HOME/opt/lib/" \

				            ARCH_FLAGS="-ggdb3 -fsanitize=address"

				          sudo make install

				      - name: Show HAProxy version

				        id: show-version

				        run: |

				          ldd $(which haproxy)

				          haproxy -vv

				          echo "version=$(haproxy -v |awk 'NR==1{print $3}')" >> $GITHUB_OUTPUT

				      - uses: ./.github/actions/setup-vtest

				      - name: Run VTest for HAProxy

				        id: vtest

				        run: |

				          make reg-tests VTEST_PROGRAM=../vtest/vtest REGTESTS_TYPES=default,bug,devel

				      - name: Show VTest results

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          for folder in ${TMPDIR:-/tmp}/haregtests-*/vtc.*; do

				            printf "::group::"

				            cat $folder/INFO

				            cat $folder/LOG

				            echo "::endgroup::"

				          done

				          exit 1

				      - name: Run Unit tests

				        id: unittests

				        run: |

				          make unit-tests

				      - name: Show coredumps

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          failed=false

				          shopt -s nullglob

				          for file in /tmp/core.*; do

				            failed=true

				            printf "::group::"

				            gdb -ex 'thread apply all bt full' ./haproxy $file

				            echo "::endgroup::"

				          done

				          if [ "$failed" = true ]; then

				            exit 1;

				          fi

									
										79

.github/workflows/vtest.yml
									
										vendored
									
											View File
											
				@ -23,7 +23,7 @@ jobs:

				    outputs:

				      matrix: ${{ steps.set-matrix.outputs.matrix }}

				    steps:

				      - uses: actions/checkout@v4

				      - uses: actions/checkout@v5

				      - name: Generate Build Matrix

				        env:

				          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				@ -44,16 +44,10 @@ jobs:

				      TMPDIR: /tmp

				      OT_CPP_VERSION: 1.6.0

				    steps:

				    - uses: actions/checkout@v4

				    - uses: actions/checkout@v5

				      with:

				        fetch-depth: 100

				    - name: Setup coredumps

				      if: ${{ startsWith(matrix.os, 'ubuntu-') }}

				      run: |

				        sudo sysctl -w fs.suid_dumpable=1

				        sudo sysctl kernel.core_pattern=/tmp/core.%h.%e.%t

				#

				# Github Action cache key cannot contain comma, so we calculate it based on job name

				#

				@ -76,26 +70,24 @@ jobs:

				      uses: actions/cache@v4

				      with:

				        path: '~/opt-ot/'

				        key: ot-${{ matrix.CC }}-${{ env.OT_CPP_VERSION }}-${{ contains(matrix.name, 'ASAN') }}

				        key: ${{ matrix.os }}-ot-${{ matrix.CC }}-${{ env.OT_CPP_VERSION }}-${{ contains(matrix.name, 'ASAN') }}

				    - name: Install apt dependencies

				      if: ${{ startsWith(matrix.os, 'ubuntu-') }}

				      run: |

				        sudo apt-get update

				        sudo apt-get install -y \

				          liblua5.3-dev \

				          libpcre2-dev \

				          libsystemd-dev \

				          ninja-build \

				        sudo apt-get update -o Acquire::Languages=none -o Acquire::Translation=none

				        sudo apt-get --no-install-recommends -y install \

				          ${{ contains(matrix.FLAGS, 'USE_LUA=1')     && 'liblua5.4-dev'  || '' }} \

				          ${{ contains(matrix.FLAGS, 'USE_PCRE2=1')   && 'libpcre2-dev'   || '' }} \

				          ${{ contains(matrix.ssl,   'BORINGSSL=yes') && 'ninja-build'    || '' }} \

				          socat \

				          gdb

				          gdb \

				          jose

				    - name: Install brew dependencies

				      if: ${{ startsWith(matrix.os, 'macos-') }}

				      run: |

				        brew install socat

				        brew install lua

				    - name: Install VTest

				      run: |

				        scripts/build-vtest.sh

				    - uses: ./.github/actions/setup-vtest

				    - name: Install SSL ${{ matrix.ssl }}

				      if: ${{ matrix.ssl && matrix.ssl != 'stock' && steps.cache_ssl.outputs.cache-hit != 'true' }}

				      run: env ${{ matrix.ssl }} scripts/build-ssl.sh

				@ -118,10 +110,19 @@ jobs:

				          ERR=1 \

				          TARGET=${{ matrix.TARGET }} \

				          CC=${{ matrix.CC }} \

				          DEBUG="-DDEBUG_POOL_INTEGRITY" \

				          DEBUG="-DDEBUG_POOL_INTEGRITY -DDEBUG_UNIT" \

				          ${{ join(matrix.FLAGS, ' ') }} \

				          ADDLIB="-Wl,-rpath,/usr/local/lib/ -Wl,-rpath,$HOME/opt/lib/"

				        sudo make install-bin

				    - name: Compile admin/halog/halog

				      run: |

				        make -j$(nproc) admin/halog/halog \

				          ERR=1 \

				          TARGET=${{ matrix.TARGET }} \

				          CC=${{ matrix.CC }} \

				          DEBUG="-DDEBUG_POOL_INTEGRITY -DDEBUG_UNIT" \

				          ${{ join(matrix.FLAGS, ' ') }} \

				          ADDLIB="-Wl,-rpath,/usr/local/lib/ -Wl,-rpath,$HOME/opt/lib/"

				        sudo make install

				    - name: Show HAProxy version

				      id: show-version

				      run: |

				@ -136,45 +137,33 @@ jobs:

				        echo "::endgroup::"

				        haproxy -vv

				        echo "version=$(haproxy -v |awk 'NR==1{print $3}')" >> $GITHUB_OUTPUT

				    - name: Install problem matcher for VTest

				      # This allows one to more easily see which tests fail.

				      run: echo "::add-matcher::.github/vtest.json"

				    - name: Run VTest for HAProxy ${{ steps.show-version.outputs.version }}

				      id: vtest

				      env:

				        # Force ASAN output into asan.log to make the output more readable.

				        ASAN_OPTIONS: log_path=asan.log

				      run: |

				        # This is required for macOS which does not actually allow to increase

				        # the '-n' soft limit to the hard limit, thus failing to run.

				        ulimit -n 65536

				        ulimit -c unlimited

				        make reg-tests VTEST_PROGRAM=../vtest/vtest REGTESTS_TYPES=default,bug,devel

				    - name: Config syntax check memleak smoke testing

				      if: ${{ contains(matrix.name, 'ASAN') }}

				      run: |

				        ./haproxy -dI -f .github/h2spec.config -c

				        ./haproxy -dI -f examples/content-sw-sample.cfg -c

				        ./haproxy -dI -f examples/option-http_proxy.cfg -c

				        ./haproxy -dI -f examples/quick-test.cfg -c

				        ./haproxy -dI -f examples/transparent_proxy.cfg -c

				    - name: Show VTest results

				      if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				      run: |

				        for folder in ${TMPDIR}/haregtests-*/vtc.*; do

				        for folder in ${TMPDIR:-/tmp}/haregtests-*/vtc.*; do

				          printf "::group::"

				          cat $folder/INFO

				          cat $folder/LOG

				          echo "::endgroup::"

				        done

				        shopt -s nullglob

				        for asan in asan.log*; do

				          echo "::group::$asan"

				          cat $asan

				        exit 1

				    - name: Run Unit tests

				      id: unittests

				      run: |

				        make unit-tests

				    - name: Show Unit-Tests results

				      if: ${{ failure() && steps.unittests.outcome == 'failure' }}

				      run: |

				        for result in ${TMPDIR:-/tmp}/ha-unittests-*/results/res.*; do

				          printf "::group::"

				          cat $result

				          echo "::endgroup::"

				        done

				        exit 1

				    - name: Show coredumps

				      if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				      run: |

									
										2

.github/workflows/windows.yml
									
										vendored
									
											View File
											
				@ -35,7 +35,7 @@ jobs:

				          - USE_THREAD=1

				          - USE_ZLIB=1

				    steps:

				    - uses: actions/checkout@v4

				    - uses: actions/checkout@v5

				    - uses: msys2/setup-msys2@v2

				      with:

				        install: >-

									
										80

.github/workflows/wolfssl.yml
									
										vendored
									
										Normal file
									
											View File
											
				@ -0,0 +1,80 @@

				name: WolfSSL

				on:

				  schedule:

				    - cron: "0 0 * * 4"

				  workflow_dispatch:

				permissions:

				  contents: read

				jobs:

				  test:

				    runs-on: ubuntu-latest

				    if: ${{ github.repository_owner == 'haproxy' || github.event_name == 'workflow_dispatch' }}

				    steps:

				      - uses: actions/checkout@v5

				      - name: Install apt dependencies

				        run: |

				          sudo apt-get update -o Acquire::Languages=none -o Acquire::Translation=none

				          sudo apt-get --no-install-recommends -y install socat gdb jose

				      - name: Install WolfSSL

				        run: env WOLFSSL_VERSION=git-master WOLFSSL_DEBUG=1 scripts/build-ssl.sh

				      - name: Compile HAProxy

				        run: |

				          make -j$(nproc) ERR=1 CC=gcc TARGET=linux-glibc \

				            USE_OPENSSL_WOLFSSL=1 USE_QUIC=1 \

				            SSL_LIB=${HOME}/opt/lib SSL_INC=${HOME}/opt/include \

				            DEBUG="-DDEBUG_POOL_INTEGRITY -DDEBUG_UNIT" \

				            ADDLIB="-Wl,-rpath,/usr/local/lib/ -Wl,-rpath,$HOME/opt/lib/" \

				            ARCH_FLAGS="-ggdb3 -fsanitize=address"

				          sudo make install

				      - name: Show HAProxy version

				        id: show-version

				        run: |

				          ldd $(which haproxy)

				          haproxy -vv

				          echo "version=$(haproxy -v |awk 'NR==1{print $3}')" >> $GITHUB_OUTPUT

				      - uses: ./.github/actions/setup-vtest

				      - name: Run VTest for HAProxy

				        id: vtest

				        run: |

				          make reg-tests VTEST_PROGRAM=../vtest/vtest REGTESTS_TYPES=default,bug,devel

				      - name: Run Unit tests

				        id: unittests

				        run: |

				          make unit-tests

				      - name: Show VTest results

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          for folder in ${TMPDIR:-/tmp}/haregtests-*/vtc.*; do

				            printf "::group::"

				            cat $folder/INFO

				            cat $folder/LOG

				            echo "::endgroup::"

				          done

				          exit 1

				      - name: Show coredumps

				        if: ${{ failure() && steps.vtest.outcome == 'failure' }}

				        run: |

				          failed=false

				          shopt -s nullglob

				          for file in /tmp/core.*; do

				            failed=true

				            printf "::group::"

				            gdb -ex 'thread apply all bt full' ./haproxy $file

				            echo "::endgroup::"

				          done

				          if [ "$failed" = true ]; then

				            exit 1;

				          fi

				      - name: Show Unit-Tests results

				        if: ${{ failure() && steps.unittests.outcome == 'failure' }}

				        run: |

				          for result in ${TMPDIR:-/tmp}/ha-unittests-*/results/res.*; do

				            printf "::group::"

				            cat $result

				            echo "::endgroup::"

				          done

				          exit 1

1

.gitignore vendored

View File

 @ -57,3 +57,4 @@ dev/udp/udp-perturb
 /src/dlmalloc.c
 /tests/test_hashes
 doc/lua-api/_build
 dev/term_events/term_events

									
										2

.travis.yml
									
											View File
											
				@ -8,7 +8,7 @@ branches:

				env:

				  global:

				    - FLAGS="USE_LUA=1 USE_OPENSSL=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_SYSTEMD=1 USE_ZLIB=1"

				    - FLAGS="USE_LUA=1 USE_OPENSSL=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_ZLIB=1"

				    - TMPDIR=/tmp

				addons:

12

BRANCHES

View File

 @ -171,7 +171,17 @@ feedback for developers:
     as the previous releases that had 6 months to stabilize. In terms of
     stability it really means that the point zero version already accumulated
 months of fixes and that it is much safer to use even just after it is
     released.
     released. There is one exception though, features marked as "experimental"
     are not guaranteed to be maintained beyond the release of the next LTS
     branch. The rationale here is that the experimental status is made to
     expose an early preview of a feature, that is often incomplete, not always
     in its definitive form regarding configuration, and for which developers
     are seeking feedback from the users. It is even possible that changes will
     be brought within the stable branch and it may happen that the feature
     breaks. It is not imaginable to always be able to backport bug fixes too
     far in this context since the code and configuration may change quite a
     bit. Users who want to try experimental features are expected to upgrade
     quickly to benefit from the improvements made to that feature.
   - for developers, given that the odd versions are solely used by highly
     skilled users, it's easier to get advanced traces and captures, and there

3832

CHANGELOG

View File

File diff suppressed because it is too large Load Diff

2

CONTRIBUTING

View File

 @ -1010,7 +1010,7 @@ you notice you're already practising some of them:
   - continue to send pull requests after having been explained why they are not
     welcome.
   - give wrong advices to people asking for help, or sending them patches to
   - give wrong advice to people asking for help, or sending them patches to
     try which make no sense, waste their time, and give them a bad impression
     of the people working on the project.

120

INSTALL

View File

 @ -9,7 +9,7 @@ used to follow updates then it is recommended that instead you use the packages
 provided by your software vendor or Linux distribution. Most of them are taking
 this task seriously and are doing a good job at backporting important fixes.
 If for any reason you'd prefer to use a different version than the one packaged
 If for any reason you would prefer a different version than the one packaged
 for your system, you want to be certain to have all the fixes or to get some
 commercial support, other choices are available at http://www.haproxy.com/.
 @ -34,18 +34,26 @@ are a few build examples :
   - recent Linux system with all options, make and install :
     $ make clean
     $ make -j $(nproc) TARGET=linux-glibc \
                 USE_OPENSSL=1 USE_LUA=1 USE_PCRE2=1 USE_SYSTEMD=1
            USE_OPENSSL=1 USE_QUIC=1 USE_QUIC_OPENSSL_COMPAT=1 \
            USE_LUA=1 USE_PCRE2=1
     $ sudo make install
   - FreeBSD and OpenBSD, build with all options :
     $ gmake -j 4 TARGET=freebsd USE_OPENSSL=1 USE_LUA=1 USE_PCRE2=1
   - FreeBSD + OpenSSL, build with all options :
     $ gmake -j $(sysctl -n hw.ncpu) TARGET=freebsd \
            USE_OPENSSL=1 USE_QUIC=1 USE_QUIC_OPENSSL_COMPAT=1 \
            USE_LUA=1 USE_PCRE2=1
   - OpenBSD + LibreSSL, build with all options :
     $ gmake -j $(sysctl -n hw.ncpu) TARGET=openbsd \
            USE_OPENSSL=1 USE_QUIC=1 USE_LUA=1 USE_PCRE2=1
   - embedded Linux, build using a cross-compiler :
     $ make -j $(nproc) TARGET=linux-glibc USE_OPENSSL=1 USE_PCRE2=1 \
                 CC=/opt/cross/gcc730-arm/bin/gcc ADDLIB=-latomic
            CC=/opt/cross/gcc730-arm/bin/gcc CFLAGS="-mthumb" ADDLIB=-latomic
   - Build with static PCRE on Solaris / UltraSPARC :
     $ make TARGET=solaris CPU_CFLAGS="-mcpu=v9" USE_STATIC_PCRE2=1
     $ make -j $(/usr/sbin/psrinfo -p) TARGET=solaris \
            CPU_CFLAGS="-mcpu=v9" USE_STATIC_PCRE2=1
 For more advanced build options or if a command above reports an error, please
 read the following sections.
 @ -103,20 +111,22 @@ HAProxy requires a working GCC or Clang toolchain and GNU make :
     may want to retry with "gmake" which is the name commonly used for GNU make
     on BSD systems.
   - GCC >= 4.2 (up to 13 tested). Older versions can be made to work with a
     few minor adaptations if really needed. Newer versions may sometimes break
     due to compiler regressions or behaviour changes. The version shipped with
     your operating system is very likely to work with no trouble. Clang >= 3.0
     is also known to work as an alternative solution. Recent versions may emit
     a bit more warnings that are worth reporting as they may reveal real bugs.
     TCC (https://repo.or.cz/tinycc.git) is also usable for developers but will
     not support threading and was found at least once to produce bad code in
     some rare corner cases (since fixed). But it builds extremely quickly
     (typically half a second for the whole project) and is very convenient to
     run quick tests during API changes or code refactoring.
   - GCC >= 4.7 (up to 15 tested). Older versions are no longer supported due to
     the latest mt_list update which only uses c11-like atomics. Newer versions
     may sometimes break due to compiler regressions or behaviour changes. The
     version shipped with your operating system is very likely to work with no
     trouble. Clang >= 3.0 is also known to work as an alternative solution, and
     versions up to 19 were successfully tested. Recent versions may emit a bit
     more warnings that are worth reporting as they may reveal real bugs. TCC
     (https://repo.or.cz/tinycc.git) is also usable for developers but will not
     support threading and was found at least once to produce bad code in some
     rare corner cases (since fixed). But it builds extremely quickly (typically
     half a second for the whole project) and is very convenient to run quick
     tests during API changes or code refactoring.
   - GNU ld (binutils package), with no particular version. Other linkers might
     work but were not tested.
     work but were not tested. The default one from your operating system will
     normally work.
 On debian or Ubuntu systems and their derivatives, you may get all these tools
 at once by issuing the two following commands :
 @ -227,7 +237,7 @@ to forcefully enable it using "USE_LIBCRYPT=1".
 -----------------
 For SSL/TLS, it is necessary to use a cryptography library. HAProxy currently
 supports the OpenSSL library, and is known to build and work with branches
 .0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 3.0, 3.1 and 3.2. It is recommended to use
 .0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, and 3.0 to 3.6. It is recommended to use
 at least OpenSSL 1.1.1 to have support for all SSL keywords and configuration
 in HAProxy. OpenSSL follows a long-term support cycle similar to HAProxy's,
 and each of the branches above receives its own fixes, without forcing you to
 @ -244,16 +254,20 @@ https://github.com/openssl/openssl/issues/17627). If a migration to 3.x is
 mandated by support reasons, at least 3.1 recovers a small fraction of this
 important loss.
 Four OpenSSL derivatives called LibreSSL, BoringSSL, QUICTLS, and AWS-LC are
 Three OpenSSL derivatives called LibreSSL, QUICTLS, and AWS-LC are
 reported to work as well. While there are some efforts from the community to
 ensure they work well, OpenSSL remains the primary target and this means that
 in case of conflicting choices, OpenSSL support will be favored over other
 options.  Note that QUIC is not fully supported when haproxy is built with
 OpenSSL. In this case, QUICTLS is the preferred alternative.  As of writing
 this, the QuicTLS project follows OpenSSL very closely and provides update
 simultaneously, but being a volunteer-driven project, its long-term future does
 not look certain enough to convince operating systems to package it, so it
 needs to be build locally. See the section about QUIC in this document.
 OpenSSL < 3.5.2 version. In this case, QUICTLS or AWS-LC are the preferred
 alternatives. As of writing this, the QuicTLS project follows OpenSSL very
 closely and provides update simultaneously, but being a volunteer-driven
 project, its long-term future does not look certain enough to convince
 operating systems to package it, so it needs to be build locally. Recent
 versions of AWS-LC (>= 1.22 and the FIPS branches) are pretty complete and
 generally more performant than other OpenSSL derivatives, but may behave
 slightly differently, particularly when dealing with outdated setups. See
 the section about QUIC in this document.
 A fifth option is wolfSSL (https://github.com/wolfSSL/wolfssl). It is the only
 supported alternative stack not based on OpenSSL, yet which implements almost
 @ -312,7 +326,7 @@ command line, for example:
   $ make -j $(nproc) TARGET=generic USE_OPENSSL_WOLFSSL=1 USE_QUIC=1 \
     SSL_INC=/opt/wolfssl-5.6.6/include SSL_LIB=/opt/wolfssl-5.6.6/lib
 To use HAProxy with AWS-LC you must have version v1.13.0 or newer of AWS-LC
 To use HAProxy with AWS-LC you must have version v1.22.0 or newer of AWS-LC
 built and installed locally.
  $ cd ~/build/aws-lc
  $ cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/opt/aws-lc
 @ -375,10 +389,15 @@ systems, by passing "USE_SLZ=" to the "make" command.
 Please note that SLZ will benefit from some CPU-specific instructions like the
 availability of the CRC32 extension on some ARM processors. Thus it can further
 improve its performance to build with "CPU=native" on the target system, or
 "CPU=armv81" (modern systems such as Graviton2 or A55/A75 and beyond),
 "CPU=a72" (e.g. for RPi4, or AWS Graviton), "CPU=a53" (e.g. for RPi3), or
 "CPU=armv8-auto" (automatic detection with minor runtime penalty).
 improve its performance to build with:
   - "CPU_CFLAGS=-march=native" on the target system or
   - "CPU_CFLAGS=-march=armv81" on modern systems such as Graviton2 or A55/A75
      and beyond)
   - "CPU_CFLAGS=-march=a72" (e.g. for RPi4, or AWS Graviton)
   - "CPU_CFLAGS=-march=a53" (e.g. for RPi3)
   - "CPU_CFLAGS=-march=armv8-auto" automatic detection with minor runtime
      penalty)
 A second option involves the widely known zlib library, which is very likely
 installed on your system. In order to use zlib, simply pass "USE_ZLIB=1" to the
 @ -452,12 +471,6 @@ are the extra libraries that may be referenced at build time :
                   on Linux. It is automatically detected and may be disabled
                   using "USE_DL=", though it should never harm.
   - USE_SYSTEMD=1 enables support for the sdnotify features of systemd,
                   allowing better integration with systemd on Linux systems
                   which come with it. It is never enabled by default so there
                   is no need to disable it.
 .10) Common errors
 -------------------
 Some build errors may happen depending on the options combinations or the
 @ -481,8 +494,8 @@ target. Common issues may include:
        other supported compatible library.
   - many "dereferencing pointer 'sa.985' does break strict-aliasing rules"
     => these warnings happen on old compilers (typically gcc-4.4), and may
        safely be ignored; newer ones are better on these.
     => these warnings happen on old compilers (typically gcc before 7.x),
        and may safely be ignored; newer ones are better on these.
 .11) QUIC
 @ -491,10 +504,11 @@ QUIC is the new transport layer protocol and is required for HTTP/3. This
 protocol stack is currently supported as an experimental feature in haproxy on
 the frontend side. In order to enable it, use "USE_QUIC=1 USE_OPENSSL=1".
 Note that QUIC is not fully supported by the OpenSSL library. Indeed QUIC 0-RTT
 cannot be supported by OpenSSL contrary to others libraries with full QUIC
 support. The preferred option is to use QUICTLS. This is a fork of OpenSSL with
 a QUIC-compatible API. Its repository is available at this location:
 Note that QUIC is not always fully supported by the OpenSSL library depending on
 its version. Indeed QUIC 0-RTT cannot be supported by OpenSSL for versions before
 .5 contrary to others libraries with full QUIC support. The preferred option is
 to use QUICTLS. This is a fork of OpenSSL with a QUIC-compatible API. Its
 repository is available at this location:
      https://github.com/quictls/openssl
 @ -522,14 +536,18 @@ way assuming that wolfSSL was installed in /opt/wolfssl-5.6.0 as shown in 4.5:
     SSL_INC=/opt/wolfssl-5.6.0/include SSL_LIB=/opt/wolfssl-5.6.0/lib
     LDFLAGS="-Wl,-rpath,/opt/wolfssl-5.6.0/lib"
 As last resort, haproxy may be compiled against OpenSSL as follows:
 As last resort, haproxy may be compiled against OpenSSL as follows from 3.5
 version with 0-RTT support:
   $ make TARGET=generic USE_OPENSSL=1 USE_QUIC=1
 or as follows for all OpenSSL versions but without O-RTT support:
   $ make TARGET=generic USE_OPENSSL=1 USE_QUIC=1 USE_QUIC_OPENSSL_COMPAT=1
 Note that QUIC 0-RTT is not supported by haproxy QUIC stack when built against
 OpenSSL. In addition to this compilation requirements, the QUIC listener
 bindings must be explicitly enabled with a specific QUIC tuning parameter.
 (see "limited-quic" global parameter of haproxy Configuration Manual).
 In addition to this requirements, the QUIC listener bindings must be explicitly
 enabled with a specific QUIC tuning parameter. (see "limited-quic" global
 parameter of haproxy Configuration Manual).
 ) How to build HAProxy
 @ -545,9 +563,9 @@ It goes into more details with the main options.
 To build haproxy, you have to choose your target OS amongst the following ones
 and assign it to the TARGET variable :
   - linux-glibc         for Linux kernel 2.6.28 and above
   - linux-glibc         for Linux kernel 4.17 and above
   - linux-glibc-legacy  for Linux kernel 2.6.28 and above without new features
   - linux-musl          for Linux kernel 2.6.28 and above with musl libc
   - linux-musl          for Linux kernel 4.17 and above with musl libc
   - solaris             for Solaris 10 and above
   - freebsd             for FreeBSD 10 and above
   - dragonfly           for DragonFlyBSD 4.3 and above
 @ -747,8 +765,8 @@ forced to produce final binaries, and must not be used during bisect sessions,
 as it will often lead to the wrong commit.
 Examples:
   # silence strict-aliasing warnings with old gcc-4.4:
   $ make -j$(nproc) TARGET=linux-glibc CC=gcc-44 CFLAGS=-fno-strict-aliasing
   # silence strict-aliasing warnings with old gcc-5.5:
   $ make -j$(nproc) TARGET=linux-glibc CC=gcc-55 CFLAGS=-fno-strict-aliasing
   # disable all warning options:
   $ make -j$(nproc) TARGET=linux-glibc CC=mycc WARN_CFLAGS= NOWARN_CFLAGS=

2

MAINTAINERS

View File

 @ -138,7 +138,7 @@ ScientiaMobile WURFL Device Detection
 Maintainer: Paul Borile, Massimiliano Bellomi <wurfl-haproxy-support@scientiamobile.com>
 Files: addons/wurfl, doc/WURFL-device-detection.txt
 SPOE (deprecated)
 SPOE
 Maintainer: Christopher Faulet <cfaulet@haproxy.com>
 Files: src/flt_spoe.c, include/haproxy/spoe*.h, doc/SPOE.txt

									
										203

Makefile
									
											View File
											
				@ -35,6 +35,7 @@

				#   USE_OPENSSL             : enable use of OpenSSL. Recommended, but see below.

				#   USE_OPENSSL_AWSLC       : enable use of AWS-LC

				#   USE_OPENSSL_WOLFSSL     : enable use of wolfSSL with the OpenSSL API

				#   USE_ECH                 : enable use of ECH with the OpenSSL API

				#   USE_QUIC                : enable use of QUIC with the quictls API (quictls, libressl, boringssl)

				#   USE_QUIC_OPENSSL_COMPAT : enable use of QUIC with the standard openssl API (limited features)

				#   USE_ENGINE              : enable use of OpenSSL Engine.

				@ -56,14 +57,14 @@

				#   USE_DEVICEATLAS         : enable DeviceAtlas api.

				#   USE_51DEGREES           : enable third party device detection library from 51Degrees

				#   USE_WURFL               : enable WURFL detection library from Scientiamobile

				#   USE_SYSTEMD             : enable sd_notify() support.

				#   USE_OBSOLETE_LINKER     : use when the linker fails to emit __start_init/__stop_init

				#   USE_THREAD_DUMP         : use the more advanced thread state dump system. Automatic.

				#   USE_OT                  : enable the OpenTracing filter

				#   USE_MEMORY_PROFILING    : enable the memory profiler. Linux-glibc only.

				#   USE_LIBATOMIC           : force to link with/without libatomic. Automatic.

				#   USE_PTHREAD_EMULATION   : replace pthread's rwlocks with ours

				#   USE_SHM_OPEN            : use shm_open() for the startup-logs

				#   USE_SHM_OPEN            : use shm_open() for features that can make use of shared memory

				#   USE_KTLS                : use kTLS.(requires at least Linux 4.17).

				#

				# Options can be forced by specifying "USE_xxx=1" or can be disabled by using

				# "USE_xxx=" (empty string). The list of enabled and disabled options for a

				@ -135,7 +136,12 @@

				#   VTEST_PROGRAM  : location of the vtest program to run reg-tests.

				#   DEBUG_USE_ABORT: use abort() for program termination, see include/haproxy/bug.h for details

				#### Add -Werror when set to non-empty, and make Makefile stop on warnings.

				#### It must be declared before includes because it's used there.

				ERR =

				include include/make/verbose.mk

				include include/make/errors.mk

				include include/make/compiler.mk

				include include/make/options.mk

				@ -159,7 +165,7 @@ TARGET =

				CPU =

				ifneq ($(CPU),)

				ifneq ($(CPU),generic)

				$(warning Warning: the "CPU" variable was forced to "$(CPU)" but is no longer \

				$(call $(complain),the "CPU" variable was forced to "$(CPU)" but is no longer \

				  used and will be ignored. For native builds, modern compilers generally     \

				  prefer that the string "-march=native" is passed in CPU_CFLAGS or CFLAGS.   \

				  For other CPU-specific options, please read suggestions in the INSTALL file.)

				@ -169,7 +175,7 @@ endif

				#### No longer used

				ARCH =

				ifneq ($(ARCH),)

				$(warning Warning: the "ARCH" variable was forced to "$(ARCH)" but is no \

				$(call $(complain),the "ARCH" variable was forced to "$(ARCH)" but is no \

				  longer used and will be ignored. Please check the INSTALL file for other \

				  options, but usually in order to pass arch-specific options, ARCH_FLAGS, \

				  CFLAGS or LDFLAGS are preferred.)

				@ -187,7 +193,7 @@ OPT_CFLAGS = -O2

				#### No longer used

				DEBUG_CFLAGS =

				ifneq ($(DEBUG_CFLAGS),)

				$(warning Warning: DEBUG_CFLAGS was forced to "$(DEBUG_CFLAGS)" but is no     \

				$(call $(complain),DEBUG_CFLAGS was forced to "$(DEBUG_CFLAGS)" but is no     \

				  longer used and will be ignored. If you have ported this build setting from \

				  and older version, it is likely that you just want to pass these options    \

				  to the CFLAGS variable. If you are passing some debugging-related options   \

				@ -195,12 +201,10 @@ $(warning Warning: DEBUG_CFLAGS was forced to "$(DEBUG_CFLAGS)" but is no     \

				  both the compilation and linking stages.)

				endif

				#### Add -Werror when set to non-empty

				ERR =

				#### May be used to force running a specific set of reg-tests

				REG_TEST_FILES =

				REG_TEST_SCRIPT=./scripts/run-regtests.sh

				UNIT_TEST_SCRIPT=./scripts/run-unittests.sh

				#### Standard C definition

				# Compiler-specific flags that may be used to set the standard behavior we

				@ -210,7 +214,8 @@ REG_TEST_SCRIPT=./scripts/run-regtests.sh

				# undefined behavior to silently produce invalid code. For this reason we have

				# to use -fwrapv or -fno-strict-overflow to guarantee the intended behavior.

				# It is preferable not to change this option in order to avoid breakage.

				STD_CFLAGS  := $(call cc-opt-alt,-fwrapv,-fno-strict-overflow)

				STD_CFLAGS  := $(call cc-opt-alt,-fwrapv,-fno-strict-overflow)                \

				               $(call cc-opt,-fvect-cost-model=very-cheap)

				#### Compiler-specific flags to enable certain classes of warnings.

				# Some are hard-coded, others are enabled only if supported.

				@ -247,7 +252,7 @@ endif

				#### No longer used

				SMALL_OPTS =

				ifneq ($(SMALL_OPTS),)

				$(warning Warning: SMALL_OPTS was forced to "$(SMALL_OPTS)" but is no longer \

				$(call $(complain),SMALL_OPTS was forced to "$(SMALL_OPTS)" but is no longer \

				  used and will be ignored. Please check if this setting are still relevant, \

				  and move it either to DEFINE or to CFLAGS instead.)

				endif

				@ -260,8 +265,9 @@ endif

				# without appearing here. Currently defined DEBUG macros include DEBUG_FULL,

				# DEBUG_MEM_STATS, DEBUG_DONT_SHARE_POOLS, DEBUG_FD, DEBUG_POOL_INTEGRITY,

				# DEBUG_NO_POOLS, DEBUG_FAIL_ALLOC, DEBUG_STRICT_ACTION=[0-3], DEBUG_HPACK,

				# DEBUG_AUTH, DEBUG_SPOE, DEBUG_UAF, DEBUG_THREAD, DEBUG_STRICT, DEBUG_DEV,

				# DEBUG_TASK, DEBUG_MEMORY_POOLS, DEBUG_POOL_TRACING, DEBUG_QPACK, DEBUG_LIST.

				# DEBUG_AUTH, DEBUG_SPOE, DEBUG_UAF, DEBUG_THREAD=0-2, DEBUG_STRICT, DEBUG_DEV,

				# DEBUG_TASK, DEBUG_MEMORY_POOLS, DEBUG_POOL_TRACING, DEBUG_QPACK, DEBUG_LIST,

				# DEBUG_COUNTERS=[0-2], DEBUG_STRESS, DEBUG_UNIT.

				DEBUG =

				#### Trace options

				@ -336,14 +342,16 @@ use_opts = USE_EPOLL USE_KQUEUE USE_NETFILTER USE_POLL                        \

				           USE_TPROXY USE_LINUX_TPROXY USE_LINUX_CAP                          \

				           USE_LINUX_SPLICE USE_LIBCRYPT USE_CRYPT_H USE_ENGINE               \

				           USE_GETADDRINFO USE_OPENSSL USE_OPENSSL_WOLFSSL USE_OPENSSL_AWSLC  \

					       USE_ECH                                                            \

				           USE_SSL USE_LUA USE_ACCEPT4 USE_CLOSEFROM USE_ZLIB USE_SLZ         \

				           USE_CPU_AFFINITY USE_TFO USE_NS USE_DL USE_RT USE_LIBATOMIC        \

				           USE_MATH USE_DEVICEATLAS USE_51DEGREES                             \

				           USE_WURFL USE_SYSTEMD USE_OBSOLETE_LINKER USE_PRCTL USE_PROCCTL    \

				           USE_WURFL USE_OBSOLETE_LINKER USE_PRCTL USE_PROCCTL                \

				           USE_THREAD_DUMP USE_EVPORTS USE_OT USE_QUIC USE_PROMEX             \

				           USE_MEMORY_PROFILING USE_SHM_OPEN                                  \

				           USE_STATIC_PCRE USE_STATIC_PCRE2                                   \

				           USE_PCRE USE_PCRE_JIT USE_PCRE2 USE_PCRE2_JIT USE_QUIC_OPENSSL_COMPAT

				           USE_PCRE USE_PCRE_JIT USE_PCRE2 USE_PCRE2_JIT                      \

				           USE_QUIC_OPENSSL_COMPAT USE_KTLS

				# preset all variables for all supported build options among use_opts

				$(reset_opts_vars)

				@ -374,13 +382,13 @@ ifeq ($(TARGET),haiku)

				  set_target_defaults = $(call default_opts,USE_POLL USE_TPROXY USE_OBSOLETE_LINKER)

				endif

				# For linux >= 2.6.28 and glibc

				# For linux >= 4.17 and glibc

				ifeq ($(TARGET),linux-glibc)

				  set_target_defaults = $(call default_opts, \

				    USE_POLL USE_TPROXY USE_LIBCRYPT USE_DL USE_RT USE_CRYPT_H USE_NETFILTER  \

				    USE_CPU_AFFINITY USE_THREAD USE_EPOLL USE_LINUX_TPROXY USE_LINUX_CAP      \

				    USE_ACCEPT4 USE_LINUX_SPLICE USE_PRCTL USE_THREAD_DUMP USE_NS USE_TFO     \

				    USE_GETADDRINFO USE_BACKTRACE USE_SHM_OPEN USE_SYSTEMD)

				    USE_GETADDRINFO USE_BACKTRACE USE_SHM_OPEN USE_KTLS)

				  INSTALL = install -v

				endif

				@ -393,13 +401,13 @@ ifeq ($(TARGET),linux-glibc-legacy)

				  INSTALL = install -v

				endif

				# For linux >= 2.6.28 and musl

				# For linux >= 4.17 and musl

				ifeq ($(TARGET),linux-musl)

				  set_target_defaults = $(call default_opts, \

				    USE_POLL USE_TPROXY USE_LIBCRYPT USE_DL USE_RT USE_CRYPT_H USE_NETFILTER  \

				    USE_CPU_AFFINITY USE_THREAD USE_EPOLL USE_LINUX_TPROXY USE_LINUX_CAP      \

				    USE_ACCEPT4 USE_LINUX_SPLICE USE_PRCTL USE_THREAD_DUMP USE_NS USE_TFO     \

				    USE_GETADDRINFO USE_SHM_OPEN)

				    USE_GETADDRINFO USE_BACKTRACE USE_SHM_OPEN USE_KTLS)

				  INSTALL = install -v

				endif

				@ -416,7 +424,7 @@ endif

				ifeq ($(TARGET),freebsd)

				  set_target_defaults = $(call default_opts, \

				    USE_POLL USE_TPROXY USE_LIBCRYPT USE_THREAD USE_CPU_AFFINITY USE_KQUEUE   \

				    USE_ACCEPT4 USE_CLOSEFROM USE_GETADDRINFO USE_PROCCTL USE_SHM_OPEN)

				    USE_ACCEPT4 USE_CLOSEFROM USE_GETADDRINFO USE_PROCCTL)

				endif

				# kFreeBSD glibc

				@ -590,10 +598,16 @@ endif

				ifneq ($(USE_BACKTRACE:0=),)

				  BACKTRACE_LDFLAGS = -Wl,$(if $(EXPORT_SYMBOL),$(EXPORT_SYMBOL),--export-dynamic)

				  BACKTRACE_CFLAGS  = -fno-omit-frame-pointer

				endif

				ifneq ($(USE_MEMORY_PROFILING:0=),)

				  MEMORY_PROFILING_CFLAGS  = -fno-optimize-sibling-calls

				endif

				ifneq ($(USE_CPU_AFFINITY:0=),)

				  OPTIONS_OBJS   += src/cpuset.o

				  OPTIONS_OBJS   += src/cpu_topo.o

				endif

				# OpenSSL is packaged in various forms and with various dependencies.

				@ -626,7 +640,10 @@ ifneq ($(USE_OPENSSL:0=),)

				    SSL_LDFLAGS   := $(if $(SSL_LIB),-L$(SSL_LIB)) -lssl -lcrypto

				  endif

				  USE_SSL         := $(if $(USE_SSL:0=),$(USE_SSL:0=),implicit)

				  OPTIONS_OBJS += src/ssl_sock.o src/ssl_ckch.o src/ssl_sample.o src/ssl_crtlist.o src/cfgparse-ssl.o src/ssl_utils.o src/jwt.o src/ssl_ocsp.o src/ssl_gencert.o

				  OPTIONS_OBJS += src/ssl_sock.o src/ssl_ckch.o src/ssl_ocsp.o src/ssl_crtlist.o       \

				                  src/ssl_sample.o src/cfgparse-ssl.o src/ssl_gencert.o                \

				                  src/ssl_utils.o src/jwt.o src/ssl_clienthello.o src/jws.o src/acme.o \

				                  src/ssl_trace.o src/jwe.o

				endif

				ifneq ($(USE_ENGINE:0=),)

				@ -638,17 +655,22 @@ ifneq ($(USE_ENGINE:0=),)

				endif

				ifneq ($(USE_QUIC:0=),)

				OPTIONS_OBJS += src/quic_conn.o src/mux_quic.o src/h3.o src/xprt_quic.o    \

				                src/quic_frame.o src/quic_tls.o src/quic_tp.o              \

				                src/quic_stats.o src/quic_sock.o src/proto_quic.o          \

				                src/qmux_trace.o src/quic_loss.o src/qpack-enc.o           \

				                src/quic_cc_newreno.o src/quic_cc_cubic.o src/qpack-tbl.o  \

				                src/qpack-dec.o src/hq_interop.o src/quic_stream.o         \

				                src/h3_stats.o src/qmux_http.o src/cfgparse-quic.o         \

				                src/cbuf.o src/quic_cc.o src/quic_cc_nocc.o src/quic_ack.o \

				                src/quic_trace.o src/quic_cli.o src/quic_ssl.o             \

				                src/quic_rx.o src/quic_tx.o src/quic_cid.o src/quic_retry.o\

				                src/quic_retransmit.o src/quic_fctl.o

				OPTIONS_OBJS += src/mux_quic.o src/h3.o src/quic_rx.o src/quic_tx.o	\

				                src/quic_conn.o src/quic_frame.o src/quic_sock.o	\

				                src/quic_tls.o src/quic_ssl.o src/proto_quic.o		\

				                src/quic_cli.o src/quic_trace.o src/quic_tp.o		\

				                src/quic_cid.o src/quic_stream.o			\

				                src/quic_retransmit.o src/quic_loss.o			\

				                src/hq_interop.o src/quic_cc_cubic.o			\

				                src/quic_cc_bbr.o src/quic_retry.o			\

				                src/cfgparse-quic.o src/xprt_quic.o src/quic_token.o	\

				                src/quic_ack.o src/qpack-dec.o src/quic_cc_newreno.o	\

				                src/qmux_http.o src/qmux_trace.o src/quic_rules.o	\

				                src/quic_cc_nocc.o src/quic_cc.o src/quic_pacing.o	\

				                src/h3_stats.o src/quic_stats.o src/qpack-enc.o		\

				                src/qpack-tbl.o src/quic_cc_drs.o src/quic_fctl.o	\

				                src/quic_enc.o

				endif

				ifneq ($(USE_QUIC_OPENSSL_COMPAT:0=),)

				@ -760,10 +782,6 @@ ifneq ($(USE_WURFL:0=),)

				  WURFL_LDFLAGS    = $(if $(WURFL_LIB),-L$(WURFL_LIB)) -lwurfl

				endif

				ifneq ($(USE_SYSTEMD:0=),)

				  OPTIONS_OBJS    += src/systemd.o

				endif

				ifneq ($(USE_PCRE:0=)$(USE_STATIC_PCRE:0=)$(USE_PCRE_JIT:0=),)

				  ifneq ($(USE_PCRE2:0=)$(USE_STATIC_PCRE2:0=)$(USE_PCRE2_JIT:0=),)

				    $(error cannot compile both PCRE and PCRE2 support)

				@ -933,7 +951,7 @@ all:

					@echo

					@exit 1

				else

				all: haproxy dev/flags/flags $(EXTRA)

				all: dev/flags/flags haproxy $(EXTRA)

				endif # obsolete targets

				endif # TARGET

				@ -943,40 +961,48 @@ ifneq ($(EXTRA_OBJS),)

				  OBJS += $(EXTRA_OBJS)

				endif

				OBJS += src/mux_h2.o src/mux_fcgi.o src/mux_h1.o src/tcpcheck.o               \

				        src/stream.o src/stats.o src/http_ana.o src/server.o                  \

				        src/stick_table.o src/sample.o src/flt_spoe.o src/tools.o             \

				        src/log.o src/cfgparse.o src/peers.o src/backend.o src/resolvers.o    \

				        src/cli.o src/connection.o src/proxy.o src/http_htx.o                 \

				        src/cfgparse-listen.o src/pattern.o src/check.o src/haproxy.o         \

				        src/cache.o src/stconn.o src/http_act.o src/http_fetch.o              \

				        src/http_client.o src/listener.o src/dns.o src/vars.o src/debug.o     \

				        src/tcp_rules.o src/sink.o src/h1_htx.o src/task.o src/mjson.o        \

				        src/h2.o src/filters.o src/server_state.o src/payload.o               \

				        src/fcgi-app.o src/map.o src/htx.o src/h1.o src/pool.o src/dns_ring.o \

				        src/cfgparse-global.o src/trace.o src/tcp_sample.o src/http_ext.o     \

				        src/flt_http_comp.o src/mux_pt.o src/flt_trace.o src/mqtt.o           \

				        src/acl.o src/sock.o src/mworker.o src/tcp_act.o src/ring.o           \

				        src/session.o src/proto_tcp.o src/fd.o src/channel.o src/activity.o   \

				        src/queue.o src/lb_fas.o src/http_rules.o src/extcheck.o              \

				        src/flt_bwlim.o src/thread.o src/http.o src/lb_chash.o src/applet.o   \

				        src/compression.o src/raw_sock.o src/ncbuf.o src/frontend.o           \

				        src/errors.o src/uri_normalizer.o src/http_conv.o src/lb_fwrr.o       \

				        src/sha1.o src/proto_sockpair.o src/mailers.o src/lb_fwlc.o           \

				        src/ebmbtree.o src/cfgcond.o src/action.o src/xprt_handshake.o        \

				        src/protocol.o src/proto_uxst.o src/proto_udp.o src/lb_map.o          \

				        src/fix.o src/ev_select.o src/arg.o src/sock_inet.o src/event_hdl.o   \

				        src/mworker-prog.o src/hpack-dec.o src/cfgparse-tcp.o src/lb_ss.o     \

				        src/sock_unix.o src/shctx.o src/proto_uxdg.o src/fcgi.o               \

				        src/eb64tree.o src/clock.o src/chunk.o src/cfgdiag.o src/signal.o     \

				        src/regex.o src/lru.o src/eb32tree.o src/eb32sctree.o                 \

				        src/cfgparse-unix.o src/hpack-tbl.o src/ebsttree.o src/ebimtree.o     \

				        src/base64.o src/auth.o src/uri_auth.o src/time.o src/ebistree.o      \

				        src/dynbuf.o src/wdt.o src/pipe.o src/init.o src/http_acl.o           \

				        src/hpack-huff.o src/hpack-enc.o src/dict.o src/freq_ctr.o            \

				        src/ebtree.o src/hash.o src/dgram.o src/version.o src/proto_rhttp.o   \

				        src/guid.o src/stats-html.o src/stats-json.o src/stats-file.o         \

				        src/stats-proxy.o

				OBJS += src/mux_h2.o src/mux_h1.o src/mux_fcgi.o src/log.o		\

				        src/server.o src/stream.o src/tcpcheck.o src/http_ana.o		\

				        src/stick_table.o src/tools.o src/mux_spop.o src/sample.o	\

				        src/activity.o src/cfgparse.o src/peers.o src/cli.o		\

				        src/backend.o src/connection.o src/resolvers.o src/proxy.o	\

				        src/cache.o src/stconn.o src/http_htx.o src/debug.o		\

				        src/check.o src/stats-html.o src/haproxy.o src/listener.o	\

				        src/applet.o src/pattern.o src/cfgparse-listen.o		\

				        src/flt_spoe.o src/cebis_tree.o src/http_ext.o			\

				        src/http_act.o src/http_fetch.o src/cebs_tree.o			\

				        src/cebib_tree.o src/http_client.o src/dns.o			\

				        src/cebb_tree.o src/vars.o src/event_hdl.o src/tcp_rules.o	\

				        src/trace.o src/stats-proxy.o src/pool.o src/stats.o		\

				        src/cfgparse-global.o src/filters.o src/mux_pt.o		\

				        src/flt_http_comp.o src/sock.o src/h1.o src/sink.o		\

				        src/ceba_tree.o src/session.o src/payload.o src/htx.o		\

				        src/cebl_tree.o src/ceb32_tree.o src/ceb64_tree.o		\

				        src/server_state.o src/proto_rhttp.o src/flt_trace.o src/fd.o	\

				        src/task.o src/map.o src/fcgi-app.o src/h2.o src/mworker.o	\

				        src/tcp_sample.o src/mjson.o src/h1_htx.o src/tcp_act.o		\

				        src/ring.o src/flt_bwlim.o src/acl.o src/thread.o src/queue.o	\

				        src/http_rules.o src/http.o src/channel.o src/proto_tcp.o	\

				        src/mqtt.o src/lb_chash.o src/extcheck.o src/dns_ring.o		\

				        src/errors.o src/ncbuf.o src/compression.o src/http_conv.o	\

				        src/frontend.o src/stats-json.o src/proto_sockpair.o		\

				        src/raw_sock.o src/action.o src/stats-file.o src/buf.o		\

				        src/xprt_handshake.o src/proto_uxst.o src/lb_fwrr.o		\

				        src/uri_normalizer.o src/mailers.o src/protocol.o		\

				        src/cfgcond.o src/proto_udp.o src/lb_fwlc.o src/ebmbtree.o	\

				        src/proto_uxdg.o src/cfgdiag.o src/sock_unix.o src/sha1.o	\

				        src/lb_fas.o src/clock.o src/sock_inet.o src/ev_select.o	\

				        src/lb_map.o src/shctx.o src/hpack-dec.o src/net_helper.o       \

				        src/arg.o src/signal.o src/fix.o src/dynbuf.o src/guid.o	\

				        src/cfgparse-tcp.o src/lb_ss.o src/chunk.o src/counters.o	\

				        src/cfgparse-unix.o src/regex.o src/fcgi.o src/uri_auth.o	\

				        src/eb64tree.o src/eb32tree.o src/eb32sctree.o src/lru.o	\

				        src/limits.o src/ebimtree.o src/wdt.o src/hpack-tbl.o		\

				        src/ebistree.o src/base64.o src/auth.o src/time.o		\

				        src/ebsttree.o src/freq_ctr.o src/systemd.o src/init.o		\

				        src/http_acl.o src/dict.o src/dgram.o src/pipe.o		\

				        src/hpack-huff.o src/hpack-enc.o src/ebtree.o src/hash.o	\

				        src/httpclient_cli.o src/version.o src/ncbmbuf.o src/ech.o

				ifneq ($(TRACE),)

				  OBJS += src/calltrace.o

				@ -1011,8 +1037,9 @@ help:

				# TARGET variable is not set since we're not building, by definition.

				IGNORE_OPTS=help install install-man install-doc install-bin \

					uninstall clean tags cscope tar git-tar version update-version \

					opts reg-tests reg-tests-help admin/halog/halog dev/flags/flags \

					dev/haring/haring dev/poll/poll dev/tcploop/tcploop

					opts reg-tests reg-tests-help unit-tests admin/halog/halog dev/flags/flags \

					dev/haring/haring dev/ncpu/ncpu dev/poll/poll dev/tcploop/tcploop \

					dev/term_events/term_events

				ifneq ($(TARGET),)

				ifeq ($(filter $(firstword $(MAKECMDGOALS)),$(IGNORE_OPTS)),)

				@ -1049,6 +1076,9 @@ dev/haring/haring: dev/haring/haring.o

				dev/hpack/%: dev/hpack/%.o

					$(cmd_LD) $(ARCH_FLAGS) $(LDFLAGS) -o $@ $^ $(LDOPTS)

				dev/ncpu/ncpu:

					$(cmd_MAKE) -C dev/ncpu ncpu V='$(V)'

				dev/poll/poll:

					$(cmd_MAKE) -C dev/poll poll CC='$(CC)' OPTIMIZE='$(COPTS)' V='$(V)'

				@ -1061,13 +1091,16 @@ dev/tcploop/tcploop:

				dev/udp/udp-perturb: dev/udp/udp-perturb.o

					$(cmd_LD) $(ARCH_FLAGS) $(LDFLAGS) -o $@ $^ $(LDOPTS)

				dev/term_events/term_events: dev/term_events/term_events.o

					$(cmd_LD) $(ARCH_FLAGS) $(LDFLAGS) -o $@ $^ $(LDOPTS)

				# rebuild it every time

				.PHONY: src/version.c dev/poll/poll dev/tcploop/tcploop

				.PHONY: src/version.c dev/ncpu/ncpu dev/poll/poll dev/tcploop/tcploop

				src/calltrace.o: src/calltrace.c $(DEP)

					$(cmd_CC) $(TRACE_COPTS) -c -o $@ $<

				src/haproxy.o:	src/haproxy.c $(DEP)

				src/version.o:	src/version.c $(DEP)

					$(cmd_CC) $(COPTS) \

					      -DBUILD_TARGET='"$(strip $(TARGET))"' \

					      -DBUILD_CC='"$(strip $(CC))"' \

				@ -1090,6 +1123,11 @@ install-doc:

						$(INSTALL) -m 644 doc/$$x.txt "$(DESTDIR)$(DOCDIR)" ; \

					done

				install-admin:

					$(Q)$(INSTALL) -d "$(DESTDIR)$(SBINDIR)"

					$(Q)$(INSTALL) admin/cli/haproxy-dump-certs "$(DESTDIR)$(SBINDIR)"

					$(Q)$(INSTALL) admin/cli/haproxy-reload "$(DESTDIR)$(SBINDIR)"

				install-bin:

					$(Q)for i in haproxy $(EXTRA); do \

						if ! [ -e "$$i" ]; then \

				@ -1100,7 +1138,7 @@ install-bin:

					$(Q)$(INSTALL) -d "$(DESTDIR)$(SBINDIR)"

					$(Q)$(INSTALL) haproxy $(EXTRA) "$(DESTDIR)$(SBINDIR)"

				install: install-bin install-man install-doc

				install: install-bin install-admin install-man install-doc

				uninstall:

					$(Q)rm -f "$(DESTDIR)$(MANDIR)"/man1/haproxy.1

				@ -1122,10 +1160,13 @@ clean:

					$(Q)rm -f addons/ot/src/*.[oas]

					$(Q)rm -f addons/wurfl/*.[oas] addons/wurfl/dummy/*.[oas]

					$(Q)rm -f admin/*/*.[oas] admin/*/*/*.[oas]

					$(Q)rm -f dev/*/*.[oas]

					$(Q)rm -f dev/flags/flags

				distclean: clean

					$(Q)rm -f admin/iprange/iprange admin/iprange/ip6range admin/halog/halog

					$(Q)rm -f admin/dyncookie/dyncookie

					$(Q)rm -f dev/*/*.[oas]

					$(Q)rm -f dev/flags/flags dev/haring/haring dev/poll/poll dev/tcploop/tcploop

					$(Q)rm -f dev/haring/haring dev/ncpu/ncpu{,.so} dev/poll/poll dev/tcploop/tcploop

					$(Q)rm -f dev/hpack/decode dev/hpack/gen-enc dev/hpack/gen-rht

					$(Q)rm -f dev/qpack/decode

				@ -1245,10 +1286,17 @@ reg-tests-help:

				.PHONY: reg-tests reg-tests-help

				unit-tests:

					$(Q)$(UNIT_TEST_SCRIPT)

				.PHONY: unit-tests

				# "make range" iteratively builds using "make all" and the exact same build

				# options for all commits within RANGE. RANGE may be either a git range

				# such as ref1..ref2 or a single commit, in which case all commits from

				# the master branch to this one will be tested.

				# Will execute TEST_CMD for each commit if defined, and will stop in case of

				# failure.

				range:

					$(Q)[ -d .git/. ] || { echo "## Fatal: \"make $@\" may only be used inside a Git repository."; exit 1; }

				@ -1274,6 +1322,7 @@ range:

							echo "[ $$index/$$count ]   $$commit #############################"; \

							git checkout -q $$commit || die 1; \

							$(MAKE) all || die 1; \

							[ -z "$(TEST_CMD)" ] || $(TEST_CMD) || die 1; \

							index=$$((index + 1)); \

						done; \

						echo;echo "Done! $${count} commit(s) built successfully for RANGE $${RANGE}" ; \

22

README

View File

 @ -1,22 +0,0 @@
 The HAProxy documentation has been split into a number of different files for
 ease of use.
 Please refer to the following files depending on what you're looking for :
   - INSTALL for instructions on how to build and install HAProxy
   - BRANCHES to understand the project's life cycle and what version to use
   - LICENSE for the project's license
   - CONTRIBUTING for the process to follow to submit contributions
 The more detailed documentation is located into the doc/ directory :
   - doc/intro.txt for a quick introduction on HAProxy
   - doc/configuration.txt for the configuration's reference manual
   - doc/lua.txt for the Lua's reference manual
   - doc/SPOE.txt for how to use the SPOE engine
   - doc/network-namespaces.txt for how to use network namespaces under Linux
   - doc/management.txt for the management guide
   - doc/regression-testing.txt for how to use the regression testing suite
   - doc/peers.txt for the peers protocol reference
   - doc/coding-style.txt for how to adopt HAProxy's coding style
   - doc/internals for developer-specific documentation (not all up to date)

									
										62

README.md
									
										Normal file
									
											View File
											
				@ -0,0 +1,62 @@

				# HAProxy

				[![alpine/musl](https://github.com/haproxy/haproxy/actions/workflows/musl.yml/badge.svg)](https://github.com/haproxy/haproxy/actions/workflows/musl.yml)

				[![AWS-LC](https://github.com/haproxy/haproxy/actions/workflows/aws-lc.yml/badge.svg)](https://github.com/haproxy/haproxy/actions/workflows/aws-lc.yml)

				[![openssl no-deprecated](https://github.com/haproxy/haproxy/actions/workflows/openssl-nodeprecated.yml/badge.svg)](https://github.com/haproxy/haproxy/actions/workflows/openssl-nodeprecated.yml)

				[![Illumos](https://github.com/haproxy/haproxy/actions/workflows/illumos.yml/badge.svg)](https://github.com/haproxy/haproxy/actions/workflows/illumos.yml)

				[![NetBSD](https://github.com/haproxy/haproxy/actions/workflows/netbsd.yml/badge.svg)](https://github.com/haproxy/haproxy/actions/workflows/netbsd.yml)

				[![FreeBSD](https://api.cirrus-ci.com/github/haproxy/haproxy.svg?task=FreeBSD)](https://cirrus-ci.com/github/haproxy/haproxy/)

				[![VTest](https://github.com/haproxy/haproxy/actions/workflows/vtest.yml/badge.svg)](https://github.com/haproxy/haproxy/actions/workflows/vtest.yml)

				![HAProxy logo](doc/HAProxyCommunityEdition_60px.png)

				HAProxy is a free, very fast and reliable reverse-proxy offering high availability, load balancing, and proxying for TCP

				and HTTP-based applications.

				## Installation

				The [INSTALL](INSTALL) file describes how to build HAProxy.

				A [list of packages](https://github.com/haproxy/wiki/wiki/Packages) is also available on the wiki.

				## Getting help

				The [discourse](https://discourse.haproxy.org/) and the [mailing-list](https://www.mail-archive.com/haproxy@formilux.org/)

				are available for questions or configuration assistance. You can also use the [slack](https://slack.haproxy.org/) or

				[IRC](irc://irc.libera.chat/%23haproxy) channel. Please don't use the issue tracker for these.

				The [issue tracker](https://github.com/haproxy/haproxy/issues/) is only for bug reports or feature requests.

				## Documentation

				The HAProxy documentation has been split into a number of different files for

				ease of use. It is available in text format as well as HTML. The wiki is also meant to replace the old architecture

				guide.

				- [HTML documentation](http://docs.haproxy.org/)

				- [HTML HAProxy LUA API Documentation](https://www.arpalert.org/haproxy-api.html)

				- [Wiki](https://github.com/haproxy/wiki/wiki)

				Please refer to the following files depending on what you're looking for:

				  - [INSTALL](INSTALL) for instructions on how to build and install HAProxy

				  - [BRANCHES](BRANCHES) to understand the project's life cycle and what version to use

				  - [LICENSE](LICENSE) for the project's license

				  - [CONTRIBUTING](CONTRIBUTING) for the process to follow to submit contributions

				The more detailed documentation is located into the doc/ directory:

				  - [ doc/intro.txt ](doc/intro.txt) for a quick introduction on HAProxy

				  - [ doc/configuration.txt ](doc/configuration.txt) for the configuration's reference manual

				  - [ doc/lua.txt ](doc/lua.txt) for the Lua's reference manual

				  - [ doc/SPOE.txt ](doc/SPOE.txt) for how to use the SPOE engine

				  - [ doc/network-namespaces.txt ](doc/network-namespaces.txt) for how to use network namespaces under Linux

				  - [ doc/management.txt ](doc/management.txt) for the management guide

				  - [ doc/regression-testing.txt ](doc/regression-testing.txt) for how to use the regression testing suite

				  - [ doc/peers.txt ](doc/peers.txt) for the peers protocol reference

				  - [ doc/coding-style.txt ](doc/coding-style.txt) for how to adopt HAProxy's coding style

				  - [ doc/internals ](doc/internals) for developer-specific documentation (not all up to date)

				## License

				HAProxy is licensed under [GPL 2](doc/gpl.txt) or any later version, the headers under [LGPL 2.1](doc/lgpl.txt). See the

				[LICENSE](LICENSE) file for a more detailed explanation.

2

VERDATE

View File

 @ -1,2 +1,2 @@
 $Format:%ci$
 /05/04
 /01/07

2

VERSION

View File

 @ -1 +1 @@
 .0-dev10
 .4-dev2

									
										3

addons/deviceatlas/Makefile.inc
									
											View File
											
				@ -5,7 +5,8 @@ CXX             := c++

				CXXLIB          := -lstdc++

				ifeq ($(DEVICEATLAS_SRC),)

				OPTIONS_LDFLAGS         += -lda

				OPTIONS_CFLAGS  += -I$(DEVICEATLAS_INC)

				OPTIONS_LDFLAGS += -Wl,-rpath,$(DEVICEATLAS_LIB) -L$(DEVICEATLAS_LIB) -lda

				else

				DEVICEATLAS_INC = $(DEVICEATLAS_SRC)

				DEVICEATLAS_LIB = $(DEVICEATLAS_SRC)

									
										2

addons/deviceatlas/dummy/dac.h
									
											View File
											
				@ -212,7 +212,7 @@ da_status_t da_atlas_compile(void *ctx, da_read_fn readfn, da_setpos_fn setposfn

				 * da_getpropid on the atlas, and if generated by the search, the ID will be consistent across

				 * different calls to search.

				 * Properties added by a search that are neither in the compiled atlas, nor in the extra_props list

				 * Are assigned an ID within the context that is not transferrable through different search results

				 * Are assigned an ID within the context that is not transferable through different search results

				 * within the same atlas.

				 * @param atlas Atlas instance

				 * @param extra_props properties

6

addons/ot/README

View File

 @ -47,6 +47,12 @@ via the OpenTracing API with OpenTracing compatible servers (tracers).
 Currently, tracers that support this API include Datadog, Jaeger, LightStep
 and Zipkin.
 Note: The OpenTracing filter shouldn't be used for new designs as OpenTracing
       itself is no longer maintained nor supported by its authors. A
       replacement filter base on OpenTelemetry is currently under development
       and is expected to be ready around HAProxy 3.2. As such OpenTracing will
       be deprecated in 3.3 and removed in 3.5.
 The OT filter was primarily tested with the Jaeger tracer, while configurations
 for both Datadog and Zipkin tracers were also set in the test directory.

									
										2

addons/ot/src/filter.c
									
											View File
											
				@ -718,7 +718,7 @@ static void flt_ot_check_timeouts(struct stream *s, struct filter *f)

					if (flt_ot_is_disabled(f FLT_OT_DBG_ARGS(, -1)))

						FLT_OT_RETURN();

					s->pending_events |= TASK_WOKEN_MSG;

					s->pending_events |= STRM_EVT_MSG;

					flt_ot_return_void(f, &err);

									
										17

addons/ot/src/parser.c
									
											View File
											
				@ -1074,8 +1074,9 @@ static int flt_ot_post_parse_cfg_scope(void)

				 */

				static int flt_ot_parse_cfg(struct flt_ot_conf *conf, const char *flt_name, char **err)

				{

					struct list backup_sections;

					int         retval = ERR_ABORT | ERR_ALERT;

					struct list    backup_sections;

					struct cfgfile cfg_file = {0};

					int            retval = ERR_ABORT | ERR_ALERT;

					FLT_OT_FUNC("%p, \"%s\", %p:%p", conf, flt_name, FLT_OT_DPTR_ARGS(err));

				@ -1094,8 +1095,16 @@ static int flt_ot_parse_cfg(struct flt_ot_conf *conf, const char *flt_name, char

						/* Do nothing. */;

					else if (access(conf->cfg_file, R_OK) == -1)

						FLT_OT_PARSE_ERR(err, "'%s' : %s", conf->cfg_file, strerror(errno));

					else

						retval = readcfgfile(conf->cfg_file);

					else {

						cfg_file.filename = conf->cfg_file;

						cfg_file.size = load_cfg_in_mem(cfg_file.filename, &cfg_file.content);

						if (cfg_file.size < 0) {

							ha_free(&cfg_file.content);

							FLT_OT_RETURN_INT(retval);

						}

						retval = parse_cfg(&cfg_file);

						ha_free(&cfg_file.content);

					}

					/* Unregister OT sections and restore previous sections. */

					cfg_unregister_sections();

									
										13

addons/ot/src/vars.c
									
											View File
											
				@ -39,14 +39,21 @@

				 */

				static void flt_ot_vars_scope_dump(struct vars *vars, const char *scope)

				{

					const struct var *var;

					int i;

					if (vars == NULL)

						return;

					vars_rdlock(vars);

					list_for_each_entry(var, &(vars->head), l)

						FLT_OT_DBG(2, "'%s.%016" PRIx64 "' -> '%.*s'", scope, var->name_hash, (int)b_data(&(var->data.u.str)), b_orig(&(var->data.u.str)));

					for (i = 0; i < VAR_NAME_ROOTS; i++) {

						struct ceb_node *node = cebu64_first(&(vars->name_root[i]));

						for ( ; node != NULL; node = cebu64_next(&(vars->name_root[i]), node)) {

							struct var *var = container_of(node, struct var, node);

							FLT_OT_DBG(2, "'%s.%016" PRIx64 "' -> '%.*s'", scope, var->name_hash, (int)b_data(&(var->data.u.str)), b_orig(&(var->data.u.str)));

						}

					}

					vars_rdunlock(vars);

				}

17

addons/promex/README

View File

 @ -91,6 +91,18 @@ name must be preceded by a minus character ('-'). Here are examples:
   # Only dump frontends, backends and servers status
   /metrics?metrics=haproxy_frontend_status,haproxy_backend_status,haproxy_server_status
 * Add section description as label for all metrics
 It is possible to set a description in global and proxy sections, via the
 "description" directive. The global description is exposed if it is define via
 the "haproxy_process_description" metric. But the descriptions provided in proxy
 sections are not dumped. However, it is possible to add it as a label for all
 metrics of the corresponding section, including the global one. To do so,
 "desc-labels" parameter must be set:
   /metrics?desc-labels
   / metrics?scope=frontend&desc-labels
 * Dump extra counters
 @ -193,6 +205,8 @@ listed below. Metrics from extra counters are not listed.
 | haproxy_process_current_tasks                  |
 | haproxy_process_current_run_queue              |
 | haproxy_process_idle_time_percent              |
 | haproxy_process_node                           |
 | haproxy_process_description                    |
 | haproxy_process_stopping                       |
 | haproxy_process_jobs                           |
 | haproxy_process_unstoppable_jobs               |
 @ -375,6 +389,9 @@ listed below. Metrics from extra counters are not listed.
 | haproxy_server_max_connect_time_seconds            |
 | haproxy_server_max_response_time_seconds           |
 | haproxy_server_max_total_time_seconds              |
 | haproxy_server_agent_status                        |
 | haproxy_server_agent_code                          |
 | haproxy_server_agent_duration_seconds              |
 | haproxy_server_internal_errors_total               |
 | haproxy_server_unsafe_idle_connections_current     |
 | haproxy_server_safe_idle_connections_current       |

									
										11

addons/promex/include/promex/promex.h
									
											View File
											
				@ -32,11 +32,11 @@

				/* Prometheus exporter flags (ctx->flags) */

				#define PROMEX_FL_METRIC_HDR        0x00000001

				#define PROMEX_FL_INFO_METRIC       0x00000002

				#define PROMEX_FL_FRONT_METRIC      0x00000004

				#define PROMEX_FL_BACK_METRIC       0x00000008

				#define PROMEX_FL_SRV_METRIC        0x00000010

				#define PROMEX_FL_LI_METRIC         0x00000020

				#define PROMEX_FL_BODYLESS_RESP     0x00000002

				/* unused: 0x00000004 */

				/* unused: 0x00000008 */

				/* unused: 0x00000010 */

				/* unused: 0x00000020 */

				#define PROMEX_FL_MODULE_METRIC     0x00000040

				#define PROMEX_FL_SCOPE_GLOBAL      0x00000080

				#define PROMEX_FL_SCOPE_FRONT       0x00000100

				@ -47,6 +47,7 @@

				#define PROMEX_FL_NO_MAINT_SRV      0x00002000

				#define PROMEX_FL_EXTRA_COUNTERS    0x00004000

				#define PROMEX_FL_INC_METRIC_BY_DEFAULT 0x00008000

				#define PROMEX_FL_DESC_LABELS       0x00010000

				#define PROMEX_FL_SCOPE_ALL (PROMEX_FL_SCOPE_GLOBAL | PROMEX_FL_SCOPE_FRONT | \

							     PROMEX_FL_SCOPE_LI | PROMEX_FL_SCOPE_BACK | \

745

addons/promex/service-prometheus.c

View File

File diff suppressed because it is too large Load Diff

674

admin/acme.sh/LICENSE

View File

 @ -1,674 +0,0 @@
                     GNU GENERAL PUBLIC LICENSE
                        Version 3, 29 June 2007
  Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
  Everyone is permitted to copy and distribute verbatim copies
  of this license document, but changing it is not allowed.
                             Preamble
   The GNU General Public License is a free, copyleft license for
 software and other kinds of works.
   The licenses for most software and other practical works are designed
 to take away your freedom to share and change the works.  By contrast,
 the GNU General Public License is intended to guarantee your freedom to
 share and change all versions of a program--to make sure it remains free
 software for all its users.  We, the Free Software Foundation, use the
 GNU General Public License for most of our software; it applies also to
 any other work released this way by its authors.  You can apply it to
 your programs, too.
   When we speak of free software, we are referring to freedom, not
 price.  Our General Public Licenses are designed to make sure that you
 have the freedom to distribute copies of free software (and charge for
 them if you wish), that you receive source code or can get it if you
 want it, that you can change the software or use pieces of it in new
 free programs, and that you know you can do these things.
   To protect your rights, we need to prevent others from denying you
 these rights or asking you to surrender the rights.  Therefore, you have
 certain responsibilities if you distribute copies of the software, or if
 you modify it: responsibilities to respect the freedom of others.
   For example, if you distribute copies of such a program, whether
 gratis or for a fee, you must pass on to the recipients the same
 freedoms that you received.  You must make sure that they, too, receive
 or can get the source code.  And you must show them these terms so they
 know their rights.
   Developers that use the GNU GPL protect your rights with two steps:
 (1) assert copyright on the software, and (2) offer you this License
 giving you legal permission to copy, distribute and/or modify it.
   For the developers' and authors' protection, the GPL clearly explains
 that there is no warranty for this free software.  For both users' and
 authors' sake, the GPL requires that modified versions be marked as
 changed, so that their problems will not be attributed erroneously to
 authors of previous versions.
   Some devices are designed to deny users access to install or run
 modified versions of the software inside them, although the manufacturer
 can do so.  This is fundamentally incompatible with the aim of
 protecting users' freedom to change the software.  The systematic
 pattern of such abuse occurs in the area of products for individuals to
 use, which is precisely where it is most unacceptable.  Therefore, we
 have designed this version of the GPL to prohibit the practice for those
 products.  If such problems arise substantially in other domains, we
 stand ready to extend this provision to those domains in future versions
 of the GPL, as needed to protect the freedom of users.
   Finally, every program is threatened constantly by software patents.
 States should not allow patents to restrict development and use of
 software on general-purpose computers, but in those that do, we wish to
 avoid the special danger that patents applied to a free program could
 make it effectively proprietary.  To prevent this, the GPL assures that
 patents cannot be used to render the program non-free.
   The precise terms and conditions for copying, distribution and
 modification follow.
                        TERMS AND CONDITIONS
 . Definitions.
   "This License" refers to version 3 of the GNU General Public License.
   "Copyright" also means copyright-like laws that apply to other kinds of
 works, such as semiconductor masks.
   "The Program" refers to any copyrightable work licensed under this
 License.  Each licensee is addressed as "you".  "Licensees" and
 "recipients" may be individuals or organizations.
   To "modify" a work means to copy from or adapt all or part of the work
 in a fashion requiring copyright permission, other than the making of an
 exact copy.  The resulting work is called a "modified version" of the
 earlier work or a work "based on" the earlier work.
   A "covered work" means either the unmodified Program or a work based
 on the Program.
   To "propagate" a work means to do anything with it that, without
 permission, would make you directly or secondarily liable for
 infringement under applicable copyright law, except executing it on a
 computer or modifying a private copy.  Propagation includes copying,
 distribution (with or without modification), making available to the
 public, and in some countries other activities as well.
   To "convey" a work means any kind of propagation that enables other
 parties to make or receive copies.  Mere interaction with a user through
 a computer network, with no transfer of a copy, is not conveying.
   An interactive user interface displays "Appropriate Legal Notices"
 to the extent that it includes a convenient and prominently visible
 feature that (1) displays an appropriate copyright notice, and (2)
 tells the user that there is no warranty for the work (except to the
 extent that warranties are provided), that licensees may convey the
 work under this License, and how to view a copy of this License.  If
 the interface presents a list of user commands or options, such as a
 menu, a prominent item in the list meets this criterion.
 . Source Code.
   The "source code" for a work means the preferred form of the work
 for making modifications to it.  "Object code" means any non-source
 form of a work.
   A "Standard Interface" means an interface that either is an official
 standard defined by a recognized standards body, or, in the case of
 interfaces specified for a particular programming language, one that
 is widely used among developers working in that language.
   The "System Libraries" of an executable work include anything, other
 than the work as a whole, that (a) is included in the normal form of
 packaging a Major Component, but which is not part of that Major
 Component, and (b) serves only to enable use of the work with that
 Major Component, or to implement a Standard Interface for which an
 implementation is available to the public in source code form.  A
 "Major Component", in this context, means a major essential component
 (kernel, window system, and so on) of the specific operating system
 (if any) on which the executable work runs, or a compiler used to
 produce the work, or an object code interpreter used to run it.
   The "Corresponding Source" for a work in object code form means all
 the source code needed to generate, install, and (for an executable
 work) run the object code and to modify the work, including scripts to
 control those activities.  However, it does not include the work's
 System Libraries, or general-purpose tools or generally available free
 programs which are used unmodified in performing those activities but
 which are not part of the work.  For example, Corresponding Source
 includes interface definition files associated with source files for
 the work, and the source code for shared libraries and dynamically
 linked subprograms that the work is specifically designed to require,
 such as by intimate data communication or control flow between those
 subprograms and other parts of the work.
   The Corresponding Source need not include anything that users
 can regenerate automatically from other parts of the Corresponding
 Source.
   The Corresponding Source for a work in source code form is that
 same work.
 . Basic Permissions.
   All rights granted under this License are granted for the term of
 copyright on the Program, and are irrevocable provided the stated
 conditions are met.  This License explicitly affirms your unlimited
 permission to run the unmodified Program.  The output from running a
 covered work is covered by this License only if the output, given its
 content, constitutes a covered work.  This License acknowledges your
 rights of fair use or other equivalent, as provided by copyright law.
   You may make, run and propagate covered works that you do not
 convey, without conditions so long as your license otherwise remains
 in force.  You may convey covered works to others for the sole purpose
 of having them make modifications exclusively for you, or provide you
 with facilities for running those works, provided that you comply with
 the terms of this License in conveying all material for which you do
 not control copyright.  Those thus making or running the covered works
 for you must do so exclusively on your behalf, under your direction
 and control, on terms that prohibit them from making any copies of
 your copyrighted material outside their relationship with you.
   Conveying under any other circumstances is permitted solely under
 the conditions stated below.  Sublicensing is not allowed; section 10
 makes it unnecessary.
 . Protecting Users' Legal Rights From Anti-Circumvention Law.
   No covered work shall be deemed part of an effective technological
 measure under any applicable law fulfilling obligations under article
 of the WIPO copyright treaty adopted on 20 December 1996, or
 similar laws prohibiting or restricting circumvention of such
 measures.
   When you convey a covered work, you waive any legal power to forbid
 circumvention of technological measures to the extent such circumvention
 is effected by exercising rights under this License with respect to
 the covered work, and you disclaim any intention to limit operation or
 modification of the work as a means of enforcing, against the work's
 users, your or third parties' legal rights to forbid circumvention of
 technological measures.
 . Conveying Verbatim Copies.
   You may convey verbatim copies of the Program's source code as you
 receive it, in any medium, provided that you conspicuously and
 appropriately publish on each copy an appropriate copyright notice;
 keep intact all notices stating that this License and any
 non-permissive terms added in accord with section 7 apply to the code;
 keep intact all notices of the absence of any warranty; and give all
 recipients a copy of this License along with the Program.
   You may charge any price or no price for each copy that you convey,
 and you may offer support or warranty protection for a fee.
 . Conveying Modified Source Versions.
   You may convey a work based on the Program, or the modifications to
 produce it from the Program, in the form of source code under the
 terms of section 4, provided that you also meet all of these conditions:
     a) The work must carry prominent notices stating that you modified
     it, and giving a relevant date.
     b) The work must carry prominent notices stating that it is
     released under this License and any conditions added under section
 .  This requirement modifies the requirement in section 4 to
     "keep intact all notices".
     c) You must license the entire work, as a whole, under this
     License to anyone who comes into possession of a copy.  This
     License will therefore apply, along with any applicable section 7
     additional terms, to the whole of the work, and all its parts,
     regardless of how they are packaged.  This License gives no
     permission to license the work in any other way, but it does not
     invalidate such permission if you have separately received it.
     d) If the work has interactive user interfaces, each must display
     Appropriate Legal Notices; however, if the Program has interactive
     interfaces that do not display Appropriate Legal Notices, your
     work need not make them do so.
   A compilation of a covered work with other separate and independent
 works, which are not by their nature extensions of the covered work,
 and which are not combined with it such as to form a larger program,
 in or on a volume of a storage or distribution medium, is called an
 "aggregate" if the compilation and its resulting copyright are not
 used to limit the access or legal rights of the compilation's users
 beyond what the individual works permit.  Inclusion of a covered work
 in an aggregate does not cause this License to apply to the other
 parts of the aggregate.
 . Conveying Non-Source Forms.
   You may convey a covered work in object code form under the terms
 of sections 4 and 5, provided that you also convey the
 machine-readable Corresponding Source under the terms of this License,
 in one of these ways:
     a) Convey the object code in, or embodied in, a physical product
     (including a physical distribution medium), accompanied by the
     Corresponding Source fixed on a durable physical medium
     customarily used for software interchange.
     b) Convey the object code in, or embodied in, a physical product
     (including a physical distribution medium), accompanied by a
     written offer, valid for at least three years and valid for as
     long as you offer spare parts or customer support for that product
     model, to give anyone who possesses the object code either (1) a
     copy of the Corresponding Source for all the software in the
     product that is covered by this License, on a durable physical
     medium customarily used for software interchange, for a price no
     more than your reasonable cost of physically performing this
     conveying of source, or (2) access to copy the
     Corresponding Source from a network server at no charge.
     c) Convey individual copies of the object code with a copy of the
     written offer to provide the Corresponding Source.  This
     alternative is allowed only occasionally and noncommercially, and
     only if you received the object code with such an offer, in accord
     with subsection 6b.
     d) Convey the object code by offering access from a designated
     place (gratis or for a charge), and offer equivalent access to the
     Corresponding Source in the same way through the same place at no
     further charge.  You need not require recipients to copy the
     Corresponding Source along with the object code.  If the place to
     copy the object code is a network server, the Corresponding Source
     may be on a different server (operated by you or a third party)
     that supports equivalent copying facilities, provided you maintain
     clear directions next to the object code saying where to find the
     Corresponding Source.  Regardless of what server hosts the
     Corresponding Source, you remain obligated to ensure that it is
     available for as long as needed to satisfy these requirements.
     e) Convey the object code using peer-to-peer transmission, provided
     you inform other peers where the object code and Corresponding
     Source of the work are being offered to the general public at no
     charge under subsection 6d.
   A separable portion of the object code, whose source code is excluded
 from the Corresponding Source as a System Library, need not be
 included in conveying the object code work.
   A "User Product" is either (1) a "consumer product", which means any
 tangible personal property which is normally used for personal, family,
 or household purposes, or (2) anything designed or sold for incorporation
 into a dwelling.  In determining whether a product is a consumer product,
 doubtful cases shall be resolved in favor of coverage.  For a particular
 product received by a particular user, "normally used" refers to a
 typical or common use of that class of product, regardless of the status
 of the particular user or of the way in which the particular user
 actually uses, or expects or is expected to use, the product.  A product
 is a consumer product regardless of whether the product has substantial
 commercial, industrial or non-consumer uses, unless such uses represent
 the only significant mode of use of the product.
   "Installation Information" for a User Product means any methods,
 procedures, authorization keys, or other information required to install
 and execute modified versions of a covered work in that User Product from
 a modified version of its Corresponding Source.  The information must
 suffice to ensure that the continued functioning of the modified object
 code is in no case prevented or interfered with solely because
 modification has been made.
   If you convey an object code work under this section in, or with, or
 specifically for use in, a User Product, and the conveying occurs as
 part of a transaction in which the right of possession and use of the
 User Product is transferred to the recipient in perpetuity or for a
 fixed term (regardless of how the transaction is characterized), the
 Corresponding Source conveyed under this section must be accompanied
 by the Installation Information.  But this requirement does not apply
 if neither you nor any third party retains the ability to install
 modified object code on the User Product (for example, the work has
 been installed in ROM).
   The requirement to provide Installation Information does not include a
 requirement to continue to provide support service, warranty, or updates
 for a work that has been modified or installed by the recipient, or for
 the User Product in which it has been modified or installed.  Access to a
 network may be denied when the modification itself materially and
 adversely affects the operation of the network or violates the rules and
 protocols for communication across the network.
   Corresponding Source conveyed, and Installation Information provided,
 in accord with this section must be in a format that is publicly
 documented (and with an implementation available to the public in
 source code form), and must require no special password or key for
 unpacking, reading or copying.
 . Additional Terms.
   "Additional permissions" are terms that supplement the terms of this
 License by making exceptions from one or more of its conditions.
 Additional permissions that are applicable to the entire Program shall
 be treated as though they were included in this License, to the extent
 that they are valid under applicable law.  If additional permissions
 apply only to part of the Program, that part may be used separately
 under those permissions, but the entire Program remains governed by
 this License without regard to the additional permissions.
   When you convey a copy of a covered work, you may at your option
 remove any additional permissions from that copy, or from any part of
 it.  (Additional permissions may be written to require their own
 removal in certain cases when you modify the work.)  You may place
 additional permissions on material, added by you to a covered work,
 for which you have or can give appropriate copyright permission.
   Notwithstanding any other provision of this License, for material you
 add to a covered work, you may (if authorized by the copyright holders of
 that material) supplement the terms of this License with terms:
     a) Disclaiming warranty or limiting liability differently from the
     terms of sections 15 and 16 of this License; or
     b) Requiring preservation of specified reasonable legal notices or
     author attributions in that material or in the Appropriate Legal
     Notices displayed by works containing it; or
     c) Prohibiting misrepresentation of the origin of that material, or
     requiring that modified versions of such material be marked in
     reasonable ways as different from the original version; or
     d) Limiting the use for publicity purposes of names of licensors or
     authors of the material; or
     e) Declining to grant rights under trademark law for use of some
     trade names, trademarks, or service marks; or
     f) Requiring indemnification of licensors and authors of that
     material by anyone who conveys the material (or modified versions of
     it) with contractual assumptions of liability to the recipient, for
     any liability that these contractual assumptions directly impose on
     those licensors and authors.
   All other non-permissive additional terms are considered "further
 restrictions" within the meaning of section 10.  If the Program as you
 received it, or any part of it, contains a notice stating that it is
 governed by this License along with a term that is a further
 restriction, you may remove that term.  If a license document contains
 a further restriction but permits relicensing or conveying under this
 License, you may add to a covered work material governed by the terms
 of that license document, provided that the further restriction does
 not survive such relicensing or conveying.
   If you add terms to a covered work in accord with this section, you
 must place, in the relevant source files, a statement of the
 additional terms that apply to those files, or a notice indicating
 where to find the applicable terms.
   Additional terms, permissive or non-permissive, may be stated in the
 form of a separately written license, or stated as exceptions;
 the above requirements apply either way.
 . Termination.
   You may not propagate or modify a covered work except as expressly
 provided under this License.  Any attempt otherwise to propagate or
 modify it is void, and will automatically terminate your rights under
 this License (including any patent licenses granted under the third
 paragraph of section 11).
   However, if you cease all violation of this License, then your
 license from a particular copyright holder is reinstated (a)
 provisionally, unless and until the copyright holder explicitly and
 finally terminates your license, and (b) permanently, if the copyright
 holder fails to notify you of the violation by some reasonable means
 prior to 60 days after the cessation.
   Moreover, your license from a particular copyright holder is
 reinstated permanently if the copyright holder notifies you of the
 violation by some reasonable means, this is the first time you have
 received notice of violation of this License (for any work) from that
 copyright holder, and you cure the violation prior to 30 days after
 your receipt of the notice.
   Termination of your rights under this section does not terminate the
 licenses of parties who have received copies or rights from you under
 this License.  If your rights have been terminated and not permanently
 reinstated, you do not qualify to receive new licenses for the same
 material under section 10.
 . Acceptance Not Required for Having Copies.
   You are not required to accept this License in order to receive or
 run a copy of the Program.  Ancillary propagation of a covered work
 occurring solely as a consequence of using peer-to-peer transmission
 to receive a copy likewise does not require acceptance.  However,
 nothing other than this License grants you permission to propagate or
 modify any covered work.  These actions infringe copyright if you do
 not accept this License.  Therefore, by modifying or propagating a
 covered work, you indicate your acceptance of this License to do so.
 . Automatic Licensing of Downstream Recipients.
   Each time you convey a covered work, the recipient automatically
 receives a license from the original licensors, to run, modify and
 propagate that work, subject to this License.  You are not responsible
 for enforcing compliance by third parties with this License.
   An "entity transaction" is a transaction transferring control of an
 organization, or substantially all assets of one, or subdividing an
 organization, or merging organizations.  If propagation of a covered
 work results from an entity transaction, each party to that
 transaction who receives a copy of the work also receives whatever
 licenses to the work the party's predecessor in interest had or could
 give under the previous paragraph, plus a right to possession of the
 Corresponding Source of the work from the predecessor in interest, if
 the predecessor has it or can get it with reasonable efforts.
   You may not impose any further restrictions on the exercise of the
 rights granted or affirmed under this License.  For example, you may
 not impose a license fee, royalty, or other charge for exercise of
 rights granted under this License, and you may not initiate litigation
 (including a cross-claim or counterclaim in a lawsuit) alleging that
 any patent claim is infringed by making, using, selling, offering for
 sale, or importing the Program or any portion of it.
 . Patents.
   A "contributor" is a copyright holder who authorizes use under this
 License of the Program or a work on which the Program is based.  The
 work thus licensed is called the contributor's "contributor version".
   A contributor's "essential patent claims" are all patent claims
 owned or controlled by the contributor, whether already acquired or
 hereafter acquired, that would be infringed by some manner, permitted
 by this License, of making, using, or selling its contributor version,
 but do not include claims that would be infringed only as a
 consequence of further modification of the contributor version.  For
 purposes of this definition, "control" includes the right to grant
 patent sublicenses in a manner consistent with the requirements of
 this License.
   Each contributor grants you a non-exclusive, worldwide, royalty-free
 patent license under the contributor's essential patent claims, to
 make, use, sell, offer for sale, import and otherwise run, modify and
 propagate the contents of its contributor version.
   In the following three paragraphs, a "patent license" is any express
 agreement or commitment, however denominated, not to enforce a patent
 (such as an express permission to practice a patent or covenant not to
 sue for patent infringement).  To "grant" such a patent license to a
 party means to make such an agreement or commitment not to enforce a
 patent against the party.
   If you convey a covered work, knowingly relying on a patent license,
 and the Corresponding Source of the work is not available for anyone
 to copy, free of charge and under the terms of this License, through a
 publicly available network server or other readily accessible means,
 then you must either (1) cause the Corresponding Source to be so
 available, or (2) arrange to deprive yourself of the benefit of the
 patent license for this particular work, or (3) arrange, in a manner
 consistent with the requirements of this License, to extend the patent
 license to downstream recipients.  "Knowingly relying" means you have
 actual knowledge that, but for the patent license, your conveying the
 covered work in a country, or your recipient's use of the covered work
 in a country, would infringe one or more identifiable patents in that
 country that you have reason to believe are valid.
   If, pursuant to or in connection with a single transaction or
 arrangement, you convey, or propagate by procuring conveyance of, a
 covered work, and grant a patent license to some of the parties
 receiving the covered work authorizing them to use, propagate, modify
 or convey a specific copy of the covered work, then the patent license
 you grant is automatically extended to all recipients of the covered
 work and works based on it.
   A patent license is "discriminatory" if it does not include within
 the scope of its coverage, prohibits the exercise of, or is
 conditioned on the non-exercise of one or more of the rights that are
 specifically granted under this License.  You may not convey a covered
 work if you are a party to an arrangement with a third party that is
 in the business of distributing software, under which you make payment
 to the third party based on the extent of your activity of conveying
 the work, and under which the third party grants, to any of the
 parties who would receive the covered work from you, a discriminatory
 patent license (a) in connection with copies of the covered work
 conveyed by you (or copies made from those copies), or (b) primarily
 for and in connection with specific products or compilations that
 contain the covered work, unless you entered into that arrangement,
 or that patent license was granted, prior to 28 March 2007.
   Nothing in this License shall be construed as excluding or limiting
 any implied license or other defenses to infringement that may
 otherwise be available to you under applicable patent law.
 . No Surrender of Others' Freedom.
   If conditions are imposed on you (whether by court order, agreement or
 otherwise) that contradict the conditions of this License, they do not
 excuse you from the conditions of this License.  If you cannot convey a
 covered work so as to satisfy simultaneously your obligations under this
 License and any other pertinent obligations, then as a consequence you may
 not convey it at all.  For example, if you agree to terms that obligate you
 to collect a royalty for further conveying from those to whom you convey
 the Program, the only way you could satisfy both those terms and this
 License would be to refrain entirely from conveying the Program.
 . Use with the GNU Affero General Public License.
   Notwithstanding any other provision of this License, you have
 permission to link or combine any covered work with a work licensed
 under version 3 of the GNU Affero General Public License into a single
 combined work, and to convey the resulting work.  The terms of this
 License will continue to apply to the part which is the covered work,
 but the special requirements of the GNU Affero General Public License,
 section 13, concerning interaction through a network will apply to the
 combination as such.
 . Revised Versions of this License.
   The Free Software Foundation may publish revised and/or new versions of
 the GNU General Public License from time to time.  Such new versions will
 be similar in spirit to the present version, but may differ in detail to
 address new problems or concerns.
   Each version is given a distinguishing version number.  If the
 Program specifies that a certain numbered version of the GNU General
 Public License "or any later version" applies to it, you have the
 option of following the terms and conditions either of that numbered
 version or of any later version published by the Free Software
 Foundation.  If the Program does not specify a version number of the
 GNU General Public License, you may choose any version ever published
 by the Free Software Foundation.
   If the Program specifies that a proxy can decide which future
 versions of the GNU General Public License can be used, that proxy's
 public statement of acceptance of a version permanently authorizes you
 to choose that version for the Program.
   Later license versions may give you additional or different
 permissions.  However, no additional obligations are imposed on any
 author or copyright holder as a result of your choosing to follow a
 later version.
 . Disclaimer of Warranty.
   THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
 APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
 HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
 OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
 THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
 IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
 ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
 . Limitation of Liability.
   IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
 WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
 THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
 GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
 USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
 DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
 PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
 EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
 SUCH DAMAGES.
 . Interpretation of Sections 15 and 16.
   If the disclaimer of warranty and limitation of liability provided
 above cannot be given local legal effect according to their terms,
 reviewing courts shall apply local law that most closely approximates
 an absolute waiver of all civil liability in connection with the
 Program, unless a warranty or assumption of liability accompanies a
 copy of the Program in return for a fee.
                      END OF TERMS AND CONDITIONS
             How to Apply These Terms to Your New Programs
   If you develop a new program, and you want it to be of the greatest
 possible use to the public, the best way to achieve this is to make it
 free software which everyone can redistribute and change under these terms.
   To do so, attach the following notices to the program.  It is safest
 to attach them to the start of each source file to most effectively
 state the exclusion of warranty; and each file should have at least
 the "copyright" line and a pointer to where the full notice is found.
     <one line to give the program's name and a brief idea of what it does.>
     Copyright (C) <year>  <name of author>
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
     the Free Software Foundation, either version 3 of the License, or
     (at your option) any later version.
     This program is distributed in the hope that it will be useful,
     but WITHOUT ANY WARRANTY; without even the implied warranty of
     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     GNU General Public License for more details.
     You should have received a copy of the GNU General Public License
     along with this program.  If not, see <https://www.gnu.org/licenses/>.
 Also add information on how to contact you by electronic and paper mail.
   If the program does terminal interaction, make it output a short
 notice like this when it starts in an interactive mode:
     <program>  Copyright (C) <year>  <name of author>
     This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
     This is free software, and you are welcome to redistribute it
     under certain conditions; type `show c' for details.
 The hypothetical commands `show w' and `show c' should show the appropriate
 parts of the General Public License.  Of course, your program's commands
 might be different; for a GUI interface, you would use an "about box".
   You should also get your employer (if you work as a programmer) or school,
 if any, to sign a "copyright disclaimer" for the program, if necessary.
 For more information on this, and how to apply and follow the GNU GPL, see
 <https://www.gnu.org/licenses/>.
   The GNU General Public License does not permit incorporating your program
 into proprietary programs.  If your program is a subroutine library, you
 may consider it more useful to permit linking proprietary applications with
 the library.  If this is what you want to do, use the GNU Lesser General
 Public License instead of this License.  But first, please read
 <https://www.gnu.org/licenses/why-not-lgpl.html>.

13

admin/acme.sh/README

View File

 @ -1,13 +0,0 @@
 This directory contains a fork of the acme.sh deploy script for haproxy which
 allow acme.sh to run as non-root and don't require to reload haproxy.
 The content of this directory is licensed under GPLv3 as explained in the
 LICENSE file.
 This was originally written for this pull request
 https://github.com/acmesh-official/acme.sh/pull/4581.
 The documentation is available on the haproxy wiki:
 https://github.com/haproxy/wiki/wiki/Letsencrypt-integration-with-HAProxy-and-acme.sh
 The haproxy.sh script must replace the one provided by acme.sh.

									
										403

admin/acme.sh/haproxy.sh
									
											View File
										
				@ -1,403 +0,0 @@

				#!/usr/bin/env sh

				# Script for acme.sh to deploy certificates to haproxy

				#

				# The following variables can be exported:

				#

				# export DEPLOY_HAPROXY_PEM_NAME="${domain}.pem"

				#

				# Defines the name of the PEM file.

				# Defaults to "<domain>.pem"

				#

				# export DEPLOY_HAPROXY_PEM_PATH="/etc/haproxy"

				#

				# Defines location of PEM file for HAProxy.

				# Defaults to /etc/haproxy

				#

				# export DEPLOY_HAPROXY_RELOAD="systemctl reload haproxy"

				#

				# OPTIONAL: Reload command used post deploy

				# This defaults to be a no-op (ie "true").

				# It is strongly recommended to set this something that makes sense

				# for your distro.

				#

				# export DEPLOY_HAPROXY_ISSUER="no"

				#

				# OPTIONAL: Places CA file as "${DEPLOY_HAPROXY_PEM}.issuer"

				# Note: Required for OCSP stapling to work

				#

				# export DEPLOY_HAPROXY_BUNDLE="no"

				#

				# OPTIONAL: Deploy this certificate as part of a multi-cert bundle

				# This adds a suffix to the certificate based on the certificate type

				# eg RSA certificates will have .rsa as a suffix to the file name

				# HAProxy will load all certificates and provide one or the other

				# depending on client capabilities

				# Note: This functionality requires HAProxy was compiled against

				# a version of OpenSSL that supports this.

				#

				# export DEPLOY_HAPROXY_HOT_UPDATE="yes"

				# export DEPLOY_HAPROXY_STATS_SOCKET="UNIX:/run/haproxy/admin.sock"

				#

				# OPTIONAL: Deploy the certificate over the HAProxy stats socket without

				# needing to reload HAProxy. Default is "no".

				#

				# Require the socat binary. DEPLOY_HAPROXY_STATS_SOCKET variable uses the socat

				# address format.

				#

				# export DEPLOY_HAPROXY_MASTER_CLI="UNIX:/run/haproxy-master.sock"

				#

				# OPTIONAL: To use the master CLI with DEPLOY_HAPROXY_HOT_UPDATE="yes" instead

				# of a stats socket, use this variable.

				########  Public functions #####################

				#domain keyfile certfile cafile fullchain

				haproxy_deploy() {

				  _cdomain="$1"

				  _ckey="$2"

				  _ccert="$3"

				  _cca="$4"

				  _cfullchain="$5"

				  _cmdpfx=""

				  # Some defaults

				  DEPLOY_HAPROXY_PEM_PATH_DEFAULT="/etc/haproxy"

				  DEPLOY_HAPROXY_PEM_NAME_DEFAULT="${_cdomain}.pem"

				  DEPLOY_HAPROXY_BUNDLE_DEFAULT="no"

				  DEPLOY_HAPROXY_ISSUER_DEFAULT="no"

				  DEPLOY_HAPROXY_RELOAD_DEFAULT="true"

				  DEPLOY_HAPROXY_HOT_UPDATE_DEFAULT="no"

				  DEPLOY_HAPROXY_STATS_SOCKET_DEFAULT="UNIX:/run/haproxy/admin.sock"

				  _debug _cdomain "${_cdomain}"

				  _debug _ckey "${_ckey}"

				  _debug _ccert "${_ccert}"

				  _debug _cca "${_cca}"

				  _debug _cfullchain "${_cfullchain}"

				  # PEM_PATH is optional. If not provided then assume "${DEPLOY_HAPROXY_PEM_PATH_DEFAULT}"

				  _getdeployconf DEPLOY_HAPROXY_PEM_PATH

				  _debug2 DEPLOY_HAPROXY_PEM_PATH "${DEPLOY_HAPROXY_PEM_PATH}"

				  if [ -n "${DEPLOY_HAPROXY_PEM_PATH}" ]; then

				    Le_Deploy_haproxy_pem_path="${DEPLOY_HAPROXY_PEM_PATH}"

				    _savedomainconf Le_Deploy_haproxy_pem_path "${Le_Deploy_haproxy_pem_path}"

				  elif [ -z "${Le_Deploy_haproxy_pem_path}" ]; then

				    Le_Deploy_haproxy_pem_path="${DEPLOY_HAPROXY_PEM_PATH_DEFAULT}"

				  fi

				  # Ensure PEM_PATH exists

				  if [ -d "${Le_Deploy_haproxy_pem_path}" ]; then

				    _debug "PEM_PATH ${Le_Deploy_haproxy_pem_path} exists"

				  else

				    _err "PEM_PATH ${Le_Deploy_haproxy_pem_path} does not exist"

				    return 1

				  fi

				  # PEM_NAME is optional. If not provided then assume "${DEPLOY_HAPROXY_PEM_NAME_DEFAULT}"

				  _getdeployconf DEPLOY_HAPROXY_PEM_NAME

				  _debug2 DEPLOY_HAPROXY_PEM_NAME "${DEPLOY_HAPROXY_PEM_NAME}"

				  if [ -n "${DEPLOY_HAPROXY_PEM_NAME}" ]; then

				    Le_Deploy_haproxy_pem_name="${DEPLOY_HAPROXY_PEM_NAME}"

				    _savedomainconf Le_Deploy_haproxy_pem_name "${Le_Deploy_haproxy_pem_name}"

				  elif [ -z "${Le_Deploy_haproxy_pem_name}" ]; then

				    Le_Deploy_haproxy_pem_name="${DEPLOY_HAPROXY_PEM_NAME_DEFAULT}"

				    # We better not have '*' as the first character

				    if [ "${Le_Deploy_haproxy_pem_name%%"${Le_Deploy_haproxy_pem_name#?}"}" = '*' ]; then

				      # removes the first characters and add a _ instead

				      Le_Deploy_haproxy_pem_name="_${Le_Deploy_haproxy_pem_name#?}"

				    fi

				  fi

				  # BUNDLE is optional. If not provided then assume "${DEPLOY_HAPROXY_BUNDLE_DEFAULT}"

				  _getdeployconf DEPLOY_HAPROXY_BUNDLE

				  _debug2 DEPLOY_HAPROXY_BUNDLE "${DEPLOY_HAPROXY_BUNDLE}"

				  if [ -n "${DEPLOY_HAPROXY_BUNDLE}" ]; then

				    Le_Deploy_haproxy_bundle="${DEPLOY_HAPROXY_BUNDLE}"

				    _savedomainconf Le_Deploy_haproxy_bundle "${Le_Deploy_haproxy_bundle}"

				  elif [ -z "${Le_Deploy_haproxy_bundle}" ]; then

				    Le_Deploy_haproxy_bundle="${DEPLOY_HAPROXY_BUNDLE_DEFAULT}"

				  fi

				  # ISSUER is optional. If not provided then assume "${DEPLOY_HAPROXY_ISSUER_DEFAULT}"

				  _getdeployconf DEPLOY_HAPROXY_ISSUER

				  _debug2 DEPLOY_HAPROXY_ISSUER "${DEPLOY_HAPROXY_ISSUER}"

				  if [ -n "${DEPLOY_HAPROXY_ISSUER}" ]; then

				    Le_Deploy_haproxy_issuer="${DEPLOY_HAPROXY_ISSUER}"

				    _savedomainconf Le_Deploy_haproxy_issuer "${Le_Deploy_haproxy_issuer}"

				  elif [ -z "${Le_Deploy_haproxy_issuer}" ]; then

				    Le_Deploy_haproxy_issuer="${DEPLOY_HAPROXY_ISSUER_DEFAULT}"

				  fi

				  # RELOAD is optional. If not provided then assume "${DEPLOY_HAPROXY_RELOAD_DEFAULT}"

				  _getdeployconf DEPLOY_HAPROXY_RELOAD

				  _debug2 DEPLOY_HAPROXY_RELOAD "${DEPLOY_HAPROXY_RELOAD}"

				  if [ -n "${DEPLOY_HAPROXY_RELOAD}" ]; then

				    Le_Deploy_haproxy_reload="${DEPLOY_HAPROXY_RELOAD}"

				    _savedomainconf Le_Deploy_haproxy_reload "${Le_Deploy_haproxy_reload}"

				  elif [ -z "${Le_Deploy_haproxy_reload}" ]; then

				    Le_Deploy_haproxy_reload="${DEPLOY_HAPROXY_RELOAD_DEFAULT}"

				  fi

				  # HOT_UPDATE is optional. If not provided then assume "${DEPLOY_HAPROXY_HOT_UPDATE_DEFAULT}"

				  _getdeployconf DEPLOY_HAPROXY_HOT_UPDATE

				  _debug2 DEPLOY_HAPROXY_HOT_UPDATE "${DEPLOY_HAPROXY_HOT_UPDATE}"

				  if [ -n "${DEPLOY_HAPROXY_HOT_UPDATE}" ]; then

				    Le_Deploy_haproxy_hot_update="${DEPLOY_HAPROXY_HOT_UPDATE}"

				    _savedomainconf Le_Deploy_haproxy_hot_update "${Le_Deploy_haproxy_hot_update}"

				  elif [ -z "${Le_Deploy_haproxy_hot_update}" ]; then

				    Le_Deploy_haproxy_hot_update="${DEPLOY_HAPROXY_HOT_UPDATE_DEFAULT}"

				  fi

				  # STATS_SOCKET is optional. If not provided then assume "${DEPLOY_HAPROXY_STATS_SOCKET_DEFAULT}"

				  _getdeployconf DEPLOY_HAPROXY_STATS_SOCKET

				  _debug2 DEPLOY_HAPROXY_STATS_SOCKET "${DEPLOY_HAPROXY_STATS_SOCKET}"

				  if [ -n "${DEPLOY_HAPROXY_STATS_SOCKET}" ]; then

				    Le_Deploy_haproxy_stats_socket="${DEPLOY_HAPROXY_STATS_SOCKET}"

				    _savedomainconf Le_Deploy_haproxy_stats_socket "${Le_Deploy_haproxy_stats_socket}"

				  elif [ -z "${Le_Deploy_haproxy_stats_socket}" ]; then

				    Le_Deploy_haproxy_stats_socket="${DEPLOY_HAPROXY_STATS_SOCKET_DEFAULT}"

				  fi

				  # MASTER_CLI is optional. No defaults are used. When the master CLI is used,

				  # all commands are sent with a prefix.

				  _getdeployconf DEPLOY_HAPROXY_MASTER_CLI

				  _debug2 DEPLOY_HAPROXY_MASTER_CLI "${DEPLOY_HAPROXY_MASTER_CLI}"

				  if [ -n "${DEPLOY_HAPROXY_MASTER_CLI}" ]; then

				    Le_Deploy_haproxy_stats_socket="${DEPLOY_HAPROXY_MASTER_CLI}"

				    _savedomainconf Le_Deploy_haproxy_stats_socket "${Le_Deploy_haproxy_stats_socket}"

				    _cmdpfx="@1 " # command prefix used for master CLI only.

				  fi

				  # Set the suffix depending if we are creating a bundle or not

				  if [ "${Le_Deploy_haproxy_bundle}" = "yes" ]; then

				    _info "Bundle creation requested"

				    # Initialise $Le_Keylength if its not already set

				    if [ -z "${Le_Keylength}" ]; then

				      Le_Keylength=""

				    fi

				    if _isEccKey "${Le_Keylength}"; then

				      _info "ECC key type detected"

				      _suffix=".ecdsa"

				    else

				      _info "RSA key type detected"

				      _suffix=".rsa"

				    fi

				  else

				    _suffix=""

				  fi

				  _debug _suffix "${_suffix}"

				  # Set variables for later

				  _pem="${Le_Deploy_haproxy_pem_path}/${Le_Deploy_haproxy_pem_name}${_suffix}"

				  _issuer="${_pem}.issuer"

				  _ocsp="${_pem}.ocsp"

				  _reload="${Le_Deploy_haproxy_reload}"

				  _statssock="${Le_Deploy_haproxy_stats_socket}"

				  _info "Deploying PEM file"

				  # Create a temporary PEM file

				  _temppem="$(_mktemp)"

				  _debug _temppem "${_temppem}"

				  cat "${_ccert}" "${_cca}" "${_ckey}" | grep . >"${_temppem}"

				  _ret="$?"

				  # Check that we could create the temporary file

				  if [ "${_ret}" != "0" ]; then

				    _err "Error code ${_ret} returned during PEM file creation"

				    [ -f "${_temppem}" ] && rm -f "${_temppem}"

				    return ${_ret}

				  fi

				  # Move PEM file into place

				  _info "Moving new certificate into place"

				  _debug _pem "${_pem}"

				  cat "${_temppem}" >"${_pem}"

				  _ret=$?

				  # Clean up temp file

				  [ -f "${_temppem}" ] && rm -f "${_temppem}"

				  # Deal with any failure of moving PEM file into place

				  if [ "${_ret}" != "0" ]; then

				    _err "Error code ${_ret} returned while moving new certificate into place"

				    return ${_ret}

				  fi

				  # Update .issuer file if requested

				  if [ "${Le_Deploy_haproxy_issuer}" = "yes" ]; then

				    _info "Updating .issuer file"

				    _debug _issuer "${_issuer}"

				    cat "${_cca}" >"${_issuer}"

				    _ret="$?"

				    if [ "${_ret}" != "0" ]; then

				      _err "Error code ${_ret} returned while copying issuer/CA certificate into place"

				      return ${_ret}

				    fi

				  else

				    [ -f "${_issuer}" ] && _err "Issuer file update not requested but .issuer file exists"

				  fi

				  # Update .ocsp file if certificate was requested with --ocsp/--ocsp-must-staple option

				  if [ -z "${Le_OCSP_Staple}" ]; then

				    Le_OCSP_Staple="0"

				  fi

				  if [ "${Le_OCSP_Staple}" = "1" ]; then

				    _info "Updating OCSP stapling info"

				    _debug _ocsp "${_ocsp}"

				    _info "Extracting OCSP URL"

				    _ocsp_url=$(${ACME_OPENSSL_BIN:-openssl} x509 -noout -ocsp_uri -in "${_pem}")

				    _debug _ocsp_url "${_ocsp_url}"

				    # Only process OCSP if URL was present

				    if [ "${_ocsp_url}" != "" ]; then

				      # Extract the hostname from the OCSP URL

				      _info "Extracting OCSP URL"

				      _ocsp_host=$(echo "${_ocsp_url}" | cut -d/ -f3)

				      _debug _ocsp_host "${_ocsp_host}"

				      # Only process the certificate if we have a .issuer file

				      if [ -r "${_issuer}" ]; then

				        # Check if issuer cert is also a root CA cert

				        _subjectdn=$(${ACME_OPENSSL_BIN:-openssl} x509 -in "${_issuer}" -subject -noout | cut -d'/' -f2,3,4,5,6,7,8,9,10)

				        _debug _subjectdn "${_subjectdn}"

				        _issuerdn=$(${ACME_OPENSSL_BIN:-openssl} x509 -in "${_issuer}" -issuer -noout | cut -d'/' -f2,3,4,5,6,7,8,9,10)

				        _debug _issuerdn "${_issuerdn}"

				        _info "Requesting OCSP response"

				        # If the issuer is a CA cert then our command line has "-CAfile" added

				        if [ "${_subjectdn}" = "${_issuerdn}" ]; then

				          _cafile_argument="-CAfile \"${_issuer}\""

				        else

				          _cafile_argument=""

				        fi

				        _debug _cafile_argument "${_cafile_argument}"

				        # if OpenSSL/LibreSSL is v1.1 or above, the format for the -header option has changed

				        _openssl_version=$(${ACME_OPENSSL_BIN:-openssl} version | cut -d' ' -f2)

				        _debug _openssl_version "${_openssl_version}"

				        _openssl_major=$(echo "${_openssl_version}" | cut -d '.' -f1)

				        _openssl_minor=$(echo "${_openssl_version}" | cut -d '.' -f2)

				        if [ "${_openssl_major}" -eq "1" ] && [ "${_openssl_minor}" -ge "1" ] || [ "${_openssl_major}" -ge "2" ]; then

				          _header_sep="="

				        else

				          _header_sep=" "

				        fi

				        # Request the OCSP response from the issuer and store it

				        _openssl_ocsp_cmd="${ACME_OPENSSL_BIN:-openssl} ocsp \

				          -issuer \"${_issuer}\" \

				          -cert \"${_pem}\" \

				          -url \"${_ocsp_url}\" \

				          -header Host${_header_sep}\"${_ocsp_host}\" \

				          -respout \"${_ocsp}\" \

				          -verify_other \"${_issuer}\" \

				          ${_cafile_argument} \

				          | grep -q \"${_pem}: good\""

				        _debug _openssl_ocsp_cmd "${_openssl_ocsp_cmd}"

				        eval "${_openssl_ocsp_cmd}"

				        _ret=$?

				      else

				        # Non fatal: No issuer file was present so no OCSP stapling file created

				        _err "OCSP stapling in use but no .issuer file was present"

				      fi

				    else

				      # Non fatal: No OCSP url was found int the certificate

				      _err "OCSP update requested but no OCSP URL was found in certificate"

				    fi

				    # Non fatal: Check return code of openssl command

				    if [ "${_ret}" != "0" ]; then

				      _err "Updating OCSP stapling failed with return code ${_ret}"

				    fi

				  else

				    # An OCSP file was already present but certificate did not have OCSP extension

				    if [ -f "${_ocsp}" ]; then

				      _err "OCSP was not requested but .ocsp file exists."

				      # Could remove the file at this step, although HAProxy just ignores it in this case

				      # rm -f "${_ocsp}" || _err "Problem removing stale .ocsp file"

				    fi

				  fi

				  if [ "${Le_Deploy_haproxy_hot_update}" = "yes" ]; then

				    # set the socket name for messages

				    if [ -n "${_cmdpfx}" ]; then

				      _socketname="master CLI"

				    else

				      _socketname="stats socket"

				    fi

				    # Update certificate over HAProxy stats socket or master CLI.

				    if _exists socat; then

				      # look for the certificate on the stats socket, to chose between updating or creating one

				      _socat_cert_cmd="echo '${_cmdpfx}show ssl cert' | socat '${_statssock}' - | grep -q '^${_pem}$'"

				      _debug _socat_cert_cmd "${_socat_cert_cmd}"

				      eval "${_socat_cert_cmd}"

				      _ret=$?

				      if [ "${_ret}" != "0" ]; then

				        _newcert="1"

				        _info "Creating new certificate '${_pem}' over HAProxy ${_socketname}."

				        # certificate wasn't found, it's a new one. We should check if the crt-list exists and creates/inserts the certificate.

				        _socat_crtlist_show_cmd="echo '${_cmdpfx}show ssl crt-list' | socat '${_statssock}' - | grep -q '^${Le_Deploy_haproxy_pem_path}$'"

				        _debug _socat_crtlist_show_cmd "${_socat_crtlist_show_cmd}"

				        eval "${_socat_crtlist_show_cmd}"

				        _ret=$?

				        if [ "${_ret}" != "0" ]; then

				          _err "Couldn't find '${Le_Deploy_haproxy_pem_path}' in haproxy 'show ssl crt-list'"

				          return "${_ret}"

				        fi

				        # create a new certificate

				        _socat_new_cmd="echo '${_cmdpfx}new ssl cert ${_pem}' | socat '${_statssock}' - | grep -q 'New empty'"

				        _debug _socat_new_cmd "${_socat_new_cmd}"

				        eval "${_socat_new_cmd}"

				        _ret=$?

				        if [ "${_ret}" != "0" ]; then

				          _err "Couldn't create '${_pem}' in haproxy"

				          return "${_ret}"

				        fi

				      else

				        _info "Update existing certificate '${_pem}' over HAProxy ${_socketname}."

				      fi

				      _socat_cert_set_cmd="echo -e '${_cmdpfx}set ssl cert ${_pem} <<\n$(cat "${_pem}")\n' | socat '${_statssock}' - | grep -q 'Transaction created'"

				      _debug _socat_cert_set_cmd "${_socat_cert_set_cmd}"

				      eval "${_socat_cert_set_cmd}"

				      _ret=$?

				      if [ "${_ret}" != "0" ]; then

				        _err "Can't update '${_pem}' in haproxy"

				        return "${_ret}"

				      fi

				      _socat_cert_commit_cmd="echo '${_cmdpfx}commit ssl cert ${_pem}' | socat '${_statssock}' - | grep -q '^Success!$'"

				      _debug _socat_cert_commit_cmd "${_socat_cert_commit_cmd}"

				      eval "${_socat_cert_commit_cmd}"

				      _ret=$?

				      if [ "${_ret}" != "0" ]; then

				        _err "Can't commit '${_pem}' in haproxy"

				        return ${_ret}

				      fi

				      if [ "${_newcert}" = "1" ]; then

				       # if this is a new certificate, it needs to be inserted into the crt-list`

				        _socat_cert_add_cmd="echo '${_cmdpfx}add ssl crt-list ${Le_Deploy_haproxy_pem_path} ${_pem}' | socat '${_statssock}' - | grep -q 'Success!'"

				        _debug _socat_cert_add_cmd "${_socat_cert_add_cmd}"

				        eval "${_socat_cert_add_cmd}"

				        _ret=$?

				        if [ "${_ret}" != "0" ]; then

				          _err "Can't update '${_pem}' in haproxy"

				          return "${_ret}"

				        fi

				      fi

				    else

				      _err "'socat' is not available, couldn't update over ${_socketname}"

				    fi

				  else

				    # Reload HAProxy

				    _debug _reload "${_reload}"

				    eval "${_reload}"

				    _ret=$?

				    if [ "${_ret}" != "0" ]; then

				      _err "Error code ${_ret} during reload"

				      return ${_ret}

				    else

				      _info "Reload successful"

				    fi

				  fi

				  return 0

				}

235

admin/cli/haproxy-dump-certs Executable file

View File

 @ -0,0 +1,235 @@
 #!/bin/bash
 #
 # Dump certificates from the HAProxy stats or master socket to the filesystem
 # Experimental script
 #
 set -e
 export BASEPATH=${BASEPATH:-/etc/haproxy}/
 export SOCKET=${SOCKET:-/var/run/haproxy-master.sock}
 export DRY_RUN=0
 export DEBUG=
 export VERBOSE=
 export M="@1 "
 export TMP
 vecho() {
 	[ -n "$VERBOSE" ] && echo "$@"
 	return 0
 }
 read_certificate() {
 	name=$1
 	crt_filename=
 	key_filename=
 	OFS=$IFS
 	IFS=":"
 	while read -r key value; do
 		case "$key" in
 			"Crt filename")
 				crt_filename="${value# }"
 				key_filename="${value# }"
 			;;
 			"Key filename")
 				key_filename="${value# }"
 			;;
 		esac
 	done < <(echo "${M}show ssl cert ${name}" | socat "${SOCKET}" -)
 	IFS=$OFS
 	if [ -z "$crt_filename" ] || [ -z "$key_filename" ]; then
 		return 1
 	fi
 	# handle fields without a crt-base/key-base
 	[ "${crt_filename:0:1}" != "/" ] && crt_filename="${BASEPATH}${crt_filename}"
 	[ "${key_filename:0:1}" != "/" ] && key_filename="${BASEPATH}${key_filename}"
 	vecho "name:$name"
 	vecho "crt:$crt_filename"
 	vecho "key:$key_filename"
 	export NAME="$name"
 	export CRT_FILENAME="$crt_filename"
 	export KEY_FILENAME="$key_filename"
 	return 0
 }
 cmp_certkey() {
 	prev=$1
 	new=$2
 	if [ ! -f "$prev" ]; then
 		return 1;
 	fi
 	if ! cmp -s <(openssl x509 -in "$prev" -noout -fingerprint -sha256) <(openssl x509 -in "$new" -noout -fingerprint -sha256); then
 		return 1
 	fi
 	return 0
 }
 dump_certificate() {
 	name=$1
 	prev_crt=$2
 	prev_key=$3
 	r="tmp.${RANDOM}"
 	d="old.$(date +%s)"
 	new_crt="$TMP/$(basename "$prev_crt").${r}"
 	new_key="$TMP/$(basename "$prev_key").${r}"
 	if ! touch "${new_crt}" || ! touch "${new_key}"; then
 		echo "[ALERT] ($$) : can't dump \"$name\", can't create tmp files" >&2
 		return 1
 	fi
 	echo "${M}dump ssl cert ${name}" | socat "${SOCKET}" - | openssl pkey >> "${new_key}"
 	# use crl2pkcs7 as a way to dump multiple x509, storeutl could be used in modern versions of openssl
 	echo "${M}dump ssl cert ${name}" | socat "${SOCKET}" - | openssl crl2pkcs7 -nocrl -certfile /dev/stdin | openssl pkcs7 -print_certs  >> "${new_crt}"
 	if ! cmp -s <(openssl x509 -in "${new_crt}" -pubkey -noout) <(openssl pkey -in "${new_key}" -pubout); then
 		echo "[ALERT] ($$) : Private key \"${new_key}\"  and public key \"${new_crt}\" don't match" >&2
 		return 1
 	fi
 	if cmp_certkey "${prev_crt}" "${new_crt}"; then
 		echo "[NOTICE] ($$) : ${crt_filename} is already up to date" >&2
 		return 0
 	fi
 	# dry run will just return before trying to move the files
 	if [ "${DRY_RUN}" != "0" ]; then
 		return 0
 	fi
 	# move the current certificates to ".old.timestamp"
 	if [ -f "${prev_crt}" ] && [ -f "${prev_key}" ]; then
 		mv "${prev_crt}" "${prev_crt}.${d}"
 		[ "${prev_crt}" != "${prev_key}" ] && mv "${prev_key}" "${prev_key}.${d}"
 	fi
 	# move the new certificates to old place
 	mv "${new_crt}" "${prev_crt}"
 	[ "${prev_crt}" != "${prev_key}" ] && mv "${new_key}" "${prev_key}"
 	return 0
 }
 dump_all_certificates() {
 	echo "${M}show ssl cert" | socat "${SOCKET}" - | grep -v '^#' | grep -v '^$' | while read -r line; do
 		export NAME
 		export CRT_FILENAME
 		export KEY_FILENAME
 		if read_certificate "$line"; then
 			dump_certificate "$NAME" "$CRT_FILENAME" "$KEY_FILENAME"
 		else
 			echo "[WARNING] ($$) : can't dump \"$name\", crt/key filename details not found in \"show ssl cert\"" >&2
 		fi
 	done
 }
 usage() {
 	echo "Usage:"
 	echo " $0 [options]* [cert]*"
 	echo ""
 	echo " Dump certificates from the HAProxy stats or master socket to the filesystem"
 	echo " Require socat and openssl"
 	echo " EXPERIMENTAL script, backup your files!"
 	echo " The script will move your previous files to FILE.old.unixtimestamp (ex: foo.com.pem.old.1759044998)"
 	echo ""
 	echo "Options:"
 	echo "  -S, --master-socket <path>   Use the master socket at <path> (default: ${SOCKET})"
 	echo "  -s, --socket <path>          Use the stats socket at <path>"
 	echo "  -p, --path <path>            Specifiy a base path for relative files (default: ${BASEPATH})"
 	echo "  -n, --dry-run                Read certificates on the socket but don't dump them"
 	echo "  -d, --debug                  Debug mode, set -x"
 	echo "  -v, --verbose                Verbose mode"
 	echo "  -h, --help                   This help"
 	echo "  --                           End of options"
 	echo ""
 	echo "Examples:"
 	echo "  $0 -v -p ${BASEPATH} -S ${SOCKET}"
 	echo "  $0 -v -p ${BASEPATH} -S ${SOCKET} bar.com.rsa.pem"
 	echo "  $0 -v -p ${BASEPATH} -S ${SOCKET} -- foo.com.ecdsa.pem bar.com.rsa.pem"
 }
 main() {
 	while [ -n "$1" ]; do
 		case "$1" in
 			-S|--master-socket)
 				SOCKET="$2"
 				M="@1 "
 				shift 2
 				;;
 			-s|--socket)
 				SOCKET="$2"
 				M=
 				shift 2
 				;;
 			-p|--path)
 				BASEPATH="$2/"
 				shift 2
 				;;
 			-n|--dry-run)
 				DRY_RUN=1
 				shift
 				;;
 			-d|--debug)
 				DEBUG=1
 				shift
 				;;
 			-v|--verbose)
 				VERBOSE=1
 				shift
 				;;
 			-h|--help)
 				usage "$@"
 				exit 0
 				;;
 			--)
 				shift
 				break
 				;;
 			-*)
 				echo "[ALERT] ($$) : Unknown option '$1'" >&2
 				usage "$@"
 				exit 1
 				;;
 			*)
 				break
 				;;
 		esac
 	done
 	if [ -n "$DEBUG" ]; then
 		set -x
 	fi
 	TMP=${TMP:-$(mktemp -d)}
 	if [ -z "$1" ]; then
 		dump_all_certificates
 	else
 		# compute the certificates names at the end of the command
 		while [ -n "$1" ]; do
 			if ! read_certificate "$1"; then
 				echo "[ALERT] ($$) : can't dump \"$1\", crt/key filename details not found in \"show ssl cert\"" >&2
 				exit 1
 			fi
 			[ "${DRY_RUN}" = "0" ] && dump_certificate "$NAME" "$CRT_FILENAME" "$KEY_FILENAME"
 			shift
 		done
 	fi
 }
 trap 'rm -rf -- "$TMP"' EXIT
 main "$@"

113

admin/cli/haproxy-reload Executable file

View File

 @ -0,0 +1,113 @@
 #!/bin/bash
 set -e
 export VERBOSE=1
 export TIMEOUT=90
 export MASTER_SOCKET=${MASTER_SOCKET:-/var/run/haproxy-master.sock}
 export RET=
 alert() {
 	if [ "$VERBOSE" -ge "1" ]; then
 		echo "[ALERT] $*" >&2
 	fi
 }
 reload() {
 	while read -r line; do
 		if [ "$line" = "Success=0" ]; then
 			RET=1
 		elif [ "$line" = "Success=1" ]; then
 			RET=0
 		elif [ "$line" = "Another reload is still in progress." ]; then
 			alert "$line"
 		elif [ "$line" = "--" ]; then
 			continue;
 		else
 			if [ "$RET" = 1 ] && [ "$VERBOSE" = "2" ]; then
 				echo "$line" >&2
 			elif [ "$VERBOSE" = "3" ]; then
 				echo "$line" >&2
 			fi
 		fi
 	done < <(echo "reload" | socat -t"${TIMEOUT}" "${MASTER_SOCKET}" -)
 	if [ -z "$RET" ]; then
 		alert "Couldn't finish the reload before the timeout (${TIMEOUT})."
 		return 1
 	fi
 	return "$RET"
 }
 usage() {
 	echo "Usage:"
 	echo " $0 [options]*"
 	echo ""
 	echo " Trigger a reload from the master socket"
 	echo " Require socat"
 	echo " EXPERIMENTAL script!"
 	echo ""
 	echo "Options:"
 	echo "  -S,  --master-socket <path>   Use the master socket at <path> (default: ${MASTER_SOCKET})"
 	echo "  -d,  --debug                  Debug mode, set -x"
 	echo "  -t,  --timeout                Timeout (socat -t) (default: ${TIMEOUT})"
 	echo "  -s,  --silent                 Silent mode (no output)"
 	echo "  -v,  --verbose                Verbose output (output from haproxy on failure)"
 	echo "  -vv                           Even more verbose output (output from haproxy on success and failure)"
 	echo "  -h,  --help                   This help"
 	echo ""
 	echo "Examples:"
 	echo "  $0 -S ${MASTER_SOCKET} -d ${TIMEOUT}"
 }
 main() {
 	while [ -n "$1" ]; do
 		case "$1" in
 			-S|--master-socket)
 				MASTER_SOCKET="$2"
 				shift 2
 				;;
 			-t|--timeout)
 				TIMEOUT="$2"
 				shift 2
 				;;
 			-s|--silent)
 				VERBOSE=0
 				shift
 				;;
 			-v|--verbose)
 				VERBOSE=2
 				shift
 				;;
 			-vv|--verbose)
 				VERBOSE=3
 				shift
 				;;
 			-d|--debug)
 				DEBUG=1
 				shift
 				;;
 			-h|--help)
 				usage "$@"
 				exit 0
 				;;
 			*)
 				echo "[ALERT] ($$) : Unknown option '$1'" >&2
 				usage "$@"
 				exit 1
 				;;
 		esac
 	done
 	if [ -n "$DEBUG" ]; then
 		set -x
 	fi
 }
 main "$@"
 reload

									
										29

admin/halog/halog.c
									
											View File
											
				@ -123,6 +123,22 @@ struct url_stat {

				#define FILT2_PRESERVE_QUERY    0x02

				#define FILT2_EXTRACT_CAPTURE   0x04

				#define FILT_OUTPUT_FMT   (FILT_COUNT_ONLY| \

							   FILT_COUNT_STATUS| \

							   FILT_COUNT_SRV_STATUS| \

							   FILT_COUNT_COOK_CODES| \

							   FILT_COUNT_TERM_CODES| \

							   FILT_COUNT_URL_ONLY| \

							   FILT_COUNT_URL_COUNT| \

							   FILT_COUNT_URL_ERR| \

							   FILT_COUNT_URL_TAVG| \

							   FILT_COUNT_URL_TTOT| \

							   FILT_COUNT_URL_TAVGO| \

							   FILT_COUNT_URL_TTOTO| \

							   FILT_COUNT_URL_BAVG| \

							   FILT_COUNT_URL_BTOT| \

							   FILT_COUNT_IP_COUNT)

				unsigned int filter = 0;

				unsigned int filter2 = 0;

				unsigned int filter_invert = 0;

				@ -192,7 +208,7 @@ void help()

					       "                         you can also use -n to start from earlier then field %d\n"

					       " -query                  preserve the query string for per-URL (-u*) statistics\n"

					       "\n"

					       "Output format - only one may be used at a time\n"

					       "Output format - **only one** may be used at a time\n"

					       " -c    only report the number of lines that would have been printed\n"

					       " -pct  output connect and response times percentiles\n"

					       " -st   output number of requests per HTTP status code\n"

				@ -898,6 +914,9 @@ int main(int argc, char **argv)

					if (!filter && !filter2)

						die("No action specified.\n");

					if ((filter & FILT_OUTPUT_FMT) & ((filter & FILT_OUTPUT_FMT) - 1))

						die("Please, set only one output filter.\n");

					if (filter & FILT_ACC_COUNT && !filter_acc_count)

						filter_acc_count=1;

				@ -1552,6 +1571,10 @@ void filter_count_srv_status(const char *accept_field, const char *time_field, s

					if (!srv_node) {

						/* server not yet in the tree, let's create it */

						srv = (void *)calloc(1, sizeof(struct srv_st) + e - b + 1);

						if (unlikely(!srv)) {

							fprintf(stderr, "%s: not enough memory\n", __FUNCTION__);

							exit(1);

						}

						srv_node = &srv->node;

						memcpy(&srv_node->key, b, e - b);

						srv_node->key[e - b] = '\0';

				@ -1661,6 +1684,10 @@ void filter_count_url(const char *accept_field, const char *time_field, struct t

					 */

					if (unlikely(!ustat))

						ustat = calloc(1, sizeof(*ustat));

					if (unlikely(!ustat)) {

						fprintf(stderr, "%s: not enough memory\n", __FUNCTION__);

						exit(1);

					}

					ustat->nb_err = err;

					ustat->nb_req = 1;

15

admin/release-estimator/README.md

View File

 @ -7,6 +7,21 @@ the queue.
 ## Requirements
   - Python 3.x
   - [lxml](https://lxml.de/installation.html)
   - requests
   - urllib3
 ## Installation
 It can be easily installed with venv from python3
     $ python3 -m venv ~/.local/venvs/stable-bot/
     $ source ~/.local/venvs/stable-bot/bin/activate
     $ pip install -r requirements.txt
 And can be executed with:
     $ ~/.local/venvs/stable-bot/bin/python release-estimator.py
 ## Usage

									
										4

admin/release-estimator/release-estimator.py
									
											View File
											
				@ -1,4 +1,4 @@

				#!/usr/bin/python3

				#!/usr/bin/env python3

				#

				# Release estimator for HAProxy

				#

				@ -16,6 +16,7 @@

				#

				from lxml import html

				from urllib.parse import urljoin

				import requests

				import traceback

				import smtplib

				@ -190,6 +191,7 @@ This is a friendly bot that watches fixes pending for the next haproxy-stable re

				        # parse out the CHANGELOG link

				        CHANGELOG = tree.xpath('//a[contains(@href,"CHANGELOG")]/@href')[0]

				        CHANGELOG = urljoin("https://", CHANGELOG)

				        last_version = tree.xpath('//td[contains(text(), "last")]/../td/a/text()')[0]

				        first_version = "%s.0" % (version)

3

admin/release-estimator/requirements.txt Normal file

View File

 @ -0,0 +1,3 @@
 lxml
 requests
 urllib3

									
										6

admin/systemd/haproxy.service.in
									
											View File
											
				@ -6,9 +6,9 @@ Wants=network-online.target

				[Service]

				EnvironmentFile=-/etc/default/haproxy

				EnvironmentFile=-/etc/sysconfig/haproxy

				Environment="CONFIG=/etc/haproxy/haproxy.cfg" "PIDFILE=/run/haproxy.pid" "EXTRAOPTS=-S /run/haproxy-master.sock"

				ExecStart=@SBINDIR@/haproxy -Ws -f $CONFIG -p $PIDFILE $EXTRAOPTS

				ExecReload=@SBINDIR@/haproxy -Ws -f $CONFIG -c $EXTRAOPTS

				Environment="CONFIG=/etc/haproxy/haproxy.cfg" "PIDFILE=/run/haproxy.pid" "CFGDIR=/etc/haproxy/conf.d" "EXTRAOPTS=-S /run/haproxy-master.sock"

				ExecStart=@SBINDIR@/haproxy -Ws -f $CONFIG -f $CFGDIR -p $PIDFILE $EXTRAOPTS

				ExecReload=@SBINDIR@/haproxy -Ws -f $CONFIG -f $CFGDIR -c $EXTRAOPTS

				ExecReload=/bin/kill -USR2 $MAINPID

				KillMode=mixed

				Restart=always

34

dev/coccinelle/unchecked-calloc.cocci Normal file

View File

 @ -0,0 +1,34 @@
 // find calls to calloc
 @call@
 expression ptr;
 position p;
 @@
 ptr@p = calloc(...);
 // find ok calls to calloc
 @ok@
 expression ptr;
 position call.p;
 @@
 ptr@p = calloc(...);
 ... when != ptr
 (
  (ptr == NULL || ...)
 |
  (ptr == 0 || ...)
 |
  (ptr != NULL || ...)
 |
  (ptr != 0 || ...)
 )
 // fix bad calls to calloc
 @depends on !ok@
 expression ptr;
 position call.p;
 @@
 ptr@p = calloc(...);
 + if (ptr == NULL) return;

34

dev/coccinelle/unchecked-malloc.cocci Normal file

View File

 @ -0,0 +1,34 @@
 // find calls to malloc
 @call@
 expression ptr;
 position p;
 @@
 ptr@p = malloc(...);
 // find ok calls to malloc
 @ok@
 expression ptr;
 position call.p;
 @@
 ptr@p = malloc(...);
 ... when != ptr
 (
  (ptr == NULL || ...)
 |
  (ptr == 0 || ...)
 |
  (ptr != NULL || ...)
 |
  (ptr != 0 || ...)
 )
 // fix bad calls to malloc
 @depends on !ok@
 expression ptr;
 position call.p;
 @@
 ptr@p = malloc(...);
 + if (ptr == NULL) return;

34

dev/coccinelle/unchecked-strdup.cocci Normal file

View File

 @ -0,0 +1,34 @@
 // find calls to strdup
 @call@
 expression ptr;
 position p;
 @@
 ptr@p = strdup(...);
 // find ok calls to strdup
 @ok@
 expression ptr;
 position call.p;
 @@
 ptr@p = strdup(...);
 ... when != ptr
 (
  (ptr == NULL || ...)
 |
  (ptr == 0 || ...)
 |
  (ptr != NULL || ...)
 |
  (ptr != 0 || ...)
 )
 // fix bad calls to strdup
 @depends on !ok@
 expression ptr;
 position call.p;
 @@
 ptr@p = strdup(...);
 + if (ptr == NULL) return;

									
										18

dev/flags/flags.c
									
											View File
											
				@ -4,6 +4,7 @@

				/* make the include files below expose their flags */

				#define HA_EXPOSE_FLAGS

				#include <haproxy/applet-t.h>

				#include <haproxy/channel-t.h>

				#include <haproxy/connection-t.h>

				#include <haproxy/fd-t.h>

				@ -12,7 +13,10 @@

				#include <haproxy/mux_fcgi-t.h>

				#include <haproxy/mux_h2-t.h>

				#include <haproxy/mux_h1-t.h>

				#include <haproxy/mux_quic-t.h>

				#include <haproxy/mux_spop-t.h>

				#include <haproxy/peers-t.h>

				#include <haproxy/quic_conn-t.h>

				#include <haproxy/stconn-t.h>

				#include <haproxy/stream-t.h>

				#include <haproxy/task-t.h>

				@ -39,11 +43,17 @@

				#define SHOW_AS_FSTRM 0x00040000

				#define SHOW_AS_PEERS 0x00080000

				#define SHOW_AS_PEER  0x00100000

				#define SHOW_AS_QC    0x00200000

				#define SHOW_AS_SPOPC 0x00400000

				#define SHOW_AS_SPOPS 0x00800000

				#define SHOW_AS_QCC   0x01000000

				#define SHOW_AS_QCS   0x02000000

				#define SHOW_AS_APPCTX 0x04000000

				// command line names, must be in exact same order as the SHOW_AS_* flags above

				// so that show_as_words[i] matches flag 1U<<i.

				const char *show_as_words[] = { "ana", "chn", "conn", "sc", "stet", "strm", "task", "txn", "sd", "hsl", "htx", "hmsg", "fd", "h2c", "h2s",  "h1c", "h1s", "fconn", "fstrm",

								"peers", "peer"};

								"peers", "peer", "qc", "spopc", "spops", "qcc", "qcs", "appctx"};

				/* will be sufficient for even largest flag names */

				static char buf[4096];

				@ -158,6 +168,12 @@ int main(int argc, char **argv)

						if (show_as & SHOW_AS_FSTRM) printf("fstrm->flags = %s\n",(fstrm_show_flags  (buf, bsz, " | ", flags), buf));

						if (show_as & SHOW_AS_PEERS) printf("peers->flags = %s\n",(peers_show_flags  (buf, bsz, " | ", flags), buf));

						if (show_as & SHOW_AS_PEER)  printf("peer->flags = %s\n", (peer_show_flags   (buf, bsz, " | ", flags), buf));

						if (show_as & SHOW_AS_QC)    printf("qc->flags = %s\n",   (qc_show_flags     (buf, bsz, " | ", flags), buf));

						if (show_as & SHOW_AS_SPOPC) printf("spopc->flags = %s\n",(spop_conn_show_flags(buf, bsz, " | ", flags), buf));

						if (show_as & SHOW_AS_SPOPS) printf("spops->flags = %s\n",(spop_strm_show_flags(buf, bsz, " | ", flags), buf));

						if (show_as & SHOW_AS_QCC)    printf("qcc->flags = %s\n", (qcc_show_flags    (buf, bsz, " | ", flags), buf));

						if (show_as & SHOW_AS_QCS)    printf("qcs->flags = %s\n", (qcs_show_flags    (buf, bsz, " | ", flags), buf));

						if (show_as & SHOW_AS_APPCTX) printf("appctx->flags = %s\n", (appctx_show_flags(buf, bsz, " | ", flags), buf));

					}

					return 0;

				}

									
										2

dev/flags/show-fd-to-flags.sh
									
											View File
											
				@ -1,2 +1,2 @@

				#!/bin/sh

				awk '{print $12}' | grep cflg= | sort | uniq -c | sort -nr | while read a b; do c=${b##*=}; d=$(${0%/*}/flags conn $c);d=${d##*= }; printf "%6d %s    %s\n" $a "$b" "$d";done

				grep -o 'cflg=[0-9a-fx]*' | sort | uniq -c | sort -nr | while read a b; do c=${b##*=}; d=$(${0%/*}/flags conn $c);d=${d##*= }; printf "%6d %s    %s\n" $a "$b" "$d";done

									
										2

dev/flags/show-sess-to-flags.sh
									
											View File
											
				@ -195,7 +195,7 @@ while read -r; do

				                ! [[ "$REPLY" =~ [[:blank:]]h2c.*\.flg=([0-9a-fx]*) ]] || append_flag b.h2c.flg   h2c  "${BASH_REMATCH[1]}"

				        elif [ $ctx = cob ]; then

				                ! [[ "$REPLY" =~ [[:blank:]]flags=([0-9a-fx]*) ]]      || append_flag b.co.flg    conn "${BASH_REMATCH[1]}"

				                ! [[ "$REPLY" =~ [[:blank:]]fd.state=([0-9a-fx]*) ]]   || append_flag b.co.fd.st  fd   "${BASH_REMATCH[1]}"

				                ! [[ "$REPLY" =~ [[:blank:]]fd.state=([0-9a-fx]*) ]]   || append_flag b.co.fd.st  fd   0x"${BASH_REMATCH[1]}"

				        elif [ $ctx = res ]; then

				                ! [[ "$REPLY" =~ [[:blank:]]\(f=([0-9a-fx]*) ]]        || append_flag res.flg     chn  "${BASH_REMATCH[1]}"

				                ! [[ "$REPLY" =~ [[:blank:]]an=([0-9a-fx]*) ]]         || append_flag res.ana     ana  "${BASH_REMATCH[1]}"

118

dev/gdb/ebtree.gdb Normal file

View File

 @ -0,0 +1,118 @@
 # sets $tag and $node from $arg0, for internal use only
 define _ebtree_set_tag_node
   set $tag = (unsigned long)$arg0 & 0x1
   set $node = (unsigned long)$arg0 & 0xfffffffffffffffe
   set $node = (struct eb_node *)$node
 end
 # get root from any node (leaf of node), returns in $node
 define ebtree_root
   set $node = (struct eb_root *)$arg0->node_p
   if $node == 0
     # sole node
     set $node = (struct eb_root *)$arg0->leaf_p
   end
   # walk up
   while 1
     _ebtree_set_tag_node $node
     if $node->branches.b[1] == 0
       break
     end
     set $node = $node->node_p
   end
   # root returned in $node
 end
 # returns $node filled with the first node of ebroot $arg0
 define ebtree_first
   # browse ebtree left until encountering leaf
   set $node = (struct eb_node *)$arg0->b[0]
   while 1
     _ebtree_set_tag_node $node
     if $tag == 0
       loop_break
     end
     set $node = (struct eb_root *)$node->branches.b[0]
   end
   # extract last node
   _ebtree_set_tag_node $node
 end
 # finds next ebtree node after $arg0, and returns it in $node
 define ebtree_next
   # get parent
   set $node = (struct eb_root *)$arg0->leaf_p
   # Walking up from right branch, so we cannot be below root
   # while (eb_gettag(t) != EB_LEFT) // #define EB_LEFT 0
   while 1
     _ebtree_set_tag_node $node
     if $tag == 0
       loop_break
     end
     set $node = (struct eb_root *)$node->node_p
   end
   set $node = (struct eb_root *)$node->branches.b[1]
   # walk down (left side => 0)
   # while (eb_gettag(start) == EB_NODE) // #define EB_NODE 1
   while 1
     _ebtree_set_tag_node $node
     if $node == 0
       loop_break
     end
     if $tag != 1
       loop_break
     end
     set $node = (struct eb_root *)$node->branches.b[0]
   end
 end
 # sets $tag and $node from $arg0, for internal use only
 define _ebsctree_set_tag_node
   set $tag = (unsigned long)$arg0 & 0x1
   set $node = (unsigned long)$arg0 & 0xfffffffffffffffe
   set $node = (struct eb32sc_node *)$node
 end
 # returns $node filled with the first node of ebroot $arg0
 define ebsctree_first
   # browse ebsctree left until encountering leaf
   set $node = (struct eb32sc_node *)$arg0->b[0]
   while 1
     _ebsctree_set_tag_node $node
     if $tag == 0
       loop_break
     end
     set $node = (struct eb_root *)$node->branches.b[0]
   end
   # extract last node
   _ebsctree_set_tag_node $node
 end
 # finds next ebtree node after $arg0, and returns it in $node
 define ebsctree_next
   # get parent
   set $node = (struct eb_root *)$arg0->node.leaf_p
   # Walking up from right branch, so we cannot be below root
   # while (eb_gettag(t) != EB_LEFT) // #define EB_LEFT 0
   while 1
     _ebsctree_set_tag_node $node
     if $tag == 0
       loop_break
     end
     set $node = (struct eb_root *)$node->node.node_p
   end
   set $node = (struct eb_root *)$node->node.branches.b[1]
   # walk down (left side => 0)
   # while (eb_gettag(start) == EB_NODE) // #define EB_NODE 1
   while 1
     _ebsctree_set_tag_node $node
     if $node == 0
       loop_break
     end
     if $tag != 1
       loop_break
     end
     set $node = (struct eb_root *)$node->node.branches.b[0]
   end
 end

26

dev/gdb/list.gdb Normal file

View File

 @ -0,0 +1,26 @@
 # lists entries starting at list head $arg0
 define list_dump
   set $h = $arg0
   set $p = *(void **)$h
   while ($p != $h)
     printf "%#lx\n", $p
     if ($p == 0)
       loop_break
     end
     set $p = *(void **)$p
   end
 end
 # list all entries starting at list head $arg0 until meeting $arg1
 define list_find
   set $h = $arg0
   set $k = $arg1
   set $p = *(void **)$h
   while ($p != $h)
     printf "%#lx\n", $p
     if ($p == 0 || $p == $k)
       loop_break
     end
     set $p = *(void **)$p
   end
 end

19

dev/gdb/memprof.dbg Normal file

View File

 @ -0,0 +1,19 @@
 # show non-null memprofile entries with method, alloc/free counts/tot and caller
 define memprof_dump
   set $i = 0
   set $meth={ "UNKN", "MALL", "CALL", "REAL", "STRD", "FREE", "P_AL", "P_FR", "STND", "VALL", "ALAL", "PALG", "MALG", "PVAL" }
   while $i < sizeof(memprof_stats) / sizeof(memprof_stats[0])
     if memprof_stats[$i].alloc_calls || memprof_stats[$i].free_calls
       set $m = memprof_stats[$i].method
       printf "m:%s ac:%u fc:%u at:%u ft:%u ", $meth[$m], \
            memprof_stats[$i].alloc_calls, memprof_stats[$i].free_calls, \
            memprof_stats[$i].alloc_tot, memprof_stats[$i].free_tot
       output/a memprof_stats[$i].caller
       printf "\n"
     end
     set $i = $i + 1
   end
 end

21

dev/gdb/pools.gdb Normal file

View File

 @ -0,0 +1,21 @@
 # dump pool contents (2.9 and above, with buckets)
 define pools_dump
   set $h = $po
   set $p = *(void **)$h
   while ($p != $h)
     set $e = (struct pool_head *)(((char *)$p) - (unsigned long)&((struct pool_head *)0)->list)
     set $total = 0
     set $used = 0
     set $idx = 0
     while $idx < sizeof($e->buckets) / sizeof($e->buckets[0])
       set $total=$total + $e->buckets[$idx].allocated
       set $used=$used + $e->buckets[$idx].used
       set $idx=$idx + 1
     end
     set $mem = $total * $e->size
     printf "list=%#lx pool_head=%p name=%s size=%u alloc=%u used=%u mem=%u\n", $p, $e, $e->name, $e->size, $total, $used, $mem
     set $p = *(void **)$p
   end
 end

47

dev/gdb/post-mortem.gdb Normal file

View File

 @ -0,0 +1,47 @@
 # This script will set the post_mortem struct pointer ($pm) from the one found
 # in the "post_mortem" symbol. If not found or if not correct, it's the same
 # address as the "_post_mortem" section, which can be found using "info files"
 # or "objdump -h" on the executable. The guessed value is the by a first call
 # to pm_init, but if not correct, you just need to call pm_init again with the
 # correct pointer, e.g:
 #   pm_init 0xcfd400
 define pm_init
   set $pm = (struct post_mortem*)$arg0
   set $g = $pm.global
   set $ti = $pm.thread_info
   set $tc = $pm.thread_ctx
   set $tgi = $pm.tgroup_info
   set $tgc = $pm.tgroup_ctx
   set $fd = $pm.fdtab
   set $pxh = *$pm.proxies
   set $po  = $pm.pools
   set $ac  = $pm.activity
 end
 # show basic info on the running process (OS, uid, etc)
 define pm_show_info
   print $pm->platform
   print $pm->process
 end
 # show thread IDs to easily map between gdb threads and tid
 define pm_show_threads
   set $t = 0
   while $t < $g.nbthread
     printf "Tid %4d: pthread_id=%#lx  stack_top=%#lx\n", $t, $ti[$t].pth_id, $ti[$t].stack_top
     set $t = $t + 1
   end
 end
 # dump all threads' dump buffers
 define pm_show_thread_dump
   set $t = 0
   while $t < $g.nbthread
     printf "%s\n", $tc[$t].thread_dump_buffer->area
     set $t = $t + 1
   end
 end
 # initialize the various pointers
 pm_init &post_mortem

25

dev/gdb/proxies.gdb Normal file

View File

 @ -0,0 +1,25 @@
 # list proxies starting with the one in argument (typically $pxh)
 define px_list
   set $p = (struct proxy *)$arg0
   while ($p != 0)
     printf "%p (", $p
     if $p->cap & 0x10
       printf "LB,"
     end
     if $p->cap & 0x1
       printf "FE,"
     end
     if $p->cap & 0x2
       printf "BE,"
     end
     printf "%s)", $p->id
     if $p->cap & 0x1
       printf " feconn=%u cmax=%u cum_conn=%llu cpsmax=%u", $p->feconn, $p->fe_counters.conn_max, $p->fe_counters.cum_conn, $p->fe_counters.cps_max
     end
     if $p->cap & 0x2
       printf " beconn=%u served=%u queued=%u qmax=%u cum_sess=%llu wact=%u", $p->beconn, $p->served, $p->queue.length, $p->be_counters.nbpend_max, $p->be_counters.cum_sess, $p->lbprm.tot_wact
     end
     printf "\n"
     set $p = ($p)->next
   end
 end

9

dev/gdb/servers.gdb Normal file

View File

 @ -0,0 +1,9 @@
 # list servers in a proxy whose pointer is passed in argument
 define px_list_srv
   set $h = (struct proxy *)$arg0
   set $p = ($h)->srv
   while ($p != 0)
     printf "%#lx %s maxconn=%u cur_sess=%u max_sess=%u served=%u queued=%u st=%u->%u ew=%u sps_max=%u\n", $p, $p->id, $p->maxconn, $p->cur_sess, $p->counters.cur_sess_max, $p->served, $p->queue.length, $p->cur_state, $p->next_state, $p->cur_eweight, $p->counters.sps_max
     set $p = ($p)->next
   end
 end

18

dev/gdb/stream.gdb Normal file

View File

 @ -0,0 +1,18 @@
 # list all streams for all threads
 define stream_dump
   set $t = 0
   while $t < $g.nbthread
     set $h = &$tc[$t].streams
     printf "Tid %4d: &streams=%p\n", $t, $h
     set $p = *(void **)$h
     while ($p != $h)
       set $s = (struct stream *)(((char *)$p) - (unsigned long)&((struct stream *)0)->list)
       printf "  &list=%#lx strm=%p uid=%u strm.fe=%s strm.flg=%#x strm.list={n=%p,p=%p}\n", $p, $s, $s->uniq_id, $s->sess->fe->id, $s->flags, $s->list.n, $s->list.p
       if ($p == 0)
          loop_break
       end
       set $p = *(void **)$p
     end
     set $t = $t + 1
   end
 end

									
										247

dev/h2/h2-tracer.lua
									
										Normal file
									
											View File
											
				@ -0,0 +1,247 @@

				-- This is an HTTP/2 tracer for a TCP proxy. It will decode the frames that are

				-- exchanged between the client and the server and indicate their direction,

				-- types, flags and lengths. Lines are prefixed with a connection number modulo

				-- 4096 that allows to sort out multiplexed exchanges. In order to use this,

				-- simply load this file in the global section and use it from a TCP proxy:

				--

				--   global

				--       lua-load "dev/h2/h2-tracer.lua"

				--

				--   listen h2_sniffer

				--       mode tcp

				--       bind :8002

				--       filter lua.h2-tracer #hex

				--       server s1 127.0.0.1:8003

				--

				-- define the decoder's class here

				Dec = {}

				Dec.id = "Lua H2 tracer"

				Dec.flags = 0

				Dec.__index = Dec

				Dec.args = {}  -- args passed by the filter's declaration

				Dec.cid = 0    -- next connection ID

				-- prefix to indent responses

				res_pfx = "                                         | "

				-- H2 frame types

				h2ft = {

				    [0] = "DATA",

				    [1] = "HEADERS",

				    [2] = "PRIORITY",

				    [3] = "RST_STREAM",

				    [4] = "SETTINGS",

				    [5] = "PUSH_PROMISE",

				    [6] = "PING",

				    [7] = "GOAWAY",

				    [8] = "WINDOW_UPDATE",

				    [9] = "CONTINUATION",

				}

				h2ff = {

				    [0] = { [0] = "ES", [3] = "PADDED" }, -- data

				    [1] = { [0] = "ES", [2] = "EH", [3] = "PADDED", [5] = "PRIORITY" }, -- headers

				    [2] = { }, -- priority

				    [3] = { }, -- rst_stream

				    [4] = { [0] = "ACK" }, -- settings

				    [5] = { [2] = "EH", [3] = "PADDED" }, -- push_promise

				    [6] = { [0] = "ACK" }, -- ping

				    [7] = { }, -- goaway

				    [8] = { }, -- window_update

				    [9] = { [2] = "EH" }, -- continuation

				}

				function Dec:new()

				    local dec = {}

				    setmetatable(dec, Dec)

				    dec.do_hex = false

				    if (Dec.args[1] == "hex") then

				        dec.do_hex = true

				    end

				    Dec.cid = Dec.cid+1

				    -- mix the thread number when multithreading.

				    dec.cid = Dec.cid + 64 * core.thread

				    -- state per dir. [1]=req [2]=res

				    dec.st = {

				        [1] = {

				            hdr = { 0, 0, 0, 0, 0, 0, 0, 0, 0 },

				            fofs = 0,

				            flen = 0,

				            ftyp = 0,

				            fflg = 0,

				            sid = 0,

				            tot = 0,

				        },

				        [2] = {

				            hdr = { 0, 0, 0, 0, 0, 0, 0, 0, 0 },

				            fofs = 0,

				            flen = 0,

				            ftyp = 0,

				            fflg = 0,

				            sid = 0,

				            tot = 0,

				        },

				    }

				    return dec

				end

				function Dec:start_analyze(txn, chn)

				    if chn:is_resp() then

				        io.write(string.format("[%03x] ", self.cid % 4096) .. res_pfx .. "### res start\n")

				    else

				        io.write(string.format("[%03x] ", self.cid % 4096) .. "### req start\n")

				    end

				    filter.register_data_filter(self, chn)

				end

				function Dec:end_analyze(txn, chn)

				    if chn:is_resp() then

				        io.write(string.format("[%03x] ", self.cid % 4096) .. res_pfx .. "### res end: " .. self.st[2].tot .. " bytes total\n")

				    else

				        io.write(string.format("[%03x] ", self.cid % 4096) .. "### req end: " ..self.st[1].tot.. " bytes total\n")

				    end

				end

				function Dec:tcp_payload(txn, chn)

				    local data = { }

				    local dofs = 1

				    local pfx = ""

				    local dir = 1

				    local sofs = 0

				    local ft = ""

				    local ff = ""

				    if chn:is_resp() then

				        pfx = res_pfx

				        dir = 2

				    end

				    pfx = string.format("[%03x] ", self.cid % 4096) .. pfx

				    -- stream offset before processing

				    sofs = self.st[dir].tot

				    if (chn:input() > 0) then

				        data = chn:data()

				        self.st[dir].tot = self.st[dir].tot + chn:input()

				    end

				    if (chn:input() > 0 and self.do_hex ~= false) then

				        io.write("\n" .. pfx .. "Hex:\n")

				        for i = 1, #data do

				            if ((i & 7) == 1) then io.write(pfx) end

				            io.write(string.format("0x%02x ", data:sub(i, i):byte()))

				            if ((i & 7) == 0 or i == #data) then io.write("\n") end

				        end

				    end

				    -- start at byte 1 in the <data> string

				    dofs = 1

				    -- the first 24 bytes are expected to be an H2 preface on the request

				    if (dir == 1 and sofs < 24) then

				        -- let's not check it for now

				        local bytes = self.st[dir].tot - sofs

				        if (sofs + self.st[dir].tot >= 24) then

				            -- skip what was missing from the preface

				            dofs = dofs + 24 - sofs

				            sofs = 24

				            io.write(pfx .. "[PREFACE len=24]\n")

				        else

				            -- consume more preface bytes

				            sofs = sofs + self.st[dir].tot

				            return

				        end

				    end

				    -- parse contents as long as there are pending data

				    while true do

				        -- check if we need to consume data from the current frame

				        -- flen is the number of bytes left before the frame's end.

				        if (self.st[dir].flen > 0) then

				            if dofs > #data then return end -- missing data

				            if (#data - dofs + 1 < self.st[dir].flen) then

				                -- insufficient data

				                self.st[dir].flen = self.st[dir].flen - (#data - dofs + 1)

				                io.write(pfx .. string.format("%32s\n", "... -" .. (#data - dofs + 1) .. " = " .. self.st[dir].flen))

				                dofs = #data + 1

				                return

				            else

				                -- enough data to finish

				                if (dofs == 1) then

				                    -- only print a partial size if the frame was interrupted

				                    io.write(pfx .. string.format("%32s\n", "... -" .. self.st[dir].flen .. " = 0"))

				                end

				                dofs = dofs + self.st[dir].flen

				                self.st[dir].flen = 0

				            end

				        end

				        -- here, flen = 0, we're at the beginning of a new frame --

				        -- read possibly missing header bytes until dec.fofs == 9

				        while self.st[dir].fofs < 9 do

				            if dofs > #data then return end -- missing data

				            self.st[dir].hdr[self.st[dir].fofs + 1] = data:sub(dofs, dofs):byte()

				            dofs = dofs + 1

				            self.st[dir].fofs = self.st[dir].fofs + 1

				        end

				        -- we have a full frame header here

				        if (self.do_hex ~= false) then

				            io.write("\n" .. pfx .. string.format("hdr=%02x %02x %02x %02x %02x %02x %02x %02x %02x\n",

				                     self.st[dir].hdr[1], self.st[dir].hdr[2], self.st[dir].hdr[3],

				                     self.st[dir].hdr[4], self.st[dir].hdr[5], self.st[dir].hdr[6],

				                     self.st[dir].hdr[7], self.st[dir].hdr[8], self.st[dir].hdr[9]))

				        end

				        -- we have a full frame header, we'll be ready

				        -- for a new frame once the data is gone

				        self.st[dir].flen = self.st[dir].hdr[1] * 65536 +

				                            self.st[dir].hdr[2] * 256 +

				                            self.st[dir].hdr[3]

				        self.st[dir].ftyp = self.st[dir].hdr[4]

				        self.st[dir].fflg = self.st[dir].hdr[5]

				        self.st[dir].sid  = self.st[dir].hdr[6] * 16777216 +

				                            self.st[dir].hdr[7] * 65536 +

				                            self.st[dir].hdr[8] * 256 +

				                            self.st[dir].hdr[9]

				        self.st[dir].fofs = 0

				        -- decode frame type

				        if self.st[dir].ftyp <= 9 then

				            ft = h2ft[self.st[dir].ftyp]

				        else

				            ft = string.format("TYPE_0x%02x\n", self.st[dir].ftyp)

				        end

				        -- decode frame flags for frame type <ftyp>

				        ff = ""

				        for i = 7, 0, -1 do

				            if (((self.st[dir].fflg >> i) & 1) ~= 0) then

				                if self.st[dir].ftyp <= 9 and h2ff[self.st[dir].ftyp][i] ~= nil then

				                    ff = ff .. ((ff == "") and "" or "+")

				                    ff = ff .. h2ff[self.st[dir].ftyp][i]

				                else

				                    ff = ff .. ((ff == "") and "" or "+")

				                    ff = ff .. string.format("0x%02x", 1<<i)

				                end

				            end

				        end

				        io.write(pfx .. string.format("[%s %ssid=%u len=%u (bytes=%u)]\n",

				            ft, (ff == "") and "" or ff .. " ",

				            self.st[dir].sid, self.st[dir].flen,

				            (#data - dofs + 1)))

				    end

				end

				core.register_filter("h2-tracer", Dec, function(dec, args)

				    Dec.args = args

				    return dec

				end)

									
										6

dev/haring/haring.c
									
											View File
											
				@ -59,9 +59,9 @@ struct ring_v2 {

				struct ring_v2a {

					size_t size;         // storage size

					size_t rsvd;         // header length (used for file-backed maps)

					size_t tail __attribute__((aligned(64)));         // storage tail

					size_t head __attribute__((aligned(64)));         // storage head

					char area[0] __attribute__((aligned(64)));        // storage area begins immediately here

					size_t tail ALIGNED(64);         // storage tail

					size_t head ALIGNED(64);         // storage head

					char area[0] ALIGNED(64);        // storage area begins immediately here

				};

				/* display the message and exit with the code */

									
										31

dev/ncpu/Makefile
									
										Normal file
									
											View File
											
				@ -0,0 +1,31 @@

				include ../../include/make/verbose.mk

				CC       = cc

				OPTIMIZE = -O2 -g

				DEFINE   =

				INCLUDE  =

				OBJS     = ncpu.so ncpu

				OBJDUMP  = objdump

				all:	$(OBJS)

				%.o: %.c

					$(cmd_CC) $(OPTIMIZE) $(DEFINE) $(INCLUDE) -shared -fPIC -c -o $@ $^

				%.so: %.o

					$(cmd_CC) -pie -o $@ $^

					$(Q)rm -f $^

				%: %.so

					$(call qinfo, PATCHING)set -- $$($(OBJDUMP) -j .dynamic -h $^ | fgrep .dynamic); \

					  ofs=$$6; size=$$3; \

					  dd status=none bs=1 count=$$((0x$$ofs)) if=$^ of=$^-p1; \

					  dd status=none bs=1 skip=$$((0x$$ofs)) count=$$((0x$$size)) if=$^ of=$^-p2; \

					  dd status=none bs=1 skip=$$((0x$$ofs+0x$$size)) if=$^ of=$^-p3; \

					  sed -e 's,\xfb\xff\xff\x6f\x00\x00\x00\x00\x00\x00\x00\x08,\xfb\xff\xff\x6f\x00\x00\x00\x00\x00\x00\x00\x00,g' < $^-p2 > $^-p2-patched; \

					  cat $^-p1 $^-p2-patched $^-p3 > "$@"

					$(Q)rm -f $^-p*

					$(Q)chmod 755 "$@"

				clean:

					rm -f $(OBJS) *.[oas] *.so-* *~

									
										136

dev/ncpu/ncpu.c
									
										Normal file
									
											View File
											
				@ -0,0 +1,136 @@

				#define _GNU_SOURCE

				#include <errno.h>

				#include <limits.h>

				#include <sched.h>

				#include <stdio.h>

				#include <stdlib.h>

				#include <string.h>

				#include <unistd.h>

				// gcc -fPIC -shared -O2 -o ncpu{.so,.c}

				// NCPU=16 LD_PRELOAD=$PWD/ncpu.so command args...

				static char prog_full_path[PATH_MAX];

				long sysconf(int name)

				{

					if (name == _SC_NPROCESSORS_ONLN ||

					    name == _SC_NPROCESSORS_CONF) {

						const char *ncpu = getenv("NCPU");

						int n;

						n = ncpu ? atoi(ncpu) : CPU_SETSIZE;

						if (n < 0 || n > CPU_SETSIZE)

							n = CPU_SETSIZE;

						return n;

					}

					errno = EINVAL;

					return -1;

				}

				/* return a cpu_set having the first $NCPU set */

				int sched_getaffinity(pid_t pid, size_t cpusetsize, cpu_set_t *mask)

				{

					const char *ncpu;

					int i, n;

					CPU_ZERO_S(cpusetsize, mask);

					ncpu = getenv("NCPU");

					n = ncpu ? atoi(ncpu) : CPU_SETSIZE;

					if (n < 0 || n > CPU_SETSIZE)

						n = CPU_SETSIZE;

					for (i = 0; i < n; i++)

						CPU_SET_S(i, cpusetsize, mask);

					return 0;

				}

				/* silently ignore the operation */

				int sched_setaffinity(pid_t pid, size_t cpusetsize, const cpu_set_t *mask)

				{

					return 0;

				}

				void usage(const char *argv0)

				{

					fprintf(stderr,

						"Usage: %s [-n ncpu] [cmd [args...]]\n"

						"       Will install itself in LD_PRELOAD before calling <cmd> with args.\n"

						"       The number of CPUs may also come from variable NCPU or default to %d.\n"

						"\n"

						"",

						argv0, CPU_SETSIZE);

					exit(1);

				}

				/* Called in wrapper mode, no longer supported on recent glibc */

				int main(int argc, char **argv)

				{

					const char *argv0 = argv[0];

					char *preload;

					int plen;

					prog_full_path[0] = 0;

					plen = readlink("/proc/self/exe", prog_full_path, sizeof(prog_full_path) - 1);

					if (plen != -1)

						prog_full_path[plen] = 0;

					else

						plen = snprintf(prog_full_path, sizeof(prog_full_path), "%s", argv[0]);

					while (1) {

						argc--;

						argv++;

						if (argc < 1)

							usage(argv0);

						if (strcmp(argv[0], "--") == 0) {

							argc--;

							argv++;

							break;

						}

						else if (strcmp(argv[0], "-n") == 0) {

							if (argc < 2)

								usage(argv0);

							if (setenv("NCPU", argv[1], 1) != 0)

								usage(argv0);

							argc--;

							argv++;

						}

						else {

							/* unknown arg, that's the command */

							break;

						}

					}

					/* here the only args left start with the cmd name */

					/* now we'll concatenate ourselves at the end of the LD_PRELOAD variable */

					preload = getenv("LD_PRELOAD");

					if (preload) {

						int olen = strlen(preload);

						preload = realloc(preload, olen + 1 + plen + 1);

						if (!preload) {

							perror("realloc");

							exit(2);

						}

						preload[olen] = ' ';

						memcpy(preload + olen + 1, prog_full_path, plen);

						preload[olen + 1 + plen] = 0;

					}

					else {

						preload = prog_full_path;

					}

					if (setenv("LD_PRELOAD", preload, 1) < 0) {

						perror("setenv");

						exit(2);

					}

					execvp(*argv, argv);

					perror("execve");

					exit(2);

				}

4

dev/patchbot/prompts/prompt15-3.0-mist7bv2-pfx.txt → dev/patchbot/prompts/prompt15-3.1-mist7bv2-pfx.txt

View File

 @ -14,11 +14,11 @@ that are picked from the development branch.
 Branches are numbered in 0.1 increments. Every 6 months, upon a new major
 release, the development branch enters maintenance and a new development branch
 is created with a new, higher version. The current development branch is
 .0-dev, and maintenance branches are 2.9 and below.
 .1-dev, and maintenance branches are 3.0 and below.
 Fixes created in the development branch for issues that were introduced in an
 earlier branch are applied in descending order to each and every version till
 that branch that introduced the issue: 2.9 first, then 2.8, then 2.7 and so
 that branch that introduced the issue: 3.0 first, then 2.9, then 2.8 and so
 on. This operation is called "backporting". A fix for an issue is never
 backported beyond the branch that introduced the issue. An important point is
 that the project maintainers really aim at zero regression in maintenance

2

dev/patchbot/prompts/prompt15-3.0-mist7bv2-sfx.txt → dev/patchbot/prompts/prompt15-3.1-mist7bv2-sfx.txt

View File

 @ -17,7 +17,7 @@ Finally, based on your analysis, give your general conclusion as "Conclusion: X"
 where X is a single word among:
   - "yes", if you recommend to backport the patch right now either because
     it explicitly states this or because it's a fix for a bug that affects
     a maintenance branch (2.9 or lower);
     a maintenance branch (3.0 or lower);
   - "wait", if this patch explicitly mentions that it must be backported, but
     only after waiting some time.
   - "no", if nothing clearly indicates a necessity to backport this patch (e.g.

70

dev/patchbot/prompts/prompt15-3.2-mist7bv2-pfx.txt Normal file

View File

 @ -0,0 +1,70 @@
 BEGININPUT
 BEGINCONTEXT
 HAProxy's development cycle consists in one development branch, and multiple
 maintenance branches.
 All the development is made into the development branch exclusively. This
 includes mostly new features, doc updates, cleanups and or course, fixes.
 The maintenance branches, also called stable branches, never see any
 development, and only receive ultra-safe fixes for bugs that affect them,
 that are picked from the development branch.
 Branches are numbered in 0.1 increments. Every 6 months, upon a new major
 release, the development branch enters maintenance and a new development branch
 is created with a new, higher version. The current development branch is
 .2-dev, and maintenance branches are 3.1 and below.
 Fixes created in the development branch for issues that were introduced in an
 earlier branch are applied in descending order to each and every version till
 that branch that introduced the issue: 3.1 first, then 3.0, then 2.9, then 2.8
 and so on. This operation is called "backporting". A fix for an issue is never
 backported beyond the branch that introduced the issue. An important point is
 that the project maintainers really aim at zero regression in maintenance
 branches, so they're never willing to take any risk backporting patches that
 are not deemed strictly necessary.
 Fixes consist of patches managed using the Git version control tool and are
 identified by a Git commit ID and a commit message. For this reason we
 indistinctly talk about backporting fixes, commits, or patches; all mean the
 same thing. When mentioning commit IDs, developers always use a short form
 made of the first 8 characters only, and expect the AI assistant to do the
 same.
 It seldom happens that some fixes depend on changes that were brought by other
 patches that were not in some branches and that will need to be backported as
 well for the fix to work. In this case, such information is explicitly provided
 in the commit message by the patch's author in natural language.
 Developers are serious and always indicate if a patch needs to be backported.
 Sometimes they omit the exact target branch, or they will say that the patch is
 "needed" in some older branch, but it means the same. If a commit message
 doesn't mention any backport instructions, it means that the commit does not
 have to be backported. And patches that are not strictly bug fixes nor doc
 improvements are normally not backported. For example, fixes for design
 limitations, architectural improvements and performance optimizations are
 considered too risky for a backport. Finally, all bug fixes are tagged as
 "BUG" at the beginning of their subject line. Patches that are not tagged as
 such are not bugs, and must never be backported unless their commit message
 explicitly requests so.
 ENDCONTEXT
 A developer is reviewing the development branch, trying to spot which commits
 need to be backported to maintenance branches. This person is already expert
 on HAProxy and everything related to Git, patch management, and the risks
 associated with backports, so he doesn't want to be told how to proceed nor to
 review the contents of the patch.
 The goal for this developer is to get some help from the AI assistant to save
 some precious time on this tedious review work. In order to do a better job, he
 needs an accurate summary of the information and instructions found in each
 commit message. Specifically he needs to figure if the patch fixes a problem
 affecting an older branch or not, if it needs to be backported, if so to which
 branches, and if other patches need to be backported along with it.
 The indented text block below after an "id" line and starting with a Subject line
 is a commit message from the HAProxy development branch that describes a patch
 applied to that branch, starting with its subject line, please read it carefully.

29

dev/patchbot/prompts/prompt15-3.2-mist7bv2-sfx.txt Normal file

View File

 @ -0,0 +1,29 @@
 ENDINPUT
 BEGININSTRUCTION
 You are an AI assistant that follows instruction extremely well. Help as much
 as you can, responding to a single question using a single response.
 The developer wants to know if he needs to backport the patch above to fix
 maintenance branches, for which branches, and what possible dependencies might
 be mentioned in the commit message. Carefully study the commit message and its
 backporting instructions if any (otherwise it should probably not be backported),
 then provide a very concise and short summary that will help the developer decide
 to backport it, or simply to skip it.
 Start by explaining in one or two sentences what you recommend for this one and why.
 Finally, based on your analysis, give your general conclusion as "Conclusion: X"
 where X is a single word among:
   - "yes", if you recommend to backport the patch right now either because
     it explicitly states this or because it's a fix for a bug that affects
     a maintenance branch (3.1 or lower);
   - "wait", if this patch explicitly mentions that it must be backported, but
     only after waiting some time.
   - "no", if nothing clearly indicates a necessity to backport this patch (e.g.
      lack of explicit backport instructions, or it's just an improvement);
   - "uncertain" otherwise for cases not covered above
 ENDINSTRUCTION
 Explanation:

70

dev/patchbot/prompts/prompt15-3.3-mist7bv2-pfx.txt Normal file

View File

 @ -0,0 +1,70 @@
 BEGININPUT
 BEGINCONTEXT
 HAProxy's development cycle consists in one development branch, and multiple
 maintenance branches.
 All the development is made into the development branch exclusively. This
 includes mostly new features, doc updates, cleanups and or course, fixes.
 The maintenance branches, also called stable branches, never see any
 development, and only receive ultra-safe fixes for bugs that affect them,
 that are picked from the development branch.
 Branches are numbered in 0.1 increments. Every 6 months, upon a new major
 release, the development branch enters maintenance and a new development branch
 is created with a new, higher version. The current development branch is
 .3-dev, and maintenance branches are 3.2 and below.
 Fixes created in the development branch for issues that were introduced in an
 earlier branch are applied in descending order to each and every version till
 that branch that introduced the issue: 3.2 first, then 3.1, then 3.0, then 2.9
 and so on. This operation is called "backporting". A fix for an issue is never
 backported beyond the branch that introduced the issue. An important point is
 that the project maintainers really aim at zero regression in maintenance
 branches, so they're never willing to take any risk backporting patches that
 are not deemed strictly necessary.
 Fixes consist of patches managed using the Git version control tool and are
 identified by a Git commit ID and a commit message. For this reason we
 indistinctly talk about backporting fixes, commits, or patches; all mean the
 same thing. When mentioning commit IDs, developers always use a short form
 made of the first 8 characters only, and expect the AI assistant to do the
 same.
 It seldom happens that some fixes depend on changes that were brought by other
 patches that were not in some branches and that will need to be backported as
 well for the fix to work. In this case, such information is explicitly provided
 in the commit message by the patch's author in natural language.
 Developers are serious and always indicate if a patch needs to be backported.
 Sometimes they omit the exact target branch, or they will say that the patch is
 "needed" in some older branch, but it means the same. If a commit message
 doesn't mention any backport instructions, it means that the commit does not
 have to be backported. And patches that are not strictly bug fixes nor doc
 improvements are normally not backported. For example, fixes for design
 limitations, architectural improvements and performance optimizations are
 considered too risky for a backport. Finally, all bug fixes are tagged as
 "BUG" at the beginning of their subject line. Patches that are not tagged as
 such are not bugs, and must never be backported unless their commit message
 explicitly requests so.
 ENDCONTEXT
 A developer is reviewing the development branch, trying to spot which commits
 need to be backported to maintenance branches. This person is already expert
 on HAProxy and everything related to Git, patch management, and the risks
 associated with backports, so he doesn't want to be told how to proceed nor to
 review the contents of the patch.
 The goal for this developer is to get some help from the AI assistant to save
 some precious time on this tedious review work. In order to do a better job, he
 needs an accurate summary of the information and instructions found in each
 commit message. Specifically he needs to figure if the patch fixes a problem
 affecting an older branch or not, if it needs to be backported, if so to which
 branches, and if other patches need to be backported along with it.
 The indented text block below after an "id" line and starting with a Subject line
 is a commit message from the HAProxy development branch that describes a patch
 applied to that branch, starting with its subject line, please read it carefully.

29

dev/patchbot/prompts/prompt15-3.3-mist7bv2-sfx.txt Normal file

View File

 @ -0,0 +1,29 @@
 ENDINPUT
 BEGININSTRUCTION
 You are an AI assistant that follows instruction extremely well. Help as much
 as you can, responding to a single question using a single response.
 The developer wants to know if he needs to backport the patch above to fix
 maintenance branches, for which branches, and what possible dependencies might
 be mentioned in the commit message. Carefully study the commit message and its
 backporting instructions if any (otherwise it should probably not be backported),
 then provide a very concise and short summary that will help the developer decide
 to backport it, or simply to skip it.
 Start by explaining in one or two sentences what you recommend for this one and why.
 Finally, based on your analysis, give your general conclusion as "Conclusion: X"
 where X is a single word among:
   - "yes", if you recommend to backport the patch right now either because
     it explicitly states this or because it's a fix for a bug that affects
     a maintenance branch (3.2 or lower);
   - "wait", if this patch explicitly mentions that it must be backported, but
     only after waiting some time.
   - "no", if nothing clearly indicates a necessity to backport this patch (e.g.
      lack of explicit backport instructions, or it's just an improvement);
   - "uncertain" otherwise for cases not covered above
 ENDINSTRUCTION
 Explanation:

70

dev/patchbot/prompts/prompt15-3.4-mist7bv2-pfx.txt Normal file

View File

 @ -0,0 +1,70 @@
 BEGININPUT
 BEGINCONTEXT
 HAProxy's development cycle consists in one development branch, and multiple
 maintenance branches.
 All the development is made into the development branch exclusively. This
 includes mostly new features, doc updates, cleanups and or course, fixes.
 The maintenance branches, also called stable branches, never see any
 development, and only receive ultra-safe fixes for bugs that affect them,
 that are picked from the development branch.
 Branches are numbered in 0.1 increments. Every 6 months, upon a new major
 release, the development branch enters maintenance and a new development branch
 is created with a new, higher version. The current development branch is
 .4-dev, and maintenance branches are 3.3 and below.
 Fixes created in the development branch for issues that were introduced in an
 earlier branch are applied in descending order to each and every version till
 that branch that introduced the issue: 3.3 first, then 3.2, then 3.1, then 3.0
 and so on. This operation is called "backporting". A fix for an issue is never
 backported beyond the branch that introduced the issue. An important point is
 that the project maintainers really aim at zero regression in maintenance
 branches, so they're never willing to take any risk backporting patches that
 are not deemed strictly necessary.
 Fixes consist of patches managed using the Git version control tool and are
 identified by a Git commit ID and a commit message. For this reason we
 indistinctly talk about backporting fixes, commits, or patches; all mean the
 same thing. When mentioning commit IDs, developers always use a short form
 made of the first 8 characters only, and expect the AI assistant to do the
 same.
 It seldom happens that some fixes depend on changes that were brought by other
 patches that were not in some branches and that will need to be backported as
 well for the fix to work. In this case, such information is explicitly provided
 in the commit message by the patch's author in natural language.
 Developers are serious and always indicate if a patch needs to be backported.
 Sometimes they omit the exact target branch, or they will say that the patch is
 "needed" in some older branch, but it means the same. If a commit message
 doesn't mention any backport instructions, it means that the commit does not
 have to be backported. And patches that are not strictly bug fixes nor doc
 improvements are normally not backported. For example, fixes for design
 limitations, architectural improvements and performance optimizations are
 considered too risky for a backport. Finally, all bug fixes are tagged as
 "BUG" at the beginning of their subject line. Patches that are not tagged as
 such are not bugs, and must never be backported unless their commit message
 explicitly requests so.
 ENDCONTEXT
 A developer is reviewing the development branch, trying to spot which commits
 need to be backported to maintenance branches. This person is already expert
 on HAProxy and everything related to Git, patch management, and the risks
 associated with backports, so he doesn't want to be told how to proceed nor to
 review the contents of the patch.
 The goal for this developer is to get some help from the AI assistant to save
 some precious time on this tedious review work. In order to do a better job, he
 needs an accurate summary of the information and instructions found in each
 commit message. Specifically he needs to figure if the patch fixes a problem
 affecting an older branch or not, if it needs to be backported, if so to which
 branches, and if other patches need to be backported along with it.
 The indented text block below after an "id" line and starting with a Subject line
 is a commit message from the HAProxy development branch that describes a patch
 applied to that branch, starting with its subject line, please read it carefully.

29

dev/patchbot/prompts/prompt15-3.4-mist7bv2-sfx.txt Normal file

View File

 @ -0,0 +1,29 @@
 ENDINPUT
 BEGININSTRUCTION
 You are an AI assistant that follows instruction extremely well. Help as much
 as you can, responding to a single question using a single response.
 The developer wants to know if he needs to backport the patch above to fix
 maintenance branches, for which branches, and what possible dependencies might
 be mentioned in the commit message. Carefully study the commit message and its
 backporting instructions if any (otherwise it should probably not be backported),
 then provide a very concise and short summary that will help the developer decide
 to backport it, or simply to skip it.
 Start by explaining in one or two sentences what you recommend for this one and why.
 Finally, based on your analysis, give your general conclusion as "Conclusion: X"
 where X is a single word among:
   - "yes", if you recommend to backport the patch right now either because
     it explicitly states this or because it's a fix for a bug that affects
     a maintenance branch (3.3 or lower);
   - "wait", if this patch explicitly mentions that it must be backported, but
     only after waiting some time.
   - "no", if nothing clearly indicates a necessity to backport this patch (e.g.
      lack of explicit backport instructions, or it's just an improvement);
   - "uncertain" otherwise for cases not covered above
 ENDINSTRUCTION
 Explanation:

									
										51

dev/patchbot/scripts/post-ai.sh
									
											View File
											
				@ -150,11 +150,14 @@ function updt_table(line) {

				  var w = document.getElementById("sh_w").checked;

				  var y = document.getElementById("sh_y").checked;

				  var tn = 0, tu = 0, tw = 0, ty = 0;

				  var bn = 0, bu = 0, bw = 0, by = 0;

				  var i, el;

				  for (i = 1; i < nb_patches; i++) {

				    if (document.getElementById("bt_" + i + "_n").checked) {

				      tn++;

				      if (bkp[i])

				         bn++;

				      if (line && i != line)

				        continue;

				      el = document.getElementById("tr_" + i);

				@ -163,6 +166,8 @@ function updt_table(line) {

				    }

				    else if (document.getElementById("bt_" + i + "_u").checked) {

				      tu++;

				      if (bkp[i])

				         bu++;

				      if (line && i != line)

				        continue;

				      el = document.getElementById("tr_" + i);

				@ -171,6 +176,8 @@ function updt_table(line) {

				    }

				    else if (document.getElementById("bt_" + i + "_w").checked) {

				      tw++;

				      if (bkp[i])

				         bw++;

				      if (line && i != line)

				        continue;

				      el = document.getElementById("tr_" + i);

				@ -179,6 +186,8 @@ function updt_table(line) {

				    }

				    else if (document.getElementById("bt_" + i + "_y").checked) {

				      ty++;

				      if (bkp[i])

				         by++;

				      if (line && i != line)

				        continue;

				      el = document.getElementById("tr_" + i);

				@ -198,6 +207,18 @@ function updt_table(line) {

				  document.getElementById("cnt_u").innerText = tu;

				  document.getElementById("cnt_w").innerText = tw;

				  document.getElementById("cnt_y").innerText = ty;

				  document.getElementById("cnt_bn").innerText = bn;

				  document.getElementById("cnt_bu").innerText = bu;

				  document.getElementById("cnt_bw").innerText = bw;

				  document.getElementById("cnt_by").innerText = by;

				  document.getElementById("cnt_bt").innerText = bn + bu + bw + by;

				  document.getElementById("cnt_nbn").innerText = tn - bn;

				  document.getElementById("cnt_nbu").innerText = tu - bu;

				  document.getElementById("cnt_nbw").innerText = tw - bw;

				  document.getElementById("cnt_nby").innerText = ty - by;

				  document.getElementById("cnt_nbt").innerText = tn - bn + tu - bu + tw - bw + ty - by;

				}

				function updt_output() {

				@ -236,23 +257,47 @@ function updt(line,value) {

				  updt_output();

				}

				function show_only(b,n,u,w,y) {

				    document.getElementById("sh_b").checked = !!b;

				    document.getElementById("sh_n").checked = !!n;

				    document.getElementById("sh_u").checked = !!u;

				    document.getElementById("sh_w").checked = !!w;

				    document.getElementById("sh_y").checked = !!y;

				    document.getElementById("show_all").checked = true;

				    updt(0,"r");

				}

				// -->

				</script>

				</HEAD>

				EOF

				echo "<BODY>"

				echo -n "<table cellpadding=3 cellspacing=5 style='font-size: 150%;'><tr><th align=left>Backported</th>"

				echo -n "<td style='background-color:$BG_N'><a href='#' onclick='show_only(1,1,0,0,0);'> N: <span id='cnt_bn'>0</span> </a></td>"

				echo -n "<td style='background-color:$BG_U'><a href='#' onclick='show_only(1,0,1,0,0);'> U: <span id='cnt_bu'>0</span> </a></td>"

				echo -n "<td style='background-color:$BG_W'><a href='#' onclick='show_only(1,0,0,1,0);'> W: <span id='cnt_bw'>0</span> </a></td>"

				echo -n "<td style='background-color:$BG_Y'><a href='#' onclick='show_only(1,0,0,0,1);'> Y: <span id='cnt_by'>0</span> </a></td>"

				echo -n "<td>total: <span id='cnt_bt'>0</span></td>"

				echo "</tr><tr>"

				echo -n "<th align=left>Not backported</th>"

				echo -n "<td style='background-color:$BG_N'><a href='#' onclick='show_only(0,1,0,0,0);'> N: <span id='cnt_nbn'>0</span> </a></td>"

				echo -n "<td style='background-color:$BG_U'><a href='#' onclick='show_only(0,0,1,0,0);'> U: <span id='cnt_nbu'>0</span> </a></td>"

				echo -n "<td style='background-color:$BG_W'><a href='#' onclick='show_only(0,0,0,1,0);'> W: <span id='cnt_nbw'>0</span> </a></td>"

				echo -n "<td style='background-color:$BG_Y'><a href='#' onclick='show_only(0,0,0,0,1);'> Y: <span id='cnt_nby'>0</span> </a></td>"

				echo -n "<td>total: <span id='cnt_nbt'>0</span></td>"

				echo "</tr></table><P/>"

				echo -n "<big><big>Show:"

				echo -n " <span style='background-color:$BG_B'><input type='checkbox' onclick='updt_table(0);' id='sh_b' checked />B (${#bkp[*]})</span> "

				echo -n " <span style='background-color:$BG_N'><input type='checkbox' onclick='updt_table(0);' id='sh_n' checked />N (<span id='cnt_n'>0</span>)</span> "

				echo -n " <span style='background-color:$BG_U'><input type='checkbox' onclick='updt_table(0);' id='sh_u' checked />U (<span id='cnt_u'>0</span>)</span> "

				echo -n " <span style='background-color:$BG_W'><input type='checkbox' onclick='updt_table(0);' id='sh_w' checked />W (<span id='cnt_w'>0</span>)</span> "

				echo -n " <span style='background-color:$BG_Y'><input type='checkbox' onclick='updt_table(0);' id='sh_y' checked />Y (<span id='cnt_y'>0</span>)</span> "

				echo -n "</big/></big> (B=show backported, N=no/drop, U=uncertain, W=wait/next, Y=yes/pick"

				echo -n "</big/></big><br/>(B=show backported, N=no/drop, U=uncertain, W=wait/next, Y=yes/pick"

				echo ")<P/>"

				echo "<TABLE COLS=5 BORDER=1 CELLSPACING=0 CELLPADDING=3>"

				echo "<TR><TH>All<br/><input type='radio' name='review' onclick='updt(0,\"r\");' checked title='Start review here'/></TH><TH>CID</TH><TH>Subject</TH><TH>Verdict<BR>N U W Y</BR></TH><TH>Reason</TH></TR>"

				echo "<TR><TH>All<br/><input type='radio' name='review' id='show_all' onclick='updt(0,\"r\");' checked title='Start review here'/></TH><TH>CID</TH><TH>Subject</TH><TH>Verdict<BR>N U W Y</BR></TH><TH>Reason</TH></TR>"

				seq_num=1; do_check=1; review=0;

				for patch in "${PATCHES[@]}"; do

				        # try to retrieve the patch's numbering (0001-9999)

				@ -335,7 +380,7 @@ for patch in "${PATCHES[@]}"; do

				        resp=$(echo "$resp" | sed -e "s|#\([0-9]\{1,5\}\)|<a href='${ISSUES}\1'>#\1</a>|g")

				        # put links to commit IDs

				        resp=$(echo "$resp" | sed -e "s|\([0-9a-f]\{8,40\}\)|<a href='${GITURL}\1'>\1</a>|g")

				        resp=$(echo "$resp" | sed -e "s|\([0-9a-f]\{7,40\}\)|<a href='${GITURL}\1'>\1</a>|g")

				        echo -n "<TD nowrap align=center ${bkp[$cid]:+style='background-color:${BG_B}'}>$seq_num<BR/>"

				        echo -n "<input type='radio' name='review' onclick='updt($seq_num,\"r\");' ${do_check:+checked} title='Start review here'/></TD>"

									
										3

dev/patchbot/scripts/update-3.0.sh
									
											View File
											
				@ -22,7 +22,8 @@ STABLE=$(cd "$HAPROXY_DIR" && git describe --tags "v${BRANCH}-dev0^" |cut -f1,2

				PATCHES_DIR="$PATCHES_PFX"-"$BRANCH"

				(cd "$HAPROXY_DIR"

				 git pull

				 # avoid git pull, it chokes on forced push

				 git remote update origin; git reset origin/master;git checkout -f

				 last_file=$(ls -1 "$PATCHES_DIR"/*.patch 2>/dev/null | tail -n1)

				 if [ -n "$last_file" ]; then

					restart=$(head -n1 "$last_file" | cut -f2 -d' ')

									
										4

dev/phash/phash.c
									
											View File
											
				@ -17,9 +17,9 @@

				//const int codes[CODES] = { 200,400,401,403,404,405,407,408,410,413,421,422,425,429,500,501,502,503,504};

				#define CODES 32

				const int codes[CODES] = { 200,400,401,403,404,405,407,408,410,413,421,422,425,429,500,501,502,503,504,

				const int codes[CODES] = { 200,400,401,403,404,405,407,408,410,413,414,421,422,425,429,431,500,501,502,503,504,

					/* padding entries below, which will fall back to the default code */

					-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};

					-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};

				unsigned mul, xor;

				unsigned bmul = 0, bxor = 0;

									
										233

dev/term_events/term_events.c
									
										Normal file
									
											View File
											
				@ -0,0 +1,233 @@

				#include <stdio.h>

				#include <stdlib.h>

				#include <haproxy/connection-t.h>

				#include <haproxy/intops.h>

				struct tevt_info {

					const char *loc;

					const char **types;

				};

				/* will be sufficient for even largest flag names */

				static char buf[4096];

				static size_t bsz = sizeof(buf);

				static const char *tevt_unknown_types[16] = {

					[ 0] = "-", [ 1] = "-", [ 2] = "-", [ 3] = "-",

					[ 4] = "-", [ 5] = "-", [ 6] = "-", [ 7] = "-",

					[ 8] = "-", [ 9] = "-", [10] = "-", [11] = "-",

					[12] = "-", [13] = "-", [14] = "-", [15] = "-",

				};

				static const char *tevt_fd_types[16] = {

					[ 0] = "-",           [ 1] = "shutw",         [ 2] = "shutr",    [ 3] = "rcv_err",

					[ 4] = "snd_err",     [ 5] = "-",             [ 6] = "-",        [ 7] = "conn_err",

					[ 8] = "intercepted", [ 9] = "conn_poll_err", [10] = "poll_err", [11] = "poll_hup",

					[12] = "-",           [13] = "-",             [14] = "-",        [15] = "-",

				};

				static const char *tevt_hs_types[16] = {

					[ 0] = "-",       [ 1] = "-", [ 2] = "-", [ 3] = "rcv_err",

					[ 4] = "snd_err", [ 5] = "-", [ 6] = "-", [ 7] = "-",

					[ 8] = "-",       [ 9] = "-", [10] = "-", [11] = "-",

					[12] = "-",       [13] = "-", [14] = "-", [15] = "-",

				};

				static const char *tevt_xprt_types[16] = {

					[ 0] = "-",       [ 1] = "shutw", [ 2] = "shutr", [ 3] = "rcv_err",

					[ 4] = "snd_err", [ 5] = "-",     [ 6] = "-",     [ 7] = "-",

					[ 8] = "-",       [ 9] = "-",     [10] = "-",     [11] = "-",

					[12] = "-",       [13] = "-",     [14] = "-",     [15] = "-",

				};

				static const char *tevt_muxc_types[16] = {

					[ 0] = "-",             [ 1] = "shutw",           [ 2] = "shutr",             [ 3] = "rcv_err",

					[ 4] = "snd_err",       [ 5] = "truncated_shutr", [ 6] = "truncated_rcv_err", [ 7] = "tout",

					[ 8] = "goaway_rcvd",   [ 9] = "proto_err",       [10] = "internal_err",      [11] = "other_err",

					[12] = "graceful_shut", [13] = "-",               [14] = "-",                 [15] = "-",

				};

				static const char *tevt_se_types[16] = {

					[ 0] = "-",         [ 1] = "shutw",         [ 2] = "eos",               [ 3] = "rcv_err",

					[ 4] = "snd_err",   [ 5] = "truncated_eos", [ 6] = "truncated_rcv_err", [ 7] = "-",

					[ 8] = "rst_rcvd",  [ 9] = "proto_err",     [10] = "internal_err",      [11] = "other_err",

					[12] = "cancelled", [13] = "-",             [14] = "-",                 [15] = "-",

				};

				static const char *tevt_strm_types[16] = {

					[ 0] = "-",           [ 1] = "shutw",         [ 2] = "eos",               [ 3] = "rcv_err",

					[ 4] = "snd_err",     [ 5] = "truncated_eos", [ 6] = "truncated_rcv_err", [ 7] = "tout",

					[ 8] = "intercepted", [ 9] = "proto_err",     [10] = "internal_err",      [11] = "other_err",

					[12] = "aborted",     [13] = "-",             [14] = "-",                 [15] = "-",

				};

				static const struct tevt_info tevt_location[26] = {

					[ 0] = {.loc = "-",    .types = tevt_unknown_types}, [ 1] = {.loc = "-",    .types = tevt_unknown_types},

					[ 2] = {.loc = "-",    .types = tevt_unknown_types}, [ 3] = {.loc = "-",    .types = tevt_unknown_types},

					[ 4] = {.loc = "se",   .types = tevt_se_types},      [ 5] = {.loc = "fd",   .types = tevt_fd_types},

					[ 6] = {.loc = "-",    .types = tevt_unknown_types}, [ 7] = {.loc = "hs",   .types = tevt_hs_types},

					[ 8] = {.loc = "-",    .types = tevt_unknown_types}, [ 9] = {.loc = "-",    .types = tevt_unknown_types},

					[10] = {.loc = "-",    .types = tevt_unknown_types}, [11] = {.loc = "-",    .types = tevt_unknown_types},

					[12] = {.loc = "muxc", .types = tevt_muxc_types},    [13] = {.loc = "-",    .types = tevt_unknown_types},

					[14] = {.loc = "-",    .types = tevt_unknown_types}, [15] = {.loc = "-",    .types = tevt_unknown_types},

					[16] = {.loc = "-",    .types = tevt_unknown_types}, [17] = {.loc = "-",    .types = tevt_unknown_types},

					[18] = {.loc = "strm", .types = tevt_strm_types},    [19] = {.loc = "-",    .types = tevt_unknown_types},

					[20] = {.loc = "-",    .types = tevt_unknown_types}, [21] = {.loc = "-",    .types = tevt_unknown_types},

					[22] = {.loc = "-",    .types = tevt_unknown_types}, [23] = {.loc = "xprt", .types = tevt_xprt_types},

					[24] = {.loc = "-",    .types = tevt_unknown_types}, [25] = {.loc = "-",    .types = tevt_unknown_types},

				};

				void usage_exit(const char *name)

				{

					fprintf(stderr, "Usage: %s { value* | - }\n", name);

					exit(1);

				}

				char *to_upper(char *dst, const char *src)

				{

					int i;

					for (i = 0; src[i]; i++)

						dst[i] = toupper(src[i]);

					dst[i] = 0;

					return dst;

				}

				char *tevt_show_events(char *buf, size_t len, const char *delim, const char *value)

				{

					char loc[5];

					int ret;

					if (!value || !*value) {

						snprintf(buf, len, "##NONE");

						goto end;

					}

					if (strcmp(value, "-") == 0) {

						snprintf(buf, len, "##UNK");

						goto end;

					}

					if (strlen(value) % 2 != 0) {

						snprintf(buf, len, "##INV");

						goto end;

					}

					while (*value) {

						struct tevt_info info;

						char l = value[0];

						char t = value[1];

						if (!isalpha(l) || !isxdigit(t)) {

							snprintf(buf, len, "##INV");

							goto end;

						}

						info = tevt_location[tolower(l) - 'a'];

						ret = snprintf(buf, len, "%s:%s%s",

							       isupper(l) ? to_upper(loc, info.loc) : info.loc,

							       info.types[hex2i(t)],

							       value[2] != 0 ? delim : "");

						if (ret < 0)

							break;

						len -= ret;

						buf += ret;

						value += 2;

					}

				  end:

					return buf;

				}

				char *tevt_show_tuple_events(char *buf, size_t len, char *value)

				{

					char *p = value;

					/* skip '{' */

					p++;

					while (*p) {

						char *v;

						char c;

						while (*p == ' ' || *p == '\t')

							p++;

						v = p;

						while (*p && *p != ',' && *p != '}')

							p++;

						c = *p;

						*p = 0;

						tevt_show_events(buf, len, " > ", v);

						printf("\t- %s\n", buf);

						*p = c;

						if (*p == ',')

							p++;

						else if (*p == '}')

							break;

						else {

							printf("\t- ##INV\n");

							break;

						}

					}

					*buf = 0;

					return buf;

				}

				int main(int argc, char **argv)

				{

					const char *name = argv[0];

					char line[128];

					char *value;

					int multi = 0;

					int use_stdin = 0;

					char *err;

					while (argc == 1)

						usage_exit(name);

					argv++; argc--;

					if (argc > 1)

						multi = 1;

					if (strcmp(argv[0], "-") == 0)

						use_stdin = 1;

					while (argc > 0) {

						if (use_stdin) {

							value = fgets(line, sizeof(line), stdin);

							if (!value)

								break;

							/* skip common leading delimiters that slip from copy-paste */

							while (*value == ' ' || *value == '\t' || *value == ':' || *value == '=')

								value++;

							err = value;

							while (*err && *err != '\n')

								err++;

							*err = 0;

						}

						else {

							value = argv[0];

							argv++; argc--;

						}

						if (multi)

							printf("### %-8s : ", value);

						if (*value == '{') {

							if (!use_stdin)

								printf("\n");

							tevt_show_tuple_events(buf, bsz, value);

						}

						else

							tevt_show_events(buf, bsz, " > ", value);

						printf("%s\n", buf);

					}

					return 0;

				}

31

doc/DeviceAtlas-device-detection.txt

View File

 @ -3,7 +3,9 @@ DeviceAtlas Device Detection
 In order to add DeviceAtlas Device Detection support, you would need to download
 the API source code from https://deviceatlas.com/deviceatlas-haproxy-module.
 Once extracted :
 Once extracted, two modes are supported :
 / Build HAProxy and DeviceAtlas in one command
     $ make TARGET=<target> USE_DEVICEATLAS=1 DEVICEATLAS_SRC=<path to the API root folder>
 @ -14,10 +16,6 @@ directory. Also, in the case the api cache support is not needed and/or a C++ to
     $ make TARGET=<target> USE_DEVICEATLAS=1 DEVICEATLAS_SRC=<path to the API root folder> DEVICEATLAS_NOCACHE=1
 However, if the API had been installed beforehand, DEVICEATLAS_SRC
 can be omitted. Note that the DeviceAtlas C API version supported is from the 3.x
 releases series (3.2.1 minimum recommended).
 For HAProxy developers who need to verify that their changes didn't accidentally
 break the DeviceAtlas code, it is possible to build a dummy library provided in
 the addons/deviceatlas/dummy directory and to use it as an alternative for the
 @ -27,6 +25,29 @@ validate API changes :
     $ make TARGET=<target> USE_DEVICEATLAS=1 DEVICEATLAS_SRC=$PWD/addons/deviceatlas/dummy
 / Build and install DeviceAtlas according to https://docs.deviceatlas.com/apis/enterprise/c/<release version>/README.html
 For example :
 In the deviceatlas library folder :
     $ cmake .
     $ make
     $ sudo make install
 In the HAProxy folder :
     $ make TARGET=<target> USE_DEVICEATLAS=1
 Note that if the -DCMAKE_INSTALL_PREFIX cmake option had been used, it is necessary to set as well DEVICEATLAS_LIB and
 DEVICEATLAS_INC as follow :
     $ make TARGET=<target> USE_DEVICEATLAS=1 DEVICEATLAS_INC=<CMAKE_INSTALL_PREFIX value>/include DEVICEATLAS_LIB=<CMAKE_INSTALL_PREFIX value>/lib
 For example :
     $ cmake -DCMAKE_INSTALL_PREFIX=/opt/local
     $ make
     $ sudo make install
     $ make TARGET=<target> USE_DEVICEATLAS=1 DEVICEATLAS_INC=/opt/local/include DEVICEATLAS_LIB=/opt/local/lib
 Note that DEVICEATLAS_SRC is omitted in this case.
 These are supported DeviceAtlas directives (see doc/configuration.txt) :
   - deviceatlas-json-file <path to the DeviceAtlas JSON data file>.
   - deviceatlas-log-level <number> (0 to 3, level of information returned by

BIN
doc/HAProxyCommunityEdition_60px.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 15 KiB

325

doc/SPOE.txt

View File

 @ -1,16 +1,12 @@
                 -----------------------------------------------
                    Stream Processing Offload Engine (SPOE)
                                   Version 1.2
                           ( Last update: 2020-06-13 )
                           ( Last update: 2024-07-12 )
                 -----------------------------------------------
                           Author : Christopher Faulet
                       Contact : cfaulet at haproxy dot com
   WARNING: The SPOE is now deprecated and will be removed in future version.
 SUMMARY
 --------
 @ -73,13 +69,10 @@ systems (often at least the connect() is blocking). So, it is hard to properly
 implement Single Sign On solution (SSO) in HAProxy. The SPOE will ease this
 kind of processing, or we hope so.
 Now, the aim of SPOE is to allow any kind of offloading on the streams. First
 releases won't do lot of things. As we will see, there are few handled events
 and even less actions supported. Actually, for now, the SPOE can offload the
 processing before "tcp-request content", "tcp-response content", "http-request"
 and "http-response" rules. And it only supports variables definition. But, in
 spite of these limited features, we can easily imagine to implement SSO
 solution, ip reputation or ip geolocation services.
 The aim of SPOE is to allow any kind of offloading on the streams. It can
 offload the processing before "tcp-request content", "tcp-response content",
 "http-request" and "http-response" rules. It is also possible to offload the
 processing via an TCP/HTTP rule.
 Some example implementations in various languages are linked to from the
 HAProxy Wiki page dedicated to this mechanism:
 @ -89,8 +82,8 @@ HAProxy Wiki page dedicated to this mechanism:
 . SPOE configuration
 ----------------------
 Because SPOE is implemented as a filter, To use it, you must declare a "filter
 spoe" line in a proxy section (frontend/backend/listen) :
 Because SPOE is implemented as a filter, To use it, a "filter spoe" line must
 be declared xin a proxy section (frontend/backend/listen) :
   frontend my-front
       ...
 @ -103,9 +96,10 @@ the SPOE configuration. So it is possible to use the same SPOE configuration
 for several engines. If no name is provided, the SPOE configuration must not
 contain any scope directive.
 We use a separate configuration file on purpose. By commenting SPOE filter
 line, you completely disable the feature, including the parsing of sections
 reserved to SPOE. This is also a way to keep the HAProxy configuration clean.
 Using a separate configuration file makes possible to disable completely an
 engine by only commenting the SPOE filter line, including the parsing of
 sections reserved to SPOE. This is also a way to keep the HAProxy configuration
 clean.
 A SPOE configuration file must contains, at least, the SPOA configuration
 ("spoe-agent" section) and SPOE messages/groups ("spoe-message" or "spoe-group"
 @ -118,12 +112,13 @@ file.
 .1. SPOE scope
 -------------------------
 If you specify an engine name on the SPOE filter line, then you need to define
 scope in the SPOE configuration with the same name. You can have several SPOE
 scope in the same file. In each scope, you must define one and only one
 "spoe-agent" section to configure the SPOA linked to your SPOE and several
 "spoe-message" and "spoe-group" sections to describe, respectively, messages and
 group of messages sent to servers managed by your SPOA.
 If an engine name is specified on the SPOE filter line, then the corresponding
 scope must be defined in the SPOE configuration with the same name. It is
 possible to have several SPOE scopes in the same file. In each scope, one and
 only one "spoe-agent" section must be defined, to configure the SPOA linked to
 the defined engine and several "spoe-message" and "spoe-group" sections to
 describe, respectively, messages and group of messages sent to servers managed
 the SPOA.
 A SPOE scope starts with this kind of line :
 @ -152,15 +147,15 @@ If no engine name is provided on the SPOE filter line, no SPOE scope must be
 found in the SPOE configuration file. All the file is considered to be in the
 same anonymous and implicit scope.
 The engine name must be uniq for a proxy. If no engine name is provided on the
 SPOE filter line, the SPOE agent name is used by default.
 The engine name must be unique for a proxy. If no engine name is provided on
 the SPOE filter line, the SPOE agent name is used by default.
 .2. "spoe-agent" section
 --------------------------
 For each engine, you must define one and only one "spoe-agent" section. In this
 section, you will declare SPOE messages and the backend you will use. You will
 also set timeouts and options to customize your agent's behaviour.
 For each engine, exactly one "spoe-agent" section must be defined. Enabled SPOE
 messages are declared in this section, and all the parameters (timeout,
 options, ...) used to customize the agent behavior.
 spoe-agent <name>
 @ -173,15 +168,10 @@ spoe-agent <name>
   following keywords are supported :
     - groups
     - log
     - maxconnrate
     - maxerrrate
     - max-frame-size
     - max-waiting-frames
     - messages
     - [no] option async
     - [no] option dontlog-normal
     - [no] option pipelining
     - [no] option send-frag-payload
     - option continue-on-error
     - option force-set-var
     - option set-on-error
 @ -189,9 +179,16 @@ spoe-agent <name>
     - option set-total-time
     - option var-prefix
     - register-var-names
     - timeout hello|idle|processing
     - timeout processing
     - use-backend
   following keywords are deprecated and ignored:
     - maxconnrate
     - maxerrrate
     - max-waiting-frames
     - [no] option async
     - [no] option send-frag-payload
     - timeout hello|idle
 groups <grp-name> ...
   Declare the list of SPOE groups that an agent will handle.
 @ -200,11 +197,11 @@ groups <grp-name> ...
     <grp-name>   is the name of a SPOE group.
   Groups declared here must be found in the same engine scope, else an error is
   triggered during the configuration parsing. You can have many "groups" lines.
   triggered during the configuration parsing. Several "groups" lines can be
   defined.
   See also: "spoe-group" section.
 log global
 log <address> [len <length>] [format <format>] <facility> [<level> [<minlevel>]]
 no log
 @ -215,28 +212,35 @@ no log
   See the HAProxy Configuration Manual for details about this option.
 maxconnrate <number>
 maxconnrate <number> [DEPRECATED]
   Set the maximum number of connections per second to <number>. The SPOE will
   stop to open new connections if the maximum is reached and will wait to
   acquire an existing one. So it is important to set "timeout hello" to a
   relatively small value.
   This parameter is now deprecated and ignored. It will be removed in future
   versions.
 maxerrrate <number>
 maxerrrate <number> [DEPRECATED]
   Set the maximum number of errors per second to <number>. The SPOE will stop
   its processing if the maximum is reached.
   This parameter is now deprecated and ignored. It will be removed in future
   versions.
 max-frame-size <number>
   Set the maximum allowed size for frames exchanged between HAProxy and SPOA.
   It must be in the range [256, tune.bufsize-4] (4 bytes are reserved for the
   frame length). By default, it is set to (tune.bufsize-4).
 max-waiting-frames <number>
 max-waiting-frames <number>  [DEPRECATED]
   Set the maximum number of frames waiting for an acknowledgement on the same
   connection. This value is only used when the pipelinied or asynchronous
   exchanges between HAProxy and SPOA are enabled. By default, it is set to 20.
   This parameter is now deprecated and ignored. It will be removed in future
   versions.
 messages <msg-name> ...
   Declare the list of SPOE messages that an agent will handle.
 @ -244,23 +248,24 @@ messages <msg-name> ...
     <msg-name>   is the name of a SPOE message.
   Messages declared here must be found in the same engine scope, else an error
   is triggered during the configuration parsing. You can have many "messages"
   lines.
   is triggered during the configuration parsing. Several "messages" lines can
   be defined.
   See also: "spoe-message" section.
 option async
 option async [DEPRECATED]
 no option async
   Enable or disable the support of asynchronous exchanges between HAProxy and
   SPOA. By default, this option is enabled.
   This parameter is now deprecated and ignored. It will be removed in future
   versions.
 option continue-on-error
   Do not stop the events processing when an error occurred on a stream.
   By default, for a specific stream, when an abnormal/unexpected error occurs,
   the SPOE is disabled for all the transaction. So if you have several events
   the SPOE is disabled for all the transaction. if several events are
   configured, such error on an event will disabled all following. For TCP
   streams, this will disable the SPOE for the whole session. For HTTP streams,
   this will disable it for the transaction (request and response).
 @ -268,7 +273,6 @@ option continue-on-error
   When set, this option bypass this behaviour and only the current event will
   be ignored.
 option dontlog-normal
 no option dontlog-normal
   Enable or disable logging of normal, successful processing.
 @ -277,29 +281,27 @@ no option dontlog-normal
   See also: "log" and section 4 about logging.
 option force-set-var
   By default, SPOE filter only register already known variables (mainly from
   parsing of the configuration), and process-wide variables (those of scope
   "proc") cannot be created. If you want that haproxy trusts the agent and
   registers all variables (ex: can be useful for LUA workload), activate this
   option.
   "proc") cannot be created. If HAProxy trusts the agent and registers all
   variables (ex: can be useful for LUA workload), this option can be sets.
   Caution : this option opens to a variety of attacks such as a rogue SPOA that
   asks to register too many variables.
 option pipelining
 no option pipelining
   Enable or disable the support of pipelined exchanges between HAProxy and
   SPOA. By default, this option is enabled.
 option send-frag-payload
 option send-frag-payload [DEPRECATED]
 no option send-frag-payload
   Enable or disable the sending of fragmented payload to SPOA. By default, this
   option is enabled.
   This parameter is now deprecated and ignored. It will be removed in future
   versions.
 option set-on-error <var name>
   Define the variable to set when an error occurred during an event processing.
 @ -311,13 +313,13 @@ option set-on-error <var name>
   This variable will only be set when an error occurred in the scope of the
   transaction. As for all other variables define by the SPOE, it will be
   prefixed. So, if your variable name is "error" and your prefix is
   prefixed. So, if the variable name is "error" and the prefix is
   "my_spoe_pfx", the variable will be "txn.my_spoe_pfx.error".
   When set, the variable is an integer representing the error reason. For values
   under 256, it represents an error coming from the engine. Below 256, it
   reports a SPOP error. In this case, to retrieve the right SPOP status code,
   you must remove 256 to this value. Here are possible values:
 must be removed from this value. Here are possible values:
     * 1       a timeout occurred during the event processing.
 @ -351,8 +353,8 @@ option set-process-time <var name>
                  contain characters 'a-z', 'A-Z', '0-9', '.' and '_'.
   This variable will be set in the scope of the transaction. As for all other
   variables define by the SPOE, it will be prefixed. So, if your variable name
   is "process_time" and your prefix is "my_spoe_pfx", the variable will be
   variables define by the SPOE, it will be prefixed. So, if the variable name
   is "process_time" and the prefix is "my_spoe_pfx", the variable will be
   "txn.my_spoe_pfx.process_time".
   When set, the variable is an integer representing the delay to process the
 @ -360,11 +362,10 @@ option set-process-time <var name>
   latency added by the SPOE processing for the last handled event or group.
   If several events or groups are processed for the same stream, this value
   will be overrideen.
   will be overridden.
   See also: "option set-total-time".
 option set-total-time <var name>
   Define the variable to set to report the total processing time SPOE for a
   stream.
 @ -375,8 +376,8 @@ option set-total-time <var name>
                  contain characters 'a-z', 'A-Z', '0-9', '.' and '_'.
   This variable will be set in the scope of the transaction. As for all other
   variables define by the SPOE, it will be prefixed. So, if your variable name
   is "total_time" and your prefix is "my_spoe_pfx", the variable will be
   variables define by the SPOE, it will be prefixed. So, if the variable name
   is "total_time" and the prefix is "my_spoe_pfx", the variable will be
   "txn.my_spoe_pfx.total_time".
   When set, the variable is an integer representing the sum of processing times
 @ -388,7 +389,6 @@ option set-total-time <var name>
   See also: "option set-process-time".
 option var-prefix <prefix>
   Define the prefix used when variables are set by an agent.
 @ -403,19 +403,19 @@ option var-prefix <prefix>
   The prefix will be added between the variable scope and its name, separated
   by a '.'. It may only contain characters 'a-z', 'A-Z', '0-9', '.' and '_', as
   for variables name. In HAProxy configuration, you need to use this prefix as
   a part of the variables name. For example, if an agent define the variable
   "myvar" in the "txn" scope, with the prefix "my_spoe_pfx", then you should
   use "txn.my_spoe_pfx.myvar" name in your HAProxy configuration.
   for variables name. In HAProxy configuration, this prefix must be used as a
   part of the variables name. For example, if an agent define the variable
   "myvar" in the "txn" scope, with the prefix "my_spoe_pfx", then
   "txn.my_spoe_pfx.myvar" name must be used in HAProxy configuration.
   By default, an agent will never set new variables at runtime: It can only set
   new value for existing ones. If you want a different behaviour, see
   force-set-var option and register-var-names directive.
   new value for existing ones. To change this behaviour, see "force-set-var"
   option and "register-var-names" directive.
 register-var-names <var name> ...
   Register some variable names. By default, an agent will not be allowed to set
   new variables at runtime. This rule can be totally relaxed by setting the
   option "force-set-var". If you know all the variables you will need, this
   option "force-set-var". If all the required variables are known, this
   directive is a good way to register them without letting an agent doing what
   it want. This is only required if these variables are not referenced anywhere
   in the HAProxy configuration or the SPOE one.
 @ -424,12 +424,12 @@ register-var-names <var name> ...
     <var name>   is a variable name without the scope. The name may only
                  contain characters 'a-z', 'A-Z', '0-9', '.' and '_'.
   The prefix will be automatically added during the registration. You can have
   many "register-var-names" lines.
   The prefix will be automatically added during the registration. Several
   "register-var-names" lines can be used.
   See also: "option force-set-var", "option var-prefix".
 timeout hello <timeout>
 timeout hello <timeout> [DEPRECATED]
   Set the maximum time to wait for an agent to receive the AGENT-HELLO frame.
   It is applied on the stream that handle the connection with the agent.
 @ -441,8 +441,10 @@ timeout hello <timeout>
   This timeout is an applicative timeout. It differ from "timeout connect"
   defined on backends.
   This parameter is now deprecated and ignored. It will be removed in future
   versions.
 timeout idle <timeout>
 timeout idle <timeout>  [DEPRECATED]
   Set the maximum time to wait for an agent to close an idle connection. It is
   applied on the stream that handle the connection with the agent.
 @ -451,6 +453,8 @@ timeout idle <timeout>
                 can be in any other unit if the number is suffixed by the unit,
                 as explained at the top of this document.
   This parameter is now deprecated and ignored. It will be removed in future
   versions.
 timeout processing <timeout>
   Set the maximum time to wait for a stream to process an event, i.e to acquire
 @ -486,21 +490,19 @@ spoe-message <name>
   Arguments :
     <name>   is the name of the SPOE message.
   Here you define a message that can be referenced in a "spoe-agent"
   section. Following keywords are supported :
   Here a message that can be referenced in a "spoe-agent" section is
   defined. Following keywords are supported :
     - acl
     - args
     - event
   See also: "spoe-agent" section.
 acl <aclname> <criterion> [flags] [operator] <value> ...
   Declare or complete an access list.
   See section 7 about ACL usage in the HAProxy Configuration Manual.
 args [name=]<sample> ...
   Define arguments passed into the SPOE message.
 @ -514,7 +516,6 @@ args [name=]<sample> ...
   For example:
     args frontend=fe_id src dst
 event <name> [ { if | unless } <condition> ]
   Set the event that triggers sending of the message. It may optionally be
   followed by an ACL-based condition, in which case it will only be evaluated
 @ -556,13 +557,12 @@ spoe-group <name>
   Arguments :
     <name>   is the name of the SPOE group.
   Here you define a group of SPOE messages that can be referenced in a
   Here a group of SPOE messages is defined. It can be referenced in a
   "spoe-agent" section. Following keywords are supported :
      - messages
        - messages
   See also: "spoe-agent" and "spoe-message" sections.
 messages <msg-name> ...
   Declare the list of SPOE messages belonging to the group.
 @ -571,7 +571,7 @@ messages <msg-name> ...
   Messages declared here must be found in the same engine scope, else an error
   is triggered during the configuration parsing. Furthermore, a message belongs
   at most to a group. You can have many "messages" lines.
   at most to a group. Several "messages" lines can be defined.
   See also: "spoe-message" section.
 @ -602,7 +602,7 @@ and 0 a blacklisted IP with no doubt).
         server http A.B.C.D:80
     backend iprep-servers
         mode tcp
         mode spop
         balance roundrobin
         timeout connect 5s # greater than hello timeout
 @ -620,8 +620,6 @@ and 0 a blacklisted IP with no doubt).
         option var-prefix iprep
         timeout hello      2s
         timeout idle       2m
         timeout processing 10ms
         use-backend iprep-servers
 @ -718,62 +716,37 @@ actions.
             +---+---+----------+
     FIN: Indicates that this is the final payload fragment. The first fragment
          may also be the final fragment.
          may also be the final fragment. The payload fragmentation was removed
          and is now deprecated. It means the FIN flag must be set on all
          frames.
     ABORT: Indicates that the processing of the current frame must be
            cancelled. This bit should be set on frames with a fragmented
            payload. It can be ignore for frames with an unfragemnted
            payload. When it is set, the FIN bit must also be set.
            cancelled.
 Frames cannot exceed a maximum size negotiated between HAProxy and agents
 during the HELLO handshake. Most of time, payload will be small enough to send
 it in one frame. But when supported by the peer, it will be possible to
 fragment huge payload on many frames. This ability is announced during the
 HELLO handshake and it can be asynmetric (supported by agents but not by
 HAProxy or the opposite). The following rules apply to fragmentation:
   * An unfragemnted payload consists of a single frame with the FIN bit set.
   * A fragemented payload consists of several frames with the FIN bit clear and
     terminated by a single frame with the FIN bit set. All these frames must
     share the same STREAM-ID and FRAME-ID. The first frame must set the right
     FRAME-TYPE (e.g, NOTIFY). The following frames must have an unset type (0).
 Beside the support of fragmented payload by a peer, some payload must not be
 fragmented. See below for details.
 it in one frame.
 IMPORTANT : The maximum size supported by peers for a frame must be greater
 than or equal to 256 bytes.
             than or equal to 256 bytes. A good common value is the HAProxy
             buffer size minus 4 bytes, reserved for the frame length
             (tune.bufsize - 4). It is the default value announced by HAproxy.
 .2.1. Frame capabilities
 --------------------------
 Here are the list of official capabilities that HAProxy and agents can support:
   * fragmentation: This is the ability for a peer to support fragmented
                    payload in received frames. This is an asymmectical
                    capability, it only concerns the peer that announces
                    it. This is the responsibility to the other peer to use it
                    or not.
   * pipelining: This is the ability for a peer to decouple NOTIFY and ACK
                 frames. This is a symmectical capability. To be used, it must
                 be supported by HAProxy and agents. Unlike HTTP pipelining, the
                 ACK frames can be send in any order, but always on the same TCP
                 connection used for the corresponding NOTIFY frame.
   * async: This ability is similar to the pipelining, but here any TCP
            connection established between HAProxy and the agent can be used to
            send ACK frames. if an agent accepts connections from multiple
            HAProxy, it can use the "engine-id" value to group TCP
            connections. See details about HAPROXY-HELLO frame.
 Unsupported or unknown capabilities are silently ignored, when possible.
 NOTE: HAProxy does not support the fragmentation for now. This means it is not
       able to handle fragmented frames. However, if an agent announces the
       fragmentation support, HAProxy may choose to send fragemented frames.
 NOTE: Fragmentation and async capabilities were deprecated and are now ignored.
 .2.2. Frame types overview
 ----------------------------
 @ -782,9 +755,6 @@ Here are types of frame supported by SPOE. Frames sent by HAProxy come first,
 then frames sent by agents :
     TYPE                       |  ID | DESCRIPTION
   -----------------------------+-----+-------------------------------------
      UNSET                     |  0  | Used for all frames but the first when a
                                |     | payload is fragmented.
   -----------------------------+-----+-------------------------------------
      HAPROXY-HELLO             |  1  |  Sent by HAProxy when it opens a
                                |     |  connection on an agent.
 @ -805,7 +775,8 @@ then frames sent by agents :
      ACK                       | 103 |  Sent to acknowledge a NOTIFY frame
   -----------------------------+-----+-------------------------------------
 Unknown frames may be silently skipped.
 Unknown frames may be silently skipped or trigger an error, depending on the
 implementation.
 .2.3. Workflow
 ----------------
 @ -869,37 +840,6 @@ Unknown frames may be silently skipped.
        | <-------------------------- |
        |                             |
   * Notify / Ack exchange (fragmented payload):
     HAPROXY                       AGENT SRV
        |      NOTIFY (frag 1)        |
        | --------------------------> |
        |                             |
        |       UNSET (frag 2)        |
        | --------------------------> |
        |            ...              |
        |       UNSET (frag N)        |
        | --------------------------> |
        |                             |
        |           ACK               |
        | <-------------------------- |
        |                             |
   * Aborted fragmentation of a NOTIFY frame:
     HAPROXY                       AGENT SRV
        |            ...              |
        |       UNSET (frag X)        |
        | --------------------------> |
        |                             |
        |         ACK/ABORT           |
        | <-------------------------- |
        |                             |
        |       UNSET (frag X+1)      |
        | -----------X                |
        |                             |
        |                             |
   * Connection closed by haproxy:
     HAPROXY                       AGENT SRV
 @ -921,8 +861,8 @@ Unknown frames may be silently skipped.
 ----------------------------
 This frame is the first one exchanged between HAProxy and an agent, when the
 connection is established. The payload of this frame is a KV-LIST. It cannot be
 fragmented. STREAM-ID and FRAME-ID are must be set 0.
 connection is established. The payload of this frame is a KV-LIST. STREAM-ID
 and FRAME-ID are must be set 0.
 Following items are mandatory in the KV-LIST:
 @ -967,7 +907,7 @@ AGENT-DISCONNECT frame must be returned.
 This frame is sent in reply to a HAPROXY-HELLO frame to finish a HELLO
 handshake. As for HAPROXY-HELLO frame, STREAM-ID and FRAME-ID are also set
 . The payload of this frame is a KV-LIST and it cannot be fragmented.
 . The payload of this frame is a KV-LIST.
 Following items are mandatory in the KV-LIST:
 @ -1001,8 +941,7 @@ will close the connection at the end of the health check.
 Information are sent to the agents inside NOTIFY frames. These frames are
 attached to a stream, so STREAM-ID and FRAME-ID must be set. The payload of
 NOTIFY frames is a LIST-OF-MESSAGES and, if supported by agents, it can be
 fragmented.
 NOTIFY frames is a LIST-OF-MESSAGES.
 NOTIFY frames must be acknowledge by agents sending an ACK frame, repeating
 right STREAM-ID and FRAME-ID.
 @ -1012,8 +951,7 @@ right STREAM-ID and FRAME-ID.
 ACK frames must be sent by agents to reply to NOTIFY frames. STREAM-ID and
 FRAME-ID found in a NOTIFY frame must be reuse in the corresponding ACK
 frame. The payload of ACK frames is a LIST-OF-ACTIONS and, if supported by
 HAProxy, it can be fragmented.
 frame. The payload of ACK frames is a LIST-OF-ACTIONS.
 .2.8. Frame: HAPROXY-DISCONNECT
 ---------------------------------
 @ -1023,8 +961,8 @@ frame is sent with information describing the error. HAProxy will wait an
 AGENT-DISCONNECT frame in reply. All other frames will be ignored. The agent
 must then close the socket.
 The payload of this frame is a KV-LIST. It cannot be fragmented. STREAM-ID and
 FRAME-ID are must be set 0.
 The payload of this frame is a KV-LIST. STREAM-ID and FRAME-ID are must be set
 .
 Following items are mandatory in the KV-LIST:
 @ -1046,8 +984,8 @@ is sent, with information describing the error. such frame is also sent in reply
 to a HAPROXY-DISCONNECT. The agent must close the socket just after sending
 this frame.
 The payload of this frame is a KV-LIST. It cannot be fragmented. STREAM-ID and
 FRAME-ID are must be set 0.
 The payload of this frame is a KV-LIST. STREAM-ID and FRAME-ID are must be set
 .
 Following items are mandatory in the KV-LIST:
 @ -1064,10 +1002,10 @@ For more information about known errors, see section "Errors & timeouts"
 .3. Events & Messages
 -----------------------
 Information about streams are sent in NOTIFY frames. You can specify which kind
 of information to send by defining "spoe-message" sections in your SPOE
 configuration file. for each "spoe-message" there will be a message in a NOTIFY
 frame when the right event is triggered.
 Information about streams are sent in NOTIFY frames. It is possible to specify
 which kind of information to send by defining "spoe-message" sections in the
 SPOE configuration file. for each "spoe-message" there will be a message in a
 NOTIFY frame when the right event is triggered.
 A NOTIFY frame is sent for an specific event when there is at least one
 "spoe-message" attached to this event. All messages for an event will be added
 @ -1189,21 +1127,15 @@ An agent can define its own errors using a not yet assigned status code.
 IMPORTANT NOTE: By default, for a specific stream, when an abnormal/unexpected
                 error occurs, the SPOE is disabled for all the transaction. So
                 if you have several events configured, such error on an event
                 will disabled all following. For TCP streams, this will
                 disable the SPOE for the whole session. For HTTP streams, this
                 will disable it for the transaction (request and response).
                 See 'option continue-on-error' to bypass this limitation.
                 if several events are configured, such error on an event will
                 disabled all following. For TCP streams, this will disable the
                 SPOE for the whole session. For HTTP streams, this will disable
                 it for the transaction (request and response).  See 'option
                 continue-on-error' to bypass this limitation.
 To avoid a stream to wait undefinetly, you must carefully choose the
 acknowledgement timeout. In most of cases, it will be quiet low. But it depends
 on the responsivness of your service.
 You must also choose idle timeout carefully. Because connection with your
 service depends on the backend configuration used by the SPOA, it is important
 to use a lower value for idle timeout than the server timeout. Else the
 connection will be closed by HAProxy. The same is true for hello timeout. You
 should choose a lower value than the connect timeout.
 To avoid a stream to wait undefinetly, A processing timeout should be carefully
 defined. Most of time, it will be quiet low. But it depends on the SPOA
 responsivness.
 . Logging
 -----------
 @ -1218,40 +1150,19 @@ LOG_NOTICE. Otherwise, the message is logged with the level LOG_WARNING.
 The messages are logged using the agent's logger, if defined, and use the
 following format:
     SPOE: [AGENT] <TYPE:NAME> sid=STREAM-ID st=STATUS-CODE reqT/qT/wT/resT/pT \
     <idles>/<applets> <nb_sending>/<nb_waiting> <nb_error>/<nb_processed>
     SPOE: [AGENT] <TYPE:NAME> sid=STREAM-ID st=STATUS-CODE pT <nb_error>/<nb_processed>
       AGENT              is the agent name
       TYPE               is EVENT of GROUP
       NAME               is the event or the group name
       STREAM-ID          is an integer, the unique id of the stream
       STATUS_CODE        is the processing's status code
       reqT/qT/wT/resT/pT are the following time events:
         * reqT : the encoding time. It includes ACLs processing, if any. For
                  fragmented frames, it is the sum of all fragments.
         * qT   : the delay before the request gets out the sending queue. For
                  fragmented frames, it is the sum of all fragments.
         * wT   : the delay before the response is received. No fragmentation
                  supported here.
         * resT : the delay to process the response. No fragmentation supported
                  here.
         * pT   : the delay to process the event or the group. From the stream
                  point of view, it is the latency added by the SPOE processing.
                  It is more or less the sum of values above.
       <idle>             is the numbers of idle SPOE applets
       <applets>          is the numbers of SPOE applets
       <nb_sending>       is the numbers of streams waiting to send data
       <nb_waiting>       is the numbers of streams waiting for a ack
       pT                 is the delay to process the event or the group.
                             From the stream point of view, it is the latency added
                             by the SPOE processing.
       <nb_error>         is the numbers of processing errors
       <nb_processed>     is the numbers of events/groups processed
 For all these time events, -1 means the processing was interrupted before the
 end. So -1 for the queue time means the request was never dequeued. For
 fragmented frames it is harder to know when the interruption happened.
 /*
  * Local variables:
  *  fill-column: 79

1448

doc/architecture.txt

View File

File diff suppressed because it is too large Load Diff

11517

doc/configuration.txt

View File

File diff suppressed because it is too large Load Diff

114

doc/design-thoughts/error-reporting.txt Normal file

View File

 @ -0,0 +1,114 @@
 -10-28 - error reporting
 ----------------------------
 - rules:
     -> stream->current_rule ~= yielding rule or error
        pb: not always set.
     -> todo: curr_rule_in_progress points to &rule->conf (file+line)
        - set on ACT_RET_ERR, ACT_RET_YIELD, ACT_RET_INV.
        - sample_fetch: curr_rule
 - filters:
     -> strm_flt.filters[2] (1 per direction) ~= yielding filter or error
     -> to check: what to do on forward filters (e.g. compression)
     -> check spoe / waf (stream data)
     -> sample_fetch: curr_filt
 - cleanup:
   - last_rule_line + last_rule_file can point to &rule->conf
 - xprt:
   - all handshakes use the dummy xprt "xprt_handshake" ("HS"). No data
     exchange is possible there. The ctx is of type xprt_handshake_ctx
     for all of them, and contains a wait_event.
     => conn->xprt_ctx->wait_event contains the sub for current handshake
        *if* xprt points to xprt_handshake.
   - at most 2 active xprt at once: top and bottom (bottom=raw_sock)
 - proposal:
   - combine 2 bits for muxc, 2 bits for xprt, 4 bits for fd (active,ready).
     => 8 bits for muxc and below. QUIC uses something different TBD.
   - muxs uses 6 bits max (ex: h2 send_list, fctl_list, full etc; h1: full,
     blocked connect...).
   - 2 bits for sc's sub
   - mux_sctl to retrieve a 32-bit code padded right, limited to 16 bits
     for now.
     => [ 0000 | 0000 | 0000 | 0000 | SC | MUXS | MUXC | XPRT | FD ]
 6      2      2     4
   - sample-fetch for each side.
 - shut / abort
   - history, almost human-readable.
   - event locations:
      - fd (detected by rawsock)
      - handshake (detected by xprt_handshake). Eg. parsing or address encoding
      - xprt (ssl)
      - muxc
      - se: muxs / applet
      - stream
      < 8 total. +8 to distinguish front from back at stream level.
      suggest:
        - F, H, X, M, E, S  front or back
        - f, h, x, m, e, s  back or front
   - event types:
       - 0 = no event yet
       - 1 = timeout
       - 2 = intercepted (rule, etc)
       - 3 unused
       // shutr / shutw: +1 if other side already shut
       - 4 = aligned shutr
       - 6 = aligned recv error
       - 8 = early shutr (truncation)
       - 10 = early error (truncation)
       - 12 = shutw
       - 14 = send error
   - event location = MSB
     event type     = LSB
     appending a single event:
       -- if code not full --
       code <<= 8;
       code |= location << 4;
       code |= event type;
   - up to 4 events per connection in 32-bit mode stored on connection
     (since raw_sock & ssl_sock need to access it).
   - SE (muxs/applet) store their event log in the SD: se_event_log (64 bits).
   - muxs must aggregate the connection's flags with its own:
     - store last known connection state in SD: conn_event_log
     - detect changes at the connection level by comparing with SD conn_event_log
     - create a new SD event with difference(s) into SD se_event_log
     - update connection state in SD conn_event_log
   - stream
     - store their event log in the stream: strm_event_log (64 bits).
     - for each side:
       - store last known SE state in SD: last_se_event_log
       - detect changes at the SE level by comparing with SD se_event_log
       - create a new STREAM event with difference(s) into STREAM strm_event_log
         and patch the location depending on front vs back (+8 for back).
       - update SE state in SD last_se_event_log
     => strm_event_log contains a composite of each side + stream.
     - converted to string using the location letters
     - if more event types needed later, can enlarge bits and use another letter.
     - note: also possible to create an exhaustive enumeration of all possible codes
       (types+locations).
 - sample fetch to retrieve strm_event_log.
 - Note that fc_err and fc_err_str are already usable
 - questions:
   - htx layer needed ?
   - ability to map EOI/EOS etc to SE activity ?
   - we'd like to detect an HTTP response before end of POST.

750

doc/design-thoughts/h2-rx-win.fig Normal file

View File

 @ -0,0 +1,750 @@
 #FIG 3.2  Produced by xfig version 3.1
 Landscape
 Center
 Metric
 A4
 .00
 Single
 -2
 2
 32 #8e8e8e
 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
 450 450 6750
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 547 2250 637
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 592 2250 682
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 637 2250 727
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 682 2250 772
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 900 2250 990
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 945 2250 1035
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 990 2250 1080
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1035 2250 1125
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1080 2250 1170
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1125 2250 1215
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1168 2250 1258
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1213 2250 1303
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1429 2250 1519
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1384 2250 1474
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1339 2250 1429
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1303 2250 1393
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1253 2248 1343
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 794 451 884
 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
 450 2250 6750
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1130 451 1220
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1309 451 1399
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 810 2475 810
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 1305 2475 1305
 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
 450 10800 7155
 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
 450 9000 7155
 1 0 2 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 547 10800 1440
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 592 10800 1485
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 637 10800 1530
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 682 10800 1575
 1 0 2 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2437 10800 3330
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2482 10800 3375
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2527 10800 3420
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2572 10800 3465
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2617 10800 3510
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2707 10800 3600
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2752 10800 3645
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2662 10800 3555
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4327 10800 5220
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4372 10800 5265
 1 0 2 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4462 10800 5355
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4417 10800 5310
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4507 10800 5400
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4552 10800 5445
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4597 10800 5490
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4642 10800 5535
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5334 9001 6189
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5532 9001 6387
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 3629 9001 4484
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 3476 9001 4331
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1575 9001 2430
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 1575 11610 1575
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 3645 11565 3645
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 6120 11610 6120
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 1487 10948 1366 10948 1456 11173 1276
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 1741 10948 1620 10948 1710 11173 1530
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 3406 10948 3285 10948 3375 11173 3195
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 3681 10948 3560 10948 3650 11173 3470
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 3996 10948 3875 10948 3965 11173 3785
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 4266 10948 4145 10948 4235 11173 4055
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 5278 10948 5157 10948 5247 11173 5067
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 5537 10948 5416 10948 5506 11173 5326
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5002 10800 5895
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5047 10800 5940
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5092 10800 5985
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5137 10800 6030
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5182 10800 6075
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5227 10800 6120
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6802 10800 7695
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6847 10800 7740
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6892 10800 7785
 1 0 2 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6982 10800 7875
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7027 10800 7920
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7072 10800 7965
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6937 10800 7830
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7117 10800 8010
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7162 10800 8055
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6129 9001 6984
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5942 9001 6797
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4950 10800 5843
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4905 10800 5798
 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
 450 3150 6750
 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
 450 4905 6750
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 592 4950 1485
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 637 4950 1530
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 547 4950 1440
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 682 4950 1575
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2572 4950 3465
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2527 4950 3420
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2482 4950 3375
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2437 4950 3330
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2617 4950 3510
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2662 4950 3555
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2707 4950 3600
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2752 4950 3645
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4552 4950 5445
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4597 4950 5490
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4642 4950 5535
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4687 4950 5580
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4867 4950 5760
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4912 4950 5805
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5047 4950 5940
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5092 4950 5985
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4822 4950 5715
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4777 4950 5670
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4732 4950 5625
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4957 4950 5850
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5002 4950 5895
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5137 4950 6030
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5227 4950 6120
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5182 4950 6075
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1575 3151 2430
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 3673 3151 4528
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 1575 5175 1575
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 3645 5175 3645
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 6120 5175 6120
 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
 450 7650 7155
 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
 450 5850 7155
 1 0 2 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 547 7650 1440
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 592 7650 1485
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 637 7650 1530
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 682 7650 1575
 1 0 2 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2437 7650 3330
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2482 7650 3375
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2527 7650 3420
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2572 7650 3465
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2617 7650 3510
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2707 7650 3600
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2752 7650 3645
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 2662 7650 3555
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4327 7650 5220
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4372 7650 5265
 1 0 2 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4462 7650 5355
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4417 7650 5310
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4507 7650 5400
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4552 7650 5445
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4597 7650 5490
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4642 7650 5535
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4687 7650 5580
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4732 7650 5625
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4777 7650 5670
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4822 7650 5715
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4867 7650 5760
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4912 7650 5805
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 4957 7650 5850
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5002 7650 5895
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6213 7650 7106
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6262 7650 7155
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6307 7650 7200
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6352 7650 7245
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6397 7650 7290
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6487 7650 7380
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6532 7650 7425
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6577 7650 7470
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6442 7650 7335
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6622 7650 7515
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6667 7650 7560
 1 0 2 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6757 7650 7650
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6802 7650 7695
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6847 7650 7740
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6712 7650 7605
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6892 7650 7785
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6937 7650 7830
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5334 5851 6189
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5532 5851 6387
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5698 5851 6553
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 5917 5851 6772
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 3629 5851 4484
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 3476 5851 4331
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 1575 5851 2430
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 1575 8460 1575
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 3645 8415 3645
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 1 2
 1 1.00 60.00 120.00
 6120 8460 6120
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 1487 7798 1366 7798 1456 8023 1276
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 1741 7798 1620 7798 1710 8023 1530
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 3406 7798 3285 7798 3375 8023 3195
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 3681 7798 3560 7798 3650 8023 3470
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 3996 7798 3875 7798 3965 8023 3785
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 4266 7798 4145 7798 4235 8023 4055
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 5278 7798 5157 7798 5247 8023 5067
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 5537 7798 5416 7798 5506 8023 5326
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 0 4
 4680 8910 4680 8910 4860 8955 4860
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 0 0 4
 6570 8910 6570 8910 6750 8955 6750
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 5791 10948 5670 10948 5760 11173 5580
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 6060 10948 5939 10948 6029 11173 5849
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 6372 10948 6251 10948 6341 11173 6161
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 6601 10948 6480 10948 6570 11173 6390
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 6781 10948 6660 10948 6750 11173 6570
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 6970 10948 6849 10948 6939 11173 6759
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 5791 7798 5670 7798 5760 8023 5580
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 6060 7798 5939 7798 6029 8023 5849
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 6372 7798 6251 7798 6341 8023 6161
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 6601 7798 6480 7798 6570 8023 6390
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 6781 7798 6660 7798 6750 8023 6570
 1 0 1 5 7 54 -1 -1 0.000 0 0 -1 1 0 4
 1 1.00 60.00 120.00
 6970 7798 6849 7798 6939 8023 6759
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7245 9001 8100
 1 0 1 12 7 52 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7425 9001 8280
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7920 10800 8813
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7965 10800 8858
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 8010 10800 8903
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 8055 10800 8948
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 8100 10800 8993
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 8145 10800 9038
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 8190 10800 9083
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 8235 10800 9128
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7560 10800 8453
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7605 10800 8498
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7650 10800 8543
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7695 10800 8588
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7740 10800 8633
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7785 10800 8678
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7830 10800 8723
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7875 10800 8768
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7200 10800 8093
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7245 10800 8138
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7290 10800 8183
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7335 10800 8228
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7380 10800 8273
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7425 10800 8318
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7470 10800 8363
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 7515 10800 8408
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6210 10800 7103
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6255 10800 7148
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6300 10800 7193
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6345 10800 7238
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6390 10800 7283
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6435 10800 7328
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6480 10800 7373
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 6525 10800 7418
 1 0 1 4 7 53 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 8280 8955 8280
 1 0 1 1 7 51 -1 -1 0.000 0 0 -1 1 0 2
 1 1.00 60.00 120.00
 8282 10800 9175
 0 0 1 4 7 53 -1 -1 0.000 0 1 0 5
 1 1.00 60.00 120.00
 4905 8820 5310 8775 5805 8865 6345 8910 6525
 .000 1.000 1.000 1.000 0.000
 0 0 53 -1 16 6 0.0000 4 105 495 2520 1350 WP1 @12\001
 0 0 53 -1 16 6 0.0000 4 105 435 2565 855 WP0 @4\001
 0 0 53 -1 16 6 0.0000 4 75 390 2565 1005     => +8\001
 1 0 52 -1 16 8 0.4363 4 105 765 9945 4050 WU: win=16\001
 1 0 52 -1 16 8 0.4363 4 105 690 9945 1935 WU: win=8\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 1305 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 1485 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 3195 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 3465 -2 = 4\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 3825 -2 = 2\001
 0 20 54 -1 18 6 0.0000 4 75 270 11205 4095 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 5085 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 5355 -2 = 4\001
 0 0 53 -1 16 6 0.0000 4 105 495 11340 3645 WP1 @12\001
 0 0 53 -1 16 6 0.0000 4 105 495 11295 6075 WP2 @28\001
 0 0 53 -1 16 6 0.0000 4 105 435 11340 1710 WP0 @4\001
 0 0 53 -1 16 6 0.0000 4 75 360 11340 1860    => +8\001
 1 0 52 -1 16 8 0.4363 4 105 765 9945 6480 WU: win=32\001
 0 0 53 -1 16 6 0.0000 4 105 495 5220 3690 WP1 @12\001
 0 0 53 -1 16 6 0.0000 4 105 495 5220 6165 WP2 @28\001
 0 0 53 -1 16 6 0.0000 4 105 435 5220 1620 WP0 @4\001
 0 0 53 -1 16 6 0.0000 4 75 390 5220 1770     => +8\001
 1 0 52 -1 16 8 0.4363 4 105 765 6795 6300 WU: win=32\001
 1 0 52 -1 16 8 0.4363 4 105 765 6795 4050 WU: win=16\001
 1 0 52 -1 16 8 0.4363 4 105 690 6795 1935 WU: win=8\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 1305 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 1485 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 3195 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 3465 -2 = 4\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 3825 -2 = 2\001
 0 20 54 -1 18 6 0.0000 4 75 270 8055 4095 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 5085 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 5355 -2 = 4\001
 0 0 53 -1 16 6 0.0000 4 105 495 8190 3645 WP1 @12\001
 0 0 53 -1 16 6 0.0000 4 105 495 8145 6075 WP2 @28\001
 0 0 53 -1 16 6 0.0000 4 105 435 8190 1710 WP0 @4\001
 0 0 53 -1 16 6 0.0000 4 75 360 8190 1860    => +8\001
 2 0 53 -1 16 6 0.0000 4 90 315 8865 4770 Pause\001
 2 0 53 -1 16 6 0.0000 4 90 210 8865 6660 Zero\001
 2 0 53 -1 16 6 0.0000 4 90 390 8865 6750 Window\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 5625 -2 = 3\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 6435 -2 = 4\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 6615 -2 = 2\001
 0 20 54 -1 18 6 0.0000 4 75 270 11205 6795 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 5625 -2 = 8\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 5850 -2 = 8\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 6210 -2 = 6\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 6435 -2 = 4\001
 0 0 54 -1 16 6 0.0000 4 75 270 8055 6615 -2 = 2\001
 0 20 54 -1 18 6 0.0000 4 75 270 8055 6795 -2 = 0\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 5850 -2 = 7\001
 0 0 54 -1 16 6 0.0000 4 75 270 11205 6210 -2 = 6\001
 2 0 53 -1 16 6 0.0000 4 90 270 8910 8190 Fixed\001

1458

doc/design-thoughts/numa-auto.txt Normal file

View File

File diff suppressed because it is too large Load Diff

12

doc/internals/api/buffer-api.txt

View File

 @ -548,11 +548,15 @@ buffer_almost_full  | const buffer *buf| returns true if the buffer is not null
                     |                  | are used. A waiting buffer will match.
 --------------------+------------------+---------------------------------------
 b_alloc             | buffer *buf      | ensures that <buf> is allocated or
                     | ret: buffer *    | allocates a buffer and assigns it to
                     |                  | *buf. If no memory is available, (1)
                     |                  | is assigned instead with a zero size.
                     | enum dynbuf_crit | allocates a buffer and assigns it to
                     |     criticality  | *buf. If no memory is available, (1)
                     | ret: buffer *    | is assigned instead with a zero size.
                     |                  | The allocated buffer is returned, or
                     |                  | NULL in case no memory is available
                     |                  | NULL in case no memory is available.
                     |                  | The criticality indicates the how the
                     |                  | buffer might be used and how likely it
                     |                  | is that the allocated memory will be
                     |                  | quickly released.
 --------------------+------------------+---------------------------------------
 __b_free            | buffer *buf      | releases <buf> which must be allocated
                     | ret: void        | and marks it empty

128

doc/internals/api/buffer-list-api.txt Normal file

View File

 @ -0,0 +1,128 @@
 -09-30 - Buffer List API
 . Use case
 The buffer list API allows one to share a certain amount of buffers between
 multiple entities, which will each see their own as lists of buffers, while
 keeping a sharedd free list. The immediate use case is for muxes, which may
 want to allocate up to a certain number of buffers per connection, shared
 among all streams. In this case, each stream will first request a new list
 for its own use, then may request extra entries from the free list. At any
 moment it will be possible to enumerate all allocated lists and to know which
 buffer follows which one.
 . Representation
 The buffer list is an array of struct bl_elem. It can hold up to N-1 buffers
 for N elements. The first one serves as the bookkeeping head and creates the
 free list.
 Each bl_elem contains a struct buffer, a pointer to the next cell, and a few
 flags. The struct buffer is a real struct buffer for all cells, except the
 first one where it holds useful data to describe the state of the array:
     struct bl_elem {
         struct buffer {
             size_t size;  // head: size of the array in number of elements
             char  *area;  // head: not used (0)
             size_t data;  // head: number of elements allocated
             size_t head;  // head: number of users
         } buf;
         uint32_t next;
         uint32_t flags;
     };
 There are a few important properties here:
   - for the free list, the first element isn't part of the list, otherwise
     there wouldn't be any head storage anymore.
   - the head's buf.data doesn't include the first cell of the array, thus its
     maximum value is buf.size - 1.
   - allocations are always made by appending to end of the existing list
   - releases are always made by releasing the beginning of the existing list
   - next == 0 for an allocatable cell implies that all the cells from this
     element to the last one of the array are free. This allows to simply
     initialize a whole new array with memset(array, 0, sizeof(array))
   - next == ~0 for an allocated cell indicates we've reached the last element
     of the current list.
   - for the head of the list, next points to the first available cell, or 0 if
     the free list is depleted.
 . Example
 The array starts like this, created with a calloc() and having size initialized
 to the total number of cells. The number represented is the 'next' value. "~"
 here standands for ~0 (i.e. end marker).
   [1|0|0|0|0|0|0|0|0|0]    => array entirely free
 strm1: bl_get(0) -> 1 = assign 1 to strm1's first cell
   [2|~|0|0|0|0|0|0|0|0]    => strm1 allocated at [1]
 
 strm1: bl_get(1) -> 2 = allocate one cell after cell 1
   [3|2|~|0|0|0|0|0|0|0]
 
 strm1: bl_get(2) -> 3 = allocate one cell after cell 2
   [4|2|3|~|0|0|0|0|0|0]
 
 strm2: bl_get(0) -> 4 = assign 4 to strm2's first cell
   [5|2|3|~|~|0|0|0|0|0]
 2
 strm1: bl_put(1) -> 2 = release cell 1, jump to next one (2)
   [1|5|3|~|~|0|0|0|0|0]
 2
 . Manipulating buffer lists
 The API is very simple, it allows to reserve a buffer for a new stream or for
 an existing one, to release a stream's first buffer or release the entire
 stream, and to initialize / release the whole array.
 ====================+==================+=======================================
 Function            | Arguments/Return | Description
 --------------------+------------------+---------------------------------------
 bl_users()          | const bl_elem *b | returns the current number of users on
                     | ret: uint32_t    | the array (i.e. buf.head).
 --------------------+------------------+---------------------------------------
 bl_size()           | const bl_elem *b | returns the total number of
                     | ret: uint32_t    | allocatable cells (i.e. buf.size-1)
 --------------------+------------------+---------------------------------------
 bl_used()           | const bl_elem *b | returns the number of cells currently
                     | ret: uint32_t    | in use (i.e. buf.data)
 --------------------+------------------+---------------------------------------
 bl_avail()          | const bl_elem *b | returns the number of cells still
                     | ret: uint32_t    | available.
 --------------------+------------------+---------------------------------------
 bl_init()           | bl_elem *b       | initializes b for n elements. All are
                     | uint32_t n       | in the free list.
 --------------------+------------------+---------------------------------------
 bl_put()            | bl_elem *b       | releases cell <idx> to the free list,
                     | uint32_t n       | possibly deleting the user. Returns
                     | ret: uint32_t    | next cell idx or 0 if none (last one).
 --------------------+------------------+---------------------------------------
 bl_deinit()         | bl_elem *b       | only when DEBUG_STRICT==2, scans the
                     |                  | array to check for leaks.
 --------------------+------------------+---------------------------------------
 bl_get()            | bl_elem *b       | allocates a new cell after to add to n
                     | uint32_t n       | or a new stream. Returns the cell or 0
                     | ret: uint32_t    | if no more space.
 ====================+==================+=======================================

47

doc/internals/api/event_hdl.txt

View File

 @ -1,12 +1,12 @@
                    -----------------------------------------
                          event_hdl Guide - version 2.8
                           ( Last update: 2022-11-14 )
                          event_hdl Guide - version 3.1
                           ( Last update: 2024-06-21 )
                    ------------------------------------------
 ABSTRACT
 --------
 The event_hdl support is a new feature of HAProxy 2.7. It is a way to easily
 The event_hdl support is a new feature of HAProxy 2.8. It is a way to easily
 handle general events in a simple to maintain fashion, while keeping core code
 impact to the bare minimum.
 @ -38,7 +38,7 @@ SUMMARY
 . EVENT_HDL INTRODUCTION
 -----------------------
 -------------------------
 EVENT_HDL provides two complementary APIs, both are implemented
 in src/event_hdl.c and include/haproxy/event_hdl(-t).h:
 @ -52,7 +52,7 @@ an event that is happening in the process.
 (See section 3.)
 . HOW TO HANDLE EXISTING EVENTS
 ---------------------
 --------------------------------
 To handle existing events, you must first decide which events you're
 interested in.
 @ -197,7 +197,7 @@ event subscription is performed using the function:
 	As the name implies, anonymous subscriptions don't support lookups.
 .1 SYNC MODE
 ---------------------
 -------------
 Example, you want to register a sync handler that will be called when
 a new server is added.
 @ -280,12 +280,12 @@ identified subscription where freeing private is required when subscription ends
 ```
 .2 ASYNC MODE
 ---------------------
 --------------
 As mentioned before, async mode comes in 2 flavors, normal and task.
 .2.1 NORMAL VERSION
 ---------------------
 --------------------
 Normal is meant to be really easy to use, and highly compatible with sync mode.
 @ -379,7 +379,7 @@ identified subscription where freeing private is required when subscription ends
 ```
 .2.2 TASK VERSION
 ---------------------
 ------------------
 task version requires a bit more setup, but it's pretty
 straightforward actually.
 @ -510,14 +510,14 @@ Note:  it is not recommended to perform multiple subscriptions
 			   that might already be freed. Thus UAF will occur.
 .3 ADVANCED FEATURES
 -----------------------
 ---------------------
 We've already covered some of these features in the previous examples.
 Here is a documented recap.
 .3.1 SUB MGMT
 -----------------------
 --------------
 From an event handler context, either sync or async mode:
 	You have the ability to directly manage the subscription
 @ -565,7 +565,7 @@ task and notify async modes (from the event):
 ```
 .3.2 SUBSCRIPTION EXTERNAL LOOKUPS
 -----------------------
 -----------------------------------
 As you've seen in 2.3.1, managing the subscription directly
 from the handler is a possibility.
 @ -620,7 +620,7 @@ unsubscribing:
 ```
 .3.3 SUBSCRIPTION PTR
 -----------------------
 ----------------------
 To manage existing subscriptions from external code,
 we already talked about identified subscriptions that
 @ -720,7 +720,7 @@ Example:
 ```
 .3.4 PRIVATE FREE
 -----------------------
 ------------------
 Upon handler subscription, you have the ability to provide
 a private data pointer that will be passed to the handler
 @ -777,7 +777,7 @@ Then:
 ```
 HOW TO ADD SUPPORT FOR NEW EVENTS
 -----------------------
 -----------------------------------
 Adding support for a new event is pretty straightforward.
 @ -787,9 +787,20 @@ First, you need to declare a new event subtype in event_hdl-t.h file
 You might want to declare a whole new event family, in which case
 you declare both the new family and the associated subtypes (if any).
 Up to 256 families containing 16 subtypes each are supported by the API.
 Family 0 is reserved for special events, which means there are 255 usable
 families.
 You can declare a family using EVENT_HDL_SUB_FAMILY(x) where x is the
 family.
 You can declare a subtype using EVENT_HDL_SUB_TYPE(x, y) where x is the
 family previously declared and y the subtype, Subtypes range from 1 to
 (included), 0 is not a valid subtype.
 ```
 	#define EVENT_HDL_SUB_NEW_FAMILY                EVENT_HDL_SUB_FAMILY(4)
 	#define EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1      EVENT_HDL_SUB_TYPE(4,0)
 	#define EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1      EVENT_HDL_SUB_TYPE(4,1)
 ```
 Then, you need to update the event_hdl_sub_type_map map,
 @ -803,7 +814,7 @@ Please follow this procedure:
 	You added a new family: go to section 3.1
 .1 DECLARING A NEW EVENT DATA STRUCTURE
 -----------------------
 ----------------------------------------
 You have the ability to provide additional data for a given
 event family when such events occur.
 @ -943,7 +954,7 @@ Event publishing can be performed from anywhere in the code.
 --------------------------------------------------------------------------------
 SUBSCRIPTION LISTS
 -----------------------
 --------------------
 As you may already know, EVENT_HDL API main functions rely on
 subscription lists.

17

doc/internals/api/htx-api.txt

View File

 @ -540,14 +540,15 @@ message. These functions are used by HTX analyzers or by multiplexers.
       the amount of data drained.
     - htx_xfer_blks() transfers HTX blocks from an HTX message to another,
       stopping on the first block of a specified type or when a specific amount
       of bytes, including meta-data, was moved. If the tail block is a DATA
       block, it may be partially moved. All other block are transferred at once
       or kept. This function returns a mixed value, with the last block moved,
       or NULL if nothing was moved, and the amount of data transferred. When
       HEADERS or TRAILERS blocks must be transferred, this function transfers
       all of them. Otherwise, if it is not possible, it triggers an error. It is
       the caller responsibility to transfer all headers or trailers at once.
       stopping after the first block of a specified type is transferred or when
       a specific amount of bytes, including meta-data, was moved. If the tail
       block is a DATA block, it may be partially moved. All other block are
       transferred at once or kept. This function returns a mixed value, with the
       last block moved, or NULL if nothing was moved, and the amount of data
       transferred. When HEADERS or TRAILERS blocks must be transferred, this
       function transfers all of them. Otherwise, if it is not possible, it
       triggers an error. It is the caller responsibility to transfer all headers
       or trailers at once.
     - htx_append_msg() append an HTX message to another one. All the message is
       copied or nothing. So, if an error occurred, a rollback is performed. This

10

doc/internals/api/initcalls.txt

View File

 @ -314,6 +314,16 @@ alphanumerically ordered:
   call to cfg_register_section() with the three arguments at stage
   STG_REGISTER.
   You can only register a section once, but you can register post callbacks
   multiple time for this section with REGISTER_CONFIG_POST_SECTION().
 - REGISTER_CONFIG_POST_SECTION(name, post)
   Registers a function which will be called after a section is parsed. This is
   the same as the <post> argument in REGISTER_CONFIG_SECTION(), the difference
   is that it allows to register multiple <post> callbacks and to register them
   elsewhere in the code.
 - REGISTER_PER_THREAD_ALLOC(fct)
   Registers a call to register_per_thread_alloc(fct) at stage STG_REGISTER.

86

doc/internals/api/memory.txt Normal file

View File

 @ -0,0 +1,86 @@
 -08-13 - Memory allocation in HAProxy 3.3
 The vast majority of dynamic memory allocations are performed from pools. Pools
 are optimized to store pre-calibrated objects of the right size for a given
 usage, try to favor locality and hot objects as much as possible, and are
 heavily instrumented to detect and help debug a wide class of bugs including
 buffer overflows, use-after-free, etc.
 For objects of random sizes, or those used only at configuration time, pools
 are not suited, and the regular malloc/free family is available, in addition of
 a few others.
 The standard allocation calls are intercepted at the code level (#define) when
 the code is compiled with -DDEBUG_MEM_STATS. For this reason, these calls are
 redefined as macros in "bug.h", and one must not try to use the pointers to
 such functions, as this may break DEBUG_MEM_STATS. This provides fine-grained
 stats about allocation/free per line of source code using locally implemented
 counters that can be consulted by "debug dev memstats". The calls are
 categorized into one of "calloc", "free", "malloc", "realloc", "strdup",
 "p_alloc", "p_free", the latter two designating pools. Extra calls such as
 memalign() and similar are also intercepted and counted as malloc.
 Due to the nature of this replacement, DEBUG_MEM_STATS cannot see operations
 performed in libraries or dependencies.
 In addition to DEBUG_MEM_STATS, when haproxy is built with USE_MEMORY_PROFILING
 the standard functions are wrapped by new ones defined in "activity.c", which
 also hold counters by call place. These ones are able to trace activity in
 libraries because the functions check the return pointer to figure where the
 call was made. The approach is different and relies on a large hash table. The
 files, function names and line numbers are not know, but by passing the pointer
 to dladdr(), we can often resolve most of these symbols. These operations are
 consulted via "show profiling memory". It must first be enabled either in the
 global config "profiling.memory on" or the CLI using "set profiling memory on".
 Memory profiling can also track pool allocations and frees thanks to knowing
 the size of the element and knowing a place where to store it. Some future
 evolutions might consider making this possible as well for pure malloc/free
 too by leveraging malloc_usable_size() a bit more.
 Finally, 3.3 brought aligned allocations. These are made available via a new
 family of functions around ha_aligned_alloc() that simply map to either
 posix_memalign(), memalign() or _aligned_malloc() for CYGWIN, depending on
 which one is available. This latter one requires to pass the pointer to
 _aligned_free() instead of free(), so for this reason, all aligned allocations
 have to be released using ha_aligned_free(). Since this mostly happens on
 configuration elements, in practice it's not as inconvenient as it can sound.
 These functions are in reality macros handled in "bug.h" like the previous
 ones in order to deal with DEBUG_MEM_STATS. All "alloc" variants are reported
 in memstats as "malloc". All "zalloc" variants are reported in memstats as
 "calloc".
 The currently available allocators are the following:
   - void *ha_aligned_alloc(size_t align, size_t size)
   - void *ha_aligned_zalloc(size_t align, size_t size)
     Equivalent of malloc() but aligned to <align> bytes. The alignment MUST be
     at least as large as one word and MUST be a power of two. The "zalloc"
     variant also zeroes the area on success. Both return NULL on failure.
   - void *ha_aligned_alloc_safe(size_t align, size_t size)
   - void *ha_aligned_zalloc_safe(size_t align, size_t size)
     Equivalent of malloc() but aligned to <align> bytes. The alignment is
     automatically adjusted to the nearest larger power of two that is at least
     as large as a word. The "zalloc" variant also zeroes the area on
     success. Both return NULL on failure.
   - (type *)ha_aligned_alloc_typed(size_t count, type)
     (type *)ha_aligned_zalloc_typed(size_t count, type)
     This macro returns an area aligned to the required alignment for type
     <type>, large enough for <count> objects of this type, and the result is a
     pointer of this type. The goal is to ease allocation of known structures
     whose alignment is not necessarily known to the developer (and to avoid
     encouraging to hard-code alignment). The cast in return also provides a
     last-minute control in case a wrong type is mistakenly used due to a poor
     copy-paste or an extra "*" after the type. When DEBUG_MEM_STATS is in use,
     the type is stored as a string in the ".extra" field so that it can be
     displayed in "debug dev memstats". The "zalloc" variant also zeroes the
     area on success. Both return NULL on failure.
   - void ha_aligned_free(void *ptr)
     Frees the area pointed to by ptr. It is the equivalent of free() but for
     objects allocated using one of the functions above.

Compare commits

3823 Commits v3.0-dev10 ... master

6 .cirrus.yml Unescape Escape View File

34 .github/actions/setup-vtest/action.yml vendored Normal file Unescape Escape View File

6 .github/h2spec.config vendored Unescape Escape View File

129 .github/matrix.py vendored Unescape Escape View File

12 .github/workflows/aws-lc-fips.yml vendored Normal file Unescape Escape View File

94 .github/workflows/aws-lc-template.yml vendored Normal file Unescape Escape View File

60 .github/workflows/aws-lc.yml vendored Unescape Escape View File

9 .github/workflows/codespell.yml vendored Unescape Escape View File

17 .github/workflows/compliance.yml vendored Unescape Escape View File

2 .github/workflows/contrib.yml vendored Unescape Escape View File

13 .github/workflows/coverity.yml vendored Unescape Escape View File

7 .github/workflows/cross-zoo.yml vendored Unescape Escape View File

19 .github/workflows/fedora-rawhide.yml vendored Unescape Escape View File

24 .github/workflows/illumos.yml vendored Normal file Unescape Escape View File

20 .github/workflows/musl.yml vendored Unescape Escape View File

6 .github/workflows/netbsd.yml vendored Unescape Escape View File

82 .github/workflows/openssl-ech.yml vendored Normal file Unescape Escape View File

77 .github/workflows/openssl-master.yml vendored Normal file Unescape Escape View File

33 .github/workflows/openssl-nodeprecated.yml vendored Unescape Escape View File

104 .github/workflows/quic-interop-aws-lc.yml vendored Normal file Unescape Escape View File

102 .github/workflows/quic-interop-libressl.yml vendored Normal file Unescape Escape View File

74 .github/workflows/quictls.yml vendored Normal file Unescape Escape View File

79 .github/workflows/vtest.yml vendored Unescape Escape View File

2 .github/workflows/windows.yml vendored Unescape Escape View File

80 .github/workflows/wolfssl.yml vendored Normal file Unescape Escape View File

1 .gitignore vendored Unescape Escape View File

2 .travis.yml Unescape Escape View File

12 BRANCHES Unescape Escape View File

3832 CHANGELOG View File

2 CONTRIBUTING Unescape Escape View File

120 INSTALL Unescape Escape View File

2 MAINTAINERS Unescape Escape View File

203 Makefile Unescape Escape View File

22 README Unescape Escape View File

62 README.md Normal file Unescape Escape View File

2 VERDATE Unescape Escape View File

2 VERSION Unescape Escape View File

3 addons/deviceatlas/Makefile.inc Unescape Escape View File

2 addons/deviceatlas/dummy/dac.h Unescape Escape View File

6 addons/ot/README Unescape Escape View File

2 addons/ot/src/filter.c Unescape Escape View File

17 addons/ot/src/parser.c Unescape Escape View File

13 addons/ot/src/vars.c Unescape Escape View File

17 addons/promex/README Unescape Escape View File

11 addons/promex/include/promex/promex.h Unescape Escape View File

745 addons/promex/service-prometheus.c View File

674 admin/acme.sh/LICENSE Unescape Escape View File

13 admin/acme.sh/README Unescape Escape View File

403 admin/acme.sh/haproxy.sh Unescape Escape View File

235 admin/cli/haproxy-dump-certs Executable file Unescape Escape View File

113 admin/cli/haproxy-reload Executable file Unescape Escape View File

29 admin/halog/halog.c Unescape Escape View File

15 admin/release-estimator/README.md Unescape Escape View File

4 admin/release-estimator/release-estimator.py Unescape Escape View File

3 admin/release-estimator/requirements.txt Normal file Unescape Escape View File

6 admin/systemd/haproxy.service.in Unescape Escape View File

34 dev/coccinelle/unchecked-calloc.cocci Normal file Unescape Escape View File

34 dev/coccinelle/unchecked-malloc.cocci Normal file Unescape Escape View File

34 dev/coccinelle/unchecked-strdup.cocci Normal file Unescape Escape View File

18 dev/flags/flags.c Unescape Escape View File

2 dev/flags/show-fd-to-flags.sh Unescape Escape View File

2 dev/flags/show-sess-to-flags.sh Unescape Escape View File

118 dev/gdb/ebtree.gdb Normal file Unescape Escape View File

26 dev/gdb/list.gdb Normal file Unescape Escape View File

19 dev/gdb/memprof.dbg Normal file Unescape Escape View File

21 dev/gdb/pools.gdb Normal file Unescape Escape View File

47 dev/gdb/post-mortem.gdb Normal file Unescape Escape View File

25 dev/gdb/proxies.gdb Normal file Unescape Escape View File

9 dev/gdb/servers.gdb Normal file Unescape Escape View File

18 dev/gdb/stream.gdb Normal file Unescape Escape View File

247 dev/h2/h2-tracer.lua Normal file Unescape Escape View File

6 dev/haring/haring.c Unescape Escape View File

31 dev/ncpu/Makefile Normal file Unescape Escape View File

136 dev/ncpu/ncpu.c Normal file Unescape Escape View File

4 dev/patchbot/prompts/prompt15-3.0-mist7bv2-pfx.txt → dev/patchbot/prompts/prompt15-3.1-mist7bv2-pfx.txt Unescape Escape View File

2 dev/patchbot/prompts/prompt15-3.0-mist7bv2-sfx.txt → dev/patchbot/prompts/prompt15-3.1-mist7bv2-sfx.txt Unescape Escape View File

70 dev/patchbot/prompts/prompt15-3.2-mist7bv2-pfx.txt Normal file Unescape Escape View File

29 dev/patchbot/prompts/prompt15-3.2-mist7bv2-sfx.txt Normal file Unescape Escape View File

3823 Commits

v3.0-dev10 ... master

6

.cirrus.yml

View File

34

.github/actions/setup-vtest/action.yml vendored Normal file

View File

6

.github/h2spec.config vendored

View File

129

.github/matrix.py vendored

View File

12

.github/workflows/aws-lc-fips.yml vendored Normal file

View File

94

.github/workflows/aws-lc-template.yml vendored Normal file

View File

60

.github/workflows/aws-lc.yml vendored

View File

9

.github/workflows/codespell.yml vendored

View File

17

.github/workflows/compliance.yml vendored

View File

2

.github/workflows/contrib.yml vendored

View File

13

.github/workflows/coverity.yml vendored

View File

7

.github/workflows/cross-zoo.yml vendored

View File

19

.github/workflows/fedora-rawhide.yml vendored

View File

24

.github/workflows/illumos.yml vendored Normal file

View File

20

.github/workflows/musl.yml vendored

View File

6

.github/workflows/netbsd.yml vendored

View File

82

.github/workflows/openssl-ech.yml vendored Normal file

View File

77

.github/workflows/openssl-master.yml vendored Normal file

View File

33

.github/workflows/openssl-nodeprecated.yml vendored

View File

104

.github/workflows/quic-interop-aws-lc.yml vendored Normal file

View File

102

.github/workflows/quic-interop-libressl.yml vendored Normal file

View File

74

.github/workflows/quictls.yml vendored Normal file

View File

79

.github/workflows/vtest.yml vendored

View File

2

.github/workflows/windows.yml vendored

View File

80

.github/workflows/wolfssl.yml vendored Normal file

View File

1

.gitignore vendored

View File

2

.travis.yml

View File

12

BRANCHES

View File

3832

CHANGELOG

View File

2

CONTRIBUTING

View File

120

INSTALL

View File

2

MAINTAINERS

View File

203

Makefile

View File

22

README

View File

62

README.md Normal file

View File

2

VERDATE

View File

2

VERSION

View File

3

addons/deviceatlas/Makefile.inc

View File

2

addons/deviceatlas/dummy/dac.h

View File

6

addons/ot/README

View File

2

addons/ot/src/filter.c

View File

17

addons/ot/src/parser.c

View File

13

addons/ot/src/vars.c

View File

17

addons/promex/README

View File

11

addons/promex/include/promex/promex.h

View File

745

addons/promex/service-prometheus.c

View File

674

admin/acme.sh/LICENSE

View File

13

admin/acme.sh/README

View File

403

admin/acme.sh/haproxy.sh

View File

235

admin/cli/haproxy-dump-certs Executable file

View File

113

admin/cli/haproxy-reload Executable file

View File

29

admin/halog/halog.c

View File

15

admin/release-estimator/README.md

View File

4

admin/release-estimator/release-estimator.py

View File

3

admin/release-estimator/requirements.txt Normal file

View File

6

admin/systemd/haproxy.service.in

View File

34

dev/coccinelle/unchecked-calloc.cocci Normal file

View File

34

dev/coccinelle/unchecked-malloc.cocci Normal file

View File

34

dev/coccinelle/unchecked-strdup.cocci Normal file

View File

18

dev/flags/flags.c

View File

2

dev/flags/show-fd-to-flags.sh

View File

2

dev/flags/show-sess-to-flags.sh

View File

118

dev/gdb/ebtree.gdb Normal file

View File

26

dev/gdb/list.gdb Normal file

View File

19

dev/gdb/memprof.dbg Normal file

View File

21

dev/gdb/pools.gdb Normal file

View File

47

dev/gdb/post-mortem.gdb Normal file

View File

25

dev/gdb/proxies.gdb Normal file

View File

9

dev/gdb/servers.gdb Normal file

View File

18

dev/gdb/stream.gdb Normal file

View File

247

dev/h2/h2-tracer.lua Normal file

View File

6

dev/haring/haring.c

View File

31

dev/ncpu/Makefile Normal file

View File

136

dev/ncpu/ncpu.c Normal file

View File

4

dev/patchbot/prompts/prompt15-3.0-mist7bv2-pfx.txt → dev/patchbot/prompts/prompt15-3.1-mist7bv2-pfx.txt

View File

2

dev/patchbot/prompts/prompt15-3.0-mist7bv2-sfx.txt → dev/patchbot/prompts/prompt15-3.1-mist7bv2-sfx.txt

View File

70

dev/patchbot/prompts/prompt15-3.2-mist7bv2-pfx.txt Normal file

View File

29

dev/patchbot/prompts/prompt15-3.2-mist7bv2-sfx.txt Normal file

View File

70

dev/patchbot/prompts/prompt15-3.3-mist7bv2-pfx.txt Normal file

View File