haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-09-27 08:41:31 +02:00

Author	SHA1	Message	Date
Willy Tarreau	099c1b2442	BUG/MAJOR: queue: properly keep count of the queue length The queue length was moved to its own variable in commit 583303c48 ("MINOR: proxies/servers: Calculate queueslength and use it."), however a few places were missed in pendconn_unlink() and assign_server_and_queue() resulting in never decreasing counts on aborted streams. This was reproduced when injecting more connections than the total backend could stand in TCP mode and letting some of them time out in the queue. No backport is needed, this is only 3.2.	2025-05-17 10:46:10 +02:00
Willy Tarreau	6be02d1c6e	BUG/MAJOR: leastconn: do not loop forever when facing saturated servers Since commit 9fe72bba3 ("MAJOR: leastconn; Revamp the way servers are ordered."), there's no way to escape the loop visiting the mt_list heads in fwlc_get_next_server if all servers in the list are saturated, resulting in a watchdog panic. It can be reproduced with this config and injecting with more than 2 concurrent conns: balance leastconn server s1 127.0.0.1:8000 maxconn 1 server s2 127.0.0.1:8000 maxconn 1 Here we count the number of saturated servers that were encountered, and escape the loop once the number of remaining servers exceeds the number of saturated ones. No backport is needed since this arrived in 3.2.	2025-05-17 10:44:36 +02:00
Willy Tarreau	ccc65012d3	IMPORT: slz: silence a build warning on non-x86 non-arm Building with clang 16 on MIPS64 yields this warning: src/slz.c:931:24: warning: unused function 'crc32_uint32' [-Wunused-function] static inline uint32_t crc32_uint32(uint32_t data) ^ Let's guard it using UNALIGNED_LE_OK which is the only case where it's used. This saves us from introducing a possibly non-portable attribute. This is libslz upstream commit f5727531dba8906842cb91a75c1ffa85685a6421.	2025-05-16 16:43:53 +02:00
Willy Tarreau	31ca29eee1	IMPORT: slz: fix header used for empty zlib message Calling slz_rfc1950_finish() without emitting any data would result in incorrectly emitting a gzip header (rfc1952) instead of a zlib header (rfc1950) due to a copy-paste between the two wrappers. The impact is almost inexistent since the zlib format is almost never used in this context, and compressing totally empty messages is quite rare as well. Let's take this opportunity for fixing another mistake on an RFC number in a comment. This is slz upstream commit 7f3fce4f33e8c2f5e1051a32a6bca58e32d4f818.	2025-05-16 16:43:53 +02:00
Willy Tarreau	411b04c7d3	IMPORT: slz: use a better hash for machines with a fast multiply The current hash involves 3 simple shifts and additions so that it can be mapped to a multiply on architecures having a fast multiply. This is indeed what the compiler does on x86_64. A large range of values was scanned to try to find more optimal factors on machines supporting such a fast multiply, and it turned out that new factor 0x1af42f resulted in smoother hashes that provided on average 0.4% better compression on both the Silesia corpus and an mbox file composed of very compressible emails and uncompressible attachments. It's even slightly better than CRC32C while being faster on Skylake. This patch enables this factor on archs with a fast multiply. This is slz upstream commit 82ad1e75c13245a835c1c09764c89f2f6e8e2a40.	2025-05-16 16:43:53 +02:00
Willy Tarreau	248bbec83c	IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested If building for sse4 and USE_CRC32C_HASH is defined, then we can use crc32c to calculate the lookup hash. By default we don't do it because even on skylake it's slower than the current hash, which only involves a short multiply (~5% slower). But the gains are marginal (0.3%). This is slz upstream commit 44ae4f3f85eb275adba5844d067d281e727d8850. Note: this is not used by default and only merged in order to avoid divergence between the code bases.	2025-05-16 16:43:53 +02:00
Willy Tarreau	ea1b70900f	IMPORT: slz: avoid multiple shifts on 64-bits On 64-bit platforms, disassembling the code shows that send_huff() performs a left shift followed by a right one, which are the result of integer truncation and zero-extension caused solely by using different types at different levels in the call chain. By making encode24() take a 64-bit int on input and send_huff() take one optionally, we can remove one shift in the hot path and gain 1% performance without affecting other platforms. This is slz upstream commit fd165b36c4621579c5305cf3bb3a7f5410d3720b.	2025-05-16 16:43:53 +02:00
Willy Tarreau	0a91c6dcae	BUILD: debug: mark ha_crash_now() as attribute(noreturn) Building on MIPS64 with clang16 incorrectly reports some uninitialized value warnings in stats-proxy.c due to some calls to ABORT_NOW() where the compiler didn't know the code wouldn't return. Let's properly mark the function as noreturn, and take this opportunity for also marking it unused to avoid possible warnings depending on the build options (if ABORT_NOW is not used). No backport needed though it will not harm.	2025-05-16 16:43:53 +02:00
William Lallemand	1eebf98952	DOC: management: change reference to configuration manual Since e24b77e7 ('DOC: config: move the extraneous sections out of the "global" definition') the ACME section of the configuration manual was move from 3.13 to 12.8. Change the reference to that section in "acme renew".	2025-05-16 16:01:43 +02:00
Willy Tarreau	81e46be026	DOC: config: properly index "table and "stick-table" in their section Tim reported in issue #2953 that "stick-table" and "table" were not indexed as keywords. The issue was the indent level. Also let's make sure to put a box around the "store" arguments as well.	2025-05-16 15:37:03 +02:00
Willy Tarreau	df00164fdd	BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field In continuation with 9a05c1f574 ("BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly") and the discussion in issue #2941, @DemiMarie rightfully suggested that Host should also be sanitized, because it is sometimes used in concatenation, such as this: http-request set-url https://%[req.hdr(host)]%[pathq] which was proposed as a workaround for h2 upstream servers that require :authority here: https://www.mail-archive.com/haproxy@formilux.org/msg43261.html The current patch then adds the same check for forbidden chars in the Host header, using the same function as for the patch above, since in both cases we validate the host:port part of the authority. This way we won't reconstruct ambiguous URIs by concatenating Host and path. Just like the patch above, this can be backported afer a period of observation.	2025-05-16 15:13:17 +02:00
Willy Tarreau	b84762b3e0	BUG/MINOR: h3: don't insert more than one Host header Let's make sure we drop extraneous Host headers after having compared them. That also works when :authority was already present. This way, like for h1 and h2, we only keep one copy of it, while still making sure that Host matches :authority. This way, if a request has both :authority and Host, only one Host header will be produced (from :authority). Note that due to the different organization of the code and wording along the evolving RFCs, here we also check that all duplicates are identical, while h2 ignores them as per RFC7540, but this will be re-unified later. This should be backported to stable versions, at least 2.8, though thanks to the existing checks the impact is probably nul.	2025-05-16 15:13:17 +02:00
Christopher Faulet	f45a632bad	BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload It is especially a problem with Lua filters, but it is important to disable the 0-copy forwarding if a filter alters the payload, or at least to be able to disable it. While the filter is registered on the data filtering, it is not an issue (and it is the common case) because, there is now way to fast-forward data at all. But it may be an issue if a filter decides to alter the payload and to unregister from data filtering. In that case, the 0-copy forwarding can be re-enabled in a hardly precdictable state. To fix the issue, a SC flags was added to do so. The HTTP compression filter set it and lua filters too if the body length is changed (via HTTPMessage.set_body_len()). Note that it is an issue because of a bad design about the HTX. Many info about the message are stored in the HTX structure itself. It must be refactored to move several info to the stream-endpoint descriptor. This should ease modifications at the stream level, from filter or a TCP/HTTP rules. This should be backported as far as 3.0. If necessary, it may be backported on lower versions, as far as 2.6. In that case, it must be reviewed and adapted.	2025-05-16 15:11:37 +02:00
Christopher Faulet	94055a5e73	MEDIUM: hlua: Add function to change the body length of an HTTP Message There was no function for a lua filter to change the body length of an HTTP Message. But it is mandatory to be able to alter the message payload. It is not possible update to directly update the message headers because the internal state of the message must also be updated accordingly. It is the purpose of HTTPMessage.set_body_len() function. The new body length myst be passed as argument. If it is an integer, the right "Content-Length" header is set. If the "chunked" string is used, it forces the message to be chunked-encoded and in that case the "Transfer-Encoding" header. This patch should fix the issue #2837. It could be backported as far as 2.6.	2025-05-16 14:34:12 +02:00
Willy Tarreau	f2d7aa8406	BUG/MEDIUM: peers: also limit the number of incoming updates There's a configurable limit to the number of messages sent to a peer (tune.peers.max-updates-at-once), but this one is not applied to the receive side. While it can usually be OK with default settings, setups involving a large tune.bufsize (1MB and above) regularly experience high latencies and even watchdogs during reloads because the full learning process sends a lot of data that manages to fill the entire buffer, and due to the compactness of the protocol, 1MB of buffer can contain more than 100k updates, meaning taking locks etc during this time, which is not workable. Let's make sure the receiving side also respects the max-updates-at-once setting. For this it counts incoming updates, and refrains from continuing once the limit is reached. It's a bit tricky to do because after receiving updates we still have to send ours (and possibly some ACKs) so we cannot just leave the loop. This issue was reported on 3.1 but it should progressively be backported to all versions having the max-updates-at-once option available.	2025-05-15 16:57:21 +02:00
Aurelien DARRAGON	098a5e5c0b	BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers using "send-proxy" or "send-proxy-v2" option on a ring server is not relevant nor supported. Worse, on 2.4 it causes haproxy process to crash as reported in GH #2965. Let's be more explicit about the fact that this keyword is not supported under "ring" context by ignoring the option and emitting a warning message to inform the user about that. Ideally, we should do the same for peers and log servers. The proper way would be to check servers options during postparsing but we currently lack proper cross-type server postparsing hooks. This will come later and thus will give us a chance to perform the compatibilty checks for server options depending on proxy type. But for now let's simply fix the "ring" case since it is the only one that's known to cause a crash. It may be backported to all stable versions.	2025-05-15 16:18:31 +02:00
Basha Mougamadou	824bb93e18	DOC: configuration: explicit multi-choice on bind shards option From the documentation, this wasn't clear enough that shards should be followed by one of the options number / by-thread / by-group. Align it with existing options in documentation so that it becomes more explicit.	2025-05-14 19:41:38 +02:00
Willy Tarreau	17df04ff09	[RELEASE] Released version 3.2-dev16 Released version 3.2-dev16 with the following main changes : - BUG/MEDIUM: mux-quic: fix crash on invalid fctl frame dereference - DEBUG: pool: permit per-pool UAF configuration - MINOR: acme: add the global option 'acme.scheduler' - DEBUG: pools: add a new integrity mode "backup" to copy the released area - MEDIUM: sock-inet: re-check IPv6 connectivity every 30s - BUG/MINOR: ssl: doesn't fill conf->crt with first arg - BUG/MINOR: ssl: prevent multiple 'crt' on the same ssl-f-use line - BUG/MINOR: ssl/ckch: always free() the previous entry during parsing - MINOR: tools: ha_freearray() frees an array of string - BUG/MINOR: ssl/ckch: always ha_freearray() the previous entry during parsing - MINOR: ssl/ckch: warn when the same keyword was used twice - BUG/MINOR: threads: fix soft-stop without multithreading support - BUG/MINOR: tools: improve parse_line()'s robustness against empty args - BUG/MINOR: cfgparse: improve the empty arg position report's robustness - BUG/MINOR: server: dont depend on proxy for server cleanup in srv_drop() - BUG/MINOR: server: perform lbprm deinit for dynamic servers - MINOR: http: add a function to validate characters of :authority - BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly - MINOR: quic: account Tx data per stream - MINOR: mux-quic: account Rx data per stream - MINOR: quic: add stream format for "show quic" - MINOR: quic: display QCS info on "show quic stream" - MINOR: quic: display stream age - BUG/MINOR: cpu-topo: fix group-by-cluster policy for disordered clusters - MINOR: cpu-topo: add a new "group-by-ccx" CPU policy - MINOR: cpu-topo: provide a function to sort clusters by average capacity - MEDIUM: cpu-topo: change "performance" to consider per-core capacity - MEDIUM: cpu-topo: change "efficiency" to consider per-core capacity - MEDIUM: cpu-topo: prefer grouping by CCX for "performance" and "efficiency" - MEDIUM: config: change default limits to 1024 threads and 32 groups - BUG/MINOR: hlua: Fix Channel:data() and Channel:line() to respect documentation - DOC: config: Fix a typo in the "term_events" definition - BUG/MINOR: spoe: Don't report error on applet release if filter is in DONE state - BUG/MINOR: mux-spop: Don't report error for stream if ACK was already received - BUG/MINOR: mux-spop: Make the demux stream ID a signed integer - BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error - MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing - BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state - BUG/MEDIUM: mux-spop: Properly handle CLOSING state - BUG/MEDIUM: spop-conn: Report short read for partial frames payload - BUG/MEDIUM: mux-spop: Properly detect truncated frames on demux to report error - BUG/MEDIUM: mux-spop; Don't report a read error if there are pending data - DEBUG: mux-spop: Review some trace messages to adjust the message or the level - DOC: config: move address formats definition to section 2 - DOC: config: move stick-tables and peers to their own section - DOC: config: move the extraneous sections out of the "global" definition - CI: AWS-LC(fips): enable unit tests - CI: AWS-LC: enable unit tests - CI: compliance: limit run on forks only to manual + cleanup - CI: musl: enable unit tests - CI: QuicTLS (weekly): limit run on forks only to manual dispatch - CI: WolfSSL: enable unit tests v3.2-dev16	2025-05-14 17:01:46 +02:00
Ilia Shipitsin	12de9ecce5	CI: WolfSSL: enable unit tests Run the new make unit-tests on the CI.	2025-05-14 17:00:31 +02:00
Ilia Shipitsin	75a1e40501	CI: QuicTLS (weekly): limit run on forks only to manual dispatch	2025-05-14 17:00:31 +02:00
Ilia Shipitsin	a8b1b08fd7	CI: musl: enable unit tests Run the new make unit-tests on the CI.	2025-05-14 17:00:31 +02:00
Ilia Shipitsin	01225f9aa5	CI: compliance: limit run on forks only to manual + cleanup	2025-05-14 17:00:31 +02:00
Ilia Shipitsin	61b30a09c0	CI: AWS-LC: enable unit tests Run the new make unit-tests on the CI.	2025-05-14 17:00:31 +02:00
Ilia Shipitsin	944a96156e	CI: AWS-LC(fips): enable unit tests Run the new make unit-tests on the CI.	2025-05-14 17:00:31 +02:00
Willy Tarreau	e24b77e765	DOC: config: move the extraneous sections out of the "global" definition Due to some historic mistakes that have spread to newly added sections, a number of of recently added small sections found themselves described under section 3 "global parameters" which is specific to "global" section keywords. This is highly confusing, especially given that sections 3.1, 3.2, 3.3 and 3.10 directly start with keywords valid in the global section, while others start with keywords that describe a new section. Let's just create a new chapter "12. other sections" and move them all there. 3.10 "HTTPclient tuning" however was moved to 3.4 as it's really a definition of the global options assigned to the HTTP client. The "programs" that are going away in 3.3 were moved at the end to avoid a renumbering later. Another nice benefit is that it moves a lot of text that was previously keeping the global and proxies sections apart.	2025-05-14 16:08:02 +02:00
Willy Tarreau	da67a89f30	DOC: config: move stick-tables and peers to their own section As suggested by Tim in issue #2953, stick-tables really deserve their own section to explain the configuration. And peers have to move there as well since they're totally dedicated to stick-tables. Now we introduce a new section "Stick-tables and Peers", explaining the concepts, and under which there is one subsection for stick-tables configuration and one for the peers (which mostly keeps the existing peers section).	2025-05-14 16:08:02 +02:00
Willy Tarreau	423dffa308	DOC: config: move address formats definition to section 2 Section 2 describes the config file format, variables naming etc, so there's no reason why the address format used in this file should be in a separate section, let's bring it into section 2 as well.	2025-05-14 16:08:02 +02:00
Christopher Faulet	e2ae8a74e8	DEBUG: mux-spop: Review some trace messages to adjust the message or the level Some trace messages were not really accurrate, reporting a CLOSED connection while only an error was reported on it. In addition, an TRACE_ERROR() was used to report a short read on HELLO/DISCONNECT frames header. But it is not an error. a TRACE_DEVEL() should be used instead. This patch could be backported to 3.1 to ease future backports.	2025-05-14 11:52:10 +02:00
Christopher Faulet	6e46f0bf93	BUG/MEDIUM: mux-spop; Don't report a read error if there are pending data When an read error is detected, no error must be reported on the SPOP connection is there are still some data to parse. It is important to be sure to process all data before reporting the error and be sure to not truncate received frames. However, we must also take care to handle short read case to not wait data that will never be received. This patch must be backported to 3.1.	2025-05-14 11:51:58 +02:00
Christopher Faulet	16314bb93c	BUG/MEDIUM: mux-spop: Properly detect truncated frames on demux to report error There was no test in the demux part to detect truncated frames and to report an error at the connection level. The SPOP streams were properly switch to half-closed state. But waiting the associated SPOE applets were woken up and released, the SPOP connection could be woken up several times for nothing. I never triggered the watchdog in that case, but it is not excluded. Now, at the end of the demux function, if a specific test was added to detect truncated frames to report an error and close the connection. This patch must be backported to 3.1.	2025-05-14 11:47:41 +02:00
Christopher Faulet	71feb49a9f	BUG/MEDIUM: spop-conn: Report short read for partial frames payload When a frame was not fully received, a short read must be reported on the SPOP connection to help the demux to handle truncated frames. This was performed for frames truncated on the header part but not on the payload part. It is now properly detected. This patch must be backported to 3.1.	2025-05-14 09:20:10 +02:00
Christopher Faulet	ddc5f8d92e	BUG/MEDIUM: mux-spop: Properly handle CLOSING state The CLOSING state was not handled at all by the SPOP multiplexer while it is mandatory when a DISCONNECT frame was sent and the mux should wait for the DISCONNECT frame in reply from the agent. Thanks to this patch, it should be fixed. In addition, if an error occurres during the AGENT HELLO frame parsing, the SPOP connection is no longer switched to CLOSED state and remains in ERROR state instead. It is important to be able to send the DISCONNECT frame to the agent instead of closing the TCP connection immediately. This patch depends on following commits: * BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state * MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing * BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error * BUG/MINOR: mux-spop: Make the demux stream ID a signed integer All the series must be backported to 3.1.	2025-05-14 09:14:12 +02:00
Christopher Faulet	a3940614c2	BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state SPOP_CS_FRAME_H and SPOP_CS_FRAME_P states, that were used to handle frame parsing, were removed. The demux process now relies on the demux stream ID to know if it is waiting for the frame header or the frame payload. Concretly, when the demux stream ID is not set (dsi == -1), the demuxer is waiting for the next frame header. Otherwise (dsi >= 0), it is waiting for the frame payload. It is especially important to be able to properly handle DISCONNECT frames sent by the agents. SPOP_CS_RUNNING state is introduced to know the hello handshake was finished and the SPOP connection is able to open SPOP streams and exchange NOTIFY/ACK frames with the agents. It depends on the following fixes: * MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing * BUG/MINOR: mux-spop: Make the demux stream ID a signed integer This change will be mandatory for the next fix. It must be backported to 3.1 with the commits above.	2025-05-13 19:51:40 +02:00
Christopher Faulet	6b0f7de4e3	MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing After the ACK frame was parsed, it is useless to set the SPOP connection state to SPOP_CS_FRAME_H state because this will be automatically handled by the demux function. If it is not an issue, but this will simplify changes for the next commit.	2025-05-13 19:51:40 +02:00
Christopher Faulet	197eaaadfd	BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error Till now, only SPOP connections fully closed or those with a TCP connection on error were concerned. But available streams could be reported for SPOP connections in error or closing state. But in these states, no NOTIFY frames will be sent and no ACK frames will be parsed. So, no new SPOP streams should be opened. This patch should be backported to 3.1.	2025-05-13 19:51:40 +02:00
Christopher Faulet	cbc10b896e	BUG/MINOR: mux-spop: Make the demux stream ID a signed integer The demux stream ID of a SPOP connection, used when received frames are parsed, must be a signed integer because it is set to -1 when the SPOP connection is initialized. It will be important for the next fix. This patch must be backported to 3.1.	2025-05-13 19:51:40 +02:00
Christopher Faulet	6d68beace5	BUG/MINOR: mux-spop: Don't report error for stream if ACK was already received When a SPOP connection was closed or was in error, an error was systematically reported on all its SPOP streams. However, SPOP streams that already received their ACK frame must be excluded. Otherwise if an agent sends a ACK and close immediately, the ACK will be ignored because the SPOP stream will handle the error first. This patch must be backported to 3.1.	2025-05-13 19:51:40 +02:00
Christopher Faulet	1cd30c998b	BUG/MINOR: spoe: Don't report error on applet release if filter is in DONE state When the SPOE applet was released, if a SPOE filter context was still attached to it, an error was reported to the filter. However, there is no reason to report an error if the ACK message was already received. Because of this bug, if the ACK message is received and the SPOE connection is immediately closed, this prevents the ACK message to be processed. This patch should be backported to 3.1.	2025-05-13 19:51:40 +02:00
Christopher Faulet	dcce02d6ed	DOC: config: Fix a typo in the "term_events" definition A space was missing before the colon.	2025-05-13 19:51:40 +02:00
Christopher Faulet	a5de0e1595	BUG/MINOR: hlua: Fix Channel:data() and Channel:line() to respect documentation When the channel API was revisted, the both functions above was added. An offset can be passed as argument. However, this parameter could be reported to be out of range if there was not enough input data was received yet. It is an issue, especially with a tcp rule, because more data could be received. If an error is reported too early, this prevent the rule to be reevaluated later. In fact, an error should only be reported if the offset is part of the output data. Another issue is about the conditions to report 'nil' instead of an empty string. 'nil' was reported when no data was found. But it is not aligned with the documentation. 'nil' must only be returned if no more data cannot be received and there is no input data at all. This patch should fix the issue #2716. It should be backported as far as 2.6.	2025-05-13 19:51:40 +02:00
Willy Tarreau	e049bd00ab	MEDIUM: config: change default limits to 1024 threads and 32 groups A test run on a dual-socket EPYC 9845 (2x160 cores) showed that we'll be facing new limits during the lifetime of 3.2 with our current 16 groups and 256 threads max: $ cat test.cfg global cpu-policy perforamnce $ ./haproxy -dc -c -f test.cfg ... Thread CPU Bindings: Tgrp/Thr Tid CPU set 1/1-32 1-32 32: 0-15,320-335 2/1-32 33-64 32: 16-31,336-351 3/1-32 65-96 32: 32-47,352-367 4/1-32 97-128 32: 48-63,368-383 5/1-32 129-160 32: 64-79,384-399 6/1-32 161-192 32: 80-95,400-415 7/1-32 193-224 32: 96-111,416-431 8/1-32 225-256 32: 112-127,432-447 Raising the default limit to 1024 threads and 32 groups is sufficient to buy us enough margin for a long time (hopefully, please don't laugh, you, reader from the future): $ ./haproxy -dc -c -f test.cfg ... Thread CPU Bindings: Tgrp/Thr Tid CPU set 1/1-32 1-32 32: 0-15,320-335 2/1-32 33-64 32: 16-31,336-351 3/1-32 65-96 32: 32-47,352-367 4/1-32 97-128 32: 48-63,368-383 5/1-32 129-160 32: 64-79,384-399 6/1-32 161-192 32: 80-95,400-415 7/1-32 193-224 32: 96-111,416-431 8/1-32 225-256 32: 112-127,432-447 9/1-32 257-288 32: 128-143,448-463 10/1-32 289-320 32: 144-159,464-479 11/1-32 321-352 32: 160-175,480-495 12/1-32 353-384 32: 176-191,496-511 13/1-32 385-416 32: 192-207,512-527 14/1-32 417-448 32: 208-223,528-543 15/1-32 449-480 32: 224-239,544-559 16/1-32 481-512 32: 240-255,560-575 17/1-32 513-544 32: 256-271,576-591 18/1-32 545-576 32: 272-287,592-607 19/1-32 577-608 32: 288-303,608-623 20/1-32 609-640 32: 304-319,624-639 We can change this default now because it has no functional effect without any configured cpu-policy, so this will only be an opt-in and it's better to do it now than to have an effect during the maintenance phase. A tiny effect is a doubling of the number of pool buckets and stick-table shards internally, which means that aside slightly reducing contention in these areas, a dump of tables can enumerate keys in a different order (hence the adjustment in the vtc). The only really visible effect is a slightly higher static memory consumption (29->35 MB on a small config), but that difference remains even with 50k servers so that's pretty much acceptable. Thanks to Erwan Velu for the quick tests and the insights!	2025-05-13 18:15:33 +02:00
Willy Tarreau	158da59c34	MEDIUM: cpu-topo: prefer grouping by CCX for "performance" and "efficiency" Most of the time, machines made of multiple CPU types use the same L3 for them, and grouping CPUs by frequencies to form groups doesn't bring any value and on the opposite can impair the incoming connection balancing. This choice of grouping by cluster was made in order to constitute a good choice on homogenous machines as well, so better rely on the per-CCX grouping than the per-cluster one in this case. This will create less clusters on machines where it counts without affecting other ones. It doesn't seem necessary to change anything for the "resource" policy since it selects a single cluster.	2025-05-13 16:48:30 +02:00
Willy Tarreau	70b0dd6b0f	MEDIUM: cpu-topo: change "efficiency" to consider per-core capacity This is similar to the previous change to the "performance" policy but it applies to the "efficiency" one. Here we're changing the sorting method to sort CPU clusters by average per-CPU capacity, and we evict clusters whose per-CPU capacity is above 125% of the previous one. Per-core capacity allows to detect discrepancies between CPU cores, and to continue to focus on efficient ones as a priority.	2025-05-13 16:48:30 +02:00
Willy Tarreau	6c88e27cf4	MEDIUM: cpu-topo: change "performance" to consider per-core capacity Running the "performance" policy on highly heterogenous systems yields bad choices when there are sufficiently more small than big cores, and/or when there are multiple cluster types, because on such setups, the higher the frequency, the lower the number of cores, despite small differences in frequencies. In such cases, we quickly end up with "performance" only choosing the small or the medium cores, which is contrary to the original intent, which was to select performance cores. This is what happens on boards like the Orion O6 for example where only the 4 medium cores and 2 big cores are choosen, evicting the 2 biggest cores and the 4 smallest ones. Here we're changing the sorting method to sort CPU clusters by average per-CPU capacity, and we evict clusters whose per-CPU capacity falls below 80% of the previous one. Per-core capacity allows to detect discrepancies between CPU cores, and to continue to focus on high performance ones as a priority.	2025-05-13 16:48:30 +02:00
Willy Tarreau	5ab2c815f1	MINOR: cpu-topo: provide a function to sort clusters by average capacity The current per-capacity sorting function acts on a whole cluster, but in some setups having many small cores and few big ones, it becomes easy to observe an inversion of metrics where the many small cores show a globally higher total capacity than the few big ones. This does not necessarily fit all use cases. Let's add new a function to sort clusters by their per-cpu average capacity to cover more use cases.	2025-05-13 16:48:30 +02:00
Willy Tarreau	01df98adad	MINOR: cpu-topo: add a new "group-by-ccx" CPU policy This cpu-policy will only consider CCX and not clusters. This makes a difference on machines with heterogenous CPUs that generally share the same L3 cache, where it's not desirable to create multiple groups based on the CPU types, but instead create one with the different CPU types. The variants "group-by-2/3/4-ccx" have also been added. Let's also add some text explaining the difference between cluster and CCX.	2025-05-13 16:48:30 +02:00
Willy Tarreau	33d8b006d4	BUG/MINOR: cpu-topo: fix group-by-cluster policy for disordered clusters Some (rare) boards have their clusters in an erratic order. This is the case for the Radxa Orion O6 where one of the big cores appears as CPU0 due to booting from it, then followed by the small cores, then the medium cores, then the remaining big cores. This results in clusters appearing this order: 0,2,1,0. The core in cpu_policy_group_by_cluster() expected ordered clusters, and performs ordered comparisons to decide whether a CPU's cluster has already been taken care of. On the board above this doesn't work, only clusters 0 and 2 appear and 1 is skipped. Let's replace the cluster number comparison with a cpuset to record which clusters have been taken care of. Now the groups properly appear like this: Tgrp/Thr Tid CPU set 1/1-2 1-2 2: 0,11 2/1-4 3-6 4: 1-4 3/1-6 7-12 6: 5-10 No backport is needed, this is purely 3.2.	2025-05-13 16:48:30 +02:00
Amaury Denoyelle	f3b9676416	MINOR: quic: display stream age Add a field to save the creation date of qc_stream_desc instance. This is useful to display QUIC stream age in "show quic stream" output.	2025-05-13 15:44:22 +02:00
Amaury Denoyelle	dbf07c754e	MINOR: quic: display QCS info on "show quic stream" Complete stream output for "show quic" by displaying information from its upper QCS. Note that QCS may be NULL if already released, so a default output is also provided.	2025-05-13 15:43:28 +02:00
Amaury Denoyelle	cbadfa0163	MINOR: quic: add stream format for "show quic" Add a new format for "show quic" command labelled as "stream". This is an equivalent of "show sess", dedicated to the QUIC stack. Each active QUIC streams are listed on a line with their related infos. The main objective of this command is to ensure there is no freeze streams remaining after a transfer.	2025-05-13 15:41:51 +02:00

... 8 9 10 11 12 ...

25093 Commits