haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-27 06:31:23 +01:00

Author	SHA1	Message	Date
Willy Tarreau	e3fd9970a9	MINOR: cpu-topo: add a new "resource" cpu-policy This cpu policy keeps the smallest CPU cluster. This can be used to limit the resource usage to the strict minimum that still delivers decent performance, for example to try to further reduce power consumption or minimize the number of cores needed on some rented systems for a sidecar setup, in order to scale the system down more easily. Note that if a single cluster is present, it will still be fully used. When started on a 64-core EPYC gen3, it uses only one CCX with 8 cores and 16 threads, all in the same group.	2025-03-14 18:33:16 +01:00
Willy Tarreau	ad3650c354	MINOR: cpu-topo: add a new "efficiency" cpu-policy This cpu policy tries to evict performant core clusters and only focuses on efficiency-oriented ones. On an intel i9-14900k, we can get 525k rps using 8 performance cores, versus 405k when using all 24 efficiency cores. In some cases the power savings might be more desirable (e.g. scalability tests on a developer's laptop), or the performance cores might be better suited for another component (application or security component).	2025-03-14 18:33:16 +01:00
Willy Tarreau	dcae2fa4a4	MINOR: cpu-topo: add a new "performance" cpu-policy This cpu policy tries to evict efficient core clusters and only focuses on performance-oriented ones. On an intel i9-14900k, we can get 525k rps using only 8 cores this way, versus 594k when using all 24 cores. The gains from using all these codes are not significant enough to waste them on this. Also these cores can be much slower at doing SSL handshakes so it can make sense to evict them. Better keep the efficiency cores for network interrupts for example. Also, on a developer's machine it can be convenient to keep all these cores for the local tasks and extra tools (load generators etc).	2025-03-14 18:33:16 +01:00
Willy Tarreau	8aeb096740	MINOR: cpu-topo: add cpu-policy "group-by-cluster" This policy forms thread groups from the CPU clusters, and bind all the threads in them to all the CPUs of the cluster. This is recommended on system with bad inter-CCX latencies. It was shown to simply triple the performance with queuing on a 64-core EPYC without having to manually assign the cores with cpu-map.	2025-03-14 18:33:16 +01:00
Willy Tarreau	56d939866b	MEDIUM: cpu-topo: use the "first-usable-node" cpu-policy by default This now turns the cpu-policy to "first-usable-node" by default, so that we preserve the current default behavior consisting in binding to the first node if nothing was forced. If a second node is found, global.nbthread is set and the previous code will be skipped.	2025-03-14 18:33:16 +01:00
Willy Tarreau	7fc6cdd0b1	MINOR: cpu-topo: add a 'first-usable-node' cpu policy This is a reimplemlentation of the current default policy. It binds to the first node having usable CPUs if found, and drops CPUs from the second and next nodes.	2025-03-14 18:33:16 +01:00
Willy Tarreau	9a8e8af11a	MINOR: cpu-topo: add "only-cluster" and "drop-cluster" to cpu-set These are processed after the topology is detected, and they allow to restrict binding to or evict CPUs matching the indicated hardware cluster number(s). It can be used to bind to only some clusters, such as CCX or different energy efficiency cores. For this reason, here we use the cluster's local ID (local to the node).	2025-03-14 18:33:16 +01:00
Willy Tarreau	a946cfa8b5	MINOR: cpu-topo: add "only-core" and "drop-core" to cpu-set These are processed after the topology is detected, and they allow to restrict binding to or evict CPUs matching the indicated hardware core number(s). It can be used to bind to only some clusters as well as to evict efficient cores whose number is known.	2025-03-14 18:33:16 +01:00
Willy Tarreau	c591c9d6a6	MINOR: cpu-topo: add "only-thread" and "drop-thread" to cpu-set These are processed after the topology is detected, and they allow to restrict binding to or evict CPUs matching the indicated hardware thread number(s). It can be used to reserve even threads for HW IRQs and odd threads for haproxy for example, or to evict efficient cores that do only have thread #0.	2025-03-14 18:33:16 +01:00
Willy Tarreau	c93ee25054	MINOR: cpu-topo: add "only-node" and "drop-node" to cpu-set These are processed after the topology is detected, and they allow to restrict binding to or evict CPUs matching the indicated node(s).	2025-03-14 18:33:16 +01:00
Willy Tarreau	68069e4b27	MINOR: cpu-topo: add "drop-cpu" and "only-cpu" to cpu-set These allow respectively to disable binding to CPUs listed in a set, and to disable binding to CPUs not in a set.	2025-03-14 18:30:30 +01:00
Willy Tarreau	cda4956d9c	MINOR: cpu-topo: add a new "cpu-set" global directive to choose cpus For now it's limited, it only supports "reset" to ask that any previous "taskset" be ignored. The goal will be to later add more actions that allow to symbolically define sets of cpus to bind to or to drop. This also clears the cpu_mask_forced variable that is used to detect that a taskset had been used.	2025-03-14 18:30:30 +01:00
Willy Tarreau	f0661e79fe	MINOR: global: add a command-line option to enable CPU binding debugging During development, everything related to CPU binding and the CPU topology is debugged using state dumps at various places, but it does make sense to have a real command line option so that this remains usable in production to help users figure why some CPUs are not used by default. Let's add "-dc" for this. Since the list of global.tune.options values is almost full and does not 100% match this option, let's add a new "tune.debug" field for this.	2025-03-14 18:30:30 +01:00
Willy Tarreau	f156baf8ce	DOC: design-thoughts: commit numa-auto.txt Lots of collected data and observations aggregated into a single commit so as not to lose them. Some parts below come from several commit messages and are incremental. Add captures and analysis of intel 14900 where it's not easy to draw the line between the desired P and E cores. The 14900 raises some questions (imagine a dual-die variant in multi-socket). That's the start of an algorithmic distribution of performance cores into thread groups. cpu-map currently conflicts a lot with the choices after auto-detection but it doesn't have to. The problem is the inability to configure the threads for the whole process like taskset does. By offering this ability we can also start to designate groups of CPUs symbolically (package, die, ccx, cores, smt). It can also be useful to exploit the info from cpuinfo that is not available in /sys, such as the model number. At least on arm, higher numbers indicate bigger cores and can be useful to distinguish cores inside a cluster. It will not indicate big vs medium ones of the same type (e.g. a78 3.0 vs 2.4 GHz) but can still be effective at identifying the efficient ones. In short, infos such as cluster ID not always reliable, and are local to the package. die_id as well. die number is not reported here but should definitely be used, as a higher priority than L3. We're still missing a discriminant between the l3 and cluster number in order to address heterogenous CPUs (e.g. intel 14900), though in terms of locality that's currently done correctly. CPU selection is also a full topic, and some thoughts were noted regarding sorting by perf vs locality so as never to mix inter- socket CPUs due to sorting. The proposed cpu-selection cannot work as-is, because it acts both on restriction and preference, and these two are not actions but a sequence. First restrictions must be enforced, and second the remaining CPUs are sorted according to the preferred criterion, and a number of threads are selected. Currently we refine the OS-exposed cluster number but it's not correct as we can end up with something poorly numbered. We need to respect the LLC in any case so let's explain the approach.	2025-03-14 18:30:30 +01:00
Aurelien DARRAGON	4c3eb60e70	DOC: management: rename some last occurences from domain "dns" to "resolvers" This is a complementary patch to cf913c2f9 ("DOC: management: rename show stats domain cli "dns" to "resolvers"). The doc still refered to the legacy "dns" domain filter for stat command. Let's rename those occurences to "resolvers". It may be backported to all stable versions.	2025-03-13 11:49:10 +01:00
Aurelien DARRAGON	e942305214	MEDIUM: log: change default "host" strategy for log-forward section Historically, log-forward proxy used to preserve host field from input message as much as possible, and if syslog host wasn't provided (rfc5424 '-' or bad rfc3164 or rfc5424 message) then "localhost" or "-" would be used as host when outputting message using rfc3164 or rfc5424. We change that behavior (which corresponds to "keep" host option), so that log-forward now uses "fill" strategy as default: if the host is provided in input message, it is preserved. However if it is missing and IP address from sender is available, we use it.	2025-03-12 10:55:49 +01:00
Aurelien DARRAGON	ad0133cc50	MINOR: log: handle log-forward "option host" Following previous patch, we know implement the logic for the host option under log-forward section. Possible strategies are: replace If input message already contains a value for the host field, we replace it by the source IP address from the sender. If input message doesn't contain a value for the host field (ie: '-' as input rfc5424 message or non compliant rfc3164 or rfc5424 message), we use the source IP address from the sender as host field. fill If input message already contains a value for the host field, we keep it. If input message doesn't contain a value for the host field (ie: '-' as input rfc5424 message or non compliant rfc3164 or rfc5424 message), we use the source IP address from the sender as host field. keep If input message already contains a value for the host field, we keep it. If input message doesn't contain a value for the host field, we set it to localhost (rfc3164) or '-' (rfc5424). (This is the default) append If input message already contains a value for the host field, we append a comma followed by the IP address from the sender. If input message doesn't contain a value for the host field, we use the source IP address from the sender. Default value (unchanged) is "keep" strategy. option host is only relevant with rfc3164 or rfc5424 format on log targets. Also, if the source address is not available (ie: UNIX socket), default behavior prevails. Documentation was updated.	2025-03-12 10:52:07 +01:00
Willy Tarreau	3cbeb6a74b	[RELEASE] Released version 3.2-dev7 Released version 3.2-dev7 with the following main changes : - BUG/MEDIUM: applet: Don't handle EOI/EOS/ERROR is applet is waiting for room - BUG/MEDIUM: spoe/mux-spop: Introduce an NOOP action to deal with empty ACK - BUG/MINOR: cfgparse: fix NULL ptr dereference in cfg_parse_peers - BUG/MEDIUM: uxst: fix outgoing abns address family in connect() - REGTESTS: fix reg-tests/server/abnsz.vtc - BUG/MINOR: log: fix outgoing abns address family - BUG/MINOR: sink: add tempo between 2 connection attempts for sft servers - MINOR: clock: always use atomic ops for global_now_ms - CI: QUIC Interop: clean old docker images - BUG/MINOR: stream: do not call co_data() from __strm_dump_to_buffer() - BUG/MINOR: mux-h1: always make sure h1s->sd exists in h1_dump_h1s_info() - MINOR: tinfo: add a new thread flag to indicate a call from a sig handler - BUG/MEDIUM: stream: never allocate connection addresses from signal handler - MINOR: freq_ctr: provide non-blocking read functions - BUG/MEDIUM: stream: use non-blocking freq_ctr calls from the stream dumper - MINOR: tools: use only opportunistic symbols resolution - CLEANUP: task: move the barrier after clearing th_ctx->current - MINOR: compression: Introduce minimum size - BUG/MINOR: h2: always trim leading and trailing LWS in header values - MINOR: tinfo: split the signal handler report flags into 3 - BUG/MEDIUM: stream: don't use localtime in dumps from a signal handler - OPTIM: connection: don't try to kill other threads' connection when !shared - BUILD: add possibility to use different QuicTLS variants - MEDIUM: fd: Wait if locked in fd_grab_tgid() and fd_take_tgid(). - MINOR: fd: Add fd_lock_tgid_cur(). - MEDIUM: epoll: Make sure we can add a new event - MINOR: pollers: Add a fixup_tgid_takeover() method. - MEDIUM: pollers: Drop fd events after a takeover to another tgid. - MEDIUM: connections: Allow taking over connections from other tgroups. - MEDIUM: servers: Add strict-maxconn. - BUG/MEDIUM: server: properly initialize PROXY v2 TLVs - BUG/MINOR: server: fix the "server-template" prefix memory leak - BUG/MINOR: h3: do not report transfer as aborted on preemptive response - CLEANUP: h3: fix documentation of h3_rcv_buf() - MINOR: hq-interop: properly handle incomplete request - BUG/MEDIUM: mux-fcgi: Try to fully fill demux buffer on receive if not empty - MINOR: h1: permit to relax the websocket checks for missing mandatory headers - BUG/MINOR: hq-interop: fix leak in case of rcv_buf early return - BUG/MINOR: server: check for either proxy-protocol v1 or v2 to send hedaer - MINOR: jws: implement a JWK public key converter - DEBUG: init: add a way to register functions for unit tests - TESTS: add a unit test runner in the Makefile - TESTS: jws: register a unittest for jwk - CI: github: run make unit-tests on the CI - TESTS: add config smoke checks in the unit tests - MINOR: jws: conversion to NIST curves name - CI: github: remove smoke tests from vtest.yml - TESTS: ist: fix wrong array size - TESTS: ist: use the exit code to return a verdict - TESTS: ist: add a ist.sh to launch in make unit-tests - CI: github: fix h2spec.config proxy names - DEBUG: init: Add a macro to register unit tests - MINOR: sample: allow custom date format in error-log-format - CLEANUP: log: removing "log-balance" references - BUG/MINOR: log: set proper smp size for balance log-hash - MINOR: log: use __send_log() with exact payload length - MEDIUM: log: postpone the decision to send or not log with empty messages - MINOR: proxy: make pr_mode enum bitfield compatible - MINOR: cfgparse-listen: add and use cfg_parse_listen_match_option() helper - MINOR: log: add options eval for log-forward - MINOR: log: detach prepare from parse message - MINOR: log: add dont-parse-log and assume-rfc6587-ntf options - BUG/MEIDUM: startup: return to initial cwd only after check_config_validity() - TESTS: change the output of run-unittests.sh - TESTS: unit-tests: store sh -x in a result file - CI: github: show results of the Unit tests - BUG/MINOR: cfgparse/peers: fix inconsistent check for missing peer server - BUG/MINOR: cfgparse/peers: properly handle ignored local peer case - BUG/MINOR: server: dont return immediately from parse_server() when skipping checks - MINOR: cfgparse/peers: provide more info when ignoring invalid "peer" or "server" lines - BUG/MINOR: stream: fix age calculation in "show sess" output - MINOR: stream/cli: rework "show sess" to better consider optional arguments - MINOR: stream/cli: make "show sess" support filtering on front/back/server - TESTS: quic: create first quic unittest - MINOR: h3/hq-interop: restore function for standalone FIN receive - MINOR/OPTIM: mux-quic: do not allocate rxbuf on standalone FIN - MINOR: mux-quic: refine reception of standalone STREAM FIN - MINOR: mux-quic: define globally stream rxbuf size - MINOR: mux-quic: define rxbuf wrapper - MINOR: mux-quic: store QCS Rx buf in a single-entry tree - MINOR: mux-quic: adjust Rx data consumption API - MINOR: mux-quic: adapt return value of qcc_decode_qcs() - MAJOR: mux-quic: support multiple QCS RX buffers - MEDIUM: mux-quic: handle too short data splitted on multiple rxbuf - MAJOR: mux-quic: increase stream flow-control for multi-buffer alloc - BUG/MINOR: cfgparse-tcp: relax namespace bind check - MINOR: startup: adjust alert messages, when capabilities are missed	2025-03-07 16:37:57 +01:00
Willy Tarreau	5e558c1727	MINOR: stream/cli: make "show sess" support filtering on front/back/server With "show sess", particularly "show sess all", we're often missing the ability to inspect only streams attached to a frontend, backend or server. Let's just add these filters to the command. Only one at a time may be set. One typical use case could be to dump streams attached to a server after issuing "shutdown sessions server XXX" to figure why any wouldn't stop for example.	2025-03-07 10:38:12 +01:00
Willy Tarreau	2bd7cf53cb	MINOR: stream/cli: rework "show sess" to better consider optional arguments The "show sess" CLI command parser is getting really annoying because several options were added in an exclusive mode as the single possible argument. Recently some cumulable options were added ("show-uri") but the older ones were not yet adapted. Let's just make sure that the various filters such as "older" and "age" now belong to the options and leave only <id>, "all", and "help" for the first ones. The doc was updated and it's now easier to find these options.	2025-03-07 10:36:58 +01:00
Roberto Moreda	f98b5c4f59	MINOR: log: add dont-parse-log and assume-rfc6587-ntf options This commit introduces the dont-parse-log option to disable log message parsing, allowing raw log data to be forwarded without modification. Also, it adds the assume-rfc6587-ntf option to frame log messages using only non-transparent framing as per RFC 6587. This avoids missparsing in certain cases (mainly with non RFC compliant messages). The documentation is updated to include details on the new options and their intended use cases. This feature was discussed in GH #2856	2025-03-06 09:30:39 +01:00
Willy Tarreau	fd5d59967a	MINOR: h1: permit to relax the websocket checks for missing mandatory headers At least one user would like to allow a standards-violating client setup WebSocket connections through haproxy to a standards-violating server that accepts them. While this should of course never be done over the internet, it can make sense in the datacenter between application components which do not need to mask the data, so this typically falls into the situation of what the "accept-unsafe-violations-in-http-request" option and the "accept-unsafe-violations-in-http-response" option are made for. See GH #2876 for more context. This patch relaxes the test on the "Sec-Websocket-Key" header field in the request, and of the "Sec-Websocket-Accept" header in the response when these respective options are set. The doc was updated to reference this addition. This may be backported to 3.1 but preferably not further.	2025-02-28 17:31:20 +01:00
Olivier Houchard	706b008429	MEDIUM: servers: Add strict-maxconn. Maxconn is a bit of a misnomer when it comes to servers, as it doesn't control the maximum number of connections we establish to a server, but the maximum number of simultaneous requests. So add "strict-maxconn", that will make it so we will never establish more connections than maxconn. It extends the meaning of the "restricted" setting of tune.takeover-other-tg-connections, as it will also attempt to get idle connections from other thread groups if strict-maxconn is set.	2025-02-26 13:00:18 +01:00
Olivier Houchard	8de8ed4f48	MEDIUM: connections: Allow taking over connections from other tgroups. Allow haproxy to take over idle connections from other thread groups than our own. To control that, add a new tunable, tune.takeover-other-tg-connections. It can have 3 values, "none", where we won't attempt to get connections from the other thread group (the default), "restricted", where we only will try to get idle connections from other thread groups when we're using reverse HTTP, and "full", where we always try to get connections from other thread groups. Unless there is a special need, it is advised to use "none" (or restricted if we're using reverse HTTP) as using connections from other thread groups may have a performance impact.	2025-02-26 13:00:18 +01:00
Vincent Dechenaux	9011b3621b	MINOR: compression: Introduce minimum size This is the introduction of "minsize-req" and "minsize-res". These two options allow you to set the minimum payload size required for compression to be applied. This helps save CPU on both server and client sides when the payload does not need to be compressed.	2025-02-22 11:32:40 +01:00
Willy Tarreau	4ef6be4a1f	[RELEASE] Released version 3.2-dev6 Released version 3.2-dev6 with the following main changes : - BUG/MEDIUM: debug: close a possible race between thread dump and panic() - DEBUG: thread: report the spin lock counters as seek locks - DEBUG: thread: make lock time computation more consistent - DEBUG: thread: report the wait time buckets for lock classes - DEBUG: thread: don't keep the redundant _locked counter - DEBUG: thread: make lock_stat per operation instead of for all operations - DEBUG: thread: reduce the struct lock_stat to store only 30 buckets - MINOR: lbprm: add a new callback ->server_requeue to the lbprm - MEDIUM: server: allocate a tasklet for asyncronous requeuing - MAJOR: leastconn: postpone the server's repositioning under contention - BUG/MINOR: quic: reserve length field for long header encoding - BUG/MINOR: quic: fix CRYPTO payload size calcul for encoding - MINOR: quic: simplify length calculation for STREAM/CRYPTO frames - BUG/MINOR: mworker: section ignored in discovery after a post_section_parser - BUG/MINOR: mworker: post_section_parser for the last section in discovery - CLEANUP: mworker: "program" section does not have a post_section_parser anymore - MEDIUM: initcall: allow to register mutiple post_section_parser per section - CI: cirrus-ci: bump FreeBSD image to 14-2 - DOC: initcall: name correctly REGISTER_CONFIG_POST_SECTION() - REGTESTS: stop using truncated.vtc on freebsd - MINOR: quic: refactor STREAM encoding and splitting - MINOR: quic: refactor CRYPTO encoding and splitting - BUG/MEDIUM: fd: mark FD transferred to another process as FD_CLONED - BUG/MINOR: ssl/cli: "show ssl crt-list" lacks client-sigals - BUG/MINOR: ssl/cli: "show ssl crt-list" lacks sigals - MINOR: ssl/cli: display more filenames in 'show ssl cert' - DOC: watchdog: document the sequence of the watchdog and panic - MINOR: ssl: store the filenames resulting from a lookup in ckch_conf - MINOR: startup: allow hap_register_feature() to enable a feature in the list - MINOR: quic: support frame type as a varint - BUG/MINOR: startup: leave at first post_section_parser which fails - BUG/MINOR: startup: hap_register_feature() fix for partial feature name - BUG/MEDIUM: cli: Be sure to drop all input data in END state - BUG/MINOR: cli: Wait for the last ACK when FDs are xferred from the old worker - BUG/MEDIUM: filters: Handle filters registered on data with no payload callback - BUG/MINOR: fcgi: Don't set the status to 302 if it is already set - MINOR: ssl/crtlist: split the ckch_conf loading from the crtlist line parsing - MINOR: ssl/crtlist: handle crt_path == cc->crt in crtlist_load_crt() - MINOR: ssl/ckch: return from ckch_conf_clean() when conf is NULL - MEDIUM: ssl/crtlist: "crt" keyword in frontend - DOC: configuration: document the "crt" frontend keyword - DEV: h2: add a Lua-based HTTP/2 connection tracer - BUG/MINOR: quic: prevent crash on conn access after MUX init failure - BUG/MINOR: mux-quic: prevent crash after MUX init failure - DEV: h2: fix flags for the continuation frame - REGTESTS: Fix truncated.vtc to send 0-CRLF - BUG/MINOR: mux-h2: Properly handle full or truncated HTX messages on shut - Revert "REGTESTS: stop using truncated.vtc on freebsd" - MINOR: mux-quic: define a QCC application state member - MINOR: mux-quic/h3: emit SETTINGS via MUX tasklet handler - MINOR: mux-quic/h3: support temporary blocking on control stream sending	2025-02-19 18:39:51 +01:00
William Lallemand	764f6910ed	DOC: configuration: document the "crt" frontend keyword Document the "crt" keyword of frontend and listen section.	2025-02-17 18:26:37 +01:00
Willy Tarreau	a4d65c9cc8	DOC: watchdog: document the sequence of the watchdog and panic Each time we go into the watchdog and panic code, it's super hard to figure who calls what since signals are involved to bounce between threads. Let's document the main principles and sequences to ease the journey next time.	2025-02-13 16:45:07 +01:00
William Lallemand	0b47e5fa20	DOC: initcall: name correctly REGISTER_CONFIG_POST_SECTION() REGISTER_CONFIG_POST_SECTION() was not named correctly.	2025-02-12 13:27:44 +01:00
William Lallemand	4de86bbbfc	MEDIUM: initcall: allow to register mutiple post_section_parser per section Before this patch, REGISTER_CONFIG_SECTION() allowed to register one and only one callback (<post>) called after the parsing of a section. It was limitating because you couldn't register a post callback from anywhere else in the code. This patch introduces the new REGISTER_CONFIG_SECTION_POST() macros which allows to register a new post callback for a section keyword from anywhere. This patch introduces the feature by allowing `struct cfg_section` entries that does not have a `section_parser`, and then iterating on all cfg_section with a post_section_parser for a keyword.	2025-02-12 12:52:41 +01:00
Willy Tarreau	37e84676c7	[RELEASE] Released version 3.2-dev5 Released version 3.2-dev5 with the following main changes : - BUG/MINOR: ssl: put ssl_sock_load_ca under SSL_NO_GENERATE_CERTIFICATES - CLEANUP: ssl: rename ssl_sock_load_ca to ssl_sock_gencert_load_ca - CLEANUP: ssl: move ssl_sock_gencert_load_ca declaration in ssl_gencert.h - CLEANUP: tree-wide: define and use acl_match_cond() helper - MINOR: epoll: permit to mask certain specific events - MINOR: proxies: Add a per-thread group field to struct proxy. - MINOR: Add fields to the per-thread group field in struct server. - MINOR: proxies/servers: Calculate queueslength and use it. - MEDIUM: servers/proxies: Switch to using per-tgroup queues. - BUG/MINOR: stream: Properly handle "on-marked-up shutdown-backup-sessions" - MEDIUM: stream: Map task wake up reasons to dedicated stream events - MEDIUM: stream: No longer use TASK_F_UEVT* to shut a stream down - BUILD: tools: fix build on BSD by dropping the ETIME check - MINOR: queues: use __ha_cpu_relax() on failed CAS. - BUILD: queues: Use unsigned int when needed - BUILD: ssl: allow to build without the renegotiation API of WolfSSL - BUILD: ssl: more cleaner approach to WolfSSL without renegotiation - BUG/MEDIUM: chunk: make sure to flush the trash pool before resizing - MINOR: quic: remove references to burst in quic-cc-algo parsing - MINOR: quic: allow BBR testing without pacing - MINOR: quic: transform pacing settings into a global option - MAJOR: quic: mark pacing as stable and enable it by default - MINOR: quic: mark BBR as stable - MINOR: quic: define quic_tune - BUILD: quic: fix overflow in global tune - DEBUG: fd: add a counter of takeovers of an FD since it was last opened - MINOR: fd: add a generation number to file descriptors - DEBUG: epoll: store and compare the FD's generation count with reported event - MEDIUM: epoll: skip reports of stale file descriptors - MINOR: mux-h1: Add masks to group H1S DEMUX and MUX errors - BUG/MINOR: mux-h1: Only report a SE error on demux error - MINOR: tevt: Add the termination events log's fundations - MINOR: tevt/stconn: Add a termination events log in the SE descriptor - MINOR: tevt/mux-h1: Report termination events for the H1C and H1S - MINOR: tevt/mux-h2: Report termination events for the H2C - MINOR: tevt/stream/stconn: Report termination events for stream and sc - MINOR: tevt/conn: Report intercepted event for L4 rules - MINOR: tevt/mux-h1/mux-h2: Add termination events log when dumping mux info - MINOR: tevt/muxes: Add CTL and SCTL command to get the termination event logs - MINOR: tevt/mux-pt: Add support for termination event logs - MINOR: tevt/connection: Add dedicated termination events for lower locations - MEDIUM: tevt/muxes: Add dedicated termination events for muxc/se locations - MINOR: tevt/stconn: Be more accurate to report shutw events - MEDIUM: tevt/stconn/stream: Add dedicated termination events for stream location - MINOR: tevt: Don't duplicate termination event during reporting - MINOR: tevt/applet: Add limited support for termination event logs for applets - MINOR: tevt: Add a sample to get termination events for all locations - MINOR: tevt: Improve function to convert a termination events log to string - REORG: tevt/connection: Move enums at the end of the header file - MINOR: tevt/dev: Add term_events tool - MINOR: tevt/connection: Add support for POLL_HUP/POLL_ERR events - MINOR: tevt/dev: Parse tuple of termination events - BUG/MEDIUM: htx: wrong count computation in htx_xfer_blks() - DOC: htx: clarify <mark> parameter for htx_xfer_blks() - BUILD: quic: remove GCC undefined error in qc_release_lost_pkts() - MEDIUM: htx: prevent <mark> to copy incomplete headers in htx_xfer_blks() - BUG/MEDIUM: mux-fcgi: Properly handle read0 on partial records - BUG/MINOR: tevt/http-ana: Remove badly placed event reports - DEBUG: http-ana: Remove debug counters from HTTP analyzers - DEBUG: mux-h1: Remove some debug counters - BUG/MINOR: tcp-rules: Don't forward close during tcp-response content rules eval - MEDIUM: stream: interrupt costly rulesets after too many evaluations - BUG/MINOR: http-check: Don't pretend a C-L heeader is set before adding it - BUILD: ssl: remove a boringssl definition defined by recent boringssl libs - BUG/MINOR: tevt/mux-h2: Set truncated receive/eos events at SE level on error - BUG/MEDIUM: flt-spoe: Set/test applet flags instead of SE flags from I/O handler - BUG/MEDIUM: applet: Don't pretend to have more data to handle EOI/EOS/ERROR - BUG/MEDIUM: flt-spoe: Properly handle end of stream from the SPOE applet - MINOR: flt-spoe: Report end of input immediately after applet init - MINOR: mux-spop: Report EOI on the SE when a ACK is received for a stream - MINOR: mux-spop: Set SPOP_CF_ERROR flag on connection error only - MINOR: tevt/mux-spop: Report termination events for the SPOP connect/stream - CLEANUP: mux-spop: Remove useless comments - MINOR: mux-spop: Dump info about connections and streams in dedicated functions - MINOR: mux-spop: Implement .show_sd callback function - MEDIUM: mux-fcgi: Add a function to propagate termination flags from fstrm to SE - BUG/MEDIUM: mux-fcgi: Propagate flags to SE in fcgi_strm_wake_one_stream - MINOR: tevt/mux-fcgi: Report termination events for the FCGI connect/stream - MINOR: mux-fcgi: Dump info about connections and streams in dedicated functions - MINOR: mux-spop/mux-fcgi: Add support of the debug string for logs - BUG/MINOR: cli: Don't set SE flags from the cli applet - BUG/MINOR: cli: Fix memory leak on error for _getsocks command - BUG/MINOR: cli: Fix a possible infinite loop in _getsocks() - BUG/MINOR: config/userlist: Support one 'users' option for 'group' directive - BUG/MINOR: auth: Fix a leak on error path when parsing user's groups - BUG/MINOR: flt-trace: Support only one name option - MINOR: filters: Improve errors formating during filters parsing - BUG/MINOR: stats-json: Define JSON_INT_MAX as a signed integer - DOC: option redispatch should mention persist options - BUG/MINOR: debug: make "debug dev sched" accept a negative TID - BUG/MINOR: debug: make sure the "debug dev sched" tasks don't block stopping - IMPORT: plock: export the uninlined version of the lock wait function - IMPORT: plock: give higher precedence to W than S - IMPORT: plock: lower the slope of the exponential back-off - IMPORT: plock: use cpu_relax() for a shorter time in EBO - Revert "IMPORT: plock: export the uninlined version of the lock wait function" - BUG/MEDIUM: ssl: chosing correct certificate using RSA-PSS with TLSv1.3	2025-02-08 05:53:40 +01:00
Lukas Tribus	5926fb7823	DOC: option redispatch should mention persist options "option redispatch" remains vague in which cases a session would persist; let's mention "option persist" and "force-persist" as an example so folks don't draw the conclusion that this may be default. Should be backported to stable branches.	2025-02-06 17:49:13 +01:00
Aurelien DARRAGON	0846638f7f	MEDIUM: stream: interrupt costly rulesets after too many evaluations It is not rare to see configurations with a large number of "tcp-request content" or "http-request" rules for instance. A large number of rules combined with cpu-demanding actions (e.g.: actions that work on content) may create thread contention as all the rules from a given ruleset are evaluated under the same polling loop if the evaluation is not interrupted Thus, in this patch we add extra logic around "tcp-request content", "tcp-response content", "http-request" and "http-response" rulesets, so that when a certain number of rules are evaluated under the single polling loop, we force the evaluating function to yield. As such, the rule which was about to be evaluated is saved, and the function starts evaluating rules from the save pointer when it returns (in the next polling loop). We use task_wakeup(task, TASK_WOKEN_MSG) to explicitly wake the task so that no time is wasted and the processing is resumed ASAP. TASK_WOKEN_MSG is mandatory here because process_stream() expects TASK_WOKEN_MSG for explicit analyzers re-evaluation. rules_bcount stream's attribute was added to count how manu rules were evaluated since last interruption (yield). Also, SF_RULE_FYIELD flag was added to know that the s->current_rule was assigned due to forced yield and not regular yield. By default haproxy will enforce a yield every 50 rules, this behavior can be configured using the "tune.max-rules-at-once" global keyword. There is a limitation though: for now, if the ACT_OPT_FINAL flag is set on act_opts, we consider it is not safe to yield (as it is already the case for automatic yield). In this case instead of yielding an taking the risk of not being called back, we skip the yield and hope it will not create contention. This is something we should ideally try to improve in order to yield in all conditions.	2025-02-03 17:09:48 +01:00
William Lallemand	c17e029232	DOC: htx: clarify <mark> parameter for htx_xfer_blks() Clarify the fact that the first <mark> block is transferred before stopping when using htx_xfer_blks()	2025-01-31 15:23:47 +01:00
Christopher Faulet	b161155498	MINOR: tevt: Add a sample to get termination events for all locations "term_events" is a sample fetche function that can be used to get termination events for all locations in one call. The format equivalent to: {fc_term_events,fc_mux_term_events,fs.term_events,txn.term_events,bs.term_events,bc_mux_term_events,bc_term_events} If no event was reported for a location, the field is empty. If the feature is not supported yet, a dash ('-') is printed.	2025-01-31 10:41:50 +01:00
Amaury Denoyelle	2fc63cb186	MINOR: quic: mark BBR as stable Pacing has recently been moved out of experimental status and is activated by default. This is a mandatory requirement for BBR. Furthermore, BBR is now considered stable. As such, removes its experimental status with this commit.	2025-01-30 17:20:41 +01:00
Amaury Denoyelle	a19d9b0486	MAJOR: quic: mark pacing as stable and enable it by default Remove pacing experimental status, so it's not required anymore to use expose-experimental-directives to enable it. Along this change, pacing is now activated by default. As such, pacing configuration is transformed into its final form. The global on/off setting is turned into a disable setting without argument.	2025-01-30 17:20:41 +01:00
Amaury Denoyelle	0c8b54b2d1	MINOR: quic: transform pacing settings into a global option Pacing support was previously activated on each bind line individually, via an optional argument of quic-cc-algo keyword. Remove this optional argument and introduce a global setting to enable/disable pacing. Pacing activation is still flagged as experimental. One important change is that previously BBR usage automatically activated pacing support. This is not the case anymore, so users should now always explicitely activate pacing if BBR is selected. A new warning message will be displayed if this is not the case. Another consequence of this change is that now pacing_inter callback is always defined for every quic_cc_algo types. As such, QUIC MUX uses global.tune.options to determine if pacing is required. This should be backported up to 3.1, after a period of observation.	2025-01-30 17:19:38 +01:00
Amaury Denoyelle	d04e93bc2e	MINOR: quic: allow BBR testing without pacing Pacing is activated per bind line via an optional boolean argument of quic-cc-algo keyword. Contrary to the default usage, pacing is automatically activated when BBR is chosen. This is because this algorithm is expected to run on top of pacing, else its behavior is undefined. Previously, pacing argument was thus ignored when BBR was selected. Change this to support explicit deactivation of pacing with it. This could be useful to test BBR without pacing when debugging some issues. This should be backported up to 3.1, after a period of observation.	2025-01-30 17:18:02 +01:00
Christopher Faulet	0a52a75ef7	BUG/MINOR: stream: Properly handle "on-marked-up shutdown-backup-sessions" shutdown-backup-sessions action for on-marked-up directive does not work anymore since the stream_shutdown() function was modified to be async-safe. When stream_shutdown() was modified to be async-safe, dedicated task events were added to map the reasons to shut a stream down. SF_ERR_DOWN was mapped to TASK_F_EVT1 and SF_ERR_KILLED was mapped to TASK_F_EVT2. The reverse mapping was performed by process_stream() to shut the stream with the appropriate reason. However, SF_ERR_UP reason, used by shutdown-backup-sessions action to shut a stream down because a preferred server became available, was not mapped in the same way. So since commit b8e3b0a18d ("BUG/MEDIUM: stream: make stream_shutdown() async-safe"), this action is ignored and does not work anymore. To fix an issue, and being able to bakcport the fix, a third task event was added. TASK_F_EVT3 is now mapped on SF_ERR_UP. This patch should fix the issue #2848. It must be backported as far as 2.6.	2025-01-28 14:53:37 +01:00
Willy Tarreau	7fa70da06d	MINOR: epoll: permit to mask certain specific events A few times in the past we've seen cases where epoll was caught reporting a wrong event that caused trouble (e.g. spuriously reporting HUP or RDHUP after a successful connect()). The new tune.epoll.mask-events directive permits to mask events such as ERR, HUP and RDHUP and convert them to IN events that are processed by the regular receive path. This should help better diagnose and troubleshoot issues such as this one, as well as rule out such a cause when similar issues are reported: https://github.com/haproxy/haproxy/issues/2368 https://www.spinics.net/lists/netdev/msg876470.html It should be harmless to backport this if necessary.	2025-01-27 15:47:46 +01:00
Willy Tarreau	670182bc9e	[RELEASE] Released version 3.2-dev4 Released version 3.2-dev4 with the following main changes : - BUG/MINOR: stktable: fix big-endian compatiblity in smp_to_stkey() - MINOR: stktable: add stkey_to_smp() helper - MINOR: stktable: add stksess_getkey() helper - MINOR: stktable: add sc[0-2]_key fetches - BUG/MEDIUM: queues: Adjust the proxy counters when appropriate - MINOR: trace: add help message for -dt argument - MINOR: trace: ensure -dt priority over traces config section - MINOR: trace: support all source alias on -dt - BUG/MINOR: quic: reject NEW_TOKEN frames from clients - MINOR: stktable: fix potential build issue in smp_to_stkey - BUG/MEDIUM: stktable: fix missing lock on some table converters - BUG/MEDIUM: promex: Use right context pointers to dump backends extra-counters - MINOR: stktable: fix potential build issue in smp_to_stkey (2nd try) - MINOR: stktable: add smp_fetch_stksess() helper function - MEDIUM: stktable: split src-based key smp_fetch_sc functions - MEDIUM: stktable: split sc_ and src_ fetch lookup logics - MEDIUM: stktable: leverage smp_fetch_* helpers from sample conv - DOC: config: unify sample conv\|fetches optional arguments syntax - DOC: config: stick-table converters support implicit <table> argument - DOC: config: stick-table converter do accept ANY-typed input - DOC: config: clarify return type for some stick-table converters - DOC: config: refer to canonical sticktable converters for src_* fetches - CLEANUP: stktable: move sample_conv_table_bytes_out_rate() - MINOR: stktable: add table_{inc,clr}_gpc* converters - BUG/MAJOR: quic: reject too large CRYPTO frames - BUG/MAJOR: log/sink: possible sink collision in sink_new_from_srv() - BUG/MINOR: init: set HAPROXY_STARTUP_VERSION from the variable, not the macro - REORG: version: move the remaining BUILD_* stuff from haproxy.c to version.c - BUG/MINOR: quic: ensure a detached coalesced packet can't access its neighbours - MINOR: quic: Add a BUG_ON() on quic_tx_packet refcount - BUILD: quic: Move an ASSUME_NONNULL() for variable which is not null - BUG/MEDIUM: mux-h1: Properly close H1C if an error is reported before sending data - CLEANUP: quic: remove unused prototype - MINOR: quic: rename pacing_rate cb to pacing_inter - BUG/MINOR: quic: do not increase congestion window if app limited - MINOR: mux-quic: increment pacing retry counter on expired - MEDIUM: quic: implement credit based pacing - MEDIUM: mux-quic: reduce pacing CPU usage with passive wait - MEDIUM: quic: use dynamic credit for pacing - MINOR: quic: remove unused pacing burst in bind_conf/quic_cc_path - MINOR: quic: adapt credit based pacing to BBR - MINOR: tools: add errname to print errno macro name - MINOR: debug: debug_parse_cli_show_dev: use errname - MINOR: debug: show boot and runtime process settings in table	2025-01-24 11:01:06 +01:00
Amaury Denoyelle	cb91ccd8a8	MEDIUM: quic: use dynamic credit for pacing Major improvements have been introduced in pacing recently. Most notably, QMUX schedules emission on a millisecond resolution, which allow to use passive wait to be much CPU friendly. However, an issue remains with the pacing max credit. Unless BBR is used, it is fixed to the configured value from quic-cc-algo bind statement. This is not practical as if too low, it may drastically reduce performance due to 1ms sleep resolution. If too high, some clients will suffer from too much packet loss. This commit fixes the issue by implementing a dynamic maximum credit value based on the network condition specific to each clients. Calculation is done to fix a maximum value which should allow QMUX current tasklet context to emit enough data to cover the delay with the next tasklet invokation. As such, avg_loop_us is used to detect the process load. If too small, 1.5ms is used as minimal value, to cover the extra delay incurred by the system which will happen for a default 1ms sleep. This should be backported up to 3.1.	2025-01-23 17:40:48 +01:00
Amaury Denoyelle	8098be1fdc	MEDIUM: mux-quic: reduce pacing CPU usage with passive wait Pacing algorithm has been revamped in the previous commit to implement a credit based solution. This is a far more adaptative solution, in particular which allow to catch up in case pause between pacing emission was longer than expected. This allows QMUX to remove the active loop based on tasklet wake-up. Instead, a new task is used when emission should be paced. The main advantage is that CPU usage is drastically reduced. New pacing task timer is reset each time qcc_io_send() is invoked. Timer will be set only if pacing engine reports that emission must be interrupted. In this case timer is set via qcc_wakeup_pacing() to the delay reported by congestion algorithm, or 1ms if delay is too short. At the end of qcc_io_cb(), pacing task is queued if timer has been set. Pacing task execution is simple enough : it immediately wakes up QCC I/O handler. Note that to have decent performance, it requires to have a large enough burst defined in configuration of quic-cc-algo. However, this value is common to every listener clients, which may cause too much loss under network conditions. This will be address in a future patch. This should be backported up to 3.1.	2025-01-23 17:40:22 +01:00
Aurelien DARRAGON	0486b9e491	MINOR: stktable: add table_{inc,clr}_gpc* converters As discussed in GH #2423, there are some cases where src_{inc,clr}_gpc* is not sufficient because we need to perform the lookup on a specific key. Indeed, just like we did in e642916 ("MEDIUM: stktable: leverage smp_fetch_* helpers from sample conv"), we can easily implement new table converters based on existing fetches. This is what we do in this patch. Also the doc was updated so that src_{inc,clr}_gpc* fetches now point to their generic equivalent table_{inc,clr}_gpc. Indeed, src_{inc,clr}_gpc are simply aliases. This should fix GH #2423.	2025-01-16 11:50:33 +01:00
Aurelien DARRAGON	62e42184ab	DOC: config: refer to canonical sticktable converters for src_* fetches When available, to prevent doc duplication, let's make src_* fetches point to equivalent table_* converters, as they are in fact aliases for src,table_* converters.	2025-01-16 11:50:20 +01:00
Aurelien DARRAGON	163c1124a2	DOC: config: clarify return type for some stick-table converters Some stick-table converters such as "table_gpt" erroneously suggest that the returned type is a boolean while in fact it is integer type, as properly documented for the sample fetch equivalents.	2025-01-16 11:50:14 +01:00
Aurelien DARRAGON	a8407cf3f7	DOC: config: stick-table converter do accept ANY-typed input Since 2d17db58 ("MINOR: stick-table: change all stick-table converters' inputs to SMP_T_ANY"), all stick-table converters accept ANY input type as parameter, this means that it does no longer restrict the key as a string representation of the input. However the doc wasn't updated when the change was made. Moreover, some converters document the updated behavior while others don't, which is kind of confusing, let's fix that.	2025-01-16 11:50:08 +01:00
Aurelien DARRAGON	0d318b4383	DOC: config: stick-table converters support implicit <table> argument As with stick-table sample fetches, the <table> argument is not strictly needed and defaults to the current proxy's stick-table when not provided Let's update the doc and prototype to reflect the current behavior.	2025-01-16 11:50:02 +01:00
Aurelien DARRAGON	dfdee47a8e	DOC: config: unify sample conv\|fetches optional arguments syntax The most common way (and proper way it seems) to declare optional arguments in sample fetch or converters' prototype is to declare them between square brackets, including the leading coma (because the coma should be omitted if the argument is not provided). Also, when multiple optional arguments are found, we should apply the same logic but recursively. In this patch we fix prototypes that include optional arguments and don't follow this syntax. This improves readibility and sets the norm for upcoming sample fetches/converters.	2025-01-16 11:49:55 +01:00

... 3 4 5 6 7 ...

3279 Commits