haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-11-04 02:21:03 +01:00

Author	SHA1	Message	Date
Willy Tarreau	d11241b7ba	MINOR: cpu-topo: fall back to nominal_perf and scaling_max_freq for the capacity When cpu_capacity is not present, let's try to check acpi_cppc's nominal_perf which is similar and commonly found on servers, then scaling_max_freq (though that last one may vary a bit between CPUs depending on die quality). That variation is not a problem since we can absorb a ~5% variation without issue. It was verified on an i9-14900 featuring 5.7-P, 6.0-P and 4.4-E GHz that P-cores were not reordered and that E cores were placed last. It was also OK on a W3-2345 with 4.3 to 4.5GHz.	2025-03-14 18:30:30 +01:00
Willy Tarreau	322c28cc19	MINOR: cpu-topo: refine cpu dump output to better show kept/dropped CPUs It's becoming difficult to see which CPUs are going to be kept/dropped. Let's just skip all offline CPUs, and indicate "keep" in front of those that are going to be used, and "----" in front of the excluded ones. It is way more readable this way. Also let's just drop the array entry number, since it's always the same as the CPU number and is only an internal representation anyway.	2025-03-14 18:30:30 +01:00
Willy Tarreau	f1210ee7c6	MEDIUM: cfgparse: remove now unused numa & thread-count detection Ths is not needed anymore since already done before landing here via thread_detect_count().	2025-03-14 18:30:30 +01:00
Willy Tarreau	e3aef4c9a4	MEDIUM: thread: reimplement first numa node detection Let's reimplement automatic binding to the first NUMA node when thread count is not forced. It's the same thing as is already done in check_config_validity() except that this time it's based on the collected CPU information. The threads are automatically counted and CPUs from non-first node(s) are evicted.	2025-03-14 18:30:30 +01:00
Willy Tarreau	4a525e8d27	MEDIUM: cpu-topo: make sure to properly assign CPUs to threads as a fallback If no cpu-map is done and no cpu-policy could be enforced, we still need to count the number of usable CPUs, assign them to all threads and set the nbthread value accordingly. This already handles the part that was done in check_config_validity() via thread_cpus_enabled_at_boot.	2025-03-14 18:30:30 +01:00
Willy Tarreau	1af4942c95	MEDIUM: thread: start to detect thread groups and threads min/max By mutually refining the thread count and group count, we can try to detect the most suitable setup for the current machine. Taskset is implicitly handled correctly. tgroups automatically adapt to the configured number of threads. cpu-map manages to limit tgroups to the smallest supported value. The thread-limit is enforced. Just like in cfgparse, if the thread count was forced to a higher value, it's reduced and a warning is emitted. But if it was not set, the thr_max value is bound to this limit so that further calculations respect it. We continue to default to the max number of available threads and 1 tgroup by default, with the limit. This normally allows to get rid of that test in check_config_validity().	2025-03-14 18:30:30 +01:00
Willy Tarreau	68069e4b27	MINOR: cpu-topo: add "drop-cpu" and "only-cpu" to cpu-set These allow respectively to disable binding to CPUs listed in a set, and to disable binding to CPUs not in a set.	2025-03-14 18:30:30 +01:00
Willy Tarreau	cda4956d9c	MINOR: cpu-topo: add a new "cpu-set" global directive to choose cpus For now it's limited, it only supports "reset" to ask that any previous "taskset" be ignored. The goal will be to later add more actions that allow to symbolically define sets of cpus to bind to or to drop. This also clears the cpu_mask_forced variable that is used to detect that a taskset had been used.	2025-03-14 18:30:30 +01:00
Willy Tarreau	f0661e79fe	MINOR: global: add a command-line option to enable CPU binding debugging During development, everything related to CPU binding and the CPU topology is debugged using state dumps at various places, but it does make sense to have a real command line option so that this remains usable in production to help users figure why some CPUs are not used by default. Let's add "-dc" for this. Since the list of global.tune.options values is almost full and does not 100% match this option, let's add a new "tune.debug" field for this.	2025-03-14 18:30:30 +01:00
Willy Tarreau	94543d7b65	MINOR: cfgparse: use already known offline CPU information No need to reparse cpu/online, let's just rely on the info we learned previously about offline CPUs.	2025-03-14 18:30:30 +01:00
Willy Tarreau	1560827c9d	MINOR: cfgparse: move the binding detection into numa_detect_topology() For now the function refrains from detecting the CPU topology when a restrictive taskset or cpu-map was already performed on the process, and it's documented as such, the reason being that until we're able to automatically create groups, better not change user settings. But we'll need to be able to detect bound CPUs and to process them as desired by the user, so we now need to move that detection into the function itself. It changes nothing to the logic, just gives more freedom to the function.	2025-03-14 18:30:30 +01:00
Willy Tarreau	ac1db9db7d	MINOR: thread: turn thread_cpu_mask_forced() into an init-time variable The function is not convenient because it doesn't allow us to undo the startup changes, and depending on where it's being used, we don't know whether the values read have already been altered (this is not the case right now but it's going to evolve). Let's just compute the status during cpu_detect_usable() and set a variable accordingly. This way we'll always read the init value, and if needed we can even afford to reset it. Also, placing it in cpu_topo.c limits cross-file dependencies (e.g. threads without affinity etc).	2025-03-14 18:30:30 +01:00
Willy Tarreau	3a7cc676fa	MINOR: cpu-topo: add NUMA node identification to CPUs on FreeBSD With this patch we're also NUMA node IDs to each CPU when the info is found. The code is highly inspired from the one in commit f5d48f8b3 ("MEDIUM: cfgparse: numa detect topology on FreeBSD."), the difference being that we're just setting the value in ha_cpu_topo[].	2025-03-14 18:30:30 +01:00
Willy Tarreau	f6154c079e	MINOR: cpu-topo: add NUMA node identification to CPUs on Linux With this patch we're also assigning NUMA node IDs to each CPU when one is found. The code is highly inspired from the one in commit b56a7c89a ("MEDIUM: cfgparse: detect numa and set affinity if needed") that already did the job, except that it could be simplified since we're just collecting info to fill the ha_cpu_topo[] array.	2025-03-14 18:30:30 +01:00
Willy Tarreau	65612369e7	MINOR: cpu-topo: also store the sibling ID with SMT The sibling ID was not reported because it's not directly accessible but we don't care, what matters is that we assign numbers to all the threads we find using the same CPU so that some strategies permit to allocate one thread at a time if we want to use few threads with max performance.	2025-03-14 18:30:30 +01:00
Willy Tarreau	7cb274439b	MINOR: cpu-topo: add CPU topology detection for linux This uses the publicly available information from /sys to figure the cache and package arrangements between logical CPUs and fill ha_cpu_topo[], as well as their SMT capabilities and relative capacity for those which expose this. The functions clearly have to be OS-specific.	2025-03-14 18:30:30 +01:00
Willy Tarreau	12f3a2bbb7	MINOR: cpu-topo: try to detect offline cpus at boot When possible, the offline CPUs are detected at boot and their OFFLINE flag is set in the ha_cpu_topo[] array. When the detection is not possible (e.g. not linux, /sys not mounted etc), we just mark none of them as being offline, as we don't want to infer wrong info that could hinder automatic CPU placement detection. When valid, we take this opportunity for refining cpu_topo_lastcpu so that we don't need to manipulate CPUs beyond this value.	2025-03-14 18:30:30 +01:00
Willy Tarreau	44881e5abf	MINOR: cpu-topo: add detection of online CPUs on FreeBSD On FreeBSD we can detect online CPUs at least by doing the bitwise-OR of the CPUs of all domains, so we're using this and adding this detection to ha_cpuset_detect_online(). If we find simpler later, we can always rework it, but it's reasonably inexpensive since we only check existing domains.	2025-03-14 18:30:30 +01:00
Willy Tarreau	8f72ce335a	MINOR: cpu-topo: add detection of online CPUs on Linux This adds a generic function ha_cpuset_detect_online() which for now only supports linux via /sys. It fills a cpuset with the list of online CPUs that were detected (or returns a failure).	2025-03-14 18:30:30 +01:00
Willy Tarreau	8c524c7c9d	REORG: cpu-topo: move bound cpu detection from cpuset to cpu-topo The cpuset files are normally used only for cpu manipulations. It happens that the initial CPU binding detection was initially placed there since there was no better place, but in practice, being OS-specific, it should really be in cpu-topo. This simplifies cpuset which doesn't need to know about the OS anymore.	2025-03-14 18:30:30 +01:00
Willy Tarreau	a6fdc3eaf0	MINOR: cpu-topo: update CPU topology from excluded CPUs at boot Now before trying to resolve the thread assignment to groups, we detect which CPUs are not bound at boot so that we can mark them with HA_CPU_F_EXCLUDED. This will be useful to better know on which CPUs we can count later. Note that we purposely ignore cpu-map here as we don't know how threads and groups will map to cpu-map entries, hence which CPUs will really be used. It's important to proceed this way so that when we have no info we assume they're all available.	2025-03-14 18:30:30 +01:00
Willy Tarreau	bdb731172c	MINOR: cpu-topo: add a function to dump CPU topology The new function cpu_dump_topology() will centralize most debugging calls, and it can make efforts of not dumping some possibly irrelevant fields (e.g. non-existing cache levels).	2025-03-14 18:30:30 +01:00
Willy Tarreau	041462c4af	MINOR: cpu-topo: rely on _SC_NPROCESSORS_CONF to trim maxcpus We don't want to constantly deal with as many CPUs as a cpuset can hold, so let's first try to trim the value to what the system claims to support via _SC_NPROCESSORS_CONF. It is obviously still subject to the limit of the cpuset size though. The value is stored globally so that we can reuse it elsewhere after initialization.	2025-03-14 18:30:30 +01:00
Willy Tarreau	656cedad42	MINOR: cpu-topo: allocate and initialize the ha_cpu_topo array. This does the bare minimum to allocate and initialize a global ha_cpu_topo array for the number of supported CPUs and release it at deinit time.	2025-03-14 18:30:30 +01:00
Willy Tarreau	05a4efb102	MINOR: thread: rely on the cpuset functions to count bound CPUs let's just clean up the thread_cpus_enabled() code a little bit by removing the OS-specific code and rely on ha_cpuset_detect_bound() instead. On macos we continue to use sysconf() for now.	2025-03-14 18:30:30 +01:00
Willy Tarreau	32bb68e736	MINOR: cpuset: make the API support negative CPU IDs Negative IDs are very convenient to mean "not set", so let's just make the cpuset API robust against this, especially with ha_cpuset_isset() so that we don't have to manually add this check everywhere when a value is not known.	2025-03-14 18:30:30 +01:00
Willy Tarreau	ed75148ca0	BUILD: tools: avoid a build warning on gcc-4.8 in resolve_sym_name() A build warning is emitted with gcc-4.8 in tools.c since commit e920d73f59 ("MINOR: tools: improve symbol resolution without dl_addr") because the compiler doesn't see that <size> is necessarily initialized. Let's just preset it.	2025-03-14 18:30:30 +01:00
Willy Tarreau	4e09789644	MINOR: tools: teach resolve_sym_name() a few more common symbols This adds run_poll_loop, run_tasks_from_lists, process_runnable_tasks, ha_dump_backtrace and cli_io_handler which are fairly common in backtraces. This will be less relative symbols when dladdr is not usable.	2025-03-13 17:31:16 +01:00
Willy Tarreau	a3582a77f7	MINOR: tools: ease the declaration of known symbols in resolve_sym_name() Let's have a macro that declares both the symbol and its name, it will avoid the risk of introducing typos, and encourages adding more when needed. The macro also takes an optional second argument to permit an inline declaration of an extern symbol.	2025-03-13 17:30:48 +01:00
Willy Tarreau	e920d73f59	MINOR: tools: improve symbol resolution without dl_addr When dl_addr is not usable or fails, better fall back to the closest symbol among the known ones instead of providing everything relative to main. Most often, the location of the function will give some hints about what it can be. Thus now we can emit fct+0xXXX in addition to main+0xXXX or main-0xXXX. We keep a margin of +256kB maximum after a function for a match, which is around the maximum size met in an object file, otherwise it becomes pointless again.	2025-03-13 17:30:48 +01:00
Willy Tarreau	1e99efccef	MINOR: cli: export cli_io_handler() to ease symbol resolution It's common to meet this function in backtraces, it's a bit annoying that it's not resolved, so let's export it so that it becomes resolvable.	2025-03-13 17:30:48 +01:00
Aurelien DARRAGON	8311be5ac6	BUG/MINOR: stats: fix capabilities and hide settings for some generic metrics Performing a diff on stats output before vs after commit 66152526 ("MEDIUM: stats: convert counters to new column definition") revealed that some metrics were not properly ported to to the new API. Namely, "lbtot", "cli_abrt" and "srv_abrt" are now exposed on frontend and listeners while it was not the case before. Also, "hrsp_other" is exposed even when "mode http" wasn't set on the proxy. In this patch we restore original behavior by fixing the capabilities and hide settings. As this could be considered as a minor regression (looking at the commit message it doesn't seem intended), better tag this as a bug. It should be backported in 3.0 with 66152526.	2025-03-13 11:49:18 +01:00
Willy Tarreau	78ef52dbd1	BUILD: backend: silence a build warning when threads are disabled Since commit 8de8ed4f48 ("MEDIUM: connections: Allow taking over connections from other tgroups.") we got this partially absurd build warning when disabling threads: src/backend.c: In function 'conn_backend_get': src/backend.c:1371:27: warning: array subscript [0, 0] is outside array bounds of 'struct tgroup_info[1]' [-Warray-bounds] The reason is that gcc sees that curtgid is not equal to tgid which is defined as 1 in this case, thus it figures that tgroup_info[curtgid-1] will be anything but zero and that doesn't fit. It is ridiculous as it is a perfect case of dead code elimination which should not warrant a warning. Nevertheless we know we don't need to do this when threads are disabled and in this case there will not be more than 1 thread group, so we can happily use that preliminary test to help the compiler eliminate the dead condition and avoid spitting this warning. No backport is needed.	2025-03-12 18:16:14 +01:00
Willy Tarreau	b61ed9babe	BUILD: tools: silence a build warning when USE_THREAD=0 The dladdr_lock that was added to avoid re-entering into dladdr is conditioned by threads, but the way it's declared causes a build warning if threads are disabled due to the insertion of a lone semi colon in the variables block. Let's switch to __decl_thread_var() for this. This can be backported wherever commit eb41d768f9 ("MINOR: tools: use only opportunistic symbols resolution") is backported. It relies on these previous two commits: bb4addabb7 ("MINOR: compiler: add a simple macro to concatenate resolved strings") 69ac4cd315 ("MINOR: compiler: add a new __decl_thread_var() macro to declare local variables")	2025-03-12 18:11:14 +01:00
Willy Tarreau	12383fd9f5	BUG/MEDIUM: thread: use pthread_self() not ha_pthread[tid] in set_affinity A bug was uncovered by the work on NUMA. It only triggers in the CI with libmusl due to a race condition. What happens is that the call to set_thread_cpu_affinity() is done very early in the polling loop, and that it relies on ha_pthread[tid] instead of pthread_self(). The problem is that ha_pthread[tid] is only set by the return from pthread_create(), which might happen later depending on the number of CPUs available to run the starting thread. Let's just use pthread_self() here. ha_pthread[] is only used to send signals between threads, there's no point in using it here. This can be backported to 2.6.	2025-03-12 15:59:23 +01:00
Aurelien DARRAGON	e942305214	MEDIUM: log: change default "host" strategy for log-forward section Historically, log-forward proxy used to preserve host field from input message as much as possible, and if syslog host wasn't provided (rfc5424 '-' or bad rfc3164 or rfc5424 message) then "localhost" or "-" would be used as host when outputting message using rfc3164 or rfc5424. We change that behavior (which corresponds to "keep" host option), so that log-forward now uses "fill" strategy as default: if the host is provided in input message, it is preserved. However if it is missing and IP address from sender is available, we use it.	2025-03-12 10:55:49 +01:00
Aurelien DARRAGON	ad0133cc50	MINOR: log: handle log-forward "option host" Following previous patch, we know implement the logic for the host option under log-forward section. Possible strategies are: replace If input message already contains a value for the host field, we replace it by the source IP address from the sender. If input message doesn't contain a value for the host field (ie: '-' as input rfc5424 message or non compliant rfc3164 or rfc5424 message), we use the source IP address from the sender as host field. fill If input message already contains a value for the host field, we keep it. If input message doesn't contain a value for the host field (ie: '-' as input rfc5424 message or non compliant rfc3164 or rfc5424 message), we use the source IP address from the sender as host field. keep If input message already contains a value for the host field, we keep it. If input message doesn't contain a value for the host field, we set it to localhost (rfc3164) or '-' (rfc5424). (This is the default) append If input message already contains a value for the host field, we append a comma followed by the IP address from the sender. If input message doesn't contain a value for the host field, we use the source IP address from the sender. Default value (unchanged) is "keep" strategy. option host is only relevant with rfc3164 or rfc5424 format on log targets. Also, if the source address is not available (ie: UNIX socket), default behavior prevails. Documentation was updated.	2025-03-12 10:52:07 +01:00
Aurelien DARRAGON	003fe530ae	MINOR: log: add "option host" log-forward option add only the parsing part, options are currently unused	2025-03-12 10:51:35 +01:00
Aurelien DARRAGON	47f14be9f3	MINOR: tools: only print address in sa2str() when port == -1 Support special value for port in sa2str: if port is equal to -1, only print the address without the port, also ignoring <map_ports> value.	2025-03-12 10:51:20 +01:00
Aurelien DARRAGON	2de62d0461	MINOR: log: provide source address information in syslog_process_message() provide struct sockaddr_storage pointer from the message sender in syslog_process_message()	2025-03-12 10:50:30 +01:00
Aurelien DARRAGON	bc76f6dde9	MINOR: log: migrate log-forward options from proxy->options2 to options3 Migrate recently added log-forward section options, currently stored under proxy->options2 to proxy->options3 since proxy->options2 is running out of space and we plan on adding more log-forward options.	2025-03-12 10:50:03 +01:00
Aurelien DARRAGON	cc5a66212d	MINOR: proxy: add proxy->options3 proxy->options2 is almost full, yet we will add new log-forward options in upcoming patches so we anticipate that by adding a new {no_}options3 and cfg_opts3[] to further extend proxy options	2025-03-12 10:49:36 +01:00
Aurelien DARRAGON	d47e7103b8	CLEANUP: log: add syslog_process_message() helper Prevent code duplication under syslog_fd_handler() and syslog_io_handler() by merging common code path in a single syslog_process_message() helper that processed a single message stored in <buf> according to <frontend> settings.	2025-03-12 10:49:18 +01:00
Aurelien DARRAGON	8b8520305e	CLEANUP: log-forward: remove useless options2 init It is actually not required to zero out proxy->options2 since proxy is allocated using calloc() which already does it.	2025-03-12 10:49:08 +01:00
William Lallemand	d014d7ee72	TESTS: jws: implement a test for JWS signing This test returns a JWS payload signed a specified private key in the PEM format, and uses the "jose" command tool to check if the signature is correct against the jwk public key. The test could be improved later by using the code from jwt.c allowing to check a signature.	2025-03-11 22:29:40 +01:00
William Lallemand	3abb428fc8	MINOR: jws: implement JWS signing This commits implement JWS signing, this is divided in 3 parts: - jws_b64_protected() creates a JWS "protected" header, which takes the algorithm, kid or jwk, nonce and url as input, and fill a destination buffer with the base64url version of the header - jws_b64_payload() just encode a payload in base64url - jws_b64_signature() generates a signature using as input the protected header and the payload, it supports ES256, ES384 and ES512 for ECDSA keys, and RS256 for RSA ones. The RSA signature just use the EVP_DigestSign() API with its result encoded in base64url. For ECDSA it's a little bit more complicated, and should follow section 3.4 of RFC7518, R and S should be padded to byte size. Then the JWS can be output with jws_flattened() which just formats the 3 base64url output in a JSON representation with the 3 fields, protected, payload and signature.	2025-03-11 22:29:40 +01:00
Valentine Krasnobaeva	7d427134fe	MINOR: startup: adjust alert messages, when capabilities are missed CAP_SYS_ADMIN support was added, in order to access sockets in namespaces. So let's adjust the alert at startup, where we check preserved capabilities from global.last_checks. Let's mention here cap_sys_admin as well.	2025-03-07 16:37:16 +01:00
Damien Claisse	f0a07f834c	BUG/MINOR: cfgparse-tcp: relax namespace bind check Commit 5cbb278 introduced cap_sys_admin support, and enforced checks for both binds and servers. However, when binding into a namespace, the bind is done before dropping privileges. Hence, checking that we have cap_sys_admin capability set in this case is not needed (and it would decrease security to add it). For users starting haproxy with other user than root and without cap_sys_admin, bind should have already failed. As a consequence, relax runtime check for binds into a namespace.	2025-03-07 16:23:29 +01:00
Amaury Denoyelle	dc7913d814	MAJOR: mux-quic: increase stream flow-control for multi-buffer alloc Support for multiple Rx buffers per QCS instance has been introduced by previous patches. However, due to flow-control initial values, client were still unable to fully used this to increase their upload throughput. This patch increases max-stream-data-bidi-remote flow-control initial values. A new define QMUX_STREAM_RX_BUF_FACTOR will fix the number of concurrent buffers allocable per QCS. It is set to 90. Note that connection flow-control initial value did not changed. It is still configured to be equivalent to bufsize multiplied by the maximum concurrent streams. This ensures that Rx buffers allocation is still constrained per connection, so that it won't be possible to have all active QCS instances using in parallel their maximum Rx buffers count.	2025-03-07 12:06:27 +01:00
Amaury Denoyelle	75027692a3	MEDIUM: mux-quic: handle too short data splitted on multiple rxbuf Previous commit introduces support for multiple Rx buffers per QCS instance. Contiguous data may be splitted accross multiple buffers depending on their offset. A particular issue could arise with this new model. Indeed, app_ops rcv_buf callback can still deal with a single buffer at a time. This may cause a deadlock in decoding if app_ops layer cannot proceed due to partial data, but such data are precisely divided on two buffers. This can for example intervene during HTTP/3 frame header parsing. To deal with this, a new function is implemented to force data realign between two contiguous buffers. This is called only when app_ops rcv_buf returned 0 but data is available in the next buffer after the current one. In this case, data are transferred from the next into the current buffer via qcs_transfer_rx_data(). Decoding is then restarted, which should ensure that app_ops layer has enough data to advance. During this operation, special care is ensure to removed both qc_stream_rxbuf entries, as their offset are adjusted. The next buffer is only reinserted if there is remaining data in it, else it can be freed. This case is not easily reproducible as it depends on the HTTP/3 framing used by the client. It seems to be easily reproduced though with quiche. $ quiche-client --http-version HTTP/3 --method POST --body /tmp/100m \ "https://127.0.0.1:20443/post"	2025-03-07 12:06:27 +01:00

... 2 3 4 5 6 ...

19082 Commits