haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-12-21 01:21:00 +01:00

Author	SHA1	Message	Date
Willy Tarreau	5e558c1727	MINOR: stream/cli: make "show sess" support filtering on front/back/server With "show sess", particularly "show sess all", we're often missing the ability to inspect only streams attached to a frontend, backend or server. Let's just add these filters to the command. Only one at a time may be set. One typical use case could be to dump streams attached to a server after issuing "shutdown sessions server XXX" to figure why any wouldn't stop for example.	2025-03-07 10:38:12 +01:00
Willy Tarreau	2bd7cf53cb	MINOR: stream/cli: rework "show sess" to better consider optional arguments The "show sess" CLI command parser is getting really annoying because several options were added in an exclusive mode as the single possible argument. Recently some cumulable options were added ("show-uri") but the older ones were not yet adapted. Let's just make sure that the various filters such as "older" and "age" now belong to the options and leave only <id>, "all", and "help" for the first ones. The doc was updated and it's now easier to find these options.	2025-03-07 10:36:58 +01:00
Willy Tarreau	1cdf2869f6	BUG/MINOR: stream: fix age calculation in "show sess" output The "show sess" output reports an age that's based on the last byte of the HTTP request instead of the stream creation date, due to a confusion between logs->request_ts and the request_date sample fetch function. Most of the time these are equal except when the request is not yet full for any reason (e.g. wait-body). This explains why a few "show sess" could report a few new streams aged by 99 days for example. Let's perform the correct request timestamp calculation like the sample fetch function does, by adding t_idle and t_handshake to the accept_ts. Now the stream's age is correct and can be correctly used with the "show sess older <age>" variant. This issue was introduced in 2.9 and the fix can be backported to 3.0.	2025-03-07 10:36:58 +01:00
Aurelien DARRAGON	dbb25720dd	MINOR: cfgparse/peers: provide more info when ignoring invalid "peer" or "server" lines Invalid (incomplete) "server" or "peer" lines under peers section are now properly ignored. For completeness, in this patch we add some reports so that the user knows that incomplete lines were ignored. For an incomplete server line, since it is tolerated (see GH #565), we only emit a diag warning. For an incomplete peer line, we report a real warning, as it is not expected to have a peer line without an address:port specified. Also, 'newpeer == curpeers->local' check could be simplified since we already have the 'local_peer' variable which tells us that the parsed line refers to a local peer.	2025-03-07 09:39:51 +01:00
Aurelien DARRAGON	a76b5358f0	BUG/MINOR: server: dont return immediately from parse_server() when skipping checks If parse_server() is called under peers section parser, and the address needs to be parsed but it is missing, we directly return from the function However since 0fc136ce5b ("REORG: server: use parsing ctx for server parsing"), parse_server() uses parsing ctx to emit warning/errors, and the ctx must be reset before returning from the function, yet this early return was overlooked. Because of that, any ha_{warning,alert..} message reported after early return from parse_server() could cause messages to have an extra "parsing [file:line]" info. We fix that by ensuring parse_server() doesn't return without resetting the parsing context. It should be backported up to 2.6	2025-03-07 09:39:46 +01:00
Aurelien DARRAGON	054443dfb9	BUG/MINOR: cfgparse/peers: properly handle ignored local peer case In 8ba10fea6 ("BUG/MINOR: peers: Incomplete peers sections should be validated."), some checks were relaxed in parse_server(), and extra logic was added in the peers section parser in an attempt to properly ignore incomplete "server" or "peer" statement under peers section. This was done in response to GH #565, the main intent was that haproxy should already complain about incomplete peers section (ie: missing localpeer). However, 8ba10fea69 explicitly skipped the peer cleanup upon missing srv association for local peers. This is wrong because later haproxy code always assumes that peer->srv is valid. Indeed, we got reports that the (invalid) config below would cause segmentation fault on all stable versions: global localpeer 01JM0TEPAREK01FQQ439DDZXD8 peers my-table peer 01JM0TEPAREK01FQQ439DDZXD8 listen dummy bind localhost:8080 To fix the issue, instead of by-passing some cleanup for the local peer, handle this case specifically by doing the regular peer cleanup and reset some fields set on the curpeers and curpeers proxy because of the invalid local peer (do as if the peer was not declared). It should still comply with requirements from #565. This patch should be backported to all stable versions.	2025-03-06 22:05:29 +01:00
Aurelien DARRAGON	2560ab892f	BUG/MINOR: cfgparse/peers: fix inconsistent check for missing peer server In the "peers" section parser, right after parse_server() is called, we used to check whether the curpeers->peers_fe->srv pointer was set or not to know if parse_server() successfuly added a server to the peers proxy, server that we can then associate to the new peer. However the check is wrong, as curpeers->peers_fe->srv points to the last added server, if a server was successfully added before the failing one, we cannot detect that the last parse_server() didn't add a server. This is known to cause bug with bad "peer"/"server" statements. To fix the issue, we save a pointer on the last known curpeers->peers_fe->srv before parse_server() is called, and we then compare the save with the pointer after parse_server(), if the value didn't change, then parse_server() didn't add a server. This makes the check consistent in all situations. It should be backported to all stable versions.	2025-03-06 22:05:24 +01:00
Valentine Krasnobaeva	e900ef987e	BUG/MEIDUM: startup: return to initial cwd only after check_config_validity() In check_config_validity() we evaluate some sample fetch expressions (log-format, server rules, etc). These expressions may use external files like maps. If some particular 'default-path' was set in the global section before, it's no longer applied to resolve file pathes in check_config_validity(). parse_cfg() at the end of config parsing switches back to the initial cwd. This fixes the issue #2886. This patch should be backported in all stable versions since 2.4.0, including 2.4.0.	2025-03-06 10:49:48 +01:00
Roberto Moreda	f98b5c4f59	MINOR: log: add dont-parse-log and assume-rfc6587-ntf options This commit introduces the dont-parse-log option to disable log message parsing, allowing raw log data to be forwarded without modification. Also, it adds the assume-rfc6587-ntf option to frame log messages using only non-transparent framing as per RFC 6587. This avoids missparsing in certain cases (mainly with non RFC compliant messages). The documentation is updated to include details on the new options and their intended use cases. This feature was discussed in GH #2856	2025-03-06 09:30:39 +01:00
Roberto Moreda	c25e6f5efa	MINOR: log: detach prepare from parse message This commit adds a new function `prepare_log_message` to initialize log message buffers and metadata. This function sets default values for log level and facility, ensuring a consistent starting state for log processing. It also prepares the buffer and metadata fields, simplifying subsequent log parsing and construction.	2025-03-06 09:30:31 +01:00
Roberto Moreda	834e9af877	MINOR: log: add options eval for log-forward This commit adds parsing of options in log-forward config sections and prepares the scenario to implement actual changes of behaviuor. So far we only take in account proxy->options2, which is the bit container with more available positions.	2025-03-06 09:30:25 +01:00
Aurelien DARRAGON	0746f6bde0	MINOR: cfgparse-listen: add and use cfg_parse_listen_match_option() helper cfg_parse_listen_match_option() takes cfg_opt array as parameter, as well current args, expected mode and cap bitfields. It is expected to be used under cfg_parse_listen() function or similar. Its goal is to remove code duplication around proxy->options and proxy->options2 handling, since the same checks are performed for the two. Also, this function could help to evaluate proxy options for mode-specific proxies such as log-forward section for instance: by giving the expected mode and capatiblity as input, the function would only match compatible options.	2025-03-06 09:30:18 +01:00
Aurelien DARRAGON	c7abe7778e	MEDIUM: log: postpone the decision to send or not log with empty messages As reported by Nick Ramirez in GH #2891, it is currently not possible to use log-profile without a log-format set on the proxy. This is due to historical reason, because all log sending functions avoid trying to send a log with empty message. But now with log-profile which can override log-format, it is possible that some loggers may actually end up generating a valid log message that should be sent! Yet from the upper logging functions we don't know about that because loggers are evaluated in lower API functions. Thus, to avoid skipping potentially valid messages (thanks to log-profile overrides), in this patch we postpone the decision to send or not empty log messages in lower log API layer, ie: _process_send_log_final(), once the log-profile settings were evaluated for a given logger. A known side-effect of this change is that fe->log_count statistic may be increased even if no log message is sent because the message was empty and even the log-profile didn't help to produce a non empty log message. But since configurations lacking proxy log-format are not supposed to be used without log-profile (+ log steps combination) anyway it shouldn't be an issue.	2025-03-05 15:38:52 +01:00
Aurelien DARRAGON	9e9b110032	MINOR: log: use __send_log() with exact payload length Historically, __send_log() was called with terminating NULL byte after the message payload. But now that __send_log() supports being called without terminating NULL byte (thanks to size hint), and that __sendlog() actually stips any \n or NULL byte, we don't need to bother with that anymore. So let's remove extra logic around __send_log() users where we added 1 extra byte for the terminating NULL byte. No change of behavior should be expected.	2025-03-05 15:38:46 +01:00
Aurelien DARRAGON	94a9b0f5de	BUG/MINOR: log: set proper smp size for balance log-hash result.data.u.str.size was set to size+1 to take into account terminating NULL byte as per the comment. But this is wrong because the caller is free to set size to just the right amount of bytes (without terminating NULL byte). In fact all smp API functions will not read past str.data so there is not risk about uninitialized reads, but this leaves an ambiguity for converters that may use all the smp size to perform transformations, and since we don't know about the "message" memory origin, we cannot assume that its size may be greater than size. So we max it out to size just to be safe. This bug was not known to cause any issue, it was spotted during code review. It should be backported in 2.9 with b30bd7a ("MEDIUM: log/balance: support for the "hash" lb algorithm")	2025-03-05 15:38:41 +01:00
Aurelien DARRAGON	ddf66132f4	CLEANUP: log: removing "log-balance" references This is a complementary patch to 0e1f389fe9 ("DOC: config: removing "log-balance" references"): we properly removed all log-balance references in the doc but there remained some in the code, let's fix that. It could be backported in 2.9 with 0e1f389fe9	2025-03-05 15:38:34 +01:00
Valentine Krasnobaeva	b46b81949f	MINOR: sample: allow custom date format in error-log-format Sample fetches %[accept_date] and %[request_date] with converters can be used in error-log-format string. But in the most error cases they fetches nothing, as error logs are produced on SSL handshake issues or when invalid PROXY protocol header is used. Stream object is never allocated in such cases and smp_fetch_accept_date() just simply returns 0. There is a need to have a custom date format (ISO8601) also in the error logs, along with normal logs. When sess_build_logline_orig() builds log line it always copies the accept date to strm_logs structure. When stream is absent, accept date is copied from the session object. So, if the steam object wasn't allocated, let's use the session date info in smp_fetch_accept_date(). This allows then, in sample_process(), to apply to the fetched date different converters and formats. This fixes the issue #2884.	2025-03-04 18:57:29 +01:00
William Lallemand	cf71e9f5cf	MINOR: jws: conversion to NIST curves name OpenSSL version greater than 3.0 does not use the same API when manipulating EVP_PKEY structures, the EC_KEY API is deprecated and it's not possible anymore to get an EC_GROUP and simply call EC_GROUP_get_curve_name(). Instead, one must call EVP_PKEY_get_utf8_string_param with the OSSL_PKEY_PARAM_GROUP_NAME parameter, but this would result in a SECG curves name, instead of a NIST curves name in previous version. (ex: secp384r1 vs P-384) This patch adds 2 functions: - the first one look for a curves name and converts it to an openssl NID. - the second one converts a NID to a NIST curves name The list only contains: P-256, P-384 and P-521 for now, it could be extended in the fure with more curves.	2025-03-03 12:43:32 +01:00
William Lallemand	09457111bb	TESTS: jws: register a unittest for jwk Add a way to test the jwk converter in the unit test system $ make TARGET=linux-glibc USE_OPENSSL=1 CFLAGS="-DDEBUG_UNIT=1" $ ./haproxy -U jwk foobar.pem.rsa { "kty": "RSA", "n": "...", "e": "AQAB" } $ ./haproxy -U jwk foobar.pem.ecdsa { "kty": "EC", "crv": "P-384", "x": "...", "y": "..." } This is then tested by a shell script: $ HAPROXY_PROGRAM=${PWD}/haproxy tests/unit/jwk/test.sh + readlink -f tests/unit/jwk/test.sh + BASENAME=/haproxy/tests/unit/jwk/test.sh + dirname /haproxy/tests/unit/jwk/test.sh + TESTDIR=/haproxy/tests/unit/jwk + HAPROXY_PROGRAM=/haproxy/haproxy + mktemp + FILE1=/tmp/tmp.iEICxC5yNK + /haproxy/haproxy -U jwk /haproxy/tests/unit/jwk/ecdsa.key + diff -Naurp /haproxy/tests/unit/jwk/ecdsa.pub.jwk /tmp/tmp.iEICxC5yNK + rm /tmp/tmp.iEICxC5yNK + mktemp + FILE2=/tmp/tmp.EIrGZGaCDi + /haproxy/haproxy -U jwk /haproxy/tests/unit/jwk/rsa.key + diff -Naurp /haproxy/tests/unit/jwk/rsa.pub.jwk /tmp/tmp.EIrGZGaCDi + rm /tmp/tmp.EIrGZGaCDi $ echo $? 0	2025-03-03 12:43:32 +01:00
William Lallemand	a647839954	DEBUG: init: add a way to register functions for unit tests Doing unit tests with haproxy was always a bit difficult, some of the function you want to test would depend on the buffer or trash buffer initialisation of HAProxy, so building a separate main() for them is quite hard. This patch adds a way to register a function that can be called with the "-U" parameter on the command line, will be executed just after step_init_1() and will exit the process with its return value as an exit code. When using the -U option, every keywords after this option is passed to the callback and could be used as a parameter, letting the capability to handle complex arguments if required by the test. HAProxy need to be built with DEBUG_UNIT to activate this feature.	2025-03-03 12:43:32 +01:00
William Lallemand	4dc0ba233e	MINOR: jws: implement a JWK public key converter Implement a converter which takes an EVP_PKEY and converts it to a public JWK key. This is the first step of the JWS implementation. It supports both EC and RSA keys. Know to work with: - LibreSSL - AWS-LC - OpenSSL > 1.1.1	2025-03-03 12:43:32 +01:00
Willy Tarreau	730641f7ca	BUG/MINOR: server: check for either proxy-protocol v1 or v2 to send hedaer As reported in issue #2882, using "no-send-proxy-v2" on a server line does not properly disable the use of proxy-protocol if it was enabled in a default-server directive in combination with other PP options. The reason for this is that the sending of a proxy header is determined by a test on srv->pp_opts without any distinction, so disabling PPv2 while leaving other options results in a PPv1 header to be sent. Let's fix this by explicitly testing for the presence of either send-proxy or send-proxy-v2 when deciding to send a proxy header. This can be backported to all versions. Thanks to Andre Sencioles (@asenci) for reporting the issue and testing the fix.	2025-03-03 04:05:47 +01:00
Amaury Denoyelle	d0f97040a3	BUG/MINOR: hq-interop: fix leak in case of rcv_buf early return HTTP/0.9 parser was recently updated to support truncated requests in rcv_buf operation. However, this caused a leak as input buffer is allocated early. In fact, the leak was already present in case of fatal errors. Fix this by first delaying buffer allocation, so that initial checks are performed before. Then, ensure that buffer is released in case of a latter error. This is considered as minor, as HTTP/0.9 is reserved for experiment and QUIC interop usages. This should be backported up to 2.6.	2025-02-28 17:37:00 +01:00
Willy Tarreau	fd5d59967a	MINOR: h1: permit to relax the websocket checks for missing mandatory headers At least one user would like to allow a standards-violating client setup WebSocket connections through haproxy to a standards-violating server that accepts them. While this should of course never be done over the internet, it can make sense in the datacenter between application components which do not need to mask the data, so this typically falls into the situation of what the "accept-unsafe-violations-in-http-request" option and the "accept-unsafe-violations-in-http-response" option are made for. See GH #2876 for more context. This patch relaxes the test on the "Sec-Websocket-Key" header field in the request, and of the "Sec-Websocket-Accept" header in the response when these respective options are set. The doc was updated to reference this addition. This may be backported to 3.1 but preferably not further.	2025-02-28 17:31:20 +01:00
Christopher Faulet	0e08252294	BUG/MEDIUM: mux-fcgi: Try to fully fill demux buffer on receive if not empty Don't reserve space for the HTX overhead on receive if the demux buffer is not empty. Otherwise, the demux buffer may be erroneously reported as full and this may block records processing. Because of this bug, a ping-pong loop till timeout between data reception and demux process can be observed. This bug was introduced by the commit 5f927f603 ("BUG/MEDIUM: mux-fcgi: Properly handle read0 on partial records"). To fix the issue, if the demux buffer is not empty when we try to receive more data, all free space in the buffer can now be used. However, if the demux buffer is empty, we still try to keep it aligned with the HTX. This patch must be backported to 3.1.	2025-02-28 16:07:05 +01:00
Amaury Denoyelle	3cc095a011	MINOR: hq-interop: properly handle incomplete request Extends HTTP/0.9 layer to be able to deal with incomplete requests. Instead of an error, 0 is returned. Thus, instead of a stream closure. QUIC-MUX may retry rcv_buf operation later if more data is received, similarly to HTTP/3 layer. Note that HTTP/0.9 is only used for testing and interop purpose. As such, this limitation is not considered as a bug. It is probably not worth to backport it.	2025-02-27 17:34:06 +01:00
Amaury Denoyelle	0aa35289b3	CLEANUP: h3: fix documentation of h3_rcv_buf() Return value of h3_rcv_buf() is incorrectly documented. Indeed, it may return a positive value to indicate that input bytes were converted into HTX. This is especially important, as caller uses this value to consume the reported data amount in QCS Rx buffer. This should be backported up to 2.6. Note that on 2.8, h3_rcv_buf() was named h3_decode_qcs().	2025-02-27 17:31:40 +01:00
Amaury Denoyelle	f6648d478b	BUG/MINOR: h3: do not report transfer as aborted on preemptive response HTTP/3 specification allows a server to emit the entire response even if only a partial request was received. In particular, this happens when request STREAM FIN is delayed and transmitted in an empty payload frame. In this case, qcc_abort_stream_read() was used by HTTP/3 layer to emit a STOP_SENDING. Remaining received data were not transmitted to the stream layer as they were simply discared. However, this prevents FIN transmission to the stream layer. This causes the transfer to be considered as prematurely closed, resulting in a cL-- log line status. This is misleading to users which could interpret it as if the response was not sent. To fix this, disable STOP_SENDING emission on full preemptive reponse emission. Rx channel is kept opened until the client closes it with either a FIN or a RESET_STREAM. This ensures that the FIN signal can be relayed to the stream layer, which allows the transfer to be reported as completed. This should be backported up to 2.9.	2025-02-27 17:23:24 +01:00
Dragan Dosen	0ae7a5d672	BUG/MINOR: server: fix the "server-template" prefix memory leak The srv->tmpl_info.prefix was not freed in srv_free_params(). This could be backported to all stable versions.	2025-02-27 04:21:01 +01:00
Dragan Dosen	6838fe43a3	BUG/MEDIUM: server: properly initialize PROXY v2 TLVs The PROXY v2 TLVs were not properly initialized when defined with "set-proxy-v2-tlv-fmt" keyword, which could have caused a crash when validating the configuration or malfunction (e.g. when used in combination with "server-template" and/or "default-server"). The issue was introduced with commit 6f4bfed3a ("MINOR: server: Add parser support for set-proxy-v2-tlv-fmt"). This should be backported up to 2.9.	2025-02-27 04:20:45 +01:00
Olivier Houchard	706b008429	MEDIUM: servers: Add strict-maxconn. Maxconn is a bit of a misnomer when it comes to servers, as it doesn't control the maximum number of connections we establish to a server, but the maximum number of simultaneous requests. So add "strict-maxconn", that will make it so we will never establish more connections than maxconn. It extends the meaning of the "restricted" setting of tune.takeover-other-tg-connections, as it will also attempt to get idle connections from other thread groups if strict-maxconn is set.	2025-02-26 13:00:18 +01:00
Olivier Houchard	8de8ed4f48	MEDIUM: connections: Allow taking over connections from other tgroups. Allow haproxy to take over idle connections from other thread groups than our own. To control that, add a new tunable, tune.takeover-other-tg-connections. It can have 3 values, "none", where we won't attempt to get connections from the other thread group (the default), "restricted", where we only will try to get idle connections from other thread groups when we're using reverse HTTP, and "full", where we always try to get connections from other thread groups. Unless there is a special need, it is advised to use "none" (or restricted if we're using reverse HTTP) as using connections from other thread groups may have a performance impact.	2025-02-26 13:00:18 +01:00
Olivier Houchard	d31b1650ae	MEDIUM: pollers: Drop fd events after a takeover to another tgid. In pollers that support it, provide the generation number in addition to the fd, and, when an event happened, if the generation number is the same, but the tgid changed, then assumed the fd was taken over by a thread from another thread group, and just delete the event from the current thread's poller, as we no longer want to hear about it.	2025-02-26 13:00:18 +01:00
Olivier Houchard	c36aae2af1	MINOR: pollers: Add a fixup_tgid_takeover() method. Add a fixup_tgid_takeover() method to pollers for which it makes sense (epoll, kqueue and evport). That method can be called after a takeover of a fd from a different thread group, to make sure the poller's internal structure reflects the new state.	2025-02-26 13:00:18 +01:00
Olivier Houchard	752c5cba5d	MEDIUM: epoll: Make sure we can add a new event Check that the call to epoll_ctl() succeeds, and if it does not, if we're adding a new event and it fails with EEXIST, then delete and re-add the event. There are a few cases where we may already have events for a fd. If epoll_ctl() fails for any reason, use BUG_ON to make sure we immediately crash, as this should not happen.	2025-02-26 13:00:18 +01:00
Willy Tarreau	a826250659	OPTIM: connection: don't try to kill other threads' connection when !shared Users may have good reasons for using "tune.idle-pool.shared off", one of them being the cost of moving cache lines between cores, or the kernel- side locking associated with moving FDs. For this reason, when getting close to the file descriptors limits, we must not try to kill adjacent threads' FDs when the sharing of pools is disabled. This is extremely expensive and kills the performance. We must limit ourselves to our local FDs only. In such cases, it's up to the users to configure a large enough maxconn for their usages. Before this patch, perf top reported 9% CPU usage in connect_server() onthe trylock used to kill connections when running at 4800 conns for a global maxconn of 6400 on a 128-thread server. Now it doesn't spend its time there anymore, and performance has increased by 12%. Note, it was verified that disabling the locks in such a case has no effect at all, so better keep them and stay safe.	2025-02-25 09:23:46 +01:00
Willy Tarreau	2e0bac90da	BUG/MEDIUM: stream: don't use localtime in dumps from a signal handler In issue #2861, Jarosaw Rzesz�tko reported another issue with "show threads", this time in relation with the conversion of a stream's accept date to local time. Indeed, if the libc was interrupted in this same function, it could have been interrupted with a lock held, then it's no longer possible to dump the date, and we face a deadlock. This is easy to reproduce with logging enabled. Let's detect we come from a signal handler and do not try to resolve the time to localtime in this case.	2025-02-24 13:40:42 +01:00
Willy Tarreau	fb7874c286	MINOR: tinfo: split the signal handler report flags into 3 While signals are not recursive, one signal (e.g. wdt) may interrupt another one (e.g. debug). The problem this causes is that when leaving the inner handler, it removes the outer's flag, hence the protection that comes with it. Let's just have 3 distinct flags for regular signals, debug signal and watchdog signal. We add a 4th definition which is an aggregate of the 3 to ease testing.	2025-02-24 13:37:52 +01:00
Willy Tarreau	bbf824933f	BUG/MINOR: h2: always trim leading and trailing LWS in header values Annika Wickert reported some occasional disconnections between haproxy and varnish when communicating over HTTP/2, with varnish complaining about protocol errors while captures looked apparently normal. Nils Goroll managed to reproduce this on varnish by injecting the capture of the outgoing haproxy traffic and noticed that haproxy was forwarding a header value containing a trailing space, which is now explicitly forbidden since RFC9113. It turns out that the only way for such a header to pass through haproxy is to arrive in h2 and not be edited, in which case it will arrive in HTX with its undesired spaces. Since the code dealing with HTX headers always trims spaces around them, these are not observable in dumps, but only when started in debug mode (-d). Conversions to/from h1 also drop the spaces. With this patch we trim LWS both on input and on output. This way we always present clean headers in the whole stack, and even if some are manually crafted by the configuration or Lua, they will be trimmed on the output. This must be backported to all stable versions. Thanks to Annika for the helpful capture and Nils for the help with the analysis on the varnish side!	2025-02-24 09:39:57 +01:00
Vincent Dechenaux	9011b3621b	MINOR: compression: Introduce minimum size This is the introduction of "minsize-req" and "minsize-res". These two options allow you to set the minimum payload size required for compression to be applied. This helps save CPU on both server and client sides when the payload does not need to be compressed.	2025-02-22 11:32:40 +01:00
Willy Tarreau	e7510d6230	CLEANUP: task: move the barrier after clearing th_ctx->current There's a barrier after releasing the current task in the scheduler. However it's improperly placed, it's done after pool_free() while in fact it must be done immediately after resetting the current pointer. Indeed, the purpose is to make sure that nobody sees the task as valid when it's in the process of being released. This is something that could theoretically happen if interrupted by a signal in the inlined code of pool_free() if the compiler decided to postpone the write to ->current. In practice since nothing fancy is done in the inlined part of the function, there's currently no risk of reordering. But it could happen if the underlying __pool_free() were to be inlined for example, and in this case we could possibly observe th_ctx->current pointing to something currently being destroyed. With the barrier between the two, there's no risk anymore.	2025-02-21 18:31:46 +01:00
Willy Tarreau	eb41d768f9	MINOR: tools: use only opportunistic symbols resolution As seen in issue #2861, dladdr_and_size() an be quite expensive and will often hold a mutex in the underlying library. It becomes a real problem when issuing lots of "show threads" or wdt warnings in parallel because threads will queue up waiting for each other to finish, adding to their existing latency that possibly caused the warning in the first place. Here we're taking a different approach. If the thread is not isolated and not panicking, it's doing unimportant stuff like showing threads or warnings. In this case we try to grab a lock, and if we fail because another thread is already there, we just pretend we cannot resolve the symbol. This is not critical because then we fall back to the already used case which consists in writing "main+<offset>". In practice this will almost never happen except in bad situations which could have otherwise degenerated.	2025-02-21 18:26:29 +01:00
Willy Tarreau	3c22fa315b	BUG/MEDIUM: stream: use non-blocking freq_ctr calls from the stream dumper The stream dump function is called from signal handlers (warning, show threads, panic). It makes use of read_freq_ctr() which might possibly block if it tries to access a locked freq_ctr in the process of being updated, e.g. by the current thread. Here we're relying on the non-blocking API instead. It may return incorrect values (typically smaller ones after resetting the curr counter) but at least it will not block. This needs to be backported to stable versions along with the previous commit below: MINOR: freq_ctr: provide non-blocking read functions At least 3.1 is concerned as the warnings tend to increase the risk of this situation appearing.	2025-02-21 18:26:29 +01:00
Willy Tarreau	29e246a84c	MINOR: freq_ctr: provide non-blocking read functions Some code called by the debug handlers in the context of a signal handler accesses to some freq_ctr and occasionally ends up on a locked one from the same thread that is dumping it. Let's introduce a non-blocking version that at least allows to return even if the value is in the process of being updated, it's less problematic than hanging.	2025-02-21 18:26:29 +01:00
Willy Tarreau	84d4c948fc	BUG/MEDIUM: stream: never allocate connection addresses from signal handler In __strm_dump_to_buffer(), we call conn_get_src()/conn_get_dst() to try to retrieve the connection's IP addresses. But this function may be called from a signal handler to dump a currently running stream, and if the addresses were not allocated yet, a poll_alloc() will be performed while we might possibly already be running pools code, resulting in pool list corruption. Let's just make sure we don't call these sensitive functions there when called from a signal handler. This must be backported at least to 3.1 and ideally all other versions, along with this previous commit: MINOR: tinfo: add a new thread flag to indicate a call from a sig handler	2025-02-21 17:41:38 +01:00
Willy Tarreau	ddd173355c	MINOR: tinfo: add a new thread flag to indicate a call from a sig handler Signal handlers must absolutely not change anything, but some long and complex call chains may look innocuous at first glance, yet result in some subtle write accesses (e.g. pools) that can conflict with a running thread being interrupted. Let's add a new thread flag TH_FL_IN_SIG_HANDLER that is only set when entering a signal handler and cleared when leaving them. Note, we're speaking about real signal handlers (synchronous ones), not deferred ones. This will allow some sensitive call places to act differently when detecting such a condition, and possibly even to place a few new BUG_ON().	2025-02-21 17:41:38 +01:00
Willy Tarreau	a56dfbdcb4	BUG/MINOR: mux-h1: always make sure h1s->sd exists in h1_dump_h1s_info() This function may be called from a signal handler during a warning, a panic or a show thread. We need to be more cautious about what may or may not be dereferenced since an h1s is not necessarily fully initialized. Loops of "show threads" sometimes manage to crash when dereferencing a null h1s->sd, so let's guard it and add a comment remining about the unusual call place. This can be backported to the relevant versions.	2025-02-21 17:41:38 +01:00
Willy Tarreau	9d5bd47634	BUG/MINOR: stream: do not call co_data() from __strm_dump_to_buffer() co_data() was instrumented to detect cases where c->output > data and emits a warning if that's not correct. The problem is that it happens quite a bit during "show threads" if it interrupts traffic anywhere, and that in some environments building with -DDEBUG_STRICT_ACTION=3, it will kill the process. Let's just open-code the channel functions that make access to co_data(), there are not that many and the operations remain very simple. This can be backported to 3.1. It didn't trigger in earlier versions because they didn't have this CHECK_IF_HOT() test.	2025-02-21 17:18:00 +01:00
Aurelien DARRAGON	97a19517ff	MINOR: clock: always use atomic ops for global_now_ms global_now_ms is shared between threads so we must give hint to the compiler that read/writes operations should be performed atomically. Everywhere global_now_ms was used, atomic ops were used, except in clock_update_global_date() where a read was performed without using atomic op. In practise it is not an issue because on most systems such reads should be atomic already, but to prevent any confusion or potential bug on exotic systems, let's use an explicit _HA_ATOMIC_LOAD there. This may be backported up to 2.8	2025-02-21 11:22:35 +01:00
Aurelien DARRAGON	9561b9fb69	BUG/MINOR: sink: add tempo between 2 connection attempts for sft servers When the connection for sink_forward_{oc}_applet fails or a previous one is destroyed, the sft->appctx is instantly released. However process_sink_forward_task(), which may run at any time, iterates over all known sfts and tries to create sessions for orphan ones. It means that instantly after sft->appctx is destroyed, a new one will be created, thus a new connection attempt will be made. It can be an issue with tcp log-servers or sink servers, because if the server is unavailable, process_sink_forward() will keep looping without any temporisation until the applet survives (ie: connection succeeds), which results in unexpected CPU usage on the threads responsible for that task. Instead, we add a tempo logic so that a delay of 1second is applied between two retries. Of course the initial attempt is not delayed. This could be backported to all stable versions.	2025-02-21 11:22:35 +01:00

... 24 25 26 27 28 ...

20122 Commits