haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-27 06:31:23 +01:00

Author	SHA1	Message	Date
Aurelien DARRAGON	5c299dee5a	MEDIUM: stats: consider that shared stats pointers may be NULL This patch looks huge, but it has a very simple goal: protect all accessed to shared stats pointers (either read or writes), because we know consider that these pointers may be NULL. The reason behind this is despite all precautions taken to ensure the pointers shouldn't be NULL when not expected, there are still corner cases (ie: frontends stats used on a backend which no FE cap and vice versa) where we could try to access a memory area which is not allocated. Willy stumbled on such cases while playing with the rings servers upon connection error, which eventually led to process crashes (since 3.3 when shared stats were implemented) Also, we may decide later that shared stats are optional and should be disabled on the proxy to save memory and CPU, and this patch is a step further towards that goal. So in essence, this patch ensures shared stats pointers are always initialized (including NULL), and adds necessary guards before shared stats pointers are de-referenced. Since we already had some checks for backends and listeners stats, and the pointer address retrieval should stay in cpu cache, let's hope that this patch doesn't impact stats performance much.	2025-09-18 16:49:51 +02:00
Amaury Denoyelle	0678d0a69b	MINOR: check: reject invalid check config on a QUIC server QUIC is now supported on the backend side. The previous commit ensures that simple checks can be activated on QUIC servers without any issue. The current patch ensures that check server settings remain compatible with a QUIC server. Thus, configuration is now invalid if check specifies an explicit MUX proto other than QUIC, disables SSL or try to use PROXY protocol.	2025-09-09 16:55:09 +02:00
Amaury Denoyelle	6d3c3c7871	BUG/MINOR: check: ensure check-reuse is compatible with SSL SSL may be activated implicitely if a server relies on SSL, even without check-ssl keyword. This is performed by init_srv_check() function. The main operation is to change xprt layer for check to SSL. Prior to this patch, <use_ssl> check member was also set, despite not strictly necessary. This has a negative side-effect of rendering check-reuse-pool ineffective. Indeed, reuse on check is only performed if no specific check configuration has been specified (see tcpcheck_use_nondefault_connect()). This patch fixes check reuse with SSL : <use_ssl> is not set in case SSL is inherited implicitely from server configuration. Thus, <use_ssl> is now only set if an explicit check-ssl keyword is set, which disables connection reuse for check. This must be backported up to 3.2.	2025-09-03 16:54:48 +02:00
Christopher Faulet	f8b7299ee7	BUG/MINOR: server: Duplicate healthcheck's sni inherited from default server It is not really an issue, but the "check-sni" value inerited from a default server is not duplicated while the paramter value is duplicated during the parsing. So here there is a small leak if several "check-sni" parameters are used on the same server line. The previous value is never released. But to fix this issue, the value inherited from the default server must also be duplicated. At the end it is safer this way and consistant with the parsing of the "sni" parameter. It is harmless so there is no reason to backport this patch.	2025-09-01 15:45:05 +02:00
Christopher Faulet	f7a04b428a	BUG/MEDIUM: server: Duplicate healthcheck's alpn inherited from default server When "check-alpn" parameter is inherited from the default server, the value is not duplicated, the pointer of the default server is used. However, when this parameter is overridden, the old value is released. So the "check-alpn" value of the default server is released. So it is possible to have a UAF if if another server inherit from the same the default server. To fix the issue, the "check-alpn" parameter must be handled the same way the "alpn" is. The default value is duplicated. So it could be safely released if it is forced on the server line. This patch should fix the issue #3096. It must be backported to all stable versions.	2025-09-01 15:45:05 +02:00
Aurelien DARRAGON	75e480d107	MEDIUM: stats: avoid 1 indirection by storing the shared stats directly in counters struct Between 3.2 and 3.3-dev we noticed a noticeable performance regression due to stats handling. After bisecting, Willy found out that recent work to split stats computing accross multiple thread groups (stats sharding) was responsible for that performance regression. We're looking at roughly 20% performance loss. More precisely, it is the added indirections, multiplied by the number of statistics that are updated for each request, which in the end causes a significant amount of time being spent resolving pointers. We noticed that the fe_counters_shared and be_counters_shared structures which are currently allocated in dedicated memory since a0dcab5c ("MAJOR: counters: add shared counters base infrastructure") are no longer huge since 16eb0fab31 ("MAJOR: counters: dispatch counters over thread groups") because they now essentially hold flags plus the per-thread group id pointer mapping, not the counters themselves. As such we decided to try merging fe_counters_shared and be_counters_shared in their parent structures. The cost is slight memory overhead for the parent structure, but it allows to get rid of one pointer indirection. This patch alone yields visible performance gains and almost restores 3.2 stats performance. counters_fe_shared_get() was renamed to counters_fe_shared_prepare() and now returns either failure or success instead of a pointer because we don't need to retrieve a shared pointer anymore, the function takes care of initializing existing pointer.	2025-07-25 16:46:10 +02:00
Aurelien DARRAGON	01dfe17acf	MEDIUM: server: add and use a separate last_change variable for internal use last_change server metric is used for 2 separate purposes. First it is used to report last server state change date for stats and other related metrics. But it is also used internally, including in sensitive paths, such as lb related stuff to take decision or perform computations (ie: in srv_dynamic_maxconn()). Due to last_change counter now being split over thread groups since 16eb0fa ("MAJOR: counters: dispatch counters over thread groups"), reading the aggregated value has a cost, and we cannot afford to consult last_change value from srv_dynamic_maxconn() anymore. Moreover, since the value is used to take decision for the current process we don't wan't the variable to be updated by another process in our back. To prevent performance regression and sharing issues, let's instead add a separate srv->last_change value, which is not updated atomically (given how rare the updates are), and only serves for places where the use of the aggregated last_change counter/stats (split over thread groups) is too costly.	2025-06-30 16:26:25 +02:00
Aurelien DARRAGON	5694a98744	MAJOR: mailers: remove native mailers support As mentioned in 2.8 announce on the mailing list [1] and on the wiki [2] native mailers were deprecated and planned for removal in 3.3. Now is the time to drop the legacy code for native mailers which is based on a tcpcheck "hack" and cannot be maintained. Lua mailers should be used as a drop in replacement. Indeed, "mailers" and associated config directives are preserved because mailers config is exposed to Lua, which helps smoothing the transition from native mailers to Lua based ones. As a reminder, to keep mailers configuration working as before without making changes to the config file, simply add the line below to the global section: lua-load examples/lua/mailers.lua mailers.lua script (provided in the git repository, adjust path as needed) may be customized by users familiar with Lua, by default it emulates the behavior of the native (now removed) mailers. [1]: https://www.mail-archive.com/haproxy@formilux.org/msg43600.html [2]: https://github.com/haproxy/wiki/wiki/Breaking-changes	2025-06-24 10:55:58 +02:00
Christopher Faulet	54d74259e9	BUG/MEDIUM: check: Set SOCKERR by default when a connection error is reported When a connection error is reported, we try to collect as much information as possible on the connection status and the server status is adjusted accordingly. However, the function does nothing if there is no connection error and if the healthcheck is not expired yet. It is a problem when an internal error occurred. It may happen at many places and it is hard to be sure an error is reported on the connection. And in fact, it is already a problem when the multiplexer allocation fails. In that case, the healthcheck is not interrupted as it should be. Concretely, it could only happen when a connection is established. It is hard to predict the effects of this bug. It may be unimportant. But it could probably lead to a crash. To avoid any issue, a SOCKERR status is now set by default when a connection error is reported. There is no reason to report a connection error for nothing. So a healthcheck failure must be reported. There is no "internal error" status. So a socket error is reported. This patch must be backport to all stable versions.	2025-06-16 17:47:35 +02:00
Aurelien DARRAGON	16eb0fab31	MAJOR: counters: dispatch counters over thread groups Most fe and be counters are good candidates for being shared between processes. They are now grouped inside "shared" struct sub member under be_counters and fe_counters. Now they are properly identified, they would greatly benefit from being shared over thread groups to reduce the cost of atomic operations when updating them. For this, we take the current tgid into account so each thread group only updates its own counters. For this to work, it is mandatory that the "shared" member from {fe,be}_counters is initialized AFTER global.nbtgroups is known, because each shared counter causes the stat to be allocated lobal.nbtgroups times. When updating a counter without concurrency, the first counter from the array may be updated. To consult the shared counters (which requires aggregation of per-tgid individual counters), some helper functions were added to counter.h to ease code maintenance and avoid computing errors.	2025-06-05 09:59:38 +02:00
Aurelien DARRAGON	a0dcab5c45	MAJOR: counters: add shared counters base infrastructure Shareable counters are not tagged as shared counters and are dynamically allocated in separate memory area as a prerequisite for being stored in shared memory area. For now, GUID and threads groups are not taken into account, this is only a first step. also we ensure all counters are now manipulated using atomic operations, namely, "last_change" counter is now read from and written to using atomic ops. Despite the numerous changes caused by the counters being moved away from counters struct, no change of behavior should be expected.	2025-06-05 09:58:58 +02:00
Christopher Faulet	6786b05297	DEBUG: check: Add the healthcheck's expiration date in the trace messags It could help to diagnose some issues about timeout processing. So let's add it !	2025-06-03 15:06:12 +02:00
Christopher Faulet	7c788f0984	BUG/MEDIUM: check: Requeue healthchecks on I/O events to handle check timeout When a healthchecks is processed, once the first wakeup passed to start the check, and as long as the expiration timer is not reached, only I/O events are able to wake it up. It is an issue when there is a check timeout defined. Especially if the connect timeout is high and the check timeout is low. In that case, the healthcheck's task is never requeue to handle any timeout update. When the connection is established, the check timeout is set to replace the connect timeout. It is thus possible to report a success while a timeout should be reported. So, now, when an I/O event is handled, the healthcheck is requeue, except if an success or an abort is reported. Thanks to Thierry Fournier for report and the reproducer. This patch must be backported to all stable versions.	2025-06-03 15:03:30 +02:00
Olivier Houchard	81dc3e67cf	MEDIUM: checks: Make sure we return the tasklet from srv_chk_io_cb In srv_chk_io_cb, return the tasklet to tell the scheduler the tasklet is still alive, it is not yet needed, but will be soon.	2025-04-25 16:14:26 +02:00
Aurelien DARRAGON	8a944d0e46	MINOR: checks: deinit checks_fe upon deinit This is just to make valgrind and friends happy, leverage deinit_proxy() for checks_fe proxy upon deinit to ensure proper cleanup. We check the presence of proxy->id to know if it was initialized because we cannot rely on a pointer for that.	2025-04-10 22:10:31 +02:00
Aurelien DARRAGON	4194f756de	MEDIUM: tree-wide: avoid manually initializing proxies In this patch we try to use the proxy API init functions as much as possible to avoid code redundancy and prevent proxy initialization errors. As such, we prefer using alloc_new_proxy() and setup_new_proxy() instead of manually allocating the proxy pointer and performing the base init ourselves.	2025-04-10 22:10:31 +02:00
Aurelien DARRAGON	5087048b6d	MINOR: checks: mark CHECKS-FE dummy frontend as internal CHECKS-FE frontend is a dummy frontend used to create checks sessions as such, it is internal and should not be exposed to the user. Better mark it as internal using PR_CAP_INT capability to prevent proxy API from ever exposing it.	2025-04-10 22:10:31 +02:00
Amaury Denoyelle	f0f1816f1a	MINOR: check: implement check-pool-conn-name srv keyword This commit is a direct follow-up of the previous one. It defines a new server keyword check-pool-conn-name. It is used as the default value for the name parameter of idle connection hash generation. Its behavior is similar to server keyword pool-conn-name, but reserved for checks reuse. If check-pool-conn-name is set, it is used in priority to match a connection for reuse. If unset, a fallback is performed on check-sni.	2025-04-03 17:19:07 +02:00
Amaury Denoyelle	e34f748e3a	MINOR: check define check-reuse-pool server keyword Define a new server keyword check-reuse-pool, and its counterpart with a "no" prefix. For the moment, only parsing is implemented. The real behavior adjustment will be implemented in the next patch.	2025-04-02 14:57:40 +02:00
Olivier Houchard	583303c48b	MINOR: proxies/servers: Calculate queueslength and use it. For both proxies and servers, properly calculates queueslength, which is the total number of element in each queues (as they currently are only using one queue, it is equivalent to the number of element of that queue), and use it instead of the queue's length.	2025-01-28 12:49:41 +01:00
Ilia Shipitsin	495f1f9741	BUG/MINOR: checks: handle a possible strdup() failure This defect was found by the coccinelle script "unchecked-strdup.cocci". It can be backported to all supported branches.	2024-12-25 12:40:56 +01:00
Willy Tarreau	9c6ccb8dbb	MEDIUM: config: warn on unitless timeouts < 100 ms From time to time we face a configuration with very small timeouts which look accidental because there could be expectations that they're expressed in seconds and not milliseconds. This commit adds a check for non-nul unitless values smaller than 100 and emits a warning suggesting to append an explicit unit if that was the intent. Only the common timeouts, the server check intervals and the resolvers hold and timeout values were covered for now. All the code needs to be manually reviewed to verify if it supports emitting warnings. This may break some configs using "zero-warning", but greps in existing configs indicate that these are extremely rare and solely intentionally done during tests. At least even if a user leaves that after a test, it will be more obvious when reading 10ms that something's probably not correct.	2024-11-19 10:33:20 +01:00
Willy Tarreau	2f287f14f3	BUG/MEDIUM: checks: make sure to always apply offsets to now_ms in expiration Now_ms can be zero nowadays, so it's not suitable for direct assignment to t->expire, as there's a risk that the timer never wakes up once assigned (TICK_ETERNITY). Let's use tick_add(now_ms, 0) for an immediate wakeup instead. The impact here might be health checks suddenly stopping. This should be backported where it applies.	2024-11-15 15:39:00 +01:00
Aperence	a7b04e383a	MINOR: tools: extend str2sa_range to add an alt parameter Add a new parameter "alt" that will store wether this configuration use an alternate protocol. This alt pointer will contain a value that can be transparently passed to protocol_lookup to obtain an appropriate protocol structure. This change is needed to allow for example the servers to know if it need to use an alternate protocol or not.	2024-08-30 18:53:49 +02:00
Christopher Faulet	1538c4aa82	MEDIUM: proxy/spoe: Add a SPOP mode The SPOE was significantly lightened. It is now possible to refactor it to use a dedicated multiplexer. The first step is to add a SPOP mode for proxies. The corresponding multiplexer mode is also added. For now, there is no SPOP multiplexer, so it is only declarative. But at the end, the SPOP multiplexer will be automatically selected for servers inside a SPOP backend. The related issue is #2502.	2024-07-12 15:27:04 +02:00
Willy Tarreau	f5566afec6	MEDIUM: dynbuf: generalize the use of b_dequeue() to detach buffer_wait Now thanks to this the bufq_map field is expected to remain accurate.	2024-05-10 17:18:13 +02:00
Willy Tarreau	a214197ce7	MINOR: dynbuf: use the b_queue()/b_requeue() functions everywhere The code places that were used to manipulate the buffer_wq manually now just call b_queue() or b_requeue(). This will simplify the multiple list management later.	2024-05-10 17:18:13 +02:00
Willy Tarreau	72d0dcda8e	MINOR: dynbuf: pass a criticality argument to b_alloc() The goal is to indicate how critical the allocation is, between the least one (growing an existing buffer ring) and the topmost one (boot time allocation for the life of the process). The 3 tcp-based muxes (h1, h2, fcgi) use a common allocation function to try to allocate otherwise subscribe. There's currently no distinction of direction nor part that tries to allocate, and this should be revisited to improve this situation, particularly when we consider that mux-h2 can reduce its Tx allocations if needed. For now, 4 main levels are planned, to translate how the data travels inside haproxy from a producer to a consumer: - MUX_RX: buffer used to receive data from the OS - SE_RX: buffer used to place a transformation of the RX data for a mux, or to produce a response for an applet - CHANNEL: the channel buffer for sync recv - MUX_TX: buffer used to transfer data from the channel to the outside, generally a mux but there can be a few specificities (e.g. http client's response buffer passed to the application, which also gets a transformation of the channel data). The other levels are a bit different in that they don't strictly need to allocate for the first two ones, or they're permanent for the last one (used by compression).	2024-05-10 17:18:13 +02:00
Amaury Denoyelle	634cc2a5d8	MINOR: counters: move last_change into counters struct last_change was a member present in both proxy and server struct. It is used as an age statistics to report the last update of the object. Move last_change into fe_counters/be_counters. This is necessary to be able to manipulate it through generic stat column and report it into stats-file. Note that there is a change for proxy structure with now 2 different last_change values, on frontend and backend side. Special care was taken to ensure that the value is initialized only on the proxy side. The other value is set to 0 unless a listen proxy is instantiated. For the moment, only backend counter is reported in stats. However, with now two distinct values, stats could be extended to report it on both side.	2024-05-02 10:55:25 +02:00
Christopher Faulet	1e38ac72ce	MEDIUM: stconn: Use one function to shut connection and applet endpoints se_shutdown() function is now used to perform a shutdown on a connection endpoint and an applet endpoint. The same function is used for both. sc_conn_shut() function was removed and appctx_shut() function was updated to only deal with the applet stuff.	2024-04-19 16:33:35 +02:00
Christopher Faulet	c96a873ba3	MEDIUM: stconn: Use only one SC function to shut connection endpoints The SC API to perform shutdowns on connection endpoints was unified to have only one function, sc_conn_shut(), with read/write shut modes passed explicitly. It means sc_conn_shutr() and sc_conn_shutw() were removed. The next step is to do the same at the mux level.	2024-04-19 16:25:06 +02:00
Ilya Shipitsin	80813cdd2a	CLEANUP: assorted typo fixes in the code and comments This is 37th iteration of typo fixes	2023-11-23 16:23:14 +01:00
Aurelien DARRAGON	12582eb8e5	MINOR: tools: make str2sa_range() directly return type hints str2sa_range() already allows the caller to provide <proto> in order to get a pointer on the protocol matching with the string input thanks to 5fc9328a ("MINOR: tools: make str2sa_range() directly return the protocol") However, as stated into the commit message, there is a trick: "we can fail to return a protocol in case the caller accepts an fqdn for use later. This is what servers do and in this case it is valid to return no protocol" In this case, we're unable to return protocol because the protocol lookup depends on both the [proto type + xprt type] and the [family type] to be known. While family type might not be directly resolved when fqdn is involved (because family type might be discovered using DNS queries), proto type and xprt type are already known. As such, the caller might be interested in knowing those address related hints even if the address family type is not yet resolved and thus the matching protocol cannot be looked up. Thus in this patch we add the optional net_addr_type (custom type) argument to str2sa_range to enable the caller to check the protocol type and transport type when the function succeeds.	2023-11-10 17:49:57 +01:00
Christopher Faulet	c72ab1cc6d	BUG/MINOR: tcpcheck: Report hexstring instead of binary one on check failure When an expect rule failed for a tcp-check, information about the expect rule is dumped in the report. For a check on a binary string, a hexstring is used in the configuration but the decoded string is dumped. It is an problem because it can contain special characters. And it is not really handy because there is no correspondance with the config. So, now, the hexstring is dumped in the report. This way, we are sure there is no special characters and it is easy to find it in the configuration. This patch shoudl solve the issue #2326. It must be backported as far as 2.2.	2023-10-31 08:02:44 +01:00
Willy Tarreau	fca3fc0d90	BUILD: checks: shut up yet another stupid gcc warning gcc has always had hallucinations regarding value ranges, and this one is interesting, and affects branches 4.7 to 11.3 at least. When building without threads, the randomly picked new_tid that is reduced to a multiply by 1 shifted right 32 bits, hence a constant output of 0 shows this warning: src/check.c: In function 'process_chk_conn': src/check.c:1150:32: warning: array subscript [-1, 0] is outside array bounds of 'struct thread_ctx[1]' [-Warray-bounds] In file included from include/haproxy/thread.h:28, from include/haproxy/list.h:26, from include/haproxy/action.h:28, from src/check.c:31: or this one when trying to force the test to see that it cannot be zero(!): src/check.c: In function 'process_chk_conn': src/check.c:1150:54: warning: array subscript [0, 0] is outside array bounds of 'struct thread_ctx[1]' [-Warray-bounds] 1150 \| uint t2_act = _HA_ATOMIC_LOAD(&ha_thread_ctx[thr2].active_checks); \| ~~~~~~~~~~~~~^~~~~~ include/haproxy/atomic.h:66:40: note: in definition of macro 'HA_ATOMIC_LOAD' 66 \| #define HA_ATOMIC_LOAD(val) *(val) \| ^~~ src/check.c:1150:24: note: in expansion of macro '_HA_ATOMIC_LOAD' 1150 \| uint t2_act = _HA_ATOMIC_LOAD(&ha_thread_ctx[thr2].active_checks); \| ^~~~~~~~~~~~~~~ Let's just add an ALREADY_CHECKED() statement there, no other check seems to get rid of it. No backport is needed.	2023-09-04 19:38:51 +02:00
Willy Tarreau	b0031d9679	MINOR: checks: also consider the thread's queue for rebalancing Let's also check for other threads when the current one is queueing, let's not wait for the load to be high. Now this totally eliminates differences between threads.	2023-09-01 14:00:04 +02:00
Willy Tarreau	844a3bc25b	MEDIUM: checks: implement a queue in order to limit concurrent checks The progressive adoption of OpenSSL 3 and its abysmal handshake performance has started to reveal situations where it simply isn't possible anymore to succesfully run health checks on many servers, because between the moment all the checks are started and the moment the handshake finally completes, the timeout has expired! This also has consequences on production traffic which gets significantly delayed as well, all that for lots of checks. While it's possible to increase the check delays, it doesn't solve everything as checks still take a huge amount of time to converge in such conditions. Here we take a different approach by permitting to enforce the maximum concurrent checks per thread limitation and implementing an ordered queue. Thanks to this, if a thread about to start a check has reached its limit, it will add the check at the end of a queue and it will be processed once another check is finished. This proves to be extremely efficient, with all checks completing in a reasonable amount of time and not being disturbed by the rest of the traffic from other checks. They're just cycling slower, but at the speed the machine can handle. One must understand however that if some complex checks perform multiple exchanges, they will take a check slot for all the required duration. This is why the limit is not enforced by default. Tests on SSL show that a limit of 5-50 checks per thread on local servers gives excellent results already, so that could be a good starting point.	2023-09-01 14:00:04 +02:00
Willy Tarreau	cfc0bceeb5	MEDIUM: checks: search more aggressively for another thread on overload When the current check is overloaded (more running checks than the configured limit), we'll try more aggressively to find another thread. Instead of just opportunistically looking for one half as loaded, now if the current thread has more than 1% more active checks than another one, or has more than a configured limit of concurrent running checks, it will search for a more suitable thread among 3 other random ones in order to migrate the check there. The number of migrations remains very low (~1%) and the checks load very fair across all threads (~1% as well). The new parameter is called tune.max-checks-per-thread.	2023-09-01 08:26:06 +02:00
Willy Tarreau	016e189ea3	MINOR: check: also consider the random other thread's active checks When checking if it's worth transferring a sleeping thread to another random thread, let's also check if that random other thread has less checks than the current one, which is another reason for transferring the load there. This commit adds a function "check_thread_cmp_load()" to compare two threads' loads in order to simplify the decision taking. The minimum active check count before starting to consider rebalancing the load was now raised from 2 to 3, because tests show that at 15k concurrent checks, at 2, 50% are evaluated for rebalancing and 30% are rebalanced, while at 3, this is cut in half.	2023-09-01 08:26:06 +02:00
Willy Tarreau	00de9e0804	MINOR: checks: maintain counters of active checks per thread Let's keep two check counters per thread: - one for "active" checks, i.e. checks that are no more sleeping and are assigned to the thread. These include sleeping and running checks ; - one for "running" checks, i.e. those which are currently executing on the thread. By doing so, we'll be able to spread the health checks load a bit better and refrain from sending too many at once per thread. The counters are atomic since a migration increments the target thread's active counter. These numbers are reported in "show activity", which allows to check per thread and globally how many checks are currently pending and running on the system. Ideally, we should only consider checks in the process of establishing a connection since that's really the expensive part (particularly with OpenSSL 3.0). But the inner layers are really not suitable to doing this. However knowing the number of active checks is already a good enough hint.	2023-09-01 08:26:06 +02:00
Willy Tarreau	3b7942a1c9	MINOR: check/activity: collect some per-thread check activity stats We now count the number of times a check was started on each thread and the number of times a check was adopted. This helps understand better what is observed regarding checks.	2023-09-01 08:26:06 +02:00
Willy Tarreau	e03d05c6ce	MINOR: check: remember when we migrate a check The goal here is to explicitly mark that a check was migrated so that we don't do it again. This will allow us to perform other actions on the target thread while still knowing that we don't want to be migrated again. The new READY bit combine with SLEEPING to form 4 possible states: SLP RDY State Description 0 0 - (reserved) 0 1 RUNNING Check is bound to current thread and running 1 0 SLEEPING Check is sleeping, not bound to a thread 1 1 MIGRATING Check is migrating to another thread Thus we set READY upon migration, and check for it before migrating, this is sufficient to prevent a second migration. To make things a bit clearer, the SLEEPING bit was switched with FASTINTER so that SLEEPING and READY are adjacent.	2023-09-01 08:26:06 +02:00
Willy Tarreau	3544c9f8a0	MINOR: checks: pin the check to its thread upon wakeup When a check leaves the sleeping state, we must pin it to the thread that is processing it. It's normally always the case after the first execution, but initial checks that start assigned to any thread (-1) could be assigned much later, causing problems with planned changes involving queuing. Thus better do it early, so that all threads start properly pinned.	2023-09-01 08:26:06 +02:00
Willy Tarreau	7163f95b43	MINOR: checks: start the checks in sleeping state The CHK_ST_SLEEPING state was introduced by commit d114f4a68 ("MEDIUM: checks: spread the checks load over random threads") to indicate that a check was not currently bound to a thread and that it could easily be migrated to any other thread. However it did not start the checks in this state, meaning that they were not redispatchable on startup. Sometimes under heavy load (e.g. when using SSL checks with OpenSSL 3.0) the cost of setting up new connections is so high that some threads may experience connection timeouts on startup. In this case it's better if they can transfer their excess load to other idle threads. By just marking the check as sleeping upon startup, we can do this and significantly reduce the number of failed initial checks.	2023-09-01 08:26:06 +02:00
Willy Tarreau	48442b8b15	BUG/MINOR: checks: do not queue/wake a bounced check A small issue was introduced with commit d114f4a68 ("MEDIUM: checks: spread the checks load over random threads"): when a check is bounced to another thread, its expiration time is set to TICK_ETERNITY. This makes it show as not expired upon first wakeup on the next thread, thus being detected as "woke up too early" and being instantly rescheduled. Only this after this next wakeup it will be properly considered. Several approaches were attempted to fix this. The best one seems to consist in resetting t->expire and expired upon wakeup, and changing the !expired test for !tick_is_expired() so that we don't trigger on this case. This needs to be backported to 2.7.	2023-09-01 08:26:06 +02:00
Christopher Faulet	8bca3cc8c7	MEDIUM: checks: Stop scheduling healthchecks during stopping stage When the process is stopping, the health-checks are suspended. However the task is still periodically woken up for nothing. If there is a huge number of health-checks and if they are woken up in same time, it may lead to a noticeable CPU consumption for no reason. To avoid this extra CPU cost, we stop to schedule the health-check tasks when the proxy is disabled or stopped. This patch should partially solve the issue #2145.	2023-05-17 14:57:10 +02:00
Willy Tarreau	c7b9308f20	BUG/MINOR: clock: automatically adjust the internal clock with the boot time This is a better and more general solution to the problem described in this commit: BUG/MINOR: checks: postpone the startup of health checks by the boot time Now we're updating the now_offset that is used to compute now_ms at the few points where we update the ready date during boot. This ensures that now_ms while being stable during all the boot process will be correct and will start with the boot value right after the boot is finished. As such the patch above is rolled back (we don't want to count the boot time twice). This must not be backported because it relies on the more flexible clock architecture in 2.8.	2023-05-17 09:33:54 +02:00
Willy Tarreau	8e978a094d	BUG/MINOR: checks: postpone the startup of health checks by the boot time When health checks are started at boot, now_ms could be off by the boot time. In general it's not even noticeable, but with very large configs taking up to one or even a few seconds to start, this can result in a part of the servers' checks being scheduled slightly in the past. As such all of them will start groupped, partially defeating the purpose of the spread-checks setting. For example, this can cause a burst of connections for the network, or an excess of CPU usage during SSL handshakes, possibly even causing some timeouts to expire early. Here in order to compensate for this, we simply add the known boot time to the computed delay when scheduling the startup of checks. That's very simple and particularly efficient. For example, a config with 5k servers in 800 backends checked every 5 seconds, that was taking 3.8 seconds to start used to show this distribution of health checks previously despite the spread-checks 50: 3690 08:59:25 417 08:59:26 213 08:59:27 71 08:59:28 428 08:59:29 860 08:59:30 918 08:59:31 938 08:59:32 1124 08:59:33 904 08:59:34 647 08:59:35 890 08:59:36 973 08:59:37 856 08:59:38 893 08:59:39 154 08:59:40 Now with the fix it shows this: 470 08:59:59 929 09:00:00 896 09:00:01 937 09:00:02 854 09:00:03 827 09:00:04 906 09:00:05 863 09:00:06 913 09:00:07 873 09:00:08 162 09:00:09 This should be backported to all supported versions. It depends on this commit: MINOR: clock: measure the total boot time For 2.8 where the internal clock is now totally independent on the human one, an more generic fix will consist in simply updating now_ms to reflect the startup time.	2023-05-17 09:33:54 +02:00
Christopher Faulet	cb76030356	CLEANUP: check; Remove some useless assignments to NULL In process_chk_conn(), some assignments to NULL are useless and are reported by Coverity as unused value. while it is harmless, these assignments can be removed. This patch should fix the coverity report #2158.	2023-05-17 09:28:23 +02:00
Willy Tarreau	b93758cec9	MINOR: checks: make sure spread-checks is used also at boot time This makes use of spread-checks also for the startup of the check tasks. This provides a smoother load on startup for uneven configurations which tend to enable only some servers. Below is the connection distribution per second of the SSL checks of a config with 5k servers spread over 800 backends, with a check inter of 5 seconds: - default: 682 08:00:50 826 08:00:51 773 08:00:52 1016 08:00:53 885 08:00:54 889 08:00:55 825 08:00:56 773 08:00:57 1016 08:00:58 884 08:00:59 888 08:01:00 491 08:01:01 - with spread-checks 50: 437 08:01:19 866 08:01:20 777 08:01:21 1023 08:01:22 1118 08:01:23 923 08:01:24 641 08:01:25 859 08:01:26 962 08:01:27 860 08:01:28 929 08:01:29 909 08:01:30 866 08:01:31 849 08:01:32 114 08:01:33 - with spread-checks 50 + this patch: 680 08:01:55 922 08:01:56 962 08:01:57 899 08:01:58 819 08:01:59 843 08:02:00 916 08:02:01 896 08:02:02 886 08:02:03 846 08:02:04 903 08:02:05 894 08:02:06 178 08:02:07 The load is much smoother from the start, this can help initial health checks succeed when many target the same overloaded server for example. This could be backported as it should make border-line configs more reliable across reloads.	2023-05-17 08:10:40 +02:00

1 2 3 4

195 Commits