haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-26 22:20:59 +01:00

Author	SHA1	Message	Date
Aurelien DARRAGON	40eb1dd135	BUG/MEDIUM: sink: fix unexpected double postinit of sink backend Willy experienced an unexpected behavior with the config below: global stats socket :1514 ring buf1 server srv1 127.0.0.1:1514 Indeed, haproxy would connect to the ring server twice since commit 23e5f18b ("MEDIUM: sink: change the sink mode type to PR_MODE_SYSLOG"), and one of the connection would report errors. The reason behind is is, despite the above commit saying no change of behavior is expected, with the sink forward_px proxy now being set with PR_MODE_SYSLOG, postcheck_log_backend() was being automatically executed in addition to the manual cfg_post_parse_ring() function for each "ring" section. The consequence is that sink_finalize() was called twice for a given "ring" section, which means the connection init would be triggered twice.. which in turn resulted in the behavior described above, plus possible unexpected side-effects. To fix the issue, when we create the forward_px proxy, we now set the PR_CAP_INT capability on it to tell haproxy not to automatically manage the proxy (ie: to skip the automatic log backend postinit), because we are about to manually manage the proxy from the sink API. No backport needed, this bug is specific to 3.3	2025-09-18 16:49:29 +02:00
Willy Tarreau	79ef362d9e	OPTIM: ring: avoid reloading the tail_ofs value before the CAS in ring_write() The load followed by the CAS seem to cause two bus cycles, one to retrieve the cache line in shared state and a second one to get exclusive ownership of it. Tests show that on x86 it's much better to just rely on the previous value and preset it to zero before entering the loop. We just mask the ring lock in case of failure so as to challenge it on next iteration and that's done. This little change brings 2.3% extra performance (11.34M msg/s) on a 64-core AMD.	2025-09-18 15:27:32 +02:00
Willy Tarreau	a727c6eaa5	OPTIM: ring: check the queue's owner using a CAS on x86 In the loop where the queue's leader tries to get the tail lock, we also need to check if another thread took ownership of the queue the current thread is currently working for. This is currently done using an atomic load. Tests show that on x86, using a CAS for this is much more efficient because it allows to keep the cache line in exclusive state for a few more cycles that permit the queue release call after the loop to be done without having to wait again. The measured gain is +5% for 128 threads on a 64-core AMD system (11.08M msg/s vs 10.56M). However, ARM loses about 1% on this, and we cannot afford that on machines without a fast CAS anyway, so the load is performed using a CAS only on x86_64. It might not be as efficient on low-end models but we don't care since they are not the ones dealing with high contention.	2025-09-18 15:08:12 +02:00
Willy Tarreau	d25099b359	OPTIM: ring: always relax in the ring lock and leader wait loop Tests have shown that AMD systems really need to use a cpu_relax() in these two loops. The performance improves from 10.03 to 10.56M messages per second (+5%) on a 128-thread system, without affecting intel nor ARM, so let's do this.	2025-09-18 15:07:56 +02:00
Willy Tarreau	eca1f90e16	CLEANUP: ring: rearrange the wait loop in ring_write() The loop is constructed in a complicated way with a single break statement in the middle and many continue statements everywhere, making it hard to better factor between variants. Let's first reorganize it so as to make it easier to escape when the ring tail lock is obtained. The sequence of instrucitons remains the same, it's only better organized.	2025-09-18 14:58:38 +02:00
Willy Tarreau	4431e3bd26	OPTIM: sink: reduce contention on sink_announce_dropped() perf top shows that sink_announce_dropped() consumes most of the CPU on a 128-thread x86 system. Digging further reveals that the atomic fetch_or() on the dropped field used to detect the presence of another thread is entirely responsible for this. Indeed, the compiler implements it using a CAS that loops without relaxing and makes all threads wait until they can synchronize on this one, only to discover later that another thread is there and they need to give up. Let's just replace this with a hand-crafted CAS loop that will detect before attempting the CAS if another thread is there. Doing so achieves the same goal without forcing threads to agree. With this simple change, the sustained request rate on h1 with all traces on bumped from 110k/s to 244k/s! This should be backported to stable releases where it's often needed to help debugging.	2025-09-18 08:38:34 +02:00
Willy Tarreau	361c227465	MINOR: trace: don't call strlen() on the function's name Currently there's a small mistake in the way the trace function and macros. The calling function name is known as a constant until the macro and passed as-is to the __trace() function. That one needs to know its length and will call ist() on it, resulting in a real call to strlen() while that length was known before the call. Let's use an ist instead of a const char* for __trace() and __trace_enabled() so that we can now completely avoid calling strlen() during this operation. This has significantly reduced the importance of __trace_enabled() in perf top.	2025-09-18 08:31:57 +02:00
Willy Tarreau	06fa9f717f	MINOR: trace: don't call strlen() on the thread-id numeric encoding In __trace(), we're making an integer for the thread id but this one is passed through strlen() in the call to ist() because it's not a constant. We do know that it's exactly 3 chars long so we can manage this using ist2() and pass it the length instead in order to reduce the number of calls to strlen(). Also let's note that the thread number will no longer be numeric for thread numbers above 100.	2025-09-18 08:02:59 +02:00
Willy Tarreau	d53ad49ad1	BUG/MEDIUM: ring: invert the length check to avoid an int overflow Vincent Gramer reported in GH issue #3125 a case of crash on a BUG_ON() condition in the rings. What happens is that a message that is one byte less than the maximum ring size is emitted, and it passes all the checks, but once inflated by the extra +1 for the refcount, it can no longer. But the check was made based on message size compared to space left, except that this space left can now be negative, which is a high positive for size_t, so the check remained valid and triggered a BUG_ON() later. Let's compute the size the other way around instead (i.e. current + needed) since we can't have rings as large as half of the memory space anyway, thus we have no risk of overflow on this one. This needs to be backported to all versions supporting multi-threaded rings (3.0 and above). Thanks to Vincent for the easy and working reproducer.	2025-09-17 18:45:13 +02:00
Willy Tarreau	8c077c17eb	MINOR: server: add the "cc" keyword to set the TCP congestion controller It is possible on at least Linux and FreeBSD to set the congestion control algorithm to be used with outgoing connections, among the list of supported and permitted ones. Let's expose this setting with "cc". Unknown or forbidden algorithms will be ignored and the default one will continue to be used.	2025-09-17 17:19:33 +02:00
Willy Tarreau	4ed3cf295d	MINOR: listener: add the "cc" bind keyword to set the TCP congestion controller It is possible on at least Linux and FreeBSD to set the congestion control algorithm to be used with incoming connections, among the list of supported and permitted ones. Let's expose this setting with "cc". Permission issues might be reported (as warnings).	2025-09-17 17:03:42 +02:00
Willy Tarreau	fef4cfbd21	IMPORT: ebtree: only use __builtin_prefetch() when supported It looks like __builtin_prefetch() appeared in gcc-3.1 as there's no mention of it in 3.0's doc. Let's replace it with eb_prefetch() which maps to __builtin_prefetch() on supported compilers and falls back to the usual do{}while(0) on other ones. It was tested to properly build with tcc as well as gcc-2.95. This is ebtree commit 7ee6ede56a57a046cb552ed31302b93ff1a21b1a.	2025-09-17 14:30:32 +02:00
Willy Tarreau	6c54bf7295	IMPORT: eb32/eb64: place an unlikely() on the leaf test In the loop we can help the compiler build slightly more efficient code by placing an unlikely() around the leaf test. This shows a consistent 0.5% performance gain both on eb32 and eb64. This is ebtree commit 6c9cdbda496837bac1e0738c14e42faa0d1b92c4.	2025-09-17 14:30:32 +02:00
Willy Tarreau	6af17d491f	IMPORT: eb32/eb64: reorder the lookup loop for modern CPUs The current code calculates the next troot based on a calculation. This was efficient when the algorithm was developed many years ago on K6 and K7 CPUs running at low frequencies with few registers and limited branch prediction units but nowadays with ultra-deep pipelines and high latency memory that's no longer efficient, because the CPU needs to have completed multiple operations before knowing which address to start fetching from. It's sad because we only have two branches each time but the CPU cannot know it. In addition, the calculation is performed late in the loop, which does not help the address generation unit to start prefetching next data. Instead we should help the CPU by preloading data early from the node and calculing troot as soon as possible. The CPU will be able to postpone that processing until the dependencies are available and it really needs to dereference it. In addition we must absolutely avoid serializing instructions such as "(a >> b) & 1" because there's no way for the compiler to parallelize that code nor for the CPU to pre- process some early data. What this patch does is relatively simple: - we try to prefetch the next two branches as soon as the node is known, which will help dereference the selected node in the next iteration; it was shown that it only works with the next changes though, otherwise it can reduce the performance instead. In practice the prefetching will start a bit later once the node is really in the cache, but since there's no dependency between these instructions and any other one, we let the CPU optimize as it wants. - we preload all important data from the node (next two branches, key and node.bit) very early even if not immediately needed. This is cheap, it doesn't cause any pipeline stall and speeds up later operations. - we pre-calculate 1<<bit that we assign into a register, so as to avoid serializing instructions when deciding which branch to take. - we assign the troot based on a ternary operation (or if/else) so that the CPU knows upfront the two possible next addresses without waiting for the end of a calculation and can prefetch their contents every time the branch prediction unit guesses right. Just doing this provides significant gains at various tree sizes on random keys (in million lookups per second): eb32 1k: 29.07 -> 33.17 +14.1% 10k: 14.27 -> 15.74 +10.3% 100k: 6.64 -> 8.00 +20.5% eb64 1k: 27.51 -> 34.40 +25.0% 10k: 13.54 -> 16.17 +19.4% 100k: 7.53 -> 8.38 +11.3% The performance is now much closer to the sequential keys. This was done for all variants ({32,64}{,i,le,ge}). Another point, the equality test in the loop improves the performance when looking up random keys (since we don't need to reach the leaf), but is counter-productive for sequential keys, which can gain ~17% without that test. However sequential keys are normally not used with exact lookups, but rather with lookup_ge() that spans a time frame, and which does not have that test for this precise reason, so in the end both use cases are served optimally. It's interesting to note that everything here is solely based on data dependencies, and that trying to perform less operations upfront always ends up with lower performance (typically the original one). This is ebtree commit 05a0613e97f51b6665ad5ae2801199ad55991534.	2025-09-17 14:30:31 +02:00
Willy Tarreau	dcd4d36723	IMPORT: ebtree: delete unusable ebpttree.c Since commit 21fd162 ("[MEDIUM] make ebpttree rely solely on eb32/eb64 trees") it was no longer used and no longer builds. The commit message mentions that the file is no longer needed, probably that a rebase failed and left the file there. This is ebtree commit fcfaf8df90e322992f6ba3212c8ad439d3640cb7.	2025-09-17 14:30:31 +02:00
Aurelien DARRAGON	31b3be7aae	CLEANUP: log: remove deadcode in px_parse_log_steps() When logsteps proxy storage was migrated from eb nodes to bitmasks in 6a92b14 ("MEDIUM: log/proxy: store log-steps selection using a bitmask, not an eb tree"), some unused eb node related code was left over in px_parse_log_steps() Not only this code is unused, it also resulted in wasted memory since an eb node was allocated for nothing. This should fix GH #3121	2025-09-17 11:31:17 +02:00
Willy Tarreau	3d73e6c818	BUG/MEDIUM: pattern: fix possible infinite loops on deletion (try 2) Commit e36b3b60b3 ("MEDIUM: migrate the patterns reference to cebs_tree") changed the construction of the loops used to look up matching nodes, and since we don't need two elements anymore, the "continue" statement now loops on the same element when deleting. Let's fix this to make sure it passes through the next one. While this bug is 3.3 only, it turns out that 3.2 is also affected by the incorrect loop construct in pat_ref_set_from_node(), where it's possible to run an infinite loop since commit 010c34b8c7 ("MEDIUM: pattern: consider gen_id in pat_ref_set_from_node()") due to the "continue" statement being placed before the ebmb_next_dup() call. As such the relevant part of this fix (pat_ref_set_from_elt) will need to be backported to 3.2.	2025-09-16 16:32:39 +02:00
Willy Tarreau	f1b1d3682a	Revert "BUG/MEDIUM: pattern: fix possible infinite loops on deletion" This reverts commit 359a829ccb8693e0b29808acc0fa7975735c0353. The fix is neither sufficient nor correct (it triggers ASAN). Better redo it cleanly rather than accumulate invalid fixes.	2025-09-16 16:32:39 +02:00
Willy Tarreau	359a829ccb	BUG/MEDIUM: pattern: fix possible infinite loops on deletion Commit e36b3b60b3 ("MEDIUM: migrate the patterns reference to cebs_tree") changed the construction of the loops used to look up matching nodes, and since we don't need two elements anymore, the "continue" statement now loops on the same element when deleting. Let's fix this to make sure it passes through the next one. No backport is needed, this is only 3.3.	2025-09-16 11:49:01 +02:00
Willy Tarreau	4edff4a2cc	CLEANUP: vars: use the item API for the variables trees The variables trees use the immediate cebtree API, better use the item one which is more expressive and safer. The "node" field was renamed to "name_node" to avoid any ambiguity.	2025-09-16 10:51:23 +02:00
Willy Tarreau	c058cc5ddf	CLEANUP: tools: use the item API for the file names tree The file names tree uses the immediate cebtree API, better use the item one which is more expressive and safer.	2025-09-16 10:41:19 +02:00
Willy Tarreau	2d6b5c7a60	MEDIUM: connection: reintegrate conn_hash_node into connection Previously the conn_hash_node was placed outside the connection due to the big size of the eb64_node that could have negatively impacted frontend connections. But having it outside also means that one extra allocation is needed for each backend connection, and that one memory indirection is needed for each lookup. With the compact trees, the tree node is smaller (16 bytes vs 40) so the overhead is much lower. By integrating it into the connection, We're also eliminating one pointer from the connection to the hash node and one pointer from the hash node to the connection (in addition to the extra object bookkeeping). This results in saving at least 24 bytes per total backend connection, and only inflates connections by 16 bytes (from 240 to 256), which is a reasonable compromise. Tests on a 64-core EPYC show a 2.4% increase in the request rate (from 2.08 to 2.13 Mrps).	2025-09-16 09:23:46 +02:00
Willy Tarreau	ceaf8c1220	MEDIUM: connection: move idle connection trees to ceb64 Idle connection trees currently require a 56-byte conn_hash_node per connection, which can be reduced to 32 bytes by moving to ceb64. While ceb64 is theoretically slower, in practice here we're essentially dealing with trees that almost always contain a single key and many duplicates. In this case, ceb64 insert and lookup functions become faster than eb64 ones because all duplicates are a list accessed in O(1) while it's a subtree for eb64. In tests it is impossible to tell the difference between the two, so it's worth reducing the memory usage. This commit brings the following memory savings to conn_hash_node (one per backend connection), and to srv_per_thread (one per thread and per server): struct before after delta conn_hash_nodea 56 32 -24 srv_per_thread 96 72 -24 The delicate part is conn_delete_from_tree(), because we need to know the tree root the connection is attached to. But thanks to recent cleanups, it's now clear enough (i.e. idle/safe/avail vs session are easy to distinguish).	2025-09-16 09:23:46 +02:00
Willy Tarreau	95b8adff67	MINOR: connection: pass the thread number to conn_delete_from_tree() We'll soon need to choose the server's root based on the connection's flags, and for this we'll need the thread it's attached to, which is not always the current one. This patch simply passes the thread number from all callers. They know it because they just set the idle_conns lock on it prior to calling the function.	2025-09-16 09:23:46 +02:00
Willy Tarreau	efe519ab89	CLEANUP: backend: use a single variable for removed in srv_cleanup_idle_conns() Probably due to older code, there's a boolean variable used to set another one which is then checked. Also the first check is made under the lock, which is unnecessary. Let's simplify this and use a single variable. This only makes the code clearer, it doesn't change the output code.	2025-09-16 09:23:46 +02:00
Willy Tarreau	f7d1fc2b08	MINOR: server: pass the server and thread to srv_migrate_conns_to_remove() We'll need to have access to the srv_per_thread element soon from this function, and there's no particular reason for passing it list pointers so let's pass the server and the thread so that it is autonomous. It also makes the calling code simpler.	2025-09-16 09:23:46 +02:00
Willy Tarreau	d1c5df6866	CLEANUP: server: use eb64_entry() not ebmb_entry() to convert an eb64 There were a few leftovers from an earlier version of the conn_hash_node that was using ebmb nodes. A few calls to ebmb_first() and ebmb_entry() were still present while acting on an eb64 tree. These are harmless as one is just eb_first() and the other container_of(), but it's confusing so let's clean them up.	2025-09-16 09:23:46 +02:00
Willy Tarreau	3d18a0d4c2	CLEANUP: backend: factor the connection lookup loop The connection lookup loop is made of two identical blocks, one looking in the idle or safe lists and the other one looking into the safe list only. The second one is skipped if a connection was found or if the request looks for a safe one (since already done). Also the two are slightly different due to leftovers from earlier versions in that the second one checks for safe connections and not the first one, and the second one sets is_safe which is not used later. Let's just rationalize all this by placing them in a loop which checks first from the idle conns and second from the safe ones, or skips the first step if the request wants a safe connection. This reduces the code and shortens the time spent under the lock.	2025-09-16 09:23:46 +02:00
Willy Tarreau	d18d972b1f	MEDIUM: server: index server ID using compact trees The server ID is currently stored as a 32-bit int using an eb32 tree. It's used essentially to find holes in order to automatically assign IDs, and to detect duplicates. Let's change this to use compact trees instead in order to save 24 bytes in struct server for this node, plus 8 bytes in struct proxy. The server struct is still 3904 bytes large (due to alignment) and the proxy struct is 3072.	2025-09-16 09:23:46 +02:00
Willy Tarreau	66191584d1	MEDIUM: listener: index listener ID using compact trees The listener ID is currently stored as a 32-bit int using an eb32 tree. It's used essentially to find holes in order to automatically assign IDs, and to detect duplicates. Let's change this to use compact trees instead in order to save 24 bytes in struct listener for this node, plus 8 bytes in struct proxy. The struct listener is now 704 bytes large, and the struct proxy 3080.	2025-09-16 09:23:46 +02:00
Willy Tarreau	1a95bc42c7	MEDIUM: proxy: index proxy ID using compact trees The proxy ID is currently stored as a 32-bit int using an eb32 tree. It's used essentially to find holes in order to automatically assign IDs, and to detect duplicates. Let's change this to use compact trees instead in order to save 24 bytes in struct proxy for this node, plus 8 bytes in the root (which is static so not much relevant here). Now the proxy is 3088 bytes large.	2025-09-16 09:23:46 +02:00
Willy Tarreau	eab5b89dce	MINOR: proxy: add proxy_index_id() to index a proxy by its ID This avoids needlessly exposing the tree's root and the mechanics outside of the low-level code.	2025-09-16 09:23:46 +02:00
Willy Tarreau	5e4b6714e1	MINOR: listener: add listener_index_id() to index a listener by its ID This avoids needlessly exposing the tree's root and the mechanics outside of the low-level code.	2025-09-16 09:23:46 +02:00
Willy Tarreau	5a5cec4d7a	MINOR: server: add server_index_id() to index a server by its ID This avoids needlessly exposing the tree's root and the mechanics outside of the low-level code.	2025-09-16 09:23:46 +02:00
Willy Tarreau	4ed4cdbf3d	CLEANUP: server: use server_find_by_id() when looking for already used IDs In srv_parse_id(), there's no point doing all the low-level work with the tree functions to check for the existence of an ID, we already have server_find_by_id() which does exactly this, so let's use it.	2025-09-16 09:23:46 +02:00
Willy Tarreau	0b0aefe19b	MINOR: server: add server_get_next_id() to find next free server ID This was previously achieved via the generic get_next_id() but we'll soon get rid of generic ID trees so let's have a dedicated server_get_next_id(). As a bonus it reduces the exposure of the tree's root outside of the functions.	2025-09-16 09:23:46 +02:00
Willy Tarreau	23605eddb1	MINOR: listener: add listener_get_next_id() to find next free listener ID This was previously achieved via the generic get_next_id() but we'll soon get rid of generic ID trees so let's have a dedicated listener_get_next_id(). As a bonus it reduces the exposure of the tree's root outside of the functions.	2025-09-16 09:23:46 +02:00
Willy Tarreau	b2402d67b7	MINOR: proxy: add proxy_get_next_id() to find next free proxy ID This was previously achieved via the generic get_next_id() but we'll soon get rid of generic ID trees so let's have a dedicated proxy_get_next_id().	2025-09-16 09:23:46 +02:00
Willy Tarreau	f4059ea42f	MEDIUM: stktable: index table names using compact trees Here we're saving 64 bytes per stick-table, from 3392 to 3328, and the change was really straightforward so there's no reason not to do it.	2025-09-16 09:23:46 +02:00
Willy Tarreau	d0d60a007d	MEDIUM: proxy: switch conf.name to cebis_tree This is used to index the proxy's name and it contains a copy of the pointer to the proxy's name in <id>. Changing that for a ceb_node placed just before <id> saves 32 bytes to the struct proxy, which is now 3112 bytes large. Here we need to continue to support duplicates since they're still allowed between type-incompatible proxies. Interestingly, the use of cebis_next_dup() instead of cebis_next() in proxy_find_by_name() allows us to get rid of an strcmp() that was performed for each use_backend rule. A test with a large config (100k backends) shows that we can get 3% extra performance on a config involving a static use_backend rule (3.09M to 3.18M rps), and even 4.5% on a dynamic rule selecting a random backend (2.47M to 2.59M).	2025-09-16 09:23:46 +02:00
Willy Tarreau	fdf6fd5b45	MEDIUM: server: switch the host_dn member to cebis_tree This member is used to index the hostname_dn contents for DNS resolution. Let's replace it with a cebis_tree to save another 32 bytes (24 for the node + 8 by avoiding the duplication of the pointer). The struct server is now at 3904 bytes.	2025-09-16 09:23:46 +02:00
Willy Tarreau	413e903a22	MEDIUM: server: switch conf.name to cebis_tree This is used to index the server name and it contains a copy of the pointer to the server's name in <id>. Changing that for a ceb_node placed just before <id> saves 32 bytes to the struct server, which remains 3968 bytes large due to alignment. The proxy struct shrinks by 8 bytes to 3144. It's worth noting that the current way duplicate names are handled remains based on the previous mechanism where dups were permitted. Ideally we should now reject them during insertion and use unique key trees instead.	2025-09-16 09:23:46 +02:00
Willy Tarreau	0e99f64fc6	MEDIUM: server: switch addr_node to cebis_tree This contains the text representation of the server's address, for use with stick-tables with "srvkey addr". Switching them to a compact node saves 24 more bytes from this structure. The key was moved to an external pointer "addr_key" right after the node. The server struct is now 3968 bytes (down from 4032) due to alignment, and the proxy struct shrinks by 8 bytes to 3152.	2025-09-16 09:23:46 +02:00
Willy Tarreau	91258fb9d8	MEDIUM: guid: switch guid to more compact cebuis_tree The current guid struct size is 56 bytes. Once reduced using compact trees, it goes down to 32 (almost half). We're not on a critical path and size matters here, so better switch to this. It's worth noting that the name part could also be stored in the guid_node at the end to save 8 extra byte (no pointer needed anymore), however the purpose of this struct is to be embedded into other ones, which is not compatible with having a dynamic size. Affected struct sizes in bytes: Before After Diff server 4032 4032 0* proxy 3184 3160 -24 listener 752 728 -24 *: struct server is full of holes and padding (176 bytes) and is 64-byte aligned. Moving the guid_node elsewhere such as after sess_conn reduces it to 3968, or one less cache line. There's no point in moving anything now because forthcoming patches will arrange other parts.	2025-09-16 09:23:46 +02:00
Willy Tarreau	e36b3b60b3	MEDIUM: migrate the patterns reference to cebs_tree cebs_tree are 24 bytes smaller than ebst_tree (16B vs 40B), and pattern references are only used during map/acl updates, so their storage is pure loss between updates (which most of the time never happen). By switching their indexing to compact trees, we can save 16 to 24 bytes per entry depending on alightment (here it's 24 per struct but 16 practical as malloc's alignment keeps 8 unused). Tested on core i7-8650U running at 3.0 GHz, with a file containing 17.7M IP addresses (16.7M different): $ time ./haproxy -c -f acl-ip.cfg Save 280 MB RAM for 17.7M IP addresses, and slightly speeds up the startup (5.8%, from 19.2s to 18.2s), a part of which possible being attributed to having to write less memory. Note that this is on small strings. On larger ones such as user-agents, ebtree doesn't reread the whole key and might be more efficient. Before: RAM (VSZ/RSS): 4443912 3912444 real 0m19.211s user 0m18.138s sys 0m1.068s Overhead Command Shared Object Symbol 44.79% haproxy haproxy [.] ebst_insert 25.07% haproxy haproxy [.] ebmb_insert_prefix 3.44% haproxy libc-2.33.so [.] __libc_calloc 2.71% haproxy libc-2.33.so [.] _int_malloc 2.33% haproxy haproxy [.] free_pattern_tree 1.78% haproxy libc-2.33.so [.] inet_pton4 1.62% haproxy libc-2.33.so [.] _IO_fgets 1.58% haproxy libc-2.33.so [.] _int_free 1.56% haproxy haproxy [.] pat_ref_push 1.35% haproxy libc-2.33.so [.] malloc_consolidate 1.16% haproxy libc-2.33.so [.] __strlen_avx2 0.79% haproxy haproxy [.] pat_idx_tree_ip 0.76% haproxy haproxy [.] pat_ref_read_from_file 0.60% haproxy libc-2.33.so [.] __strrchr_avx2 0.55% haproxy libc-2.33.so [.] unlink_chunk.constprop.0 0.54% haproxy libc-2.33.so [.] __memchr_avx2 0.46% haproxy haproxy [.] pat_ref_append After: RAM (VSZ/RSS): 4166108 3634768 real 0m18.114s user 0m17.113s sys 0m0.996s Overhead Command Shared Object Symbol 38.99% haproxy haproxy [.] cebs_insert 27.09% haproxy haproxy [.] ebmb_insert_prefix 3.63% haproxy libc-2.33.so [.] __libc_calloc 3.18% haproxy libc-2.33.so [.] _int_malloc 2.69% haproxy haproxy [.] free_pattern_tree 1.99% haproxy libc-2.33.so [.] inet_pton4 1.74% haproxy libc-2.33.so [.] _IO_fgets 1.73% haproxy libc-2.33.so [.] _int_free 1.57% haproxy haproxy [.] pat_ref_push 1.48% haproxy libc-2.33.so [.] malloc_consolidate 1.22% haproxy libc-2.33.so [.] __strlen_avx2 1.05% haproxy libc-2.33.so [.] __strcmp_avx2 0.80% haproxy haproxy [.] pat_idx_tree_ip 0.74% haproxy libc-2.33.so [.] __memchr_avx2 0.69% haproxy libc-2.33.so [.] __strrchr_avx2 0.69% haproxy libc-2.33.so [.] _IO_getline_info 0.62% haproxy haproxy [.] pat_ref_read_from_file 0.56% haproxy libc-2.33.so [.] unlink_chunk.constprop.0 0.56% haproxy libc-2.33.so [.] cfree@GLIBC_2.2.5 0.46% haproxy haproxy [.] pat_ref_append If the addresses are totally disordered (via "shuf" on the input file), we see both implementations reach exactly 68.0s (slower due to much higher cache miss ratio). On large strings such as user agents (1 million here), it's now slightly slower (+9%): Before: real 0m2.475s user 0m2.316s sys 0m0.155s After: real 0m2.696s user 0m2.544s sys 0m0.147s But such patterns are much less common than short ones, and the memory savings do still count. Note that while it could be tempting to get rid of the list that chains all these pat_ref_elt together and only enumerate them by walking along the tree to save 16 extra bytes per entry, that's not possible due to the problem that insertion ordering is critical (think overlapping regex such as /index.* and /index.html). Currently it's not possible to proceed differently because patterns are first pre-loaded into the pat_ref via pat_ref_read_from_file_smp() and later indexed by pattern_read_from_file(), which has to only redo the second part anyway for maps/acls declared multiple times.	2025-09-16 09:23:46 +02:00
Willy Tarreau	ddf900a0ce	IMPORT: cebtree: import version 0.5.0 to support duplicates The support for duplicates is necessary for various use cases related to config names, so let's upgrade to the latest version which brings this support. This updates the cebtree code to commit 808ed67 (tag 0.5.0). A few tiny adaptations were needed: - replace a few ceb_node with ceb_root since pointers are now tagged ; - replace cebu.h with ceb.h since both are now merged in the same include file. This way we can drop the unused cebu*.h files from cebtree that are provided only for compatibility. - rename immediate storage functions to cebXX_imm_XXX() as per the API change in 0.5 that makes immediate explicit rather than implicit. This only affects vars and tools.c:copy_file_name(). The tests continue to work.	2025-09-16 09:23:46 +02:00
Valentine Krasnobaeva	f8acac653e	BUG/MINOR: resolvers: always normalize FQDN from response RFC1034 states the following: By convention, domain names can be stored with arbitrary case, but domain name comparisons for all present domain functions are done in a case-insensitive manner, assuming an ASCII character set, and a high order zero bit. This means that you are free to create a node with label "A" or a node with label "a", but not both as brothers; you could refer to either using "a" or "A". In practice, most DNS resolvers normalize domain labels (i.e., convert them to lowercase) before performing searches or comparisons to ensure this requirement is met. While HAProxy normalizes the domain name in the request, it currently does not do so for the response. Commit 75cc653 ("MEDIUM: resolvers: replace bogus resolv_hostname_cmp() with memcmp()") intentionally removed the `tolower()` conversion from `resolv_hostname_cmp()` for safety and performance reasons. This commit re-introduces the necessary normalization for FQDNs received in the response. The change is made in `resolv_read_name()`, where labels are processed as an unsigned char string, allowing `tolower()` to be applied safely. Since a typical FQDN has only 3-4 labels, replacing `memcpy()` with an explicit copy that also applies `tolower()` should not introduce a significant performance degradation. This patch addresses the rare edge case, as most resolvers perform this normalization themselves. This fixes the GitHub issue #3102. This fix may be backported in all stable versions since 2.5 included 2.5.	2025-09-15 18:02:16 +02:00
Remi Tricot-Le Breton	257df69fbd	BUG/MINOR: ocsp: Crash when updating CA during ocsp updates If an ocsp response is set to be updated automatically and some certificate or CA updates are performed on the CLI, if the CLI update happens while the OCSP response is being updated and is then detached from the udapte tree, it might be wrongly inserted into the update tree in 'ssl_sock_load_ocsp', and then reinserted when the update finishes. The update tree then gets corrupted and we could end up crashing when accessing other nodes in the ocsp response update tree. This patch must be backported up to 2.8. This patch fixes GitHub #3100.	2025-09-15 15:34:36 +02:00
Aurelien DARRAGON	6a92b14cc1	MEDIUM: log/proxy: store log-steps selection using a bitmask, not an eb tree An eb tree was used to anticipate for infinite amount of custom log steps configured at a proxy level. In turns out this makes no sense to configure that much logging steps for a proxy, and the cost of the eb tree is non negligible in terms of memory footprint, especially when used in a default section. Instead, let's use a simple bitmask, which allows up to 64 logging steps configured at proxy level. If we lack space some day (and need more than 64 logging steps to be configured), we could simply modify "struct log_steps" to spread the bitmask over multiple 64bits integers, minor some adjustments where the mask is set and checked.	2025-09-15 10:29:02 +02:00
Aurelien DARRAGON	be417c1db2	BUG/MEDIUM: http_ana: fix potential NULL deref in http_process_req_common() As reported by @kenballus in GH #3118, a potential NULL-deref was introduced in 3da1d63 ("BUG/MEDIUM: http_ana: handle yield for "stats http-request" evaluation") Indeed, px->uri_auth may be NULL when stats directive is not involved in the current proxy section. The bug went unnoticed because it didn't seem to cause any side-effect so far and valgrind didn't catch it. However ASAN did, so let's fix it before it causes harm. It should be backported with 3da1d63.	2025-09-15 10:28:59 +02:00

1 2 3 4 5 ...

20112 Commits