haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-29 07:31:00 +01:00

Author	SHA1	Message	Date
Ilia Shipitsin	27a6353ceb	CLEANUP: assorted typo fixes in the code, commits and doc	2025-04-03 11:37:25 +02:00
Aurelien DARRAGON	f72a66eef2	MINOR: pattern: publish event_hdl events on pat_ref updates Now that PAT_REF events were defined in previous commit, let's actually publish them from pattern API where relevant. Unlike server events, pattern reference events are only published in the pat_ref subscriber's list on purpose, because in some setups patref updates (updates performed on a map for instance from action or cli) are very frequent, and we don't want to impact pattern API performance just for that. Moreover, as the main use case is to be able to subscribe to maps updates from Lua, allowing a per-pattern reference registration is already enough. No additional data is provided for such events (also for performance reason) Care was taken not to publish events when the update doesn't affect the live subset (the one targeted by curr_gen).	2024-11-29 07:22:25 +01:00
Aurelien DARRAGON	aa69a02d7f	MEDIUM: pattern: always consider gen_id for pat_ref lookup operations Historically, pat_ref lookup operations were performed on the whole pat_ref elements list. As such, set, find and delete operations on a given key would cause any matching element in pat_ref to be considered. When prepare/commit operations were added, gen_id was impelemnted in order to be able to work on a subset from pat_ref without impacting the current (live) version from pat_ref, until a new subset is committed to replace the current one. While the logic was good, there remained a design flaw from the historical implementation: indeed, legacy functions such as pat_ref_set(), pat_ref_delete() and pat_ref_find_elt() kept performing the lookups on the whole set of elements instead of considering only elements from the current subset. Because of this, mixing new prepare/commit operations with legacy operations could yield unexpected results. For instance, before this commit: echo "add map #0 key oldvalue" \| socat /tmp/ha.sock - echo "prepare map #0" \| socat /tmp/ha.sock - New version created: 1 echo "add map @1 #0 key newvalue" \| socat /tmp/ha.sock - echo "del map #0 key" \| socat /tmp/ha.sock - echo "commit map @1 #0" \| socat /tmp/ha.sock - -> the result would be that "key" entry doesn't exist anymore after the commit, while we would expect the new value to be there instead. Thanks to the previous commits, we may finally fix this issue: for set, find_elt and delete operations, the current generation id is considered. With the above example, it means that the "del map #0 key" would only target elements from the current subset, thus elements in "version 1" of the map would be immune to the delete (as we would expect it to work).	2024-11-26 16:12:31 +01:00
Aurelien DARRAGON	010c34b8c7	MEDIUM: pattern: consider gen_id in pat_ref_set_from_node() Don't set all duplicates from a given node if they don't have the same gen_id. Indeed, now we consider the gen_id to only work on the same pattern ref revision.	2024-11-26 16:12:26 +01:00
Aurelien DARRAGON	4792f27892	MINOR: pattern: add pat_ref_gen_delete() function pat_ref_gen_delete(ref, gen_id, key) tries to delete all samples belonging to <gen_id> and matching <key> under <ref> The goal is to be able to target a single subset from <ref>	2024-11-26 16:12:21 +01:00
Aurelien DARRAGON	a131c542a6	MINOR: pattern: add pat_ref_gen_find_elt() function pat_ref_gen_find_elt(ref, gen_id, key) tries to find <elt> element belonging to <gen_id> and matching <key> in <ref> reference. The goal is to be able to target a single subset from <ref>	2024-11-26 16:12:16 +01:00
Aurelien DARRAGON	c9d6af3c6d	MINOR: pattern: add pat_ref_gen_set() function pat_ref_gen_set(ref, gen_id, value, err) modifies to <value> the sample of all patterns matching <key> and belonging to <gen_id> (generation id) under <ref> The goal is to be able to target a single subset from <ref>	2024-11-26 16:12:11 +01:00
Aurelien DARRAGON	3d250b3be8	MINOR: pattern: split pat_ref_set() split pat_ref_set() function in 2 distinct functions. Indeed, since 0844bed7d3 ("MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)"), pat_ref_set() prototype was updated to include an extra <elt> argument. But the logic behind is not explicit because the function will not only try to set <elt>, but also its duplicate (unlike pat_ref_set_elt() which only tries to update <elt>). Thus, to make it clearer and better distinguish between the key-based lookup version and the elt-based one, restotre pat_ref_set() previous prototype and add a dedicated pat_ref_set_elt_duplicate() that takes <elt> as argument and tries to update <elt> and all duplicates.	2024-11-26 16:12:05 +01:00
Willy Tarreau	555994c968	OPTIM: pattern: only apply LRU cache for large enough lists As shown in issue #1518, the LRU cache has a non-null cost that can sometimes be above the match cost it's trying to avoid. After a number of tests, it appears that: - "simple" match operations (sub, beg, end, int etc) reach a break-even after ~20 patterns in list - "heavy" match operations (reg) reach a break-even after ~5 patterns in list Let's only consult the LRU cache when the number of patterns in the expression is at least as large as this limit. Of course there will always be outliers but it already starts good. Another improvement consists in reducing the cache size to further speed up lookups, which makes sense if less expressions use the cache.	2024-11-15 15:33:04 +01:00
Aurelien DARRAGON	aba3ed62ae	MINOR: pattern: add pat_ref_free() helper func For now, pat_ref struct are never freed, except during init in case of error. The freeing is done directly in the init functions because we don't have an helper for that. No having an helper func to properly free pat_ref struct doesn't encourage us to free unused pat_ref structs, plus it is error-prone if new dynamic members are added to pat_ref struct in the future. To fix that, let's add a pat_ref_free() helper func and use it where relevant (which means only under pat_ref init function for now..)	2024-11-07 11:36:13 +01:00
Aurelien DARRAGON	e8a0dbff93	OPTIM: pattern: use malloc() to initialize new pat_ref struct As mentioned in the previous commit, in _pat_ref_new(), it was not strictly needed to explicitly assign all struct members to 0 since the struct was allocated with calloc() which does the zeroing for us. However, it was verified that we already initialize all fields explictly, thus there is no reason to keep using calloc() instead of malloc(). In fact using malloc() is less expensive, so let's use that instead now.	2024-11-07 11:36:08 +01:00
Aurelien DARRAGON	d1397401f0	MINOR: pattern: add _pat_ref_new() helper func pat_ref_newid() and pat_ref_new() are two functions to create and initialize a pat_ref struct based on input parameters. Both function perform the same generic allocation and initialization for pat_ref struct, thus there is quite a lot of code redundancy. This is error-prone if the pat_ref init sequence has to be updated at some point. To reduce maintenance costs, let's add a _pat_ref_new() helper func that takes care of the generic allocation and base initialization for pat_ref struct.	2024-11-07 11:36:01 +01:00
Willy Tarreau	9f8d9c9e8b	BUG/MINOR: pattern: do not leave a leading comma on "set" error messages Commit 4f2493f355 ("BUG/MINOR: pattern: pat_ref_set: fix UAF reported by coverity") dropped the condition to concatenate error messages and as such introduced a leading comma in front of all of them. Then commit 911f4d93d4 ("BUG/MINOR: pattern: pat_ref_set: return 0 if err was found") changed the behavior to stop at the first error anyway, so all the mechanics dedicated to the concatenation of error messages is no longer needed and we can simply return the error as-is, without inserting any comma. This should be backported where the patches above are backported.	2024-09-10 08:55:29 +02:00
Aurelien DARRAGON	68cfb222b5	BUG/MEDIUM: pattern: prevent UAF on reused pattern expr Since c5959fd ("MEDIUM: pattern: merge same pattern"), UAF (leading to crash) can be experienced if the same pattern file (and match method) is used in two default sections and the first one is not referenced later in the config. In this case, the first default section will be cleaned up. However, due to an unhandled case in the above optimization, the original expr which the second default section relies on is mistakenly freed. This issue was discovered while trying to reproduce GH #2708. The issue was particularly tricky to reproduce given the config and sequence required to make the UAF happen. Hopefully, Github user @asmnek not only provided useful informations, but since he was able to consistently trigger the crash in his environment he was able to nail down the crash to the use of pattern file involved with 2 named default sections. Big thanks to him. To fix the issue, let's push the logic from c5959fd a bit further. Instead of relying on "do_free" variable to know if the expression should be freed or not (which proved to be insufficient in our case), let's switch to a simple refcounting logic. This way, no matter who owns the expression, the last one attempting to free it will be responsible for freeing it. Refcount is implemented using a 32bit value which fills a previous 4 bytes structure gap: int mflags; /* 80 4 / / XXX 4 bytes hole, try to pack / long unsigned int lock; / 88 8 */ (output from pahole) Even though it was not reproduced in 2.6 or below by @asmnek (the bug was revealed thanks to another bugfix), this issue theorically affects all stable versions (up to c5959fd), thus it should be backported to all stable versions.	2024-09-09 16:07:05 +02:00
Aurelien DARRAGON	8157c1caf2	BUG/MEDIUM: pattern: prevent uninitialized reads in pat_match_{str,beg} Using valgrind when running map_beg or map_str, the following error is reported: ==242644== Conditional jump or move depends on uninitialised value(s) ==242644== at 0x2E4AB1: pat_match_str (pattern.c:457) ==242644== by 0x2E81ED: pattern_exec_match (pattern.c:2560) ==242644== by 0x343176: sample_conv_map (map.c:211) ==242644== by 0x27522F: sample_process_cnv (sample.c:1330) ==242644== by 0x2752DB: sample_process (sample.c:1373) ==242644== by 0x319917: action_store (vars.c:814) ==242644== by 0x24D451: http_req_get_intercept_rule (http_ana.c:2697) In fact, the error is legit, because in pat_match_{beg,str}, we dereference the buffer on len+1 to check if a value was previously set, and then decide to force NULL-byte if it wasn't set. But the approach is no longer compatible with current architecture: data past str.data is not guaranteed to be initialized in the buffer. Thus we cannot dereference the value, else we expose us to uninitialized read errors. Moreover, the check is useless, because we systematically set the ending byte to 0 when the conditions are met. Finally, restoring the older value after the lookup is not relevant: indeed, either the sample is marked as const and in such case it is already duplicated, or the sample is not const and we forcefully add a terminating NULL byte outside from the actual string bytes (since we're past str.data), so as we didn't alter effective string data and that data past str.data cannot be dereferenced anyway as it isn't guaranteed to be initialized, there's no point in restoring previous uninitialized data. It could be backported in all stable versions. But since this was only detected by valgrind and isn't known to cause issues in existing deployments, it's probably better to wait a bit before backporting it to avoid any breakage.. although the fix should be theoretically harmless.	2024-09-09 15:57:30 +02:00
Aurelien DARRAGON	3449525a02	BUG/MINOR: pattern: prevent const sample from being tampered in pat_match_beg() This is a complementary patch to a68affeaa ("BUG/MINOR: pattern: a sample marked as const could be written"). Indeed the same logic from pat_match_str() is used there, but we lack the check to ensure that the sample is not const before writing data to it. It could be backported to all stable versions.	2024-09-09 15:57:23 +02:00
Valentine Krasnobaeva	911f4d93d4	BUG/MINOR: pattern: pat_ref_set: return 0 if err was found pat_ref_set_elt() returns 0, if we are run out of memory or can't parse a new map value. Any arror message emitted by pat_ref_set_elt() is saved in err buffer, if its provided by caller. These error messages are cumulated during the loop. pat_ref_set() is used to update values in map, referred to the same given key. If during the update pat_ref_set_elt() fails, let's retun 0 to caller immediately. We have the same non-unique key and the same new value in each loop. So it seems quite odd to cumulate the same error messages and print it in CLI: > add map @1 mytest.map << + 1.0.1.11 TestA + 1.0.1.11 TESTA + 1.0.1.11 test_a + > set map mytest.map 1.0.1.11 15 unable to parse '15' unable to parse '15' unable to parse '15'. cli_parse_set_map(), which calls pat_ref_set() to update map, will return only one error message with this patch: > set map mytest.map 1.0.1.11 15 unable to parse '15'. hlua_set_map() and http_action_set_map() don't provide error buffer and will just exit on the first error. This should be backported in all stable versions.	2024-08-13 16:13:43 +02:00
Valentine Krasnobaeva	4f2493f355	BUG/MINOR: pattern: pat_ref_set: fix UAF reported by coverity memprintf() performs realloc and updates then the pointer to an output buffer, where it has written the data. So free() is called on the previous buffer address, if it was provided. pat_ref_set_elt() uses memprintf() to write its error message as well as pat_ref_set(). So, when we re-enter into the while loop the second time and pat_ref_set_elt() has returned, the err ptr (previous value of merr) is already freed by memprintf() from pat_ref_set_el(). 'if (!found)' condition is false at this point, because we've found a node at the first loop. So, the second memprintf(), in order to write error messages, does again free(*err). This should be backported in all stable versions.	2024-08-13 16:13:41 +02:00
Aurelien DARRAGON	3b0bf5097b	MINOR: map: mapfile ordering also matters for tree-based match types Willy made me realize that tree-based matching may also suffer from out-of-order mapfile loading, as opposed to what's being said in b546bb6d ("BUG/MINOR: map: list-based matching potential ordering regression") and the associated REGTEST. Indeed, in case of duplicated keys, we want to be sure that only the key that was first seen in the file will be returned (as long as it is not removed). The above fix is still valid, and the list-based match regtest will also prevent regressions for tree-based match since mapfile loading logic is currently match-type agnostic. But let's clarify that by making both the code comment and the regtest more precise.	2024-01-11 11:13:54 +01:00
Aurelien DARRAGON	b546bb6d67	BUG/MINOR: map: list-based matching potential ordering regression An unexpected side-effect was introduced by 5fea597 ("MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list") The above commit tried to use eb tree API to manipulate elements as much as possible in the hope to accelerate some functions. Prior to 5fea597, pattern_read_from_file() used to iterate over all elements from the map file in the same order they were seen in the file (using list_for_each_entry) to push them in the pattern expression. Now, since eb api is used to iterate over elements, the ordering is lost very early. This is known to cause behavior changes with existing setups (same conf and map file) when compared with previous versions for some list-based matching methods as described in GH #2400. For instance, the map_dom() converter may return a different matching key from the one that was returned by older haproxy versions. For IP or STR matching, matching is based on tree lookups for better efficiency, so in this case the ordering is lost at the name of performance. The order in which they are loaded doesn't matter because tree ordering is based on the content, it is not positional. But with some other types, matching is based on list lookups (e.g.: dom), and the order in which elements are pushed into the list can affect the matching element that will be returned (in case of multiple matches, since only the first matching element in the list will be returned). Despite the documentation not officially stating that the file ordering should be preserved for list-based matching methods, it's probably best to be conservative here and stick to historical behavior. Moreover, there was no performance benefit from using the eb tree api to iterate over elements in pattern_read_from_file() since all elements are visited anyway. This should be backported to 2.9.	2024-01-10 18:02:13 +01:00
Aurelien DARRAGON	d7964c52ce	BUG/MEDIUM: map/acl: pat_ref_{set,delete}_by_id regressions Some regressions were introduced by 5fea59754b ("MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list") pat_ref_delete_by_id() fails to properly unlink and free the removed reference because it bypasses the pat_ref_delete_by_ptr() made for that purpose. This function is normally used everywhere the target reference is set for removal, such as the pat_ref_delete() function that matches pattern against a string. The call was probably skipped by accident during the rewrite of the function. With the above commit also comes another undesirable change: both pat_ref_delete_by_id() and pat_ref_set_by_id() directly use the <refelt> argument as a valid pointer (they do dereference it). This is wrong, because <refelt> is unsafe and should be handled as an ID, not a pointer (hence the function name). Indeed, the calling function may directly pass user input from the CLI as <refelt> argument, so we must first ensure that it points to a valid element before using it, else it is probably invalid and we shouldn't touch it. What this patch essentially does, is that it reverts pat_ref_set_by_id() and pat_ref_delete_by_id() to pre 5fea59754b behavior. This seems like it was the only optimization from the patch that doesn't apply. Hopefully, after reviewing the changes with Fred, it seems that the 2 functions are only being involved in commands for manipulating maps or acls on the cli, so the "missed" opportunity to improve their performance shouldn't matter much. Nonetheless, if we wanted to speed up the reference lookup by ID, we could consider adding an eb64 tree for that specific purpose that contains all pattern references IDs (ie: pointers) so that eb lookup functions may be used instead of linear list search. The issue was raised by Marko Juraga as he failed to perform an an acl removal by reference on the CLI on 2.9 which was known to work properly on other versions. It should be backported on 2.9. Co-Authored-by: Frédéric Lécaille <flecaille@haproxy.com>	2023-12-08 14:26:06 +01:00
Christopher Faulet	67c03508d6	MEDIUM: pattern: Add support for virtual and optional files for patterns Before this patch, it was not possible to use a list of patterns, map or a list of acls, without an existing file. However, it could be handy to just use an ID, with no file on the disk. It is pretty useful for everyone managing dynamically these lists. It could also be handy to try to load a list from a file if it exists without failing if not. This way, it could be possible to make a cold start without any file (instead of empty file), dynamically add and del patterns, dump the list to the file periodically to reuse it on reload (via an external process). In this patch, we uses some prefixes to be able to use virtual or optional files. The default case remains unchanged. regular files are used. A filename, with no prefix, is used as reference, and it must exist on the disk. With the prefix "file@", the same is performed. Internally this prefix is skipped. Thus the same file, with ou without "file@" prefix, references the same list of patterns. To use a virtual map, "virt@" prefix must be used. No file is read, even if the following name looks like a file. It is just an ID. The prefix is part of ID and must always be used. To use a optional file, ie a file that may or may not exist on a disk at startup, "opt@" prefix must be used. If the file exists, its content is loaded. But HAProxy doesn't complain if not. The prefix is not part of ID. For a given file, optional files and regular files reference the same list of patterns. This patch should fix the issue #2202.	2023-12-06 10:24:41 +01:00
Christopher Faulet	660e4185e1	MINOR: pattern: Use reference name as filename to read patterns from a file It is only a small API refactoring. The filename is no longer used when pat_ref_read_from_file_smp() or pat_ref_read_from_file() functions are called. The filename was already used to create the reference on the list of patterns. Thus, we now directly use info from this reference.	2023-12-06 10:24:41 +01:00
Willy Tarreau	3ac9912837	OPTIM: pattern: save memory and time using ebst instead of ebis In the pat_ref_elt struct, the pattern string is stored outside of the node element, using a pointer to an strdup(). Not only this needlessly wastes at least 16-24 bytes per entry (8 for the pointer, 8-16 for the allocator), it also makes the tree descent less efficient since both the node and the string have to be visited for each layer (hence at least two cache lines). Let's use an ebmb storage and place the pattern right at the end of the pat_ref_elt, making it a variable-sized element instead. The set-map test below jumps from 173 to 182 kreq/s/core, and the memory usage drops from 356 MB to 324 MB: http-request set-map(/dev/null) %[rand(1000000)] 1 This is even more visible with large maps: after loading 16M IP addresses into a map, the process uses this amount of memory: - 3.15 GB with haproxy-2.8 - 4.21 GB with haproxy-2.9-dev11 - 3.68 GB with this patch So that's a net saving of 32 bytes per entry here, which cuts in half the extra cost of the tree, and loading a large map takes about 20% less time.	2023-11-27 11:25:07 +01:00
Willy Tarreau	58185669d8	BUG/MEDIUM: pattern: don't trim pools under lock in pat_ref_purge_range() There's a subtle issue that results from pat_ref_purge_range() trying to release memory. Since commit 0d93a8186 ("MINOR: pools: work around possibly slow malloc_trim() during gc") that was backported to 2.3, trim_all_pools() now protects itself against concurrent malloc() and free() by isolating itself. The problem is that pat_ref_purge_range() must be called under a lock, which is precisely what's done in cli_io_handler_clear_map(). Thus during a clearing of a map, if another thread tries to access or update an entry in the same map, it will wait for the ref->lock to be released, and trim_all_pools() will wait for all threads to be harmless, thus causing a deadlock. Note that disabling memory trimming cannot work around the problem here because it's tested only under isolation. The solution here consists in moving the call to trim_all_pools() to the caller, out of the lock. This must be backported as far as 2.4.	2023-11-04 07:55:37 +01:00
Aurelien DARRAGON	0189a4679e	MINOR: pattern/ip: simplify pat_match_ip() function pat_match_ip() has been updated several times over the last decade to introduce new features, but it was never cleaned up. The result is that the function is pretty hard to read, and there are multiple duplicated code blocks so it becomes error-prone to maintain it, plus it bloats the haproxy binary for nothing. In this patch, we move the tree search (ip4 / ip6) logic into 2 dedicated helper functions. This allows us to refactor pat_match_ip() without touching to the original behavior.	2023-09-21 09:50:56 +02:00
Aurelien DARRAGON	f80122db26	MINOR: pattern/ip: offload ip conversion logic to helper functions Now that v4tov6() and v6tov4() were reworked to match behavior from pat_match_ip() function in ("MINOR: tools/ip: v4tov6() and v6tov4() rework"), we can remove code duplication in pat_match_ip() by directly using those dedicated functions where relevant.	2023-09-21 09:50:55 +02:00
Fr�d�ric L�caille	81815a9a83	MEDIUM: map/acl: Replace map/acl spin lock by a read/write lock. Replace ->lock type of pat_ref struct by HA_RWLOCK_T. Replace all calls to HA_SPIN_LOCK() (resp. HA_SPIN_UNLOCK()) by HA_RWLOCK_WRLOCK() (resp. HA_RWLOCK_WRUNLOCK()) when a write access is required. There is only one read access which is needed. This is in the "show map" command callback, cli_io_handler_map_lookup() where a HA_SPIN_LOCK() call is replaced by HA_RWLOCK_RDLOCK() (resp. HA_SPIN_UNLOCK() by HA_RWLOCK_RDUNLOCK). Replace HA_SPIN_INIT() calls by HA_RWLOCK_INIT() calls.	2023-08-25 15:42:03 +02:00
Fr�d�ric L�caille	5fea59754b	MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list Replace as much as possible list_for_each*() around ->head list, member of pat_ref_elt struct by use of its ->ebpt_root member which is an ebtree.	2023-08-25 15:42:01 +02:00
Fr�d�ric L�caille	745d1a269b	MEDIUM: map/acl: Improve pat_ref_set_elt() efficiency (for "set-map", "add-acl"action perfs) Store a pointer to the expression (struct pattern_expr) into the data structure used to chain/store the map element references (struct pat_ref_elt) , e.g. the struct pattern_tree when stored into an ebtree or struct pattern_list when chained to a list. Modify pat_ref_set_elt() to stop inspecting all the expressions attached to a map and to look for the <elt> element passed as parameter to retrieve the sample data to be parsed. Indeed, thanks to the pointer added above to each pattern tree nodes or list elements, they all can be inspected directly from the <elt> passed as parameter and its ->tree_head and ->list_head member: the pattern tree nodes are stored into elt->tree_head, and the pattern list elements are chained to elt->list_head list. This inspection was also the job of pattern_find_smp() which is no more useful. This patch removes the code of this function.	2023-08-25 15:41:59 +02:00
Fr�d�ric L�caille	0844bed7d3	MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs) Organize reference to pattern element of map (struct pat_ref_elt) into an ebtree: - add an eb_root member to the map (pat_ref struct) and an ebpt_node to its element (pat_ref_elt struct), - modify the code to insert these nodes into their ebtrees each time they are allocated. This is done in pat_ref_append(). Note that ->head member (struct list) of map (struct pat_ref) is not removed could have been removed. This is not the case because still necessary to dump the map contents from the CLI in the order the map elememnts have been inserted. This patch also modifies http_action_set_map() which is the callback at least used by "set-map" action. The pat_ref_elt element returned by pat_ref_find_elt() is no more ignored, but reused if not NULL by pat_ref_set() as first element to lookup from. This latter is also modified to use the ebtree attached to the map in place of the ->head list attached to each map element (pat_ref_elt struct). Also modify pat_ref_find_elt() to makes it use ->eb_root map ebtree added to the map by this patch in place of inspecting all the elements with a strcmp() call.	2023-08-25 15:41:56 +02:00
Willy Tarreau	821fc95146	MINOR: pattern: do not needlessly lookup the LRU cache for empty lists If a pattern list is empty, there's no way we can find its elements in the pattern cache, so let's avoid this expensive lookup. This can happen for ACLs or maps loaded from files that may optionally be empty for example. Doing so improves the request rate by roughly 10% for a single such match for only 8 threads. That's normal because the LRU cache pre-creates an entry that is about to be committed for the case the list lookup succeeds after a miss, so we bypass all this.	2023-08-22 07:27:01 +02:00
Willy Tarreau	9b060f148e	MINOR: pattern: use trim_all_pools() instead of a conditional malloc_trim() First this will ensure that we serialize the threads and avoid severe contention. Second it removes ugly ifdefs and conditions.	2023-03-22 17:30:28 +01:00
Miroslav Zagorac	d8a97d8f60	BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used In the event that HAProxy is linked with the jemalloc library, it is still shown that malloc_trim() is enabled when executing "haproxy -vv": .. Support for malloc_trim() is enabled. .. It's not so much a problem as it is that malloc_trim() is called in the pat_ref_purge_range() function without any checking. This was solved by setting the using_default_allocator variable to the correct value in the detect_allocator() function and before calling malloc_trim() it is checked whether the function should be called.	2023-03-22 14:14:50 +01:00
Willy Tarreau	51d38a26fe	BUG/MEDIUM: pattern: only visit equivalent nodes when skipping versions Miroslav reported in issue #1802 a problem that affects atomic map/acl updates. During an update, incorrect versions are properly skipped, but in order to do so, we rely on ebmb_next() instead of ebmb_next_dup(). This means that if a new matching entry is in the process of being added and is the first one to succeed in the lookup, we'll skip it due to its version and use the next entry regardless of its value provided that it has the correct version. For IP addresses and string prefixes it's particularly visible because a lookup may match a new longer prefix that's not yet committed (e.g. 11.0.0.1 would match 11/8 when 10/7 was the only committed one), and skipping it could end up on 12/8 for example. As soon as a commit for the last version happens, the issue disappears. This problem only affects tree-based matches: the "str", "ip", and "beg" matches. Here we replace the ebmb_next() values with ebmb_next_dup() for exact string matches, and with ebmb_lookup_shorter() for longest matches, which will first visit duplicates, then look for shorter prefixes. This relies on previous commit: MINOR: ebtree: add ebmb_lookup_shorter() to pursue lookups Both need to be backported to 2.4, where the generation ID was added. Note that nowadays a simpler and more efficient approach might be employed, by having a single version in the current tree, and a list of trees per version. Manipulations would look up the tree version and work (and lock) only in the relevant trees, while normal operations would be performed on the current tree only. Committing would just be a matter of swapping tree roots and deleting old trees contents.	2022-08-01 11:59:46 +02:00
Tim Duesterhus	d5fc8fcb86	CLEANUP: Add haproxy/xxhash.h to avoid modifying import/xxhash.h This solves setting XXH_INLINE_ALL in a cleaner way, because the imported header is not modified, easing future updates. see 6f7cc11e6dd0f01b437fba893da2edd2362660a2	2021-09-11 19:58:45 +02:00
Dragan Dosen	a75eea78e2	MINOR: map/acl: print the count of all the map/acl entries in "show map/acl" The output of "show map/acl" now contains the 'entry_cnt' value that represents the count of all the entries for each map/acl, not just the active ones, which means that it also includes entries currently being added.	2021-05-25 08:44:45 +02:00
Willy Tarreau	da7f11bfb5	CLEANUP: pattern: remove the unused and dangerous pat_ref_reload() This function was not used anymore after the atomic updates were implemented in 2.3, and it must not be used given that it does not yield and can easily make the process hang for tens of seconds on large acls/maps. Let's remove it before someone uses it as an example to implement something else!	2021-05-11 16:49:55 +02:00
Willy Tarreau	a13afe6535	MINOR: pattern: support purging arbitrary ranges of generations Instead of being able to purge only values older than a specific value, let's support arbitrary ranges and make pat_ref_purge_older() just be one special case of this one.	2021-04-30 15:36:31 +02:00
Willy Tarreau	2b71810cb3	CLEANUP: lists/tree-wide: rename some list operations to avoid some confusion The current "ADD" vs "ADDQ" is confusing because when thinking in terms of appending at the end of a list, "ADD" naturally comes to mind, but here it does the opposite, it inserts. Several times already it's been incorrectly used where ADDQ was expected, the latest of which was a fortunate accident explained in 6fa922562 ("CLEANUP: stream: explain why we queue the stream at the head of the server list"). Let's use more explicit (but slightly longer) names now: LIST_ADD -> LIST_INSERT LIST_ADDQ -> LIST_APPEND LIST_ADDED -> LIST_INLIST LIST_DEL -> LIST_DELETE The same is true for MT_LISTs, including their "TRY" variant. LIST_DEL_INIT keeps its short name to encourage to use it instead of the lazier LIST_DELETE which is often less safe. The change is large (~674 non-comment entries) but is mechanical enough to remain safe. No permutation was performed, so any out-of-tree code can easily map older names to new ones. The list doc was updated.	2021-04-21 09:20:17 +02:00
Willy Tarreau	295a89c029	MINOR: pattern: make the pat_lru_seed read_mostly This seed is created once at boot and is used in every LRU hash when caching results. Let's mark it read_mostly.	2021-04-10 19:27:41 +02:00
Willy Tarreau	9057a0026e	CLEANUP: pattern: make all pattern tables read-only Interestingly, all arrays used to declare patterns were read-write while only hard-coded. Let's mark them const so that they move from data to rodata and don't risk to experience false sharing.	2021-04-10 17:49:41 +02:00
Willy Tarreau	61cfdf4fd8	CLEANUP: tree-wide: replace free(x);x=NULL with ha_free(&x) This makes the code more readable and less prone to copy-paste errors. In addition, it allows to place some __builtin_constant_p() predicates to trigger a link-time error in case the compiler knows that the freed area is constant. It will also produce compile-time error if trying to free something that is not a regular pointer (e.g. a function). The DEBUG_MEM_STATS macro now also defines an instance for ha_free() so that all these calls can be checked. 178 occurrences were converted. The vast majority of them were handled by the following Coccinelle script, some slightly refined to better deal with "&*x" or with long lines: @ rule @ expression E; @@ - free(E); - E = NULL; + ha_free(&E); It was verified that the resulting code is the same, more or less a handful of cases where the compiler optimized slightly differently the temporary variable that holds the copy of the pointer. A non-negligible amount of {free(str);str=NULL;str_len=0;} are still present in the config part (mostly header names in proxies). These ones should also be cleaned for the same reasons, and probably be turned into ist strings.	2021-02-26 21:21:09 +01:00
Willy Tarreau	dc2410d093	CLEANUP: pattern: rename pat_ref_commit() to pat_ref_commit_elt() It's about the third time I get confused by these functions, half of which manipulate the reference as a whole and those manipulating only an entry. For me "pat_ref_commit" means committing the pattern reference, not just an element, so let's rename it. A number of other ones should really be renamed before 2.4 gets released :-/	2021-01-15 14:11:59 +01:00
Thayne McCombs	8f0cc5c4ba	CLEANUP: Fix spelling errors in comments This is from the output of codespell. It's done at once over a bunch of files and only affects comments, so there is nothing user-visible. No backport needed.	2021-01-08 14:56:32 +01:00
Dragan Dosen	967e7e79af	MEDIUM: xxhash: use the XXH3 functions to generate 64-bit hashes Replace the XXH64() function calls with the XXH3 variant function XXH3_64bits_withSeed() where possible.	2020-12-23 06:39:21 +01:00
Thierry Fournier	a68affeaa9	BUG/MINOR: pattern: a sample marked as const could be written The functions add final 0 to string if the final 0 is not set, but don't check the flag CONST. This patch duplicates the strings if the final zero is not set and the string is CONST. Should be backported until 2.2 (at least)	2020-11-11 10:43:15 +01:00
Willy Tarreau	38d41996c1	MEDIUM: pattern: turn the pattern chaining to single-linked list It does not require heavy deletion from the expr anymore, so we can now turn this to a single-linked list since most of the time we want to delete all instances of a given pattern from the head. By doing so we save 32 bytes of memory per pattern. The pat_unlink_from_head() function was adjusted accordingly.	2020-11-05 19:27:09 +01:00
Willy Tarreau	867a8a5a10	MINOR: pattern: prepare removal of a pattern from the list head Instead of using LIST_DEL() on the pattern itself inside an expression, we look it up from its head. The goal is to get rid of the double-linked list while this usage remains exclusively for freeing on startup error!	2020-11-05 19:27:09 +01:00
Willy Tarreau	2817472bb0	MINOR: pattern: during reload, delete elements frem the ref, not the expression Instead of scanning all elements from the expression and using the slow delete path there, let's use the faster way which involves pat_delete_gen() while the elements are detached from ther reference.	2020-11-05 19:27:09 +01:00

1 2 3 4 5

225 Commits