haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-19 13:41:27 +02:00

Author	SHA1	Message	Date
Willy Tarreau	7867cebf31	BUG/MAJOR: queue: set SF_ASSIGNED when setting strm->target on dequeue Commit 82cd5c13a ("OPTIM: backend: skip LB when we know the backend is full") has uncovered a long-burried bug in the dequeing code: when a server releases a connection, it picks a new one from the proxy's or its queue. Technically speaking it only picks a pendconn which is a link between a position in the queue and a stream. It then sets this pendconn's target to itself, and wakes up the stream's task so that it can try to connect again. The stream then goes through the regular connection setup phases, calls back_try_conn_req() which calls pendconn_dequeue(), which sets the stream's target to the pendconn's and releases the pendconn. It then reaches assign_server() which sees no SF_ASSIGNED and calls assign_server_and_queue() to perform load balancing or queuing. This one first destroys the stream's target and gets ready to perform load balancing. At this point we're load-balancing for no reason since we already knew what server was available. And this is where the commit above comes into play: the check for the backend's queue above may detect other connections that arrived in between, and will immediately return FULL, forcing this request back into the queue. If the server had a very low maxconn (e.g. 1 due to a long slowstart), it's possible that this evicted connection was the last one on the server and that no other one will ever be present to process the queue. Usually a regularly processed request will still have its own srv_conn that will be used during stream_free() to dequeue other connections. But if the server had a down-up cycle, then a call to pendconn_grab_from_px() may start to dequeue entries which had no srv_conn and which will have no server slot to offer when they expire, thus maintaining the situation above forever. Worse, as new requests arrive, there are always some requests in the queue and the situation feeds on itself. The correct fix here is to properly set SF_ASSIGNED in pendconn_dequeue() when the stream's target is assigned (as it's what this flag means), so as to avoid a load-balancing pass when dequeuing. Many thanks to Pierre Cheynier for the numerous detailed traces he provided that helped narrow this problem down. This could be backported to all stable versions, but in practice only 2.3 and above are really affected since the presence of the commit above. Given how tricky this code is it's better to limit it to those versions that really need it.	2021-06-16 09:05:35 +02:00
Willy Tarreau	c1a689f2eb	BUILD: queue: include tools.h from queue.c It uses memprintf() without including the file because it inherited it from other ones.	2021-05-08 13:59:05 +02:00
Willy Tarreau	4781b1521a	CLEANUP: atomic/tree-wide: replace single increments/decrements with inc/dec This patch replaces roughly all occurrences of an HA_ATOMIC_ADD(&foo, 1) or HA_ATOMIC_SUB(&foo, 1) with the equivalent HA_ATOMIC_INC(&foo) and HA_ATOMIC_DEC(&foo) respectively. These are 507 changes over 45 files.	2021-04-07 18:18:37 +02:00
Willy Tarreau	1db427399c	CLEANUP: atomic: add an explicit _FETCH variant for add/sub/and/or Currently our atomic ops return a value but it's never known whether the fetch is done before or after the operation, which causes some confusion each time the value is desired. Let's create an explicit variant of these operations suffixed with _FETCH to explicitly mention that the fetch occurs after the operation, and make use of it at the few call places.	2021-04-07 18:18:37 +02:00
Willy Tarreau	59b0fecfd9	MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the lock The two algos defining these functions (first and leastconn) do not need the server's lock. However it's already present in pendconn_process_next_strm() so the API must be updated so that the functions may take it if needed and that the callers indicate whether they already own it. As such, the call places (backend.c and stream.c) now do not take it anymore, queue.c was unchanged since it's already held, and both "first" and "leastconn" were updated to take it if not already held. A quick test on the "first" algo showed a jump from 432 to 565k rps by just dropping the lock in stream.c!	2021-02-18 10:06:45 +01:00
Willy Tarreau	751153e0f1	OPTIM: server: switch the actconn list to an mt-list The remaining contention on the server lock solely comes from sess_change_server() which takes the lock to add and remove a stream from the server's actconn list. This is both expensive and pointless since we have mt-lists, and this list is only used by the CLI's "shutdown server sessions" command! Let's migrate to an mt-list and remove the need for this costly lock. By doing so, the request rate increased by ~1.8%.	2021-02-18 10:06:45 +01:00
Christopher Faulet	cd7126b396	CLEANUP: queue: Remove useless tests on p or pp in pendconn_process_next_strm() This patch removes unecessary tests on p or pp pointers in pendconn_process_next_strm() function. This should make cppcheck happy and avoid false report of null pointer dereference. This patch should fix the issue #1036.	2021-02-11 11:48:36 +01:00
Willy Tarreau	5472aa50f1	BUG/MEDIUM: queue: fix unsafe proxy pointer when counting nbpend As reported by Coverity in issue #917, commit 96bca33 ("OPTIM: queue: decrement the nbpend and totpend counters outside of the lock") introduced a bug when moving the increments outside of the loop, because we can't always rely on the pendconn "p" here as it may be null. We can retrieve the proxy pointer directly from s->proxy instead. The same is true for pendconn_redistribute(), though the last "p" pointer there was still valid. This patch fixes both. No backport is needed, this was introduced just before 2.3-dev8.	2020-10-24 12:57:41 +02:00
Willy Tarreau	96bca33d75	OPTIM: queue: decrement the nbpend and totpend counters outside of the lock We don't need to do that inside the lock. However since the operation used to be done in deep functions, we have to make it resurface closer to visible parts. It remains reasonably self-contained in queue.c so that's not that big of a deal. Some places (redistribute) could benefit from a single operation for all counts at once. Others like pendconn_process_next_strm() are still called with both locks held but now it will be possible to change this.	2020-10-22 17:32:28 +02:00
Willy Tarreau	56c1cfb179	OPTIM: queue: make the nbpend counters atomic Instead of incrementing, decrementing them and updating their max under the lock, make them atomic and keep them out of the lock as much as possible. For __pendconn_unlink_* it would be wide to decide to move these counters outside of the function, inside the callers so that a single atomic op can be done per counter even for groups of operations.	2020-10-22 17:32:28 +02:00
Willy Tarreau	c7eedf7a5a	MINOR: queue: reduce the locked area in pendconn_add() Similarly to previous changes, we know if we're dealing with a server or proxy lock so let's directly lock at the finest possible places there. It's worth noting that a part of the operation consisting in an increment and update of a max could be done outside of the lock using atomic ops and a CAS.	2020-10-22 17:32:28 +02:00
Willy Tarreau	3e3ae2524d	MINOR: queue: split __pendconn_unlink() in per-srv and per-prx The function is called with the lock held and does too many tests for things that are already known from its callers. Let's split it in two so that its callers call either the per-server or per-proxy function depending on where the element is (since they had to determine it prior to taking the lock).	2020-10-22 17:32:28 +02:00
Willy Tarreau	ac66d6bafb	MINOR: proxy; replace the spinlock with an rwlock This is an anticipation of finer grained locking for the queues. For now all lock places take a write lock so that there is no difference at all with previous code.	2020-10-22 17:32:28 +02:00
Willy Tarreau	ef71f0194c	BUG/MINOR: queue: properly report redistributed connections In commit 5cd4bbd7a ("BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management") the counter of transferred connections was accidently lost, so that when a server goes down with connections in its queue, it will always be reported that 0 connection were transferred. This should be backported as far as 1.8 since the patch above was backported there.	2020-10-21 12:04:53 +02:00
Willy Tarreau	b2551057af	CLEANUP: include: tree-wide alphabetical sort of include files This patch fixes all the leftovers from the include cleanup campaign. There were not that many (~400 entries in ~150 files) but it was definitely worth doing it as it revealed a few duplicates.	2020-06-11 10:18:59 +02:00
Willy Tarreau	dfd3de8826	REORG: include: move stream.h to haproxy/stream{,-t}.h This one was not easy because it was embarking many includes with it, which other files would automatically find. At least global.h, arg.h and tools.h were identified. 93 total locations were identified, 8 additional includes had to be added. In the rare files where it was possible to finalize the sorting of includes by adjusting only one or two extra lines, it was done. But all files would need to be rechecked and cleaned up now. It was the last set of files in types/ and proto/ and these directories must not be reused anymore.	2020-06-11 10:18:58 +02:00
Willy Tarreau	1e56f92693	REORG: include: move server.h to haproxy/server{,-t}.h extern struct dict server_name_dict was moved from the type file to the main file. A handful of inlined functions were moved at the bottom of the file. Call places were updated to use server-t.h when relevant, or to simply drop the entry when not needed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	a55c45470f	REORG: include: move queue.h to haproxy/queue{,-t}.h Nothing outstanding here. A number of call places were not justified and removed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	4980160ecc	REORG: include: move backend.h to haproxy/backend{,-t}.h The files remained mostly unchanged since they were OK. However, half of the users didn't need to include them, and about as many actually needed to have it and used to find functions like srv_currently_usable() through a long chain that broke when moving the file.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c2b1ff04e5	REORG: include: move http_ana.h to haproxy/http_ana{,-t}.h It was moved without any change, however many callers didn't need it at all. This was a consequence of the split of proto_http.c into several parts that resulted in many locations to still reference it.	2020-06-11 10:18:58 +02:00
Willy Tarreau	5e539c9b8d	REORG: include: move stream_interface.h to haproxy/stream_interface{,-t}.h Almost no changes, removed stdlib and added buf-t and connection-t to the types to avoid a warning.	2020-06-11 10:18:58 +02:00
Willy Tarreau	8b550afe1e	REORG: include: move tcp_rules.h to haproxy/tcp_rules.h There's no type file on this one which is pretty simple.	2020-06-11 10:18:58 +02:00
Willy Tarreau	cea0e1bb19	REORG: include: move task.h to haproxy/task{,-t}.h The TASK_IS_TASKLET() macro was moved to the proto file instead of the type one. The proto part was a bit reordered to remove a number of ugly forward declaration of static inline functions. About a tens of C and H files had their dependency dropped since they were not using anything from task.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	e6ce10be85	REORG: include: move sample.h to haproxy/sample{,-t}.h This one is particularly tricky to move because everyone uses it and it depends on a lot of other types. For example it cannot include arg-t.h and must absolutely only rely on forward declarations to avoid dependency loops between vars -> sample_data -> arg. In order to address this one, it would be nice to split the sample_data part out of sample.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c761f843da	REORG: include: move http_rules.h to haproxy/http_rules.h There was no include file. This one still includes types/proxy.h.	2020-06-11 10:18:57 +02:00
Willy Tarreau	d0ef439699	REORG: include: move common/memory.h to haproxy/pool.h Now the file is ready to be stored into its final destination. A few minor reorderings were performed to keep the file properly organized, making the various sections more visible (cache & lockless). In addition and to stay consistent, memory.c was renamed to pool.c.	2020-06-11 10:18:57 +02:00
Willy Tarreau	92b4f1372e	REORG: include: move time.h from common/ to haproxy/ This one is included almost everywhere and used to rely on a few other .h that are not needed (unistd, stdlib, standard.h). It could possibly make sense to split it into multiple parts to distinguish operations performed on timers and the internal time accounting, but at this point it does not appear much important.	2020-06-11 10:18:56 +02:00
Willy Tarreau	3f567e4949	REORG: include: split hathreads into haproxy/thread.h and haproxy/thread-t.h This splits the hathreads.h file into types+macros and functions. Given that most users of this file used to include it only to get the definition of THREAD_LOCAL and MAXTHREADS, the bare minimum was placed into thread-t.h (i.e. types and macros). All the thread management was left to haproxy/thread.h. It's worth noting the drop of the trailing "s" in the name, to remove the permanent confusion that arises between this one and the system implementation (no "s") and the makefile's option (no "s"). For consistency, src/hathreads.c was also renamed thread.c. A number of files were updated to only include thread-t which is the one they really needed. Some future improvements are possible like replacing empty inlined functions with macros for the thread-less case, as building at -O0 disables inlining and causes these ones to be emitted. But this really is cosmetic.	2020-06-11 10:18:56 +02:00
Willy Tarreau	4c7e4b7738	REORG: include: update all files to use haproxy/api.h or api-t.h if needed All files that were including one of the following include files have been updated to only include haproxy/api.h or haproxy/api-t.h once instead: - common/config.h - common/compat.h - common/compiler.h - common/defaults.h - common/initcall.h - common/tools.h The choice is simple: if the file only requires type definitions, it includes api-t.h, otherwise it includes the full api.h. In addition, in these files, explicit includes for inttypes.h and limits.h were dropped since these are now covered by api.h and api-t.h. No other change was performed, given that this patch is large and affects 201 files. At least one (tools.h) was already freestanding and didn't get the new one added.	2020-06-11 10:18:42 +02:00
Willy Tarreau	8d2b777fe3	REORG: ebtree: move the include files from ebtree to include/import/ This is where other imported components are located. All files which used to directly include ebtree were touched to update their include path so that "import/" is now prefixed before the ebtree-related files. The ebtree.h file was slightly adjusted to read compiler.h from the common/ subdirectory (this is the only change). A build issue was encountered when eb32sctree.h is loaded before eb32tree.h because only the former checks for the latter before defining type u32. This was addressed by adding the reverse ifdef in eb32tree.h. No further cleanup was done yet in order to keep changes minimal.	2020-06-11 09:31:11 +02:00
Willy Tarreau	e3b57bf92f	MINOR: sample: make sample_parse_expr() able to return an end pointer When an end pointer is passed, instead of complaining that a comma is missing after a keyword, sample_parse_expr() will silently return the pointer to the current location into this return pointer so that the caller can continue its parsing. This will be used by more complex expressions which embed sample expressions, and may even permit to embed sample expressions into arguments of other expressions.	2020-02-14 19:02:06 +01:00
Willy Tarreau	9ada030697	BUG/MINOR: queue/threads: make the queue unlinking atomic There is a very short race in the queues which happens in the following situation: - stream A on thread 1 is being processed by a server - stream B on thread 2 waits in the backend queue for a server - stream B on thread 2 is fed up with waiting and expires, calls stream_free() which calls pendconn_free(), which sees the stream attached - at the exact same instant, stream A finishes on thread 1, sees one stream is waiting (B), detaches it and wakes it up - stream B continues pendconn_free() and calls pendconn_unlink() - pendconn_unlink() now detaches the node again and performs a second deletion (harmless since idempotent), and decrements srv/px->nbpend again => the number of connections on the proxy or server may reach -1 if/when this race occurs. It is extremely tight as it can only occur during the test on p->leaf_p though it has been witnessed at least once. The solution consists in testing leaf_p again once the lock is held to make sure the element was not removed in the mean time. This should be backported to 2.0 and 1.9, probably even 1.8.	2019-11-14 14:58:39 +01:00
Willy Tarreau	5e83d996cf	BUG/MAJOR: queue/threads: avoid an AB/BA locking issue in process_srv_queue() A problem involving server slowstart was reported by @max2k1 in issue #197. The problem is that pendconn_grab_from_px() takes the proxy lock while already under the server's lock while process_srv_queue() first takes the proxy's lock then the server's lock. While the latter seems more natural, it is fundamentally incompatible with mayn other operations performed on servers, namely state change propagation, where the proxy is only known after the server and cannot be locked around the servers. Howwever reversing the lock in process_srv_queue() is trivial and only the few functions related to dynamic cookies need to be adjusted for this so that the proxy's lock is taken for each server operation. This is possible because the proxy's server list is built once at boot time and remains stable. So this is what this patch does. The comments in the proxy and server structs were updated to mention this rule that the server's lock may not be taken under the proxy's lock but may enclose it. Another approach could consist in using a second lock for the proxy's queue which would be different from the regular proxy's lock, but given that the operations above are rare and operate on small servers list, there is no reason for overdesigning a solution. This fix was successfully tested with 10000 servers in a backend where adjusting the dyncookies in loops over the CLI didn't have a measurable impact on the traffic. The only workaround without the fix is to disable any occurrence of "slowstart" on server lines, or to disable threads using "nbthread 1". This must be backported as far as 1.8.	2019-07-30 14:02:06 +02:00
Christopher Faulet	fc9cfe4006	REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files The old module proto_http does not exist anymore. All code dedicated to the HTTP analysis is now grouped in the file proto_htx.c. So, to finish the polishing after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in http_ana.{c,h} files. In addition, all HTX analyzers and related functions prefixed with "htx_" have been renamed to start with "http_" instead.	2019-07-19 09:24:12 +02:00
Willy Tarreau	bff005ae58	BUG/MEDIUM: queue: fix the tree walk in pendconn_redistribute. In pendconn_redistribute() we scan the queue using eb32_next() on the node we've just deleted, which is wrong since the node is not in the tree anymore, and it could dereference one node that has already been released by another thread. Note that we cannot use eb32_first() in the loop here instead because we need to skip pendconns having SF_FORCE_PRST. Instead, let's keep a copy of the next node before deleting it. In addition, the pendconn retrieved there is wrong, it uses &node as the pointer instead of node, resulting in very quick crashes when the server list is scanned. Fortunately this only happens when "option redispatch" is used in conjunction with "maxconn" on server lines, "cookie" for the stickiness, and when a server goes down with entries in its queue. This bug was introduced by commit 0355dabd7 ("MINOR: queue: replace the linked list with a tree") so the fix must be backported to 1.9.	2019-05-27 10:29:59 +02:00
Olivier Houchard	b4df492d01	MEDIUM: queues: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:38 +01:00
Joseph Herlant	d8499ecb6e	CLEANUP: Fix a typo in the queue subsystem Fixes a typo in the code comments of the queue subsystem.	2018-12-02 18:40:11 +01:00
Willy Tarreau	8ceae72d44	MEDIUM: init: use initcall for all fixed size pool creations This commit replaces the explicit pool creation that are made in constructors with a pool registration. Not only this simplifies the pools declaration (it can be done on a single line after the head is declared), but it also removes references to pools from within constructors. The only remaining create_pool() calls are those performed in init functions after the config is parsed, so there is no more user of potentially uninitialized pool now. It has been the opportunity to remove no less than 12 constructors and 6 init functions.	2018-11-26 19:50:32 +01:00
Willy Tarreau	0108d90c6c	MEDIUM: init: convert all trivial registration calls to initcalls This switches explicit calls to various trivial registration methods for keywords, muxes or protocols from constructors to INITCALL1 at stage STG_REGISTER. All these calls have in common to consume a single pointer and return void. Doing this removes 26 constructors. The following calls were addressed : - acl_register_keywords - bind_register_keywords - cfg_register_keywords - cli_register_kw - flt_register_keywords - http_req_keywords_register - http_res_keywords_register - protocol_register - register_mux_proto - sample_register_convs - sample_register_fetches - srv_register_keywords - tcp_req_conn_keywords_register - tcp_req_cont_keywords_register - tcp_req_sess_keywords_register - tcp_res_cont_keywords_register - flt_register_keywords	2018-11-26 19:50:32 +01:00
Willy Tarreau	61c112aa5b	REORG: http: move HTTP rules parsing to http_rules.c These ones are mostly called from cfgparse.c for the parsing and do not depend on the HTTP representation. The functions's prototypes were moved to proto/http_rules.h, making this file work exactly like tcp_rules. Ideally we should stop calling these functions directly from cfgparse and register keywords, but there are a few cases where that wouldn't work (stats http-request) so it's probably not worth trying to go this far.	2018-10-02 18:28:05 +02:00
Willy Tarreau	deca26c452	BUG/MAJOR: queue/threads: make pendconn_redistribute not lock the server Since commit 3ff577e ("MAJOR: server: make server state changes synchronous again"), srv_update_status() is called with the server lock held. It calls (among others) pendconn_redistribute() which used to take this lock, causing CPU loops by default, or crashes if build with -DDEBUG_THREAD. Since this function is not called from any other place anymore, it doesn't require the lock on its own so let's simply drop it from there. No backport is needed, this is 1.9-specific.	2018-08-21 18:11:03 +02:00
Patrick Hemmer	248cb4c503	MEDIUM: queue: adjust position based on priority-class and priority-offset The priority values are used when connections are queued to determine which connections should be served first. The lowest priority class is served first. When multiple requests from the same class are found, the earliest (according to queue_time + offset) is served first. The queue offsets can span over roughly 17 minutes after which the offsets will wrap around. This allows up to 8 minutes spent in the queue with no reordering.	2018-08-10 15:06:48 +02:00
Patrick Hemmer	268a707a3d	MEDIUM: add set-priority-class and set-priority-offset This adds the set-priority-class and set-priority-offset actions to http-request and tcp-request content. At this point they are not used yet, which is the purpose of the next commit, but all the logic to set and clear the values is there.	2018-08-10 15:06:31 +02:00
Patrick Hemmer	0355dabd7c	MINOR: queue: replace the linked list with a tree We'll need trees to manage the queues by priorities. This change replaces the list with a tree based on a single key. It's effectively a list but allows us to get rid of the list management right now.	2018-08-10 15:06:27 +02:00
Patrick Hemmer	da282f4a8f	MINOR: queue: store the queue index in the stream when enqueuing We store the queue index in the stream and check it on dequeueing to figure how many entries were processed in between. This way we'll be able to count the elements that may later be added before ours.	2018-08-10 15:06:25 +02:00
Patrick Hemmer	ffe5e8c638	MINOR: stream: rename {srv,prx}_queue_size to *_queue_pos The current name is misleading as it implies a queue size, but the value instead indicates a position in the queue. The value is only the queue size at the exact moment the element is enqueued. Soon we will gain the ability to insert anywhere into the queue, upon which clarity of the name is more important.	2018-08-10 15:04:14 +02:00
Willy Tarreau	a8694654ba	BUG/MEDIUM: queue: prevent a backup server from draining the proxy's connections When switching back from a backup to an active server, the backup server currently continues to drain the proxy's connections, which is a problem because it's not expected to be able to pick them. This patch ensures that a backup server will only pick backend connections if there is no active server and it is the selected backup server or all backup servers are supposed to be used. This issue seems to have existed forever, so this fix should be backported to all stable versions.	2018-08-07 10:52:01 +02:00
Olivier Houchard	ecfe673f61	MINOR: threads/queue: Get rid of THREAD_WANT_SYNC in the queue code. Now that we can wake one thread sleeping in the poller, we don't have to use THREAD_WANT_SYNC any more. This gives a significant performance boost on highly contended accesses (servers with maxconn 1), showing a jump from 21k to 31k conn/s on a test involving 8 threads.	2018-07-26 20:55:02 +02:00
Willy Tarreau	3201e4e428	MEDIUM: queue: get rid of the pendconn lock This lock was necessary to manipulate the pendconn element between concurrent places, but was causing great difficulties in the list walk by having to iterate over multiple entries instead of being able to safely pick the first one (in fact the first element was always the right one but the locking model was hard to prove). Here since we know we can always rely on the queue's locks, we take the queue's lock every time we need to modify the element. In practice it was already the case everywhere except in pendconn_dequeue() which only works on an element that was already detached. This function had to be protected against the risk of meeting an incompletely detached element (which could be unlinked but not yet assigned). By taking the queue lock around the LIST_ISEMPTY test, it's enough to ensure that a concurrent thread either didn't begin or had completed the operation. The true benefit really is in pendconn_process_next_strm() where we can again safely work with the first element of each queue. This will significantly simplify next updates to this code.	2018-07-26 17:32:51 +02:00
Willy Tarreau	7c6f8a2b0d	MINOR: queue: implement pendconn queue locking functions The new pendconn_queue_lock() and pendconn_queue_unlock() functions are made to make it more convenient to lock or unlock the pendconn queue either at the proxy or the server depending on pendconn->srv. This way it is possible to remove the open-coding of these locks at various places. These ones have been used in pendconn_unlink() and pendconn_add(), thus significantly simplifying the logic there.	2018-07-26 17:32:51 +02:00

1 2 3

106 Commits