haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-08 08:07:10 +02:00

Author	SHA1	Message	Date
Willy Tarreau	901972e261	MINOR: queue: update the stream's pend_pos before queuing it Since commit `c7eedf7a5` ("MINOR: queue: reduce the locked area in pendconn_add()") the stream's pend_pos is set out of the lock, after the pendconn is queued. While this entry is only manipulated by the stream itself and there is no bug caused by this right now, it's a bit dangerous because another thread could decide to look at this field during dequeuing and could randomly see something else. Also in case of crashes, memory inspection wouldn't be as trustable. Let's assign the pendconn before it can be found in the queue.	2021-06-18 18:21:18 +02:00
Willy Tarreau	7867cebf31	BUG/MAJOR: queue: set SF_ASSIGNED when setting strm->target on dequeue Commit `82cd5c13a` ("OPTIM: backend: skip LB when we know the backend is full") has uncovered a long-burried bug in the dequeing code: when a server releases a connection, it picks a new one from the proxy's or its queue. Technically speaking it only picks a pendconn which is a link between a position in the queue and a stream. It then sets this pendconn's target to itself, and wakes up the stream's task so that it can try to connect again. The stream then goes through the regular connection setup phases, calls back_try_conn_req() which calls pendconn_dequeue(), which sets the stream's target to the pendconn's and releases the pendconn. It then reaches assign_server() which sees no SF_ASSIGNED and calls assign_server_and_queue() to perform load balancing or queuing. This one first destroys the stream's target and gets ready to perform load balancing. At this point we're load-balancing for no reason since we already knew what server was available. And this is where the commit above comes into play: the check for the backend's queue above may detect other connections that arrived in between, and will immediately return FULL, forcing this request back into the queue. If the server had a very low maxconn (e.g. 1 due to a long slowstart), it's possible that this evicted connection was the last one on the server and that no other one will ever be present to process the queue. Usually a regularly processed request will still have its own srv_conn that will be used during stream_free() to dequeue other connections. But if the server had a down-up cycle, then a call to pendconn_grab_from_px() may start to dequeue entries which had no srv_conn and which will have no server slot to offer when they expire, thus maintaining the situation above forever. Worse, as new requests arrive, there are always some requests in the queue and the situation feeds on itself. The correct fix here is to properly set SF_ASSIGNED in pendconn_dequeue() when the stream's target is assigned (as it's what this flag means), so as to avoid a load-balancing pass when dequeuing. Many thanks to Pierre Cheynier for the numerous detailed traces he provided that helped narrow this problem down. This could be backported to all stable versions, but in practice only 2.3 and above are really affected since the presence of the commit above. Given how tricky this code is it's better to limit it to those versions that really need it.	2021-06-16 09:05:35 +02:00
Willy Tarreau	c1a689f2eb	BUILD: queue: include tools.h from queue.c It uses memprintf() without including the file because it inherited it from other ones.	2021-05-08 13:59:05 +02:00
Willy Tarreau	4781b1521a	CLEANUP: atomic/tree-wide: replace single increments/decrements with inc/dec This patch replaces roughly all occurrences of an HA_ATOMIC_ADD(&foo, 1) or HA_ATOMIC_SUB(&foo, 1) with the equivalent HA_ATOMIC_INC(&foo) and HA_ATOMIC_DEC(&foo) respectively. These are 507 changes over 45 files.	2021-04-07 18:18:37 +02:00
Willy Tarreau	1db427399c	CLEANUP: atomic: add an explicit _FETCH variant for add/sub/and/or Currently our atomic ops return a value but it's never known whether the fetch is done before or after the operation, which causes some confusion each time the value is desired. Let's create an explicit variant of these operations suffixed with _FETCH to explicitly mention that the fetch occurs after the operation, and make use of it at the few call places.	2021-04-07 18:18:37 +02:00
Willy Tarreau	59b0fecfd9	MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the lock The two algos defining these functions (first and leastconn) do not need the server's lock. However it's already present in pendconn_process_next_strm() so the API must be updated so that the functions may take it if needed and that the callers indicate whether they already own it. As such, the call places (backend.c and stream.c) now do not take it anymore, queue.c was unchanged since it's already held, and both "first" and "leastconn" were updated to take it if not already held. A quick test on the "first" algo showed a jump from 432 to 565k rps by just dropping the lock in stream.c!	2021-02-18 10:06:45 +01:00
Willy Tarreau	751153e0f1	OPTIM: server: switch the actconn list to an mt-list The remaining contention on the server lock solely comes from sess_change_server() which takes the lock to add and remove a stream from the server's actconn list. This is both expensive and pointless since we have mt-lists, and this list is only used by the CLI's "shutdown server sessions" command! Let's migrate to an mt-list and remove the need for this costly lock. By doing so, the request rate increased by ~1.8%.	2021-02-18 10:06:45 +01:00
Christopher Faulet	cd7126b396	CLEANUP: queue: Remove useless tests on p or pp in pendconn_process_next_strm() This patch removes unecessary tests on p or pp pointers in pendconn_process_next_strm() function. This should make cppcheck happy and avoid false report of null pointer dereference. This patch should fix the issue #1036.	2021-02-11 11:48:36 +01:00
Willy Tarreau	5472aa50f1	BUG/MEDIUM: queue: fix unsafe proxy pointer when counting nbpend As reported by Coverity in issue #917, commit `96bca33` ("OPTIM: queue: decrement the nbpend and totpend counters outside of the lock") introduced a bug when moving the increments outside of the loop, because we can't always rely on the pendconn "p" here as it may be null. We can retrieve the proxy pointer directly from s->proxy instead. The same is true for pendconn_redistribute(), though the last "p" pointer there was still valid. This patch fixes both. No backport is needed, this was introduced just before 2.3-dev8.	2020-10-24 12:57:41 +02:00
Willy Tarreau	96bca33d75	OPTIM: queue: decrement the nbpend and totpend counters outside of the lock We don't need to do that inside the lock. However since the operation used to be done in deep functions, we have to make it resurface closer to visible parts. It remains reasonably self-contained in queue.c so that's not that big of a deal. Some places (redistribute) could benefit from a single operation for all counts at once. Others like pendconn_process_next_strm() are still called with both locks held but now it will be possible to change this.	2020-10-22 17:32:28 +02:00
Willy Tarreau	56c1cfb179	OPTIM: queue: make the nbpend counters atomic Instead of incrementing, decrementing them and updating their max under the lock, make them atomic and keep them out of the lock as much as possible. For __pendconn_unlink_* it would be wide to decide to move these counters outside of the function, inside the callers so that a single atomic op can be done per counter even for groups of operations.	2020-10-22 17:32:28 +02:00
Willy Tarreau	c7eedf7a5a	MINOR: queue: reduce the locked area in pendconn_add() Similarly to previous changes, we know if we're dealing with a server or proxy lock so let's directly lock at the finest possible places there. It's worth noting that a part of the operation consisting in an increment and update of a max could be done outside of the lock using atomic ops and a CAS.	2020-10-22 17:32:28 +02:00
Willy Tarreau	3e3ae2524d	MINOR: queue: split __pendconn_unlink() in per-srv and per-prx The function is called with the lock held and does too many tests for things that are already known from its callers. Let's split it in two so that its callers call either the per-server or per-proxy function depending on where the element is (since they had to determine it prior to taking the lock).	2020-10-22 17:32:28 +02:00
Willy Tarreau	ac66d6bafb	MINOR: proxy; replace the spinlock with an rwlock This is an anticipation of finer grained locking for the queues. For now all lock places take a write lock so that there is no difference at all with previous code.	2020-10-22 17:32:28 +02:00
Willy Tarreau	ef71f0194c	BUG/MINOR: queue: properly report redistributed connections In commit `5cd4bbd7a` ("BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management") the counter of transferred connections was accidently lost, so that when a server goes down with connections in its queue, it will always be reported that 0 connection were transferred. This should be backported as far as 1.8 since the patch above was backported there.	2020-10-21 12:04:53 +02:00
Willy Tarreau	b2551057af	CLEANUP: include: tree-wide alphabetical sort of include files This patch fixes all the leftovers from the include cleanup campaign. There were not that many (~400 entries in ~150 files) but it was definitely worth doing it as it revealed a few duplicates.	2020-06-11 10:18:59 +02:00
Willy Tarreau	dfd3de8826	REORG: include: move stream.h to haproxy/stream{,-t}.h This one was not easy because it was embarking many includes with it, which other files would automatically find. At least global.h, arg.h and tools.h were identified. 93 total locations were identified, 8 additional includes had to be added. In the rare files where it was possible to finalize the sorting of includes by adjusting only one or two extra lines, it was done. But all files would need to be rechecked and cleaned up now. It was the last set of files in types/ and proto/ and these directories must not be reused anymore.	2020-06-11 10:18:58 +02:00
Willy Tarreau	1e56f92693	REORG: include: move server.h to haproxy/server{,-t}.h extern struct dict server_name_dict was moved from the type file to the main file. A handful of inlined functions were moved at the bottom of the file. Call places were updated to use server-t.h when relevant, or to simply drop the entry when not needed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	a55c45470f	REORG: include: move queue.h to haproxy/queue{,-t}.h Nothing outstanding here. A number of call places were not justified and removed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	4980160ecc	REORG: include: move backend.h to haproxy/backend{,-t}.h The files remained mostly unchanged since they were OK. However, half of the users didn't need to include them, and about as many actually needed to have it and used to find functions like srv_currently_usable() through a long chain that broke when moving the file.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c2b1ff04e5	REORG: include: move http_ana.h to haproxy/http_ana{,-t}.h It was moved without any change, however many callers didn't need it at all. This was a consequence of the split of proto_http.c into several parts that resulted in many locations to still reference it.	2020-06-11 10:18:58 +02:00
Willy Tarreau	5e539c9b8d	REORG: include: move stream_interface.h to haproxy/stream_interface{,-t}.h Almost no changes, removed stdlib and added buf-t and connection-t to the types to avoid a warning.	2020-06-11 10:18:58 +02:00
Willy Tarreau	8b550afe1e	REORG: include: move tcp_rules.h to haproxy/tcp_rules.h There's no type file on this one which is pretty simple.	2020-06-11 10:18:58 +02:00
Willy Tarreau	cea0e1bb19	REORG: include: move task.h to haproxy/task{,-t}.h The TASK_IS_TASKLET() macro was moved to the proto file instead of the type one. The proto part was a bit reordered to remove a number of ugly forward declaration of static inline functions. About a tens of C and H files had their dependency dropped since they were not using anything from task.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	e6ce10be85	REORG: include: move sample.h to haproxy/sample{,-t}.h This one is particularly tricky to move because everyone uses it and it depends on a lot of other types. For example it cannot include arg-t.h and must absolutely only rely on forward declarations to avoid dependency loops between vars -> sample_data -> arg. In order to address this one, it would be nice to split the sample_data part out of sample.h.	2020-06-11 10:18:58 +02:00
Willy Tarreau	c761f843da	REORG: include: move http_rules.h to haproxy/http_rules.h There was no include file. This one still includes types/proxy.h.	2020-06-11 10:18:57 +02:00
Willy Tarreau	d0ef439699	REORG: include: move common/memory.h to haproxy/pool.h Now the file is ready to be stored into its final destination. A few minor reorderings were performed to keep the file properly organized, making the various sections more visible (cache & lockless). In addition and to stay consistent, memory.c was renamed to pool.c.	2020-06-11 10:18:57 +02:00
Willy Tarreau	92b4f1372e	REORG: include: move time.h from common/ to haproxy/ This one is included almost everywhere and used to rely on a few other .h that are not needed (unistd, stdlib, standard.h). It could possibly make sense to split it into multiple parts to distinguish operations performed on timers and the internal time accounting, but at this point it does not appear much important.	2020-06-11 10:18:56 +02:00
Willy Tarreau	3f567e4949	REORG: include: split hathreads into haproxy/thread.h and haproxy/thread-t.h This splits the hathreads.h file into types+macros and functions. Given that most users of this file used to include it only to get the definition of THREAD_LOCAL and MAXTHREADS, the bare minimum was placed into thread-t.h (i.e. types and macros). All the thread management was left to haproxy/thread.h. It's worth noting the drop of the trailing "s" in the name, to remove the permanent confusion that arises between this one and the system implementation (no "s") and the makefile's option (no "s"). For consistency, src/hathreads.c was also renamed thread.c. A number of files were updated to only include thread-t which is the one they really needed. Some future improvements are possible like replacing empty inlined functions with macros for the thread-less case, as building at -O0 disables inlining and causes these ones to be emitted. But this really is cosmetic.	2020-06-11 10:18:56 +02:00
Willy Tarreau	4c7e4b7738	REORG: include: update all files to use haproxy/api.h or api-t.h if needed All files that were including one of the following include files have been updated to only include haproxy/api.h or haproxy/api-t.h once instead: - common/config.h - common/compat.h - common/compiler.h - common/defaults.h - common/initcall.h - common/tools.h The choice is simple: if the file only requires type definitions, it includes api-t.h, otherwise it includes the full api.h. In addition, in these files, explicit includes for inttypes.h and limits.h were dropped since these are now covered by api.h and api-t.h. No other change was performed, given that this patch is large and affects 201 files. At least one (tools.h) was already freestanding and didn't get the new one added.	2020-06-11 10:18:42 +02:00
Willy Tarreau	8d2b777fe3	REORG: ebtree: move the include files from ebtree to include/import/ This is where other imported components are located. All files which used to directly include ebtree were touched to update their include path so that "import/" is now prefixed before the ebtree-related files. The ebtree.h file was slightly adjusted to read compiler.h from the common/ subdirectory (this is the only change). A build issue was encountered when eb32sctree.h is loaded before eb32tree.h because only the former checks for the latter before defining type u32. This was addressed by adding the reverse ifdef in eb32tree.h. No further cleanup was done yet in order to keep changes minimal.	2020-06-11 09:31:11 +02:00
Willy Tarreau	e3b57bf92f	MINOR: sample: make sample_parse_expr() able to return an end pointer When an end pointer is passed, instead of complaining that a comma is missing after a keyword, sample_parse_expr() will silently return the pointer to the current location into this return pointer so that the caller can continue its parsing. This will be used by more complex expressions which embed sample expressions, and may even permit to embed sample expressions into arguments of other expressions.	2020-02-14 19:02:06 +01:00
Willy Tarreau	9ada030697	BUG/MINOR: queue/threads: make the queue unlinking atomic There is a very short race in the queues which happens in the following situation: - stream A on thread 1 is being processed by a server - stream B on thread 2 waits in the backend queue for a server - stream B on thread 2 is fed up with waiting and expires, calls stream_free() which calls pendconn_free(), which sees the stream attached - at the exact same instant, stream A finishes on thread 1, sees one stream is waiting (B), detaches it and wakes it up - stream B continues pendconn_free() and calls pendconn_unlink() - pendconn_unlink() now detaches the node again and performs a second deletion (harmless since idempotent), and decrements srv/px->nbpend again => the number of connections on the proxy or server may reach -1 if/when this race occurs. It is extremely tight as it can only occur during the test on p->leaf_p though it has been witnessed at least once. The solution consists in testing leaf_p again once the lock is held to make sure the element was not removed in the mean time. This should be backported to 2.0 and 1.9, probably even 1.8.	2019-11-14 14:58:39 +01:00
Willy Tarreau	5e83d996cf	BUG/MAJOR: queue/threads: avoid an AB/BA locking issue in process_srv_queue() A problem involving server slowstart was reported by @max2k1 in issue #197. The problem is that pendconn_grab_from_px() takes the proxy lock while already under the server's lock while process_srv_queue() first takes the proxy's lock then the server's lock. While the latter seems more natural, it is fundamentally incompatible with mayn other operations performed on servers, namely state change propagation, where the proxy is only known after the server and cannot be locked around the servers. Howwever reversing the lock in process_srv_queue() is trivial and only the few functions related to dynamic cookies need to be adjusted for this so that the proxy's lock is taken for each server operation. This is possible because the proxy's server list is built once at boot time and remains stable. So this is what this patch does. The comments in the proxy and server structs were updated to mention this rule that the server's lock may not be taken under the proxy's lock but may enclose it. Another approach could consist in using a second lock for the proxy's queue which would be different from the regular proxy's lock, but given that the operations above are rare and operate on small servers list, there is no reason for overdesigning a solution. This fix was successfully tested with 10000 servers in a backend where adjusting the dyncookies in loops over the CLI didn't have a measurable impact on the traffic. The only workaround without the fix is to disable any occurrence of "slowstart" on server lines, or to disable threads using "nbthread 1". This must be backported as far as 1.8.	2019-07-30 14:02:06 +02:00
Christopher Faulet	fc9cfe4006	REORG: proto_htx: Move HTX analyzers & co to http_ana.{c,h} files The old module proto_http does not exist anymore. All code dedicated to the HTTP analysis is now grouped in the file proto_htx.c. So, to finish the polishing after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in http_ana.{c,h} files. In addition, all HTX analyzers and related functions prefixed with "htx_" have been renamed to start with "http_" instead.	2019-07-19 09:24:12 +02:00
Willy Tarreau	bff005ae58	BUG/MEDIUM: queue: fix the tree walk in pendconn_redistribute. In pendconn_redistribute() we scan the queue using eb32_next() on the node we've just deleted, which is wrong since the node is not in the tree anymore, and it could dereference one node that has already been released by another thread. Note that we cannot use eb32_first() in the loop here instead because we need to skip pendconns having SF_FORCE_PRST. Instead, let's keep a copy of the next node before deleting it. In addition, the pendconn retrieved there is wrong, it uses &node as the pointer instead of node, resulting in very quick crashes when the server list is scanned. Fortunately this only happens when "option redispatch" is used in conjunction with "maxconn" on server lines, "cookie" for the stickiness, and when a server goes down with entries in its queue. This bug was introduced by commit `0355dabd7` ("MINOR: queue: replace the linked list with a tree") so the fix must be backported to 1.9.	2019-05-27 10:29:59 +02:00
Olivier Houchard	b4df492d01	MEDIUM: queues: Use the new _HA_ATOMIC_* macros. Use the new _HA_ATOMIC_* macros and add barriers where needed.	2019-03-11 17:02:38 +01:00
Joseph Herlant	d8499ecb6e	CLEANUP: Fix a typo in the queue subsystem Fixes a typo in the code comments of the queue subsystem.	2018-12-02 18:40:11 +01:00
Willy Tarreau	8ceae72d44	MEDIUM: init: use initcall for all fixed size pool creations This commit replaces the explicit pool creation that are made in constructors with a pool registration. Not only this simplifies the pools declaration (it can be done on a single line after the head is declared), but it also removes references to pools from within constructors. The only remaining create_pool() calls are those performed in init functions after the config is parsed, so there is no more user of potentially uninitialized pool now. It has been the opportunity to remove no less than 12 constructors and 6 init functions.	2018-11-26 19:50:32 +01:00
Willy Tarreau	0108d90c6c	MEDIUM: init: convert all trivial registration calls to initcalls This switches explicit calls to various trivial registration methods for keywords, muxes or protocols from constructors to INITCALL1 at stage STG_REGISTER. All these calls have in common to consume a single pointer and return void. Doing this removes 26 constructors. The following calls were addressed : - acl_register_keywords - bind_register_keywords - cfg_register_keywords - cli_register_kw - flt_register_keywords - http_req_keywords_register - http_res_keywords_register - protocol_register - register_mux_proto - sample_register_convs - sample_register_fetches - srv_register_keywords - tcp_req_conn_keywords_register - tcp_req_cont_keywords_register - tcp_req_sess_keywords_register - tcp_res_cont_keywords_register - flt_register_keywords	2018-11-26 19:50:32 +01:00
Willy Tarreau	61c112aa5b	REORG: http: move HTTP rules parsing to http_rules.c These ones are mostly called from cfgparse.c for the parsing and do not depend on the HTTP representation. The functions's prototypes were moved to proto/http_rules.h, making this file work exactly like tcp_rules. Ideally we should stop calling these functions directly from cfgparse and register keywords, but there are a few cases where that wouldn't work (stats http-request) so it's probably not worth trying to go this far.	2018-10-02 18:28:05 +02:00
Willy Tarreau	deca26c452	BUG/MAJOR: queue/threads: make pendconn_redistribute not lock the server Since commit `3ff577e` ("MAJOR: server: make server state changes synchronous again"), srv_update_status() is called with the server lock held. It calls (among others) pendconn_redistribute() which used to take this lock, causing CPU loops by default, or crashes if build with -DDEBUG_THREAD. Since this function is not called from any other place anymore, it doesn't require the lock on its own so let's simply drop it from there. No backport is needed, this is 1.9-specific.	2018-08-21 18:11:03 +02:00
Patrick Hemmer	248cb4c503	MEDIUM: queue: adjust position based on priority-class and priority-offset The priority values are used when connections are queued to determine which connections should be served first. The lowest priority class is served first. When multiple requests from the same class are found, the earliest (according to queue_time + offset) is served first. The queue offsets can span over roughly 17 minutes after which the offsets will wrap around. This allows up to 8 minutes spent in the queue with no reordering.	2018-08-10 15:06:48 +02:00
Patrick Hemmer	268a707a3d	MEDIUM: add set-priority-class and set-priority-offset This adds the set-priority-class and set-priority-offset actions to http-request and tcp-request content. At this point they are not used yet, which is the purpose of the next commit, but all the logic to set and clear the values is there.	2018-08-10 15:06:31 +02:00
Patrick Hemmer	0355dabd7c	MINOR: queue: replace the linked list with a tree We'll need trees to manage the queues by priorities. This change replaces the list with a tree based on a single key. It's effectively a list but allows us to get rid of the list management right now.	2018-08-10 15:06:27 +02:00
Patrick Hemmer	da282f4a8f	MINOR: queue: store the queue index in the stream when enqueuing We store the queue index in the stream and check it on dequeueing to figure how many entries were processed in between. This way we'll be able to count the elements that may later be added before ours.	2018-08-10 15:06:25 +02:00
Patrick Hemmer	ffe5e8c638	MINOR: stream: rename {srv,prx}_queue_size to *_queue_pos The current name is misleading as it implies a queue size, but the value instead indicates a position in the queue. The value is only the queue size at the exact moment the element is enqueued. Soon we will gain the ability to insert anywhere into the queue, upon which clarity of the name is more important.	2018-08-10 15:04:14 +02:00
Willy Tarreau	a8694654ba	BUG/MEDIUM: queue: prevent a backup server from draining the proxy's connections When switching back from a backup to an active server, the backup server currently continues to drain the proxy's connections, which is a problem because it's not expected to be able to pick them. This patch ensures that a backup server will only pick backend connections if there is no active server and it is the selected backup server or all backup servers are supposed to be used. This issue seems to have existed forever, so this fix should be backported to all stable versions.	2018-08-07 10:52:01 +02:00
Olivier Houchard	ecfe673f61	MINOR: threads/queue: Get rid of THREAD_WANT_SYNC in the queue code. Now that we can wake one thread sleeping in the poller, we don't have to use THREAD_WANT_SYNC any more. This gives a significant performance boost on highly contended accesses (servers with maxconn 1), showing a jump from 21k to 31k conn/s on a test involving 8 threads.	2018-07-26 20:55:02 +02:00
Willy Tarreau	3201e4e428	MEDIUM: queue: get rid of the pendconn lock This lock was necessary to manipulate the pendconn element between concurrent places, but was causing great difficulties in the list walk by having to iterate over multiple entries instead of being able to safely pick the first one (in fact the first element was always the right one but the locking model was hard to prove). Here since we know we can always rely on the queue's locks, we take the queue's lock every time we need to modify the element. In practice it was already the case everywhere except in pendconn_dequeue() which only works on an element that was already detached. This function had to be protected against the risk of meeting an incompletely detached element (which could be unlinked but not yet assigned). By taking the queue lock around the LIST_ISEMPTY test, it's enough to ensure that a concurrent thread either didn't begin or had completed the operation. The true benefit really is in pendconn_process_next_strm() where we can again safely work with the first element of each queue. This will significantly simplify next updates to this code.	2018-07-26 17:32:51 +02:00
Willy Tarreau	7c6f8a2b0d	MINOR: queue: implement pendconn queue locking functions The new pendconn_queue_lock() and pendconn_queue_unlock() functions are made to make it more convenient to lock or unlock the pendconn queue either at the proxy or the server depending on pendconn->srv. This way it is possible to remove the open-coding of these locks at various places. These ones have been used in pendconn_unlink() and pendconn_add(), thus significantly simplifying the logic there.	2018-07-26 17:32:51 +02:00
Willy Tarreau	88930dd364	MINOR: queue: use a distinct variable for the assigned server and the queue The pendconn struct uses ->px and ->srv to designate where the element is queued. There is something confusing regarding threads though, because we have to lock the appropriate queue before inserting/removing elements, and this queue may only be determined by looking at ->srv (if it's not NULL it's the server, otherwise use the proxy). But pendconn_grab_from_px() and pendconn_process_next_strm() both assign this ->srv field, making it complicated to know what queue to lock before manipulating the element, which is exactly why we have the pendconn_lock in the first place. This commit introduces pendconn->target which is the target server that the two aforementioned functions will set when assigning the server. Thanks to this, the server pointer may always be relied on to determine what queue to use.	2018-07-26 17:32:51 +02:00
Willy Tarreau	c1a60d6218	MINOR: queue: make sure pendconn->strm->pend_pos is always valid pendconn_add() used to assign strm->pend_pos very late, after unlocking the queue, so that a watching thread could see a random value in pendconn->strm->pend_pos even while holding the lock on the element and the queue itself. While there's currently nothing wrong with this, it costs nothing to arrange it and will simplify code analysis later.	2018-07-26 17:32:51 +02:00
Willy Tarreau	6bdd05c0ef	DOC: queue: document the expected locking model for the server's queue The locking model is not trivial and is worth documenting to avoid seeing apparent bugs everywhere while they are not.	2018-07-26 17:32:51 +02:00
Willy Tarreau	d0ad4a87f0	MEDIUM: queue: make pendconn_free() work on the stream instead Now pendconn_free() takes a stream, checks that pend_pos is set, clears it, and uses pendconn_unlink() to complete the job. It's cleaner and centralizes all the bookkeeping work in pendconn_unlink() only and ensures that there's a single place where the stream's position in the queue is manipulated.	2018-07-26 17:32:51 +02:00
Willy Tarreau	9624faec86	MINOR: queue: centralize dequeuing code a bit better For now the pendconns may be dequeued at two places : - pendconn_unlink(), which operates on a locked queue - pendconn_free(), which operates on an unlocked queue and frees everything. Some changes are coming to the queue and we'll need to be able to be a bit stricter regarding the places where we dequeue to keep the accounting accurate. This first step renames the locked function __pendconn_unlink() as it's for use by those aware of it, and introduces a new general purpose pendconn_unlink() function which automatically grabs the necessary locks before calling the former, and pendconn_cond_unlink() which additionally checks the pointer and the presence in the queue.	2018-07-26 17:32:48 +02:00
Ilya Shipitsin	7741c854cd	BUILD/MINOR: fix build when USE_THREAD is not defined src/queue.o: In function `pendconn_redistribute': /home/ilia/haproxy/src/queue.c:272: undefined reference to `thread_want_sync' src/queue.o: In function `pendconn_grab_from_px': /home/ilia/haproxy/src/queue.c:311: undefined reference to `thread_want_sync' src/queue.o: In function `process_srv_queue': /home/ilia/haproxy/src/queue.c:184: undefined reference to `thread_want_sync' collect2: error: ld returned 1 exit status make: *** [Makefile:900: haproxy] Error 1 To be backported to 1.8.	2018-03-26 17:17:59 +02:00
Christopher Faulet	fd83f0bfa4	BUG/MEDIUM: threads/queue: wake up other threads upon dequeue The previous patch about queues (`5cd4bbd7a` "BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management") revealed a performance drop when multithreading is enabled (nbthread > 1). This happens when pending connections handled by other theads are dequeued. If these other threads are blocked in the poller, we have to wait the poller's timeout (or any I/O event) to process the dequeued connections. To fix the problem, at least temporarly, we "wake up" the threads by requesting a synchronization. This may seem a bit overkill to use the sync point to do a wakeup on threads, but it fixes this performance issue. So we can now think calmly on the good way to address this kind of issues. This patch should be backported in 1.8 with the commit `5cd4bbd7a` ("BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management").	2018-03-19 22:16:58 +01:00
Christopher Faulet	5cd4bbd7ab	BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management The management of the servers and the proxies queues was not thread-safe at all. First, the accesses to <strm>->pend_pos were not protected. So it was possible to release it on a thread (for instance because the stream is released) and to use it in same time on another one (because we redispatch pending connections for a server). Then, the accesses to stream's information (flags and target) from anywhere is forbidden. To be safe, The stream's state must always be updated in the context of process_stream. So to fix these issues, the queue module has been refactored. A lock has been added in the pendconn structure. And now, when we try to dequeue a pending connection, we start by unlinking it from the server/proxy queue and we wake up the stream. Then, it is the stream reponsibility to really dequeue it (or release it). This way, we are sure that only the stream can create and release its <pend_pos> field. However, be careful. This new implementation should be thread-safe (hopefully...). But it is not optimal and in some situations, it could be really slower in multi-threaded mode than in single-threaded one. The problem is that, when we try to dequeue pending connections, we process it from the older one to the newer one independently to the thread's affinity. So we need to wait the other threads' wakeup to really process them. If threads are blocked in the poller, this will add a significant latency. This problem happens when maxconn values are very low. This patch must be backported in 1.8.	2018-03-19 10:03:06 +01:00
Willy Tarreau	103e5663c8	BUG/MAJOR: threads/queue: avoid recursive locking in pendconn_get_next_strm() pendconn_get_next_strm() is called from process_srv_queue() under the server lock, and calls stream_add_srv_conn() with this lock held, while the latter tries to take it again. This results in a deadlock when a server's maxconn is reached and haproxy is built with thread support.	2017-11-26 18:50:30 +01:00
Willy Tarreau	bafbe01028	CLEANUP: pools: rename all pool functions and pointers to remove this "2" During the migration to the second version of the pools, the new functions and pool pointers were all called "pool_something2()" and "pool2_something". Now there's no more pool v1 code and it's a real pain to still have to deal with this. Let's clean this up now by removing the "2" everywhere, and by renaming the pool heads "pool_head_something".	2017-11-24 17:49:53 +01:00
Christopher Faulet	2a944ee16b	BUILD: threads: Rename SPIN/RWLOCK macros using HA_ prefix This remove any name conflicts, especially on Solaris.	2017-11-07 11:10:24 +01:00
Christopher Faulet	8ba59148ae	MEDIUM: threads/queue: Make queues thread-safe The list of pending connections are now protected using the proxy or server lock, depending on the context.	2017-10-31 13:58:32 +01:00
Christopher Faulet	29f77e846b	MEDIUM: threads/server: Add a lock per server and atomically update server vars The server's lock is use, among other things, to lock acces to the active connection list of a server.	2017-10-31 13:58:31 +01:00
Christopher Faulet	ff8abcd31d	MEDIUM: threads/proxy: Add a lock per proxy and atomically update proxy vars Now, each proxy contains a lock that must be used when necessary to protect it. Moreover, all proxy's counters are now updated using atomic operations.	2017-10-31 13:58:30 +01:00
Emeric Brun	52a91d3d48	MEDIUM: check: server states and weight propagation re-work The server state and weight was reworked to handle "pending" values updated by checks/CLI/LUA/agent. These values are commited to be propagated to the LB stack. In further dev related to multi-thread, the commit will be handled into a sync point. Pending values are named using the prefix 'next_' Current values used by the LB stack are named 'cur_'	2017-09-05 15:23:16 +02:00
Christopher Faulet	f3a55dbd22	MINOR: queue: Change pendconn_from_srv/pendconn_from_px into private functions	2017-06-27 14:38:02 +02:00
Christopher Faulet	87566c923b	MINOR: queue: Change pendconn_get_next_strm into private function	2017-06-27 14:38:02 +02:00
Andrew Rodland	e168feb4a8	MINOR: proxy: add 'served' field to proxy, equal to total of all servers' This will allow lb_chash to determine the total active sessions for a proxy without any computation. Signed-off-by: Andrew Rodland <andrewr@vimeo.com>	2016-10-25 20:21:32 +02:00
Willy Tarreau	e7dff02dd4	REORG/MEDIUM: stream: rename stream flags from SN_* to SF_* This is in order to keep things consistent.	2015-04-06 11:23:57 +02:00
Willy Tarreau	87b09668be	REORG/MAJOR: session: rename the "session" entity to "stream" With HTTP/2, we'll have to support multiplexed streams. A stream is in fact the largest part of what we currently call a session, it has buffers, logs, etc. In order to catch any error, this commit removes any reference to the struct session and tries to rename most "session" occurrences in function names to "stream" and "sess" to "strm" when that's related to a session. The files stream.{c,h} were added and session.{c,h} removed. The session will be reintroduced later and a few parts of the stream will progressively be moved overthere. It will more or less contain only what we need in an embryonic session. Sample fetch functions and converters will have to change a bit so that they'll use an L5 (session) instead of what's currently called "L4" which is in fact L6 for now. Once all changes are completed, we should see approximately this : L7 - http_txn L6 - stream L5 - session L4 - connection \| applet There will be at most one http_txn per stream, and a same session will possibly be referenced by multiple streams. A connection will point to a session and to a stream. The session will hold all the information we need to keep even when we don't yet have a stream. Some more cleanup is needed because some code was already far from being clean. The server queue management still refers to sessions at many places while comments talk about connections. This will have to be cleaned up once we have a server-side connection pool manager. Stream flags "SN_*" still need to be renamed, it doesn't seem like any of them will need to move to the session.	2015-04-06 11:23:56 +02:00
Willy Tarreau	9943d3117e	MINOR: server: make use of srv_is_usable() instead of checking eweight srv_is_usable() is broader than srv_is_usable() as it not only considers the weight but the server's state as well. Future changes will allow a server to be in drain mode with a non-zero weight, so we should migrate to use that function instead.	2014-05-23 14:29:11 +02:00
Willy Tarreau	4aac7db940	REORG: checks: put the functions in the appropriate files ! Checks.c has become a total mess. A number of proxy or server maintenance and queue management functions were put there probably because they were used there, but that makes the code untouchable. And that's without saying that their names does not always relate to what they really do! So let's do a first pass by moving these ones : - set_backend_down() => backend.c - redistribute_pending() => queue.c:pendconn_redistribute() - check_for_pending() => queue.c:pendconn_grab_from_px() - shutdown_sessions => server.c:srv_shutdown_sessions() - shutdown_backup_sessions => server.c:srv_shutdown_backup_sessions() All of them were moved at once.	2014-05-22 11:27:00 +02:00
Willy Tarreau	892337c8e1	MAJOR: server: use states instead of flags to store the server state Servers used to have 3 flags to store a state, now they have 4 states instead. This avoids lots of confusion for the 4 remaining undefined states. The encoding from the previous to the new states can be represented this way : SRV_STF_RUNNING \| SRV_STF_GOINGDOWN \| \| SRV_STF_WARMINGUP \| \| \| 0 x x SRV_ST_STOPPED 1 0 0 SRV_ST_RUNNING 1 0 1 SRV_ST_STARTING 1 1 x SRV_ST_STOPPING Note that the case where all bits were set used to exist and was randomly dealt with. For example, the task was not stopped, the throttle value was still updated and reported in the stats and in the http_server_state header. It was the same if the server was stopped by the agent or for maintenance. It's worth noting that the internal function names are still quite confusing.	2014-05-22 11:27:00 +02:00
Willy Tarreau	c93cd16b6c	REORG/MEDIUM: server: split server state and flags in two different variables Till now, the server's state and flags were all saved as a single bit field. It causes some difficulties because we'd like to have an enum for the state and separate flags. This commit starts by splitting them in two distinct fields. The first one is srv->state (with its counter-part srv->prev_state) which are now enums, but which still contain bits (SRV_STF_*). The flags now lie in their own field (srv->flags). The function srv_is_usable() was updated to use the enum as input, since it already used to deal only with the state. Note that currently, the maintenance mode is still in the state for simplicity, but it must move as well.	2014-05-22 11:27:00 +02:00
Willy Tarreau	87eb1d6994	MINOR: server: create srv_was_usable() from srv_is_usable() and use a pointer We used to call srv_is_usable() with either the current state and weights or the previous ones. This causes trouble for future changes, so let's first split it in two variants : - srv_is_usable(srv) considers the current status - srv_was_usable(srv) considers the previous status	2014-05-13 22:34:55 +02:00
Willy Tarreau	3fdb366885	MAJOR: connection: replace struct target with a pointer to an enum Instead of storing a couple of (int, ptr) in the struct connection and the struct session, we use a different method : we only store a pointer to an integer which is stored inside the target object and which contains a unique type identifier. That way, the pointer allows us to retrieve the object type (by dereferencing it) and the object's address (by computing the displacement in the target structure). The NULL pointer always corresponds to OBJ_TYPE_NONE. This reduces the size of the connection and session structs. It also simplifies target assignment and compare. In order to improve the generated code, we try to put the obj_type element at the beginning of all the structs (listener, server, proxy, si_applet), so that the original and target pointers are always equal. A lot of code was touched by massive replaces, but the changes are not that important.	2012-11-12 00:42:33 +01:00
Willy Tarreau	f8e8b76ed3	BUG/MEDIUM: zero-weight servers must not dequeue requests from the backend It was reported that a server configured with a zero weight would sometimes still take connections from the backend queue. This issue is real, it happens this way : 1) the disabled server accepts a request with a cookie 2) many cookie-less requests accumulate in the backend queue 3) when the disabled server completes its request, it checks its own queue and the backend's queue 4) the server takes a pending request from the backend queue and processes it. In response, the server's cookie is assigned to the client, which ensures that some requests will continue to be served by this server, leading back to point 1 above. The fix consists in preventing a zero-weight server from dequeuing pending requests from the backend. Making use of srv_is_usable() in such tests makes the tests more robust against future changes. This fix must be backported to 1.4 and 1.3.	2012-01-20 16:18:53 +01:00
Willy Tarreau	4426770013	CLEANUP: rename possibly confusing struct field "tracked" When reading the code, the "tracked" member of a server makes one think the server is tracked while it's the opposite, it's a pointer to the server being tracked. This is particularly true in constructs such as : if (srv->tracked) { Since it's the second time I get caught misunderstanding it, let's rename it to "track" to avoid the confusion.	2011-10-28 15:35:33 +02:00
Simon Horman	af51495397	[MINOR] Add active connection list to server The motivation for this is to allow iteration of all the connections of a server without the expense of iterating over the global list of connections. The first use of this will be to implement an option to close connections associated with a server when is is marked as being down or in maintenance mode.	2011-06-21 22:00:12 +02:00
Willy Tarreau	7d0aaf39d1	[MEDIUM] stats: split frontend and backend stats It's very annoying that frontend and backend stats are merged because we don't know what we're observing. For instance, if a "listen" instance makes use of a distinct backend, it's impossible to know what the bytes_out means. Some points take care of not updating counters twice if the backend points to the frontend, indicating a "listen" instance. The thing becomes more complex when we try to add support for server side keep-alive, because we have to maintain a pointer to the backend used for last request, and to update its stats. But we can't perform such comparisons anymore because the counters will not match anymore. So in order to get rid of this situation, let's have both frontend AND backend stats in the "struct proxy". We simply update the relevant ones during activity. Some of them are only accounted for in the backend, while others are just for frontend. Maybe we can improve a bit on that later, but the essential part is that those counters now reflect what they really mean.	2011-03-13 22:00:23 +01:00
Willy Tarreau	827aee913f	[MAJOR] session: remove the ->srv pointer from struct session This one has been removed and is now totally superseded by ->target. To get the server, one must use target_srv(&s->target) instead of s->srv now. The function ensures that non-server targets still return NULL.	2011-03-10 23:32:17 +01:00
Willy Tarreau	9e000c6ec8	[CLEANUP] stream_interface: use inline functions to manipulate targets The connection target involves a type and a union of pointers, let's make the code cleaner using simple wrappers.	2011-03-10 23:32:17 +01:00
Willy Tarreau	664beb8610	[MINOR] session: add a pointer to the new target into the session When dealing with HTTP keep-alive, we'll have to know if we can reuse an existing connection. For that, we'll have to check if the current connection was made on the exact same target (referenced in the stream interface). Thus, we need to first assign the next target to the session, then copy it to the stream interface upon connect(). Later we'll check for equivalence between those two operations.	2011-03-10 23:32:16 +01:00
Willy Tarreau	d132f746f2	[BUG] queue: don't dequeue proxy-global requests on disabled servers If a server is disabled or tracking a disabled server, it must not dequeue requests pending in the proxy queue, it must only dequeue its own ones. The problem that was caused is that if a backend always had requests in its queue, a disabled server would continue to take traffic forever. (was commit 09d02aaf02d1f21c0c02672888f3a36a14bdd299 in 1.4)	2010-08-17 21:39:07 +02:00
Willy Tarreau	ac68c5d92c	[OPTIM] counters: move some max numbers to the counters struct There are a few remaining max values that need to move to counters. Also, the counters are more often used than some config information, so get them closer to the other useful struct members for better cache efficiency.	2009-10-04 23:26:19 +02:00
Willy Tarreau	922a806075	[BUG] do not dequeue the backend's pending connections on a dead server Kai Krueger found that previous patch was incomplete, because there is an unconditionnal call to process_srv_queue() in session_free() which still causes a dead server to consume pending connections from the backend. This call was made unconditionnal so that we don't leave unserved connections in the server queue, for instance connections coming in with "option persist" which can bypass the server status check. However, the server must not touch the backend's queue if it is down. Another fear was that some connections might remain unserved when the server is using a dynamic maxconn if the number of connections to the backend is too low. Right now, srv_dynamic_maxconn() ensures this cannot happen, so the call can remain conditionnal. The fix consists in allowing a server to process it own queue whatever its state, but not to touch the backend's queue if it is down. Its queue should normally be empty when the server is down because it is redistributed when the server goes down. The only remaining cases are precisely the persistent connections with "option persist" set, coming in after the queue has been redispatched. Those ones must still be processed when a connection terminates. (cherry picked from commit `cd485c4480`)	2008-12-07 23:51:12 +01:00
Willy Tarreau	28a9e529f8	[BUG] dynamic connection throttling could return a max of zero conns srv_dynamic_maxconn() is clearly documented as returning at least 1 possible connection under throttling. But the computation was wrong, the minimum 1 was divided and got lost in case of very low maxconns. Apply the MAX(1, max) before returning the result in order to ensure that a newly appeared server will get some traffic. (cherry picked from commit `819970098f`)	2008-12-07 23:30:38 +01:00
Willy Tarreau	fdccded0e8	[MEDIUM] indicate a reason for a task wakeup It's very frequent to require some information about the reason why a task is running. Some flags have been added so that a task now knows if it got woken up due to I/O completion, timeout, etc...	2008-11-02 10:19:08 +01:00
Willy Tarreau	ec6c5df018	[CLEANUP] remove many #include <types/xxx> from C files It should be stated as a rule that a C file should never include types/xxx.h when proto/xxx.h exists, as it gives less exposure to declaration conflicts (one of which was caught and fixed here) and it complicates the file headers for nothing. Only types/global.h, types/capture.h and types/polling.h have been found to be valid includes from C files.	2008-07-16 10:30:42 +02:00
Willy Tarreau	7c669d7e0f	[BUG] fix the dequeuing logic to ensure that all requests get served The dequeuing logic was completely wrong. First, a task was assigned to all servers to process the queue, but this task was never scheduled and was only woken up on session free. Second, there was no reservation of server entries when a task was assigned a server. This means that as long as the task was not connected to the server, its presence was not accounted for. This was causing trouble when detecting whether or not a server had reached maxconn. Third, during a redispatch, a session could lose its place at the server's and get blocked because another session at the same moment would have stolen the entry. Fourth, the redispatch option did not work when maxqueue was reached for a server, and it was not possible to do so without indefinitely hanging a session. The root cause of all those problems was the lack of pre-reservation of connections at the server's, and the lack of tracking of servers during a redispatch. Everything relied on combinations of flags which could appear similarly in quite distinct situations. This patch is a major rework but there was no other solution, as the internal logic was deeply flawed. The resulting code is cleaner, more understandable, uses less magics and is overall more robust. As an added bonus, "option redispatch" now works when maxqueue has been reached on a server.	2008-06-20 15:08:06 +02:00
Willy Tarreau	7a63abd84f	[BUG] log: reported queue position was offed-by-one The reported queue position in the logs was 0 for the first pending request in the queue, which is wrong because it means that one request will have to be completed before the queued one may execute. It caused the undesired side effect that 0/0 was reported when either 0 or 1 request was pending in the queue. Thus, we have to increment the queue size before reporting the value.	2008-06-20 15:08:04 +02:00
Willy Tarreau	7008987813	[BUG] queue management: wake oldest request in queues When a server terminates a connection, the next session in its own queue was immediately processed. Because of this, if all server queues are always filled, then no new anonymous request will be processed. Consider oldest request between global and server queues to choose from which to pick the request. An improvement over this will consist in adding a configurable offset when comparing expiration dates, so that cookie-less requests can get either less or more priority.	2008-06-20 15:07:40 +02:00
Willy Tarreau	9909fc13f1	[MEDIUM] implement the slowstart parameter for servers The new 'slowstart' parameter for a server accepts a value in milliseconds which indicates after how long a server which has just come back up will run at full speed. The speed grows linearly from 0 to 100% during this time. The limitation applies to two parameters : - maxconn: the number of connections accepted by the server will grow from 1 to 100% of the usual dynamic limit defined by (minconn,maxconn,fullconn). - weight: when the backend uses a dynamic weighted algorithm, the weight grows linearly from 1 to 100%. In this case, the weight is updated at every health-check. For this reason, it is important that the 'inter' parameter is smaller than the 'slowstart', in order to maximize the number of steps. The slowstart never applies when haproxy starts, otherwise it would cause trouble to running servers. It only applies when a server has been previously seen as failed.	2007-11-30 17:42:05 +01:00
Willy Tarreau	e4d7e55061	[MAJOR] ported pendconn to mempools v2 A pool_destroy() was also missing in deinit()	2007-05-13 20:19:55 +02:00
Willy Tarreau	d825eef9c5	[MAJOR] replaced all timeouts with struct timeval The timeout functions were difficult to manipulate because they were rounding results to the millisecond. Thus, it was difficult to compare and to check what expired and what did not. Also, the comparison functions were heavy with multiplies and divides by 1000. Now, all timeouts are stored in timevals, reducing the number of operations for updates and leading to cleaner and more efficient code.	2007-05-12 22:35:00 +02:00
Willy Tarreau	96bcfd75aa	[MAJOR] replaced rbtree with ul2tree. The rbtree-based wait queue consumes a lot of CPU. Use the ul2tree instead. Lots of cleanups and code reorganizations made it possible to reduce the task struct and simplify the code a bit.	2007-04-29 13:43:53 +02:00
Willy Tarreau	a8cff1d6a7	[BUILD] fixed a warning on OpenBSD : MIN/MAX redefined	2007-04-09 16:10:57 +02:00
Willy Tarreau	e2e27a5c8d	[MEDIUM] removed now unused fiprm and beprm from proxies The fiprm and beprm were added to ease the transition between a single listener mode to frontends+backends. They are no longer needed and make the code a bit more complicated. Remove them.	2007-04-01 00:01:37 +02:00
Willy Tarreau	8603431822	[MEDIUM] split fe->maxconn into fe->maxconn and be->fullconn The maxconn argument is used only for the listeners, and the fullconn is used only for the backends. If unset, it inherits maxconn's value which itself can inherit the default or the global value (we might need to change this).	2006-12-29 00:10:33 +01:00

1 2 3 4

157 Commits