haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-10-30 16:11:01 +01:00

Author	SHA1	Message	Date
Christopher Faulet	a8ce497aac	BUG/MINOR: resolvers: Reset server IP when no ip is found in the response For A/AAAA resolution, if no ip is found for a server in the response, the server is set to RMAINT status. However, its address must also be reset. Otherwise, it is still reported by the cli on "show servers state" commands. This may be confusing. This patch may be backported as far as 2.0.	2021-06-24 17:22:36 +02:00
Willy Tarreau	cdc83e0192	MINOR: queue: add a pointer to the server and the proxy in the queue A queue is specific to a server or a proxy, so we don't need to place this distinction inside all pendconns, it can be in the queue itself. This commit adds the relevant fields "px" and "sv" into the struct queue, and initializes them accordingly.	2021-06-24 10:52:31 +02:00
Willy Tarreau	df3b0cbe31	MINOR: queue: add queue_init() to initialize a queue This is better and cleaner than open-coding this in the server and proxy code, where it has all chances of becoming wrong once forgotten.	2021-06-24 10:52:31 +02:00
Willy Tarreau	9ab78293bf	MEDIUM: queue: simplify again the process_srv_queue() API (v2) This basically undoes the API changes that were performed by commit 0274286dd ("BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check") to address the deadlock issue: since process_srv_queue() doesn't use the server lock anymore, it doesn't need the "server_locked" argument, so let's get rid of it before it gets used again.	2021-06-24 10:52:31 +02:00
Willy Tarreau	16fbdda3c3	MEDIUM: queue: use a dedicated lock for the queues (v2) Till now whenever a server or proxy's queue was touched, this server or proxy's lock was taken. Not only this requires distinct code paths, but it also causes unnecessary contention with other uses of these locks. This patch adds a lock inside the "queue" structure that will be used the same way by the server and the proxy queuing code. The server used to use a spinlock and the proxy an rwlock, though the queue only used it for locked writes. This new version uses a spinlock since we don't need the read lock part here. Tests have not shown any benefit nor cost in using this one versus the rwlock so we could change later if needed. The lower contention on the locks increases the performance from 362k to 374k req/s on 16 threads with 20 servers and leastconn. The gain with roundrobin even increases by 9%. This is tagged medium because the lock is changed, but no other part of the code touches the queues, with nor without locking, so this should remain invisible.	2021-06-24 10:52:31 +02:00
Willy Tarreau	3f70fb9ea2	Revert "MEDIUM: queue: use a dedicated lock for the queues" This reverts commit fcb8bf8650ec6b5614d1b88db54f1200ebd96cbd. The recent changes since 5304669e1 MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn opened a tiny race condition between stream_free() and process_srv_queue(), as the pendconn is accessed outside of the lock, possibly while it's being freed. A different approach is required.	2021-06-24 07:26:28 +02:00
Willy Tarreau	ccd85a3e08	Revert "MEDIUM: queue: simplify again the process_srv_queue() API" This reverts commit c83e45e9b001591633188a480a896c935d3c9625. The recent changes since 5304669e1 MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn opened a tiny race condition between stream_free() and process_srv_queue(), as the pendconn is accessed outside of the lock, possibly while it's being freed. A different approach is required.	2021-06-24 07:22:18 +02:00
Willy Tarreau	c83e45e9b0	MEDIUM: queue: simplify again the process_srv_queue() API This basically undoes the API changes that were performed by commit 0274286dd ("BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check") to address the deadlock issue: since process_srv_queue() doesn't use the server lock anymore, it doesn't need the "server_locked" argument, so let's get rid of it before it gets used again.	2021-06-22 18:57:15 +02:00
Willy Tarreau	fcb8bf8650	MEDIUM: queue: use a dedicated lock for the queues Till now whenever a server or proxy's queue was touched, this server or proxy's lock was taken. Not only this requires distinct code paths, but it also causes unnecessary contention with other uses of these locks. This patch adds a lock inside the "queue" structure that will be used the same way by the server and the proxy queuing code. The server used to use a spinlock and the proxy an rwlock, though the queue only used it for locked writes. This new version uses a spinlock since we don't need the read lock part here. Tests have not shown any benefit nor cost in using this one versus the rwlock so we could change later if needed. The lower contention on the locks increases the performance from 491k to 507k req/s on 16 threads with 20 servers and leastconn. The gain with roundrobin even increases by 6%. The performance profile changes from this: 13.03% haproxy [.] fwlc_srv_reposition 8.08% haproxy [.] fwlc_get_next_server 3.62% haproxy [.] process_srv_queue 1.78% haproxy [.] pendconn_dequeue 1.74% haproxy [.] pendconn_add to this: 11.95% haproxy [.] fwlc_srv_reposition 7.57% haproxy [.] fwlc_get_next_server 3.51% haproxy [.] process_srv_queue 1.74% haproxy [.] pendconn_dequeue 1.70% haproxy [.] pendconn_add At this point the differences are mostly measurement noise. This is tagged medium because the lock is changed, but no other part of the code touches the queues, with nor without locking, so this should remain invisible.	2021-06-22 18:43:56 +02:00
Willy Tarreau	a05704582c	MINOR: server: replace the pendconns-related stuff with a struct queue Just like for proxies, all three elements (pendconns, nbpend, queue_idx) were moved to struct queue.	2021-06-22 18:43:14 +02:00
Amaury Denoyelle	0274286dd3	BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check The server_parse_maxconn_change_request locks the server lock. However, this function can be called via agent-checks or lua code which already lock it. This bug has been introduced by the following commit : commit 79a88ba3d09f7e2b73ae27cb5d24cc087a548fa6 BUG/MAJOR: server: prevent deadlock when using 'set maxconn server' This commit tried to fix another deadlock with can occur because previoulsy server_parse_maxconn_change_request requires the server lock to be held. However, it may call internally process_srv_queue which also locks the server lock. The locking policy has thus been updated. The fix is functional for the CLI 'set maxconn' but fails to address the agent-check / lua counterparts. This new issue is fixed in two steps : - changes from the above commit have been reverted. This means that server_parse_maxconn_change_request must again be called with the server lock. - to counter the deadlock fixed by the above commit, process_srv_queue now takes an argument to render the server locking optional if the caller already held it. This is only used by server_parse_maxconn_change_request. The above commit was subject to backport up to 1.8. Thus this commit must be backported in every release where it is already present.	2021-06-22 11:39:20 +02:00
Amaury Denoyelle	34897d2eff	MINOR: ssl: support ssl keyword for dynamic servers Activate the 'ssl' keyword for dynamic servers. This is the final step to have ssl dynamic servers feature implemented. If activated, ssl_sock_prepare_srv_ctx will be called at the end of the 'add server' CLI handler. At the same time, update the management doc to list all ssl keywords implemented for dynamic servers.	2021-06-18 16:42:26 +02:00
Amaury Denoyelle	b89d3d3de7	MINOR: server: disable CLI 'set server ssl' for dynamic servers 'set server ssl' uses ssl parameters from default-server. As dynamic servers does not reuse any default-server parameters, this command has no sense for them.	2021-06-18 16:42:25 +02:00
Christopher Faulet	0ba54bb401	BUG/MINOR: server/cli: Fix locking in function processing "set server" command The commit c7b391aed ("BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI") introduced 2 bugs. The first one is a typo on the server's lock label (s/SERVER_UNLOCK/SERVER_LOCK/). The second one is about the server's lock itself. It must be acquired to execute the "agent-send" subcommand. The patch above is marked to be backported as far as 1.8. Thus, this one must also backported as far 1.8. BUG/MINOR: server/cli: Don't forget to lock server on agent-send subcommand	2021-06-18 09:16:32 +02:00
Christopher Faulet	dcac418062	BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution status When a server relies on a SRV resolution, a task is created to clean it up (fqdn/port and address) when the SRV resolution is considered as outdated (based on the resolvers 'timeout' value). It is only possible if the server inherits outdated info from a state file and is no longer selected to be attached to a SRV item. Note that most of time, a server is attached to a SRV item. Thus when the item becomes obsolete, the server is cleaned up. It is important to have such task to be sure the server will be free again to have a chance to be resolved again with fresh information. Of course, this patch is a workaround to solve a design issue. But there is no other obvious way to fix it without rewritting all the resolvers part. And it must be backportable. This patch relies on following commits: * MINOR: resolvers: Clean server in a dedicated function when removing a SRV item * MINOR: resolvers: Remove server from named_servers tree when removing a SRV item All the series must be backported as far as 2.2 after some observation period. Backports to 2.0 and 1.8 must be evaluated.	2021-06-17 16:52:35 +02:00
Christopher Faulet	c7b391aed2	BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI To perform servers resolution, the resolver's lock is first acquired then the server's lock when necessary. However, when the fqdn is set via the CLI, the opposite is performed. So, it is possible to experience an ABBA deadlock. To fix this bug, the server's lock is acquired and released for each subcommand of "set server" with an exception when the fqdn is set. The resolver's lock is first acquired. Of course, this means we must be sure to have a resolver to lock. This patch must be backported as far as 1.8.	2021-06-17 16:52:14 +02:00
Christopher Faulet	a386e78823	BUG/MINOR: server: Forbid to set fqdn on the CLI if SRV resolution is enabled If a server is configured to rely on a SRV resolution, we must forbid to change its fqdn on the CLI. Indeed, in this case, the server retrieves its fqdn from the SRV resolution. If the fqdn is changed via the CLI, this conflicts with the SRV resolution and leaves the server in an undefined state. Most of time, the SRV resolution remains enabled with no effect on the server (no update). Some time the A/AAAA resolution for the new fqdn is not enabled at all. It depends on the server state and resolver state when the CLI command is executed. This patch must be backported as far as 2.0 (maybe to 1.8 too ?) after some observation period.	2021-06-17 16:17:14 +02:00
Miroslav Zagorac	8a8f270f6a	CLEANUP: server: a separate function for initializing the per_thr field To avoid repeating the same source code, allocating memory and initializing the per_thr field from the server structure is transferred to a separate function.	2021-06-17 16:07:10 +02:00
Amaury Denoyelle	8ff0434b61	BUG/MEDIUM: server: do not auto insert a dynamic server in px addr_node Until then, the servers were automatically attached on their creation into the proxy addr_node tree via _srv_parse_init. In case of an invalid dynamic server which is instantly freed, no detach operation was made leaving a NULL server in the tree. Change this mode of operation by marking the attach operation as optional in _srv_parse_init. This operation is not conduct for a dynamic server. The server is attached only at the end of the CLI handler when it is marked as valid. This must be backported up to 2.4.	2021-06-15 11:42:53 +02:00
Amaury Denoyelle	1613b4a75d	BUG/MINOR: server: do not keep an invalid dynamic server in px ids tree A bug is present when trying to create a dynamic server with a fixed id. If the server is detected invalid due to a later parsing arguments error, the server is not removed from the proxy used ids tree before being freed. Change the mode of operation of 'id' keyword parsing handler. The insertion in the backend tree is removed from the handler and is not taken in charge by parse_server for configuration parsing. For the dynamic servers, the insertion is called at the end of the 'add server' CLI handler when the server has been validated. This must be backported up to 2.4.	2021-06-15 11:42:53 +02:00
Amaury Denoyelle	406aaef55a	BUG/MEDIUM: server: do not forget to generate the dynamic servers ids If no id is specified by the user for a dynamic server, it is necessary to generate a new one. This operation is now done at the end of 'add server' CLI handler. The server is then inserted into the proxy ids tree. Without this, several features may be broken for dynamic servers. Among them, there is the "first" lb algorithm, the persistence using stick-tables or the uniqueness internal check of srv_parse_id. This must be backported up to 2.4.	2021-06-15 11:42:53 +02:00
Amaury Denoyelle	82d7f77463	BUG/MEDIUM: server: clear dynamic srv on delete from proxy id/name trees Do not leave deleted server in used_server_id/used_server_addr backend trees. This might lead to crashes if a deleted server is used through these trees. At this moment, dynamic servers are only added in used_server_id if they have a fixed id. They are never inserted in used_server_addr as this code is missing. So these new delete instructions are noop. However, a fix will be provided soon to insert properly all dynamic servers in both used_server_id and used_server_addr trees so the deletion counterpart will be mandatory in the CLI server delete handler. This must be backported to 2.4.	2021-06-15 11:38:06 +02:00
Amaury Denoyelle	31ddd76fef	BUG/MEDIUM: server: extend thread-isolate over much of CLI 'add server' Some config parsing handlers were designed to be run at startup on a single-thread. When executing at runtime for dynamic servers, thread-safety is not guaranteed. This is the case for example in srv_parse_id which manipulates backend used_ids tree. One solution could be to add locks but it might be tricky to found all affected functions and it can be an easy source of deadlock. The other solution which has been chosen is to use thread-isolation over almost all of the cli_parse_add_server CLI handler. For now this solution is sufficient. If some users make heavy use of the 'add server', hurting the overall performance, it will be necessary to design a much thinner solution. This must be backported up to 2.4.	2021-06-15 11:19:43 +02:00
Emeric Brun	caef19e0c7	BUG/MAJOR: resolvers: segfault using server template without SRV RECORDs This patch fix the issue adding a test in srvrq before registering the server on it during server template init. This was a regression due to commit : 3406766d57fc11478d54a6fa2d048cbfe4524a4e This should be backported with this previous commit (until 2.0)	2021-06-14 11:04:02 +02:00
Emeric Brun	3406766d57	MEDIUM: resolvers: add a ref between servers and srv request or used SRV record This patch add a ref into servers to register them onto the record answer item used to set their hostnames. It also adds a head list into 'srvrq' to register servers free to be affected to a SRV record. A head of a tree is also added to srvrq to put servers which present a hotname in server state file. To re-link them fastly to the matching record as soon an item present the same name. This results in better performances on SRV record response parsing. This is an optimization but it could avoid to trigger the haproxy's internal wathdog in some circumstances. And for this reason it should be backported as far we can (2.0 ?)	2021-06-11 16:16:16 +02:00
Emeric Brun	bd78c912fd	MEDIUM: resolvers: add a ref on server to the used A/AAAA answer item This patch adds a head list into answer items on servers which use this record to set their IPs. It makes lookup on duplicated ip faster and allow to check immediatly if an item is still valid renewing the IP. This results in better performances on A/AAAA resolutions. This is an optimization but it could avoid to trigger the haproxy's internal wathdog in some circumstances. And for this reason it should be backported as far we can (2.0 ?)	2021-06-11 16:16:16 +02:00
Emeric Brun	12ca658dbe	BUG/MINOR: resolvers: answser item list was randomly purged or errors In case of SRV records, The answer item list was purged by the error callback of the first requester which considers the error could not be safely ignored. It makes this item list unavailable for subsequent requesters even if they consider the error could be ignored. On A resolution or do_resolve action error, the answer items were never trashed. This patch re-work the error callbacks and the code to check the return code If a callback return 1, we consider the error was ignored and the answer item list must be kept. At the opposite, If all error callbacks of all requesters of the same resolution returns 0 the list will be purged This patch should be backported as far as 2.0.	2021-06-11 16:16:16 +02:00
Amaury Denoyelle	efbf35caf9	BUG/MINOR: server: explicitly set "none" init-addr for dynamic servers Define srv.init_addr_methods to SRV_IADDR_NONE on 'add server' CLI handler. This explicitly states that no resolution will be made on the server creation. This is not a real bug as the default value (SRV_IADDR_END) has the same effect in practice. However the intent is clearer and prevent to use the default "libc,last" by mistake which cannot execute on runtime (blocking call + file access via gethostbyname/getaddrinfo). The doc is also updated to reflect this limitation. This should be backported up to 2.4.	2021-06-10 17:44:05 +02:00
Amaury Denoyelle	5e560e80c7	MINOR: server: use ha_alert in server parsing functions Replace memprintf usage in _srv_parse* functions by ha_alert calls. This has the advantage to simplify the function prototype by removing an extra char** argument. As a consequence, the CLI handler of 'add server' is updated to output the user messages buffers if not empty.	2021-06-07 17:19:33 +02:00
Amaury Denoyelle	9d0138ab08	MINOR: server: use parsing ctx for server init addr Initialize the parsing context in srv_init_addr. This function is called after configuration check. This will standardize the stderr output on startup with the parse_server function.	2021-06-07 17:19:30 +02:00
Amaury Denoyelle	0fc136ce5b	REORG: server: use parsing ctx for server parsing Use the parsing context in parse_server. Remove redundant manual format-string specifying the current file/line/server parsed.	2021-06-07 17:19:24 +02:00
Amaury Denoyelle	c008a63582	CLEANUP: server: fix cosmetic of error message on sni parsing Fix memprintf used in server_parse_sni_expr. Error messages should not be ending with a newline as it will be inserted in the parent function on the ha_alert invocation.	2021-06-07 16:58:16 +02:00
Remi Tricot-Le Breton	f1800e64ef	BUG/MINOR: server: Missing calloc return value check in srv_parse_source Two calloc calls were not checked in the srv_parse_source function. Considering that this function could be called at runtime through a dynamic server creation via the CLI, this could lead to an unfortunate crash. It was raised in GitHub issue #1233. It could be backported to all stable branches even though the runtime crash could only happen on branches where dynamic server creation is possible.	2021-05-31 10:50:32 +02:00
Amaury Denoyelle	79a88ba3d0	BUG/MAJOR: server: prevent deadlock when using 'set maxconn server' A deadlock is possible with 'set maxconn server' command, if there is pending connection ready to be dequeued. This is caused by the locking of server spinlock in both cli_parse_set_maxconn_server and process_srv_queue. Fix this by reducing the scope of the server lock into server_parse_maxconn_change_request. If connection are dequeued, the lock is taken a second time. This can be seen as suboptimal but as it happens only during 'set maxconn server' it can be considered as tolerable. This issue was reported on the mailing list, for the 1.8.x branch. It must be backported up to the 1.8.	2021-05-19 17:52:05 +02:00
Willy Tarreau	b00a8e30f1	BUILD: server: include missing proxy.h in server.c It's needed for a number of functions and definitions but was missing.	2021-05-08 20:24:09 +02:00
Willy Tarreau	ba6300ea62	BUILD: server: include tools.h from server.c A lot of functions from tools.h are used there but the file was only inherited via other ones.	2021-05-08 19:37:41 +02:00
Amaury Denoyelle	24abb0cdc1	BUG/MINOR: server: do not report diag for peer servers with null weight Only check servers attached to a proxy with PR_CAP_LB. This does not need to be backported as the diag message was added in the current 2.4-dev branch.	2021-05-07 15:20:54 +02:00
Willy Tarreau	b205bfdab7	CLEANUP: cli/tree-wide: properly re-align the CLI commands' help messages There were 102 CLI commands whose help were zig-zagging all along the dump making them unreadable. This patch realigns all these messages so that the command now uses up to 40 characters before the delimiting colon. About a third of the commands did not correctly list their arguments which were added after the first version, so they were all updated. Some abuses of the term "id" were fixed to use a more explanatory term. The "set ssl ocsp-response" command was not listed because it lacked a help message, this was fixed as well. The deprecated enable/disable commands for agent/health/server were prominently written as deprecated. Whenever possible, clearer explanations were provided.	2021-05-07 11:51:26 +02:00
Amaury Denoyelle	3109ccfe70	MINOR: srv: close all idle connections on shutdown Implement a function to close all server idle connections. This function is called via a global deinit server handler. The main objective is to prevents from leaving sockets in TIME_WAIT state. To limit the set of operations on shutdown and prevents tasks rescheduling, only the ctrl stack closing is done.	2021-05-05 14:33:51 +02:00
Amaury Denoyelle	eafd701dc5	MINOR: server: fix doc/trace on lb algo for dynamic server creation The text mentionned that only backends with consistent hash method were supported for dynamic servers. In fact, it is only required that the lb algorith is dynamic.	2021-04-29 14:59:42 +02:00
Amaury Denoyelle	d6b4b6da3f	BUG/MINOR: server: fix potential null gcc error in delete server gcc still reports a potential null pointer dereference in delete server function event with a BUG_ON before it. Remove the misleading NULL check in the for loop which should never happen. This does not need to be backported.	2021-04-21 12:02:30 +02:00
Amaury Denoyelle	e558043e13	MINOR: server: implement delete server cli command Implement a new CLI command 'del server'. It can be used to removed a dynamically added server. Only servers in maintenance mode can be removed, and without pending/active/idle connection on it. Add a new reg-test for this feature. The scenario of the reg-test need to first add a dynamic server. It is then deleted and a client is used to ensure that the server is non joinable. The management doc is updated with the new command 'del server'.	2021-04-21 11:00:31 +02:00
Amaury Denoyelle	d38e7fa233	MINOR: server: add log on dynamic server creation Add a notice log to report the creation of a new server. The log is printed at the end of the function.	2021-04-21 11:00:31 +02:00
Amaury Denoyelle	cece918625	BUG/MEDIUM: server: ensure thread-safety of server runtime creation cli_parse_add_server can be executed in parallel by several CLI instances and so must be thread-safe. The critical points of the function are : - server duplicate detection - insertion of the server in the proxy list The mode of operation has been reversed. The server is first instantiated and parsed. The duplicate check has been moved at the end just before the insertion in the proxy list, under the thread isolation. Thus, the thread safety is guaranteed and server allocation is kept outside of locks/thread isolation.	2021-04-21 11:00:30 +02:00
Amaury Denoyelle	fb247946a1	BUG/MINOR: server: free srv.lb_nodes in free_server lb_nodes is allocated for servers using lb_chash (balance random or hash-type consistent). It can be backported up to 1.8.	2021-04-21 11:00:03 +02:00
Willy Tarreau	2b71810cb3	CLEANUP: lists/tree-wide: rename some list operations to avoid some confusion The current "ADD" vs "ADDQ" is confusing because when thinking in terms of appending at the end of a list, "ADD" naturally comes to mind, but here it does the opposite, it inserts. Several times already it's been incorrectly used where ADDQ was expected, the latest of which was a fortunate accident explained in 6fa922562 ("CLEANUP: stream: explain why we queue the stream at the head of the server list"). Let's use more explicit (but slightly longer) names now: LIST_ADD -> LIST_INSERT LIST_ADDQ -> LIST_APPEND LIST_ADDED -> LIST_INLIST LIST_DEL -> LIST_DELETE The same is true for MT_LISTs, including their "TRY" variant. LIST_DEL_INIT keeps its short name to encourage to use it instead of the lazier LIST_DELETE which is often less safe. The change is large (~674 non-comment entries) but is mechanical enough to remain safe. No permutation was performed, so any out-of-tree code can easily map older names to new ones. The list doc was updated.	2021-04-21 09:20:17 +02:00
Willy Tarreau	dcb121fd9c	BUG/MINOR: server: make srv_alloc_lb() allocate lb_nodes for consistent hash The test in srv_alloc_lb() to allocate the lb_nodes[] array used in the consistent hash was incorrect, it wouldn't do it for consistent hash and could do it for regular random. No backport is needed as this was added for dynamic servers in 2.4-dev by commit f99f77a50 ("MEDIUM: server: implement 'add server' cli command").	2021-04-20 11:39:54 +02:00
Willy Tarreau	14015b8880	MINOR: server: move idle_conn_task to read_mostly This pointer is used when adding connections to the idle list and is never changed, let's move it to the read_mostly section.	2021-04-10 19:27:41 +02:00
Amaury Denoyelle	da0e7f61e0	MINOR: server: diag for 0 weight server Output a diagnostic report if a server has been configured with a null weight.	2021-04-01 18:03:37 +02:00
Ilya Shipitsin	ba13f16aa2	CLEANUP: assorted typo fixes in the code and comments This is 21st iteration of typo fixes	2021-03-20 09:28:58 +01:00

... 4 5 6 7 8 ...

787 Commits