haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-09 00:27:08 +02:00

Author	SHA1	Message	Date
Amaury Denoyelle	14c3c5c121	MEDIUM: server: allow to remove servers at runtime except non purgeable Relax the condition on "delete server" CLI handler to be able to remove all servers, even non dynamic, except if they are flagged as non purgeable. This change is necessary to extend the use cases for dynamic servers with reload. It's expected that each dynamic server created via the CLI is manually commited in the haproxy configuration by the user. Dynamic servers will be present on reload only if they are present in the configuration file. This means that non-dynamic servers must be allowed to be removable at runtime. The dynamic servers removal reg-test has been updated and renamed to reflect its purpose. A new test is present to check that non-purgeable servers cannot be removed.	2021-08-25 15:53:54 +02:00
Amaury Denoyelle	0626961ad3	MINOR: server: mark referenced servers as non purgeable Mark servers that are referenced by configuration elements as non purgeable. This includes the following list : - tracked servers - servers referenced in a use-server rule - servers referenced in a sample fetch	2021-08-25 15:53:54 +02:00
Amaury Denoyelle	bc2ebfa5a4	MEDIUM: server: extend refcount for all servers In a future patch, it will be possible to remove at runtime every servers, both static and dynamic. This requires to extend the server refcount for all instances. First, refcount manipulation functions have been renamed to better express the API usage. * srv_refcount_use -> srv_take The refcount is always initialize to 1 on the server creation in new_server. It's also incremented for each check/agent configured on a server instance. * free_server -> srv_drop This decrements the refcount and if null, the server is freed, so code calling it must not use the server reference after it. As a bonus, this function now returns the next server instance. This is useful when calling on the server loop without having to save the next pointer before each invocation. In these functions, remove the checks that prevent refcount on non-dynamic servers. Each reference to "dynamic" in variable/function naming have been eliminated as well.	2021-08-25 15:53:54 +02:00
Amaury Denoyelle	0a8d05d31c	BUG/MINOR: stats: use refcount to protect dynamic server on dump A dynamic server may be deleted at runtime at the same moment when the stats applet is pointing to it. Use the server refcount to prevent deletion in this case. This should be backported up to 2.4, with an observability period of 2 weeks. Note that it requires the dynamic server refcounting feature which has been implemented on 2.5; the following commits are required : - MINOR: server: implement a refcount for dynamic servers - BUG/MINOR: server: do not use refcount in free_server in stopping mode - MINOR: server: return the next srv instance on free_server	2021-08-25 15:53:43 +02:00
Amaury Denoyelle	f5c1e12e44	MINOR: server: return the next srv instance on free_server As a convenience, return the next server instance from servers list on free_server. This is particularily useful when using this function on the servers list without having to save of the next pointer before calling it.	2021-08-25 15:29:19 +02:00
William Lallemand	4c395fce21	MINOR: server: check if srv is NULL in free_server() Check if srv is NULL before trying to do anything in free_server(), like most free()-like function do.	2021-08-20 10:20:51 +02:00
Amaury Denoyelle	3eb42f91d9	BUG/MEDIUM: server: support both check/agent-check on a dynamic instance A static server is able to support simultaneously both health chech and agent-check. Adjust the dynamic server CLI handlers to also support this configuration. This should not be backported, unless dynamic server checks are backported.	2021-08-11 14:41:47 +02:00
Amaury Denoyelle	13f2e2ceeb	BUG/MINOR: server: do not use refcount in free_server in stopping mode Currently there is a leak at process shutdown with dynamic servers with check/agent-check activated. Check purges are not executed on process stopping, so the server is not liberated due to its refcount. The solution is simply to ignore the refcount on process stopping mode and free the server on the first free_server invocation. This should not be backported, unless dynamic server checks are backported. In this case, the following commit must be backported first. `7afa5c1843` MINOR: global: define MODE_STOPPING	2021-08-09 17:53:30 +02:00
Amaury Denoyelle	b65f4cab6a	MEDIUM: server: implement agent check for dynamic servers This commit is the counterpart for agent check of "MEDIUM: server: implement check for dynamic servers". The "agent-check" keyword is enabled for dynamic servers. The agent check must manually be activated via "enable agent" CLI. This can enable the dynamic server if the agent response is "ready" without an explicit "enable server" CLI.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	2fc4d39577	MEDIUM: server: implement check for dynamic servers Implement check support for dynamic servers. The "check" keyword is now enabled for dynamic servers. If used, the server check is initialized and the check task started in the "add server" CLI handler. The check is explicitely disabled and must be manually activated via "enable health" CLI handler. The dynamic server refcount is incremented if a check is configured. On "delete server" handler, the check is purged, which decrements the refcount.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	d6b7080cec	MINOR: server: implement a refcount for dynamic servers It is necessary to have a refcount mechanism on dynamic servers to be able to enable check support. Indeed, when deleting a dynamic server with check activated, the check will be asynchronously removed. This is mandatory to properly free the check resources in a thread-safe manner. The server instance must be kept alive for this.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	fca18172d9	MINOR: server: initialize fields for dynamic server check Set default inter/rise/fall values for dynamic servers check/agent. This is required because dynamic servers do not inherit from a default-server.	2021-08-06 11:08:04 +02:00
Amaury Denoyelle	c755efd5c6	MINOR: server: unmark deprecated on enable health/agent cli Remove the "DEPRECATED" marker on "enable/disable health/agent" commands. Their purpose is to toggle the check/agent on a server. These commands are still useful because their purpose is not covered by the "set server" command. Most there was confusion with the commands 'set server health/agent', which in fact serves another goal. Note that the indication "use 'set server' instead" has been added since 2016 on the commit `2c04eda8b5` REORG: cli: move "{enable\|disable} health" to server.c and `58d9cb7d22` REORG: cli: move "{enable\|disable} agent" to server.c Besides, these commands will become required to enable check/agent on dynamic servers which will be created with check disabled. This should be backported up to 2.4.	2021-08-06 10:09:50 +02:00
Willy Tarreau	d332f1396b	BUG/MINOR: server: update last_change on maint->ready transitions too Nenad noticed that when leaving maintenance, the servers' last_change field was not updated. This is visible in the Status column of the stats page in front of the state, as the cumuled time spent in the current state is wrong, it starts from the last transition (typically ready->maint). In addition, the backend's state was not updated either, because the down transition is performed by set_backend_down() which also emits a log, and it is this function which was extended to update the backend's last_change, but it's not called for down->up transitions so that was not done. The most visible (and unpleasant) effect of this bug is that it affects slowstart so such a server could immediately restart with a significant load ratio. This should likely be backported to all stable releases.	2021-08-04 19:41:01 +02:00
Amaury Denoyelle	bd8dd841e5	BUG/MINOR: server: remove srv from px list on CLI 'add server' error If an error occured during the CLI 'add server' handler, the newly created server must be removed from the proxy list if already inserted. Currently, this can happen on the extremely rare error during server id generation if there is no id left. The removal operation is not thread-safe, it must be conducted before releasing the thread isolation. This can be backported up to 2.4. Please note that dynamic server track is not implemented in 2.4, so the release_server_track invocation must be removed for the backport to prevent a compilation error.	2021-08-04 14:57:06 +02:00
Willy Tarreau	ba3ab7907a	MEDIUM: servers: make the server deletion code run under full thread isolation In 2.4, runtime server deletion was brought by commit `e558043e1` ("MINOR: server: implement delete server cli command"). A comment remained in the code about a theoretical race between the thread_isolate() call and another thread being in the process of allocating memory before accessing the server via a reference that was grabbed before the memory allocation, since the thread_harmless_now()/thread_harmless_end() pair around mmap() may have the effect of allowing cli_parse_delete_server() to proceed. Now that the full thread isolation is available, let's update the code to rely on this. Now it is guaranteed that competing threads will either be in the poller or queued in front of thread_isolate_full(). This may be backported to 2.4 if any report of breakage suggests the bug really exists, in which case the two following patches will also be needed: MINOR: threads: make thread_release() not wait for other ones to complete MEDIUM: threads: add a stronger thread_isolate_full() call	2021-08-04 14:49:36 +02:00
Amaury Denoyelle	08be72b827	BUG/MINOR: server: fix race on error path of 'add server' CLI if track If an error occurs during a dynamic server creation with tracking, it must be removed from the tracked list. This operation is not thread-safe and thus must be conducted under the thread isolation. Track support for dynamic servers has been introduced in this release. This does not need to be backported.	2021-08-04 09:18:12 +02:00
Amaury Denoyelle	56eb8ed37d	MEDIUM: server: support track keyword for dynamic servers Allow the usage of the 'track' keyword for dynamic servers. On server deletion, the server is properly removed from the tracking chain to prevents NULL pointer dereferencing.	2021-07-16 10:22:58 +02:00
Amaury Denoyelle	79f68be207	MINOR: srv: do not allow to track a dynamic server Prevents the use of the "track" keyword for a dynamic server. This simplifies the deletion of a dynamic server, without having to worry about servers which might tracked it. A BUG_ON is present in the dynamic server delete function to validate this assertion.	2021-07-16 10:08:55 +02:00
Amaury Denoyelle	669b620e5f	MINOR: srv: extract tracking server config function Extract the post-config tracking setup in a dedicated function srv_apply_track. This will be useful to implement track support for dynamic servers.	2021-07-16 10:08:55 +02:00
Remi Tricot-Le Breton	0498fa4059	BUG/MINOR: ssl: Default-server configuration ignored by server When a default-server line specified a client certificate to use, the frontend would not take it into account and create an empty SSL context, which would raise an error on the backend side ("peer did not return a certificate"). This bug was introduced by `d817dc733e` in which the SSL contexts are created earlier than before (during the default-server line parsing) without setting it in the corresponding server structures. It then made the server create an empty SSL context in ssl_sock_prepare_srv_ctx because it thought it needed one. It was raised on redmine, in Bug #3906. It can be backported to 2.4.	2021-07-13 18:35:38 +02:00
Christopher Faulet	81ba74ae50	BUG/MEDIUM: resolvers: Make 1st server of a template take part to SRV resolution The commit `3406766d5` ("MEDIUM: resolvers: add a ref between servers and srv request or used SRV record") introduced a regression. The first server of a template based on SRV record is no longer resolved. The same bug exists for a normal server based on a SRV record. In fact, the server used during parsing (used as reference when a server-template line is parsed) is never attached to the corresponding srvrq object. Thus with following lines, no resolution is performed because "srvrq->attached_servers" is empty: server-template test 1 _http.domain.tld resolvers dns ... server test1 _http.domain.tld resolvers dns ... This patch should fix the issue #1295 (but not confirmed yet it is the same bug). It must be backported everywhere the above commit is.	2021-06-29 20:52:37 +02:00
Christopher Faulet	07ecff589d	MINOR: resolvers: Reset server IP on error in resolv_get_ip_from_response() If resolv_get_ip_from_response() returns an error (or an unexpected return value), the server is set to RMAINT status. However, its address must also be reset. Otherwise, it is still reported by the cli on "show servers state" commands. This may be confusing. Note that it is a theorical patch because this code path does not exist. Thus it is not tagged as a BUG. This patch may be backported as far as 2.0.	2021-06-24 17:22:36 +02:00
Christopher Faulet	a8ce497aac	BUG/MINOR: resolvers: Reset server IP when no ip is found in the response For A/AAAA resolution, if no ip is found for a server in the response, the server is set to RMAINT status. However, its address must also be reset. Otherwise, it is still reported by the cli on "show servers state" commands. This may be confusing. This patch may be backported as far as 2.0.	2021-06-24 17:22:36 +02:00
Willy Tarreau	cdc83e0192	MINOR: queue: add a pointer to the server and the proxy in the queue A queue is specific to a server or a proxy, so we don't need to place this distinction inside all pendconns, it can be in the queue itself. This commit adds the relevant fields "px" and "sv" into the struct queue, and initializes them accordingly.	2021-06-24 10:52:31 +02:00
Willy Tarreau	df3b0cbe31	MINOR: queue: add queue_init() to initialize a queue This is better and cleaner than open-coding this in the server and proxy code, where it has all chances of becoming wrong once forgotten.	2021-06-24 10:52:31 +02:00
Willy Tarreau	9ab78293bf	MEDIUM: queue: simplify again the process_srv_queue() API (v2) This basically undoes the API changes that were performed by commit `0274286dd` ("BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check") to address the deadlock issue: since process_srv_queue() doesn't use the server lock anymore, it doesn't need the "server_locked" argument, so let's get rid of it before it gets used again.	2021-06-24 10:52:31 +02:00
Willy Tarreau	16fbdda3c3	MEDIUM: queue: use a dedicated lock for the queues (v2) Till now whenever a server or proxy's queue was touched, this server or proxy's lock was taken. Not only this requires distinct code paths, but it also causes unnecessary contention with other uses of these locks. This patch adds a lock inside the "queue" structure that will be used the same way by the server and the proxy queuing code. The server used to use a spinlock and the proxy an rwlock, though the queue only used it for locked writes. This new version uses a spinlock since we don't need the read lock part here. Tests have not shown any benefit nor cost in using this one versus the rwlock so we could change later if needed. The lower contention on the locks increases the performance from 362k to 374k req/s on 16 threads with 20 servers and leastconn. The gain with roundrobin even increases by 9%. This is tagged medium because the lock is changed, but no other part of the code touches the queues, with nor without locking, so this should remain invisible.	2021-06-24 10:52:31 +02:00
Willy Tarreau	3f70fb9ea2	Revert "MEDIUM: queue: use a dedicated lock for the queues" This reverts commit `fcb8bf8650`. The recent changes since `5304669e1` MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn opened a tiny race condition between stream_free() and process_srv_queue(), as the pendconn is accessed outside of the lock, possibly while it's being freed. A different approach is required.	2021-06-24 07:26:28 +02:00
Willy Tarreau	ccd85a3e08	Revert "MEDIUM: queue: simplify again the process_srv_queue() API" This reverts commit `c83e45e9b0`. The recent changes since `5304669e1` MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn opened a tiny race condition between stream_free() and process_srv_queue(), as the pendconn is accessed outside of the lock, possibly while it's being freed. A different approach is required.	2021-06-24 07:22:18 +02:00
Willy Tarreau	c83e45e9b0	MEDIUM: queue: simplify again the process_srv_queue() API This basically undoes the API changes that were performed by commit `0274286dd` ("BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check") to address the deadlock issue: since process_srv_queue() doesn't use the server lock anymore, it doesn't need the "server_locked" argument, so let's get rid of it before it gets used again.	2021-06-22 18:57:15 +02:00
Willy Tarreau	fcb8bf8650	MEDIUM: queue: use a dedicated lock for the queues Till now whenever a server or proxy's queue was touched, this server or proxy's lock was taken. Not only this requires distinct code paths, but it also causes unnecessary contention with other uses of these locks. This patch adds a lock inside the "queue" structure that will be used the same way by the server and the proxy queuing code. The server used to use a spinlock and the proxy an rwlock, though the queue only used it for locked writes. This new version uses a spinlock since we don't need the read lock part here. Tests have not shown any benefit nor cost in using this one versus the rwlock so we could change later if needed. The lower contention on the locks increases the performance from 491k to 507k req/s on 16 threads with 20 servers and leastconn. The gain with roundrobin even increases by 6%. The performance profile changes from this: 13.03% haproxy [.] fwlc_srv_reposition 8.08% haproxy [.] fwlc_get_next_server 3.62% haproxy [.] process_srv_queue 1.78% haproxy [.] pendconn_dequeue 1.74% haproxy [.] pendconn_add to this: 11.95% haproxy [.] fwlc_srv_reposition 7.57% haproxy [.] fwlc_get_next_server 3.51% haproxy [.] process_srv_queue 1.74% haproxy [.] pendconn_dequeue 1.70% haproxy [.] pendconn_add At this point the differences are mostly measurement noise. This is tagged medium because the lock is changed, but no other part of the code touches the queues, with nor without locking, so this should remain invisible.	2021-06-22 18:43:56 +02:00
Willy Tarreau	a05704582c	MINOR: server: replace the pendconns-related stuff with a struct queue Just like for proxies, all three elements (pendconns, nbpend, queue_idx) were moved to struct queue.	2021-06-22 18:43:14 +02:00
Amaury Denoyelle	0274286dd3	BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check The server_parse_maxconn_change_request locks the server lock. However, this function can be called via agent-checks or lua code which already lock it. This bug has been introduced by the following commit : commit `79a88ba3d0` BUG/MAJOR: server: prevent deadlock when using 'set maxconn server' This commit tried to fix another deadlock with can occur because previoulsy server_parse_maxconn_change_request requires the server lock to be held. However, it may call internally process_srv_queue which also locks the server lock. The locking policy has thus been updated. The fix is functional for the CLI 'set maxconn' but fails to address the agent-check / lua counterparts. This new issue is fixed in two steps : - changes from the above commit have been reverted. This means that server_parse_maxconn_change_request must again be called with the server lock. - to counter the deadlock fixed by the above commit, process_srv_queue now takes an argument to render the server locking optional if the caller already held it. This is only used by server_parse_maxconn_change_request. The above commit was subject to backport up to 1.8. Thus this commit must be backported in every release where it is already present.	2021-06-22 11:39:20 +02:00
Amaury Denoyelle	34897d2eff	MINOR: ssl: support ssl keyword for dynamic servers Activate the 'ssl' keyword for dynamic servers. This is the final step to have ssl dynamic servers feature implemented. If activated, ssl_sock_prepare_srv_ctx will be called at the end of the 'add server' CLI handler. At the same time, update the management doc to list all ssl keywords implemented for dynamic servers.	2021-06-18 16:42:26 +02:00
Amaury Denoyelle	b89d3d3de7	MINOR: server: disable CLI 'set server ssl' for dynamic servers 'set server ssl' uses ssl parameters from default-server. As dynamic servers does not reuse any default-server parameters, this command has no sense for them.	2021-06-18 16:42:25 +02:00
Christopher Faulet	0ba54bb401	BUG/MINOR: server/cli: Fix locking in function processing "set server" command The commit `c7b391aed` ("BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI") introduced 2 bugs. The first one is a typo on the server's lock label (s/SERVER_UNLOCK/SERVER_LOCK/). The second one is about the server's lock itself. It must be acquired to execute the "agent-send" subcommand. The patch above is marked to be backported as far as 1.8. Thus, this one must also backported as far 1.8. BUG/MINOR: server/cli: Don't forget to lock server on agent-send subcommand	2021-06-18 09:16:32 +02:00
Christopher Faulet	dcac418062	BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution status When a server relies on a SRV resolution, a task is created to clean it up (fqdn/port and address) when the SRV resolution is considered as outdated (based on the resolvers 'timeout' value). It is only possible if the server inherits outdated info from a state file and is no longer selected to be attached to a SRV item. Note that most of time, a server is attached to a SRV item. Thus when the item becomes obsolete, the server is cleaned up. It is important to have such task to be sure the server will be free again to have a chance to be resolved again with fresh information. Of course, this patch is a workaround to solve a design issue. But there is no other obvious way to fix it without rewritting all the resolvers part. And it must be backportable. This patch relies on following commits: * MINOR: resolvers: Clean server in a dedicated function when removing a SRV item * MINOR: resolvers: Remove server from named_servers tree when removing a SRV item All the series must be backported as far as 2.2 after some observation period. Backports to 2.0 and 1.8 must be evaluated.	2021-06-17 16:52:35 +02:00
Christopher Faulet	c7b391aed2	BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI To perform servers resolution, the resolver's lock is first acquired then the server's lock when necessary. However, when the fqdn is set via the CLI, the opposite is performed. So, it is possible to experience an ABBA deadlock. To fix this bug, the server's lock is acquired and released for each subcommand of "set server" with an exception when the fqdn is set. The resolver's lock is first acquired. Of course, this means we must be sure to have a resolver to lock. This patch must be backported as far as 1.8.	2021-06-17 16:52:14 +02:00
Christopher Faulet	a386e78823	BUG/MINOR: server: Forbid to set fqdn on the CLI if SRV resolution is enabled If a server is configured to rely on a SRV resolution, we must forbid to change its fqdn on the CLI. Indeed, in this case, the server retrieves its fqdn from the SRV resolution. If the fqdn is changed via the CLI, this conflicts with the SRV resolution and leaves the server in an undefined state. Most of time, the SRV resolution remains enabled with no effect on the server (no update). Some time the A/AAAA resolution for the new fqdn is not enabled at all. It depends on the server state and resolver state when the CLI command is executed. This patch must be backported as far as 2.0 (maybe to 1.8 too ?) after some observation period.	2021-06-17 16:17:14 +02:00
Miroslav Zagorac	8a8f270f6a	CLEANUP: server: a separate function for initializing the per_thr field To avoid repeating the same source code, allocating memory and initializing the per_thr field from the server structure is transferred to a separate function.	2021-06-17 16:07:10 +02:00
Amaury Denoyelle	8ff0434b61	BUG/MEDIUM: server: do not auto insert a dynamic server in px addr_node Until then, the servers were automatically attached on their creation into the proxy addr_node tree via _srv_parse_init. In case of an invalid dynamic server which is instantly freed, no detach operation was made leaving a NULL server in the tree. Change this mode of operation by marking the attach operation as optional in _srv_parse_init. This operation is not conduct for a dynamic server. The server is attached only at the end of the CLI handler when it is marked as valid. This must be backported up to 2.4.	2021-06-15 11:42:53 +02:00
Amaury Denoyelle	1613b4a75d	BUG/MINOR: server: do not keep an invalid dynamic server in px ids tree A bug is present when trying to create a dynamic server with a fixed id. If the server is detected invalid due to a later parsing arguments error, the server is not removed from the proxy used ids tree before being freed. Change the mode of operation of 'id' keyword parsing handler. The insertion in the backend tree is removed from the handler and is not taken in charge by parse_server for configuration parsing. For the dynamic servers, the insertion is called at the end of the 'add server' CLI handler when the server has been validated. This must be backported up to 2.4.	2021-06-15 11:42:53 +02:00
Amaury Denoyelle	406aaef55a	BUG/MEDIUM: server: do not forget to generate the dynamic servers ids If no id is specified by the user for a dynamic server, it is necessary to generate a new one. This operation is now done at the end of 'add server' CLI handler. The server is then inserted into the proxy ids tree. Without this, several features may be broken for dynamic servers. Among them, there is the "first" lb algorithm, the persistence using stick-tables or the uniqueness internal check of srv_parse_id. This must be backported up to 2.4.	2021-06-15 11:42:53 +02:00
Amaury Denoyelle	82d7f77463	BUG/MEDIUM: server: clear dynamic srv on delete from proxy id/name trees Do not leave deleted server in used_server_id/used_server_addr backend trees. This might lead to crashes if a deleted server is used through these trees. At this moment, dynamic servers are only added in used_server_id if they have a fixed id. They are never inserted in used_server_addr as this code is missing. So these new delete instructions are noop. However, a fix will be provided soon to insert properly all dynamic servers in both used_server_id and used_server_addr trees so the deletion counterpart will be mandatory in the CLI server delete handler. This must be backported to 2.4.	2021-06-15 11:38:06 +02:00
Amaury Denoyelle	31ddd76fef	BUG/MEDIUM: server: extend thread-isolate over much of CLI 'add server' Some config parsing handlers were designed to be run at startup on a single-thread. When executing at runtime for dynamic servers, thread-safety is not guaranteed. This is the case for example in srv_parse_id which manipulates backend used_ids tree. One solution could be to add locks but it might be tricky to found all affected functions and it can be an easy source of deadlock. The other solution which has been chosen is to use thread-isolation over almost all of the cli_parse_add_server CLI handler. For now this solution is sufficient. If some users make heavy use of the 'add server', hurting the overall performance, it will be necessary to design a much thinner solution. This must be backported up to 2.4.	2021-06-15 11:19:43 +02:00
Emeric Brun	caef19e0c7	BUG/MAJOR: resolvers: segfault using server template without SRV RECORDs This patch fix the issue adding a test in srvrq before registering the server on it during server template init. This was a regression due to commit : `3406766d57` This should be backported with this previous commit (until 2.0)	2021-06-14 11:04:02 +02:00
Emeric Brun	3406766d57	MEDIUM: resolvers: add a ref between servers and srv request or used SRV record This patch add a ref into servers to register them onto the record answer item used to set their hostnames. It also adds a head list into 'srvrq' to register servers free to be affected to a SRV record. A head of a tree is also added to srvrq to put servers which present a hotname in server state file. To re-link them fastly to the matching record as soon an item present the same name. This results in better performances on SRV record response parsing. This is an optimization but it could avoid to trigger the haproxy's internal wathdog in some circumstances. And for this reason it should be backported as far we can (2.0 ?)	2021-06-11 16:16:16 +02:00
Emeric Brun	bd78c912fd	MEDIUM: resolvers: add a ref on server to the used A/AAAA answer item This patch adds a head list into answer items on servers which use this record to set their IPs. It makes lookup on duplicated ip faster and allow to check immediatly if an item is still valid renewing the IP. This results in better performances on A/AAAA resolutions. This is an optimization but it could avoid to trigger the haproxy's internal wathdog in some circumstances. And for this reason it should be backported as far we can (2.0 ?)	2021-06-11 16:16:16 +02:00
Emeric Brun	12ca658dbe	BUG/MINOR: resolvers: answser item list was randomly purged or errors In case of SRV records, The answer item list was purged by the error callback of the first requester which considers the error could not be safely ignored. It makes this item list unavailable for subsequent requesters even if they consider the error could be ignored. On A resolution or do_resolve action error, the answer items were never trashed. This patch re-work the error callbacks and the code to check the return code If a callback return 1, we consider the error was ignored and the answer item list must be kept. At the opposite, If all error callbacks of all requesters of the same resolution returns 0 the list will be purged This patch should be backported as far as 2.0.	2021-06-11 16:16:16 +02:00
Amaury Denoyelle	efbf35caf9	BUG/MINOR: server: explicitly set "none" init-addr for dynamic servers Define srv.init_addr_methods to SRV_IADDR_NONE on 'add server' CLI handler. This explicitly states that no resolution will be made on the server creation. This is not a real bug as the default value (SRV_IADDR_END) has the same effect in practice. However the intent is clearer and prevent to use the default "libc,last" by mistake which cannot execute on runtime (blocking call + file access via gethostbyname/getaddrinfo). The doc is also updated to reflect this limitation. This should be backported up to 2.4.	2021-06-10 17:44:05 +02:00
Amaury Denoyelle	5e560e80c7	MINOR: server: use ha_alert in server parsing functions Replace memprintf usage in _srv_parse* functions by ha_alert calls. This has the advantage to simplify the function prototype by removing an extra char** argument. As a consequence, the CLI handler of 'add server' is updated to output the user messages buffers if not empty.	2021-06-07 17:19:33 +02:00
Amaury Denoyelle	9d0138ab08	MINOR: server: use parsing ctx for server init addr Initialize the parsing context in srv_init_addr. This function is called after configuration check. This will standardize the stderr output on startup with the parse_server function.	2021-06-07 17:19:30 +02:00
Amaury Denoyelle	0fc136ce5b	REORG: server: use parsing ctx for server parsing Use the parsing context in parse_server. Remove redundant manual format-string specifying the current file/line/server parsed.	2021-06-07 17:19:24 +02:00
Amaury Denoyelle	c008a63582	CLEANUP: server: fix cosmetic of error message on sni parsing Fix memprintf used in server_parse_sni_expr. Error messages should not be ending with a newline as it will be inserted in the parent function on the ha_alert invocation.	2021-06-07 16:58:16 +02:00
Remi Tricot-Le Breton	f1800e64ef	BUG/MINOR: server: Missing calloc return value check in srv_parse_source Two calloc calls were not checked in the srv_parse_source function. Considering that this function could be called at runtime through a dynamic server creation via the CLI, this could lead to an unfortunate crash. It was raised in GitHub issue #1233. It could be backported to all stable branches even though the runtime crash could only happen on branches where dynamic server creation is possible.	2021-05-31 10:50:32 +02:00
Amaury Denoyelle	79a88ba3d0	BUG/MAJOR: server: prevent deadlock when using 'set maxconn server' A deadlock is possible with 'set maxconn server' command, if there is pending connection ready to be dequeued. This is caused by the locking of server spinlock in both cli_parse_set_maxconn_server and process_srv_queue. Fix this by reducing the scope of the server lock into server_parse_maxconn_change_request. If connection are dequeued, the lock is taken a second time. This can be seen as suboptimal but as it happens only during 'set maxconn server' it can be considered as tolerable. This issue was reported on the mailing list, for the 1.8.x branch. It must be backported up to the 1.8.	2021-05-19 17:52:05 +02:00
Willy Tarreau	b00a8e30f1	BUILD: server: include missing proxy.h in server.c It's needed for a number of functions and definitions but was missing.	2021-05-08 20:24:09 +02:00
Willy Tarreau	ba6300ea62	BUILD: server: include tools.h from server.c A lot of functions from tools.h are used there but the file was only inherited via other ones.	2021-05-08 19:37:41 +02:00
Amaury Denoyelle	24abb0cdc1	BUG/MINOR: server: do not report diag for peer servers with null weight Only check servers attached to a proxy with PR_CAP_LB. This does not need to be backported as the diag message was added in the current 2.4-dev branch.	2021-05-07 15:20:54 +02:00
Willy Tarreau	b205bfdab7	CLEANUP: cli/tree-wide: properly re-align the CLI commands' help messages There were 102 CLI commands whose help were zig-zagging all along the dump making them unreadable. This patch realigns all these messages so that the command now uses up to 40 characters before the delimiting colon. About a third of the commands did not correctly list their arguments which were added after the first version, so they were all updated. Some abuses of the term "id" were fixed to use a more explanatory term. The "set ssl ocsp-response" command was not listed because it lacked a help message, this was fixed as well. The deprecated enable/disable commands for agent/health/server were prominently written as deprecated. Whenever possible, clearer explanations were provided.	2021-05-07 11:51:26 +02:00
Amaury Denoyelle	3109ccfe70	MINOR: srv: close all idle connections on shutdown Implement a function to close all server idle connections. This function is called via a global deinit server handler. The main objective is to prevents from leaving sockets in TIME_WAIT state. To limit the set of operations on shutdown and prevents tasks rescheduling, only the ctrl stack closing is done.	2021-05-05 14:33:51 +02:00
Amaury Denoyelle	eafd701dc5	MINOR: server: fix doc/trace on lb algo for dynamic server creation The text mentionned that only backends with consistent hash method were supported for dynamic servers. In fact, it is only required that the lb algorith is dynamic.	2021-04-29 14:59:42 +02:00
Amaury Denoyelle	d6b4b6da3f	BUG/MINOR: server: fix potential null gcc error in delete server gcc still reports a potential null pointer dereference in delete server function event with a BUG_ON before it. Remove the misleading NULL check in the for loop which should never happen. This does not need to be backported.	2021-04-21 12:02:30 +02:00
Amaury Denoyelle	e558043e13	MINOR: server: implement delete server cli command Implement a new CLI command 'del server'. It can be used to removed a dynamically added server. Only servers in maintenance mode can be removed, and without pending/active/idle connection on it. Add a new reg-test for this feature. The scenario of the reg-test need to first add a dynamic server. It is then deleted and a client is used to ensure that the server is non joinable. The management doc is updated with the new command 'del server'.	2021-04-21 11:00:31 +02:00
Amaury Denoyelle	d38e7fa233	MINOR: server: add log on dynamic server creation Add a notice log to report the creation of a new server. The log is printed at the end of the function.	2021-04-21 11:00:31 +02:00
Amaury Denoyelle	cece918625	BUG/MEDIUM: server: ensure thread-safety of server runtime creation cli_parse_add_server can be executed in parallel by several CLI instances and so must be thread-safe. The critical points of the function are : - server duplicate detection - insertion of the server in the proxy list The mode of operation has been reversed. The server is first instantiated and parsed. The duplicate check has been moved at the end just before the insertion in the proxy list, under the thread isolation. Thus, the thread safety is guaranteed and server allocation is kept outside of locks/thread isolation.	2021-04-21 11:00:30 +02:00
Amaury Denoyelle	fb247946a1	BUG/MINOR: server: free srv.lb_nodes in free_server lb_nodes is allocated for servers using lb_chash (balance random or hash-type consistent). It can be backported up to 1.8.	2021-04-21 11:00:03 +02:00
Willy Tarreau	2b71810cb3	CLEANUP: lists/tree-wide: rename some list operations to avoid some confusion The current "ADD" vs "ADDQ" is confusing because when thinking in terms of appending at the end of a list, "ADD" naturally comes to mind, but here it does the opposite, it inserts. Several times already it's been incorrectly used where ADDQ was expected, the latest of which was a fortunate accident explained in `6fa922562` ("CLEANUP: stream: explain why we queue the stream at the head of the server list"). Let's use more explicit (but slightly longer) names now: LIST_ADD -> LIST_INSERT LIST_ADDQ -> LIST_APPEND LIST_ADDED -> LIST_INLIST LIST_DEL -> LIST_DELETE The same is true for MT_LISTs, including their "TRY" variant. LIST_DEL_INIT keeps its short name to encourage to use it instead of the lazier LIST_DELETE which is often less safe. The change is large (~674 non-comment entries) but is mechanical enough to remain safe. No permutation was performed, so any out-of-tree code can easily map older names to new ones. The list doc was updated.	2021-04-21 09:20:17 +02:00
Willy Tarreau	dcb121fd9c	BUG/MINOR: server: make srv_alloc_lb() allocate lb_nodes for consistent hash The test in srv_alloc_lb() to allocate the lb_nodes[] array used in the consistent hash was incorrect, it wouldn't do it for consistent hash and could do it for regular random. No backport is needed as this was added for dynamic servers in 2.4-dev by commit `f99f77a50` ("MEDIUM: server: implement 'add server' cli command").	2021-04-20 11:39:54 +02:00
Willy Tarreau	14015b8880	MINOR: server: move idle_conn_task to read_mostly This pointer is used when adding connections to the idle list and is never changed, let's move it to the read_mostly section.	2021-04-10 19:27:41 +02:00
Amaury Denoyelle	da0e7f61e0	MINOR: server: diag for 0 weight server Output a diagnostic report if a server has been configured with a null weight.	2021-04-01 18:03:37 +02:00
Ilya Shipitsin	ba13f16aa2	CLEANUP: assorted typo fixes in the code and comments This is 21st iteration of typo fixes	2021-03-20 09:28:58 +01:00
Amaury Denoyelle	304672320e	MINOR: server: support keyword proto in 'add server' cli Allow to specify the mux proto for a dynamic server. It must be compatible with the backend mode to be accepted. The reg-tests has been extended for this error case.	2021-03-18 16:22:10 +01:00
Amaury Denoyelle	fc465a54fd	MINOR: server: enable standard options for dynamic servers Enable a subset of server options to be used as keywords on the CLI command 'add server'. These options are safe and can be applied flawlessly for a dynamic server.	2021-03-18 16:22:10 +01:00
Amaury Denoyelle	f99f77a500	MEDIUM: server: implement 'add server' cli command Add a new cli command 'add server'. This command is used to create a new server at runtime attached on an existing backend. The syntax is the following one : $ add server <be_name>/<sv_name> [<kws>...] This command is only available through experimental mode for the moment. Currently, no server keywords are supported. They will be activated individually when deemed properly functional and safe. Another limitation is put on the backend load-balancing algorithm. The algorithm must use consistent hashing to guarantee a minimal reallocation of existing connections on the new server insertion.	2021-03-18 15:52:07 +01:00
Amaury Denoyelle	76e10e78bb	MINOR: server: prepare parsing for dynamic servers Prepare the server parsing API to support dynamic servers. - define a new parsing flag to be used for dynamic servers - each keyword contains a new field dynamic_ok to indicate if it can be used for a dynamic server. For now, no keyword are supported. - do not copy settings from the default server for a new dynamic server. - a dynamic server is created in a maintenance mode and requires an explicit 'enable server' command. - a new server flag named SRV_F_DYNAMIC is created. This flag is set for all servers created at runtime. It might be useful later, for example to know if a server can be purged.	2021-03-18 15:51:12 +01:00
Amaury Denoyelle	30c0537f5a	REORG: server: use flags for parse_server Modify the API of parse_server function. Use flags to describe the type of the parsed server instead of discrete arguments. These flags can be used to specify if a server/default-server/server-template is parsed. Additional parameters are also specified (parsing of the address required, resolve of a name must be done immediately). It is now unneeded to use strcmp on args[0] in parse_server. Also, the calls to parse_server are more explicit thanks to the flags.	2021-03-18 15:37:05 +01:00
Amaury Denoyelle	cf58dd79e3	REORG: server: attach servers in parse_server Move server linked into proxy backend list outside of _srv_parse_init to parse_server. This is groundwork for dynamic servers support. There will be two differences in case of a dynamic server : - the server will be attached to the proxy list only at the very end of the operations when everything is ok - the server will be directly attached to the end of the server proxy list	2021-03-18 15:37:05 +01:00
Amaury Denoyelle	7d27efef23	REORG: server: rename internal functions from parse_server Use a standard convention for the functions used through parse_server. Use the prefix _srv_parse and specify their private scope in a comment.	2021-03-18 15:37:05 +01:00
Amaury Denoyelle	9394a9444e	REORG: server: move alert traces in parse_server Move every ha_alert calls in parsing functions into parse_server. Parsing functions now support a pointer-to-string argument which will be allocated with an error message if needed via memprintf. parse_server has then the responsibility to display errors with ha_alert. This is groundwork for dynamic server. No traces should be printed on stderr as a response to a cli command. cli_err will replace ha_alert in this case.	2021-03-18 15:37:05 +01:00
Amaury Denoyelle	a8f442e078	REORG: server: split parse_server The huge parse_server function is splitted into two smaller ones. * _srv_parse_init allocates a new server instance and parses the address parameter * _srv_parse_kw parse the current server keyword This simplify a bit the parse_server function. Besides, it will be useful for dynamic server creation.	2021-03-18 15:37:05 +01:00
Amaury Denoyelle	3b89c11d4d	MINOR: server: remove fastinter from mistyped kw list This keyword is already present in server kw list from checks.c.	2021-03-18 15:37:05 +01:00
Amaury Denoyelle	587b71e402	REORG: server: move keywords in srv_kws Move server-keyword hardcoded in parse_server into the srv_kws list of server.c. Now every server keywords is checked through srv_find_kw. This has the effect to reduce the size of parse_server. As a side-effect, common kw list can be reduced. This change has been made to be able to quickly discard these keywords in case of a dynamic server.	2021-03-18 15:37:05 +01:00
Amaury Denoyelle	828adf0121	REORG: server: add a free server function Create a new server function named free_server. It can be used to deallocate a server and its member.	2021-03-18 15:37:05 +01:00
Christopher Faulet	59b2925733	BUG/MINOR: resolvers: Add missing case-insensitive comparisons of DNS hostnames DNS hostname comparisons were fixed to be case-insensitive (see `b17b88487` "BUG/MEDIUM: dns: Consider the fact that dns answers are case-insensitive"). However 2 comparisons are still case-sensitive. This patch must be backported as far as 1.8.	2021-03-16 11:25:04 +01:00
Christopher Faulet	c392d461d6	CLEANUP: resolvers: Use ha_free() in srvrq_resolution_error_cb() Two occurrences to "free(A);A=NULL;" may be replaced by a call to ha_free() in the srvrq_resolution_error_cb() function.	2021-03-12 17:42:47 +01:00
Christopher Faulet	d83a6df5cd	BUG/MEDIUM: resolvers: Skip DNS resolution at startup if SRV resolution is set At startup, if a SRV resolution is set for a server, no DNS resolution is created. We must wait the first SRV resolution to know if it must be triggered. It is important to do so for two reasons. First, during a "classical" startup, a server based on a SRV resolution has no hostname. Thus the created DNS resolution is useless. Best waiting the first SRV resolution. It is not really a bug at this stage, it is just useless. Second, in the same situation, if the server state is loaded from a file, its hosname will be set a bit later. Thus, if there is no additionnal record for this server, because there is already a DNS resolution, it inhibits any new DNS resolution. But there is no hostname attached to the existing DNS resolution. So no resolution is performed at all for this server. To avoid any problem, it is fairly easier to handle this special case during startup. But this means we must be prepared to have no "resolv_requester" field for a server at runtime. This patch must be backported as far as 2.2.	2021-03-12 17:41:28 +01:00
Christopher Faulet	0efc0993ec	BUG/MEDIUM: resolvers: Don't release resolution from a requester callbacks Another way to say it: "Safely unlink requester from a requester callbacks". Requester callbacks must never try to unlink a requester from a resolution, for the current requester or another one. First, these callback functions are called in a loop on a request list, not necessarily safe. Thus unlink resolution at this place, may be unsafe. And it is useless to try to make these loops safe because, all this stuff is placed in a loop on a resolution list. Unlink a requester may lead to release a resolution if it is the last requester. However, the unkink is necessary because we cannot reset the server state (hostname and IP) with some pending DNS resolution on it. So, to workaround this issue, we introduce the "safe" unlink. It is only performed from a requester callback. In this case, the unlink function never releases the resolution, it only reset it if necessary. And when a resolution is found with an empty requester list, it is released. This patch depends on the following commits : * MINOR: resolvers: Purge answer items when a SRV resolution triggers an error * MINOR: resolvers: Use a function to remove answers attached to a resolution * MINOR: resolvers: Directly call srvrq_update_srv_state() when possible * MINOR: resolvers: Add function to change the srv status based on SRV resolution All the series must be backported as far as 2.2. It fixes a regression introduced by the commit `b4badf720` ("BUG/MINOR: resolvers: new callback to properly handle SRV record errors"). don't release resolution from requester cb	2021-03-12 17:41:28 +01:00
Christopher Faulet	6b117aed49	MINOR: resolvers: Directly call srvrq_update_srv_state() when possible When the server status must be updated from the result of a SRV resolution, we can directly call srvrq_update_srv_state(). It is simpler and this avoid a test on the server DNS resolution. This patch is mandatory for the next commit. It also rely on "MINOR: resolvers: Directly call srvrq_update_srv_state() when possible".	2021-03-12 17:41:28 +01:00
Christopher Faulet	5efdef24c1	MINOR: resolvers: Add function to change the srv status based on SRV resolution srvrq_update_srv_status() update the server status based on result of SRV resolution. For now, it is only used from snr_update_srv_status() when appropriate.	2021-03-12 17:41:28 +01:00
Christopher Faulet	51d5e3bda7	MINOR: resolvers: Purge answer items when a SRV resolution triggers an error When a SRV request trigger an error, if we decide to handle the error because last_valid duration is expired, the answer list may be purged. All items are considered as obsolete.	2021-03-12 17:41:28 +01:00
Christopher Faulet	49531e8471	BUG/MINOR; resolvers: Ignore DNS resolution for expired SRV item If no ADD item is found for a SRV item in a SRV response, a DNS resolution is triggered. When it succeeds, we must be sure the SRV item is still alive. Otherwise the DNS resolution must be ignored. This patch depends on the commit "MINOR: resolvers: Move last_seen time of an ADD into its corresponding SRV item". Both must be backported as far as 2.2.	2021-03-12 17:41:28 +01:00
Christopher Faulet	bca680ba90	BUG/MINOR: resolvers: Unlink DNS resolution to set RMAINT on SRV resolution When a server is set in RMAINT becaues of a SRV resolution failure, the server DNS resolution, if any, must be unlink first. It is mandatory to handle the change in the context of a SRV resolution. This patch must be backported as far as 2.2.	2021-03-12 16:43:37 +01:00
Christopher Faulet	5130c21fbb	BUG/MINOR: resolvers: Reset server address on DNS error only on status change When a DNS resolution error is detected, in snr_resolution_error_cb(), the server address must be reset only if the server status has changed. It this case, it means the server is set to RMAINT. Thus the server address may by reset. This patch fixes a bug introduced by commit `d127ffa9f` ("BUG/MEDIUM: resolvers: Reset address for unresolved servers"). It must be backported as far as 2.0.	2021-03-12 16:43:37 +01:00
Christopher Faulet	bd0227c109	BUG/MINOR: resolvers: Consider server to have no IP on DNS resolution error When an error is received for a DNS resolution, for instance a NXDOMAIN error, the server must be considered to have no address when its status is updated, not the opposite. Concretly, because this parameter is not used on error path in snr_update_srv_status(), there is no impact. This patch must be backported as far as 1.8.	2021-03-12 16:43:37 +01:00
Willy Tarreau	736adef511	BUG/MINOR: cfgparse/server: increment the extra keyword counter one at a time This was introduced in previous commit `49c2b45c1` ("MINOR: cfgparse/server: try to fix spelling mistakes on server lines"), the loop was changed but the increment left. No backport is needed.	2021-03-12 14:47:10 +01:00
Willy Tarreau	49c2b45c1d	MINOR: cfgparse/server: try to fix spelling mistakes on server lines Let's apply the fuzzy match to server keywords so that we can avoid dumping the huge list of supported keywords each time there is a spelling mistake, and suggest proper spelling instead: $ printf "listen f\nserver s 0 sendpx-v2\n" \| ./haproxy -c -f /dev/stdin [NOTICE] 070/095718 (24152) : haproxy version is 2.4-dev11-caa6e3-25 [NOTICE] 070/095718 (24152) : path to executable is ./haproxy [ALERT] 070/095718 (24152) : parsing [/dev/stdin:2] : 'server s' unknown keyword 'sendpx-v2'; did you mean 'send-proxy-v2' maybe ? [ALERT] 070/095718 (24152) : Error(s) found in configuration file : /dev/stdin [ALERT] 070/095718 (24152) : Fatal errors found in configuration.	2021-03-12 14:13:21 +01:00
Willy Tarreau	018251667e	CLEANUP: config: make the cfg_keyword parsers take a const for the defproxy The default proxy was passed as a variable to all parsers instead of a const, which is not without risk, especially when some timeout parsers used to make some int pointers point to the default values for comparisons. We want to be certain that none of these parsers will modify the defaults sections by accident, so it's important to mark this proxy as const. This patch touches all occurrences found (89).	2021-03-09 10:09:43 +01:00
Willy Tarreau	d4e78d873c	MINOR: server: move actconns to the per-thread structure The actconns list creates massive contention on low server counts because it's in fact a list of streams using a server, all threads compete on the list's head and it's still possible to see some watchdog panics on 48 threads under extreme contention with 47 threads trying to add and one thread trying to delete. Moving this list per thread is trivial because it's only used by srv_shutdown_streams(), which simply required to iterate over the list. The field was renamed to "streams" as it's really a list of streams rather than a list of connections.	2021-03-05 15:00:24 +01:00

1 2 3 4 5 ...

610 Commits