haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-09 16:47:18 +02:00

Author	SHA1	Message	Date
matthias sweertvaegher	062ea3a3d4	BUILD: solaris: fix compilation errors Compilation on solaris fails because of usage of names reserved on that platform, i.e. 'queue' and 's_addr'. This patch redefines 'queue' as '_queue' and renames 's_addr' to 'srv_addr' which fixes compilation for now. Future plan: rename 'queue' in code base so define can be removed again. Backporting: 2.9, 2.8	2024-03-09 11:24:54 +01:00
Amaury Denoyelle	8a31783b64	BUG/MEDIUM: server: fix dynamic servers initial settings Contrary to static servers, dynamic servers does not initialize their settings from a default server instance. As such, _srv_parse_init() was responsible to set a set of minimal values to have a correct behavior. However, some settings were not properly initialized. This caused dynamic servers to not behave as static ones without explicit parameters. Currently, the main issue detected is connection reuse which was completely impossible. This is due to incorrect pool_purge_delay and max_reuse settings incompatible with srv_add_to_idle_list(). To fix the connection reuse, but also more generally to ensure dynamic servers are aligned with other server instances, define a new function srv_settings_init(). This is used to set initial values for both default servers and dynamic servers. For static servers, srv_settings_cpy() is kept instead, using their default server as reference. This patch could have unexpected effects on dynamic servers behavior as it restored proper initial settings. Previously, they were set to 0 via calloc() invocation from new_server(). This should be backported up to 2.6, after a brief period of observation.	2024-02-27 17:02:20 +01:00
Amaury Denoyelle	1b8c5abeeb	BUG/MAJOR: server: fix stream crash due to deleted server Before a dynamic server can be deleted, a set of preconditions must be validated to ensure it is not referenced naymore by a stream or a connection. This is implemented in srv_check_for_deletion(). The various criteria specified were incomplete. This allows a server instance to be deleted while still be referenced by a stream and a connection. This bug was reproduced by using ASAN compilation. A script was used to add and delete a server every second, while using h2load to generate traffic with download of 1k objects. Here is the ASAN error. ==140916==ERROR: AddressSanitizer: heap-use-after-free on address 0x520000020080 at pc 0x63cb25679537 bp 0x701529ff5070 sp 0x701529ff5060 READ of size 1 at 0x520000020080 thread T7 #0 0x63cb25679536 in objt_server include/haproxy/obj_type.h:99 #1 0x63cb2568f465 in process_stream src/stream.c:1823 #2 0x63cb25a4a4a2 in run_tasks_from_lists src/task.c:632 #3 0x63cb25a4bf62 in process_runnable_tasks src/task.c:876 #4 0x63cb2596a220 in run_poll_loop src/haproxy.c:3050 #5 0x63cb2596b192 in run_thread_poll_loop src/haproxy.c:3252 #6 0x701539aa9559 (/usr/lib/libc.so.6+0x8b559) (BuildId: c0caa0b7709d3369ee575fcd7d7d0b0fc48733af) #7 0x701539b26a3b (/usr/lib/libc.so.6+0x108a3b) (BuildId: c0caa0b7709d3369ee575fcd7d7d0b0fc48733af) To fix this, add <curr_used_conns> to the counters checked in srv_check_for_deletion(). Outside of this bug, one case which remains sensible is for SF_DIRECT streams which referenced a server instance early in process_stream() before connect_server(). This occurs with use-server directive, force-persist rule or cookie persistence. However, after code reexamination, the code is considered reliable as process_stream() is not rescheduled before connect_server() invocation. These observations have been saved in sess_change_server() documentation to ensure it remains valid in the future. This must be backported up to 2.6.	2024-02-22 18:36:54 +01:00
Willy Tarreau	9b680d7411	MINOR: server: split the server deletion code in two parts We'll need to be able to verify whether or not a server may be deleted. For now, both the verification and the action are performed in the same function, at once under thread isolation. The goal here is to extract the verification code into a new function that will perform these checks, return a status between success/recoverable/non-recoverable failure, and will also return a message for the caller.	2024-02-09 20:38:08 +01:00
Willy Tarreau	eaeb67bdb4	BUG/MINOR: server/cli: add missing LF at the end of certain notice/error lines Some cli_err(), cli_msg() or even ha_error() etc are missing the trailing LF, which breaks the continuity of the CLI parsing: the extra LF that serves to mark the end of the command is in fact taken as the missing LF and no extra one is added. This patch adds the missing LF on identified messages. It might be worth trying to proceed in a more generic way with this, given the amount of code that is possibly at risk.	2024-02-08 18:21:52 +01:00
Frédéric Lécaille	860028db47	CLEANUP: quic: Remaining useless code into server part Remove some QUIC definitions of members from server structure as the haproxy QUIC stack does not support at all the server part (QUIC client) as this time. Remove the statements in relation with their initializations. This patch should be backported as far as 2.6 to save memory.	2024-01-04 11:16:06 +01:00
Amaury Denoyelle	b4db3be86e	BUG/MINOR: server: fix server_find_by_name() usage during parsing Since below commit, server_find_by_name() now search using 'used_server_id' proxy backend tree : `4bcfe30414` OPTIM: server: eb lookup for server_find_by_name() This introduces a regression if server_find_by_name() is used via check_config_validity() during post-parsing. Indeed, used_server_id tree is populated at the same stage so it's possible to not found an existing server. This can cause incorrect rejection of previously valid configuration file. To fix this, servers are now inserted in used_server_id tree during parsing via parse_server(). This guarantees that server instances can be retrieved during post parsing. A known feature which uses server_find_by_name() during post parsing is attach-srv tcp-rule used for reverse HTTP. Prior to the current fix, a config was wrongly rejected if the rule was declared before the server line. This should not be backported unless the mentionned commit is.	2024-01-02 15:52:47 +01:00
Aurelien DARRAGON	bdecff511c	MEDIUM: server: simplify snr_set_srv_down() to prevent confusions snr_set_srv_down() (was formely known as snr_update_srv_status()), is still too ambiguous because it's not clear whether we will be putting the server under maintenance or not. This is mainly due to the fact that the function behaves differently if has_no_ip is set or not. By reviewing the function callers, it has now become clear that snr_resolution_cb() is always calling the function with a valid resolution so we only want to put the server under maintenance if we don't have a valid IP address. On the other hand snr_resolution_error_cb() always calls the function on error, with either no resolution (for SRV requests) or with failing resolution (all cases except RSLV_STATUS_VALID), so in this case we decide whether to put the server under maintenance case by case (ie: expired? timeout?) As a result, let's simplify snr_set_srv_down() so that it is only called when the caller really thinks that the server should be put under maintenance, which means always for snr_resolution_error_cb(), and only if the resolution didn't yield usable ip for snr_resolution_cb().	2024-01-02 10:29:50 +01:00
Aurelien DARRAGON	689784ed91	CLEANUP: resolvers: remove some more unused RSLV_UDP flags RSLV_UPD_CNAME and RSLV_UPD_NAME_ERROR flags have now become useless since `3cf7f987` ("MINOR: dns: proper domain name validation when receiving DNS response") as they are never set, but we forgot to remove them.	2024-01-02 10:29:41 +01:00
Aurelien DARRAGON	3ebe7bef8d	CLEANUP: server: remove ambiguous check in srv_update_addr_port() A leftover check was left by recent patch series about server addr:svc_port propagation: a check on (msg) being set was performed in srv_update_addr_port(), but msg is always set, so the check is not needed and confuses coverity (See GH #2399)	2024-01-02 10:29:24 +01:00
Ilya Shipitsin	8705e45964	CLEANUP: assorted typo fixes in the code and comments This is 38th iteration of typo fixes	2024-01-02 10:19:48 +01:00
Aurelien DARRAGON	64c9c8ef39	BUG/MINOR: server/dns: use server_set_inetaddr() to unset srv addr from DNS As seen before, server's addr and svc_port should not be updated directly during runtime, because even if the update is performed under the lock, some competing threads might be reading ->addr and ->svc_port without the lock because they simply cannot afford it. To prevent races with such competing threads, server's addr and port should only be updated using server_set_inetaddr() function or similar. This patch depends on: - "MINOR: server: ensure connection cleanup on server addr changes" - "CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event" - "MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic" - "MEDIUM: server: make server_set_inetaddr() updater serializable" - "MINOR: server/event_hdl: expose updater info through INETADDR event" - "MINOR: server: add dns hint in server_inetaddr_updater struct" - "MEDIUM: server/dns: clear RMAINT when addr resolves again" While it could be backported in 2.9 with `cd994407a` ("BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates") to ensure addr and svc_port reset performed by resolver's code comply with the API taking care of pushing the update (and thus avoid any race), some patch dependencies are quite sensitive so it's probably best to avoid backporting for no good reason, or at least wait for it to be considered stable to prevent any breakeages.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	334ebfa1a2	MEDIUM: server/dns: clear RMAINT when addr resolves again snr_update_srv_status() and srvrq_update_srv_status() will both set or clear the server RMAINT state depending of the result of the current dns resolution. This used to work pretty well in the past, but now that addr:svc_port changes are changed atomically through a dedicated task, the change is performed asynchronously, so this can cause some flapping issues if the server is put out of maintenance while the server's address is still unassigned. To prevent errors, the resolver's code is now only allowed to put the server under maintenance but not to remove it from maintenance: the decision to remove a server from maintenance is performed by the task responsible for updating the server's addr: if the addr resolves again thanks to a valid DNS resolution and the server was previously under RMAINT, then it cleared from RMAINT state. srvrq_update_srv_status() was renamed srvrq_set_srv_down(), since it is only called to put the server in maintenance as a result of a failing SRV entry. snr_update_srv_status() was renamed srv_set_srv_down() and slightly modified so that it only takes care of putting the server under maintenance when needed. The cli command "set server x/y addr" does not need to remove the RMAINT flag anymore.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	33cd676e9e	MINOR: server/event_hdl: expose updater info through INETADDR event Thanks to the previous commit, we can now expose updater info through INETADDR event.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	3ac79b504a	MEDIUM: server: make server_set_inetaddr() updater serializable server_set_inetaddr() updater argument is a simple char * string containing infos about the caller responsible for the update. In this patch, we try to make this argument serializable, that is, make it so that we can easily export it without having to keep the original pointer passed by the caller or having to work with strings of variable lengths. This was a prerequisite for exposing more updater information through SERVER_INETADDR event (upcoming patch). Static strings were simply mapped to a fixed ID that can be converted back to a string when needed using server_inetaddr_updater_by_to_str(). One special case one made for the SERVER_INETADDR_UPDATER_DNS_RESOLVER updater since in this case the updater hint has to be generated from the corresponding resolver id / nameserver id combination. This was achieved by saving the nameserver id within the updater struct. Knowing that the resolver id can be guessed from the server struct directly, it was not exposed through the updater struct. This patch depends on: - "MINOR: resolvers: add unique numeric id to nameservers" No functional change should be expected.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	ab6fef4882	CLEANUP: server: remove unused server_parse_addr_change_request() function server_parse_addr_change_request() was completely replaced by the newer srv_update_addr_port() function. Considering the function doesn't offer useful features that srv_update_addr_port() couldn't do, we simply remove the function.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	f1f4b93a67	MEDIUM: server: merge srv_update_addr() and srv_update_addr_port() logic Both functions are performing the similar tasks, except that the _port() version is doing a bit more work. In this patch, we add the server_set_inetaddr() function that works like the srv_update_addr_port() but it takes parsed inputs instead of raw strings as arguments. Then, server_set_inetaddr() is used as underlying helper function for both srv_update_addr() and srv_update_addr_port() to make them easier to maintain. Also, helper functions were added: - server_set_inetaddr_warn() -> same as server_set_inetaddr() but report a warning on updates. - server_get_inetaddr() -> fills a struct server_inetaddr from srv Since the feedback message generation part was slightly reworked, some minor changes in the way addr:svc_port updates are reported in the logs or cli messages should be expected (no loss of information though).	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	2d0c7f5935	CLEANUP: server/event_hdl: remove purge_conn hint in INETADDR event Now that purge_conn hint is now being ignored thanks to previous commit, we can simply get rid of it.	2023-12-21 14:22:27 +01:00
Aurelien DARRAGON	2e3a163e47	MINOR: server: ensure connection cleanup on server addr changes Previously, in srv_update_addr_port(), we forced connection cleanup on server changes. This was done in `6318d33ce` ("BUG/MEDIUM: connections: force connections cleanup on server changes"). However, there is no reason we shouldn't have done the same in srv_update_addr() function, because the end goal is the same: perform runtime changes on server's address. The purge_conn hint propagated through the INETADDR server event was simply there to keep the original behavior (only purge the connection for events originating from srv_update_addr_port()), but to ensure the address change is handled the same way for both code paths, we simply ignore this hint.	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	545e72546c	BUG/MINOR: server/event_hdl: propagate map port info through inetaddr event server addr:svc_port updates during runtime might set or clear the SRV_F_MAPPORTS flag. Unfortunately, the flag update is still directly performed by srv_update_addr_port() function while the addr:svc_port update is being scheduled for atomic update. Given that existing readers don't take server's lock to read addr:svc_port, they also check the SRV_F_MAPPORTS flag right after without the lock. So we could cause the readers to incorrectly interpret the svc_port from the server struct because the mapport information is not published atomically, resulting in inconsistencies between svc_port / mapport flag. (MAPPORTS flag causes svc_port to be used differently by the reader) To fix this, we publish the mapport information within the INETADDR server event and we let the task responsible for updating server's addr and port position or clear the flag depending on the mapport hint. This patch depends on: - MINOR: server/event_hdl: add server_inetaddr struct to facilitate event data usage - MINOR: server/event_hdl: update _srv_event_hdl_prepare_inetaddr prototype This should be backported in 2.9 with `683b2ae01` ("MINOR: server/event_hdl: add SERVER_INETADDR event")	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	4e50c31eab	MINOR: server/event_hdl: update _srv_event_hdl_prepare_inetaddr prototype Slightly change _srv_event_hdl_prepare_inetaddr() function prototype to reduce the input arguments by learning some settings directly from the server. Also taking this opportunity to make the function static inline since it's relatively simple and not meant to be used directly.	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	835263047e	OPTIM: server: ebtree lookups for findserver_unique_* functions `4e5e2664` ("MINOR: proxy: add findserver_unique_id() and findserver_unique_name()") added findserver_unique_id() and findserver_unique_name() functions that were inspired from the historical findserver() function, so unfortunately they don't perform well when used on large backend farms because they scan the whole server list linearly. I was about to provide a patch to optimize such functions when I stumbled on Baptiste's work: `19a106d24` ("MINOR: server: server_find functions: id, name, best_match") It turns out Baptiste already implemented helper functions to supersed the unoptimized findserver() function (at least at runtime when servers have been assigned their final IDs and inserted in the lookup trees): they offer more matching options and rely on eb lookups so they are much more suitable for fast queries. I don't know how I missed that, but they are a perfect base for the server rid matching functions. So in this patch, we essentially revert `4e5e2664` to provide the optimized equivalent functions named server_find_by_id_unique() and server_find_by_name_unique(), then we force existing findserver_unique_*() callers to switch to the new functions. This patch depends on: - "OPTIM: server: eb lookup for server_find_by_name()" This could be backported up to 2.8.	2023-12-21 14:22:26 +01:00
Aurelien DARRAGON	4bcfe30414	OPTIM: server: eb lookup for server_find_by_name() server_find_by_name() function was added in `19a106d24` ("MINOR: server: server_find functions: id, name, best_match"). At that time, only the used_server_id proxy tree was available, thus the name lookup was performed as a linear search. However, used_server_name proxy tree was added in `84d6046a` ("MINOR: proxy: Add a "server by name" tree to proxy."), so we may safely rely on it to perform server name lookups now. This will hopefully make the function quite faster, especially when performing lookups in huge backend farms.	2023-12-21 14:22:26 +01:00
Christopher Faulet	3811c1de25	BUG/MINOR: server: Use the configured address family for the initial resolution A regression was introduced by the commit `c886fb58eb` ("MINOR: server/ip: centralize server ip updates"). The configured address family is lost when the server address is initialized during the startup, for the resolution based on the libc or based on the server state-file. Thus, "ipv4@" and "ipv6@" prefixed are ignored. To fix the bug, we take care to use the configured address family before calling str2ip2() in srv_apply_lastaddr() and srv_apply_via_libc() functions. This patch should fix the issue #2393. It must be backported to 2.9.	2023-12-20 12:21:59 +01:00
Aurelien DARRAGON	c2cd6a419c	BUG/MINOR: server/event_hdl: properly handle AF_UNSPEC for INETADDR event It is possible that a server's addr family is temporarily set to AF_UNSPEC even if we're certain to be in INET context (ipv4, ipv6). Indeed, as soon as IP address resolving is involved, srv->addr family will be set to AF_UNSPEC when the resolution fails (could happen at anytime). However, _srv_event_hdl_prepare_inetaddr() wrongly assumed that it would only be called with AF_INET or AF_INET6 families. Because of that, the function will handle AF_UNSPEC address as an IPV6 address: not only we could risk reading from an unititialized area, but we would then propagate false information when publishing the event. In this patch we make sure to properly handle the AF_UNSPEC family in both the "prev" and the "next" part for SERVER_INETADDR event and that every members are explicitly initialized. This bug was introduced by 6fde37e046 ("MINOR: server/event_hdl: add SERVER_INETADDR event"), no backport needed.	2023-12-01 20:43:42 +01:00
Willy Tarreau	822d45678f	BUILD: server: shut a bogus gcc warning on certain ubuntu On ubuntu 20.04 and 22.04 with gcc 9.4 and 11.4 respectively, we get the following warning: src/server.c: In function 'srv_update_addr_port': src/server.c:4027:3: warning: 'new_port' may be used uninitialized in this function [-Wmaybe-uninitialized] 4027 \| _srv_event_hdl_prepare_inetaddr(&cb_data.addr, &s->addr, s->svc_port, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4028 \| ((ip_change) ? &sa : &s->addr), \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4029 \| ((port_change) ? new_port : s->svc_port), \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4030 \| 1); \| ~~ It's clearly wrong, port_change only changes from 0 to anything else after assigning new_port. Let's just preset new_port to zero instead of trying to play smart with the compiler.	2023-11-30 17:48:03 +01:00
Aurelien DARRAGON	2f2cb6d082	MEDIUM: log/balance: support FQDN for UDP log servers In previous log backend implementation, we created a pseudo log target for each declared log server, and we made the log target's address point to the actual server address to save some time and prevent unecessary copies. But this was done without knowing that when FQDN is involved (more broadly when dns/resolution is involved), the "port" part of server addr should not be relied upon, and we should explicitly use ->svc_port for that purpose. With that in mind and thanks to the previous commit, some changes were required: we allocate a dedicated addr within the log target when target is in DGRAM mode. The addr is first initialized with known values and it is then updated automatically by _srv_set_inetaddr() during runtime. (the change is atomic so readers don't need to worry about it) addr from server "log target" (INET/DGRAM mode) is made of the combination of server's address (lacking the port part) and server's svc_port.	2023-11-29 08:59:27 +01:00
Aurelien DARRAGON	cd994407a9	BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates For inet families (IP4/IP6), it is expected that server's addr/port might be updated at runtime from DNS, cli or lua for instance. Such updates were performed under the server's lock. Unfortunately, most readers such as backend.c or sink.c perform the read without taking server's lock because they can't afford slowing down their processing for a type of event which is normally rare. But this could result in bad values being read for the server addr:svc_port tuple (ie: during connection etablishment) as a result of concurrent updates from external components, which can obviously cause some undesirable effects. Instead of slowing the readers down, as we consider server's addr changes are relatively rare, we take another approach and try to update the addr:port atomically by performing changes under full thread isolation when a new change is requested. The changes are performed by a dedicated task which takes care of isolating the current thread and doesn't depend on other threads (independent code path) to protect against dead locks. As such, server's addr:port changes will now be performed atomically, but they will not be processed instantly, they will be translated to events that the dedicated task will pick up from time to time to apply the pending changes. This bug existed for a very long time and has never been reported so far. It was discovered by reading the code during the implementation of log backend ("mode log" in backends). As it involves changes in sensitive areas as well as thread isolation, it is probably not worth considering backporting it for now, unless it is proven that it will help to solve bugs that are actually encountered in the field. This patch depends on: - `24da4d3` ("MINOR: tools: use const for read only pointers in ip{cmp,cpy}") - `c886fb5` ("MINOR: server/ip: centralize server ip updates") - event_hdl API (which was first seen on 2.8) + `683b2ae` ("MINOR: server/event_hdl: add SERVER_INETADDR event") + BUG/MEDIUM: server/event_hdl: memory overrun in _srv_event_hdl_prepare_inetaddr() + "MINOR: event_hdl: add global tunables" Note that the patch may be reworked so that it doesn't depend on event_hdl API for older versions, the approach would remain the same: this would result in a larger patch due to the need to manually implement a global queue of pending updates with its dedicated task responsible for picking updates and comitting them. An alternative approach could consist in per-server, lock-protected, temporary addr:svc_port storage dedicated to "updaters" were only the most recent values would be kept. The sync task would then use them as source values to atomically update the addr:svc_port members that the runtime readers are actually using.	2023-11-29 08:59:27 +01:00
Aurelien DARRAGON	f638d4b1bc	BUG/MEDIUM: server/event_hdl: memory overrun in _srv_event_hdl_prepare_inetaddr() As reported in GH #2358, #2359, #2360, #2361 and #2362: ipv6 address handling may cause memory overrun due to struct in6_addr being handled as sockaddr_in6 which is larger. Moreover, source variable wasn't properly read from since the raw value was used as a pointer instead of pointing to the actual variable's address. This bug was introduced by 6fde37e046 ("MINOR: server/event_hdl: add SERVER_INETADDR event") Unfortunately for us, gcc didn't catch this and, this actually used to "work" by accident since in6_addr struct is made of array so not passing pointer explicitly still resolved to the proper starting address.. Hopefully this was caught by coverity so thanks to Ilya for that. The fix is simple: we simply copy the whole in6_addr struct by accessing it using a pointer and using the proper struct size for the copy.	2023-11-29 08:59:27 +01:00
Aurelien DARRAGON	c886fb58eb	MINOR: server/ip: centralize server ip updates Add a new helper function named _srv_update_inetaddr() to centralize ip addr and port updates during runtime.	2023-11-24 16:27:55 +01:00
Aurelien DARRAGON	683b2ae013	MINOR: server/event_hdl: add SERVER_INETADDR event In this patch we add the support for a new SERVER event in the event_hdl API. SERVER_INETADDR is implemented as an advanced server event. It is published each time the server's ip address or port is about to change. (ie: from the cli, dns, lua...) SERVER_INETADDR data is an event_hdl_cb_data_server_inetaddr struct that provides additional info related to the server inet addr change, but can be casted as a regular event_hdl_cb_data_server struct if additional info is not needed.	2023-11-24 16:27:55 +01:00
Amaury Denoyelle	55e78ff7e1	MINOR: rhttp: large renaming to use rhttp prefix Previous commit renames 'proto_reverse_connect' module to 'proto_rhttp'. This commits follows this by replacing various custom prefix by 'rhttp_' to make the code uniform. Note that 'reverse_' prefix was kept in connection module. This is because if a new reversable protocol not based on HTTP is implemented, it may be necessary to reused the same connection function which are protocol agnostic.	2023-11-23 17:40:01 +01:00
Willy Tarreau	53da8bfcb6	BUG/MINOR: server: do not leak default-server in defaults sections When a default-server directive is used in a defaults section, it's never freed and the "defaults" proxy gets reset without freeing the fields from that default-server. Normally there are no allocation there, except for the config file location stored in srv->conf.file form an strdup() since commit `9394a9444` ("REORG: server: move alert traces in parse_server") that appeared in 2.4. In addition, if a "default-server" directive appears multiple times in a defaults section, one more entry will be leaked per call. This commit addresses this by checking that we don't overwrite the file upon multiple calls, and by clearing it when resetting the default proxy. This should be backported to 2.4.	2023-11-23 14:32:55 +01:00
Amaury Denoyelle	560cb1332a	MINOR: server: force add to idle on reverse A backend connection is inserted in server idle list via srv_add_to_idle_list(). This function has several conditions which may cause the connection to be rejected instead. One of this condition is based on the current estimate count of needed connections for the server. If the count of idle connections stored has already reached this estimation, the new connection is rejected. This is in opposition with the purpose of reverse HTTP. On active reverse, haproxy can instantiate several connections to properly serve the future traffic. However, the opposite passive haproxy will have only a low estimate of needed connection and will reject most of them. To fix this, simply check CO_FL_REVERSED connection flag on srv_add_to_idle_list(). If set, the connection is inserted without checking for estimate count. Note that all other conditions are not impacted, so it's still possible to reject a connection, for example if process FD limit is reached. This commit relies on recent patch which change CO_FL_REVERSED flag for connection after passive reverse.	2023-11-16 18:43:41 +01:00
Willy Tarreau	79aa638238	MINOR: server: always initialize pp_tlvs for default servers In commit `6f4bfed3a` ("MINOR: server: Add parser support for set-proxy-v2-tlv-fmt") a suspicious check for a NULL srv_tlv was placed in the list_for_each_entry(), that should not be needed. In practice, it's caused by the list head not being initialized, hence the first element is NULL, as shown by Alexander's reproducer below which crashes if the test in the loop is removed: backend dummy default-server send-proxy-v2 set-proxy-v2-tlv-fmt(0xE1) %[fc_pp_tlv(0xE1)] server dummy_server 127.0.0.1:2319 The right place to initialize this field is proxy_preset_defaults(). We'd really need a function to initialize a server :-/ The check in the loop was removed. No backport is needed.	2023-11-13 08:53:28 +01:00
Aurelien DARRAGON	64e0b63442	BUG/MEDIUM: server: invalid address (post)parsing checks This bug was introduced with `29b76ca` ("BUG/MEDIUM: server/log: "mode log" after server keyword causes crash ") Indeed, we cannot safely rely on addr_proto being set when str2sa_range() returns in parse_server() (even if SRV_PARSE_PARSE_ADDR is set), because proto lookup might be bypassed when FQDN addresses are involved. Unfortunately, the above patch wrongly assumed that proto would always be set when SRV_PARSE_PARSE_ADDR was passed to parse_server() (so when str2sa_range() was called), resulting in invalid postparsing checks being performed, which could as well lead to crashes with log backends ("mode log" set) because some postparsing init was skipped as a result of proto not being set and this wasn't expected later in the init code. To fix this, we now make use of the previous patch to perform server's address compatibility checks on hints that are always set when str2sa_range() succesfully returns. For log backend, we're also adding a complementary test to check if the address family is of expected type, else we report an error, plus we're moving the postinit logic in log api since _srv_check_proxy_mode() is only meant to check proxy mode compatibility and we were abusing it. This patch depends on: - "MINOR: tools: make str2sa_range() directly return type hints" No backport required unless `29b76ca` gets backported.	2023-11-10 17:49:57 +01:00
Aurelien DARRAGON	12582eb8e5	MINOR: tools: make str2sa_range() directly return type hints str2sa_range() already allows the caller to provide <proto> in order to get a pointer on the protocol matching with the string input thanks to `5fc9328a` ("MINOR: tools: make str2sa_range() directly return the protocol") However, as stated into the commit message, there is a trick: "we can fail to return a protocol in case the caller accepts an fqdn for use later. This is what servers do and in this case it is valid to return no protocol" In this case, we're unable to return protocol because the protocol lookup depends on both the [proto type + xprt type] and the [family type] to be known. While family type might not be directly resolved when fqdn is involved (because family type might be discovered using DNS queries), proto type and xprt type are already known. As such, the caller might be interested in knowing those address related hints even if the address family type is not yet resolved and thus the matching protocol cannot be looked up. Thus in this patch we add the optional net_addr_type (custom type) argument to str2sa_range to enable the caller to check the protocol type and transport type when the function succeeds.	2023-11-10 17:49:57 +01:00
Tim Duesterhus	d7eaa0d553	CLEANUP: Re-apply xalloc_size.cocci (3) This reapplies the xalloc_size.cocci patch across the whole `src/` tree. see `16cc16dd82` see `63ee0e4c01` see `9fb57e8c17`	2023-11-06 20:49:56 +01:00
Willy Tarreau	09eacb8b24	BUG/MINOR: server: remove some incorrect free() calls on null elements In commit `6f4bfed3a` ("MINOR: server: Add parser support for set-proxy-v2-tlv-fmt") a few free() calls were made to an element on error path when it was detected it was NULL. It doesn't have any effect, however there was one case of use-after-free at the end of srv_settings_cpy() that was caught by gcc due to attempting to free the element after freeing its holder. No backport is needed.	2023-11-04 08:56:01 +01:00
Alexander Stephan	6f4bfed3a2	MINOR: server: Add parser support for set-proxy-v2-tlv-fmt This commit introduces a generic server-side parsing of type-value pair arguments and allocation of a TLV list via a new keyword called set-proxy-v2-tlv-fmt. This allows to 1) forward any TLV type with the help of fc_pp_tlv, 2) generally, send out any TLV type and value via a log format expression. To have this fully working the connection will need to be updated in a follow-up commit to actually respect the new server TLV list. default-server support has also been implemented.	2023-11-04 04:56:59 +01:00
Aurelien DARRAGON	1822e8998b	MINOR: server: add helper function to detach server from proxy list Remove some code duplication by introducing a basic helper function to detach a server from its parent proxy. It is supported to call the function even if the server is not yet listed in the proxy list. If the server is not yet listed in the proxy, the function will do nothing. In delete_server(), we previously performed some BUG_ON() to ensure that the detach always succeeded given that we were certain that the server was in the proxy list because it was retrieved through get_backend_server(). However this test is superfluous, we can safely assume that the operation will always succeed if get_backend_server() returned != NULL (we're under full thread isolation), and if it's not the case, then we have a bigger API issue anyway..	2023-10-25 11:59:27 +02:00
Aurelien DARRAGON	e128fc7ce1	BUG/MEDIUM: server: "proto" not working for dynamic servers In `304672320e` ("MINOR: server: support keyword proto in 'add server' cli") improper use of conn_get_best_mux_entry() function was made: First, server's proxy mode was directly passed as "proto_mode" argument to conn_get_best_mux_entry(), but this is strictly invalid because while there is some relationship between proto modes and proxy modes, they don't use the same storage mechanism and cannot be used interchangeably. Because of this bug, conn_get_best_mux_entry() would not work at all for TCP because PR_MODE_TCP equals 0, where PROTO_MODE_TCP normally equals 1. Then another, less sensitive bug, remains: as its name and description implies, conn_get_best_mux_entry() will try its best to return something to the user, only using keyword (mux_proto) input as an hint to return the most relevant mux within the list of mux that are compatibles with proto_side and proto_mode values. This means that even if mux_proto cannot be found or is not available with current proto_side and proto_mode values, conn_get_best_mux_entry() will most probably fallback to a more generic mux. However in cli_parse_add_server(), we directly check the result of conn_get_best_mux_entry() and consider that it will return NULL if the provided keyword hint for mux_proto cannot be found. This will result in the function not raising errors as expected, because most of the times if the expected proto cannot be found, then we'll silently switch to the fallback one, despite the user providing an explicit proto. To fix that, we store the result of conn_get_best_mux_entry() to compare the returned mux proto name with the one we're expecting to get, as it is originally performed in cfgparse during initial server keyword parsing. This patch depends on - "MINOR: connection: add conn_pr_mode_to_proto_mode() helper func") It must be backported up to 2.6.	2023-10-25 11:59:27 +02:00
Aurelien DARRAGON	29b76cae47	BUG/MEDIUM: server/log: "mode log" after server keyword causes crash In `9a74a6c` ("MAJOR: log: introduce log backends"), a mistake was made: it was assumed that the proxy mode was already known during server keyword parsing in parse_server() function, but this is wrong. Indeed, "mode log" can be declared late in the proxy section. Due to this, a simple config like this will cause the process to crash: \|backend test \| \| server name 127.0.0.1:8080 \| mode log In order to fix this, we relax some checks in _srv_parse_init() and store the address protocol from str2sa_range() in server struct, then we set-up a postparsing function that is to be called after config parsing to finish the server checks/initialization that depend on the proxy mode to be known. We achieve this by checking the PR_CAP_LB capability from the parent proxy to know if we're in such case where the effective proxy mode is not yet known (it is assumed that other proxies which are implicit ones don't provide this possibility and thus don't suffer from this constraint). Only then, if the capability is not found, we immediately perform the server checks that depend on the proxy mode, else the check is postponed and it will automatically be performed during postparsing thanks to the REGISTER_POST_SERVER_CHECK() hook. Note that we remove the SRV_PARSE_IN_LOG_BE flag because it was introduced in the above commit and it is no longer relevant. No backport needed unless `9a74a6c` gets backported.	2023-10-25 11:59:27 +02:00
Amaury Denoyelle	f76e94d231	MINOR: backend: refactor insertion in avail conns tree Define a new function srv_add_to_avail_list(). This function is used to centralize connection insertion in available tree. It reuses a BUG_ON() statement to ensure the connection is not present in the idle list.	2023-10-25 10:33:06 +02:00
Amaury Denoyelle	394bd4eb39	BUG/MAJOR: backend: fix idle conn crash under low FD Since the following commit, idle conns are stored in a list as secondary storage to retrieve them in usage order : `5afcb686b9` MAJOR: connection: purge idle conn by last usage The list usage has been extended wherever connections lookup are done both on idle and safe trees. This reduced the code size by replacing a two tree loops by a single list loop. LIST_ELEM() is used in this context to retrieve the first idle list element from the server list head. However, macro usage was wrong due to an extra '&' operator which returns an invalid connection reference. This will most of the time caused a crash on conn_delete_from_tree() or affiliated functions. This bug only occurs if the FD pool is exhausted and some idle connections are selected to be killed. It can be reproduced using the following config and h2load command : $ h2load -t 8 -c 800 -m 10 -n 800 "http://127.0.0.1:21080/?s=10k" global maxconn 100 defaults mode http timeout connect 20s timeout client 20s timeout server 20s listen li bind :21080 proto h2 server nginx 127.99.0.1:30080 proto h1 This bug has been introduced by the above commit. Thus no need to backport this fix. Note that LIST_ELEM() macro usage was slightly adjusted also in srv_migrate_conns_to_remove(). The function used toremove_list instead of idle_list connection list element. This is not a bug as they are stored in the same union. However, the new code is clearer as it intends to move connection from the idle_list only into the toremove_list mt-list.	2023-10-25 10:30:45 +02:00
Amaury Denoyelle	9d4c7c1151	MINOR: server: convert @reverse to rev@ standard format Remove the recently introduced '@reverse' notation for HTTP reverse servers. Instead, reuse the 'rev@' prefix already defined for bind lines.	2023-10-20 14:44:37 +02:00
Aurelien DARRAGON	94d0f77deb	MINOR: server: introduce "log-bufsize" kw "log-bufsize" may now be used for a log server (in a log backend) to configure the bufsize of implicit ring associated to the server (which defaults to BUFSIZE).	2023-10-13 10:05:07 +02:00
Aurelien DARRAGON	9a74a6cb17	MAJOR: log: introduce log backends Using "mode log" in a backend section turns the proxy in a log backend which can be used to log-balance logs between multiple log targets (udp or tcp servers) log backends can be used as regular log targets using the log directive with "backend@be_name" prefix, like so: \| log backend@mybackend local0 A log backend will distribute log messages to servers according to the log load-balancing algorithm that can be set using the "log-balance" option from the log backend section. For now, only the roundrobin algorithm is supported and set by default.	2023-10-13 10:05:06 +02:00
Aurelien DARRAGON	95c4d24825	BUG/MEDIUM: server/cli: don't delete a dynamic server that has streams In cli_parse_delete_server(), we take care of checking that the server is in MAINT and that the cur_sess counter is set to 0, in the hope that no connection/stream ressources continue to point to the server, else we refuse to delete it. As shown in GH #2298, this is not sufficient. Indeed, when the server option "on-marked-down shutdown-sessions" is not used, server streams are not purged when srv enters maintenance mode. As such, there could be remaining streams that point to the server. To detect this, a secondary check on srv->cur_sess counter was performed in cli_parse_delete_server(). Unfortunately, there are some code paths that could lead to cur_sess being decremented, and not resulting in a stream being actually shutdown. As such, if the delete_server cli is handled right after cur_sess has been decremented with streams still pointing to the server, we could face some nasty bugs where stream->srv_conn could point to garbage memory area, as described in the original github report. To make the check more reliable prior to deleting the server, we don't rely exclusively on cur_sess and directly check that the server is not used in any stream through the srv_has_stream() helper function. Thanks to @capflam which found out the root cause for the bug and greatly helped to provide the fix. This should be backported up to 2.6.	2023-09-21 14:57:01 +02:00
Aurelien DARRAGON	2c9bd3ae80	BUG/MINOR: server: add missing free for server->rdr_pfx rdr_pfx was not being free during server cleanup, leading to small memory leak when "redir" argument was used on a server line (HTTP only). This should be backported to every stable versions. [For 2.6 and 2.7: the free should be performed in srv_drop() directly. For older versions: free in deinit() function near the free for the cookie string]	2023-09-15 17:46:49 +02:00
Willy Tarreau	6cbb5a057b	Revert "MAJOR: import: update mt_list to support exponential back-off" This reverts commit `c618ed5ff4`. The list iterator is broken. As found by Fred, running QUIC single- threaded shows that only the first connection is accepted because the accepter relies on the element being initialized once detached (which is expected and matches what MT_LIST_DELETE_SAFE() used to do before). However while doing this in the quic_sock code seems to work, doing it inside the macro show total breakage and the unit test doesn't work anymore (random crashes). Thus it looks like the fix is not trivial, let's roll this back for the time it will take to fix the loop.	2023-09-15 17:13:43 +02:00
Willy Tarreau	c618ed5ff4	MAJOR: import: update mt_list to support exponential back-off The new mt_list code supports exponential back-off on conflict, which is important for use cases where there is contention on a large number of threads. The API evolved a little bit and required some updates: - mt_list_for_each_entry_safe() is now in upper case to explicitly show that it is a macro, and only uses the back element, doesn't require a secondary pointer for deletes anymore. - MT_LIST_DELETE_SAFE() doesn't exist anymore, instead one just has to set the list iterator to NULL so that it is not re-inserted into the list and the list is spliced there. One must be careful because it was usually performed before freeing the element. Now instead the element must be nulled before the continue/break. - MT_LIST_LOCK_ELT() and MT_LIST_UNLOCK_ELT() have always been unclear. They were replaced by mt_list_cut_around() and mt_list_connect_elem() which more explicitly detach the element and reconnect it into the list. - MT_LIST_APPEND_LOCKED() was only in haproxy so it was left as-is in list.h. It may however possibly benefit from being upstreamed. This required tiny adaptations to event_hdl.c and quic_sock.c. The test case was updated and the API doc added. Note that in order to keep include files small, the struct mt_list definition remains in list-t.h (par of the internal API) and was ifdef'd out in mt_list.h. A test on QUIC with both quictls 1.1.1 and wolfssl 5.6.3 on ARM64 with 80 threads shows a drastic reduction of CPU usage thanks to this and the refined memory barriers. Please note that the CPU usage on OpenSSL 3.0.9 is significantly higher due to the excessive use of atomic ops by openssl, but 3.1 is only slightly above 1.1.1 though: - before: 35 Gbps, 3.5 Mpps, 7800% CPU - after: 41 Gbps, 4.2 Mpps, 2900% CPU	2023-09-13 11:50:33 +02:00
Amaury Denoyelle	5afcb686b9	MAJOR: connection: purge idle conn by last usage Backend idle connections are purged on a recurring occurence during the process lifetime. An estimated number of needed connections is calculated and the excess is removed periodically. Before this patch, purge was done directly using the idle then the safe connection tree of a server instance. This has a major drawback to take no account of a specific ordre and it may removed functional connections while leaving ones which will fail on the next reuse. The problem can be worse when using criteria to differentiate idle connections such as the SSL SNI. In this case, purge may remove connections with a high rate of reusing while leaving connections with criteria never matched once, thus reducing drastically the reuse rate. To improve this, introduce an alternative storage for idle connection used in parallel of the idle/safe trees. Now, each connection inserted in one of this tree is also inserted in the new list at `srv_per_thread.idle_conn_list`. This guarantees that recently used connection is present at the end of the list. During the purge, use this list instead of idle/safe trees. Remove first connection in front of the list which were not reused recently. This will ensure that connection that are frequently reused are not purged and should increase the reuse rate, particularily if distinct idle connection criterias are in used.	2023-08-25 15:57:48 +02:00
Amaury Denoyelle	61fc9568fb	MINOR: server: move idle tree insert in a dedicated function Define a new function _srv_add_idle(). This is a simple wrapper to insert a connection in the server idle tree. This is reserved for simple usage and require to idle_conns lock. In most cases, srv_add_to_idle_list() should be used. This patch does not have any functional change. However, it will help with the next patch as idle connection will be always inserted in a list as secondary storage along with idle/safe trees.	2023-08-25 15:57:48 +02:00
Amaury Denoyelle	77ac8eb4a6	MINOR: connection: simplify removal of idle conns from their trees Small change of API for conn_delete_from_tree(). Now the connection instance is taken as argument instead of its inner node. No functional change introduced with this commit. This simplifies slightly invocation of conn_delete_from_tree(). The most useful changes is that this function will be extended in the next patch to be able to remove the connection from its new idle list at the same time as in its idle tree.	2023-08-25 15:57:48 +02:00
Amaury Denoyelle	e6223a3188	MINOR: server: define reverse-connect server Implement reverse-connect server. This server type cannot instantiate its own connection on transfer. Instead, it can only reuse connection from its idle pool. These connections will be populated using the future 'tcp-request session attach-srv' rule. A reverse-connect has no address. Instead, it uses a new custom server notation with '@' character prefix. For the moment, only '@reverse' is defined. An extra check is implemented to ensure server is used in a HTTP proxy.	2023-08-24 14:49:03 +02:00
Amaury Denoyelle	d8d9122a02	MINOR: connection: centralize init/deinit of backend elements A connection contains extra elements which are only used for the backend side. Regroup their allocation and deallocation in two new functions named conn_backend_init() and conn_backend_deinit(). No functional change is introduced with this commit. The new functions are reused in place of manual alloc/dealloc in conn_new() / conn_free(). This patch will be useful for reverse connect support with connection conversion from backend to frontend side and vice-versa.	2023-08-24 14:44:33 +02:00
Amaury Denoyelle	fbe35afaa4	MINOR: proxy: simplify parsing 'backend/server' Several CLI handlers use a server argument specified with the format '<backend>/<server>'. The parsing of this arguement is done in two steps, first splitting the string with '/' delimiter and then use get_backend_server() to retrieve the server instance. Refactor this code sections with the following changes : * splitting is reimplented using ist API * get_backend_server() is removed. Instead use the already existing proxy_be_by_name() then server_find_by_name() which contains duplicated code with the now removed function. No functional change occurs with this commit. However, it will be useful to add new configuration options reusing the same '<backend>/<server>' for reverse connect.	2023-08-24 14:44:33 +02:00
Christopher Faulet	e2e72e578e	BUG/MINOR: server: Don't warn on server resolution failure with init-addr none During startup, when the "none" method for "init-addr" is evaluated, a warning is emitted if a resolution failure was previously encountered. The documentation of the "none" method states it should be used to ignore server resolution failures and let the server starts in DOWN state. However, because a warning may be emitted, it is not possible to start HAProxy with "zero-warning" option. The same is true when "-dr" command line option is used. It is counter intuitive and, in a way, this contradict what is specified in the documentation. So instead, a notice message is now emitted. At the end, if "-dr" command line option is used or if "none" method is explicitly used, it means the admin is agree with server resolution failures. There is no reason to emit a warning. This patch should fix the issue #2176. It could be backported to all stable versions but backporting to 2.8 is probably enough for now.	2023-07-20 18:12:44 +02:00
Aurelien DARRAGON	b2f7069479	BUG/MINOR: server: set rid default value in new_server() srv->rid default value is set in _srv_parse_init() after the server is succesfully allocated using new_server(). This is wrong because new_server() can be used independently so rid value assignment would be skipped in this case. Hopefully new_server() allocates server data using calloc() so srv->rid is already set to 0 in practise. But if calloc() is replaced by malloc() or other memory allocating function that doesn't zero-initialize srv members, this could lead to rid being uninitialized in some cases. This should be backported in 2.8 with `61e3894dfe` ("MINOR: server: add srv->rid (revision id) value")	2023-07-10 18:28:08 +02:00
Aurelien DARRAGON	19b5a7c7a5	BUG/MINOR: server: inherit from netns in srv_settings_cpy() When support for 'namespace' keyword was added for the 'default-server' directive in `22f41a2` ("MINOR: server: Make 'default-server' support 'namespace' keyword."), we forgot to copy the attribute from the parent to the newly created server. This resulted in the 'namespace' keyword being parsed without errors when used from a 'default-server' directive, but in practise the option was simply ignored. There's no need to duplicate the netns struct because it is stored in a shared list, so copying the pointer does the job. This patch partially fixes GH #2038 and should be backported to all stable versions.	2023-06-14 11:27:29 +02:00
Aurelien DARRAGON	b7f8af3ca9	BUG/MINOR: proxy/server: free default-server on deinit proxy default-server is a specific type of server that is not allocated using new_server(): it is directly stored within the parent proxy structure. However, since it may contain some default config options that may be inherited by regular servers, it is also subject to dynamic members (strings, structures..) that needs to be deallocated when the parent proxy is cleaned up. Unfortunately, srv_drop() may not be used directly from p->defsrv since this function is meant to be used on regular servers only (those created using new_server()). To circumvent this, we're splitting srv_drop() to make a new function called srv_free_params() that takes care of the member cleaning which originally takes place in srv_drop(). This function is exposed through server.h, so it may be called from outside server.c. Thanks to this, calling srv_free_params(&p->defsrv) from free_proxy() prevents any memory leaks due to dynamic parameters allocated when parsing a default-server line from a proxy section. This partially fixes GH #2173 and may be backported to 2.8. [While it could also be relevant for other stable versions, the patch won't apply due to architectural changes / name changes between 2.4 => 2.6 and then 2.6 => 2.8. Considering this is a minor fix that only makes memory analyzers happy during deinit paths (at least for <= 2.8), it might not be worth the trouble to backport them any further?]	2023-06-06 15:15:17 +02:00
Christopher Faulet	2c29d1f524	BUG/MINOR: peers: Improve detection of config errors in peers sections There are several misuses in peers sections that are not detected during the configuration parsing and that could lead to undefined behaviors or crashes. First, only one listener is expected for a peers section. If several bind lines or local peer definitions are used, an error is triggered. However, if multiple addresses are set on the same bind line, there is no error while only the last listener is properly configured. On the 2.8, there is no crash but side effects are hardly predictable. On older version, HAProxy crashes if an unconfigured listener is used. Then, there is no check on remote peers name. It is unexpected to have same name for several remote peers. There is now a test, performed during the post-parsing, to verify all remote peer names are unique. Finally, server parsing options for the peers sections are changed to be sure a port is always defined, and not a port range or a port offset. This patch fixes the issue #2066. It could be backported to all stable versions.	2023-06-05 08:24:34 +02:00
Aurelien DARRAGON	0d2f1acee6	BUG/MINOR: server: memory leak in _srv_update_status_op() on server DOWN When server is transitionning from UP to DOWN, a log message is generated. e.g.: "Server backend_name/server_name is DOWN") However since `f71e064` ("MEDIUM: server: split srv_update_status() in two functions"), the allocated buffer tmptrash which is used to prepare the log message is not freed after it has been used, resulting in a small memory leak each time a server goes DOWN because of an operational change. This is a 2.8 specific bug, no backport needed unless the above commit gets backported.	2023-05-17 09:21:01 +02:00
Aurelien DARRAGON	22d584a993	CLEANUP: server: remove useless tmptrash assigments in srv_update_status() Within srv_update_status subfunctions _op() and _adm(), each time tmptrash is freed, we assign it to NULL to ensure it will not be reused. However, within those functions it is not very useful given that tmptrash is never checked against NULL except upon allocation through alloc_trash_chunk(), which happens everytime a new log message is generated, sent, and then freed right away, so there are no code paths that could lead to tmptrash being checked for reuse (tmptrash is systematically overwritten since all log messages are independant from each other). This was raised by coverity, see GH #2162.	2023-05-17 09:21:01 +02:00
Aurelien DARRAGON	977688bd57	MINOR: server: fix message report when IDRAIN is set and MAINT is cleared Remaining in drain mode after removing one of server admins flags leads to this message being generated: "Server name/backend is leaving forced drain but remains in drain mode." However this is not necessarily true: the server might just be leaving MAINT with the IDRAIN flag set, so the report is incorrect in this case. (FDRAIN was not set so it cannot be cleared) To prevent confusion around this message and to comply with the code comment above it: we remove the "leaving forced drain" precision to make the report suitable for multiple transitions.	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	dcbc2d2cac	MINOR: checks/event_hdl: SERVER_CHECK event Adding a new event type: SERVER_CHECK. This event is published when a server's check state ought to be reported. (check status change or check result) SERVER_CHECK event is provided as a server event with additional data carrying relevant check's context such as check's result and health.	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	a163d65254	MINOR: server/event_hdl: add SERVER_ADMIN event Adding a new SERVER event in the event_hdl API. SERVER_ADMIN is implemented as an advanced server event. It is published each time the administrative state changes. (when s->cur_admin changes) SERVER_ADMIN data is an event_hdl_cb_data_server_admin struct that provides additional info related to the admin state change, but can be casted as a regular event_hdl_cb_data_server struct if additional info is not needed.	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	c249f6d964	OPTIM: server: publish UP/DOWN events from STATE change Reuse cb_data from STATE event to publish UP and DOWN events. This saves some CPU time since the event is only constructed once to publish STATE, STATE+UP or STATE+DOWN depending on the state change.	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	e3eea29f48	MINOR: server/event_hdl: add SERVER_STATE event Adding a new SERVER event in the event_hdl API. SERVER_STATE is implemented as an advanced server event. It is published each time the server's effective state changes. (when s->cur_state changes) SERVER_STATE data is an event_hdl_cb_data_server_state struct that provides additional info related to the server state change, but can be casted as a regular event_hdl_cb_data_server struct if additional info is not needed.	2023-05-05 16:28:32 +02:00
Aurelien DARRAGON	306a5fc987	MINOR: server/event_hdl: publish macro helper add a macro helper to help publish server events to global and per-server subscription list at once since all server events support both subscription modes.	2023-05-05 16:28:32 +02:00
Willy Tarreau	69530f59ae	MEDIUM: clock: replace timeval "now" with integer "now_ns" This puts an end to the occasional confusion between the "now" date that is internal, monotonic and not synchronized with the system's date, and "date" which is the system's date and not necessarily monotonic. Variable "now" was removed and replaced with a 64-bit integer "now_ns" which is a counter of nanoseconds. It wraps every 585 years, so if all goes well (i.e. if humanity does not need haproxy anymore in 500 years), it will just never wrap. This implies that now_ns is never nul and that the zero value can reliably be used as "not set yet" for a timestamp if needed. This will also simplify date checks where it becomes possible again to do "date1<date2". All occurrences of "tv_to_ns(&now)" were simply replaced by "now_ns". Due to the intricacies between now, global_now and now_offset, all 3 had to be turned to nanoseconds at once. It's not a problem since all of them were solely used in 3 functions in clock.c, but they make the patch look bigger than it really is. The clock_update_local_date() and clock_update_global_date() functions are now much simpler as there's no need anymore to perform conversions nor to round the timeval up or down. The wrapping continues to happen by presetting the internal offset in the short future so that the 32-bit now_ms continues to wrap 20 seconds after boot. The start_time used to calculate uptime can still be turned to nanoseconds now. One interrogation concerns global_now_ms which is used only for the freq counters. It's unclear whether there's more value in using two variables that need to be synchronized sequentially like today or to just use global_now_ns divided by 1 million. Both approaches will work equally well on modern systems, the difference might come from smaller ones. Better not change anyhting for now. One benefit of the new approach is that we now have an internal date with a resolution of the nanosecond and the precision of the microsecond, which can be useful to extend some measurements given that timestamps also have this resolution.	2023-04-28 16:08:08 +02:00
Willy Tarreau	eed5da1037	MINOR: clock: do not use now.tv_sec anymore Instead we're using ns_to_sec(tv_to_ns(&now)) which allows the tv_sec part to disappear. At this point, "now" is only used as a timeval in clock.c where it is updated.	2023-04-28 16:08:08 +02:00
Aurelien DARRAGON	23f352f7d0	MINOR: server/event_hdl: prepare for server event data wrapper Adding the possibility to publish an event using a struct wrapper around existing SERVER events to provide additional contextual info. Using the specific struct wrapper is not required: it is supported to cast event data as a regular server event data struct so that we don't break the existing API. However, casting event data with a more explicit data type allows to fetch event-only relevant hints.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	f71e0645c1	MEDIUM: server: split srv_update_status() in two functions Considering that srv_update_status() is now synchronous again since `3ff577e1` ("MAJOR: server: make server state changes synchronous again"), and that we can easily identify if the update is from an operational or administrative context thanks to "MINOR: server: pass adm and op cause to srv_update_status()". And given that administrative and operational updates cannot be cumulated (since srv_update_status() is called synchronously and independently for admin updates and state/operational updates, and the function directly consumes the changes). We split srv_update_status() in 2 distinct parts: Either <type> is 0, meaning the update is an operational update which is handled by directly looking at cur_state and next_state to apply the proper transition. Also, the check to prevent operational state from being applied if MAINT admin flag is set is no longer needed given that the calling functions already ensure this (ie: srv_set_{running,stopping,stopped) Or <type> is 1, meaning the update is an administrative update, where cur_admin and next_admin are evaluated to apply the proper transition and deduct the resulting server state (next_state is updated implicitly). Once this is done, both operations share a common code path in srv_update_status() to update proxy and servers stats if required. Thanks to this change, the function's behavior is much more predictable, it is not an all-in-one function anymore. Either we apply an operational change, else it is an administrative change. That's it, we cannot mix the 2 since both code paths are now properly separated.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	76e255520f	MINOR: server: pass adm and op cause to srv_update_status() Operational and administrative state change causes are not propagated through srv_update_status(), instead they are directly consumed within the function to provide additional info during the call when required. Thus, there is no valid reason for keeping adm and op causes within server struct. We are wasting space and keeping uneeded complexity. We now exlicitly pass change type (operational or administrative) and associated cause to srv_update_status() so that no extra storage is needed since those values are only relevant from srv_update_status().	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	10518c0d59	CLEANUP: server: fix srv_set_{running, stopping, stopped} function comment Fixing function comments for the server state changing function since they still refer to asynchonous propagation of server state which is no longer in play. Moreover, there were some mixups between running/stopping.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	c54b98ac9a	CLEANUP: server: remove unused variables in srv_update_status() check and px local variable aliases are not very useful. Let's remove them and use s->check and s->proxy instead.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	1746b56e68	MINOR: server: change srv_op_st_chg_cause storage type This one is greatly inspired by "MINOR: server: change adm_st_chg_cause storage type". While looking at current srv_op_st_chg_cause usage, it was clear that the struct needed some cleanup since some leftovers from asynchronous server state change updates were left behind and resulted in some useless code duplication, and making the whole thing harder to maintain. Two observations were made: - by tracking down srv_set_{running, stopped, stopping} usage, we can see that the <reason> argument is always a fixed statically allocated string. - check-related state change context (duration, status, code...) is not used anymore since srv_append_status() directly extracts the values from the server->check. This is pure legacy from when the state changes were applied asynchronously. To prevent code duplication, useless string copies and make the reason/cause more exportable, we store it as an enum now, and we provide srv_op_st_chg_cause() function to fetch the related description string. HEALTH and AGENT causes (check related) are now explicitly identified to make consumers like srv_append_op_chg_cause() able to fetch checks info from the server itself if they need to.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	f3b48a808e	MINOR: server: srv_append_status refacto srv_append_status() has become a swiss-knife function over time. It is used from server code and also from checks code, with various inputs and distincts code paths, making it very hard to guess the actual behavior of the function (resulting string output). To simplify the logic behind it, we're dividing it in multiple contextual functions that take simple inputs and do explicit things, making them more predictable and easier to maintain.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9b1ccd7325	MINOR: server: change adm_st_chg_cause storage type Even though it doesn't look like it at first glance, this is more like a cleanup than an actual code improvement: Given that srv->adm_st_chg_cause has been used to exclusively store static strings ever since it was implemented, we make the choice to store it as an enum instead of a fixed-size string within server struct. This will allow to save some space in server struct, and will make it more easily exportable (ie: event handlers) because of the reduced memory footprint during handling and the ability to later get the corresponding human-readable message when it's explicitly needed.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	85b91375bf	MINOR: server: propagate lb changes through srv_lb_propagate() Now that we have a generic srv_lb_propagate(s) function, let's use it each time we explicitly wan't to set the status down as well. Indeed, it is tricky to try to handle "down" case explicitly, instead we use srv_lb_propagate() which will call the proper function that will handle the new server state. This will allow some code cleanup and will prevent any logic error. This commit depends on: - "MINOR: server: propagate server state change to lb through single function"	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	8bbe643acc	MINOR: server: propagate server state change to lb through single function Use a dedicated helper function to propagate server state change to lb algorithms, since it is performed at multiple places within srv_update_status() function.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	5f80f8bbc5	MINOR: server: central update for server counters on state change Based on "BUG/MINOR: server: don't miss server stats update on server state transitions", we're also taking advantage of the new centralized logic to update down_trans server counter directly from there instead of multiple places.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9f5853fa38	BUG/MINOR: server: don't miss server stats update on server state transitions s->last_change and s->down_time updates were manually updated for each effective server state change within srv_update_status(). This is rather error-prone, and as a result there were still some state transitions that were not handled properly since at least 1.8. ie: - when transitionning from DRAIN to READY: downtime was updated (which is wrong since a server in DRAIN state should not be considered as DOWN) - when transitionning from MAINT to READY: downtime was not updated (this can be easily seen in the html stats page) To fix these all at once, and prevent similar bugs from being introduced, we centralize the server last_change and down_time stats logic at the end of srv_update_status(): If the server state changed during the call, then it means that last_change must be updated, with a special case when changing from STOPPED state which means the server was previously DOWN and thus downtime should be updated. This patch depends on: - "MINOR: server: explicitly commit state change in srv_update_status()" This could be backported to every stable versions.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	e80ddb18a8	BUG/MINOR: server: don't miss proxy stats update on server state transitions backend "down" stats logic has been duplicated multiple times in srv_update_status(), resulting in the logic now being error-prone. For example, the following bugfix was needed to compensate for a copy-paste introduced bug: `d332f139` ("BUG/MINOR: server: update last_change on maint->ready transitions too") While the above patch works great, we actually forgot to update the proxy downtime like it is done for other down->up transitions... This is simply illustrating that the current design is error-prone, it is very easy to miss something in this area. To properly update the proxy downtime stats on the maint->ready transition, to cleanup srv_update_status() and to prevent similar bugs from being introduced in the future, proxy/backend stats update are now automatically performed at the end of the server state change if needed. Thus we can remove existing updates that were performed at various places within the function, this simplifies things a bit. This patch depends on: - "MINOR: server: explicitly commit state change in srv_update_status()" This could be backported to all stable versions. Backport notes: 2.2: Replace struct task srv_cleanup_toremove_conns(struct task task, void context, unsigned int state) by struct task srv_cleanup_toremove_connections(struct task task, void context, unsigned short state)	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	22151c70bb	MINOR: server: explicitly commit state change in srv_update_status() As shown in `8f29829` ("BUG/MEDIUM: checks: a down server going to maint remains definitely stucked on down state."), state changes that don't result in explicit lb state change, require us to perform an explicit server state commit to make sure the next state is applied before returning from the function. This is the case for server state changes that don't trigger lb logic and only perform some logging. This is quite error prone, we could easily forget a state change combination that could result in next_state, next_admin or next_eweight not being applied. (cur_state, cur_admin and cur_eweight would be left with unexpected values) To fix this, we explicitly call srv_lb_commit_status() at the end of srv_update_status() to enforce the new values, even if they were already applied. (when a state changes requires lb state update an implicit commit is already performed) Applying the state change multiple times is safe (since the next value always points to the current value). Backport notes: 2.2: Replace struct task srv_cleanup_toremove_conns(struct task task, void context, unsigned int state) by struct task srv_cleanup_toremove_connections(struct task task, void context, unsigned short state)	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	9a1df02ccb	BUG/MINOR: server: incorrect report for tracking servers leaving drain Report message for tracking servers completely leaving drain is wrong: The check for "leaving drain .. via" never evaluates because the condition !(s->next_admin & SRV_ADMF_FDRAIN) is always true in the current block which is guarded by !(s->next_admin & SRV_ADMF_DRAIN). For tracking servers that leave inherited drain mode, this results in the following message being emitted: "Server x/b is UP (leaving forced drain)" Instead of: "Server x/b is UP (leaving drain) via x/a" To this fix: we check if FDRAIN is currently set, else it means that the drain status is inherited from the tracked server (IDRAIN) This regression was introduced with `64cc49cf` ("MAJOR: servers: propagate server status changes asynchronously."), thus it may be backported to every stable versions.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	a63f4903c9	MINOR: server/event_hdl: prepare for upcoming refactors This commit does nothing that ought to be mentioned, except that it adds missing comments and slighty moves some function calls out of "sensitive" code in preparation of some server code refactors.	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	d714213862	MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server Expose proxy_uuid variable in event_hdl_cb_data_server struct to overcome proxy_name fixed length limitation. proxy_uuid may be used by the handler to perform proxy lookups. This should be preferred over lookups relying proxy_name. (proxy_name is suitable for printing / logging purposes but not for ID lookups since it has a maximum fixed length)	2023-04-21 14:36:45 +02:00
Aurelien DARRAGON	0ddf052972	CLEANUP: server: fix update_status() function comment srv_update_status() function comment says that the function "is designed to be called asynchronously". While this used to be true back then with `64cc49cf` ("MAJOR: servers: propagate server status changes asynchronously.") This is not true anymore since `3ff577e` ("MAJOR: server: make server state changes synchronous again") Fixing the comment in order to better reflect current behavior.	2023-04-21 14:36:45 +02:00
Willy Tarreau	fc458ec8aa	CLEANUP: tree-wide: remove strpcy() from constant strings These ones are genenerally harmless on modern compilers because the compiler checks them. While gcc optimizes them away without even referencing strcpy(), clang prefers to call strcpy(). Nevertheless they prevent from enabling stricter checks so better remove them altogether. They were all replaced by strlcpy2() and the size of the destination which is always known there.	2023-04-07 18:14:28 +02:00
Aurelien DARRAGON	32483ecaac	MINOR: server: correctly free servers on deinit() srv_drop() function is reponsible for freeing the server when the refcount reaches 0. There is one exception: when global.mode has the MODE_STOPPING flag set, srv_drop() will ignore the refcount and free the server on first invocation. This logic has been implemented with `13f2e2ce` ("BUG/MINOR: server: do not use refcount in free_server in stopping mode") and back then doing so was not a problem since dynamic server API was just implemented and srv_take() and srv_drop() were not widely used. Now that dynamic server API is starting to get more popular we cannot afford to keep the current logic: some modules or lua scripts may hold references to existing server and also do their cleanup in deinit phases In this kind of situation, it would be easy to trigger double-frees since every call to srv_drop() on a specific server will try to free it. To fix this, we take a different approach and try to fix the issue at the source: we now properly drop server references involved with checks/agent_checks in deinit_srv_check() and deinit_srv_agent_check(). While this could theorically be backported up to 2.6, it is not very relevant for now since srv_drop() usage in older versions is very limited and we're only starting to face the issue in mid 2.8 developments. (ie: lua core updates)	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	b5ee8bebfc	MINOR: server: always call ssl->destroy_srv when available In srv_drop(), we only call the ssl->destroy_srv() method on specific conditions. But this has two downsides: First, destroy_srv() is reponsible for freeing data that may have been allocated in prepare_srv(), but not exclusively: it also frees ssl-related parameters allocated when parsing a server entry, such as ca-file for instance. So this is quite error-prone, we could easily miss a condition where some data needs to be deallocated using destroy_srv() even if prepare_srv() was not used (since prepare_srv() is also conditional), thus resulting in memory leaks. Moreover, depending on srv->proxy to guard the check is probably not a good idea here, since srv_drop() could be called in late de-init paths in which related proxy could be freed already. srv_drop() should only take care of freeing local server data without external logic. Thankfully, destroy_srv() function performs the necessary checks to ensure that a systematic call to the function won't result in invalid reads or double frees. No backport needed.	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	f175b08bfb	BUG/MINOR: server/del: fix srv->next pointer consistency We recently discovered a bug which affects dynamic server deletion: When a server is deleted, it is removed from the "visible" server list. But as we've seen in previous commit ("MINOR: server: add SRV_F_DELETED flag"), it can still be accessed by someone who keeps a reference on it (waiting for the final srv_drop()). Throughout this transient state, server ptr is still valid (may be dereferenced) and the flag SRV_F_DELETED is set. However, as the server is not part of server list anymore, we have an issue: srv->next pointer won't be updated anymore as the only place where we perform such update is in cli_parse_delete_server() by iterating over the "visible" server list. Because of this, we cannot guarantee that a server with the SRV_F_DELETED flag has a valid 'next' ptr: 'next' could be pointing to a fully removed (already freed) server. This problem can be easily demonstrated with server dumping in the stats: server list dumping is performed in stats_dump_proxy_to_buffer() The function can be interrupted and resumed later by design. ie: output buffer is full: partial dump and finish the dump after the flush This is implemented by calling srv_take() on the server being dumped, and only releasing it when we're done with it using srv_drop(). (drop can be delayed after function resume if buffer is full) While the function design seems OK, it works with the assumption that srv->next will still be valid after the function resumes, which is not true. (especially if multiple servers are being removed in between the 2 dumping attempts) In practice, this did not cause any crash yet (at least this was not reported so far), because server dumping is so fast that it is very unlikely that multiple server deletions make their way between 2 dumping attempts in most setups. But still, this is a problem that we need to address because some upcoming work might depend on this assumption as well and for the moment it is not safe at all. ======================================================================== Here is a quick reproducer: With this patch, we're creating a large deletion window of 3s as soon as we reach a server named "t2" while iterating over the list. This will give us plenty of time to perform multiple deletions before the function is resumed. \| diff --git a/src/stats.c b/src/stats.c \| index 84a4f9b6e..15e49b4cd 100644 \| --- a/src/stats.c \| +++ b/src/stats.c \| @@ -3189,11 +3189,24 @@ int stats_dump_proxy_to_buffer(struct stconn sc, struct htx htx, \| * Temporarily increment its refcount to prevent its \| * anticipated cleaning. Call free_server to release it. \| / \| + struct server orig = ctx->obj2; \| for (; ctx->obj2 != NULL; \| ctx->obj2 = srv_drop(sv)) { \| \| sv = ctx->obj2; \| + printf("sv = %s\n", sv->id); \| srv_take(sv); \| + if (!strcmp("t2", sv->id) && orig == px->srv) { \| + printf("deletion window: 3s\n"); \| + thread_idle_now(); \| + thread_harmless_now(); \| + sleep(3); \| + thread_harmless_end(); \| + \| + thread_idle_end(); \| + \| + goto full; /* simulate full buffer / \| + } \| \| if (htx) { \| if (htx_almost_full(htx)) \| @@ -4353,6 +4366,7 @@ static void http_stats_io_handler(struct appctx appctx) \| struct channel res = sc_ic(sc); \| struct htx req_htx, res_htx; \| \| + printf("http dump\n"); \| / only proxy stats are available via http / \| ctx->domain = STATS_DOMAIN_PROXY; \| Ok, we're ready, now we start haproxy with the following conf: global stats socket /tmp/ha.sock mode 660 level admin expose-fd listeners thread 1-1 nbthread 2 frontend stats mode http bind :8081 thread 2-2 stats enable stats uri / backend farm server t1 127.0.0.1:1899 disabled server t2 127.0.0.1:18999 disabled server t3 127.0.0.1:18998 disabled server t4 127.0.0.1:18997 disabled And finally, we execute the following script: curl localhost:8081/stats& sleep .2 echo "del server farm/t2" \| nc -U /tmp/ha.sock echo "del server farm/t3" \| nc -U /tmp/ha.sock This should be enough to reveal the issue, I easily manage to consistently crash haproxy with the following reproducer: http dump sv = t1 http dump sv = t1 sv = t2 deletion window = 3s [NOTICE] (2940566) : Server deleted. [NOTICE] (2940566) : Server deleted. http dump sv = t2 sv = ��U [1] 2940566 segmentation fault (core dumped) ./haproxy -f ttt.conf ======================================================================== To fix this, we add prev_deleted mt_list in server struct. For a given "visible" server, this list will contain the pending "deleted" servers references that point to it using their 'next' ptr. This way, whenever this "visible" server is going to be deleted via cli_parse_delete_server() it will check for servers in its 'prev_deleted' list and update their 'next' pointer so that they no longer point to it, and then it will push them in its 'next->prev_deleted' list to transfer the update responsibility to the next 'visible' server (if next != NULL). Then, following the same logic, the server about to be removed in cli_parse_delete_server() will push itself as well into its 'next->prev_deleted' list (if next != NULL) so that it may still use its 'next' ptr for the time it is in transient removal state. In srv_drop(), right before the server is finally freed, we make sure to remove it from the 'next->prev_deleted' list so that 'next' won't try to perform the pointers update for this server anymore. This has to be done atomically to prevent 'next' srv from accessing a purged server. As a result: for a valid server, either deleted or not, 'next' ptr will always point to a non deleted (ie: visible) server. With the proposed fix, and several removal combinations (including unordered cli_parse_delete_server() and srv_drop() calls), I cannot reproduce the crash anymore. Example tricky removal sequence that is now properly handled: sv list: t1,t2,t3,t4,t5,t6 ops: take(t2) del(t4) del(t3) del(t5) drop(t3) drop(t4) drop(t5) drop(t2)	2023-04-05 08:58:16 +02:00
Aurelien DARRAGON	75b9d1c041	MINOR: server: add SRV_F_DELETED flag Set the SRV_F_DELETED flag when server is removed from the cli. When removing a server from the cli (in cli_parse_delete_server()), we update the "visible" server list so that the removed server is no longer part of the list. However, despite the server being removed from "visible" server list, one could still access the server data from a valid ptr (ie: srv_take()) Deleted flag helps detecting when a server is in transient removal state: that is, removed from the list, thus not visible but not yet purged from memory.	2023-04-05 08:58:16 +02:00
Christopher Faulet	3a7b539b12	BUG/MEDIUM: connection: Preserve flags when a conn is removed from an idle list The commit `5e1b0e7bf` ("BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle list") introduced a regression. CO_FL_SAFE_LIST and CO_FL_IDLE_LIST flags are used when the connection is released to properly decrement used/idle connection counters. if a connection is idle, these flags must be preserved till the connection is really released. It may be removed from the list but not immediately released. If these flags are lost when it is finally released, the current number of used connections is erroneously decremented. If means this counter may become negative and the counters tracking the number of idle connecitons is not decremented, suggesting a leak. So, the above commit is reverted and instead we improve a bit the way to detect an idle connection. The function conn_get_idle_flag() must now be used to know if a connection is in an idle list. It returns the connection flag corresponding to the idle list if the connection is idle (CO_FL_SAFE_LIST or CO_FL_IDLE_LIST) or 0 otherwise. But if the connection is scheduled to be removed, 0 is also returned, regardless the connection flags. This new function is used when the connection is temporarily removed from the list to be used, mainly in muxes. This patch should fix #2078 and #2057. It must be backported as far as 2.2.	2023-03-16 15:34:20 +01:00
Christopher Faulet	5e1b0e7bf8	BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle list When a connection is removed from the safe list or the idle list, CO_FL_SAFE_LIST and CO_FL_IDLE_LIST flags must be cleared. It is performed when the connection is reused. But not when it is moved into the toremove_conns list. It may be an issue because the multiplexer owning the connection may be woken up before the connection is really removed. If the connection flags are not sanitized, it may think the connection is idle and reinsert it in the corresponding list. From this point, we can imagine several bugs. An UAF or a connection reused with an invalid state for instance. To avoid any issue, the connection flags are sanitized when an idle connection is moved into the toremove_conns list. The same is performed at right places in the multiplexers. Especially because the connection release may be delayed (for h2 and fcgi connections). This patch shoudld fix the issue #2057. It must carefully be backported as far as 2.2. Especially on the 2.2 where the code is really different. But some conflicts should be expected on the 2.4 too.	2023-02-28 18:36:29 +01:00
Aurelien DARRAGON	86207e782c	BUG/MINOR: server/add: ensure minconn/maxconn consistency when adding server When a new server was added through the cli using "server add" command, the maxconn/minconn consistency check historically implemented in check_config_validity() for static servers was missing. As a result, when adding a server with the maxconn parameter without the minconn set, the server was unable to handle any connection because srv_dynamic_maxconn() would always return 0. Consider the following reproducer: \| global \| stats socket /tmp/ha.sock mode 660 level admin expose-fd listeners \| \| defaults \| timeout client 5s \| timeout server 5s \| timeout connect 5s \| \| frontend test \| mode http \| bind *:8081 \| use_backend farm \| \| listen dummyok \| bind localhost:18999 \| mode http \| http-request return status 200 hdr test "ok" \| \| backend farm \| mode http Start haproxy and perform the following : echo "add server farm/t1 127.0.0.1:18999 maxconn 100" \| nc -U /tmp/ha.sock echo "enable server farm/t1" \| nc -U /tmp/ha.sock curl localhost:8081 # -> 503 after 5s connect timeout Thanks to ("MINOR: cfgparse/server: move (min/max)conn postparsing logic into dedicated function"), we are now able to perform the consistency check after the new dynamic server has been parsed. This is enough to fix the issue documented here that was reported by Thomas Pedoussaut on the ML. This commit depends on: - ("MINOR: cfgparse/server: move (min/max)conn postparsing logic into dedicated function") It must be backported to 2.6 and 2.7	2023-02-08 14:48:21 +01:00
Aurelien DARRAGON	7d541a91ec	BUG/MINOR: checks: restore legacy on-error fastinter behavior With previous commit, `9e080bf` ("BUG/MINOR: checks: make sure fastinter is used even on forced transitions"), on-error mark-down\|sudden-death\|fail-check are now working as expected. However, on-error fastinter remains broken because srv_getinter(), used in the above commit to check the expiration date, won't return fastinter interval if server health is maxed out (which is the case with on-error fastinter mode). To fix this, we introduce a check flag named CHK_ST_FASTINTER. This flag is set when on-error is triggered. This way we can force srv_getinter() to return fastinter interval whenever the flag is set. The flag is automatically cleared as soon as the new check task expiry is recalculated in process_chk_conn(). This restores original behavior prior to `d114f4a` ("MEDIUM: checks: spread the checks load over random threads"). It must be backported to 2.7 along with the aforementioned commits.	2022-12-07 17:03:55 +01:00
Aurelien DARRAGON	22f82f81e5	MINOR: server/event_hdl: add support for SERVER_UP and SERVER_DOWN events We're using srv_update_status() as the only event source or UP/DOWN server events in an attempt to simplify the support for these 2 events. It seems srv_update_status() is the common path for server state changes anyway Tested with server state updated from various sources: - the cli - server-state file (maybe we could disable this or at least don't publish in global event queue in the future if it ends in slower startup for setups relying on huge server state files) - dns records (ie: srv template) (again, could be fined tuned to only publish in server specific subscriber list and no longer in global subscription list if mass dns update tend to slow down srv_update_status()) - normal checks and observe checks (HCHK_STATUS_HANA) (same as above, if checks related state update storms are expected) - lua scripts - html stats page (admin mode)	2022-12-06 10:22:07 +01:00
Aurelien DARRAGON	129ecf441f	MINOR: server/event_hdl: add support for SERVER_ADD and SERVER_DEL events Basic support for ADD and DEL server events are added through this commit: SERVER_ADD is published on dynamic server addition through cli. SERVER_DEL is published on dynamic server deletion through cli. This work depends on: "MINOR: event_hdl: add event handler base api" "MINOR: server: add srv->rid (revision id) value"	2022-12-06 10:22:07 +01:00
Aurelien DARRAGON	61e3894dfe	MINOR: server: add srv->rid (revision id) value With current design, we could not distinguish between previously existing deleted server and a new server reusing the deleted server name/id. This can cause some confusion when auditing stats/events/logs, because the new server will look similar to the old one. To address this, we're adding a new value in server structure: rid rid (revision id) value is an unsigned 32bits value that is set upon server creation. Value is derived from a global counter that starts at 0 and is incremented each time one or multiple server deletions are followed by a server addition (meaning that old name/id reuse could occur). Thanks to this revision id, it is now easy to tell whether the server we're looking at is the same as before or if it has been deleted and re-added in the meantime. (combining server name/id + server revision id yields a process-wide unique identifier)	2022-12-06 10:22:06 +01:00
Amaury Denoyelle	21e611dc89	MINOR: tools: add port for ipcmp as optional criteria Complete ipcmp() function with a new argument <check_port>. If this argument is true, the function will compare port values besides IP addresses and return true only if both are identical. This commit will simplify QUIC connection migration detection. As such, it should be backported to 2.7.	2022-12-02 14:45:43 +01:00
Willy Tarreau	c21a187ec0	MINOR: server/idle: make the next_takeover index per-tgroup In order to evenly pick idle connections from other threads, there is a "next_takeover" index in the server, that is incremented each time a connection is picked from another thread, and indicates which one to start from next time. With thread groups this doesn't work well because the index is the same regardless of the group, and if a group has more threads than another, there's even a risk to reintroduce an imbalance. This patch introduces a new per-tgroup storage in servers which, for now, only contains an instance of this next_takeover index. This way each thread will now only manipulate the index specific to its own group, and the takeover will become fair again. More entries may come soon.	2022-11-21 19:21:07 +01:00
Willy Tarreau	9dc231a6b2	BUG/MINOR: server/idle: at least use atomic stores when updating max_used_conns In 2.2, some idle conns usage metrics were added by commit `cf612a045` ("MINOR: servers: Add a counter for the number of currently used connections."), which mentioned that the operation doesn't need to be atomic since we're not seeking exact values. This is true but at least we should use atomic stores to make sure not to cause invalid values to appear on archs that wouldn't guarantee atomicity when writing an int, such as writing two 16-bit words. This is pretty unlikely on our targets but better keep the code safe against this. This may be backported as far as 2.2.	2022-11-21 19:21:07 +01:00
Amaury Denoyelle	30fc6da148	MINOR: server: clear prefix on stderr logs after add server cli_parse_add_server() is the CLI handler for 'add server' command. This functions uses usermsgs_ctx to retrieve logs messages from internal ha_alert() calls and display it at the end of the handler. At the beginning of the handler, stderr prefix is defined to "CLI" via usermsgs_clr() function. However, this is not resetted at the end. This causes inconsistency for stderr output : 1. each ha_alert() invocation will reuse "CLI" prefix if 'add server' command was executed before, even in non-CLI context 2. usermsgs_ctx is thread local, so this is only true if this runs on the same thread as 'add server' handler. To fix this, ensure that "CLI" prefix is now resetted after cli_parse_add_server(). This is done thanks to the addition to cli_umsg()/cli_umsgerr() functions. This can be backported up to 2.5 if we prefer to ensure output consistency at the risk of changing stderr behaviors in stable versions. In this case, the previous commit should be backported before : MINOR: cli: define usermsgs print context	2022-11-10 16:42:47 +01:00
Fr�d�ric L�caille	36d1565640	MINOR: peers: Support for peer shards Add "shards" new keyword for "peers" section to configure the number of peer shards attached to such secions. This impact all the stick-tables attached to the section. Add "shard" new "server" parameter to configure the peers which participate to all the stick-tables contents distribution. Each peer receive the stick-tables updates only for keys with this shard value as distribution hash. The "shard" value is stored in ->shard new server struct member. cfg_parse_peers() which is the function which is called to parse all the lines of a "peers" section is modified to parse the "shards" parameter stored in ->nb_shards new peers struct member. Add srv_parse_shard() new callback into server.c to pare the "shard" parameter. Implement stksess_getkey_hash() to compute the distribution hash for a stick-table key as the 64-bits xxhash of the key concatenated to the stick-table name. This function is called by stksess_setkey_shard(), itself called by the already implemented function which create a new stick-table key (stksess_new()). Add ->idlen new stktable struct member to store the stick-table name length to not have to compute it each time a stick-table key hash is computed.	2022-10-24 10:55:53 +02:00
Willy Tarreau	8522348482	BUG/MAJOR: conn-idle: fix hash indexing issues on idle conns Idle connections do not work on 32-bit machines due to an alignment issue causing the connection nodes to be indexed with their lower 32-bits set to zero and the higher 32 ones containing the 32 lower bitss of the hash. The cause is the use of ebmb_node with an aligned data, as on this platform ebmb_node is only 32-bit aligned, leaving a hole before the following hash which is a uint64_t: $ pahole -C conn_hash_node ./haproxy struct conn_hash_node { struct ebmb_node node; /* 0 20 / / XXX 4 bytes hole, try to pack / int64_t hash; / 24 8 / struct connection conn; /* 32 4 / / size: 40, cachelines: 1, members: 3 / / sum members: 32, holes: 1, sum holes: 4 / / padding: 4 / / last cacheline: 40 bytes */ }; Instead, eb64 nodes should be used when it comes to simply storing a 64-bit key, and that is what this patch does. For backports, a variant consisting in simply marking the "hash" member with a "packed" attribute on the struct also does the job (tested), and might be preferable if the fix is difficult to adapt. Only 2.6 and 2.5 are affected by this.	2022-10-03 12:06:36 +02:00
Aurelien DARRAGON	8d0ff28406	BUG/MEDIUM: server: segv when adding server with hostname from CLI When calling 'add server' with a hostname from the cli (runtime), str2sa_range() does not resolve hostname because it is purposely called without PA_O_RESOLVE flag. This leads to 'srv->addr_node.key' being NULL. According to Willy it is fine behavior, as long as we handle it properly, and is already handled like this in srv_set_addr_desc(). This patch fixes GH #1865 by adding an extra check before inserting 'srv->addr_node' into 'be->used_server_addr'. Insertion and removal will be skipped if 'addr_node.key' is NULL. It must be backported to 2.6 and 2.5 only.	2022-09-17 06:30:59 +02:00
Christopher Faulet	b32cb9b515	REORG: server: Export srv_settings_cpy() function This function will be used to init a proxy with settings of the default proxy. It is mandatory to fix a bug. To do so, it must be exposed.	2022-08-03 11:28:52 +02:00
Christopher Faulet	0b365e3cb5	MINOR: server: Constify source server to copy its settings The source server used to initialize a new server, in srv_settings_cpy() and sub-functions, is now a constant. This patch is mandatory to fix a bug.	2022-08-03 11:28:23 +02:00
Willy Tarreau	245721b329	MINOR: server: indicate when no address was expected for a server When parsing a peers section, it's particularly difficult to make the difference between the local peer which doesn't have any address, and other peers which need one, and the error messages do not help because with just: peers foo bind :8001 server foo 127.0.0.1:8001 server bar 127.0.0.2:8001 One can get such a confusing message when the local peer is "bar": [peers.cfg:15] : 'server foo/bar' : unknown keyword '127.0.0.1:8001'. It's not clear there why the other peer doesn't trigger an error. With this commit we add a hint in the error message when no address was expected. The error remains quite generic (since deep into the server code) but at least the useer gets a hint about why the keyword wasn't understood: [peers.cfg:15] : 'server foo/bar' : unknown keyword '127.0.0.1:8001'. Hint: no address was expected for this server.	2022-05-31 09:25:34 +02:00
Willy Tarreau	cb086c6de1	REORG: stconn: rename conn_stream.{c,h} to stconn.{c,h} There's no more reason for keepin the code and definitions in conn_stream, let's move all that to stconn. The alphabetical ordering of include files was adjusted.	2022-05-27 19:33:35 +02:00
Willy Tarreau	5edca2f0e1	REORG: rename cs_utils.h to sc_strm.h This file contains all the stream-connector functions that are specific to application layers of type stream. So let's name it accordingly so that it's easier to figure what's located there. The alphabetical ordering of include files was preserved.	2022-05-27 19:33:35 +02:00
Willy Tarreau	d0a06d52f4	CLEANUP: applet: use applet_put() everywhere possible This applies the change so that the applet code stops using ci_putchk() and friends everywhere possible, for the much saferapplet_put() instead. The change is mechanical but large. Two or three functions used to have no appctx and a cs derived from the appctx instead, which was a reminiscence of old times' stream_interface. These were simply changed to directly take the appctx. No sensitive change was performed, and the old (more complex) API is still usable when needed (e.g. the channel is already known). The change touched roughly a hundred of locations, with no less than 124 lines removed. It's worth noting that the stats applet, the oldest of the series, could get a serious lifting, as it's still very channel-centric instead of propagating the appctx along the chain. Given that this code doesn't change often, there's no emergency to clean it up but it would look better.	2022-05-27 19:33:34 +02:00
Willy Tarreau	4596fe20d9	CLEANUP: conn_stream: tree-wide rename to stconn (stream connector) This renames the "struct conn_stream" to "struct stconn" and updates the descriptions in all comments (and the rare help descriptions) to "stream connector" or "connector". This touches a lot of files but the change is minimal. The local variables were not even renamed, so there's still a lot of "cs" everywhere.	2022-05-27 19:33:34 +02:00
Willy Tarreau	0698c80a58	CLEANUP: applet: remove the unneeded appctx->owner This one is the pointer to the conn_stream which is always in the endpoint that is always present in the appctx, thus it's not needed. This patch removes it and replaces it with appctx_cs() instead. A few occurences that were using __cs_strm(appctx->owner) were moved directly to appctx_strm() which does the equivalent.	2022-05-13 14:28:48 +02:00
Christopher Faulet	6b0a0fb2f9	CLEANUP: tree-wide: Remove any ref to stream-interfaces Stream-interfaces are gone. Corresponding files can be safely be removed. In addition, comments are updated accordingly.	2022-04-13 15:10:16 +02:00
Christopher Faulet	a0bdec350f	MEDIUM: stream-int/conn-stream: Move blocking flags from SI to CS Remaining flags and associated functions are move in the conn-stream scope. These flags are added on the endpoint and not the conn-stream itself. This way it will be possible to get them from the mux or the applet. The functions to get or set these flags are renamed accordingly with the "cs_" prefix and updated to manipualte a conn-stream instead of a stream-interface.	2022-04-13 15:10:15 +02:00
Christopher Faulet	908628c4c0	MEDIUM: tree-wide: Use CS util functions instead of SI ones At many places, we now use the new CS functions to get a stream or a channel from a conn-stream instead of using the stream-interface API. It is the first step to reduce the scope of the stream-interfaces. The main change here is about the applet I/O callback functions. Before the refactoring, the stream-interface was the appctx owner. Thus, it was heavily used. Now, as far as possible,the conn-stream is used. Of course, it remains many calls to the stream-interface API.	2022-04-13 15:10:14 +02:00
Willy Tarreau	ca1acd6080	MINOR: config: add a function to dump all known config keywords All registered config keywords that are valid in the config parser are dumped to stdout organized like the regular sections (global, listen, etc). Some keywords that are known to only be valid in frontends or backends will be suffixed with [FE] or [BE]. All regularly registered "bind" and "server" keywords are also dumped, one per "bind" or "server" line. Those depending on ssl are listed after the "ssl" keyword. Doing so required to export the listener and server keyword lists that were static. The function is called from dump_registered_keywords() for keyword class "cfg".	2022-03-29 18:01:32 +02:00
William Lallemand	0d05867e78	MINOR: server: export server_parse_sni_expr() function Export the server_parse_sni_expr() function in order to create a SNI expression in a server which was not parsed from the configuration.	2022-03-16 15:55:30 +01:00
Amaury Denoyelle	76e8b70e43	MEDIUM: server: remove experimental-mode for dynamic servers Dynamic servers feature is now judged to be stable enough. Remove the experimental-mode requirement for "add/del server" commands. This should facilitate dynamic servers adoption.	2022-03-11 14:28:28 +01:00
Christopher Faulet	86e1c3381b	MEDIUM: applet: Set the conn-stream as appctx owner instead of the stream-int Because appctx is now an endpoint of the conn-stream, there is no reason to still have the stream-interface as appctx owner. Thus, the conn-stream is now the appctx owner.	2022-02-24 11:00:02 +01:00
William Dauchy	a087f87875	BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl While giving a fresh try to `set server ssl` (which I wrote), I realised the behavior is a bit inconsistent. Indeed when using this command over a server with ssl enabled for the data path but also for the health check path we have: - data and health check done using tls - emit `set server be_foo/srv0 ssl off` - data path and health check path becomes plain text - emit `set server be_foo/srv0 ssl on` - data path becomes tls and health check path remains plain text while I thought the end result would be: - data path and health check path comes back in tls In the current code we indeed erase all connections while deactivating, but restore only the data path while activating. I made this mistake in the past because I was testing with a case where the health check plain text by default. There are several ways to solve this issue. The cleanest one would probably be to avoid changing the health check connection when we use `set server ssl` command, and create a new command `set server ssl-check` to change this. For now I assumed this would be ok to simply avoid changing the health check path and be more consistent. This patch tries to address that and also update the documentation. It should not break the existing usage with health check on plain text, as in this case they should have `no-check-ssl` in defaults. Without this patch, it makes the command unusable in an env where you have a list of server to add along the way with initial `server-template`, and all using tls for data and healthcheck path. For 2.6 we should probably reconsider and add `set server ssl-check` command for better granularity of cases. If this solution is accepted, this patch should be backported up to >= 2.4. The alternative solution was to restore the previous state, but I believe this will create even more confusion in the future. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2022-01-18 12:05:17 +01:00
William Lallemand	2c776f1c30	BUG/MEDIUM: ssl: initialize correctly ssl w/ default-server This bug was introduced by `d817dc73` ("MEDIUM: ssl: Load client certificates in a ckch for backend servers") in which the creation of the SSL_CTX for a server was moved to the configuration parser when using a "crt" keyword instead of being done in ssl_sock_prepare_srv_ctx(). The patch `0498fa40` ("BUG/MINOR: ssl: Default-server configuration ignored by server") made it worse by setting the same SSL_CTX for every servers using a default-server. Resulting in any SSL option on a server applied to every server in its backend. This patch fixes the issue by reintroducing a string which store the path of certificate inside the server structure, and loading the certificate in ssl_sock_prepare_srv_ctx() again. This is a quick fix to backport, a cleaner way can be achieve by always creating the SSL_CTX in ssl_sock_prepare_srv_ctx() and splitting properly the ssl_sock_load_srv_cert() function. This patch fixes issue #1488. Must be backported as far as 2.4.	2021-12-29 14:42:16 +01:00
Christopher Faulet	70f8948364	BUG/MINOR: cli/server: Don't crash when a server is added with a custom id When a server is dynamically added via the CLI with a custom id, the key used to insert it in the backend's tree of used names is not initialized. The server id must be used but it is only used when no custom id is provided. Thus, with a custom id, HAProxy crashes. Now, the server id is always used to init this key, to be able to insert the server in the corresponding tree. This patch should fix the issue #1481. It must be backported as far as 2.4.	2021-12-07 19:04:33 +01:00
Christopher Faulet	4ab2679689	BUG/MINOR: server: Don't rely on last default-server to init server SSL context During post-parsing stage, the SSL context of a server is initialized if SSL is configured on the server or its default-server. It is required to be able to enable SSL at runtime. However a regression was introduced, because the last parsed default-server is used. But it is not necessarily the default-server line used to configure the server. This may lead to erroneously initialize the SSL context for a server without SSL parameter or the skip it while it should be done. The problem is the default-server used to configure a server is not saved during configuration parsing. So, the information is lost during the post-parsing. To fix the bug, the SRV_F_DEFSRV_USE_SSL flag is introduced. It is used to know when a server was initialized with a default-server using SSL. For the record, the commit `f63704488e` ("MEDIUM: cli/ssl: configure ssl on server at runtime") has introduced the bug. This patch must be backported as far as 2.4.	2021-12-01 11:47:08 +01:00
Tim Duesterhus	025b93e3a2	CLEANUP: Apply ha_free.cocci Use `ha_free()` where possible.	2021-11-05 07:48:38 +01:00
Emeric Brun	d174f0e59a	MINOR: resolvers/dns: split dns and resolver counters in dns_counter struct This patch add a union and struct into dns_counter struct to split application specific counters. The only current existing application is the resolver.c layer but in futur we could handle different application such as dns load balancing with others specific counters. This patch should not be backported.	2021-11-03 17:16:46 +01:00
Amaury Denoyelle	f9d5957cd9	MINOR: server: add ws keyword Implement parsing for the server keyword 'ws'. This is used to configure the mode of selection for websocket protocol. The configuration documentation has been updated. A new regtest has been created to test the proper behavior of the keyword.	2021-11-03 16:24:48 +01:00
Amaury Denoyelle	9c3251d108	MEDIUM: server/backend: implement websocket protocol selection Handle properly websocket streams if the server uses an ALPN with both h1 and h2. Add a new field h2_ws in the server structure. If set to off, reuse is automatically disable on backend and ALPN is forced to http1.x if possible. Nothing is done if on. Implement a mechanism to be able to use a different http version for websocket streams. A new server member <ws> represents the algorithm to select the protocol. This can overrides the server <proto> configuration. If the connection uses ALPN for proto selection, it is updated for websocket streams to select the right protocol. Three mode of selection are implemented : - auto : use the same protocol between non-ws and ws streams. If ALPN is use, try to update it to "http/1.1"; this is only done if the server ALPN contains "http/1.1". - h1 : use http/1.1 - h2 : use http/2.0; this requires the server to support RFC8441 or an error will be returned by haproxy.	2021-11-03 16:24:48 +01:00
Willy Tarreau	14e7f29e86	MINOR: protocols: replace protocol_by_family() with protocol_lookup() At a few places we were still using protocol_by_family() instead of the richer protocol_lookup(). The former is limited as it enforces SOCK_STREAM and a stream protocol at the control layer. At least with protocol_lookup() we don't have this limitationn. The values were still set for now but later we can imagine making them configurable on the fly.	2021-10-27 17:41:07 +02:00
Willy Tarreau	6878f80427	MEDIUM: resolvers: remove the last occurrences of the "safe" argument This one was used to indicate whether the callee had to follow particularly safe code path when removing resolutions. Since the code now uses a kill list, this is not needed anymore.	2021-10-20 17:54:27 +02:00
Tim Duesterhus	c5aa113d80	CLEANUP: Apply strcmp.cocci This fixes the use of the various *cmp functions to use != 0 or == 0.	2021-10-18 07:17:04 +02:00
Christopher Faulet	dfd10ab5ee	MINOR: proxy: Introduce proxy flags to replace disabled bitfield This change is required to support TCP/HTTP rules in defaults sections. The 'disabled' bitfield in the proxy structure, used to know if a proxy is disabled or stopped, is replaced a generic bitfield named 'flags'. PR_DISABLED and PR_STOPPED flags are renamed to PR_FL_DISABLED and PR_FL_STOPPED respectively. In addition, everywhere there is a test to know if a proxy is disabled or stopped, there is now a bitwise AND operation on PR_FL_DISABLED and/or PR_FL_STOPPED flags.	2021-10-15 14:12:19 +02:00
Willy Tarreau	bf9498a31b	MINOR: resolvers: fix the resolv_str_to_dn_label() API about trailing zero This function is bogus at the API level: it demands that the input string is zero-terminated and that its length including the trailing zero is passed on input. While that already looks smelly, the trailing zero is copied as-is, and is then explicitly replaced with a zero... Not only all callers have to pass hostname_len+1 everywhere to work around this absurdity, but this requirement causes a bug in the do-resolve() action that passes random string lengths on input, and that will be fixed on a subsequent patch. Let's fix this API issue for now. This patch will have to be backported, and in versions 2.3 and older, the function is in dns.c and is called dns_str_to_dn_label().	2021-10-14 21:24:18 +02:00
Willy Tarreau	260f324c19	REORG: server: uninline the idle conns management functions The following functions are quite heavy and have no reason to be kept inlined: srv_release_conn, srv_lookup_conn, srv_lookup_conn_next, srv_add_to_idle_list They were moved to server.c. It's worth noting that they're a bit at the edge between server and connection and that maybe we could create an idle-conn file for these in the near future.	2021-10-07 01:41:14 +02:00
Willy Tarreau	a8a72c68d5	CLEANUP: ssl/server: move ssl_sock_set_srv() to srv_set_ssl() in server.c This one has nothing to do with ssl_sock as it manipulates the struct server only. Let's move it to server.c and remove unneeded dependencies on ssl_sock.h. This further reduces by 10% the number of includes of opensslconf.h and by 0.5% the number of compiled lines.	2021-10-07 01:41:06 +02:00
Willy Tarreau	80527bcb9d	CLEANUP: server: always include the storage for SSL settings The SSL stuff in struct server takes less than 3% of it and requires lots of annoying ifdefs in the code just to take care of the cases where the field is absent. Let's get rid of this and stop including openssl-compat from server.c to detect NPN and ALPN capabilities. This reduces the total LoC by another 0.4%.	2021-10-07 01:36:51 +02:00
Willy Tarreau	beeabf5314	MINOR: task: provide 3 task_new_* wrappers to simplify the API We'll need to improve the API to pass other arguments in the future, so let's start to adapt better to the current use cases. task_new() is used: - 18 times as task_new(tid_bit) - 18 times as task_new(MAX_THREADS_MASK) - 2 times with a single bit (in a loop) - 1 in the debug code that uses a mask This patch provides 3 new functions to achieve this: - task_new_here() to create a task on the calling thread - task_new_anywhere() to create a task to be run anywhere - task_new_on() to create a task to run on a specific thread The change is trivial and will allow us to later concentrate the required adaptations to these 3 functions only. It's still possible to call task_new() if needed but a comment was added to encourage the use of the new ones instead. The debug code was not changed and still uses it.	2021-10-01 18:36:29 +02:00
Amaury Denoyelle	cd8a6f28c6	MINOR: server: enable slowstart for dynamic server Enable the 'slowstart' keyword for dynamic servers. The slowstart task is allocated in 'add server' handler if slowstart is used. As the server is created in disabled state, there is no need to start the task. The slowstart task will be automatically started on the first 'enable server' invocation.	2021-09-21 14:00:32 +02:00
Amaury Denoyelle	29d1ac1330	REORG: server: move slowstart init outside of checks 'slowstart' can be used without check on a server, with the CLI handlers 'enable/disable server'. Move the code to initialize and start the slowstart task outside of check.c. This change will also be reused to enable slowstart for dynamic servers.	2021-09-21 14:00:32 +02:00
Amaury Denoyelle	725f8d29ff	MINOR: server: enable more check related keywords for dynamic servers Allow to use the check related keywords defined in server.c. These keywords can be enabled now that checks have been implemented for dynamic servers. Here is the list of the new keywords supported : - error-limit - observe - on-error - on-marked-down - on-marked-up	2021-09-21 14:00:32 +02:00
Amaury Denoyelle	79b90e8cd4	MINOR: server: enable more keywords for ssl checks for dynamic servers Allow to configure ssl support for dynamic server checks independently of the ssl server configuration. This is done via the keyword "check-ssl". Also enable to configure the sni/alpn used for the check via "check-sni/alpn".	2021-09-21 14:00:07 +02:00
Amaury Denoyelle	b621552ca3	BUG/MINOR: server: alloc dynamic srv ssl ctx if proxy uses ssl chk rule The ssl context is not initialized for a dynamic server, even if there is a tcpcheck rule which uses ssl on the related backed. This will cause the check initialization to failed with the message : "Out of memory when initializing an SSL connection" This can be reproduced by having the following config in the backend : option tcp-check tcp-check connect ssl and create a dynamic server with check activated and a ca-file. Fix this by calling the prepare_srv xprt callback when the proxy options PR_O_TCPCKH_SSL is set. Check support for dynamic servers has been merged in the current branch. No backport needed.	2021-09-21 13:56:03 +02:00
Amaury Denoyelle	0f456d5029	BUG/MINOR: server: allow 'enable health' only if check configured Test that checks have been configured on the server before enabling via the 'enable health' CLI. This mirrors the 'enable agent' command. Without this, a user can use the command on the server without checks. This leaves the server in an undefined state. Notably, the stat page reports the server in check transition. This condition was left on the following reorg commit. `2c04eda8b5` REORG: cli: move "{enable\|disable} health" to server.c This should be backported up to 1.8.	2021-09-21 11:50:22 +02:00
Tim Duesterhus	d5fc8fcb86	CLEANUP: Add haproxy/xxhash.h to avoid modifying import/xxhash.h This solves setting XXH_INLINE_ALL in a cleaner way, because the imported header is not modified, easing future updates. see `6f7cc11e6d`	2021-09-11 19:58:45 +02:00
Amaury Denoyelle	14c3c5c121	MEDIUM: server: allow to remove servers at runtime except non purgeable Relax the condition on "delete server" CLI handler to be able to remove all servers, even non dynamic, except if they are flagged as non purgeable. This change is necessary to extend the use cases for dynamic servers with reload. It's expected that each dynamic server created via the CLI is manually commited in the haproxy configuration by the user. Dynamic servers will be present on reload only if they are present in the configuration file. This means that non-dynamic servers must be allowed to be removable at runtime. The dynamic servers removal reg-test has been updated and renamed to reflect its purpose. A new test is present to check that non-purgeable servers cannot be removed.	2021-08-25 15:53:54 +02:00

1 2 3 4 5 ...

809 Commits