haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-07 23:56:57 +02:00

Author	SHA1	Message	Date
Willy Tarreau	59b0fecfd9	MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the lock The two algos defining these functions (first and leastconn) do not need the server's lock. However it's already present in pendconn_process_next_strm() so the API must be updated so that the functions may take it if needed and that the callers indicate whether they already own it. As such, the call places (backend.c and stream.c) now do not take it anymore, queue.c was unchanged since it's already held, and both "first" and "leastconn" were updated to take it if not already held. A quick test on the "first" algo showed a jump from 432 to 565k rps by just dropping the lock in stream.c!	2021-02-18 10:06:45 +01:00
Willy Tarreau	751153e0f1	OPTIM: server: switch the actconn list to an mt-list The remaining contention on the server lock solely comes from sess_change_server() which takes the lock to add and remove a stream from the server's actconn list. This is both expensive and pointless since we have mt-lists, and this list is only used by the CLI's "shutdown server sessions" command! Let's migrate to an mt-list and remove the need for this costly lock. By doing so, the request rate increased by ~1.8%.	2021-02-18 10:06:45 +01:00
Willy Tarreau	4e9df2737d	BUG/MEDIUM: checks: don't needlessly take the server lock in health_adjust() The server lock was taken preventively for anything in health_adjust(), including the static config checks needed to detect that the lock was not needed, while the function is always called on the response path to update a server's status. This was responsible for huge contention causing a performance drop of about 17% on 16 threads. Let's move the lock only where it should be, i.e. inside the function around the critical sections only. By doing this, a 16-thread process jumped back from 575 to 675 krps. This should be backported to 2.3 as the situation degraded there, and maybe later to 2.2.	2021-02-18 10:06:45 +01:00
Willy Tarreau	64ba5ebadc	BUG/MINOR: checks: properly handle wrapping time in __health_adjust() There's an issue when a server state changes, we use an integer comparison to decide whether or not to reschedule a test instead of using a wrapping timer comparison. This will cause some health-checks not to be immediately triggered half of the time, and some unneeded calls to task_queue() to be performed in other cases. This bug has always been there as it was introduced with the commit that added the feature, `97f07b832` ("[MEDIUM] Decrease server health based on http responses / events, version 3"). This may be backported everywhere.	2021-02-18 10:06:45 +01:00
Amaury Denoyelle	36441f46c4	MINOR: connection: remove pointers for prehash in conn_hash_params Replace unneeded pointers for sni/proxy prehash by plain data type. The code is slightly cleaner.	2021-02-17 16:43:07 +01:00
Amaury Denoyelle	4c09800b76	BUG/MINOR: backend: do not call smp_make_safe for sni conn hash conn_hash_prehash does not need a nul-terminated string, thus it is only needed to test if the sni sample is not null before using it as connection hash input. Moreover, a bug could be introduced between smp_make_safe and ssl_sock_set_servername call. Indeed, smp_make_safe may call smp_dup which duplicates the sample in the trash buffer. If another function manipulates the trash buffer before the call to ssl_sock_set_servername, the sni sample might be erased. Currently, no function seems to do that except make_proxy_line in case proxy protocol is used simultaneously with the sni on the server. This does not need to be backported.	2021-02-17 16:38:20 +01:00
Willy Tarreau	9805859f24	BUG/MINOR: session: atomically increment the tracked sessions counter In session_count_new() the tracked counter was still incremented with a "++" outside of any lock, resulting in occasional slightly off values such as the following: # table: foo, type: string, size:1000, used:1 0xb2a398: key=127.1.2.3 use=0 exp=86398318 sess_cnt=999959 http_req_cnt=1000004 Now with the correct atomic increment: # table: foo, type: string, size:1000, used:1 0x7f82a4026d38: key=127.1.2.3 use=0 exp=86399294 sess_cnt=1000004 http_req_cnt=1000004 This can be backported to 1.8.	2021-02-16 18:08:12 +01:00
Emeric Brun	267221557f	BUG/MEDIUM: dns: fix multiple double close on fd in dns.c It seems that fd_delete perform the close of the file descriptor Se we must not close the fd once again after that. This should fix issues #1128, #1130 and #1131	2021-02-15 15:42:44 +01:00
Emeric Brun	0e40fda16a	BUG/MINOR: dns: fix ring attach control on dns_session_new Ths patch adds a control on ring_attach which can not currently fail since we are the first to try to attach. This should fix issue #1126	2021-02-15 15:24:28 +01:00
Emeric Brun	743afeed33	BUG/MINOR: dns: missing test writing in output channel in session handler This patch fix a case which should never happen writing in output channel since we check available room before This patch should fix github issue #1132	2021-02-15 15:13:01 +01:00
Emeric Brun	526b79219e	BUG/MINOR: dns: dns_connect_server must return -1 unsupported nameserver's type This patch fix returns code in case of dns_connect_server is called on unsupported type (which should not happen). Doing this we have the warranty that after a return 0 the fd is never -1. This patch should fix github issues #1127, #1128 and #1130	2021-02-15 15:12:58 +01:00
Emeric Brun	538bb0441c	BUG/MINOR: dns: add test on result getting value from buffer into ring. This patch adds a missing test in dns_session_io_handler, getting the query id from the buffer of the ring. An error should never happen since messages are completely added atomically. This bug should fix github issue #1133	2021-02-15 15:12:55 +01:00
William Dauchy	3679d0c794	MINOR: stats: add helper to get status string move listen status to a helper, defining both status enum and string definition. this will be helpful to be reused in prometheus code. It also removes this hard-to-read nested ternary. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-15 14:13:32 +01:00
William Dauchy	655e14ef17	MEDIUM: stats: allow to select one field in `stats_fill_li_stats` prometheus approach requires to output all values for a given metric name; meaning we iterate through all metrics, and then iterate in the inner loop on all objects for this metric. In order to allow more code reuse, adapt the stats API to be able to select one field or fill them all otherwise. From this patch it should be possible to add support for listen stats in prometheus. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-15 14:13:32 +01:00
William Dauchy	b26122b032	CLEANUP: check: fix get_check_status_info declaration we always put a \n between function name and `{` Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-15 11:56:31 +01:00
Christopher Faulet	eaab7325a7	BUG/MINOR: server: Remove RMAINT from admin state when loading server state The RMAINT admin state is dynamic and should be remove from the srv_admin_state parameter when a server state is loaded from a server-state file. Otherwise an erorr is reported, the server-state line is ignored and the server state is not updated. This patch should fix the issue #576. It must be backported as far as 1.8.	2021-02-15 11:56:31 +01:00
Emeric Brun	56fc5d9ebc	MEDIUM: resolvers: add supports of TCP nameservers in resolvers. This patch introduce the new line "server" to set a TCP nameserver in a "resolvers" section: server <name> <address> [param*] Used to configure a DNS TCP or stream server. This supports for all "server" parameters found in 5.2 paragraph. Some of these parameters are irrelevant for DNS resolving. Note: currently 4 queries are pipelined on the same connections. A batch of idle connections are removed every 5 seconds. "maxconn" can be configured to limit the amount of those concurrent connections and TLS should also usable if the server supports . The current implementation limits to 4 pipelined The name of the line in configuration is open to discussion and could be changed before the next release.	2021-02-13 10:03:46 +01:00
Emeric Brun	fd647d5f5f	MEDIUM: dns: adds code to support pipelined DNS requests over TCP. This patch introduce the "dns_stream_nameserver" to use DNS over TCP on strict nameservers. For the upper layer it is analog to the api used with udp nameservers except that the user que switch the name server in "stream" mode at the init using "dns_stream_init". The fallback from UDP to TCP is not handled and this is not the purpose of this feature. This is done to choose the transport layer during the initialization. Currently there is a hardcoded limit of 4 pipelined transactions per TCP connections. A batch of idle connections is expired every 5s. This code is designed to support a maximum DNS message size on TCP: 64k. Note: this code won't perform retry on unanswered queries this should be handled by the upper layer	2021-02-13 10:03:46 +01:00
Emeric Brun	c943799c86	MEDIUM: resolvers/dns: split dns.c into dns.c and resolvers.c This patch splits current dns.c into two files: The first dns.c contains code related to DNS message exchange over UDP and in future other TCP. We try to remove depencies to resolving to make it usable by other stuff as DNS load balancing. The new resolvers.c inherit of the code specific to the actual resolvers. Note: It was really difficult to obtain a clean diff dur to the amount of moved code. Note2: Counters and stuff related to stats is not cleany separated because currently counters for both layers are merged and hard to separate for now.	2021-02-13 10:03:46 +01:00
Emeric Brun	d26a6237ad	MEDIUM: resolvers: split resolving and dns message exchange layers. This patch splits recv and send functions in two layers. the lowest is responsible of DNS message transactions over the network. Doing this we could use DNS message layer for something else than resolving. Load balancing for instance. This patch also re-works the way to init a nameserver and introduce the new struct dns_dgram_server to prepare the arrival of dns_stream_server and the support of DNS over TCP. The way to retry a send failure of a request because of EAGAIN was re-worked. Previously there was no control and all "pending" queries were re-played each time it reaches a EAGAIN. This patch introduce a ring to stack messages in case of sent failure. This patch is emptied if poller shows that the socket is ready again to push messages.	2021-02-13 09:51:10 +01:00
Emeric Brun	d3b4495f0d	MINOR: resolvers: rework dns stats prototype because specific to resolvers Counters are currently stored into lowlevel nameservers struct but most of them are resolving layer data and increased in the upper layer So this patch renames the prototype used to allocate/dump them with prefix 'resolv' waiting for a clean split.	2021-02-13 09:43:18 +01:00
Emeric Brun	6a2006ae37	MINOR: resolvers: replace nameserver's resolver ref by generic parent pointer This will allow to use nameservers in something else than a resolver section (load balancing for instance).	2021-02-13 09:43:18 +01:00
Emeric Brun	8a55193d4e	MEDIUM: resolvers: move resolvers section parsing from cfgparse.c to dns.c The resolver section parsing is moved from cfgparse.c to dns.c	2021-02-13 09:43:18 +01:00
Emeric Brun	d30e9a1709	MINOR: resolvers: rework prototype suffixes to split resolving and dns. A lot of prototypes in dns.h are specific to resolvers and must be renamed to split resolving and DNS layers.	2021-02-13 09:43:18 +01:00
Emeric Brun	456de77bdb	MINOR: resolvers: renames resolvers DNS_UPD_* returncodes to RSLV_UPD_* This patch renames some #defines prefixes from DNS to RSLV.	2021-02-13 09:43:18 +01:00
Emeric Brun	30c766ebbc	MINOR: resolvers: renames resolvers DNS_RESP_* errcodes RSLV_RESP_* This patch renames some #defines prefixes from DNS to RSLV.	2021-02-13 09:43:18 +01:00
Emeric Brun	21fbeedf97	MINOR: resolvers: renames some dns prefixed types using resolv prefix. @@ -119,8 +119,8 @@ struct act_rule { - } dns; /* dns resolution / + } resolv; / resolving */ -struct dns_options { +struct resolv_options {	2021-02-13 09:43:18 +01:00
Emeric Brun	08622d3c0a	MINOR: resolvers: renames some resolvers specific types to not use dns prefix This patch applies those changes on names: -struct dns_resolution { +struct resolv_resolution { -struct dns_requester { +struct resolv_requester { -struct dns_srvrq { +struct resolv_srvrq { @@ -185,12 +185,12 @@ struct stream { struct { - struct dns_requester dns_requester; + struct resolv_requester requester; ... - } dns_ctx; + } resolv_ctx;	2021-02-13 09:43:18 +01:00
Emeric Brun	750fe79cd0	MINOR: resolvers: renames type dns_resolvers to resolvers. It also renames 'dns_resolvers' head list to sec_resolvers to avoid conflicts with local variables 'resolvers'.	2021-02-13 09:43:17 +01:00
Emeric Brun	85914e9d9b	MINOR: resolvers: renames some resolvers internal types and removes dns prefix Some types are specific to resolver code and a renamed using the 'resolv' prefix instead 'dns'. -struct dns_query_item { +struct resolv_query_item { -struct dns_answer_item { +struct resolv_answer_item { -struct dns_response_packet { +struct resolv_response {	2021-02-13 09:43:17 +01:00
Emeric Brun	50c870e4de	BUG/MINOR: dns: add missing sent counter and parent id to dns counters. Resolv callbacks are also updated to rely on counters and not on nameservers. "show stat domain dns" will now show the parent id (i.e. resolvers section name).	2021-02-13 09:43:17 +01:00
Emeric Brun	147b3f05b5	CLEANUP: channel: fix comment in ci_putblk. The comment is outdated and refer to an old code. Should be backported until branch 1.5	2021-02-13 09:43:17 +01:00
Emeric Brun	e14b98c08e	MINOR: ring: adds new ring_init function. Adds the new ring_init function to initialize a pre-allocated ring struct using the given memory area.	2021-02-13 09:43:17 +01:00
David Carlier	1eb595b8b4	MINOR: tcp: add support for defer-accept on FreeBSD. FreeBSD has a kernel feature (accf) and a sockopt flag similar to the Linux's TCP_DEFER_ACCEPT to filter incoming data upon ACK. The main difference is the filter needs to be placed when the socket actually listens.	2021-02-13 09:05:02 +01:00
Willy Tarreau	4b10302fd8	MINOR: cfgparse: implement a simple if/elif/else/endif macro block handler Very often, especially since reg-tests, it would be desirable to be able to conditionally comment out a config block, such as removing an SSL binding when SSL is disabled, or enabling HTX only for certain versions, etc. This patch introduces a very simple nested block management which takes ".if", ".elif", ".else" and ".endif" directives to take or ignore a block. For now the conditions are limited to empty string or "0" for false versus a non-nul integer for true, which already suffices to test environment variables. Still, it needs to be a bit more advanced with defines, versions etc. A set of ".notice", ".warning" and ".alert" statements are provided to emit messages, often in order to provide advice about how to fix certain conditions.	2021-02-12 18:54:19 +01:00
Willy Tarreau	49962b58d0	MINOR: peers/cli: do not dump the peers dictionaries by default on "show peers" The "show peers" output has become huge due to the dictionaries making it less readable. Now this feature has reached a certain level of maturity which doesn't warrant to dump it all the time, given that it was essentially needed by developers. Let's make it optional, and disabled by default, only when "show peers dict" is requested. The default output reminds about the command. The output has been divided by 5 : $ socat - /tmp/sock1 <<< "show peers dict" \| wc -l 125 $ socat - /tmp/sock1 <<< "show peers" \| wc -l 26 It could be useful to backport this to recent stable versions.	2021-02-12 17:00:52 +01:00
Christopher Faulet	469676423e	CLEANUP: server: Remove useless "filepath" variable in apply_server_state() This variable is now only used to point on the local server-state file. When the server-state is global, it is unused. So, we now use "localfilepath" instead. Thus, the "filepath" variable can safely be removed.	2021-02-12 16:42:00 +01:00
Christopher Faulet	8952ea636b	BUG/MINOR: server: Don't call fopen() with server-state filepath set to NULL When a local server-state file is loaded, if its name is too long, the error is not properly handled, resulting to a call to fopen() with the "filepath" variable set to NULL. To fix the bug, when this error occurs, we jump to the next proxy, via a "continue" statement. And we take case to set "filepath" variable after the error handling to be sure. This patch should fix the issue #1111. It must be backported as far as 1.6.	2021-02-12 16:42:00 +01:00
Christopher Faulet	b1d19eab1c	CLEANUP: tcpcheck: Remove a useless test on port variable When a connect rule is evaluated a test is performed on the "port" variable while it is set to 0 just on the line just above. Just remove this useless test to make ccpcheck happy. This patch fixes the issue #1113.	2021-02-12 16:42:00 +01:00
Yves Lafon	b4d3708cb7	MINOR: http: add baseq sample fetch Symetrical to path/pathq, baseq returns the concatenation of the Host header and the path including the query string.	2021-02-12 16:38:50 +01:00
Willy Tarreau	7c0b4d861e	MEDIUM: cfgparse: allow a proxy to designate the defaults section to use Now it becomes possible to specify "from foo" on a frontend/listen/backend or even on a "defaults" line, to mention that defaults section "foo" needs to be used to preset the proxy's settings. When not set, the last section remains used. In case the designated name is found at multiple places, it is rejected and an error indicates two occurrences of the same name. Similarly, if the section name is found, its name must only use valid characters. This allows multiple named defaults section to continue to coexist without the risk that they will cause trouble by accident. When it comes to "defaults" relying on another defaults, what happens is just that a new defaults section is created from the designated one. This will make it possible for example to reuse some settings such as log-format like below: defaults tcp-clear log stdout local0 info log-format "%ci:%cp/%b/%si:%sp %ST %ts %U/%B %{+Q}r" defaults tcp-ssl log stdout local0 info log-format "%ci:%cp/%b/%si:%sp %ST %ts %U/%B %{+Q}r ssl=%sslv" defaults http-clear from tcp-clear mode http defaults http-ssl from tcp-ssl mode http frontend fe1 from http-clear bind :8001 frontend fe2 from http-ssl bind :8002 A small corner case remains in the error detection, if a second defaults section appears with the same name after the point where it was used, and nobody references it, the duplicate will not be detected. This could be addressed by performing the syntactic checks in check_config_validity(), and by postponing the freeing of the defaults, after tagging a defaults section as explicitly looked up by another section. This doesn't seem that important at the moment though.	2021-02-12 16:23:46 +01:00
Willy Tarreau	e90904d5a9	MEDIUM: proxy: store the default proxies in a tree by name Now default proxies are stored into a dedicated tree, sorted by name. Only unnamed entries are not kept upon new section creation. The very first call to cfg_parse_listen() will automatically allocate a dummy defaults section which corresponds to the previous static one, since the code requires to have one at a few places. The first immediately visible benefit is that it allows to reuse alloc_new_proxy() to allocate a defaults section instead of doing it by hand. And the secret goal is to allow to keep multiple named defaults section in memory to reuse them from various proxies.	2021-02-12 16:23:46 +01:00
Willy Tarreau	0a0f6a7e4f	MINOR: proxy: support storing defaults sections into their own tree Now we'll have a tree of named defaults sections. The regular insertion and lookup functions take care of the capability in order to select the appropriate tree. A new function proxy_destroy_defaults() removes a proxy from this tree and frees it entirely.	2021-02-12 16:23:46 +01:00
Willy Tarreau	c02ab03142	MINOR: proxy: also store the name for a defaults section There's an optional name, but till now it was not even saved into the structure, let's keep it.	2021-02-12 16:23:46 +01:00
Willy Tarreau	ab3410c65d	MINOR: cfgparse: use a pointer to the current default proxy In order to make the default proxy configurable, we'll need to have a pointer to it which might differ from &defproxy. cfg_parse_listen() now gets curr_defproxy for this.	2021-02-12 16:23:46 +01:00
Willy Tarreau	5d095c2fac	MINOR: cfgparse: check PR_CAP_DEF instead of comparing poiner against defproxy We want to get rid of this defproxy, let's now simply check the proxy's capabilities instead of comparing its pointer to the known default one.	2021-02-12 16:23:46 +01:00
Willy Tarreau	80dc6fea59	MINOR: proxy: add a new capability PR_CAP_DEF In order to more easily distinguish a default proxy from a standard one, let's introduce a new capability PR_CAP_DEF.	2021-02-12 16:23:46 +01:00
Willy Tarreau	7d0c143185	MINOR: cfgparse: move defproxy to cfgparse-listen as a static We don't want to expose this one anymore as we'll soon keep multiple default proxies. Let's move it inside the parser which is the only place which still uses it, and initialize it on the fly once needed instead of doing it at boot time.	2021-02-12 16:23:46 +01:00
Willy Tarreau	bb8669ae28	BUG/MINOR: server: parse_server() must take a const for the defproxy The default proxy was passed as a variable, which in addition to being a PITA to deal with in the config parser, doesn't feel safe to use when it ought to be const. This will only affect new code so no backport is needed.	2021-02-12 16:23:46 +01:00
Willy Tarreau	54fa7e332a	BUG/MINOR: tcpcheck: proxy_parse_check() must take a const for the defproxy The default proxy was passed as a variable, which in addition to being a PITA to deal with in the config parser, doesn't feel safe to use when it ought to be const. This will only affect new code so no backport is needed.	2021-02-12 16:23:46 +01:00
Willy Tarreau	220fd70694	BUG/MINOR: extcheck: proxy_parse_extcheck() must take a const for the defproxy The default proxy was passed as a variable, which in addition to being a PITA to deal with in the config parser, doesn't feel safe to use when it ought to be const. This will only affect new code so no backport is needed.	2021-02-12 16:23:46 +01:00
Willy Tarreau	818ec78af8	MINOR: proxy: always properly reset the just freed default instance pointers In proxy_free_defaults(); none of the free() calls was followed by a pointer reset. Not only it's hard to figure if one of them is duplicated, but this code started to call other functions which might or might not rely on such just freed pointers. Let's reset them as they should be to make sure there will never be any case of use-after-free. The 3 functions called there were inspected and are all unaffected by this so this remains safe to do right now.	2021-02-12 16:23:46 +01:00
Willy Tarreau	a3320a0509	MINOR: proxy: move the defproxy freeing code to proxy.c This used to be open-coded in cfgparse-listen.c when facing a "defaults" keyword. Let's move this into proxy_free_defaults(). This code is ugly and doesn't even reset the just freed pointers. Let's not change this yet. This code should probably be merged with a generic proxy deinit function called from deinit(). However there's a catch on uri_auth which cannot be freed because it might be used by one or several proxies. We definitely need refcounts there!	2021-02-12 16:23:46 +01:00
Willy Tarreau	3b06eaec86	MEDIUM: proxy: only take defaults when a default proxy is passed. The proxy initialization code relies on three phases, allocation, pre-initialization, and assignments from defaults. This last part is entirely taken from the defaults proxy when arguments are set. This sensibly complexifies the initialization code as it requires to always have a default proxy. This patch instead first applies the original default settings on a proxy, and then uses those from a default proxy only if one such is used. This will allow to initialize a proxy out of any default proxy while still using valid defaults. A careful inspection of the function showed that only 4 fields used to be set regardless of the default proxy, and those were moved to init_new_proxy() where they ought to have been in the first place.	2021-02-12 16:23:46 +01:00
Willy Tarreau	7683893c70	REORG: proxy: centralize the proxy allocation code into alloc_new_proxy() This new function takes over the old open-coding that used to be done for too long in cfg_parse_listen() and it now does everything at once in a proxy-centric function. The function does all the job of allocating the structure, initializing it, presetting its defaults from the default proxy and checking for errors. The code was almost unchanged except for defproxy being passed as a pointer, and the error message being passed using memprintf(). This change will be needed to ease reuse of multiple default proxies, or to create dynamic backends in a distant future.	2021-02-12 16:23:46 +01:00
Willy Tarreau	144289b459	REORG: move init_default_instance() to proxy.c and pass it the defproxy pointer init_default_instance() was still left in cfgparse.c which is not the best place to pre-initialize a proxy. Let's place it in proxy.c just after init_new_proxy(), take this opportunity for renaming it to proxy_preset_defaults() and taking out init_new_proxy() from it, and let's pass it the pointer to the default proxy to be initialized instead of implicitly assuming defproxy. We'll soon be able to exploit this. Only two call places had to be updated.	2021-02-12 16:23:46 +01:00
Willy Tarreau	09f2e77eb1	BUG/MINOR: tcpheck: the source list must be a const in dup_tcpcheck_var() This is just an API bug but it's annoying when trying to tidy the code. The source list passed in argument must be a const and not a variable, as it's typically the list head from a default proxy and must obviously not be modified by the function. No backport is needed as it only impacts new code.	2021-02-12 16:23:46 +01:00
Willy Tarreau	016255a483	BUG/MINOR: http-htx: defpx must be a const in proxy_dup_default_conf_errors() This is just an API bug but it's annoying when trying to tidy the code. The default proxy passed in argument must be a const and not a variable. No backport is needed as it only impacts new code.	2021-02-12 16:23:46 +01:00
Willy Tarreau	b2ec994523	BUG/MINOR: cfgparse: do not mention "addr:port" as supported on proxy lines The very old error message indicating that a proxy name is mandatory still had a reference to the optional addr:port argument while this one is explicitly rejected a few lines later since at least 1.9. This is harmless but confusing. This can be backported to 2.0.	2021-02-12 16:23:45 +01:00
Willy Tarreau	5bbc676608	BUG/MINOR: stats: revert the change on ST_CONVDONE In 2.1, commit `ee4f5f83d` ("MINOR: stats: get rid of the ST_CONVDONE flag") introduced a subtle bug. By testing curproxy against defproxy in check_config_validity(), it tried to eliminate the need for a flag to indicate that stats authentication rules were already compiled, but by doing so it left the issue opened for the case where a new defaults section appears after the two proxies sharing the first one: defaults mode http stats auth foo:bar listen l1 bind :8080 listen l2 bind :8181 defaults # just to break above This config results in: [ALERT] 042/113725 (3121) : proxy 'f2': stats 'auth'/'realm' and 'http-request' can't be used at the same time. [ALERT] 042/113725 (3121) : Fatal errors found in configuration. Removing the last defaults remains OK. It turns out that the cleanups that followed that patch render it useless, so the best fix is to revert the change (with the up-to-date flags instead). The flag was marked as belonging to the config. It's not exact but it's the closest to the reality, as it's not there to configure the behavior but ti mention that the config parser did its job. This could be backported as far as 2.1, but in practice it looks like nobody ever hit it.	2021-02-12 16:23:45 +01:00
Willy Tarreau	937c3ead34	BUG/MEDIUM: config: don't pick unset values from last defaults section Since commit 1.3.14 with commit `1fa3126ec` ("[MEDIUM] introduce separation between contimeout, and tarpit + queue"), check_config_validity() looks at the last defaults section to update all proxies' queue and tarpit timeouts if they were not set! This was apparently an attempt to properly set them on the fallback values, except that the fallback values were taken from the default proxy before looking at the current proxy itself. The worst part of it is that it might have randomly worked by accident for some configurations when there was a single defaults section, but has certainly caused too short queue expirations once another defaults section was added later in the file with these explicitly defined. Let's remove the defproxy part and keep only the curproxy ones. This could be backported everywhere, the bug has been there for 13 years.	2021-02-12 16:23:45 +01:00
Christopher Faulet	f5ea269723	CLEANUP: deinit: release global and per-proxy server-state variables on deinit The global server-state base directory and file name are now released on deinit, as well as per-proxy server-state file name.	2021-02-12 16:04:52 +01:00
Christopher Faulet	583b6de68a	BUG/MINOR: server: Fix server-state-file-name directive Since the beginning, this directive is documented to accept an optional file name. But it should also be possible to use it without any argument to use the backend name as file name. However, when no argument is provided, an error is reported during the configuration parsing requesting an argument, a file name or "use-backend-name". And This last special argument is not documented. So, to respect the documentation and to avoid configuration breakages, all modes are now supported. If this directive is called with no argument or with "use-backend-name", the backend name is use as file name for the server-state file. Otherwise, the provided string is used. In addition, we take care to release any previously allocated file name in case this directive is defines multiple times in the same backend. And an error is reported if more than one argument are defined. Finally, the documentation is updated accordingly. Sections supporting this directive are also mentioned. This patch should be backported as far as 1.6.	2021-02-12 16:04:52 +01:00
William Dauchy	ddc7ce9645	MINOR: server: enhance error precision when applying server state server health checks and agent parameters are written the same way as others to be able to enahcne code reuse: basically we make use of parsing and assignment at the same place. It makes it difficult for error handling to know whether srv object was modified partially or not. The problem was already present with SRV resolution though. I was a bit puzzled about the approach to take to be honest, and I did not wanted to go into a full refactor, so I assumed it was ok to simply notify whether the line was failed or partially applied. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
William Dauchy	d1a7b85a40	MEDIUM: server: support {check,agent}_addr, agent_port in server state logical followup from cli commands addition, so that the state server file stays compatible with the changes made at runtime; use previously added helper to load server attributes. also alloc a specific chunk to avoid mixing with other called functions using it Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
William Dauchy	63e6cba12a	MEDIUM: server: add server-states version 2 Even if it is possibly too much work for the current usage, it makes sure we don't break states file from v2.3 to v2.4; indeed, since v2.3, we introduced two new fields, so we put them aside to guarantee we can easily reload from a version 1. The diff seems huge but there is no specific change apart from: - introduce v2 where it is needed (parsing, update) - move away from switch/case in update to be able to reuse code - move srv lock to the whole function to make it easier this patch confirm how painful it is to maintain this functionality. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
William Dauchy	7cabc06da6	MEDIUM: cli: add agent-port command this patch allows to set agent port at runtime. In order to align with both `addr` and `check-addr` commands, also add the possibility to optionnaly set port on `agent-addr` command. This led to a small refactor in order to use the same function for both `agent-addr` and `agent-port` commands. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
William Dauchy	b456e1f389	MEDIUM: cli: add check-addr command this patch allows to set server health check address at runtime. In order to align with `addr` command, also allow to set port optionnaly. This led to a small refactor in order to use the same function for both `check-addr` and `check-port` commands. for `check-port`, we however don't permit the change anymore if checks are not enabled on the server. This command becomes more and more useful for people having a consul like architecture: - the backend server is located on a container with its own IP - the health checks are done the consul instance located on the host with the host IP Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
Amaury Denoyelle	edadf192fe	BUG/MINOR: backend: fix compilation without ssl sni_smp/sni_hash are reported as unused on compilation without USE_OPENSL and may cause compilation failure This does not need to be backported.	2021-02-12 13:49:42 +01:00
Amaury Denoyelle	1921d20fff	MINOR: connection: use proxy protocol as parameter for srv conn hash Use the proxy protocol frame if proxy protocol is activated on the server line. Do not add anymore these connections in the private list. If some requests are made with the same proxy fields, they can reuse the idle connection. The reg-tests proxy_protocol_send_unique_id must be adapted has it relied on the side effect behavior that every requests from a same connection reused a private server connection. Now, a new connection is created as expected if the proxy protocol fields differ.	2021-02-12 12:54:04 +01:00
Amaury Denoyelle	d10a200f62	MINOR: connection: use src addr as parameter for srv conn hash The source address is used as an input to the the server connection hash. The address and port are used as separate hash inputs. Do not add anymore these connections in the private list. This parameter is set only if used in the transparent-proxy mode.	2021-02-12 12:54:04 +01:00
Amaury Denoyelle	f7bdf00071	MINOR: backend: rewrite alloc of connection src address This commit is similar to "MINOR: backend: rewrite alloc of stream target address" but with source address.	2021-02-12 12:54:04 +01:00
Amaury Denoyelle	01a287f1e5	MINOR: connection: use dst addr as parameter for srv conn hash The destination address is used as an input to the server connection hash. The address and port are used as separated hash inputs. Note that they are not used when statically specified on the server line. This is only useful for dynamic destination address. This is typically used when the server address is dynamically set via the set-dst action. The address and port are separated hash parameters. Most notably, it should fixed set-dst use case (cf github issue #947).	2021-02-12 12:53:56 +01:00
Amaury Denoyelle	68cf3959b3	MINOR: backend: rewrite alloc of stream target address Change the API of the function used to allocate the stream target address. This is done in order to be able to allocate the destination address and use it to reuse a connection sharing with the same address. In particular, the flag stream SF_ADDR_SET is now set outside of the function.	2021-02-12 12:53:56 +01:00
Amaury Denoyelle	9b626e3c19	MINOR: connection: use sni as parameter for srv conn hash The sni parameter is an input to the server connection hash. Do not add anymore connections with dynamic sni in the private list. Thus, it is now possible to reuse a server connection if they use the same sni.	2021-02-12 12:48:11 +01:00
Amaury Denoyelle	293dcc400e	MINOR: backend: compare conn hash for session conn reuse Compare the connection hash when reusing a connection from the session. This ensures that a private connection is reused only if it shares the same set of parameters.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	1a58aca84e	MINOR: connection: use the srv pointer for the srv conn hash The pointer of the target server is used as a first parameter for the server connection hash calcul. This prevents the hash to be null when no specific parameters are present, and can serve as a simple defense against an attacker trying to reuse a non-conform connection.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	81c6f76d3e	MINOR: connection: prepare hash calcul for server conns This is a preliminary work for the calcul of the backend connection hash. A structure conn_hash_params is the input for the operation, containing the various specific parameters of a connection. The high bits of the hash will reflect the parameters present as input. A set of macros is written to manipulate the connection hash and extract the parameters/payload.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	aa890aef3d	MINOR: backend: search conn in idle tree after safe on always reuse With http-reuse always, if no matching safe connection is found, check in idle tree for a matching one. This is needed because now idle connections can be differentiated from each other. If only the safe tree was checked because not empty, but did not contain a matching connection, we could miss matching entry in idle tree.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	1399d695c0	MINOR: backend: search conn in idle/safe trees after available If no matching connection is found on available, check on idle/safe trees for a matching one. This is needed because now idle connections can be differentiated from each other. If only the available list was checked because not empty, but did not contain a matching connection, we could miss matching entries in idle or safe trees.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	f232cb3e9b	MEDIUM: connection: replace idle conn lists by eb trees The server idle/safe/available connection lists are replaced with ebmb- trees. This is used to store backend connections, with the new field connection hash as the key. The hash is a 8-bytes size field, used to reflect specific connection parameters. This is a preliminary work to be able to reuse connection with SNI, explicit src/dst address or PROXY protocol.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	5c7086f6b0	MEDIUM: connection: protect idle conn lists with locks This is a preparation work for connection reuse with sni/proxy protocol/specific src-dst addresses. Protect every access to idle conn lists with a lock. This is currently strictly not needed because the access to the list are made with atomic operations. However, to be able to reuse connection with specific parameters, the list storage will be converted to eb-trees. As this structure does not have atomic operation, it is mandatory to protect it with a lock. For this, the takeover lock is reused. Its role was to protect during connection takeover. As it is now extended to general idle conns usage, it is renamed to idle_conns_lock. A new lock section is also instantiated named IDLE_CONNS_LOCK to isolate its impact on performance.	2021-02-12 12:33:04 +01:00
Amaury Denoyelle	a3bf62ec54	BUG/MINOR: backend: hold correctly lock when killing idle conn The wrong lock seems to be held when trying to remove another thread connection if max fd limit has been reached (locking the current thread instead of the target thread lock). This could be backported up to 2.0.	2021-02-12 12:32:31 +01:00
Christopher Faulet	cd7126b396	CLEANUP: queue: Remove useless tests on p or pp in pendconn_process_next_strm() This patch removes unecessary tests on p or pp pointers in pendconn_process_next_strm() function. This should make cppcheck happy and avoid false report of null pointer dereference. This patch should fix the issue #1036.	2021-02-11 11:48:36 +01:00
Ilya Shipitsin	a1e0f387c7	CLEANUP: remove unused variable assigned found by Coverity this is pure cleanup, no need to backport 2116 if ((end - 1) == (payload + strlen(PAYLOAD_PATTERN))) { 2117 /* if the payload pattern is at the end */ 2118 s->pcli_flags \|= PCLI_F_PAYLOAD; CID 1399833 (#1 of 1): Unused value (UNUSED_VALUE)assigned_value: Assigning value from reql to ret here, but that stored value is overwritten before it can be used. 2119 ret = reql; 2120 } This patch fixes the issue #1048.	2021-02-11 11:48:36 +01:00
Christopher Faulet	4b524124db	BUG/MINOR: tools: Fix a memory leak on error path in parse_dotted_uints() When an invalid character is found during parsing in parse_dotted_uints() function, the allocated array of uint must be released. This patch fixes a memory leak on error path during the configuration parsing. This patch should fix the issue #1106. It should be backported as far as 2.0. Note that, for 2.1 and 2.0, the function is in src/standard.c	2021-02-11 11:48:36 +01:00
Christopher Faulet	0aeaa290da	CLEANUP: muxes: Remove useless calls to b_realign_if_empty() In H1, H2 and FCGI muxes, b_realign_if_empty() is called to reset the head of an empty buffer before setting it a specific value to permit the zero-copy. Thus, we can remove call to b_realign_if_empty().	2021-02-11 11:48:36 +01:00
Christopher Faulet	368936703a	MINOR: mux-h1: Be sure EOM flag is set when processing end of outgoing message When a message is sent, an extra check is performed when the parser is switch to MSG_DONE state to be sure the EOM flag is really set. This flag is quite new and replaces the EOM block. Thus, this test is a safeguard waiting for a proper refactoring of the outgoing side.	2021-02-10 16:25:42 +01:00
Christopher Faulet	337243235f	BUG/MEDIUM: mux-h2: Add EOT block when EOM flag is set on an empty HTX message In the H2 mux, when a empty DATA frame is used to finish a message, just to set the ES flag, we now only set the EOM flag on the HTX message. However, if the HTX message is empty, this event will not be properly handled on the other side because there is no effective data to handle. Thus, it is interpreted as an abort by the H1 mux. It is in part caused by the current H1 mux design but also because there is no way to emit empty HTX block (NOOP HTX block) or to wakeup a mux for send when there is no data to finish some internal processing. Thus, for now, to work around this limitation, an EOT HTX block is added by the H2 mux if a EOM flag is added on an empty HTX message. This case is only possible when an empty DATA frame with the ES flag is received. This fix is specific for 2.4. No backport needed.	2021-02-10 16:25:42 +01:00
Christopher Faulet	0a916d2aca	BUG/MINOR: mux-h1: Don't blindly skip EOT block for non-chunked messages In HTTP/2, we may have trailers for messages with a Content-length header. Thus, when the H2 mux receives a HEADERS frame at the end of a message, it always emits TLR and EOT HTX blocks. On the H1 mux, if this happens, these blocks are just skipped because we cannot emit trailers for a non-chunked message. But the EOT HTX block must not be blindly ignored. Indeed, there is no longer EOM HTX block to mark the end of the message. Thus the EOT block, when found, is the end of the message. So we must handle it to swith in MSG_DONE state. This fix is specific for 2.4. No backport needed.	2021-02-10 16:25:42 +01:00
Christopher Faulet	0d7e634631	BUG/MINOR: mux-h1: Fix data skipping for bodyless responses When payload is received for a bodyless response, for instance a response to a HEAD request, it is silently skipped. Unfortunately, when this happens, the end of the message is not properly handled. The response remains in the MSG_DATA state (or MSG_TRAILERS if the message is chunked). In addition, when a zero-copy is possible, the data are not removed from the channel buffer and the H1 connection is killed because an error is then triggered. To fix the bug, the zero-copy is disabled for bodyless responses. It is not a problem because there is no copy at all. And the last block (DATA or EOT) is now properly handled. This bug was introduced by the commit `e5596bf53` ("MEDIUM: mux-h1: Don't emit any payload for bodyless responses"). This fix is specific for 2.4. No backport needed.	2021-02-10 16:25:42 +01:00
Christopher Faulet	a22782b597	BUG/MEDIUM: mux-h1: Always set CS_FL_EOI for response in MSG_DONE state During the message parsing, if in MSG_DONE state, the CS_FL_EOI flag must always be set on the conn-stream if following conditions are met : * It is a response or * It is a request but not a protocol upgrade nor a CONNECT. For now, there is no test on the message type (request or response). Thus the CS_FL_EOI flag is not set for a response with a "Connection: upgrade" header but not a 101 response. This bug was introduced by the commit `3e1748bbf` ("BUG/MINOR: mux-h1: Don't set CS_FL_EOI too early for protocol upgrade requests"). It was backported as far as 2.0. Thus, this patch must also be backported as far as 2.0.	2021-02-10 16:25:42 +01:00
Christopher Faulet	bf7175f9b6	BUG/MINOR: http-ana: Don't increment HTTP error counter on internal errors If internal error is reported by the mux during HTTP request parsing, the HTTP error counter should not be incremented. It should only be incremented on parsing error to reflect errors caused by clients. This patch must be backported as far as 2.0. During the backport, the same must be performed for 408-request-time-out errors.	2021-02-10 16:22:32 +01:00
Christopher Faulet	f4b7074784	BUG/MINOR: mux-h1: Don't increment HTTP error counter for 408/500/501 errors The HTTP error counter reflects the number of errors caused by clients. Thus, In the H1 mux, it should only be increment on parsing errors. This fix is specific for 2.4. No backport needed.	2021-02-10 16:22:32 +01:00
Willy Tarreau	826f3ab5e6	MINOR: stick-tables/counters: add http_fail_cnt and http_fail_rate data types Historically we've been counting lots of client-triggered events in stick tables to help detect misbehaving ones, but we've been missing the same on the server side, and there's been repeated requests for being able to count the server errors per URL in order to precisely monitor the quality of service or even to avoid routing requests to certain dead services, which is also called "circuit breaking" nowadays. This commit introduces http_fail_cnt and http_fail_rate, which work like http_err_cnt and http_err_rate in that they respectively count events and their frequency, but they only consider server-side issues such as network errors, unparsable and truncated responses, and 5xx status codes other than 501 and 505 (since these ones are usually triggered by the client). Note that retryable errors are purposely not accounted for, so that only what the client really sees is considered. With this it becomes very simple to put some protective measures in place to perform a redirect or return an excuse page when the error rate goes beyond a certain threshold for a given URL, and give more chances to the server to recover from this condition. Typically it could look like this to bypass a URL causing more than 10 requests per second: stick-table type string len 80 size 4k expire 1m store http_fail_rate(1m) http-request track-sc0 base # track host+path, ignore query string http-request return status 503 content-type text/html \ lf-file excuse.html if { sc0_http_fail_rate gt 10 } A more advanced mechanism using gpt0 could even implement high/low rates to disable/enable the service. Reg-test converteers_ref_cnt_never_dec.vtc was updated to test it.	2021-02-10 12:27:01 +01:00
Willy Tarreau	e4d247e217	BUG/MINOR: freq_ctr: fix a wrong delay calculation in next_event_delay() The sleep time calculation in next_event_delay() was wrong because it was dividing 999 by the number of pending events, and was directly responsible for an observation made a long time ago that listeners would eat all the CPU when hammered while globally rate-limited, because the more the queued events, the least it would wait, and would ignore the configured frequency to compute the delay. This was addressed in various ways in listeners through the switch to the FULL state and the wakeup of manage_global_listener_queue() that avoids this fast loop, but the calculation made there remained wrong nevertheless. It's even visible with this patch that the accept frequency is much more accurate at low values now; for example, configuring a maxconrate of 10 would give between 8.99 and 11.0 cps before this patch and between 9.99 and 10.0 with it. Better fix it now in case it's reused anywhere else and causes confusion again. It maybe be backported but is probably not worth it.	2021-02-09 17:52:50 +01:00
William Lallemand	3ce6eedb37	MEDIUM: ssl: add a rwlock for SSL server session cache When adding the server side support for certificate update over the CLI we encountered a design problem with the SSL session cache which was not locked. Indeed, once a certificate is updated we need to flush the cache, but we also need to ensure that the cache is not used during the update. To prevent the use of the cache during an update, this patch introduce a rwlock for the SSL server session cache. In the SSL session part this patch only lock in read, even if it writes. The reason behind this, is that in the session part, there is one cache storage per thread so it is not a problem to write in the cache from several threads. The problem is only when trying to write in the cache from the CLI (which could be on any thread) when a session is trying to access the cache. So there is a write lock in the CLI part to prevent simultaneous access by a session and the CLI. This patch also remove the thread_isolate attempt which is eating too much CPU time and was not protecting from the use of a free ptr in the session.	2021-02-09 09:43:44 +01:00
Ilya Shipitsin	7ff7747a17	BUILD: ssl: guard SSL_CTX_set_msg_callback with SSL_CTRL_SET_MSG_CALLBACK macro both SSL_CTX_set_msg_callback and SSL_CTRL_SET_MSG_CALLBACK defined since ea262260469e49149cb10b25a87dfd6ad3fbb4ba, we can safely switch to that guard instead of OpenSSL version	2021-02-08 13:49:41 +01:00
William Dauchy	060ffc82d6	CLEANUP: tools: typo in `strl2irc` mention `str2irc` does not exist Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-08 10:49:08 +01:00
William Dauchy	f4300902b9	CLEANUP: check: fix some typo in comments a few obvious english typo in comments, some of which introduced by myself quite recently Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-08 10:49:08 +01:00
Ilya Shipitsin	acf84595a7	CLEANUP: assorted typo fixes in the code and comments This is 17th iteration of typo fixes	2021-02-08 10:49:08 +01:00
Christopher Faulet	3d6e0e3e04	BUG/MINOR: mux-h1: Don't emit extra CRLF for empty chunked messages Because of a buggy tests when processing the EOH HTX block, an extra CRLF is added for empty chunked messages. This bug was introduced by the commit `d1ac2b90c` ("MAJOR: htx: Remove the EOM block type and use HTX_FL_EOM instead"). This fix is specific for 2.4. No backport needed.	2021-02-08 09:43:36 +01:00
Ilya Shipitsin	f00cdb1856	BUILD: ssl: guard SSL_CTX_add_server_custom_ext with special macro special guard macros HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT was defined earlier exactly for guarding SSL_CTX_add_server_custom_ext, let us use it wherever appropriate	2021-02-08 00:11:43 +01:00
Ilya Shipitsin	7bbf5866e0	BUILD: ssl: fix typo in HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT macro HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT was introduced in `ec60909871` however it was defined as HAVE_SL_CTX_ADD_SERVER_CUSTOM_EXT (missing "S") let us fix typo	2021-02-08 00:11:41 +01:00
Willy Tarreau	133aaa9f11	BUG/MEDIUM: mux-h2: do not quit the demux loop before setting END_REACHED The demux loop could quit on missing data but the H2_CF_END_REACHED flag would not be set in this case. This fixes a remaining situation where previous commit `f09612289` ("BUG/MEDIUM: mux-h2: handle remaining read0 cases") could not be sufficient and still leave CLOSE_WAIT. It's harder to reproduce but was still observed in prod. Now we quit via the end of the loop which already takes care of shutr. This should be backported along with the patch above as far as 2.0.	2021-02-05 12:22:54 +01:00
Remi Tricot-Le Breton	25dd0ad123	BUG/MINOR: sock: Unclosed fd in case of connection allocation failure If allocating a connection object failed right after a successful accept on a listener, the new file descriptor was not properly closed. This fixes GitHub issue #905. It can be backported to 2.3.	2021-02-05 12:14:51 +01:00
Christopher Faulet	1cdc028687	CLEANUP: http-htx: Set buffer area to NULL instead of malloc(0) During error files conversion to HTX message, in http_str_to_htx(), if a file is empty, the corresponding buffer's area is initialized with a malloc(0) and its size is set to 0. There is no problem here. The behaviour is totally defined. But it is not really intuitive. Instead, we can simply set the area to NULL. This patch should fix the issue #1022.	2021-02-05 11:51:44 +01:00
Willy Tarreau	f09612289f	BUG/MEDIUM: mux-h2: handle remaining read0 cases Commit `3d4631fec` ("BUG/MEDIUM: mux-h2: fix read0 handling on partial frames") tried to address an issue introduced in commit `aade4edc1` where read0 wasn't properly handled in the middle of a frame. But the fix was incomplete for two reasons: - first, it would set H2_CF_RCVD_SHUT in h2_recv() after detecting a read0 but the condition was guarded by h2_recv_allowed() which explicitly excludes read0 ; - second, h2_process would only call h2_process_demux() when there were still data in the buffer, but closing after a short pause to leave a buffer empty wouldn't be caught in this case. This patch fixes this by properly taking care of the received shutdown and by also waking up h2_process_demux() on an empty buffer if the demux is not blocked. Given the patches above were tagged for backporting to 2.0, this one should be as well.	2021-02-05 11:48:38 +01:00
Willy Tarreau	ed9892018c	MINOR: cli/show_fd: report local and report ports when known FD dumps are not always easy to match against netstat dumps, and often require an lsof as a third dump. Let's emit the socket family, and the local and remore ports when the FD is an IPv4/IPv6 socket, this will significantly ease the matching.	2021-02-05 10:58:03 +01:00
Willy Tarreau	a84986ae4f	BUG/MINOR: ssl: do not try to use early data if not configured The CO_FL_EARLY_SSL_HS flag was inconditionally set on the connection, resulting in SSL_read_early_data() always being used first in handshake calculations. While this seems to work well (probably that there are fallback paths inside openssl), it's particularly confusing and makes the debugging quite complicated. It possibly is not optimal by the way. This flag ought to be set only when early_data is configured on the bind line. Apparently there used to be a good reason for doing it this way in 1.8 times, but it really does not make sense anymore. It may be OK to backport this to 2.3 if this helps with troubleshooting, but better not go too far as it's unlikely to fix any real issue while it could introduce some in old versions.	2021-02-05 08:04:02 +01:00
Christopher Faulet	a8979a9b59	DOC: server: Add missing params in comment of the server state line parsing srv_use_ssl and srv_check_port parameters were not mentionned in the comment of the function parsing a server state line.	2021-02-04 14:00:43 +01:00
William Dauchy	4858fb2e18	MEDIUM: check: align agentaddr and agentport behaviour in the same manner of agentaddr, we now: - permit to set agentport through `port` keyword, like it is the case for agentaddr through `addr` - set the priority on `agent-port` keyword when used - add a flag to be able to test when the value is set like for agentaddr it makes the behaviour between `addr` and `port` more consistent. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 14:00:38 +01:00
William Dauchy	1c921cd748	BUG/MINOR: check: consitent way to set agentaddr small consistency problem with `addr` and `agent-addr` options: for the both options, the last one parsed is always used to set the agent-check addr. Thus these two lines don't have the same behavior: server ... addr <addr1> agent-addr <addr2> server ... agent-addr <addr2> addr <addr1> After this patch `agent-addr` will always be the priority option over `addr`. It means we test the flag before setting agentaddr. We also fix all the places where we did not set the flag to be coherent everywhere. I was not really able to determine where this issue is coming from. So it is probable we may backport it to all stable version where the agent is supported. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 13:55:04 +01:00
William Dauchy	fe03e7d045	MEDIUM: server: adding support for check_port in server state We can currently change the check-port using the cli command `set server check-port` but there is a consistency issue when using server state. This patch aims to fix this problem but will be also a good preparation work to get rid of checkport flag, so we are able to know when checkport was set by config. I am fully aware this is not making github #953 moving forward, I however think this might be acceptable while waiting for a proper solution and resolve consistency problem faced with port settings. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 10:46:52 +01:00
William Dauchy	69f118d7b6	MEDIUM: check: remove checkport checkaddr flag While trying to fix some consistency problem with the config file/cli (e.g. check-port cli command does not set the flag), we realised checkport flag was not necessarily needed. Indeed tcpcheck uses service port as the last choice if check.port is zero. So we can assume if check.port is zero, it means it was never set by the user, regardless if it is by the cli or config file. In the longterm this will avoid to introduce a new consistency issue if we forget to set the flag. in the same manner of checkport flag, we don't really need checkaddr flag. We can assume if checkaddr is not set, it means it was never set by the user or config. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 10:43:00 +01:00
Christopher Faulet	21ca3dfc3a	MINOR: dns: Don't set the check port during a server dns resolution When a server dns resolution is performed, there is no reason to set an unconfigured check port with the server port. Because by default, if the check port is not set, the server's one is used. Thus we can remove this useless assignment. It is mandatory for next improvements.	2021-02-04 10:42:52 +01:00
Christopher Faulet	99497d7dba	MINOR: server: Don't set the check port during the update from a state file When the server state is loaded from a server-state file, there is no reason to set an unconfigured check port with the server port. Because by default, if the check port is not set, the server's one is used. Thus we can remove this useless assignment. It is mandatory for next improvements.	2021-02-04 10:42:45 +01:00
William Dauchy	446db718cb	BUG/MINOR: cli: fix set server addr/port coherency with health checks while reading `update_server_addr_port` I found out some things which can be seen as incoherency. I hope I did not overlooked anything: - one comment is stating check's address should be updated if it uses the server one; however the condition checks if `SRV_F_CHECKADDR` is set; this flag is set when a check address is set; result is that we override the check address where I was not expecting it. In fact we don't need to update anything here as server addr is used when check addr is not set. - same goes for check agent addr - for port, it is a bit different, we update the check port if it is unset. This is harmless because we also use server port if check port is unset. However it creates some incoherency before/after using this command, as check port should stay unset througout the life of the process unless it is is set by `set server check-port` command. quite hard to locate the origin of this this issue but the function was introduced in commit `d458adcc52` ("MINOR: new update_server_addr_port() function to change both server's ADDR and service PORT"). I was however not able to determine whether this is due to a change of behavior along the years. So this patch can potentially be backported up to v1.8 but we must be careful while doing so, as the code has changed a lot. That being said, the bug being not very impacting I would be fine keeping it for 2.4 only. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 09:06:04 +01:00
William Lallemand	e0de0a6b32	MINOR: ssl/cli: flush the server session cache upon 'commit ssl cert' Flush the SSL session cache when updating a certificate which is used on a server line. This prevent connections to be established with a cached session which was using the previous SSL_CTX. This patch also replace the ha_barrier with a thread_isolate() since there are more operations to do. The reg-test was also updated to remove the 'no-ssl-reuse' keyword which is now uneeded.	2021-02-03 18:51:01 +01:00
Amaury Denoyelle	377d8786a7	BUG/MINOR: mux_h2: fix incorrect stat titles Duplicate titles for the stats H2_ST_{OPEN,TOTAL}_{CONN,STREAM}. These entries are used on csv for the heading. This must be backported up to 2.3. This fixes the github issue #1102.	2021-02-03 17:50:45 +01:00
Willy Tarreau	0630038e77	BUG/MEDIUM: ssl: check a connection's status before computing a handshake As spotted in issue #822, we're having a problem with error detection in the SSL layer. The problem is that on an overwhelmed machine, accepted connections can start to pile up, each of them requiring a slow handshake, and during all this time if the client aborts, the handshake will still be calculated. The error controls are properly placed, it's just that the SSL layer reads records exactly of the advertised size, without having the ability to encounter a pending connection error. As such if injecting many TLS connections to a listener with a huge backlog, it's fairly possible to meet this situation: 12:50:48.236056 accept4(8, {sa_family=AF_INET, sin_port=htons(62794), sin_addr=inet_addr("127.0.0.1")}, [128->16], SOCK_NONBLOCK) = 1109 12:50:48.236071 setsockopt(1109, SOL_TCP, TCP_NODELAY, [1], 4) = 0 (process other connections' handshakes) 12:50:48.257270 getsockopt(1109, SOL_SOCKET, SO_ERROR, [ECONNRESET], [4]) = 0 (proof that error was detectable there but this code was added for the PoC) 12:50:48.257297 recvfrom(1109, "\26\3\1\2\0", 5, 0, NULL, NULL) = 5 12:50:48.257310 recvfrom(1109, "\1\0\1\3"..., 512, 0, NULL, NULL) = 512 (handshake calculation taking 700us) 12:50:48.258004 sendto(1109, "\26\3\3\0z"..., 1421, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = -1 EPIPE (Broken pipe) 12:50:48.258036 close(1109) = 0 The situation was amplified by the multi-queue accept code, as it resulted in many incoming connections to be accepted long before they could be handled. Prior to this they would have been accepted and the handshake immediately started, which would have resulted in most of the connections waiting in the the system's accept queue, and dying there when the client aborted, thus the error would have been detected before even trying to pass them to the handshake code. As a result, with a listener running on a very large backlog, it's possible to quickly accept tens of thousands of connections and waste time slowly running their handshakes while they get replaced by other ones. This patch adds an SO_ERROR check on the connection's FD before starting the handshake. This is not pretty as it requires to access the FD, but it does the job. Some improvements should be made over the long term so that the transport layers can report extra information with their ->rcv_buf() call, or at the very least, implement a ->get_conn_status() function to report various flags such as shutr, shutw, error at various stages, allowing an upper layer to inquire for the relevance of engaging into a long operation if it's known the connection is not usable anymore. An even simpler step could probably consist in implementing this in the control layer. This patch is simple enough to be backported as far as 2.0. Many thanks to @ngaugler for his numerous tests with detailed feedback.	2021-02-02 15:55:53 +01:00
William Lallemand	8695ce0bae	BUG/MEDIUM: ssl/cli: abort ssl cert is freeing the old store The "abort ssl cert" command is buggy and removes the current ckch store, and instances, leading to SNI removal. It must only removes the new one. This patch also adds a check in set_ssl_cert.vtc and set_ssl_server_cert.vtc. Must be backported as far as 2.2.	2021-02-01 17:58:21 +01:00
William Dauchy	19f7cfc8c3	MINOR: stats: improve max stats descriptions In order to unify prometheus and stats description, we need to remove some field reference which are specific to stats implementation: - `scur` in max current sessions (also reword current session) - `rate` in max sessions - `req_rate` in max requests - `conn_rate` in max connections Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	eedb9b13f4	MINOR: stats: improve pending connections description In order to unify prometheus and stats description, we need to clarify the description for pending connections. - remove the BE reference in counters struct, as it is also used in servers - remove reference of `qcur` field in description as it is specific to stats implemention - try to reword cur and max pending connections description Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
Christopher Faulet	7aa3271439	MINOR: checks: Add function to get the result code corresponding to a status The function get_check_status_result() can now be used to get the result code (CHK_RES_) corresponding to a check status (HCHK_STATUS_). It will be used by the Prometheus exporter when reporting the check status of a server.	2021-02-01 15:16:33 +01:00
Willy Tarreau	75f72338df	BUG/MINOR: activity: take care of late wakeups in "show tasks" During the call to thread_isolate(), some other threads might have performed some task_wakeup() which will have a call date past the one we retrieved. It could be avoided by taking the current date once we're alone but this would significantly affect the latency measurements by adding the isolation time. Instead we're now only accounting positive times, so that late wakeups normally appear with a zero latency. No backport is needed, this is 2.4.	2021-01-29 15:07:07 +01:00
Willy Tarreau	d597ec2718	MINOR: listener: export manage_global_listener_queue() This one pops up in tasks lists when running against a saturated listener.	2021-01-29 14:29:57 +01:00
Christopher Faulet	c29b4bf946	MINOR: mux-h2: Slightly improve request HEADERS frames sending In h2s_bck_make_req_headers() function, in the loop on the HTX blocks, the most common blocks, the headers, are now handled in first, before the start-line. The same change was already performed on the response HEADERS frames. Thus the code is more consistent now.	2021-01-29 13:28:43 +01:00
Christopher Faulet	564981369b	MINOR: mux-h2: Don't tests the start-line when sending HEADERS frame When a HEADERS frame is sent, it is always when an HTX start-line block is found. Thus, in h2s_bck_make_req_headers() and h2s_frt_make_resp_headers() functions, it is useless to tests the start-line. Instead of being too defensive, we use BUG_ON() now because it must not happen and must be handled as a bug. This patch should fix the issue #1086.	2021-01-29 13:27:57 +01:00
Christopher Faulet	3702f78cf9	MINOR: ssl-sample: Don't check if argument list is set in sample fetches The list is always defined by definition. Thus there is no reason to test it.	2021-01-29 13:26:24 +01:00
Christopher Faulet	e6e7a585e9	MINOR: sample: Don't check if argument list is set in sample fetches The list is always defined by definition. Thus there is no reason to test it.	2021-01-29 13:26:13 +01:00
Christopher Faulet	72dbcfe66d	MINOR: http-conv: Don't check if argument list is set in sample converters The list is always defined by definition. Thus there is no reason to test it.	2021-01-29 13:26:02 +01:00
Christopher Faulet	623af93722	MINOR: http-fetch: Don't check if argument list is set in sample fetches The list is always defined by definition. Thus there is no reason to test it. There is also plenty of checks on arguments types while it is already validated during the configuration parsing. But one thing at a time. This patch should fix the issue #1087.	2021-01-29 13:25:34 +01:00
Christopher Faulet	bdbd5db2a5	BUG/MINOR: stick-table: Always call smp_fetch_src() with a valid arg list The sample fetch functions must always be called with a valid argument list. When called by hand, if there is no argument to pass, empty_arg_list must be used. In the stick-table code, there are some calls to smp_fetch_src() with NULL as argument list. It is changed to use empty_arg_list instead. It is not really a bug because smp_fetch_src() does not use the argument list. But it is an API bug. This patch may be backported to all stable branches as a cleanup.	2021-01-29 13:24:16 +01:00
Christopher Faulet	1faeb4c710	MINOR: mux-h1: Remove first useless test on count in h1_process_output() h1_process_output() function is never called with no data to send (count == 0). Thus, the first test on count, at the beginning of the function is useless and may be removed. This way, by reading the code, it is obvious the <chn_htx> variable is always defined. This patch should fix the issue #1085.	2021-01-29 13:16:32 +01:00
Willy Tarreau	5c25daa170	MINOR: stick-tables: export process_table_expire() This handler can take quite some time as it deletes a large number of entries under a lock, let's export it so that it's immediately visible in "show profiling".	2021-01-29 12:39:32 +01:00
Willy Tarreau	f6c88421b7	MINOR: peers: export process_peer_sync() to improve traces This one will probably pop up from time to time in "show profiling", better have it resolve.	2021-01-29 12:38:42 +01:00
Willy Tarreau	025fc71b47	MINOR: checks: export a few functions that appear often in trace dumps The check I/O handler, process_chk_conn and server_warmup are often present in complex backtraces as they're impacted by locking or I/O issues. Let's export them so that they resolve cleanly.	2021-01-29 12:35:24 +01:00
Willy Tarreau	ac6322dd36	MINOR: muxes: export the timeout and shutr task handlers These ones appear often in "show tasks" so it's handy to make them resolve.	2021-01-29 12:33:46 +01:00
Willy Tarreau	02922e19ca	MINOR: session: export session_expire_embryonic() This is only to make it resolve nicely in "show tasks".	2021-01-29 12:27:57 +01:00
Willy Tarreau	fb5401f296	MINOR: listener: export accept_queue_process This is only to make it resolve in "show tasks".	2021-01-29 12:25:23 +01:00
Willy Tarreau	7eff06e162	MINOR: activity: add a new "show tasks" command to list currently active tasks This finally adds the long-awaited solution to inspect the run queues and figure what is eating the CPU or causing latencies. We can even see the experienced latencies when profiling is enabled. Example on a saturated process: > show tasks Running tasks: 14983 (4 threads) function places % lat_tot lat_avg process_stream 4948 33.0 5.840m 70.82ms h1_io_cb 2535 16.9 - - main+0x9e670 2508 16.7 2.930m 70.10ms ssl_sock_io_cb 2499 16.6 - - si_cs_io_cb 2493 16.6 - -	2021-01-29 12:12:28 +01:00
Willy Tarreau	cfa7101d59	MINOR: activity: flush scheduler stats on "set profiling tasks on" If a user enables profiling by hand, it makes sense to reset the stats counters to provide fresh new measurements. Therefore it's worth using this as the standard method to reset counters.	2021-01-29 12:10:33 +01:00
Willy Tarreau	1bd67e9b03	MINOR: activity: also report collected tasks stats in "show profiling" "show profiling" will now dump the stats collected by the scheduler if profiling was previously enabled. This will immediately make it obvious what functions are responsible for others' high latencies or which ones are suffering from others, and should help spot issues like undesired wakeups. Example: Per-task CPU profiling : on # set profiling tasks {on\|auto\|off} Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg si_cs_io_cb 5569479 23.37s 4.196us - - h1_io_cb 5558654 13.60s 2.446us - - process_stream 250841 1.476s 5.882us 3.499s 13.95us main+0x9e670 198 - - 5.526ms 27.91us task_run_applet 17 1.509ms 88.77us 205.8us 12.11us srv_cleanup_idle_connections 12 44.51us 3.708us 25.71us 2.142us main+0x158c80 9 48.72us 5.413us - - srv_cleanup_toremove_connections 5 165.1us 33.02us 123.6us 24.72us	2021-01-29 12:10:33 +01:00
Willy Tarreau	4e2282f9bf	MEDIUM: tasks/activity: collect per-task statistics when profiling is enabled Now when the profiling is enabled, the scheduler wlil update per-function task-level statistics on number of calls, cpu usage and lateny, that could later be checked using "show profiling". This will immediately make it obvious what functions are responsible for others' high latencies or which ones are suffering from others, and should help spot issues like undesired wakeups. For now the stats are only collected but not reported (though they are readable from sched_activity[] under gdb).	2021-01-29 12:10:33 +01:00
Willy Tarreau	3fb6a7b46e	MINOR: activity: declare a new structure to collect per-function activity The new sched_activity structure will be used to collect task-level activity based on the target function. The principle is to declare a large enough array to make collisions rare (256 entries), and hash the function pointer using a reduced XXH to decide where to store the stats. On first computation an entry is definitely assigned to the array and it's done atomically. A special entry (0) is used to store collisions ("others"). The goal is to make it easy and inexpensive for the scheduler code to use these to store #calls, cpu_time and lat_time for each task.	2021-01-29 12:10:33 +01:00
Willy Tarreau	aa622b822b	MINOR: activity: make profiling more manageable In 2.0, commit `d2d3348ac` ("MINOR: activity: enable automatic profiling turn on/off") introduced an automatic mode to enable/disable profiling. The problem is that the automatic mode automatically changes to on/off, which implied that the forced on/off modes aren't sticky anymore. It's annoying when debugging because as soon as the load decreases, profiling stops. This makes a small change which ought to have been done first, which consists in having two states for "auto" (auto-on, auto-off) to distinguish them from the forced states. Setting to "auto" in the config defaults to "auto-off" as before, and setting it on the CLI switches to auto but keeps the current operating state. This is simple enough to be backported to older releases if needed.	2021-01-29 12:10:33 +01:00
Willy Tarreau	4deeb1055f	MINOR: tools: add print_time_short() to print a condensed duration value When reporting some values in debugging output we often need to have some condensed, stable-length values. This function prints a duration from nanosecond to years with at least 4 digits of accuracy using the most suitable unit, always on 7 chars.	2021-01-29 12:10:33 +01:00
Amaury Denoyelle	a81bb7197e	BUG/MINOR: backend: check available list allocation for reuse Do not consider reuse connection if available list is not allocated for the target server. This will prevent a crash when using a standalone server for an external purpose like socket_tcp/socket_ssl on hlua code. For the idle/safe lists, they are considered allocated if srv.max_idle_conns is not null. Note that the hlua code is currently safe thanks to the additional checks on proxy http mode and stream reuse policy not never. However, this might not be sufficient for future code. This patch should be backported in every branches containing the following patch : `7f68d815af` (2.4 tree) REORG: backend: simplify conn_backend_get	2021-01-28 18:12:07 +01:00
Willy Tarreau	02757d02c2	Revert "BUG/MEDIUM: listener: do not accept connections faster than we can process them" This reverts commit `62e8aaa1bd`. While is works extremely well to address SSL handshake floods, it prevents establishment of new connections during regular traffic above 50-60 Gbps, because for an unknown reason the queue seems to have ~1.7 active tasks per connection all the time, which makes no sense as these ought to be waiting on subscribed events. It might uncover a deeper issue but at least for now a different solution is needed. cf issue #822. The test is trivial to run, just start a config with tune.runqueue-depth 10 and inject on 1GB objects with more than 10 connections. Try to connect to the stats socket, it only works once, then the listeners are not dequeued.	2021-01-28 18:11:32 +01:00

1 2 3 4 5 ...

10974 Commits