haproxy

mirror of https://git.haproxy.org/git/haproxy.git/ synced 2025-08-07 23:56:57 +02:00

Author	SHA1	Message	Date
Willy Tarreau	220fd70694	BUG/MINOR: extcheck: proxy_parse_extcheck() must take a const for the defproxy The default proxy was passed as a variable, which in addition to being a PITA to deal with in the config parser, doesn't feel safe to use when it ought to be const. This will only affect new code so no backport is needed.	2021-02-12 16:23:46 +01:00
Willy Tarreau	818ec78af8	MINOR: proxy: always properly reset the just freed default instance pointers In proxy_free_defaults(); none of the free() calls was followed by a pointer reset. Not only it's hard to figure if one of them is duplicated, but this code started to call other functions which might or might not rely on such just freed pointers. Let's reset them as they should be to make sure there will never be any case of use-after-free. The 3 functions called there were inspected and are all unaffected by this so this remains safe to do right now.	2021-02-12 16:23:46 +01:00
Willy Tarreau	a3320a0509	MINOR: proxy: move the defproxy freeing code to proxy.c This used to be open-coded in cfgparse-listen.c when facing a "defaults" keyword. Let's move this into proxy_free_defaults(). This code is ugly and doesn't even reset the just freed pointers. Let's not change this yet. This code should probably be merged with a generic proxy deinit function called from deinit(). However there's a catch on uri_auth which cannot be freed because it might be used by one or several proxies. We definitely need refcounts there!	2021-02-12 16:23:46 +01:00
Willy Tarreau	3b06eaec86	MEDIUM: proxy: only take defaults when a default proxy is passed. The proxy initialization code relies on three phases, allocation, pre-initialization, and assignments from defaults. This last part is entirely taken from the defaults proxy when arguments are set. This sensibly complexifies the initialization code as it requires to always have a default proxy. This patch instead first applies the original default settings on a proxy, and then uses those from a default proxy only if one such is used. This will allow to initialize a proxy out of any default proxy while still using valid defaults. A careful inspection of the function showed that only 4 fields used to be set regardless of the default proxy, and those were moved to init_new_proxy() where they ought to have been in the first place.	2021-02-12 16:23:46 +01:00
Willy Tarreau	7683893c70	REORG: proxy: centralize the proxy allocation code into alloc_new_proxy() This new function takes over the old open-coding that used to be done for too long in cfg_parse_listen() and it now does everything at once in a proxy-centric function. The function does all the job of allocating the structure, initializing it, presetting its defaults from the default proxy and checking for errors. The code was almost unchanged except for defproxy being passed as a pointer, and the error message being passed using memprintf(). This change will be needed to ease reuse of multiple default proxies, or to create dynamic backends in a distant future.	2021-02-12 16:23:46 +01:00
Willy Tarreau	144289b459	REORG: move init_default_instance() to proxy.c and pass it the defproxy pointer init_default_instance() was still left in cfgparse.c which is not the best place to pre-initialize a proxy. Let's place it in proxy.c just after init_new_proxy(), take this opportunity for renaming it to proxy_preset_defaults() and taking out init_new_proxy() from it, and let's pass it the pointer to the default proxy to be initialized instead of implicitly assuming defproxy. We'll soon be able to exploit this. Only two call places had to be updated.	2021-02-12 16:23:46 +01:00
Willy Tarreau	09f2e77eb1	BUG/MINOR: tcpheck: the source list must be a const in dup_tcpcheck_var() This is just an API bug but it's annoying when trying to tidy the code. The source list passed in argument must be a const and not a variable, as it's typically the list head from a default proxy and must obviously not be modified by the function. No backport is needed as it only impacts new code.	2021-02-12 16:23:46 +01:00
Willy Tarreau	016255a483	BUG/MINOR: http-htx: defpx must be a const in proxy_dup_default_conf_errors() This is just an API bug but it's annoying when trying to tidy the code. The default proxy passed in argument must be a const and not a variable. No backport is needed as it only impacts new code.	2021-02-12 16:23:46 +01:00
Willy Tarreau	b2ec994523	BUG/MINOR: cfgparse: do not mention "addr:port" as supported on proxy lines The very old error message indicating that a proxy name is mandatory still had a reference to the optional addr:port argument while this one is explicitly rejected a few lines later since at least 1.9. This is harmless but confusing. This can be backported to 2.0.	2021-02-12 16:23:45 +01:00
Willy Tarreau	5bbc676608	BUG/MINOR: stats: revert the change on ST_CONVDONE In 2.1, commit `ee4f5f83d` ("MINOR: stats: get rid of the ST_CONVDONE flag") introduced a subtle bug. By testing curproxy against defproxy in check_config_validity(), it tried to eliminate the need for a flag to indicate that stats authentication rules were already compiled, but by doing so it left the issue opened for the case where a new defaults section appears after the two proxies sharing the first one: defaults mode http stats auth foo:bar listen l1 bind :8080 listen l2 bind :8181 defaults # just to break above This config results in: [ALERT] 042/113725 (3121) : proxy 'f2': stats 'auth'/'realm' and 'http-request' can't be used at the same time. [ALERT] 042/113725 (3121) : Fatal errors found in configuration. Removing the last defaults remains OK. It turns out that the cleanups that followed that patch render it useless, so the best fix is to revert the change (with the up-to-date flags instead). The flag was marked as belonging to the config. It's not exact but it's the closest to the reality, as it's not there to configure the behavior but ti mention that the config parser did its job. This could be backported as far as 2.1, but in practice it looks like nobody ever hit it.	2021-02-12 16:23:45 +01:00
Willy Tarreau	937c3ead34	BUG/MEDIUM: config: don't pick unset values from last defaults section Since commit 1.3.14 with commit `1fa3126ec` ("[MEDIUM] introduce separation between contimeout, and tarpit + queue"), check_config_validity() looks at the last defaults section to update all proxies' queue and tarpit timeouts if they were not set! This was apparently an attempt to properly set them on the fallback values, except that the fallback values were taken from the default proxy before looking at the current proxy itself. The worst part of it is that it might have randomly worked by accident for some configurations when there was a single defaults section, but has certainly caused too short queue expirations once another defaults section was added later in the file with these explicitly defined. Let's remove the defproxy part and keep only the curproxy ones. This could be backported everywhere, the bug has been there for 13 years.	2021-02-12 16:23:45 +01:00
Christopher Faulet	f5ea269723	CLEANUP: deinit: release global and per-proxy server-state variables on deinit The global server-state base directory and file name are now released on deinit, as well as per-proxy server-state file name.	2021-02-12 16:04:52 +01:00
Christopher Faulet	583b6de68a	BUG/MINOR: server: Fix server-state-file-name directive Since the beginning, this directive is documented to accept an optional file name. But it should also be possible to use it without any argument to use the backend name as file name. However, when no argument is provided, an error is reported during the configuration parsing requesting an argument, a file name or "use-backend-name". And This last special argument is not documented. So, to respect the documentation and to avoid configuration breakages, all modes are now supported. If this directive is called with no argument or with "use-backend-name", the backend name is use as file name for the server-state file. Otherwise, the provided string is used. In addition, we take care to release any previously allocated file name in case this directive is defines multiple times in the same backend. And an error is reported if more than one argument are defined. Finally, the documentation is updated accordingly. Sections supporting this directive are also mentioned. This patch should be backported as far as 1.6.	2021-02-12 16:04:52 +01:00
William Dauchy	ddc7ce9645	MINOR: server: enhance error precision when applying server state server health checks and agent parameters are written the same way as others to be able to enahcne code reuse: basically we make use of parsing and assignment at the same place. It makes it difficult for error handling to know whether srv object was modified partially or not. The problem was already present with SRV resolution though. I was a bit puzzled about the approach to take to be honest, and I did not wanted to go into a full refactor, so I assumed it was ok to simply notify whether the line was failed or partially applied. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
William Dauchy	d1a7b85a40	MEDIUM: server: support {check,agent}_addr, agent_port in server state logical followup from cli commands addition, so that the state server file stays compatible with the changes made at runtime; use previously added helper to load server attributes. also alloc a specific chunk to avoid mixing with other called functions using it Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
William Dauchy	63e6cba12a	MEDIUM: server: add server-states version 2 Even if it is possibly too much work for the current usage, it makes sure we don't break states file from v2.3 to v2.4; indeed, since v2.3, we introduced two new fields, so we put them aside to guarantee we can easily reload from a version 1. The diff seems huge but there is no specific change apart from: - introduce v2 where it is needed (parsing, update) - move away from switch/case in update to be able to reuse code - move srv lock to the whole function to make it easier this patch confirm how painful it is to maintain this functionality. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
William Dauchy	7cabc06da6	MEDIUM: cli: add agent-port command this patch allows to set agent port at runtime. In order to align with both `addr` and `check-addr` commands, also add the possibility to optionnaly set port on `agent-addr` command. This led to a small refactor in order to use the same function for both `agent-addr` and `agent-port` commands. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
William Dauchy	b456e1f389	MEDIUM: cli: add check-addr command this patch allows to set server health check address at runtime. In order to align with `addr` command, also allow to set port optionnaly. This led to a small refactor in order to use the same function for both `check-addr` and `check-port` commands. for `check-port`, we however don't permit the change anymore if checks are not enabled on the server. This command becomes more and more useful for people having a consul like architecture: - the backend server is located on a container with its own IP - the health checks are done the consul instance located on the host with the host IP Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-12 16:04:52 +01:00
Amaury Denoyelle	edadf192fe	BUG/MINOR: backend: fix compilation without ssl sni_smp/sni_hash are reported as unused on compilation without USE_OPENSL and may cause compilation failure This does not need to be backported.	2021-02-12 13:49:42 +01:00
Amaury Denoyelle	1921d20fff	MINOR: connection: use proxy protocol as parameter for srv conn hash Use the proxy protocol frame if proxy protocol is activated on the server line. Do not add anymore these connections in the private list. If some requests are made with the same proxy fields, they can reuse the idle connection. The reg-tests proxy_protocol_send_unique_id must be adapted has it relied on the side effect behavior that every requests from a same connection reused a private server connection. Now, a new connection is created as expected if the proxy protocol fields differ.	2021-02-12 12:54:04 +01:00
Amaury Denoyelle	d10a200f62	MINOR: connection: use src addr as parameter for srv conn hash The source address is used as an input to the the server connection hash. The address and port are used as separate hash inputs. Do not add anymore these connections in the private list. This parameter is set only if used in the transparent-proxy mode.	2021-02-12 12:54:04 +01:00
Amaury Denoyelle	f7bdf00071	MINOR: backend: rewrite alloc of connection src address This commit is similar to "MINOR: backend: rewrite alloc of stream target address" but with source address.	2021-02-12 12:54:04 +01:00
Amaury Denoyelle	01a287f1e5	MINOR: connection: use dst addr as parameter for srv conn hash The destination address is used as an input to the server connection hash. The address and port are used as separated hash inputs. Note that they are not used when statically specified on the server line. This is only useful for dynamic destination address. This is typically used when the server address is dynamically set via the set-dst action. The address and port are separated hash parameters. Most notably, it should fixed set-dst use case (cf github issue #947).	2021-02-12 12:53:56 +01:00
Amaury Denoyelle	68cf3959b3	MINOR: backend: rewrite alloc of stream target address Change the API of the function used to allocate the stream target address. This is done in order to be able to allocate the destination address and use it to reuse a connection sharing with the same address. In particular, the flag stream SF_ADDR_SET is now set outside of the function.	2021-02-12 12:53:56 +01:00
Amaury Denoyelle	9b626e3c19	MINOR: connection: use sni as parameter for srv conn hash The sni parameter is an input to the server connection hash. Do not add anymore connections with dynamic sni in the private list. Thus, it is now possible to reuse a server connection if they use the same sni.	2021-02-12 12:48:11 +01:00
Amaury Denoyelle	293dcc400e	MINOR: backend: compare conn hash for session conn reuse Compare the connection hash when reusing a connection from the session. This ensures that a private connection is reused only if it shares the same set of parameters.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	1a58aca84e	MINOR: connection: use the srv pointer for the srv conn hash The pointer of the target server is used as a first parameter for the server connection hash calcul. This prevents the hash to be null when no specific parameters are present, and can serve as a simple defense against an attacker trying to reuse a non-conform connection.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	81c6f76d3e	MINOR: connection: prepare hash calcul for server conns This is a preliminary work for the calcul of the backend connection hash. A structure conn_hash_params is the input for the operation, containing the various specific parameters of a connection. The high bits of the hash will reflect the parameters present as input. A set of macros is written to manipulate the connection hash and extract the parameters/payload.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	aa890aef3d	MINOR: backend: search conn in idle tree after safe on always reuse With http-reuse always, if no matching safe connection is found, check in idle tree for a matching one. This is needed because now idle connections can be differentiated from each other. If only the safe tree was checked because not empty, but did not contain a matching connection, we could miss matching entry in idle tree.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	1399d695c0	MINOR: backend: search conn in idle/safe trees after available If no matching connection is found on available, check on idle/safe trees for a matching one. This is needed because now idle connections can be differentiated from each other. If only the available list was checked because not empty, but did not contain a matching connection, we could miss matching entries in idle or safe trees.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	f232cb3e9b	MEDIUM: connection: replace idle conn lists by eb trees The server idle/safe/available connection lists are replaced with ebmb- trees. This is used to store backend connections, with the new field connection hash as the key. The hash is a 8-bytes size field, used to reflect specific connection parameters. This is a preliminary work to be able to reuse connection with SNI, explicit src/dst address or PROXY protocol.	2021-02-12 12:33:05 +01:00
Amaury Denoyelle	5c7086f6b0	MEDIUM: connection: protect idle conn lists with locks This is a preparation work for connection reuse with sni/proxy protocol/specific src-dst addresses. Protect every access to idle conn lists with a lock. This is currently strictly not needed because the access to the list are made with atomic operations. However, to be able to reuse connection with specific parameters, the list storage will be converted to eb-trees. As this structure does not have atomic operation, it is mandatory to protect it with a lock. For this, the takeover lock is reused. Its role was to protect during connection takeover. As it is now extended to general idle conns usage, it is renamed to idle_conns_lock. A new lock section is also instantiated named IDLE_CONNS_LOCK to isolate its impact on performance.	2021-02-12 12:33:04 +01:00
Amaury Denoyelle	a3bf62ec54	BUG/MINOR: backend: hold correctly lock when killing idle conn The wrong lock seems to be held when trying to remove another thread connection if max fd limit has been reached (locking the current thread instead of the target thread lock). This could be backported up to 2.0.	2021-02-12 12:32:31 +01:00
Christopher Faulet	cd7126b396	CLEANUP: queue: Remove useless tests on p or pp in pendconn_process_next_strm() This patch removes unecessary tests on p or pp pointers in pendconn_process_next_strm() function. This should make cppcheck happy and avoid false report of null pointer dereference. This patch should fix the issue #1036.	2021-02-11 11:48:36 +01:00
Ilya Shipitsin	a1e0f387c7	CLEANUP: remove unused variable assigned found by Coverity this is pure cleanup, no need to backport 2116 if ((end - 1) == (payload + strlen(PAYLOAD_PATTERN))) { 2117 /* if the payload pattern is at the end */ 2118 s->pcli_flags \|= PCLI_F_PAYLOAD; CID 1399833 (#1 of 1): Unused value (UNUSED_VALUE)assigned_value: Assigning value from reql to ret here, but that stored value is overwritten before it can be used. 2119 ret = reql; 2120 } This patch fixes the issue #1048.	2021-02-11 11:48:36 +01:00
Christopher Faulet	4b524124db	BUG/MINOR: tools: Fix a memory leak on error path in parse_dotted_uints() When an invalid character is found during parsing in parse_dotted_uints() function, the allocated array of uint must be released. This patch fixes a memory leak on error path during the configuration parsing. This patch should fix the issue #1106. It should be backported as far as 2.0. Note that, for 2.1 and 2.0, the function is in src/standard.c	2021-02-11 11:48:36 +01:00
Christopher Faulet	0aeaa290da	CLEANUP: muxes: Remove useless calls to b_realign_if_empty() In H1, H2 and FCGI muxes, b_realign_if_empty() is called to reset the head of an empty buffer before setting it a specific value to permit the zero-copy. Thus, we can remove call to b_realign_if_empty().	2021-02-11 11:48:36 +01:00
Christopher Faulet	368936703a	MINOR: mux-h1: Be sure EOM flag is set when processing end of outgoing message When a message is sent, an extra check is performed when the parser is switch to MSG_DONE state to be sure the EOM flag is really set. This flag is quite new and replaces the EOM block. Thus, this test is a safeguard waiting for a proper refactoring of the outgoing side.	2021-02-10 16:25:42 +01:00
Christopher Faulet	337243235f	BUG/MEDIUM: mux-h2: Add EOT block when EOM flag is set on an empty HTX message In the H2 mux, when a empty DATA frame is used to finish a message, just to set the ES flag, we now only set the EOM flag on the HTX message. However, if the HTX message is empty, this event will not be properly handled on the other side because there is no effective data to handle. Thus, it is interpreted as an abort by the H1 mux. It is in part caused by the current H1 mux design but also because there is no way to emit empty HTX block (NOOP HTX block) or to wakeup a mux for send when there is no data to finish some internal processing. Thus, for now, to work around this limitation, an EOT HTX block is added by the H2 mux if a EOM flag is added on an empty HTX message. This case is only possible when an empty DATA frame with the ES flag is received. This fix is specific for 2.4. No backport needed.	2021-02-10 16:25:42 +01:00
Christopher Faulet	0a916d2aca	BUG/MINOR: mux-h1: Don't blindly skip EOT block for non-chunked messages In HTTP/2, we may have trailers for messages with a Content-length header. Thus, when the H2 mux receives a HEADERS frame at the end of a message, it always emits TLR and EOT HTX blocks. On the H1 mux, if this happens, these blocks are just skipped because we cannot emit trailers for a non-chunked message. But the EOT HTX block must not be blindly ignored. Indeed, there is no longer EOM HTX block to mark the end of the message. Thus the EOT block, when found, is the end of the message. So we must handle it to swith in MSG_DONE state. This fix is specific for 2.4. No backport needed.	2021-02-10 16:25:42 +01:00
Christopher Faulet	0d7e634631	BUG/MINOR: mux-h1: Fix data skipping for bodyless responses When payload is received for a bodyless response, for instance a response to a HEAD request, it is silently skipped. Unfortunately, when this happens, the end of the message is not properly handled. The response remains in the MSG_DATA state (or MSG_TRAILERS if the message is chunked). In addition, when a zero-copy is possible, the data are not removed from the channel buffer and the H1 connection is killed because an error is then triggered. To fix the bug, the zero-copy is disabled for bodyless responses. It is not a problem because there is no copy at all. And the last block (DATA or EOT) is now properly handled. This bug was introduced by the commit `e5596bf53` ("MEDIUM: mux-h1: Don't emit any payload for bodyless responses"). This fix is specific for 2.4. No backport needed.	2021-02-10 16:25:42 +01:00
Christopher Faulet	a22782b597	BUG/MEDIUM: mux-h1: Always set CS_FL_EOI for response in MSG_DONE state During the message parsing, if in MSG_DONE state, the CS_FL_EOI flag must always be set on the conn-stream if following conditions are met : * It is a response or * It is a request but not a protocol upgrade nor a CONNECT. For now, there is no test on the message type (request or response). Thus the CS_FL_EOI flag is not set for a response with a "Connection: upgrade" header but not a 101 response. This bug was introduced by the commit `3e1748bbf` ("BUG/MINOR: mux-h1: Don't set CS_FL_EOI too early for protocol upgrade requests"). It was backported as far as 2.0. Thus, this patch must also be backported as far as 2.0.	2021-02-10 16:25:42 +01:00
Christopher Faulet	bf7175f9b6	BUG/MINOR: http-ana: Don't increment HTTP error counter on internal errors If internal error is reported by the mux during HTTP request parsing, the HTTP error counter should not be incremented. It should only be incremented on parsing error to reflect errors caused by clients. This patch must be backported as far as 2.0. During the backport, the same must be performed for 408-request-time-out errors.	2021-02-10 16:22:32 +01:00
Christopher Faulet	f4b7074784	BUG/MINOR: mux-h1: Don't increment HTTP error counter for 408/500/501 errors The HTTP error counter reflects the number of errors caused by clients. Thus, In the H1 mux, it should only be increment on parsing errors. This fix is specific for 2.4. No backport needed.	2021-02-10 16:22:32 +01:00
Willy Tarreau	826f3ab5e6	MINOR: stick-tables/counters: add http_fail_cnt and http_fail_rate data types Historically we've been counting lots of client-triggered events in stick tables to help detect misbehaving ones, but we've been missing the same on the server side, and there's been repeated requests for being able to count the server errors per URL in order to precisely monitor the quality of service or even to avoid routing requests to certain dead services, which is also called "circuit breaking" nowadays. This commit introduces http_fail_cnt and http_fail_rate, which work like http_err_cnt and http_err_rate in that they respectively count events and their frequency, but they only consider server-side issues such as network errors, unparsable and truncated responses, and 5xx status codes other than 501 and 505 (since these ones are usually triggered by the client). Note that retryable errors are purposely not accounted for, so that only what the client really sees is considered. With this it becomes very simple to put some protective measures in place to perform a redirect or return an excuse page when the error rate goes beyond a certain threshold for a given URL, and give more chances to the server to recover from this condition. Typically it could look like this to bypass a URL causing more than 10 requests per second: stick-table type string len 80 size 4k expire 1m store http_fail_rate(1m) http-request track-sc0 base # track host+path, ignore query string http-request return status 503 content-type text/html \ lf-file excuse.html if { sc0_http_fail_rate gt 10 } A more advanced mechanism using gpt0 could even implement high/low rates to disable/enable the service. Reg-test converteers_ref_cnt_never_dec.vtc was updated to test it.	2021-02-10 12:27:01 +01:00
Willy Tarreau	e4d247e217	BUG/MINOR: freq_ctr: fix a wrong delay calculation in next_event_delay() The sleep time calculation in next_event_delay() was wrong because it was dividing 999 by the number of pending events, and was directly responsible for an observation made a long time ago that listeners would eat all the CPU when hammered while globally rate-limited, because the more the queued events, the least it would wait, and would ignore the configured frequency to compute the delay. This was addressed in various ways in listeners through the switch to the FULL state and the wakeup of manage_global_listener_queue() that avoids this fast loop, but the calculation made there remained wrong nevertheless. It's even visible with this patch that the accept frequency is much more accurate at low values now; for example, configuring a maxconrate of 10 would give between 8.99 and 11.0 cps before this patch and between 9.99 and 10.0 with it. Better fix it now in case it's reused anywhere else and causes confusion again. It maybe be backported but is probably not worth it.	2021-02-09 17:52:50 +01:00
William Lallemand	3ce6eedb37	MEDIUM: ssl: add a rwlock for SSL server session cache When adding the server side support for certificate update over the CLI we encountered a design problem with the SSL session cache which was not locked. Indeed, once a certificate is updated we need to flush the cache, but we also need to ensure that the cache is not used during the update. To prevent the use of the cache during an update, this patch introduce a rwlock for the SSL server session cache. In the SSL session part this patch only lock in read, even if it writes. The reason behind this, is that in the session part, there is one cache storage per thread so it is not a problem to write in the cache from several threads. The problem is only when trying to write in the cache from the CLI (which could be on any thread) when a session is trying to access the cache. So there is a write lock in the CLI part to prevent simultaneous access by a session and the CLI. This patch also remove the thread_isolate attempt which is eating too much CPU time and was not protecting from the use of a free ptr in the session.	2021-02-09 09:43:44 +01:00
Ilya Shipitsin	7ff7747a17	BUILD: ssl: guard SSL_CTX_set_msg_callback with SSL_CTRL_SET_MSG_CALLBACK macro both SSL_CTX_set_msg_callback and SSL_CTRL_SET_MSG_CALLBACK defined since ea262260469e49149cb10b25a87dfd6ad3fbb4ba, we can safely switch to that guard instead of OpenSSL version	2021-02-08 13:49:41 +01:00
William Dauchy	060ffc82d6	CLEANUP: tools: typo in `strl2irc` mention `str2irc` does not exist Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-08 10:49:08 +01:00
William Dauchy	f4300902b9	CLEANUP: check: fix some typo in comments a few obvious english typo in comments, some of which introduced by myself quite recently Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-08 10:49:08 +01:00
Ilya Shipitsin	acf84595a7	CLEANUP: assorted typo fixes in the code and comments This is 17th iteration of typo fixes	2021-02-08 10:49:08 +01:00
Christopher Faulet	3d6e0e3e04	BUG/MINOR: mux-h1: Don't emit extra CRLF for empty chunked messages Because of a buggy tests when processing the EOH HTX block, an extra CRLF is added for empty chunked messages. This bug was introduced by the commit `d1ac2b90c` ("MAJOR: htx: Remove the EOM block type and use HTX_FL_EOM instead"). This fix is specific for 2.4. No backport needed.	2021-02-08 09:43:36 +01:00
Ilya Shipitsin	f00cdb1856	BUILD: ssl: guard SSL_CTX_add_server_custom_ext with special macro special guard macros HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT was defined earlier exactly for guarding SSL_CTX_add_server_custom_ext, let us use it wherever appropriate	2021-02-08 00:11:43 +01:00
Ilya Shipitsin	7bbf5866e0	BUILD: ssl: fix typo in HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT macro HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT was introduced in `ec60909871` however it was defined as HAVE_SL_CTX_ADD_SERVER_CUSTOM_EXT (missing "S") let us fix typo	2021-02-08 00:11:41 +01:00
Willy Tarreau	133aaa9f11	BUG/MEDIUM: mux-h2: do not quit the demux loop before setting END_REACHED The demux loop could quit on missing data but the H2_CF_END_REACHED flag would not be set in this case. This fixes a remaining situation where previous commit `f09612289` ("BUG/MEDIUM: mux-h2: handle remaining read0 cases") could not be sufficient and still leave CLOSE_WAIT. It's harder to reproduce but was still observed in prod. Now we quit via the end of the loop which already takes care of shutr. This should be backported along with the patch above as far as 2.0.	2021-02-05 12:22:54 +01:00
Remi Tricot-Le Breton	25dd0ad123	BUG/MINOR: sock: Unclosed fd in case of connection allocation failure If allocating a connection object failed right after a successful accept on a listener, the new file descriptor was not properly closed. This fixes GitHub issue #905. It can be backported to 2.3.	2021-02-05 12:14:51 +01:00
Christopher Faulet	1cdc028687	CLEANUP: http-htx: Set buffer area to NULL instead of malloc(0) During error files conversion to HTX message, in http_str_to_htx(), if a file is empty, the corresponding buffer's area is initialized with a malloc(0) and its size is set to 0. There is no problem here. The behaviour is totally defined. But it is not really intuitive. Instead, we can simply set the area to NULL. This patch should fix the issue #1022.	2021-02-05 11:51:44 +01:00
Willy Tarreau	f09612289f	BUG/MEDIUM: mux-h2: handle remaining read0 cases Commit `3d4631fec` ("BUG/MEDIUM: mux-h2: fix read0 handling on partial frames") tried to address an issue introduced in commit `aade4edc1` where read0 wasn't properly handled in the middle of a frame. But the fix was incomplete for two reasons: - first, it would set H2_CF_RCVD_SHUT in h2_recv() after detecting a read0 but the condition was guarded by h2_recv_allowed() which explicitly excludes read0 ; - second, h2_process would only call h2_process_demux() when there were still data in the buffer, but closing after a short pause to leave a buffer empty wouldn't be caught in this case. This patch fixes this by properly taking care of the received shutdown and by also waking up h2_process_demux() on an empty buffer if the demux is not blocked. Given the patches above were tagged for backporting to 2.0, this one should be as well.	2021-02-05 11:48:38 +01:00
Willy Tarreau	ed9892018c	MINOR: cli/show_fd: report local and report ports when known FD dumps are not always easy to match against netstat dumps, and often require an lsof as a third dump. Let's emit the socket family, and the local and remore ports when the FD is an IPv4/IPv6 socket, this will significantly ease the matching.	2021-02-05 10:58:03 +01:00
Willy Tarreau	a84986ae4f	BUG/MINOR: ssl: do not try to use early data if not configured The CO_FL_EARLY_SSL_HS flag was inconditionally set on the connection, resulting in SSL_read_early_data() always being used first in handshake calculations. While this seems to work well (probably that there are fallback paths inside openssl), it's particularly confusing and makes the debugging quite complicated. It possibly is not optimal by the way. This flag ought to be set only when early_data is configured on the bind line. Apparently there used to be a good reason for doing it this way in 1.8 times, but it really does not make sense anymore. It may be OK to backport this to 2.3 if this helps with troubleshooting, but better not go too far as it's unlikely to fix any real issue while it could introduce some in old versions.	2021-02-05 08:04:02 +01:00
Christopher Faulet	a8979a9b59	DOC: server: Add missing params in comment of the server state line parsing srv_use_ssl and srv_check_port parameters were not mentionned in the comment of the function parsing a server state line.	2021-02-04 14:00:43 +01:00
William Dauchy	4858fb2e18	MEDIUM: check: align agentaddr and agentport behaviour in the same manner of agentaddr, we now: - permit to set agentport through `port` keyword, like it is the case for agentaddr through `addr` - set the priority on `agent-port` keyword when used - add a flag to be able to test when the value is set like for agentaddr it makes the behaviour between `addr` and `port` more consistent. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 14:00:38 +01:00
William Dauchy	1c921cd748	BUG/MINOR: check: consitent way to set agentaddr small consistency problem with `addr` and `agent-addr` options: for the both options, the last one parsed is always used to set the agent-check addr. Thus these two lines don't have the same behavior: server ... addr <addr1> agent-addr <addr2> server ... agent-addr <addr2> addr <addr1> After this patch `agent-addr` will always be the priority option over `addr`. It means we test the flag before setting agentaddr. We also fix all the places where we did not set the flag to be coherent everywhere. I was not really able to determine where this issue is coming from. So it is probable we may backport it to all stable version where the agent is supported. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 13:55:04 +01:00
William Dauchy	fe03e7d045	MEDIUM: server: adding support for check_port in server state We can currently change the check-port using the cli command `set server check-port` but there is a consistency issue when using server state. This patch aims to fix this problem but will be also a good preparation work to get rid of checkport flag, so we are able to know when checkport was set by config. I am fully aware this is not making github #953 moving forward, I however think this might be acceptable while waiting for a proper solution and resolve consistency problem faced with port settings. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 10:46:52 +01:00
William Dauchy	69f118d7b6	MEDIUM: check: remove checkport checkaddr flag While trying to fix some consistency problem with the config file/cli (e.g. check-port cli command does not set the flag), we realised checkport flag was not necessarily needed. Indeed tcpcheck uses service port as the last choice if check.port is zero. So we can assume if check.port is zero, it means it was never set by the user, regardless if it is by the cli or config file. In the longterm this will avoid to introduce a new consistency issue if we forget to set the flag. in the same manner of checkport flag, we don't really need checkaddr flag. We can assume if checkaddr is not set, it means it was never set by the user or config. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 10:43:00 +01:00
Christopher Faulet	21ca3dfc3a	MINOR: dns: Don't set the check port during a server dns resolution When a server dns resolution is performed, there is no reason to set an unconfigured check port with the server port. Because by default, if the check port is not set, the server's one is used. Thus we can remove this useless assignment. It is mandatory for next improvements.	2021-02-04 10:42:52 +01:00
Christopher Faulet	99497d7dba	MINOR: server: Don't set the check port during the update from a state file When the server state is loaded from a server-state file, there is no reason to set an unconfigured check port with the server port. Because by default, if the check port is not set, the server's one is used. Thus we can remove this useless assignment. It is mandatory for next improvements.	2021-02-04 10:42:45 +01:00
William Dauchy	446db718cb	BUG/MINOR: cli: fix set server addr/port coherency with health checks while reading `update_server_addr_port` I found out some things which can be seen as incoherency. I hope I did not overlooked anything: - one comment is stating check's address should be updated if it uses the server one; however the condition checks if `SRV_F_CHECKADDR` is set; this flag is set when a check address is set; result is that we override the check address where I was not expecting it. In fact we don't need to update anything here as server addr is used when check addr is not set. - same goes for check agent addr - for port, it is a bit different, we update the check port if it is unset. This is harmless because we also use server port if check port is unset. However it creates some incoherency before/after using this command, as check port should stay unset througout the life of the process unless it is is set by `set server check-port` command. quite hard to locate the origin of this this issue but the function was introduced in commit `d458adcc52` ("MINOR: new update_server_addr_port() function to change both server's ADDR and service PORT"). I was however not able to determine whether this is due to a change of behavior along the years. So this patch can potentially be backported up to v1.8 but we must be careful while doing so, as the code has changed a lot. That being said, the bug being not very impacting I would be fine keeping it for 2.4 only. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-04 09:06:04 +01:00
William Lallemand	e0de0a6b32	MINOR: ssl/cli: flush the server session cache upon 'commit ssl cert' Flush the SSL session cache when updating a certificate which is used on a server line. This prevent connections to be established with a cached session which was using the previous SSL_CTX. This patch also replace the ha_barrier with a thread_isolate() since there are more operations to do. The reg-test was also updated to remove the 'no-ssl-reuse' keyword which is now uneeded.	2021-02-03 18:51:01 +01:00
Amaury Denoyelle	377d8786a7	BUG/MINOR: mux_h2: fix incorrect stat titles Duplicate titles for the stats H2_ST_{OPEN,TOTAL}_{CONN,STREAM}. These entries are used on csv for the heading. This must be backported up to 2.3. This fixes the github issue #1102.	2021-02-03 17:50:45 +01:00
Willy Tarreau	0630038e77	BUG/MEDIUM: ssl: check a connection's status before computing a handshake As spotted in issue #822, we're having a problem with error detection in the SSL layer. The problem is that on an overwhelmed machine, accepted connections can start to pile up, each of them requiring a slow handshake, and during all this time if the client aborts, the handshake will still be calculated. The error controls are properly placed, it's just that the SSL layer reads records exactly of the advertised size, without having the ability to encounter a pending connection error. As such if injecting many TLS connections to a listener with a huge backlog, it's fairly possible to meet this situation: 12:50:48.236056 accept4(8, {sa_family=AF_INET, sin_port=htons(62794), sin_addr=inet_addr("127.0.0.1")}, [128->16], SOCK_NONBLOCK) = 1109 12:50:48.236071 setsockopt(1109, SOL_TCP, TCP_NODELAY, [1], 4) = 0 (process other connections' handshakes) 12:50:48.257270 getsockopt(1109, SOL_SOCKET, SO_ERROR, [ECONNRESET], [4]) = 0 (proof that error was detectable there but this code was added for the PoC) 12:50:48.257297 recvfrom(1109, "\26\3\1\2\0", 5, 0, NULL, NULL) = 5 12:50:48.257310 recvfrom(1109, "\1\0\1\3"..., 512, 0, NULL, NULL) = 512 (handshake calculation taking 700us) 12:50:48.258004 sendto(1109, "\26\3\3\0z"..., 1421, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = -1 EPIPE (Broken pipe) 12:50:48.258036 close(1109) = 0 The situation was amplified by the multi-queue accept code, as it resulted in many incoming connections to be accepted long before they could be handled. Prior to this they would have been accepted and the handshake immediately started, which would have resulted in most of the connections waiting in the the system's accept queue, and dying there when the client aborted, thus the error would have been detected before even trying to pass them to the handshake code. As a result, with a listener running on a very large backlog, it's possible to quickly accept tens of thousands of connections and waste time slowly running their handshakes while they get replaced by other ones. This patch adds an SO_ERROR check on the connection's FD before starting the handshake. This is not pretty as it requires to access the FD, but it does the job. Some improvements should be made over the long term so that the transport layers can report extra information with their ->rcv_buf() call, or at the very least, implement a ->get_conn_status() function to report various flags such as shutr, shutw, error at various stages, allowing an upper layer to inquire for the relevance of engaging into a long operation if it's known the connection is not usable anymore. An even simpler step could probably consist in implementing this in the control layer. This patch is simple enough to be backported as far as 2.0. Many thanks to @ngaugler for his numerous tests with detailed feedback.	2021-02-02 15:55:53 +01:00
William Lallemand	8695ce0bae	BUG/MEDIUM: ssl/cli: abort ssl cert is freeing the old store The "abort ssl cert" command is buggy and removes the current ckch store, and instances, leading to SNI removal. It must only removes the new one. This patch also adds a check in set_ssl_cert.vtc and set_ssl_server_cert.vtc. Must be backported as far as 2.2.	2021-02-01 17:58:21 +01:00
William Dauchy	19f7cfc8c3	MINOR: stats: improve max stats descriptions In order to unify prometheus and stats description, we need to remove some field reference which are specific to stats implementation: - `scur` in max current sessions (also reword current session) - `rate` in max sessions - `req_rate` in max requests - `conn_rate` in max connections Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
William Dauchy	eedb9b13f4	MINOR: stats: improve pending connections description In order to unify prometheus and stats description, we need to clarify the description for pending connections. - remove the BE reference in counters struct, as it is also used in servers - remove reference of `qcur` field in description as it is specific to stats implemention - try to reword cur and max pending connections description Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-01 15:16:33 +01:00
Christopher Faulet	7aa3271439	MINOR: checks: Add function to get the result code corresponding to a status The function get_check_status_result() can now be used to get the result code (CHK_RES_) corresponding to a check status (HCHK_STATUS_). It will be used by the Prometheus exporter when reporting the check status of a server.	2021-02-01 15:16:33 +01:00
Willy Tarreau	75f72338df	BUG/MINOR: activity: take care of late wakeups in "show tasks" During the call to thread_isolate(), some other threads might have performed some task_wakeup() which will have a call date past the one we retrieved. It could be avoided by taking the current date once we're alone but this would significantly affect the latency measurements by adding the isolation time. Instead we're now only accounting positive times, so that late wakeups normally appear with a zero latency. No backport is needed, this is 2.4.	2021-01-29 15:07:07 +01:00
Willy Tarreau	d597ec2718	MINOR: listener: export manage_global_listener_queue() This one pops up in tasks lists when running against a saturated listener.	2021-01-29 14:29:57 +01:00
Christopher Faulet	c29b4bf946	MINOR: mux-h2: Slightly improve request HEADERS frames sending In h2s_bck_make_req_headers() function, in the loop on the HTX blocks, the most common blocks, the headers, are now handled in first, before the start-line. The same change was already performed on the response HEADERS frames. Thus the code is more consistent now.	2021-01-29 13:28:43 +01:00
Christopher Faulet	564981369b	MINOR: mux-h2: Don't tests the start-line when sending HEADERS frame When a HEADERS frame is sent, it is always when an HTX start-line block is found. Thus, in h2s_bck_make_req_headers() and h2s_frt_make_resp_headers() functions, it is useless to tests the start-line. Instead of being too defensive, we use BUG_ON() now because it must not happen and must be handled as a bug. This patch should fix the issue #1086.	2021-01-29 13:27:57 +01:00
Christopher Faulet	3702f78cf9	MINOR: ssl-sample: Don't check if argument list is set in sample fetches The list is always defined by definition. Thus there is no reason to test it.	2021-01-29 13:26:24 +01:00
Christopher Faulet	e6e7a585e9	MINOR: sample: Don't check if argument list is set in sample fetches The list is always defined by definition. Thus there is no reason to test it.	2021-01-29 13:26:13 +01:00
Christopher Faulet	72dbcfe66d	MINOR: http-conv: Don't check if argument list is set in sample converters The list is always defined by definition. Thus there is no reason to test it.	2021-01-29 13:26:02 +01:00
Christopher Faulet	623af93722	MINOR: http-fetch: Don't check if argument list is set in sample fetches The list is always defined by definition. Thus there is no reason to test it. There is also plenty of checks on arguments types while it is already validated during the configuration parsing. But one thing at a time. This patch should fix the issue #1087.	2021-01-29 13:25:34 +01:00
Christopher Faulet	bdbd5db2a5	BUG/MINOR: stick-table: Always call smp_fetch_src() with a valid arg list The sample fetch functions must always be called with a valid argument list. When called by hand, if there is no argument to pass, empty_arg_list must be used. In the stick-table code, there are some calls to smp_fetch_src() with NULL as argument list. It is changed to use empty_arg_list instead. It is not really a bug because smp_fetch_src() does not use the argument list. But it is an API bug. This patch may be backported to all stable branches as a cleanup.	2021-01-29 13:24:16 +01:00
Christopher Faulet	1faeb4c710	MINOR: mux-h1: Remove first useless test on count in h1_process_output() h1_process_output() function is never called with no data to send (count == 0). Thus, the first test on count, at the beginning of the function is useless and may be removed. This way, by reading the code, it is obvious the <chn_htx> variable is always defined. This patch should fix the issue #1085.	2021-01-29 13:16:32 +01:00
Willy Tarreau	5c25daa170	MINOR: stick-tables: export process_table_expire() This handler can take quite some time as it deletes a large number of entries under a lock, let's export it so that it's immediately visible in "show profiling".	2021-01-29 12:39:32 +01:00
Willy Tarreau	f6c88421b7	MINOR: peers: export process_peer_sync() to improve traces This one will probably pop up from time to time in "show profiling", better have it resolve.	2021-01-29 12:38:42 +01:00
Willy Tarreau	025fc71b47	MINOR: checks: export a few functions that appear often in trace dumps The check I/O handler, process_chk_conn and server_warmup are often present in complex backtraces as they're impacted by locking or I/O issues. Let's export them so that they resolve cleanly.	2021-01-29 12:35:24 +01:00
Willy Tarreau	ac6322dd36	MINOR: muxes: export the timeout and shutr task handlers These ones appear often in "show tasks" so it's handy to make them resolve.	2021-01-29 12:33:46 +01:00
Willy Tarreau	02922e19ca	MINOR: session: export session_expire_embryonic() This is only to make it resolve nicely in "show tasks".	2021-01-29 12:27:57 +01:00
Willy Tarreau	fb5401f296	MINOR: listener: export accept_queue_process This is only to make it resolve in "show tasks".	2021-01-29 12:25:23 +01:00
Willy Tarreau	7eff06e162	MINOR: activity: add a new "show tasks" command to list currently active tasks This finally adds the long-awaited solution to inspect the run queues and figure what is eating the CPU or causing latencies. We can even see the experienced latencies when profiling is enabled. Example on a saturated process: > show tasks Running tasks: 14983 (4 threads) function places % lat_tot lat_avg process_stream 4948 33.0 5.840m 70.82ms h1_io_cb 2535 16.9 - - main+0x9e670 2508 16.7 2.930m 70.10ms ssl_sock_io_cb 2499 16.6 - - si_cs_io_cb 2493 16.6 - -	2021-01-29 12:12:28 +01:00
Willy Tarreau	cfa7101d59	MINOR: activity: flush scheduler stats on "set profiling tasks on" If a user enables profiling by hand, it makes sense to reset the stats counters to provide fresh new measurements. Therefore it's worth using this as the standard method to reset counters.	2021-01-29 12:10:33 +01:00
Willy Tarreau	1bd67e9b03	MINOR: activity: also report collected tasks stats in "show profiling" "show profiling" will now dump the stats collected by the scheduler if profiling was previously enabled. This will immediately make it obvious what functions are responsible for others' high latencies or which ones are suffering from others, and should help spot issues like undesired wakeups. Example: Per-task CPU profiling : on # set profiling tasks {on\|auto\|off} Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg si_cs_io_cb 5569479 23.37s 4.196us - - h1_io_cb 5558654 13.60s 2.446us - - process_stream 250841 1.476s 5.882us 3.499s 13.95us main+0x9e670 198 - - 5.526ms 27.91us task_run_applet 17 1.509ms 88.77us 205.8us 12.11us srv_cleanup_idle_connections 12 44.51us 3.708us 25.71us 2.142us main+0x158c80 9 48.72us 5.413us - - srv_cleanup_toremove_connections 5 165.1us 33.02us 123.6us 24.72us	2021-01-29 12:10:33 +01:00
Willy Tarreau	4e2282f9bf	MEDIUM: tasks/activity: collect per-task statistics when profiling is enabled Now when the profiling is enabled, the scheduler wlil update per-function task-level statistics on number of calls, cpu usage and lateny, that could later be checked using "show profiling". This will immediately make it obvious what functions are responsible for others' high latencies or which ones are suffering from others, and should help spot issues like undesired wakeups. For now the stats are only collected but not reported (though they are readable from sched_activity[] under gdb).	2021-01-29 12:10:33 +01:00
Willy Tarreau	3fb6a7b46e	MINOR: activity: declare a new structure to collect per-function activity The new sched_activity structure will be used to collect task-level activity based on the target function. The principle is to declare a large enough array to make collisions rare (256 entries), and hash the function pointer using a reduced XXH to decide where to store the stats. On first computation an entry is definitely assigned to the array and it's done atomically. A special entry (0) is used to store collisions ("others"). The goal is to make it easy and inexpensive for the scheduler code to use these to store #calls, cpu_time and lat_time for each task.	2021-01-29 12:10:33 +01:00
Willy Tarreau	aa622b822b	MINOR: activity: make profiling more manageable In 2.0, commit `d2d3348ac` ("MINOR: activity: enable automatic profiling turn on/off") introduced an automatic mode to enable/disable profiling. The problem is that the automatic mode automatically changes to on/off, which implied that the forced on/off modes aren't sticky anymore. It's annoying when debugging because as soon as the load decreases, profiling stops. This makes a small change which ought to have been done first, which consists in having two states for "auto" (auto-on, auto-off) to distinguish them from the forced states. Setting to "auto" in the config defaults to "auto-off" as before, and setting it on the CLI switches to auto but keeps the current operating state. This is simple enough to be backported to older releases if needed.	2021-01-29 12:10:33 +01:00
Willy Tarreau	4deeb1055f	MINOR: tools: add print_time_short() to print a condensed duration value When reporting some values in debugging output we often need to have some condensed, stable-length values. This function prints a duration from nanosecond to years with at least 4 digits of accuracy using the most suitable unit, always on 7 chars.	2021-01-29 12:10:33 +01:00
Amaury Denoyelle	a81bb7197e	BUG/MINOR: backend: check available list allocation for reuse Do not consider reuse connection if available list is not allocated for the target server. This will prevent a crash when using a standalone server for an external purpose like socket_tcp/socket_ssl on hlua code. For the idle/safe lists, they are considered allocated if srv.max_idle_conns is not null. Note that the hlua code is currently safe thanks to the additional checks on proxy http mode and stream reuse policy not never. However, this might not be sufficient for future code. This patch should be backported in every branches containing the following patch : `7f68d815af` (2.4 tree) REORG: backend: simplify conn_backend_get	2021-01-28 18:12:07 +01:00
Willy Tarreau	02757d02c2	Revert "BUG/MEDIUM: listener: do not accept connections faster than we can process them" This reverts commit `62e8aaa1bd`. While is works extremely well to address SSL handshake floods, it prevents establishment of new connections during regular traffic above 50-60 Gbps, because for an unknown reason the queue seems to have ~1.7 active tasks per connection all the time, which makes no sense as these ought to be waiting on subscribed events. It might uncover a deeper issue but at least for now a different solution is needed. cf issue #822. The test is trivial to run, just start a config with tune.runqueue-depth 10 and inject on 1GB objects with more than 10 connections. Try to connect to the stats socket, it only works once, then the listeners are not dequeued.	2021-01-28 18:11:32 +01:00
Willy Tarreau	62e8aaa1bd	BUG/MEDIUM: listener: do not accept connections faster than we can process them In github issue #822, user @ngaugler reported some performance problems when dealing with many concurrent SSL connections on restarts, after migrating from 1.6 to 2.2, indicating a long time required to re-establish connections. The Run_queue metric in the traces showed an abnormally high number of tasks in the run queue, likely indicating we were accepting faster than we could process. And this is indeed one of the differences between 1.6 and 2.2, the accept I/O loop and the TLS handshakes are totally independent, so much that they can even run on different threads. In 1.6 the SSL handshake was handled almost immediately after the accept(), so this was limiting the input rate. With large maxconn values, as long as there are incoming connections, new I/Os are scheduled and many of them pass before the handshake, being tagged for low latency processing. The result is that handshakes get postponed, and are further postponed as new connections are accepted. When they are finally able to be processed, some of them fail as the client is gone, and the client had already queued new ones. This causes an excess number of apparent connections and total number of handshakes to be processed, just because we were accepting connections on a temporarily saturated machine. The solution is to temporarily pause new incoming connections when the load already indicates that more tasks are already queued than will be handled in a poll loop. The difficulty with this usually is to be able to come back to re-enable the operation, but given that the metric is the run queue, we just have to queue the global_listener_queue task so that it gets picked by any thread once the run queues get flushed. Before this patch, injecting with SSL reneg with 10000 concurrent connections resulted in 350k tasks in the run queue, and a majority of handshake timeouts noticed by the client. With the patch, the run queue fluctuates between 1-3x runqueue-depth, the process is constantly busy, the accept rate is maximized and clients observe no error anymore. It would be desirable to backport this patch to 2.3 and 2.2 after some more testing, provided the accept loop there is compatible.	2021-01-28 16:48:01 +01:00
Christopher Faulet	405f054652	MINOR: h1: Raise the chunk size limit up to (2^52 - 1) The allowed chunk size was historically limited to 2GB to avoid risk of overflow. This restriction is no longer necessary because the chunk size is immediately stored into a 64bits integer after the parsing. Thus, it is now possible to raise this limit. However to never fed possibly bogus values from languages that use floats for their integers, we don't get more than 13 hexa-digit (2^52 - 1). 4 petabytes is probably enough ! This patch should fix the issue #1065. It may be backported as far as 2.1. For the 2.0, the legacy HTTP part must be reviewed. But there is honestely no reason to do so.	2021-01-28 16:37:14 +01:00
Christopher Faulet	73518be595	MINOR: mux-fcgi/trace: add traces at level ERROR for all kind of errors A number of traces could be added or changed to report errors with TRACE_ERROR. The goal is to be able to enable error tracing only to detect anomalies.	2021-01-28 16:37:14 +01:00
Christopher Faulet	26a2643466	MINOR: mux-h1/trace: add traces at level ERROR for all kind of errors A number of traces could be added or changed to report errors with TRACE_ERROR. The goal is to be able to enable error tracing only to detect anomalies.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	f9dcbeeab3	MEDIUM: h2: send connect protocol h2 settings In order to announce support for the Extended CONNECT h2 method by haproxy, always send the ENABLE_CONNECT_PROTOCOL h2 settings. This new setting has been described in the rfc 8441. After receiving ENABLE_CONNECT_PROTOCOL, the client is free to use the Extended CONNECT h2 method. This can notably be useful for the support of websocket handshake on http/2.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	c9a0afcc32	MEDIUM: h2: parse Extended CONNECT request to htx Support for the rfc 8441 Bootstraping WebSockets with HTTP/2 Convert an Extended CONNECT HTTP/2 request into a htx representation. The htx message uses the GET method with an Upgrade header field to be fully compatible with the equivalent HTTP/1.1 Upgrade mechanism. The Extended CONNECT is of the following form : :method = CONNECT :protocol = websocket :scheme = https :path = /chat :authority = server.example.com The new pseudo-header :protocol has been defined and is used to identify an Extended CONNECT method. Contrary to standard CONNECT, Extended CONNECT must have :scheme, :path and :authority defined.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	efe2276a9e	MEDIUM: mux_h2: generate Extended CONNECT response Support for the rfc 8441 Bootstraping WebSockets with HTTP/2 Convert a 101 htx response message to a 200 HTTP/2 response.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	aad333a9fc	MEDIUM: h1: add a WebSocket key on handshake if needed Add the header Sec-Websocket-Key when generating a h1 handshake websocket without this header. This is the case when doing h2-h1 conversion. The key is randomly generated and base64 encoded. It is stored on the session side to be able to verify response key and reject it if not valid.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	9bf957335e	MEDIUM: mux_h2: generate Extended CONNECT from htx upgrade Support for the rfc 8441 Bootstraping WebSockets with HTTP/2 Generate an HTTP/2 Extended CONNECT request from a htx Upgrade message. This conversion is done when seeing the header Connection: Upgrade. A CONNECT request is written with the :protocol pseudo-header set from the Upgrade htx header value. The protocol is saved in the h2s structure. This is needed on the response side because the protocol is not present on HTTP/2 response but is needed if the client side is using HTTP/1.1 with 101 status code.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	7416274914	MEDIUM: h2: parse Extended CONNECT reponse to htx Support for the rfc 8441 Bootstraping WebSockets with HTTP/2 Convert a 200 status reply from an Extended CONNECT request into a htx representation. The htx message is set to 101 status code to be fully compatible with the equivalent HTTP/1.1 Upgrade mechanism. This conversion is only done if the stream flags H2_SF_EXT_CONNECT_SENT has been set. This is true if an Extended CONNECT request has already been seen on the stream. Besides the 101 status, the additional headers Connection/Upgrade are added to the htx message. The protocol is set from the value stored in h2s. Typically it will be extracted from the client request. This is only used if the client is using h1 as only the HTTP/1.1 101 Response contains the Upgrade header.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	5fb48ea7a4	MINOR: mux_h2: define H2_SF_EXT_CONNECT_SENT stream flag This flag is used to signal that an Extended CONNECT has been sent by the server mux on the current stream. This will allow to convert the response to a 101 htx status message.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	c193823343	MEDIUM: h1: generate WebSocket key on response if needed Add the Sec-Websocket-Accept header on a websocket handshake response. This header may be missing if a h2 server is used with a h1 client. The response key is calculated following the rfc6455. For this, the handshake request key must be stored in the h1 session, as a new field name ws_key. Note that this is only done if the message has been prealably identified as a Websocket handshake request.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	18ee5c3eb0	MINOR: h1: reject websocket handshake if missing key If a request is identified as a WebSocket handshake, it must contains a websocket key header or else it can be reject, following the rfc6455. A new flag H1_MF_UPG_WEBSOCKET is set on such messages. For the request te be identified as a WebSocket handshake, it must contains the headers: Connection: upgrade Upgrade: websocket This commit is a compagnon of "MEDIUM: h1: generate WebSocket key on response if needed" and "MEDIUM: h1: add a WebSocket key on handshake if needed". Indeed, it ensures that a WebSocket key is added only from a http/2 side and not for a http/1 bogus peer.	2021-01-28 16:37:14 +01:00
Christopher Faulet	5b82cc5b5c	MEDIUM: http-ana: Deal with L7 retries in HTTP analysers The code dealing with the copy of requests in the L7-buffer and the retransmits during L7 retries has been moved in the HTTP analysers. The copy is now performed in the REQ_HTTP_XFER_BODY analyser and the L7 retries is performed in the RES_WAIT_HTTP analyser. This way, si_cs_recv() and si_cs_send() don't care of it anymore. It is much more natural to deal with L7 retry in HTTP analysers.	2021-01-28 16:37:14 +01:00
Christopher Faulet	991febdfe0	MEDIUM: mux-h2: Don't emit DATA frame for bodyless responses Some responses must not contain data. Reponses to HEAD requests and 204/304 responses. But there is no warranty that this will be really respected by the senders or even if it is possible. For instance, the method may be rewritten by an http-request rule (HEAD->GET). Thus, it is not really possible to always strip these data from the response at the receive stage. And the response may be emitted by an applet or an internal service not strictly following the spec. All that to say that we may be prepared to handle payload for bodyless responses on the sending path. In addition, unlike the HTTP/1, it is not really clear that the trailers is part of the payload or not. Thus, some clients may expect to have the trailers, if any, in the response to a HEAD request. For instance, the GRPC status is placed in a trailer and clients rely on it. But what happens for 204 responses then. Read the following thread for details : https://lists.w3.org/Archives/Public/ietf-http-wg/2020OctDec/0040.html So, thanks to previous patches, it is now possible to know on the sending path if a response must be bodyless or not. So, for such responses, no DATA frame is emitted, except eventually the last empty one carring the ES flag. However, the TRAILERS frames are still emitted. The h2s_skip_data() function is added to take care to remove HTX DATA blocks without emitting any DATA frame expect the last one, if there is no trailers.	2021-01-28 16:37:14 +01:00
Christopher Faulet	7d247f0771	MINOR: h2/mux-h2: Add flags to notify the response is known to have no body The H2 message flag H2_MSGF_BODYLESS_RSP is now used during the request or the response parsing to notify the mux that, considering the parsed message, the response is known to have no body. This happens during HEAD requests parsing and during 204/304 responses parsing. On the H2 multiplexer, the equivalent flag is set on H2 streams. Thus the H2_SF_BODYLESS_RESP flag is set on a H2 stream if the H2_MSGF_BODYLESS_RSP is found after a HEADERS frame parsing. Conversely, this flag is also set when a HEADERS frame is emitted for HEAD requests and for 204/304 responses. The H2_SF_BODYLESS_RESP flag will be used to ignore data payload from the response but not the trailers.	2021-01-28 16:37:14 +01:00
Christopher Faulet	f3e7619041	MINOR: mux-h1: Don't add Connection close/keep-alive header for 1xx messages No connection header must be added by the H1 mux in 1xx messages, including 101. Existing connection headers remains untouched, especially the "Connection: upgrade" of 101 responses. This patch only avoids to add "Connection: close" or "Connection: keep-alive" to 1xx responses.	2021-01-28 16:37:14 +01:00
Christopher Faulet	91fcf21e45	MINOR: mux-h1: Don't emit C-L and T-E headers for 204 and 1xx responses 204 and 1xx responses must not have any payload. Now, the H1 mux takes care of that in last resort. But they also must not have any C-L or T-E headers. Thus, if found on the sending path, these headers are ignored.	2021-01-28 16:37:14 +01:00
Christopher Faulet	e5596bf53f	MEDIUM: mux-h1: Don't emit any payload for bodyless responses Some responses must not contain data. Reponses to HEAD requests and 204/304 xresponses. But there is no warranty that this will be really respected by the senders or even if it is possible. For instance, the method may be rewritten by an http-request rule (HEAD->GET). Thus, it is not really possible to always strip the payload from the response at the receive stage. And the response may be emitted by an applet or an internal service not strictly following the spec. All that to say that we may be prepared to handle payload for bodyless responses on the sending path. So, thanks to previous patches, it is now possible to know on the sending path if a response must be bodyless or not. So, for such responses, no payload is emitted, all HTX blocks after the EOH are silently removed (including the trailers).	2021-01-28 16:37:14 +01:00
Christopher Faulet	5696f5450e	MINOR: mux-h1: Add a flag on H1 streams with a response known to be bodyless In HTTP/1, responses to HEAD requests and 204/304 must not have payload. The H1S_F_BODYLESS_RESP flag is not set on streams that should handle such responses, on the client side and the server side. On the client side, this flag is set when a HEAD request is parsed and when a 204/304 response is emitted. On the server side, this happends when a HEAD request is emitted or a 204/304 response is parsed.	2021-01-28 16:37:14 +01:00
Christopher Faulet	d1ac2b90cd	MAJOR: htx: Remove the EOM block type and use HTX_FL_EOM instead The EOM block may be removed. The HTX_FL_EOM flags is enough. Most of time, to know if the end of the message is reached, we just need to have an empty HTX message with HTX_FL_EOM flag set. It may also be detected when the last block of a message with HTX_FL_EOM flag is manipulated. Removing EOM blocks simplifies the HTX message filling. Indeed, there is no more edge problems when the message ends but there is no more space to write the EOM block. However, some part are more tricky. Especially the compression filter or the FCGI mux. The compression filter must finish the compression on the last DATA block. Before it was performed on the EOM block, an extra DATA block with the checksum was added. Now, we must detect the last DATA block to be sure to finish the compression. The FCGI mux on its part must be sure to reserve the space for the empty STDIN record on the last DATA block while this record was inserted on the EOM block. The H2 multiplexer is probably the part that benefits the most from this change. Indeed, it is now fairly easier to known when to set the ES flag. The HTX documentaion has been updated accordingly.	2021-01-28 16:37:14 +01:00
Christopher Faulet	42432f347f	MINOR: htx: Rename HTX_FL_EOI flag into HTX_FL_EOM The HTX_FL_EOI flag is not well named. For now, it is not very used. But that will change. It will replace the EOM block. Thus, it is renamed.	2021-01-28 16:37:14 +01:00
Christopher Faulet	5be651d4d7	BUG/MAJOR: mux-h1/mux-h2/htx: Fix HTTP tunnel management at the mux level Tunnel management between the H1 and H2 multiplexers is a bit blurred. And the HTX is not enough well defined on this point to make things clear. In fact, Establishing a tunnel between an H2 client and an H1 server, or the opposite is buggy because the both multiplexers don't handle the EOM block the same way when a tunnel is established. In fact, the H2 multiplexer is pretty strict and add an END_STREAM flag when an EOM block is found, while the H1 multiplexer is more flexible. The purpose of this patch is to make the EOM block usage pretty clear and to fix the HTTP multiplexers to really handle HTTP tunnels in the right way. Now, an EOM block is used to mark the end of an HTTP message, semantically speaking. That means it may be followed by tunneled data. Thus, CONNECT requests are now finished by an EOM block, just after the EOH block. On the H1 multiplexer side, a tunnel is now only established on the response path. So a CONNECT request remains in a DONE state waiting for the 2xx response. On the H2 multiplexer side, a flag is used to know an HTTP tunnel is requested, to not immediately add the END_STREAM flag on the EOM block. All these changes are sensitives and not backportable because of recent changes. The same problem exists on earlier versions and should be addressed. But it will only be possible with a specific patchset. This patch relies on the following ones : * MEDIUM: mux-h1: Properly handle tunnel establishments and aborts * MEDIUM: mux-h2: Close streams when processing data for an aborted tunnel * MEDIUM: mux-h2: Block client data on server side waiting tunnel establishment * MINOR: mux-h2: Add 2 flags to help to properly handle tunnel mode * MINOR: mux-h1: Split H1C_F_WAIT_OPPOSITE flag to separate input/output sides * MINOR: mux-h1/mux-fcgi: Don't set TUNNEL mode if payload length is unknown	2021-01-28 16:37:14 +01:00
Christopher Faulet	dea2474991	MEDIUM: mux-h1: Properly handle tunnel establishments and aborts In the same way than the H2 mux, we now bloc data sending on the server side if a tunnel is not fully established. In addition, if some data are still pending for a aborted tunnel, an error is triggered and the server connection is closed. To do so, we rely on the H1C_F_WAIT_INPUT flag to bloc the output processing. This patch contributes to fix the tunnel mode between the H1 and the H2 muxes.	2021-01-28 16:37:14 +01:00
Christopher Faulet	91b21dc8d8	MEDIUM: mux-h2: Close streams when processing data for an aborted tunnel In the previous patch ("MEDIUM: mux-h2: Block client data on server side waiting tunnel establishment"), we added a way to block client data for not fully established tunnel on the server side. This one closes the stream with an ERR_CANCEL erorr if there are some pending tunneled data while the tunnel was aborted. This may happen on the client side if a non-empty DATA frame or an empty DATA frame without the ES flag is received. This may also happen on the server side if there is a DATA htx block. However in this last case, we first wait the response is fully forwarded. This patch contributes to fix the tunnel mode between the H1 and the H2 muxes.	2021-01-28 16:37:14 +01:00
Christopher Faulet	f95f87650f	MEDIUM: mux-h2: Block client data on server side waiting tunnel establishment On the server side, when a tunnel is not fully established, we must block tunneled data, waiting for the server response. It is mandatory because the server may refuse the tunnel. This happens when a DATA htx block is processed in tunnel mode (H2_SF_BODY_TUNNEL flag set) but before the response HEADERS frame is received (H2_SF_HEADERS_RCVD flag no set). In this case, the H2_SF_BLK_MBUSY flag is set to mark the stream as busy. This flag is removed when the tunnel is fully established or aborted. This patch contributes to fix the tunnel mode between the H1 and the H2 muxes.	2021-01-28 16:37:14 +01:00
Christopher Faulet	d0db42326d	MINOR: mux-h2: Add 2 flags to help to properly handle tunnel mode H2_SF_BODY_TUNNEL and H2_SF_TUNNEL_ABRT flags are added to properly handle the tunnel mode in the H2 mux. The first one is used to detect tunnel establishment or fully established tunnel. The second one is used to abort a tunnel attempt. It is the first commit having as a goal to fix tunnel establishment between H1 and H2 muxes. There is a subtlety in h2_rcv_buf(). CS_FL_EOS flag is added on the conn-stream when ES is received on a tunneled stream. It really reflects the conn-stream state and is mandatory for next commits.	2021-01-28 16:37:14 +01:00
Christopher Faulet	b385b50fbb	MINOR: mux-h1: Split H1C_F_WAIT_OPPOSITE flag to separate input/output sides The H1C_F_WAIT_OPPOSITE flag is now splitted in 2 flags, H1C_F_WAIT_INPUT and H1C_F_WAIT_OUTPUT, depending on the side is waiting. The change is a prerequisite to fix the tunnel mode management in HTTP muxes. H1C_F_WAIT_INPUT must be used to bloc the output side and to wait for an event from the input side. H1C_F_WAIT_OUTPUT does the opposite. It bloc the input side and wait for an event from the output side.	2021-01-28 16:37:14 +01:00
Christopher Faulet	1e857785e9	MINOR: mux-h1/mux-fcgi: Don't set TUNNEL mode if payload length is unknown Responses with no C-L and T-E headers are no longer switched in TUNNEL mode and remains in DATA mode instead. The H1 and FCGI muxes are updated accordingly. This change reflects the real message state. It is not a true tunnel. Data received are still part of the message. It is not a bug. However, this message may be backported after some observation period (at least as far as 2.2).	2021-01-28 16:37:14 +01:00
Christopher Faulet	8989942cfc	BUG/MINOR: h2/mux-h2: Reject 101 responses with a PROTOCOL_ERROR h2s error As stated in the RFC7540, section 8.1.1, the HTTP/2 removes support for the 101 informational status code. Thus a PROTOCOL_ERROR is now returned to the server if a 101-switching-protocols response is received. Thus, the server connection is aborted. This patch may be backported as far as 2.0.	2021-01-28 16:36:40 +01:00
Christopher Faulet	6e6c7b1284	MEDIUM: http-ana: Refuse invalid 101-switching-protocols responses A 101-switching-protocols response must contain a Connection header with the Upgrade option. And this response must only be received from a server if the client explicitly requested a protocol upgrade. Thus, the request must also contain a Connection header with the Upgrade option. If not, a 502-bad-gateway response is returned to the client. This way, a tunnel is only established if both sides are agree. It is closer to what the RFC says, but it remains a bit flexible because there is no check on the Upgrade header itself. However, that's probably enough to ensure a tunnel is not established when not requested. This one is not tagged as a bug. But it may be backported, at least to 2.3. It relies on : * MINOR: htx/http-ana: Save info about Upgrade option in the Connection header	2021-01-28 16:27:48 +01:00
Christopher Faulet	576c358508	MINOR: htx/http-ana: Save info about Upgrade option in the Connection header Add an HTX start-line flag and its counterpart into the HTTP message to track the presence of the Upgrade option into the Connection header. This way, without parsing the Connection header again, it will be easy to know if a client asks for a protocol upgrade and if the server agrees to do so. It will also be easy to perform some conformance checks when a 101-switching-protocols is received.	2021-01-28 16:27:48 +01:00
Christopher Faulet	0f9395d81e	BUG/MAJOR: mux-h1: Properly handle TCP to H1 upgrades It is the second part and the most important of the fix. Since the mux-h1 refactoring, and more specifically since the commit `c4bfa59f1` ("MAJOR: mux-h1: Create the client stream as later as possible"), the upgrade from a TCP client connection to H1 is broken. Indeed, now the H1 mux is responsible to create the frontend conn-stream once the request headers are fully received. But, to properly support TCP to H1 upgrades, we must inherit from the existing conn-stream. To do so, if the conn-stream already exists when the client H1 connection is created, we create a H1 stream in ST_ATTACHED state, but not ST_READY, and the conn-stream is attached to it. Because the ST_READY state is not set, no data are xferred to the data layer when h1_rcv_buf() is called and shutdowns are inhibited except on client aborts. This way, the request is parsed the same way than for a classical H1 connection. Once the request headers are fully received and parsed, the data stream is upgraded and the ST_READY state is set. A tricky case appears when an H2 upgrade is performed because the H2 preface is matched. In this case, the conn-stream must be detached and destroyed before switching to the H2 mux and releasing the current H1 mux. We must also take care to detach and destroy the conn-stream when a timeout occurres. This patch relies on the following series of patches : * BUG/MEDIUM: stream: Don't immediatly ack the TCP to H1 upgrades * MEDIUM: http-ana: Do nothing in wait-for-request analyzer if not htx * MINOR: stream: Add a function to validate TCP to H1 upgrades * MEDIUM: mux-h1: Add ST_READY state for the H1 connections * MINOR: mux-h1: Wake up instead of subscribe for reads after H1C creation * MINOR: mux-h1: Try to wake up data layer first before calling its wake callback * MINOR: stream-int: Take care of EOS in the SI wake callback function * BUG/MINOR: stream: Don't update counters when TCP to H2 upgrades are performed This fix is specific for 2.4. No backport needed.	2021-01-28 16:27:48 +01:00
Christopher Faulet	cdd1e2a44b	BUG/MEDIUM: stream: Don't immediatly ack the TCP to H1 upgrades Instead of switching the stream to HTX mode, the request channel is only reset (the request buffer is xferred to the mux) and the SF_IGNORE flag is set on the stream. This flag prevent any processing in case of abort. Once the upgrade confirmed, the flag is removed, in stream_upgrade_from_cs(). It is only the first part of the fix. The next one ("BUG/MAJOR: mux-h1: Properly handle TCP to H1 upgrades") is also required. Both rely on the following series of patches : * MEDIUM: http-ana: Do nothing in wait-for-request analyzer if not htx * MINOR: stream: Add a function to validate TCP to H1 upgrades * MEDIUM: mux-h1: Add ST_READY state for the H1 connections * MINOR: mux-h1: Wake up instead of subscribe for reads after H1C creation * MINOR: mux-h1: Try to wake up data layer first before calling its wake callback * MINOR: stream-int: Take care of EOS in the SI wake callback function * BUG/MINOR: stream: Don't update counters when TCP to H2 upgrades are performed This fix is specific for 2.4. No backport needed.	2021-01-28 16:27:48 +01:00
Christopher Faulet	da46a0dca7	MEDIUM: http-ana: Do nothing in wait-for-request analyzer if not htx If http_wait_for_request() analyzer is called with a non-htx stream, nothing is performed and we return immediatly. For now, it is totally unexpected. But it will be true during TCP to H1 upgrades, once fixed. Indeed, there will be a transition period during these upgrades. First the mux will be upgraded and the not the stream, and finally the stream will be upgraded by the mux once ready. In the meantime, the stream will still be in raw mode. Nothing will be performed in wait-for-request analyzer because it will be the mux responsibility to handle errors. This patch is required to fix the TCP to H1 upgrades.	2021-01-28 16:27:48 +01:00
Christopher Faulet	4ef84c9c41	MINOR: stream: Add a function to validate TCP to H1 upgrades TCP to H1 upgrades are buggy for now. When such upgrade is performed, a crash is experienced. The bug is the result of the recent H1 mux refactoring, and more specifically because of the commit `c4bfa59f1` ("MAJOR: mux-h1: Create the client stream as later as possible"). Indeed, now the H1 mux is responsible to create the frontend conn-stream once the request headers are fully received. Thus the TCP to H1 upgrade is a problem because the frontend conn-stream already exists. To fix the bug, we must keep this conn-stream and the associate stream and use it in the H1 mux. To do so, the upgrade will be performed in two steps. First, the mux is upgraded from mux-pt to mux-h1. Then, the mux-h1 performs the stream upgrade, once the request headers are fully received and parsed. To do so, stream_upgrade_from_cs() must be used. This function set the SF_HTX flags to switch the stream to HTX mode, it removes the SF_IGNORE flags and eventually it fills the request channel with some input data. This patch is required to fix the TCP to H1 upgrades and is intimately linked with the next commits.	2021-01-28 16:27:48 +01:00
Christopher Faulet	39c7b6b09d	MEDIUM: mux-h1: Add ST_READY state for the H1 connections An alive H1 connection may be in one of these 3 states : * ST_IDLE : not active and is waiting to be reused (no h1s and no cs) * ST_EMBRYONIC : active with a h1s but without any cs * ST_ATTACHED : active with a h1s and a cs ST_IDLE and ST_ATTACHED are possible for frontend and backend connection. ST_EMBRYONIC is only possible on the client side, when we are waiting for the request headers. The last one is the expected state for an active connection processing data. These states are mutually exclusives. Now, there is a new state, ST_READY. It may only be set if ST_ATTACHED is also set and when the CS is considered as fully active. For now, ST_READY is set in the same time of ST_ATTACHED. But it will be used to fix TCP to H1 upgrades. Idea is to have an H1 connection in ST_ATTACHED state but not ST_READY yet and have more or less the same behavior than an H1 connection in ST_EMBRYONIC state. And when the upgrade is fully achieved, the ST_READY state may be set and the data layer may be notified accordingly. So for now, this patch should not change anything. TCP to H1 upgrades are still buggy. But it is mandatory to make it work properly.	2021-01-28 16:27:48 +01:00
Christopher Faulet	d9ee788b7a	MINOR: mux-h1: Wake up H1C after its creation if input buffer is not empty When a H1 connection is created, we now wakeup the H1C tasklet if there are some data in the input buffer. If not we only subscribe for reads. This patch is required to fix the TCP to H1 upgrades.	2021-01-28 16:27:15 +01:00
Christopher Faulet	ad4daf629e	MINOR: mux-h1: Try to wake up data layer first before calling its wake callback Instead of calling the data layer wake callback function, we now first try to wake it up. If the data layer is subscribed for receives or for sends, its tasklet is woken up. The wake callback function is only called as the last chance to notify the data layer.	2021-01-28 16:22:53 +01:00
Christopher Faulet	89e34c261b	MEDIUM: stream-int: Take care of EOS if the SI wake callback function Because si_cs_process() is also the SI wake callback function, it may be called from the mux layer. Thus, in such cases, it is performed outside any I/O event and si_cs_recv() is not called. If a read0 is reported by the mux, via the CS_FL_EOS flag, the event is not handled, because only si_cs_recv() take care of this flag for now. It is not a bug, because this does not happens for now. All muxes set this flag when the data layer retrieve data (via mux->rcv_buf()). But it is safer to be prepared to handle it from the wake callback. And in fact, it will be useful to fix the HTTP upgrades of TCP connections (especially TCP>H1>H2 upgrades). To be sure to not handle the same event twice, it is only handled if the shutr is not already set on the input channel.	2021-01-28 16:22:04 +01:00
Amaury Denoyelle	08d87b3f49	BUG/MEDIUM: backend: never reuse a connection for tcp mode The reuse of idle connections should only happen for a proxy with the http mode. In case of a backend with the tcp mode, the reuse selection and insertion in session list are skipped. This behavior is present since commit : MEDIUM: connection: Add private connections synchronously in session server list It could also be further exagerated by : MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking It can be backported up to 2.3.	2021-01-28 14:18:33 +01:00
William Lallemand	8d67394f69	BUG/MINOR: ssl: init tmp chunk correctly in ssl_sock_load_sctl_from_file() Use chunk_inistr() for a chunk initialisation in ssl_sock_load_sctl_from_file() instead of a manual initialisation which was not initialising head. Fix issue #1073. Must be backported as far as 2.2	2021-01-27 14:58:51 +01:00
William Lallemand	b8868498ed	CLEANUP: ssl: remove dead code in ckch_inst_new_load_srv_store() The new ckch_inst_new_load_srv_store() function which mimics the ckch_inst_new_load_store() function includes some dead code which was used only in the former function. Fix issue #1081.	2021-01-27 14:44:59 +01:00
Christopher Faulet	3888b8cd7b	BUG/MINOR: stats: Add a break after filling ST_F_MODE field for servers The previous patch was pushed too quickly (`399bf72f6` "BUG/MINOR: stats: Remove a break preventing ST_F_QCUR to be set for servers"). It was not an extra break but a misplaced break statement. Thus, now a break statement must be added after filling the ST_F_MODE field in stats_fill_sv_stats(). No backport needed except if the above commit is backported.	2021-01-27 13:32:26 +01:00
Christopher Faulet	399bf72f66	BUG/MINOR: stats: Remove a break preventing ST_F_QCUR to be set for servers There is an extra break statement wrongly placed in stats_fill_sv_stats() function, just before filling the ST_F_QCUR field. It prevents this field to be set to the right value for servers. No backport needed except if commit 3a9a4992 ("MEDIUM: stats: allow to select one field in `stats_fill_sv_stats`") is backported.	2021-01-27 12:48:38 +01:00
William Lallemand	db26e2b00e	CLEANUP: ssl: make load_srv_{ckchs,cert} match their bind counterpart This patch makes things more consistent between the bind_conf functions and the server ones: - ssl_sock_load_srv_ckchs() loads the SSL_CTX in the server (ssl_sock_load_ckchs() load the SNIs in the bind_conf) - add the server parameter to ssl_sock_load_srv_ckchs() - changes made to the ckch_inst are done in ckch_inst_new_load_srv_store()	2021-01-26 15:19:36 +01:00
William Lallemand	795bd9ba3a	CLEANUP: ssl: remove SSL_CTX function parameter Since the server SSL_CTX is now stored in the ckch_inst, it is not needed anymore to pass an SSL_CTX to ckch_inst_new_load_srv_store() and ssl_sock_load_srv_ckchs().	2021-01-26 15:19:36 +01:00
William Lallemand	1dedb0a82a	CLEANUP: ssl/cli: rework free in cli_io_handler_commit_cert() The new feature allowing the change of server side certificates introduced duplicated free code. Rework the code in cli_io_handler_commit_cert() to be more consistent.	2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton	bb470aa327	MINOR: ssl: Remove client_crt member of the server's ssl context The client_crt member is not used anymore since the server's ssl context initialization now behaves the same way as the bind lines one (using ckch stores and instances).	2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton	f3eedfe195	MEDIUM: ssl: Enable backend certificate hot update When trying to update a backend certificate, we should find a server-side ckch instance thanks to which we can rebuild a new ssl context and a new ckch instance that replace the previous ones in the server structure. This way any new ssl session will be built out of the new ssl context and the newly updated certificate. This resolves a subpart of GitHub issue #427 (the certificate part)	2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton	d817dc733e	MEDIUM: ssl: Load client certificates in a ckch for backend servers In order for the backend server's certificate to be hot-updatable, it needs to fit into the implementation used for the "bind" certificates. This patch follows the architecture implemented for the frontend implementation and reuses its structures and general function calls (adapted for the server side). The ckch store logic is kept and a dedicated ckch instance is used (one per server). The whole sni_ctx logic was not kept though because it is not needed. All the new functions added in this patch are basically server-side copies of functions that already exist on the frontend side with all the sni and bind_cond references removed. The ckch_inst structure has a new 'is_server_instance' flag which is used to distinguish regular instances from the server-side ones, and a new pointer to the server's structure in case of backend instance. Since the new server ckch instances are linked to a standard ckch_store, a lookup in the ckch store table will succeed so the cli code used to update bind certificates needs to be covered to manage those new server side ckch instances.	2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton	ec805a32b9	MINOR: ssl: Certificate chain loading refactorization Move the certificate chain loading code into a dedicated function that will then be useable elsewhere.	2021-01-26 15:19:36 +01:00
Remi Tricot-Le Breton	442b7f2238	MINOR: ssl: Server ssl context prepare function refactoring Split the server's ssl context initialization into the general ssl related initializations and the actual initialization of a single SSL_CTX structure. This way the context's initialization will be usable by itself from elsewhere.	2021-01-26 15:19:36 +01:00
Amaury Denoyelle	7f68d815af	REORG: backend: simplify conn_backend_get Reorganize the conditions for the reuse of idle/safe connections : - reduce code by using variable to store reuse mode and idle/safe conns counts - consider that idle/safe/avail lists are properly allocated if max_idle_conns not null. An allocation failure prevents haproxy startup.	2021-01-26 14:48:39 +01:00
Amaury Denoyelle	37e25bcd1e	CLEANUP: backend: remove an obsolete comment on conn_backend_get This comment was valid for haproxy 1.8 but now it is obsolete.	2021-01-26 14:48:39 +01:00
Amaury Denoyelle	18c68df558	CLEANUP: srv: fix comment for pool-max-conn Adjust comment for the unlimited value of pool-max-conn which is -1.	2021-01-26 14:48:39 +01:00
Amaury Denoyelle	69c5c3ab33	BUG/MINOR: config: fix leak on proxy.conn_src.bind_hdr_name Leak for parsing of option usesrc of the source keyword. This can be backported to 1.8.	2021-01-26 14:48:39 +01:00
Christopher Faulet	6071c2d12d	BUG/MEDIUM: filters/htx: Fix data forwarding when payload length is unknown It is only a problem on the response path because the request payload length it always known. But when a filter is registered to analyze the response payload, the filtering may hang if the server closes just after the headers. The root cause of the bug comes from an attempt to allow the filters to not immediately forward the headers if necessary. A filter may choose to hold the headers by not forwarding any bytes of the payload. For a message with no payload but a known payload length, there is always a EOM block to forward. Thus holding the EOM block for bodyless messages is a good way to also hold the headers. However, messages with an unknown payload length, there is no EOM block finishing the message, but only a SHUTR flag on the channel to mark the end of the stream. If there is no payload when it happens, there is no payload at all to forward. In the filters API, it is wrongly detected as a condition to not forward the headers. Because it is not the most used feature and not the obvious one, this patch introduces another way to hold the message headers at the begining of the forwarding. A filter flag is added to explicitly says the headers should be hold. A filter may choose to set the STRM_FLT_FL_HOLD_HTTP_HDRS flag and not forwad anything to hold the headers. This flag is removed at each call, thus it must always be explicitly set by filters. This flag is only evaluated if no byte has ever been forwarded because the headers are forwarded with the first byte of the payload. reg-tests/filters/random-forwarding.vtc reg-test is updated to also test responses with unknown payload length (with and without payload). This patch must be backported as far as 2.0.	2021-01-26 09:53:52 +01:00
William Dauchy	d3a9a4992b	MEDIUM: stats: allow to select one field in `stats_fill_sv_stats` prometheus approach requires to output all values for a given metric name; meaning we iterate through all metrics, and then iterate in the inner loop on all objects for this metric. In order to allow more code reuse, adapt the stats API to be able to select one field or fill them all otherwise. This patch follows what has already been done on frontend and backend side. From this patch it should be possible to remove most of the duplicate code on prometheuse side for the server. A few things to note though: - state require prior calculation, so I moved that to a sort of helper `stats_fill_be_stats_computestate`. - all ST_F*TIME fields requires some minor compute, so I moved it at te beginning of the function under a condition. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-26 09:24:51 +01:00
William Dauchy	da3b466fc2	MEDIUM: stats: allow to select one field in `stats_fill_be_stats` prometheus approach requires to output all values for a given metric name; meaning we iterate through all metrics, and then iterate in the inner loop on all objects for this metric. In order to allow more code reuse, adapt the stats API to be able to select one field or fill them all otherwise. This patch follows what has already been done on frontend side. From this patch it should be possible to remove most of the duplicate code on prometheuse side for the backend A few things to note though: - status and uweight field requires prior compute, so I moved that to a sort of helper `stats_fill_be_stats_computesrv`. - all ST_F*TIME fields requires some minor compute, so I moved it at te beginning of the function under a condition. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-26 09:24:19 +01:00
Ilya Shipitsin	7704b0e1e1	CLEANUP: assorted typo fixes in the code and comments This is 16th iteration of typo fixes	2021-01-26 09:16:48 +01:00
William Dauchy	2107a0faf5	CLEANUP: stats: improve field selection for frontend http fields while working on backend/servers I realised I could have written that in a better way and avoid one extra break. This is slightly improving readiness. also while being here, fix function declaration which was not 100% accurate. this patch does not change the behaviour of the code. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-25 15:53:28 +01:00
Christopher Faulet	8596bfbafd	BUG/MINOR: stats: Init the metric variable when frontend stats are filled In stats_fill_fe_stats(), some fields are conditionnal (ST_F_HRSP_* for instance). But unlike unimplemented fields, for those fields, the <metric> variable is used to fill the <stats> array, but it is not initialized. This bug as no impact, because these fields are not used. But it is better to fix it now to avoid future bugs. To fix it, the metric is now defined and initialized into the for loop. The bug was introduced by the commit `0ef54397` ("MEDIUM: stats: allow to select one field in `stats_fill_fe_stats`"). No backport is needed except if the above commit is backported. It fixes the issue #1063.	2021-01-25 15:53:03 +01:00
Ilya Shipitsin	1fc44d494a	BUILD: ssl: guard Client Hello callbacks with HAVE_SSL_CLIENT_HELLO_CB macro instead of openssl version let us introduce new macro HAVE_SSL_CLIENT_HELLO_CB and guard callback functions with it	2021-01-22 20:45:24 +01:00
Christopher Faulet	d808f1759d	BUG/MINOR: stats: Continue to fill frontend stats on unimplemented metric A regression was introduced by the commit `0ef54397b` ("MEDIUM: stats: allow to select one field in `stats_fill_fe_stats`"). stats_fill_fe_stats() function fails on unimplemented metrics for frontends. However, not all stats metrics are used by frontends. For instance ST_F_QCUR. As a consequence, the frontends stats are always skipped. To fix the bug, we just skip unimplemented metric for frontends. An error is triggered only if a specific field is given and is unimplemented. No backport is needed except if the above commit is backported.	2021-01-22 17:42:32 +01:00
Bertrand Jacquin	f4c12d4da2	BUILD/MINOR: lua: define _GNU_SOURCE for LLONG_MAX Lua requires LLONG_MAX defined with __USE_ISOC99 which is set by _GNU_SOURCE, not necessarely defined by default on old compiler/glibc. $ make V=1 TARGET=linux-glibc-legacy USE_THREAD= USE_ACCEPT4= USE_PCRE=1 USE_OPENSSL=1 USE_ZLIB=1 USE_LUA=1 .. cc -Iinclude -O2 -g -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-strict-aliasing -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-missing-field-initializers -DUSE_EPOLL -DUSE_NETFILTER -DUSE_PCRE -DUSE_POLL -DUSE_TPROXY -DUSE_LINUX_TPROXY -DUSE_LINUX_SPLICE -DUSE_LIBCRYPT -DUSE_CRYPT_H -DUSE_GETADDRINFO -DUSE_OPENSSL -DUSE_LUA -DUSE_FUTEX -DUSE_ZLIB -DUSE_CPU_AFFINITY -DUSE_DL -DUSE_RT -DUSE_PRCTL -DUSE_THREAD_DUMP -I/usr/include/openssl101e/ -DUSE_PCRE -I/usr/include -DCONFIG_HAPROXY_VERSION=\"2.4-dev5-73246d-83\" -DCONFIG_HAPROXY_DATE=\"2021/01/21\" -c -o src/hlua.o src/hlua.c In file included from /usr/local/include/lua.h:15, from /usr/local/include/lauxlib.h:15, from src/hlua.c:16: /usr/local/include/luaconf.h:581:2: error: #error "Compiler does not support 'long long'. Use option '-DLUA_32BITS' or '-DLUA_C89_NUMBERS' (see file 'luaconf.h' for details)" .. cc -Iinclude -O2 -g -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-strict-aliasing -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-missing-field-initializers -DUSE_EPOLL -DUSE_NETFILTER -DUSE_PCRE -DUSE_POLL -DUSE_TPROXY -DUSE_LINUX_TPROXY -DUSE_LINUX_SPLICE -DUSE_LIBCRYPT -DUSE_CRYPT_H -DUSE_GETADDRINFO -DUSE_OPENSSL -DUSE_LUA -DUSE_FUTEX -DUSE_ZLIB -DUSE_CPU_AFFINITY -DUSE_DL -DUSE_RT -DUSE_PRCTL -DUSE_THREAD_DUMP -I/usr/include/openssl101e/ -DUSE_PCRE -I/usr/include -DCONFIG_HAPROXY_VERSION=\"2.4-dev5-73246d-83\" -DCONFIG_HAPROXY_DATE=\"2021/01/21\" -c -o src/hlua_fcn.o src/hlua_fcn.c In file included from /usr/local/include/lua.h:15, from /usr/local/include/lauxlib.h:15, from src/hlua_fcn.c:17: /usr/local/include/luaconf.h:581:2: error: #error "Compiler does not support 'long long'. Use option '-DLUA_32BITS' or '-DLUA_C89_NUMBERS' (see file 'luaconf.h' for details)" .. Cc: Thierry Fournier <tfournier@arpalert.org>	2021-01-22 16:17:56 +01:00
Bertrand Jacquin	80839ff8e4	MINOR: lua: remove unused variable hlua_init() uses 'idx' only in openssl related code, while 'i' is used in shared code and is safe to be reused. This commit replaces the use of 'idx' with 'i' $ make V=1 TARGET=linux-glibc USE_LUA=1 USE_OPENSSL= .. cc -Iinclude -O2 -g -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -DUSE_EPOLL -DUSE_NETFILTER -DUSE_POLL -DUSE_THREAD -DUSE_BACKTRACE -DUSE_TPROXY -DUSE_LINUX_TPROXY -DUSE_LINUX_SPLICE -DUSE_LIBCRYPT -DUSE_CRYPT_H -DUSE_GETADDRINFO -DUSE_LUA -DUSE_FUTEX -DUSE_ACCEPT4 -DUSE_CPU_AFFINITY -DUSE_TFO -DUSE_NS -DUSE_DL -DUSE_RT -DUSE_PRCTL -DUSE_THREAD_DUMP -I/usr/include/lua5.3 -I/usr/include/lua5.3 -DCONFIG_HAPROXY_VERSION=\"2.4-dev5-37286a-78\" -DCONFIG_HAPROXY_DATE=\"2021/01/21\" -c -o src/hlua.o src/hlua.c src/hlua.c: In function 'hlua_init': src/hlua.c:9145:6: warning: unused variable 'idx' [-Wunused-variable] 9145 \| int idx; \| ^~~	2021-01-22 16:14:34 +01:00
Willy Tarreau	2cbe2e7f84	BUILD: debug: fix build warning by consuming the write() result When writing commit `a8459b28c` ("MINOR: debug: create ha_backtrace_to_stderr() to dump an instant backtrace") I just forgot that some distros are a bit extremist about the syscall return values. src/debug.c: In function `ha_backtrace_to_stderr': src/debug.c:147:3: error: ignoring return value of `write', declared with attribute warn_unused_result [-Werror=unused-result] write(2, b.area, b.data); ^~~~~~~~~~~~~~~~~~~~~~~~ CC src/h1_htx.o Let's apply the usual tricks to shut them up. No backport is needed.	2021-01-22 15:58:26 +01:00
Willy Tarreau	2bfce7e424	MINOR: debug: let ha_dump_backtrace() dump a bit further for some callers The dump state is now passed to the function so that the caller can adjust the behavior. A new series of 4 values allow to stop after dumping main instead of before it or any of the usual loops. This allows to also report BUG_ON() that could happen very high in the call graph (e.g. startup, or the scheduler itself) while still understanding what the call path was.	2021-01-22 14:48:34 +01:00
Willy Tarreau	5baf4fe31a	MEDIUM: debug: now always print a backtrace on CRASH_NOW() and friends The purpose is to enable the dumping of a backtrace on BUG_ON(). While it's very useful to know that a condition was met, very often some caller context is missing to figure how the condition could happen. From now on, on systems featuring backtrace, a backtrace of the calling thread will also be dumped to stderr in addition to the unexpected condition. This will help users of DEBUG_STRICT as they'll most often find this backtrace in their logs even if they can't find their core file. A new "debug dev bug" expert-mode CLI command was added to test the feature.	2021-01-22 14:18:34 +01:00
Willy Tarreau	a8459b28c3	MINOR: debug: create ha_backtrace_to_stderr() to dump an instant backtrace This function calls the ha_dump_backtrace() function with a locally allocated buffer and sends the output slightly indented to fd #2. It's meant to be used as an emergency backtrace dump.	2021-01-22 14:15:36 +01:00
Willy Tarreau	123fc9786a	MINOR: debug: extract the backtrace dumping code to its own function The backtrace dumping code was located into the thread dump function but it looks particularly convenient to be able to call it to produce a dump in other situations, so let's move it to its own function and make sure it's called last in the function so that we can benefit from tail merging to save one entry.	2021-01-22 13:52:41 +01:00
Willy Tarreau	2f1227eb3f	MINOR: debug: always export the my_backtrace function In order to simplify the code and remove annoying ifdefs everywhere, let's always export my_backtrace() and make it adapt to the situation and return zero if not supported. A small update in the thread dump function was needed to make sure we don't use its results if it fails now.	2021-01-22 12:12:29 +01:00
Willy Tarreau	3d4631fec6	BUG/MEDIUM: mux-h2: fix read0 handling on partial frames Since commit `aade4edc1` ("BUG/MEDIUM: mux-h2: Don't handle pending read0 too early on streams"), we've met a few cases where an early connection close wouldn't be properly handled if some data were pending in a frame header, because the test now considers the buffer's contents before accepting to report the close, but given that frame headers or preface are consumed at once, the buffer cannot make progress when it's stuck at intermediary lengths. In order to address this, this patch introduces two flags in the h2c connection to store any reported shutdown and failed parsing. The idea is that we cannot rely on conn_xprt_read0_pending() in the parser since it wouldn't consider data pending in the buffer nor intermediary layers, but we know for certain that after a read0 is reported by the transport layer in presence of an RD_SH on the connection, no more progress will be made there. This alone is not sufficient to decide to end processing, we can only do this once these final data have been submitted to a parser. Therefore, now when a parser fails on missing data, we check if a read0 has already been reported on this connection, and if so we set a new END_REACHED flag on the connection to indicate a failure to process the final data. The h2c_read0_pending() function now simply reports this flag's status. This way we're certain that the input shutdown is only considered after the demux attempted to parse the last frame. Maybe over the long term the subscribe() API should be improved to synchronously fail when trying to subscribe for an even that will not happen. This may be an elegant solution that could possibly work across multiple layers and even muxes, and be usable at a few specific places where that's needed. Given the patch above was backported as far as 2.0, this one should be backported there as well. It is possible that the fcgi mux has the same issue, but this was not analysed yet. Thanks to Pierre Cheynier for providing detailed traces allowing to quickly narrow the problem down, and to Olivier for his analysis.	2021-01-22 10:54:15 +01:00
Christopher Faulet	341064eb16	BUG/MINOR: stream: Don't update counters when TCP to H2 upgrades are performed When a TCP to H2 upgrade is performed, the SF_IGNORE flag is set on the stream before killing it. This happens when a TCP/SSL client connection is routed to a HTTP backend and the h2 alpn detected. The SF_IGNORE flag was added for this purpose, to skip some processing when the stream is aborted before a mux upgrade. Some counters updates were skipped this way. But some others are still updated. Now, all counters update at the end of process_stream(), before releasing the stream, are ignored if SF_IGNORE flag is set. Note this stream is aborted because we switch from a mono-stream to a multi-stream multiplexer. It works differently for TCP to H1 upgrades. This patch should be backported as far as 2.0 after some observation period.	2021-01-22 09:06:34 +01:00
William Dauchy	b9577450ea	MINOR: contrib/prometheus-exporter: use fill_fe_stats for frontend dump use `stats_fill_fe_stats` when possible to avoid duplicating code; make use of field selector to get the needed field only. this should not introduce any difference of output. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-21 18:59:30 +01:00
William Dauchy	0ef54397b0	MEDIUM: stats: allow to select one field in `stats_fill_fe_stats` prometheus approach requires to output all values for a given metric name; meaning we iterate through all metrics, and then iterate in the inner loop on all objects for this metric. In order to allow more code reuse, adapt the stats API to be able to select one field or fill them all otherwise. From this patch it should be possible to remove most of the duplicate code on prometheuse side for the frontend. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-21 18:59:30 +01:00
William Dauchy	defd15685e	MINOR: stats: add new start time field Another patch in order to try to reconciliate haproxy stats and prometheus. Here I'm adding a proper start time field in order to make proper use of uptime field. That being done we can move the calculation in `fill_info` Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-21 18:59:30 +01:00
William Dauchy	a8766cfad1	MINOR: stats: duplicate 3 fields in bytes in info in order to prepare a possible merge of fields between haproxy stats and prometheus, duplicate 3 fields: INF_MEMMAX INF_POOL_ALLOC INF_POOL_USED Those were specifically named in MB unit which is not what prometheus recommends. We therefore used them but changed the unit while doing the calculation. It created a specific case for that, up to the description. This patch: - removes some possible confusion, i.e. using MB field for bytes - will permit an easier merge of fields such as description First consequence for now, is that we can remove the calculation on prometheus side and move it on `fill_info`. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-01-21 18:59:30 +01:00
Christopher Faulet	1d2d77b27f	MEDIUM: mux-h1: Return a 501-not-implemented for upgrade requests with a body If an HTTP protocol upgrade request with a payload is received, a 501-not-implemented error is now returned to the client. It is valid from the RFC point of view but will be incompatible with the way the H2 websockets will be handled by HAProxy. And it is probably a very uncommon way to do perform protocol upgrades.	2021-01-21 15:21:12 +01:00
Christopher Faulet	2eed800d54	MINOR: mux-h1: Be prepared to return 501-not-implemented error during parsing With this patch, the H1 mux is now able to return 501-not-implemented errors to client during the request parsing. However, no such errors are returned for now.	2021-01-21 15:21:12 +01:00
Christopher Faulet	142dd33912	MINOR: muxes: Add exit status for errors about not implemented features The MUX_ES_NOTIMPL_ERR exit status is added to allow the multiplexers to report errors about not implemented features. This will be used by the H1 mux to return 501-not-implemented errors.	2021-01-21 15:21:12 +01:00
Christopher Faulet	e095f31d36	MINOR: http: Add HTTP 501-not-implemented error message Add the support for the 501-not-implemented status code with the corresponding default message. The documentation is updated accordingly because it is now part of status codes HAProxy may emit via an errorfile or a deny/return HTTP action.	2021-01-21 15:21:12 +01:00
Christopher Faulet	7d013e796c	BUG/MEDIUM: mux-h2: Xfer rxbuf to the upper layer when creating a front stream Just like the H1 muliplexer, when a new frontend H2 stream is created, the rxbuf is xferred to the stream at the upper layer. Originally, it is not a bug fix, but just an api standardization. And in fact, it fixes a crash when a h2 stream is aborted after the request parsing but before the first call to process_stream(). It crashes since the commit `8bebd2fe5` ("MEDIUM: http-ana: Don't process partial or empty request anymore"). It is now totally unexpected to have an HTTP stream without a valid request. But here the stream is unable to get the request because the client connection was aborted. Passing it during the stream creation fixes the bug. But the true problem is that the stream-interfaces are still relying on the connection state while only the muxes should do so. This fix is specific for 2.4. No backport needed.	2021-01-21 15:21:12 +01:00
Christopher Faulet	8f100427c4	BUG/MEDIUM: tcpcheck: Don't destroy connection in the wake callback context When a tcpcheck ruleset uses multiple connections, the existing one must be closed and destroyed before openning the new one. This part is handled in the tcpcheck_main() function, when called from the wake callback function (wake_srv_chk). But it is indeed a problem, because this function may be called from the mux layer. This means a mux may call the wake callback function of the data layer, which may release the connection and the mux. It is easy to see how it is hazardous. And actually, depending on the scheduling, it leads to crashes. Thus, we must avoid to release the connection in the wake callback context, and move this part in the check's process function instead. To do so, we rely on the CHK_ST_CLOSE_CONN flags. When a connection must be replaced by a new one, this flag is set on the check, in tcpcheck_main() function, and the check's task is woken up. Then, the connection is really closed in process_chk_conn() function. This patch must be backported as far as 2.2, with some adaptations however because the code is not exactly the same.	2021-01-21 15:21:12 +01:00
Bertrand Jacquin	25439de181	BUG/MINOR: mworker: define _GNU_SOURCE for strsignal() glibc < 2.10 requires _GNU_SOURCE in order to make use of strsignal(), otherwise leading to SEGV at runtime. $ make V=1 TARGET=linux-glibc-legacy USE_THREAD= USE_ACCEPT4= .. src/mworker.c: In function 'mworker_catch_sigchld': src/mworker.c:285: warning: implicit declaration of function 'strsignal' src/mworker.c:285: warning: pointer/integer type mismatch in conditional expression .. $ make V=1 reg-tests REGTESTS_TYPES=slow,default .. ###### Test case: reg-tests/mcli/mcli_start_progs.vtc ###### ## test results in: "/tmp/haregtests-2021-01-19_15-18-07.n24989/vtc.29077.28f6153d" ---- h1 Bad exit status: 0x008b exit 0x0 signal 11 core 128 ---- h1 Assert error in haproxy_wait(), src/vtc_haproxy.c line 792: Condition(*(&h->fds[1]) >= 0) not true. Errno=0 Success .. $ gdb ./haproxy /tmp/core.0.haproxy.30270 .. Core was generated by `/root/haproxy/haproxy -d -W -S fd@8 -dM -f /tmp/haregtests-2021-01-19_15-18-07.'. Program terminated with signal 11, Segmentation fault. #0 0x00002aaaab387a10 in strlen () from /lib64/libc.so.6 (gdb) bt #0 0x00002aaaab387a10 in strlen () from /lib64/libc.so.6 #1 0x00002aaaab354b69 in vfprintf () from /lib64/libc.so.6 #2 0x00002aaaab37788a in vsnprintf () from /lib64/libc.so.6 #3 0x00000000004a76a3 in memvprintf (out=0x7fffedc680a0, format=0x5a5d58 "Current worker #%d (%d) exited with code %d (%s)\n", orig_args=0x7fffedc680d0) at src/tools.c:3868 #4 0x00000000004bbd40 in print_message (label=0x58abed "ALERT", fmt=0x5a5d58 "Current worker #%d (%d) exited with code %d (%s)\n", argp=0x7fffedc680d0) at src/log.c:1066 #5 0x00000000004bc07f in ha_alert (fmt=0x5a5d58 "Current worker #%d (%d) exited with code %d (%s)\n") at src/log.c:1109 #6 0x0000000000534b7b in mworker_catch_sigchld (sh=<value optimized out>) at src/mworker.c:293 #7 0x0000000000556af3 in __signal_process_queue () at src/signal.c:88 #8 0x00000000004f6216 in signal_process_queue () at include/haproxy/signal.h:39 #9 run_poll_loop () at src/haproxy.c:2859 #10 0x00000000004f63b7 in run_thread_poll_loop (data=<value optimized out>) at src/haproxy.c:3028 #11 0x00000000004faaac in main (argc=<value optimized out>, argv=0x7fffedc68498) at src/haproxy.c:904 See: https://man7.org/linux/man-pages/man3/strsignal.3.html Must be backported as far as 2.0.	2021-01-21 12:16:52 +01:00
Willy Tarreau	0c0c0a2878	MINOR: mux-h1/show_fd: report as suspicious an entry with too many calls An FD entry that maps to an H1 connection whose stream was woken up more than 1M times is now flagged as suspicious.	2021-01-21 09:18:25 +01:00
Willy Tarreau	06bf83e0ae	MINOR: mux-h2/show_fd: report as suspicious an entry with too many calls An FD entry that maps to an H2C connection whose last stream was woken up more than 1M times is now flagged as suspicious.	2021-01-21 09:17:42 +01:00
Willy Tarreau	4bd5d630ac	MINOR: ssl/show_fd: report some FDs as suspicious when possible If a subscriber's tasklet was called more than one million times, if the ssl_ctx's connection doesn't match the current one, or if the connection appears closed in one direction while the SSL stack is still subscribed, the FD is reported as suspicious. The close cases may occasionally trigger a false positive during very short and rare windows. Similarly the 1M calls will trigger after 16GB are transferred over a given connection. These are rare enough events to be reported as suspicious.	2021-01-21 09:09:05 +01:00
Willy Tarreau	dacfde4ba4	MINOR: cli/show_fd: report some easily detectable suspicious states A file descriptor which maps to a connection but has more than one thread in its mask, or an FD handle that doesn't correspond to the FD, or wiht no mux context, or an FD with no thread in its mask, or with more than 1 million events is flagged as suspicious.	2021-01-21 09:09:05 +01:00
Willy Tarreau	8050efeacb	MINOR: cli: give the show_fd helpers the ability to report a suspicious entry Now the show_fd helpers at the transport and mux levels return an integer which indicates whether or not the inspected entry looks suspicious. When an entry is reported as suspicious, "show fd" will suffix it with an exclamation mark ('!') in the dump, that is supposed to help detecting them. For now, helpers were adjusted to adapt to the new API but none of them reports any suspicious entry yet.	2021-01-21 08:58:15 +01:00
Willy Tarreau	1776ffb975	MINOR: mux-fcgi: make the "show fd" helper also decode the fstrm subscriber when known When dumping a live fcgi stream, also take the opportunity for reporting the subscriber including the event, tasklet, handler and context.	2021-01-20 17:17:40 +01:00
Willy Tarreau	150c4f8b72	MINOR: mux-h1: make the "show fd" helper also decode the h1s subscriber when known When dumping a live h1 stream, also take the opportunity for reporting the subscriber including the event, tasklet, handler and context. Example: 3030 : st=0x21(R:rA W:Ra) ev=0x04(heOpi) [Lc] tmask=0x4 umask=0x0 owner=0x7f97805c1f70 iocb=0x65b847(sock_conn_iocb) back=1 cflg=0x00002300 sv=s1/recv mux=H1 ctx=0x7f97805c21b0 h1c.flg=0x80000200 .sub=1 .ibuf=0@(nil)+0/0 .obuf=0@(nil)+0/0 h1s=0x7f97805c2380 h1s.flg=0x4010 .req.state=MSG_DATA .res.state=MSG_RPBEFORE .meth=POST status=0 .cs.flg=0x00000000 .cs.data=0x7f97805c1720 .subs=0x7f97805c1748(ev=1 tl=0x7f97805c1990 tl.calls=2 tl.ctx=0x7f97805c1720 tl.fct=si_cs_io_cb) xprt=RAW	2021-01-20 17:17:40 +01:00
Willy Tarreau	98e40b9818	MINOR: mux-h2: make the "show fd" helper also decode the h2s subscriber when known When dumping a valid h2 stream, also dump the subscriber, its events, tasklet context and handler. Example: 128 : st=0x21(R:rA W:Ra) ev=0x01(heopI) [lc] tmask=0x1 umask=0x0 owner=0x7f40380d7370 iocb=0x65b71b(sock_conn_iocb) back=0 cflg=0x00001300 fe=recv mux=H2 ctx=0x1ad23e0 h2c.st0=FRP .err=0 .maxid=3 .lastid=-1 .flg=0x10000 .nbst=2 .nbcs=2 .fctl_cnt=0 .send_cnt=0 .tree_cnt=2 .orph_cnt=0 .sub=1 .dsi=3 .dbuf=16366@0x1ea9380+16441/16448 .msi=-1 .mbuf=[1..1\|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] last_h2s=0x20a8340 .id=3 .st=OPN .flg=0x4100 .rxbuf=0@(nil)+0/0 .cs=0x20a8440(.flg=0x00100000 .data=0x20a8738) .subs=0x20a8760(ev=1 tl=0x20a89b0 tl.calls=22 tl.ctx=0x20a8738 tl.fct=si_cs_io_cb) xprt=SSL xprt_ctx=0x1aaf4c0 xctx.st=0 .xprt=RAW .wait.ev=1 .subs=0x1ad28e0(ev=1 tl=0x1ab3c70 tl.calls=176 tl.ctx=0x1ad23e0 tl.fct=h2_io_cb) .sent_early=0 .early_in=0	2021-01-20 17:17:39 +01:00
Willy Tarreau	691d503896	MINOR: xprt/mux: export all *_io_cb functions so that "show fd" resolves them In FD dumps it's often very important to figure what upper layer function is going to be called. Let's export the few I/O callbacks that appear as tasklet functions so that "show fd" can resolve them instead of printing a pointer relative to main. For example: 1028 : st=0x21(R:rA W:Ra) ev=0x01(heopI) [lc] tmask=0x2 umask=0x2 owner=0x7f00b889b200 iocb=0x65b638(sock_conn_iocb) back=0 cflg=0x00001300 fe=recv mux=H2 ctx=0x7f00c8824de0 h2c.st0=FRH .err=0 .maxid=795 .lastid=-1 .flg=0x0000 .nbst=0 .nbcs=0 .fctl_cnt=0 .send_cnt=0 .tree_cnt=0 .orph_cnt=0 .sub=1 .dsi=795 .dbuf=0@(nil)+0/0 .msi=-1 .mbuf=[1..1\|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] xprt=SSL xprt_ctx=0x7f00c86d0750 xctx.st=0 .xprt=RAW .wait.ev=1 .subs=0x7f00c88252e0(ev=1 tl=0x7f00a07d1aa0 tl.calls=1047 tl.ctx=0x7f00c8824de0 tl.fct=h2_io_cb) .sent_early=0 .early_in=0	2021-01-20 17:17:39 +01:00
Willy Tarreau	de5675a38c	MINOR: ssl: provide a "show fd" helper to report important SSL information The SSL context contains a lot of important details that are currently missing from debug outputs. Now that we detect ssl_sock, we can perform some sanity checks, print the next xprt, the subscriber callback's context, handler and number of calls. The process function is also resolved. This now gives for example on an H2 connection: 1029 : st=0x21(R:rA W:Ra) ev=0x01(heopI) [lc] tmask=0x2 umask=0x2 owner=0x7fc714881700 iocb=0x65b528(sock_conn_iocb) back=0 cflg=0x00001300 fe=recv mux=H2 ctx=0x7fc734545e50 h2c.st0=FRH .err=0 .maxid=217 .lastid=-1 .flg=0x0000 .nbst=0 .nbcs=0 .fctl_cnt=0 .send_cnt=0 .tree_cnt=0 .orph_cnt=0 .sub=1 .dsi=217 .dbuf=0@(nil)+0/0 .msi=-1 .mbuf=[1..1\|32],h=[0@(nil)+0/0],t=[0@(nil)+0/0] xprt=SSL xprt_ctx=0x7fc73478f230 xctx.st=0 .xprt=RAW .wait.ev=1 .subs=0x7fc734546350(ev=1 tl=0x7fc7346702e0 tl.calls=278 tl.ctx=0x7fc734545e50 tl.fct=main-0x144efa) .sent_early=0 .early_in=0	2021-01-20 17:17:39 +01:00
Willy Tarreau	108a271049	MINOR: xprt: add a new show_fd() helper to complete some "show fd" dumps. Just like we did for the muxes, now the transport layers will have the ability to provide helpers to report more detailed information about their internal context. When the helper is not known, the pointer continues to be dumped as-is if it's not NULL. This way a transport with no context nor dump function will not add a useless "xprt_ctx=(nil)" but the pointer will be emitted if valid or if a helper is defined.	2021-01-20 17:17:39 +01:00
Willy Tarreau	37be953424	MINOR: cli: make "show fd" also report the xprt and xprt_ctx These ones are definitely missing from some dumps, let's report them! We print the xprt's name instead of its useless pointer, as well as its ctx when xprt is not NULL.	2021-01-20 17:17:39 +01:00
Willy Tarreau	eb0595d039	CLEANUP: cli: make "show fd" use a const connection to access other fields Over time the code has uglified, casting fdt.owner as a struct connection for about everything. Let's have a const struct connection* there and take this opportunity for passing all fields as const as well. Additionally a misplaced closing parenthesis on the output was fixed.	2021-01-20 17:17:39 +01:00
Willy Tarreau	45fd1030d5	CLEANUP: tools: make resolve_sym_name() take a const pointer When `0c439d895` ("BUILD: tools: make resolve_sym_name() return a const") was written, the pointer argument ought to have been turned to const for more flexibility. Let's do it now.	2021-01-20 17:17:39 +01:00

... 2 3 4 5 6 ...

10974 Commits